Science.gov

Sample records for expression sequence tags

  1. Expressed Sequence Tags from Developing Castor Seeds.

    PubMed Central

    Van De Loo, F. J.; Turner, S.; Somerville, C.

    1995-01-01

    To expand the availability of genes encoding enzymes and structural proteins associated with storage lipid synthesis and deposition, partial nucleotide sequences, or expressed sequence tags (ESTs), were obtained for 743 cDNA clones derived from developing seeds of castor (Ricinus communis L.). Enrichment for seed-specific cDNA clones was obtained by selecting clones that did not detectably hybridize to first-strand cDNA from leaf mRNA. Similarly, clones that hybridized to storage proteins or other highly abundant mRNA species from developing seeds were selected against. To enrich for endomembrane-associated proteins, some clones were selected for sequencing by immunological screening with antibodies prepared against partially purified endoplasmic reticulum membranes. Comparison of the deduced amino acid sequences of the ESTs with the public data bases resulted in the assignment of putative identities of 49% of the clones selected by differential hybridization and 71% of the clones selected by immunological screening. Open reading frames in 100 of the ESTs exhibited higher homology to 78 different nonplant gene products than to any previously known plant gene product. PMID:12228533

  2. Analysis of expressed sequence tags from Plasmodium falciparum.

    PubMed

    Chakrabarti, D; Reddy, G R; Dame, J B; Almira, E C; Laipis, P J; Ferl, R J; Yang, T P; Rowe, T C; Schuster, S M

    1994-07-01

    An initiative was undertaken to sequence all genes of the human malaria parasite Plasmodium falciparum in an effort to gain a better understanding at the molecular level of the parasite that inflicts much suffering in the developing world. 550 random complimentary DNA clones were partially sequenced from the intraerythrocytic form of the parasite as one of the approaches to analyze the transcribed sequences of its genome. The sequences, after editing, generated 389 expressed sequence tag sites and over 105 kb of DNA sequences. About 32% of these clones showed significant homology with other genes in the database. These clones represent 340 new Plasmodium falciparum expressed sequence tags.

  3. Expressed sequence tags from the halophyte Limonium sinense.

    PubMed

    Chen, Shi-Hua; Guo, Shan Li; Wang, Zeng Lan; Zhao, Ji Qiang; Zhao, Yan Xiu; Zhang, Hui

    2007-02-01

    Halophytes can grow under a high salinity condition. Similar to glycophytes, their salt-tolerance possesses a high genetic complexity. There are many morphological and physiological studies on halophytes but very little information is at molecular level why they are salt-tolerant. Limonium sinense is a salt-secreting halophyte and can excretes salts by multi-cellular glands. Here, we report the library construction and sequence analysis of a cDNA library made from leaf tissue of L. sinenes. Among those 1082 expressed sequence tag (EST) obtained, 684 unique genes were identified: 429 showed homology to previously identified genes, 255 matched to uncharacterized genes. Compared with other EST databases, some characteristic features such as abundance genes in related to cytoskeleton and intracellular traffic, membrane transporting were observed, which may be specific to halophytes. PMID:17364815

  4. Expressed sequence tag analysis in tef (Eragrostis tef (Zucc) Trotter).

    PubMed

    Yu, Ju-Kyung; Sun, Qi; Rota, Mauricio La; Edwards, Hugh; Tefera, Hailu; Sorrells, Mark E

    2006-04-01

    Tef (Eragrostis tef (Zucc.) Trotter) is the most important cereal crop in Ethiopia; however, there is very little DNA sequence information available for this species. Expressed sequence tags (ESTs) were generated from 4 cDNA libraries: seedling leaf, seedling root, and inflorescence of E. tef and seedling leaf of Eragrostis pilosa, a wild relative of E. tef. Clustering of 3603 sequences produced 530 clusters and 1890 singletons, resulting in 2420 tef unigenes. Approximately 3/4 of tef unigenes matched protein or nucleotide sequences in public databases. Annotation of unigenes associated 68% of the putative tef genes with gene ontology categories. Identification of the translated unigenes for conserved protein domains revealed 389 protein family domains (Pfam), the most frequent of which was protein kinase. A total of 170 ESTs containing simple sequence repeats (EST-SSRs) were identified and 80 EST-SSR markers were developed. In addition, 19 single-nucleotide polymorphism (SNP) and (or) insertion-deletion (indel) and 34 intron fragment length polymorphism (IFLP) markers were developed. The EST database and molecular markers generated in this study will be valuable resources for further tef genetic research.

  5. Analysis of expressed sequence tags from the Ulva prolifera (Chlorophyta)

    NASA Astrophysics Data System (ADS)

    Niu, Jianfeng; Hu, Haiyan; Hu, Songnian; Wang, Guangce; Peng, Guang; Sun, Song

    2010-01-01

    In 2008, a green tide broke out before the sailing competition of the 29th Olympic Games in Qingdao. The causative species was determined to be Enteromorpha prolifera ( Ulva prolifera O. F. Müller), a familiar green macroalga along the coastline of China. Rapid accumulation of a large biomass of floating U. prolifera prompted research on different aspects of this species. In this study, we constructed a nonnormalized cDNA library from the thalli of U. prolifera and acquired 10 072 high-quality expressed sequence tags (ESTs). These ESTs were assembled into 3 519 nonredundant gene groups, including 1 446 clusters and 2 073 singletons. After annotation with the nr database, a large number of genes were found to be related with chloroplast and ribosomal protein, GO functional classification showed 1 418 ESTs participated in photosynthesis and 1 359 ESTs were responsible for the generation of precursor metabolites and energy. In addition, rather comprehensive carbon fixation pathways were found in U. prolifera using KEGG. Some stress-related and signal transduction-related genes were also found in this study. All the evidences displayed that U. prolifera had substance and energy foundation for the intense photosynthesis and the rapid proliferation. Phylogenetic analysis of cytochrome c oxidase subunit I revealed that this green-tide causative species is most closely affiliated to Pseudendoclonium akinetum (Ulvophyceae).

  6. Identification of human chromosome 22 transcribed sequences with ORF expressed sequence tags

    PubMed Central

    de Souza, Sandro J.; Camargo, Anamaria A.; Briones, Marcelo R. S.; Costa, Fernando F.; Nagai, Maria Aparecida; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; de Fátima Sonati, Maria; Tajara, Eloiza H.; Valentini, Sandro R.; Acencio, Marcio; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Bengtson, Mário Henrique; Carraro, Dirce M.; Carvalho, Alex F.; Carvalho, Lúcia Helena; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Costa, Maria Cristina R.; Curcio, Cyntia; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Leite, Luciana C. C.; Maia, Gustavo; Majumder, Paromita; Marins, Mozart; Matsukuma, Adriana; Melo, Analy S. A.; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana Gilbert; Rahal, Paula; Rainho, Claudia A.; da Ro's, Nancy; de Sá, Renata G.; Sales, Magaly M.; da Silva, Neusa P.; Silva, Tereza C.; da Silva, Wilson; Simão, Daniel F.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Zalcberg, Heloisa; Brentani, Ricardo R.; Reis, Luis F. L.; Dias-Neto, Emmanuel; Simpson, Andrew J. G.

    2000-01-01

    Transcribed sequences in the human genome can be identified with confidence only by alignment with sequences derived from cDNAs synthesized from naturally occurring mRNAs. We constructed a set of 250,000 cDNAs that represent partial expressed gene sequences and that are biased toward the central coding regions of the resulting transcripts. They are termed ORF expressed sequence tags (ORESTES). The 250,000 ORESTES were assembled into 81,429 contigs. Of these, 1,181 (1.45%) were found to match sequences in chromosome 22 with at least one ORESTES contig for 162 (65.6%) of the 247 known genes, for 67 (44.6%) of the 150 related genes, and for 45 of the 148 (30.4%) EST-predicted genes on this chromosome. Using a set of stringent criteria to validate our sequences, we identified a further 219 previously unannotated transcribed sequences on chromosome 22. Of these, 171 were in fact also defined by EST or full length cDNA sequences available in GenBank but not utilized in the initial annotation of the first human chromosome sequence. Thus despite representing less than 15% of all expressed human sequences in the public databases at the time of the present analysis, ORESTES sequences defined 48 transcribed sequences on chromosome 22 not defined by other sequences. All of the transcribed sequences defined by ORESTES coincided with DNA regions predicted as encoding exons by genscan. (http://genes.mit.edu/GENSCAN.html). PMID:11070084

  7. Shotgun sequencing of the human transcriptome with ORF expressed sequence tags

    PubMed Central

    Dias Neto, Emmanuel; Garcia Correa, Ricardo; Verjovski-Almeida, Sergio; Briones, Marcelo R. S.; Nagai, Maria Aparecida; da Silva, Wilson; Zago, Marco Antonio; Bordin, Silvana; Costa, Fernando Ferreira; Goldman, Gustavo Henrique; Carvalho, Alex F.; Matsukuma, Adriana; Baia, Gilson S.; Simpson, David H.; Brunstein, Adriana; de Oliveira, Paulo S. L.; Bucher, Philipp; Jongeneel, C. Victor; O'Hare, Michael J.; Soares, Fernando; Brentani, Ricardo R.; Reis, Luis F. L.; de Souza, Sandro J.; Simpson, Andrew J. G.

    2000-01-01

    Theoretical considerations predict that amplification of expressed gene transcripts by reverse transcription–PCR using arbitrarily chosen primers will result in the preferential amplification of the central portion of the transcript. Systematic, high-throughput sequencing of such products would result in an expressed sequence tag (EST) database consisting of central, generally coding regions of expressed genes. Such a database would add significant value to existing public EST databases, which consist mostly of sequences derived from the extremities of cDNAs, and facilitate the construction of contigs of transcript sequences. We tested our predictions, creating a database of 10,000 sequences from human breast tumors. The data confirmed the central distribution of the sequences, the significant normalization of the sequence population, the frequent extension of contigs composed of existing human ESTs, and the identification of a series of potentially important homologues of known genes. This approach should make a significant contribution to the early identification of important human genes, the deciphering of the draft human genome sequence currently being compiled, and the shotgun sequencing of the human transcriptome. PMID:10737800

  8. Identification of expressed resistance gene analogs from peanut (Arachis hypogaea L.) expressed sequence tags.

    PubMed

    Liu, Zhanji; Feng, Suping; Pandey, Manish K; Chen, Xiaoping; Culbreath, Albert K; Varshney, Rajeev K; Guo, Baozhu

    2013-05-01

    Low genetic diversity makes peanut (Arachis hypogaea L.) very vulnerable to plant pathogens, causing severe yield loss and reduced seed quality. Several hundred partial genomic DNA sequences as nucleotide-binding-site leucine-rich repeat (NBS-LRR) resistance genes (R) have been identified, but a small portion with expressed transcripts has been found. We aimed to identify resistance gene analogs (RGAs) from peanut expressed sequence tags (ESTs) and to develop polymorphic markers. The protein sequences of 54 known R genes were used to identify homologs from peanut ESTs from public databases. A total of 1,053 ESTs corresponding to six different classes of known R genes were recovered, and assembled 156 contigs and 229 singletons as peanut-expressed RGAs. There were 69 that encoded for NBS-LRR proteins, 191 that encoded for protein kinases, 82 that encoded for LRR-PK/transmembrane proteins, 28 that encoded for Toxin reductases, 11 that encoded for LRR-domain containing proteins and four that encoded for TM-domain containing proteins. Twenty-eight simple sequence repeats (SSRs) were identified from 25 peanut expressed RGAs. One SSR polymorphic marker (RGA121) was identified. Two polymerase chain reaction-based markers (Ahsw-1 and Ahsw-2) developed from RGA013 were homologous to the Tomato Spotted Wilt Virus (TSWV) resistance gene. All three markers were mapped on the same linkage group AhIV. These expressed RGAs are the source for RGA-tagged marker development and identification of peanut resistance genes.

  9. Peanut (Arachis hypogaea) Expressed Sequence Tag Project: Progress and Application

    PubMed Central

    Feng, Suping; Wang, Xingjun; Zhang, Xinyou; Dang, Phat M.; Holbrook, C. Corley; Culbreath, Albert K.; Wu, Yaoting; Guo, Baozhu

    2012-01-01

    Many plant ESTs have been sequenced as an alternative to whole genome sequences, including peanut because of the genome size and complexity. The US peanut research community had the historic 2004 Atlanta Genomics Workshop and named the EST project as a main priority. As of August 2011, the peanut research community had deposited 252,832 ESTs in the public NCBI EST database, and this resource has been providing the community valuable tools and core foundations for various genome-scale experiments before the whole genome sequencing project. These EST resources have been used for marker development, gene cloning, microarray gene expression and genetic map construction. Certainly, the peanut EST sequence resources have been shown to have a wide range of applications and accomplished its essential role at the time of need. Then the EST project contributes to the second historic event, the Peanut Genome Project 2010 Inaugural Meeting also held in Atlanta where it was decided to sequence the entire peanut genome. After the completion of peanut whole genome sequencing, ESTs or transcriptome will continue to play an important role to fill in knowledge gaps, to identify particular genes and to explore gene function. PMID:22745594

  10. Gene expression profile in the anterior regeneration of the earthworm using expressed sequence tags.

    PubMed

    Cho, Sung-Jin; Lee, Myung Sik; Tak, Eun Sik; Lee, Eun; Koh, Ki Seok; Ahn, Chi Hyun; Park, Soon Cheol

    2009-01-01

    In order to gain insight into the gene expression profiles associated with anterior regeneration of the earthworm, Perionyx excavatus, we analyzed 1,159 expressed sequence tags (ESTs) derived from cDNA library early anterior regenerated tissue. Among the 1,159 ESTs analyzed, 622 (53.7%) ESTs showed significant similarity to known genes and represented 338 genes, of which 233 ESTs were singletons and 105 ESTs manifested as two or more ESTs. While 663 ESTs (57.2%) were sequenced only once, 308 ESTs (26.6%) appeared 2 to 5 times, and 188 ESTs (16.2%) were sequenced more than 5 times. A total of 803 genes were categorized into 15 groups according to their biological functions. Among 1,159 ESTs sequenced, we found several gene encoding signaling molecules, such as Notch and Distal-less. The ESTs used in this study should provide a resource for future research in earthworm regeneration. PMID:19129665

  11. Development of peanut EST (expressed sequence tag)-based genomic resources and tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  12. Comprehensive Genetic Database of Expressed Sequence Tags for Coccolithophorids

    NASA Astrophysics Data System (ADS)

    Ranji, Mohammad; Hadaegh, Ahmad R.

    Coccolithophorids are unicellular, marine, golden-brown, single-celled algae (Haptophyta) commonly found in near-surface waters in patchy distributions. They belong to the Phytoplankton family that is known to be responsible for much of the earth reproduction. Phytoplankton, just like plants live based on the energy obtained by Photosynthesis which produces oxygen. Substantial amount of oxygen in the earth's atmosphere is produced by Phytoplankton through Photosynthesis. The single-celled Emiliana Huxleyi is the most commonly known specie of Coccolithophorids and is known for extracting bicarbonate (HCO3) from its environment and producing calcium carbonate to form Coccoliths. Coccolithophorids are one of the world's primary producers, contributing about 15% of the average oceanic phytoplankton biomass to the oceans. They produce elaborate, minute calcite platelets (Coccoliths), covering the cell to form a Coccosphere and supplying up to 60% of the bulk pelagic calcite deposited on the sea floors. In order to understand the genetics of Coccolithophorid and the complexities of their biochemical reactions, we decided to build a database to store a complete profile of these organisms' genomes. Although a variety of such databases currently exist, (http://www.geneservice.co.uk/home/) none have yet been developed to comprehensively address the sequencing efforts underway by the Coccolithophorid research community. This database is called CocooExpress and is available to public (http://bioinfo.csusm.edu) for both data queries and sequence contribution.

  13. Automated SNP detection in expressed sequence tags: statistical considerations and application to maritime pine sequences.

    PubMed

    Dantec, Loïck Le; Chagné, David; Pot, David; Cantin, Olivier; Garnier-Géré, Pauline; Bedon, Frank; Frigerio, Jean-Marc; Chaumeil, Philippe; Léger, Patrick; Garcia, Virginie; Laigret, Frédéric; De Daruvar, Antoine; Plomion, Christophe

    2004-02-01

    We developed an automated pipeline for the detection of single nucleotide polymorphisms (SNPs) in expressed sequence tag (EST) data sets, by combining three DNA sequence analysis programs: Phred, Phrap and PolyBayes. This application requires access to the individual electrophoregram traces. First, a reference set of 65 SNPs was obtained from the sequencing of 30 gametes in 13 maritime pine (Pinus pinaster Ait.) gene fragments (6671 bp), resulting in a frequency of 1 SNP every 102.6 bp. Second, parameters of the three programs were optimized in order to retrieve as many true SNPs, while keeping the rate of false positive as low as possible. Overall, the efficiency of detection of true SNPs was 83.1%. However, this rate varied largely as a function of the rare SNP allele frequency: down to 41% for rare SNP alleles (frequency < 10%), up to 98% for allele frequencies above 10%. Third, the detection method was applied to the 18498 assembled maritime pine (Pinus pinaster Ait.) ESTs, allowing to identify a total of 1400 candidate SNPs, in contigs containing between 4 and 20 sequence reads. These genetic resources, described for the first time in a forest tree species, were made available at http://www.pierroton.inra/genetics/Pinesnps. We also derived an analytical expression for the SNP detection probability as a function of the SNP allele frequency, the number of haploid genomes used to generate the EST sequence database, and the sample size of the contigs considered for SNP detection. The frequency of the SNP allele was shown to be the main factor influencing the probability of SNP detection.

  14. Comparative mapping of expressed sequence tags containing microsatellites in rainbow trout (Oncorhynchus mykiss)

    PubMed Central

    Rexroad, Caird E; Rodriguez, Maria F; Coulibaly, Issa; Gharbi, Karim; Danzmann, Roy G; DeKoning, Jenefer; Phillips, Ruth; Palti, Yniv

    2005-01-01

    Background Comparative genomics, through the integration of genetic maps from species of interest with whole genome sequences of other species, will facilitate the identification of genes affecting phenotypes of interest. The development of microsatellite markers from expressed sequence tags will serve to increase marker densities on current salmonid genetic maps and initiate in silico comparative maps with species whose genomes have been fully sequenced. Results Eighty-nine polymorphic microsatellite markers were generated for rainbow trout of which at least 74 amplify in other salmonids. Fifty-five have been associated with functional annotation and 30 were mapped on existing genetic maps. Homologous sequences were identified for 20 of the EST containing microsatellites to identify comparative assignments within the tetraodon, mouse, and/or human genomes. Conclusion The addition of microsatellite markers constructed from expressed sequence tag data will facilitate the development of high-density genetic maps for rainbow trout and comparative maps with other salmonids and better studied species. PMID:15836796

  15. Mining and survey of simple sequence repeats in expressed sequence tags of dicotyledonous species.

    PubMed

    Kumpatla, Siva P; Mukhopadhyay, Snehasis

    2005-12-01

    Simple sequence repeat (SSR) markers are widely used in many plant and animal genomes due to their abundance, hypervariability, and suitability for high-throughput analysis. Development of SSR markers using molecular methods is time consuming, laborious, and expensive. Use of computational approaches to mine ever-increasing sequences such as expressed sequence tags (ESTs) in public databases permits rapid and economical discovery of SSRs. Most of such efforts to date focused on mining SSRs from monocotyledonous ESTs. In this study, we have computationally mined and examined the abundance of SSRs in more than 1.54 million ESTs belonging to 55 dicotyledonous species. The frequency of ESTs containing SSRs among species ranged from 2.65% to 16.82%. Dinucleotide repeats were found to be the most abundant followed by tri- or mono-nucleotide repeats. The motifs A/T, AG/GA/CT/TC, and AAG/AGA/GAA/CTT/TTC/TCT were the predominant mono-, di-, and tri-nucleotide SSRs, respectively. Most of the mononucleotide SSRs contained 15-25 repeats, whereas the majority of the di- and tri-nucleotide SSRs contained 5-10 repeats. The comprehensive SSR survey data presented here demonstrates the potential of in silico mining of ESTs for rapid development of SSR markers for genetic analysis and applications in dicotyledonous crops.

  16. Identification of Differential Gene Expression in Brassica rapa Nectaries through Expressed Sequence Tag Analysis

    PubMed Central

    Hampton, Marshall; Xu, Wayne W.; Kram, Brian W.; Chambers, Emily M.; Ehrnriter, Jerad S.; Gralewski, Jonathan H.; Joyal, Teresa; Carter, Clay J.

    2010-01-01

    Background Nectaries are the floral organs responsible for the synthesis and secretion of nectar. Despite their central roles in pollination biology, very little is understood about the molecular mechanisms underlying nectar production. This project was undertaken to identify genes potentially involved in mediating nectary form and function in Brassica rapa. Methodology and Principal Findings Four cDNA libraries were created using RNA isolated from the median and lateral nectaries of B. rapa flowers, with one normalized and one non-normalized library being generated from each tissue. Approximately 3,000 clones from each library were randomly sequenced from the 5′ end to generate a total of 11,101 high quality expressed sequence tags (ESTs). Sequence assembly of all ESTs together allowed the identification of 1,453 contigs and 4,403 singleton sequences, with the Basic Localized Alignment Search Tool (BLAST) being used to identify 4,138 presumptive orthologs to Arabidopsis thaliana genes. Several genes differentially expressed between median and lateral nectaries were initially identified based upon the number of BLAST hits represented by independent ESTs, and later confirmed via reverse transcription polymerase chain reaction (RT PCR). RT PCR was also used to verify the expression patterns of eight putative orthologs to known Arabidopsis nectary-enriched genes. Conclusions/Significance This work provided a snapshot of gene expression in actively secreting B. rapa nectaries, and also allowed the identification of differential gene expression between median and lateral nectaries. Moreover, 207 orthologs to known nectary-enriched genes from Arabidopsis were identified through this analysis. The results suggest that genes involved in nectar production are conserved amongst the Brassicaceae, and also supply clones and sequence information that can be used to probe nectary function in B. rapa. PMID:20098697

  17. Development of expressed sequence tag and expressed sequence tag–simple sequence repeat marker resources for Musa acuminata

    PubMed Central

    Passos, Marco A. N.; de Oliveira Cruz, Viviane; Emediato, Flavia L.; de Camargo Teixeira, Cristiane; Souza, Manoel T.; Matsumoto, Takashi; Rennó Azevedo, Vânia C.; Ferreira, Claudia F.; Amorim, Edson P.; de Alencar Figueiredo, Lucio Flavio; Martins, Natalia F.; de Jesus Barbosa Cavalcante, Maria; Baurens, Franc-Christophe; da Silva, Orzenil Bonfim; Pappas, Georgios J.; Pignolet, Luc; Abadie, Catherine; Ciampi, Ana Y.; Piffanelli, Pietro; Miller, Robert N. G.

    2012-01-01

    Background and aims Banana (Musa acuminata) is a crop contributing to global food security. Many varieties lack resistance to biotic stresses, due to sterility and narrow genetic background. The objective of this study was to develop an expressed sequence tag (EST) database of transcripts expressed during compatible and incompatible banana–Mycosphaerella fijiensis (Mf) interactions. Black leaf streak disease (BLSD), caused by Mf, is a destructive disease of banana. Microsatellite markers were developed as a resource for crop improvement. Methodology cDNA libraries were constructed from in vitro-infected leaves from BLSD-resistant M. acuminata ssp. burmaniccoides Calcutta 4 (MAC4) and susceptible M. acuminata cv. Cavendish Grande Naine (MACV). Clones were 5′-end Sanger sequenced, ESTs assembled with TGICL and unigenes annotated using BLAST, Blast2GO and InterProScan. Mreps was used to screen for simple sequence repeats (SSRs), with markers evaluated for polymorphism using 20 diploid (AA) M. acuminata accessions contrasting in resistance to Mycosphaerella leaf spot diseases. Principal results A total of 9333 high-quality ESTs were obtained for MAC4 and 3964 for MACV, which assembled into 3995 unigenes. Of these, 2592 displayed homology to genes encoding proteins with known or putative function, and 266 to genes encoding proteins with unknown function. Gene ontology (GO) classification identified 543 GO terms, 2300 unigenes were assigned to EuKaryotic orthologous group categories and 312 mapped to Kyoto Encyclopedia of Genes and Genomes pathways. A total of 624 SSR loci were identified, with trinucleotide repeat motifs the most abundant in MAC4 (54.1 %) and MACV (57.6 %). Polymorphism across M. acuminata accessions was observed with 75 markers. Alleles per polymorphic locus ranged from 2 to 8, totalling 289. The polymorphism information content ranged from 0.08 to 0.81. Conclusions This EST collection offers a resource for studying functional genes, including

  18. Identification of grapevine rootstock cultivars using expressed sequence tag-simple sequence repeats.

    PubMed

    Fan, X C; Chu, J Q; Liu, C H; Sun, X; Fang, J G

    2014-09-26

    Grapevine (Vitis) rootstock varieties or cultivars are used to confer resistance and tolerance to insect and disease pests, unfavorable soil conditions, and other environmental conditions to cultivars that are susceptible to these conditions but otherwise have desired properties. The need to genotype and thoroughly identify grapevine rootstock varieties in the grape industry has become increasingly critical as more and more varieties are bred or selected. Although DNA markers have advantageous applications in plant identification, markers developed from classic DNA fingerprint analysis methods are not practical for plant cultivar identification. The manual cultivar identification diagram (MCID), which was previously developed in our research group, has been shown to select DNA markers that are relatively more exploitable in identifications of genotyped plant individuals. Using this MCID strategy and expressed sequence tag-simple sequence repeat (EST-SSR) markers, we identified 22 grapevine rootstock cultivars of diverse origin. All cultivars were clearly separated by fingerprints of seven pairs of EST-SSR primers and the grapevine rootstock CID (V-R-CID) generated is both practical and referable for the identification of any grapevine rootstock cultivars studied here. Furthermore, fewer primers can be used to distinguish all cultivars using this approach since the fingerprint from each primer pair could be used several times once it is generated. This initial version of V-R-CID can be made more informative with the identification and incorporation of more cultivars, thus providing better service to the grape industry.

  19. Initiation of a Sarcocystis neurona expressed sequence tag (EST) sequencing project: a preliminary report.

    PubMed

    Howe, D K

    2001-02-26

    To accelerate genetic and molecular characterization of Sarcocystis neurona, the primary causative agent of equine protozoal myeloencephalitis (EPM), a sequencing project has been initiated that will generate approximately 7000-8000 expressed sequence tags (ESTs) from this apicomplexan parasite. Poly(A)(+) RNA was isolated from culture-derived S. neurona merozoites, and a cDNA library was constructed in a unidirectional lambda phage cloning vector. Sixty phage clones were randomly picked from the library, and the cDNA inserts were amplified from these clones using the T3 and T7 primers that flank the multi-cloning site of the lambda vector. This analysis demonstrated that 100% (60/60) of the clones selected from this library contained recombinant cDNA inserts ranging in size from 0.4 to 4.0 kilobases (kb) with an average size of 1.23kb. Single-pass sequencing from the 5' end of the 60 amplified cDNAs produced high-quality nucleotide sequence from 53 of the clones. Comparison of these ESTs to the current gene databases revealed significant matches for 10 of the ESTs, six of which are similar to sequences from other Apicomplexa (i.e., Toxoplasma gondii). Importantly, none of the ESTs were of obvious mammalian origin, thus indicating that the cDNAs in this library were derived primarily from parasite mRNA and not from mRNA of the bovine turbinate host cells. Collectively, these data indicate that the described cDNA library will provide an excellent substrate for generating a portion of the ESTs that are planned from S. neurona. This sequencing project will greatly hasten gene discovery for this protozoan pathogen thereby enhancing efforts towards the development of improved diagnostics, treatments, and preventatives for EPM. In addition, the S. neurona ESTs will represent a significant contribution to the extensive database of sequences from the Apicomplexa. Comparative analyses of these apicomplexan sequences will likely offer a multitude of important information

  20. Expressed sequence tags: normalization and subtraction of cDNA libraries expressed sequence tags\\ normalization and subtraction of cDNA libraries.

    PubMed

    Soares, Marcelo Bento; de Fatima Bonaldo, Maria; Hackett, Jeremiah D; Bhattacharya, Debashish

    2009-01-01

    Expressed Sequence Tags (ESTs) provide a rapid and efficient approach for gene discovery and analysis of gene expression in eukaryotes. ESTs have also become particularly important with recent expanded efforts in complete genome sequencing of understudied, nonmodel eukaryotes such as protists and algae. For these projects, ESTs provide an invaluable source of data for gene identification and prediction of exon-intron boundaries. The generation of EST data, although straightforward in concept, requires nonetheless great care to ensure the highest efficiency and return for the investment in time and funds. To this end, key steps in the process include generation of a normalized cDNA library to facilitate a high gene discovery rate followed by serial subtraction of normalized libraries to maintain the discovery rate. Here we describe in detail, protocols for normalization and subtraction of cDNA libraries followed by an example using the toxic dinoflagellate Alexandrium tamarense.

  1. Massively parallel sequencing and analysis of expressed sequence tags in a successful invasive plant

    PubMed Central

    Prentis, Peter J.; Woolfit, Megan; Thomas-Hall, Skye R.; Ortiz-Barrientos, Daniel; Pavasovic, Ana; Lowe, Andrew J.; Schenk, Peer M.

    2010-01-01

    Background Invasive species pose a significant threat to global economies, agriculture and biodiversity. Despite progress towards understanding the ecological factors associated with plant invasions, limited genomic resources have made it difficult to elucidate the evolutionary and genetic factors responsible for invasiveness. This study presents the first expressed sequence tag (EST) collection for Senecio madagascariensis, a globally invasive plant species. Methods We used pyrosequencing of one normalized and two subtractive libraries, derived from one native and one invasive population, to generate an EST collection. ESTs were assembled into contigs, annotated by BLAST comparison with the NCBI non-redundant protein database and assigned gene ontology (GO) terms from the Plant GO Slim ontologies. Key Results Assembly of the 221 746 sequence reads resulted in 12 442 contigs. Over 50 % (6183) of 12 442 contigs showed significant homology to proteins in the NCBI database, representing approx. 4800 independent transcripts. The molecular transducer GO term was significantly over-represented in the native (South African) subtractive library compared with the invasive (Australian) library. Based on NCBI BLAST hits and literature searches, 40 % of the molecular transducer genes identified in the South African subtractive library are likely to be involved in response to biotic stimuli, such as fungal, bacterial and viral pathogens. Conclusions This EST collection is the first representation of the S. madagascariensis transcriptome and provides an important resource for the discovery of candidate genes associated with plant invasiveness. The over-representation of molecular transducer genes associated with defence responses in the native subtractive library provides preliminary support for aspects of the enemy release and evolution of increased competitive ability hypotheses in this successful invasive. This study highlights the contribution of next-generation sequencing

  2. Single nucleotide polymorphisms from Theobroma cacao expressed sequence tags associated with witches' broom disease in cacao.

    PubMed

    Lima, L S; Gramacho, K P; Carels, N; Novais, R; Gaiotto, F A; Lopes, U V; Gesteira, A S; Zaidan, H A; Cascardo, J C M; Pires, J L; Micheli, F

    2009-07-14

    In order to increase the efficiency of cacao tree resistance to witches' broom disease, which is caused by Moniliophthora perniciosa (Tricholomataceae), we looked for molecular markers that could help in the selection of resistant cacao genotypes. Among the different markers useful for developing marker-assisted selection, single nucleotide polymorphisms (SNPs) constitute the most common type of sequence difference between alleles and can be easily detected by in silico analysis from expressed sequence tag libraries. We report the first detection and analysis of SNPs from cacao-M. perniciosa interaction expressed sequence tags, using bioinformatics. Selection based on analysis of these SNPs should be useful for developing cacao varieties resistant to this devastating disease.

  3. Analysis and Functional Annotation of an Expressed Sequence Tag Collection for Tropical Crop Sugarcane

    PubMed Central

    Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo

    2003-01-01

    To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979

  4. Alternative splicing and expression profile analysis of expressed sequence tags in domestic pig.

    PubMed

    Zhang, Liang; Tao, Lin; Ye, Lin; He, Ling; Zhu, Yuan-Zhong; Zhu, Yue-Dong; Zhou, Yan

    2007-02-01

    Domestic pig (Sus scrofa domestica) is one of the most important mammals to humans. Alternative splicing is a cellular mechanism in eukaryotes that greatly increases the diversity of gene products. Expression sequence tags (ESTs) have been widely used for gene discovery, expression profile analysis, and alternative splicing detection. In this study, a total of 712,905 ESTs extracted from 101 different non-normalized EST libraries of the domestic pig were analyzed. These EST libraries cover the nervous system, digestive system, immune system, and meat production related tissues from embryo, newborn, and adult pigs, making contributions to the analysis of alternative splicing variants as well as expression profiles in various stages of tissues. A modified approach was designed to cluster and assemble large EST datasets, aiming to detect alternative splicing together with EST abundance of each splicing variant. Much efforts were made to classify alternative splicing into different types and apply different filters to each type to get more reliable results. Finally, a total of 1,223 genes with average 2.8 splicing variants were detected among 16,540 unique genes. The overview of expression profiles would change when we take alternative splicing into account. PMID:17572361

  5. Alternative splicing and expression profile analysis of expressed sequence tags in domestic pig.

    PubMed

    Zhang, Liang; Tao, Lin; Ye, Lin; He, Ling; Zhu, Yuan-Zhong; Zhu, Yue-Dong; Zhou, Yan

    2007-02-01

    Domestic pig (Sus scrofa domestica) is one of the most important mammals to humans. Alternative splicing is a cellular mechanism in eukaryotes that greatly increases the diversity of gene products. Expression sequence tags (ESTs) have been widely used for gene discovery, expression profile analysis, and alternative splicing detection. In this study, a total of 712,905 ESTs extracted from 101 different non-normalized EST libraries of the domestic pig were analyzed. These EST libraries cover the nervous system, digestive system, immune system, and meat production related tissues from embryo, newborn, and adult pigs, making contributions to the analysis of alternative splicing variants as well as expression profiles in various stages of tissues. A modified approach was designed to cluster and assemble large EST datasets, aiming to detect alternative splicing together with EST abundance of each splicing variant. Much efforts were made to classify alternative splicing into different types and apply different filters to each type to get more reliable results. Finally, a total of 1,223 genes with average 2.8 splicing variants were detected among 16,540 unique genes. The overview of expression profiles would change when we take alternative splicing into account.

  6. Identification of molecular motors in the Woods Hole squid, Loligo pealei: an expressed sequence tag approach.

    PubMed

    DeGiorgis, Joseph A; Cavaliere, Kimberly R; Burbach, J Peter H

    2011-10-01

    The squid giant axon and synapse are unique systems for studying neuronal function. While a few nucleotide and amino acid sequences have been obtained from squid, large scale genetic and proteomic information is lacking. We have been particularly interested in motors present in axons and their roles in transport processes. Here, to obtain genetic data and to identify motors expressed in squid, we initiated an expressed sequence tag project by single-pass sequencing mRNAs isolated from the stellate ganglia of the Woods Hole Squid, Loligo pealei. A total of 22,689 high quality expressed sequence tag (EST) sequences were obtained and subjected to basic local alignment search tool analysis. Seventy six percent of these sequences matched genes in the National Center for Bioinformatics databases. By CAP3 analysis this library contained 2459 contigs and 7568 singletons. Mining for motors successfully identified six kinesins, six myosins, a single dynein heavy chain, as well as components of the dynactin complex, and motor light chains and accessory proteins. This initiative demonstrates that EST projects represent an effective approach to obtain sequences of interest.

  7. Comprehensive Functional Analyses of Expressed Sequence Tags in Common Wheat (Triticum aestivum)

    PubMed Central

    Manickavelu, Alagu; Kawaura, Kanako; Oishi, Kazuko; Shin-I, Tadasu; Kohara, Yuji; Yahiaoui, Nabila; Keller, Beat; Abe, Reina; Suzuki, Ayako; Nagayama, Taishi; Yano, Kentaro; Ogihara, Yasunari

    2012-01-01

    About 1 million expressed sequence tag (EST) sequences comprising 125.3 Mb nucleotides were accreted from 51 cDNA libraries constructed from a variety of tissues and organs under a range of conditions, including abiotic stresses and pathogen challenges in common wheat (Triticum aestivum). Expressed sequence tags were assembled with stringent parameters after processing with inbuild scripts, resulting in 37 138 contigs and 215 199 singlets. In the assembled sequences, 10.6% presented no matches with existing sequences in public databases. Functional characterization of wheat unigenes by gene ontology annotation, mining transcription factors, full-length cDNA, and miRNA targeting sites were carried out. A bioinformatics strategy was developed to discover single-nucleotide polymorphisms (SNPs) within our large EST resource and reported the SNPs between and within (homoeologous) cultivars. Digital gene expression was performed to find the tissue-specific gene expression, and correspondence analysis was executed to identify common and specific gene expression by selecting four biotic stress-related libraries. The assembly and associated information cater a framework for future investigation in functional genomics. PMID:22334568

  8. Analysis of expressed sequence tags from Brassica rapa L. ssp. pekinensis.

    PubMed

    Lim, J Y; Shin, C S; Chung, E J; Kim, J S; Kim, H U; Oh, S J; Choi, W B; Ryou, C S; Kim, J B; Kwon, M S; Chung, T Y; Song, S I; Kim, J K; Nahm, B H; Hwang, Y S; Eun, M Y; Lee, J S; Cheong, J J; Choi, Y D

    2000-08-31

    Non-redundant expressed sequence tags (ESTs) were generated from six different organs at various developmental stages of Chinese cabbage, Brassica rapa L. ssp. pekinensis. Of the 1,295 ESTs, 915 (71%) showed significantly high homology in nucleotide or deduced amino acid sequences with other sequences deposited in databases, while 380 did not show similarity to any sequences. Briefly, 598 ESTs matched with proteins of identified biological function, 177 with hypothetical proteins or non-annotated Arabidopsis genome sequences, and 140 with other ESTs. About 82% of the top-scored matching sequences were from Arabidopsis or Brassica, but overall 558 (43%) ESTs matched with Arabidopsis ESTs at the nucleotide sequence level. This observation strongly supports the idea that gene-expression profiles of Chinese cabbage differ from that of Arabidopsis, despite their genome structures being similar to each other. Moreover, sequence analyses of 21 Brassica ESTs revealed that their primary structure is different from those of corresponding annotated sequences of Arabidopsis genes. Our data suggest that direct prediction of Brassica gene expression pattern based on the information from Arabidopsis genome research has some limitations. Thus, information obtained from the Brassica EST study is useful not only for understanding of unique developmental processes of the plant, but also for the study of Arabidopsis genome structure.

  9. From expressed sequence tags to 'epigenomics': an understanding of disease processes.

    PubMed

    Zweiger, G; Scott, R W

    1997-12-01

    Expressed sequence tags (ESTs) are at the forefront of technological change that is sweeping the biomedical research community. ESTs provide a high throughput means for identifying gene transcripts and monitoring complex gene expression patterns. EST-based technologies coupled with sophisticated computer analysis tools enable the informational content and output of the genome to be accessed and evaluated on a scale immensely larger than previously possible. EST-based technologies are being used to understand disease processes and to find better disease treatments, and will allow biology to move from single gene to multigene, or even more complex epigenetic, explanations for disease.

  10. Evaluation of anonymous and expressed sequence tag derived polymorphic microsatellite markers in the tobacco budworm Heliothis virescens (Lepidoptera: noctuidae)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Polymorphic genetic markers were identified and characterized using a partial genomic library of Heliothis virescens enriched for simple sequence repeats (SSR) and nucleotide sequences of expressed sequence tags (EST). Nucleotide sequences of 192 clones from the partial genomic library yielded 147 u...

  11. Analyses of an Expressed Sequence Tag Library from Taenia solium, Cysticerca

    PubMed Central

    Lundström, Jonas; Salazar-Anton, Fernando; Sherwood, Ellen; Andersson, Björn; Lindh, Johan

    2010-01-01

    Background Neurocysticercosis is a disease caused by the oral ingestion of eggs from the human parasitic worm Taenia solium. Although drugs are available they are controversial because of the side effects and poor efficiency. An expressed sequence tag (EST) library is a method used to describe the gene expression profile and sequence of mRNA from a specific organism and stage. Such information can be used in order to find new targets for the development of drugs and to get a better understanding of the parasite biology. Methods and Findings Here an EST library consisting of 5760 sequences from the pig cysticerca stage has been constructed. In the library 1650 unique sequences were found and of these, 845 sequences (52%) were novel to T. solium and not identified within other EST libraries. Furthermore, 918 sequences (55%) were of unknown function. Amongst the 25 most frequently expressed sequences 6 had no relevant similarity to other sequences found in the Genbank NR DNA database. A prediction of putative signal peptides was also performed and 4 among the 25 were found to be predicted with a signal peptide. Proposed vaccine and diagnostic targets T24, Tsol18/HP6 and Tso31d could also be identified among the 25 most frequently expressed. Conclusions An EST library has been produced from pig cysticerca and analyzed. More than half of the different ESTs sequenced contained a sequence with no suggested function and 845 novel EST sequences have been identified. The library increases the knowledge about what genes are expressed and to what level. It can also be used to study different areas of research such as drug and diagnostic development together with parasite fitness via e.g. immune modulation. PMID:21200421

  12. Preparation and analysis of an expressed sequence tag library from the toxic dinoflagellate Alexandrium catenella.

    PubMed

    Uribe, Paulina; Fuentes, Daniela; Valdés, Jorge; Shmaryahu, Amir; Zúñiga, Alicia; Holmes, David; Valenzuela, Pablo D T

    2008-01-01

    Dinoflagellates of the genus Alexandrium are photosynthetic microalgae that have an extreme importance due to the impact of some toxic species on shellfish aquaculture industry. Alexandrium catenella is the species responsible for the production of paralytic shellfish poisoning in Chile and other geographical areas. We have constructed a cDNA library from midexponential cells of A. catenella grown in culture free of associated bacteria and sequenced 10,850 expressed sequence tags (ESTs) that were assembled into 1,021 contigs and 5,475 singletons for a total of 6,496 unigenes. Approximately 41.6% of the unigenes showed similarity to genes with predicted function. A significant number of unigenes showed similarity with genes from other dinoflagellates, plants, and other protists. Among the identified genes, the most expressed correspond to those coding for proteins of luminescence, carbohydrate metabolism, and photosynthesis. The sequences of 9,847 ESTs have been deposited in Gene Bank (accession numbers EX 454357-464203).

  13. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-01-01

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data. PMID:26214460

  14. Large-scale detection and application of expressed sequence tag single nucleotide polymorphisms in Nicotiana.

    PubMed

    Wang, Y; Zhou, D; Wang, S; Yang, L

    2015-07-14

    Single nucleotide polymorphisms (SNPs) are widespread in the Nicotiana genome. Using an alignment and variation detection method, we developed 20,607,973 SNPs, based on the expressed sequence tag sequences of 10 Nicotiana species. The replacement rate was much higher than the transversion rate in the SNPs, and SNPs widely exist in the Nicotiana. In vitro verification indicated that all of the SNPs were high quality and accurate. Evolutionary relationships between 15 varieties were investigated by polymerase chain reaction with a special primer; the specific 302 locus of these sequence results clearly indicated the origin of Zhongyan 100. A database of Nicotiana SNPs (NSNP) was developed to store and search for SNPs in Nicotiana. NSNP is a tool for researchers to develop SNP markers of sequence data.

  15. Simple sequence repeat marker development from bacterial artificial chromosome end sequences and expressed sequence tags of flax (Linum usitatissimum L.).

    PubMed

    Cloutier, Sylvie; Miranda, Evelyn; Ward, Kerry; Radovanovic, Natasa; Reimer, Elsa; Walichnowski, Andrzej; Datla, Raju; Rowland, Gordon; Duguid, Scott; Ragupathy, Raja

    2012-08-01

    Flax is an important oilseed crop in North America and is mostly grown as a fibre crop in Europe. As a self-pollinated diploid with a small estimated genome size of ~370 Mb, flax is well suited for fast progress in genomics. In the last few years, important genetic resources have been developed for this crop. Here, we describe the assessment and comparative analyses of 1,506 putative simple sequence repeats (SSRs) of which, 1,164 were derived from BAC-end sequences (BESs) and 342 from expressed sequence tags (ESTs). The SSRs were assessed on a panel of 16 flax accessions with 673 (58 %) and 145 (42 %) primer pairs being polymorphic in the BESs and ESTs, respectively. With 818 novel polymorphic SSR primer pairs reported in this study, the repertoire of available SSRs in flax has more than doubled from the combined total of 508 of all previous reports. Among nucleotide motifs, trinucleotides were the most abundant irrespective of the class, but dinucleotides were the most polymorphic. SSR length was also positively correlated with polymorphism. Two dinucleotide (AT/TA and AG/GA) and two trinucleotide (AAT/ATA/TAA and GAA/AGA/AAG) motifs and their iterations, different from those reported in many other crops, accounted for more than half of all the SSRs and were also more polymorphic (63.4 %) than the rest of the markers (42.7 %). This improved resource promises to be useful in genetic, quantitative trait loci (QTL) and association mapping as well as for anchoring the physical/genetic map with the whole genome shotgun reference sequence of flax.

  16. Rapid in silico cloning of genes using expressed sequence tags (ESTs).

    PubMed

    Gill, R W; Sanseau, P

    2000-01-01

    Expressed sequence tags (ESTs) are short single-pass DNA sequences obtained from either end of cDNA clones. These ESTs are derived from a vast number of cDNA libraries obtained from different species. Human ESTs are the bulk of the data and have been widely used to identify new members of gene families, as markers on the human chromosomes, to discover polymorphism sites and to compare expression patterns in different tissues or pathologies states. Information strategies have been devised to query EST databases. Since most of the analysis is performed with a computer, the term "in silico" strategy has been coined. In this chapter we will review the current status of EST databases, the pros and cons of EST-type data and describe possible strategies to retrieve meaningful information. PMID:10874996

  17. Analysis of expressed sequence tags from a naked foraminiferan Reticulomyxa filosa.

    PubMed

    Burki, Fabien; Nikolaev, Sergey I; Bolivar, Ignacio; Guiard, Jackie; Pawlowski, Jan

    2006-08-01

    Foraminifers are a major component of modern marine ecosystems and one of the most important oceanic producers of calcium carbonate. They are a key phylogenetic group among amoeboid protists, but our knowledge of their genome is still mostly limited to a few conserved genes. Here, we report the first study of expressed genes by means of expressed sequence tag (EST) from the freshwater naked foraminiferan Reticulomyxa filosa. Cluster analysis of 1630 valid ESTs enabled the identification of 178 groups of related sequences and 871 singlets. Approximately 50% of the putative unique 1059 ESTs could be annotated using Blast searches against the protein database SwissProt + TrEMBL. The EST database described here is the first step towards gene discovery in Foraminifera and should provide the basis for new insights into the genomic and transcriptomic characteristics of these interesting but poorly understood protists.

  18. Mining and gene ontology based annotation of SSR markers from expressed sequence tags of Humulus lupulus.

    PubMed

    Singh, Swati; Gupta, Sanchita; Mani, Ashutosh; Chaturvedi, Anoop

    2012-01-01

    Humulus lupulus is commonly known as hops, a member of the family moraceae. Currently many projects are underway leading to the accumulation of voluminous genomic and expressed sequence tag sequences in public databases. The genetically characterized domains in these databases are limited due to non-availability of reliable molecular markers. The large data of EST sequences are available in hops. The simple sequence repeat markers extracted from EST data are used as molecular markers for genetic characterization, in the present study. 25,495 EST sequences were examined and assembled to get full-length sequences. Maximum frequency distribution was shown by mononucleotide SSR motifs i.e. 60.44% in contig and 62.16% in singleton where as minimum frequency are observed for hexanucleotide SSR in contig (0.09%) and pentanucleotide SSR in singletons (0.12%). Maximum trinucleotide motifs code for Glutamic acid (GAA) while AT/TA were the most frequent repeat of dinucleotide SSRs. Flanking primer pairs were designed in-silico for the SSR containing sequences. Functional categorization of SSRs containing sequences was done through gene ontology terms like biological process, cellular component and molecular function.

  19. Expressed sequence tags from a NaCl-treated Suaeda salsa cDNA library.

    PubMed

    Zhang, L; Ma, X L; Zhang, Q; Ma, C L; Wang, P P; Sun, Y F; Zhao, Y X; Zhang, H

    2001-04-18

    Past efforts to improve plant tolerance to osmotic stress have had limited success owing to the genetic complexity of stress responses. The first step towards cataloging and categorizing genetically complex abotic stress responses is the rapid discovery of genes by the large-scale partial sequencing of randomly selected cDNA clones or expressed sequence tags (ESTs). Suaeda salsa, which can survive seawater-level salinity, is a favorite halophytic model for salt tolerant research. We constructed a NaCl-treated cDNA library of Suaeda salsa and sequenced 1048 randomly selected clones, out of which 1016 clones produced readable sequences (773 showed homology to previously identified genes, 227 matched unknown protein coding regions, 16 anomalous sequences or sequences of bacterial origin were excluded from further analysis). By sequence analysis we identified 492 unique clones: 315 showed homology to previously identified genes, 177 matched unknown protein coding regions (101 of which have been found before in other organisms and 76 are completely novel). All our EST data are available on the Internet. We believe that our dbEST and the associated DNA materials will be a useful source to scientists engaging in stress-tolerance study. PMID:11313146

  20. Gene ontology based characterization of expressed sequence tags (ESTs) of Brassica rapa cv. Osome.

    PubMed

    Arasan, Senthil Kumar Thamil; Park, Jong-In; Ahmed, Nasar Uddin; Jung, Hee-Jeong; Lee, In-Ho; Cho, Yong-Gu; Lim, Yong-Pyo; Kang, Kwon-Kyoo; Nou, Ill-Sup

    2013-07-01

    Chinese cabbage (Brassica rapa) is widely recognized for its economic importance and contribution to human nutrition but abiotic and biotic stresses are main obstacle for its quality, nutritional status and production. In this study, 3,429 Express Sequence Tag (EST) sequences were generated from B. rapa cv. Osome cDNA library and the unique transcripts were classified functionally using a gene ontology (GO) hierarchy, Kyoto encyclopedia of genes and genomes (KEGG). KEGG orthology and the structural domain data were obtained from the biological database for stress related genes (SRG). EST datasets provided a wide outlook of functional characterization of B. rapa cv. Osome. In silico analysis revealed % 83 of ESTs to be well annotated towards reeds one dimensional concept. Clustering of ESTs returned 333 contigs and 2,446 singlets, giving a total of 3,284 putative unigene sequences. This dataset contained 1,017 EST sequences functionally annotated to stress responses and from which expression of randomly selected SRGs were analyzed against cold, salt, drought, ABA, water and PEG stresses. Most of the SRGs showed differentially expression against these stresses. Thus, the EST dataset is very important for discovering the potential genes related to stress resistance in Chinese cabbage, and can be of useful resources for genetic engineering of Brassica sp.

  1. Gene ontology based characterization of expressed sequence tags (ESTs) of Brassica rapa cv. Osome.

    PubMed

    Arasan, Senthil Kumar Thamil; Park, Jong-In; Ahmed, Nasar Uddin; Jung, Hee-Jeong; Lee, In-Ho; Cho, Yong-Gu; Lim, Yong-Pyo; Kang, Kwon-Kyoo; Nou, Ill-Sup

    2013-07-01

    Chinese cabbage (Brassica rapa) is widely recognized for its economic importance and contribution to human nutrition but abiotic and biotic stresses are main obstacle for its quality, nutritional status and production. In this study, 3,429 Express Sequence Tag (EST) sequences were generated from B. rapa cv. Osome cDNA library and the unique transcripts were classified functionally using a gene ontology (GO) hierarchy, Kyoto encyclopedia of genes and genomes (KEGG). KEGG orthology and the structural domain data were obtained from the biological database for stress related genes (SRG). EST datasets provided a wide outlook of functional characterization of B. rapa cv. Osome. In silico analysis revealed % 83 of ESTs to be well annotated towards reeds one dimensional concept. Clustering of ESTs returned 333 contigs and 2,446 singlets, giving a total of 3,284 putative unigene sequences. This dataset contained 1,017 EST sequences functionally annotated to stress responses and from which expression of randomly selected SRGs were analyzed against cold, salt, drought, ABA, water and PEG stresses. Most of the SRGs showed differentially expression against these stresses. Thus, the EST dataset is very important for discovering the potential genes related to stress resistance in Chinese cabbage, and can be of useful resources for genetic engineering of Brassica sp. PMID:23898551

  2. Refined annotation of the Arabidopsis genome by complete expressed sequence tag mapping.

    PubMed

    Zhu, Wei; Schlueter, Shannon D; Brendel, Volker

    2003-06-01

    Expressed sequence tags (ESTs) currently encompass more entries in the public databases than any other form of sequence data. Thus, EST data sets provide a vast resource for gene identification and expression profiling. We have mapped the complete set of 176,915 publicly available Arabidopsis EST sequences onto the Arabidopsis genome using GeneSeqer, a spliced alignment program incorporating sequence similarity and splice site scoring. About 96% of the available ESTs could be properly aligned with a genomic locus, with the remaining ESTs deriving from organelle genomes and non-Arabidopsis sources or displaying insufficient sequence quality for alignment. The mapping provides verified sets of EST clusters for evaluation of EST clustering programs. Analysis of the spliced alignments suggests corrections to current gene structure annotation and provides examples of alternative and non-canonical pre-mRNA splicing. All results of this study were parsed into a database and are accessible via a flexible Web interface at http://www.plantgdb.org/AtGDB/.

  3. Development of polymorphic microsatellite markers based on expressed sequence tags in Populus cathayana (Salicaceae).

    PubMed

    Tian, Z Z; Zhang, F Q; Cai, Z Y; Chen, S L

    2016-01-01

    Populus cathayana occupies a large area within the northern, central, and southwestern regions of China, and is considered to be an important reforestation species in western China. In order to investigate the population genetic structure of this species, 10 polymorphic microsatellite loci were identified based on expressed sequence tags from de novo sequencing on the Illumina HiSeq 2000 platform. All microsatellite primers were tested on 48 P. cathayana individuals from four locations on the Qinghai-Tibet Plateau. The observed heterozygosity ranged from 0.000 to 1.000, and the null-allele frequency ranged from 0.000 to 0.904. These microsatellite markers may be a useful tool in genetic studies on P. cathayana and closely related species.

  4. Development of polymorphic microsatellite markers based on expressed sequence tags in Populus cathayana (Salicaceae).

    PubMed

    Tian, Z Z; Zhang, F Q; Cai, Z Y; Chen, S L

    2016-01-01

    Populus cathayana occupies a large area within the northern, central, and southwestern regions of China, and is considered to be an important reforestation species in western China. In order to investigate the population genetic structure of this species, 10 polymorphic microsatellite loci were identified based on expressed sequence tags from de novo sequencing on the Illumina HiSeq 2000 platform. All microsatellite primers were tested on 48 P. cathayana individuals from four locations on the Qinghai-Tibet Plateau. The observed heterozygosity ranged from 0.000 to 1.000, and the null-allele frequency ranged from 0.000 to 0.904. These microsatellite markers may be a useful tool in genetic studies on P. cathayana and closely related species. PMID:27525845

  5. TBestDB: a taxonomically broad database of expressed sequence tags (ESTs)

    PubMed Central

    O'Brien, Emmet A.; Koski, Liisa B.; Zhang, Yue; Yang, LiuSong; Wang, Eric; Gray, Michael W.; Burger, Gertraud; Lang, B. Franz

    2007-01-01

    The TBestDB database contains ∼370 000 clustered expressed sequence tag (EST) sequences from 49 organisms, covering a taxonomically broad range of poorly studied, mainly unicellular eukaryotes, and includes experimental information, consensus sequences, gene annotations and metabolic pathway predictions. Most of these ESTs have been generated by the Protist EST Program, a collaboration among six Canadian research groups. EST sequences are read from trace files up to a minimum quality cut-off, vector and linker sequence is masked, and the ESTs are clustered using phrap. The resulting consensus sequences are automatically annotated by using the AutoFACT program. The datasets are automatically checked for clustering errors due to chimerism and potential cross-contamination between organisms, and suspect data are flagged in or removed from the database. Access to data deposited in TBestDB by individual users can be restricted to those users for a limited period. With this first report on TBestDB, we open the database to the research community for free processing, annotation, interspecies comparisons and GenBank submission of EST data generated in individual laboratories. For instructions on submission to TBestDB, contact tbestdb@bch.umontreal.ca. The database can be queried at . PMID:17202165

  6. ANALYSIS OF EXPRESSED SEQUENCE TAGS FROM THE GREEN ALGA DUNALIELLA SALINA (CHLOROPHYTA)(1).

    PubMed

    Zhao, Rui; Cao, Yu; Xu, Hui; Lv, Linfeng; Qiao, Dairong; Cao, Yi

    2011-12-01

    The unicellular green alga Dunaliella salina (Dunal) Teodor. is a novel model photosynthetic eukaryote for studying photosystems, high salinity acclimation, and carotenoid accumulation. In spite of such significance, there have been limited studies on the Dunaliella genome transcriptome and proteome. To further investigate D. salina, a cDNA library was constructed and sequenced. Here, we present the analysis of the 2,282 expressed sequence tags (ESTs) generated together with 3,990 ESTs from dbEST. A total of 4,148 unique sequences (UniSeqs) were identified, of which 56.1% had sequence similarity with Uniprot entries, suggesting that a large number of unique genes may be harbored by Dunaliella. Additionally, protein family domains were identified to further characterize these sequences. Then, we also compared EST sequences with different complete eukaryotic genomes from several animals, plants, and fungi. We observed notable differences between D. salina and other organisms. This EST collection and its annotation provided a significant resource for basic and applied research on D. salina and laid the foundation for a systematic analysis of the transcriptome basis of green algae development and diversification.

  7. Mining expressed sequence tag (EST) libraries for cancer-associated genes.

    PubMed

    Schmitt, Armin O

    2010-01-01

    Originally established in the beginning of the 1990s as a direct route to gene finding, expressed sequence tags (ESTs) still lend themselves as a means to analyze gene expression in almost all human tissues. The type of questions that can be addressed using public EST libraries ranges from tissue-specific gene profiling to the comparison between tissues in diseased and healthy states. Thanks to a multitude of web-based online bioinformatics resources, mining in EST libraries is not restricted to experts in the field of data analysis, but can readily be performed by the medical or life scientist. In this chapter, a couple of cases studies are presented that guide the scientist to the most useful online resources so that they can conduct their own research.

  8. Cloning, analysis and functional annotation of expressed sequence tags from the Earthworm Eisenia fetida

    PubMed Central

    Pirooznia, Mehdi; Gong, Ping; Guan, Xin; Inouye, Laura S; Yang, Kuan; Perkins, Edward J; Deng, Youping

    2007-01-01

    Background Eisenia fetida, commonly known as red wiggler or compost worm, belongs to the Lumbricidae family of the Annelida phylum. Little is known about its genome sequence although it has been extensively used as a test organism in terrestrial ecotoxicology. In order to understand its gene expression response to environmental contaminants, we cloned 4032 cDNAs or expressed sequence tags (ESTs) from two E. fetida libraries enriched with genes responsive to ten ordnance related compounds using suppressive subtractive hybridization-PCR. Results A total of 3144 good quality ESTs (GenBank dbEST accession number EH669363–EH672369 and EL515444–EL515580) were obtained from the raw clone sequences after cleaning. Clustering analysis yielded 2231 unique sequences including 448 contigs (from 1361 ESTs) and 1783 singletons. Comparative genomic analysis showed that 743 or 33% of the unique sequences shared high similarity with existing genes in the GenBank nr database. Provisional function annotation assigned 830 Gene Ontology terms to 517 unique sequences based on their homology with the annotated genomes of four model organisms Drosophila melanogaster, Mus musculus, Saccharomyces cerevisiae, and Caenorhabditis elegans. Seven percent of the unique sequences were further mapped to 99 Kyoto Encyclopedia of Genes and Genomes pathways based on their matching Enzyme Commission numbers. All the information is stored and retrievable at a highly performed, web-based and user-friendly relational database called EST model database or ESTMD version 2. Conclusion The ESTMD containing the sequence and annotation information of 4032 E. fetida ESTs is publicly accessible at . PMID:18047730

  9. Studies of a Biochemical Factory: Tomato Trichome Deep Expressed Sequence Tag Sequencing and Proteomics1[W][OA

    PubMed Central

    Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.

    2010-01-01

    Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087

  10. Mining of SSR markers from Expressed Sequence Tags of bamboo species

    PubMed Central

    Ramalakshmi, Oviya Iyappan; Piramanayagam, Shanmughavel

    2010-01-01

    With the ever increasing number of Expressed Sequence Tags (ESTs) from various sequencing projects, ESTs have become valuable and first-hand source of in-silico mining of simple sequence repeats (SSR) markers. We examined a total of 3419 EST sequences from three bamboo species, namely, Phyllostachys edulis, Bambusa oldhamii and Dendrocalamus sinicus for the presence of di- to hexa- microsatellites. The frequency of SSR containing ESTs varied from 5.36% in B. oldhamii to 13.05% in P. edulis. No SSRs were found in D. sinicus. Tri-nucleotide repeats (49.34%) were most frequent in P. edulis, while not much comparable difference in repeats was found in B. oldhamii. Flanking primer pairs were also designed in-silico for the sequences containing SSRs and their position on the genome hypothesized using similarity searching. SSRs located in open reading frame (ORF) were given functional annotation using Gene Ontology. Polymorphic SSRs were also detected using new pipeline- polySSR. Polymorphism level was very low (2.43%) and the position of the polymorphic SSRs was determined. The development of SSRs and the study of polymorphism will help in the further study of intra- and inter- gene flow, genetic structure, variability, linkage mapping and evolutionary relationships in bamboo PMID:21364824

  11. Development of an Expressed Sequence Tag (EST) Resource for Wheat (Triticum aestivum L.)

    PubMed Central

    Lazo, G. R.; Chao, S.; Hummel, D. D.; Edwards, H.; Crossman, C. C.; Lui, N.; Matthews, D. E.; Carollo, V. L.; Hane, D. L.; You, F. M.; Butler, G. E.; Miller, R. E.; Close, T. J.; Peng, J. H.; Lapitan, N. L. V.; Gustafson, J. P.; Qi, L. L.; Echalier, B.; Gill, B. S.; Dilbirligi, M.; Randhawa, H. S.; Gill, K. S.; Greene, R. A.; Sorrells, M. E.; Akhunov, E. D.; Dvořák, J.; Linkiewicz, A. M.; Dubcovsky, J.; Hossain, K. G.; Kalavacharla, V.; Kianian, S. F.; Mahmoud, A. A.; Miftahudin; Ma, X.-F.; Conley, E. J.; Anderson, J. A.; Pathan, M. S.; Nguyen, H. T.; McGuire, P. E.; Qualset, C. O.; Anderson, O. D.

    2004-01-01

    This report describes the rationale, approaches, organization, and resource development leading to a large-scale deletion bin map of the hexaploid (2n = 6x = 42) wheat genome (Triticum aestivum L.). Accompanying reports in this issue detail results from chromosome bin-mapping of expressed sequence tags (ESTs) representing genes onto the seven homoeologous chromosome groups and a global analysis of the entire mapped wheat EST data set. Among the resources developed were the first extensive public wheat EST collection (113,220 ESTs). Described are protocols for sequencing, sequence processing, EST nomenclature, and the assembly of ESTs into contigs. These contigs plus singletons (unassembled ESTs) were used for selection of distinct sequence motif unigenes. Selected ESTs were rearrayed, validated by 5′ and 3′ sequencing, and amplified for probing a series of wheat aneuploid and deletion stocks. Images and data for all Southern hybridizations were deposited in databases and were used by the coordinators for each of the seven homoeologous chromosome groups to validate the mapping results. Results from this project have established the foundation for future developments in wheat genomics. PMID:15514037

  12. Expressed sequence tag analysis of functional genes associated with adventitious rooting in Liriodendron hybrids.

    PubMed

    Zhong, Y D; Sun, X Y; Liu, E Y; Li, Y Q; Gao, Z; Yu, F X

    2016-06-24

    Liriodendron hybrids (Liriodendron chinense x L. tulipifera) are important landscaping and afforestation hardwood trees. To date, little genomic research on adventitious rooting has been reported in these hybrids, as well as in the genus Liriodendron. In the present study, we used adventitious roots to construct the first cDNA library for Liriodendron hybrids. A total of 5176 expressed sequence tags (ESTs) were generated and clustered into 2921 unigenes. Among these unigenes, 2547 had significant homology to the non-redundant protein database representing a wide variety of putative functions. Homologs of these genes regulated many aspects of adventitious rooting, including those for auxin signal transduction and root hair development. Results of quantitative real-time polymerase chain reaction showed that AUX1, IRE, and FB1 were highly expressed in adventitious roots and the expression of AUX1, ARF1, NAC1, RHD1, and IRE increased during the development of adventitious roots. Additionally, 181 simple sequence repeats were identified from 166 ESTs and more than 91.16% of these were dinucleotide and trinucleotide repeats. To the best of our knowledge, the present study reports the identification of the genes associated with adventitious rooting in the genus Liriodendron for the first time and provides a valuable resource for future genomic studies. Expression analysis of selected genes could allow us to identify regulatory genes that may be essential for adventitious rooting.

  13. Preparation and analysis of an expressed sequence tag library from the toxic dinoflagellate Alexandrium catenella.

    PubMed

    Uribe, Paulina; Fuentes, Daniela; Valdés, Jorge; Shmaryahu, Amir; Zúñiga, Alicia; Holmes, David; Valenzuela, Pablo D T

    2008-01-01

    Dinoflagellates of the genus Alexandrium are photosynthetic microalgae that have an extreme importance due to the impact of some toxic species on shellfish aquaculture industry. Alexandrium catenella is the species responsible for the production of paralytic shellfish poisoning in Chile and other geographical areas. We have constructed a cDNA library from midexponential cells of A. catenella grown in culture free of associated bacteria and sequenced 10,850 expressed sequence tags (ESTs) that were assembled into 1,021 contigs and 5,475 singletons for a total of 6,496 unigenes. Approximately 41.6% of the unigenes showed similarity to genes with predicted function. A significant number of unigenes showed similarity with genes from other dinoflagellates, plants, and other protists. Among the identified genes, the most expressed correspond to those coding for proteins of luminescence, carbohydrate metabolism, and photosynthesis. The sequences of 9,847 ESTs have been deposited in Gene Bank (accession numbers EX 454357-464203). PMID:18478293

  14. Large scale in silico identification of MYB family genes from wheat expressed sequence tags.

    PubMed

    Cai, Hongsheng; Tian, Shan; Dong, Hansong

    2012-10-01

    The MYB proteins constitute one of the largest transcription factor families in plants. Much research has been performed to determine their structures, functions, and evolution, especially in the model plants, Arabidopsis, and rice. However, this transcription factor family has been much less studied in wheat (Triticum aestivum), for which no genome sequence is yet available. Despite this, expressed sequence tags are an important resource that permits opportunities for large scale gene identification. In this study, a total of 218 sequences from wheat were identified and confirmed to be putative MYB proteins, including 1RMYB, R2R3-type MYB, 3RMYB, and 4RMYB types. A total of 36 R2R3-type MYB genes with complete open reading frames were obtained. The putative orthologs were assigned in rice and Arabidopsis based on the phylogenetic tree. Tissue-specific expression pattern analyses confirmed the predicted orthologs, and this meant that gene information could be inferred from the Arabidopsis genes. Moreover, the motifs flanking the MYB domain were analyzed using the MEME web server. The distribution of motifs among wheat MYB proteins was investigated and this facilitated subfamily classification.

  15. Identification of immunological expressed sequence tags in the mealworm beetle Tenebrio molitor.

    PubMed

    Dobson, Adam J; Johnston, Paul R; Vilcinskas, Andreas; Rolff, Jens

    2012-12-01

    Understanding the evolutionary ecology of immune responses to persistent infection could provide fundamental insight into temporal dynamics or interactive mechanisms that could be co-opted for antibiotic treatment regimes. Additionally, identification of novel molecules involved in these processes could provide novel compounds for biotechnological development. The beetle Tenebrio molitor displays a high level of induced antimicrobial activity coincident with persistent immuno-resistant Staphylococcus aureus, and is the first invertebrate model for persistent infection. Here we present expressed sequence tags (ESTs) detected by suppression-subtraction hybridization of Tenebrio larvae after infection with S. aureus. Amongst others, we identified mRNAs coding for various oxidative enzymes and two antimicrobial peptides. These ESTs provide a foundation for mechanistic study of Tenebrio's immune system. PMID:23041376

  16. Analysis and functional annotation of expressed sequence tags of water buffalo.

    PubMed

    Bajetha, Garima; Bhati, Jyotika; Sarika; Iquebal, M A; Rai, Anil; Arora, Vasu; Kumar, Dinesh

    2013-01-01

    An elucidated genome of domestic livestock river buffalo will contribute enormously to economy and better understanding of genome evolution as well. An attempt is made to obtain genomic information on buffalo, based on total Expressed Sequence Tags (ESTs) of Bubalus bubalis available in public domain. These ESTs were annotated and classified into 15 different functional categories based on their homology to the known proteins. Interestingly, 41.79% of the contigs were found to be buffalo specific novel ESTs with respect to other species used in analysis which needs further studies. Also, 224 pSNPs (putative Single Nucleotide Polymorphism) were detected. This study will provide a home base for further genomic studies of buffalo and comparative studies enabling a starting point for the genome annotation of the organism. Supplementary materials are available for this article online.

  17. Development of expressed sequence tag-simple sequence repeat markers for Chrysanthemum morifolium and closely related species.

    PubMed

    Liu, H; Zhang, Q X; Sun, M; Pan, H T; Kong, Z X

    2015-01-01

    With the development of chrysanthemum breeding in recent years, an increasing number of wild species in genera related to Chrysanthemum were introduced to extend the genetic resources and facilitate the genetic improvement of chrysanthemums via hybridization. However, few simple sequence repeat (SSR) markers are available for marker-assisted breeding and population genetic studies of chrysanthemum and closely related species. Expressed sequence tags (ESTs) in public databases and cross-species transferable markers are considered to be a cost-effective means for developing sequence-based markers. In this study, 25 EST-SSRs were successfully developed from Chrysanthemum EST sequences for Chrysanthemum morifolium and closely related species. In total, 4164 unigene sequences were assembled from 7180 ESTs of chrysanthemum in GenBank, which were subsequently used to screen for the presence of microsatellites with the SSRIT software. The screening criteria were 8, 5, 4, and 3 repeating units for di-, tri-, tetra-, and penta- and higher-order nucleotides, respectively. Moreover, 310 SSR loci from 296 sequences were identified, and 198 primer pairs for SSR amplification were designed with the Primer Premier 5.0 software, of which 25 SSR loci showed polymorphic amplification in 52 species and varieties belonging to Chrysanthemum, Ajania, and Opisthopappus. The application of EST-SSR markers to the identification of intergeneric hybrids between Chrysanthemum and Ajania was demonstrated. Therefore, EST-SSRs can be developed for species that lack gene sequences or ESTs by utilizing ESTs of closely related species. PMID:26214436

  18. Analysis of expressed sequence tags from the red alga Griffithsia okiensis.

    PubMed

    Lee, Hyoungseok; Lee, Hong Kum; An, Gynheung; Lee, Yoo Kyung

    2007-12-01

    Red algae are distributed globally, and the group contains several commercially important species. Griffithsia okiensis is one of the most extensively studied red algal species. In this study, we conducted expressed sequence tag (ESTs) analysis and synonymous codon usage analysis using cultured G. okiensis samples. A total of 1,104 cDNA clones were sequenced using a cDNA library made from samples collected from Dolsan Island, on the southern coast of Korea. The clustering analysis of these sequences allowed for the identification of 1,048 unigene clusters consisting of 36 consensus and 1,012 singleton sequences. BLASTX searches generated 532 significant hits (E-value <10(-4)) and via further Gene Ontology analysis, we constructed a functional classification of 434 unigenes. Our codon usage analysis showed that unigene clusters with more than three ESTs had higher GC contents (76.5%) at the third position of the codons than the singletons. Also, the majority of the optimal codons of G. okiensis and Chondrus crispus belonging to Bangiophycidae were C-ending, whereas those of Porphyra yezoensis belonging to Florideophycidae were G-ending. An orthologous gene search for the P. yezoensis EST database resulted in the identification of 39 unigenes commonly expressed in two rhodophytes, which have putative functions for structural proteins, protein degradation, signal transduction, stress response, and physiological processes. Although experiments have been conducted on a limited scale, this study provides a material basis for the development of microarrays useful for gene expression studies, as well as useful information for the comparative genomic analysis of red algae.

  19. Mining of haplotype-based expressed sequence tag single nucleotide polymorphisms in citrus

    PubMed Central

    2013-01-01

    Background Single nucleotide polymorphisms (SNPs), the most abundant variations in a genome, have been widely used in various studies. Detection and characterization of citrus haplotype-based expressed sequence tag (EST) SNPs will greatly facilitate further utilization of these gene-based resources. Results In this paper, haplotype-based SNPs were mined out of publicly available citrus expressed sequence tags (ESTs) from different citrus cultivars (genotypes) individually and collectively for comparison. There were a total of 567,297 ESTs belonging to 27 cultivars in varying numbers and consequentially yielding different numbers of haplotype-based quality SNPs. Sweet orange (SO) had the most (213,830) ESTs, generating 11,182 quality SNPs in 3,327 out of 4,228 usable contigs. Summed from all the individually mining results, a total of 25,417 quality SNPs were discovered – 15,010 (59.1%) were transitions (AG and CT), 9,114 (35.9%) were transversions (AC, GT, CG, and AT), and 1,293 (5.0%) were insertion/deletions (indels). A vast majority of SNP-containing contigs consisted of only 2 haplotypes, as expected, but the percentages of 2 haplotype contigs varied widely in these citrus cultivars. BLAST of the 25,417 25-mer SNP oligos to the Clementine reference genome scaffolds revealed 2,947 SNPs had “no hits found”, 19,943 had 1 unique hit / alignment, 1,571 had one hit and 2+ alignments per hit, and 956 had 2+ hits and 1+ alignment per hit. Of the total 24,293 scaffold hits, 23,955 (98.6%) were on the main scaffolds 1 to 9, and only 338 were on 87 minor scaffolds. Most alignments had 100% (25/25) or 96% (24/25) nucleotide identities, accounting for 93% of all the alignments. Considering almost all the nucleotide discrepancies in the 24/25 alignments were at the SNP sites, it served well as in silico validation of these SNPs, in addition to and consistent with the rate (81%) validated by sequencing and SNaPshot assay. Conclusions High-quality EST-SNPs from different

  20. Pyrosequence analysis of expressed sequence tags for Manduca sexta hemolymph proteins involved in immune responses.

    PubMed

    Zou, Zhen; Najar, Fares; Wang, Yang; Roe, Bruce; Jiang, Haobo

    2008-06-01

    The tobacco hornworm Manduca sexta is widely used as a model organism to investigate the biochemical basis of insect physiological processes but little transcriptome information is available. To get a broad view of the larval hemolymph proteins, particularly those related to immunity, we synthesized and sequenced cDNA fragments from a mixture of eight total RNA samples: fat body and hemocytes from larvae injected with killed bacteria, fat body, hemocytes, integument and trachea from naïve larvae, and fat body and hemocytes from wandering larvae. Using massively parallel pyrosequencing, we obtained 95,458 M. sexta expressed sequence tags (ESTs) at an average size of 185bp per read. A majority of the sequences (69,429 reads) could be assembled into 7231 contigs with an average size of 300bp, 1178 of which had significant similarity with Drosophila genes from various functional groups. Only approximately 8% (606) of the contigs matched known M. sexta cDNA sequences, representing 186 of the 375 unique NCBI entries. The remaining 6625 contigs represented newly discovered cDNA segments from this well studied biochemical model insect. A search of the 7231 contigs using Tribolium castaneum, Drosophila melanogaster, and Bombyx mori immunity-related sequences revealed 424 cDNA contigs with significant similarity (E-value <1 x 10(-5)). These included 218 previously unknown M. sexta sequences coding for putative defense molecules such as pattern recognition receptors, serine proteinases, serpins, Spätzle, Toll-like receptors, intracellular signaling molecules, and antimicrobial peptides. PMID:18510979

  1. Characterization of genic microsatellite markers derived from expressed sequence tags in Pacific abalone ( Haliotis discus hannai)

    NASA Astrophysics Data System (ADS)

    Li, Qi; Shu, Jing; Zhao, Cui; Liu, Shikai; Kong, Lingfeng; Zheng, Xiaodong

    2010-01-01

    Simple sequence repeat (SSR) markers were developed from the expressed sequence tags (ESTs) of Pacific abalone ( Haliotis discus hannai). Repeat motifs were found in 4.95% of the ESTs at a frequency of one repeat every 10.04 kb of EST sequences, after redundancy elimination. Seventeen polymorphic EST-SSRs were developed. The number of alleles per locus varied from 2-17, with an average of 6.8 alleles per locus. The expected and observed heterozygosities ranged from 0.159 to 0.928 and from 0.132 to 0.922, respectively. Twelve of the 17 loci (70.6%) were successfully amplified in H. diversicolor. Seventeen loci segregated in three families, with three showing the presence of null alleles (17.6%). The adequate level of variability and low frequency of null alleles observed in H. discus hannai, together with the high rate of transportability across Haliotis species, make this set of EST-SSR markers an important tool for comparative mapping, marker-assisted selection, and evolutionary studies, not only in the Pacific abalone, but also in related species.

  2. A wing expressed sequence tag resource for Bicyclus anynana butterflies, an evo-devo model

    PubMed Central

    Beldade, Patrícia; Rudd, Stephen; Gruber, Jonathan D; Long, Anthony D

    2006-01-01

    Background Butterfly wing color patterns are a key model for integrating evolutionary developmental biology and the study of adaptive morphological evolution. Yet, despite the biological, economical and educational value of butterflies they are still relatively under-represented in terms of available genomic resources. Here, we describe an Expression Sequence Tag (EST) project for Bicyclus anynana that has identified the largest available collection to date of expressed genes for any butterfly. Results By targeting cDNAs from developing wings at the stages when pattern is specified, we biased gene discovery towards genes potentially involved in pattern formation. Assembly of 9,903 ESTs from a subtracted library allowed us to identify 4,251 genes of which 2,461 were annotated based on BLAST analyses against relevant gene collections. Gene prediction software identified 2,202 peptides, of which 215 longer than 100 amino acids had no homology to any known proteins and, thus, potentially represent novel or highly diverged butterfly genes. We combined gene and Single Nucleotide Polymorphism (SNP) identification by constructing cDNA libraries from pools of outbred individuals, and by sequencing clones from the 3' end to maximize alignment depth. Alignments of multi-member contigs allowed us to identify over 14,000 putative SNPs, with 316 genes having at least one high confidence double-hit SNP. We furthermore identified 320 microsatellites in transcribed genes that can potentially be used as genetic markers. Conclusion Our project was designed to combine gene and sequence polymorphism discovery and has generated the largest gene collection available for any butterfly and many potential markers in expressed genes. These resources will be invaluable for exploring the potential of B. anynana in particular, and butterflies in general, as models in ecological, evolutionary, and developmental genetics. PMID:16737530

  3. Application of Cydia pomonella expressed sequence tags: Identification and expression of three general odorant binding proteins in codling moth

    PubMed Central

    Garczynski, Stephen F.; Coates, Brad S.; Unruh, Thomas R.; Schaeffer, Scott; Jiwan, Derick; Koepke, Tyson; Dhingra, Amit

    2014-01-01

    The codling moth, Cydia pomonella, is one of the most important pests of pome fruits in the world, yet the molecular genetics and the physiology of this insect remain poorly understood. A combined assembly of 8 341 expressed sequence tags was generated from Roche 454 GS-FLX sequencing of eight tissue-specific cDNA libraries. Putative chemosensory proteins (12) and odorant binding proteins (OBPs) (18) were annotated, which included three putative general OBP (GOBP), one more than typically reported for other Lepidoptera. To further characterize CpomGOBPs, we cloned cDNA copies of their transcripts and determined their expression patterns in various tissues. Cloning and sequencing of the 698 nt transcript for CpomGOBP1 resulted in the prediction of a 163 amino acid coding region, and subsequent RT-PCR indicated that the transcripts were mainly expressed in antennae and mouthparts. The 1 289 nt (160 amino acid) CpomGOBP2 and the novel 702 nt (169 amino acid) CpomGOBP3 transcripts are mainly expressed in antennae, mouthparts, and female abdomen tips. These results indicate that next generation sequencing is useful for the identification of novel transcripts of interest, and that codling moth expresses a transcript encoding for a new member of the GOBP subfamily. PMID:23956229

  4. Functional annotation of an expressed sequence tag library from Haliotis diversicolor and analysis of its plant-like sequences.

    PubMed

    Jiang, Jing-Zhe; Zhang, Wei; Guo, Zhi-Xun; Cai, Chen-Chen; Su, You-Lu; Wang, Rui-Xuan; Wang, Jiang-Yong

    2011-09-01

    The small abalone, Haliotis diversicolor, is a widely distributed and cultured species in the subtropical coastal area of China. To identify and classify functional genes of this important species, a normalized expressed sequence tag (EST) library, including 7069 high quality ESTs from the total body of H. diversicolor, was analyzed. A total of 4781 unigenes were assembled and 2991 novel abalone genes were identified. The GC content, codon and amino acid usage of the transcriptome were analyzed. For the accurate annotation of the abalone library, different influencing factors were evaluated. The gene ontology (GO) database provided a higher annotation rate (69.6%), and sequences longer than 800bp were easily subjected to a BLAST search. The taxonomy of the BLAST results showed that lancelet and invertebrates are most closely related to abalone. Sixty-seven identified plant-like genes were further examined by reverse transcription-polymerase chain reaction (RT-PCR) and sequencing, only seven of these were real transcripts in abalone. Phylogenic trees were also constructed to illustrate the positions of two Cystatin sequences and one Calmodulin protein sequence identified in abalone. To perform functional classification, three different databases (GO, KEGG and COG) were used and 60 immune or disease-related unigenes were determined. This work has greatly enlarged the known gene pool of H. diversicolor and will have important implications for future molecular and genetic analyses in this organism.

  5. Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius

    PubMed Central

    Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.

    2010-01-01

    Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665

  6. Identification, analysis, and linkage mapping of expressed sequence tags from the Australian sheep blowfly

    PubMed Central

    2011-01-01

    Background The Australian sheep blowfly Lucilia cuprina (Wiedemann) (Diptera: Calliphoridae) is a destructive pest of the sheep, a model organism for insecticide resistance research, and a valuable tool for medical and forensic professionals. However, genomic information on L. cuprina is still sparse. Results We report here the construction of an embryonic and 2 larval cDNA libraries for L. cuprina. A total of 29,816 expressed sequence tags (ESTs) were obtained and assembled into 7,464 unique clusters. The sequence collection captures a great diversity of genes, including those related to insecticide resistance (e.g., 12 cytochrome P450s, 2 glutathione S transferases, and 6 esterases). Compared to Drosophila melanogaster, codon preference is different in 13 of the 18 amino acids encoded by redundant codons, reflecting the lower overall GC content in L. cuprina. In addition, we demonstrated that the ESTs could be converted into informative gene markers by capitalizing on the known gene structures in the model organism D. melanogaster. We successfully assigned 41 genes to their respective chromosomes in L. cuprina. The relative locations of these loci revealed high but incomplete chromosomal synteny between L. cuprina and D. melanogaster. Conclusions Our results represent the first major transcriptomic undertaking in L. cuprina. These new genetic resources could be useful for the blowfly and insect research community. PMID:21827708

  7. Analysis of expressed sequence tag loci on wheat chromosome group 4.

    PubMed

    Miftahudin; Ross, K; Ma, X-F; Mahmoud, A A; Layton, J; Milla, M A Rodriguez; Chikmawati, T; Ramalingam, J; Feril, O; Pathan, M S; Momirovic, G Surlan; Kim, S; Chema, K; Fang, P; Haule, L; Struxness, H; Birkes, J; Yaghoubian, C; Skinner, R; McAllister, J; Nguyen, V; Qi, L L; Echalier, B; Gill, B S; Linkiewicz, A M; Dubcovsky, J; Akhunov, E D; Dvorák, J; Dilbirligi, M; Gill, K S; Peng, J H; Lapitan, N L V; Bermudez-Kandianis, C E; Sorrells, M E; Hossain, K G; Kalavacharla, V; Kianian, S F; Lazo, G R; Chao, S; Anderson, O D; Gonzalez-Hernandez, J; Conley, E J; Anderson, J A; Choi, D-W; Fenton, R D; Close, T J; McGuire, P E; Qualset, C O; Nguyen, H T; Gustafson, J P

    2004-10-01

    A total of 1918 loci, detected by the hybridization of 938 expressed sequence tag unigenes (ESTs) from 26 Triticeae cDNA libraries, were mapped to wheat (Triticum aestivum L.) homoeologous group 4 chromosomes using a set of deletion, ditelosomic, and nulli-tetrasomic lines. The 1918 EST loci were not distributed uniformly among the three group 4 chromosomes; 41, 28, and 31% mapped to chromosomes 4A, 4B, and 4D, respectively. This pattern is in contrast to the cumulative results of EST mapping in all homoeologous groups, as reported elsewhere, that found the highest proportion of loci mapped to the B genome. Sixty-five percent of these 1918 loci mapped to the long arms of homoeologous group 4 chromosomes, while 35% mapped to the short arms. The distal regions of chromosome arms showed higher numbers of loci than the proximal regions, with the exception of 4DL. This study confirmed the complex structure of chromosome 4A that contains two reciprocal translocations and two inversions, previously identified. An additional inversion in the centromeric region of 4A was revealed. A consensus map for homoeologous group 4 was developed from 119 ESTs unique to group 4. Forty-nine percent of these ESTs were found to be homoeologous to sequences on rice chromosome 3, 12% had matches with sequences on other rice chromosomes, and 39% had no matches with rice sequences at all. Limited homology (only 26 of the 119 consensus ESTs) was found between wheat ESTs on homoeologous group 4 and the Arabidopsis genome. Forty-two percent of the homoeologous group 4 ESTs could be classified into functional categories on the basis of blastX searches against all protein databases. PMID:15514042

  8. Analysis of Expressed Sequence Tag Loci on Wheat Chromosome Group 4

    PubMed Central

    Miftahudin; Ross, K.; Ma, X.-F.; Mahmoud, A. A.; Layton, J.; Milla, M. A. Rodriguez; Chikmawati, T.; Ramalingam, J.; Feril, O.; Pathan, M. S.; Momirovic, G. Surlan; Kim, S.; Chema, K.; Fang, P.; Haule, L.; Struxness, H.; Birkes, J.; Yaghoubian, C.; Skinner, R.; McAllister, J.; Nguyen, V.; Qi, L. L.; Echalier, B.; Gill, B. S.; Linkiewicz, A. M.; Dubcovsky, J.; Akhunov, E. D.; Dvořák, J.; Dilbirligi, M.; Gill, K. S.; Peng, J. H.; Lapitan, N. L. V.; Bermudez-Kandianis, C. E.; Sorrells, M. E.; Hossain, K. G.; Kalavacharla, V.; Kianian, S. F.; Lazo, G. R.; Chao, S.; Anderson, O. D.; Gonzalez-Hernandez, J.; Conley, E. J.; Anderson, J. A.; Choi, D.-W.; Fenton, R. D.; Close, T. J.; McGuire, P. E.; Qualset, C. O.; Nguyen, H. T.; Gustafson, J. P.

    2004-01-01

    A total of 1918 loci, detected by the hybridization of 938 expressed sequence tag unigenes (ESTs) from 26 Triticeae cDNA libraries, were mapped to wheat (Triticum aestivum L.) homoeologous group 4 chromosomes using a set of deletion, ditelosomic, and nulli-tetrasomic lines. The 1918 EST loci were not distributed uniformly among the three group 4 chromosomes; 41, 28, and 31% mapped to chromosomes 4A, 4B, and 4D, respectively. This pattern is in contrast to the cumulative results of EST mapping in all homoeologous groups, as reported elsewhere, that found the highest proportion of loci mapped to the B genome. Sixty-five percent of these 1918 loci mapped to the long arms of homoeologous group 4 chromosomes, while 35% mapped to the short arms. The distal regions of chromosome arms showed higher numbers of loci than the proximal regions, with the exception of 4DL. This study confirmed the complex structure of chromosome 4A that contains two reciprocal translocations and two inversions, previously identified. An additional inversion in the centromeric region of 4A was revealed. A consensus map for homoeologous group 4 was developed from 119 ESTs unique to group 4. Forty-nine percent of these ESTs were found to be homoologous to sequences on rice chromosome 3, 12% had matches with sequences on other rice chromosomes, and 39% had no matches with rice sequences at all. Limited homology (only 26 of the 119 consensus ESTs) was found between wheat ESTs on homoeologous group 4 and the Arabidopsis genome. Forty-two percent of the homoeologous group 4 ESTs could be classified into functional categories on the basis of blastX searches against all protein databases. PMID:15514042

  9. Development of polymorphic expressed sequence tag-single sequence repeat markers in the common Chinese cuttlefish, Sepiella maindroni.

    PubMed

    Li, R H; Lu, S K; Zhang, C L; Song, W W; Mu, C K; Wang, C L

    2014-01-01

    The common Chinese cuttlefish (Sepiella maindroni) is one of the popular edible cephalopod consumed across Asia. To facilitate the population genetic investigation of this species, we developed fourteen polymorphic microsatellite makers from expressed sequence tags of S. maindroni. The number of alleles at each locus ranged from 6 to 10 with an average of 7.9 alleles per locus. The ranges of observed and expected heterozygosity were from 0.615 to 0.962 and 0.685 to 0.888, respectively. Four loci were found deviated significantly from Hardy-Weinberg equilibrium. The polymorphism information content ranged from 0.638 to 0.833. These polymorphic microsatellite loci will be helpful for the population genetic, genetic linkage map, and other genetic studies of S. maindroni. PMID:25117305

  10. A comprehensive nonredundant expressed sequence tag collection for the developing Rattus norvegicus heart.

    PubMed

    Laffin, Jennifer J S; Scheetz, Todd E; Bonaldo, Maria de Fatima; Reiter, Rebecca S; Chang, Shereen; Eyestone, Mari; Abdulkawy, Hakeem; Brown, Bartley; Roberts, Chad; Tack, Dylan; Kucaba, Tamara; Lin, Jim Jung-Ching; Sheffield, Val C; Casavant, Thomas L; Soares, M Bento

    2004-04-13

    Congenital heart defects affect approximately 1,000,000 people in the United States, with 40,000 new births contributing to that number every year. A large percentage of these defects can be attributed to septal defects. We assembled a nonredundant collection of over 12,000 expressed sequence tags (ESTs) from a total of 30,000 ESTs, with the ultimate goal of identifying spatially and/or temporally regulated genes during heart septation. These ESTs were compiled from nonnormalized, normalized, and serially subtracted cDNA libraries derived from two sets of tissue samples. The first includes microdissected rat hearts from embryonic (E) days E13, E15, and E16.5-E18.5 and adult heart. The second includes hearts from embryonic days E17, E19, and E21 and postnatal (P) days P1, P12, P74, and P200. Over 6,000 novel ESTs were identified in the libraries derived from these two sets of tissues, all of which have been contributed to the NCBI rat UniGene collection. It is anticipated that such EST and cDNA clone resources will prove invaluable to gene expression studies aimed at the understanding of the molecular mechanisms underlying heart septation defects.

  11. Isolation of expressed sequence tags of Agaricus bisporus and their assignment to chromosomes.

    PubMed Central

    Sonnenberg, A S; de Groot, P W; Schaap, P J; Baars, J J; Visser, J; Van Griensven, L J

    1996-01-01

    The genome of the cultivated basidiomycete Agaricus bisporus Horst U1 and of its homokaryotic parents has been characterized by using an optimized method of pulsed-field gel electrophoresis. Expressed sequence tags obtained as expressed cDNAs from a primordial tissue-derived cDNA library and a number of previously isolated genes were used to identify the individual chromosomes of the parental lines of Horst U1. The genome consists of 13 chromosomes, and its total size is 31 Mb. For those chromosomes that could not be resolved by contour-clamped homogeneous electric field electrophoresis, the segregation of marker genes was studied in a set of 86 homokaryotic offspring of Horst U1. At least two markers were assigned to each individual chromosome. In this way all individual chromosomes were unequivocally identified. The large size difference observed between the homologous chromosomes IX, harboring the rDNA repeat, was shown to be largely due to a higher copy number of rDNA in parental strain H97 than in parental strain H39. PMID:8953726

  12. Transcriptome analysis of the Amazonian viper Bothrops atrox venom gland using expressed sequence tags (ESTs).

    PubMed

    Neiva, Márcia; Arraes, Fabricio B M; de Souza, Jonso Vieira; Rádis-Baptista, Gandhi; Prieto da Silva, Alvaro R B; Walter, Maria Emilia M T; Brigido, Marcelo de Macedo; Yamane, Tetsuo; López-Lozano, Jorge Luiz; Astolfi-Filho, Spartaco

    2009-03-15

    Bothrops atrox is a highly dangerous pit viper in the Brazilian Amazon region. We produced a global catalogue of gene transcripts to identify the main toxin and other protein families present in the B. atrox venom gland. We prepared a directional cDNA library, from which a set of 610 high quality expressed sequence tags (ESTs) were generated by bioinformatics processing. Our data indicated a predominance of transcripts encoding mainly metalloproteinases (59% of the toxins). The expression pattern of the B. atrox venom was similar to Bothrops insularis, Bothrops jararaca and Bothrops jararacussu in terms of toxin type, although some differences were observed. B. atrox showed a higher amount of the PIII class of metalloproteinases which correlates well with the observed intense hemorrhagic action of its toxin. Also, the PLA2 content was the second highest in this sample compared to the other three Bothrops transcriptomes. To our knowledge, this work is the first transcriptome analysis of an Amazonian rain forest pit viper and it will contribute to the body of knowledge regarding the gene diversity of the venom gland of members of the Bothrops genus. Moreover, our results can be used for future studies with other snake species from the Amazon region to investigate differences in gene patterns or phylogenetic relationships. PMID:19708221

  13. Gene cataloging and expression profiling in human gastric cancer cells by expressed sequence tags.

    PubMed

    Kim, Nam-Soon; Hahn, Yoonsoo; Oh, Jung-Hwa; Lee, Ju-Yeon; Oh, Kyung-Jin; Kim, Jeong-Min; Park, Hong-Seog; Kim, Sangsoo; Song, Kyu-Sang; Rho, Seung-Moo; Yoo, Hyang-Sook; Kim, Yong Sung

    2004-06-01

    To understand the molecular mechanism associated with gastric carcinogenesis, we identified genes expressed in gastric cancer cell lines and tissues. Of 97,609 high-quality ESTs sequenced from 36 cDNA libraries, 92,545 were coalesced into 10,418 human Unigene clusters (Build 151). The gene expression profile was produced by counting the cluster frequencies in each library. Although the profiles of highly expressed genes varied greatly from library to library, those genes related to cell structure formation, heat shock proteins, the glycolysis pathway, and the signaling pathway were highly represented in human gastric cancer cell lines and in primary tumors. Conversely, the genes encoding immunoglobulins, ribosomal proteins, and digestive proteins were down-regulated in gastric cancer cell lines and tissues compared to normal tissues. The transcription levels of some of these genes were confirmed by RT-PCR. We found that genes related to cell adhesion, apoptosis, and cytoskeleton formation were particularly up-regulated in the gastric cancer cell lines established from malignant ascites compared to those from primary tumors. This comprehensive molecular profiling of human gastric cancer should be useful for elucidating the genetic events associated with human gastric cancer. PMID:15177556

  14. Expressed sequence tags reveal genetic diversity and putative virulence factors of the pathogenic oomycete Pythium insidiosum.

    PubMed

    Krajaejun, Theerapong; Khositnithikul, Rommanee; Lerksuthirat, Tassanee; Lowhnoo, Tassanee; Rujirawat, Thidarat; Petchthong, Thanom; Yingyong, Wanta; Suriyaphol, Prapat; Smittipat, Nat; Juthayothin, Tada; Phuntumart, Vipaporn; Sullivan, Thomas D

    2011-07-01

    Oomycetes are unique eukaryotic microorganisms that share a mycelial morphology with fungi. Many oomycetes are pathogenic to plants, and a more limited number are pathogenic to animals. Pythium insidiosum is the only oomycete that is capable of infecting both humans and animals, and causes a life-threatening infectious disease, called "pythiosis". In the majority of pythiosis patients life-long handicaps result from the inevitable radical excision of infected organs, and many die from advanced infection. Better understanding P. insidiosum pathogenesis at molecular levels could lead to new forms of treatment. Genetic and genomic information is lacking for P. insidiosum, so we have undertaken an expressed sequence tag (EST) study, and report on the first dataset of 486 ESTs, assembled into 217 unigenes. Of these, 144 had significant sequence similarity with known genes, including 47 with ribosomal protein homology. Potential virulence factors included genes involved in antioxidation, thermal adaptation, immunomodulation, and iron and sterol binding. Effectors resembling pathogenicity factors of plant-pathogenic oomycetes were also discovered, such as, a CBEL-like protein (possible involvement in host cell adhesion and hemagglutination), a putative RXLR effector (possibly involved in host cell modulation) and elicitin-like (ELL) proteins. Phylogenetic analysis mapped P. insidiosum ELLs to several novel clades of oomycete elicitins (ELIs), and homology modeling predicted that P. insidiosum ELLs should bind sterols. Most of the P. insidiosum ESTs showed homology to sequences in the genome or EST databases of other oomycetes, but one putative gene, with unknown function, was found to be unique to P. insidiosum. The EST dataset reported here represents the first steps in identifying genes of P. insidiosum and beginning transcriptome analysis. This genetic information will facilitate understanding of pathogenic mechanisms of this devastating pathogen. PMID:21724174

  15. Genome-wide characterization and selection of expressed sequence tag simple sequence repeat primers for optimized marker distribution and reliability in peach

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Expressed sequence tag (EST) simple sequence repeats (SSRs) in Prunus were mined, and flanking primers designed and used for genome-wide characterization and selection of primers to optimize marker distribution and reliability. A total of 12,618 contigs were assembled from 84,727 ESTs, along with 34...

  16. Desiccation survival in an Antarctic nematode: molecular analysis using expressed sequenced tags

    PubMed Central

    Adhikari, Bishwo N; Wall, Diana H; Adams, Byron J

    2009-01-01

    Background Nematodes are the dominant soil animals in Antarctic Dry Valleys and are capable of surviving desiccation and freezing in an anhydrobiotic state. Genes induced by desiccation stress have been successfully enumerated in nematodes; however we have little knowledge of gene regulation by Antarctic nematodes which can survive multiple environmental stresses. To address this problem we investigated the genetic responses of a nematode species, Plectus murrayi, that is capable of tolerating Antarctic environmental extremes, in particular desiccation and freezing. In this study, we provide the first insight into the desiccation induced transcriptome of an Antarctic nematode through cDNA library construction and suppressive subtractive hybridization. Results We obtained 2,486 expressed sequence tags (ESTs) from 2,586 clones derived from the cDNA library of desiccated P. murrayi. The 2,486 ESTs formed 1,387 putative unique transcripts of which 523 (38%) had matches in the model-nematode Caenorhabditis elegans, 107 (7%) in nematodes other than C. elegans, 153 (11%) in non-nematode organisms and 605 (44%) had no significant match to any sequences in the current databases. The 1,387 unique transcripts were functionally classified by using Gene Ontology (GO) hierarchy and the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. The results indicate that the transcriptome contains a group of transcripts from diverse functional areas. The subtractive library of desiccated nematodes showed 80 transcripts differentially expressed during desiccation stress, of which 28% were metabolism related, 19% were involved in environmental information processing, 28% involved in genetic information processing and 21% were novel transcripts. Expression profiling of 14 selected genes by quantitative Real-time PCR showed 9 genes significantly up-regulated, 3 down-regulated and 2 continuously expressed in response to desiccation. Conclusion The establishment of a desiccation EST

  17. Gene expression analysis of volatile-rich male flowers of dioecious Pandanus fascicularis using expressed sequence tags.

    PubMed

    Vinod, M S; Sankararamasubramanian, H M; Priyanka, R; Ganesan, G; Parida, Ajay

    2010-07-15

    Pandanus fascicularis is dioecious with the female plant producing a non-scented fruit while the male produces a flower rich in volatiles. The essential oil extracted from the flowers is economically exploited as a natural flavouring agent as well as for its therapeutic properties. Molecular dissection of this distinct flower for identifying the genes responsible for its aroma by way of expressed sequence tags (ESTs) has not been initiated in spite of its economic viability. A male flower-specific cDNA library was constructed and 977 ESTs were generated. CAP3 analysis performed on the dataset revealed 83 contigs (549 ESTs) and 428 singlets, thereby yielding a total of 511 unigenes. Functional annotation using the BLAST2GO software resulted in 1952 Gene ontology (GO) functional classification terms for 621 sequences. Unknown proteins were further analysed with InterProScan to determine their functional motifs. RNA gel blot analysis of 26 functionally distinct transcripts potentially involved in flowering and volatile generation, using vegetative and reproductive tissues of both the sexes, revealed differential expression profiles. In addition to an overview of genes expressed, candidate genes with expression that are modulated predominantly in the male inflorescence were also identified. This is the first report on generation of ESTs to determine the subset of genes that can be used as potential candidates for future attempts aimed towards its genetic and genome analysis including metabolic engineering of floral volatiles in this economically important plant.

  18. Expressed sequence tag analysis in Cycas, the most primitive living seed plant

    PubMed Central

    Brenner, Eric D; Stevenson, Dennis W; McCombie, Richard W; Katari, Manpreet S; Rudd, Stephen A; Mayer, Klaus FX; Palenchar, Peter M; Runko, Suzan J; Twigg, Richard W; Dai, Guangwei; Martienssen, Rob A; Benfey, Phillip N; Coruzzi, Gloria M

    2003-01-01

    Background Cycads are ancient seed plants (living fossils) with origins in the Paleozoic. Cycads are sometimes considered a 'missing link' as they exhibit characteristics intermediate between vascular non-seed plants and the more derived seed plants. Cycads have also been implicated as the source of 'Guam's dementia', possibly due to the production of S(+)-beta-methyl-alpha, beta-diaminopropionic acid (BMAA), which is an agonist of animal glutamate receptors. Results A total of 4,200 expressed sequence tags (ESTs) were created from Cycas rumphii and clustered into 2,458 contigs, of which 1,764 had low-stringency BLAST similarity to other plant genes. Among those cycad contigs with similarity to plant genes, 1,718 cycad 'hits' are to angiosperms, 1,310 match genes in gymnosperms and 734 match lower (non-seed) plants. Forty-six contigs were found that matched only genes in lower plants and gymnosperms. Upon obtaining the complete sequence from the clones of 37/46 contigs, 14 still matched only gymnosperms. Among those cycad contigs common to higher plants, ESTs were discovered that correspond to those involved in development and signaling in present-day flowering plants. We purified a cycad EST for a glutamate receptor (GLR)-like gene, as well as ESTs potentially involved in the synthesis of the GLR agonist BMAA. Conclusions Analysis of cycad ESTs has uncovered conserved and potentially novel genes. Furthermore, the presence of a glutamate receptor agonist, as well as a glutamate receptor-like gene in cycads, supports the hypothesis that such neuroactive plant products are not merely herbivore deterrents but may also serve a role in plant signaling. PMID:14659015

  19. Development and characterization of novel expressed sequence tag-derived simple sequence repeat markers in Hevea brasiliensis (rubber tree).

    PubMed

    An, Z W; Li, Y C; Zhai, Q L; Xie, L L; Zhao, Y H; Huang, H S

    2013-11-22

    Cultivated clones of Hevea brasiliensis have a narrow genetic base. In order to broaden the genetic base, it is first necessary to investigate the genetic diversity of wild populations. Expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed to investigate the genetic diversity of Hevea populations. Four hundred and thirty microsatellites were identified and 148 primers were designed to amplify the loci. Twenty-nine primer pairs were synthesized and evaluated for their ability to detect genetic polymorphisms among 40 wild accessions of H. brasiliensis. Twenty-one of the 29 loci were polymorphic. The number of alleles per locus in the 40 accessions ranged from 2 to 7. H(O) and H(E) at each locus ranged from 0.0000 to 0.9000 and from 0.0000 to 0.8704, respectively. All 21 loci could amplify in H. brasiliensis, H. pauciflora, H. nitida, H. spruceana, and H. camargoana. The EST-SSR primers developed herein can be used in genetic diversity and structure studies in H. brasiliensis.

  20. A Hybrid Distance Measure for Clustering Expressed Sequence Tags Originating from the Same Gene Family

    PubMed Central

    Ng, Keng-Hoong; Ho, Chin-Kuan; Phon-Amnuaisuk, Somnuk

    2012-01-01

    Background Clustering is a key step in the processing of Expressed Sequence Tags (ESTs). The primary goal of clustering is to put ESTs from the same transcript of a single gene into a unique cluster. Recent EST clustering algorithms mostly adopt the alignment-free distance measures, where they tend to yield acceptable clustering accuracies with reasonable computational time. Despite the fact that these clustering methods work satisfactorily on a majority of the EST datasets, they have a common weakness. They are prone to deliver unsatisfactory clustering results when dealing with ESTs from the genes derived from the same family. The root cause is the distance measures applied on them are not sensitive enough to separate these closely related genes. Methodology/Principal Findings We propose a hybrid distance measure that combines the global and local features extracted from ESTs, with the aim to address the clustering problem faced by ESTs derived from the same gene family. The clustering process is implemented using the DBSCAN algorithm. We test the hybrid distance measure on the ten EST datasets, and the clustering results are compared with the two alignment-free EST clustering tools, i.e. wcd and PEACE. The clustering results indicate that the proposed hybrid distance measure performs relatively better (in terms of clustering accuracy) than both EST clustering tools. Conclusions/Significance The clustering results provide support for the effectiveness of the proposed hybrid distance measure in solving the clustering problem for ESTs that originate from the same gene family. The improvement of clustering accuracies on the experimental datasets has supported the claim that the sensitivity of the hybrid distance measure is sufficient to solve the clustering problem. PMID:23071763

  1. Chromosome-specific physical localisation of expressed sequence tag loci in Corchorus olitorius L.

    PubMed

    Joshi, A; Das, S K; Samanta, P; Paria, P; Sen, S K; Basu, A

    2014-11-01

    Jute (Corchorus spp.), as a natural fibre-producing species, ranks next only to cotton. Inadequate understanding of its genetic architecture is a major lacuna for genetic improvement of this crop in terms of yield and quality. Establishment of a physical map provides a genomic tool that helps in positional cloning of valuable genes. In this report, an attempt was initiated to study association and localisation of single copy expressed sequence tag (EST) loci in the genome of Corchorus olitorius. The chromosome-specific association of EST was determined based on the appearance of an extra signal for a single copy cDNA probe in mitotic interphase nuclei of specific trisomic(s) for fluorescence in situ hybridisation, and validated using a cDNA fragment of the 26S rRNA gene (600 bp) as molecular probe. The probe exhibited three signals in meiotic interphase nuclei of trisomic 5, instead of two as observed in diploids and other trisomics, indicating its association with chromosome 5. Subsequent hybridisation of the same probe on the pachytene chromosomes of diploids confirmed that 26S rRNA occupies the terminal end of the short arm of chromosome 5 in C. olitorius. Subsequently, chromosome-specific association of 63 single copy EST and their physical localisation were determined on chromosomes 2, 4, 5 and 7. The study describes chromosome-specific physical localisation of genes in jute. The approach used here could be a step towards construction of genome-wide physical maps for any recalcitrant plant species like jute. PMID:24628982

  2. Chromosome-specific physical localisation of expressed sequence tag loci in Corchorus olitorius L.

    PubMed

    Joshi, A; Das, S K; Samanta, P; Paria, P; Sen, S K; Basu, A

    2014-11-01

    Jute (Corchorus spp.), as a natural fibre-producing species, ranks next only to cotton. Inadequate understanding of its genetic architecture is a major lacuna for genetic improvement of this crop in terms of yield and quality. Establishment of a physical map provides a genomic tool that helps in positional cloning of valuable genes. In this report, an attempt was initiated to study association and localisation of single copy expressed sequence tag (EST) loci in the genome of Corchorus olitorius. The chromosome-specific association of EST was determined based on the appearance of an extra signal for a single copy cDNA probe in mitotic interphase nuclei of specific trisomic(s) for fluorescence in situ hybridisation, and validated using a cDNA fragment of the 26S rRNA gene (600 bp) as molecular probe. The probe exhibited three signals in meiotic interphase nuclei of trisomic 5, instead of two as observed in diploids and other trisomics, indicating its association with chromosome 5. Subsequent hybridisation of the same probe on the pachytene chromosomes of diploids confirmed that 26S rRNA occupies the terminal end of the short arm of chromosome 5 in C. olitorius. Subsequently, chromosome-specific association of 63 single copy EST and their physical localisation were determined on chromosomes 2, 4, 5 and 7. The study describes chromosome-specific physical localisation of genes in jute. The approach used here could be a step towards construction of genome-wide physical maps for any recalcitrant plant species like jute.

  3. An expressed sequence tag survey of gene expression in the pond snail Lymnaea stagnalis, an intermediate vector of trematodes [corrected].

    PubMed

    Davison, A; Blaxter, M L

    2005-05-01

    The pond snail Lymnaea stagnalis is an intermediate vector for the liver fluke Fasciola hepatica, a common parasite of ruminants and humans. Yet, despite being a disease of medical and economic importance, as well as a potentially useful comparative tool, the genetics of the relationship between Lymnaea and Fasciola has barely been investigated. As a complement to forthcoming F. hepatica expressed sequence tags (ESTs), we generated 1320 ESTs from L. stagnalis central nervous system (CNS) libraries. We estimate that these sequences derive from 771 different genes, of which 374 showed significant similarity to proteins in public databases, and 169 were similar to ESTs from the snail vector Biomphalaria glabrata. These L. stagnalis ESTs will provide insight into the function of the snail CNS, as well as the molecular components of behaviour and response to parasitism. In the future, the comparative analysis of Lymnaea/Fasciola with Biomphalaria/Schistosoma will help to understand both conserved and divergent aspects of the host-parasite relationship. The L. stagnalis ESTs will also assist gene prediction in the forthcoming B. glabrata genome sequence. The dataset is available for searching on the world-wide web at http://zeldia.cap.ed.ac.uk/mollusca.html.

  4. Expressed sequence tags from Atta laevigata and identification of candidate genes for the control of pest leaf-cutting ants

    PubMed Central

    2011-01-01

    Background Leafcutters are the highest evolved within Neotropical ants in the tribe Attini and model systems for studying caste formation, labor division and symbiosis with microorganisms. Some species of leafcutters are agricultural pests controlled by chemicals which affect other animals and accumulate in the environment. Aiming to provide genetic basis for the study of leafcutters and for the development of more specific and environmentally friendly methods for the control of pest leafcutters, we generated expressed sequence tag data from Atta laevigata, one of the pest ants with broad geographic distribution in South America. Results The analysis of the expressed sequence tags allowed us to characterize 2,006 unique sequences in Atta laevigata. Sixteen of these genes had a high number of transcripts and are likely positively selected for high level of gene expression, being responsible for three basic biological functions: energy conservation through redox reactions in mitochondria; cytoskeleton and muscle structuring; regulation of gene expression and metabolism. Based on leafcutters lifestyle and reports of genes involved in key processes of other social insects, we identified 146 sequences potential targets for controlling pest leafcutters. The targets are responsible for antixenobiosis, development and longevity, immunity, resistance to pathogens, pheromone function, cell signaling, behavior, polysaccharide metabolism and arginine kynase activity. Conclusion The generation and analysis of expressed sequence tags from Atta laevigata have provided important genetic basis for future studies on the biology of leaf-cutting ants and may contribute to the development of a more specific and environmentally friendly method for the control of agricultural pest leafcutters. PMID:21682882

  5. Gene Discovery and Expression Profile Analysis through Sequencing of Expressed Sequence Tags from Different Developmental Stages of the Chytridiomycete Blastocladiella emersonii†

    PubMed Central

    Ribichich, Karina F.; Salem-Izacc, Silvia M.; Georg, Raphaela C.; Vêncio, Ricardo Z. N.; Navarro, Luci D.; Gomes, Suely L.

    2005-01-01

    Blastocladiella emersonii is an aquatic fungus of the chytridiomycete class which diverged early from the fungal lineage and is notable for the morphogenetic processes which occur during its life cycle. Its particular taxonomic position makes this fungus an interesting system to be considered when investigating phylogenetic relationships and studying the biology of lower fungi. To contribute to the understanding of the complexity of the B. emersonii genome, we present here a survey of expressed sequence tags (ESTs) from various stages of the fungal development. Nearly 20,000 cDNA clones from 10 different libraries were partially sequenced from their 5′ end, yielding 16,984 high-quality ESTs. These ESTs were assembled into 4,873 putative transcripts, of which 48% presented no matches with existing sequences in public databases. As a result of Gene Ontology (GO) project annotation, 1,680 ESTs (35%) were classified into biological processes of the GO structure, with transcription and RNA processing, protein biosynthesis, and transport as prevalent processes. We also report full-length sequences, useful for construction of molecular phylogenies, and several ESTs that showed high similarity with known proteins, some of which were not previously described in fungi. Furthermore, we analyzed the expression profile (digital Northern analysis) of each transcript throughout the life cycle of the fungus using Bayesian statistics. The in silico approach was validated by Northern blot analysis with good agreement between the two methodologies. PMID:15701807

  6. Expressed sequence tag (EST) profiling in hyper saline shocked Dunaliella salina reveals high expression of protein synthetic apparatus components.

    PubMed

    Alkayal, Fadi; Albion, Rebecca L; Tillett, Richard L; Hathwaik, Leyla T; Lemos, Mark S; Cushman, John C

    2010-11-01

    The unicellular, halotolerant, green alga, Dunaliella salina (Chlorophyceae) has the unique ability to adapt and grow in a wide range of salt conditions from about 0.05 to 5.5M. To better understand the molecular basis of its salinity tolerance, a complementary DNA (cDNA) library was constructed from D. salina cells adapted to 2.5M NaCl, salt-shocked at 3.4M NaCl for 5h, and used to generate an expressed sequence tag (EST) database. ESTs were obtained for 2831 clones representing 1401 unique transcripts. Putative functions were assigned to 1901 (67.2%) ESTs after comparison with protein databases. An additional 154 (5.4%) ESTs had significant similarity to known sequences whose functions are unclear and 776 (27.4%) had no similarity to known sequences. For those D. salina ESTs for which functional assignments could be made, the largest functional categories included protein synthesis (35.7%), energy (photosynthesis) (21.4%), primary metabolism (13.8%) and protein fate (6.8%). Within the protein synthesis category, the vast majority of ESTs (80.3%) encoded ribosomal proteins representing about 95% of the approximately 82 subunits of the cytosolic ribosome indicating that D. salina invests substantial resources in the production and maintenance of protein synthesis. The increased mRNA expression upon salinity shock was verified for a small set of selected genes by real-time, quantitative reverse-transcription-polymerase chain reaction (qRT-PCR). This EST collection also provided important new insights into the genetic underpinnings for the biosynthesis and utilization of glycerol and other osmoprotectants, the carotenoid biosynthetic pathway, reactive oxygen-scavenging enzymes, and molecular chaperones (heat shock proteins) not described previously for D. salina. EST discovery also revealed the existence of RNA interference and signaling pathways associated with osmotic stress adaptation. The unknown ESTs described here provide a rich resource for the identification

  7. Adult midgut expressed sequence tags from the tsetse fly Glossina morsitans morsitans and expression analysis of putative immune response genes

    PubMed Central

    Lehane, M J; Aksoy, S; Gibson, W; Kerhornou, A; Berriman, M; Hamilton, J; Soares, M B; Bonaldo, M F; Lehane, S; Hall, N

    2003-01-01

    Background Tsetse flies transmit African trypanosomiasis leading to half a million cases annually. Trypanosomiasis in animals (nagana) remains a massive brake on African agricultural development. While trypanosome biology is widely studied, knowledge of tsetse flies is very limited, particularly at the molecular level. This is a serious impediment to investigations of tsetse-trypanosome interactions. We have undertaken an expressed sequence tag (EST) project on the adult tsetse midgut, the major organ system for establishment and early development of trypanosomes. Results A total of 21,427 ESTs were produced from the midgut of adult Glossina morsitans morsitans and grouped into 8,876 clusters or singletons potentially representing unique genes. Putative functions were ascribed to 4,035 of these by homology. Of these, a remarkable 3,884 had their most significant matches in the Drosophila protein database. We selected 68 genes with putative immune-related functions, macroarrayed them and determined their expression profiles following bacterial or trypanosome challenge. In both infections many genes are downregulated, suggesting a malaise response in the midgut. Trypanosome and bacterial challenge result in upregulation of different genes, suggesting that different recognition pathways are involved in the two responses. The most notable block of genes upregulated in response to trypanosome challenge are a series of Toll and Imd genes and a series of genes involved in oxidative stress responses. Conclusions The project increases the number of known Glossina genes by two orders of magnitude. Identification of putative immunity genes and their preliminary characterization provides a resource for the experimental dissection of tsetse-trypanosome interactions. PMID:14519198

  8. Expressed sequence tags and molecular cloning and characterization of gene encoding pinoresinol/lariciresinol reductase from Podophyllum hexandrum.

    PubMed

    Wankhede, Dhammaprakash Pandhari; Biswas, Dipul Kumar; Rajkumar, Subramani; Sinha, Alok Krishna

    2013-12-01

    Podophyllotoxin, an aryltetralin lignan, is the source of important anticancer drugs etoposide, teniposide, and etopophos. Roots/rhizome of Podophyllum hexandrum form one of the most important sources of podophyllotoxin. In order to understand genes involved in podophyllotoxin biosynthesis, two suppression subtractive hybridization libraries were synthesized, one each from root/rhizome and leaves using high and low podophyllotoxin-producing plants of P. hexandrum. Sequencing of clones identified a total of 1,141 Expressed Sequence Tags (ESTs) resulting in 354 unique ESTs. Several unique ESTs showed sequence similarity to the genes involved in metabolism, stress/defense responses, and signalling pathways. A few ESTs also showed high sequence similarity with genes which were shown to be involved in podophyllotoxin biosynthesis in other plant species such as pinoresinol/lariciresinol reductase. A full length coding sequence of pinoresinol/lariciresinol reductase (PLR) has been cloned from P. hexandrum which was found to encode protein with 311 amino acids and show sequence similarity with PLR from Forsythia intermedia and Linum spp. Spatial and stress-inducible expression pattern of PhPLR and other known genes of podophyllotoxin biosynthesis, secoisolariciresinol dehydrogenase (PhSDH), and dirigent protein oxidase (PhDPO) have been studied. All the three genes showed wounding and methyl jasmonate-inducible expression pattern. The present work would form a basis for further studies to understand genomics of podophyllotoxin biosynthesis in P. hexandrum.

  9. Confirming single nucleotide polymorphisms from expressed sequence tag datasets derived from three cattle cDNA libraries.

    PubMed

    Lee, Seung-Hwan; Park, Eung-Woo; Cho, Yong-Min; Lee, Ji-Woong; Kim, Hyoung-Yong; Lee, Jun-Heon; Oh, Sung-Jong; Cheong, Il-Cheong; Yoon, Du-Hak

    2006-03-31

    Using the Phred/Phrap/Polyphred/Consed pipeline established in the National Livestock Research Institute of Korea, we predicted candidate coding single nucleotide polymorphisms (cSNPs) from 7,600 expressed sequence tags (ESTs) derived from three cDNA libraries (liver, M. longissimus dorsi, and intermuscular fat) of Hanwoo (Korean native cattle) steers. From the 7,600 ESTs, 829 contigs comprising more than two EST reads were assembled using the Phrap assembler. Based on the contig analysis, 201 candidate cSNPs were identified in 129 contigs, in which transitions (69%) outnumbered transversions (31%). To verify whether the predicted cSNPs are real, 17 SNPs involved in lipid and energy metabolism were selected from the ESTs. Twelve of these were confirmed to be real while five were identified as artifacts, possibly due to expressed sequence tag sequence error. Further analysis of the 12 verified cSNPs was performed using the program BLASTX. Five were identified as nonsynonymous cSNPs, five were synonymous cSNPs, and two SNPs were located in 3'-UTRs. Our data indicated that a relatively high SNP prediction rate (71%) from a large EST database could produce abundant cSNPs rapidly, which can be used as valuable genetic markers in cattle.

  10. Generation and Analysis of Expressed Sequence Tags from Olea europaea L.

    PubMed Central

    Ozdemir Ozgenturk, Nehir; Oruç, Fatma; Sezerman, Ugur; Kuçukural, Alper; Vural Korkut, Senay; Toksoz, Feriha; Un, Cemal

    2010-01-01

    Olive (Olea europaea L.) is an important source of edible oil which was originated in Near-East region. In this study, two cDNA libraries were constructed from young olive leaves and immature olive fruits for generation of ESTs to discover the novel genes and search the function of unknown genes of olive. The randomly selected 3840 colonies were sequenced for EST collection from both libraries. Readable 2228 sequences for olive leaf and 1506 sequences for olive fruit were assembled into 205 and 69 contigs, respectively, whereas 2478 were singletons. Putative functions of all 2752 differentially expressed unique sequences were designated by gene homology based on BLAST and annotated using BLAST2GO. While 1339 ESTs show no homology to the database, 2024 ESTs have homology (under 80%) with hypothetical proteins, putative proteins, expressed proteins, and unknown proteins in NCBI-GenBank. 635 EST's unique genes sequence have been identified by over 80% homology to known function in other species which were not previously described in Olea family. Only 3.1% of total EST's was shown similarity with olive database existing in NCBI. This generated EST's data and consensus sequences were submitted to NCBI as valuable source for functional genome studies of olive. PMID:21197085

  11. Comparative analysis of the Acyrthosiphon pisum genome and expressed sequence tag-based gene sets from other aphid species.

    PubMed

    Ollivier, M; Legeai, F; Rispe, C

    2010-03-01

    To study gene repertoires and their evolution within aphids, we compared the complete genome sequence of Acyrthosiphon pisum (reference gene set) and expressed sequence tag (EST) data from three other species: Myzus persicae, Aphis gossypii and Toxoptera citricida. We assembled ESTs, predicted coding sequences, and identified potential pairs of orthologues (reciprocical best hits) with A. pisum. Pairwise comparisons show that a fraction of the genes evolve fast (high ratio of non-synonymous to synonymous rates), including many genes shared by aphids but with no hit in Uniprot. A detailed phylogenetic study for four fast-evolving genes (C002, JHAMT, Apo and GH) shows that rate accelerations are often associated with duplication events. We also compare compositional patterns between the two tribes of aphids, Aphidini and Macrosiphini.

  12. Development, characterization and cross species amplification of polymorphic microsatellite markers from expressed sequence tags of turmeric (Curcuma longa L.).

    PubMed

    Siju, S; Dhanya, K; Syamkumar, S; Sasikumar, B; Sheeja, T E; Bhat, A I; Parthasarathy, V A

    2010-02-01

    Expressed sequence tags (ESTs) from turmeric (Curcuma longa L.) were used for the screening of type and frequency of Class I (hypervariable) simple sequence repeats (SSRs). A total of 231 microsatellite repeats were detected from 12,593 EST sequences of turmeric after redundancy elimination. The average density of Class I SSRs accounts to one SSR per 17.96 kb of EST. Mononucleotides were the most abundant class of microsatellite repeat in turmeric ESTs followed by trinucleotides. A robust set of 17 polymorphic EST-SSRs were developed and used for evaluating 20 turmeric accessions. The number of alleles detected ranged from 3 to 8 per loci. The developed markers were also evaluated in 13 related species of C. longa confirming high rate (100%) of cross species transferability. The polymorphic microsatellite markers generated from this study could be used for genetic diversity analysis and resolving the taxonomic confusion prevailing in the genus.

  13. Immune gene discovery by expressed sequence tag (EST) analysis of hemocytes in the ridgetail white prawn Exopalaemon carinicauda

    PubMed Central

    Duan, Yafei; Liu, Ping; Li, Jitao; Li, Jian; Chen, Ping

    2013-01-01

    The ridgetail white prawn Exopalaemon carinicauda is one of the most important commercial species in eastern China. However, little information of immune genes in E. carinicauda has been reported. To identify distinctive genes associated with immunity, an expressed sequence tag (EST) library was constructed from hemocytes of E. carinicauda. A total of 3411 clones were sequenced, yielding 2853 ESTs and the average sequence length is 436 bp. The cluster and assembly analysis yielded 1053 unique sequences including 329 contigs and 724 singletons. Blast analysis identified 593 (56.3%) of the unique sequences as orthologs of genes from other organisms (E-value < 1e-5). Based on the COG and Gene Ontology (GO), 593 unique sequences were classified. Through comparison with previous studies, 153 genes assembled from 367 ESTs have been identified as possibly involved in defense or immune functions. These genes are categorized into seven categories according to their putative functions in shrimp immune system: antimicrobial peptides, prophenoloxidase activating system, antioxidant defense systems, chaperone proteins, clottable proteins, pattern recognition receptors and other immune-related genes. According to EST abundance, the major immune-related genes were thioredoxin (141, 4.94% of all ESTs) and calmodulin (14, 0.49% of all ESTs). The EST sequences of E. carinicauda hemocytes provide important information of the immune system and lay the groundwork for development of molecular markers related to disease resistance in prawn species. PMID:23092732

  14. Analysis of expressed sequence tags from the anamorphic basidiomycetous yeast, Pseudozyma antarctica, which produces glycolipid biosurfactants, mannosylerythritol lipids.

    PubMed

    Morita, Tomotake; Konishi, Masaaki; Fukuoka, Tokuma; Imura, Tomohiro; Kitamoto, Dai

    2006-07-15

    Pseudozyma antarctica T-34 secretes a large amount of biosurfactants (BS), mannosylerythritol lipids (MEL), from different carbon sources such as hydrocarbons and vegetable oils. The detailed biosynthetic pathway of MEL remained unknown due to lack of genetic information on the anamorphic basidiomycetous yeasts, including the genus Pseudozyma. Here, in order to obtain genetic information on P. antarctica T-34, we constructed a cDNA library from yeast cells producing MEL from soybean oil and identified the genes expressed through the creation of an expressed sequence tags (EST) library. We generated 398 ESTs, assembled into 146 contiguous sequences. Based upon a BLAST search similarity cut-off of Esequences in the protein database; 60.3% of all contiguous sequences shared significant identities to hypothetical protein of Ustilago maydis, which is a smut fungus and BS producer. Based on the gene expression study using real-time reverse transcriptase-PCR, the predicted genes, such as mannosyltranferase and acyltransferase, were demonstrated to be highly involved in MEL biosynthesis in soybean oil-grown cells. PMID:16845679

  15. Probing essential oil biosynthesis and secretion by functional evaluation of expressed sequence tags from mint glandular trichomes.

    PubMed

    Lange, B M; Wildung, M R; Stauber, E J; Sanchez, C; Pouchnik, D; Croteau, R

    2000-03-14

    Functional genomics approaches, which use combined computational and expression-based analyses of large amounts of sequence information, are emerging as powerful tools to accelerate the comprehensive understanding of cellular metabolism in specialized tissues and whole organisms. As part of an ongoing effort to identify genes of essential oil (monoterpene) biosynthesis, we have obtained sequence information from 1,316 randomly selected cDNA clones, or expressed sequence tags (ESTs), from a peppermint (Mentha x piperita) oil gland secretory cell cDNA library. After bioinformatic selection, candidate genes putatively involved in essential oil biosynthesis and secretion have been subcloned into suitable expression vectors for functional evaluation in Escherichia coli. On the basis of published and preliminary data on the functional properties of these clones, it is estimated that the ESTs involved in essential oil metabolism represent about 25% of the described sequences. An additional 7% of the recognized genes code for proteins involved in transport processes, and a subset of these is likely involved in the secretion of essential oil terpenes from the site of synthesis to the storage cavity of the oil glands. The integrated approaches reported here represent an essential step toward the development of a metabolic map of oil glands and provide a valuable resource for defining molecular targets for the genetic engineering of essential oil formation. PMID:10717007

  16. Expressed sequence tag analysis of khat (Catha edulis) provides a putative molecular biochemical basis for the biosynthesis of phenylpropylamino alkaloids

    PubMed Central

    Hagel, Jillian M.; Krizevski, Raz; Kilpatrick, Korey; Sitrit, Yaron; Marsolais, Frédéric; Lewinsohn, Efraim; Facchini, Peter J.

    2011-01-01

    Khat (Catha edulis Forsk.) is a flowering perennial shrub cultivated for its neurostimulant properties resulting mainly from the occurrence of (S)-cathinone in young leaves. The biosynthesis of (S)-cathinone and the related phenylpropylamino alkaloids (1S,2S)-cathine and (1R,2S)-norephedrine is not well characterized in plants. We prepared a cDNA library from young khat leaves and sequenced 4,896 random clones, generating an expressed sequence tag (EST) library of 3,293 unigenes. Putative functions were assigned to > 98% of the ESTs, providing a key resource for gene discovery. Candidates potentially involved at various stages of phenylpropylamino alkaloid biosynthesis from L-phenylalanine to (1S,2S)-cathine were identified. PMID:22215969

  17. Analysis of expressed sequence tags from cDNA library of Fusarium culmorum infected barley (Hordeum vulgare L.) roots

    PubMed Central

    Tufan, Feyza; Uçarlı, Cüneyt; Gürel, Filiz

    2015-01-01

    Fusarium culmorum is one of the most common and globally important causal agent of root and crown rot diseases of cereals. These diseases cause grain yield loss and reduced grain quality in barley. In this study, we have analyzed an expressed sequence tag (EST) database derived from F. culmorum infected barley root tissues available at the National Center for Biotechnology Information (NCBI). The 2294 sequences were assembled into 1619 non-redundant sequences consisting of 359 contigs and 1260 singletons using the program CAP3. BLASTX analysis for these sequences was conducted in order to find similar sequences in all databases. Gene Ontology search, enzyme search, KEGG mapping and InterProScan search were done using Blast2GO 3.0.7 tool. By BLASTX analysis, 41.7%, 7.7%, 3.2% and 47.4% of ESTs were categorized as annotated, unannotated, not mapping and without blast hits, respectively. BLASTX analysis revealed that the majority of top hits were barley proteins (43.5%). Based on Gene Ontology classification, 38.3%, 31.3%, and 16% of ESTs were assigned to molecular function, biological process, and cellular component GO terms, respectively. Most abundant GO terms were as follows: 157 sequences were related to response to stress (biological process), 207 sequences were related to ion binding (molecular function), and 160 sequences were related to plastid (cellular component). Furthermore, based on KEGG mapping, 369 sequences could be assigned to 264 enzymes and 83 different KEGG pathways. According to Enzyme Commission (EC) distribution; 94 sequences were transferases (EC2) while 70 sequences were hydrolases (EC3). PMID:25780278

  18. Molecular diversification based on analysis of expressed sequence tags from the venom glands of the Chinese bird spider Ornithoctonus huwena.

    PubMed

    Jiang, Liping; Peng, Li; Chen, Jinjun; Zhang, Yongqun; Xiong, Xia; Liang, Songping

    2008-06-15

    The bird spider Ornithoctonus huwena is one of the most venomous spiders in China. Its venom has been investigated but usually only the most abundant components have been analyzed. To characterize the primary structure of O. huwena toxins, a list of transcripts within the venom gland were made using the expressed sequence tag (EST) strategy. We generated 468 ESTs from a directional cDNA library of O. huwena venom glands. All ESTs were grouped into 24 clusters and 65 singletons, of which 68.00% of total ESTs belong to toxin-like sequences, 13.00% are similar to body peptide transcripts and 19.00% have no significant similarity to any known sequences. Precursors of all toxin-like sequences can be classified into eight different superfamilies (HWTX-I superfamily, HWTX-II superfamily, HWTX-X superfamily, HWTX-XIV superfamily, HWTX-XV superfamily, HWTX-XVI superfamily, HWTX-XVII superfamily, HWTX-XVIII superfamily) except HWTX-XI and HWTX-XIII, according to the identity of their precursor sequences. The results have predictive value for the discovery of various groups of pharmacologically distinct toxins in complex venoms, and for understanding the relationship of spider toxin evolution based on the diversification of cDNA sequences, primary structure of precursor peptides, three-dimensional structure motifs and biological functions.

  19. In silico identification of miRNAs and their targets from the expressed sequence tags of Raphanus sativus

    PubMed Central

    Muvva, Charuvaka; Tewari, Lata; Aruna, Kasoju; Ranjit, Pabbati; MD, Zahoorullah S; MD, K A Matheen; Veeramachaneni, Hemanth

    2012-01-01

    MicroRNAs (miRNAs) are a novel growing family of endogenous, small, non- coding, single-stranded RNA molecules directly involved in regulating gene expression at the posttranscriptional level. High conservation of miRNAs in plant provides the foundation for identification of new miRNAs in other plant species through homology alignment. Here, previous known plant miRNAs were BLASTed against the Expressed Sequence Tag (EST) database of Raphanus sativus, and according to a series of filtering criteria, a total of 48 miRNAs belonging to 9 miRNA families were identified, and 16 potential target genes of them were subsequently predicted, most of which seemed to encode transcription factors or enzymes participating in regulation of development, growth and other physiological processes. Overall, our findings lay the foundation for further researches of miRNAs function in R.sativus. PMID:22359443

  20. Developing expressed sequence tag libraries and the discovery of simple sequence repeat markers for two species of raspberry (Rubus L.)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background: Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed S...

  1. Development of expressed sequence tag resources for Vanda Mimi Palmer and data mining for EST-SSR.

    PubMed

    Teh, Seow-Ling; Chan, Wai-Sun; Abdullah, Janna Ong; Namasivayam, Parameswari

    2011-08-01

    Vanda Mimi Palmer (VMP) is a highly sought as fragrant-orchid hybrid in Malaysia. It is economically important in cosmetic and beauty industries and also a famous potted ornamental plant. To date, no work on fragrance-related genes of vandaceous orchids has been reported from other research groups although the analysis of floral fragrance or volatiles have been extensively studied. An expressed sequence tag (EST) resource was developed for VMP principally to mine any potential fragrance-related expressed sequence tag-simple sequence repeat (EST-SSR) for future development as markers in the identification of fragrant vandaceous orchids endemic to Malaysia. Clustering, annotation and assembling of the ESTs identified 1,196 unigenes which defined 966 singletons and 230 contigs. The VMP dbEST was functionally classified by gene ontology (GO) into three groups: molecular functions (51.2%), cellular components (16.4%) and biological processes (24.6%) while the remaining 7.8% showed no hits with GO identifier. A total of 112 EST-SSR (9.4%) was mined on which at least five units of di-, tri-, tetra-, penta-, or hexa-nucleotide repeats were predicted. The di-nucleotide motif repeats appeared to be the most frequent repeats among the detected SSRs with the AT/TA types as the most abundant among the dimerics, while AAG/TTC, AGA/TCT-type were the most frequent trimerics. The mined EST-SSR is believed to be useful in the development of EST-SSR markers that is applicable in the screening and characterization of fragrance-related transcripts in closely related species.

  2. Identification of Anhydrobiosis-related Genes from an Expressed Sequence Tag Database in the Cryptobiotic Midge Polypedilum vanderplanki (Diptera; Chironomidae)*

    PubMed Central

    Cornette, Richard; Kanamori, Yasushi; Watanabe, Masahiko; Nakahara, Yuichi; Gusev, Oleg; Mitsumasu, Kanako; Kadono-Okuda, Keiko; Shimomura, Michihiko; Mita, Kazuei; Kikawada, Takahiro; Okuda, Takashi

    2010-01-01

    Some organisms are able to survive the loss of almost all their body water content, entering a latent state known as anhydrobiosis. The sleeping chironomid (Polypedilum vanderplanki) lives in the semi-arid regions of Africa, and its larvae can survive desiccation in an anhydrobiotic form during the dry season. To unveil the molecular mechanisms of this resistance to desiccation, an anhydrobiosis-related Expressed Sequence Tag (EST) database was obtained from the sequences of three cDNA libraries constructed from P. vanderplanki larvae after 0, 12, and 36 h of desiccation. The database contained 15,056 ESTs distributed into 4,807 UniGene clusters. ESTs were classified according to gene ontology categories, and putative expression patterns were deduced for all clusters on the basis of the number of clones in each library; expression patterns were confirmed by real-time PCR for selected genes. Among up-regulated genes, antioxidants, late embryogenesis abundant (LEA) proteins, and heat shock proteins (Hsps) were identified as important groups for anhydrobiosis. Genes related to trehalose metabolism and various transporters were also strongly induced by desiccation. Those results suggest that the oxidative stress response plays a central role in successful anhydrobiosis. Similarly, protein denaturation and aggregation may be prevented by marked up-regulation of Hsps and the anhydrobiosis-specific LEA proteins. A third major feature is the predicted increase in trehalose synthesis and in the expression of various transporter proteins allowing the distribution of trehalose and other solutes to all tissues. PMID:20833722

  3. An expressed sequence tag database of T-cell-enriched activated chicken splenocytes: sequence analysis of 5251 clones.

    PubMed

    Tirunagaru, V G; Sofer, L; Cui, J; Burnside, J

    2000-06-01

    The cDNA and gene sequences of many mammalian cytokines and their receptors are known. However, corresponding information on avian cytokines is limited due to the lack of cross-species activity at the functional level or strong homology at the molecular level. To improve the efficiency of identifying cytokines and novel chicken genes, a directionally cloned cDNA library from T-cell-enriched activated chicken splenocytes was constructed, and the partial sequence of 5251 clones was obtained. Sequence clustering indicates that 2357 (42%) of the clones are present as a single copy, and 2961 are distinct clones, demonstrating the high level of complexity of this library. Comparisons of the sequence data with known DNA sequences in GenBank indicate that approximately 25% of the clones match known chicken genes, 39% have similarity to known genes in other species, and 11% had no match to any sequence in the database. Several previously uncharacterized chicken cytokines and their receptors were present in our library. This collection provides a useful database for cataloging genes expressed in T cells and a valuable resource for future investigations of gene expression in avian immunology. A chicken EST Web site (http://udgenome. ags.udel. edu/chickest/chick.htm) has been created to provide access to the data, and a set of unique sequences has been deposited with GenBank (Accession Nos. AI979741-AI982511). Our new Web site (http://www. chickest.udel.edu) will be active as of March 3, 2000, and will also provide keyword-searching capabilities for BLASTX and BLASTN hits of all our clones. PMID:10860659

  4. Flavonoid biosynthesis genes putatively identified in the aromatic plant Polygonum minus via Expressed Sequences Tag (EST) analysis.

    PubMed

    Roslan, Nur Diyana; Yusop, Jastina Mat; Baharum, Syarul Nataqain; Othman, Roohaida; Mohamed-Hussein, Zeti-Azura; Ismail, Ismanizan; Noor, Normah Mohd; Zainal, Zamri

    2012-01-01

    P. minus is an aromatic plant, the leaf of which is widely used as a food additive and in the perfume industry. The leaf also accumulates secondary metabolites that act as active ingredients such as flavonoid. Due to limited genomic and transcriptomic data, the biosynthetic pathway of flavonoids is currently unclear. Identification of candidate genes involved in the flavonoid biosynthetic pathway will significantly contribute to understanding the biosynthesis of active compounds. We have constructed a standard cDNA library from P. minus leaves, and two normalized full-length enriched cDNA libraries were constructed from stem and root organs in order to create a gene resource for the biosynthesis of secondary metabolites, especially flavonoid biosynthesis. Thus, large-scale sequencing of P. minus cDNA libraries identified 4196 expressed sequences tags (ESTs) which were deposited in dbEST in the National Center of Biotechnology Information (NCBI). From the three constructed cDNA libraries, 11 ESTs encoding seven genes were mapped to the flavonoid biosynthetic pathway. Finally, three flavonoid biosynthetic pathway-related ESTs chalcone synthase, CHS (JG745304), flavonol synthase, FLS (JG705819) and leucoanthocyanidin dioxygenase, LDOX (JG745247) were selected for further examination by quantitative RT-PCR (qRT-PCR) in different P. minus organs. Expression was detected in leaf, stem and root. Gene expression studies have been initiated in order to better understand the underlying physiological processes.

  5. Integration of Expressed Sequence Tag Data Flanking Predicted RNA Secondary Structures Facilitates Novel Non-Coding RNA Discovery

    PubMed Central

    Krzyzanowski, Paul M.; Price, Feodor D.; Muro, Enrique M.; Rudnicki, Michael A.; Andrade-Navarro, Miguel A.

    2011-01-01

    Many computational methods have been used to predict novel non-coding RNAs (ncRNAs), but none, to our knowledge, have explicitly investigated the impact of integrating existing cDNA-based Expressed Sequence Tag (EST) data that flank structural RNA predictions. To determine whether flanking EST data can assist in microRNA (miRNA) prediction, we identified genomic sites encoding putative miRNAs by combining functional RNA predictions with flanking ESTs data in a model consistent with miRNAs undergoing cleavage during maturation. In both human and mouse genomes, we observed that the inclusion of flanking ESTs adjacent to and not overlapping predicted miRNAs significantly improved the performance of various methods of miRNA prediction, including direct high-throughput sequencing of small RNA libraries. We analyzed the expression of hundreds of miRNAs predicted to be expressed during myogenic differentiation using a customized microarray and identified several known and predicted myogenic miRNA hairpins. Our results indicate that integrating ESTs flanking structural RNA predictions improves the quality of cleaved miRNA predictions and suggest that this strategy can be used to predict other non-coding RNAs undergoing cleavage during maturation. PMID:21698286

  6. Flavonoid Biosynthesis Genes Putatively Identified in the Aromatic Plant Polygonum minus via Expressed Sequences Tag (EST) Analysis

    PubMed Central

    Roslan, Nur Diyana; Yusop, Jastina Mat; Baharum, Syarul Nataqain; Othman, Roohaida; Mohamed-Hussein, Zeti-Azura; Ismail, Ismanizan; Noor, Normah Mohd; Zainal, Zamri

    2012-01-01

    P. minus is an aromatic plant, the leaf of which is widely used as a food additive and in the perfume industry. The leaf also accumulates secondary metabolites that act as active ingredients such as flavonoid. Due to limited genomic and transcriptomic data, the biosynthetic pathway of flavonoids is currently unclear. Identification of candidate genes involved in the flavonoid biosynthetic pathway will significantly contribute to understanding the biosynthesis of active compounds. We have constructed a standard cDNA library from P. minus leaves, and two normalized full-length enriched cDNA libraries were constructed from stem and root organs in order to create a gene resource for the biosynthesis of secondary metabolites, especially flavonoid biosynthesis. Thus, large-scale sequencing of P. minus cDNA libraries identified 4196 expressed sequences tags (ESTs) which were deposited in dbEST in the National Center of Biotechnology Information (NCBI). From the three constructed cDNA libraries, 11 ESTs encoding seven genes were mapped to the flavonoid biosynthetic pathway. Finally, three flavonoid biosynthetic pathway-related ESTs chalcone synthase, CHS (JG745304), flavonol synthase, FLS (JG705819) and leucoanthocyanidin dioxygenase, LDOX (JG745247) were selected for further examination by quantitative RT-PCR (qRT-PCR) in different P. minus organs. Expression was detected in leaf, stem and root. Gene expression studies have been initiated in order to better understand the underlying physiological processes. PMID:22489118

  7. Development of expressed sequence tag-simple sequence repeat markers for genetic characterization and population structure analysis of Praxelis clematidea (Asteraceae).

    PubMed

    Wang, Q Z; Huang, M; Downie, S R; Chen, Z X

    2016-01-01

    Invasive plants tend to spread aggressively in new habitats and an understanding of their genetic diversity and population structure is useful for their management. In this study, expressed sequence tag-simple sequence repeat (EST-SSR) markers were developed for the invasive plant species Praxelis clematidea (Asteraceae) from 5548 Stevia rebaudiana (Asteraceae) expressed sequence tags (ESTs). A total of 133 microsatellite-containing ESTs (2.4%) were identified, of which 56 (42.1%) were hexanucleotide repeat motifs and 50 (37.6%) were trinucleotide repeat motifs. Of the 24 primer pairs designed from these 133 ESTs, 7 (29.2%) resulted in significant polymorphisms. The number of alleles per locus ranged from 5 to 9. The relatively high genetic diversity (H = 0.2667, I = 0.4212, and P = 100%) of P. clematidea was related to high gene flow (Nm = 1.4996) among populations. The coefficient of population differentiation (GST = 0.2500) indicated that most genetic variation occurred within populations. A Mantel test suggested that there was significant correlation between genetic distance and geographical distribution (r = 0.3192, P = 0.012). These results further support the transferability of EST-SSR markers between closely related genera of the same family. PMID:27323082

  8. SSH Analysis of Endosperm Transcripts and Characterization of Heat Stress Regulated Expressed Sequence Tags in Bread Wheat.

    PubMed

    Goswami, Suneha; Kumar, Ranjeet R; Dubey, Kavita; Singh, Jyoti P; Tiwari, Sachidanand; Kumar, Ashok; Smita, Shuchi; Mishra, Dwijesh C; Kumar, Sanjeev; Grover, Monendra; Padaria, Jasdeep C; Kala, Yugal K; Singh, Gyanendra P; Pathak, Himanshu; Chinnusamy, Viswanathan; Rai, Anil; Praveen, Shelly; Rai, Raj D

    2016-01-01

    Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h) wheat cv. HD2985 by suppression subtractive hybridization (SSH). We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger's sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs). Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD, and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs). We observed eight different types of post-translational modifications (PTMs) in the DEPs corresponds to the cloned ESTs-147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH) in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant), as compared to HD2329 (thermosusceptible) during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat-a novel step toward the development of "climate-smart" wheat.

  9. SSH Analysis of Endosperm Transcripts and Characterization of Heat Stress Regulated Expressed Sequence Tags in Bread Wheat

    PubMed Central

    Goswami, Suneha; Kumar, Ranjeet R.; Dubey, Kavita; Singh, Jyoti P.; Tiwari, Sachidanand; Kumar, Ashok; Smita, Shuchi; Mishra, Dwijesh C.; Kumar, Sanjeev; Grover, Monendra; Padaria, Jasdeep C.; Kala, Yugal K.; Singh, Gyanendra P.; Pathak, Himanshu; Chinnusamy, Viswanathan; Rai, Anil; Praveen, Shelly; Rai, Raj D.

    2016-01-01

    Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h) wheat cv. HD2985 by suppression subtractive hybridization (SSH). We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger's sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs). Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD, and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs). We observed eight different types of post-translational modifications (PTMs) in the DEPs corresponds to the cloned ESTs-147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH) in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant), as compared to HD2329 (thermosusceptible) during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat—a novel step toward the development of

  10. SSH Analysis of Endosperm Transcripts and Characterization of Heat Stress Regulated Expressed Sequence Tags in Bread Wheat.

    PubMed

    Goswami, Suneha; Kumar, Ranjeet R; Dubey, Kavita; Singh, Jyoti P; Tiwari, Sachidanand; Kumar, Ashok; Smita, Shuchi; Mishra, Dwijesh C; Kumar, Sanjeev; Grover, Monendra; Padaria, Jasdeep C; Kala, Yugal K; Singh, Gyanendra P; Pathak, Himanshu; Chinnusamy, Viswanathan; Rai, Anil; Praveen, Shelly; Rai, Raj D

    2016-01-01

    Heat stress is one of the major problems in agriculturally important cereal crops, especially wheat. Here, we have constructed a subtracted cDNA library from the endosperm of HS-treated (42°C for 2 h) wheat cv. HD2985 by suppression subtractive hybridization (SSH). We identified ~550 recombinant clones ranging from 200 to 500 bp with an average size of 300 bp. Sanger's sequencing was performed with 205 positive clones to generate the differentially expressed sequence tags (ESTs). Most of the ESTs were observed to be localized on the long arm of chromosome 2A and associated with heat stress tolerance and metabolic pathways. Identified ESTs were BLAST search using Ensemble, TriFLD, and TIGR databases and the predicted CDS were translated and aligned with the protein sequences available in pfam and InterProScan 5 databases to predict the differentially expressed proteins (DEPs). We observed eight different types of post-translational modifications (PTMs) in the DEPs corresponds to the cloned ESTs-147 sites with phosphorylation, 21 sites with sumoylation, 237 with palmitoylation, 96 sites with S-nitrosylation, 3066 calpain cleavage sites, and 103 tyrosine nitration sites, predicted to sense the heat stress and regulate the expression of stress genes. Twelve DEPs were observed to have transmembrane helixes (TMH) in their structure, predicted to play the role of sensors of HS. Quantitative Real-Time PCR of randomly selected ESTs showed very high relative expression of HSP17 under HS; up-regulation was observed more in wheat cv. HD2985 (thermotolerant), as compared to HD2329 (thermosusceptible) during grain-filling. The abundance of transcripts was further validated through northern blot analysis. The ESTs and their corresponding DEPs can be used as molecular marker for screening or targeted precision breeding program. PTMs identified in the DEPs can be used to elucidate the thermotolerance mechanism of wheat-a novel step toward the development of "climate-smart" wheat

  11. Identification and Validation of Expressed Sequence Tags from Pigeonpea (Cajanus cajan L.) Root.

    PubMed

    Kumar, Ravi Ranjan; Yadav, Shailesh; Joshi, Shourabh; Bhandare, Prithviraj P; Patil, Vinod Kumar; Kulkarni, Pramod B; Sonkawade, Swati; Naik, G R

    2014-01-01

    Pigeonpea (Cajanus cajan (L) Millsp.) is an important food legume crop of rain fed agriculture in the arid and semiarid tropics of the world. It has deep and extensive root system which serves a number of important physiological and metabolic functions in plant development and growth. In order to identify genes associated with pigeonpea root, ESTs were generated from the root tissues of pigeonpea (GRG-295 genotype) by normalized cDNA library. A total of 105 high quality ESTs were generated by sequencing of 250 random clones which resulted in 72 unigenes comprising 25 contigs and 47 singlets. The ESTs were assigned to 9 functional categories on the basis of their putative function. In order to validate the possible expression of transcripts, four genes, namely, S-adenosylmethionine synthetase, phosphoglycerate kinase, serine carboxypeptidase, and methionine aminopeptidase, were further analyzed by reverse transcriptase PCR. The possible role of the identified transcripts and their functions associated with root will also be a valuable resource for the functional genomics study in legume crop. PMID:24895494

  12. Identification of odorant-binding protein genes from antennal expressed sequence tags of the onion fly, Delia antiqua.

    PubMed

    Mitaka, Hayato; Matsuo, Takashi; Miura, Nami; Ishikawa, Yukio

    2011-03-01

    Insect odorant-binding proteins (OBPs) are thought to play a crucial role in the chemosensation of hydrophobic molecules such as pheromones and host chemicals. The onion fly, Delia antiqua, is a specialist feeder of Allium plants, and utilizes a host odorant n-dipropyl disulfide as a cue for its oviposition. Because n-dipropyl disulfide is a highly hydrophobic compound, some OBPs might be indispensable for perception of it. However, no OBP gene has been identified in D. antiqua. Here, to obtain the DNA sequences of D. antiqua OBPs, we performed an analysis of antennal expressed sequence tags (ESTs). Among 288 EST clones, eight D. antiqua OBP genes were identified for the first time. Phylogenetic analysis revealed that each D. antiqua OBP gene is more closely related to its Drosophila orthologs than to the other D. antiqua OBP genes, suggesting that these OBP genes had emerged before the divergence of Delia and Drosophila species. All of the eight D. antiqua OBPs are expressed not only in the antennae but also in the legs, suggesting additional roles in the taste perception of non-volatile compounds. These findings serve as an important basis for understanding the molecular mechanisms underlying the host adaptations of D. antiqua. PMID:20848218

  13. Chromosome Bin Map of Expressed Sequence Tags in Homoeologous Group 1 of Hexaploid Wheat and Homoeology With Rice and Arabidopsis

    PubMed Central

    Peng, J. H.; Zadeh, H.; Lazo, G. R.; Gustafson, J. P.; Chao, S.; Anderson, O. D.; Qi, L. L.; Echalier, B.; Gill, B. S.; Dilbirligi, M.; Sandhu, D.; Gill, K. S.; Greene, R. A.; Sorrells, M. E.; Akhunov, E. D.; Dvořák, J.; Linkiewicz, A. M.; Dubcovsky, J.; Hossain, K. G.; Kalavacharla, V.; Kianian, S. F.; Mahmoud, A. A.; Miftahudin; Conley, E. J.; Anderson, J. A.; Pathan, M. S.; Nguyen, H. T.; McGuire, P. E.; Qualset, C. O.; Lapitan, N. L. V.

    2004-01-01

    A total of 944 expressed sequence tags (ESTs) generated 2212 EST loci mapped to homoeologous group 1 chromosomes in hexaploid wheat (Triticum aestivum L.). EST deletion maps and the consensus map of group 1 chromosomes were constructed to show EST distribution. EST loci were unevenly distributed among chromosomes 1A, 1B, and 1D with 660, 826, and 726, respectively. The number of EST loci was greater on the long arms than on the short arms for all three chromosomes. The distribution of ESTs along chromosome arms was nonrandom with EST clusters occurring in the distal regions of short arms and middle regions of long arms. Duplications of group 1 ESTs in other homoeologous groups occurred at a rate of 35.5%. Seventy-five percent of wheat chromosome 1 ESTs had significant matches with rice sequences (E ≤ e−10), where large regions of conservation occurred between wheat consensus chromosome 1 and rice chromosome 5 and between the proximal portion of the long arm of wheat consensus chromosome 1 and rice chromosome 10. Only 9.5% of group 1 ESTs showed significant matches to Arabidopsis genome sequences. The results presented are useful for gene mapping and evolutionary and comparative genomics of grasses. PMID:15514039

  14. Generation of 10,154 expressed sequence tags from a leafy gametophyte of a marine red alga, Porphyra yezoensis.

    PubMed

    Nikaido, I; Asamizu, E; Nakajima, M; Nakamura, Y; Saga, N; Tabata, S

    2000-06-30

    A total of 10,154 5'-end expressed sequence tags (EST) were established from the normalized and size-selected cDNA libraries of a marine red alga, Porphyra yezoensis. Among the ESTs, 2140 were unique species, and the remaining 8014 were grouped into 1127 species. Database search of the 3267 non-redundant ESTs by BLAST algorithm showed that the sequences of 1080 species (33.1%) have similarity to those of registered genes from various organisms including higher plants, mammals, yeasts, and cyanobacteria, while 2187 (66.9%) are novel. Codon usage analysis in the coding regions of 101 non-redundant EST groups showing significant similarity to known genes indicated the higher GC contents at the third position of codons (79.4%) than the first (62.2%) and the second position (45.0%), suggesting that the genome has been exposed to high GC pressure during evolution. The sequence data of individual ESTs are available at the web site http://www.kazusa.or.jp/en/plant/porphyra/EST/.

  15. Comprehensive analysis of expressed sequence tags from the pulp of the red mutant 'Cara Cara' navel orange (Citrus sinensis Osbeck).

    PubMed

    Ye, Jun-Li; Zhu, An-Dan; Tao, Neng-Guo; Xu, Qiang; Xu, Juan; Deng, Xiu-Xin

    2010-10-01

    Expressed sequence tag (EST) analysis of the pulp of the red-fleshed mutant 'Cara Cara' navel orange provided a starting point for gene discovery and transcriptome survey during citrus fruit maturation. Interpretation of the EST datasets revealed that the mutant pulp transcriptome held a high section of stress responses related genes, such as the type III metallothionein-like gene (6.0%), heat shock protein (2.8%), Cu/Zn superoxide dismutase (0.8%), late embryogenesis abundant protein 5 (0.8%), etc. 133 transcripts were detected to be differentially expressed between the red mutant and its orange-color wild genotype 'Washington' via digital expression analysis. Among them, genes involved in metabolism, defense/stress and signal transduction were statistical overrepresented. Fifteen transcription factors, composed of NAM, ATAF, and CUC transcription factor (NAC); myeloblastosis (MYB); myelocytomatosis (MYC); basic helix-loop-helix (bHLH); basic leucine zipper (bZIP) domain members, were also included. The data reflected the distinct expression profile and the unique regulatory module associated with these two genotypes. Eight differently expressed genes analyzed in digital were validated by quantitative real-time polymerase chain reaction. For structural polymorphism, both simple sequence repeats and single nucleotide polymorphisms (SNP) loci were surveyed; dinucleotide presentation revealed a bias toward AG/GA/TC/CT repeats (52.5%), against GC/CG repeats (0%). SNPs analysis found that transitions (73%) outnumbered transversions (27%). Seventeen potential cultivar-specific and 387 heterozygous SNP loci were detected from 'Cara Cara' and 'Washington' EST pool.

  16. Analysis and functional annotation of expressed sequence tags (ESTs) from multiple tissues of oil palm (Elaeis guineensis Jacq.)

    PubMed Central

    Ho, Chai-Ling; Kwan, Yen-Yen; Choi, Mei-Chooi; Tee, Sue-Sean; Ng, Wai-Har; Lim, Kok-Ang; Lee, Yang-Ping; Ooi, Siew-Eng; Lee, Weng-Wah; Tee, Jin-Ming; Tan, Siang-Hee; Kulaveerasingam, Harikrishna; Alwee, Sharifah Shahrul Rabiah Syed; Abdullah, Meilina Ong

    2007-01-01

    Background Oil palm is the second largest source of edible oil which contributes to approximately 20% of the world's production of oils and fats. In order to understand the molecular biology involved in in vitro propagation, flowering, efficient utilization of nitrogen sources and root diseases, we have initiated an expressed sequence tag (EST) analysis on oil palm. Results In this study, six cDNA libraries from oil palm zygotic embryos, suspension cells, shoot apical meristems, young flowers, mature flowers and roots, were constructed. We have generated a total of 14537 expressed sequence tags (ESTs) from these libraries, from which 6464 tentative unique contigs (TUCs) and 2129 singletons were obtained. Approximately 6008 of these tentative unique genes (TUGs) have significant matches to the non-redundant protein database, from which 2361 were assigned to one or more Gene Ontology categories. Predominant transcripts and differentially expressed genes were identified in multiple oil palm tissues. Homologues of genes involved in many aspects of flower development were also identified among the EST collection, such as CONSTANS-like, AGAMOUS-like (AGL)2, AGL20, LFY-like, SQUAMOSA, SQUAMOSA binding protein (SBP) etc. Majority of them are the first representatives in oil palm, providing opportunities to explore the cause of epigenetic homeotic flowering abnormality in oil palm, given the importance of flowering in fruit production. The transcript levels of two flowering-related genes, EgSBP and EgSEP were analysed in the flower tissues of various developmental stages. Gene homologues for enzymes involved in oil biosynthesis, utilization of nitrogen sources, and scavenging of oxygen radicals, were also uncovered among the oil palm ESTs. Conclusion The EST sequences generated will allow comparative genomic studies between oil palm and other monocotyledonous and dicotyledonous plants, development of gene-targeted markers for the reference genetic map, design and

  17. A High-Throughput Data Mining of Single Nucleotide Polymorphisms in Coffea Species Expressed Sequence Tags Suggests Differential Homeologous Gene Expression in the Allotetraploid Coffea arabica1[W

    PubMed Central

    Vidal, Ramon Oliveira; Mondego, Jorge Maurício Costa; Pot, David; Ambrósio, Alinne Batista; Andrade, Alan Carvalho; Pereira, Luiz Filipe Protasio; Colombo, Carlos Augusto; Vieira, Luiz Gonzaga Esteves; Carazzolle, Marcelo Falsarella; Pereira, Gonçalo Amarante Guimarães

    2010-01-01

    Polyploidization constitutes a common mode of evolution in flowering plants. This event provides the raw material for the divergence of function in homeologous genes, leading to phenotypic novelty that can contribute to the success of polyploids in nature or their selection for use in agriculture. Mounting evidence underlined the existence of homeologous expression biases in polyploid genomes; however, strategies to analyze such transcriptome regulation remained scarce. Important factors regarding homeologous expression biases remain to be explored, such as whether this phenomenon influences specific genes, how paralogs are affected by genome doubling, and what is the importance of the variability of homeologous expression bias to genotype differences. This study reports the expressed sequence tag assembly of the allopolyploid Coffea arabica and one of its direct ancestors, Coffea canephora. The assembly was used for the discovery of single nucleotide polymorphisms through the identification of high-quality discrepancies in overlapped expressed sequence tags and for gene expression information indirectly estimated by the transcript redundancy. Sequence diversity profiles were evaluated within C. arabica (Ca) and C. canephora (Cc) and used to deduce the transcript contribution of the Coffea eugenioides (Ce) ancestor. The assignment of the C. arabica haplotypes to the C. canephora (CaCc) or C. eugenioides (CaCe) ancestral genomes allowed us to analyze gene expression contributions of each subgenome in C. arabica. In silico data were validated by the quantitative polymerase chain reaction and allele-specific combination TaqMAMA-based method. The presence of differential expression of C. arabica homeologous genes and its implications in coffee gene expression, ontology, and physiology are discussed. PMID:20864545

  18. Identification, Characterization, and Mapping of Expressed Sequence Tags from an Embryonic Zebrafish Heart cDNA Library

    PubMed Central

    Ton, Christopher; Hwang, David M.; Dempsey, Adam A.; Tang, Hong-Chang; Yoon, Jennifer; Lim, Mindy; Mably, John D.; Fishman, Mark C.; Liew, Choong-Chin

    2000-01-01

    The generation of expressed sequence tags (ESTs) has proven to be a rapid and economical approach by which to identify and characterize expressed genes. We generated 5102 ESTs from a 3-d-old embryonic zebrafish heart cDNA library. Of these, 57.6% matched to known genes, 14.2% matched only to other ESTs, and 27.8% showed no match to any ESTs or known genes. Clustering of all ESTs identified 359 unique clusters comprising 1771 ESTs, whereas the remaining 3331 ESTs did not cluster. This estimates the number of unique genes identified in the data set to be approximately 3690. A total of 1242 unique known genes were used to analyze the gene expression patterns in the zebrafish embryonic heart. These were categorized into seven categories on the basis of gene function. The largest class of genes represented those involved in gene/protein expression (25.9% of known transcripts). This class was followed by genes involved in metabolism (18.7%), cell structure/motility (16.4%), cell signaling and communication (9.6%), cell/organism defense (7.1%), and cell division (4.4%). Unclassified genes constituted the remaining 17.91%. Radiation hybrid mapping was performed for 102 ESTs and comparison of map positions between zebrafish and human identified new synteny groups. Continued comparative analysis will be useful in defining the boundaries of conserved chromosome segments between zebrafish and humans, which will facilitate the transfer of genetic information between the two organisms and improve our understanding of vertebrate evolution. [The sequence data described in this paper have been submitted to the GenBank data library under accession nos. BE693120–BE693210 and BE704450.] PMID:11116087

  19. Analysis of expressed sequence tags (ESTs) from a normalized cDNA library and isolation of EST simple sequence repeats from the invasive cotton mealybug Phenacoccus solenopsis.

    PubMed

    Li, Hui; Lang, Kun-Ling; Fu, Hai-Bin; Shen, Chang-Peng; Wan, Fang-Hao; Chu, Dong

    2015-12-01

    The cotton mealybug, Phenacoccus solenopsis Tinsley, is a serious and invasive pest. At present, genetic resources for studying P. solenopsis are limited, and this negatively affects genetic research on the organism and, consequently, translational work to improve management of this pest. In the present study, expressed sequence tags (ESTs) were analyzed from a normalized complementary DNA library of P. solenopsis. In addition, EST-derived microsatellite loci (also known as simple sequence repeats or SSRs) were isolated and characterized. A total of 1107 high-quality ESTs were acquired from the library. Clustering and assembly analysis resulted in 785 unigenes, which were classified functionally into 23 categories according to the Gene Ontology database. Seven EST-based SSR markers were developed in this study and are expected to be useful in characterizing how this invasive species was introduced, as well as providing insights into its genetic microevolution.

  20. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.).

    PubMed

    Seong, Eun Soo; Yoo, Ji Hye; Choi, Jae Hoo; Kim, Chang Heum; Jeon, Mi Ran; Kang, Byeong Ju; Lee, Jae Geun; Choi, Seon Kang; Ghimire, Bimal Kumar; Yu, Chang Yeon

    2015-01-01

    Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST) homology searches and annotated Gene Ontology (GO). A total of 18 simple sequence repeats (SSRs) were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant.

  1. Expressed Sequence Tags Analysis and Design of Simple Sequence Repeats Markers from a Full-Length cDNA Library in Perilla frutescens (L.)

    PubMed Central

    Seong, Eun Soo; Yoo, Ji Hye; Choi, Jae Hoo; Kim, Chang Heum; Jeon, Mi Ran; Kang, Byeong Ju; Lee, Jae Geun; Choi, Seon Kang; Ghimire, Bimal Kumar; Yu, Chang Yeon

    2015-01-01

    Perilla frutescens is valuable as a medicinal plant as well as a natural medicine and functional food. However, comparative genomics analyses of P. frutescens are limited due to a lack of gene annotations and characterization. A full-length cDNA library from P. frutescens leaves was constructed to identify functional gene clusters and probable EST-SSR markers via analysis of 1,056 expressed sequence tags. Unigene assembly was performed using basic local alignment search tool (BLAST) homology searches and annotated Gene Ontology (GO). A total of 18 simple sequence repeats (SSRs) were designed as primer pairs. This study is the first to report comparative genomics and EST-SSR markers from P. frutescens will help gene discovery and provide an important source for functional genomics and molecular genetic research in this interesting medicinal plant. PMID:26664999

  2. Identification and characterization of 43 microsatellite markers derived from expressed sequence tags of the sea cucumber ( Apostichopus japonicus)

    NASA Astrophysics Data System (ADS)

    Jiang, Qun; Li, Qi; Yu, Hong; Kong, Lingfeng

    2011-06-01

    The sea cucumber Apostichopus japonicus is a commercially and ecologically important species in China. A total of 3056 potential unigenes were generated after assembling 7597 A. japonicus expressed sequence tags (ESTs) downloaded from Gen-Bank. Two hundred and fifty microsatellite-containing ESTs (8.18%) and 299 simple sequence repeats (SSRs) were detected. The average density of SSRs was 1 per 7.403 kb of EST after redundancy elimination. Di-nucleotide repeat motifs appeared to be the most abundant type with a percentage of 69.90%. Of the 126 primer pairs designed, 90 amplified the expected products and 43 showed polymorphism in 30 individuals tested. The number of alleles per locus ranged from 2 to 26 with an average of 7.0 alleles, and the observed and expected heterozygosities varied from 0.067 to 1.000 and from 0.066 to 0.959, respectively. These new EST-derived microsatellite markers would provide sufficient polymorphism for population genetic studies and genome mapping of this sea cucumber species.

  3. Gene expression profiling of coelomic cells and discovery of immune-related genes in the earthworm, Eisenia andrei, using expressed sequence tags.

    PubMed

    Tak, Eun Sik; Cho, Sung-Jin; Park, Soon Cheol

    2015-01-01

    The coelomic cells of the earthworm consist of leukocytes, chlorogocytes, and coelomocytes, which play an important role in innate immunity reactions. To gain insight into the expression profiles of coelomic cells of the earthworm, Eisenia andrei, we analyzed 1151 expressed sequence tags (ESTs) derived from the cDNA library of the coelomic cells. Among the 1151 ESTs analyzed, 493 ESTs (42.8%) showed a significant similarity to known genes and represented 164 unique genes, of which 93 ESTs were singletons and 71 ESTs manifested as two or more ESTs. From the 164 unique genes sequenced, we found 24 immune-related and cell defense genes. Furthermore, real-time PCR analysis showed that levels of lysenin-related proteins mRNA in coelomic cells of E. andrei were upregulated after the injection of Bacillus subtilis bacteria. This EST data-set would provide a valuable resource for future researches of earthworm immune system.

  4. ESTPiper – a web-based analysis pipeline for expressed sequence tags

    PubMed Central

    Tang, Zuojian; Choi, Jeong-Hyeon; Hemmerich, Chris; Sarangi, Ankita; Colbourne, John K; Dong, Qunfeng

    2009-01-01

    Background EST sequencing projects are increasing in scale and scope as the genome sequencing technologies migrate from core sequencing centers to individual research laboratories. Effectively, generating EST data is no longer a bottleneck for investigators. However, processing large amounts of EST data remains a non-trivial challenge for many. Web-based EST analysis tools are proving to be the most convenient option for biologists when performing their analysis, so these tools must continuously improve on their utility to keep in step with the growing needs of research communities. We have developed a web-based EST analysis pipeline called ESTPiper, which streamlines typical large-scale EST analysis components. Results The intuitive web interface guides users through each step of base calling, data cleaning, assembly, genome alignment, annotation, analysis of gene ontology (GO), and microarray oligonucleotide probe design. Each step is modularized. Therefore, a user can execute them separately or together in batch mode. In addition, the user has control over the parameters used by the underlying programs. Extensive documentation of ESTPiper's functionality is embedded throughout the web site to facilitate understanding of the required input and interpretation of the computational results. The user can also download intermediate results and port files to separate programs for further analysis. In addition, our server provides a time-stamped description of the run history for reproducibility. The pipeline can also be installed locally, allowing researchers to modify ESTPiper to suit their own needs. Conclusion ESTPiper streamlines the typical process of EST analysis. The pipeline was initially designed in part to support the Daphnia pulex cDNA sequencing project. A web server hosting ESTPiper is provided at to now support projects of all size. The software is also freely available from the authors for local installations. PMID:19383159

  5. Rediscovering medicinal plants' potential with OMICS: microsatellite survey in expressed sequence tags of eleven traditional plants with potent antidiabetic properties.

    PubMed

    Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar; Talukdar, Anupam Das

    2014-05-01

    Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic

  6. Rediscovering medicinal plants' potential with OMICS: microsatellite survey in expressed sequence tags of eleven traditional plants with potent antidiabetic properties.

    PubMed

    Sahu, Jagajjit; Sen, Priyabrata; Choudhury, Manabendra Dutta; Dehury, Budheswar; Barooah, Madhumita; Modi, Mahendra Kumar; Talukdar, Anupam Das

    2014-05-01

    Herbal medicines and traditionally used medicinal plants present an untapped potential for novel molecular target discovery using systems science and OMICS biotechnology driven strategies. Since up to 40% of the world's poor people have no access to government health services, traditional and folk medicines are often the only therapeutics available to them. In this vein, North East (NE) India is recognized for its rich bioresources. As part of the Indo-Burma hotspot, it is regarded as an epicenter of biodiversity for several plants having myriad traditional uses, including medicinal use. However, the improvement of these valuable bioresources through molecular breeding strategies, for example, using genic microsatellites or Simple Sequence Repeats (SSRs) or Expressed Sequence Tags (ESTs)-derived SSRs has not been fully utilized in large scale to date. In this study, we identified a total of 47,700 microsatellites from 109,609 ESTs of 11 medicinal plants (pineapple, papaya, noyontara, bitter orange, bermuda brass, ratalu, barbados nut, mango, mulberry, lotus, and guduchi) having proven antidiabetic properties. A total of 58,159 primer pairs were designed for the non-redundant 8060 SSR-positive ESTs and putative functions were assigned to 4483 unique contigs. Among the identified microsatellites, excluding mononucleotide repeats, di-/trinucleotides are predominant, among which repeat motifs of AG/CT and AAG/CTT were most abundant. Similarity search of SSR containing ESTs and antidiabetic gene sequences revealed 11 microsatellites linked to antidiabetic genes in five plants. GO term enrichment analysis revealed a total of 80 enriched GO terms widely distributed in 53 biological processes, 17 molecular functions, and 10 cellular components associated with the 11 markers. The present study therefore provides concrete insights into the frequency and distribution of SSRs in important medicinal resources. The microsatellite markers reported here markedly add to the genetic

  7. Comparative analysis of expressed sequence tag (EST) libraries in the seagrass Zostera marina subjected to temperature stress.

    PubMed

    Reusch, Thorsten B H; Veron, Amelie S; Preuss, Christoph; Weiner, January; Wissler, Lothar; Beck, Alfred; Klages, Sven; Kube, Michael; Reinhardt, Richard; Bornberg-Bauer, Erich

    2008-01-01

    Global warming is associated with increasing stress and mortality on temperate seagrass beds, in particular during periods of high sea surface temperatures during summer months, adding to existing anthropogenic impacts, such as eutrophication and habitat destruction. We compare several expressed sequence tag (EST) in the ecologically important seagrass Zostera marina (eelgrass) to elucidate the molecular genetic basis of adaptation to environmental extremes. We compared the tentative unigene (TUG) frequencies of libraries derived from leaf and meristematic tissue from a control situation with two experimentally imposed temperature stress conditions and found that TUG composition is markedly different among these conditions (all P < 0.0001). Under heat stress, we find that 63 TUGs are differentially expressed (d.e.) at 25 degrees C compared with lower, no-stress condition temperatures (4 degrees C and 17 degrees C). Approximately one-third of d.e. eelgrass genes were characteristic for the stress response of the terrestrial plant model Arabidopsis thaliana. The changes in gene expression suggest complex photosynthetic adjustments among light-harvesting complexes, reaction center subunits of photosystem I and II, and components of the dark reaction. Heat shock encoding proteins and reactive oxygen scavengers also were identified, but their overall frequency was too low to perform statistical tests. In all conditions, the most abundant transcript (3-15%) was a putative metallothionein gene with unknown function. We also find evidence that heat stress may translate to enhanced infection by protists. A total of 210 TUGs contain one or more microsatellites as potential candidates for gene-linked genetic markers. Data are publicly available in a user-friendly database at http://www.uni-muenster.de/Evolution/ebb/Services/zostera .

  8. Mining expressed sequence tags of rapeseed (Brassica napus L.) to predict the drought responsive regulatory network.

    PubMed

    Shamloo-Dashtpagerdi, Roohollah; Razi, Hooman; Ebrahimie, Esmaeil

    2015-07-01

    It is of great significance to understand the regulatory mechanisms by which plants deal with drought stress. Two EST libraries derived from rapeseed (Brassica napus) leaves in non-stressed and drought stress conditions were analyzed in order to obtain the transcriptomic landscape of drought-exposed B. napus plants, and also to identify and characterize significant drought responsive regulatory genes and microRNAs. The functional ontology analysis revealed a substantial shift in the B. napus transcriptome to govern cellular drought responsiveness via different stress-activated mechanisms. The activity of transcription factor and protein kinase modules generally increased in response to drought stress. The 26 regulatory genes consisting of 17 transcription factor genes, eight protein kinase genes and one protein phosphatase gene were identified showing significant alterations in their expressions in response to drought stress. We also found the six microRNAs which were differentially expressed during drought stress supporting the involvement of a post-transcriptional level of regulation for B. napus drought response. The drought responsive regulatory network shed light on the significance of some regulatory components involved in biosynthesis and signaling of various plant hormones (abscisic acid, auxin and brassinosteroids), ubiquitin proteasome system, and signaling through Reactive Oxygen Species (ROS). Our findings suggested a complex and multi-level regulatory system modulating response to drought stress in B. napus. PMID:26261397

  9. A new view of insect-crustacean relationships II. Inferences from expressed sequence tags and comparisons with neural cladistics.

    PubMed

    Andrew, David R

    2011-05-01

    The enormous diversity of Arthropoda has complicated attempts by systematists to deduce the history of this group in terms of phylogenetic relationships and phenotypic change. Traditional hypotheses regarding the relationships of the major arthropod groups (Chelicerata, Myriapoda, Crustacea, and Hexapoda) focus on suites of morphological characters, whereas phylogenomics relies on large amounts of molecular sequence data to infer evolutionary relationships. The present discussion is based on expressed sequence tags (ESTs) that provide large numbers of short molecular sequences and so provide an abundant source of sequence data for phylogenetic inference. This study presents well-supported phylogenies of diverse arthropod and metazoan outgroup taxa obtained from publicly-available databases. An in-house bioinformatics pipeline has been used to compile and align conserved orthologs from each taxon for maximum likelihood inferences. This approach resolves many currently accepted hypotheses regarding internal relationships between the major groups of Arthropoda, including monophyletic Hexapoda, Tetraconata (Crustacea + Hexapoda), Myriapoda, and Chelicerata sensu lato (Pycnogonida + Euchelicerata). "Crustacea" is a paraphyletic group with some taxa more closely related to the monophyletic Hexapoda. These results support studies that have utilized more restricted EST data for phylogenetic inference, yet they differ in important regards from recently published phylogenies employing nuclear protein-coding sequences. The present results do not, however, depart from other phylogenies that resolve Branchiopoda as the crustacean sister group of Hexapoda. Like other molecular phylogenies, EST-derived phylogenies alone are unable to resolve morphological convergences or evolved reversals and thus omit what may be crucial events in the history of life. For example, molecular data are unable to resolve whether a Hexapod-Branchiopod sister relationship infers a branchiopod

  10. A new view of insect-crustacean relationships II. Inferences from expressed sequence tags and comparisons with neural cladistics.

    PubMed

    Andrew, David R

    2011-05-01

    The enormous diversity of Arthropoda has complicated attempts by systematists to deduce the history of this group in terms of phylogenetic relationships and phenotypic change. Traditional hypotheses regarding the relationships of the major arthropod groups (Chelicerata, Myriapoda, Crustacea, and Hexapoda) focus on suites of morphological characters, whereas phylogenomics relies on large amounts of molecular sequence data to infer evolutionary relationships. The present discussion is based on expressed sequence tags (ESTs) that provide large numbers of short molecular sequences and so provide an abundant source of sequence data for phylogenetic inference. This study presents well-supported phylogenies of diverse arthropod and metazoan outgroup taxa obtained from publicly-available databases. An in-house bioinformatics pipeline has been used to compile and align conserved orthologs from each taxon for maximum likelihood inferences. This approach resolves many currently accepted hypotheses regarding internal relationships between the major groups of Arthropoda, including monophyletic Hexapoda, Tetraconata (Crustacea + Hexapoda), Myriapoda, and Chelicerata sensu lato (Pycnogonida + Euchelicerata). "Crustacea" is a paraphyletic group with some taxa more closely related to the monophyletic Hexapoda. These results support studies that have utilized more restricted EST data for phylogenetic inference, yet they differ in important regards from recently published phylogenies employing nuclear protein-coding sequences. The present results do not, however, depart from other phylogenies that resolve Branchiopoda as the crustacean sister group of Hexapoda. Like other molecular phylogenies, EST-derived phylogenies alone are unable to resolve morphological convergences or evolved reversals and thus omit what may be crucial events in the history of life. For example, molecular data are unable to resolve whether a Hexapod-Branchiopod sister relationship infers a branchiopod

  11. Construction and evaluation of cDNA libraries for large-scale expressed sequence tag sequencing in wheat (Triticum aestivum L.).

    PubMed

    Zhang, D; Choi, D W; Wanamaker, S; Fenton, R D; Chin, A; Malatrasi, M; Turuspekov, Y; Walia, H; Akhunov, E D; Kianian, P; Otto, C; Simons, K; Deal, K R; Echenique, V; Stamova, B; Ross, K; Butler, G E; Strader, L; Verhey, S D; Johnson, R; Altenbach, S; Kothari, K; Tanaka, C; Shah, M M; Laudencia-Chingcuanco, D; Han, P; Miller, R E; Crossman, C C; Chao, S; Lazo, G R; Klueva, N; Gustafson, J P; Kianian, S F; Dubcovsky, J; Walker-Simmons, M K; Gill, K S; Dvorák, J; Anderson, O D; Sorrells, M E; McGuire, P E; Qualset, C O; Nguyen, H T; Close, T J

    2004-10-01

    A total of 37 original cDNA libraries and 9 derivative libraries enriched for rare sequences were produced from Chinese Spring wheat (Triticum aestivum L.), five other hexaploid wheat genotypes (Cheyenne, Brevor, TAM W101, BH1146, Butte 86), tetraploid durum wheat (T. turgidum L.), diploid wheat (T. monococcum L.), and two other diploid members of the grass tribe Triticeae (Aegilops speltoides Tausch and Secale cereale L.). The emphasis in the choice of plant materials for library construction was reproductive development subjected to environmental factors that ultimately affect grain quality and yield, but roots and other tissues were also included. Partial cDNA expressed sequence tags (ESTs) were examined by various measures to assess the quality of these libraries. All ESTs were processed to remove cloning system sequences and contaminants and then assembled using CAP3. Following these processing steps, this assembly yielded 101,107 sequences derived from 89,043 clones, which defined 16,740 contigs and 33,213 singletons, a total of 49,953 "unigenes." Analysis of the distribution of these unigenes among the libraries led to the conclusion that the enrichment methods were effective in reducing the most abundant unigenes and to the observation that the most diverse libraries were from tissues exposed to environmental stresses including heat, drought, salinity, or low temperature. PMID:15514038

  12. Exploiting expressed sequence tag databases for the development and characterization of gene-derived simple sequence repeat markers in the opium poppy (Papaver somniferum L.) for forensic applications.

    PubMed

    Lee, Eun Jung; Jin, Gang Nam; Lee, Kyung Lyong; Han, Myun Soo; Lee, Yang Han; Yang, Moon Sik

    2011-09-01

    Simple sequence repeat (SSR) markers in the opium poppy (Papaver somniferum L.) were identified from an expressed sequence tag (EST) database comprised of 20,340 sequences. In total, 2780 SSR-containing sequences were identified. The most frequent microsatellite had an AT/TA motif (37%). Twenty-two opium poppy EST-SSR markers were presently developed and polymorphisms of six markers (psom 2, 4, 12, 13, 17, and 22) were utilized in 135 individuals under narcotic control investigation. An average of three alleles per locus (range: 2-5 alleles) with a mean heterozygosity of 0.167 was detected. Six loci identified 29 unique profiles in 135 individuals. The EST-SSR markers exhibited small degrees of genetic differentiation (fixation index = 0.727, p < 0.001). Other variable markers will be needed to facilitate the forensic identification of the opium poppy for future cases. To determine the potential for cross-species amplification, six markers were tested in five Papaver genera species and two Eschscholzia genera. The psom 4 and psom 17 primer pair was transferable. This is the first study to report SSR markers of the opium poppy.

  13. Construction and Evaluation of cDNA Libraries for Large-Scale Expressed Sequence Tag Sequencing in Wheat (Triticum aestivum L.)

    PubMed Central

    Zhang, D.; Choi, D. W.; Wanamaker, S.; Fenton, R. D.; Chin, A.; Malatrasi, M.; Turuspekov, Y.; Walia, H.; Akhunov, E. D.; Kianian, P.; Otto, C.; Simons, K.; Deal, K. R.; Echenique, V.; Stamova, B.; Ross, K.; Butler, G. E.; Strader, L.; Verhey, S. D.; Johnson, R.; Altenbach, S.; Kothari, K.; Tanaka, C.; Shah, M. M.; Laudencia-Chingcuanco, D.; Han, P.; Miller, R. E.; Crossman, C. C.; Chao, S.; Lazo, G. R.; Klueva, N.; Gustafson, J. P.; Kianian, S. F.; Dubcovsky, J.; Walker-Simmons, M. K.; Gill, K. S.; Dvořák, J.; Anderson, O. D.; Sorrells, M. E.; McGuire, P. E.; Qualset, C. O.; Nguyen, H. T.; Close, T. J.

    2004-01-01

    A total of 37 original cDNA libraries and 9 derivative libraries enriched for rare sequences were produced from Chinese Spring wheat (Triticum aestivum L.), five other hexaploid wheat genotypes (Cheyenne, Brevor, TAM W101, BH1146, Butte 86), tetraploid durum wheat (T. turgidum L.), diploid wheat (T. monococcum L.), and two other diploid members of the grass tribe Triticeae (Aegilops speltoides Tausch and Secale cereale L.). The emphasis in the choice of plant materials for library construction was reproductive development subjected to environmental factors that ultimately affect grain quality and yield, but roots and other tissues were also included. Partial cDNA expressed sequence tags (ESTs) were examined by various measures to assess the quality of these libraries. All ESTs were processed to remove cloning system sequences and contaminants and then assembled using CAP3. Following these processing steps, this assembly yielded 101,107 sequences derived from 89,043 clones, which defined 16,740 contigs and 33,213 singletons, a total of 49,953 “unigenes.” Analysis of the distribution of these unigenes among the libraries led to the conclusion that the enrichment methods were effective in reducing the most abundant unigenes and to the observation that the most diverse libraries were from tissues exposed to environmental stresses including heat, drought, salinity, or low temperature. PMID:15514038

  14. Micro- and minisatellite-expressed sequence tag (EST) markers discriminate between populations of Rhipicephalus appendiculatus.

    PubMed

    Kanduma, Esther G; Mwacharo, Joram M; Sunter, Jack D; Nzuki, Inosters; Mwaura, Stephen; Kinyanjui, Peter W; Kibe, Michael; Heyne, Heloise; Hanotte, Olivier; Skilton, Robert A; Bishop, Richard P

    2012-06-01

    Biological differences, including vector competence for the protozoan parasite Theileria parva have been reported among populations of Rhipicephalus appendiculatus (Acari: Ixodidae) from different geographic regions. However, the genetic diversity and population structure of this important tick vector remain unknown due to the absence of appropriate genetic markers. Here, we describe the development and evaluation of a panel of EST micro- and minisatellite markers to characterize the genetic diversity within and between populations of R. appendiculatus and other rhipicephaline species. Sixty-six micro- and minisatellite markers were identified through analysis of the R. appendiculatus Gene Index (RaGI) EST database and selected bacterial artificial chromosome (BAC) sequences. These were used to genotype 979 individual ticks from 10 field populations, 10 laboratory-bred stocks, and 5 additional Rhipicephalus species. Twenty-nine markers were polymorphic and therefore informative for genetic studies while 6 were monomorphic. Primers designed from the remaining 31 loci did not reliably generate amplicons. The 29 polymorphic markers discriminated populations of R. appendiculatus and also 4 other Rhipicephalus species, but not R. zambeziensis. The percentage Principal Component Analysis (PCA) implemented using Multiple Co-inertia Analysis (MCoA) clustered populations of R. appendiculatus into 2 groups. Individual markers however differed in their ability to generate the reference typology using the MCoA approach. This indicates that different panels of markers may be required for different applications. The 29 informative polymorphic micro- and minisatellite markers are the first available tools for the analysis of the phylogeography and population genetics of R. appendiculatus. PMID:22789728

  15. Exploring the Host Parasitism of the Migratory Plant-Parasitic Nematode Ditylenchus destuctor by Expressed Sequence Tags Analysis

    PubMed Central

    Peng, Huan; Gao, Bing-li; Kong, Ling-an; Yu, Qing; Huang, Wen-kun; He, Xu-feng; Long, Hai-bo; Peng, De-liang

    2013-01-01

    The potato rot nematode, Ditylenchus destructor, is a very destructive nematode pest on many agriculturally important crops worldwide, but the molecular characterization of its parasitism of plant has been limited. The effectors involved in nematode parasitism of plant for several sedentary endo-parasitic nematodes such as Heterodera glycines, Globodera rostochiensis and Meloidogyne incognita have been identified and extensively studied over the past two decades. Ditylenchus destructor, as a migratory plant parasitic nematode, has different feeding behavior, life cycle and host response. Comparing the transcriptome and parasitome among different types of plant-parasitic nematodes is the way to understand more fully the parasitic mechanism of plant nematodes. We undertook the approach of sequencing expressed sequence tags (ESTs) derived from a mixed stage cDNA library of D. destructor. This is the first study of D. destructor ESTs. A total of 9800 ESTs were grouped into 5008 clusters including 3606 singletons and 1402 multi-member contigs, representing a catalog of D. destructor genes. Implementing a bioinformatics' workflow, we found 1391 clusters have no match in the available gene database; 31 clusters only have similarities to genes identified from D. africanus, the most closely related species to D. destructor; 1991 clusters were annotated using Gene Ontology (GO); 1550 clusters were assigned enzyme commission (EC) numbers; and 1211 clusters were mapped to 181 KEGG biochemical pathways. 22 ESTs had similarities to reported nematode effectors. Interestedly, most of the effectors identified in this study are involved in host cell wall degradation or modification, such as 1,4-beta-glucanse, 1,3-beta-glucanse, pectate lyase, chitinases and expansin, or host defense suppression such as calreticulin, annexin and venom allergen-like protein. This result implies that the migratory plant-parasitic nematode D. destructor secrets similar effectors to those of sedentary

  16. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    PubMed

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis.

  17. Diversity analysis in Cannabis sativa based on large-scale development of expressed sequence tag-derived simple sequence repeat markers.

    PubMed

    Gao, Chunsheng; Xin, Pengfei; Cheng, Chaohua; Tang, Qing; Chen, Ping; Wang, Changbiao; Zang, Gonggu; Zhao, Lining

    2014-01-01

    Cannabis sativa L. is an important economic plant for the production of food, fiber, oils, and intoxicants. However, lack of sufficient simple sequence repeat (SSR) markers has limited the development of cannabis genetic research. Here, large-scale development of expressed sequence tag simple sequence repeat (EST-SSR) markers was performed to obtain more informative genetic markers, and to assess genetic diversity in cannabis (Cannabis sativa L.). Based on the cannabis transcriptome, 4,577 SSRs were identified from 3,624 ESTs. From there, a total of 3,442 complementary primer pairs were designed as SSR markers. Among these markers, trinucleotide repeat motifs (50.99%) were the most abundant, followed by hexanucleotide (25.13%), dinucleotide (16.34%), tetranucloetide (3.8%), and pentanucleotide (3.74%) repeat motifs, respectively. The AAG/CTT trinucleotide repeat (17.96%) was the most abundant motif detected in the SSRs. One hundred and seventeen EST-SSR markers were randomly selected to evaluate primer quality in 24 cannabis varieties. Among these 117 markers, 108 (92.31%) were successfully amplified and 87 (74.36%) were polymorphic. Forty-five polymorphic primer pairs were selected to evaluate genetic diversity and relatedness among the 115 cannabis genotypes. The results showed that 115 varieties could be divided into 4 groups primarily based on geography: Northern China, Europe, Central China, and Southern China. Moreover, the coefficient of similarity when comparing cannabis from Northern China with the European group cannabis was higher than that when comparing with cannabis from the other two groups, owing to a similar climate. This study outlines the first large-scale development of SSR markers for cannabis. These data may serve as a foundation for the development of genetic linkage, quantitative trait loci mapping, and marker-assisted breeding of cannabis. PMID:25329551

  18. Expressed sequence-tag analysis of ovaries of Brachiaria brizantha reveals genes associated with the early steps of embryo sac differentiation of apomictic plants.

    PubMed

    Silveira, Erica Duarte; Guimarães, Larissa Arrais; de Alencar Dusi, Diva Maria; da Silva, Felipe Rodrigues; Martins, Natália Florencio; do Carmo Costa, Marcos Mota; Alves-Ferreira, Márcio; de Campos Carneiro, Vera Tavares

    2012-02-01

    In apomixis, asexual mode of plant reproduction through seeds, an unreduced megagametophyte is formed due to circumvented or altered meiosis. The embryo develops autonomously from the unreduced egg cell, independently of fertilization. Brachiaria is a genus of tropical forage grasses that reproduces sexually or by apomixis. A limited number of studies have reported the sequencing of apomixis-related genes and a few Brachiaria sequences have been deposited at genebank databases. This work shows sequencing and expression analyses of expressed sequence-tags (ESTs) of Brachiaria genus and points to transcripts from ovaries with preferential expression at megasporogenesis in apomictic plants. From the 11 differentially expressed sequences from immature ovaries of sexual and apomictic Brachiaria brizantha obtained from macroarray analysis, 9 were preferentially detected in ovaries of apomicts, as confirmed by RT-qPCR. A putative involvement in early steps of Panicum-type embryo sac differentiation of four sequences from B. brizantha ovaries: BbrizHelic, BbrizRan, BbrizSec13 and BbrizSti1 is suggested. Two of these, BbrizSti1 and BbrizHelic, with similarity to a gene coding to stress induced protein and a helicase, respectively, are preferentially expressed in the early stages of apomictic ovaries development, especially in the nucellus, in a stage previous to the differentiation of aposporous initials, as verified by in situ hybridization.

  19. Development and characterization of 1,827 expressed sequence tag-derived simple sequence repeat markers for ramie (Boehmeria nivea L. Gaud).

    PubMed

    Liu, Touming; Zhu, Siyuan; Fu, Lili; Tang, Qingming; Yu, Yongting; Chen, Ping; Luan, Mingbao; Wang, Changbiao; Tang, Shouwei

    2013-01-01

    Ramie (Boehmeria nivea L. Gaud) is one of the most important natural fiber crops, and improvement of fiber yield and quality is the main goal in efforts to breed superior cultivars. However, efforts aimed at enhancing the understanding of ramie genetics and developing more effective breeding strategies have been hampered by the shortage of simple sequence repeat (SSR) markers. In our previous study, we had assembled de novo 43,990 expressed sequence tags (ESTs). In the present study, we searched these previously assembled ESTs for SSRs and identified 1,685 ESTs (3.83%) containing 1,878 SSRs. Next, we designed 1,827 primer pairs complementary to regions flanking these SSRs, and these regions were designated as SSR markers. Among these markers, dinucleotide and trinucleotide repeat motifs were the most abundant types (36.4% and 36.3%, respectively), whereas tetranucleotide, pentanucleotide, and hexanucleotide motifs represented <10% of the markers. The motif AG/CT was the most abundant, accounting for 28.74% of the markers. One hundred EST-SSR markers (97 SSRs located in genes encoding transcription factors and 3 SSRs in genes encoding cellulose synthases) were amplified using polymerase chain reaction for detecting 24 ramie varieties. Of these 100 markers, 98 markers were successfully amplified and 81 markers were polymorphic, with 2-6 alleles among the 24 varieties. Analysis of the genetic diversity of all 24 varieties revealed similarity coefficients that ranged from 0.51 to 0.80. The EST-SSRs developed in this study represent the first large-scale development of SSR markers for ramie. These SSR markers could be used for development of genetic and physical maps, quantitative trait loci mapping, genetic diversity studies, association mapping, and cultivar fingerprinting.

  20. Identification of genes expressed in human CD34+ hematopoietic stem/progenitor cells by expressed sequence tags and efficient full-length cDNA cloning

    PubMed Central

    Mao, Mao; Fu, Gang; Wu, Ji-Sheng; Zhang, Qing-Hua; Zhou, Jun; Kan, Li-Xin; Huang, Qiu-Hua; He, Kai-Li; Gu, Bai-Wei; Han, Ze-Guang; Shen, Yu; Gu, Jian; Yu, Ya-Ping; Xu, Shu-Hua; Wang, Ya-Xin; Chen, Sai-Juan; Chen, Zhu

    1998-01-01

    Hematopoietic stem/progenitor cells (HSPCs) possess the potentials of self-renewal, proliferation, and differentiation toward different lineages of blood cells. These cells not only play a primordial role in hematopoietic development but also have important clinical application. Characterization of the gene expression profile in CD34+ HSPCs may lead to a better understanding of the regulation of normal and pathological hematopoiesis. In the present work, genes expressed in human umbilical cord blood CD34+ cells were catalogued by partially sequencing a large amount of cDNA clones [or expressed sequence tags (ESTs)] and analyzing these sequences with the tools of bioinformatics. Among 9,866 ESTs thus obtained, 4,697 (47.6%) showed identity to known genes in the GenBank database, 2,603 (26.4%) matched to the ESTs previously deposited in a public domain database, 1,415 (14.3%) were previously undescribed ESTs, and the remaining 1,151 (11.7%) were mitochondrial DNA, ribosomal RNA, or repetitive (Alu or L1) sequences. Integration of ESTs of known genes generated a profile including 855 genes that could be divided into different categories according to their functions. Some (8.2%) of the genes in this profile were considered related to early hematopoiesis. The possible function of ESTs corresponding to so far unknown genes were approached by means of homology and functional motif searches. Moreover, attempts were made to generate libraries enriched for full-length cDNAs, to better explore the genes in HSPCs. Nearly 60% of the cDNA clones of mRNA under 2 kb in our libraries had 5′ ends upstream of the first ATG codon of the ORF. With this satisfactory result, we have developed an efficient working system that allowed fast sequencing of 32 full-length cDNAs, 16 of them being mapped to the chromosomes with radiation hybrid panels. This work may lay a basis for the further research on the molecular network of hematopoietic regulation. PMID:9653160

  1. Analyses of expressed sequence tags from the maize foliar pathogen Cercospora zeae-maydis identify novel genes expressed during vegetative, infectious, and reproductive growth

    PubMed Central

    Bluhm, Burton H; Dhillon, Braham; Lindquist, Erika A; Kema, Gert HJ; Goodwin, Stephen B; Dunkle, Larry D

    2008-01-01

    Background The ascomycete fungus Cercospora zeae-maydis is an aggressive foliar pathogen of maize that causes substantial losses annually throughout the Western Hemisphere. Despite its impact on maize production, little is known about the regulation of pathogenesis in C. zeae-maydis at the molecular level. The objectives of this study were to generate a collection of expressed sequence tags (ESTs) from C. zeae-maydis and evaluate their expression during vegetative, infectious, and reproductive growth. Results A total of 27,551 ESTs was obtained from five cDNA libraries constructed from vegetative and sporulating cultures of C. zeae-maydis. The ESTs, grouped into 4088 clusters and 531 singlets, represented 4619 putative unique genes. Of these, 36% encoded proteins similar (E value ≤ 10-05) to characterized or annotated proteins from the NCBI non-redundant database representing diverse molecular functions and biological processes based on Gene Ontology (GO) classification. We identified numerous, previously undescribed genes with potential roles in photoreception, pathogenesis, and the regulation of development as well as Zephyr, a novel, actively transcribed transposable element. Differential expression of selected genes was demonstrated by real-time PCR, supporting their proposed roles in vegetative, infectious, and reproductive growth. Conclusion Novel genes that are potentially involved in regulating growth, development, and pathogenesis were identified in C. zeae-maydis, providing specific targets for characterization by molecular genetics and functional genomics. The EST data establish a foundation for future studies in evolutionary and comparative genomics among species of Cercospora and other groups of plant pathogenic fungi. PMID:18983654

  2. A Unique Set of 11,008 Onion Expressed Sequence Tags Reveals Expressed Sequence and Genomic Differences between the Monocot Orders Asparagales and PoalesW⃞

    PubMed Central

    Kuhl, Joseph C.; Cheung, Foo; Yuan, Qiaoping; Martin, William; Zewdie, Yayeh; McCallum, John; Catanach, Andrew; Rutherford, Paul; Sink, Kenneth C.; Jenderek, Maria; Prince, James P.; Town, Christopher D.; Havey, Michael J.

    2004-01-01

    Enormous genomic resources have been developed for plants in the monocot order Poales; however, it is not clear how representative the Poales are for the monocots as a whole. The Asparagales are a monophyletic order sister to the lineage carrying the Poales and possess economically important plants such as asparagus, garlic, and onion. To assess the genomic differences between the Asparagales and Poales, we generated 11,008 unique ESTs from a normalized cDNA library of onion. Sequence analyses of these ESTs revealed microsatellite markers, single nucleotide polymorphisms, and homologs of transposable elements. Mean nucleotide similarity between rice and the Asparagales was 78% across coding regions. Expressed sequence and genomic comparisons revealed strong differences between the Asparagales and Poales for codon usage and mean GC content, GC distribution, and relative GC content at each codon position, indicating that genomic characteristics are not uniform across the monocots. The Asparagales were more similar to eudicots than to the Poales for these genomic characteristics. PMID:14671025

  3. A molecular analysis of desiccation tolerance mechanisms in the anhydrobiotic nematode Panagrolaimus superbus using expressed sequenced tags

    PubMed Central

    2012-01-01

    Background Some organisms can survive extreme desiccation by entering into a state of suspended animation known as anhydrobiosis. Panagrolaimus superbus is a free-living anhydrobiotic nematode that can survive rapid environmental desiccation. The mechanisms that P. superbus uses to combat the potentially lethal effects of cellular dehydration may include the constitutive and inducible expression of protective molecules, along with behavioural and/or morphological adaptations that slow the rate of cellular water loss. In addition, inducible repair and revival programmes may also be required for successful rehydration and recovery from anhydrobiosis. Results To identify constitutively expressed candidate anhydrobiotic genes we obtained 9,216 ESTs from an unstressed mixed stage population of P. superbus. We derived 4,009 unigenes from these ESTs. These unigene annotations and sequences can be accessed at http://www.nematodes.org/nembase4/species_info.php?species=PSC. We manually annotated a set of 187 constitutively expressed candidate anhydrobiotic genes from P. superbus. Notable among those is a putative lineage expansion of the lea (late embryogenesis abundant) gene family. The most abundantly expressed sequence was a member of the nematode specific sxp/ral-2 family that is highly expressed in parasitic nematodes and secreted onto the surface of the nematodes' cuticles. There were 2,059 novel unigenes (51.7% of the total), 149 of which are predicted to encode intrinsically disordered proteins lacking a fixed tertiary structure. One unigene may encode an exo-β-1,3-glucanase (GHF5 family), most similar to a sequence from Phytophthora infestans. GHF5 enzymes have been reported from several species of plant parasitic nematodes, with horizontal gene transfer (HGT) from bacteria proposed to explain their evolutionary origin. This P. superbus sequence represents another possible HGT event within the Nematoda. The expression of five of the 19 putative stress response

  4. An analysis of expressed sequence tags of developing castor endosperm using a full-length cDNA library

    PubMed Central

    Lu, Chaofu; Wallis, James G; Browse, John

    2007-01-01

    Background Castor seeds are a major source for ricinoleate, an important industrial raw material. Genomics studies of castor plant will provide critical information for understanding seed metabolism, for effectively engineering ricinoleate production in transgenic oilseeds, or for genetically improving castor plants by eliminating toxic and allergic proteins in seeds. Results Full-length cDNAs are useful resources in annotating genes and in providing functional analysis of genes and their products. We constructed a full-length cDNA library from developing castor endosperm, and obtained 4,720 ESTs from 5'-ends of the cDNA clones representing 1,908 unique sequences. The most abundant transcripts are genes encoding storage proteins, ricin, agglutinin and oleosins. Several other sequences are also very numerous, including two acidic triacylglycerol lipases, and the oleate hydroxylase (FAH12) gene that is responsible for ricinoleate biosynthesis. The role(s) of the lipases in developing castor seeds are not clear, and co-expressing of a lipase and the FAH12 did not result in significant changes in hydroxy fatty acid accumulation in transgenic Arabidopsis seeds. Only one oleate desaturase (FAD2) gene was identified in our cDNA sequences. Sequence and functional analyses of the castor FAD2 were carried out since it had not been characterized previously. Overexpression of castor FAD2 in a FAH12-expressing Arabidopsis line resulted in decreased accumulation of hydroxy fatty acids in transgenic seeds. Conclusion Our results suggest that transcriptional regulation of FAD2 and FAH12 genes maybe one of the mechanisms that contribute to a high level of ricinoleate accumulation in castor endosperm. The full-length cDNA library will be used to search for additional genes that affect ricinoleate accumulation in seed oils. Our EST sequences will also be useful to annotate the castor genome, which whole sequence is being generated by shotgun sequencing at the Institute for Genome

  5. Analysis of expressed sequence tags from a significant livestock pest, the stable fly (Stomoxys calcitrans), identifies transcripts with a putative role in chemosensation and sex determination.

    PubMed

    Olafson, Pia Untalan; Lohmeyer, Kimberly H; Dowd, Scot E

    2010-07-01

    The stable fly, Stomoxys calcitrans L. (Diptera: Muscidae), is one of the most significant pests of livestock in the United States. The identification of targets for the development of novel control for this pest species, focusing on those molecules that play a role in successful feeding and reproduction, is critical to mitigating its impact on confined and rangeland livestock. A database was developed representing genes expressed at the immature and adult life stages of the stable fly, comprising data obtained from pyrosequencing both immature and adult stages and from small-scale sequencing of an antennal/maxillary palp-expressed sequence tag library. The full-length sequence and expression of 21 transcripts that may have a role in chemosensation is presented, including 13 odorant-binding proteins, 6 chemosensory proteins, and 2 odorant receptors. Transcripts with potential roles in sex determination and reproductive behaviors are identified, including evidence for the sex-specific expression of stable fly doublesex- and transformer-like transcripts. The current database will be a valuable tool for target identification and for comparative studies with other Diptera. PMID:20572127

  6. Analysis of expressed sequence tags from a significant livestock pest, the stable fly (Stomoxys calcitrans), identifies transcripts with a putative role in chemosensation and sex determination.

    PubMed

    Olafson, Pia Untalan; Lohmeyer, Kimberly H; Dowd, Scot E

    2010-07-01

    The stable fly, Stomoxys calcitrans L. (Diptera: Muscidae), is one of the most significant pests of livestock in the United States. The identification of targets for the development of novel control for this pest species, focusing on those molecules that play a role in successful feeding and reproduction, is critical to mitigating its impact on confined and rangeland livestock. A database was developed representing genes expressed at the immature and adult life stages of the stable fly, comprising data obtained from pyrosequencing both immature and adult stages and from small-scale sequencing of an antennal/maxillary palp-expressed sequence tag library. The full-length sequence and expression of 21 transcripts that may have a role in chemosensation is presented, including 13 odorant-binding proteins, 6 chemosensory proteins, and 2 odorant receptors. Transcripts with potential roles in sex determination and reproductive behaviors are identified, including evidence for the sex-specific expression of stable fly doublesex- and transformer-like transcripts. The current database will be a valuable tool for target identification and for comparative studies with other Diptera.

  7. Existence of microsatellites in expressed sequence tags of common carp ( Cyprinus carpio L.) available in GenBank dbEST database

    NASA Astrophysics Data System (ADS)

    Hu, Jingjie; Wang, Xiaolong; Hu, Xiaoli; Bao, Zhenmin

    2006-01-01

    Common carp expressed sequence tags (ESTs) were analyzed for the existence of microsatellites, or simple sequence repeats (SSRs). In the NCBI dbEST database, a total of 10612 sequences were registered before December 31, 2004. A complete search of 2 6 nucleotide microsatellites resulted in the identification of 513 SSR-containing ESTs, accounting for 4.8% of the total. Cluster analysis indicated that 73 sequences of SSR-containing ESTs fell into 27 groups and the remaining 440 ESTs were indenpendent. A total of 467 unique SSR-containing ESTs were identified. These EST-SSRs contained a variety of simple sequence types, and di- and tri-nucleotide repeats were the most abundant, accounting for 42.1% and 27.9% of the whole, respectively. Of the dinucleotide repeats, CA/TG was the most abundant, followed by GA/TC. BLASTx search showed that 38.1% of the SSR loci could be associated with genes or proteins of known or unknown function. BLASTx searches of SSR-containing ESTs also showed high frequencies (98/179) of hits on zebrafish sequences.

  8. Existence of microsatellites in expressed sequence tags of common carp ( Cyprinus carpio L.) available in GenBank dbEST database

    NASA Astrophysics Data System (ADS)

    Jingjie, Hu; Xiaolong, Wang; Xiaoli, Hu; Zhenmin, Bao

    2006-01-01

    Common carp expressed sequence tags (ESTs) were analyzed for the existence of microsatellites, or simple sequence repeats (SSRs). In the NCBI dbEST database, a total of 10612 sequences were registered before December 31, 2004. A complete search of 2-6 nucleotide microsatellites resulted in the identification of 513 SSR-containing ESTs, accounting for 4.8% of the total. Cluster analysis indicated that 73 sequences of SSR-containing ESTs fell into 27 groups and the remaining 440 ESTs were indenpendent. A total of 467 unique SSR-containing ESTs were identified. These EST-SSRs contained a variety of simple sequence types, and di- and tri-nucleotide repeats were the most abundant, accounting for 42.1% and 27.9% of the whole, respectively. Of the dinucleotide repeats, CA/TG was the most abundant, followed by GA/TC. BLASTx search showed that 38.1% of the SSR loci could be associated with genes or proteins of known or unknown function. BLASTx searches of SSR-containing ESTs also showed high frequencies (98/179) of hits on zebrafish sequences.

  9. The first set of expressed sequence tags (EST) from the medicinal mushroom Agaricus subrufescens delivers resource for gene discovery and marker development.

    PubMed

    Foulongne-Oriol, Marie; Lapalu, Nicolas; Férandon, Cyril; Spataro, Cathy; Ferrer, Nathalie; Amselem, Joelle; Savoie, Jean-Michel

    2014-09-01

    Agaricus subrufescens is one of the most important culinary-medicinal cultivable mushrooms with potentially high-added-value products and extended agronomical valorization. The development of A. subrufescens-related technologies is hampered by, among others, the lack of suitable molecular tools. Thus, this mushroom is considered as a genomic orphan species with a very limited number of available molecular markers or sequences. To fill this gap, this study reports the generation and analysis of the first set of expressed sequence tags (EST) for A. subrufescens. cDNA fragments obtained from young sporophores (SP) and vegetative mycelium in liquid culture (CL) were sequenced using 454 pyrosequencing technology. After assembly process, 4,989 and 5,125 sequences were obtained in SP and CL libraries, respectively. About 87% of the EST had significant similarity with Agaricus bisporus-predicted proteins, and 79% correspond to known proteins. Functional categorization according to Gene Ontology could be assigned to 49% of the sequences. Some gene families potentially involved in bioactive compound biosynthesis could be identified. A total of 232 simple sequence repeats (SSRs) were identified, and a set of 40 EST-SSR polymorphic markers were successfully developed. This EST dataset provides a new resource for gene discovery and molecular marker development. It constitutes a solid basis for further genetic and genomic studies in A. subrufescens.

  10. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    PubMed

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-01-01

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility. PMID:21732283

  11. Identification and isolation of full-length cDNA sequences by sequencing and analysis of expressed sequence tags from guarana (Paullinia cupana).

    PubMed

    Figueirêdo, L C; Faria-Campos, A C; Astolfi-Filho, S; Azevedo, J L

    2011-06-21

    The current intense production of biological data, generated by sequencing techniques, has created an ever-growing volume of unanalyzed data. We reevaluated data produced by the guarana (Paullinia cupana) transcriptome sequencing project to identify cDNA clones with complete coding sequences (full-length clones) and complete sequences of genes of biotechnological interest, contributing to the knowledge of biological characteristics of this organism. We analyzed 15,490 ESTs of guarana in search of clones with complete coding regions. A total of 12,402 sequences were analyzed using BLAST, and 4697 full-length clones were identified, responsible for the production of 2297 different proteins. Eighty-four clones were identified as full-length for N-methyltransferase and 18 were sequenced in both directions to obtain the complete genome sequence, and confirm the search made in silico for full-length clones. Phylogenetic analyses were made with the complete genome sequences of three clones, which showed only 0.017% dissimilarity; these are phylogenetically close to the caffeine synthase of Theobroma cacao. The search for full-length clones allowed the identification of numerous clones that had the complete coding region, demonstrating this to be an efficient and useful tool in the process of biological data mining. The sequencing of the complete coding region of identified full-length clones corroborated the data from the in silico search, strengthening its efficiency and utility.

  12. Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon

    PubMed Central

    2011-01-01

    Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot

  13. Construction of cDNA library and preliminary analysis of expressed sequence tags from tea plant [Camellia sinensis (L) O. Kuntze].

    PubMed

    Phukon, Munmi; Namdev, Richa; Deka, Diganta; Modi, Mahendra K; Sen, Priyabrata

    2012-09-10

    Tea is the most popular non-alcoholic and healthy beverage across the world. The understanding of the genetic organization and molecular biology of tea plant, which is very poorly understood at present, is required for quantum increase in productivity and efficient use of germplasm for either cultivation or breeding program. Single-pass sequencing of randomly selected cDNA clones is the most widely accepted technique for gene identification and cloning. In the present study, a good quality cDNA library was constructed and preliminary analysis of ESTs was carried out. The titers of unamplified and amplified libraries were 1.4 × 10(6)pfu/ml and 5.27 × 10(8)pfu/ml respectively. A total of 210 cDNA clones from the constructed cDNA library were sequenced and analyzed. A total of 84 high quality Expressed Sequence Tags (ESTs) were generated, among which 71 ESTs had significant homology with sequences in NCBI non-redundant protein database by BLAST X analysis. About 80% ESTs had poly (A) tail at 3' end indicating that the cDNAs were full length. The database-matched ESTs were classified into putative cellular roles, viz. energy-related category (corresponding to 20% of total BLAST X matched ESTs), Transcription (14.2%), protein synthesis (14.2%) cell growth and division (8.6%), cell structure (5.7%), signal transduction (5.7%), transporters (2.9%), disease and defenses (2.9%), secondary metabolism (2.9%) and gene regulation (2.9%). This study provides an overview of the mRNA expression profile and first hand information of gene sequence expressed in tender leaves and apical buds of tea plant.

  14. A White Campion (Silene latifolia) floral expressed sequence tag (EST) library: annotation, EST-SSR characterization, transferability, and utility for comparative mapping

    PubMed Central

    Moccia, Maria Domenica; Oger-Desfeux, Christine; Marais, Gabriel AB; Widmer, Alex

    2009-01-01

    Background Expressed sequence tag (EST) databases represent a valuable resource for the identification of genes in organisms with uncharacterized genomes and for development of molecular markers. One class of markers derived from EST sequences are simple sequence repeat (SSR) markers, also known as EST-SSRs. These are useful in plant genetic and evolutionary studies because they are located in transcribed genes and a putative function can often be inferred from homology searches. Another important feature of EST-SSR markers is their expected high level of transferability to related species that makes them very promising for comparative mapping. In the present study we constructed a normalized EST library from floral tissue of Silene latifolia with the aim to identify expressed genes and to develop polymorphic molecular markers. Results We obtained a total of 3662 high quality sequences from a normalized Silene cDNA library. These represent 3105 unigenes, with 73% of unigenes matching genes in other species. We found 255 sequences containing one or more SSR motifs. More than 60% of these SSRs were trinucleotides. A total of 30 microsatellite loci were identified from 106 ESTs having sufficient flanking sequences for primer design. The inheritance of these loci was tested via segregation analyses and their usefulness for linkage mapping was assessed in an interspecific cross. Tests for crossamplification of the EST-SSR loci in other Silene species established their applicability to related species. Conclusion The newly characterized genes and gene-derived markers from our Silene EST library represent a valuable genetic resource for future studies on Silene latifolia and related species. The polymorphism and transferability of EST-SSR markers facilitate comparative linkage mapping and analyses of genetic diversity in the genus Silene. PMID:19467153

  15. Development of expressed sequence tag-based microsatellite markers for the critically endangered Isoëtes sinensis (Isoetaceae) based on transcriptome analysis.

    PubMed

    Gichira, A W; Long, Z C; Wang, Q F; Chen, J M; Liao, K

    2016-01-01

    Isoëtes sinensis is a critically endangered quillwort. To facilitate studies on the conservation genetics of this species, we developed expressed sequence tag-simple sequence repeat (EST-SSR) markers. A total of 50,063 unigenes were predicted by transcriptome sequencing, 5294 (10.6%) of which significantly matched 3011 Gene Ontology annotations and 2363 were assigned to Kyoto Encyclopedia of Genes and Genomes metabolic pathways. Most of these (2297) were involved in metabolism. A total of 1982 SSR motifs were identified, with trinucleotides being the dominant repeat motif, and 1438 (72.6%) SSR primers were designed. Eighteen randomly selected primer pairs were used to genotype 24 I. sinensis accessions, which confirmed the suitability of these novel markers for molecular studies of I. sinensis. The heterozygosity index value ranged between 0.0799 and 0.9106, while the Shannon-Wiener diversity index value ranged between 0.1732 and 2.5589. The EST-SSRs reported in this study are linked to genic sequences, and are therefore ideal for investigating the evolutionary history of I. sinensis. These markers, together with the large EST dataset generated in this study, will greatly facilitate conservation genetic studies of I. sinensis. PMID:27525847

  16. Identification of Genes with Potential Roles in Apple Fruit Development and Biochemistry through Large-Scale Statistical Analysis of Expressed Sequence Tags1[W

    PubMed Central

    Park, Sunchung; Sugimoto, Nobuko; Larson, Matthew D.; Beaudry, Randy; van Nocker, Steven

    2006-01-01

    Advanced studies of apple (Malus domestica Borkh) development, physiology, and biochemistry have been hampered by the lack of appropriate genomics tools. One exception is the recent acquisition of extensive expressed sequence tag (EST) data. The entire available EST dataset for apple resulted from the efforts of at least 20 contributors and was derived from more than 70 cDNA libraries representing diverse transcriptional profiles from a variety of organs, fruit parts, developmental stages, biotic and abiotic stresses, and from at least nine cultivars. We analyzed apple EST sequences available in public databanks using statistical algorithms to identify those apple genes that are likely to be highly expressed in fruit, expressed uniquely or preferentially in fruit, and/or temporally or spatially regulated during fruit growth and development. We applied these results to the analysis of biochemical pathways involved in biosynthesis of precursors for volatile esters and identified a subset of apple genes that may participate in generating flavor and aroma components found in mature fruit. PMID:16825339

  17. Analysis of expressed sequence tags from Maize mosaic rhabdovirus-infected gut tissues of Peregrinus maidis reveals the presence of key components of insect innate immunity.

    PubMed

    Whitfield, A E; Rotenberg, D; Aritua, V; Hogenhout, S A

    2011-04-01

    The corn planthopper, Peregrinus maidis, causes direct feeding damage to plants and transmits Maize mosaic rhabdovirus (MMV) in a persistent-propagative manner. MMV must cross several insect tissue layers for successful transmission to occur, and the gut serves as an important barrier for rhabdovirus transmission. In order to facilitate the identification of proteins that may interact with MMV either by facilitating acquisition or responding to virus infection, we generated and analysed the gut transcriptome of P. maidis. From two normalized cDNA libraries, we generated a P. maidis gut transcriptome composed of 20,771 expressed sequence tags (ESTs). Assembly of the sequences yielded 1860 contigs and 14,032 singletons, and biological roles were assigned to 5793 (36%). Comparison of P. maidis ESTs with other insect amino acid sequences revealed that P. maidis shares greatest sequence similarity with another hemipteran, the brown planthopper Nilaparvata lugens. We identified 202 P. maidis transcripts with putative homology to proteins associated with insect innate immunity, including those implicated in the Toll, Imd, JAK/STAT, Jnk and the small-interfering RNA-mediated pathways. Sequence comparisons between our P. maidis gut EST collection and the currently available National Center for Biotechnology Information EST database collection for Ni. lugens revealed that a pathogen recognition receptor in the Imd pathway, peptidoglycan recognition protein-long class (PGRP-LC), is present in these two members of the family Delphacidae; however, these recognition receptors are lacking in the model hemipteran Acyrthosiphon pisum. In addition, we identified sequences in the P. maidis gut transcriptome that share significant amino acid sequence similarities with the rhabdovirus receptor molecule, acetylcholine receptor (AChR), found in other hosts. This EST analysis sheds new light on immune response pathways in hemipteran guts that will be useful for further dissecting innate

  18. Expressed sequence tags from organ-specific cDNA libraries of tea (Camellia sinensis) and polymorphisms and transferability of EST-SSRs across Camellia species.

    PubMed

    Taniguchi, Fumiya; Fukuoka, Hiroyuki; Tanaka, Junichi

    2012-06-01

    Tea is one of the most popular beverages in the world and the tea plant, Camellia sinensis (L.) O. Kuntze, is an important crop in many countries. To increase the amount of genomic information available for C. sinensis, we constructed seven cDNA libraries from various organs and used these to generate expressed sequence tags (ESTs). A total of 17,458 ESTs were generated and assembled into 5,262 unigenes. About 50% of the unigenes were assigned annotations by Gene Ontology. Some were homologous to genes involved in important biological processes, such as nitrogen assimilation, aluminum response, and biosynthesis of caffeine and catechins. Digital northern analysis showed that 67 unigenes were expressed differentially among the seven organs. Simple sequence repeat (SSR) motif searches among the unigenes identified 1,835 unigenes (34.9%) harboring SSR motifs of more than six repeat units. A subset of 100 EST-SSR primer sets was tested for amplification and polymorphism in 16 tea accessions. Seventy-one primer sets successfully amplified EST-SSRs and 70 EST-SSR loci were polymorphic. Furthermore, these 70 EST-SSR markers were transferable to 14 other Camellia species. The ESTs and EST-SSR markers will enhance the study of important traits and the molecular genetics of tea plants and other Camellia species.

  19. Identification of stress-induced genes from the drought-tolerant plant Prosopis juliflora (Swartz) DC. through analysis of expressed sequence tags.

    PubMed

    George, Suja; Venkataraman, Gayatri; Parida, Ajay

    2007-05-01

    Abiotic stresses such as cold, salinity, drought, wounding, and heavy metal contamination adversely affect crop productivity throughout the world. Prosopis juliflora is a phreatophyte that can tolerate severe adverse environmental conditions such as drought, salinity, and heavy metal contamination. As a first step towards the characterization of genes that contribute to combating abiotic stress, construction and analysis of a cDNA library of P. juliflora genes is reported here. Random expressed sequence tag (EST) sequencing of 1750 clones produced 1467 high-quality reads. These clones were classified into functional categories, and BLAST comparisons revealed that 114 clones were homologous to genes implicated in stress response(s) and included heat shock proteins, metallothioneins, lipid transfer proteins, and late embryogenesis abundant proteins. Of the ESTs analyzed, 26% showed homology to previously uncharacterized genes in the databases. Fifty-two clones from this category were selected for reverse Northern analysis: 21 were shown to be upregulated and 16 downregulated. The results obtained by reverse Northern analysis were confirmed by Northern analysis. Clustering of the 1467 ESTs produced a total of 295 contigs encompassing 790 ESTs, resulting in a 54.2% redundancy. Two of the abundant genes coding for a nonspecific lipid transfer protein and late embryogenesis abundant protein were sequenced completely. Northern analysis (after polyethylene glycol stress) of the 2 genes was carried out. The implications of the analyzed genes in abiotic stress tolerance are also discussed.

  20. Generation and analysis of a 29,745 unique Expressed Sequence Tags from the Pacific oyster (Crassostrea gigas) assembled into a publicly accessible database: the GigasDatabase

    PubMed Central

    2009-01-01

    Background Although bivalves are among the most-studied marine organisms because of their ecological role and economic importance, very little information is available on the genome sequences of oyster species. This report documents three large-scale cDNA sequencing projects for the Pacific oyster Crassostrea gigas initiated to provide a large number of expressed sequence tags that were subsequently compiled in a publicly accessible database. This resource allowed for the identification of a large number of transcripts and provides valuable information for ongoing investigations of tissue-specific and stimulus-dependant gene expression patterns. These data are crucial for constructing comprehensive DNA microarrays, identifying single nucleotide polymorphisms and microsatellites in coding regions, and for identifying genes when the entire genome sequence of C. gigas becomes available. Description In the present paper, we report the production of 40,845 high-quality ESTs that identify 29,745 unique transcribed sequences consisting of 7,940 contigs and 21,805 singletons. All of these new sequences, together with existing public sequence data, have been compiled into a publicly-available Website http://public-contigbrowser.sigenae.org:9090/Crassostrea_gigas/index.html. Approximately 43% of the unique ESTs had significant matches against the SwissProt database and 27% were annotated using Gene Ontology terms. In addition, we identified a total of 208 in silico microsatellites from the ESTs, with 173 having sufficient flanking sequence for primer design. We also identified a total of 7,530 putative in silico, single-nucleotide polymorphisms using existing and newly-generated EST resources for the Pacific oyster. Conclusion A publicly-available database has been populated with 29,745 unique sequences for the Pacific oyster Crassostrea gigas. The database provides many tools to search cleaned and assembled ESTs. The user may input and submit several filters, such as

  1. Sorghum expressed sequence tags identify signature genes for drought, pathogenesis, and skotomorphogenesis from a milestone set of 16,801 unique transcripts.

    PubMed

    Pratt, Lee H; Liang, Chun; Shah, Manish; Sun, Feng; Wang, Haiming; Reid, St Patrick; Gingle, Alan R; Paterson, Andrew H; Wing, Rod; Dean, Ralph; Klein, Robert; Nguyen, Henry T; Ma, Hong-Mei; Zhao, Xin; Morishige, Daryl T; Mullet, John E; Cordonnier-Pratt, Marie-Michèle

    2005-10-01

    Improved knowledge of the sorghum transcriptome will enhance basic understanding of how plants respond to stresses and serve as a source of genes of value to agriculture. Toward this goal, Sorghum bicolor L. Moench cDNA libraries were prepared from light- and dark-grown seedlings, drought-stressed plants, Colletotrichum-infected seedlings and plants, ovaries, embryos, and immature panicles. Other libraries were prepared with meristems from Sorghum propinquum (Kunth) Hitchc. that had been photoperiodically induced to flower, and with rhizomes from S. propinquum and johnsongrass (Sorghum halepense L. Pers.). A total of 117,682 expressed sequence tags (ESTs) were obtained representing both 3' and 5' sequences from about half that number of cDNA clones. A total of 16,801 unique transcripts, representing tentative UniScripts (TUs), were identified from 55,783 3' ESTs. Of these TUs, 9,032 are represented by two or more ESTs. Collectively, these libraries were predicted to contain a total of approximately 31,000 TUs. Individual libraries, however, were predicted to contain no more than about 6,000 to 9,000, with the exception of light-grown seedlings, which yielded an estimate of close to 13,000. In addition, each library exhibits about the same level of complexity with respect to both the number of TUs preferentially expressed in that library and the frequency with which two or more ESTs is found in only that library. These results indicate that the sorghum genome is expressed in highly selective fashion in the individual organs and in response to the environmental conditions surveyed here. Close to 2,000 differentially expressed TUs were identified among the cDNA libraries examined, of which 775 were differentially expressed at a confidence level of 98%. From these 775 TUs, signature genes were identified defining drought, Colletotrichum infection, skotomorphogenesis (etiolation), ovary, immature panicle, and embryo.

  2. Computational identification of conserved microRNAs and their targets from expression sequence tags of blueberry (Vaccinium corybosum)

    PubMed Central

    Li, Xuyan; Hou, Yanming; Zhang, Li; Zhang, Wenhao; Quan, Chen; Cui, Yuhai; Bian, Shaomin

    2014-01-01

    MicroRNAs (miRNAs) are a class of endogenous, approximately 21nt in length, non-coding RNA, which mediate the expression of target genes primarily at post-transcriptional levels. miRNAs play critical roles in almost all plant cellular and metabolic processes. Although numerous miRNAs have been identified in the plant kingdom, the miRNAs in blueberry, which is an economically important small fruit crop, still remain totally unknown. In this study, we reported a computational identification of miRNAs and their targets in blueberry. By conducting an EST-based comparative genomics approach, 9 potential vco-miRNAs were discovered from 22,402 blueberry ESTs according to a series of filtering criteria, designated as vco-miR156–5p, vco-miR156–3p, vco-miR1436, vco-miR1522, vco-miR4495, vco-miR5120, vco-miR5658, vco-miR5783, and vco-miR5986. Based on sequence complementarity between miRNA and its target transcript, 34 target ESTs from blueberry and 70 targets from other species were identified for the vco-miRNAs. The targets were found to be involved in transcription, RNA splicing and binding, DNA duplication, signal transduction, transport and trafficking, stress response, as well as synthesis and metabolic process. These findings will greatly contribute to future research in regard to functions and regulatory mechanisms of blueberry miRNAs. PMID:25763692

  3. Computational identification of conserved microRNAs and their targets from expression sequence tags of blueberry (Vaccinium corybosum).

    PubMed

    Li, Xuyan; Hou, Yanming; Zhang, Li; Zhang, Wenhao; Quan, Chen; Cui, Yuhai; Bian, Shaomin

    2014-01-01

    MicroRNAs (miRNAs) are a class of endogenous, approximately 21nt in length, non-coding RNA, which mediate the expression of target genes primarily at post-transcriptional levels. miRNAs play critical roles in almost all plant cellular and metabolic processes. Although numerous miRNAs have been identified in the plant kingdom, the miRNAs in blueberry, which is an economically important small fruit crop, still remain totally unknown. In this study, we reported a computational identification of miRNAs and their targets in blueberry. By conducting an EST-based comparative genomics approach, 9 potential vco-miRNAs were discovered from 22,402 blueberry ESTs according to a series of filtering criteria, designated as vco-miR156-5p, vco-miR156-3p, vco-miR1436, vco-miR1522, vco-miR4495, vco-miR5120, vco-miR5658, vco-miR5783, and vco-miR5986. Based on sequence complementarity between miRNA and its target transcript, 34 target ESTs from blueberry and 70 targets from other species were identified for the vco-miRNAs. The targets were found to be involved in transcription, RNA splicing and binding, DNA duplication, signal transduction, transport and trafficking, stress response, as well as synthesis and metabolic process. These findings will greatly contribute to future research in regard to functions and regulatory mechanisms of blueberry miRNAs.

  4. Expressed sequence tags from larval gut of the European corn borer (Ostrinia nubilalis): Exploring candidate genes potentially involved in Bacillus thuringiensis toxicity and resistance

    PubMed Central

    Khajuria, Chitvan; Zhu, Yu Cheng; Chen, Ming-Shun; Buschman, Lawrent L; Higgins, Randall A; Yao, Jianxiu; Crespo, Andre LB; Siegfried, Blair D; Muthukrishnan, Subbaratnam; Zhu, Kun Yan

    2009-01-01

    Background Lepidoptera represents more than 160,000 insect species which include some of the most devastating pests of crops, forests, and stored products. However, the genomic information on lepidopteran insects is very limited. Only a few studies have focused on developing expressed sequence tag (EST) libraries from the guts of lepidopteran larvae. Knowledge of the genes that are expressed in the insect gut are crucial for understanding basic physiology of food digestion, their interactions with Bacillus thuringiensis (Bt) toxins, and for discovering new targets for novel toxins for use in pest management. This study analyzed the ESTs generated from the larval gut of the European corn borer (ECB, Ostrinia nubilalis), one of the most destructive pests of corn in North America and the western world. Our goals were to establish an ECB larval gut-specific EST database as a genomic resource for future research and to explore candidate genes potentially involved in insect-Bt interactions and Bt resistance in ECB. Results We constructed two cDNA libraries from the guts of the fifth-instar larvae of ECB and sequenced a total of 15,000 ESTs from these libraries. A total of 12,519 ESTs (83.4%) appeared to be high quality with an average length of 656 bp. These ESTs represented 2,895 unique sequences, including 1,738 singletons and 1,157 contigs. Among the unique sequences, 62.7% encoded putative proteins that shared significant sequence similarities (E-value ≤ 10-3)with the sequences available in GenBank. Our EST analysis revealed 52 candidate genes that potentially have roles in Bt toxicity and resistance. These genes encode 18 trypsin-like proteases, 18 chymotrypsin-like proteases, 13 aminopeptidases, 2 alkaline phosphatases and 1 cadherin-like protein. Comparisons of expression profiles of 41 selected candidate genes between Cry1Ab-susceptible and resistant strains of ECB by RT-PCR showed apparently decreased expressions in 2 trypsin-like and 2 chymotrypsin

  5. Computational identification and characterization of conserved miRNAs and their target genes in garlic (Allium sativum L.) expressed sequence tags.

    PubMed

    Panda, Debashis; Dehury, Budheswar; Sahu, Jagajjit; Barooah, Madhumita; Sen, Priyabrata; Modi, Mahendra K

    2014-03-10

    The endogenous small non-coding functional microRNAs (miRNAs) are short in size, range from ~21 to 24 nucleotides in length, play a pivotal role in gene expression in plants and animals by silencing genes either by destructing or blocking of translation of homologous mRNA. Although various high-throughput, time consuming and expensive techniques like forward genetics and direct cloning are employed to detect miRNAs in plants but comparative genomics complemented with novel bioinformatic tools pave the way for efficient and cost-effective identification of miRNAs through homologous sequence search with previously known miRNAs. In this study, an attempt was made to identify and characterize conserved miRNAs in garlic expressed sequence tags (ESTs) through computational means. For identification of novel miRNAs in garlic, a total 3227 known mature miRNAs of plant kingdom Viridiplantae were searched for homology against 21,637 EST sequences resulting in identification of 6 potential miRNA candidates belonging to 6 different miRNA families. The psRNATarget server predicted 33 potential target genes and their probable functions for the six identified miRNA families in garlic. Most of the garlic miRNA target genes seem to encode transcription factors as well as genes involved in stress response, metabolism, plant growth and development. The results from the present study will shed more light on the understanding of molecular mechanisms of miRNA in garlic which may aid in the development of novel and precise techniques to understand some post-transcriptional gene silencing mechanism in response to stress tolerance.

  6. Linking yeast genetics to mammalian genomes: identification and mapping of the human homolog of CDC27 via the expressed sequence tag (EST) data base.

    PubMed Central

    Tugendreich, S; Boguski, M S; Seldin, M S; Hieter, P

    1993-01-01

    We describe a strategy for quickly identifying and positionally mapping human homologs of yeast genes to cross-reference the biological and genetic information known about yeast genes to mammalian chromosomal maps. Optimized computer search methods have been developed to scan the rapidly expanding expressed sequence tag (EST) data base to find human open reading frames related to yeast protein sequence queries. These methods take advantage of the newly developed BLOSUM scoring matrices and the query masking function SEG. The corresponding human cDNA is then used to obtain a high-resolution map position on human and mouse chromosomes, providing the links between yeast genetic analysis and mapped mammalian loci. By using these methods, a human homolog of Saccharomyces cerevisiae CDC27 has been identified and mapped to human chromosome 17 and mouse chromosome 11 between the Pkca and Erbb-2 genes. Human CDC27 encodes an 823-aa protein with global similarity to its fungal homologs CDC27, nuc2+, and BimA. Comprehensive cross-referencing of genes and mutant phenotypes described in humans, mice, and yeast should accelerate the study of normal eukaryotic biology and human disease states. Images Fig. 2 PMID:8234252

  7. Comparative characterization of sweetpotato antioxidant genes from expressed sequence tags of dehydration-treated fibrous roots under different abiotic stress conditions.

    PubMed

    Kim, Yun-Hee; Jeong, Jae Cheol; Lee, Haeng-Soon; Kwak, Sang-Soo

    2013-04-01

    Drought stress is one of the most adverse conditions for plant growth and productivity. The plant antioxidant system is an important defense mechanism and includes antioxidant enzymes and low-molecular weight antioxidants. Understanding the biochemical and molecular responses to drought is essential for improving plant resistance to water-limited conditions. Previously, we isolated and characterized expressed sequence tags (ESTs) from a full-length enriched cDNA library prepared from fibrous roots of sweetpotato subjected to dehydration stress (Kim et al. in BMB Rep 42:271-276, [5]). In this study, we isolated and characterized 11 sweetpotato antioxidant genes from sweetpotato EST library under various abiotic stress conditions, which included six intracellular CuZn superoxide dismutases (CuZnSOD), ascorbate peroxidase, catalase, glutathione peroxidase (GPX), glutathione-S-transferase, thioredoxin (TRX), and five extracellular peroxidase genes. The expression of almost all the antioxidant genes induced under dehydration treatments occurred in leaves, with the exception of extracellular swPB6, whereas some antioxidant genes showed increased expression levels in the fibrous roots, such as intracellular GPX, TRX, extracellular swPA4, and swPB7 genes. During various abiotic stress treatments in leaves, such as exposure to NaCl, cold, and abscisic acid, several intracellular antioxidant genes were strongly expressed compared with the expression of extracellular antioxidant genes. These results indicated that some intracellular antioxidant genes, especially swAPX1 and CuZnSOD, might be specifically involved in important defense mechanisms against oxidative stress induced by various abiotic stresses including dehydration in sweetpotato plants.

  8. Functional categorization of unique expressed sequence tags obtained from the yeast-like growth phase of the elm pathogen Ophiostoma novo-ulmi

    PubMed Central

    2011-01-01

    Background The highly aggressive pathogenic fungus Ophiostoma novo-ulmi continues to be a serious threat to the American elm (Ulmus americana) in North America. Extensive studies have been conducted in North America to understand the mechanisms of virulence of this introduced pathogen and its evolving population structure, with a view to identifying potential strategies for the control of Dutch elm disease. As part of a larger study to examine the genomes of economically important Ophiostoma spp. and the genetic basis of virulence, we have constructed an expressed sequence tag (EST) library using total RNA extracted from the yeast-like growth phase of O. novo-ulmi (isolate H327). Results A total of 4,386 readable EST sequences were annotated by determining their closest matches to known or theoretical sequences in public databases by BLASTX analysis. Searches matched 2,093 sequences to entries found in Genbank, including 1,761 matches with known proteins and 332 matches with unknown (hypothetical/predicted) proteins. Known proteins included a collection of 880 unique transcripts which were categorized to obtain a functional profile of the transcriptome and to evaluate physiological function. These assignments yielded 20 primary functional categories (FunCat), the largest including Metabolism (FunCat 01, 20.28% of total), Sub-cellular localization (70, 10.23%), Protein synthesis (12, 10.14%), Transcription (11, 8.27%), Biogenesis of cellular components (42, 8.15%), Cellular transport, facilitation and routes (20, 6.08%), Classification unresolved (98, 5.80%), Cell rescue, defence and virulence (32, 5.31%) and the unclassified category, or known sequences of unknown metabolic function (99, 7.5%). A list of specific transcripts of interest was compiled to initiate an evaluation of their impact upon strain virulence in subsequent studies. Conclusions This is the first large-scale study of the O. novo-ulmi transcriptome. The expression profile obtained from the yeast

  9. A consensus linkage map for sugi (Cryptomeria japonica) from two pedigrees, based on microsatellites and expressed sequence tags.

    PubMed Central

    Tani, Naoki; Takahashi, Tomokazu; Iwata, Hiroyoshi; Mukai, Yuzuru; Ujino-Ihara, Tokuko; Matsumoto, Asako; Yoshimura, Kensuke; Yoshimaru, Hiroshi; Murai, Masafumi; Nagasaka, Kazutoshi; Tsumura, Yoshihiko

    2003-01-01

    A consensus map for sugi (Cryptomeria japonica) was constructed by integrating linkage data from two unrelated third-generation pedigrees, one derived from a full-sib cross and the other by self-pollination of F1 individuals. The progeny segregation data of the first pedigree were derived from cleaved amplified polymorphic sequences, microsatellites, restriction fragment length polymorphisms, and single nucleotide polymorphisms. The data of the second pedigree were derived from cleaved amplified polymorphic sequences, isozyme markers, morphological traits, random amplified polymorphic DNA markers, and restriction fragment length polymorphisms. Linkage analyses were done for the first pedigree with JoinMap 3.0, using its parameter set for progeny derived by cross-pollination, and for the second pedigree with the parameter set for progeny derived from selfing of F1 individuals. The 11 chromosomes of C. japonica are represented in the consensus map. A total of 438 markers were assigned to 11 large linkage groups, 1 small linkage group, and 1 nonintegrated linkage group from the second pedigree; their total length was 1372.2 cM. On average, the consensus map showed 1 marker every 3.0 cM. PCR-based codominant DNA markers such as cleaved amplified polymorphic sequences and microsatellite markers were distributed in all linkage groups and occupied about half of mapped loci. These markers are very useful for integration of different linkage maps, QTL mapping, and comparative mapping for evolutional study, especially for species with a large genome size such as conifers. PMID:14668402

  10. Global Transcriptome Analysis of the Tentacle of the Jellyfish Cyanea capillata Using Deep Sequencing and Expressed Sequence Tags: Insight into the Toxin- and Degenerative Disease-Related Transcripts

    PubMed Central

    Liu, Dan; Wang, Qianqian; Ruan, Zengliang; He, Qian; Zhang, Liming

    2015-01-01

    Background Jellyfish contain diverse toxins and other bioactive components. However, large-scale identification of novel toxins and bioactive components from jellyfish has been hampered by the low efficiency of traditional isolation and purification methods. Results We performed de novo transcriptome sequencing of the tentacle tissue of the jellyfish Cyanea capillata. A total of 51,304,108 reads were obtained and assembled into 50,536 unigenes. Of these, 21,357 unigenes had homologues in public databases, but the remaining unigenes had no significant matches due to the limited sequence information available and species-specific novel sequences. Functional annotation of the unigenes also revealed general gene expression profile characteristics in the tentacle of C. capillata. A primary goal of this study was to identify putative toxin transcripts. As expected, we screened many transcripts encoding proteins similar to several well-known toxin families including phospholipases, metalloproteases, serine proteases and serine protease inhibitors. In addition, some transcripts also resembled molecules with potential toxic activities, including cnidarian CfTX-like toxins with hemolytic activity, plancitoxin-1, venom toxin-like peptide-6, histamine-releasing factor, neprilysin, dipeptidyl peptidase 4, vascular endothelial growth factor A, angiotensin-converting enzyme-like and endothelin-converting enzyme 1-like proteins. Most of these molecules have not been previously reported in jellyfish. Interestingly, we also characterized a number of transcripts with similarities to proteins relevant to several degenerative diseases, including Huntington’s, Alzheimer’s and Parkinson’s diseases. This is the first description of degenerative disease-associated genes in jellyfish. Conclusion We obtained a well-categorized and annotated transcriptome of C. capillata tentacle that will be an important and valuable resource for further understanding of jellyfish at the molecular

  11. EGENES: Transcriptome-Based Plant Database of Genes with Metabolic Pathway Information and Expressed Sequence Tag Indices in KEGG1[C][W][OA

    PubMed Central

    Masoudi-Nejad, Ali; Goto, Susumu; Jauregui, Ruy; Ito, Masumi; Kawashima, Shuichi; Moriya, Yuki; Endo, Takashi R.; Kanehisa, Minoru

    2007-01-01

    EGENES is a knowledge-based database for efficient analysis of plant expressed sequence tags (ESTs) that was recently added to the KEGG suite of databases. It links plant genomic information with higher order functional information in a single database. It also provides gene indices for each genome. The genomic information in EGENES is a collection of EST contigs constructed from assembly of ESTs. Due to the extremely large genomes of plant species, the bulk collection of data such as ESTs is a quick way to capture a complete repertoire of genes expressed in an organism. Using ESTs for reconstructing metabolic pathways is a new expansion in KEGG and provides researchers with a new resource for species in which only EST sequences are available. Functional annotation in EGENES is a process of linking a set of genes/transcripts in each genome with a network of interacting molecules in the cell. EGENES is a multispecies, integrated resource consisting of genomic, chemical, and network information containing a complete set of building blocks (genes and molecules) and wiring diagrams (biological pathways) to represent cellular functions. Using EGENES, genome-based pathway annotation and EST-based annotation can now be compared and mutually validated. The ultimate goals of EGENES will be to: bring new plant species into KEGG by clustering and annotating ESTs; abstract knowledge and principles from large-scale plant EST data; and improve computational prediction of systems of higher complexity. EGENES will be updated at least once a year. EGENES is publicly available and is accessible by the following link or by KEGG's navigation system (http://www.genome.jp/kegg-bin/create_kegg_menu?category=plants_egenes). PMID:17468225

  12. Expressed sequence tag analysis and development of gene associated markers in a near-isogenic plant system of Eragrostis curvula.

    PubMed

    Cervigni, Gerardo D L; Paniego, Norma; Díaz, Marina; Selva, Juan P; Zappacosta, Diego; Zanazzi, Darío; Landerreche, Iñaki; Martelotto, Luciano; Felitti, Silvina; Pessino, Silvina; Spangenberg, Germán; Echenique, Viviana

    2008-05-01

    Eragrostis curvula (Schrad.) Nees is a forage grass native to the semiarid regions of Southern Africa, which reproduces mainly by pseudogamous diplosporous apomixis. A collection of ESTs was generated from four cDNA libraries, three of them obtained from panicles of near-isogenic lines with different ploidy levels and reproductive modes, and one obtained from 12 days-old plant leaves. A total of 12,295 high-quality ESTs were clustered and assembled, rendering 8,864 unigenes, including 1,490 contigs and 7,394 singletons, with a genome coverage of 22%. A total of 7,029 (79.11%) unigenes were functionally categorized by BLASTX analysis against sequences deposited in public databases, but only 37.80% could be classified according to Gene Ontology. Sequence comparison against the cereals genes indexes (GI) revealed 50% significant hits. A total of 254 EST-SSRs were detected from 219 singletons and 35 from contigs. Di- and tri- motifs were similarly represented with percentages of 38.95 and 40.16%, respectively. In addition, 190 SNPs and Indels were detected in 18 contigs generated from 3 to 4 libraries. The ESTs and the molecular markers obtained in this study will provide valuable resources for a wide range of applications including gene identification, genetic mapping, cultivar identification, analysis of genetic diversity, phenotype mapping and marker assisted selection.

  13. Unraveling new genes associated with seed development and metabolism in Bixa orellana L. by expressed sequence tag (EST) analysis.

    PubMed

    Soares, Virgínia L F; Rodrigues, Simone M; de Oliveira, Tahise M; de Queiroz, Talisson O; Lima, Lívia S; Hora-Júnior, Braz T; Gramacho, Karina P; Micheli, Fabienne; Cascardo, Júlio C M; Otoni, Wagner C; Gesteira, Abelmon S; Costa, Marcio G C

    2011-02-01

    The tropical tree Bixa orellana L. produces a range of secondary metabolites which biochemical and molecular biosynthesis basis are not well understood. In this work we have characterized a set of ESTs from a non-normalized cDNA library of B. orellana seeds to obtain information about the main developmental and metabolic processes taking place in developing seeds and their associated genes. After sequencing a set of randomly selected clones, most of the sequences were assigned with putative functions based on similarity, GO annotations and protein domains. The most abundant transcripts encoded proteins associated with cell wall (prolyl 4-hydroxylase), fatty acid (acyl carrier protein), and hormone/flavonoid (2OG-Fe oxygenase) synthesis, germination (MADS FLC-like protein) and embryo development (AP2/ERF transcription factor) regulation, photosynthesis (chlorophyll a-b binding protein), cell elongation (MAP65-1a), and stress responses (metallothionein- and thaumatin-like proteins). Enzymes were assigned to 16 different metabolic pathways related to both primary and secondary metabolisms. Characterization of two candidate genes of the bixin biosynthetic pathway, BoCCD and BoOMT, showed that they belong, respectively, to the carotenoid-cleavage dioxygenase 4 (CCD4) and caffeic acid O-methyltransferase (COMT) families, and are up-regulated during seed development. It indicates their involvement in the synthesis of this commercially important carotenoid pigment in seeds of B. orellana. Most of the genes identified here are the first representatives of their gene families in B. orellana. PMID:20563648

  14. Analysis of expressed sequence tags (ESTs) and gene expression changes under different growth conditions for the ciliate Anophryoides haemophila, the causative agent of bumper car disease in the American lobster (Homarus americanus).

    PubMed

    Acorn, Adam R; Clark, K Fraser; Jones, Sarah; Després, Béatrice M; Munro, Sarah; Cawthorn, Richard J; Greenwood, Spencer J

    2011-06-01

    The scuticociliate Anophryoides haemophila, causes bumper car disease in American lobster (Homarus americanus) in commercial holding facilities in Atlantic Canada. While the parasite has been recognized since the 1970s and much has been learned about its biology, minimal molecular characterization exists. With genome consortiums turning to model organisms like the ciliates Tetrahymena and Paramecium, the amount of relevant sequence data available has made sequence surveys more attractive for gene discovery in related ciliates. We sequenced 9984 expressed sequence tags (ESTs) from a non-normalized A. haemophila cDNA library to characterize gene expression patterns, functional gene distribution and to discover novel genes related to the parasitic life history. The A. haemophila ESTs were grouped into 843 clusters and singletons with 658 EST clusters having identifiable homologs, while 159 ESTs were unique and had no similarity to any sequences in the public databases. Not unexpectedly, about 67% of the A. haemophila ESTs have similarity to annotated and hypothetical genes from the related oligohymenophorean ciliate, Tetrahymena. Numerous cysteine proteases, hypothetical proteins and novel sequences possess putative secretory signal peptides suggesting that they may contribute to the pathogenesis of bumper car disease in lobster. Real time RT-qPCR analysis of cathepsin L and two homologs of cathepsin B did not show any changes in gene expression under varying in vitro growth conditions or during a modified-in vivo infection which may be suggestive of the opportunistic life history strategy of this ciliate.

  15. Comparative analysis of secreted protein evolution using expressed sequence tags from four poplar leaf rusts (Melampsora spp.)

    PubMed Central

    2010-01-01

    Background Obligate biotrophs such as rust fungi are believed to establish long-term relationships by modulating plant defenses through a plethora of effector proteins, whose most recognizable feature is the presence of a signal peptide for secretion. Since the phenotypes of these effectors extend to host cells, their genes are expected to be under accelerated evolution stimulated by host-pathogen coevolutionary arms races. Recently, whole genome sequence data has allowed the prediction of secretomes, facilitating the identification of putative effectors. Results We generated cDNA libraries from four poplar leaf rust pathogens (Melampsora spp.) and used computational approaches to identify and annotate putative secreted proteins with the aim of uncovering new knowledge about the nature and evolution of the rust secretome. While more than half of the predicted secretome members encoded lineage-specific proteins, similarities with experimentally characterized fungal effectors were also identified. A SAGE analysis indicated a strong stage-specific regulation of transcripts encoding secreted proteins. The average sequence identity of putative secreted proteins to their closest orthologs in the wheat stem rust Puccinia graminis f. sp. tritici was dramatically reduced compared with non-secreted ones. A comparative genomics approach based on homologous gene groups unravelled positive selection in putative members of the secretome. Conclusion We uncovered robust evidence that different evolutionary constraints are acting on the rust secretome when compared to the rest of the genome. These results are consistent with the view that these genes are more likely to exhibit an effector activity and be involved in coevolutionary arms races with host factors. PMID:20615251

  16. Construction of a cDNA library and preliminary analysis of expressed sequence tags in Piper hainanense.

    PubMed

    Fan, R; Ling, P; Hao, C Y; Li, F P; Huang, L F; Wu, B D; Wu, H S

    2015-01-01

    Black pepper is a perennial climbing vine. It is widely cultivated because its berries can be utilized not only as a spice in food but also for medicinal use. This study aimed to construct a standardized, high-quality cDNA library to facilitated identification of new Piper hainanense transcripts. For this, 262 unigenes were used to generate raw reads. The average length of these 262 unigenes was 774.8 bp. Of these, 94 genes (35.9%) were newly identified, according to the NCBI protein database. Thus, identification of new genes may broaden the molecular knowledge of P. hainanense on the basis of Clusters of Orthologous Groups and Gene Ontology categories. In addition, certain basic genes linked to physiological processes, which can contribute to disease resistance and thereby to the breeding of black pepper. A total of 26 unigenes were found to be SSR markers. Dinucleotide SSR was the main repeat motif, accounting for 61.54%, followed by trinucleotide SSR (23.07%). Eight primer pairs successfully amplified DNA fragments and detected significant amounts of polymorphism among twenty-one piper germplasm. These results present a novel sequence information of P. hainanense, which can serve as the foundation for further genetic research on this species. PMID:26505424

  17. In silico identification and characterization of conserved miRNAs and their target genes in sweet potato (Ipomoea batatas L.) expressed sequence tags (ESTs).

    PubMed

    Dehury, Budheswar; Panda, Debashis; Sahu, Jagajjit; Sahu, Mousumi; Sarma, Kishore; Barooah, Madhumita; Sen, Priyabrata; Modi, Mahendra

    2013-01-01

    The endogenous small non-coding micro RNAs (miRNAs), which are typically ~21-24 nt nucleotides, play a crucial role in regulating the intrinsic normal growth of cells and development of the plants as well as in maintaining the integrity of genomes. These small non-coding RNAs function as the universal specificity factors in post-transcriptional gene silencing. Discovering miRNAs, identifying their targets, and further inferring miRNA functions is a routine process to understand normal biological processes of miRNAs and their roles in the development of plants. Comparative genomics based approach using expressed sequence tags (EST) and genome survey sequences (GSS) offer a cost-effective platform for identification and characterization of miRNAs and their target genes in plants. Despite the fact that sweet potato (Ipomoea batatas L.) is an important staple food source for poor small farmers throughout the world, the role of miRNA in various developmental processes remains largely unknown. In this paper, we report the computational identification of miRNAs and their target genes in sweet potato from their ESTs. Using comparative genomics-based approach, 8 potential miRNA candidates belonging to miR168, miR2911, and miR156 families were identified from 23 406 ESTs in sweet potato. A total of 42 target genes were predicted and their probable functions were illustrated. Most of the newly identified miRNAs target transcription factors as well as genes involved in plant growth and development, signal transduction, metabolism, defense, and stress response. The identification of miRNAs and their targets is expected to accelerate the pace of miRNA discovery, leading to an improved understanding of the role of miRNA in development and physiology of sweet potato, as well as stress response.

  18. Analysis of expressed sequence tags from Actinidia: applications of a cross species EST database for gene discovery in the areas of flavor, health, color and ripening

    PubMed Central

    Crowhurst, Ross N; Gleave, Andrew P; MacRae, Elspeth A; Ampomah-Dwamena, Charles; Atkinson, Ross G; Beuning, Lesley L; Bulley, Sean M; Chagne, David; Marsh, Ken B; Matich, Adam J; Montefiori, Mirco; Newcomb, Richard D; Schaffer, Robert J; Usadel, Björn; Allan, Andrew C; Boldingh, Helen L; Bowen, Judith H; Davy, Marcus W; Eckloff, Rheinhart; Ferguson, A Ross; Fraser, Lena G; Gera, Emma; Hellens, Roger P; Janssen, Bart J; Klages, Karin; Lo, Kim R; MacDiarmid, Robin M; Nain, Bhawana; McNeilage, Mark A; Rassam, Maysoon; Richardson, Annette C; Rikkerink, Erik HA; Ross, Gavin S; Schröder, Roswitha; Snowden, Kimberley C; Souleyre, Edwige JF; Templeton, Matt D; Walton, Eric F; Wang, Daisy; Wang, Mindy Y; Wang, Yanming Y; Wood, Marion; Wu, Rongmei; Yauk, Yar-Khing; Laing, William A

    2008-01-01

    Background Kiwifruit (Actinidia spp.) are a relatively new, but economically important crop grown in many different parts of the world. Commercial success is driven by the development of new cultivars with novel consumer traits including flavor, appearance, healthful components and convenience. To increase our understanding of the genetic diversity and gene-based control of these key traits in Actinidia, we have produced a collection of 132,577 expressed sequence tags (ESTs). Results The ESTs were derived mainly from four Actinidia species (A. chinensis, A. deliciosa, A. arguta and A. eriantha) and fell into 41,858 non redundant clusters (18,070 tentative consensus sequences and 23,788 EST singletons). Analysis of flavor and fragrance-related gene families (acyltransferases and carboxylesterases) and pathways (terpenoid biosynthesis) is presented in comparison with a chemical analysis of the compounds present in Actinidia including esters, acids, alcohols and terpenes. ESTs are identified for most genes in color pathways controlling chlorophyll degradation and carotenoid biosynthesis. In the health area, data are presented on the ESTs involved in ascorbic acid and quinic acid biosynthesis showing not only that genes for many of the steps in these pathways are represented in the database, but that genes encoding some critical steps are absent. In the convenience area, genes related to different stages of fruit softening are identified. Conclusion This large EST resource will allow researchers to undertake the tremendous challenge of understanding the molecular basis of genetic diversity in the Actinidia genus as well as provide an EST resource for comparative fruit genomics. The various bioinformatics analyses we have undertaken demonstrates the extent of coverage of ESTs for genes encoding different biochemical pathways in Actinidia. PMID:18655731

  19. Ginger and turmeric expressed sequence tags identify signature genes for rhizome identity and development and the biosynthesis of curcuminoids, gingerols and terpenoids

    PubMed Central

    2013-01-01

    Background Ginger (Zingiber officinale) and turmeric (Curcuma longa) accumulate important pharmacologically active metabolites at high levels in their rhizomes. Despite their importance, relatively little is known regarding gene expression in the rhizomes of ginger and turmeric. Results In order to identify rhizome-enriched genes and genes encoding specialized metabolism enzymes and pathway regulators, we evaluated an assembled collection of expressed sequence tags (ESTs) from eight different ginger and turmeric tissues. Comparisons to publicly available sorghum rhizome ESTs revealed a total of 777 gene transcripts expressed in ginger/turmeric and sorghum rhizomes but apparently absent from other tissues. The list of rhizome-specific transcripts was enriched for genes associated with regulation of tissue growth, development, and transcription. In particular, transcripts for ethylene response factors and AUX/IAA proteins appeared to accumulate in patterns mirroring results from previous studies regarding rhizome growth responses to exogenous applications of auxin and ethylene. Thus, these genes may play important roles in defining rhizome growth and development. Additional associations were made for ginger and turmeric rhizome-enriched MADS box transcription factors, their putative rhizome-enriched homologs in sorghum, and rhizomatous QTLs in rice. Additionally, analysis of both primary and specialized metabolism genes indicates that ginger and turmeric rhizomes are primarily devoted to the utilization of leaf supplied sucrose for the production and/or storage of specialized metabolites associated with the phenylpropanoid pathway and putative type III polyketide synthase gene products. This finding reinforces earlier hypotheses predicting roles of this enzyme class in the production of curcuminoids and gingerols. Conclusion A significant set of genes were found to be exclusively or preferentially expressed in the rhizome of ginger and turmeric. Specific

  20. An expressed sequence tag (EST) library from developing fruits of an Hawaiian endemic mint (Stenogyne rugosa, Lamiaceae): characterization and microsatellite markers

    PubMed Central

    Lindqvist, Charlotte; Scheen, Anne-Cathrine; Yoo, Mi-Jeong; Grey, Paris; Oppenheimer, David G; Leebens-Mack, James H; Soltis, Douglas E; Soltis, Pamela S; Albert, Victor A

    2006-01-01

    Background The endemic Hawaiian mints represent a major island radiation that likely originated from hybridization between two North American polyploid lineages. In contrast with the extensive morphological and ecological diversity among taxa, ribosomal DNA sequence variation has been found to be remarkably low. In the past few years, expressed sequence tag (EST) projects on plant species have generated a vast amount of publicly available sequence data that can be mined for simple sequence repeats (SSRs). However, these EST projects have largely focused on crop or otherwise economically important plants, and so far only few studies have been published on the use of intragenic SSRs in natural plant populations. We constructed an EST library from developing fleshy nutlets of Stenogyne rugosa principally to identify genetic markers for the Hawaiian endemic mints. Results The Stenogyne fruit EST library consisted of 628 unique transcripts derived from 942 high quality ESTs, with 68% of unigenes matching Arabidopsis genes. Relative frequencies of Gene Ontology functional categories were broadly representative of the Arabidopsis proteome. Many unigenes were identified as putative homologs of genes that are active during plant reproductive development. A comparison between unigenes from Stenogyne and tomato (both asterid angiosperms) revealed many homologs that may be relevant for fruit development. Among the 628 unigenes, a total of 44 potentially useful microsatellite loci were predicted. Several of these were successfully tested for cross-transferability to other Hawaiian mint species, and at least five of these demonstrated interesting patterns of polymorphism across a large sample of Hawaiian mints as well as close North American relatives in the genus Stachys. Conclusion Analysis of this relatively small EST library illustrated a broad GO functional representation. Many unigenes could be annotated to involvement in reproductive development. Furthermore, first tests

  1. Sequencing Degraded RNA Addressed by 3' Tag Counting

    PubMed Central

    Sigurgeirsson, Benjamín; Emanuelsson, Olof; Lundeberg, Joakim

    2014-01-01

    RNA sequencing has become widely used in gene expression profiling experiments. Prior to any RNA sequencing experiment the quality of the RNA must be measured to assess whether or not it can be used for further downstream analysis. The RNA integrity number (RIN) is a scale used to measure the quality of RNA that runs from 1 (completely degraded) to 10 (intact). Ideally, samples with high RIN (8) are used in RNA sequencing experiments. RNA, however, is a fragile molecule which is susceptible to degradation and obtaining high quality RNA is often hard, or even impossible when extracting RNA from certain clinical tissues. Thus, occasionally, working with low quality RNA is the only option the researcher has. Here we investigate the effects of RIN on RNA sequencing and suggest a computational method to handle data from samples with low quality RNA which also enables reanalysis of published datasets. Using RNA from a human cell line we generated and sequenced samples with varying RINs and illustrate what effect the RIN has on the basic procedure of RNA sequencing; both quality aspects and differential expression. We show that the RIN has systematic effects on gene coverage, false positives in differential expression and the quantification of duplicate reads. We introduce 3' tag counting (3TC) as a computational approach to reliably estimate differential expression for samples with low RIN. We show that using the 3TC method in differential expression analysis significantly reduces false positives when comparing samples with different RIN, while retaining reasonable sensitivity. PMID:24632678

  2. Development and Validation of Single Nucleotide Polymorphism (SNP) Markers from an Expressed Sequence Tag (EST) Database in Olive Flounder (Paralichthys olivaceus).

    PubMed

    Kim, Jung Eun; Lee, Young Mee; Lee, Jeong-Ho; Noh, Jae Koo; Kim, Hyun Chul; Park, Choul-Ji; Park, Jong-Won; Kim, Kyung-Kil

    2014-12-01

    To successful molecular breeding, identification and functional characterization of breeding related genes and development of molecular breeding techniques using DNA markers are essential. Although the development of a useful marker is difficult in the aspect of time, cost and effort, many markers are being developed to be used in molecular breeding and developed markers have been used in many fields. Single nucleotide polymorphisms (SNPs) markers were widely used for genomic research and breeding, but has hardly been validated for screening functional genes in olive flounder. We identified single nucleotide polymorphisms (SNPs) from expressed sequence tag (EST) database in olive flounder; out of a total 4,327 ESTs, 693 contigs and 514 SNPs were detected in total EST, and these substitutions include 297 transitions and 217 transversions. As a result, 144 SNP markers were developed on the basis of 514 SNP to selection of useful gene region, and then applied to each of eight wild and culture olive flounder (total 16 samples). In our experimental result, only 32 markers had detected polymorphism in sample, also identified 21 transitions and 11 transversions, whereas indel was not detected in polymorphic SNPs. Heterozygosity of wild and cultured olive flounder using the 32 SNP markers is 0.34 and 0.29, respectively. In conclusion, we identified SNP and polymorphism in olive flounder using newly designed marker, it supports that developed markers are suitable for SNP detection and diversity analysis in olive flounder. The outcome of this study can be basic data for researches for immunity gene and characteristic with SNP.

  3. Expression of the Arabidopsis transposable element Tag1 is targeted to developing gametophytes.

    PubMed Central

    Galli, Mary; Theriault, Angie; Liu, Dong; Crawford, Nigel M

    2003-01-01

    The Arabidopsis transposon Tag1 undergoes late excision during vegetative and germinal development in plants containing 35S-Tag1-GUS constructs. To determine if transcriptional regulation can account for the developmental control of Tag1 excision, the transcriptional activity of Tag1 promoter-GUS fusion constructs of various lengths was examined in transgenic plants. All constructs showed expression in the reproductive organs of developing flowers but no expression in leaves. Expression was restricted to developing gametophytes in both male and female lineages. Quantitative RT-PCR analysis confirmed that Tag1 expression predominates in the reproductive organs of flower buds. These results are consistent with late germinal excision of Tag1, but they cannot explain the vegetative excision activity of Tag1 observed with 35S-Tag1-GUS constructs. To resolve this issue, Tag1 excision was reexamined using elements with no adjacent 35S promoter sequences. Tag1 excision in this context is restricted to germinal events with no detectable vegetative excision. If a 35S enhancer sequence is placed next to Tag1, vegetative excision is restored. These results indicate that the intrinsic activity of Tag1 is restricted to germinal excision due to targeted expression of the Tag1 transposase to developing gametophytes and that this activity is altered by the presence of adjacent enhancers or promoters. PMID:14704189

  4. Accumulation, functional annotation, and comparative analysis of expressed sequence tags in eggplant (Solanum melongena L.), the third pole of the genus Solanum species after tomato and potato.

    PubMed

    Fukuoka, Hiroyuki; Yamaguchi, Hirotaka; Nunome, Tsukasa; Negoro, Satomi; Miyatake, Koji; Ohyama, Akio

    2010-01-15

    Eggplant (Solanum melongena L.) is a widely grown vegetable crop that belongs to the genus Solanum, which is comprised of more than 1000 species of wide genetic and phenotypic variation. Unlike tomato and potato, Solanum crops that belong to subgenus Potatoe and have been targets for comprehensive genomic studies, eggplant is endemic to the Old World and belongs to a different subgenus, Leptostemonum, and therefore, would be a unique member for comparative molecular biology in Solanum. In this study, more than 60,000 eggplant cDNA clones from various tissues and treatments were sequenced from both the 5'- and 3'-ends, and a unigene set consisting of 16,245 unique sequences was constructed. Functional annotations based on sequence similarity to known plant reference datasets revealed a distribution of functional categories almost similar to that of tomato, while 1316 unigenes were suggested to be eggplant-specific. Sequence-based comparative analysis using putative orthologous gene groups setup by reciprocal sequence comparison among six solanaceous species suggested that eggplant and its wild ally Solanum torvum were clustered separately from subgenus Potatoe species, and then, all Solanum species were clustered separately from the genus Capsicum. Microsatellite motif distribution was different among species and likely to be coincident with the phylogenetic relationships. Furthermore, the eggplant unigene dataset exhibited its utility in transcriptome analysis by the SAGE strategy where a considerable number of short tag sequences of interest were successfully assigned to unigenes and their functional annotations. The eggplant ESTs and 16k unigene set developed in this study would be a useful resource not only for molecular genetics and breeding in eggplant itself, but for expanding the scope of comparative biology in Solanum species.

  5. Analysis of expressed sequence tags (ESTs) from avocado seed (Persea americana var. drymifolia) reveals abundant expression of the gene encoding the antimicrobial peptide snakin.

    PubMed

    Guzmán-Rodríguez, Jaquelina J; Ibarra-Laclette, Enrique; Herrera-Estrella, Luis; Ochoa-Zarzosa, Alejandra; Suárez-Rodríguez, Luis María; Rodríguez-Zapata, Luis C; Salgado-Garciglia, Rafael; Jimenez-Moraila, Beatriz; López-Meza, Joel E; López-Gómez, Rodolfo

    2013-09-01

    Avocado is one of the most important fruits in the world. Avocado "native mexicano" (Persea americana var. drymifolia) seeds are widely used in the propagation of this plant and are the primary source of rootstocks globally for a variety of avocado cultivars, such as the Hass avocado. Here, we report the isolation of 5005 ESTs from the 5' ends of P. americana var. drymifolia seed cDNA clones representing 1584 possible unigenes. These avocado seed ESTs were compared with the avocado flower EST library, and we detected several genes that are expressed either in both tissues or only in the seed. The snakin gene, which encodes an element of the innate immune response in plants, was one of those most frequently found among the seed ESTs, and this suggests that it is abundantly expressed in the avocado seed. We expressed the snakin gene in a heterologous system, namely the bovine endothelial cell line BVE-E6E7. Conditioned media from transfected BVE-E6E7 cells showed antimicrobial activity against strains of Escherichia coli and Staphylococcus aureus. This is the first study of the function of the snakin gene in plant seed tissue, and our observations suggest that this gene might play a protective role in the avocado seed. PMID:23811120

  6. Analysis of expressed sequence tags (ESTs) from avocado seed (Persea americana var. drymifolia) reveals abundant expression of the gene encoding the antimicrobial peptide snakin.

    PubMed

    Guzmán-Rodríguez, Jaquelina J; Ibarra-Laclette, Enrique; Herrera-Estrella, Luis; Ochoa-Zarzosa, Alejandra; Suárez-Rodríguez, Luis María; Rodríguez-Zapata, Luis C; Salgado-Garciglia, Rafael; Jimenez-Moraila, Beatriz; López-Meza, Joel E; López-Gómez, Rodolfo

    2013-09-01

    Avocado is one of the most important fruits in the world. Avocado "native mexicano" (Persea americana var. drymifolia) seeds are widely used in the propagation of this plant and are the primary source of rootstocks globally for a variety of avocado cultivars, such as the Hass avocado. Here, we report the isolation of 5005 ESTs from the 5' ends of P. americana var. drymifolia seed cDNA clones representing 1584 possible unigenes. These avocado seed ESTs were compared with the avocado flower EST library, and we detected several genes that are expressed either in both tissues or only in the seed. The snakin gene, which encodes an element of the innate immune response in plants, was one of those most frequently found among the seed ESTs, and this suggests that it is abundantly expressed in the avocado seed. We expressed the snakin gene in a heterologous system, namely the bovine endothelial cell line BVE-E6E7. Conditioned media from transfected BVE-E6E7 cells showed antimicrobial activity against strains of Escherichia coli and Staphylococcus aureus. This is the first study of the function of the snakin gene in plant seed tissue, and our observations suggest that this gene might play a protective role in the avocado seed.

  7. A method to introduce an internal tag sequence into a Salmonella chromosomal gene.

    PubMed

    Zhao, Weidong; Méresse, Stéphane

    2015-01-01

    Epitope tags are short peptide sequences that are particularly useful for the characterization of proteins against which no antibody has been developed. Influenza hemagglutinin (HA) tag is one of the most widely used epitope tags as several valuable monoclonal and polyclonal antibodies that can be used in various techniques are commercially available. Therefore, adding a HA tag to a protein of interest is quite helpful to get rapid and cost less information regarding its localization, its expression or its biological function. In this chapter, we describe a process, derived from the Datsenko and Wanner procedure, which allows the introduction of an internal 2HA tag sequence into a chromosomal gene of the bacterial pathogen Salmonella.

  8. Identification of the immune expressed sequence tags of pearl oyster (Pinctada martensii, Dunker 1850) responding to Vibrio alginolyticus challenge by suppression subtractive hybridization.

    PubMed

    Wang, Yanhong; Fu, Dingkun; Luo, Peng; He, Xiaocui

    2012-09-01

    One hemolymph subtracted cDNA library of pearl oyster (Pinctada martensii, Dunker 1837) was constructed using the suppression subtractive hybridization (SSH) in response to Vibrio alginolyticus. A total of 1089 clones were sequenced. All the consensuses were recognized based on the BLAST searches in NCBI, and revealed that 376 (58%) of them had no significant matches to reported sequences in the database. 267 ESTs were in significant matches after homologous sequence searches. Hypothesized genes inferred from EST sequences were categorized into six groups according to their putative biological functions: replication, transcription and translation; cellular processes; responded to stimuli; metabolism and biosynthesis; signal transduction genes; "other" category. The five genes, pearlin gene promoter PGPPm, serine/threonine kinase STKPm, limbic system-associated membrane protein LSAMPPm, nacrein gene intron 6 NGIPm6 and ferritin-like protein FLPPm, were analyzed using real-time PCR. All these genes were significantly expressed after V. alginolyticus challenge.

  9. Development of peanut expessed sequence tag-based genomic resources and tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    U.S. Peanut Genome Initiative (PGI) has widely recognized the need for peanut genome tools and resources development for mitigating peanut allergens and food safety. Genomics such as Expressed Sequence Tag (EST), microarray technologies, and whole genome sequencing provides robotic tools for profili...

  10. TagRecon: High-Throughput Mutation Identification through Sequence Tagging

    PubMed Central

    Dasari, Surendra; Chambers, Matthew C.; Slebos, Robbert J.; Zimmerman, Lisa J.; Ham, Amy-Joan L.; Tabb, David L.

    2010-01-01

    Shotgun proteomics produces collections of tandem mass spectra that contain all the data needed to identify mutated peptides from clinical samples. Identifying these sequence variations, however, has not been feasible with conventional database search strategies, which require exact matches between observed and expected sequences. Searching for mutations as mass shifts on specified residues through database search can incur significant performance penalties and generate substantial false positive rates. Here we describe TagRecon, an algorithm that leverages inferred sequence tags to identify unanticipated mutations in clinical proteomic data sets. TagRecon identifies unmodified peptides as sensitively as the related MyriMatch database search engine. In both LTQ and Orbitrap data sets, TagRecon outperformed state of the art software in recognizing sequence mismatches from data sets with known variants. We developed guidelines for filtering putative mutations from clinical samples, and we applied them in an analysis of cancer cell lines and an examination of colon tissue. Mutations were found in up to 6% of identified peptides, and only a small fraction corresponded to dbSNP entries. The RKO cell line, which is DNA mismatch repair deficient, yielded more mutant peptides than the mismatch repair proficient SW480 line. Analysis of colon cancer tumor and adjacent tissue revealed hydroxyproline modifications associated with extracellular matrix degradation. These results demonstrate the value of using sequence tagging algorithms to fully interrogate clinical proteomic data sets. PMID:20131910

  11. SPIDER: software for protein identification from sequence tags with de novo sequencing error.

    PubMed

    Han, Yonghua; Ma, Bin; Zhang, Kaizhong

    2005-06-01

    For the identification of novel proteins using MS/MS, de novo sequencing software computes one or several possible amino acid sequences (called sequence tags) for each MS/MS spectrum. Those tags are then used to match, accounting amino acid mutations, the sequences in a protein database. If the de novo sequencing gives correct tags, the homologs of the proteins can be identified by this approach and software such as MS-BLAST is available for the matching. However, de novo sequencing very often gives only partially correct tags. The most common error is that a segment of amino acids is replaced by another segment with approximately the same masses. We developed a new efficient algorithm to match sequence tags with errors to database sequences for the purpose of protein and peptide identification. A software package, SPIDER, was developed and made available on Internet for free public use. This paper describes the algorithms and features of the SPIDER software. PMID:16108090

  12. SPIDER: software for protein identification from sequence tags with de novo sequencing error.

    PubMed

    Han, Yonghua; Ma, Bin; Zhang, Kaizhong

    2004-01-01

    For the identification of novel proteins using MS/MS, de novo sequencing software computes one or several possible amino acid sequences (called sequence tags) for each MS/MS spectrum. Those tags are then used to match, accounting amino acid mutations, the sequences in a protein database. If the de novo sequencing gives correct tags, the homologs of the proteins can be identified by this approach and software such as MS-BLAST is available for the matching. However, de novo sequencing very often gives only partially correct tags. The most common error is that a segment of amino acids is replaced by another segment with approximately the same masses. We developed a new efficient algorithm to match sequence tags with errors to database sequences for the purpose of protein and peptide identification. A software package, SPIDER, was developed and made available on Internet for free public use. This paper describes the algorithms and features of the SPIDER software. PMID:16448014

  13. Construction of a full-length enriched cDNA library and preliminary analysis of expressed sequence tags from Bengal Tiger Panthera tigris tigris.

    PubMed

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  14. Construction of a full-length enriched cDNA library and preliminary analysis of expressed sequence tags from Bengal Tiger Panthera tigris tigris.

    PubMed

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-05-24

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers.

  15. Construction of a Full-Length Enriched cDNA Library and Preliminary Analysis of Expressed Sequence Tags from Bengal Tiger Panthera tigris tigris

    PubMed Central

    Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun

    2013-01-01

    In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105

  16. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, R.A.; Huang, X.C.; Quesada, M.A.

    1995-07-25

    A DNA sequencing method is described which uses single lane or channel electrophoresis. Sequencing fragments are separated in the lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radioisotope labels. 5 figs.

  17. Multiple tag labeling method for DNA sequencing

    DOEpatents

    Mathies, Richard A.; Huang, Xiaohua C.; Quesada, Mark A.

    1995-01-01

    A DNA sequencing method described which uses single lane or channel electrophoresis. Sequencing fragments are separated in said lane and detected using a laser-excited, confocal fluorescence scanner. Each set of DNA sequencing fragments is separated in the same lane and then distinguished using a binary coding scheme employing only two different fluorescent labels. Also described is a method of using radio-isotope labels.

  18. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    SciTech Connect

    Catfish Genome Consortium; Wang, Shaolin; Peatman, Eric; Abernathy, Jason; Waldbieser, Geoff; Lindquist, Erika; Richardson, Paul; Lucas, Susan; Wang, Mei; Li, Ping; Thimmapuram, Jyothi; Liu, Lei; Vullaganti, Deepika; Kucuktas, Huseyin; Murdock, Christopher; Small, Brian C; Wilson, Melanie; Liu, Hong; Jiang, Yanliang; Lee, Yoona; Chen, Fei; Lu, Jianguo; Wang, Wenqi; Xu, Peng; Somridhivej, Benjaporn; Baoprasertkul, Puttharat; Quilang, Jonas; Sha, Zhenxia; Bao, Baolong; Wang, Yaping; Wang, Qun; Takano, Tomokazu; Nandi, Samiran; Liu, Shikai; Wong, Lilian; Kaltenboeck, Ludmilla; Quiniou, Sylvie; Bengten, Eva; Miller, Norman; Trant, John; Rokhsar, Daniel; Liu, Zhanjiang

    2010-03-23

    Background-Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results-A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35percent of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions-This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies.

  19. Assembly of 500,000 inter-specific catfish expressed sequence tags and large scale gene-associated marker development for whole genome association studies

    PubMed Central

    2010-01-01

    Background Through the Community Sequencing Program, a catfish EST sequencing project was carried out through a collaboration between the catfish research community and the Department of Energy's Joint Genome Institute. Prior to this project, only a limited EST resource from catfish was available for the purpose of SNP identification. Results A total of 438,321 quality ESTs were generated from 8 channel catfish (Ictalurus punctatus) and 4 blue catfish (Ictalurus furcatus) libraries, bringing the number of catfish ESTs to nearly 500,000. Assembly of all catfish ESTs resulted in 45,306 contigs and 66,272 singletons. Over 35% of the unique sequences had significant similarities to known genes, allowing the identification of 14,776 unique genes in catfish. Over 300,000 putative SNPs have been identified, of which approximately 48,000 are high-quality SNPs identified from contigs with at least four sequences and the minor allele presence of at least two sequences in the contig. The EST resource should be valuable for identification of microsatellites, genome annotation, large-scale expression analysis, and comparative genome analysis. Conclusions This project generated a large EST resource for catfish that captured the majority of the catfish transcriptome. The parallel analysis of ESTs from two closely related Ictalurid catfishes should also provide powerful means for the evaluation of ancient and recent gene duplications, and for the development of high-density microarrays in catfish. The inter- and intra-specific SNPs identified from all catfish EST dataset assembly will greatly benefit the catfish introgression breeding program and whole genome association studies. PMID:20096101

  20. Immunological responses of turbot (Psetta maxima) to nodavirus infection or polyriboinosinic polyribocytidylic acid (pIC) stimulation, using expressed sequence tags (ESTs) analysis and cDNA microarrays.

    PubMed

    Park, Kyoung C; Osborne, Jane A; Montes, Ariana; Dios, Sonia; Nerland, Audun H; Novoa, Beatriz; Figueras, Antonio; Brown, Laura L; Johnson, Stewart C

    2009-01-01

    To investigate the immunological responses of turbot to nodavirus infection or pIC stimulation, we constructed cDNA libraries from liver, kidney and gill tissues of nodavirus-infected fish and examined the differential gene expression within turbot kidney in response to nodavirus infection or pIC stimulation using a turbot cDNA microarray. Turbot were experimentally infected with nodavirus and samples of each tissue were collected at selected time points post-infection. Using equal amount of total RNA at each sampling time, we made three tissue-specific cDNA libraries. After sequencing 3230 clones we obtained 3173 (98.2%) high quality sequences from our liver, kidney and gill libraries. Of these 2568 (80.9%) were identified as known genes and 605 (19.1%) as unknown genes. A total of 768 unique genes were identified. The two largest groups resulting from the classification of ESTs according to function were the cell/organism defense genes (71 uni-genes) and apoptosis-related process (23 uni-genes). Using these clones, a 1920 element cDNA microarray was constructed and used to investigate the differential gene expression within turbot in response to experimental nodavirus infection or pIC stimulation. Kidney tissue was collected at selected times post-infection (HPI) or stimulation (HPS), and total RNA was isolated for microarray analysis. Of the 1920 genes studied on the microarray, we identified a total of 121 differentially expressed genes in the kidney: 94 genes from nodavirus-infected animals and 79 genes from those stimulated with pIC. Within the nodavirus-infected fish we observed the highest number of differentially expressed genes at 24 HPI. Our results indicate that certain genes in turbot have important roles in immune responses to nodavirus infection and dsRNA stimulation.

  1. Comparative analyses of genotype dependent expressed sequence tags and stress-responsive transcriptome of chickpea wilt illustrate predicted and unexpected genes and novel regulators of plant immunity

    PubMed Central

    Ashraf, Nasheeman; Ghai, Deepali; Barman, Pranjan; Basu, Swaraj; Gangisetty, Nagaraju; Mandal, Mihir K; Chakraborty, Niranjan; Datta, Asis; Chakraborty, Subhra

    2009-01-01

    Background The ultimate phenome of any organism is modulated by regulated transcription of many genes. Characterization of genetic makeup is thus crucial for understanding the molecular basis of phenotypic diversity, evolution and response to intra- and extra-cellular stimuli. Chickpea is the world's third most important food legume grown in over 40 countries representing all the continents. Despite its importance in plant evolution, role in human nutrition and stress adaptation, very little ESTs and differential transcriptome data is available, let alone genotype-specific gene signatures. Present study focuses on Fusarium wilt responsive gene expression in chickpea. Results We report 6272 gene sequences of immune-response pathway that would provide genotype-dependent spatial information on the presence and relative abundance of each gene. The sequence assembly led to the identification of a CaUnigene set of 2013 transcripts comprising of 973 contigs and 1040 singletons, two-third of which represent new chickpea genes hitherto undiscovered. We identified 209 gene families and 262 genotype-specific SNPs. Further, several novel transcription regulators were identified indicating their possible role in immune response. The transcriptomic analysis revealed 649 non-cannonical genes besides many unexpected candidates with known biochemical functions, which have never been associated with pathostress-responsive transcriptome. Conclusion Our study establishes a comprehensive catalogue of the immune-responsive root transcriptome with insight into their identity and function. The development, detailed analysis of CaEST datasets and global gene expression by microarray provide new insight into the commonality and diversity of organ-specific immune-responsive transcript signatures and their regulated expression shaping the species specificity at genotype level. This is the first report on differential transcriptome of an unsequenced genome during vascular wilt. PMID:19732460

  2. Expression of Epitope-Tagged Proteins in Mammalian Cells in Culture.

    PubMed

    Bhatt, Jay M; Styers, Melanie L; Sztul, Elizabeth

    2016-01-01

    Before the advent of molecular methods to tag proteins, visualization of proteins within cells required the use of antibodies directed against the protein of interest. Thus, only proteins for which antibodies were available could be visualized. Epitope tagging allows the detection of all proteins with existing sequence information, irrespective of the availability of antibodies directed against them. This technique involves the generation of DNA constructs that express the protein of interest tagged with an epitope that can be recognized by a commercially available antibody. Proteins can be tagged with a wide variety of epitopes using commercially available vectors that allow expression in mammalian cells. Epitope-tagged proteins are easily transfected into mammalian cell lines and, in most cases, tightly mimic the behavior of the endogenous protein. Tagged proteins exogenously expressed in cells provide different types of information depending on the subsequent detection approaches. Using immunofluorescence and immunoelectron microscopy with anti-tag antibodies, relative to known markers of cellular organelles, can provide information on the subcellular localization of the tagged protein and may provide clues regarding the protein's function. Immunofluorescence with anti-tag antibodies can also be utilized to assess the tagged protein's responses to cellular signals and pharmacological treatments. Immunoprecipitations with anti-tag antibodies can recover protein complexes containing the protein of interest, resulting in the identification of interacting proteins. Recovery of tagged proteins on affinity matrices allows their purification for use in biochemical assays. In addition, specialized fluorescent tags, such as the green fluorescent protein (GFP) allow the analysis of cellular dynamics in live cells in real time. PMID:27515071

  3. Characterization of expressed sequence tags (ESTs) of pigeonpea (Cajanus cajan L.) and functional validation of selected genes for abiotic stress tolerance in Arabidopsis thaliana.

    PubMed

    Priyanka, B; Sekhar, K; Sunita, T; Reddy, V D; Rao, Khareedu Venkateswara

    2010-03-01

    Pigeonpea, a major grain legume crop with remarkable drought tolerance traits, has been used for the isolation of stress-responsive genes. Herein, we report generation of ESTs, transcript profiles of selected genes and validation of candidate genes obtained from the subtracted cDNA libraries of pigeonpea plants subjected to PEG/water-deficit stress conditions. Cluster analysis of 124 selected ESTs yielded 75 high-quality ESTs. Homology searches disclosed that 55 ESTs share significant similarity with the known/putative proteins or ESTs available in the databases. These ESTs were characterized and genes relevant to the specific physiological processes were identified. Of the 75 ESTs obtained from the cDNA libraries of drought-stressed plants, 20 ESTs proved to be unique to the pigeonpea. These sequences are envisaged to serve as a potential source of stress-inducible genes of the drought stress-response transcriptome, and hence may be used for deciphering the mechanism of drought tolerance of the pigeonpea. Expression profiles of selected genes revealed increased levels of m-RNA transcripts in pigeonpea plants subjected to different abiotic stresses. Transgenic Arabidopsis lines, expressing Cajanus cajan hybrid-proline-rich protein (CcHyPRP), C. cajan cyclophilin (CcCYP) and C. cajan cold and drought regulatory (CcCDR) genes, exhibited marked tolerance, increased plant biomass and enhanced photosynthetic rates under PEG/NaCl/cold/heat stress conditions. This study represents the first report dealing with the isolation of drought-specific ESTs, transcriptome analysis and functional validation of drought-responsive genes of the pigeonpea. These genes, as such, hold promise for engineering crop plants bestowed with tolerance to major abiotic stresses. PMID:20131066

  4. Generation and analysis of a large-scale expressed sequence tags from a full-length enriched cDNA library of Siberian tiger (Panthera tigris altaica).

    PubMed

    Guo, Yu; Liu, Changqing; Lu, Taofeng; Liu, Dan; Bai, Chunyu; Li, Xiangchen; Ma, Yuehui; Guan, Weijun

    2014-05-15

    In this study, a full-length enriched cDNA library was successfully constructed from Siberian tiger, the world's most endangered species. The titers of primary and amplified libraries were 1.28×10(6)pfu/mL and 1.59×10(10)pfu/mL respectively. The proportion of recombinants from unamplified library was 91.3% and the average length of exogenous inserts was 1.06kb. A total of 279 individual ESTs with sizes ranging from 316 to 1258bps were then analyzed. Furthermore, 204 unigenes were successfully annotated and involved in 49 functions of the GO classification, cell (175, 85.5%), cellular process (165, 80.9%), and binding (152, 74.5%) are the dominant terms. 198 unigenes were assigned to 156 KEGG pathways, and the pathways with the most representation are metabolic pathways (18, 9.1%). The proportion pattern of each COG subcategory was similar among Panthera tigris altaica, P. tigris tigris and Homo sapiens, and general function prediction only cluster (44, 15.8%) represents the largest group, followed by translation, ribosomal structure and biogenesis (33, 11.8%), replication, recombination and repair (24, 8.6%), and only 7.2% ESTs classified as novel genes. Moreover, the recombinant plasmid pET32a-TAT-COL6A2 was constructed, coded for the Trx-TAT-COL6A2 fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-COL6A2 recombinant protein was 2.64±0.18mg/mL. This library will provide a useful platform for the functional genome and transcriptome research of for the P. tigris and other felid animals in the future.

  5. Generation and analysis of a large-scale expressed sequence tags from a full-length enriched cDNA library of Siberian tiger (Panthera tigris altaica).

    PubMed

    Guo, Yu; Liu, Changqing; Lu, Taofeng; Liu, Dan; Bai, Chunyu; Li, Xiangchen; Ma, Yuehui; Guan, Weijun

    2014-05-15

    In this study, a full-length enriched cDNA library was successfully constructed from Siberian tiger, the world's most endangered species. The titers of primary and amplified libraries were 1.28×10(6)pfu/mL and 1.59×10(10)pfu/mL respectively. The proportion of recombinants from unamplified library was 91.3% and the average length of exogenous inserts was 1.06kb. A total of 279 individual ESTs with sizes ranging from 316 to 1258bps were then analyzed. Furthermore, 204 unigenes were successfully annotated and involved in 49 functions of the GO classification, cell (175, 85.5%), cellular process (165, 80.9%), and binding (152, 74.5%) are the dominant terms. 198 unigenes were assigned to 156 KEGG pathways, and the pathways with the most representation are metabolic pathways (18, 9.1%). The proportion pattern of each COG subcategory was similar among Panthera tigris altaica, P. tigris tigris and Homo sapiens, and general function prediction only cluster (44, 15.8%) represents the largest group, followed by translation, ribosomal structure and biogenesis (33, 11.8%), replication, recombination and repair (24, 8.6%), and only 7.2% ESTs classified as novel genes. Moreover, the recombinant plasmid pET32a-TAT-COL6A2 was constructed, coded for the Trx-TAT-COL6A2 fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-COL6A2 recombinant protein was 2.64±0.18mg/mL. This library will provide a useful platform for the functional genome and transcriptome research of for the P. tigris and other felid animals in the future. PMID:24630959

  6. A sequence-tagged linkage map of Brassica rapa.

    PubMed

    Kim, Jung Sun; Chung, Tae Young; King, Graham J; Jin, Mina; Yang, Tae-Jin; Jin, Yong-Moon; Kim, Ho-Il; Park, Beom-Seok

    2006-09-01

    A detailed genetic linkage map of Brassica rapa has been constructed containing 545 sequence-tagged loci covering 1287 cM, with an average mapping interval of 2.4 cM. The loci were identified using a combination of 520 RFLP and 25 PCR-based markers. RFLP probes were derived from 359 B. rapa EST clones and amplification products of 11 B. rapa and 26 Arabidopsis. Including 21 SSR markers provided anchors to previously published linkage maps for B. rapa and B. napus and is followed as the referenced mapping of R1-R10. The sequence-tagged markers allowed interpretation of the pattern of chromosome duplications within the B. rapa genome and comparison with Arabidopsis. A total of 62 EST markers showing a single RFLP band were mapped through 10 linkage groups, indicating that these can be valuable anchoring markers for chromosome-based genome sequencing of B. rapa. Other RFLP probes gave rise to 2-5 loci, inferring that B. rapa genome duplication is a general phenomenon through 10 chromosomes. The map includes five loci of FLC paralogues, which represent the previously reported BrFLC-1, -2, -3, and -5 and additionally identified BrFLC3 paralogues derived from local segmental duplication on R3.

  7. Bioinformatic analyses of the publicly accessible crustacean expressed sequence tags (ESTs) reveal numerous novel neuropeptide-encoding precursor proteins, including ones from members of several little studied taxa.

    PubMed

    Christie, Andrew E; Durkin, Christopher S; Hartline, Niko; Ohno, Paul; Lenz, Petra H

    2010-05-15

    ESTs have been generated for many crustacean species, providing an invaluable resource for peptide discovery in members of this arthropod subphylum. Here, these data were mined for novel peptide-encoding transcripts, with the mature peptides encoded by them predicted using a combination of online peptide prediction programs and homology to known arthropod sequences. In total, 70 mature full-length/partial peptides representing members of 16 families/subfamilies were predicted, the vast majority being novel; the species from which the peptides were identified included members of the Branchiopoda (Daphnia carinata and Triops cancriformis), Maxillopoda (Caligus clemensi, Caligus rogercresseyi, Lepeophtheirus salmonis and Lernaeocera branchialis) and Malacostraca (Euphausia superba, Marsupenaeus japonicus, Penaeus monodon, Homarus americanus, Petrolisthes cinctipes, Callinectes sapidus and Portunus trituberculatus). Of particular note were the identifications of an intermediate between the insect adipokinetic hormones and crustacean red pigment concentrating hormone and a modified crustacean cardioactive peptide from the daphnid D. carinata; Arg(7)-corazonin was also deduced from this species, the first identification of a corazonin from a non-decapod crustacean. Our data also include the first reports of members of the calcitonin-like diuretic hormone, FMRFamide-related peptide (neuropeptide F subfamily) and orcokinin families from members of the Copepoda. Moreover, the prediction of a bursicon alpha from the euphausid E. superba represents the first peptide identified from any member of the basal eucaridean order Euphausiacea. In addition, large collections of insect eclosion hormone- and neuroparsin-like peptides were identified from a variety of species, greatly expanding the number of known members of these families in crustaceans.

  8. SV40 Tag DNA sequences, present in a small proportion of human hepatocellular carcinomas, are associated with reduced survival

    PubMed Central

    Wong, N A C S; Rae, F; Herriot, M M; Mayer, N J; Brewster, D H; Harrison, D J

    2003-01-01

    Aims: To study the association between simian virus 40 (SV40) and human hepatocarcinogenesis. Methods: Polymerase chain reaction (PCR) to detect SV40 large T antigen (Tag) DNA was performed on: 50 human hepatocellular carcinoma (HCCs) diagnosed between 1978 and 1989 (cohort A); 20 cases of alcoholic liver cirrhosis from the same period; and 20 HCCs diagnosed after 1997 (cohort B). PCR to detect SV40 regulatory sequence and SV40 Tag immunohistochemistry were performed on selected cases from cohorts A and B. Amplified products were directly sequenced. Immunohistochemistry for p53 and pRb and clinicopathological analyses were performed on selected cases from cohorts A and B. Complete survival data were collected for cohort A. Result: SV40 Tag DNA was found in five cohort A HCCs but not in alcoholic liver cirrhosis cases or cohort B HCCs. Neither SV40 regulatory sequence nor SV40 Tag protein were demonstrated in Tag DNA positive HCCs. No clinicopathological differences existed between Tag DNA positive and negative HCCs, but the presence of Tag DNA was associated with reduced disease specific survival. Relatively fewer Tag DNA positive than negative HCCs expressed p53, but loss of pRb expression was similar in the two groups. Patients with Tag DNA positive HCCs were unlikely to have received SV40 contaminated poliovirus vaccine. Conclusions: SV40 Tag DNA is present in a small proportion of historical HCCs and may contribute to their pathogenesis and influence their outcome. The source of the virus is uncertain and more recent HCCs show no evidence of SV40. PMID:14645347

  9. Sequence tagged microsatellite profiling (STMP): improved isolation of DNA sequence flanking target SSRs

    PubMed Central

    Hayden, M. J.; Good, G.; Sharp, P. J.

    2002-01-01

    Sequence tagged microsatellite profiling (STMP) enables the rapid development of large numbers of co-dominant DNA markers, known as sequence tagged microsatellites (STMs). Each STM is amplified by PCR using a single primer specific to the conserved DNA sequence flanking the microsatellite repeat in combination with a universal primer that anchors to the 5′-ends of the microsatellites. It is also possible to convert STMs into conventional microsatellite, or simple sequence repeat (SSR), markers that are amplified using a pair of primers flanking the repeat sequence. Here, we describe a modification of the STMP procedure to significantly improve the capacity to convert STMs into conventional SSRs and, therefore, facilitate the development of highly specific DNA markers for purposes such as marker-assisted breeding. The usefulness of this technique was demonstrated in bread wheat. PMID:12466561

  10. Sequence tagged microsatellite profiling (STMP): improved isolation of DNA sequence flanking target SSRs.

    PubMed

    Hayden, M J; Good, G; Sharp, P J

    2002-12-01

    Sequence tagged microsatellite profiling (STMP) enables the rapid development of large numbers of co-dominant DNA markers, known as sequence tagged microsatellites (STMs). Each STM is amplified by PCR using a single primer specific to the conserved DNA sequence flanking the microsatellite repeat in combination with a universal primer that anchors to the 5'-ends of the microsatellites. It is also possible to convert STMs into conventional microsatellite, or simple sequence repeat (SSR), markers that are amplified using a pair of primers flanking the repeat sequence. Here, we describe a modification of the STMP procedure to significantly improve the capacity to convert STMs into conventional SSRs and, therefore, facilitate the development of highly specific DNA markers for purposes such as marker-assisted breeding. The usefulness of this technique was demonstrated in bread wheat. PMID:12466561

  11. CREST--classification resources for environmental sequence tags.

    PubMed

    Lanzén, Anders; Jørgensen, Steffen L; Huson, Daniel H; Gorfer, Markus; Grindhaug, Svenn Helge; Jonassen, Inge; Øvreås, Lise; Urich, Tim

    2012-01-01

    Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com. PMID:23145153

  12. CREST – Classification Resources for Environmental Sequence Tags

    PubMed Central

    Lanzén, Anders; Jørgensen, Steffen L.; Huson, Daniel H.; Gorfer, Markus; Grindhaug, Svenn Helge; Jonassen, Inge; Øvreås, Lise; Urich, Tim

    2012-01-01

    Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com. PMID:23145153

  13. CREST--classification resources for environmental sequence tags.

    PubMed

    Lanzén, Anders; Jørgensen, Steffen L; Huson, Daniel H; Gorfer, Markus; Grindhaug, Svenn Helge; Jonassen, Inge; Øvreås, Lise; Urich, Tim

    2012-01-01

    Sequencing of taxonomic or phylogenetic markers is becoming a fast and efficient method for studying environmental microbial communities. This has resulted in a steadily growing collection of marker sequences, most notably of the small-subunit (SSU) ribosomal RNA gene, and an increased understanding of microbial phylogeny, diversity and community composition patterns. However, to utilize these large datasets together with new sequencing technologies, a reliable and flexible system for taxonomic classification is critical. We developed CREST (Classification Resources for Environmental Sequence Tags), a set of resources and tools for generating and utilizing custom taxonomies and reference datasets for classification of environmental sequences. CREST uses an alignment-based classification method with the lowest common ancestor algorithm. It also uses explicit rank similarity criteria to reduce false positives and identify novel taxa. We implemented this method in a web server, a command line tool and the graphical user interfaced program MEGAN. Further, we provide the SSU rRNA reference database and taxonomy SilvaMod, derived from the publicly available SILVA SSURef, for classification of sequences from bacteria, archaea and eukaryotes. Using cross-validation and environmental datasets, we compared the performance of CREST and SilvaMod to the RDP Classifier. We also utilized Greengenes as a reference database, both with CREST and the RDP Classifier. These analyses indicate that CREST performs better than alignment-free methods with higher recall rate (sensitivity) as well as precision, and with the ability to accurately identify most sequences from novel taxa. Classification using SilvaMod performed better than with Greengenes, particularly when applied to environmental sequences. CREST is freely available under a GNU General Public License (v3) from http://apps.cbu.uib.no/crest and http://lcaclassifier.googlecode.com.

  14. Multiplexed metagenome mining using short DNA sequence tags facilitates targeted discovery of epoxyketone proteasome inhibitors

    PubMed Central

    Owen, Jeremy G.; Charlop-Powers, Zachary; Smith, Alexandra G.; Ternei, Melinda A.; Calle, Paula Y.; Reddy, Boojala Vijay B.; Montiel, Daniel; Brady, Sean F.

    2015-01-01

    In molecular evolutionary analyses, short DNA sequences are used to infer phylogenetic relationships among species. Here we apply this principle to the study of bacterial biosynthesis, enabling the targeted isolation of previously unidentified natural products directly from complex metagenomes. Our approach uses short natural product sequence tags derived from conserved biosynthetic motifs to profile biosynthetic diversity in the environment and then guide the recovery of gene clusters from metagenomic libraries. The methodology is conceptually simple, requires only a small investment in sequencing, and is not computationally demanding. To demonstrate the power of this approach to natural product discovery we conducted a computational search for epoxyketone proteasome inhibitors within 185 globally distributed soil metagenomes. This led to the identification of 99 unique epoxyketone sequence tags, falling into 6 phylogenetically distinct clades. Complete gene clusters associated with nine unique tags were recovered from four saturating soil metagenomic libraries. Using heterologous expression methodologies, seven potent epoxyketone proteasome inhibitors (clarepoxcins A–E and landepoxcins A and B) were produced from these pathways, including compounds with different warhead structures and a naturally occurring halohydrin prodrug. This study provides a template for the targeted expansion of bacterially derived natural products using the global metagenome. PMID:25831524

  15. Multiplexed metagenome mining using short DNA sequence tags facilitates targeted discovery of epoxyketone proteasome inhibitors.

    PubMed

    Owen, Jeremy G; Charlop-Powers, Zachary; Smith, Alexandra G; Ternei, Melinda A; Calle, Paula Y; Reddy, Boojala Vijay B; Montiel, Daniel; Brady, Sean F

    2015-04-01

    In molecular evolutionary analyses, short DNA sequences are used to infer phylogenetic relationships among species. Here we apply this principle to the study of bacterial biosynthesis, enabling the targeted isolation of previously unidentified natural products directly from complex metagenomes. Our approach uses short natural product sequence tags derived from conserved biosynthetic motifs to profile biosynthetic diversity in the environment and then guide the recovery of gene clusters from metagenomic libraries. The methodology is conceptually simple, requires only a small investment in sequencing, and is not computationally demanding. To demonstrate the power of this approach to natural product discovery we conducted a computational search for epoxyketone proteasome inhibitors within 185 globally distributed soil metagenomes. This led to the identification of 99 unique epoxyketone sequence tags, falling into 6 phylogenetically distinct clades. Complete gene clusters associated with nine unique tags were recovered from four saturating soil metagenomic libraries. Using heterologous expression methodologies, seven potent epoxyketone proteasome inhibitors (clarepoxcins A-E and landepoxcins A and B) were produced from these pathways, including compounds with different warhead structures and a naturally occurring halohydrin prodrug. This study provides a template for the targeted expansion of bacterially derived natural products using the global metagenome.

  16. Data presenting a modified bacterial expression vector for expressing and purifying Nus solubility-tagged proteins.

    PubMed

    Gupta, Nidhi; Wu, Heng; Terman, Jonathan R

    2016-09-01

    Bacteria are the predominant source for producing recombinant proteins but while many exogenous proteins are expressed, only a fraction of those are soluble. We have found that a new actin regulatory enzyme Mical is poorly soluble when expressed in bacteria but the use of a Nus fusion protein tag greatly increases its solubility. However, available vectors containing a Nus tag have been engineered in a way that hinders the separation of target proteins from the Nus tag during protein purification. We have now used recombinant DNA approaches to overcome these issues and reengineer a Nus solubility tag-containing bacterial expression vector. The data herein present a modified bacterial expression vector useful for expressing proteins fused to the Nus solubility tag and separating such target proteins from the Nus tag during protein purification. PMID:27547802

  17. In silico mining and characterization of simple sequence repeats from gilthead sea bream (Sparus aurata) expressed sequence tags (EST-SSRs); PCR amplification, polymorphism evaluation and multiplexing and cross-species assays.

    PubMed

    Vogiatzi, Emmanouella; Lagnel, Jacques; Pakaki, Victoria; Louro, Bruno; Canario, Adelino V M; Reinhardt, Richard; Kotoulas, Georgios; Magoulas, Antonios; Tsigenopoulos, Costas S

    2011-06-01

    We screened for simple sequence repeats (SSRs) found in ESTs derived from an EST-database development project ('Marine Genomics Europe' Network of Excellence). Different motifs of di-, tri-, tetra-, penta- and hexanucleotide SSRs were evaluated for variation in length and position in the expressed sequences, relative abundance and distribution in gilthead sea bream (Sparus aurata). We found 899 ESTs that harbor 997 SSRs (4.94%). On average, one SSR was found per 2.95 kb of EST sequence and the dinucleotide SSRs are the most abundant accounting for 47.6% of the total number. EST-SSRs were used as template for primer design. 664 primer pairs could be successfully identified and a subset of 206 pairs of primers was synthesized, PCR-tested and visualized on ethidium bromide stained agarose gels. The main objective was to further assess the potential of EST-SSRs as informative markers and investigate their cross-species amplification in sixteen teleost fish species: seven sparid species and nine other species from different families. Approximately 78% of the primer pairs gave PCR products of expected size in gilthead sea bream, and as expected, the rate of successful amplification of sea bream EST-SSRs was higher in sparids, lower in other perciforms and even lower in species of the Clupeiform and Gadiform orders. We finally determined the polymorphism and the heterozygosity of 63 markers in a wild gilthead sea bream population; fifty-eight loci were found to be polymorphic with the expected heterozygosity and the number of alleles ranging from 0.089 to 0.946 and from 2 to 27, respectively. These tools and markers are expected to enhance the available genetic linkage map in gilthead sea bream, to assist comparative mapping and genome analyses for this species and further with other model fish species and finally to help advance genetic analysis for cultivated and wild populations and accelerate breeding programs.

  18. Plant Gene and Alternatively Spliced Variant Annotator. A plant genome annotation pipeline for rice gene and alternatively spliced variant identification with cross-species expressed sequence tag conservation from seven plant species.

    PubMed

    Chen, Feng-Chi; Wang, Sheng-Shun; Chaw, Shu-Miaw; Huang, Yao-Ting; Chuang, Trees-Juen

    2007-03-01

    The completion of the rice (Oryza sativa) genome draft has brought unprecedented opportunities for genomic studies of the world's most important food crop. Previous rice gene annotations have relied mainly on ab initio methods, which usually yield a high rate of false-positive predictions and give only limited information regarding alternative splicing in rice genes. Comparative approaches based on expressed sequence tags (ESTs) can compensate for the drawbacks of ab initio methods because they can simultaneously identify experimental data-supported genes and alternatively spliced transcripts. Furthermore, cross-species EST information can be used to not only offset the insufficiency of same-species ESTs but also derive evolutionary implications. In this study, we used ESTs from seven plant species, rice, wheat (Triticum aestivum), maize (Zea mays), barley (Hordeum vulgare), sorghum (Sorghum bicolor), soybean (Glycine max), and Arabidopsis (Arabidopsis thaliana), to annotate the rice genome. We developed a plant genome annotation pipeline, Plant Gene and Alternatively Spliced Variant Annotator (PGAA). Using this approach, we identified 852 genes (931 isoforms) not annotated in other widely used databases (i.e. the Institute for Genomic Research, National Center for Biotechnology Information, and Rice Annotation Project) and found 87% of them supported by both rice and nonrice EST evidence. PGAA also identified more than 44,000 alternatively spliced events, of which approximately 20% are not observed in the other three annotations. These novel annotations represent rich opportunities for rice genome research, because the functions of most of our annotated genes are currently unknown. Also, in the PGAA annotation, the isoforms with non-rice-EST-supported exons are significantly enriched in transporter activity but significantly underrepresented in transcription regulator activity. We have also identified potential lineage-specific and conserved isoforms, which are

  19. The contribution of 700,000 ORF sequence tags to the definition of the human transcriptome

    PubMed Central

    Camargo, Anamaria A.; Samaia, Helena P. B.; Dias-Neto, Emmanuel; Simão, Daniel F.; Migotto, Italo A.; Briones, Marcelo R. S.; Costa, Fernando F.; Aparecida Nagai, Maria; Verjovski-Almeida, Sergio; Zago, Marco A.; Andrade, Luis Eduardo C.; Carrer, Helaine; El-Dorry, Hamza F. A.; Espreafico, Enilza M.; Habr-Gama, Angelita; Giannella-Neto, Daniel; Goldman, Gustavo H.; Gruber, Arthur; Hackel, Christine; Kimura, Edna T.; Maciel, Rui M. B.; Marie, Suely K. N.; Martins, Elizabeth A. L.; Nóbrega, Marina P.; Paçó-Larson, Maria Luisa; Pardini, Maria Inês M. C.; Pereira, Gonçalo G.; Pesquero, João Bosco; Rodrigues, Vanderlei; Rogatto, Silvia R.; da Silva, Ismael D. C. G.; Sogayar, Mari C.; Sonati, Maria de Fátima; Tajara, Eloiza H.; Valentini, Sandro R.; Alberto, Fernando L.; Amaral, Maria Elisabete J.; Aneas, Ivy; Arnaldi, Liliane A. T.; de Assis, Angela M.; Bengtson, Mário Henrique; Bergamo, Nadia Aparecida; Bombonato, Vanessa; de Camargo, Maria E. R.; Canevari, Renata A.; Carraro, Dirce M.; Cerutti, Janete M.; Corrêa, Maria Lucia C.; Corrêa, Rosana F. R.; Costa, Maria Cristina R.; Curcio, Cyntia; Hokama, Paula O. M.; Ferreira, Ari J. S.; Furuzawa, Gilberto K.; Gushiken, Tsieko; Ho, Paulo L.; Kimura, Elza; Krieger, José E.; Leite, Luciana C. C.; Majumder, Paromita; Marins, Mozart; Marques, Everaldo R.; Melo, Analy S. A.; Melo, Monica; Mestriner, Carlos Alberto; Miracca, Elisabete C.; Miranda, Daniela C.; Nascimento, Ana Lucia T. O.; Nóbrega, Francisco G.; Ojopi, Élida P. B.; Pandolfi, José Rodrigo C.; Pessoa, Luciana G.; Prevedel, Aline C.; Rahal, Paula; Rainho, Claudia A.; Reis, Eduardo M. R.; Ribeiro, Marcelo L.; da Rós, Nancy; de Sá, Renata G.; Sales, Magaly M.; Sant'anna, Simone Cristina; dos Santos, Mariana L.; da Silva, Aline M.; da Silva, Neusa P.; Silva, Wilson A.; da Silveira, Rosana A.; Sousa, Josane F.; Stecconi, Daniella; Tsukumo, Fernando; Valente, Valéria; Soares, Fernando; Moreira, Eloisa S.; Nunes, Diana N.; Correa, Ricardo G.; Zalcberg, Heloisa; Carvalho, Alex F.; Reis, Luis F. L.; Brentani, Ricardo R.; Simpson, Andrew J. G.; de Souza, Sandro J.

    2001-01-01

    Open reading frame expressed sequences tags (ORESTES) differ from conventional ESTs by providing sequence data from the central protein coding portion of transcripts. We generated a total of 696,745 ORESTES sequences from 24 human tissues and used a subset of the data that correspond to a set of 15,095 full-length mRNAs as a means of assessing the efficiency of the strategy and its potential contribution to the definition of the human transcriptome. We estimate that ORESTES sampled over 80% of all highly and moderately expressed, and between 40% and 50% of rarely expressed, human genes. In our most thoroughly sequenced tissue, the breast, the 130,000 ORESTES generated are derived from transcripts from an estimated 70% of all genes expressed in that tissue, with an equally efficient representation of both highly and poorly expressed genes. In this respect, we find that the capacity of the ORESTES strategy both for gene discovery and shotgun transcript sequence generation significantly exceeds that of conventional ESTs. The distribution of ORESTES is such that many human transcripts are now represented by a scaffold of partial sequences distributed along the length of each gene product. The experimental joining of the scaffold components, by reverse transcription–PCR, represents a direct route to transcript finishing that may represent a useful alternative to full-length cDNA cloning. PMID:11593022

  20. Hierarchical molecular tagging to resolve long continuous sequences by massively parallel sequencing

    PubMed Central

    Lundin, Sverker; Gruselius, Joel; Nystedt, Björn; Lexow, Preben; Käller, Max; Lundeberg, Joakim

    2013-01-01

    Here we demonstrate the use of short-read massive sequencing systems to in effect achieve longer read lengths through hierarchical molecular tagging. We show how indexed and PCR-amplified targeted libraries are degraded, sub-sampled and arrested at timed intervals to achieve pools of differing average length, each of which is indexed with a new tag. By this process, indices of sample origin, molecular origin, and degree of degradation is incorporated in order to achieve a nested hierarchical structure, later to be utilized in the data processing to order the reads over a longer distance than the sequencing system originally allows. With this protocol we show how continuous regions beyond 3000 bp can be decoded by an Illumina sequencing system, and we illustrate the potential applications by calling variants of the lambda genome, analysing TP53 in cancer cell lines, and targeting a variable canine mitochondrial region. PMID:23470464

  1. Perceptual learning of contrast discrimination under roving: the role of semantic sequence in stimulus tagging.

    PubMed

    Cong, Lin-Juan; Zhang, Jun-Yun

    2014-11-03

    Perceptual learning may occur when multiple contrasts are practiced in a fixed, but not in a roving (random), temporal sequence. However, learning may escape roving disruption when each contrast is assigned a letter tag (i.e., A, B, C, D). Because these letter tags carry not only stimulus identity information, but also semantic sequence information, here we investigated whether the semantic sequence information is necessary for learning of tagged contrasts under the roving condition. We found that assigning number tags (i.e., 1, 2, 3, 4), which also contained both identity and semantic sequence information, to four roving contrasts enabled significant learning of discrimination of each contrast, confirming previous data. However, learning became insignificant when the contrast tags were replaced with Greek letters that were familiar to our Chinese observers except their sequence or Chinese characters that carried no sequence information. In addition, assigning orientation tags, which carried no sequence information either, to roving contrasts was ineffective as well because learning occurred only with sequenced but not roving contrasts. These results suggest that semantic sequence information is necessary for stimulus tagging to effectively enable perceptual learning of multiple contrast discrimination under roving.

  2. Grouping and identification of sequence tags (GRIST): bioinformatics tools for the NEIBank database.

    PubMed

    Wistow, Graeme; Bernstein, Steven L; Touchman, Jeffrey W; Bouffard, Gerald; Wyatt, M Keith; Peterson, Katherine; Behal, Amita; Gao, James; Buchoff, Patee; Smith, Don

    2002-06-15

    NEIBank is a project to develop and organize genomics and bioinformatics resources for the eye. As part of this effort, tools have been developed for bioinformatics analysis and web based display of data from expressed sequence tag (EST) analyses. EST sequences are identified and formed into groups or clusters representing related transcripts from the same gene. This is carried out by a rules-based procedure called GRIST (GRouping and Identification of Sequence Tags) that uses sequence match parameters derived from BLAST programs. Linked procedures are used to eliminate non-mRNA contaminants. All data are assembled in a relational database and assembled for display as web pages with annotations and links to other informatics resources. Genome projects generate huge amounts of data that need to be classified and organized to become easily accessible to the research community. GRIST provides a useful tool for assembling and displaying the results of EST analyses. The NEIBank web site contains a growing set of pages cataloging the known transcriptional repertoire of eye tissues, derived from new NEIBank cDNA libraries and from eye-related data deposited in the dbEST section of GenBank. PMID:12107414

  3. Sequence-tagged microsatellite profiling (STMP): a rapid technique for developing SSR markers.

    PubMed

    Hayden, M J; Sharp, P J

    2001-04-15

    We describe a technique, sequence-tagged microsatellite profiling (STMP), to rapidly generate large numbers of simple sequence repeat (SSR) markers from genomic or cDNA. This technique eliminates the need for library screening to identify SSR-containing clones and provides an approximately 25-fold increase in sequencing throughput compared to traditional methods. STMP generates short but characteristic nucleotide sequence tags for fragments that are present within a pool of SSR amplicons. These tags are then ligated together to form concatemers for cloning and sequencing. The analysis of thousands of tags gives rise to a representational profile of the abundance and frequency of SSRs within the DNA pool, from which low copy sequences can be identified. As each tag contains sufficient nucleotide sequence for primer design, their conversion into PCR primers allows the amplification of corresponding full-length fragments from the pool of SSR amplicons. These fragments permit the full characterisation of a SSR locus and provide flanking sequence for the development of a microsatellite marker. Alternatively, sequence tag primers can be used to directly amplify corresponding SSR loci from genomic DNA, thereby reducing the cost of developing a microsatellite marker to the synthesis of just one sequence-specific primer. We demonstrate the utility of STMP by the development of SSR markers in bread wheat. PMID:11292857

  4. Sequence-tagged microsatellite profiling (STMP): a rapid technique for developing SSR markers

    PubMed Central

    Hayden, M. J.; Sharp, P. J.

    2001-01-01

    We describe a technique, sequence-tagged microsatellite profiling (STMP), to rapidly generate large numbers of simple sequence repeat (SSR) markers from genomic or cDNA. This technique eliminates the need for library screening to identify SSR-containing clones and provides an ∼25-fold increase in sequencing throughput compared to traditional methods. STMP generates short but characteristic nucleotide sequence tags for fragments that are present within a pool of SSR amplicons. These tags are then ligated together to form concatemers for cloning and sequencing. The analysis of thousands of tags gives rise to a representational profile of the abundance and frequency of SSRs within the DNA pool, from which low copy sequences can be identified. As each tag contains sufficient nucleotide sequence for primer design, their conversion into PCR primers allows the amplification of corresponding full-length fragments from the pool of SSR amplicons. These fragments permit the full characterisation of a SSR locus and provide flanking sequence for the development of a microsatellite marker. Alternatively, sequence tag primers can be used to directly amplify corresponding SSR loci from genomic DNA, thereby reducing the cost of developing a microsatellite marker to the synthesis of just one sequence-specific primer. We demonstrate the utility of STMP by the development of SSR markers in bread wheat. PMID:11292857

  5. pAUL: A Gateway-Based Vector System for Adaptive Expression and Flexible Tagging of Proteins in Arabidopsis

    PubMed Central

    Lyska, Dagmar; Engelmann, Kerstin; Meierhoff, Karin; Westhoff, Peter

    2013-01-01

    Determination of protein function requires tools that allow its detection and/or purification. As generation of specific antibodies often is laborious and insufficient, protein tagging using epitopes that are recognized by commercially available antibodies and matrices appears more promising. Also, proper spatial and temporal expression of tagged proteins is required to prevent falsification of results. We developed a new series of binary Gateway cloning vectors named pAUL1-20 for C- and N-terminal in-frame fusion of proteins to four different tags: a single (i) HA epitope and (ii) Strep-tagIII, (iii) both epitopes combined to a double tag, and (iv) a triple tag consisting of the double tag extended by a Protein A tag possessing a 3C protease cleavage site. Expression can be driven by either the 35 S CaMV promoter or, for C-terminal fusions, promoters from genes encoding the chloroplast biogenesis factors HCF107, HCF136, or HCF173. Fusions of the four promoters to the GUS gene showed that endogenous promoter sequences are functional and drive expression more moderately and consistently throughout different transgenic lines when compared to the 35 S CaMV promoter. By testing complementation of mutations affected in chloroplast biogenesis factors HCF107 and HCF208, we found that the effect of different promoters and tags on protein function strongly depends on the protein itself. Single-step and tandem affinity purification of HCF208 via different tags confirmed the integrity of the cloned tags. PMID:23326506

  6. Genomic Sequence or Signature Tags (GSTs) from the Genome Group at Brookhaven National Laboratory (BNL)

    DOE Data Explorer

    Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K.

    Genomic Signature Tags (GSTs) are the products of a method we have developed for identifying and quantitatively analyzing genomic DNAs. The DNA is initially fragmented with a type II restriction enzyme. An oligonucleotide adaptor containing a recognition site for MmeI, a type IIS restriction enzyme, is then used to release 21-bp tags from fixed positions in the DNA relative to the sites recognized by the fragmenting enzyme. These tags are PCR-amplified, purified, concatenated and then cloned and sequenced. The tag sequences and abundances are used to create a high resolution GST sequence profile of the genomic DNA. [Quoted from Genomic Signature Tags (GSTs): A System for Profiling Genomic DNA, Dunn, John J.; McCorkle, Sean R.; Praissman, Laura A.; Hind, Geoffrey; Van der Lelie, Daniel; Bahou, Wadie F.; Gnatenko, Dmitri V.; Krause, Maureen K., Revised 9/13/2002

  7. Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

    PubMed Central

    Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

    2013-01-01

    Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation

  8. Phylogeny of Saccharina and Laminaria (Laminariaceae, Laminariales, Phaeophyta) in sequence-tagged-site markers

    NASA Astrophysics Data System (ADS)

    Qu, Jieqiong; Zhang, Jing; Wang, Xumin; Chi, Shan; Liu, Cui; Liu, Tao

    2014-01-01

    Laminaria and Saccharina have recently been recognized as two independent clades from the former genus Laminaria. Traditional morphological taxonomy is being challenged by molecular evidence from both nucleus and plastid. Intensive work is in great demand from the perspective of genome colinearity. In this study, 118 sequence-tagged site (STS) markers were screened for phylogenetic analyses, 29 based on genome sequences, while 89 were based on expressed sequence tag (EST) sequences. EST-based STS marker development (29.37%) had an effi ciency twice as high as genome-sequence-based development (9.48%) as a result of high conservation of gene transcripts among the relative species. S. ochotensis, S. religiosa, S. japonica, and L. hyperborea showed great homogeneity in all 118 STS markers. Our result supports the view that the diversifi cation between the genera Saccharina and Laminaria was a more recent event and that Saccharina and Laminaria shared high phylogenetic affi nity. However, when it came to the single nucleotide polymorphism (SNP) level among the 41 SNPs, L. hyperborea owned 29 unique SNPs against 12 within the left three Saccharina species and 12 of the 13 indels were supposedly unique for L. hyperborea, indicated by its high variability. Originating from homologous ancestors, species between the recently diverged genera Laminaria and Saccharina may have taken in enough mutations at the SNP level only, in spite of different evolutionary strategies for better adaptation to the environment. Our study lays a solid foundation from a new perspective, although more accurate phylogenetic analysis is still needed to clarify the evolutionary traces between the genera Saccharina and Laminaria.

  9. High-Throughput Tag-Sequencing Analysis of Early Events Induced by Ochratoxin A in HepG-2 Cells.

    PubMed

    Zhang, Yu; Qi, Xiaozhe; Zheng, Juanjuan; Luo, YunBo; Huang, Kunlun; Xu, Wentao

    2016-01-01

    Ochratoxin A (OTA) is produced by fungi of the species Aspergillus and Penicillium. OTA has displayed hepatotoxicity in mammals. Although recent studies have indicated that OTA influences liver function, little is known regarding its impact on differential early liver toxicity. In this study, we report high-throughput tag-sequencing (Tag-seq) analysis of the transcriptome using Solexa Analyzer platform after 4 h of OTA treatment on HepG-2 cells. The analyses of differentially expressed genes revealed the substantial changes. A total of 21,449 genes were identified and quantified, with 2726 displaying significantly altered expression levels. Expression level data were then integrated with a network of gene-gene interactions, and biological pathways to obtain a systems-level view of changes in the transcriptome that occur with OTA resistance. Our data suggest that OTA exposure leads to an imbalance in zinc finger expression and shed light on splicing factor and mitochondrial-based mechanisms. PMID:26377828

  10. High-Throughput Tag-Sequencing Analysis of Early Events Induced by Ochratoxin A in HepG-2 Cells.

    PubMed

    Zhang, Yu; Qi, Xiaozhe; Zheng, Juanjuan; Luo, YunBo; Huang, Kunlun; Xu, Wentao

    2016-01-01

    Ochratoxin A (OTA) is produced by fungi of the species Aspergillus and Penicillium. OTA has displayed hepatotoxicity in mammals. Although recent studies have indicated that OTA influences liver function, little is known regarding its impact on differential early liver toxicity. In this study, we report high-throughput tag-sequencing (Tag-seq) analysis of the transcriptome using Solexa Analyzer platform after 4 h of OTA treatment on HepG-2 cells. The analyses of differentially expressed genes revealed the substantial changes. A total of 21,449 genes were identified and quantified, with 2726 displaying significantly altered expression levels. Expression level data were then integrated with a network of gene-gene interactions, and biological pathways to obtain a systems-level view of changes in the transcriptome that occur with OTA resistance. Our data suggest that OTA exposure leads to an imbalance in zinc finger expression and shed light on splicing factor and mitochondrial-based mechanisms.

  11. Snorkel: an epitope tagging system for measuring the surface expression of membrane proteins.

    PubMed

    Brown, Michael; Stafford, Lewis J; Onisk, Dale; Joaquim, Tony; Tobb, Alhagie; Goldman, Larissa; Fancy, David; Stave, James; Chambers, Ross

    2013-01-01

    Tags are widely used to monitor a protein's expression level, interactions, protein trafficking, and localization. Membrane proteins are often tagged in their extracellular domains to allow discrimination between protein in the plasma membrane from that in internal pools. Multipass membrane proteins offer special challenges for inserting a tag since the extracellular regions are often composed of small loops and thus inserting an epitope tag risks perturbing the structure, function, or location of the membrane protein. We have developed a novel tagging system called snorkel where a transmembrane domain followed by a tag is appended to the cytoplasmic C-terminus of the membrane protein. In this way the tag is displayed extracellularly, but structurally separate from the membrane protein. We have tested the snorkel tag system on a diverse panel of membrane proteins including GPCRs and ion channels and demonstrated that it reliably allows for monitoring of the surface expression.

  12. Computational methods for the analysis of tag sequences in metagenomics studies.

    PubMed

    Chang, Qin; Luan, Yihui; Chen, Ting; Fuhrman, Jed A; Sun, Fengzhu

    2012-06-01

    Metagenomics commonly refers to the study of genetic materials directly derived from environments without culturing. Several ongoing large-scale metagenomics projects related to human and marine life, as well as pedology studies, have generated enormous amounts of data, posing a key challenge for efficient analysis, as we try to 1) understand microbial organism assemblage under different conditions, 2) compare different communities, and 3) understand how microbial organisms associate with each other and the environment.To address such questions, investigators are using new sequencing technologies, including Sanger, Illumina Solexa, and Roche 454, to sequence either particular genes, called tag sequences, mostly 16S or 18S ribosomal RNA sequences or other conserved genes, or whole metagenome shotgun sequences of all the genetic materials in a given community. In this paper, we review computational methods used for the analysis of tag sequences.

  13. Simple amino acid tags improve both expression and secretion of Candida antarctica lipase B in recombinant Escherichia coli.

    PubMed

    Kim, Sun-Ki; Park, Yong-Cheol; Lee, Hyung Ho; Jeon, Seung Taeg; Min, Won-Ki; Seo, Jin-Ho

    2015-02-01

    Escherichia coli is the best-established microbial host strain for production of proteins and chemicals, but has a weakness for not secreting high amounts of active heterologous proteins to the extracellular culture medium, of which origins belong to whether prokaryotes or eukaryotes. In this study, Candida antarctica lipase B (CalB), a popular eukaryotic enzyme which catalyzes a number of biochemical reactions and barely secreted extracellularly, was expressed functionally at a gram scale in culture medium by using a simple amino acid-tag system of E. coli. New fusion tag systems consisting of a pelB signal sequence and various anion amino acid tags facilitated both intracellular expression and extracellular secretion of CalB. Among them, the N-terminal five aspartate tag changed the quaternary structure of the dimeric CalB and allowed production of 1.9 g/L active CalB with 65 U/mL activity in culture medium, which exhibited the same enzymatic properties as the commercial CalB. This PelB-anion amino acid tag-based expression system for CalB can be extended to production of other industrial proteins hardly expressed and exported from E. coli, thereby increasing target protein concentrations and minimizing purification steps. PMID:25182473

  14. Assembly of a gene sequence tag microarray by reversible biotin-streptavidin capture for transcript analysis of Arabidopsis thaliana

    PubMed Central

    Wirta, Valtteri; Holmberg, Anders; Lukacs, Morten; Nilsson, Peter; Hilson, Pierre; Uhlén, Mathias; Bhalerao, Rishikesh P; Lundeberg, Joakim

    2005-01-01

    Background Transcriptional profiling using microarrays has developed into a key molecular tool for the elucidation of gene function and gene regulation. Microarray platforms based on either oligonucleotides or purified amplification products have been utilised in parallel to produce large amounts of data. Irrespective of platform examined, the availability of genome sequence or a large number of representative expressed sequence tags (ESTs) is, however, a pre-requisite for the design and selection of specific and high-quality microarray probes. This is of great importance for organisms, such as Arabidopsis thaliana, with a high number of duplicated genes, as cross-hybridisation signals between evolutionary related genes cannot be distinguished from true signals unless the probes are carefully designed to be specific. Results We present an alternative solid-phase purification strategy suitable for efficient preparation of short, biotinylated and highly specific probes suitable for large-scale expression profiling. Twenty-one thousand Arabidopsis thaliana gene sequence tags were amplified and subsequently purified using the described technology. The use of the arrays is exemplified by analysis of gene expression changes caused by a four-hour indole-3-acetic (auxin) treatment. A total of 270 genes were identified as differentially expressed (120 up-regulated and 150 down-regulated), including several previously known auxin-affected genes, but also several previously uncharacterised genes. Conclusions The described solid-phase procedure can be used to prepare gene sequence tag microarrays based on short and specific amplified probes, facilitating the analysis of more than 21 000 Arabidopsis transcripts. PMID:15689241

  15. Massively parallel tag sequencing reveals the complexity of anaerobic marine protistan communities

    PubMed Central

    Stoeck, Thorsten; Behnke, Anke; Christen, Richard; Amaral-Zettler, Linda; Rodriguez-Mora, Maria J; Chistoserdov, Andrei; Orsi, William; Edgcomb, Virginia P

    2009-01-01

    Background Recent advances in sequencing strategies make possible unprecedented depth and scale of sampling for molecular detection of microbial diversity. Two major paradigm-shifting discoveries include the detection of bacterial diversity that is one to two orders of magnitude greater than previous estimates, and the discovery of an exciting 'rare biosphere' of molecular signatures ('species') of poorly understood ecological significance. We applied a high-throughput parallel tag sequencing (454 sequencing) protocol adopted for eukaryotes to investigate protistan community complexity in two contrasting anoxic marine ecosystems (Framvaren Fjord, Norway; Cariaco deep-sea basin, Venezuela). Both sampling sites have previously been scrutinized for protistan diversity by traditional clone library construction and Sanger sequencing. By comparing these clone library data with 454 amplicon library data, we assess the efficiency of high-throughput tag sequencing strategies. We here present a novel, highly conservative bioinformatic analysis pipeline for the processing of large tag sequence data sets. Results The analyses of ca. 250,000 sequence reads revealed that the number of detected Operational Taxonomic Units (OTUs) far exceeded previous richness estimates from the same sites based on clone libraries and Sanger sequencing. More than 90% of this diversity was represented by OTUs with less than 10 sequence tags. We detected a substantial number of taxonomic groups like Apusozoa, Chrysomerophytes, Centroheliozoa, Eustigmatophytes, hyphochytriomycetes, Ichthyosporea, Oikomonads, Phaeothamniophytes, and rhodophytes which remained undetected by previous clone library-based diversity surveys of the sampling sites. The most important innovations in our newly developed bioinformatics pipeline employ (i) BLASTN with query parameters adjusted for highly variable domains and a complete database of public ribosomal RNA (rRNA) gene sequences for taxonomic assignments of tags; (ii

  16. ChIA-PET tool for comprehensive chromatin interaction analysis with paired-end tag sequencing.

    PubMed

    Li, Guoliang; Fullwood, Melissa J; Xu, Han; Mulawadi, Fabianus Hendriyan; Velkov, Stoyan; Vega, Vinsensius; Ariyaratne, Pramila Nuwantha; Mohamed, Yusoff Bin; Ooi, Hong-Sain; Tennakoon, Chandana; Wei, Chia-Lin; Ruan, Yijun; Sung, Wing-Kin

    2010-01-01

    Chromatin interaction analysis with paired-end tag sequencing (ChIA-PET) is a new technology to study genome-wide long-range chromatin interactions bound by protein factors. Here we present ChIA-PET Tool, a software package for automatic processing of ChIA-PET sequence data, including linker filtering, mapping tags to reference genomes, identifying protein binding sites and chromatin interactions, and displaying the results on a graphical genome browser. ChIA-PET Tool is fast, accurate, comprehensive, user-friendly, and open source (available at http://chiapet.gis.a-star.edu.sg). PMID:20181287

  17. Expression and purification of recombinant proteins in Escherichia coli tagged with the metal-binding protein CusF.

    PubMed

    Cantu-Bustos, J Enrique; Vargas-Cortez, Teresa; Morones-Ramirez, Jose Ruben; Balderas-Renteria, Isaias; Galbraith, David W; McEvoy, Megan M; Zarate, Xristo

    2016-05-01

    Production of recombinant proteins in Escherichia coli has been improved considerably through the use of fusion proteins, because they increase protein solubility and facilitate purification via affinity chromatography. In this article, we propose the use of CusF as a new fusion partner for expression and purification of recombinant proteins in E. coli. Using a cell-free protein expression system, based on the E. coli S30 extract, Green Fluorescent Protein (GFP) was expressed with a series of different N-terminal tags, immobilized on self-assembled protein microarrays, and its fluorescence quantified. GFP tagged with CusF showed the highest fluorescence intensity, and this was greater than the intensities from corresponding GFP constructs that contained MBP or GST tags. Analysis of protein production in vivo showed that CusF produces large amounts of soluble protein with low levels of inclusion bodies. Furthermore, fusion proteins can be exported to the cellular periplasm, if CusF contains the signal sequence. Taking advantage of its ability to bind copper ions, recombinant proteins can be purified with readily available IMAC resins charged with this metal ion, producing pure proteins after purification and tag removal. We therefore recommend the use of CusF as a viable alternative to MBP or GST as a fusion protein/affinity tag for the production of soluble recombinant proteins in E. coli. PMID:26805756

  18. Expression and purification of recombinant proteins in Escherichia coli tagged with the metal-binding protein CusF.

    PubMed

    Cantu-Bustos, J Enrique; Vargas-Cortez, Teresa; Morones-Ramirez, Jose Ruben; Balderas-Renteria, Isaias; Galbraith, David W; McEvoy, Megan M; Zarate, Xristo

    2016-05-01

    Production of recombinant proteins in Escherichia coli has been improved considerably through the use of fusion proteins, because they increase protein solubility and facilitate purification via affinity chromatography. In this article, we propose the use of CusF as a new fusion partner for expression and purification of recombinant proteins in E. coli. Using a cell-free protein expression system, based on the E. coli S30 extract, Green Fluorescent Protein (GFP) was expressed with a series of different N-terminal tags, immobilized on self-assembled protein microarrays, and its fluorescence quantified. GFP tagged with CusF showed the highest fluorescence intensity, and this was greater than the intensities from corresponding GFP constructs that contained MBP or GST tags. Analysis of protein production in vivo showed that CusF produces large amounts of soluble protein with low levels of inclusion bodies. Furthermore, fusion proteins can be exported to the cellular periplasm, if CusF contains the signal sequence. Taking advantage of its ability to bind copper ions, recombinant proteins can be purified with readily available IMAC resins charged with this metal ion, producing pure proteins after purification and tag removal. We therefore recommend the use of CusF as a viable alternative to MBP or GST as a fusion protein/affinity tag for the production of soluble recombinant proteins in E. coli.

  19. Highly sensitive targeted methylome sequencing by post-bisulfite adaptor tagging

    PubMed Central

    Miura, Fumihito; Ito, Takashi

    2015-01-01

    The current gold standard method for methylome analysis is whole-genome bisulfite sequencing (WGBS), but its cost is substantial, especially for the purpose of multi-sample comparison of large methylomes. Shotgun bisulfite sequencing of target-enriched DNA, or targeted methylome sequencing (TMS), can be a flexible, cost-effective alternative to WGBS. However, the current TMS protocol requires a considerable amount of input DNA and hence is hardly applicable to samples of limited quantity. Here we report a method to overcome this limitation by using post-bisulfite adaptor tagging (PBAT), in which adaptor tagging is conducted after bisulfite treatment to circumvent bisulfite-induced loss of intact sequencing templates, thereby enabling TMS of a 100-fold smaller amount of input DNA with far fewer cycles of polymerase chain reaction than in the current protocol. We thus expect that the PBAT-mediated TMS will serve as an invaluable method in epigenomics. PMID:25324297

  20. An SNR improvement of passive SAW tags with 5-bit Barker code sequence

    NASA Astrophysics Data System (ADS)

    Bae, Hyunchul; Kim, Jaekwon; Burm, Jinwook

    2012-07-01

    Passive surface acoustic wave (SAW) tags require a large signal-to-noise ratio (SNR) in order to increase the interrogation range. For the purpose of achieving high SNR for radio frequency identification (RFID) communication systems, Barker codes, a binary phase shift keying (BPSK) modulation technique, have been adopted in this study. Passive SAW RFID tags were designed with 5-bit Barker code sequences to generate BPSK modulated signals. Through the SNR analysis, the improvements in SNR were about 11 dB using Barker codes along with a correlator, which can be further improved by optimisation in the correlator.

  1. Efficient protein production method for NMR using soluble protein tags with cold shock expression vector.

    PubMed

    Hayashi, Kokoro; Kojima, Chojiro

    2010-11-01

    The E. coli protein expression system is one of the most useful methods employed for NMR sample preparation. However, the production of some recombinant proteins in E. coli is often hampered by difficulties such as low expression level and low solubility. To address these problems, a modified cold-shock expression system containing a glutathione S-transferase (GST) tag, the pCold-GST system, was investigated. The pCold-GST system successfully expressed 9 out of 10 proteins that otherwise could not be expressed using a conventional E. coli expression system. Here, we applied the pCold-GST system to 84 proteins and 78 proteins were successfully expressed in the soluble fraction. Three other cold-shock expression systems containing a maltose binding protein tag (pCold-MBP), protein G B1 domain tag (pCold-GB1) or thioredoxin tag (pCold-Trx) were also developed to improve the yield. Additionally, we show that a C-terminal proline tag, which is invisible in ¹H-¹⁵N HSQC spectra, inhibits protein degradation and increases the final yield of unstable proteins. The purified proteins were amenable to NMR analyses. These data suggest that pCold expression systems combined with soluble protein tags can be utilized to improve the expression and purification of various proteins for NMR analysis.

  2. Molecular mining of GGAA tagged transcripts and their expression in water buffalo Bubalus bubalis.

    PubMed

    Rawal, Leena; Ali, Safdar; Ali, Sher

    2012-01-15

    Repeat sequences are involved in regulation of gene expression both at the transcriptional and translational level. In the mammalian genomes, tri- and tetranucleotide repeats like ATA, AATA, GGAA and GAAA have been associated with diseases. In silico analysis of (GGAA)5 distribution across the species showed maximum number of this repeat in the mouse transcriptome compared to that in other species. Following this, we conducted minisatellite associated sequence amplification (MASA) to explore the buffalo's transcriptome using cDNA from different tissues and an oligo based on (GGAA)5 repeats. MASA uncovered twenty six mRNA transcripts showing homology to known genes in the database. qPCR studies showed the highest expression of twelve transcripts in the spleen. A transcript, pLRC107 with its partial sequence of 203 nucleotides showed sequence variation at several positions in spleen as compared to other four tissues examined. Transcript pLRC100 was found to represent the partial coding sequence of Bos taurus HECT {(homologous to E6-associated protein (UBE3A) carboxyl-terminus domain) and RCC1 (CHC1)-like domain (RLD) 1}, mRNA. We ascertained full length coding sequence of HECT gene and localized the same on buffalo chromosome 10 employing FISH. This gene was found to be conserved across the species. Another gene LRP8 uncovered in the process showed copy number variation between buffalo males (4-9) and females (34-54). The MASA approach enabled us to identify several genes in Bubalus bubalis without screening an entire cDNA library. The highest expression of 12 mRNA transcripts in spleen suggests their likely involvement with immuno transaction. A comprehensive knowledge of the repeat tagged transcriptomes is envisaged to help in understanding their significance in genome organization and evolution forming rich basis of functional and comparative genomics.

  3. Velocity measurement of clay intrusion through a sudden contraction step using a tagging pulse sequence.

    PubMed

    Tsushima, Shohji; Hasegawa, Atsushi; Suekane, Tetsuya; Hirai, Shuichiro; Tanaka, Yoshihiro; Nakasuji, Yoshizumi

    2003-07-01

    Magnetic resonance imaging (MRI) with a spatial tagging sequence was used to measure the velocity distribution of clay that was forced past a sudden contraction. A spatial tagging sequence provided magnetic resonance images of clay that allowed measurement of the velocity distribution in the clay, which can provide profound insights on the deformation process of clay during the intrusion process. The experiments were conducted using a specially-designed vessel that could operate at up to 30 MPa. The vessel offers a rectangle test section with a sudden contraction step that had a ratio of contraction of 2:1. The vessel was installed into a commercial magnetic resonance imaging equipment and then the fluid motion of clay flowing into the narrow contracted channel was quantitatively investigated to examine behaviors of flowing clay as non-Newtonian fluid. MRI results are compared with those obtained by computational fluid dynamics (CFD) calculation. Velocity distributions obtained from each tag displacement did not well agree with those predicted by CFD results near the contraction step where the fluid accelerated rapidly. However, a post-processing on calculation results, in which virtual tag displacement is calculated, gave better agreement with experiment and enabled us to compare MRI results with CFD results. PMID:12915199

  4. An Entry/Gateway® cloning system for general expression of genes with molecular tags in Drosophila melanogaster

    PubMed Central

    Akbari, Omar S; Oliver, Daniel; Eyer, Katie; Pai, Chi-Yun

    2009-01-01

    Background Tagged fusion proteins are priceless tools for monitoring the activities of biomolecules in living cells. However, over-expression of fusion proteins sometimes leads to the unwanted lethality or developmental defects. Therefore, vectors that can express tagged proteins at physiological levels are desirable tools for studying dosage-sensitive proteins. We developed a set of Entry/Gateway® vectors for expressing fluorescent fusion proteins in Drosophila melanogaster. The vectors were used to generate fluorescent CP190 which is a component of the gypsy chromatin insulator. We used the fluorescent CP190 to study the dynamic movement of related chromatin insulators in living cells. Results The Entry/Gateway® system is a timesaving technique for quickly generating expression constructs of tagged fusion proteins. We described in this study an Entry/Gateway® based system, which includes six P-element destination vectors (P-DEST) for expressing tagged proteins (eGFP, mRFP, or myc) in Drosophila melanogaster and a TA-based cloning vector for generating entry clones from unstable DNA sequences. We used the P-DEST vectors to express fluorecent CP190 at tolerable levels. Expression of CP190 using the UAS/Gal4 system, instead, led to either lethality or underdeveloped tissues. The expressed eGFP- or mRFP-tagged CP190 proteins are fully functional and rescued the lethality of the homozygous CP190 mutation. We visualized a wide range of CP190 distribution patterns in living cell nuclei, from thousands of tiny particles to less than ten giant ones, which likely reflects diverse organization of higher-order chromatin structures. We also visualized the fusion of multiple smaller insulator bodies into larger aggregates in living cells, which is likely reflective of the dynamic activities of reorganization of chromatin in living nuclei. Conclusion We have developed an efficient cloning system for expressing dosage-sensitive proteins in Drosophila melanogaster. This system

  5. Maltose-Binding Protein (MBP), a Secretion-Enhancing Tag for Mammalian Protein Expression Systems.

    PubMed

    Reuten, Raphael; Nikodemus, Denise; Oliveira, Maria B; Patel, Trushar R; Brachvogel, Bent; Breloy, Isabelle; Stetefeld, Jörg; Koch, Manuel

    2016-01-01

    Recombinant proteins are commonly expressed in eukaryotic expression systems to ensure the formation of disulfide bridges and proper glycosylation. Although many proteins can be expressed easily, some proteins, sub-domains, and mutant protein versions can cause problems. Here, we investigated expression levels of recombinant extracellular, intracellular as well as transmembrane proteins tethered to different polypeptides in mammalian cell lines. Strikingly, fusion of proteins to the prokaryotic maltose-binding protein (MBP) generally enhanced protein production. MBP fusion proteins consistently exhibited the most robust increase in protein production in comparison to commonly used tags, e.g., the Fc, Glutathione S-transferase (GST), SlyD, and serum albumin (ser alb) tag. Moreover, proteins tethered to MBP revealed reduced numbers of dying cells upon transient transfection. In contrast to the Fc tag, MBP is a stable monomer and does not promote protein aggregation. Therefore, the MBP tag does not induce artificial dimerization of tethered proteins and provides a beneficial fusion tag for binding as well as cell adhesion studies. Using MBP we were able to secret a disease causing laminin β2 mutant protein (congenital nephrotic syndrome), which is normally retained in the endoplasmic reticulum. In summary, this study establishes MBP as a versatile expression tag for protein production in eukaryotic expression systems. PMID:27029048

  6. Satellite-tagged transcribing sequences in Bubalus bubalis genome undergo programmed modulation in meiocytes: possible implications for transcriptional inactivation.

    PubMed

    Chattopadhyay, M; Gangadharan, S; Kapur, V; Azfer, M A; Prakash, B; Ali, S

    2001-09-01

    We cloned and sequenced a 1378 bp BamHI satellite DNA fraction from the water buffalo Bubalus bubalis and have studied its expression in different tissues. The GC-rich sequences of the resultant contig pDS5 crosshybridize only with bovid DNA and are not conserved evolutionarily. Typing of buffalo genomic DNA using pDS5 with several restriction enzymes revealed multilocus monomorphic bands. Similar typing of cattle, buffalo, goat, sheep, and gaur genomic DNA revealed variations in copy number and allele length giving rise to species-specific band patterns. Expression study of pDS5 in bubaline samples by RNA slot-blot, Northern blot, and RT-PCR showed various levels of signal in all the somatic tissues and germline cells except heart. A GenBank database search revealed homology of pDS5 sequences in the 5' region from nt 1-1261 with collagen gene. An AluI typing analysis of DNA from bubaline semen samples showed consistent loss of two bands. The presence of corresponding bands in somatic tissues suggests a sequence modulation within the pDS5 array in meiocytes during spermatogenesis, which is restored in the somatic cells after fertilization. Modulation of the satellite-tagged transcribing sequence in the meiocytes may be a mechanism of its inactivation.

  7. RAD tag sequencing as a source of SNP markers in Cynara cardunculus L

    PubMed Central

    2012-01-01

    Background The globe artichoke (Cynara cardunculus L. var. scolymus) genome is relatively poorly explored, especially compared to those of the other major Asteraceae crops sunflower and lettuce. No SNP markers are in the public domain. We have combined the recently developed restriction-site associated DNA (RAD) approach with the Illumina DNA sequencing platform to effect the rapid and mass discovery of SNP markers for C. cardunculus. Results RAD tags were sequenced from the genomic DNA of three C. cardunculus mapping population parents, generating 9.7 million reads, corresponding to ~1 Gbp of sequence. An assembly based on paired ends produced ~6.0 Mbp of genomic sequence, separated into ~19,000 contigs (mean length 312 bp), of which ~21% were fragments of putative coding sequence. The shared sequences allowed for the discovery of ~34,000 SNPs and nearly 800 indels, equivalent to a SNP frequency of 5.6 per 1,000 nt, and an indel frequency of 0.2 per 1,000 nt. A sample of heterozygous SNP loci was mapped by CAPS assays and this exercise provided validation of our mining criteria. The repetitive fraction of the genome had a high representation of retrotransposon sequence, followed by simple repeats, AT-low complexity regions and mobile DNA elements. The genomic k-mers distribution and CpG rate of C. cardunculus, compared with data derived from three whole genome-sequenced dicots species, provided a further evidence of the random representation of the C. cardunculus genome generated by RAD sampling. Conclusion The RAD tag sequencing approach is a cost-effective and rapid method to develop SNP markers in a highly heterozygous species. Our approach permitted to generate a large and robust SNP datasets by the adoption of optimized filtering criteria. PMID:22214349

  8. A Chimeric Affinity Tag for Efficient Expression and Chromatographic Purification of Heterologous Proteins from Plants.

    PubMed

    Sainsbury, Frank; Jutras, Philippe V; Vorster, Juan; Goulet, Marie-Claire; Michaud, Dominique

    2016-01-01

    The use of plants as expression hosts for recombinant proteins is an increasingly attractive option for the production of complex and challenging biopharmaceuticals. Tools are needed at present to marry recent developments in high-yielding gene vectors for heterologous expression with routine protein purification techniques. In this study, we designed the Cysta-tag, a new purification tag for immobilized metal affinity chromatography (IMAC) of plant-made proteins based on the protein-stabilizing fusion partner SlCYS8. We show that the Cysta-tag may be used to readily purify proteins under native conditions, and then be removed enzymatically to isolate the protein of interest. We also show that commonly used protease recognition sites for linking purification tags are differentially stable in leaves of the commonly used expression host Nicotiana benthamiana, with those linkers susceptible to cysteine proteases being less stable then serine protease-cleavable linkers. As an example, we describe a Cysta-tag experimental scheme for the one-step purification of a clinically useful protein, human α1-antitrypsin, transiently expressed in N. benthamiana. With potential applicability to the variety of chromatography formats commercially available for IMAC-based protein purification, the Cysta-tag provides a convenient means for the efficient and cost-effective purification of recombinant proteins from plant tissues. PMID:26913045

  9. Modified PCR methods for 3' end amplification from serial analysis of gene expression (SAGE) tags.

    PubMed

    Xu, Wang-Jie; Wang, Zhao-Xia; Qiao, Zhong-Dong

    2009-05-01

    Serial analysis of gene expression (SAGE) is a powerful technique to study gene expression at the genome level. However, a disadvantage of the shortness of SAGE tags is that it prevents further study of SAGE library data, thus limiting extensive application of the SAGE method in gene expression studies. However, this problem can be solved by extension of the SAGE tags to 3' cDNAs. Therefore, several methods based on PCR have been developed to generate a 3' longer fragment cDNA corresponding to a SAGE tag. The list of modified methods is extensive, and includes rapid RT-PCR analysis of unknown SAGE tags (RAST-PCR), generation of longer cDNA fragments from SAGE tags for gene identification (GLGI), a high-throughput GLGI procedure, reverse SAGE (rSAGE), two-step analysis of unknown SAGE tags (TSAT-PCR), etc. These procedures are constantly being updated because they have characteristics and advantages that can be shared. Development of these methods has promoted the widespread use of the SAGE technique, and has accelerated the speed of studies of large-scale gene expression.

  10. A Chimeric Affinity Tag for Efficient Expression and Chromatographic Purification of Heterologous Proteins from Plants

    PubMed Central

    Sainsbury, Frank; Jutras, Philippe V.; Vorster, Juan; Goulet, Marie-Claire; Michaud, Dominique

    2016-01-01

    The use of plants as expression hosts for recombinant proteins is an increasingly attractive option for the production of complex and challenging biopharmaceuticals. Tools are needed at present to marry recent developments in high-yielding gene vectors for heterologous expression with routine protein purification techniques. In this study, we designed the Cysta-tag, a new purification tag for immobilized metal affinity chromatography (IMAC) of plant-made proteins based on the protein-stabilizing fusion partner SlCYS8. We show that the Cysta-tag may be used to readily purify proteins under native conditions, and then be removed enzymatically to isolate the protein of interest. We also show that commonly used protease recognition sites for linking purification tags are differentially stable in leaves of the commonly used expression host Nicotiana benthamiana, with those linkers susceptible to cysteine proteases being less stable then serine protease-cleavable linkers. As an example, we describe a Cysta-tag experimental scheme for the one-step purification of a clinically useful protein, human α1-antitrypsin, transiently expressed in N. benthamiana. With potential applicability to the variety of chromatography formats commercially available for IMAC-based protein purification, the Cysta-tag provides a convenient means for the efficient and cost-effective purification of recombinant proteins from plant tissues. PMID:26913045

  11. A physical map of the X chromosome of Drosophila melanogaster: Cosmid contigs and sequence tagged sites

    SciTech Connect

    Madueno, E.; Modolell, J.; Papagiannakis, G.

    1995-04-01

    A physical map of the euchromatic X chromosome of Drosophila melanogaster has been constructed by assembling contiguous arrays of cosmids that were selected by screening a library with DNA isolated from microamplified chromosomal divisions. This map, consisting of 893 cosmids, covers {approximately}64% of the euchromatic part of the chromosome. In addition, 568 sequence tagged sites (STS), in aggregate representing 120 kb of sequenced DNA, were derived from selected cosmids. Most of these STSs, spaced at an average distance of {approximately} 35 kb along the euchromatic region of the chromosome, represent DNA tags that can be used as entry points to the fruitfly genome. Furthermore, 42 genes have been placed on the physical map, either through the hybridization of specific probes to the cosmids or through the fact that they were represented among the STSs. These provide a link between the physical and the genetic maps of D. melanogaster. Nine novel genes have been tentatively identified in Drosophila on the basis of matches between STS sequences and sequences from other species. 32 refs., 3 figs., 4 tabs.

  12. Differential gene expression in the siphonophore Nanomia bijuga (Cnidaria) assessed with multiple next-generation sequencing workflows.

    PubMed

    Siebert, Stefan; Robinson, Mark D; Tintori, Sophia C; Goetz, Freya; Helm, Rebecca R; Smith, Stephen A; Shaner, Nathan; Haddock, Steven H D; Dunn, Casey W

    2011-01-01

    We investigated differential gene expression between functionally specialized feeding polyps and swimming medusae in the siphonophore Nanomia bijuga (Cnidaria) with a hybrid long-read/short-read sequencing strategy. We assembled a set of partial gene reference sequences from long-read data (Roche 454), and generated short-read sequences from replicated tissue samples that were mapped to the references to quantify expression. We collected and compared expression data with three short-read expression workflows that differ in sample preparation, sequencing technology, and mapping tools. These workflows were Illumina mRNA-Seq, which generates sequence reads from random locations along each transcript, and two tag-based approaches, SOLiD SAGE and Helicos DGE, which generate reads from particular tag sites. Differences in expression results across workflows were mostly due to the differential impact of missing data in the partial reference sequences. When all 454-derived gene reference sequences were considered, Illumina mRNA-Seq detected more than twice as many differentially expressed (DE) reference sequences as the tag-based workflows. This discrepancy was largely due to missing tag sites in the partial reference that led to false negatives in the tag-based workflows. When only the subset of reference sequences that unambiguously have tag sites was considered, we found broad congruence across workflows, and they all identified a similar set of DE sequences. Our results are promising in several regards for gene expression studies in non-model organisms. First, we demonstrate that a hybrid long-read/short-read sequencing strategy is an effective way to collect gene expression data when an annotated genome sequence is not available. Second, our replicated sampling indicates that expression profiles are highly consistent across field-collected animals in this case. Third, the impacts of partial reference sequences on the ability to detect DE can be mitigated through

  13. De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins

    SciTech Connect

    Shen, Yufeng; Tolic, Nikola; Hixson, Kim K.; Purvine, Samuel O.; Anderson, Gordon A.; Smith, Richard D.

    2008-10-15

    De novo sequencing has a promise to discover the protein post-translation modifications; however, such approach is still in their infancy and not widely applied for proteomics practices due to its limited reliability. In this work, we describe a de novo sequencing approach for discovery of protein modifications through identification of the UStags (Anal. Chem. 2008, 80, 1871-1882). The de novo information was obtained from Fourier-transform tandem mass spectrometry for peptides and polypeptides in a yeast lysate, and the de novo sequences obtained were filtered to define a more limited set of UStags. The DNA-predicted database protein sequences were then compared to the UStags, and the differences observed across or in the UStags (i.e., the UStags’ prefix and suffix sequences and the UStags themselves) were used to infer the possible sequence modifications. With this de novo-UStag approach, we uncovered some unexpected variances of yeast protein sequences due to amino acid mutations and/or multiple modifications to the predicted protein sequences. Random matching of the de novo sequences to the predicted sequences were examined with use of two random (false) databases, and ~3% false discovery rates were estimated for the de novo-UStag approach. The factors affecting the reliability (e.g., existence of de novo sequencing noise residues and redundant sequences) and the sensitivity are described. The de novo-UStag complements the UStag method previously reported by enabling discovery of new protein modifications.

  14. Primer and platform effects on 16S rRNA tag sequencing

    DOE PAGESBeta

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as wellmore » as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.« less

  15. Primer and platform effects on 16S rRNA tag sequencing

    SciTech Connect

    Tremblay, Julien; Singh, Kanwar; Fern, Alison; Kirton, Edward S.; He, Shaomei; Woyke, Tanja; Lee, Janey; Chen, Feng; Dangl, Jeffery L.; Tringe, Susannah G.

    2015-08-04

    Sequencing of 16S rRNA gene tags is a popular method for profiling and comparing microbial communities. The protocols and methods used, however, vary considerably with regard to amplification primers, sequencing primers, sequencing technologies; as well as quality filtering and clustering. How results are affected by these choices, and whether data produced with different protocols can be meaningfully compared, is often unknown. Here we compare results obtained using three different amplification primer sets (targeting V4, V6–V8, and V7–V8) and two sequencing technologies (454 pyrosequencing and Illumina MiSeq) using DNA from a mock community containing a known number of species as well as complex environmental samples whose PCR-independent profiles were estimated using shotgun sequencing. We find that paired-end MiSeq reads produce higher quality data and enabled the use of more aggressive quality control parameters over 454, resulting in a higher retention rate of high quality reads for downstream data analysis. While primer choice considerably influences quantitative abundance estimations, sequencing platform has relatively minor effects when matched primers are used. In conclusion, beta diversity metrics are surprisingly robust to both primer and sequencing platform biases.

  16. Myocardial tagging by cardiovascular magnetic resonance: evolution of techniques--pulse sequences, analysis algorithms, and applications.

    PubMed

    Ibrahim, El-Sayed H

    2011-01-01

    Cardiovascular magnetic resonance (CMR) tagging has been established as an essential technique for measuring regional myocardial function. It allows quantification of local intramyocardial motion measures, e.g. strain and strain rate. The invention of CMR tagging came in the late eighties, where the technique allowed for the first time for visualizing transmural myocardial movement without having to implant physical markers. This new idea opened the door for a series of developments and improvements that continue up to the present time. Different tagging techniques are currently available that are more extensive, improved, and sophisticated than they were twenty years ago. Each of these techniques has different versions for improved resolution, signal-to-noise ratio (SNR), scan time, anatomical coverage, three-dimensional capability, and image quality. The tagging techniques covered in this article can be broadly divided into two main categories: 1) Basic techniques, which include magnetization saturation, spatial modulation of magnetization (SPAMM), delay alternating with nutations for tailored excitation (DANTE), and complementary SPAMM (CSPAMM); and 2) Advanced techniques, which include harmonic phase (HARP), displacement encoding with stimulated echoes (DENSE), and strain encoding (SENC). Although most of these techniques were developed by separate groups and evolved from different backgrounds, they are in fact closely related to each other, and they can be interpreted from more than one perspective. Some of these techniques even followed parallel paths of developments, as illustrated in the article. As each technique has its own advantages, some efforts have been made to combine different techniques together for improved image quality or composite information acquisition. In this review, different developments in pulse sequences and related image processing techniques are described along with the necessities that led to their invention, which makes this

  17. Myocardial tagging by Cardiovascular Magnetic Resonance: evolution of techniques--pulse sequences, analysis algorithms, and applications

    PubMed Central

    2011-01-01

    Cardiovascular magnetic resonance (CMR) tagging has been established as an essential technique for measuring regional myocardial function. It allows quantification of local intramyocardial motion measures, e.g. strain and strain rate. The invention of CMR tagging came in the late eighties, where the technique allowed for the first time for visualizing transmural myocardial movement without having to implant physical markers. This new idea opened the door for a series of developments and improvements that continue up to the present time. Different tagging techniques are currently available that are more extensive, improved, and sophisticated than they were twenty years ago. Each of these techniques has different versions for improved resolution, signal-to-noise ratio (SNR), scan time, anatomical coverage, three-dimensional capability, and image quality. The tagging techniques covered in this article can be broadly divided into two main categories: 1) Basic techniques, which include magnetization saturation, spatial modulation of magnetization (SPAMM), delay alternating with nutations for tailored excitation (DANTE), and complementary SPAMM (CSPAMM); and 2) Advanced techniques, which include harmonic phase (HARP), displacement encoding with stimulated echoes (DENSE), and strain encoding (SENC). Although most of these techniques were developed by separate groups and evolved from different backgrounds, they are in fact closely related to each other, and they can be interpreted from more than one perspective. Some of these techniques even followed parallel paths of developments, as illustrated in the article. As each technique has its own advantages, some efforts have been made to combine different techniques together for improved image quality or composite information acquisition. In this review, different developments in pulse sequences and related image processing techniques are described along with the necessities that led to their invention, which makes this

  18. New sequence-tagged site molecular markers for identification of sex in Distichlis spicata.

    PubMed

    Eppley, Sarah M; O'Quinn, Robin; Brown, Anna L

    2009-09-01

    Sex-linked molecular markers have become valuable tools for understanding sex ratio evolution and sex-specific physiology in pre-reproductive plants. To develop new accurate methods for sexing Distichlis spicata juveniles and nonflowering individuals, we converted a random amplified polymorphic DNA-polymerase chain reaction marker that co-segregated with the female phenotype into a set of sequence-tagged site markers. We tested the marker pair on known males and females from populations in Oregon and California. A single band was obtained for all female samples but never for males.

  19. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array.

    PubMed

    Fuller, Carl W; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J; Kasianowicz, John J; Davis, Randy; Roever, Stefan; Church, George M; Ju, Jingyue

    2016-05-10

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5'-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  20. Real-time single-molecule electronic DNA sequencing by synthesis using polymer-tagged nucleotides on a nanopore array

    PubMed Central

    Fuller, Carl W.; Kumar, Shiv; Porel, Mintu; Chien, Minchen; Bibillo, Arek; Stranges, P. Benjamin; Dorwart, Michael; Tao, Chuanjuan; Li, Zengmin; Guo, Wenjing; Shi, Shundi; Korenblum, Daniel; Trans, Andrew; Aguirre, Anne; Liu, Edward; Harada, Eric T.; Pollard, James; Bhat, Ashwini; Cech, Cynthia; Yang, Alexander; Arnold, Cleoma; Palla, Mirkó; Hovis, Jennifer; Chen, Roger; Morozova, Irina; Kalachikov, Sergey; Russo, James J.; Kasianowicz, John J.; Davis, Randy; Roever, Stefan; Church, George M.; Ju, Jingyue

    2016-01-01

    DNA sequencing by synthesis (SBS) offers a robust platform to decipher nucleic acid sequences. Recently, we reported a single-molecule nanopore-based SBS strategy that accurately distinguishes four bases by electronically detecting and differentiating four different polymer tags attached to the 5′-phosphate of the nucleotides during their incorporation into a growing DNA strand catalyzed by DNA polymerase. Further developing this approach, we report here the use of nucleotides tagged at the terminal phosphate with oligonucleotide-based polymers to perform nanopore SBS on an α-hemolysin nanopore array platform. We designed and synthesized several polymer-tagged nucleotides using tags that produce different electrical current blockade levels and verified they are active substrates for DNA polymerase. A highly processive DNA polymerase was conjugated to the nanopore, and the conjugates were complexed with primer/template DNA and inserted into lipid bilayers over individually addressable electrodes of the nanopore chip. When an incoming complementary-tagged nucleotide forms a tight ternary complex with the primer/template and polymerase, the tag enters the pore, and the current blockade level is measured. The levels displayed by the four nucleotides tagged with four different polymers captured in the nanopore in such ternary complexes were clearly distinguishable and sequence-specific, enabling continuous sequence determination during the polymerase reaction. Thus, real-time single-molecule electronic DNA sequencing data with single-base resolution were obtained. The use of these polymer-tagged nucleotides, combined with polymerase tethering to nanopores and multiplexed nanopore sensors, should lead to new high-throughput sequencing methods. PMID:27091962

  1. Evaluation of Affinity-Tagged Protein Expression Strategies using Local and Global Isotope Ratio Measurements

    SciTech Connect

    Hervey, IV, William Judson; Khalsa-Moyers, Gurusahai K; Lankford, Patricia K; Owens, Elizabeth T; McKeown, Catherine K; Lu, Tse-Yuan S; Foote, Linda J; Morrell-Falvey, Jennifer L; McDonald, W Hayes; Pelletier, Dale A; Hurst, Gregory {Greg} B

    2009-01-01

    Protein enrichments of engineered, affinity-tagged (or bait ) fusion proteins with interaction partners are often laden with background, non-specific proteins, due to interactions that occur in vitro as an artifact of the technique. Furthermore, the in vivo expression of the bait protein may itself affect physiology or metabolism. In this study, intrinsic affinity purification challenges were investigated in a model protein complex, DNA-dependent RNA polymerase (RNAP), encompassing chromosome- and plasmid-encoding strategies for bait proteins in two different microbial species: Escherichia coli and Rhodopseudomonas palustris. Isotope ratio measurements of bait protein expression strains relative to native, wild-type strains were performed by liquid chromatography tandem mass spectrometry (LC-MS-MS) to assess bait protein expression strategies in each species. Authentic interacting proteins of RNAP were successfully discerned from artifactual co-isolating proteins by the isotopic differentiation of interactions as random or targeted (I-DIRT) method (A. J. Tackett et al. J. Proteome Res. 2005, 4 (5), 1752-1756). To investigate broader effects of bait protein production in the bacteria, we compared proteomes from strains harboring a plasmid that encodes an affinity-tagged subunit (RpoA) of the RNAP complex with the corresponding wild-type strains using stable isotope metabolic labeling. The ratio of RpoA abundance in plasmid strains versus wild type was 0.8 for R. palustris and 1.7 for E. coli. While most other proteins showed no appreciable difference, proteins significantly increased in abundance in plasmid-encoded bait-expressing strains of both species included the plasmid encoded antibiotic resistance protein, GenR and proteins involved in amino acid biosynthesis. Together, these local, complex-specific and more global, whole proteome isotopic abundance ratio measurements provided a tool for evaluating both in vivo and in vitro effects of plasmid

  2. De novo sequencing of unique sequence tags for discovery of post-translational modifications of proteins.

    PubMed

    Shen, Yufeng; Tolić, Nikola; Hixson, Kim K; Purvine, Samuel O; Anderson, Gordon A; Smith, Richard D

    2008-10-15

    De novo sequencing is a spectrum analysis approach for mass spectrometry data to discover post-translational modifications in proteins; however, such an approach is still in its infancy and is still not widely applied to proteomic practices due to its limited reliability. In this work, we describe a de novo sequencing approach for the discovery of protein modifications based on identification of the proteome UStags (Shen, Y.; Tolić, N.; Hixson, K. K.; Purvine, S. O.; Pasa-Tolić, L.; Qian, W. J.; Adkins, J. N.; Moore, R. J.; Smith, R. D. Anal. Chem. 2008, 80, 1871-1882). The de novo information was obtained from Fourier-transform tandem mass spectrometry data for peptides and polypeptides from a yeast lysate, and the de novo sequences obtained were selected based on filter levels designed to provide a limited yet high quality subset of UStags. The DNA-predicted database protein sequences were then compared to the UStags, and the differences observed across or in the UStags (i.e., the UStags' prefix and suffix sequences and the UStags themselves) were used to infer possible sequence modifications. With this de novo-UStag approach, we uncovered some unexpected variances within several yeast protein sequences due to amino acid mutations and/or multiple modifications to the predicted protein sequences. To determine false discovery rates, two random (false) databases were independently used for sequence matching, and ~3% false discovery rates were estimated for the de novo-UStag approach. The factors affecting the reliability (e.g., existence of de novo sequencing noise residues and redundant sequences) and the sensitivity of the approach were investigated and described. The combined de novo-UStag approach complements the UStag method previously reported by enabling the discovery of new protein modifications. PMID:18783246

  3. A protein tagging system for signal amplification in gene expression and fluorescence imaging

    PubMed Central

    Tanenbaum, Marvin E.; Gilbert, Luke A.; Qi, Lei S.; Weissman, Jonathan S.; Vale, Ronald D.

    2014-01-01

    Summary Signals in many biological processes can be amplified by recruiting multiple copies of regulatory proteins to a site of action. Harnessing this principle, we have developed a novel protein scaffold, a repeating peptide array termed SunTag, which can recruit multiple copies of an antibody-fusion protein. We show that the SunTag can recruit up to 24 copies of GFP, thereby enabling long-term imaging of single protein molecules in living cells. We also use the SunTag to create a potent synthetic transcription factor by recruiting multiple copies of a transcriptional activation domain to a nuclease-deficient CRISPR/Cas9 protein and demonstrate strong activation of endogenous gene expression and re-engineered cell behavior with this system. Thus, the SunTag provides a versatile platform for multimerizing proteins on a target protein scaffold and is likely to have many applications in imaging and in controlling biological outputs. PMID:25307933

  4. Identification of sequence tagged sites in the Asian and African elephant.

    PubMed

    Burk, N E; Messer, L A; Ernst, C W; Rothschild, M F

    1998-01-01

    To date, gene identification in elephants has essentially related to evolutionary studies. Further identification of genes in elephants could provide additional information for evolutionary studies and for evaluating genetic diversity in existing elephant populations. The objective of this project was to identify sequence tagged sites (STSs) in the Asian and the African elephant for the following genes: melatonin receptor 1a (MTNR1A), retinoic acid receptor beta (RARB), and leptin receptor (LEPR). These genes are highly conserved among mammals, and all may play a role in reproduction. Heterologous primers for PCR were designed from sequences available in other species. Fragments of size 141 base pairs (bp) for RARB and 327 bp for LEPR were obtained by amplifying genomic Asian and African elephant DNA. The LEPR fragment included an intron of 164 bp. Also, a 417 bp fragment for MTNR1A was obtained in the Asian elephant only. All PCR products were sequenced and comparison computations were made at the nucleotide and amino acid levels to sequence available in the GenBank database. Nucleotide sequence for RARB was identical for both Asian and African elephants and differed by only 3 bp for LEPR. Deduced amino acid sequence was identical for both STSs in both species. Elephants were relatively similar in comparison to other mammals and less similar to chickens.

  5. Moving Away from the Reference Genome: Evaluating a Peptide Sequencing Tagging Approach for Single Amino Acid Polymorphism Identifications in the Genus Populus

    SciTech Connect

    Abraham, Paul E; Adams, Rachel M; Tuskan, Gerald A; Hettich, Robert {Bob} L

    2013-01-01

    The genetic diversity across natural populations of the model organism, Populus, is extensive, containing a single nucleotide polymorphism roughly every 200 base pairs. When deviations from the reference genome occur in coding regions, they can impact protein sequences. Rather than relying on a static reference database to profile protein expression, we employed a peptide sequence tagging (PST) approach capable of decoding the plasticity of the Populus proteome. Using shotgun proteomics data from two genotypes of P. trichocarpa, a tag-based approach enabled the detection of 6,653 unexpected sequence variants. Through manual validation, our study investigated how the most abundant chemical modification (methionine oxidation) could masquerade as a sequence variant (AlaSer) when few site-determining ions existed. In fact, precise localization of an oxidation site for peptides with more than one potential placement was indeterminate for 70% of the MS/MS spectra. We demonstrate that additional fragment ions made available by high energy collisional dissociation enhances the robustness of the peptide sequence tagging approach (81% of oxidation events could be exclusively localized to a methionine). We are confident that augmenting fragmentation processes for a PST approach will further improve the identification of single amino acid polymorphism in Populus and potentially other species as well.

  6. Generation and Analysis of End Sequence Database for T-DNA Tagging Lines in Rice1

    PubMed Central

    An, Suyoung; Park, Sunhee; Jeong, Dong-Hoon; Lee, Dong-Yeon; Kang, Hong-Gyu; Yu, Jung-Hwa; Hur, Junghe; Kim, Sung-Ryul; Kim, Young-Hea; Lee, Miok; Han, Soonki; Kim, Soo-Jin; Yang, Jungwon; Kim, Eunjoo; Wi, Soo Jin; Chung, Hoo Sun; Hong, Jong-Pil; Choe, Vitnary; Lee, Hak-Kyung; Choi, Jung-Hee; Nam, Jongmin; Kim, Seong-Ryong; Park, Phun-Bum; Park, Ky Young; Kim, Woo Taek; Choe, Sunghwa; Lee, Chin-Bum; An, Gynheung

    2003-01-01

    We analyzed 6,749 lines tagged by the gene trap vector pGA2707. This resulted in the isolation of 3,793 genomic sequences flanking the T-DNA. Among the insertions, 1,846 T-DNAs were integrated into genic regions, and 1,864 were located in intergenic regions. Frequencies were also higher at the beginning and end of the coding regions and upstream near the ATG start codon. The overall GC content at the insertion sites was close to that measured from the entire rice (Oryza sativa) genome. Functional classification of these 1,846 tagged genes showed a distribution similar to that observed for all the genes in the rice chromosomes. This indicates that T-DNA insertion is not biased toward a particular class of genes. There were 764, 327, and 346 T-DNA insertions in chromosomes 1, 4 and 10, respectively. Insertions were not evenly distributed; frequencies were higher at the ends of the chromosomes and lower near the centromere. At certain sites, the frequency was higher than in the surrounding regions. This sequence database will be valuable in identifying knockout mutants for elucidating gene function in rice. This resource is available to the scientific community at http://www.postech.ac.kr/life/pfg/risd. PMID:14630961

  7. Recombinant enterokinase light chain with affinity tag: expression from Saccharomyces cerevisiae and its utilities in fusion protein technology.

    PubMed

    Choi, S I; Song, H W; Moon, J W; Seong, B L

    2001-12-20

    Enterokinase and recombinant enterokinase light chain (rEK(L)) have been used widely to cleave fusion proteins with the target sequence of (Asp)(4)-Lys. In this work, we show that their utility as a site-specific cleavage agent is compromised by sporadic cleavage at other sites, albeit at low levels. Further degradation of the fusion protein in cleavage reaction is due to an intrinsic broad specificity of the enzyme rather than to the presence of contaminating proteases. To offer facilitated purification from fermentation broth and efficient removal of rEK(L) after cleavage reaction, thus minimizing unwanted cleavage of target protein, histidine affinity tag was introduced into rEK(L). Utilizing the secretion enhancer peptide derived from the human interleukin 1 beta, the recombinant EK(L) was expressed in Saccharomyces cerevisiae and efficiently secreted into culture medium. The C-terminal His-tagged EK(L) was purified in a single-step procedure on nickel affinity chromatography. It retained full enzymatic activity similar to that of EK(L), whereas the N-terminal His-tagged EK(L) was neither efficiently purified nor had any enzymatic activity. After cleavage reaction of fusion protein, the C-terminal His-tagged EK(L) was efficiently removed from the reaction mixture by a single passage through nickel-NTA spin column. The simple affinity tag renders rEK(L) extremely useful for purification, post-cleavage removal, recovery, and recycling and will broaden the utility and the versatility of the enterokinase for the production of recombinant proteins. PMID:11745150

  8. The Arabidopsis root transcriptome by serial analysis of gene expression. Gene identification using the genome sequence.

    PubMed

    Fizames, Cécile; Muños, Stéphane; Cazettes, Céline; Nacry, Philippe; Boucherez, Jossia; Gaymard, Frédéric; Piquemal, David; Delorme, Valérie; Commes, Thérèse; Doumas, Patrick; Cooke, Richard; Marti, Jacques; Sentenac, Hervé; Gojon, Alain

    2004-01-01

    Large-scale identification of genes expressed in roots of the model plant Arabidopsis was performed by serial analysis of gene expression (SAGE), on a total of 144,083 sequenced tags, representing at least 15,964 different mRNAs. For tag to gene assignment, we developed a computational approach based on 26,620 genes annotated from the complete sequence of the genome. The procedure selected warrants the identification of the genes corresponding to the majority of the tags found experimentally, with a high level of reliability, and provides a reference database for SAGE studies in Arabidopsis. This new resource allowed us to characterize the expression of more than 3,000 genes, for which there is no expressed sequence tag (EST) or cDNA in the databases. Moreover, 85% of the tags were specific for one gene. To illustrate this advantage of SAGE for functional genomics, we show that our data allow an unambiguous analysis of most of the individual genes belonging to 12 different ion transporter multigene families. These results indicate that, compared with EST-based tag to gene assignment, the use of the annotated genome sequence greatly improves gene identification in SAGE studies. However, more than 6,000 different tags remained with no gene match, suggesting that a significant proportion of transcripts present in the roots originate from yet unknown or wrongly annotated genes. The root transcriptome characterized in this study markedly differs from those obtained in other organs, and provides a unique resource for investigating the functional specificities of the root system. As an example of the use of SAGE for transcript profiling in Arabidopsis, we report here the identification of 270 genes differentially expressed between roots of plants grown either with NO3- or NH4NO3 as N source.

  9. Expressed Peptide Tags: An additional layer of data for genome annotation

    SciTech Connect

    Savidor, Alon; Donahoo, Ryan S; Hurtado-Gonzales, Oscar; Verberkmoes, Nathan C; Shah, Manesh B; Lamour, Kurt H; McDonald, W Hayes

    2006-01-01

    While genome sequencing is becoming ever more routine, genome annotation remains a challenging process. Identification of the coding sequences within the genomic milieu presents a tremendous challenge, especially for eukaryotes with their complex gene architectures. Here we present a method to assist the annotation process through the use of proteomic data and bioinformatics. Mass spectra of digested protein preparations of the organism of interest were acquired and searched against a protein database created by a six frame translation of the genome. The identified peptides were mapped back to the genome, compared to the current annotation, and then categorized as supporting or extending the current genome annotation. We named the classified peptides Expressed Peptide Tags (EPTs). The well annotated bacterium Rhodopseudomonas palustris was used as a control for the method and showed high degree of correlation between EPT mapping and the current annotation, with 86% of the EPTs confirming existing gene calls and less than 1% of the EPTs expanding on the current annotation. The eukaryotic plant pathogens Phytophthora ramorum and Phytophthora sojae, whose genomes have been recently sequenced and are much less well annotated, were also subjected to this method. A series of algorithmic steps were taken to increase the confidence of EPT identification for these organisms, including generation of smaller sub-databases to be searched against, and definition of EPT criteria that accommodates the more complex eukaryotic gene architecture. As expected, the analysis of the Phytophthora species showed less correlation between EPT mapping and their current annotation. While ~77% of Phytophthora EPTs supported the current annotation, a portion of them (7.2% and 12.6% for P. ramorum and P. sojae, respectively) suggested modification to current gene calls or identified novel genes that were missed by the current genome annotation of these organisms.

  10. Expression and purification of recombinant proteins in Escherichia coli tagged with a small metal-binding protein from Nitrosomonas europaea.

    PubMed

    Vargas-Cortez, Teresa; Morones-Ramirez, Jose Ruben; Balderas-Renteria, Isaias; Zarate, Xristo

    2016-02-01

    Escherichia coli is still the preferred organism for large-scale production of recombinant proteins. The use of fusion proteins has helped considerably in enhancing the solubility of heterologous proteins and their purification with affinity chromatography. Here, the use of a small metal-binding protein (SmbP) from Nitrosomonas europaea is described as a new fusion protein for protein expression and purification in E. coli. Fluorescent proteins tagged at the N-terminal with SmbP showed high levels of solubility, compared with those of maltose-binding protein and glutathione S-transferase, and low formation of inclusion bodies. Using commercially available IMAC resins charged with Ni(II), highly pure recombinant proteins were obtained after just one chromatography step. Proteins may be purified from the periplasm of E. coli if SmbP contains the signal sequence at the N-terminal. After removal of the SmbP tag from the protein of interest, high-yields are obtained since SmbP is a protein of just 9.9 kDa. The results here obtained suggest that SmbP is a good alternative as a fusion protein/affinity tag for the production of soluble recombinant proteins in E. coli.

  11. Gene CATCHR--gene cloning and tagging for Caenorhabditis elegans using yeast homologous recombination: a novel approach for the analysis of gene expression.

    PubMed

    Sassi, Holly E; Renihan, Stephanie; Spence, Andrew M; Cooperstock, Ramona L

    2005-01-01

    Expression patterns of gene products provide important insights into gene function. Reporter constructs are frequently used to analyze gene expression in Caenorhabditis elegans, but the sequence context of a given gene is inevitably altered in such constructs. As a result, these transgenes may lack regulatory elements required for proper gene expression. We developed Gene Catchr, a novel method of generating reporter constructs that exploits yeast homologous recombination (YHR) to subclone and tag worm genes while preserving their local sequence context. YHR facilitates the cloning of large genomic regions, allowing the isolation of regulatory sequences in promoters, introns, untranslated regions and flanking DNA. The endogenous regulatory context of a given gene is thus preserved, producing expression patterns that are as accurate as possible. Gene Catchr is flexible: any tag can be inserted at any position without introducing extra sequence. Each step is simple and can be adapted to process multiple genes in parallel. We show that expression patterns derived from Gene Catchr transgenes are consistent with previous reports and also describe novel expression data. Mutant rescue assays demonstrate that Gene Catchr-generated transgenes are functional. Our results validate the use of Gene Catchr as a valuable tool to study spatiotemporal gene expression. PMID:16254074

  12. The Use of Affinity Tags to Overcome Obstacles in Recombinant Protein Expression and Purification.

    PubMed

    Amarasinghe, Chinthaka; Jin, Jian-Ping

    2015-01-01

    Research and industrial demands for recombinant proteins continue to increase over time for their broad applications in structural and functional studies and as therapeutic agents. These applications often require large quantities of recombinant protein at desirable purity, which highlights the importance of developing and improving production approaches that provide high level expression and readily achievable purity of recombinant protein. E. coli is the most widely used host for the expression of a diverse range of proteins at low cost. However, there are common pitfalls that can severely limit the expression of exogenous proteins, such as stability, low solubility and toxicity to the host cell. To overcome these obstacles, one strategy that has found to be promising is the use of affinity tags or carrier peptide to aid in the folding of the target protein, increase solubility, lower toxicity and increase the level of expression. In the meantime, the tags and fusion proteins can be designed to facilitate affinity purification. Since the fusion protein may not exhibit the native conformation of the target protein, various strategies have been developed to remove the tag during or after purification to avoid potential complications in structural and functional studies and to obtain native biological activities. Despite extensive research and rapid development along these lines, there are unsolved problems and imperfect applications. This focused review compares and contrasts various strategies that employ affinity tags to improve bacterial expression and to facilitate purification of recombinant proteins. The pros and cons of the approaches are discussed for more effective applications and new directions of future improvement. PMID:26216265

  13. Rapid and efficient purification of native histidine-tagged protein expressed by recombinant vaccinia virus.

    PubMed Central

    Janknecht, R; de Martynoff, G; Lou, J; Hipskind, R A; Nordheim, A; Stunnenberg, H G

    1991-01-01

    Vaccinia virus has been used as a vector to express foreign genes for the production of functional and posttranslationally modified proteins. A procedure is described here that allows the rapid native purification of vaccinia-expressed proteins fused to an amino-terminal tag of six histidines. Extracts from cells infected with recombinant vaccinia virus are loaded onto Ni2+.nitrilotriacetic acid (Ni2+.NTA)-agarose and histidine-tagged proteins are selectively eluted with imidazole-containing buffers. In the case of the human serum response factor (SRF), a transcription factor involved in the regulation of the c-fos protooncogene, the vaccinia-expressed histidine-tagged SRF (SRF-6His) could be purified solely by this step to greater than 95% purity. SRF-6His was shown to resemble authentic SRF by functional criteria: it was transported to the nucleus, bound specifically the c-fos serum response element, interacted with the p62TCF protein to form a ternary complex, and stimulated in vitro transcription from the serum response element. Thus, the combination of vaccinia virus expression and affinity purification by Ni2+.NTA chromatography promises to be useful for the production of proteins in a functional and posttranslationally modified form. Images PMID:1924358

  14. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    PubMed

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies. PMID:19132072

  15. Repetitive genome elements in a European corn borer, Ostrinia nubilalis, bacterial artificial chromosome library were indicated by bacterial artificial chromosome end sequencing and development of sequence tag site markers: implications for lepidopteran genomic research.

    PubMed

    Coates, Brad S; Sumerford, Douglas V; Hellmich, Richard L; Lewis, Leslie C

    2009-01-01

    The European corn borer, Ostrinia nubilalis, is a serious pest of food, fiber, and biofuel crops in Europe, North America, and Asia and a model system for insect olfaction and speciation. A bacterial artificial chromosome library constructed for O. nubilalis contains 36 864 clones with an estimated average insert size of >or=120 kb and genome coverage of 8.8-fold. Screening OnB1 clones comprising approximately 2.76 genome equivalents determined the physical position of 24 sequence tag site markers, including markers linked to ecologically important and Bacillus thuringiensis toxin resistance traits. OnB1 bacterial artificial chromosome end sequence reads (GenBank dbGSS accessions ET217010 to ET217273) showed homology to annotated genes or expressed sequence tags and identified repetitive genome elements, O. nubilalis miniature subterminal inverted repeat transposable elements (OnMITE01 and OnMITE02), and ezi-like long interspersed nuclear elements. Mobility of OnMITE01 was demonstrated by the presence or absence in O. nubilalis of introns at two different loci. A (GTCT)n tetranucleotide repeat at the 5' ends of OnMITE01 and OnMITE02 are evidence for transposon-mediated movement of lepidopteran microsatellite loci. The number of repetitive elements in lepidopteran genomes will affect genome assembly and marker development. Single-locus sequence tag site markers described here have downstream application for integration within linkage maps and comparative genomic studies.

  16. Distinct cis regulatory elements govern the expression of TAG1 in embryonic sensory ganglia and spinal cord.

    PubMed

    Hadas, Yoav; Nitzan, Noa; Furley, Andrew J W; Kozlov, Serguei V; Klar, Avihu

    2013-01-01

    Cell fate commitment of spinal progenitor neurons is initiated by long-range, midline-derived, morphogens that regulate an array of transcription factors that, in turn, act sequentially or in parallel to control neuronal differentiation. Included among these are transcription factors that regulate the expression of receptors for guidance cues, thereby determining axonal trajectories. The Ig/FNIII superfamily molecules TAG1/Axonin1/CNTN2 (TAG1) and Neurofascin (Nfasc) are co-expressed in numerous neuronal cell types in the CNS and PNS - for example motor, DRG and interneurons - both promote neurite outgrowth and both are required for the architecture and function of nodes of Ranvier. The genes encoding TAG1 and Nfasc are adjacent in the genome, an arrangement which is evolutionarily conserved. To study the transcriptional network that governs TAG1 and Nfasc expression in spinal motor and commissural neurons, we set out to identify cis elements that regulate their expression. Two evolutionarily conserved DNA modules, one located between the Nfasc and TAG1 genes and the second directly 5' to the first exon and encompassing the first intron of TAG1, were identified that direct complementary expression to the CNS and PNS, respectively, of the embryonic hindbrain and spinal cord. Sequential deletions and point mutations of the CNS enhancer element revealed a 130bp element containing three conserved E-boxes required for motor neuron expression. In combination, these two elements appear to recapitulate a major part of the pattern of TAG1 expression in the embryonic nervous system.

  17. Population Genomics of Parallel Adaptation in Threespine Stickleback using Sequenced RAD Tags

    PubMed Central

    Etter, Paul D.; Stiffler, Nicholas; Johnson, Eric A.; Cresko, William A.

    2010-01-01

    Next-generation sequencing technology provides novel opportunities for gathering genome-scale sequence data in natural populations, laying the empirical foundation for the evolving field of population genomics. Here we conducted a genome scan of nucleotide diversity and differentiation in natural populations of threespine stickleback (Gasterosteus aculeatus). We used Illumina-sequenced RAD tags to identify and type over 45,000 single nucleotide polymorphisms (SNPs) in each of 100 individuals from two oceanic and three freshwater populations. Overall estimates of genetic diversity and differentiation among populations confirm the biogeographic hypothesis that large panmictic oceanic populations have repeatedly given rise to phenotypically divergent freshwater populations. Genomic regions exhibiting signatures of both balancing and divergent selection were remarkably consistent across multiple, independently derived populations, indicating that replicate parallel phenotypic evolution in stickleback may be occurring through extensive, parallel genetic evolution at a genome-wide scale. Some of these genomic regions co-localize with previously identified QTL for stickleback phenotypic variation identified using laboratory mapping crosses. In addition, we have identified several novel regions showing parallel differentiation across independent populations. Annotation of these regions revealed numerous genes that are candidates for stickleback phenotypic evolution and will form the basis of future genetic analyses in this and other organisms. This study represents the first high-density SNP–based genome scan of genetic diversity and differentiation for populations of threespine stickleback in the wild. These data illustrate the complementary nature of laboratory crosses and population genomic scans by confirming the adaptive significance of previously identified genomic regions, elucidating the particular evolutionary and demographic history of such regions in natural

  18. Heterologous expression of rat epitope-tagged histamine H2 receptors in insect Sf9 cells

    PubMed Central

    Beukers, M W; Klaassen, C H W; De Grip, W J; Verzijl, D; Timmerman, H; Leurs, R

    1997-01-01

    Rat histamine H2 receptors were epitope-tagged with six histidine residues at the C-terminus to allow immunological detection of the receptor. Recombinant baculoviruses containing the epitope-tagged H2 receptor were prepared and were used to infect insect Sf9 cells. The His-tagged H2 receptors expressed in insect Sf9 cells showed typical H2 receptor characteristics as determined with [125I]-aminopotentidine (APT) binding studies. In Sf9 cells expressing the His-tagged H2 receptor histamine was able to stimulate cyclic AMP production 9 fold (EC50=2.1±0.1 μM) by use of the endogenous signalling pathway. The classical antagonists cimetidine, ranitidine and tiotidine inhibited histamine induced cyclic AMP production with Ki values of 0.60±0.43 μM, 0.25±0.15 μM and 28±7 nM, respectively (mean±s.e.mean, n=3). The expression of the His-tagged H2 receptors in infected Sf9 cells reached functional levels of 6.6±0.6 pmol mg−1 protein (mean±s.e.mean, n=3) after 3 days of infection. This represents about 2×106 copies of receptor/cell. Preincubation of the cells with 0.03 mM cholesterol-β-cyclodextrin complex resulted in an increase of [125I]-APT binding up to 169±5% (mean±s.e.mean, n=3). The addition of 0.03 mM cholesterol-β-cyclodextrin complex did not affect histamine-induced cyclic AMP production. The EC50 value of histamine was 3.1±1.7 μM in the absence of cholesterol-β-cyclodextrin complex and 11.1±5.5 μM in the presence of cholesterol-β-cyclodextrin complex (mean±s.e.mean, n=3). Also, the amount of cyclic AMP produced in the presence of 100 μM histamine was identical, 85±18 pmol/106 cells in the absence and 81±11 pmol/106 cells in the presence of 0.03 mM cholesterol-β-cyclodextrin complex (mean±s.e.mean, n=3). Immunofluorescence studies with an antibody against the His-tag revealed that the majority of the His-tagged H2 receptors was localized inside the insect Sf9 cells, although plasma membrane labelling could be

  19. The non-coding RNA composition of the mitotic chromosome by 5′-tag sequencing

    PubMed Central

    Meng, Yicong; Yi, Xianfu; Li, Xinhui; Hu, Chuansheng; Wang, Ju; Bai, Ling; Czajkowsky, Daniel M.; Shao, Zhifeng

    2016-01-01

    Mitotic chromosomes are one of the most commonly recognized sub-cellular structures in eukaryotic cells. Yet basic information necessary to understand their structure and assembly, such as their composition, is still lacking. Recent proteomic studies have begun to fill this void, identifying hundreds of RNA-binding proteins bound to mitotic chromosomes. However, by contrast, there are only two RNA species (U3 snRNA and rRNA) that are known to be associated with the mitotic chromosome, suggesting that there are many mitotic chromosome-associated RNAs (mCARs) not yet identified. Here, using a targeted protocol based on 5′-tag sequencing to profile the mammalian mCAR population, we report the identification of 1279 mCARs, the majority of which are ncRNAs, including lncRNAs that exhibit greater conservation across 60 vertebrate species than the entire population of lncRNAs. There is also a significant enrichment of snoRNAs and specific SINE RNAs. Finally, ∼40% of the mCARs are presently unannotated, many of which are as abundant as the annotated mCARs, suggesting that there are also many novel ncRNAs in the mCARs. Overall, the mCARs identified here, together with the previous proteomic and genomic data, constitute the first comprehensive catalogue of the molecular composition of the eukaryotic mitotic chromosomes. PMID:27016738

  20. Broad host range vectors for expression of proteins with (Twin-) Strep-tag, His-tag and engineered, export optimized yellow fluorescent protein

    PubMed Central

    2013-01-01

    Background In current protein research, a limitation still is the production of active recombinant proteins or native protein associations to assess their function. Especially the localization and analysis of protein-complexes or the identification of modifications and small molecule interaction partners by co-purification experiments requires a controllable expression of affinity- and/or fluorescence tagged variants of a protein of interest in its native cellular background. Advantages of periplasmic and/or homologous expressions can frequently not be realized due to a lack of suitable tools. Instead, experiments are often limited to the heterologous production in one of the few well established expression strains. Results Here, we introduce a series of new RK2 based broad host range expression plasmids for inducible production of affinity- and fluorescence tagged proteins in the cytoplasm and periplasm of a wide range of Gram negative hosts which are designed to match the recently suggested modular Standard European Vector Architecture and database. The vectors are equipped with a yellow fluorescent protein variant which is engineered to fold and brightly fluoresce in the bacterial periplasm following Sec-mediated export, as shown from fractionation and imaging studies. Expression of Strep-tag®II and Twin-Strep-tag® fusion proteins in Pseudomonas putida KT2440 is demonstrated for various ORFs. Conclusion The broad host range constructs we have produced enable good and controlled expression of affinity tagged protein variants for single-step purification and qualify for complex co-purification experiments. Periplasmic export variants enable production of affinity tagged proteins and generation of fusion proteins with a novel engineered Aequorea-based yellow fluorescent reporter protein variant with activity in the periplasm of the tested Gram-negative model bacteria Pseudomonas putida KT2440 and Escherichia coli K12 for production, localization or co

  1. HaloTag is an effective expression and solubilisation fusion partner for a range of fibroblast growth factors.

    PubMed

    Sun, Changye; Li, Yong; Taylor, Sarah E; Mao, Xianqing; Wilkinson, Mark C; Fernig, David G

    2015-01-01

    The production of recombinant proteins such as the fibroblast growth factors (FGFs) is the key to establishing their function in cell communication. The production of recombinant FGFs in E. coli is limited, however, due to expression and solubility problems. HaloTag has been used as a fusion protein to introduce a genetically-encoded means for chemical conjugation of probes. We have expressed 11 FGF proteins with an N-terminal HaloTag, followed by a tobacco etch virus (TEV) protease cleavage site to allow release of the FGF protein. These were purified by heparin-affinity chromatography, and in some instances by further ion-exchange chromatography. It was found that HaloTag did not adversely affect the expression of FGF1 and FGF10, both of which expressed well as soluble proteins. The N-terminal HaloTag fusion was found to enhance the expression and yield of FGF2, FGF3 and FGF7. Moreover, whereas FGF6, FGF8, FGF16, FGF17, FGF20 and FGF22 were only expressed as insoluble proteins, their N-terminal HaloTag fusion counterparts (Halo-FGFs) were soluble, and could be successfully purified. However, cleavage of Halo-FGF6, -FGF8 and -FGF22 with TEV resulted in aggregation of the FGF protein. Measurement of phosphorylation of p42/44 mitogen-activated protein kinase and of cell growth demonstrated that the HaloTag fusion proteins were biologically active. Thus, HaloTag provides a means to enhance the expression of soluble recombinant proteins, in addition to providing a chemical genetics route for covalent tagging of proteins.

  2. HaloTag is an effective expression and solubilisation fusion partner for a range of fibroblast growth factors

    PubMed Central

    Taylor, Sarah E.; Mao, Xianqing; Wilkinson, Mark C.

    2015-01-01

    The production of recombinant proteins such as the fibroblast growth factors (FGFs) is the key to establishing their function in cell communication. The production of recombinant FGFs in E. coli is limited, however, due to expression and solubility problems. HaloTag has been used as a fusion protein to introduce a genetically-encoded means for chemical conjugation of probes. We have expressed 11 FGF proteins with an N-terminal HaloTag, followed by a tobacco etch virus (TEV) protease cleavage site to allow release of the FGF protein. These were purified by heparin-affinity chromatography, and in some instances by further ion-exchange chromatography. It was found that HaloTag did not adversely affect the expression of FGF1 and FGF10, both of which expressed well as soluble proteins. The N-terminal HaloTag fusion was found to enhance the expression and yield of FGF2, FGF3 and FGF7. Moreover, whereas FGF6, FGF8, FGF16, FGF17, FGF20 and FGF22 were only expressed as insoluble proteins, their N-terminal HaloTag fusion counterparts (Halo-FGFs) were soluble, and could be successfully purified. However, cleavage of Halo-FGF6, -FGF8 and -FGF22 with TEV resulted in aggregation of the FGF protein. Measurement of phosphorylation of p42/44 mitogen-activated protein kinase and of cell growth demonstrated that the HaloTag fusion proteins were biologically active. Thus, HaloTag provides a means to enhance the expression of soluble recombinant proteins, in addition to providing a chemical genetics route for covalent tagging of proteins. PMID:26137434

  3. Functional dissection of the cis-acting sequences of the Arabidopsis transposable element Tag1 reveals dissimilar subterminal sequence and minimal spacing requirements for transposition.

    PubMed Central

    Liu, D; Mack, A; Wang, R; Galli, M; Belk, J; Ketpura, N I; Crawford, N M

    2001-01-01

    The Arabidopsis transposon Tag1 has an unusual subterminal structure containing four sets of dissimilar repeats: one set near the 5' end and three near the 3' end. To determine sequence requirements for efficient and regulated transposition, deletion derivatives of Tag1 were tested in Arabidopsis plants. These tests showed that a 98-bp 5' fragment containing the 22-bp inverted repeat and four copies of the AAACCX (X = C, A, G) 5' subterminal repeat is sufficient for transposition while a 52-bp 5' fragment containing only one copy of the subterminal repeat is not. At the 3' end, a 109-bp fragment containing four copies of the most 3' repeat TGACCC, but not a 55-bp fragment, which has no copies of the subterminal repeats, is sufficient for transposition. The 5' and 3' end fragments are not functionally interchangeable and require an internal spacer DNA of minimal length between 238 and 325 bp to be active. Elements with these minimal requirements show transposition rates and developmental control of excision that are comparable to the autonomous Tag1 element. Last, a DNA-binding activity that interacts with the 3' 109-bp fragment but not the 5' 98-bp fragment of Tag1 was found in nuclear extracts of Arabidopsis plants devoid of Tag1. PMID:11156999

  4. Detection of genes expressed in Bordetella bronchiseptica colonizing rat trachea by in vivo expressed-tag immunoprecipitation method.

    PubMed

    Abe, Hiroyuki; Kamitani, Shigeki; Fukui-Miyazaki, Aya; Shinzawa, Naoaki; Nakamura, Keiji; Horiguchi, Yasuhiko

    2015-05-01

    Analyses of bacterial genes expressed in response to the host environment provide clues to understanding the host-pathogen interactions that lead to the establishment of infection. In this study, a novel method named In Vivo Expressed-Tag ImmunoPrecipitation (IVET-PI) was developed for detecting genes expressed in bacteria that are recovered in a small numbers from host tissues. IVET-IP was designed to overcome some drawbacks of previous similar methods. We applied IVET-IP to Bordetella bronchiseptica colonizing rat trachea and identified 173 genes that were expressed in the bacteria over the entire course of an infection. These gene products included two transcriptional factors that are involved in the expression of filamentous hemagglutinin, adenylate cyclase toxin, and major virulence factors for the bordetellae. We consider that this method might provide novel insight into the course of Bordetella infection. PMID:25683445

  5. Detection of genes expressed in Bordetella bronchiseptica colonizing rat trachea by in vivo expressed-tag immunoprecipitation method.

    PubMed

    Abe, Hiroyuki; Kamitani, Shigeki; Fukui-Miyazaki, Aya; Shinzawa, Naoaki; Nakamura, Keiji; Horiguchi, Yasuhiko

    2015-05-01

    Analyses of bacterial genes expressed in response to the host environment provide clues to understanding the host-pathogen interactions that lead to the establishment of infection. In this study, a novel method named In Vivo Expressed-Tag ImmunoPrecipitation (IVET-PI) was developed for detecting genes expressed in bacteria that are recovered in a small numbers from host tissues. IVET-IP was designed to overcome some drawbacks of previous similar methods. We applied IVET-IP to Bordetella bronchiseptica colonizing rat trachea and identified 173 genes that were expressed in the bacteria over the entire course of an infection. These gene products included two transcriptional factors that are involved in the expression of filamentous hemagglutinin, adenylate cyclase toxin, and major virulence factors for the bordetellae. We consider that this method might provide novel insight into the course of Bordetella infection.

  6. Persistent Expression of FLAG-tagged Micro dystrophin in Nonhuman Primates Following Intramuscular and Vascular Delivery

    PubMed Central

    Rodino-Klapac, Louise R; Montgomery, Chrystal L; Bremer, William G; Shontz, Kimberly M; Malik, Vinod; Davis, Nancy; Sprinkle, Spencer; Campbell, Katherine J; Sahenk, Zarife; Clark, K Reed; Walker, Christopher M; Mendell, Jerry R; Chicoine, Louis G

    2009-01-01

    Animal models for Duchenne muscular dystrophy (DMD) have species limitations related to assessing function, immune response, and distribution of micro- or mini-dystrophins. Nonhuman primates (NHPs) provide the ideal model to optimize vector delivery across a vascular barrier and provide accurate dose estimates for widespread transduction. To address vascular delivery and dosing in rhesus macaques, we have generated a fusion construct that encodes an eight amino-acid FLAG epitope at the C-terminus of micro-dystrophin to facilitate translational studies targeting DMD. Intramuscular (IM) injection of AAV8.MCK.micro-dys.FLAG in the tibialis anterior (TA) of macaques demonstrated robust gene expression, with muscle transduction (50–79%) persisting for up to 5 months. Success by IM injection was followed by targeted vascular delivery studies using a fluoroscopy-guided catheter threaded through the femoral artery. Three months after gene transfer, >80% of muscle fibers showed gene expression in the targeted muscle. No cellular immune response to AAV8 capsid, micro-dystrophin, or the FLAG tag was detected by interferon-γ (IFN-γ) enzyme-linked immunosorbent spot (ELISpot) at any time point with either route. In summary, an epitope-tagged micro-dystrophin cassette enhances the ability to evaluate site-specific localization and distribution of gene expression in the NHP in preparation for vascular delivery clinical trials. PMID:19904237

  7. Expression of epitope-tagged SYN3 cohesin proteins can disrupt meiosis in Arabidopsis.

    PubMed

    Yuan, Li; Yang, Xiaohui; Auman, Dirk; Makaroff, Christopher A

    2014-03-20

    α-kleisins are core components of meiotic and mitotic cohesin complexes. Arabidopsis contains genes encoding four α-kleisins. SYN1, a REC8 ortholog, is essential for meiosis, while SYN2 and SYN4 appear to be SCC1 orthologs and function in mitosis. SYN3 is enriched in the nucleolus of meiotic and mitotic cells and is essential for megagametogenesis. It was recently shown that expression of SYN3-RNAi constructs in buds cause changes in meiotic gene expression that result in meiotic alterations. In this report we show that expression of SYN3 from the 35S promoter with either a c-terminal Myc or FAST tag causes a reduction in SYN1 mRNA levels that results in alterations in sister chromatid cohesion, homologous chromosome synapsis and synaptonemal complex formation during both male and female meiosis. PMID:24656235

  8. Down Syndome: A search for expressed sequences

    SciTech Connect

    Pritchard, M.; Fuentes, J.J.; Bosch, A.

    1994-09-01

    Down Syndrome (DS) is a major cause of congenital heart disease and mental retardation. The most common anomaly is an extra copy of human chromosome 21 (HC21); however, chromosomal studies in rare patients with partial trisomy 21 have defined a minimal region for DS, including human chromosome 21 bands q22.2-q22.3. The study of genes in this chromosomal region will allow the elucidation of the biochemical and molecular bases for several of the distinct phenotypic traits of the syndrome. This information is the key to the design of therapeutic, pharmacological and genetic tools to counter the effects of three copies of chromosome 21 in the cells of DS patients. Towards this goal, we aim to build a transcriptional map of this region and then characterize any genes isolated. We are using two methods to isolate expressed sequences: (1) Alu-splice consensus PCR (2) cDNA hybridizsation selection. We use as starting material, YACs (CEPH/Genethon) from the specified region and cosmid minilibraries constructed from these YACs. Products are subcloned, sequenced and analyzed in the sequence databases. Several homologies with reported expressed sequences have been found and will be discussed. The HC21 origin of these putative expressed sequences is determined and they are then used to initially screen a human fetal brain full-length cDNA library. We have isolated several cDNAs and these are now being analyzed.

  9. Identification of Disulfide Bonds in Protein Proteolytic Degradation Products Using de Novo-Protein Unique Sequence Tags Approach

    SciTech Connect

    Shen, Yufeng; Tolic, Nikola; Purvine, Samuel O.; Smith, Richard D.

    2010-08-01

    Disulfide bonds are a form of posttranslational modification that often determines protein structure(s) and function(s). In this work, we report a mass spectrometry method for identification of disulfides in degradation products of proteins, and specifically endogenous peptides in the human blood plasma peptidome. LC-Fourier transform tandem mass spectrometry (FT MS/MS) was used for acquiring mass spectra that were de novo sequenced and then searched against the IPI human protein database. Through the use of unique sequence tags (UStags) we unambiguously correlated the spectra to specific database proteins. Examination of the UStags’ prefix and/or suffix sequences that contain cysteine(s) in conjunction with sequences of the UStags-specified database proteins is shown to enable the unambigious determination of disulfide bonds. Using this method, we identified the intermolecular and intramolecular disulfides in human blood plasma peptidome peptides that have molecular weights of up to ~10 kDa.

  10. Identification of disulfide bonds in protein proteolytic degradation products using de novo-protein unique sequence tags approach.

    PubMed

    Shen, Yufeng; Tolić, Nikola; Purvine, Samuel O; Smith, Richard D

    2010-08-01

    Disulfide bonds are a form of post-translational modification that often determines protein structure(s) and function(s). In this work, we report a mass spectrometry method for identification of disulfides in degradation products of proteins, specifically endogenous peptides in the human blood plasma peptidome. LC-Fourier transform tandem mass spectrometry (FT MS/MS) was used for acquiring mass spectra that were de novo sequenced and then searched against the IPI human protein database. Through the use of unique sequence tags (UStags), we unambiguously correlated the spectra to specific database proteins. Examination of the UStags' prefix and/or suffix sequences that contain cysteine(s) in conjunction with sequences of the UStags-specified database proteins is shown to enable the unambigious determination of disulfide bonds. Using this method, we identified the intermolecular and intramolecular disulfides in human blood plasma peptidome peptides that have molecular weights of up to approximately 10 kDa. PMID:20590115

  11. Thermostable tag (TST) protein expression system: engineering thermotolerant recombinant proteins and vaccines.

    PubMed

    Luke, Jeremy M; Carnes, Aaron E; Sun, Ping; Hodgson, Clague P; Waugh, David S; Williams, James A

    2011-02-10

    Methods to increase temperature stability of vaccines and adjuvants are needed to reduce dependence on cold chain storage. We report herein creation and application of pVEX expression vectors to improve vaccine and adjuvant manufacture and thermostability. Defined media fermentation yields of 6g/L thermostable toll-like receptor 5 agonist flagellin were obtained using an IPTG inducible pVEX-flagellin expression vector. Alternative pVEX vectors encoding Pyrococcus furiosus maltodextrin-binding protein (pfMBP) as a fusion partner improved Influenza hemagglutinin antigen vaccine solubility and thermostability. A pfMBP hemagglutinin HA2 domain fusion protein was a potent immunogen. Manufacturing processes that combined up to 5 g/L defined media fermentation yields with rapid, selective, thermostable pfMBP fusion protein purification were developed. The pVEX pfMBP-based thermostable tag (TST) platform is a generic protein engineering approach to enable high yield manufacture of thermostable recombinant protein vaccine components.

  12. Automated expression and solubility screening of His-tagged proteins in 96-well format.

    PubMed

    Vincentelli, Renaud; Canaan, Stéphane; Offant, Julien; Cambillau, Christian; Bignon, Christophe

    2005-11-01

    A growing need for sensitive and high-throughput methods for screening the expression and solubility of recombinant proteins exists in structural genomics. Originally, the emergency solution was to use immediately available techniques such as manual lysis of expression cells followed by analysis of protein expression by gel electrophoresis. However, these handmade methods quickly proved to be unfit for the high-throughput demand of postgenomics, and it is now generally accepted that the long-term solution to this problem will be based on automation, on industrial standard-formatted experiments, and on downsizing samples and consumables. In agreement with this consensus, we have set up a fully automated method based on a dot-blot technology and using 96-well format consumables for assessing by immunodetection the amount of total and soluble recombinant histidine (His)-tagged proteins expressed in Escherichia coli. The method starts with the harvest of expression cells and ends with the display of solubility/expression results in milligrams of recombinant protein per liter of culture using a three-color code to assist analysis. The program autonomously processes 160 independent cultures at a time.

  13. Evaluation of affinity-tagged protein expression strategies using local and global isotope ratio measurements.

    PubMed

    Hervey, W Judson; Khalsa-Moyers, Gurusahai; Lankford, Patricia K; Owens, Elizabeth T; McKeown, Catherine K; Lu, Tse-Yuan; Foote, Linda J; Asano, Keiji G; Morrell-Falvey, Jennifer L; McDonald, W Hayes; Pelletier, Dale A; Hurst, Gregory B

    2009-07-01

    Elucidation of protein-protein interactions can provide new knowledge on protein function. Enrichments of affinity-tagged (or "bait") proteins with interaction partners generally include background, nonspecific protein artifacts. Furthermore, in vivo bait expression may introduce additional artifacts arising from altered physiology or metabolism. In this study, we compared these effects for chromosome and plasmid encoding strategies for bait proteins in two microbes: Escherichia coli and Rhodopseudomonas palustris. Differential metabolic labeling of strains expressing bait protein relative to the wild-type strain in each species allowed comparison by liquid chromatography tandem mass spectrometry (LC-MS-MS). At the local level of the protein complex, authentic interacting proteins of RNA polymerase (RNAP) were successfully discerned from artifactual proteins by the isotopic differentiation of interactions as random or targeted (I-DIRT, Tackett, A. J.; et al. J. Proteome Res. 2005, 4, 1752-1756). To investigate global effects of bait protein production, we compared proteomes from strains harboring a plasmid encoding an affinity-tagged subunit (RpoA) of RNAP with the corresponding wild-type strains. The RpoA abundance ratios of 0.8 for R. palustris and 1.7 for E. coli in plasmid strains versus wild-type indicated only slightly altered expression. While most other proteins also showed no appreciable difference in abundance, several that did show altered levels were involved in amino acid metabolism. Measurements at both local and global levels proved useful for evaluating in vitro and in vivo artifacts of plasmid-encoding strategies for bait protein expression.

  14. Construction of a plasmid coding for green fluorescent protein tagged cathepsin L and data on expression in colorectal carcinoma cells

    PubMed Central

    Tamhane, Tripti; Wolters, Brit K.; Illukkumbura, Rukshala; Maelandsmo, Gunhild M.; Haugen, Mads H.; Brix, Klaudia

    2015-01-01

    The endo-lysosomal cysteine cathepsin L has recently been shown to have moonlighting activities in that its unexpected nuclear localization in colorectal carcinoma cells is involved in cell cycle progression (Tamhane et al., 2015) [1]. Here, we show data on the construction and sequence of a plasmid coding for human cathepsin L tagged with an enhanced green fluorescent protein (phCL-EGFP) in which the fluorescent protein is covalently attached to the C-terminus of the protease. The plasmid was used for transfection of HCT116 colorectal carcinoma cells, while data from non-transfected and pEGFP-N1-transfected cells is also shown. Immunoblotting data of lysates from non-transfected controls and HCT116 cells transfected with pEGFP-N1 and phCL-EGFP, showed stable expression of cathepsin L-enhanced green fluorescent protein chimeras, while endogenous cathepsin L protein amounts exceed those of hCL-EGFP chimeras. An effect of phCL-EGFP expression on proliferation and metabolic states of HCT116 cells at 24 h post-transfection was observed. PMID:26594658

  15. Construction of a plasmid coding for green fluorescent protein tagged cathepsin L and data on expression in colorectal carcinoma cells.

    PubMed

    Tamhane, Tripti; Wolters, Brit K; Illukkumbura, Rukshala; Maelandsmo, Gunhild M; Haugen, Mads H; Brix, Klaudia

    2015-12-01

    The endo-lysosomal cysteine cathepsin L has recently been shown to have moonlighting activities in that its unexpected nuclear localization in colorectal carcinoma cells is involved in cell cycle progression (Tamhane et al., 2015) [1]. Here, we show data on the construction and sequence of a plasmid coding for human cathepsin L tagged with an enhanced green fluorescent protein (phCL-EGFP) in which the fluorescent protein is covalently attached to the C-terminus of the protease. The plasmid was used for transfection of HCT116 colorectal carcinoma cells, while data from non-transfected and pEGFP-N1-transfected cells is also shown. Immunoblotting data of lysates from non-transfected controls and HCT116 cells transfected with pEGFP-N1 and phCL-EGFP, showed stable expression of cathepsin L-enhanced green fluorescent protein chimeras, while endogenous cathepsin L protein amounts exceed those of hCL-EGFP chimeras. An effect of phCL-EGFP expression on proliferation and metabolic states of HCT116 cells at 24 h post-transfection was observed.

  16. Sequencing degraded DNA from non-destructively sampled museum specimens for RAD-tagging and low-coverage shotgun phylogenetics.

    PubMed

    Tin, Mandy Man-Ying; Economo, Evan Philip; Mikheyev, Alexander Sergeyevich

    2014-01-01

    Ancient and archival DNA samples are valuable resources for the study of diverse historical processes. In particular, museum specimens provide access to biotas distant in time and space, and can provide insights into ecological and evolutionary changes over time. However, archival specimens are difficult to handle; they are often fragile and irreplaceable, and typically contain only short segments of denatured DNA. Here we present a set of tools for processing such samples for state-of-the-art genetic analysis. First, we report a protocol for minimally destructive DNA extraction of insect museum specimens, which produced sequenceable DNA from all of the samples assayed. The 11 specimens analyzed had fragmented DNA, rarely exceeding 100 bp in length, and could not be amplified by conventional PCR targeting the mitochondrial cytochrome oxidase I gene. Our approach made these samples amenable to analysis with commonly used next-generation sequencing-based molecular analytic tools, including RAD-tagging and shotgun genome re-sequencing. First, we used museum ant specimens from three species, each with its own reference genome, for RAD-tag mapping. Were able to use the degraded DNA sequences, which were sequenced in full, to identify duplicate reads and filter them prior to base calling. Second, we re-sequenced six Hawaiian Drosophila species, with millions of years of divergence, but with only a single available reference genome. Despite a shallow coverage of 0.37 ± 0.42 per base, we could recover a sufficient number of overlapping SNPs to fully resolve the species tree, which was consistent with earlier karyotypic studies, and previous molecular studies, at least in the regions of the tree that these studies could resolve. Although developed for use with degraded DNA, all of these techniques are readily applicable to more recent tissue, and are suitable for liquid handling automation.

  17. Simian virus 40 sequences and expression of the viral large T antigen oncoprotein in human pleomorphic adenomas of parotid glands.

    PubMed

    Martinelli, Marcella; Martini, Fernanda; Rinaldi, Eliana; Caramanico, Laura; Magri, Eros; Grandi, Enrico; Carinci, Francesco; Pastore, Antonio; Tognon, Mauro

    2002-10-01

    Simian virus 40 (SV40) sequences of the early region coding for the large T antigen (Tag) oncoprotein were investigated in DNA samples from human pleomorphic adenoma (PA) of parotid glands. Specific SV40 sequences were detected, by PCR and filter hybridization with an internal oligoprobe, in 28 of 45 (62%) human PA specimens. None of the DNA samples from 11 normal salivary gland tissues was SV40-positive. DNA sequence analysis, carried out in all PCR amplified products from SV40-positive PA specimens, confirmed the SV40 specificity and indicated that PCR products had a sequence not distinguishable from SV40 DNA wild-type strain 776. SV40 Tag expression was revealed by immunohistochemistry with the specific monoclonal antibody Pab 101 in PA thin sections with a highly sensitive technical approach which retrieved the nuclear viral oncoprotein in 26 out of 28 (93%) samples previously found SV40-positive by PCR. Detection of SV40 sequences and Tag expression in human PA suggests that this oncogenic virus may play a role as a cofactor in the onset and/or progression of this benign neoplasm, or that SV40 DNA could replicate and express the Tag in PA cells.

  18. Single nucleotide polymorphisms associated with rat expressed sequences.

    PubMed

    Guryev, Victor; Berezikov, Eugene; Malik, Rainer; Plasterk, Ronald H A; Cuppen, Edwin

    2004-07-01

    Single nucleotide polymorphisms (SNPs) are the most common source of genetic variation in populations and are thus most likely to account for the majority of phenotypic and behavioral differences between individuals or strains. Although the rat is extensively studied for the latter, data on naturally occurring polymorphisms are mostly lacking. We have used publicly available sequences consisting of whole-genome shotgun (WGS), expressed sequence tag (EST), and mRNA data as a source for the in silico identification of SNPs in gene-coding regions and have identified a large collection of 33,305 high-quality candidate SNPs. Experimental verification of 471 candidate SNPs using a limited set of rat isolates revealed a confirmation rate of approximately 50%. Although the majority of SNPs were identified between Sprague-Dawley (EST data) and Brown Norway (WGS data) strains, we found that 66% of the verified variations are common among different rat strains. All SNPs were extensively annotated, including chromosomal and genetic map information, and nonsynonymous SNPs were analyzed by SIFT and PolyPhen prediction programs for their potential deleterious effect on protein function. Interestingly, we retrieved three SNPs from the database that result in the introduction of a premature stop codon and that could be confirmed experimentally. Two of these "in silico-identified knockouts" reside in interesting QTL regions. Data are publicly available via a Web interface (http://cascad.niob.knaw.nl), allowing simple and advanced search queries.

  19. Plasmids with E2 epitope tags: tagging modules for N- and C-terminal PCR-based gene targeting in both budding and fission yeast, and inducible expression vectors for fission yeast.

    PubMed

    Tamm, Tiina

    2009-01-01

    A single-step PCR-based epitope tagging enables fast and efficient gene targeting with various epitope tags. This report presents a series of plasmids for the E2 epitope tagging of proteins in Saccharomyces cerevisiae and Schizosaccharomyces pombe. E2Tags are 10-amino acids (epitope E2a: SSTSSDFRDR)- and 12 amino acids (epitope E2b: GVSSTSSDFRDR)-long peptides derived from the E2 protein of bovine papillomavirus type 1. The modules for C-terminal tagging with E2a and E2b epitopes were constructed by the modification of the pYM-series plasmid. The N-terminal E2a and E2b tagging modules were based on pOM-series plasmid. The pOM-series plasmids were selected for this study because of their use of the Cre-loxP recombination system. The latter enables a marker cassette to be removed after integration into the loci of interest and, thereafter, the tagged protein is expressed under its endogenous promoter. Specifically for fission yeast, high copy pREP plasmids containing the E2a epitope tag as an N-terminal or C-terminal tag were constructed. The properties of E2a and E2b epitopes and the sensitivity of two anti-E2 monoclonal antibodies (5E11 and 3F12) were tested using several S. cerevisiae and Sz. pombe E2-tagged strains. PMID:19180640

  20. Segmental isotope labeling of proteins for NMR structural study using a protein S tag for higher expression and solubility.

    PubMed

    Kobayashi, Hiroshi; Swapna, G V T; Wu, Kuen-Phon; Afinogenova, Yuliya; Conover, Kenith; Mao, Binchen; Montelione, Gaetano T; Inouye, Masayori

    2012-04-01

    A common obstacle to NMR studies of proteins is sample preparation. In many cases, proteins targeted for NMR studies are poorly expressed and/or expressed in insoluble forms. Here, we describe a novel approach to overcome these problems. In the protein S tag-intein (PSTI) technology, two tandem 92-residue N-terminal domains of protein S (PrS(2)) from Myxococcus xanthus is fused at the N-terminal end of a protein to enhance its expression and solubility. Using intein technology, the isotope-labeled PrS(2)-tag is replaced with non-isotope labeled PrS(2)-tag, silencing the NMR signals from PrS(2)-tag in isotope-filtered (1)H-detected NMR experiments. This method was applied to the E. coli ribosome binding factor A (RbfA), which aggregates and precipitates in the absence of a solubilization tag unless the C-terminal 25-residue segment is deleted (RbfAΔ25). Using the PrS(2)-tag, full-length well-behaved RbfA samples could be successfully prepared for NMR studies. PrS(2) (non-labeled)-tagged RbfA (isotope-labeled) was produced with the use of the intein approach. The well-resolved TROSY-HSQC spectrum of full-length PrS(2)-tagged RbfA superimposes with the TROSY-HSQC spectrum of RbfAΔ25, indicating that PrS(2)-tag does not affect the structure of the protein to which it is fused. Using a smaller PrS-tag, consisting of a single N-terminal domain of protein S, triple resonance experiments were performed, and most of the backbone (1)H, (15)N and (13)C resonance assignments for full-length E. coli RbfA were determined. Analysis of these chemical shift data with the Chemical Shift Index and heteronuclear (1)H-(15)N NOE measurements reveal the dynamic nature of the C-terminal segment of the full-length RbfA protein, which could not be inferred using the truncated RbfAΔ25 construct. CS-Rosetta calculations also demonstrate that the core structure of full-length RbfA is similar to that of the RbfAΔ25 construct.

  1. The generation of knock-in mice expressing fluorescently tagged galanin receptors 1 and 2

    PubMed Central

    Kerr, Niall; Holmes, Fiona E.; Hobson, Sally-Ann; Vanderplank, Penny; Leard, Alan; Balthasar, Nina; Wynick, David

    2015-01-01

    The neuropeptide galanin has diverse roles in the central and peripheral nervous systems, by activating the G protein-coupled receptors Gal1, Gal2 and the less studied Gal3 (GalR1–3 gene products). There is a wealth of data on expression of Gal1–3 at the mRNA level, but not at the protein level due to the lack of specificity of currently available antibodies. Here we report the generation of knock-in mice expressing Gal1 or Gal2 receptor fluorescently tagged at the C-terminus with, respectively, mCherry or hrGFP (humanized Renilla green fluorescent protein). In dorsal root ganglia (DRG) neurons expressing the highest levels of Gal1-mCherry, localization to the somatic cell membrane was detected by live-cell fluorescence and immunohistochemistry, and that fluorescence decreased upon addition of galanin. In spinal cord, abundant Gal1-mCherry immunoreactive processes were detected in the superficial layers of the dorsal horn, and highly expressing intrinsic neurons of the lamina III/IV border showed both somatic cell membrane localization and outward transport of receptor from the cell body, detected as puncta within cell processes. In brain, high levels of Gal1-mCherry immunofluorescence were detected within thalamus, hypothalamus and amygdala, with a high density of nerve endings in the external zone of the median eminence, and regions with lesser immunoreactivity included the dorsal raphe nucleus. Gal2-hrGFP mRNA was detected in DRG, but live-cell fluorescence was at the limits of detection, drawing attention to both the much lower mRNA expression than to Gal1 in mice and the previously unrecognized potential for translational control by upstream open reading frames (uORFs). PMID:26292267

  2. SNP discovery using Paired-End RAD-tag sequencing on pooled genomic DNA of Sisymbrium austriacum (Brassicaceae).

    PubMed

    Vandepitte, K; Honnay, O; Mergeay, J; Breyne, P; Roldán-Ruiz, I; De Meyer, T

    2013-03-01

    Single nucleotide polymorphisms SNPs are rapidly replacing anonymous markers in population genomic studies, but their use in non model organisms is hampered by the scarcity of cost-effective approaches to uncover genome-wide variation in a comprehensive subset of individuals. The screening of one or only a few individuals induces ascertainment bias. To discover SNPs for a population genomic study of the Pyrenean rocket (Sisymbrium austriacum subsp. chrysanthum), we undertook a pooled RAD-PE (Restriction site Associated DNA Paired-End sequencing) approach. RAD tags were generated from the PstI-digested pooled genomic DNA of 12 individuals sampled across the species distribution range and paired-end sequenced using Illumina technology to produce ~24.5 Mb of sequences, covering ~7% of the specie's genome. Sequences were assembled into ~76 000 contigs with a mean length of 323 bp (N(50)  = 357 bp, sequencing depth = 24x). In all, >15 000 SNPs were called, of which 47% were annotated in putative genic regions based on homology with the Arabidopsis thaliana genome. Gene ontology (GO) slim categorization demonstrated that the identified SNPs covered extant genic variation well. The validation of 300 SNPs on a larger set of individuals using a KASPar assay underpinned the utility of pooled RAD-PE as an inexpensive genome-wide SNP discovery technique (success rate: 87%). In addition to SNPs, we discovered >600 putative SSR markers.

  3. Genome-wide discovery of cis-elements in promoter sequences using gene expression.

    PubMed

    Troukhan, Maxim; Tatarinova, Tatiana; Bouck, John; Flavell, Richard B; Alexandrov, Nickolai N

    2009-04-01

    The availability of complete or nearly complete genome sequences, a large number of 5' expressed sequence tags, and significant public expression data allow for a more accurate identification of cis-elements regulating gene expression. We have implemented a global approach that takes advantage of available expression data, genomic sequences, and transcript information to predict cis-elements associated with specific expression patterns. The key components of our approach are: (1) precise identification of transcription start sites, (2) specific locations of cis-elements relative to the transcription start site, and (3) assessment of statistical significance for all sequence motifs. By applying our method to promoters of Arabidopsis thaliana and Mus musculus, we have identified motifs that affect gene expression under specific environmental conditions or in certain tissues. We also found that the presence of the TATA box is associated with increased variability of gene expression. Strong correlation between our results and experimentally determined motifs shows that the method is capable of predicting new functionally important cis-elements in promoter sequences. PMID:19231992

  4. Expression and purification of recombinant cytoplasmic domain of human erythrocyte band 3 with hexahistidine tag or chitin-binding tag in Escherichia coli.

    PubMed

    Ding, Yu; Jiang, Weihua; Su, Yang; Zhou, Hanqing; Zhang, Zhihong

    2004-04-01

    The cytoplasmic domain of erythrocyte band 3 (cdb3) serves as a center of membrane organization in the erythrocytes by its interaction with multiple proteins including ankyrin, protein 4.1, protein 4.2, hemoglobin, and several glycolytic enzymes. In this paper, human cdb3 was cloned into three different expression vectors controlled by T7 polymerase promoter and induced with isopropyl beta-D-thiogalactopyranoside in Escherichia coli. Two of the fusion proteins containing hexahistidine tag in the N-terminal or C-terminal were purified by immobilized metal affinity column chromatography. The third recombinant cdb3 containing the affinity chitin-binding tag was purified using chitin beads followed by specific self-cleavage, which released cdb3 according to the mechanism of protein splicing. The molecular weights of purified recombinant proteins were verified by mass spectrometry. The pH-dependent properties of the intrinsic tryptophan fluorescence of the three kinds of recombinant cdb3 were compared with that of the cdb3 extracted from the erythrocytes, showing that there were no significant differences between them. Using pull-down assay combined with Western blot analysis, the interaction between recombinant cdb3 and protein 4.2 C3 fragment was verified. These demonstrated that the recombinant proteins were both structurally and functionally active. The typical yields of cdb3 purified with hexahistidine tag in the N-terminal, C-terminal, and cleaved from chitin bead were 10.6, 9.6, and 1.5 mg from 1L culture medium, respectively. PMID:15003247

  5. Genome-wide search of the genes tagged with the consensus of 33.6 repeat loci in buffalo Bubalus bubalis employing minisatellite-associated sequence amplification.

    PubMed

    Pathak, Deepali; Srivastava, Jyoti; Samad, Rana; Parwez, Iqbal; Kumar, Sudhir; Ali, Sher

    2010-06-01

    Minisatellites have been implicated with chromatin organization and gene regulation, but mRNA transcripts tagged with these elements have not been systematically characterized. The aim of the present study was to gain an insight into the transcribing genes associated with consensus of 33.6 repeat loci across the tissues in water buffalo, Bubalus bubalis. Using cDNA from spermatozoa and eight different somatic tissues and an oligo primer based on two units of consensus of 33.6 repeat loci (5' CCTCCAGCCCTCCTCCAGCCCT 3'), we conducted minisatellite-associated sequence amplification (MASA) and identified 29 mRNA transcripts. These transcripts were cloned and sequenced. Blast search of the individual mRNA transcript revealed sequence homologies with various transcribing genes and contigs in the database. Using real-time PCR, we detected the highest expression of nine mRNA transcripts in spermatozoa and one each in liver and lung. Further, 21 transcripts were found to be conserved across the species; seven were specific to bovid whereas one was exclusive to the buffalo genome. The present work demonstrates innate potentials of MASA in accessing several functional genes simultaneously without screening the cDNA library. This approach may be exploited for the development of tissue-specific mRNA fingerprints in the context of genome analysis and functional and comparative genomics.

  6. Generation of plasmid vectors expressing FLAG-tagged proteins under the regulation of human elongation factor-1α promoter using Gibson assembly.

    PubMed

    Grozdanov, Petar N; MacDonald, Clinton C

    2015-02-09

    Gibson assembly (GA) cloning offers a rapid, reliable, and flexible alternative to conventional DNA cloning methods. We used GA to create customized plasmids for expression of exogenous genes in mouse embryonic stem cells (mESCs). Expression of exogenous genes under the control of the SV40 or human cytomegalovirus promoters diminishes quickly after transfection into mESCs. A remedy for this diminished expression is to use the human elongation factor-1 alpha (hEF1α) promoter to drive gene expression. Plasmid vectors containing hEF1α are not as widely available as SV40- or CMV-containing plasmids, especially those also containing N-terminal 3xFLAG-tags. The protocol described here is a rapid method to create plasmids expressing FLAG-tagged CstF-64 and CstF-64 mutant under the expressional regulation of the hEF1α promoter. GA uses a blend of DNA exonuclease, DNA polymerase and DNA ligase to make cloning of overlapping ends of DNA fragments possible. Based on the template DNAs we had available, we designed our constructs to be assembled into a single sequence. Our design used four DNA fragments: pcDNA 3.1 vector backbone, hEF1α promoter part 1, hEF1α promoter part 2 (which contained 3xFLAG-tag purchased as a double-stranded synthetic DNA fragment), and either CstF-64 or specific CstF-64 mutant. The sequences of these fragments were uploaded to a primer generation tool to design appropriate PCR primers for generating the DNA fragments. After PCR, DNA fragments were mixed with the vector containing the selective marker and the GA cloning reaction was assembled. Plasmids from individual transformed bacterial colonies were isolated. Initial screen of the plasmids was done by restriction digestion, followed by sequencing. In conclusion, GA allowed us to create customized plasmids for gene expression in 5 days, including construct screens and verification.

  7. Generation of plasmid vectors expressing FLAG-tagged proteins under the regulation of human elongation factor-1α promoter using Gibson assembly.

    PubMed

    Grozdanov, Petar N; MacDonald, Clinton C

    2015-01-01

    Gibson assembly (GA) cloning offers a rapid, reliable, and flexible alternative to conventional DNA cloning methods. We used GA to create customized plasmids for expression of exogenous genes in mouse embryonic stem cells (mESCs). Expression of exogenous genes under the control of the SV40 or human cytomegalovirus promoters diminishes quickly after transfection into mESCs. A remedy for this diminished expression is to use the human elongation factor-1 alpha (hEF1α) promoter to drive gene expression. Plasmid vectors containing hEF1α are not as widely available as SV40- or CMV-containing plasmids, especially those also containing N-terminal 3xFLAG-tags. The protocol described here is a rapid method to create plasmids expressing FLAG-tagged CstF-64 and CstF-64 mutant under the expressional regulation of the hEF1α promoter. GA uses a blend of DNA exonuclease, DNA polymerase and DNA ligase to make cloning of overlapping ends of DNA fragments possible. Based on the template DNAs we had available, we designed our constructs to be assembled into a single sequence. Our design used four DNA fragments: pcDNA 3.1 vector backbone, hEF1α promoter part 1, hEF1α promoter part 2 (which contained 3xFLAG-tag purchased as a double-stranded synthetic DNA fragment), and either CstF-64 or specific CstF-64 mutant. The sequences of these fragments were uploaded to a primer generation tool to design appropriate PCR primers for generating the DNA fragments. After PCR, DNA fragments were mixed with the vector containing the selective marker and the GA cloning reaction was assembled. Plasmids from individual transformed bacterial colonies were isolated. Initial screen of the plasmids was done by restriction digestion, followed by sequencing. In conclusion, GA allowed us to create customized plasmids for gene expression in 5 days, including construct screens and verification. PMID:25742071

  8. Expression sequences of cell adhesion molecules.

    PubMed Central

    Crossin, K L; Chuong, C M; Edelman, G M

    1985-01-01

    A reexamination of the expression of cell adhesion molecules (CAMs) during the development of the chicken embryo was carried out using more sensitive immunocytochemical techniques than had been used previously. While the previously determined sequence of CAM expression was confirmed, neural CAM (N-CAM) was also detected on endodermal structures such as the lung epithelium, gut epithelium, and pancreas and on budding structures such as the pancreatic duct and gall bladder. It was also found on ectodermal derivatives of the skin. In most of these sites, N-CAM expression was transient, but in the chicken embryo lung, the epithelium remained positive for N-CAM and liver CAM (L-CAM) into adult life. Thus, at one time or another, both of these primary CAMs can be expressed on derivatives of all three germ layers. At sites of embryonic induction, epithelial cells expressing both L-CAM and N-CAM, or L-CAM only, were apposed to mesenchymal cells expressing N-CAM. Examples included epiblast (NL) and notochord (N); endodermal epithelium (NL) and lung mesenchyme (N); Wolffian duct (NL) and mesonephric mesenchyme (N); apical ectodermal ridge (NL) and limb mesenchyme (N); and feather placode (L) and dermal condensation (N). The cumulative observations indicate that cell surface modulation of the primary CAMs at induction sites can be classified into two modes. In mode I, expression of N-CAM (or both CAMs) in mesenchyme decreases to low amounts at the cell surface, and then N-CAM is reexpressed. In mode II, one or the other CAM disappears from epithelia expressing both CAMs. As a result of the primary processes of development, collectives of cells linked by N-CAM and undergoing modulation mode I are brought into the proximity of collectives of cells linked by L-CAM plus N-CAM or by L-CAM undergoing modulation mode II. Such adjoining cell collectives or CAM couples were found at all sites of embryonic induction examined. Images PMID:3863135

  9. Transcriptome sequencing and profiling of expressed genes in cambial zone and differentiating xylem of Japanese cedar (Cryptomeria japonica)

    PubMed Central

    2014-01-01

    Background Forest trees have ecological and economic importance, and Japanese cedar has highly valued wood attributes. Thus, studies of molecular aspects of wood formation offer practical information that may be used for screening and forward genetics approaches to improving wood quality. Results After identifying expressed sequence tags in Japanese cedar tissue undergoing xylogenesis, we designed a custom cDNA microarray to compare expression of highly regulated genes throughout a growing season. This led to identification of candidate genes involved both in wood formation and later cessation of growth and dormancy. Based on homology to orthologous protein groups, the genes were assigned to functional classes. A high proportion of sequences fell into functional classes related to posttranscriptional modification and signal transduction, while transcription factors and genes involved in the metabolism of sugars, cell-wall synthesis and lignification, and cold hardiness were among other classes of genes identified as having a potential role in xylem formation and seasonal wood formation. Conclusions We obtained 55,051 unique sequences by next-generation sequencing of a cDNA library prepared from cambial meristem and derivative cells. Previous studies on conifers have identified unique sequences expressed in developing xylem, but this is the first comprehensive study utilizing a collection of expressed sequence tags for expression studies related to xylem formation in Japanese cedar, which belongs to a different lineage than the Pinaceae. Our characterization of these sequences should allow comparative studies of genome evolution and functional genetics of wood species. PMID:24649833

  10. Effect of His-Tag on Expression, Purification, and Structure of Zinc Finger Protein, ZNF191(243-368)

    PubMed Central

    Huang, Zhongxian

    2016-01-01

    Zinc finger proteins are associated with hereditary diseases and cancers. To obtain an adequate amount of zinc finger proteins for studying their properties, structure, and functions, many protein expression systems are used. ZNF191(243-368) is a zinc finger protein and can be fused with His-tag to generate fusion proteins such as His6-ZNF191(243-368) and ZNF191(243-368)-His8. The purification of His-tag protein using Ni-NTA resin can overcome the difficulty of ZNF191(243-368) separation caused by inclusion body formation. The influences of His-tag on ZNF191(243-368) properties and structure were investigated using spectrographic techniques and hydrolase experiment. Our findings suggest that insertion of a His-tag at the N-terminal or C-terminal end of ZNF191(243-368) has different effects on the protein. Therefore, an expression system should be considered based on the properties and structure of the protein. Furthermore, the hydrolase activity of ZNF191(243-368)-His8 has provided new insights into the design of biological functional molecules. PMID:27524954

  11. Expression, purification and kinetic characterization of His-tagged glyceraldehyde-3-phosphate dehydrogenase from Trypanosoma cruzi.

    PubMed

    Cheleski, Juliana; Freitas, Renato F; Wiggers, Helton José; Rocha, Josmar R; de Araújo, Ana Paula Ulian; Montanari, Carlos A

    2011-04-01

    Trypanosomes are flagellated protozoa responsible for serious parasitic diseases that have been classified by the World Health Organization as tropical sicknesses of major importance. One important drug target receiving considerable attention is the enzyme glyceraldehyde-3-phosphate dehydrogenase from the protozoan parasite Trypanosoma cruzi, the causative agent of Chagas disease (T. cruzi Glyceraldehyde-3-phosphate dehydrogenase (TcGAPDH); EC 1.2.1.12). TcGAPDH is a key enzyme in the glycolytic pathway of T. cruzi and catalyzes the oxidative phosphorylation of D-glyceraldehyde-3-phosphate (G3P) to 1,3-bisphosphoglycerate (1,3-BPG) coupled to the reduction of oxidized nicotinamide adenine dinucleotide, (NAD(+)) to NADH, the reduced form. Herein, we describe the cloning of the T. cruzi gene for TcGAPDH into the pET-28a(+) vector, its expression as a tagged protein in Escherichia coli, purification and kinetic characterization. The His(6)-tagged TcGAPDH was purified by affinity chromatography. Enzyme activity assays for the recombinant His(6)-TcGAPDH were carried out spectrophotometrically to determine the kinetic parameters. The apparent Michaelis-Menten constant (K(M)(app)) determined for D-glyceraldehyde-3-phosphate and NAD(+) were 352±21 and 272±25 μM, respectively, which were consistent with the values for the untagged enzyme reported in the literature. We have demonstrated by the use of Isothermal Titration Calorimetry (ITC) that this vector modification resulted in activity preserved for a higher period. We also report here the use of response surface methodology (RSM) to determine the region of optimal conditions for enzyme activity. A quadratic model was developed by RSM to describe the enzyme activity in terms of pH and temperature as independent variables. According to the RMS contour plots and variance analysis, the maximum enzyme activity was at 29.1°C and pH 8.6. Above 37°C, the enzyme activity starts to fall, which may be related to previous

  12. Toward a physical map of Drosophila buzzatii. Use of randomly amplified polymorphic dna polymorphisms and sequence-tagged site landmarks.

    PubMed Central

    Laayouni, H; Santos, M; Fontdevila, A

    2000-01-01

    We present a physical map based on RAPD polymorphic fragments and sequence-tagged sites (STSs) for the repleta group species Drosophila buzzatii. One hundred forty-four RAPD markers have been used as probes for in situ hybridization to the polytene chromosomes, and positive results allowing the precise localization of 108 RAPDs were obtained. Of these, 73 behave as effectively unique markers for physical map construction, and in 9 additional cases the probes gave two hybridization signals, each on a different chromosome. Most markers (68%) are located on chromosomes 2 and 4, which partially agree with previous estimates on the distribution of genetic variation over chromosomes. One RAPD maps close to the proximal breakpoint of inversion 2z(3) but is not included within the inverted fragment. However, it was possible to conclude from this RAPD that the distal breakpoint of 2z(3) had previously been wrongly assigned. A total of 39 cytologically mapped RAPDs were converted to STSs and yielded an aggregate sequence of 28,431 bp. Thirty-six RAPDs (25%) did not produce any detectable hybridization signal, and we obtained the DNA sequence from three of them. Further prospects toward obtaining a more developed genetic map than the one currently available for D. buzzatii are discussed. PMID:11102375

  13. Improved measurement of brain deformation during mild head acceleration using a novel tagged MRI sequence.

    PubMed

    Knutsen, Andrew K; Magrath, Elizabeth; McEntee, Julie E; Xing, Fangxu; Prince, Jerry L; Bayly, Philip V; Butman, John A; Pham, Dzung L

    2014-11-01

    In vivo measurements of human brain deformation during mild acceleration are needed to help validate computational models of traumatic brain injury and to understand the factors that govern the mechanical response of the brain. Tagged magnetic resonance imaging is a powerful, noninvasive technique to track tissue motion in vivo which has been used to quantify brain deformation in live human subjects. However, these prior studies required from 72 to 144 head rotations to generate deformation data for a single image slice, precluding its use to investigate the entire brain in a single subject. Here, a novel method is introduced that significantly reduces temporal variability in the acquisition and improves the accuracy of displacement estimates. Optimization of the acquisition parameters in a gelatin phantom and three human subjects leads to a reduction in the number of rotations from 72 to 144 to as few as 8 for a single image slice. The ability to estimate accurate, well-resolved, fields of displacement and strain in far fewer repetitions will enable comprehensive studies of acceleration-induced deformation throughout the human brain in vivo.

  14. Random Tagging Genotyping by Sequencing (rtGBS), an Unbiased Approach to Locate Restriction Enzyme Sites across the Target Genome

    PubMed Central

    Hilario, Elena; Barron, Lorna; Deng, Cecilia H.; Datson, Paul M.; Davy, Marcus W.; Storey, Roy D.

    2015-01-01

    Genotyping by sequencing (GBS) is a restriction enzyme based targeted approach developed to reduce the genome complexity and discover genetic markers when a priori sequence information is unavailable. Sufficient coverage at each locus is essential to distinguish heterozygous from homozygous sites accurately. The number of GBS samples able to be pooled in one sequencing lane is limited by the number of restriction sites present in the genome and the read depth required at each site per sample for accurate calling of single-nucleotide polymorphisms. Loci bias was observed using a slight modification of the Elshire et al. method: some restriction enzyme sites were represented in higher proportions while others were poorly represented or absent. This bias could be due to the quality of genomic DNA, the endonuclease and ligase reaction efficiency, the distance between restriction sites, the preferential amplification of small library restriction fragments, or bias towards cluster formation of small amplicons during the sequencing process. To overcome these issues, we have developed a GBS method based on randomly tagging genomic DNA (rtGBS). By randomly landing on the genome, we can, with less bias, find restriction sites that are far apart, and undetected by the standard GBS (stdGBS) method. The study comprises two types of biological replicates: six different kiwifruit plants and two independent DNA extractions per plant; and three types of technical replicates: four samples of each DNA extraction, stdGBS vs. rtGBS methods, and two independent library amplifications, each sequenced in separate lanes. A statistically significant unbiased distribution of restriction fragment size by rtGBS showed that this method targeted 49% (39,145) of BamH I sites shared with the reference genome, compared to only 14% (11,513) by stdGBS. PMID:26633193

  15. Expression screen by enzyme-linked immunofiltration assay designed for high-throughput purification of affinity-tagged proteins.

    PubMed

    Kery, Vladimir; Savage, Justin R; Widjaja, Kartika; Blake, B Kelly; Conklin, David R; Ho, Yew-Seng J; Long, Xinghua; von Rechenberg, Moritz; Zarembinski, Thomas I; Boniface, J Jay

    2003-06-15

    High-throughput purification of affinity-tagged fusion proteins is currently one of the fastest developing areas of molecular proteomics. A prerequisite for success in protein purification is sufficient soluble protein expression of the target protein in a heterologous host. Hence, a fast and quantitative evaluation of the soluble-protein levels in an expression system is one of the key steps in the entire process. Here we describe a high-throughput expression screen for affinity-tagged fusion proteins based on an enzyme linked immunofiltration assay (ELIFA). An aliquot of a crude Escherichia coli extract containing the analyte, an affinity-tagged protein, is adsorbed onto the membrane. Subsequent binding of specific antibodies followed by binding of a secondary antibody horseradish peroxidase (HRP) complex then allows quantitative evaluation of the analyte using tetramethylbenzidine as the substrate for HRP. The method is accurate and quantitative, as shown by comparison with results from western blotting and an enzymatic glutathione S-transferase (GST) assay. Furthermore, it is a far more rapid assay and less cumbersome than western blotting, lending itself more readily to high-throughput analysis. It can be used at the expression level (cell lysates) or during the subsequent purification steps to monitor yield of specific protein.

  16. Characterization of expressed resistance gene analogs (RGAs) from peanut expressed sequence tags (ESTs)

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cultivated peanut (Arachis hypogaea L.) is one of the most important food legume crops grown worldwide, and is a major source for edible oil and protein. However, due to low genetic variation, peanut is very vulnerable to a variety of pathogens, such as early leaf spot, late leaf spot, rust and Toma...

  17. Examination of Endogenous Rotund Expression and Function in Developing Drosophila Olfactory System Using CRISPR-Cas9-Mediated Protein Tagging.

    PubMed

    Li, Qingyun; Barish, Scott; Okuwa, Sumie; Volkan, Pelin C

    2015-12-01

    The zinc-finger protein Rotund (Rn) plays a critical role in controlling the development of the fly olfactory system. However, little is known about its molecular function in vivo. Here, we added protein tags to the rn locus using CRISPR-Cas9 technology in Drosophila to investigate its subcellular localization and the genes that it regulates . We previously used a reporter construct to show that rn is expressed in a subset of olfactory receptor neuron (ORN) precursors and it is required for the diversification of ORN fates. Here, we show that tagged endogenous Rn protein is functional based on the analysis of ORN phenotypes. Using this method, we also mapped the expression pattern of the endogenous isoform-specific tags in vivo with increased precision. Comparison of the Rn expression pattern from this study with previously published results using GAL4 reporters showed that Rn is mainly present in early steps in antennal disc patterning, but not in pupal stages when ORNs are born. Finally, using chromatin immunoprecipitation, we showed a direct binding of Rotund to a previously identified regulatory element upstream of the bric-a-brac gene locus in the developing antennal disc. PMID:26497147

  18. Examination of Endogenous Rotund Expression and Function in Developing Drosophila Olfactory System Using CRISPR-Cas9-Mediated Protein Tagging.

    PubMed

    Li, Qingyun; Barish, Scott; Okuwa, Sumie; Volkan, Pelin C

    2015-12-01

    The zinc-finger protein Rotund (Rn) plays a critical role in controlling the development of the fly olfactory system. However, little is known about its molecular function in vivo. Here, we added protein tags to the rn locus using CRISPR-Cas9 technology in Drosophila to investigate its subcellular localization and the genes that it regulates . We previously used a reporter construct to show that rn is expressed in a subset of olfactory receptor neuron (ORN) precursors and it is required for the diversification of ORN fates. Here, we show that tagged endogenous Rn protein is functional based on the analysis of ORN phenotypes. Using this method, we also mapped the expression pattern of the endogenous isoform-specific tags in vivo with increased precision. Comparison of the Rn expression pattern from this study with previously published results using GAL4 reporters showed that Rn is mainly present in early steps in antennal disc patterning, but not in pupal stages when ORNs are born. Finally, using chromatin immunoprecipitation, we showed a direct binding of Rotund to a previously identified regulatory element upstream of the bric-a-brac gene locus in the developing antennal disc.

  19. Development of high-density linkage map and tagging leaf spot resistance in pearl millet using genotyping-by-sequencing markers

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Pearl millet is an important forage and grain crop in many parts of the world. Genome mapping studies are a prerequisite for tagging agronomically important traits. Genotyping-by-Sequencing (GBS) markers can be used to build high density linkage maps even in species lacking a reference genome. A re...

  20. Plant genotyping using fluorescently tagged inter-simple sequence repeats (ISSRs): basic principles and methodology.

    PubMed

    Prince, Linda M

    2015-01-01

    Inter-simple sequence repeat PCR (ISSR-PCR) is a fast, inexpensive genotyping technique based on length variation in the regions between microsatellites. The method requires no species-specific prior knowledge of microsatellite location or composition. Very small amounts of DNA are required, making this method ideal for organisms of conservation concern, or where the quantity of DNA is extremely limited due to organism size. ISSR-PCR can be highly reproducible but requires careful attention to detail. Optimization of DNA extraction, fragment amplification, and normalization of fragment peak heights during fluorescent detection are critical steps to minimizing the downstream time spent verifying and scoring the data.

  1. Gene discovery within the planctomycete division of the domain Bacteria using sequence tags from genomic DNA libraries

    PubMed Central

    Jenkins, Cheryl; Kedar, Vishram; Fuerst, John A

    2002-01-01

    Background The planctomycetes comprise a distinct group of the domain Bacteria, forming a separate division by phylogenetic analysis. The organization of their cells into membrane-defined compartments including membrane-bounded nucleoids, their budding reproduction and complete absence of peptidoglycan distinguish them from most other Bacteria. A random sequencing approach was applied to the genomes of two planctomycete species, Gemmata obscuriglobus and Pirellula marina, to discover genes relevant to their cell biology and physiology. Results Genes with a wide variety of functions were identified in G. obscuriglobus and Pi. marina, including those of metabolism and biosynthesis, transport, regulation, translation and DNA replication, consistent with established phenotypic characters for these species. The genes sequenced were predominantly homologous to those in members of other divisions of the Bacteria, but there were also matches with nuclear genomic genes of the domain Eukarya, genes that may have appeared in the planctomycetes via horizontal gene transfer events. Significant among these matches are those with two genes atypical for Bacteria and with significant cell-biology implications - integrin alpha-V and inter-alpha-trypsin inhibitor protein - with homologs in G. obscuriglobus and Pi. marina respectively. Conclusions The random-sequence-tag approach applied here to G. obscuriglobus and Pi. marina is the first report of gene recovery and analysis from members of the planctomycetes using genome-based methods. Gene homologs identified were predominantly similar to genes of Bacteria, but some significant best matches to genes from Eukarya suggest that lateral gene transfer events between domains may have involved this division at some time during its evolution. PMID:12093378

  2. Transcriptome analysis of subcutaneous adipose tissues in beef cattle using 3' digital gene expression-tag profiling.

    PubMed

    Jin, W; Olson, E N; Moore, S S; Basarab, J A; Basu, U; Guan, L L

    2012-01-01

    The molecular mechanisms that regulate fat deposition in bovine adipose tissue have not been well studied. To elucidate the genes and gene networks involved in bovine fat development, transcriptional profiles of backfat (BF) tissues from Hereford × Aberdeen Angus (HEAN, n = 6) and Charolais × Red Angus (CHRA, n = 6) steers with high or low BF thickness were characterized by digital gene expression-tag profiling. Approximately 9.8 to 21.9 million tags were obtained for each library, and a total of 18,034 genes were identified. In total, 650 genes were found to be differentially expressed, with a greater than 1.5-fold difference between the 2 crossbreds (Benjamini-Hochberg false discovery rate ≤ 0.05). The majority of differentially expressed genes that were more highly expressed in CHRA vs. HEAN were associated with development, whereas the differentially expressed genes with greater expression in HEAN vs. CHRA were overrepresented in biological processes such as metabolism and immune response. Thirty-six and 152 differentially expressed genes were detected between animals with high (n = 3) and low (n = 3) BF thickness in HEAN and CHRA, respectively (Benjamini-Hochberg false discovery rate ≤0.05). The differentially expressed genes between high and low groups in CHRA were related to cell proliferation and development processes. In addition, lipid metabolism was 1 of the top 5 molecular and cellular functions identified in both crossbreds. Ten and 17 differentially expressed genes were found to be involved in fat metabolism in HEAN and CHRA, respectively. Genes associated with obesity, such as PTX3 (pentraxin 3, long) and SERPINE1 (serpin peptidase inhibitor, clade E, member 1), were more highly expressed (P < 0.05) in the subset of CHRA animals with greater BF thickness. Our study revealed that the expression patterns of genes in BF tissues differed depending on the genetic background of the cattle.

  3. Deciphering Noncoding RNA and Chromatin Interactions: Multiplex Chromatin Interaction Analysis by Paired-End Tag Sequencing (mChIA-PET).

    PubMed

    Choy, Jocelyn; Fullwood, Melissa J

    2017-01-01

    Genomic DNA is dynamically associated with protein factors and folded to form chromatin fibers. The 3-dimensional (3D) configuration of the chromatin will enable the distal genetic elements to come into close proximity, allowing transcriptional regulation. Noncoding RNA can mediate the 3D structure of chromatin. Chromatin Interaction Analysis by Paired-End Tag Sequencing (ChIA-PET) is a valuable and powerful technique in molecular biology which allows the study of unbiased, genome-wide de novo chromatin interactions with paired-end tags. Here, we describe the standard version of ChIA-PET and a Multiplex ChIA-PET version. PMID:27662871

  4. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform

    PubMed Central

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-01-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus. PMID:24571307

  5. Parallel tagged amplicon sequencing of transcriptome-based genetic markers for Triturus newts with the Ion Torrent next-generation sequencing platform.

    PubMed

    Wielstra, B; Duijm, E; Lagler, P; Lammers, Y; Meilink, W R M; Ziermann, J M; Arntzen, J W

    2014-09-01

    Next-generation sequencing is a fast and cost-effective way to obtain sequence data for nonmodel organisms for many markers and for many individuals. We describe a protocol through which we obtain orthologous markers for the crested newts (Amphibia: Salamandridae: Triturus), suitable for analysis of interspecific hybridization. We use transcriptome data of a single Triturus species and design 96 primer pairs that amplify c. 180 bp fragments positioned in 3-prime untranslated regions. Next, these markers are tested with uniplex PCR for a set of species spanning the taxonomical width of the genus Triturus. The 52 markers that consistently show a single band of expected length at gel electrophoreses for all tested crested newt species are then amplified in five multiplex PCRs (with a plexity of ten or eleven) for 132 individual newts: a set of 84 representing the seven (candidate) species and a set of 48 from a presumed hybrid population. After pooling multiplexes per individual, unique tags are ligated to link amplicons to individuals. Subsequently, individuals are pooled equimolar and sequenced on the Ion Torrent next-generation sequencing platform. A bioinformatics pipeline identifies the alleles and recodes these to a genotypic format. Next, we test the utility of our markers. baps allocates the 84 crested newt individuals representing (candidate) species to their expected (candidate) species, confirming the markers are suitable for species delineation. newhybrids, a hybrid index and hiest confirm the 48 individuals from the presumed hybrid population to be genetically admixed, illustrating the potential of the markers to identify interspecific hybridization. We expect the set of markers we designed to provide a high resolving power for analysis of hybridization in Triturus.

  6. Novel Y-chromosomal microdeletions associated with non-obstructive azoospermia uncovered by high throughput sequencing of sequence-tagged sites (STSs)

    PubMed Central

    Liu, Xiao; Li, Zesong; Su, Zheng; Zhang, Junjie; Li, Honggang; Xie, Jun; Xu, Hanshi; Jiang, Tao; Luo, Liya; Zhang, Ruifang; Zeng, Xiaojing; Xu, Huaiqian; Huang, Yi; Mou, Lisha; Hu, Jingchu; Qian, Weiping; Zeng, Yong; Zhang, Xiuqing; Xiong, Chengliang; Yang, Huanming; Kristiansen, Karsten; Cai, Zhiming; Wang, Jun; Gui, Yaoting

    2016-01-01

    Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome. PMID:26907467

  7. Novel Y-chromosomal microdeletions associated with non-obstructive azoospermia uncovered by high throughput sequencing of sequence-tagged sites (STSs).

    PubMed

    Liu, Xiao; Li, Zesong; Su, Zheng; Zhang, Junjie; Li, Honggang; Xie, Jun; Xu, Hanshi; Jiang, Tao; Luo, Liya; Zhang, Ruifang; Zeng, Xiaojing; Xu, Huaiqian; Huang, Yi; Mou, Lisha; Hu, Jingchu; Qian, Weiping; Zeng, Yong; Zhang, Xiuqing; Xiong, Chengliang; Yang, Huanming; Kristiansen, Karsten; Cai, Zhiming; Wang, Jun; Gui, Yaoting

    2016-01-01

    Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome. PMID:26907467

  8. AB039. Novel Y-chromosomal microdeletions associated with non-obstructive azoospermia uncovered by high throughput sequencing of sequence-tagged sites (STSs)

    PubMed Central

    Li, Zesong

    2016-01-01

    Y-chromosomal microdeletion (YCM) serves as an important genetic factor in non-obstructive azoospermia (NOA). Multiplex polymerase chain reaction (PCR) is routinely used to detect YCMs by tracing sequence-tagged sites (STSs) in the Y chromosome. Here we introduce a novel methodology in which we sequence 1,787 (post-filtering) STSs distributed across the entire male-specific Y chromosome (MSY) in parallel to uncover known and novel YCMs. We validated this approach with 766 Chinese men with NOA and 683 ethnically matched healthy individuals and detected 481 and 98 STSs that were deleted in the NOA and control group, representing a substantial portion of novel YCMs which significantly influenced the functions of spermatogenic genes. The NOA patients tended to carry more and rarer deletions that were enriched in nearby intragenic regions. Haplogroup O2* was revealed to be a protective lineage for NOA, in which the enrichment of b1/b3 deletion in haplogroup C was also observed. In summary, our work provides a new high-resolution portrait of deletions in the Y chromosome.

  9. EST sequencing and gene expression profiling of cultivated peanut (Arachis hypogaea L.).

    PubMed

    Bi, Yu-Ping; Liu, Wei; Xia, Han; Su, Lei; Zhao, Chuan-Zhi; Wan, Shu-Bo; Wang, Xing-Jun

    2010-10-01

    Peanut (Arachis hypogaea L.) is one of the most important oil crops in the world. However, biotechnological based improvement of peanut is far behind many other crops. It is critical and urgent to establish the biotechnological platform for peanut germplasm innovation. In this study, a peanut seed cDNA library was constructed to establish the biotechnological platform for peanut germplasm innovation. About 17,000 expressed sequence tags (ESTs) were sequenced and used for further investigation. Among which, 12.5% were annotated as metabolic related and 4.6% encoded transcription or post-transcription factors. ESTs encoding storage protein and enzymes related to protein degradation accounted for 28.8% and formed the largest group of the annotated ESTs. ESTs that encoded stress responsive proteins and pathogen-related proteins accounted for 5.6%. ESTs that encoded unknown proteins or showed no hit in the GenBank nr database accounted for 20.1% and 13.9%, respectively. A total number of 5066 EST sequences were selected to make a cDNA microarray. Expression analysis revealed that these sequences showed diverse expression patterns in peanut seeds, leaves, stems, roots, flowers, and gynophores. We also analyzed the gene expression pattern during seed development. Genes that were upregulated (≥twofold) at 15, 25, 35, and 45 days after pegging (DAP) were found and compared with 70 DAP. The potential value of these genes and their promoters in the peanut gene engineering study is discussed.

  10. DNA sequence heterogeneity of Campylobacter jejuni CJIE4 prophages and expression of prophage genes.

    PubMed

    Clark, Clifford G; Chong, Patrick M; McCorrister, Stuart J; Mabon, Philip; Walker, Matthew; Westmacott, Garrett R

    2014-01-01

    Campylobacter jejuni carry temperate bacteriophages that can affect the biology or virulence of the host bacterium. Known effects include genomic rearrangements and resistance to DNA transformation. C. jejuni prophage CJIE1 shows sequence variability and variability in the content of morons. Homologs of the CJIE1 prophage enhance both adherence and invasion to cells in culture and increase the expression of a specific subset of bacterial genes. Other C. jejuni temperate phages have so far not been well characterized. In this study we describe investigations into the DNA sequence variability and protein expression in a second prophage, CJIE4. CJIE4 sequences were obtained de novo from DNA sequencing of five C. jejuni isolates, as well as from whole genome sequences submitted to GenBank by other research groups. These CJIE4 DNA sequences were heterogenous, with several different insertions/deletions (indels) in different parts of the prophage genome. Two variants of a 3-4 kb region inserted within CJIE4 had different gene content that distinguished two major conserved CJIE4 prophage families. Additional indels were detected throughout the prophage. Detection of proteins in the five isolates characterized in our laboratory in isobaric Tags for Relative and Absolute Quantitation (iTRAQ) experiments indicated that prophage proteins within each of the two large indel variants were expressed during growth of the bacteria on Mueller Hinton agar plates. These proteins included the extracellular DNase associated with resistance to DNA transformation and prophage repressor proteins. Other proteins associated with known or suspected roles in prophage biology were also expressed from CJIE4, including capsid protein, the phage integrase, and MazF, a type II toxin-antitoxin system protein. Together with the results previously obtained for the CJIE1 prophage these results demonstrate that sequence variability and expression of moron genes are both general properties of temperate

  11. Aberrant host immune response induced by highly virulent PRRSV identified by digital gene expression tag profiling

    PubMed Central

    2010-01-01

    Background There was a large scale outbreak of the highly pathogenic porcine reproductive and respiratory syndrome (PRRS) in China and Vietnam during 2006 and 2007 that resulted in unusually high morbidity and mortality among pigs of all ages. The mechanisms underlying the molecular pathogenesis of the highly virulent PRRS virus (H-PRRSV) remains unknown. Therefore, the relationship between pulmonary gene expression profiles after H-PRRSV infection and infection pathology were analyzed in this study using high-throughput deep sequencing and histopathology. Results H-PRRSV infection resulted in severe lung pathology. The results indicate that aberrant host innate immune responses to H-PRRSV and induction of an anti-apoptotic state could be responsible for the aggressive replication and dissemination of H-PRRSV. Prolific rapid replication of H-PRRSV could have triggered aberrant sustained expression of pro-inflammatory cytokines and chemokines leading to a markedly robust inflammatory response compounded by significant cell death and increased oxidative damage. The end result was severe tissue damage and high pathogenicity. Conclusions The systems analysis utilized in this study provides a comprehensive basis for better understanding the pathogenesis of H-PRRSV. Furthermore, it allows the genetic components involved in H-PRRSV resistance/susceptibility in swine populations to be identified. PMID:20929578

  12. High-throughput Gene Tagging in Trypanosoma brucei.

    PubMed

    Dyer, Philip; Dean, Samuel; Sunter, Jack

    2016-01-01

    Improvements in mass spectrometry, sequencing and bioinformatics have generated large datasets of potentially interesting genes. Tagging these proteins can give insights into their function by determining their localization within the cell and enabling interaction partner identification. We recently published a fast and scalable method to generate Trypanosoma brucei cell lines that express a tagged protein from the endogenous locus. The method was based on a plasmid we generated that, when coupled with long primer PCR, can be used to modify a gene to encode a protein tagged at either terminus. This allows the tagging of dozens of trypanosome proteins in parallel, facilitating the large-scale validation of candidate genes of interest. This system can be used to tag proteins for localization (using a fluorescent protein, epitope tag or electron microscopy tag) or biochemistry (using tags for purification, such as the TAP (tandem affinity purification) tag). Here, we describe a protocol to perform the long primer PCR and the electroporation in 96-well plates, with the recovery and selection of transgenic trypanosomes occurring in 24-well plates. With this workflow, hundreds of proteins can be tagged in parallel; this is an order of magnitude improvement to our previous protocol and genome scale tagging is now possible. PMID:27584862

  13. Transcriptional Profiling of Newly Generated Dentate Granule Cells Using TU Tagging Reveals Pattern Shifts in Gene Expression during Circuit Integration1,2

    PubMed Central

    Chatzi, Christina; Shen, Rongkun; Goodman, Richard H.

    2016-01-01

    Abstract Despite representing only a small fraction of hippocampal granule cells, adult-generated newborn granule cells have been implicated in learning and memory (Aimone et al., 2011). Newborn granule cells undergo functional maturation and circuit integration over a period of weeks. However, it is difficult to assess the accompanying gene expression profiles in vivo with high spatial and temporal resolution using traditional methods. Here we used a novel method [“thiouracil (TU) tagging”] to map the profiles of nascent mRNAs in mouse immature newborn granule cells compared with mature granule cells. We targeted a nonmammalian uracil salvage enzyme, uracil phosphoribosyltransferase, to newborn neurons and mature granule cells using retroviral and lentiviral constructs, respectively. Subsequent injection of 4-TU tagged nascent RNAs for analysis by RNA sequencing. Several hundred genes were significantly enhanced in the retroviral dataset compared with the lentiviral dataset. We compared a selection of the enriched genes with steady-state levels of mRNAs using quantitative PCR. Ontology analysis revealed distinct patterns of nascent mRNA expression, with newly generated immature neurons showing enhanced expression for genes involved in synaptic function, and neural differentiation and development, as well as genes not previously associated with granule cell maturation. Surprisingly, the nascent mRNAs enriched in mature cells were related to energy homeostasis and metabolism, presumably indicative of the increased energy demands of synaptic transmission and their complex dendritic architecture. The high spatial and temporal resolution of our modified TU-tagging method provides a foundation for comparison with steady-state RNA analyses by traditional transcriptomic approaches in defining the functional roles of newborn neurons. PMID:27011954

  14. Purification and characterization of bioactive his6-tagged recombinant human tissue inhibitor of metalloproteinases-1 (TIMP-1) protein expressed at high yields in mammalian cells.

    PubMed

    Vinther, Lena; Lademann, Ulrik; Andersen, Elisabeth Veyhe; Højrup, Peter; Thaysen-Andersen, Morten; Krogh, Berit Olsen; Viuff, Birgitte; Brünner, Nils; Stenvang, Jan; Moreira, José M A

    2014-09-01

    Tissue inhibitor of metalloproteinases-1 (TIMP-1) is an endogenous inhibitor of matrix metalloproteinases (MMPs) with reported tumor promoting, as well as inhibitory, effects. These paradoxical properties are presumably mediated by different biological functions, MMP-dependent as well as -independent, and probably related to TIMP-1 levels of protein expression, post-translational modifications, and cellular localization. TIMP-1 is an N-glycosylated protein that folds into two functional domains, a C- and an N-terminal domain, with six disulfide bonds. Furthermore, TIMP-1 is processed in the N-terminal sequence. These three biochemical properties make TIMP-1 difficult to produce in conventional bacterial, insect, or yeast expression systems. We describe here a HEK293 cell-based strategy for production and purification of secreted and N-glycosylated recombinant his6-tagged human TIMP-1 (his6-rTIMP-1), which resulted in large amounts of highly purified and bioactive protein. Matrix-assisted laser desorption ionization mass spectrometry confirmed the N- and C-termini of his6-rTIMP-1, and N-glycosylation profiling showed a match to the N-glycosylation of human plasma TIMP-1. The his6-rTIMP-1 was bioactive as shown by its proper inhibitory effect on MMP-2 activity, and its stimulatory effect on cell growth when added to the growth medium of four different breast cancer cell lines. This study provides an easy set-up for large scale production and purification of bioactive, tagged recombinant human TIMP-1, which structurally and functionally is similar to endogenous human TIMP-1, while using an expression system that is adaptable to most biochemical and biomedical laboratories including those that do not perform protein purifications routinely. PMID:24998777

  15. Discovery of EST-SSRs in lung cancer: tagged ESTs with SSRs lead to differential amino acid and protein expression patterns in cancerous tissues.

    PubMed

    Bakhtiarizadeh, Mohammad Reza; Ebrahimi, Mansour; Ebrahimie, Esmaeil

    2011-01-01

    Tandem repeats are found in both coding and non-coding sequences of higher organisms. These sequences can be used in cancer genetics and diagnosis to unravel the genetic basis of tumor formation and progression. In this study, a possible relationship between SSR distributions and lung cancer was studied by comparative analysis of EST-SSRs in normal and lung cancerous tissues. While the EST-SSR distribution was similar between tumorous tissues, this distribution was different between normal and tumorous tissues. Trinucleotides tandem repeats were highly different; the number of trinucleotides in ESTs of lung cancer was 3 times higher than normal tissue. Significant negative correlation between normal and cancerous tissue showed that cancerous tissue generates different types of trinucleotides. GGC and CGC were the more frequent expressed trinucleotides in cancerous tissue, but these SSRs were not expressed in normal tissue. Similar to the EST level, the expression pattern of EST-SSRs-derived amino acids was significantly different between normal and cancerous tissues. Arg, Pro, Ser, Gly, and Lys were the most abundant amino acids in cancerous tissues, and Leu, Cys, Phe, and His were significantly more abundant in normal tissues than in cancerous tissues. Next, the putative functions of triplet SSR-containing genes were analyzed. In cancerous tissue, EST-SSRs produce different types of proteins. Chromodomain helicase DNA binding proteins were one of the major protein products of EST-SSRs in the cancerous library, while these proteins were not produced from EST-SSRs in normal tissue. For the first time, the findings of this study confirmed that EST-SSRs in normal lung tissues are different than in unhealthy tissues, and tagged ESTs with SSRs cause remarkable differences in amino acid and protein expression patterns in cancerous tissue. We suggest that EST-SSRs and EST-SSRs differentially expressed in cancerous tissue may be suitable candidate markers for lung cancer

  16. Sequence determinants of prokaryotic gene expression level under heat stress.

    PubMed

    Xiong, Heng; Yang, Yi; Hu, Xiao-Pan; He, Yi-Ming; Ma, Bin-Guang

    2014-11-01

    Prokaryotic gene expression is environment-dependent and temperature plays an important role in shaping the gene expression profile. Revealing the regulation mechanisms of gene expression pertaining to temperature has attracted tremendous efforts in recent years particularly owning to the yielding of transcriptome and proteome data by high-throughput techniques. However, most of the previous works concentrated on the characterization of the gene expression profile of individual organism and little effort has been made to disclose the commonality among organisms, especially for the gene sequence features. In this report, we collected the transcriptome and proteome data measured under heat stress condition from recently published literature and studied the sequence determinants for the expression level of heat-responsive genes on multiple layers. Our results showed that there indeed exist commonness and consistent patterns of the sequence features among organisms for the differentially expressed genes under heat stress condition. Some features are attributed to the requirement of thermostability while some are dominated by gene function. The revealed sequence determinants of bacterial gene expression level under heat stress complement the knowledge about the regulation factors of prokaryotic gene expression responding to the change of environmental conditions. Furthermore, comparisons to thermophilic adaption have been performed to reveal the similarity and dissimilarity of the sequence determinants for the response to heat stress and for the adaption to high habitat temperature, which elucidates the complex landscape of gene expression related to the same physical factor of temperature.

  17. Analytic signal phase-based myocardial motion estimation in tagged MRI sequences by a bilinear model and motion compensation.

    PubMed

    Wang, Liang; Basarab, Adrian; Girard, Patrick R; Croisille, Pierre; Clarysse, Patrick; Delachartre, Philippe

    2015-08-01

    Different mathematical tools, such as multidimensional analytic signals, allow for the calculation of 2D spatial phases of real-value images. The motion estimation method proposed in this paper is based on two spatial phases of the 2D analytic signal applied to cardiac sequences. By combining the information of these phases issued from analytic signals of two successive frames, we propose an analytical estimator for 2D local displacements. To improve the accuracy of the motion estimation, a local bilinear deformation model is used within an iterative estimation scheme. The main advantages of our method are: (1) The phase-based method allows the displacement to be estimated with subpixel accuracy and is robust to image intensity variation in time; (2) Preliminary filtering is not required due to the bilinear model. The proposed algorithm, integrating phase-based optical flow motion estimation and the combination of global motion compensation with local bilinear transform, allows spatio-temporal cardiac motion analysis, e.g. strain and dense trajectory estimation over the cardiac cycle. Results from 7 realistic simulated tagged magnetic resonance imaging (MRI) sequences show that our method is more accurate compared with state-of-the-art method for cardiac motion analysis and with another differential approach from the literature. The motion estimation errors (end point error) of the proposed method are reduced by about 33% compared with that of the two methods. In our work, the frame-to-frame displacements are further accumulated in time, to allow for the calculation of myocardial Lagrangian cardiac strains and point trajectories. Indeed, from the estimated trajectories in time on 11 in vivo data sets (9 patients and 2 healthy volunteers), the shape of myocardial point trajectories belonging to pathological regions are clearly reduced in magnitude compared with the ones from normal regions. Myocardial point trajectories, estimated from our phase-based analytic

  18. Analytic signal phase-based myocardial motion estimation in tagged MRI sequences by a bilinear model and motion compensation.

    PubMed

    Wang, Liang; Basarab, Adrian; Girard, Patrick R; Croisille, Pierre; Clarysse, Patrick; Delachartre, Philippe

    2015-08-01

    Different mathematical tools, such as multidimensional analytic signals, allow for the calculation of 2D spatial phases of real-value images. The motion estimation method proposed in this paper is based on two spatial phases of the 2D analytic signal applied to cardiac sequences. By combining the information of these phases issued from analytic signals of two successive frames, we propose an analytical estimator for 2D local displacements. To improve the accuracy of the motion estimation, a local bilinear deformation model is used within an iterative estimation scheme. The main advantages of our method are: (1) The phase-based method allows the displacement to be estimated with subpixel accuracy and is robust to image intensity variation in time; (2) Preliminary filtering is not required due to the bilinear model. The proposed algorithm, integrating phase-based optical flow motion estimation and the combination of global motion compensation with local bilinear transform, allows spatio-temporal cardiac motion analysis, e.g. strain and dense trajectory estimation over the cardiac cycle. Results from 7 realistic simulated tagged magnetic resonance imaging (MRI) sequences show that our method is more accurate compared with state-of-the-art method for cardiac motion analysis and with another differential approach from the literature. The motion estimation errors (end point error) of the proposed method are reduced by about 33% compared with that of the two methods. In our work, the frame-to-frame displacements are further accumulated in time, to allow for the calculation of myocardial Lagrangian cardiac strains and point trajectories. Indeed, from the estimated trajectories in time on 11 in vivo data sets (9 patients and 2 healthy volunteers), the shape of myocardial point trajectories belonging to pathological regions are clearly reduced in magnitude compared with the ones from normal regions. Myocardial point trajectories, estimated from our phase-based analytic

  19. Development and optimization of sequence-tagged microsatellite site markers to detect genetic diversity within Colletotrichum capsici, a causal agent of chilli pepper anthracnose disease.

    PubMed

    Ranathunge, N P; Ford, R; Taylor, P W J

    2009-07-01

    Genomic libraries enriched for microsatellites from Colletotrichum capsici, one of the major causal agents of anthracnose disease in chilli pepper (Capsicum spp.), were developed using a modified hybridization procedure. Twenty-seven robust primer pairs were designed from microsatellite flanking sequences and were characterized using 52 isolates from three countries India, Sri Lanka and Thailand. Highest gene diversity of 0.857 was observed at the CCSSR1 with up to 18 alleles among all the isolates whereas the differentiation ranged from 0.05 to 0.45. The sequence-tagged microsatellite site markers developed in this study will be useful for genetic analyses of C. capsici populations. PMID:21564867

  20. Affinity Purification of a Recombinant Protein Expressed as a Fusion with the Maltose-Binding Protein (MBP) Tag

    PubMed Central

    Duong-Ly, Krisna C.; Gabelli, Sandra B.

    2015-01-01

    Expression of fusion proteins such as MBP fusions can be used as a way to improve the solubility of the expressed protein in E. coli (Fox and Waugh, 2003; Nallamsetty et al., 2005; Nallamsetty and Waugh, 2006) and as a way to introduce an affinity purification tag. The protocol that follows was designed by the authors as a first step in the purification of a recombinant protein fused with MBP, using fast protein liquid chromatography (FPLC). Cells should have been thawed, resuspended in binding buffer, and lysed by sonication or microfluidization before mixing with the amylose resin or loading on the column. Slight modifications to this protocol may be made to accommodate both the protein of interest and the availability of equipment. PMID:26096500

  1. Expression and characterization of Flag-epitope- and hexahistidine-tagged derivatives of saxiphilin for use in detection and assay of saxitoxin.

    PubMed

    Krishnan, G; Morabito, M A; Moczydlowski, E

    2001-01-01

    Saxiphilin is a plasma protein from the bullfrog (Rana catesbiana) that binds saxitoxin (STX), a causative agent of paralytic shellfish poisoning. Saxiphilin is homologous to transferrin and consists of two internally homologous domains called the N-lobe and the C-lobe. STX binds to a single site in the C-lobe of saxiphilin. In this study, cloned genes coding for recombinant saxiphilin and C-lobe saxiphilin were modified to contain two tandemly located affinity tags, Flag epitope (DYKDDDDK) and His(6) (HHHHHH), at the protein C-terminus and were expressed in cultured insect cells using baculovirus vectors. Both tagged proteins are readily detected on immunoblots by anti-Flag monoclonal antibody. Flag-His(6)-tagged saxiphilin was purified to homogeneity using Ni(2+)-chelate affinity chromatography and Heparin Sepharose chromatography. Equilibrium analysis of [3H]STX binding to tagged saxiphilin and tagged C-lobe saxiphilin gave K(D) values of 0.75 and 2.7 nM, respectively. Flag-His(6)-tagged saxiphilin was also utilized in a microtiter well solid-phase assay with Reacti-bind metal chelate plates to measure [3H]STX binding and binding competition by unlabeled STX. Such Flag-His(6)-tagged derivatives of saxiphilin have many possible applications in the assay of STX and related toxinological research. PMID:10978747

  2. Birbeck granule-like "organized smooth endoplasmic reticulum" resulting from the expression of a cytoplasmic YFP-tagged langerin.

    PubMed

    Lenormand, Cédric; Spiegelhalter, Coralie; Cinquin, Bertrand; Bardin, Sabine; Bausinger, Huguette; Angénieux, Catherine; Eckly, Anita; Proamer, Fabienne; Wall, David; Lich, Ben; Tourne, Sylvie; Hanau, Daniel; Schwab, Yannick; Salamero, Jean; de la Salle, Henri

    2013-01-01

    Langerin is required for the biogenesis of Birbeck granules (BGs), the characteristic organelles of Langerhans cells. We previously used a Langerin-YFP fusion protein having a C-terminal luminal YFP tag to dynamically decipher the molecular and cellular processes which accompany the traffic of Langerin. In order to elucidate the interactions of Langerin with its trafficking effectors and their structural impact on the biogenesis of BGs, we generated a YFP-Langerin chimera with an N-terminal, cytosolic YFP tag. This latter fusion protein induced the formation of YFP-positive large puncta. Live cell imaging coupled to a fluorescence recovery after photobleaching approach showed that this coalescence of proteins in newly formed compartments was static. In contrast, the YFP-positive structures present in the pericentriolar region of cells expressing Langerin-YFP chimera, displayed fluorescent recovery characteristics compatible with active membrane exchanges. Using correlative light-electron microscopy we showed that the coalescent structures represented highly organized stacks of membranes with a pentalaminar architecture typical of BGs. Continuities between these organelles and the rough endoplasmic reticulum allowed us to identify the stacks of membranes as a form of "Organized Smooth Endoplasmic Reticulum" (OSER), with distinct molecular and physiological properties. The involvement of homotypic interactions between cytoplasmic YFP molecules was demonstrated using an A206K variant of YFP, which restored most of the Langerin traffic and BG characteristics observed in Langerhans cells. Mutation of the carbohydrate recognition domain also blocked the formation of OSER. Hence, a "double-lock" mechanism governs the behavior of YFP-Langerin, where asymmetric homodimerization of the YFP tag and homotypic interactions between the lectin domains of Langerin molecules participate in its retention and the subsequent formation of BG-like OSER. These observations confirm that

  3. Method to produce acetyldiacylglycerols (ac-TAGs) by expression of an acetyltransferase gene isolated from Euonymus alatus (burning bush)

    DOEpatents

    Durrett, Timothy; Ohlrogge, John; Pollard, Michael

    2016-05-03

    The present invention relates to novel diacylglycerol acyltransferase genes and proteins, and methods of their use. In particular, the invention describes genes encoding proteins having diacylglycerol acetyltransferase activity, specifically for transferring an acetyl group to a diacylglycerol substrate to form acetyl-Triacylglycerols (ac-TAGS), for example, a 3-acetyl-1,2-diacyl-sn-glycerol. The present invention encompasses both native and recombinant wild-type forms of the transferase, as well as mutants and variant forms. The present invention also relates to methods of using novel diacylglycerol acyltransferase genes and proteins, including their expression in transgenic organisms at commercially viable levels, for increasing production of 3-acetyl-1,2-diacyl-sn-glycerols in plant oils and altering the composition of oils produced by microorganisms, such as yeast, by increasing ac-TAG production. Additionally, oils produced by methods of the present inventions comprising genes and proteins are contemplated for use as biodiesel fuel, in polymer production and as naturally produced food oils with reduced calories.

  4. Epitope-Tagged Autotransporters as Single-Cell Reporters for Gene Expression by a Salmonella Typhimurium wbaP Mutant

    PubMed Central

    Curkić, Ismeta; Schütz, Monika; Oberhettinger, Philipp; Diard, Médéric; Claassen, Manfred; Linke, Dirk; Hardt, Wolf-Dietrich

    2016-01-01

    Phenotypic diversity is an important trait of bacterial populations and can enhance fitness of the existing genotype in a given environment. To characterize different subpopulations, several studies have analyzed differential gene expression using fluorescent reporters. These studies visualized either single or multiple genes within single cells using different fluorescent proteins. However, variable maturation and folding kinetics of different fluorophores complicate the study of dynamics of gene expression. Here, we present a proof-of-principle study for an alternative gene expression system in a wbaP mutant of Salmonella Typhimurium (S. Tm) lacking the O-sidechain of the lipopolysaccharide. We employed the hemagglutinin (HA)-tagged inverse autotransporter invasin (invAHA) as a transcriptional reporter for the expression of the type three secretion system 1 (T1) in S. Tm. Using a two-reporter approach with GFP and the InvAHA in single cells, we verify that this reporter system can be used for T1 gene expression analysis, at least in strains lacking the O-antigen (wbaP), which are permissive for detection of the surface-exposed HA-epitope. When we placed the two reporters gfp and invAHA under the control of either one or two different promoters of the T1 regulon, we were able to show correlative expression of both reporters. We conclude that the invAHA reporter system is a suitable tool to analyze T1gene expression in S. Tm and propose its applicability as molecular tool for gene expression studies within single cells. PMID:27149272

  5. Genome-wide analysis of single-nucleotide polymorphisms in human expressed sequences.

    PubMed

    Irizarry, K; Kustanovich, V; Li, C; Brown, N; Nelson, S; Wong, W; Lee, C J

    2000-10-01

    Single-nucleotide polymorphisms (SNPs) have been explored as a high-resolution marker set for accelerating the mapping of disease genes. Here we report 48,196 candidate SNPs detected by statistical analysis of human expressed sequence tags (ESTs), associated primarily with coding regions of genes. We used Bayesian inference to weigh evidence for true polymorphism versus sequencing error, misalignment or ambiguity, misclustering or chimaeric EST sequences, assessing data such as raw chromatogram height, sharpness, overlap and spacing, sequencing error rates, context-sensitivity and cDNA library origin. Three separate validations-comparison with 54 genes screened for SNPs independently, verification of HLA-A polymorphisms and restriction fragment length polymorphism (RFLP) testing-verified 70%, 89% and 71% of our predicted SNPs, respectively. Our method detects tenfold more true HLA-A SNPs than previous analyses of the EST data. We found SNPs in a large fraction of known disease genes, including some disease-causing mutations (for example, the HbS sickle-cell mutation). Our comprehensive analysis of human coding region polymorphism provides a public resource for mapping of disease genes (available at http://www.bioinformatics.ucla.edu/snp).

  6. Expression profiling and comparative sequence derived insights into lipid metabolism

    SciTech Connect

    Callow, Matthew J.; Rubin, Edward M.

    2001-12-19

    Expression profiling and genomic DNA sequence comparisons are increasingly being applied to the identification and analysis of the genes involved in lipid metabolism. Not only has genome-wide expression profiling aided in the identification of novel genes involved in important processes in lipid metabolism such as sterol efflux, but the utilization of information from these studies has added to our understanding of the regulation of pathways participating in the process. Coupled with these gene expression studies, cross species comparison, searching for sequences conserved through evolution, has proven to be a powerful tool to identify important non-coding regulatory sequences as well as the discovery of novel genes relevant to lipid biology. An example of the value of this approach was the recent chance discovery of a new apolipoprotein gene (apo AV) that has dramatic effects upon triglyceride metabolism in mice and humans.

  7. Amplified expression of the tag+ and alkA+ genes in Escherichia coli: identification of gene products and effects on alkylation resistance.

    PubMed Central

    Kaasen, I; Evensen, G; Seeberg, E

    1986-01-01

    We have constructed plasmids which overproduce the tag and alkA gene products of Escherichia coli, i.e., 3-methyladenine DNA glycosylases I and II. The tag and alkA gene products were identified radiochemically in maxi- or minicells as polypeptides of 21 and 30 kilodaltons, respectively, which are consistent with the gel filtration molecular weights of the enzyme activities, thus confirming the identity of the cloned genes. High expression of the tag+-coded glycosylase almost completely suppressed the alkylation sensitivity of alkA mutants, indicating that high levels of 3-methyladenine DNA glycosylase I will eliminate the need for 3-methyladenine DNA glycosylase II in repair of alkylated DNA. Furthermore, overproduction of the alkA+-coded glycosylase greatly sensitizes wild-type cells to alkylation, suggesting that only a limited expression of this enzyme will allow efficient DNA repair. Images PMID:3536857

  8. Haplotypes of the TaGS5-A1 Gene Are Associated with Thousand-Kernel Weight in Chinese Bread Wheat

    PubMed Central

    Wang, Shasha; Yan, Xuefang; Wang, Yongyan; Liu, Hongmei; Cui, Dangqun; Chen, Feng

    2016-01-01

    In previous work, we cloned TaGS5 gene and found the association of TaGS5-A1 alleles with agronomic traits. In this study, the promoter sequence of the TaGS5-A1 gene was isolated from bread wheat. Sequencing results revealed that a G insertion was found in position -1925 bp of the TaGS5-A1 gene (Reference to ATG), which occurred in the Sp1 domain of the promoter sequence. Combined with previous single nucleotide polymorphism (SNP) in the TaGS5-A1 exon sequence, four genotypes were formed at the TaGS5-A1 locus and were designated as TaGS5-A1a-a, TaGS5-A1a-b, TaGS5-A1b-a, and TaGS5-A1b-b, respectively. Analysis of the association of TaGS5-A1 alleles with agronomic traits indicated that cultivars with the TaGS5-A1a-b allele possessed significantly higher thousand-kernel weight (TKW) and lower plant height than cultivars with the TaGS5-A1a-a allele, and cultivars with the TaGS5-A1b-b allele showed higher TKW than cultivars with the TaGS5-A1b-a allele. The differences of these traits between the TaGS5-A1a-a and TaGS5-A1a-b alleles were larger than those of the TaGS5-A1b-a and TaGS5-A1b-b alleles, suggesting that the -1925G insertion plays the more important role in TaGS5-A1a genotypes than in TaGS5-A1b genotypes. qRT-PCR indicated that TaGS5-A1b-b possessed the significantly highest expression level among four TaGS5-A1 haplotypes in mature seeds and further showed a significantly higher expression level than TaGS5-A1b-a at five different developmental stages of the seeds, suggesting that high expression of TaGS5-A1 was positively associated with high TKW in bread wheat. This study could provide a relatively superior genotype in view of TKW in wheat breeding programs and could also provide important information for dissection of the regulatory mechanism of the yield-related traits. PMID:27375643

  9. Honey bee promoter sequences for targeted gene expression.

    PubMed

    Schulte, C; Leboulle, G; Otte, M; Grünewald, B; Gehne, N; Beye, M

    2013-08-01

    The honey bee, Apis mellifera, displays a rich behavioural repertoire, social organization and caste differentiation, and has an interesting mode of sex determination, but we still know little about its underlying genetic programs. We lack stable transgenic tools in honey bees that would allow genetic control of gene activity in stable transgenic lines. As an initial step towards a transgenic method, we identified promoter sequences in the honey bee that can drive constitutive, tissue-specific and cold shock-induced gene expression. We identified the promoter sequences of Am-actin5c, elp2l, Am-hsp83 and Am-hsp70 and showed that, except for the elp2l sequence, the identified sequences were able to drive reporter gene expression in Sf21 cells. We further demonstrated through electroporation experiments that the putative neuron-specific elp2l promoter sequence can direct gene expression in the honey bee brain. The identification of these promoter sequences is an important initial step in studying the function of genes with transgenic experiments in the honey bee, an organism with a rich set of interesting phenotypes. PMID:23668189

  10. Organ-specific and dosage-dependent expression of a leaf/stem specific gene from potato after tagging and transfer into potato and tobacco plants.

    PubMed Central

    Stockhaus, J; Eckes, P; Blau, A; Schell, J; Willmitzer, L

    1987-01-01

    ST-LS1, a single copy gene from potato displaying a leaf/stem specific gene expression, was tagged by an exon modification and introduced into both potato and tobacco cells using Agrobacterium vectors. After regeneration of whole plants, the expression of the tagged gene was analyzed with respect to its organ specificity and compared to the expression of the corresponding resident gene. The expression of the transferred gene in transgenic plants closely followed the expression of the resident gene. No marked influence of the plant species serving as host was observed. The level of expression of the introduced gene varied by a factor of at least 100 in independent transformants when normalized to the expression of the resident gene. Southern analysis performed on the transformed plants indicated a correlation between copy number of the introduced gene and its expression level. The activity of the tagged gene as well as of the resident gene was significantly inhibited by treatment of the transgenic plants with the herbicide norfluorazon, indicating that this gene activity is dependent on the presence of functional chloroplasts in the leaves. Images PMID:3575098

  11. Expressed sequence tags from the black-winged sharpshooter: Application to biology and vector control

    Technology Transfer Automated Retrieval System (TEKTRAN)

    We identified 14 putative full-length transcripts of proteins important for the survival of the black-winged sharpshooter, BWSS, Oncometopia nigricans. The BWSS is considered a highly competent vector of several strains of the xylem-inhabiting bacterium Xylella fastidiosa, the causal agent of a numb...

  12. Understanding mechanisms underlying human gene expression variation with RNA sequencing

    PubMed Central

    Pickrell, Joseph K.; Marioni, John C.; Pai, Athma A.; Degner, Jacob F.; Engelhardt, Barbara E.; Nkadori, Everlyne; Veyrieras, Jean-Baptiste; Stephens, Matthew; Gilad, Yoav; Pritchard, Jonathan K.

    2011-01-01

    Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal1. Although all eQTL studies so far have assayed messenger RNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project2. By pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated untranslated regions and more than 100 new putative protein-coding exons. Using the genotypes from the HapMap project, we identified more than a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing and allele-specific expression across individuals. PMID:20220758

  13. Overview of affinity tags for protein purification.

    PubMed

    Kimple, Michelle E; Brill, Allison L; Pasker, Renee L

    2013-01-01

    Addition of an affinity tag is a useful method for differentiating recombinant proteins expressed in bacterial and eukaryotic expression systems from the background of total cellular proteins, as well as for detecting protein-protein interactions. This overview describes the historical basis for the development of affinity tags, affinity tags that are commonly used today, how to choose an appropriate affinity tag for a particular purpose, and several recently developed affinity tag technologies that may prove useful in the near future. PMID:24510596

  14. Purification and Refolding to Amyloid Fibrils of (His)6-tagged Recombinant Shadoo Protein Expressed as Inclusion Bodies in E. coli.

    PubMed

    Li, Qiaojing; Richard, Charles-Adrien; Moudjou, Mohammed; Vidic, Jasmina

    2015-12-19

    The Escherichia coli expression system is a powerful tool for the production of recombinant eukaryotic proteins. We use it to produce Shadoo, a protein belonging to the prion family. A chromatographic method for the purification of (His)6-tagged recombinant Shadoo expressed as inclusion bodies is described. The inclusion bodies are solubilized in 8 M urea and bound to a Ni(2+)-charged column to perform ion affinity chromatography. Bound proteins are eluted by a gradient of imidazole. Fractions containing Shadoo protein are subjected to size exclusion chromatography to obtain a highly purified protein. In the final step purified Shadoo is desalted to remove salts, urea and imidazole. Recombinant Shadoo protein is an important reagent for biophysical and biochemical studies of protein conformation disorders occurring in prion diseases. Many reports demonstrated that prion neurodegenerative diseases originate from the deposition of stable, ordered amyloid fibrils. Sample protocols describing how to fibrillate Shadoo into amyloid fibrils at acidic and neutral/basic pHs are presented. The methods on how to produce and fibrillate Shadoo can facilitate research in laboratories working on prion diseases, since it allows for production of large amounts of protein in a rapid and low cost manner.

  15. Cloning, expression, purification and characterization of his-tagged human glucose-6-phosphate dehydrogenase: a simplified method for protein yield.

    PubMed

    Gómez-Manzo, Saúl; Terrón-Hernández, Jessica; de la Mora-de la Mora, Ignacio; García-Torres, Itzhel; López-Velázquez, Gabriel; Reyes-Vivas, Horacio; Oria-Hernández, Jesús

    2013-10-01

    Glucose-6-phosphate dehydrogenase (G6PD) catalyzes the first step of the pentose phosphate pathway. In erythrocytes, the functionality of the pathway is crucial to protect these cells against oxidative damage. G6PD deficiency is the most frequent enzymopathy in humans with a global prevalence of 4.9 %. The clinical picture is characterized by chronic or acute hemolysis in response to oxidative stress, which is related to the low cellular activity of G6PD in red blood cells. The disease is heterogeneous at genetic level with around 160 mutations described, mostly point mutations causing single amino acid substitutions. The biochemical studies aimed to describe the detrimental effects of mutations on the functional and structural properties of human G6PD are indispensable to understand the molecular physiopathology of this disease. Therefore, reliable systems for efficient expression and purification of the protein are highly desirable. In this work, human G6PD was heterologously expressed in Escherichia coli and purified by immobilized metal affinity chromatography in a single chromatographic step. The structural and functional characterization indicates that His-tagged G6PD resembles previous preparations of recombinant G6PD. In contrast with previous protein yield systems, our method is based on commonly available resources and fully accessible laboratory equipment; therefore, it can be readily implemented.

  16. Expression and purification of short hydrophobic elastin-like polypeptides with maltose-binding protein as a solubility tag.

    PubMed

    Bataille, Laure; Dieryck, Wilfrid; Hocquellet, Agnès; Cabanne, Charlotte; Bathany, Katell; Lecommandoux, Sébastien; Garbay, Bertrand; Garanger, Elisabeth

    2015-06-01

    Elastin-like polypeptides (ELPs) are biodegradable polymers with interesting physico-chemical properties for biomedical and biotechnological applications. The recombinant expression of hydrophobic elastin-like polypeptides is often difficult because they possess low transition temperatures, and therefore form aggregates at sub-ambient temperatures. To circumvent this difficulty, we expressed in Escherichia coli three hydrophobic ELPs (VPGIG)n with variable lengths (n=20, 40, and 60) in fusion with the maltose-binding protein (MBP). Fusion proteins were soluble and yields of purified MBP-ELP ranged between 66 and 127mg/L culture. After digestion of the fusion proteins by enterokinase, the ELP moiety was purified by using inverse transition cycling. The purified fraction containing ELP40 was slightly contaminated by traces of undigested fusion protein. Purification of ELP60 was impaired because of co-purification of the MBP tag during inverse transition cycling. ELP20 was successfully purified to homogeneity, as assessed by gel electrophoresis and mass spectrometry analyses. The transition temperature of ELP20 was measured at 15.4°C in low salt buffer. In conclusion, this method can be used to produce hydrophobic ELP of low molecular mass.

  17. In silico identification of coffee genome expressed sequences potentially associated with resistance to diseases

    PubMed Central

    2010-01-01

    Sequences potentially associated with coffee resistance to diseases were identified by in silico analyses using the database of the Brazilian Coffee Genome Project (BCGP). Keywords corresponding to plant resistance mechanisms to pathogens identified in the literature were used as baits for data mining. Expressed sequence tags (ESTs) related to each of these keywords were identified with tools available in the BCGP bioinformatics platform. A total of 11,300 ESTs were mined. These ESTs were clustered and formed 979 EST-contigs with similarities to chitinases, kinases, cytochrome P450 and nucleotide binding site-leucine rich repeat (NBS-LRR) proteins, as well as with proteins related to disease resistance, pathogenesis, hypersensitivity response (HR) and plant defense responses to diseases. The 140 EST-contigs identified through the keyword NBS-LRR were classified according to function. This classification allowed association of the predicted products of EST-contigs with biological processes, including host defense and apoptosis, and with molecular functions such as nucleotide binding and signal transducer activity. Fisher's exact test was used to examine the significance of differences in contig expression between libraries representing the responses to biotic stress challenges and other libraries from the BCGP. This analysis revealed seven contigs highly similar to catalase, chitinase, protein with a BURP domain and unknown proteins. The involvement of these coffee proteins in plant responses to disease is discussed. PMID:21637594

  18. P2A-Fluorophore Tagging of BRAF Tightly Links Expression to Fluorescence In Vivo

    PubMed Central

    McMahon, Martin

    2016-01-01

    The Braf proto-oncogene is a key component of the mitogen-activated protein kinase signaling cascade and is a critical regulator of both normal development and tumorigenesis in a variety of tissues. In order to elucidate BRAF’s differing roles in varying cell types, it is important to understand both the pattern and timing of BRAF expression. Here we report the production of a mouse model that links the expression of Braf with the bright red fluorescent protein, tdTomato. We have utilized a P2A knock-in strategy, ensuring that BRAF and the fluorophore are expressed from the same endogenous promoter and from the same bicistronic mRNA transcript. This mouse model (BrafTOM) shows bright red fluorescence in organs and cell types known to be sensitive to BRAF perturbation. We further show that on a cell-by-cell basis, fluorescence correlates with BRAF protein levels. Finally, we extend the utility of this mouse by demonstrating that the remnant P2A fragment attached to BRAF acts as a suitable epitope for immunoprecipitation and biochemical characterization of BRAF in vivo. PMID:27348307

  19. The pattern of gene expression in human CD34+ stem/progenitor cells

    PubMed Central

    Zhou, Guolin; Chen, Jianjun; Lee, Sanggyu; Clark, Terry; Rowley, Janet D.; Wang, San Ming

    2001-01-01

    We have analyzed the pattern of gene expression in human primary CD34+ stem/progenitor cells. We identified 42,399 unique serial analysis of gene expression (SAGE) tags among 106,021 SAGE tags collected from 2.5 × 106 CD34+ cells purified from bone marrow. Of these unique SAGE tags, 21,546 matched known expressed sequences, including 3,687 known genes, and 20,854 were novel without a match. The SAGE tags that matched known sequences tended to be at higher levels, whereas the novel SAGE tags tended to be at lower levels. By using the generation of longer sequences from SAGE tags for gene identification (GLGI) method, we identified the correct gene for 385 of 440 high-copy SAGE tags that matched multiple genes and we generated 198 novel 3′ expressed sequence tags from 138 high-copy novel SAGE tags. We observed that many different SAGE tags were derived from the same genes, reflecting the high heterogeneity of the 3′ untranslated region in the expressed genes. We compared the quantitative relationship for genes known to be important in hematopoiesis. The qualitative identification and quantitative measure for each known gene, expressed sequence tag, and novel SAGE tag provide a base for studying normal gene expression in hematopoietic stem/progenitor cells and for studying abnormal gene expression in hematopoietic diseases. PMID:11717454

  20. Murine candidate bleomycin induced pulmonary fibrosis susceptibility genes identified by gene expression and sequence analysis of linkage regions

    PubMed Central

    Haston, C; Tomko, T; Godin, N; Kerckhoff, L; Hallett, M

    2005-01-01

    Background: Pulmonary fibrosis is a complex disease for which the predisposing genetic variants remain unknown. In a prior study, susceptibility to bleomycin induced pulmonary fibrosis was mapped to loci Blmpf1 and Blmpf2 on chromosomes 17 and 11, respectively, in a C57BL/6J (B6, susceptible) and C3Hf/KAM (C3H, resistant) mouse cross. Methods: Herein, the genetic basis of bleomycin induced pulmonary fibrosis was investigated in an approach combining gene expression and sequencing data with previously mapped linkage intervals. Results: In this study, gene expression analysis with microarrays revealed 1892 genes or ESTs (expressed sequence tags) to be differentially expressed between bleomycin treated B6 and C3H mice and 67 of these genetic elements map to Blmpf1 or Blmpf2. This group included genes involved in an oxidative stress response, in apoptosis, and in immune regulation. A comparison of the B6 and C3H sequence, for Blmpf1 and Blmpf2, made using the NCBI database and available C3H sequence, revealed approximately 35% of the genes in these regions contain non-synonymous coding sequence changes. An assessment of genotype/phenotype correlation among other inbred strains revealed 36% of these B6/C3H sequence variations predict for the known bleomycin induced fibrosis susceptibility of the DBA (susceptible) and A/J (resistant) mouse strains. Conclusions: Combining genomics approaches of differential gene expression and sequence variation potentially identifies approximately 5% the linked genes as fibrosis susceptibility candidate genes in this mouse cross. PMID:15937080

  1. Molecular cloning, sequence characterization, and gene expression profiling of a novel water buffalo (Bubalus bubalis) gene, AGPAT6.

    PubMed

    Song, S; Huo, J L; Li, D L; Yuan, Y Y; Yuan, F; Miao, Y W

    2013-01-01

    Several 1-acylglycerol-3-phosphate-O-acyltransferases (AGPATs) can acylate lysophosphatidic acid to produce phosphatidic acid. Of the eight AGPAT isoforms, AGPAT6 is a crucial enzyme for glycerolipids and triacylglycerol biosynthesis in some mammalian tissues. We amplified and identified the complete coding sequence (CDS) of the water buffalo AGPAT6 gene by using the reverse transcription-polymerase chain reaction, based on the conversed sequence information of the cattle or expressed sequence tags of other Bovidae species. This novel gene was deposited in the NCBI database (accession No. JX518941). Sequence analysis revealed that the CDS of this AGPAT6 encodes a 456-amino acid enzyme (molecular mass = 52 kDa; pI = 9.34). Water buffalo AGPAT6 contains three hydrophobic transmembrane regions and a signal 37-amino acid peptide, localized in the cytoplasm. The deduced amino acid sequences share 99, 98, 98, 97, 98, 98, 97 and 95% identity with their homologous sequences from cattle, horse, human, mouse, orangutan, pig, rat, and chicken, respectively. The phylogenetic tree analysis based on the AGPAT6 CDS showed that water buffalo has a closer genetic relationship with cattle than with other species. Tissue expression profile analysis shows that this gene is highly expressed in the mammary gland, moderately expressed in the heart, muscle, liver, and brain; weakly expressed in the pituitary gland, spleen, and lung; and almost silently expressed in the small intestine, skin, kidney, and adipose tissues. Four predicted microRNA target sites are found in the water buffalo AGPAT6 CDS. These results will establish a foundation for further insights into this novel water buffalo gene. PMID:24114207

  2. Molecular cloning, sequence characterization, and gene expression profiling of a novel water buffalo (Bubalus bubalis) gene, AGPAT6.

    PubMed

    Song, S; Huo, J L; Li, D L; Yuan, Y Y; Yuan, F; Miao, Y W

    2013-10-01

    Several 1-acylglycerol-3-phosphate-O-acyltransferases (AGPATs) can acylate lysophosphatidic acid to produce phosphatidic acid. Of the eight AGPAT isoforms, AGPAT6 is a crucial enzyme for glycerolipids and triacylglycerol biosynthesis in some mammalian tissues. We amplified and identified the complete coding sequence (CDS) of the water buffalo AGPAT6 gene by using the reverse transcription-polymerase chain reaction, based on the conversed sequence information of the cattle or expressed sequence tags of other Bovidae species. This novel gene was deposited in the NCBI database (accession No. JX518941). Sequence analysis revealed that the CDS of this AGPAT6 encodes a 456-amino acid enzyme (molecular mass = 52 kDa; pI = 9.34). Water buffalo AGPAT6 contains three hydrophobic transmembrane regions and a signal 37-amino acid peptide, localized in the cytoplasm. The deduced amino acid sequences share 99, 98, 98, 97, 98, 98, 97 and 95% identity with their homologous sequences from cattle, horse, human, mouse, orangutan, pig, rat, and chicken, respectively. The phylogenetic tree analysis based on the AGPAT6 CDS showed that water buffalo has a closer genetic relationship with cattle than with other species. Tissue expression profile analysis shows that this gene is highly expressed in the mammary gland, moderately expressed in the heart, muscle, liver, and brain; weakly expressed in the pituitary gland, spleen, and lung; and almost silently expressed in the small intestine, skin, kidney, and adipose tissues. Four predicted microRNA target sites are found in the water buffalo AGPAT6 CDS. These results will establish a foundation for further insights into this novel water buffalo gene.

  3. Unamplified cap analysis of gene expression on a single-molecule sequencer

    PubMed Central

    Kanamori-Katayama, Mutsumi; Itoh, Masayoshi; Kawaji, Hideya; Lassmann, Timo; Katayama, Shintaro; Kojima, Miki; Bertin, Nicolas; Kaiho, Ai; Ninomiya, Noriko; Daub, Carsten O.; Carninci, Piero; Forrest, Alistair R.R.; Hayashizaki, Yoshihide

    2011-01-01

    We report the development of a simplified cap analysis of gene expression (CAGE) protocol adapted for single-molecule sequencers that avoids second strand synthesis, ligation, digestion, and PCR. HeliScopeCAGE directly sequences the 3′ end of cap trapped first-strand cDNAs. As with previous versions of CAGE, we better define transcription start sites (TSS) than known models, identify novel regions of transcription and alternative promoters, and find two major classes of TSS signal, sharp peaks and broad regions. However, using this protocol, we observe reproducible evidence of regulation at the much finer level of individual TSS positions. The libraries are quantitative over 5 orders of magnitude and highly reproducible (Pearson's correlation coefficient of 0.987). We have also scaled down the sample requirement to 5 μg of total RNA for a standard HeliScopeCAGE library and 100 ng for a low-quantity version. When the same RNA was run as 5-μg and 100-ng versions, the 100 ng was still able to detect expression for ∼60% of the 13,468 loci detected by a 5-μg library using the same threshold, allowing comparative analysis of even rare cell populations. Testing the protocol for differential gene expression measurements on triplicate HeLa and THP-1 samples, we find that the log fold change compared to Illumina microarray measurements is highly correlated (0.871). In addition, HeliScopeCAGE finds differential expression for thousands more loci including those with probes on the array. Finally, although the majority of tags are 5′ associated, we also observe a low level of signal on exons that is useful for defining gene structures. PMID:21596820

  4. Sortase-tag expressed protein ligation: combining protein purification and site-specific bioconjugation into a single step.

    PubMed

    Warden-Rothman, Robert; Caturegli, Ilaria; Popik, Vladimir; Tsourkas, Andrew

    2013-11-19

    Efficient labeling of protein-based targeting ligands with various cargos (drugs, imaging agents, nanoparticles, etc.) is essential to the fields of molecular imaging and targeted therapeutics. Many common bioconjugation techniques, however, are inefficient, nonstoichiometric, not site-specific, and/or incompatible with certain classes of protein scaffolds. Additionally, these techniques can result in a mixture of conjugated and unconjugated products, which are often difficult to separate. In this study, a bacterial sortase enzyme was utilized to condense targeting ligand purification and site-specific conjugation at the C-terminus into a single step. A model was produced to determine optimal reaction conditions for high conjugate purity and efficient utilization of cargo. As proof-of-principle, the sortase-tag expressed protein ligation (STEPL) technique was used to generate tumor-specific affinity ligands with fluorescent labels and/or azide modifications at high purity (>95%) such that it was not necessary to remove unconjugated impurities. Click chemistry was then used for the highly efficient and site-specific attachment of the azide-modified targeting ligands onto nanoparticles. PMID:24111659

  5. Sortase-Tag Expressed Protein Ligation (STEPL): combining protein purification and site-specific bioconjugation into a single step

    PubMed Central

    Warden-Rothman, Robert; Caturegli, Ilaria; Popik, Vladimir; Tsourkas, Andrew

    2013-01-01

    Efficient labeling of protein-based targeting ligands with various cargos (drugs, imaging agents, nanoparticles, etc.) is essential to the fields of molecular imaging and targeted therapeutics. Many common bioconjugation techniques, however, are inefficient, non-stoichiometric, not site-specific, and/or incompatible with certain classes of protein scaffolds. Additionally, these techniques can result in a mixture of conjugated and unconjugated products, which are often difficult to separate. In this study, a bacterial sortase enzyme was utilized to condense targeting ligand purification and site-specific conjugation at the C-terminus into a single step. A model was produced to determine optimal reaction conditions for high conjugate purity and efficient utilization of cargo. As proof-of-principle, the sortase-tag expressed protein ligation (STEPL) technique was used to generate tumor-specific affinity ligands with fluorescent labels and/or azide modifications at high purity (>95%) such that is was not necessary to remove unconjugated impurities. Click chemistry was then used for the highly efficient and site-specific attachment of the azide-modified targeting ligands onto nanoparticles. PMID:24111659

  6. Expression and Subcellular Distribution of GFP-Tagged Human Tetraspanin Proteins in Saccharomyces cerevisiae.

    PubMed

    Skaar, Karin; Korza, Henryk J; Tarry, Michael; Sekyrova, Petra; Högbom, Martin

    2015-01-01

    Tetraspanins are integral membrane proteins that function as organizers of multimolecular complexes and modulate function of associated proteins. Mammalian genomes encode approximately 30 different members of this family and remotely related eukaryotic species also contain conserved tetraspanin homologs. Tetraspanins are involved in a number of fundamental processes such as regulation of cell migration, fusion, immunity and signaling. Moreover, they are implied in numerous pathological states including mental disorders, infectious diseases or cancer. Despite the great interest in tetraspanins, the structural and biochemical basis of their activity is still largely unknown. A major bottleneck lies in the difficulty of obtaining stable and homogeneous protein samples in large quantities. Here we report expression screening of 15 members of the human tetraspanin superfamily and successful protocols for the production in S. cerevisiae of a subset of tetraspanins involved in human cancer development. We have demonstrated the subcellular localization of overexpressed tetraspanin-green fluorescent protein fusion proteins in S. cerevisiae and found that despite being mislocalized, the fusion proteins are not degraded. The recombinantly produced tetraspanins are dispersed within the endoplasmic reticulum membranes or localized in granule-like structures in yeast cells. The recombinantly produced tetraspanins can be extracted from the membrane fraction and purified with detergents or the poly (styrene-co-maleic acid) polymer technique for use in further biochemical or biophysical studies. PMID:26218426

  7. Application of the High Resolution Melting analysis for genetic mapping of Sequence Tagged Site markers in narrow-leafed lupin (Lupinus angustifolius L.).

    PubMed

    Kamel, Katarzyna A; Kroc, Magdalena; Święcicki, Wojciech

    2015-01-01

    Sequence tagged site (STS) markers are valuable tools for genetic and physical mapping that can be successfully used in comparative analyses among related species. Current challenges for molecular markers genotyping in plants include the lack of fast, sensitive and inexpensive methods suitable for sequence variant detection. In contrast, high resolution melting (HRM) is a simple and high-throughput assay, which has been widely applied in sequence polymorphism identification as well as in the studies of genetic variability and genotyping. The present study is the first attempt to use the HRM analysis to genotype STS markers in narrow-leafed lupin (Lupinus angustifolius L.). The sensitivity and utility of this method was confirmed by the sequence polymorphism detection based on melting curve profiles in the parental genotypes and progeny of the narrow-leafed lupin mapping population. Application of different approaches, including amplicon size and a simulated heterozygote analysis, has allowed for successful genetic mapping of 16 new STS markers in the narrow-leafed lupin genome.

  8. High-throughput sequencing-based genome-wide identification of microRNAs expressed in developing cotton seeds.

    PubMed

    Wang, YanMei; Ding, Yan; Yu, DingWei; Xue, Wei; Liu, JinYuan

    2015-08-01

    MicroRNAs (miRNAs) have been shown to play critical regulatory roles in gene expression in cotton. Although a large number of miRNAs have been identified in cotton fibers, the functions of miRNAs in seed development remain unexplored. In this study, a small RNA library was constructed from cotton seeds sampled at 15 days post-anthesis (DPA) and was subjected to high-throughput sequencing. A total of 95 known miRNAs were detected to be expressed in cotton seeds. The expression pattern of these identified miRNAs was profiled and 48 known miRNAs were differentially expressed between cotton seeds and fibers at 15 DPA. In addition, 23 novel miRNA candidates were identified in 15-DPA seeds. Putative targets for 21 novel and 87 known miRNAs were successfully predicted and 900 expressed sequence tag (EST) sequences were proposed to be candidate target genes, which are involved in various metabolic and biological processes, suggesting a complex regulatory network in developing cotton seeds. Furthermore, miRNA-mediated cleavage of three important transcripts in vivo was validated by RLM-5' RACE. This study is the first to show the regulatory network of miRNAs that are involved in developing cotton seeds and provides a foundation for future studies on the specific functions of these miRNAs in seed development.

  9. Expression analysis of a tyrosinase promoter sequence in zebrafish.

    PubMed

    Camp, Esther; Badhwar, Prerna; Mann, Graham J; Lardelli, Michael

    2003-04-01

    Sequence comparisons and functional analysis of the 5' upstream regions of tyrosinase genes have revealed the importance of cis-regulatory elements acting to control the spatiotemporal expression of tyrosinase in the melanocytes and retinal pigmented epithelium of developing embryos. To date there are no reports addressing the control of tyrosinase gene transcription in zebrafish, a vertebrate model organism of increasing importance. To exploit the tyrosinase gene as a marker in zebrafish we set out to clone its promoter and analyse its regulation during embryogenesis. Amplification of a zebrafish tyrosinase complementary DNA fragment by reverse transcriptase polymerase chain reaction allowed us to isolate and sequence a 1041 nt genomic DNA fragment that includes a transcription initiation site and 73 nt of the open reading frame. Bioinformatic analysis of this genomic sequence revealed five E-box motifs, including one CATGTG type E-box present in a putative initiation region. These are conserved positive regulatory elements in vertebrate tyrosinase promoters. We show that a region of 814 nt upstream from the translation start site of the zebrafish tyrosinase gene can drive expression in retinal pigmented epithelium in transiently transgenic zebrafish embryos but that its activity is not restricted to melanin-producing cells. This region is unable to drive transcription in human melanoma cell lines. Ectopic expression from this zebrafish tyrosinase promoter fragment is probably due to the absence of positive and negative cis-regulatory elements, such as a tyrosinase distal element, which is known to function as a pigment cell-specific enhancer.

  10. Genetic mapping of expressed sequences in onion and in silico comparisons with rice show scant colinearity.

    PubMed

    Martin, William J; McCallum, John; Shigyo, Masayoshi; Jakse, Jernej; Kuhl, Joseph C; Yamane, Naoko; Pither-Joyce, Meeghan; Gokce, Ali Fuat; Sink, Kenneth C; Town, Christopher D; Havey, Michael J

    2005-10-01

    The Poales (which include the grasses) and Asparagales [which include onion (Allium cepa L.) and other Allium species] are the two most economically important monocot orders. Enormous genomic resources have been developed for the grasses; however, their applicability to other major monocot groups, such as the Asparagales, is unclear. Expressed sequence tags (ESTs) from onion that showed significant similarities (80% similarity over at least 70% of the sequence) to single positions in the rice genome were selected. One hundred new genetic markers developed from these ESTs were added to the intraspecific map derived from the BYG15-23xAC43 segregating family, producing 14 linkage groups encompassing 1,907 cM at LOD 4. Onion linkage groups were assigned to chromosomes using alien addition lines of Allium fistulosum L. carrying single onion chromosomes. Visual comparisons of genetic linkage in onion with physical linkage in rice revealed scant colinearity; however, short regions of colinearity could be identified. Our results demonstrate that the grasses may not be appropriate genomic models for other major monocot groups such as the Asparagales; this will make it necessary to develop genomic resources for these important plants. PMID:16025250

  11. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison.

    PubMed

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-07-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features.

  12. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly.

    PubMed

    Feng, Yan-Jie; Liu, Qing-Feng; Chen, Meng-Yun; Liang, Dan; Zhang, Peng

    2016-01-01

    In phylogenetics and population genetics, a large number of loci are often needed to accurately resolve species relationships. Normally, loci are enriched by PCR and sequenced by Sanger sequencing, which is expensive when the number of amplicons is large. Next-generation sequencing (NGS) techniques are increasingly used for parallel amplicon sequencing, which reduces sequencing costs tremendously, but has not reduced preparation costs very much. Moreover, for most current NGS methods, amplicons need to be purified and quantified before sequencing and their lengths are also restricted (normally <700 bp). Here, we describe an approach to sequence pooled amplicons of any length using the Illumina platform. Using this method, amplicons are pooled at equal volume rather than at equal concentration, thus eliminating the laborious purification and quantification steps. We then shear the pooled amplicons, repair the ends, add sample identifying linkers and pool multiple samples prior to Illumina library preparation. Data are then assembled using the transcriptome assembly program trinity, which is optimized to deal with templates of highly varying quantities. We demonstrated the utility of our approach by recovering 93.5% of the target amplicons (size up to 1650 bp) in full length for a 16 taxa × 101 loci project, using ~2.0 GB of Illumina HiSeq paired-end 90-bp data. Overall, we validate a rapid, cost-effective and scalable approach to sequence a large number of targeted loci from a large number of samples that is particularly suitable for both phylogenetics and population genetics studies that require a modest scale of data. PMID:25959587

  13. Parallel tagged amplicon sequencing of relatively long PCR products using the Illumina HiSeq platform and transcriptome assembly.

    PubMed

    Feng, Yan-Jie; Liu, Qing-Feng; Chen, Meng-Yun; Liang, Dan; Zhang, Peng

    2016-01-01

    In phylogenetics and population genetics, a large number of loci are often needed to accurately resolve species relationships. Normally, loci are enriched by PCR and sequenced by Sanger sequencing, which is expensive when the number of amplicons is large. Next-generation sequencing (NGS) techniques are increasingly used for parallel amplicon sequencing, which reduces sequencing costs tremendously, but has not reduced preparation costs very much. Moreover, for most current NGS methods, amplicons need to be purified and quantified before sequencing and their lengths are also restricted (normally <700 bp). Here, we describe an approach to sequence pooled amplicons of any length using the Illumina platform. Using this method, amplicons are pooled at equal volume rather than at equal concentration, thus eliminating the laborious purification and quantification steps. We then shear the pooled amplicons, repair the ends, add sample identifying linkers and pool multiple samples prior to Illumina library preparation. Data are then assembled using the transcriptome assembly program trinity, which is optimized to deal with templates of highly varying quantities. We demonstrated the utility of our approach by recovering 93.5% of the target amplicons (size up to 1650 bp) in full length for a 16 taxa × 101 loci project, using ~2.0 GB of Illumina HiSeq paired-end 90-bp data. Overall, we validate a rapid, cost-effective and scalable approach to sequence a large number of targeted loci from a large number of samples that is particularly suitable for both phylogenetics and population genetics studies that require a modest scale of data.

  14. Investigating the genetics of Bti resistance using mRNA tag sequencing: application on laboratory strains and natural populations of the dengue vector Aedes aegypti

    PubMed Central

    Paris, Margot; Marcombe, Sebastien; Coissac, Eric; Corbel, Vincent; David, Jean-Philippe; Després, Laurence

    2013-01-01

    Mosquito control is often the main method used to reduce mosquito-transmitted diseases. In order to investigate the genetic basis of resistance to the bio-insecticide Bacillus thuringiensis subsp. israelensis (Bti), we used information on polymorphism obtained from cDNA tag sequences from pooled larvae of laboratory Bti-resistant and susceptible Aedes aegypti mosquito strains to identify and analyse 1520 single nucleotide polymorphisms (SNPs). Of the 372 SNPs tested, 99.2% were validated using DNA Illumina GoldenGate® array, with a strong correlation between the allelic frequencies inferred from the pooled and individual data (r = 0.85). A total of 11 genomic regions and five candidate genes were detected using a genome scan approach. One of these candidate genes showed significant departures from neutrality in the resistant strain at sequence level. Six natural populations from Martinique Island were sequenced for the 372 tested SNPs with a high transferability (87%), and association mapping analyses detected 14 loci associated with Bti resistance, including one located in a putative receptor for Cry11 toxins. Three of these loci were also significantly differentiated between the laboratory strains, suggesting that most of the genes associated with resistance might differ between the two environments. It also suggests that common selected regions might harbour key genes for Bti resistance. PMID:24187584

  15. Serial analysis of gene expression.

    PubMed

    Velculescu, V E; Zhang, L; Vogelstein, B; Kinzler, K W

    1995-10-20

    The characteristics of an organism are determined by the genes expressed within it. A method was developed, called serial analysis of gene expression (SAGE), that allows the quantitative and simultaneous analysis of a large number of transcripts. To demonstrate this strategy, short diagnostic sequence tags were isolated from pancreas, concatenated, and cloned. Manual sequencing of 1000 tags revealed a gene expression pattern characteristic of pancreatic function. New pancreatic transcripts corresponding to novel tags were identified. SAGE should provide a broadly applicable means for the quantitative cataloging and comparison of expressed genes in a variety of normal, developmental, and disease states. PMID:7570003

  16. Use of the myosin motor domain as large-affinity tag for the expression and purification of proteins in Dictyostelium discoideum.

    PubMed

    Kollmar, Martin

    2006-08-15

    The cellular slime mold Dictyostelium discoideum is increasingly be used for the overexpression of proteins. Dictyostelium is amenable to classical and molecular genetic approaches and can easily be grown in large quantities. It contains a variety of chaperones and folding enzymes, and is able to perform all kinds of post-translational protein modifications. Here, new expression vectors are presented that have been designed for the production of proteins in large quantities for biochemical and structural studies. The expression cassettes of the most successful vectors are based on a tandem affinity purification tag consisting of an octahistidine tag followed by the myosin motor domain tag. The myosin motor domain not only strongly enhances the production of fused proteins but is also used for a fast affinity purification step through its ATP-dependent binding to actin. The applicability of the new system has been demonstrated for the expression and purification of subunits of the dynein-dynactin motor protein complex from different species. PMID:16516959

  17. Bacterial diversity assessment of pristine mangrove microbial community from Dhulibhashani, Sundarbans using 16S rRNA gene tag sequencing.

    PubMed

    Basak, Pijush; Pramanik, Arnab; Sengupta, Sohan; Nag, Sudip; Bhattacharyya, Anish; Roy, Debojyoti; Pattanayak, Rudradip; Ghosh, Abhrajyoti; Chattopadhyay, Dhrubajyoti; Bhattacharyya, Maitree

    2016-03-01

    The global knowledge of microbial diversity and function in Sundarbans ecosystem is still scarce, despite global advancement in understanding the microbial diversity. In the present study, we have analyzed the diversity and distribution of bacteria in the tropical mangrove sediments of Sundarbans using 16S rRNA gene amplicon sequencing. Metagenome is comprised of 1,53,926 sequences with 108.8 Mbp data and with 55 ± 2% G + C content. Metagenome sequence data are available at NCBI under the Bioproject database with accession no. PRJNA245459. Bacterial community metagenome sequences were analyzed by MG-RAST software representing the presence of 56,547 species belonging to 44 different phyla. The taxonomic analysis revealed the dominance of phyla Proteobacteria within our dataset. Further taxonomic analysis revealed abundance of Bacteroidetes, Acidobactreia, Firmicutes, Actinobacteria, Nitrospirae, Cyanobacteria, Planctomycetes and Fusobacteria group as the predominant bacterial assemblages in this largely pristine mangrove habitat. The distribution of different community datasets obtained from four sediment samples originated from one sampling station at two different depths providing better understanding of the sediment bacterial diversity and its relationship to the ecosystem dynamics of this pristine mangrove sediment of Dhulibhashani in, Sundarbans.

  18. Expression of the cell adhesion proteins BEN/SC1/DM-GRASP and TAG-1 defines early steps of axonogenesis in the human spinal cord.

    PubMed

    Karagogeos, D; Pourquié, C; Kyriakopoulou, K; Tavian, M; Stallcup, W; Péault, B; Pourquié, O

    1997-03-17

    We have studied the expression pattern of two cell adhesion proteins of the immunoglobin (Ig) superfamily, BEN/SC1/DM-GRASP (BEN) and the transient axonal glycoprotein TAG-1, during the development of the human nervous system. This study was performed by immunocytochemistry on sections of human embryos ranging from 4 to 13 weeks postconception. The overall distribution of the two proteins during development is very similar to that reported in other vertebrate species, but several important differences have been observed. Both proteins exhibit a transient expression on selected neuronal populations, which include the motor and the sensory neurons. In addition, BEN was also detected on virtually all neurons derived from the neural crest as well as in nonneuronal tissues. A major difference of expression with the chick embryo is that, in the motor neurons, BEN expression was not observed at early stages of development, thus arguing against a role of this molecule in pathfinding and fasciculation. BEN was observed to be restricted to subsets of motor neurons, such as the medial column at the upper limb level. Expression was also detected in a laterodorsal population of the ventral horn cells, which are likely to correspond to migrating preganglionic neurons that originate from the motor pool at the thoracic level. TAG-1 was found on commissural neurons and weakly on the sympathetic neurons; it was also detected on restricted nonneuronal populations. In addition, we observed TAG-1 expression in fibers that could correspond either to subsets of dorsal root ganglia (DRGs) central afferences (including the Ia fibers) or to the axons of association interneurons and in scattered motoneurons likely to correspond either to preganglionic neurons, to gamma-motoneurons, or to late-born motoneurons. Therefore, our results indicate that the molecular strategies used to establish the axonal scaffolding of the nervous system in humans are extremely conserved among the different

  19. Properties of wild-type and fluorescent protein-tagged mouse tetrodotoxin-resistant sodium channel (Na V 1.8) heterologously expressed in rat sympathetic neurons.

    PubMed

    Schofield, Geoffrey G; Puhl, Henry L; Ikeda, Stephen R

    2008-04-01

    The tetrodotoxin (TTX)-resistant Na(+) current arising from Na(V)1.8-containing channels participates in nociceptive pathways but is difficult to functionally express in traditional heterologous systems. Here, we show that injection of cDNA encoding mouse Na(V)1.8 into the nuclei of rat superior cervical ganglion (SCG) neurons results in TTX-resistant Na(+) currents with amplitudes equal to or exceeding the currents arising from natively expressing channels of mouse dorsal root ganglion (DRG) neurons. The activation and inactivation properties of the heterologously expressed Na(V)1.8 Na(+) channels were similar but not identical to native TTX-resistant channels. Most notably, the half-activation potential of the heterologously expressed Na(V)1.8 channels was shifted about 10 mV toward more depolarized potentials. Fusion of fluorescent proteins to the N- or C-termini of Na(V)1.8 did not substantially affect functional expression in SCG neurons. Unexpectedly, fluorescence was not concentrated at the plasma membrane but found throughout the interior of the neuron in a granular pattern. A similar expression pattern was observed in nodose ganglion neurons expressing the tagged channels. In contrast, expression of tagged Na(V)1.8 in HeLa cells revealed a fluorescence pattern consistent with sequestration in the endoplasmic reticulum, thus providing a basis for poor functional expression in clonal cell lines. Our results establish SCG neurons as a favorable surrogate for the expression and study of molecularly defined Na(V)1.8-containing channels. The data also indicate that unidentified factors may be required for the efficient functional expression of Na(V)1.8 with a biophysical phenotype identical to that found in sensory neurons. PMID:18272876

  20. Comparison of direct boiling method with commercial kits for extracting fecal microbiome DNA by Illumina sequencing of 16S rRNA tags.

    PubMed

    Peng, Xin; Yu, Ke-Qiang; Deng, Guan-Hua; Jiang, Yun-Xia; Wang, Yu; Zhang, Guo-Xia; Zhou, Hong-Wei

    2013-12-01

    Low cost and high throughput capacity are major advantages of using next generation sequencing (NGS) techniques to determine metagenomic 16S rRNA tag sequences. These methods have significantly changed our view of microorganisms in the fields of human health and environmental science. However, DNA extraction using commercial kits has shortcomings of high cost and time constraint. In the present study, we evaluated the determination of fecal microbiomes using a direct boiling method compared with 5 different commercial extraction methods, e.g., Qiagen and MO BIO kits. Principal coordinate analysis (PCoA) using UniFrac distances and clustering showed that direct boiling of a wide range of feces concentrations gave a similar pattern of bacterial communities as those obtained from most of the commercial kits, with the exception of the MO BIO method. Fecal concentration by boiling method affected the estimation of α-diversity indices, otherwise results were generally comparable between boiling and commercial methods. The operational taxonomic units (OTUs) determined through direct boiling showed highly consistent frequencies with those determined through most of the commercial methods. Even those for the MO BIO kit were also obtained by the direct boiling method with high confidence. The present study suggested that direct boiling could be used to determine the fecal microbiome and using this method would significantly reduce the cost and improve the efficiency of the sample preparation for studying gut microbiome diversity.

  1. Dynamic changes in the composition of photosynthetic picoeukaryotes in the northwestern Pacific Ocean revealed by high-throughput tag sequencing of plastid 16S rRNA genes.

    PubMed

    Choi, Dong H; An, Sung M; Chun, Sungjun; Yang, Eun C; Selph, Karen E; Lee, Charity M; Noh, Jae H

    2016-02-01

    Photosynthetic picoeukaryotes (PPEs) are major oceanic primary producers. However, the diversity of such communities remains poorly understood, especially in the northwestern (NW) Pacific. We investigated the abundance and diversity of PPEs, and recorded environmental variables, along a transect from the coast to the open Pacific Ocean. High-throughput tag sequencing (using the MiSeq system) revealed the diversity of plastid 16S rRNA genes. The dominant PPEs changed at the class level along the transect. Prymnesiophyceae were the only dominant PPEs in the warm pool of the NW Pacific, but Mamiellophyceae dominated in coastal waters of the East China Sea. Phylogenetically, most Prymnesiophyceae sequences could not be resolved at lower taxonomic levels because no close relatives have been cultured. Within the Mamiellophyceae, the genera Micromonas and Ostreococcus dominated in marginal coastal areas affected by open water, whereas Bathycoccus dominated in the lower euphotic depths of oligotrophic open waters. Cryptophyceae and Phaeocystis (of the Prymnesiophyceae) dominated in areas affected principally by coastal water. We also defined the biogeographical distributions of Chrysophyceae, prasinophytes, Bacillariophyceaea and Pelagophyceae. These distributions were influenced by temperature, salinity and chlorophyll a and nutrient concentrations. PMID:26712350

  2. Dynamic changes in the composition of photosynthetic picoeukaryotes in the northwestern Pacific Ocean revealed by high-throughput tag sequencing of plastid 16S rRNA genes.

    PubMed

    Choi, Dong H; An, Sung M; Chun, Sungjun; Yang, Eun C; Selph, Karen E; Lee, Charity M; Noh, Jae H

    2016-02-01

    Photosynthetic picoeukaryotes (PPEs) are major oceanic primary producers. However, the diversity of such communities remains poorly understood, especially in the northwestern (NW) Pacific. We investigated the abundance and diversity of PPEs, and recorded environmental variables, along a transect from the coast to the open Pacific Ocean. High-throughput tag sequencing (using the MiSeq system) revealed the diversity of plastid 16S rRNA genes. The dominant PPEs changed at the class level along the transect. Prymnesiophyceae were the only dominant PPEs in the warm pool of the NW Pacific, but Mamiellophyceae dominated in coastal waters of the East China Sea. Phylogenetically, most Prymnesiophyceae sequences could not be resolved at lower taxonomic levels because no close relatives have been cultured. Within the Mamiellophyceae, the genera Micromonas and Ostreococcus dominated in marginal coastal areas affected by open water, whereas Bathycoccus dominated in the lower euphotic depths of oligotrophic open waters. Cryptophyceae and Phaeocystis (of the Prymnesiophyceae) dominated in areas affected principally by coastal water. We also defined the biogeographical distributions of Chrysophyceae, prasinophytes, Bacillariophyceaea and Pelagophyceae. These distributions were influenced by temperature, salinity and chlorophyll a and nutrient concentrations.

  3. Automated Workflow for Preparation of cDNA for Cap Analysis of Gene Expression on a Single Molecule Sequencer

    PubMed Central

    Nagao-Sato, Sayaka; Saijo, Eri; Lassmann, Timo; Kanamori-Katayama, Mutsumi; Kaiho, Ai; Lizio, Marina; Kawaji, Hideya; Carninci, Piero; Forrest, Alistair R. R.; Hayashizaki, Yoshihide

    2012-01-01

    Background Cap analysis of gene expression (CAGE) is a 5′ sequence tag technology to globally determine transcriptional starting sites in the genome and their expression levels and has most recently been adapted to the HeliScope single molecule sequencer. Despite significant simplifications in the CAGE protocol, it has until now been a labour intensive protocol. Methodology In this study we set out to adapt the protocol to a robotic workflow, which would increase throughput and reduce handling. The automated CAGE cDNA preparation system we present here can prepare 96 ‘HeliScope ready’ CAGE cDNA libraries in 8 days, as opposed to 6 weeks by a manual operator.We compare the results obtained using the same RNA in manual libraries and across multiple automation batches to assess reproducibility. Conclusions We show that the sequencing was highly reproducible and comparable to manual libraries with an 8 fold increase in productivity. The automated CAGE cDNA preparation system can prepare 96 CAGE sequencing samples simultaneously. Finally we discuss how the system could be used for CAGE on Illumina/SOLiD platforms, RNA-seq and full-length cDNA generation. PMID:22303458

  4. Expression Platforms for Producing Eukaryotic Proteins: A Comparison of E. coli Cell-Based and Wheat Germ Cell-Free Synthesis, Affinity and Solubility Tags, and Cloning Strategies

    PubMed Central

    Aceti, David J.; Bingman, Craig A.; Wrobel, Russell L.; Frederick, Ronnie O.; Makino, Shin-ichi; Nichols, Karl W.; Sahu, Sarata C.; Bergeman, Lai F.; Blommel, Paul G.; Cornilescu, Claudia C.; Gromek, Katarzyna A.; Seder, Kory D.; Hwang, Soyoon; Primm, John G.; Sabat, Grzegorz; Vojtik, Frank C.; Volkman, Brian F.; Zolnai, Zsolt; Phillips, George N.; Markley, John L.; Fox, Brian G.

    2015-01-01

    Vectors designed for protein production in Escherichia coli and by wheat germ cell-free translation were tested using 21 well-characterized eukaryotic proteins chosen to serve as controls within the context of a structural genomics pipeline. The controls were carried through cloning, small-scale expression trials, large-scale growth or synthesis, and purification. Successfully purified proteins were also subjected to either crystallization trials or 1H-15N HSQC NMR analyses. Experiments evaluated: (1) the relative efficacy of restriction/ligation and recombinational cloning systems; (2) the value of maltose-binding protein (MBP) as a solubility enhancement tag; (3) the consequences of in vivo proteolysis of the MBP fusion as an alternative to post-purification proteolysis; (4) the effect of the level of LacI repressor on the yields of protein obtained from E. coli using autoinduction; (5) the consequences of removing the His tag from proteins produced by the cell-free system; and (6) the comparative performance of E. coli cells or wheat germ cell-free translation. Optimal promoter/repressor and fusion tag configurations for each expression system are discussed. PMID:25854603

  5. Expression dynamics and ultrastructural localization of epitope-tagged Abutilon mosaic virus nuclear shuttle and movement proteins in Nicotiana benthamiana cells

    SciTech Connect

    Kleinow, Tatjana; Tanwir, Fariha; Kocher, Cornelia; Krenz, Bjoern; Wege, Christina; Jeske, Holger

    2009-09-01

    The geminivirus Abutilon mosaic virus (AbMV) encodes two proteins which are essential for viral spread within plants. The nuclear shuttle protein (NSP) transfers viral DNA between the nucleus and cytoplasm, whereas the movement protein (MP) facilitates transport between cells through plasmodesmata and long-distance via phloem. An inducible overexpression system for epitope-tagged NSP and MP in plants yielded unprecedented amounts of both proteins. Western blots revealed extensive posttranslational modification and truncation for MP, but not for NSP. Ultrastructural examination of Nicotiana benthamiana tissues showed characteristic nucleopathic alterations, including fibrillar rings, when epitope-tagged NSP and MP were simultaneously expressed in leaves locally infected with an AbMV DNA A in which the coat protein gene was replaced by a green fluorescent protein encoding gene. Immunogold labelling localized NSP in the nucleoplasm and in the fibrillar rings. MP appeared at the cell periphery, probably the plasma membrane, and plasmodesmata.

  6. Expression and phosphorylation state analysis of intracellular protein kinases using Multi-PK antibody and Phos-tag SDS-PAGE

    PubMed Central

    Sugiyama, Yasunori; Katayama, Syouichi; Kameshita, Isamu; Morisawa, Keiko; Higuchi, Takuma; Todaka, Hiroshi; Kinoshita, Eiji; Kinoshita-Kikuta, Emiko; Koike, Tohru; Taniguchi, Taketoshi; Sakamoto, Shuji

    2015-01-01

    Protein kinase expression and activity play important roles in diverse cellular functions through regulation of phosphorylation signaling. The most commonly used tools for detecting the protein kinase are protein kinase-specific antibodies, and phosphorylation site-specific antibodies were used for detecting activated protein kinase. Using these antibodies, only one kinase was analyzed at a time, however, a method for analyzing the expression and activation of a panel of protein kinases in cells is not established. Therefore, we developed a combined method using Multi-PK antibody and Phos-tag SDS-PAGE for profiling the expression and phosphorylation state of intracellular protein kinases. Using the new method, changes in the expression and phosphorylation state of various protein kinases were detected in cells treated with anticancer agent which inhibit multiple tyrosine kinase activities. Therefore, the new method is a useful technique for analysis of intracellular protein kinases.•Multi-PK antibody recognizes a wide variety of protein kinases in various species.•Using Phos-tag SDS-PAGE, phosphorylated proteins are visualized as slower migration bands compared with corresponding non-phosphorylated proteins.•This combined method can be used for detecting changes in the expression and phosphorylation state of various intracellular protein kinases. PMID:26844212

  7. Laminin A chain: expression during Drosophila development and genomic sequence.

    PubMed Central

    Kusche-Gullberg, M; Garrison, K; MacKrell, A J; Fessler, L I; Fessler, J H

    1992-01-01

    A Drosophila laminin A chain gene was characterized as a 14 kb genomic nucleotide sequence which encodes an open reading frame of 3712 amino acids in 15 exons. Overall, this A chain is similar to its vertebrate counterparts, especially in its N- and C-terminal globular domains, but the sequence that forms the laminin A short arm is quite different and larger. Laminin messages appear in newly formed mesoderm and are later prominently expressed in hemocytes, which also synthesize basement membrane collagen IV. The composition of Drosophila basement membranes changes with development. A novel method of tandemly fused RNA probes showed that developmental increases of laminin mRNAs were primarily associated with periods of morphogenesis, and preceded those of collagen IV, a protein strongly expressed during growth. The ratio of A:B1:B2 mRNAs varied little during embryogenesis, with less mRNA for A than B chains. Staining of embryos with antibodies confirmed and extended the information provided by in situ hybridization. Homologs of the G-subdomains of this A chain, which occur in interacting regions of agrin, perlecan, laminin and sex steroid binding protein, may be involved in protein associations. Images PMID:1425586

  8. A simple and effective strategy for solving the problem of inclusion bodies in recombinant protein technology: His-tag deletions enhance soluble expression.

    PubMed

    Zhu, Shaozhou; Gong, Cuiyu; Ren, Lu; Li, Xingzhou; Song, Dawei; Zheng, Guojun

    2013-01-01

    The formation of inclusion bodies (IBs) in recombinant protein biotechnology has become one of the most frequent undesirable occurrences in both research and industrial applications. So far, the pET System is the most powerful system developed for the production of recombinant proteins when Escherichia coli is used as the microbial cell factory. Also, using fusion tags to facilitate detection and purification of the target protein is a commonly used tactic. However, there is still a large fraction of proteins that cannot be produced in E. coli in a soluble (and hence functional) form. Intensive research efforts have tried to address this issue, and numerous parameters have been modulated to avoid the formation of inclusion bodies. However, hardly anyone has noticed that adding fusion tags to the recombinant protein to facilitate purification is a key factor that affects the formation of inclusion bodies. To test this idea, the industrial biocatalysts uridine phosphorylase from Aeropyrum pernix K1 and (+)-γ-lactamase and (-)-γ-lactamase from Bradyrhizobium japonicum USDA 6 were expressed in E. coli by using the pET System and then examined. We found that using a histidine tag as a fusion partner for protein expression did affect the formation of inclusion bodies in these examples, suggesting that removing the fusion tag can promote the solubility of heterologous proteins. The production of soluble and highly active uridine phosphorylase, (+)-γ-lactamase, and (-)-γ-lactamase in our results shows that the traditional process needs to be reconsidered. Accordingly, a simple and efficient structure-based strategy for the production of valuable soluble recombinant proteins in E. coli is proposed.

  9. De novo transcriptome sequencing and comparative analysis of differentially expressed genes in Gossypium aridum under salt stress.

    PubMed

    Xu, Peng; Liu, Zhangwei; Fan, Xinqi; Gao, Jin; Zhang, Xia; Zhang, Xianggui; Shen, Xinlian

    2013-08-01

    Salinity stress is one of the most serious factors that impede the growth and development of various crops. Wild Gossypium species, which are remarkably tolerant to salt water immersion, are valuable resources for understanding salt tolerance mechanisms of Gossypium and improving salinity resistance in upland cotton. To generate a broad survey of genes with altered expression during various stages of salt stress, a mixed RNA sample was prepared from the roots and leaves of Gossypium aridum plants subjected to salt stress. The transcripts were sequenced using the Illumina sequencing platform. After cleaning and quality checks, approximately 41.5 million clean reads were obtained. Finally, these reads were eventually assembled into 98,989 unigenes with a mean size of 452 bp. All unigenes were compared to known cluster of orthologous groups (COG) sequences to predict and classify the possible functions of these genes, which were classified into at least 25 molecular families. Variations in gene expression were then examined after exposing the plants to 200 mM NaCl for 3, 12, 72 or 144 h. Sequencing depths of approximately six million raw tags were achieved for each of the five stages of salt stress. There were 2634 (1513 up-regulated/1121 down-regulated), 2449 (1586 up-regulated/863 down-regulated), 2271 (946 up-regulated/1325 down-regulated) and 3352 (933 up-regulated/2419 down-regulated) genes that were differentially expressed after exposure to NaCl for 3, 12, 72 and 144 h, respectively. Digital gene expression analysis indicated that pathways involved in "transport", "response to hormone stimulus" and "signaling" play important roles during salt stress, while genes involved in "protein kinase activity" and "transporter activity" undergo major changes in expression during early and later stages of salt stress, respectively.

  10. The dynamics of the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin assessed by parallel tag sequencing.

    PubMed

    Rodriguez-Mora, Maria J; Scranton, Mary I; Taylor, Gordon T; Chistoserdov, Andrei Y

    2015-09-01

    Massively parallel tag sequencing was applied to describe the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin. In total, 14 samples from the Cariaco Basin were collected over a period of eight years from two stations. A total of 244 357 unique bacterial V6 amplicons were sequenced. The total number of operational taxonomic units (OTUs) found in this study was 4692, with a range of 511-1491 OTUs per sample. Approximately 95% of the OTUs found in the redox transition zone and anoxic layers of Cariaco are represented by less than 50 amplicons suggesting that only about 5% of the bacterial OTUs are responsible for the bulk of the microbial processes in the basin redox transition and anoxic zones. The same dominant OTUs were observed across all eight years of sampling although periodic fluctuations in their proportion were apparent. No distinctive differences were observed between the bacterial communities from the redox transition and anoxic layers of the Cariaco Basin water column. The largest proportion of amplicons belongs to Gammaproteobacteria represented mostly by sulfide oxidizers, followed by Marine Group A (originally described as SAR406; Gordon and Giovannoni 1996), a group of uncultured bacteria hypothesized to be involved in metal reduction, and sulfate-reducing Deltaproteobacteria. Gammaproteobacteria, Deltaproteobacteria and Marine Group A make up 67-90% of all V6 amplicons sequenced in this study. This strongly suggests that the basin's microbial communities are actively involved in the sulfur-related metabolism and coupling of the sulfur and carbon cycles. According to detrended canonical correspondence analysis, ecological factors such as chemoautotrophy, nitrate and oxidized and reduced sulfur compounds influence the structuring and distribution of the Cariaco microbial communities. PMID:26209697

  11. The dynamics of the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin assessed by parallel tag sequencing.

    PubMed

    Rodriguez-Mora, Maria J; Scranton, Mary I; Taylor, Gordon T; Chistoserdov, Andrei Y

    2015-09-01

    Massively parallel tag sequencing was applied to describe the bacterial diversity in the redox transition and anoxic zones of the Cariaco Basin. In total, 14 samples from the Cariaco Basin were collected over a period of eight years from two stations. A total of 244 357 unique bacterial V6 amplicons were sequenced. The total number of operational taxonomic units (OTUs) found in this study was 4692, with a range of 511-1491 OTUs per sample. Approximately 95% of the OTUs found in the redox transition zone and anoxic layers of Cariaco are represented by less than 50 amplicons suggesting that only about 5% of the bacterial OTUs are responsible for the bulk of the microbial processes in the basin redox transition and anoxic zones. The same dominant OTUs were observed across all eight years of sampling although periodic fluctuations in their proportion were apparent. No distinctive differences were observed between the bacterial communities from the redox transition and anoxic layers of the Cariaco Basin water column. The largest proportion of amplicons belongs to Gammaproteobacteria represented mostly by sulfide oxidizers, followed by Marine Group A (originally described as SAR406; Gordon and Giovannoni 1996), a group of uncultured bacteria hypothesized to be involved in metal reduction, and sulfate-reducing Deltaproteobacteria. Gammaproteobacteria, Deltaproteobacteria and Marine Group A make up 67-90% of all V6 amplicons sequenced in this study. This strongly suggests that the basin's microbial communities are actively involved in the sulfur-related metabolism and coupling of the sulfur and carbon cycles. According to detrended canonical correspondence analysis, ecological factors such as chemoautotrophy, nitrate and oxidized and reduced sulfur compounds influence the structuring and distribution of the Cariaco microbial communities.

  12. Active populations of rare microbes in oceanic environments as revealed by bromodeoxyuridine incorporation and 454 tag sequencing.

    PubMed

    Hamasaki, Koji; Taniguchi, Akito; Tada, Yuya; Kaneko, Ryo; Miki, Takeshi

    2016-02-01

    The "rare biosphere" consisting of thousands of low-abundance microbial taxa is important as a seed bank or a gene pool to maintain microbial functional redundancy and robustness of the ecosystem. Here we investigated contemporaneous growth of diverse microbial taxa including rare taxa and determined their variability in environmentally distinctive locations along a north-south transect in the Pacific Ocean in order to assess which taxa were actively growing and how environmental factors influenced bacterial community structures. A bromodeoxyuridine-labeling technique in combination with PCR amplicon pyrosequencing of 16S rRNA genes gave 215-793 OTUs from 1200 to 3500 unique sequences in the total communities and 175-299 OTUs nearly 860 to 1800 sequences in the active communities. Unexpectedly, many of the active OTUs were not detected in the total fractions. Among these active but rare OTUs, some taxa (2-4% of rare OTUs) showed much higher abundance (>0.10% of total reads) in the active fraction than in the total fraction, suggesting that their contribution to bacterial community productivity or growth was much larger than that expected from their standing stocks at each location. An ordination plot by the principal component analysis presented that bacterial community compositions among 4 sampling locations and between total and active fractions were distinctive with each other. A redundancy analysis revealed that the variability of community compositions significantly correlated to seawater temperature and dissolved oxygen concentration. Also, a variation partitioning analysis showed that the environmental factors explained 49% of the variability of community compositions and the distance only explained 4.0% of its variability. These results implied very dynamic change of community structures due to environmental filtering. The active bacterial populations are more diverse and spread further in rare biosphere than we have ever seen. This study implied that rare

  13. Active populations of rare microbes in oceanic environments as revealed by bromodeoxyuridine incorporation and 454 tag sequencing.

    PubMed

    Hamasaki, Koji; Taniguchi, Akito; Tada, Yuya; Kaneko, Ryo; Miki, Takeshi

    2016-02-01

    The "rare biosphere" consisting of thousands of low-abundance microbial taxa is important as a seed bank or a gene pool to maintain microbial functional redundancy and robustness of the ecosystem. Here we investigated contemporaneous growth of diverse microbial taxa including rare taxa and determined their variability in environmentally distinctive locations along a north-south transect in the Pacific Ocean in order to assess which taxa were actively growing and how environmental factors influenced bacterial community structures. A bromodeoxyuridine-labeling technique in combination with PCR amplicon pyrosequencing of 16S rRNA genes gave 215-793 OTUs from 1200 to 3500 unique sequences in the total communities and 175-299 OTUs nearly 860 to 1800 sequences in the active communities. Unexpectedly, many of the active OTUs were not detected in the total fractions. Among these active but rare OTUs, some taxa (2-4% of rare OTUs) showed much higher abundance (>0.10% of total reads) in the active fraction than in the total fraction, suggesting that their contribution to bacterial community productivity or growth was much larger than that expected from their standing stocks at each location. An ordination plot by the principal component analysis presented that bacterial community compositions among 4 sampling locations and between total and active fractions were distinctive with each other. A redundancy analysis revealed that the variability of community compositions significantly correlated to seawater temperature and dissolved oxygen concentration. Also, a variation partitioning analysis showed that the environmental factors explained 49% of the variability of community compositions and the distance only explained 4.0% of its variability. These results implied very dynamic change of community structures due to environmental filtering. The active bacterial populations are more diverse and spread further in rare biosphere than we have ever seen. This study implied that rare

  14. Insilico analysis of three different tag polypeptides with dual roles in scFv antibodies.

    PubMed

    Mohammadi, Mozafar; Nejatollahi, Foroogh; Sakhteman, Amirhossein; Zarei, Neda

    2016-08-01

    Single chain fragment variable (scFv) antibodies are composed of variable heavy (VH) and variable light (VL) domains that are joined by a polypeptide linker. Typically, [(Gly4Ser) n] sequence is used as a linker to retain the integrity of the antigen-binding domain. Due to its low immunogenicity, this sequence cannot be used as a tag for scFv detection and purification. Several evidences have shown that the addition of an N or C-terminal tag for scFv detection and purification will result in the decreased expression and binding capacity of this antibody fragment. In this study, we substituted the traditional linker (GGGGS) with His-tag, C-myc or E-tag sequences through molecular modeling. Stability and integrity of all models were assessed by molecular dynamic (MD) simulation. Based on MD simulation analysis, the model containing E-tag sequence as a linker indicated more stability compared to other molecules. The results suggest that E-tag not only can be substituted for the traditional linker, also eliminates the necessity of using additional tag for scFv detection and purification. PMID:27113782

  15. Spatiotemporal analysis of bacterial diversity in sediments of Sundarbans using parallel 16S rRNA gene tag sequencing.

    PubMed

    Basak, Pijush; Majumder, Niladri Shekhar; Nag, Sudip; Bhattacharyya, Anish; Roy, Debojyoti; Chakraborty, Arpita; SenGupta, Sohan; Roy, Arunava; Mukherjee, Arghya; Pattanayak, Rudradip; Ghosh, Abhrajyoti; Chattopadhyay, Dhrubajyoti; Bhattacharyya, Maitree

    2015-04-01

    The influence of temporal and spatial variations on the microbial community composition was assessed in the unique coastal mangrove of Sundarbans using parallel 16S rRNA gene pyrosequencing. The total sediment DNA was extracted and subjected to the 16S rRNA gene pyrosequencing, which resulted in 117 Mbp of data from three experimental stations. The taxonomic analysis of the pyrosequencing data was grouped into 24 different phyla. In general, Proteobacteria were the most dominant phyla with predominance of Deltaproteobacteria, Alphaproteobacteria, and Gammaproteobacteria within the sediments. Besides Proteobacteria, there are a number of sequences affiliated to the following major phyla detected in all three stations in both the sampling seasons: Actinobacteria, Bacteroidetes, Planctomycetes, Acidobacteria, Chloroflexi, Cyanobacteria, Nitrospira, and Firmicutes. Further taxonomic analysis revealed abundance of micro-aerophilic and anaerobic microbial population in the surface layers, suggesting anaerobic nature of the sediments in Sundarbans. The results of this study add valuable information about the composition of microbial communities in Sundarbans mangrove and shed light on possible transformations promoted by bacterial communities in the sediments. PMID:25256302

  16. A high-density genetic recombination map of sequence-tagged sites for sorghum, as a framework for comparative structural and evolutionary genomics of tropical grains and grasses.

    PubMed Central

    Bowers, John E; Abbey, Colette; Anderson, Sharon; Chang, Charlene; Draye, Xavier; Hoppe, Alison H; Jessup, Russell; Lemke, Cornelia; Lennington, Jennifer; Li, Zhikang; Lin, Yann-Rong; Liu, Sin-Chieh; Luo, Lijun; Marler, Barry S; Ming, Reiguang; Mitchell, Sharon E; Qiang, Dou; Reischmann, Kim; Schulze, Stefan R; Skinner, D Neil; Wang, Yue-Wen; Kresovich, Stephen; Schertz, Keith F; Paterson, Andrew H

    2003-01-01

    We report a genetic recombination map for Sorghum of 2512 loci spaced at average 0.4 cM ( approximately 300 kb) intervals based on 2050 RFLP probes, including 865 heterologous probes that foster comparative genomics of Saccharum (sugarcane), Zea (maize), Oryza (rice), Pennisetum (millet, buffelgrass), the Triticeae (wheat, barley, oat, rye), and Arabidopsis. Mapped loci identify 61.5% of the recombination events in this progeny set and reveal strong positive crossover interference acting across intervals of sequence-tagged sites will foster many structural, functional and evolutionary genomic studies in major food, feed, and biomass crops. PMID:14504243

  17. Sequence and expression of ferredoxin mRNA in barley

    SciTech Connect

    Zielinski, R.; Funder, P.M.; Ling, V. )

    1990-05-01

    We have isolated and structurally characterized a full-length cDNA clone encoding ferredoxin from a {lambda}gt10 cDNA library prepared from barley leaf mRNA. The ferredoxin clone (pBFD-1) was fused head-to-head with a partial-length cDNA clone encoding calmodulin, and was fortuitously isolated by screening the library with a calmodulin-specific oligonucleotide probe. The mRNA sequence from which pBFD-1 was derived is expressed exclusively in the leaf tissues of 7-d old barley seedlings. Barley pre-ferredoxin has a predicted size of 15.3 kDal, of which 4.6 kDal are accounted for by the transit peptide. The polypeptide encoded by pBFD-1 is identical to wheat ferredoxin, and shares slightly more amino acid sequence similarity with spinach ferredoxin I than with ferredoxin II. Ferredoxin mRNA levels are rapidly increased 10-fold by white light in etiolated barley leaves.

  18. A genetic toolkit for tagging intronic MiMIC containing genes.

    PubMed

    Nagarkar-Jaiswal, Sonal; DeLuca, Steven Z; Lee, Pei-Tseng; Lin, Wen-Wen; Pan, Hongling; Zuo, Zhongyuan; Lv, Jiangxing; Spradling, Allan C; Bellen, Hugo J

    2015-01-01

    Previously, we described a large collection of Minos-Mediated Integration Cassettes (MiMICs) that contain two phiC31 recombinase target sites and allow the generation of a new exon that encodes a protein tag when the MiMIC is inserted in a codon intron (Nagarkar-Jaiswal et al., 2015). These modified genes permit numerous applications including assessment of protein expression pattern, identification of protein interaction partners by immunoprecipitation followed by mass spec, and reversible removal of the tagged protein in any tissue. At present, these conversions remain time and labor-intensive as they require embryos to be injected with plasmid DNA containing the exon tag. In this study, we describe a simple and reliable genetic strategy to tag genes/proteins that contain MiMIC insertions using an integrated exon encoding GFP flanked by FRT sequences. We document the efficiency and tag 60 mostly uncharacterized genes.

  19. Shift in prokaryotic diversity in Arctic sediment along a continuum Glacier -River - Fjord using massive 16S rRNA gene tag sequencing

    NASA Astrophysics Data System (ADS)

    Laghdass, M.; Deloffre, J.; Lafite, R.; Hänni, C.; Gillet, B.; Cecillon, S.; Simonet, P.; Petit, F.

    2012-04-01

    In Arctic environment, one of indirect consequences of the global climate warming is the significant amplification of the amount of inland water during the spring thaw resulting from the snow cover and permafrost melting. These freshwater transfers to the coast cause sedimentary transfers. The Arctic fjords that represent deep glacial valleys of the sea are particularly vulnerable systems. Although the previous studies have highlighted potentially the high bacterial diversity in Arctic environment by the pyrosequencing, a new-generation sequencing and high throughput method, does not escape the same bias as the one of classical molecular biology techniques involved at different stages of the analysis. In this context, our objective was to characterize the prokaryotic diversity associated to the sediment transfer along a gradient from the head of the glacier to mud patch sediment in the Goule river streaming in Kongsfjorden (Svalbard) during an active thaw. The prokaryotic diversity in sediment was characterized by combining a massive of 16S rRNA gene tag sequencing with a specific and original approach in order to overcome the bias associated to the sampling and extraction. The sediment was extracted by three different methods. One method was done in duplicate. Negative controls performed at extraction and PCR stages were also sequenced. The phylogenetic analysis of the environmental samples below phylum level revealed significantly changes in the diversity and the function of the prokaryotic community along the gradient. The subglacial Goule river sediment is characterized by bacteria with specific functions methylotroph bacteria, aerobic chemoautolithotrophic bacteria (Alphaproteobacteria with Methylobacteriaceae) whereas the mouth of the river Goule and the freshwater part of the Goule River was dominated by sulphate-reducing-bacteria, anaerobic chemooorganotroph (Deltaprotobacteria with the Desulfobulbaceae and Desulfuromonadaceae) and by

  20. Improved subtilisin YaB production in Bacillus subtilis using engineered synthetic expression control sequences.

    PubMed

    Wang, Jyh-Perng; Yeh, Chuan-Mei; Tsai, Ying-Chieh

    2006-12-13

    Alkaline elastase YaB, a favorable meat tenderizer, is an extracellular subtilisin-type protease produced by wild strain alkalophilic Bacillus YaB. The gene ale coding for subtilisin YaB with its own expression control sequence has been cloned and expressed in Bacillus subtilis, but at levels much lower than in the parental strain Bacillus YaB. This study investigates the influence of various expression control sequences including expression control sequences of cdd and veg from B. subtilis, a synthetic expression control sequence (SECS), and engineered synthetic expression control sequences (engineered SECSs) on the expression of subtilisin YaB in B. subtilis. The engineered SECSs were generated by using the Polymerase Chain Reaction; their UP element, Shine-Dargarno (SD) sequence, or both were different from those of the native SECS. The expression efficiencies of SECS and engineered SECSs were higher than those of expression control sequences of ale, cdd, and veg. Substitution of the SD sequence of SECS resulted in higher expression of subtilisin YaB than substitution of the UP element, whereas combined substitution of both gave the highest expression. These results demonstrate that engineering of SECSs is an approach for improving subtilisin YaB production in B. subtilis. Moreover, it is suggested that these enginnered SECSs could potentially be used to express homologous and heterologous proteins in B. subtilis at high level. PMID:17147425

  1. Nucleotide sequence and temporal expression of a baculovirus regulatory gene.

    PubMed

    Guarino, L A; Summers, M D

    1987-07-01

    The nucleotide sequence of a trans-activating regulatory gene (IE-1) of the baculovirus Autographa californica nuclear polyhedrosis virus has been determined. This gene encodes a protein of 581 amino acids with a predicted molecular weight of 66,856. A DNA fragment containing the entire coding sequence of IE-1 was inserted downstream of an RNA promoter. Subsequent cell-free transcription and translation directed the synthesis of a single peptide with an apparent molecular weight of 70,000. Quantitative S1 nuclease analysis indicated that IE-1 was maximally synthesized during a 1-h virus adsorption period and that steady-state levels of IE-1 message were maintained during the first 24 h of infection. Northern blot hybridization indicated that several late transcripts which overlap the IE-1 gene were transcribed from both strands. The precise locations of the 5' and 3' ends of these overlapping transcripts were mapped using S1 nuclease. The overlapping transcripts were grouped in two transcriptional units. One unit was composed of IE-1 and overlapping gamma transcripts which initiated upstream of IE-1 and terminated downstream of IE-1. The other unit, transcribed from the opposite strand, consisted of gamma transcripts with coterminal 5' ends and extended 3' ends. The shorter, more abundant transcripts in this unit overlapped 30 to 40 bases of IE-1 at the 3' end, while the longer transcripts overlapped the entire IE-1 gene. Transcription of several early A. californica nuclear polyhedrosis virus genes, in addition to 39K, was shown to be trans-activated by IE-1, indicating that IE-1 may have a central role in the regulation of beta-gene expression. PMID:16789264

  2. Substantial prevalence of microdeletions of the Y-chromosome in infertile men with idiopathic azoospermia and oligozoospermia detected using a sequence-tagged site-based mapping strategy

    SciTech Connect

    Najmabadi, H.; Huang, V.; Bhasin, D.

    1996-04-01

    Genes on the long arm of Y (Yq), particularly within interval 6, are believed to play a critical role in human spermatogenesis. Cytogenetically detectable deletions of this region are associated with azoospermia in men, but are relatively uncommon. The objective of this study was to validate a sequence-tagged site (STS)-mapping strategy for the detection of Yq microdeletions and to use this method to determine the proportion of men with idiopathic azoospermia or severe oligozoospermia who carry microdeletions in Yq. STS mapping of a sufficiently large sample of infertile men should also help further localize the putative gene(s) involved in the pathogenesis of male infertility. Genomic DNA was extracted from peripheral leukocytes of 16 normal fertile men, 7 normal fertile women, 60 infertile men, and 15 patients with the X-linked disorder, ichthyosis. PCR primers were synthesized for 26 STSs that span Yq interval 6. None of the 16 normal men of known fertility had microdeletions. Seven normal fertile women failed to amplify any of the 26 STSs, providing evidence of their Y specificity. No microdeletions were detected in any of the 15 patients with ichthyosis. Of the 60 infertile men typed with 26 STSs, 11 (18%; 10 azoospermic and 1 oligozoospermic) failed to amplify 1 or more STS. Interestingly, 4 of the 11 patients had microdeletions in a region that is outside the Yq region from which the DAZ (deleted in azoospermia gene region) gene was cloned. In an additional 3 patients, microdeletions were present both inside and outside the DAZ region. The physical locations of these microdeletions provide further support for the concept that a gene(s) on Yq deletion interval 6 plays an important role in spermatogenesis. The presence of deletions that do not overlap with the DAZ region suggests that genes other than the DAZ gene may also be implicated in the pathogenesis of some subsets of male infertility. 48 refs., 2 figs., 2 tabs.

  3. A Review of Recommendations for Sequencing Receptive and Expressive Language Instruction

    ERIC Educational Resources Information Center

    Petursdottir, Anna Ingeborg; Carr, James E.

    2011-01-01

    We review recommendations for sequencing instruction in receptive and expressive language objectives in early and intensive behavioral intervention (EIBI) programs. Several books recommend completing receptive protocols before introducing corresponding expressive protocols. However, this recommendation has little empirical support, and some…

  4. Fractional factorial approach combining 4 Escherichia coli strains, 3 culture media, 3 expression temperatures and 5 N-terminal fusion tags for screening the soluble expression of recombinant proteins.

    PubMed

    Noguère, Christophe; Larsson, Anna M; Guyot, Jean-Christophe; Bignon, Christophe

    2012-08-01

    Producing recombinant proteins in Escherichia coli (E. coli) is generally performed using a trial and error approach with the different expression variables being tested independently from each other. As a consequence, variable interactions are lost which makes the trial and error approach quite time-consuming. In this paper, we report how switching from a trial and error to a fractional factorial approach allows testing in less than 2 weeks four expression variables (E. coli strains, culture media, expression temperatures and N-terminal fusion tags) in a single experiment. The method, called "Fusion-InFFact", was validated using four test proteins. In all cases, Fusion-InFFact allowed finding conditions for expressing high yields of soluble proteins. The method was originally set-up for high throughput structural genomics programs, but can be used in any recombinant protein expression project. PMID:22705765

  5. A dual affinity-tag strategy for the expression and purification of human linker histone H1.4 in Escherichia coli.

    PubMed

    Ryan, Daniel P; Tremethick, David J

    2016-04-01

    Linker histones are an abundant and critical component of the eukaryotic chromatin landscape. They play key roles in regulating the higher order structure of chromatin and many genetic processes. Higher eukaryotes possess a number of different linker histone subtypes and new data are consistently emerging that indicate these subtypes are functionally distinct. We were interested in studying one of the most abundant human linker histone subtypes, H1.4. We have produced recombinant full-length H1.4 in Escherichia coli. An N-terminal Glutathione-S-Transferase tag was used to promote soluble expression and was combined with a C-terminal hexahistidine tag to facilitate a simple non-denaturing two-step affinity chromatography procedure that results in highly pure full-length H1.4. The purified H1.4 was shown to be functional via in vitro chromatin assembly experiments and remains active after extended storage at -80 °C. PMID:26739785

  6. Shark Tagging Activities.

    ERIC Educational Resources Information Center

    Current: The Journal of Marine Education, 1998

    1998-01-01

    In this group activity, children learn about the purpose of tagging and how scientists tag a shark. Using a cut-out of a shark, students identify, measure, record data, read coordinates, and tag a shark. Includes introductory information about the purpose of tagging and the procedure, a data sheet showing original tagging data from Tampa Bay, and…

  7. Targeted RNA Sequencing Assay to Characterize Gene Expression and Genomic Alterations.

    PubMed

    Martin, Dorrelyn P; Miya, Jharna; Reeser, Julie W; Roychowdhury, Sameek

    2016-01-01

    RNA sequencing (RNAseq) is a versatile method that can be utilized to detect and characterize gene expression, mutations, gene fusions, and noncoding RNAs. Standard RNAseq requires 30 - 100 million sequencing reads and can include multiple RNA products such as mRNA and noncoding RNAs. We demonstrate how targeted RNAseq (capture) permits a focused study on selected RNA products using a desktop sequencer. RNAseq capture can characterize unannotated, low, or transiently expressed transcripts that may otherwise be missed using traditional RNAseq methods. Here we describe the extraction of RNA from cell lines, ribosomal RNA depletion, cDNA synthesis, preparation of barcoded libraries, hybridization and capture of targeted transcripts and multiplex sequencing on a desktop sequencer. We also outline the computational analysis pipeline, which includes quality control assessment, alignment, fusion detection, gene expression quantification and identification of single nucleotide variants. This assay allows for targeted transcript sequencing to characterize gene expression, gene fusions, and mutations. PMID:27585245

  8. Identification of stromally expressed molecules in the prostate by tag-profiling of cancer-associated fibroblasts, normal fibroblasts and fetal prostate

    PubMed Central

    Orr, B; Riddick, A C P; Stewart, G D; Anderson, R A; Franco, O E; Hayward, S W; Thomson, A A

    2012-01-01

    The stromal microenvironment has key roles in prostate development and cancer, and cancer-associated fibroblasts (CAFs) stimulate tumourigenesis via several mechanisms including the expression of pro-tumourigenic factors. Mesenchyme (embryonic stroma) controls prostate organogenesis, and in some circumstances can re-differentiate prostate tumours. We have applied next-generation Tag profiling to fetal human prostate, normal human prostate fibroblasts (NPFs) and CAFs to identify molecules expressed in prostatic stroma. Comparison of gene expression profiles of a patient-matched pair of NPFs vs CAFs identified 671 transcripts that were enriched in CAFs and 356 transcripts whose levels were decreased, relative to NPFs. Gene ontology analysis revealed that CAF-enriched transcripts were associated with prostate morphogenesis and CAF-depleted transcripts were associated with cell cycle. We selected mRNAs to follow-up by comparison of our data sets with published prostate cancer fibroblast microarray profiles as well as by focusing on transcripts encoding secreted and peripheral membrane proteins, as well as mesenchymal transcripts identified in a previous study from our group. We confirmed differential transcript expression between CAFs and NPFs using QrtPCR, and defined protein localization using immunohistochemistry in fetal prostate, adult prostate and prostate cancer. We demonstrated that ASPN, CAV1, CFH, CTSK, DCN, FBLN1, FHL1, FN, NKTR, OGN, PARVA, S100A6, SPARC, STC1 and ZEB1 proteins showed specific and varied expression patterns in fetal human prostate and in prostate cancer. Colocalization studies suggested that some stromally expressed molecules were also expressed in subsets of tumour epithelia, indicating that they may be novel markers of EMT. Additionally, two molecules (ASPN and STC1) marked overlapping and distinct subregions of stroma associated with tumour epithelia and may represent new CAF markers. PMID:21804603

  9. Transferable green fluorescence-tagged pEI2 in Edwardsiella ictaluri

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The pEI2 plasmid of Edwardsiella ictaluri isolate, I49, was tagged using a Tn10-GFP-kan cassette to create the green fluorescence-expressing derivative I49-gfp. The Tn10-GFP-kan insertion site was mapped by plasmid sequencing to 663 bp upstream of orf2 and appeared to be at a neutral site in the pla...

  10. Quantitative Proteomics Analysis of Altered Protein Expression in the Placental Villous Tissue of Early Pregnancy Loss Using Isobaric Tandem Mass Tags

    PubMed Central

    Ni, Xiaobei; Li, Xin; Guo, Yueshuai; Zhou, Tao; Guo, Xuejiang; Zhao, Chun; Lin, Min; Zhou, Zuomin; Shen, Rong; Guo, Xirong; Ling, Xiufeng

    2014-01-01

    Many pregnant women suffer miscarriages during early gestation, but the description of these early pregnancy losses (EPL) can be somewhat confusing because of the complexities of early development. Thus, the identification of proteins with different expression profiles related to early pregnancy loss is essential for understanding the comprehensive pathophysiological mechanism. In this study, we report a gel-free tandem mass tags- (TMT-) labeling based proteomic analysis of five placental villous tissues from patients with early pregnancy loss and five from normal pregnant women. The application of this method resulted in the identification of 3423 proteins and 19647 peptides among the patient group and the matched normal control group. Qualitative and quantitative proteomic analysis revealed 51 proteins to be differentially abundant between the two groups (≥1.2-fold, Student's t-test, P < 0.05). To obtain an overview of the biological functions of the proteins whose expression levels altered significantly in EPL group, gene ontology analysis was performed. We also investigated the twelve proteins with a difference over 1.5-fold using pathways analysis. Our results demonstrate that the gel-free TMT-based proteomic approach allows the quantification of differences in protein expression levels, which is useful for obtaining molecular insights into early pregnancy loss. PMID:24738066

  11. Analysis of the Changes in Expression Levels of Sialic Acid on Influenza-Virus-Infected Cells Using Lectin-Tagged Polymeric Nanoparticles

    PubMed Central

    Cho, Jaebum; Miyake, Yukari; Honda, Ayae; Kushiro, Keiichiro; Takai, Madoka

    2016-01-01

    Viral infections affect millions around the world, sometimes leading to severe consequences or even epidemics. Understanding the molecular dynamics during viral infections would provide crucial information for preventing or stopping the progress of infections. However, the current methods often involve the disruption of the infected cells or expensive and time-consuming procedures. In this study, fluorescent polymeric nanoparticles were fabricated and used as bioimaging nanoprobes that can monitor the progression of influenza viral infection through the changes in the expression levels of sialic acids expressed on the cell membrane. The nanoparticles were composed of a biocompatible monomer to prevent non-specific interactions, a hydrophobic monomer to form the core, a fluorescent monomer, and a protein-binding monomer to conjugate lectin, which binds sialic acids. It was shown that these lectin-tagged nanoparticles that specifically target sialic acids could track the changes in the expression levels of sialic acids caused by influenza viral infections in human lung epithelial cells. There was a sudden drop in the levels of sialic acid at the initial onset of virus infection (t = 0~1 h) and at approximately 4~5 h post-infection. The latter drop correlated with the production of viral proteins that was confirmed using traditional techniques. Thus, the accuracy, the rapidity and the efficacy of the nanoprobes were demonstrated. Such molecular bioimaging tools, which allow easy-handling and in situ monitoring, would be useful to directly observe and decipher the viral infection mechanisms. PMID:27493646

  12. Analysis of the Changes in Expression Levels of Sialic Acid on Influenza-Virus-Infected Cells Using Lectin-Tagged Polymeric Nanoparticles.

    PubMed

    Cho, Jaebum; Miyake, Yukari; Honda, Ayae; Kushiro, Keiichiro; Takai, Madoka

    2016-01-01

    Viral infections affect millions around the world, sometimes leading to severe consequences or even epidemics. Understanding the molecular dynamics during viral infections would provide crucial information for preventing or stopping the progress of infections. However, the current methods often involve the disruption of the infected cells or expensive and time-consuming procedures. In this study, fluorescent polymeric nanoparticles were fabricated and used as bioimaging nanoprobes that can monitor the progression of influenza viral infection through the changes in the expression levels of sialic acids expressed on the cell membrane. The nanoparticles were composed of a biocompatible monomer to prevent non-specific interactions, a hydrophobic monomer to form the core, a fluorescent monomer, and a protein-binding monomer to conjugate lectin, which binds sialic acids. It was shown that these lectin-tagged nanoparticles that specifically target sialic acids could track the changes in the expression levels of sialic acids caused by influenza viral infections in human lung epithelial cells. There was a sudden drop in the levels of sialic acid at the initial onset of virus infection (t = 0~1 h) and at approximately 4~5 h post-infection. The latter drop correlated with the production of viral proteins that was confirmed using traditional techniques. Thus, the accuracy, the rapidity and the efficacy of the nanoprobes were demonstrated. Such molecular bioimaging tools, which allow easy-handling and in situ monitoring, would be useful to directly observe and decipher the viral infection mechanisms. PMID:27493646

  13. Molecular cloning and characterization of a plant alpha1,3/4-fucosidase based on sequence tags from almond fucosidase I.

    PubMed

    Zeleny, Reinhard; Leonard, Renaud; Dorfner, Georg; Dalik, Thomas; Kolarich, Daniel; Altmann, Friedrich

    2006-04-01

    Our work with almond peptide N-glycosidase A made us interested also in the alpha1,3/4-fucosidase which is used as a specific reagent for glycoconjugate analysis. The enzyme was purified to presumed homogeneity by a series of chromatographic steps including dye affinity and fast-performance anion exchange chromatography. The 63 kDa band was analyzed by tandem mass spectrometry which yielded several partial sequences. A homology search retrieved the hypothetical protein Q8GW72 from Arabidopsis thaliana. This protein has recently been described as being specific for alpha1,2-linkages. However, cDNA cloning and expression in Pichia pastoris of the A. thaliana fucosidase showed that it hydrolyzed fucose in 3- and 4-linkage to GlcNAc in Lewis determinants whereas neither 2-linked fucose nor fucose in 3-linkage to the innermost GlcNAc residue were attacked. This first cloning of a plant alpha1,3/4-fucosidase also confirmed the identity of the purified almond enzyme and thus settles the notorious uncertainty about its molecular mass. The alpha1,3/4-fucosidase from Arabidopsis exhibited striking sequence similarity with an enzyme of similar substrate specificity from Streptomyces sp. (Q9Z4I9) and with putative proteins from rice. PMID:16516937

  14. Sequence and Expression Analyses of Ethylene Response Factors Highly Expressed in Latex Cells from Hevea brasiliensis

    PubMed Central

    Piyatrakul, Piyanuch; Yang, Meng; Putranto, Riza-Arief; Pirrello, Julien; Dessailly, Florence; Hu, Songnian; Summo, Marilyne; Theeravatanasuk, Kannikar; Leclercq, Julie; Kuswanhadi; Montoro, Pascal

    2014-01-01

    The AP2/ERF superfamily encodes transcription factors that play a key role in plant development and responses to abiotic and biotic stress. In Hevea brasiliensis, ERF genes have been identified by RNA sequencing. This study set out to validate the number of HbERF genes, and identify ERF genes involved in the regulation of latex cell metabolism. A comprehensive Hevea transcriptome was improved using additional RNA reads from reproductive tissues. Newly assembled contigs were annotated in the Gene Ontology database and were assigned to 3 main categories. The AP2/ERF superfamily is the third most represented compared with other transcription factor families. A comparison with genomic scaffolds led to an estimation of 114 AP2/ERF genes and 1 soloist in Hevea brasiliensis. Based on a phylogenetic analysis, functions were predicted for 26 HbERF genes. A relative transcript abundance analysis was performed by real-time RT-PCR in various tissues. Transcripts of ERFs from group I and VIII were very abundant in all tissues while those of group VII were highly accumulated in latex cells. Seven of the thirty-five ERF expression marker genes were highly expressed in latex. Subcellular localization and transactivation analyses suggested that HbERF-VII candidate genes encoded functional transcription factors. PMID:24971876

  15. Widespread Differential Expression of Coding Region and 3' UTR Sequences in Neurons and Other Tissues.

    PubMed

    Kocabas, Arif; Duarte, Terence; Kumar, Saranya; Hynes, Mary A

    2015-12-16

    Mature messenger RNAs (mRNAs) consist of coding sequence (CDS) and 5' and 3' UTRs, typically expected to show similar abundance within a given neuron. Examining mRNA from defined neurons, we unexpectedly show extremely common unbalanced expression of cognate 3' UTR and CDS sequences; many genes show high 3' UTR relative to CDS, others show high CDS to 3' UTR. In situ hybridization (19 of 19 genes) shows a broad range of 3' UTR-to-CDS expression ratios across neurons and tissues. Ratios may be spatially graded or change with developmental age but are consistent across animals. Further, for two genes examined, a 3' UTR-to-CDS ratio above a particular threshold in any given neuron correlated with reduced or undetectable protein expression. Our findings raise questions about the role of isolated 3' UTR sequences in regulation of protein expression and highlight the importance of separately examining 3' UTR and CDS sequences in gene expression analyses.

  16. Molecular cloning, sequence analysis and expression of genome segment 7 (S7) of Antheraea mylitta cypovirus (AmCPV) that encodes a viral structural protein.

    PubMed

    Chavali, Venkata Ramana Murthy; Ghosh, Ananta K

    2007-10-01

    The Genome segment 7 (S7) of the 11 double stranded RNA genomes from Antheraea mylitta cypovirus (AmCPV) was converted to cDNA, cloned and sequenced. The nucleotide sequence showed that segment 7 consisted of 1789 nucleotides with an ORF of 530 amino acids and could encode a protein of approximately 61 kDa, termed P61. The 5' terminal sequence, AGTAAT and the 3' terminal sequence, AGAGC of the plus strand was found to be the same as genome segment 10 of AmCPV encoding polyhedrin. No sequence similarity was found by searching nucleic acid and protein sequence databases using BLAST. The secondary structure prediction showed the presence of 17 alpha-helices, 18 extended beta-sheets along the entire length of P61. The ORF of segment 7 was expressed in E. coli as His-tagged fusion protein, purified through Ni-NTA chromatography, and polyclonal antibody was raised in rabbit indicating that P61 is immunogenic. Immunoblot analysis using this antibody on viral infected cells as well as purified polyhedra showed that P61 is a viral structural protein. Motif scan search showed some similarity of P61 with Inosine monophosphate dehydrogenase (IMPDH) cystathionine-beta-synthase (CBS) domain at the C-terminus and it was hypothesized that by binding to single stranded viral RNA through its CBS domain P61 may help in virus replication or transcription.

  17. Evaluation of codon biology in citrus and Poncirus trifoliata based on genomic features and frame corrected expressed sequence tags.

    PubMed

    Ahmad, Touqeer; Sablok, Gaurav; Tatarinova, Tatiana V; Xu, Qiang; Deng, Xiu-Xin; Guo, Wen-Wu

    2013-04-01

    Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid.

  18. Evaluation of codon biology in citrus and Poncirus trifoliata based on genomic features and frame corrected expressed sequence tags.

    PubMed

    Ahmad, Touqeer; Sablok, Gaurav; Tatarinova, Tatiana V; Xu, Qiang; Deng, Xiu-Xin; Guo, Wen-Wu

    2013-04-01

    Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid. PMID:23315666

  19. Evaluation of Codon Biology in Citrus and Poncirus trifoliata Based on Genomic Features and Frame Corrected Expressed Sequence Tags

    PubMed Central

    Ahmad, Touqeer; Sablok, Gaurav; Tatarinova, Tatiana V.; Xu, Qiang; Deng, Xiu-Xin; Guo, Wen-Wu

    2013-01-01

    Citrus, as one of the globally important fruit trees, has been an object of interest for understanding genetics and evolutionary process in fruit crops. Meta-analyses of 19 Citrus species, including 4 globally and economically important Citrus sinensis, Citrus clementina, Citrus reticulata, and 1 Citrus relative Poncirus trifoliata, were performed. We observed that codons ending with A- or T- at the wobble position were preferred in contrast to C- or G- ending codons, indicating a close association with AT richness of Citrus species and P. trifoliata. The present study postulates a large repertoire of a set of optimal codons for the Citrus genus and P. trifoliata and demonstrates that GCT and GGT are evolutionary conserved optimal codons. Our observation suggested that mutational bias is the dominating force in shaping the codon usage bias (CUB) in Citrus and P. trifoliata. Correspondence analysis (COA) revealed that the principal axis [axis 1; COA/relative synonymous codon usage (RSCU)] contributes only a minor portion (∼10.96%) of the recorded variance. In all analysed species, except P. trifoliata, Gravy and aromaticity played minor roles in resolving CUB. Compositional constraints were found to be strongly associated with the amino acid signatures in Citrus species and P. trifoliata. Our present analysis postulates compositional constraints in Citrus species and P. trifoliata and plausible role of the stress with GC3 and coevolution pattern of amino acid. PMID:23315666

  20. Sequence and gene expression evolution of paralogous genes in willows.

    PubMed

    Harikrishnan, Srilakshmy L; Pucholt, Pascal; Berlin, Sofia

    2015-12-22

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows.

  1. Sequence and gene expression evolution of paralogous genes in willows

    PubMed Central

    Harikrishnan, Srilakshmy L.; Pucholt, Pascal; Berlin, Sofia

    2015-01-01

    Whole genome duplications (WGD) have had strong impacts on species diversification by triggering evolutionary novelties, however, relatively little is known about the balance between gene loss and forces involved in the retention of duplicated genes originating from a WGD. We analyzed putative Salicoid duplicates in willows, originating from the Salicoid WGD, which took place more than 45 Mya. Contigs were constructed by de novo assembly of RNA-seq data derived from leaves and roots from two genotypes. Among the 48,508 contigs, 3,778 pairs were, based on fourfold synonymous third-codon transversion rates and syntenic positions, predicted to be Salicoid duplicates. Both copies were in most cases expressed in both tissues and 74% were significantly differentially expressed. Mean Ka/Ks was 0.23, suggesting that the Salicoid duplicates are evolving by purifying selection. Gene Ontology enrichment analyses showed that functions related to DNA- and nucleic acid binding were over-represented among the non-differentially expressed Salicoid duplicates, while functions related to biosynthesis and metabolism were over-represented among the differentially expressed Salicoid duplicates. We propose that the differentially expressed Salicoid duplicates are regulatory neo- and/or subfunctionalized, while the non-differentially expressed are dose sensitive, hence, functionally conserved. Multiple evolutionary processes, thus drive the retention of Salicoid duplicates in willows. PMID:26689951

  2. Strep-Tagged Protein Purification.

    PubMed

    Maertens, Barbara; Spriestersbach, Anne; Kubicek, Jan; Schäfer, Frank

    2015-01-01

    The Strep-tag system can be used to purify recombinant proteins from any expression system. Here, protocols for lysis and affinity purification of Strep-tagged proteins from E. coli, baculovirus-infected insect cells, and transfected mammalian cells are given. Depending on the amount of Strep-tagged protein in the lysate, a protocol for batch binding and subsequent washing and eluting by gravity flow can be used. Agarose-based matrices with the coupled Strep-Tactin ligand are the resins of choice, with a binding capacity of up to 9 mg ml(-1). For purification of lower amounts of Strep-tagged proteins, the use of Strep-Tactin magnetic beads is suitable. In addition, Strep-tagged protein purification can also be automated using prepacked columns for FPLC or other liquid-handling chromatography instrumentation, but automated purification is not discussed in this protocol. The protocols described here can be regarded as an update of the Strep-Tag Protein Handbook (Qiagen, 2009).

  3. Strep-Tagged Protein Purification.

    PubMed

    Maertens, Barbara; Spriestersbach, Anne; Kubicek, Jan; Schäfer, Frank

    2015-01-01

    The Strep-tag system can be used to purify recombinant proteins from any expression system. Here, protocols for lysis and affinity purification of Strep-tagged proteins from E. coli, baculovirus-infected insect cells, and transfected mammalian cells are given. Depending on the amount of Strep-tagged protein in the lysate, a protocol for batch binding and subsequent washing and eluting by gravity flow can be used. Agarose-based matrices with the coupled Strep-Tactin ligand are the resins of choice, with a binding capacity of up to 9 mg ml(-1). For purification of lower amounts of Strep-tagged proteins, the use of Strep-Tactin magnetic beads is suitable. In addition, Strep-tagged protein purification can also be automated using prepacked columns for FPLC or other liquid-handling chromatography instrumentation, but automated purification is not discussed in this protocol. The protocols described here can be regarded as an update of the Strep-Tag Protein Handbook (Qiagen, 2009). PMID:26096503

  4. [Construction and Expression of RNase-Resisting His-Tagged Virus-Like Particles Containing FluA/B mRNA].

    PubMed

    Zhang, Jin; Xue, Xiaoning; Xu, Hefei; Zhu, Ke; Chen, Xiaoguang; Zhang, Juan; Zhang, Qi; Lin, Yuan

    2015-11-01

    To prepare virus-like particles containing FluA/B mRNA as RNA standard and control in Influenza RNA detection, the genes coding the coat protein and maturase of E. coli bacteriophage MS2 were amplified and cloned into D-pET32a vector. Then we inserted 6 histidines to MS2 coat protein by QuikChange Site-Directed Mutagenesis Kit to construct the universal expressing vector D-pET32a-CP-His. In addition, the partial gene fragments of FluA and FluB were cloned to the down-stream of expressing vector. The recombinant plasmid D-pET32a-CP-His-FluA/B was transformed to BL21 with induction by IPTG. The virus-like particles were purified by Ni+ chromatography. The virus-like particles can be detected by RT-PCR, but not PCR. They can be conserved stably for at least 3 months at both 4 degrees C and -20 degrees C. His-tagged virus-like particles are more stable and easier to purification. It can be used as RNA standard and control in Influenza virus RNA detection. PMID:26951007

  5. Nucleotide sequence and expression of a Drosophila metallothionein.

    PubMed

    Lastowski-Perry, D; Otto, E; Maroni, G

    1985-02-10

    A Drosophila melanogaster cDNA clone was isolated based on its more intense hybridization to RNA sequences from copper-fed larvae than from control larval RNA. This clone showed strong hybridization to mouse metallothionein I cDNA at reduced stringency. Its nucleotide sequence includes an open reading segment which codes for a 40-amino acid protein; this protein is identified as metallothionein based on its similarity to the amino-terminal portion of mammalian and crab metalloproteins. The 10 cysteine residues present occur in five pairs of near vicinal cysteines (Cys-X-Cys). This cDNA sequence hybridized to a 400-nucleotide polyadenylated RNA whose presence in the cells of the alimentary canal of larvae was stimulated by ingestion of cadmium or copper; in other tissues this RNA was present at much lower levels. Mercury, silver, and zinc induced metallothionein to a lesser extent. The level of metallothionein RNA increased very soon after the initiation of metal treatment and reached a maximum after approximately 36 h. PMID:2578462

  6. Cloning, nucleotide sequence, and expression of Achromobacter protease I gene.

    PubMed

    Ohara, T; Makino, K; Shinagawa, H; Nakata, A; Norioka, S; Sakiyama, F

    1989-12-01

    Achromobacter protease I (API) is a lysine-specific serine protease which hydrolyzes specifically the lysyl peptide bond. A gene coding for API was cloned from Achromobacter lyticus M497-1. Nucleotide sequence of the cloned DNA fragment revealed that the gene coded for a single polypeptide chain of 653 amino acids. The N-terminal 205 amino acids, including signal peptide and the threonine/serine-rich C-terminal 180 amino acids are flanking the 268 amino acid-mature protein which was identified by protein sequencing. Escherichia coli carrying a plasmid containing the cloned API gene overproduced and secreted a protein of Mr 50,000 (API') into the periplasm. This protein exhibited a distinct endopeptidase activity specific for lysyl bonds as well. The N-terminal amino acid sequence of API' was the same as mature API, suggesting that the enzyme retained the C-terminal extended peptide chain. The present experiments indicate that API, an extracellular protease produced by gram-negative bacteria, is synthesized in vivo as a precursor protein bearing long extended peptide chains at both N and C termini. PMID:2684982

  7. Expression Atlas update—a database of gene and transcript expression from microarray- and sequencing-based functional genomics experiments

    PubMed Central

    Petryszak, Robert; Burdett, Tony; Fiorelli, Benedetto; Fonseca, Nuno A.; Gonzalez-Porta, Mar; Hastings, Emma; Huber, Wolfgang; Jupp, Simon; Keays, Maria; Kryvych, Nataliya; McMurry, Julie; Marioni, John C.; Malone, James; Megy, Karine; Rustici, Gabriella; Tang, Amy Y.; Taubert, Jan; Williams, Eleanor; Mannion, Oliver; Parkinson, Helen E.; Brazma, Alvis

    2014-01-01

    Expression Atlas (http://www.ebi.ac.uk/gxa) is a value-added database providing information about gene, protein and splice variant expression in different cell types, organism parts, developmental stages, diseases and other biological and experimental conditions. The database consists of selected high-quality microarray and RNA-sequencing experiments from ArrayExpress that have been manually curated, annotated with Experimental Factor Ontology terms and processed using standardized microarray and RNA-sequencing analysis methods. The new version of Expression Atlas introduces the concept of ‘baseline’ expression, i.e. gene and splice variant abundance levels in healthy or untreated conditions, such as tissues or cell types. Differential gene expression data benefit from an in-depth curation of experimental intent, resulting in biologically meaningful ‘contrasts’, i.e. instances of differential pairwise comparisons between two sets of biological replicates. Other novel aspects of Expression Atlas are its strict quality control of raw experimental data, up-to-date RNA-sequencing analysis methods, expression data at the level of gene sets, as well as genes and a more powerful search interface designed to maximize the biological value provided to the user. PMID:24304889

  8. Generation and annotation of lodgepole pine and oleoresin-induced expressed sequences from the blue-stain fungus Ophiostoma clavigerum, a Mountain Pine Beetle-associated pathogen.

    PubMed

    DiGuistini, Scott; Ralph, Steven G; Lim, Young W; Holt, Robert; Jones, Steven; Bohlmann, Jörg; Breuil, Colette

    2007-02-01

    Ophiostoma clavigerum is a destructive pathogen of lodgepole pine (Pinus contorta) forests in western North America. It is therefore a relevant system for a genomics analysis of fungi vectored by bark beetles. To begin characterizing molecular interactions between the pathogen and its conifer host, we created an expressed sequence tag (EST) collection for O. clavigerum. Lodgepole pine sawdust and oleoresin media were selected to stimulate gene expression that would be specific to this host interaction. Over 6500 cDNA clones, derived from four normalized cDNA libraries, were single-pass sequenced from the 3' end. After quality screening, we identified 5975 high-quality reads with an average PHRED 20 of greater than 750 bp. Clustering and assembly of this high-quality EST set resulted in the identification of 2620 unique putative transcripts. BLASTX analysis revealed that only 67% of these unique transcripts could be matched to known or predicted protein sequences in public databases. Functional classification of these sequences provided initial insights into the transcriptome of O. clavigerum. Of particular interest, our ESTs represent an extensive collection of cytochrome P450 s, ATP-binding-cassette-type transporters and genes involved in 1,8-dihydroxynaphthalene-melanin biosynthesis. These results are discussed in the context of detoxification of conifer oleoresins and fungal pathogenesis.

  9. Biostratigraphic expression of pleistocene sequence boundaries, Gulf of Mexico

    SciTech Connect

    Martin, R.E. ); Neff, E.D. ); Johnson, G.W. ); Krantz, D.E. )

    1993-04-01

    The quaternary section west of the Mississippi River delta consists of thousands of meters of terrigenous sediments, but the stratigraphic and paleoclimatic history recorded in these sequences is often distorted as a result of salt and shale diapirism. Quaternary sequences of the western Gulf of Mexico often reflect highly variable sediment accumulation rates within and between isolated salt-withdrawal basins and missing section resulting from unconformities and extensive faulting. The sedimentary record of Ocean Drilling Program's Core 625B (northeast Gulf of Mexico) contains significant unconformaties that represent a record of sea-level change during the Pleistocene. The core may thus serve as a standard for timing of sea-level changes of the Western Gulf. Utilizing primarily relative abundances of the warm-water Globorotalia menardii complex and cool water G. inflata, we have subdivided the pre-zone W Pleistocene of Core 625B into 17 subzones, resulting in an average duration of approximately 100,000 years per unit. Based on graphic correlation, subzonal boundaries are largely coeval between sites and can provide high-resolution biostratigraphic subdivision of the Pleistocene of industrial wells on an operational basis. Also, the subzonation delineates anomalous paleotops that are reworked, erosionally truncated at sequence boundaries or delta-depressed as a result of localized sediment influx. Graphic correlation of subzonal boundaries coupled with available biostratigraphic and magnetostrategraphic datums has demonstrated the near synchronomy of subzonal boundaries and their utility in the subdivision of the Pleistocene. Using graphic correlation, the paleontologist can build viable exploration models that can be used to predict the occurrence of hydrocarbon reservoir sands. 87 refs., 13 figs.

  10. Constitutive heterochromatin: a surprising variety of expressed sequences.

    PubMed

    Dimitri, Patrizio; Caizzi, Ruggiero; Giordano, Ennio; Carmela Accardo, Maria; Lattanzi, Giovanna; Biamonti, Giuseppe

    2009-08-01

    The organization of chromosomes into euchromatin and heterochromatin is amongst the most important and enigmatic aspects of genome evolution. Constitutive heterochromatin is a basic yet still poorly understood component of eukaryotic chromosomes, and its molecular characterization by means of standard genomic approaches is intrinsically difficult. Although recent evidence indicates that the presence of transcribed genes in constitutive heterochromatin is a conserved trait that accompanies the evolution of eukaryotic genomes, the term heterochromatin is still considered by many as synonymous of gene silencing. In this paper, we comprehensively review data that provide a clearer picture of transcribed sequences within constitutive heterochromatin, with a special emphasis on Drosophila and humans.

  11. Identification of the promoter sequences involved in the cell specific expression of the rat somatostatin gene.

    PubMed Central

    Andrisani, O M; Hayes, T E; Roos, B; Dixon, J E

    1987-01-01

    DNA sequences containing the 5' flanking region of the rat somatostatin gene were linked to the coding sequence of the bacterial chloramphenicol acetyl transferase gene. This recombinant plasmid is active in expressing CAT activity in the neuronally derived, somatostatin producing CA-77 cell line. Deletion analyses of the somatostatin promoter show that the sequences proximal to position -60, relative to the cap site are required for expression of this promoter. A 4 base pair deletion of residues -46 through -43 within the somatostatin promoter results in a down mutation in vivo suggesting the existence of an element critical for the expression of the promoter in CA-77 cells. In addition, the somatostatin recombinant and its 5' deletion constructs preferentially express CAT activity in CA-77 cells, whereas only basal level of expression is observed in HeLa, BSC40, and RIN-5F cell lines, pointing to the cell specific nature of this promoter. Images PMID:2886975

  12. New data base-independent, sequence tag-based scoring of peptide MS/MS data validates Mowse scores, recovers below threshold data, singles out modified peptides, and assesses the quality of MS/MS techniques.

    PubMed

    Savitski, Mikhail M; Nielsen, Michael L; Zubarev, Roman A

    2005-08-01

    The Mascot score (M-score) is one of the conventional validity measures in data base identification of peptides and proteins by MS/MS data. Although tremendously useful, M-score has a number of limitations. For the same MS/MS data, M-score may change if the protein data base is expanded. A low M-value may not necessarily mean poor match but rather poor MS/MS quality. In addition M-score does not fully utilize the advantage of combined use of complementary fragmentation techniques collisionally activated dissociation (CAD) and electron capture dissociation (ECD). To address these issues, a new data base-independent scoring method (S-score) was designed that is based on the maximum length of the peptide sequence tag provided by the combined CAD and ECD data. The quality of MS/MS spectra assessed by S-score allows poor data (39% of all MS/MS spectra) to be filtered out before the data base search, speeding up the data analysis and eliminating a major source of false positive identifications. Spectra with below threshold M-scores (poor matches) but high S-scores are validated. Spectra with zero M-score (no data base match) but high S-score are classified as belonging to modified sequences. As an extension of S-score, an extremely reliable sequence tag was developed based on complementary fragments simultaneously appearing in CAD and ECD spectra. Comparison of this tag with the data base-derived sequence gives the most reliable peptide identification validation to date. The combined use of M- and S-scoring provides positive sequence identification from >25% of all MS/MS data, a 40% improvement over traditional M-scoring performed on the same Fourier transform MS instrumentation. The number of proteins reliably identified from Escherichia coli cell lysate hereby increased by 29% compared with the traditional M-score approach. Finally S-scoring provides a quantitative measure of the quality of fragmentation techniques such as the minimum abundance of the precursor ion

  13. Delineation of Cis-Acting Sequences Required for Expression of Drosophila Mojavensis Adh-1

    PubMed Central

    Bayer, C. A.; Curtiss, S. W.; Weaver, J. A.; Sullivan, D. T.

    1992-01-01

    The control of expression of the Adh-1 gene of Drosophila mojavensis has been analyzed by transforming ADH null Drosophila melanogaster hosts with P element constructs which contain D. mojavensis Adh-1 having deletions of different extent in the 5' and 3' ends. Adh-1 expression in the D. melanogaster hosts is qualitatively similar to expression in D. mojavensis, although expression is quantitatively lower in transformants. Deletions of the 5' end indicate that information required for normal temporal and tissue expression in larvae is contained within 70 bp of the transcription start site. However, deletion constructs to -70 are deficient in ovarian nurse cell expression, whereas the additional upstream sequences present in constructs containing deletions to -257 do support expression in the ovary. Comparison of the nucleotide sequence in the -257 to -70 region of Adh-1 of four species: D. mojavensis and Drosophila arizona, which express Adh-1 in the ovary, and Drosophila mulleri and Drosophila navojoa, which do not, has led to the identification of regions of sequence similarity that correlate with ovary expression. One of these bears a striking similarity to a conserved sequence located upstream of the three heat shock genes that have constitutive ovarian expression and may be an ovarian control element. We have identified an aberrant aspect of Adh-1 expression. In transformants which carry an Adh-1 gene without a functional upstream Adh-2 gene Adh-1 expression continues into the adult stage instead of ceasing at the onset of metamorphosis. In transformants with a functional Adh-2 gene, Adh-1 expression ceases in the third larval instar stage and aberrant expression in the adult stage does not occur. PMID:1317314

  14. Gene gun bombardment-mediated expression and translocation of EGFP-tagged GLUT4 in skeletal muscle fibres in vivo.

    PubMed

    Lauritzen, Hans P M M; Reynet, Christine; Schjerling, Peter; Ralston, Evelyn; Thomas, Stephen; Galbo, Henrik; Ploug, Thorkil

    2002-09-01

    Cellular protein trafficking has been studied to date only in vitro or with techniques that are invasive and have a low time resolution. To establish a gentle method for analysis of glucose transporter-4 (GLUT4) trafficking in vivo in fully differentiated rat skeletal muscle fibres we combined the enhanced green fluorescent protein (EGFP) labelling technique with physical transfection methods in vivo: intramuscular plasmid injection or gene gun bombardment. During optimisation experiments with plasmid coding for the EGFP reporter alone EGFP-positive muscle fibres were counted after collagenase treatment of in vivo transfected flexor digitorum brevis (FDB) muscles. In contrast to gene gun bombardment, intramuscular injection produced EGFP expression in only a few fibres. Regardless of the transfection technique, EGFP expression was higher in muscles from 2-week-old rats than in those from 6-week-old rats and peaked around 1 week after transfection. The gene gun was used subsequently with a plasmid coding for EGFP linked to the C-terminus of GLUT4 (GLUT4-EGFP). Rats were anaesthetised 5 days after transfection and insulin given i.v. with or without accompanying electrical hindleg muscle stimulation. After stimulation, the hindlegs were fixed by perfusion. GLUT4-EGFP-positive FDB fibres were isolated and analysed by confocal microscopy. The intracellular distribution of GLUT4-EGFP under basal conditions as well as after translocation to the plasma membrane in response to insulin, contractions, or both, was in accordance with previous studies of endogenous GLUT4. Finally, GLUT4-EGFP trafficking in quadriceps muscle in vivo was studied using time-lapse microscopy analysis in anaesthetised mice and the first detailed time-lapse recordings of GLUT4-EGFP translocation in fully differentiated skeletal muscle in vivo were obtained.

  15. Single-step affinity and cost-effective purification of recombinant proteins using the Sepharose-binding lectin-tag from the mushroom Laetiporus sulphureus as fusion partner.

    PubMed

    Li, Xiao-Jing; Liu, Jin-Ling; Gao, Dong-Sheng; Wan, Wen-Yan; Yang, Xia; Li, Yong-Tao; Chang, Hong-Tao; Chen, Lu; Wang, Chuan-Qing; Zhao, Jun

    2016-03-01

    Previous research showed that a lectin from the mushroom Laetiporus sulphureus, designed LSL, bound to Sepharose and could be eluted by lactose. In this study, by taking advantage of the strong affinity of LSL-tag for Sepharose, we developed a single-step purification method for LSL-tagged fusion proteins. We utilized unmodified Sepharose-4B as a specific adsorbent and 0.2 M lactose solution as an elution buffer. Fusion proteins of LSL-tag and porcine circovirus capsid protein, designated LSL-Cap was recovered with purity of 90 ± 4%, and yield of 87 ± 3% from crude extract of recombinant Escherichia coli. To enable the remove of LSL-tag, tobacco etch virus (TEV) protease recognition sequence was placed downstream of LSL-tag in the expression vector, and LSL-tagged TEV protease, designated LSL-TEV, was also expressed in E. coli., and was recovered with purity of 82 ± 5%, and yield of 85 ± 2% from crude extract of recombinant E. coli. After digestion of LSL-tagged recombinant proteins with LSL-TEV, the LSL tag and LSL-TEV can be easily removed