2012-01-01
Background The feline genome is valuable to the veterinary and model organism genomics communities because the cat is an obligate carnivore and a model for endangered felids. The initial public release of the Felis catus genome assembly provided a framework for investigating the genomic basis of feline biology. However, the entire set of protein coding genes has not been elucidated. Results We identified and characterized 1227 protein coding feline sequences, of which 913 map to public sequences and 314 are novel. These sequences have been deposited into NCBI's genbank database and complement public genomic resources by providing additional protein coding sequences that fill in some of the gaps in the feline genome assembly. Through functional and comparative genomic analyses, we gained an understanding of the role of these sequences in feline development, nutrition and health. Specifically, we identified 104 orthologs of human genes associated with Mendelian disorders. We detected negative selection within sequences with gene ontology annotations associated with intracellular trafficking, cytoskeleton and muscle functions. We detected relatively less negative selection on protein sequences encoding extracellular networks, apoptotic pathways and mitochondrial gene ontology annotations. Additionally, we characterized feline cDNA sequences that have mouse orthologs associated with clinical, nutritional and developmental phenotypes. Together, this analysis provides an overview of the value of our cDNA sequences and enhances our understanding of how the feline genome is similar to, and different from other mammalian genomes. Conclusions The cDNA sequences reported here expand existing feline genomic resources by providing high-quality sequences annotated with comparative genomic information providing functional, clinical, nutritional and orthologous gene information. PMID:22257742
Westhoff, Connie M.; Uy, Jon Michael; Aguad, Maria; Smeland‐Wagman, Robin; Kaufman, Richard M.; Rehm, Heidi L.; Green, Robert C.; Silberstein, Leslie E.
2015-01-01
BACKGROUND There are 346 serologically defined red blood cell (RBC) antigens and 33 serologically defined platelet (PLT) antigens, most of which have known genetic changes in 45 RBC or six PLT genes that correlate with antigen expression. Polymorphic sites associated with antigen expression in the primary literature and reference databases are annotated according to nucleotide positions in cDNA. This makes antigen prediction from next‐generation sequencing data challenging, since it uses genomic coordinates. STUDY DESIGN AND METHODS The conventional cDNA reference sequences for all known RBC and PLT genes that correlate with antigen expression were aligned to the human reference genome. The alignments allowed conversion of conventional cDNA nucleotide positions to the corresponding genomic coordinates. RBC and PLT antigen prediction was then performed using the human reference genome and whole genome sequencing (WGS) data with serologic confirmation. RESULTS Some major differences and alignment issues were found when attempting to convert the conventional cDNA to human reference genome sequences for the following genes: ABO, A4GALT, RHD, RHCE, FUT3, ACKR1 (previously DARC), ACHE, FUT2, CR1, GCNT2, and RHAG. However, it was possible to create usable alignments, which facilitated the prediction of all RBC and PLT antigens with a known molecular basis from WGS data. Traditional serologic typing for 18 RBC antigens were in agreement with the WGS‐based antigen predictions, providing proof of principle for this approach. CONCLUSION Detailed mapping of conventional cDNA annotated RBC and PLT alleles can enable accurate prediction of RBC and PLT antigens from whole genomic sequencing data. PMID:26634332
Li, XiaoChing; Wang, Xiu-Jie; Tannenhauser, Jonathan; Podell, Sheila; Mukherjee, Piali; Hertel, Moritz; Biane, Jeremy; Masuda, Shoko; Nottebohm, Fernando; Gaasterland, Terry
2007-01-01
Vocal learning and neuronal replacement have been studied extensively in songbirds, but until recently, few molecular and genomic tools for songbird research existed. Here we describe new molecular/genomic resources developed in our laboratory. We made cDNA libraries from zebra finch (Taeniopygia guttata) brains at different developmental stages. A total of 11,000 cDNA clones from these libraries, representing 5,866 unique gene transcripts, were randomly picked and sequenced from the 3′ ends. A web-based database was established for clone tracking, sequence analysis, and functional annotations. Our cDNA libraries were not normalized. Sequencing ESTs without normalization produced many developmental stage-specific sequences, yielding insights into patterns of gene expression at different stages of brain development. In particular, the cDNA library made from brains at posthatching day 30–50, corresponding to the period of rapid song system development and song learning, has the most diverse and richest set of genes expressed. We also identified five microRNAs whose sequences are highly conserved between zebra finch and other species. We printed cDNA microarrays and profiled gene expression in the high vocal center of both adult male zebra finches and canaries (Serinus canaria). Genes differentially expressed in the high vocal center were identified from the microarray hybridization results. Selected genes were validated by in situ hybridization. Networks among the regulated genes were also identified. These resources provide songbird biologists with tools for genome annotation, comparative genomics, and microarray gene expression analysis. PMID:17426146
Characterization of full-length sequenced cDNA inserts (FLIcs) from Atlantic salmon (Salmo salar)
Andreassen, Rune; Lunner, Sigbjørn; Høyheim, Bjørn
2009-01-01
Background Sequencing of the Atlantic salmon genome is now being planned by an international research consortium. Full-length sequenced inserts from cDNAs (FLIcs) are an important tool for correct annotation and clustering of the genomic sequence in any species. The large amount of highly similar duplicate sequences caused by the relatively recent genome duplication in the salmonid ancestor represents a particular challenge for the genome project. FLIcs will therefore be an extremely useful resource for the Atlantic salmon sequencing project. In addition to be helpful in order to distinguish between duplicate genome regions and in determining correct gene structures, FLIcs are an important resource for functional genomic studies and for investigation of regulatory elements controlling gene expression. In contrast to the large number of ESTs available, including the ESTs from 23 developmental and tissue specific cDNA libraries contributed by the Salmon Genome Project (SGP), the number of sequences where the full-length of the cDNA insert has been determined has been small. Results High quality full-length insert sequences from 560 pre-smolt white muscle tissue specific cDNAs were generated, accession numbers [GenBank: BT043497 - BT044056]. Five hundred and ten (91%) of the transcripts were annotated using Gene Ontology (GO) terms and 440 of the FLIcs are likely to contain a complete coding sequence (cCDS). The sequence information was used to identify putative paralogs, characterize salmon Kozak motifs, polyadenylation signal variation and to identify motifs likely to be involved in the regulation of particular genes. Finally, conserved 7-mers in the 3'UTRs were identified, of which some were identical to miRNA target sequences. Conclusion This paper describes the first Atlantic salmon FLIcs from a tissue and developmental stage specific cDNA library. We have demonstrated that many FLIcs contained a complete coding sequence (cCDS). This suggests that the remaining cDNA libraries generated by SGP represent a valuable cCDS FLIc source. The conservation of 7-mers in 3'UTRs indicates that these motifs are functionally important. Identity between some of these 7-mers and miRNA target sequences suggests that they are miRNA targets in Salmo salar transcripts as well. PMID:19878547
Attomole-level Genomics with Single-molecule Direct DNA, cDNA and RNA Sequencing Technologies.
Ozsolak, Fatih
2016-01-01
With the introduction of next-generation sequencing (NGS) technologies in 2005, the domination of microarrays in genomics quickly came to an end due to NGS's superior technical performance and cost advantages. By enabling genetic analysis capabilities that were not possible previously, NGS technologies have started to play an integral role in all areas of biomedical research. This chapter outlines the low-quantity DNA and cDNA sequencing capabilities and applications developed with the Helicos single molecule DNA sequencing technology.
RICD: a rice indica cDNA database resource for rice functional genomics.
Lu, Tingting; Huang, Xuehui; Zhu, Chuanrang; Huang, Tao; Zhao, Qiang; Xie, Kabing; Xiong, Lizhong; Zhang, Qifa; Han, Bin
2008-11-26
The Oryza sativa L. indica subspecies is the most widely cultivated rice. During the last few years, we have collected over 20,000 putative full-length cDNAs and over 40,000 ESTs isolated from various cDNA libraries of two indica varieties Guangluai 4 and Minghui 63. A database of the rice indica cDNAs was therefore built to provide a comprehensive web data source for searching and retrieving the indica cDNA clones. Rice Indica cDNA Database (RICD) is an online MySQL-PHP driven database with a user-friendly web interface. It allows investigators to query the cDNA clones by keyword, genome position, nucleotide or protein sequence, and putative function. It also provides a series of information, including sequences, protein domain annotations, similarity search results, SNPs and InDels information, and hyperlinks to gene annotation in both The Rice Annotation Project Database (RAP-DB) and The TIGR Rice Genome Annotation Resource, expression atlas in RiceGE and variation report in Gramene of each cDNA. The online rice indica cDNA database provides cDNA resource with comprehensive information to researchers for functional analysis of indica subspecies and for comparative genomics. The RICD database is available through our website http://www.ncgr.ac.cn/ricd.
In silico Analysis of 2085 Clones from a Normalized Rat Vestibular Periphery 3′ cDNA Library
Roche, Joseph P.; Cioffi, Joseph A.; Kwitek, Anne E.; Erbe, Christy B.; Popper, Paul
2005-01-01
The inserts from 2400 cDNA clones isolated from a normalized Rattus norvegicus vestibular periphery cDNA library were sequenced and characterized. The Wackym-Soares vestibular 3′ cDNA library was constructed from the saccular and utricular maculae, the ampullae of all three semicircular canals and Scarpa's ganglia containing the somata of the primary afferent neurons, microdissected from 104 male and female rats. The inserts from 2400 randomly selected clones were sequenced from the 5′ end. Each sequence was analyzed using the BLAST algorithm compared to the Genbank nonredundant, rat genome, mouse genome and human genome databases to search for high homology alignments. Of the initial 2400 clones, 315 (13%) were found to be of poor quality and did not yield useful information, and therefore were eliminated from the analysis. Of the remaining 2085 sequences, 918 (44%) were found to represent 758 unique genes having useful annotations that were identified in databases within the public domain or in the published literature; these sequences were designated as known characterized sequences. 1141 sequences (55%) aligned with 1011 unique sequences had no useful annotations and were designated as known but uncharacterized sequences. Of the remaining 26 sequences (1%), 24 aligned with rat genomic sequences, but none matched previously described rat expressed sequence tags or mRNAs. No significant alignment to the rat or human genomic sequences could be found for the remaining 2 sequences. Of the 2085 sequences analyzed, 86% were singletons. The known, characterized sequences were analyzed with the FatiGO online data-mining tool (http://fatigo.bioinfo.cnio.es/) to identify level 5 biological process gene ontology (GO) terms for each alignment and to group alignments with similar or identical GO terms. Numerous genes were identified that have not been previously shown to be expressed in the vestibular system. Further characterization of the novel cDNA sequences may lead to the identification of genes with vestibular-specific functions. Continued analysis of the rat vestibular periphery transcriptome should provide new insights into vestibular function and generate new hypotheses. Physiological studies are necessary to further elucidate the roles of the identified genes and novel sequences in vestibular function. PMID:16103642
Illumina sequencing of green stink bug nymph and adult cdna to identify potential rnai gene targets
USDA-ARS?s Scientific Manuscript database
Whole-body transcriptomes for nymphs and adults of the green stink bug, Acrosternum hilare (Say), were sequenced on an Illumina® Genome Analyzer IIx sequencer. The insects were collected from sites in North Carolina and Virginia, USA. The cDNA library for each sample was sequenced on one lane of an...
Baxter, Laura L; Hsu, Benjamin J; Umayam, Lowell; Wolfsberg, Tyra G; Larson, Denise M; Frith, Martin C; Kawai, Jun; Hayashizaki, Yoshihide; Carninci, Piero; Pavan, William J
2007-06-01
As part of the RIKEN mouse encyclopedia project, two cDNA libraries were prepared from melanocyte-derived cell lines, using techniques of full-length clone selection and subtraction/normalization to enrich for rare transcripts. End sequencing showed that these libraries display over 83% complete coding sequence at the 5' end and 96-97% complete coding sequence at the 3' end. Evaluation of the libraries, derived from B16F10Y tumor cells and melan-c cells, revealed that they contain clones for a majority of the genes previously demonstrated to function in melanocyte biology. Analysis of genomic locations for transcripts revealed that the distribution of melanocyte genes is non-random throughout the genome. Three genomic regions identified that showed significant clustering of melanocyte-expressed genes contain one or more genes previously shown to regulate melanocyte development or function. A catalog of genes expressed in these libraries is presented, providing a valuable resource of cDNA clones and sequence information that can be used for identification of new genes important for melanocyte development, function, and disease.
The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika
2010-01-27
Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set ofmore » tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics.« less
2011-01-01
Background Common bean is an important legume crop with only a moderate number of short expressed sequence tags (ESTs) made with traditional methods. The goal of this research was to use full-length cDNA technology to develop ESTs that would overlap with the beginning of open reading frames and therefore be useful for gene annotation of genomic sequences. The library was also constructed to represent genes expressed under drought, low soil phosphorus and high soil aluminum toxicity. We also undertook comparisons of the full-length cDNA library to two previous non-full clone EST sets for common bean. Results Two full-length cDNA libraries were constructed: one for the drought tolerant Mesoamerican genotype BAT477 and the other one for the acid-soil tolerant Andean genotype G19833 which has been selected for genome sequencing. Plants were grown in three soil types using deep rooting cylinders subjected to drought and non-drought stress and tissues were collected from both roots and above ground parts. A total of 20,000 clones were selected robotically, half from each library. Then, nearly 10,000 clones from the G19833 library were sequenced with an average read length of 850 nucleotides. A total of 4,219 unigenes were identified consisting of 2,981 contigs and 1,238 singletons. These were functionally annotated with gene ontology terms and placed into KEGG pathways. Compared to other EST sequencing efforts in common bean, about half of the sequences were novel or represented the 5' ends of known genes. Conclusions The present full-length cDNA libraries add to the technological toolbox available for common bean and our sequencing of these clones substantially increases the number of unique EST sequences available for the common bean genome. All of this should be useful for both functional gene annotation, analysis of splice site variants and intron/exon boundary determination by comparison to soybean genes or with common bean whole-genome sequences. In addition the library has a large number of transcription factors and will be interesting for discovery and validation of drought or abiotic stress related genes in common bean. PMID:22118559
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1987-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1990-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1988-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330
A comprehensive list of cloned human DNA sequences
Schmidtke, Jörg; Cooper, David N.
1989-01-01
A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889
Molecular Targeting of Prostate Cancer During Androgen Ablation: Inhibition of CHES1/FOXN3
2013-05-01
the DNA sequences (~25^6 reads/sample) were mapped to the human genome reference sequence (hg19...tumor the AR has a genomic abnormality, placing the novel sequence 3’ of the transcriptional start site. However, it is unclear if a genomic alteration...exon/intron organization of the CHES1 gene was determined by BLAST analysis of the human genome using the 1,473-bp CHES1 cDNA sequence
Hurrelbrink, R J; Nestorowicz, A; McMinn, P C
1999-12-01
An infectious cDNA clone of Murray Valley encephalitis virus prototype strain 1-51 (MVE-1-51) was constructed by stably inserting genome-length cDNA into the low-copy-number plasmid vector pMC18. Designated pMVE-1-51, the clone consisted of genome-length cDNA of MVE-1-51 under the control of a T7 RNA polymerase promoter. The clone was constructed by using existing components of a cDNA library, in addition to cDNA of the 3' terminus derived by RT-PCR of poly(A)-tailed viral RNA. Upon comparison with other flavivirus sequences, the previously undetermined sequence of the 3' UTR was found to contain elements conserved throughout the genus FLAVIVIRUS: RNA transcribed from pMVE-1-51 and subsequently transfected into BHK-21 cells generated infectious virus. The plaque morphology, replication kinetics and antigenic profile of clone-derived virus (CDV-1-51) was similar to the parental virus in vitro. Furthermore, the virulence properties of CDV-1-51 and MVE-1-51 (LD(50) values and mortality profiles) were found to be identical in vivo in the mouse model. Through site-directed mutagenesis, the infectious clone should serve as a valuable tool for investigating the molecular determinants of virulence in MVE virus.
Saito, T; Ochiai, H
1999-10-01
cDNA fragments putatively encoding amino acid sequences characteristic of the fatty acid desaturase were obtained using expressed sequence tag (EST) information of the Dictyostelium cDNA project. Using this sequence, we have determined the cDNA sequence and genomic sequence of a desaturase. The cloned cDNA is 1489 nucleotides long and the deduced amino acid sequence comprised 464 amino acid residues containing an N-terminal cytochrome b5 domain. The whole sequence was 38.6% identical to the initially identified Delta5-desaturase of Mortierella alpina. We have confirmed its function as Delta5-desaturase by over expression mutation in D. discoideum and also the gain of function mutation in the yeast Saccharomyces cerevisiae. Analysis of the lipids from transformed D. discoideum and yeast demonstrated the accumulation of Delta5-desaturated products. This is the first report concering fatty acid desaturase in cellular slime molds.
Bhore, Subhash J; Kassim, Amelia; Loh, Chye Ying; Shah, Farida H
2010-01-01
It is well known that the nutritional quality of the American oil-palm (Elaeis oleifera) mesocarp oil is superior to that of African oil-palm (Elaeis guineensis Jacq. Tenera) mesocarp oil. Therefore, it is of important to identify the genetic features for its superior value. This could be achieved through the genome sequencing of the oil-palm. However, the genome sequence is not available in the public domain due to commercial secrecy. Hence, we constructed a cDNA library and generated expressed sequence tags (3,205) from the mesocarp tissue of the American oil-palm. We continued to annotate each of these cDNAs after submitting to GenBank/DDBJ/EMBL. A rough analysis turned our attention to the beta-carotene hydroxylase (Chyb) enzyme encoding cDNA. Then, we completed the full sequencing of cDNA clone for its both strands using M13 forward and reverse primers. The full nucleotide and protein sequence was further analyzed and annotated using various Bioinformatics tools. The analysis results showed the presence of fatty acid hydroxylase superfamily domain in the protein sequence. The multiple sequence alignment of selected Chyb amino acid sequences from other plant species and algal members with E. oleifera Chyb using ClustalW and its phylogenetic analysis suggest that Chyb from monocotyledonous plant species, Lilium hubrid, Crocus sativus and Zea mays are the most evolutionary related with E. oleifera Chyb. This study reports the annotation of E. oleifera Chyb. Abbreviations ESTs - expressed sequence tags, EoChyb - Elaeis oleifera beta-carotene hydroxylase, MC - main cluster PMID:21364789
USDA-ARS?s Scientific Manuscript database
The complete genome sequence (6,423 nt) of an emerging Cucumber green mottle mosaic virus (CGMMV) isolate on cucumber in North America was determined through deep sequencing of sRNA and rapid amplification of cDNA ends. It shares 99% nucleotide sequence identity to the Asian genotype, but only 90% t...
Lu, L; Komada, M; Kitamura, N
1998-06-15
Hrs is a 115kDa zinc finger protein which is rapidly tyrosine phosphorylated in cells stimulated with various growth factors. We previously purified the protein from a mouse cell line and cloned its cDNA. In the present study, we cloned a human Hrs cDNA from a human placenta cDNA library by cross-hybridization, using the mouse cDNA as a probe, and determined its nucleotide sequence. The human Hrs cDNA encoded a 777-amino-acid protein whose sequence was 93% identical to that of mouse Hrs. Northern blot analysis showed that the Hrs mRNA was about 3.0kb long and was expressed in all the human adult and fetal tissues tested. In addition, we showed by genomic Southern blot analysis that the human Hrs gene was a single-copy gene with a size of about 20kb. Furthermore, the human Hrs gene was mapped to chromosome 17 by Southern blotting of genomic DNAs from human/rodent somatic cell hybrids. Copyright 1998 Elsevier Science B.V. All rights reserved.
Molecular cloning of chitinase 33 (chit33) gene from Trichoderma atroviride
Matroudi, S.; Zamani, M.R.; Motallebi, M.
2008-01-01
In this study Trichoderma atroviride was selected as over producer of chitinase enzyme among 30 different isolates of Trichoderma sp. on the basis of chitinase specific activity. From this isolate the genomic and cDNA clones encoding chit33 have been isolated and sequenced. Comparison of genomic and cDNA sequences for defining gene structure indicates that this gene contains three short introns and also an open reading frame coding for a protein of 321 amino acids. The deduced amino acid sequence includes a 19 aa putative signal peptide. Homology between this sequence and other reported Trichoderma Chit33 proteins are discussed. The coding sequence of chit33 gene was cloned in pEt26b(+) expression vector and expressed in E. coli. PMID:24031242
Hu, Lin-Yong; Cui, Chen-Chen; Song, Yu-Jie; Wang, Xiang-Guo; Jin, Ya-Ping; Wang, Ai-Hua; Zhang, Yong
2012-07-01
cDNA is widely used in gene function elucidation and/or transgenics research but often suitable tissues or cells from which to isolate mRNA for reverse transcription are unavailable. Here, an alternative method for cDNA cloning is described and tested by cloning the cDNA of human LALBA (human alpha-lactalbumin) from genomic DNA. First, genomic DNA containing all of the coding exons was cloned from human peripheral blood and inserted into a eukaryotic expression vector. Next, by delivering the plasmids into either 293T or fibroblast cells, surrogate cells were constructed. Finally, the total RNA was extracted from the surrogate cells and cDNA was obtained by RT-PCR. The human LALBA cDNA that was obtained was compared with the corresponding mRNA published in GenBank. The comparison showed that the two sequences were identical. The novel method for cDNA cloning from surrogate eukaryotic cells described here uses well-established techniques that are feasible and simple to use. We anticipate that this alternative method will have widespread applications.
NASA Astrophysics Data System (ADS)
Sun, S. M.; Slightom, J. L.; Hall, T. C.
1981-01-01
A plant gene coding for the major storage protein (phaseolin, G1-globulin) of the French bean was isolated from a genomic library constructed in the phage vector Charon 24A. Comparison of the nucleotide sequence of part of the gene with that of the cloned messenger RNA (cDNA) revealed the presence of three intervening sequences, all beginning with GTand ending with AG. The 5' and 3' boundaries of intervening sequences TVS-A (88 base pairs) and IVS-B (124 base pairs) are similar to those described for animal and viral genes, but the 3' boundary of IVS-C (129 base pairs) shows some differences. A sequence of 185 amino acids deduced from the cloned DMAs represents about 40% of a phaseolin polypeptide.
Bozzoni, I; Beccari, E; Luo, Z X; Amaldi, F
1981-01-01
Poly-A+ mRNA from Xenopus laevis oocytes, partially enriched for r-protein coding capacity has been used as starting material for preparing a cDNA bank in plasmid pBR322. The clones containing sequences specific for r-proteins have been selected by translation of the complementary mRNAs. Clones for six different r-proteins have been identified and utilized as probes for studying their genomic organization. Two gene copies per haploid genome were found for r-proteins L1, L14, S19, and four-five for protein S1, S8 and L32. Moreover a population polymorphism has been observed for the genomic regions containing sequences for r-protein S1, S8 and L14. Images PMID:6112733
Analysis of expressed sequence tags generated from full-length enriched cDNA libraries of melon
2011-01-01
Background Melon (Cucumis melo), an economically important vegetable crop, belongs to the Cucurbitaceae family which includes several other important crops such as watermelon, cucumber, and pumpkin. It has served as a model system for sex determination and vascular biology studies. However, genomic resources currently available for melon are limited. Result We constructed eleven full-length enriched and four standard cDNA libraries from fruits, flowers, leaves, roots, cotyledons, and calluses of four different melon genotypes, and generated 71,577 and 22,179 ESTs from full-length enriched and standard cDNA libraries, respectively. These ESTs, together with ~35,000 ESTs available in public domains, were assembled into 24,444 unigenes, which were extensively annotated by comparing their sequences to different protein and functional domain databases, assigning them Gene Ontology (GO) terms, and mapping them onto metabolic pathways. Comparative analysis of melon unigenes and other plant genomes revealed that 75% to 85% of melon unigenes had homologs in other dicot plants, while approximately 70% had homologs in monocot plants. The analysis also identified 6,972 gene families that were conserved across dicot and monocot plants, and 181, 1,192, and 220 gene families specific to fleshy fruit-bearing plants, the Cucurbitaceae family, and melon, respectively. Digital expression analysis identified a total of 175 tissue-specific genes, which provides a valuable gene sequence resource for future genomics and functional studies. Furthermore, we identified 4,068 simple sequence repeats (SSRs) and 3,073 single nucleotide polymorphisms (SNPs) in the melon EST collection. Finally, we obtained a total of 1,382 melon full-length transcripts through the analysis of full-length enriched cDNA clones that were sequenced from both ends. Analysis of these full-length transcripts indicated that sizes of melon 5' and 3' UTRs were similar to those of tomato, but longer than many other dicot plants. Codon usages of melon full-length transcripts were largely similar to those of Arabidopsis coding sequences. Conclusion The collection of melon ESTs generated from full-length enriched and standard cDNA libraries is expected to play significant roles in annotating the melon genome. The ESTs and associated analysis results will be useful resources for gene discovery, functional analysis, marker-assisted breeding of melon and closely related species, comparative genomic studies and for gaining insights into gene expression patterns. PMID:21599934
Rapid and efficient cDNA library screening by self-ligation of inverse PCR products (SLIP).
Hoskins, Roger A; Stapleton, Mark; George, Reed A; Yu, Charles; Wan, Kenneth H; Carlson, Joseph W; Celniker, Susan E
2005-12-02
cDNA cloning is a central technology in molecular biology. cDNA sequences are used to determine mRNA transcript structures, including splice junctions, open reading frames (ORFs) and 5'- and 3'-untranslated regions (UTRs). cDNA clones are valuable reagents for functional studies of genes and proteins. Expressed Sequence Tag (EST) sequencing is the method of choice for recovering cDNAs representing many of the transcripts encoded in a eukaryotic genome. However, EST sequencing samples a cDNA library at random, and it recovers transcripts with low expression levels inefficiently. We describe a PCR-based method for directed screening of plasmid cDNA libraries. We demonstrate its utility in a screen of libraries used in our Drosophila EST projects for 153 transcription factor genes that were not represented by full-length cDNA clones in our Drosophila Gene Collection. We recovered high-quality, full-length cDNAs for 72 genes and variously compromised clones for an additional 32 genes. The method can be used at any scale, from the isolation of cDNA clones for a particular gene of interest, to the improvement of large gene collections in model organisms and the human. Finally, we discuss the relative merits of directed cDNA library screening and RT-PCR approaches.
Tange, N; Jong-Young, L; Mikawa, N; Hirono, I; Aoki, T
1997-12-01
A cDNA clone of rainbow trout (Oncorhynchus mykiss) transferrin was obtained from a liver cDNA library. The 2537-bp cDNA sequence contained an open reading frame encoding 691 amino acids and the 5' and 3' noncoding regions. The amino acid sequences at the iron-binding sites and the two N-linked glycosylation sites, and the cysteine residues were consistent with known, conserved vertebrate transferrin cDNA sequences. Single N-linked glycosylation sites existed on the N- and C-lobe. The deduced amino acid sequence of the rainbow trout transferrin cDNA had 92.9% identities with transferrin of coho salmon (Oncorhynchus kisutch); 85%, Atlantic salmon (Salmo salar); 67.3%, medaka (Oryzias latipes); 61.3% Atlantic cod (Gadus morhua); and 59.7%, Japanese flounder (Paralichthys olivaceus). The long and accurate polymerase chain reaction (LA-PCR) was used to amplify approximately 6.5 kb of the transferrin gene from rainbow trout genomic DNA. Restriction fragment length polymorphisms (RFLPs) of the LA-PCR products revealed three digestion patterns in 22 samples.
Genome-Wide Profiling of RNA–Protein Interactions Using CLIP-Seq
Stork, Cheryl; Zheng, Sika
2017-01-01
UV crosslinking immunoprecipitation (CLIP) is an increasingly popular technique to study protein–RNA interactions in tissues and cells. Whole cells or tissues are ultraviolet irradiated to generate a covalent bond between RNA and proteins that are in close contact. After partial RNase digestion, antibodies specific to an RNA binding protein (RBP) or a protein–epitope tag is then used to immunoprecipitate the protein–RNA complexes. After stringent washing and gel separation the RBP–RNA complex is excised. The RBP is protease digested to allow purification of the bound RNA. Reverse transcription of the RNA followed by high-throughput sequencing of the cDNA library is now often used to identify protein bound RNA on a genome-wide scale. UV irradiation can result in cDNA truncations and/or mutations at the crosslink sites, which complicates the alignment of the sequencing library to the reference genome and the identification of the crosslinking sites. Meanwhile, one or more amino acids of a crosslinked RBP can remain attached to its bound RNA due to incomplete digestion of the protein. As a result, reverse transcriptase may not read through the crosslink sites, and produce cDNA ending at the crosslinked nucleotide. This is harnessed by one variant of CLIP methods to identify crosslinking sites at a nucleotide resolution. This method, individual nucleotide resolution CLIP (iCLIP) circularizes cDNA to capture the truncated cDNA and also increases the efficiency of ligating sequencing adapters to the library. Here, we describe the detailed procedure of iCLIP. PMID:26965263
Minimap2: pairwise alignment for nucleotide sequences.
Li, Heng
2018-05-10
Recent advances in sequencing technologies promise ultra-long reads of ∼100 kilo bases (kb) in average, full-length mRNA or cDNA reads in high throughput and genomic contigs over 100 mega bases (Mb) in length. Existing alignment programs are unable or inefficient to process such data at scale, which presses for the development of new alignment algorithms. Minimap2 is a general-purpose alignment program to map DNA or long mRNA sequences against a large reference database. It works with accurate short reads of ≥ 100bp in length, ≥1kb genomic reads at error rate ∼15%, full-length noisy Direct RNA or cDNA reads, and assembly contigs or closely related full chromosomes of hundreds of megabases in length. Minimap2 does split-read alignment, employs concave gap cost for long insertions and deletions (INDELs) and introduces new heuristics to reduce spurious alignments. It is 3-4 times as fast as mainstream short-read mappers at comparable accuracy, and is ≥30 times faster than long-read genomic or cDNA mappers at higher accuracy, surpassing most aligners specialized in one type of alignment. https://github.com/lh3/minimap2. hengli@broadinstitute.org.
Brain cDNA clone for human cholinesterase
DOE Office of Scientific and Technical Information (OSTI.GOV)
McTiernan, C.; Adkins, S.; Chatonnet, A.
1987-10-01
A cDNA library from human basal ganglia was screened with oligonucleotide probes corresponding to portions of the amino acid sequence of human serum cholinesterase. Five overlapping clones, representing 2.4 kilobases, were isolated. The sequenced cDNA contained 207 base pairs of coding sequence 5' to the amino terminus of the mature protein in which there were four ATG translation start sites in the same reading frame as the protein. Only the ATG coding for Met-(-28) lay within a favorable consensus sequence for functional initiators. There were 1722 base pairs of coding sequence corresponding to the protein found circulating in human serum.more » The amino acid sequence deduced from the cDNA exactly matched the 574 amino acid sequence of human serum cholinesterase, as previously determined by Edman degradation. Therefore, our clones represented cholinesterase rather than acetylcholinesterase. It was concluded that the amino acid sequences of cholinesterase from two different tissues, human brain and human serum, were identical. Hybridization of genomic DNA blots suggested that a single gene, or very few genes coded for cholinesterase.« less
Aramrak, Attawan; Kidwell, Kimberlee K; Steber, Camille M; Burke, Ian C
2015-10-23
5-Enolpyruvylshikimate-3-phosphate synthase (EPSPS) is the sixth and penultimate enzyme in the shikimate biosynthesis pathway, and is the target of the herbicide glyphosate. The EPSPS genes of allohexaploid wheat (Triticum aestivum, AABBDD) have not been well characterized. Herein, the three homoeologous copies of the allohexaploid wheat EPSPS gene were cloned and characterized. Genomic and coding DNA sequences of EPSPS from the three related genomes of allohexaploid wheat were isolated using PCR and inverse PCR approaches from soft white spring "Louise'. Development of genome-specific primers allowed the mapping and expression analysis of TaEPSPS-7A1, TaEPSPS-7D1, and TaEPSPS-4A1 on chromosomes 7A, 7D, and 4A, respectively. Sequence alignments of cDNA sequences from wheat and wheat relatives served as a basis for phylogenetic analysis. The three genomic copies of wheat EPSPS differed by insertion/deletion and single nucleotide polymorphisms (SNPs), largely in intron sequences. RT-PCR analysis and cDNA cloning revealed that EPSPS is expressed from all three genomic copies. However, TaEPSPS-4A1 is expressed at much lower levels than TaEPSPS-7A1 and TaEPSPS-7D1 in wheat seedlings. Phylogenetic analysis of 1190-bp cDNA clones from wheat and wheat relatives revealed that: 1) TaEPSPS-7A1 is most similar to EPSPS from the tetraploid AB genome donor, T. turgidum (99.7 % identity); 2) TaEPSPS-7D1 most resembles EPSPS from the diploid D genome donor, Aegilops tauschii (100 % identity); and 3) TaEPSPS-4A1 resembles EPSPS from the diploid B genome relative, Ae. speltoides (97.7 % identity). Thus, EPSPS sequences in allohexaploid wheat are preserved from the most two recent ancestors. The wheat EPSPS genes are more closely related to Lolium multiflorum and Brachypodium distachyon than to Oryza sativa (rice). The three related EPSPS homoeologues of wheat exhibited conservation of the exon/intron structure and of coding region sequence, but contained significant sequence variation within intron regions. The genome-specific primers developed will enable future characterization of natural and induced variation in EPSPS sequence and expression. This can be useful in investigating new causes of glyphosate herbicide resistance.
Cost-effective sequencing of full-length cDNA clones powered by a de novo-reference hybrid assembly.
Kuroshu, Reginaldo M; Watanabe, Junichi; Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka; Kasahara, Masahiro
2010-05-07
Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence approximately 800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only approximately US$3 per clone, demonstrating a significant advantage over previous approaches.
Pairagon: a highly accurate, HMM-based cDNA-to-genome aligner.
Lu, David V; Brown, Randall H; Arumugam, Manimozhiyan; Brent, Michael R
2009-07-01
The most accurate way to determine the intron-exon structures in a genome is to align spliced cDNA sequences to the genome. Thus, cDNA-to-genome alignment programs are a key component of most annotation pipelines. The scoring system used to choose the best alignment is a primary determinant of alignment accuracy, while heuristics that prevent consideration of certain alignments are a primary determinant of runtime and memory usage. Both accuracy and speed are important considerations in choosing an alignment algorithm, but scoring systems have received much less attention than heuristics. We present Pairagon, a pair hidden Markov model based cDNA-to-genome alignment program, as the most accurate aligner for sequences with high- and low-identity levels. We conducted a series of experiments testing alignment accuracy with varying sequence identity. We first created 'perfect' simulated cDNA sequences by splicing the sequences of exons in the reference genome sequences of fly and human. The complete reference genome sequences were then mutated to various degrees using a realistic mutation simulator and the perfect cDNAs were aligned to them using Pairagon and 12 other aligners. To validate these results with natural sequences, we performed cross-species alignment using orthologous transcripts from human, mouse and rat. We found that aligner accuracy is heavily dependent on sequence identity. For sequences with 100% identity, Pairagon achieved accuracy levels of >99.6%, with one quarter of the errors of any other aligner. Furthermore, for human/mouse alignments, which are only 85% identical, Pairagon achieved 87% accuracy, higher than any other aligner. Pairagon source and executables are freely available at http://mblab.wustl.edu/software/pairagon/
Ning, ZhongHua; Hincke, Maxwell T.; Yang, Ning; Hou, ZhuoCheng
2014-01-01
Efficiently obtaining full-length cDNA for a target gene is the key step for functional studies and probing genetic variations. However, almost all sequenced domestic animal genomes are not ‘finished’. Many functionally important genes are located in these gapped regions. It can be difficult to obtain full-length cDNA for which only partial amino acid/EST sequences exist. In this study we report a general pipeline to obtain full-length cDNA, and illustrate this approach for one important gene (Ovocleidin-17, OC-17) that is associated with chicken eggshell biomineralization. Chicken OC-17 is one of the best candidates to control and regulate the deposition of calcium carbonate in the calcified eggshell layer. OC-17 protein has been purified, sequenced, and has had its three-dimensional structure solved. However, researchers still cannot conduct OC-17 mRNA related studies because the mRNA sequence is unknown and the gene is absent from the current chicken genome. We used RNA-Seq to obtain the entire transcriptome of the adult hen uterus, and then conducted de novo transcriptome assembling with bioinformatics analysis to obtain candidate OC-17 transcripts. Based on this sequence, we used RACE and PCR cloning methods to successfully obtain the full-length OC-17 cDNA. Temporal and spatial OC-17 mRNA expression analyses were also performed to demonstrate that OC-17 is predominantly expressed in the adult hen uterus during the laying cycle and barely at immature developmental stages. Differential uterine expression of OC-17 was observed in hens laying eggs with weak versus strong eggshell, confirming its important role in the regulation of eggshell mineralization and providing a new tool for genetic selection for eggshell quality parameters. This study is the first one to report the full-length OC-17 cDNA sequence, and builds a foundation for OC-17 mRNA related studies. We provide a general method for biologists experiencing difficulty in obtaining candidate gene full-length cDNA sequences. PMID:24676480
Zhang, Quan; Liu, Long; Zhu, Feng; Ning, ZhongHua; Hincke, Maxwell T; Yang, Ning; Hou, ZhuoCheng
2014-01-01
Efficiently obtaining full-length cDNA for a target gene is the key step for functional studies and probing genetic variations. However, almost all sequenced domestic animal genomes are not 'finished'. Many functionally important genes are located in these gapped regions. It can be difficult to obtain full-length cDNA for which only partial amino acid/EST sequences exist. In this study we report a general pipeline to obtain full-length cDNA, and illustrate this approach for one important gene (Ovocleidin-17, OC-17) that is associated with chicken eggshell biomineralization. Chicken OC-17 is one of the best candidates to control and regulate the deposition of calcium carbonate in the calcified eggshell layer. OC-17 protein has been purified, sequenced, and has had its three-dimensional structure solved. However, researchers still cannot conduct OC-17 mRNA related studies because the mRNA sequence is unknown and the gene is absent from the current chicken genome. We used RNA-Seq to obtain the entire transcriptome of the adult hen uterus, and then conducted de novo transcriptome assembling with bioinformatics analysis to obtain candidate OC-17 transcripts. Based on this sequence, we used RACE and PCR cloning methods to successfully obtain the full-length OC-17 cDNA. Temporal and spatial OC-17 mRNA expression analyses were also performed to demonstrate that OC-17 is predominantly expressed in the adult hen uterus during the laying cycle and barely at immature developmental stages. Differential uterine expression of OC-17 was observed in hens laying eggs with weak versus strong eggshell, confirming its important role in the regulation of eggshell mineralization and providing a new tool for genetic selection for eggshell quality parameters. This study is the first one to report the full-length OC-17 cDNA sequence, and builds a foundation for OC-17 mRNA related studies. We provide a general method for biologists experiencing difficulty in obtaining candidate gene full-length cDNA sequences.
Marques, M Carmen; Alonso-Cantabrana, Hugo; Forment, Javier; Arribas, Raquel; Alamar, Santiago; Conejero, Vicente; Perez-Amador, Miguel A
2009-01-01
Background Interpretation of ever-increasing raw sequence information generated by modern genome sequencing technologies faces multiple challenges, such as gene function analysis and genome annotation. Indeed, nearly 40% of genes in plants encode proteins of unknown function. Functional characterization of these genes is one of the main challenges in modern biology. In this regard, the availability of full-length cDNA clones may fill in the gap created between sequence information and biological knowledge. Full-length cDNA clones facilitate functional analysis of the corresponding genes enabling manipulation of their expression in heterologous systems and the generation of a variety of tagged versions of the native protein. In addition, the development of full-length cDNA sequences has the power to improve the quality of genome annotation. Results We developed an integrated method to generate a new normalized EST collection enriched in full-length and rare transcripts of different citrus species from multiple tissues and developmental stages. We constructed a total of 15 cDNA libraries, from which we isolated 10,898 high-quality ESTs representing 6142 different genes. Percentages of redundancy and proportion of full-length clones range from 8 to 33, and 67 to 85, respectively, indicating good efficiency of the approach employed. The new EST collection adds 2113 new citrus ESTs, representing 1831 unigenes, to the collection of citrus genes available in the public databases. To facilitate functional analysis, cDNAs were introduced in a Gateway-based cloning vector for high-throughput functional analysis of genes in planta. Herein, we describe the technical methods used in the library construction, sequence analysis of clones and the overexpression of CitrSEP, a citrus homolog to the Arabidopsis SEP3 gene, in Arabidopsis as an example of a practical application of the engineered Gateway vector for functional analysis. Conclusion The new EST collection denotes an important step towards the identification of all genes in the citrus genome. Furthermore, public availability of the cDNA clones generated in this study, and not only their sequence, enables testing of the biological function of the genes represented in the collection. Expression of the citrus SEP3 homologue, CitrSEP, in Arabidopsis results in early flowering, along with other phenotypes resembling the over-expression of the Arabidopsis SEPALLATA genes. Our findings suggest that the members of the SEP gene family play similar roles in these quite distant plant species. PMID:19747386
Giardina, P; Cannio, R; Martirani, L; Marzullo, L; Palmieri, G; Sannia, G
1995-01-01
The gene (pox1) encoding a phenol oxidase from Pleurotus ostreatus, a lignin-degrading basidiomycete, was cloned and sequenced, and the corresponding pox1 cDNA was also synthesized and sequenced. The isolated gene consists of 2,592 bp, with the coding sequence being interrupted by 19 introns and flanked by an upstream region in which putative CAAT and TATA consensus sequences could be identified at positions -174 and -84, respectively. The isolation of a second cDNA (pox2 cDNA), showing 84% similarity, and of the corresponding truncated genomic clones demonstrated the existence of a multigene family coding for isoforms of laccase in P. ostreatus. PCR amplifications of specific regions on the DNA of isolated monokaryons proved that the two genes are not allelic forms. The POX1 amino acid sequence deduced was compared with those of other known laccases from different fungi. PMID:7793961
Complete cDNA sequence and amino acid analysis of a bovine ribonuclease K6 gene.
Pietrowski, D; Förster, M
2000-01-01
The complete cDNA sequence of a ribonuclease k6 gene of Bos Taurus has been determined. It codes for a protein with 154 amino acids and contains the invariant cysteine, histidine and lysine residues as well as the characteristic motifs specific to ribonuclease active sites. The deduced protein sequence is 27 residues longer than other known ribonucleases k6 and shows amino acids exchanges which could reflect a strain specificity or polymorphism within the bovine genome. Based on sequence similarity we have termed the identified gene bovine ribonuclease k6 b (brk6b).
Duyk, G M; Kim, S W; Myers, R M; Cox, D R
1990-11-01
Identification and recovery of transcribed sequences from cloned mammalian genomic DNA remains an important problem in isolating genes on the basis of their chromosomal location. We have developed a strategy that facilitates the recovery of exons from random pieces of cloned genomic DNA. The basis of this "exon trapping" strategy is that, during a retroviral life cycle, genomic sequences of nonviral origin are correctly spliced and may be recovered as a cDNA copy of the introduced segment. By using this genetic assay for cis-acting sequences required for RNA splicing, we have screened approximately 20 kilobase pairs of cloned genomic DNA and have recovered all four predicted exons.
Duyk, G M; Kim, S W; Myers, R M; Cox, D R
1990-01-01
Identification and recovery of transcribed sequences from cloned mammalian genomic DNA remains an important problem in isolating genes on the basis of their chromosomal location. We have developed a strategy that facilitates the recovery of exons from random pieces of cloned genomic DNA. The basis of this "exon trapping" strategy is that, during a retroviral life cycle, genomic sequences of nonviral origin are correctly spliced and may be recovered as a cDNA copy of the introduced segment. By using this genetic assay for cis-acting sequences required for RNA splicing, we have screened approximately 20 kilobase pairs of cloned genomic DNA and have recovered all four predicted exons. PMID:2247475
Woods, D E; Edge, M D; Colten, H R
1984-01-01
Complementary DNA (cDNA) clones corresponding to the major histocompatibility (MHC) class III antigen, complement protein C2, have been isolated from human liver cDNA libraries with the use of a complex mixture of synthetic oligonucleotides (17 mer) that contains 576 different oligonucleotide sequences. The C2 cDNA were used to identify a DNA restriction enzyme fragment length polymorphism that provides a genetic marker within the MHC that was not detectable at the protein level. An extensive search for genomic polymorphisms using a cDNA clone for another MHC class III gene, factor B, failed to reveal any DNA variants. The genomic variants detected with the C2 cDNA probe provide an additional genetic marker for analysis of MHC-linked diseases. Images PMID:6086718
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1993-01-01
The DOE Human Genome program has grown tremendously, as shown by the marked increase in the number of genome-funded projects since the last workshop held in 1991. The abstracts in this book describe the genome research of DOE-funded grantees and contractors and invited guests, and all projects are represented at the workshop by posters. The 3-day meeting includes plenary sessions on ethical, legal, and social issues pertaining to the availability of genetic data; sequencing techniques, informatics support; and chromosome and cDNA mapping and sequencing.
Cost-Effective Sequencing of Full-Length cDNA Clones Powered by a De Novo-Reference Hybrid Assembly
Sugano, Sumio; Morishita, Shinichi; Suzuki, Yutaka
2010-01-01
Background Sequencing full-length cDNA clones is important to determine gene structures including alternative splice forms, and provides valuable resources for experimental analyses to reveal the biological functions of coded proteins. However, previous approaches for sequencing cDNA clones were expensive or time-consuming, and therefore, a fast and efficient sequencing approach was demanded. Methodology We developed a program, MuSICA 2, that assembles millions of short (36-nucleotide) reads collected from a single flow cell lane of Illumina Genome Analyzer to shotgun-sequence ∼800 human full-length cDNA clones. MuSICA 2 performs a hybrid assembly in which an external de novo assembler is run first and the result is then improved by reference alignment of shotgun reads. We compared the MuSICA 2 assembly with 200 pooled full-length cDNA clones finished independently by the conventional primer-walking using Sanger sequencers. The exon-intron structure of the coding sequence was correct for more than 95% of the clones with coding sequence annotation when we excluded cDNA clones insufficiently represented in the shotgun library due to PCR failure (42 out of 200 clones excluded), and the nucleotide-level accuracy of coding sequences of those correct clones was over 99.99%. We also applied MuSICA 2 to full-length cDNA clones from Toxoplasma gondii, to confirm that its ability was competent even for non-human species. Conclusions The entire sequencing and shotgun assembly takes less than 1 week and the consumables cost only ∼US$3 per clone, demonstrating a significant advantage over previous approaches. PMID:20479877
With the advent of sequence information for entire eukaryotic genomes, it is now possible to analyze gene expression on a genomic scale. The primary tool for genomic analysis of gene expression is the gene microarray. We have used commercially available and custom cDNA microarray...
Primary structure of the Aequorea victoria green-fluorescent protein.
Prasher, D C; Eckenrode, V K; Ward, W W; Prendergast, F G; Cormier, M J
1992-02-15
Many cnidarians utilize green-fluorescent proteins (GFPs) as energy-transfer acceptors in bioluminescence. GFPs fluoresce in vivo upon receiving energy from either a luciferase-oxyluciferin excited-state complex or a Ca(2+)-activated phosphoprotein. These highly fluorescent proteins are unique due to the chemical nature of their chromophore, which is comprised of modified amino acid (aa) residues within the polypeptide. This report describes the cloning and sequencing of both cDNA and genomic clones of GFP from the cnidarian, Aequorea victoria. The gfp10 cDNA encodes a 238-aa-residue polypeptide with a calculated Mr of 26,888. Comparison of A. victoria GFP genomic clones shows three different restriction enzyme patterns which suggests that at least three different genes are present in the A. victoria population at Friday Harbor, Washington. The gfp gene encoded by the lambda GFP2 genomic clone is comprised of at least three exons spread over 2.6 kb. The nucleotide sequences of the cDNA and the gene will aid in the elucidation of structure-function relationships in this unique class of proteins.
de Bellocq, J Goüy; Leirs, H
2009-09-01
Sequences of the complete open reading frame (ORF) for rodents major histocompatibility complex (MHC) class II genes are rare. Multimammate rat (Mastomys natalensis) complementary DNA (cDNA) encoding the alpha and beta chains of MHC class II DQ gene was cloned from a rapid amplifications of cDNA Emds (RACE) cDNA library. The ORFs consist of 801 and 771 bp encoding 266 and 256 amino acid residues for DQB and DQA, respectively. The genomic structure of Mana-DQ genes is globally analogous to that described for other rodents except for the insertion of a serine residue in the signal peptide of Mana-DQB, which is unique among known rodents.
2004-01-01
The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5′-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID:15489334
Picardi, Ernesto; Quagliariello, Carla
2008-03-26
In plant mitochondria, the post-transcriptional RNA editing process converts C to U at a number of specific sites of the mRNA sequence and usually restores phylogenetically conserved codons and the encoded amino acid residues. Sites undergoing RNA editing evolve at a higher rate than sites not modified by the process. As a result, editing sites strongly affect the evolution of plant mitochondrial genomes, representing an important source of sequence variability and potentially informative characters. To date no clear and convincing evidence has established whether or not editing sites really affect the topology of reconstructed phylogenetic trees. For this reason, we investigated here the effect of RNA editing on the tree building process of twenty different plant mitochondrial gene sequences and by means of computer simulations. Based on our simulation study we suggest that the editing 'noise' in tree topology inference is mainly manifested at the cDNA level. In particular, editing sites tend to confuse tree topologies when artificial genomic and cDNA sequences are generated shorter than 500 bp and with an editing percentage higher than 5.0%. Similar results have been also obtained with genuine plant mitochondrial genes. In this latter instance, indeed, the topology incongruence increases when the editing percentage goes up from about 3.0 to 14.0%. However, when the average gene length is higher than 1,000 bp (rps3, matR and atp1) no differences in the comparison between inferred genomic and cDNA topologies could be detected. Our findings by the here reported in silico and in vivo computer simulation system seem to strongly suggest that editing sites contribute in the generation of misleading phylogenetic trees if the analyzed mitochondrial gene sequence is highly edited (higher than 3.0%) and reduced in length (shorter than 500 bp). In the current lack of direct experimental evidence the results presented here encourage, thus, the use of genomic mitochondrial rather than cDNA sequences for reconstructing phylogenetic events in land plants.
Characterization of Urtica dioica agglutinin isolectins and the encoding gene family.
Does, M P; Ng, D K; Dekker, H L; Peumans, W J; Houterman, P M; Van Damme, E J; Cornelissen, B J
1999-01-01
Urtica dioica agglutinin (UDA) has previously been found in roots and rhizomes of stinging nettles as a mixture of UDA-isolectins. Protein and cDNA sequencing have shown that mature UDA is composed of two hevein domains and is processed from a precursor protein. The precursor contains a signal peptide, two in-tandem hevein domains, a hinge region and a carboxyl-terminal chitinase domain. Genomic fragments encoding precursors for UDA-isolectins have been amplified by five independent polymerase chain reactions on genomic DNA from stinging nettle ecotype Weerselo. One amplified gene was completely sequenced. As compared to the published cDNA sequence, the genomic sequence contains, besides two basepair substitutions, two introns located at the same positions as in other plant chitinases. By partial sequence analysis of 40 amplified genes, 16 different genes were identified which encode seven putative UDA-isolectins. The deduced amino acid sequences share 78.9-98.9% identity. In extracts of roots and rhizomes of stinging nettle ecotype Weerselo six out of these seven isolectins were detected by mass spectrometry. One of them is an acidic form, which has not been identified before. Our results demonstrate that UDA is encoded by a large gene family.
NASA Astrophysics Data System (ADS)
Kikuchi, Shoshi
2009-02-01
Completion of the high-precision genome sequence analysis of rice led to the collection of about 35,000 full-length cDNA clones and the determination of their complete sequences. Mapping of these full-length cDNA sequences has given us information on (1) the number of genes expressed in the rice genome; (2) the start and end positions and exon-intron structures of rice genes; (3) alternative transcripts; (4) possible encoded proteins; (5) non-protein-coding (np) RNAs; (6) the density of gene localization on the chromosome; (7) setting the parameters of gene prediction programs; and (8) the construction of a microarray system that monitors global gene expression. Manual curation for rice gene annotation by using mapping information on full-length cDNA and EST assemblies has revealed about 32,000 expressed genes in the rice genome. Analysis of major gene families, such as those encoding membrane transport proteins (pumps, ion channels, and secondary transporters), along with the evolution from bacteria to higher animals and plants, reveals how gene numbers have increased through adaptation to circumstances. Family-based gene annotation also gives us a new way of comparing organisms. Massive amounts of data on gene expression under many kinds of physiological conditions are being accumulated in rice oligoarrays (22K and 44K) based on full-length cDNA sequences. Cluster analyses of genes that have the same promoter cis-elements, that have similar expression profiles, or that encode enzymes in the same metabolic pathways or signal transduction cascades give us clues to understanding the networks of gene expression in rice. As a tool for that purpose, we recently developed "RiCES", a tool for searching for cis-elements in the promoter regions of clustered genes.
A novel gene, RSD-3/HSD-3.1, encodes a meiotic-related protein expressed in rat and human testis.
Zhang, Xiaodong; Liu, Huixian; Zhang, Yan; Qiao, Yuan; Miao, Shiying; Wang, Linfang; Zhang, Jianchao; Zong, Shudong; Koide, S S
2003-06-01
The expression of stage-specific genes during spermatogenesis was determined by isolating two segments of rat seminiferous tubule at different stages of the germinal epithelium cycle delineated by transillumination-delineated microdissection, combined with differential display polymerase chain reaction to identify the differential transcripts formed. A total of 22 cDNAs were identified and accepted by GenBank as new expressed sequence tags. One of the expressed sequence tags was radiolabeled and used as a probe to screen a rat testis cDNA library. A novel full-length cDNA composed of 2228 bp, designated as RSD-3 (rat sperm DNA no.3, GenBank accession no. AF094609) was isolated and characterized. The reading frame encodes a polypeptide consisting of 526 amino acid residues, containing a number of DNA binding motifs and phosphorylation sites for PKC, CK-II, and p34cdc2. Northern blot of mRNA prepared from various tissues of adult rats showed that RSD-3 is expressed only in the testis. The initial expression of the RSD-3 gene was detected in the testis on the 30th postnatal day and attained adult level on the 60th postnatal day. Immunolocalization of RSD-3 in germ cells of rat testis showed that its expression is restricted to primary spermatocytes, undergoing meiosis division I. A human testis homologue of RSD-3 cDNA, designated as HSD-3.1 (GenBank accession no. AF144487) was isolated by screening the Human Testis Rapid-Screen arrayed cDNA library panels by RT-PCR. The exon-intron boundaries of HSD-3.1 gene were determined by aligning the cDNA sequence with the corresponding genome sequence. The cDNA consisted of 12 exons that span approximately 52.8 kb of the genome sequence and was mapped to chromosome 14q31.3.
[The primary structure of a vaccine strain of tobacco mosaic virus V-69].
Shiian, A N; Mil'shina, N V; Snegireva, P B; Pukhal'skiĭ, V A
1994-12-01
A random set of cDNA fragments were synthesized on genomic RNA of TMV vaccine strain V-69, using random primers and reverse transcriptase. Following synthesis of double-stranded cDNA, they were cloned into the pUC-19 plasmid; and 28 clones were sequenced (insert size 100-500 bp). High nucleotide sequence homology of V-69 (more than 95%) was shown only with tomato strain TMV-L [1]. Sequenced clones represent 54% of the genome (50% of the replicase gene, 98% of the transport protein gene, and 60% of the coat protein gene). In this genome region, 24 base substitutions were revealed, as compared to the wild-type TMV-L sequence. Six base substitutions resulted in changes in corresponding amino acid codons. No substitutions coincided with those discovered in the related TMV vaccine strain L11A [2], while two substitutions in the replicase gene were identical to those found in TMV strain Lta1 [3], which is capable of overcoming protection in tomatoes with the resistance gene Tm-1.
Genomic organization of the neurofibromatosis 1 gene (NF1)
DOE Office of Scientific and Technical Information (OSTI.GOV)
Li, Y.; O`Connell, P.; Huntsman Breidenbach, H.
Neurofibromatosis 1 maps to chromosome band 17q11.2, and the NF1 locus has been partially characterized. Even though the full-length NF1 cDNA has been sequenced, the complete genomic structure of the NF1 gene has not been elucidated. The 5{prime} end of NF1 is embedded in a CpG island containing a NotI restriction site, and the remainder of the gene lies in the adjacent 350-kb NotI fragment. In our efforts to develop a comprehensive screen for NF1 mutations, we have isolated genomic DNA clones that together harbor the entire NF1 cDNA sequence. We have identified all intron-exon boundaries of the coding regionmore » and established that it is composed of 59 exons. Furthermore, we have defined the 3{prime}-untranslated region (3{prime}-UTR) of the NF1 gene; it spans approximately 3.5 kb of genomic DNA sequence and is continuous with the stop codon. Oligonucleotide primer pairs synthesized from exon-flanking DNA sequences were used in the polymerase chain reaction with cloned, chromosome 17-specific genomic DNA as template to amplify NF1 exons 1 through 27b and the exon containing the 3{prime}-UTR separately. This information should be useful for implementing a comprehensive NF1 mutation screen using genomic DNA as template. 41 refs., 3 figs., 2 tabs.« less
Xiao, Yongli; Sheng, Zong-Mei; Taubenberger, Jeffery K.
2015-01-01
The vast majority of surgical biopsy and post-mortem tissue samples are formalin-fixed and paraffin-embedded (FFPE), but this process leads to RNA degradation that limits gene expression analysis. As an example, the viral RNA genome of the 1918 pandemic influenza A virus was previously determined in a 9-year effort by overlapping RT-PCR from post-mortem samples. Using the protocols described here, the full genome of the 1918 virus at high coverage was determined in one high-throughput sequencing run of a cDNA library derived from total RNA of a 1918 FFPE sample after duplex-specific nuclease treatments. This basic methodological approach should assist in the analysis of FFPE tissue samples isolated over the past century from a variety of infectious diseases. PMID:26344216
A new approach for cloning hLIF cDNA from genomic DNA isolated from the oral mucous membrane.
Cui, Y H; Zhu, G Q; Chen, Q J; Wang, Y F; Yang, M M; Song, Y X; Wang, J G; Cao, B Y
2011-11-25
Complementary DNA (cDNA) is valuable for investigating protein structure and function in the study of life science, but it is difficult to obtain by traditional reverse transcription. We employed a novel strategy to clone human leukemia inhibitory factor (hLIF) gene cDNA from genomic DNA, which was directly isolated from the mucous membrane of mouth. The hLIF sequence, which is 609 bp long and is composed of three exons, can be acquired within a few hours by amplifying each exon and splicing all of them using overlap-PCR. This new approach developed is simple, time- and cost-effective, without RNA preparation or cDNA synthesis, and is not limited to the specific tissues for a particular gene and the expression level of the gene.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Stapleton, Mark; Liao, Guochun; Brokstein, Peter
2002-08-12
Collections of full-length nonredundant cDNA clones are critical reagents for functional genomics. The first step toward these resources is the generation and single-pass sequencing of cDNA libraries that contain a high proportion of full-length clones. The first release of the Drosophila Gene Collection Release 1 (DGCr1) was produced from six libraries representing various tissues, developmental stages, and the cultured S2 cell line. Nearly 80,000 random 5prime expressed sequence tags (EST) from these libraries were collapsed into a nonredundant set of 5849 cDNAs, corresponding to {approx}40 percent of the 13,474 predicted genes in Drosophila. To obtain cDNA clones representing the remainingmore » genes, we have generated an additional 157,835 5prime ESTs from two previously existing and three new libraries. One new library is derived from adult testis, a tissue we previously did not exploit for gene discovery; two new cap-trapped normalized libraries are derived from 0-22hr embryos and adult heads. Taking advantage of the annotated D. melanogaster genome sequence, we clustered the ESTs by aligning them to the genome. Clusters that overlap genes not already represented by cDNA clones in the DGCr1 were analyzed further, and putative full-length clones were selected for inclusion in the new DGC. This second release of the DGC (DGCr2) contains 5061 additional clones, extending the collection to 10,910 cDNAs representing >70 percent of the predicted genes in Drosophila.« less
Gao, Ruimin; Niu, Shengniao; Dai, Weifang; Kitajima, Elliot; Wong, Sek-Man
2016-10-01
A Brazilian isolate of Hibiscus latent Fort Pierce virus (HLFPV-BR) was firstly found in a hibiscus plant in Limeira, SP, Brazil. RACE PCR was carried out to obtain the full-length sequences of HLFPV-BR which is 6453 nucleotides and has more than 99.15 % of complete genomic RNA nucleotide sequence identity with that of HLFPV Japanese isolate. The genomic structure of HLFPV-BR is similar to other tobamoviruses. It includes a 5' untranslated region (UTR), followed by open reading frames encoding for a 128-kDa protein and a 188-kDa readthrough protein, a 38-kDa movement protein, 18-kDa coat protein, and a 3' UTR. Interestingly, the unique feature of poly(A) tract is also found within its 3'-UTR. Furthermore, from the total RNA extracted from the local lesions of HLFPV-BR-infected Chenopodium quinoa leaves, a biologically active, full-length cDNA clone encompassing the genome of HLFPV-BR was amplified and placed adjacent to a T7 RNA polymerase promoter. The capped in vitro transcripts from the cloned cDNA were infectious when mechanically inoculated into C. quinoa and Nicotiana benthamiana plants. This is the first report of the presence of an isolate of HLFPV in Brazil and the successful synthesis of a biologically active HLFPV-BR full-length cDNA clone.
Tappaz, M; Bitoun, M; Reymond, I; Sergeant, A
1999-09-01
Cysteine sulfinate decarboxylase (CSD) is considered as the rate-limiting enzyme in the biosynthesis of taurine, a possible osmoregulator in brain. Through cloning and sequencing of RT-PCR and RACE-PCR products of rat brain mRNAs, a 2,396-bp cDNA sequence was obtained encoding a protein of 493 amino acids (calculated molecular mass, 55.2 kDa). The corresponding fusion protein showed a substrate specificity similar to that of the endogenous enzyme. The sequence of the encoded protein is identical to that encoded by liver CSD cDNA. Among other characterized amino acid decarboxylases, CSD shows the highest homology (54%) with either isoform of glutamic acid decarboxylase (GAD65 and GAD67). A single mRNA band, approximately 2.5 kb, was detected by northern blot in RNA extracts of brain, liver, and kidney. However, brain and liver CSD cDNA sequences differed in the 5' untranslated region. This indicates two forms of CSD mRNA. Analysis of PCR-amplified products of genomic DNA suggests that the brain form results from the use of a 3' alternative internal splicing site within an exon specifically found in liver CSD mRNA. Through selective RT-PCR the brain form was detected in brain only, whereas the liver form was found in liver and kidney. These results indicate a tissue-specific regulation of CSD genomic expression.
Floral gene resources from basal angiosperms for comparative genomics research
Albert, Victor A; Soltis, Douglas E; Carlson, John E; Farmerie, William G; Wall, P Kerr; Ilut, Daniel C; Solow, Teri M; Mueller, Lukas A; Landherr, Lena L; Hu, Yi; Buzgo, Matyas; Kim, Sangtae; Yoo, Mi-Jeong; Frohlich, Michael W; Perl-Treves, Rafael; Schlarbaum, Scott E; Bliss, Barbara J; Zhang, Xiaohong; Tanksley, Steven D; Oppenheimer, David G; Soltis, Pamela S; Ma, Hong; dePamphilis, Claude W; Leebens-Mack, James H
2005-01-01
Background The Floral Genome Project was initiated to bridge the genomic gap between the most broadly studied plant model systems. Arabidopsis and rice, although now completely sequenced and under intensive comparative genomic investigation, are separated by at least 125 million years of evolutionary time, and cannot in isolation provide a comprehensive perspective on structural and functional aspects of flowering plant genome dynamics. Here we discuss new genomic resources available to the scientific community, comprising cDNA libraries and Expressed Sequence Tag (EST) sequences for a suite of phylogenetically basal angiosperms specifically selected to bridge the evolutionary gaps between model plants and provide insights into gene content and genome structure in the earliest flowering plants. Results Random sequencing of cDNAs from representatives of phylogenetically important eudicot, non-grass monocot, and gymnosperm lineages has so far (as of 12/1/04) generated 70,514 ESTs and 48,170 assembled unigenes. Efficient sorting of EST sequences into putative gene families based on whole Arabidopsis/rice proteome comparison has permitted ready identification of cDNA clones for finished sequencing. Preliminarily, (i) proportions of functional categories among sequenced floral genes seem representative of the entire Arabidopsis transcriptome, (ii) many known floral gene homologues have been captured, and (iii) phylogenetic analyses of ESTs are providing new insights into the process of gene family evolution in relation to the origin and diversification of the angiosperms. Conclusion Initial comparisons illustrate the utility of the EST data sets toward discovery of the basic floral transcriptome. These first findings also afford the opportunity to address a number of conspicuous evolutionary genomic questions, including reproductive organ transcriptome overlap between angiosperms and gymnosperms, genome-wide duplication history, lineage-specific gene duplication and functional divergence, and analyses of adaptive molecular evolution. Since not all genes in the floral transcriptome will be associated with flowering, these EST resources will also be of interest to plant scientists working on other functions, such as photosynthesis, signal transduction, and metabolic pathways. PMID:15799777
Genomic clones for human cholinesterase
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kott, M.; Venta, P.J.; Larsen, J.
1987-05-01
A human genomic library was prepared from peripheral white blood cells from a single donor by inserting an MboI partial digest into BamHI poly-linker sites of EMBL3. This library was screened using an oligolabeled human cholinesterase cDNA probe over 700 bp long. The latter probe was obtained from a human basal ganglia cDNA library. Of approximately 2 million clones screened with high stringency conditions several positive clones were identified; two have been plaque purified. One of these clones has been partially mapped using restriction enzymes known to cut within the coded region of the cDNA for human serum cholinesterase. Hybridizationmore » of the fragments and their sizes are as expected if the genomic clone is cholinesterase. Sequencing of the DNA fragments in M13 is in progress to verify the identify of the clone and the location of introns.« less
Sequence evaluation of four specific cDNA libraries for developmental genomics of sunflower.
Tamborindeguy, C; Ben, C; Liboz, T; Gentzbittel, L
2004-04-01
Four different cDNA libraries were constructed from sunflower protoplasts growing under embryogenic and non-embryogenic conditions: one standard library from each condition and two subtractive libraries in opposite sense. A total of 22,876 cDNA clones were obtained and 4800 ESTs were sequenced, giving rise to 2479 high quality ESTs representing an unigene set of 1502 sequences. This set was compared with ESTs represented in public databases using the programs BLASTN and BLASTX, and its members were classified according to putative function using the catalog in the Kyoto Encyclopedia of Genes and Genomes (KEGG). Some 33% of sequences failed to align with existing plant ESTs and therefore represent putative novel genes. The libraries show a low level of redundancy and, on average, 50% of the present ESTs have not been previously reported for sunflower. Several potentially interesting genes were identified, based on their homology with genes involved in animal zygotic division or plant embryogenesis. We also identified two ESTs that show significantly different levels of expression under embryogenic and non-embryogenic conditions. The libraries described here represent an original and valuable resource for the discovery of yet unknown genes putatively involved in dicot embryogenesis and improving our knowledge of the mechanisms involved in polarity acquisition by plant embryos.
APPLICATION OF DNA MICROARRAYS TO REPRODUCTIVE TOXICOLOGY AND THE DEVELOPMENT OF A TESTIS ARRAY
With the advent of sequence information for entire mammalian genomes, it is now possible to analyze gene expression and gene polymorphisms on a genomic scale. The primary tool for analysis of gene expression is the DNA microarray. We have used commercially available cDNA micro...
Molecular cloning and characterization of SoxB2 gene from Zhikong scallop Chlamys farreri
NASA Astrophysics Data System (ADS)
He, Yan; Bao, Zhenmin; Guo, Huihui; Zhang, Yueyue; Zhang, Lingling; Wang, Shi; Hu, Jingjie; Hu, Xiaoli
2013-11-01
The Sox proteins play critical roles during the development of animals, including sex determination and central nervous system development. In this study, the SoxB2 gene was cloned from a mollusk, the Zhikong scallop ( Chlamys farreri), and characterized with respect to phylogeny and tissue distribution. The full-length cDNA and genomic DNA sequences of C. farreri SoxB2 ( Cf SoxB2) were obtained by rapid amplification of cDNA ends and genome walking, respectively, using a partial cDNA fragment from the highly conserved DNA-binding domain, i.e., the High Mobility Group (HMG) box. The full-length cDNA sequence of Cf SoxB2 was 2 048 bp and encoded 268 amino acids protein. The genomic sequence was 5 551 bp in length with only one exon. Several conserved elements, such as the TATA-box, GC-box, CAAT-box, GATA-box, and Sox/sry-sex/testis-determining and related HMG box factors, were found in the promoter region. Furthermore, real-time quantitative reverse transcription PCR assays were carried out to assess the mRNA expression of Cf SoxB 2 in different tissues. SoxB2 was highly expressed in the mantle, moderately in the digestive gland and gill, and weakly expressed in the gonad, kidney and adductor muscle. In male and female gonads at different developmental stages of reproduction, the expression levels of Cf SoxB2 were similar. Considering the specific expression and roles of SoxB 2 in other animals, in particular vertebrates, and the fact that there are many pallial nerves in the mantle, cerebral ganglia in the digestive gland and gill nerves in gill, we propose a possible essential role in nervous tissue function for Sox B 2 in C. farreri.
Vettore, André L.; da Silva, Felipe R.; Kemper, Edson L.; Souza, Glaucia M.; da Silva, Aline M.; Ferro, Maria Inês T.; Henrique-Silva, Flavio; Giglioti, Éder A.; Lemos, Manoel V.F.; Coutinho, Luiz L.; Nobrega, Marina P.; Carrer, Helaine; França, Suzelei C.; Bacci, Maurício; Goldman, Maria Helena S.; Gomes, Suely L.; Nunes, Luiz R.; Camargo, Luis E.A.; Siqueira, Walter J.; Van Sluys, Marie-Anne; Thiemann, Otavio H.; Kuramae, Eiko E.; Santelli, Roberto V.; Marino, Celso L.; Targon, Maria L.P.N.; Ferro, Jesus A.; Silveira, Henrique C.S.; Marini, Danyelle C.; Lemos, Eliana G.M.; Monteiro-Vitorello, Claudia B.; Tambor, José H.M.; Carraro, Dirce M.; Roberto, Patrícia G.; Martins, Vanderlei G.; Goldman, Gustavo H.; de Oliveira, Regina C.; Truffi, Daniela; Colombo, Carlos A.; Rossi, Magdalena; de Araujo, Paula G.; Sculaccio, Susana A.; Angella, Aline; Lima, Marleide M.A.; de Rosa, Vicente E.; Siviero, Fábio; Coscrato, Virginia E.; Machado, Marcos A.; Grivet, Laurent; Di Mauro, Sonia M.Z.; Nobrega, Francisco G.; Menck, Carlos F.M.; Braga, Marilia D.V.; Telles, Guilherme P.; Cara, Frank A.A.; Pedrosa, Guilherme; Meidanis, João; Arruda, Paulo
2003-01-01
To contribute to our understanding of the genome complexity of sugarcane, we undertook a large-scale expressed sequence tag (EST) program. More than 260,000 cDNA clones were partially sequenced from 26 standard cDNA libraries generated from different sugarcane tissues. After the processing of the sequences, 237,954 high-quality ESTs were identified. These ESTs were assembled into 43,141 putative transcripts. Of the assembled sequences, 35.6% presented no matches with existing sequences in public databases. A global analysis of the whole SUCEST data set indicated that 14,409 assembled sequences (33% of the total) contained at least one cDNA clone with a full-length insert. Annotation of the 43,141 assembled sequences associated almost 50% of the putative identified sugarcane genes with protein metabolism, cellular communication/signal transduction, bioenergetics, and stress responses. Inspection of the translated assembled sequences for conserved protein domains revealed 40,821 amino acid sequences with 1415 Pfam domains. Reassembling the consensus sequences of the 43,141 transcripts revealed a 22% redundancy in the first assembling. This indicated that possibly 33,620 unique genes had been identified and indicated that >90% of the sugarcane expressed genes were tagged. PMID:14613979
Cross-referencing yeast genetics and mammalian genomes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hieter, P.; Basset, D.; Boguski, M.
1994-09-01
We have initiated a project that will systematically transfer information about yeast genes onto the genetic maps of mice and human beings. Rapidly expanding human EST data will serve as a source of candidate human homologs that will be repeatedly searched using yeast protein sequence queries. Search results will be automatically reported to participating labs. Human cDNA sequences from which the ESTs are derived will be mapped at high resolution in the human and mouse genomes. The comparative mapping information cross-references the genomic position of novel human cDNAs with functional information known about the cognate yeast genes. This should facilitatemore » the initial identification of genes responsible for mammalian mutant phenotypes, including human disease. In addition, the identification of mammalian homologs of yeast genes provides reagents for determining evolutionary conservation and for performing direct experiments in multicellular eukaryotes to enhance study of the yeast protein`s function. For example, ESTs homologous to CDC27 and CDC16 were identified, and the corresponding cDNA clones were obtained from ATTC, completely sequenced, and mapped on human and mouse chromosomes. In addition, the CDC17hs cDNA has been used to raise antisera to the CDC27Hs protein and used in subcellular localization experiments and junctional studies in mammalian cells. We have received funding from the National Center for Human Genome Research to provide a community resource which will establish comprehensive cross-referencing among yeast, human, and mouse loci. The project is set up as a service and information on how to communicate with this effort will be provided.« less
Automated sample-preparation technologies in genome sequencing projects.
Hilbert, H; Lauber, J; Lubenow, H; Düsterhöft, A
2000-01-01
A robotic workstation system (BioRobot 96OO, QIAGEN) and a 96-well UV spectrophotometer (Spectramax 250, Molecular Devices) were integrated in to the process of high-throughput automated sequencing of double-stranded plasmid DNA templates. An automated 96-well miniprep kit protocol (QIAprep Turbo, QIAGEN) provided high-quality plasmid DNA from shotgun clones. The DNA prepared by this procedure was used to generate more than two mega bases of final sequence data for two genomic projects (Arabidopsis thaliana and Schizosaccharomyces pombe), three thousand expressed sequence tags (ESTs) plus half a mega base of human full-length cDNA clones, and approximately 53,000 single reads for a whole genome shotgun project (Pseudomonas putida).
Isolation and characterization of the chicken trypsinogen gene family.
Wang, K; Gan, L; Lee, I; Hood, L
1995-01-01
Based on genomic Southern hybridizations and cDNA sequence analyses, the chicken trypsinogen gene family can be divided into two multi-member subfamilies, a six-member trypsinogen I subfamily which encodes the cationic trypsin isoenzymes and a three-member trypsinogen II subfamily which encodes the anionic trypsin isoenzymes. The chicken cDNA and genomic clones containing these two subfamilies were isolated and characterized by DNA sequence analysis. The results indicated that the chicken trypsinogen genes encoded a signal peptide of 15 to 16 amino acid residues, an activation peptide of 9 to 10 residues and a trypsin of 223 amino acid residues. The chicken trypsinogens contain all the common catalytic and structural features for trypsins, including the catalytic triad His, Asp and Ser and the six disulphide bonds. The trypsinogen I and II subfamilies share approximately 70% sequence identity at the nucleotide and amino acid level. The sequence comparison among chicken trypsinogen subfamily members and trypsin sequences from other species suggested that the chicken trypsinogen genes may have evolved in coincidental or concerted fashion. Images Figure 6 Figure 7 PMID:7733885
USDA-ARS?s Scientific Manuscript database
Next generation sequencing (NGS) technology was used to analyze the occurrence of viruses in Sorghum almum plants in Florida exhibiting mosaic symptoms. Total RNA was extracted from symptomatic leaves and used as a template for cDNA library preparation. The resulting library was sequenced on an Illu...
Genomic resources for Myzus persicae: EST sequencing, SNP identification, and microarray design
Ramsey, John S; Wilson, Alex CC; de Vos, Martin; Sun, Qi; Tamborindeguy, Cecilia; Winfield, Agnese; Malloch, Gaynor; Smith, Dawn M; Fenton, Brian; Gray, Stewart M; Jander, Georg
2007-01-01
Background The green peach aphid, Myzus persicae (Sulzer), is a world-wide insect pest capable of infesting more than 40 plant families, including many crop species. However, despite the significant damage inflicted by M. persicae in agricultural systems through direct feeding damage and by its ability to transmit plant viruses, limited genomic information is available for this species. Results Sequencing of 16 M. persicae cDNA libraries generated 26,669 expressed sequence tags (ESTs). Aphids for library construction were raised on Arabidopsis thaliana, Nicotiana benthamiana, Brassica oleracea, B. napus, and Physalis floridana (with and without Potato leafroll virus infection). The M. persicae cDNA libraries include ones made from sexual and asexual whole aphids, guts, heads, and salivary glands. In silico comparison of cDNA libraries identified aphid genes with tissue-specific expression patterns, and gene expression that is induced by feeding on Nicotiana benthamiana. Furthermore, 2423 genes that are novel to science and potentially aphid-specific were identified. Comparison of cDNA data from three aphid lineages identified single nucleotide polymorphisms that can be used as genetic markers and, in some cases, may represent functional differences in the protein products. In particular, non-conservative amino acid substitutions in a highly expressed gut protease may be of adaptive significance for M. persicae feeding on different host plants. The Agilent eArray platform was used to design an M. persicae oligonucleotide microarray representing over 10,000 unique genes. Conclusion New genomic resources have been developed for M. persicae, an agriculturally important insect pest. These include previously unknown sequence data, a collection of expressed genes, molecular markers, and a DNA microarray that can be used to study aphid gene expression. These resources will help elucidate the adaptations that allow M. persicae to develop compatible interactions with its host plants, complementing ongoing work illuminating plant molecular responses to phloem-feeding insects. PMID:18021414
Whole genome sequence phylogenetic analysis of four Mexican rabies viruses isolated from cattle.
Bárcenas-Reyes, I; Loza-Rubio, E; Cantó-Alarcón, G J; Luna-Cozar, J; Enríquez-Vázquez, A; Barrón-Rodríguez, R J; Milián-Suazo, F
2017-08-01
Phylogenetic analysis of the rabies virus in molecular epidemiology has been traditionally performed on partial sequences of the genome, such as the N, G, and P genes; however, that approach raises concerns about the discriminatory power compared to whole genome sequencing. In this study we characterized four strains of the rabies virus isolated from cattle in Querétaro, Mexico by comparing the whole genome sequence to that of strains from the American, European and Asian continents. Four cattle brain samples positive to rabies and characterized as AgV11, genotype 1, were used in the study. A cDNA sequence was generated by reverse transcription PCR (RT-PCR) using oligo dT. cDNA samples were sequenced in an Illumina NextSeq 500 platform. The phylogenetic analysis was performed with MEGA 6.0. Minimum evolution phylogenetic trees were constructed with the Neighbor-Joining method and bootstrapped with 1000 replicates. Three large and seven small clusters were formed with the 26 sequences used. The largest cluster grouped strains from different species in South America: Brazil, and the French Guyana. The second cluster grouped five strains from Mexico. A Mexican strain reported in a different study was highly related to our four strains, suggesting common source of infection. The phylogenetic analysis shows that the type of host is different for the different regions in the American Continent; rabies is more related to bats. It was concluded that the rabies virus in central Mexico is genetically stable and that it is transmitted by the vampire bat Desmodus rotundus. Copyright © 2017 Elsevier Ltd. All rights reserved.
Structure, organization and expression of common carp (Cyprinus carpio L.) SLP-76 gene.
Huang, Rong; Sun, Xiao-Feng; Hu, Wei; Wang, Ya-Ping; Guo, Qiong-Lin
2008-05-01
SLP-76 is an important member of the SLP-76 family of adapters, and it plays a key role in TCR signaling and T cell function. Partial cDNA sequence of SLP-76 of common carp (Cyprinus carpio L.) was isolated from thymus cDNA library by the method of suppression subtractive hybridization (SSH). Subsequently, the full length cDNA of carp SLP-76 was obtained by means of 3' RACE and 5' RACE, respectively. The full length cDNA of carp SLP-76 was 2007 bp, consisting of a 5'-terminal untranslated region (UTR) of 285 bp, a 3'-terminal UTR of 240 bp, and an open reading frame of 1482 bp. Sequence comparison showed that the deduced amino acid sequence of carp SLP-76 had an overall similarity of 34-73% to that of other species homologues, and it was composed of an NH2-terminal domain, a central proline-rich domain, and a C-terminal SH2 domain. Amino acid sequence analysis indicated the existence of a Gads binding site R-X-X-K, a 10-aa-long sequence which binds to the SH3 domain of LCK in vitro, and three conserved tyrosine-containing sequence in the NH2-terminal domain. Then we used PCR to obtain a genomic DNA which covers the entire coding region of carp SLP-76. In the 9.2k-long genomic sequence, twenty one exons and twenty introns were identified. RT-PCR results showed that carp SLP-76 was expressed predominantly in hematopoietic tissues, and was upregulated in thymus tissue of four-month carp compared to one-year old carp. RT-PCR and virtual northern hybridization results showed that carp SLP-76 was also upregulated in thymus tissue of GH transgenic carp at the age of four-months. These results suggest that the expression level of SLP-76 gene may be related to thymocyte development in teleosts.
The Viral Evolution Core within the AIDS and Cancer Virus Program will extract viral RNA/DNA from cell-free or cell-associated samples. Complementary (cDNA) will be generated as needed, and cDNA or DNA will be diluted to a single copy prior to nested
Subramaniam, R; Reinold, S; Molitor, E K; Douglas, C J
1993-01-01
A heterologous probe encoding phenylalanine ammonia-lyase (PAL) was used to identify PAL clones in cDNA libraries made with RNA from young leaf tissue of two Populus deltoides x P. trichocarpa F1 hybrid clones. Sequence analysis of a 2.4-kb cDNA confirmed its identity as a full-length PAl clone. The predicted amino acid sequence is conserved in comparison with that of PAL genes from several other plants. Southern blot analysis of popular genomic DNA from parental and hybrid individuals, restriction site polymorphism in PAL cDNA clones, and sequence heterogeneity in the 3' ends of several cDNA clones suggested that PAL is encoded by at least two genes that can be distinguished by HindIII restriction site polymorphisms. Clones containing each type of PAL gene were isolated from a poplar genomic library. Analysis of the segregation of PAL-specific HindIII restriction fragment-length polymorphisms demonstrated the existence of two independently segregating PAL loci, one of which was mapped to a linkage group of the poplar genetic map. Developmentally regulated PAL expression in poplar was analyzed using RNA blots. Highest expression was observed in young stems, apical buds, and young leaves. Expression was lower in older stems and undetectable in mature leaves. Cellular localization of PAL expression by in situ hybridization showed very high levels of expression in subepidermal cells of leaves early during leaf development. In stems and petioles, expression was associated with subepidermal cells and vascular tissues. PMID:8108506
Du, Yu-Jie; Hou, Yi-Ling; Hou, Wan-Ru
2013-02-01
The Giant Panda is an endangered and valuable gene pool in genetic, its important functional gene POLR2H encodes an essential shared peptide H of RNA polymerases. The genomic DNA and cDNA sequences were cloned successfully for the first time from the Giant Panda (Ailuropoda melanoleuca) adopting touchdown-PCR and reverse transcription polymerase chain reaction (RT-PCR), respectively. The length of the genomic sequence of the Giant Panda is 3,285 bp, including five exons and four introns. The cDNA fragment cloned is 509 bp in length, containing an open reading frame of 453 bp encoding 150 amino acids. Alignment analysis indicated that both the cDNA and its deduced amino acid sequence were highly conserved. Protein structure prediction showed that there was one protein kinase C phosphorylation site, four casein kinase II phosphorylation sites and one amidation site in the POLR2H protein, further shaping advanced protein structure. The cDNA cloned was expressed in Escherichia coli, which indicated that POLR2H fusion with the N-terminally His-tagged form brought about the accumulation of an expected 20.5 kDa polypeptide in line with the predicted protein. On the basis of what has already been achieved in this study, further deep-in research will be conducted, which has great value in theory and practical significance.
Mochida, Keiichi; Uehara-Yamaguchi, Yukiko; Takahashi, Fuminori; Yoshida, Takuhiro; Sakurai, Tetsuya; Shinozaki, Kazuo
2013-01-01
A comprehensive collection of full-length cDNAs is essential for correct structural gene annotation and functional analyses of genes. We constructed a mixed full-length cDNA library from 21 different tissues of Brachypodium distachyon Bd21, and obtained 78,163 high quality expressed sequence tags (ESTs) from both ends of ca. 40,000 clones (including 16,079 contigs). We updated gene structure annotations of Brachypodium genes based on full-length cDNA sequences in comparison with the latest publicly available annotations. About 10,000 non-redundant gene models were supported by full-length cDNAs; ca. 6,000 showed some transcription unit modifications. We also found ca. 580 novel gene models, including 362 newly identified in Bd21. Using the updated transcription start sites, we searched a total of 580 plant cis-motifs in the −3 kb promoter regions and determined a genome-wide Brachypodium promoter architecture. Furthermore, we integrated the Brachypodium full-length cDNAs and updated gene structures with available sequence resources in wheat and barley in a web-accessible database, the RIKEN Brachypodium FL cDNA database. The database represents a “one-stop” information resource for all genomic information in the Pooideae, facilitating functional analysis of genes in this model grass plant and seamless knowledge transfer to the Triticeae crops. PMID:24130698
Bai, W L; Yin, R H; Dou, Q L; Jiang, W Q; Zhao, S J; Ma, Z J; Luo, G B; Zhao, Z H
2011-04-01
κ-Casein is one of the major proteins in the milk of mammals. It plays an important role in determining the size and specific function of milk micelles. We have previously identified and characterized a genetic variant of yak κ-casein by evaluating genomic DNA. Here, we isolate and characterize a yak κ-casein cDNA harboring the full-length open reading frame (ORF) from lactating mammary gland. Total RNA was extracted from mammary tissue of lactating female yak, and the κ-casein cDNA were synthesized by RT-PCR technique, then cloned and sequenced. The obtained cDNA of 660-bp contained an ORF sufficient to encode the entire amino acid sequence of κ-casein precursor protein consisting of 190 amino acids with a signal peptide of 21 amino acids. Yak κ-casein has a predicted molecular mass of 19,006.588 Da with a calculated isoelectric point of 7.245. Compared with the corresponding sequences in GenBank of cattle, buffalo, sheep, goat, Arabian camel, horse, and rabbit, yak κ-casein sequence had identity of 64.76-98.78% in cDNA, and identity of 44.79-98.42% and similarity of 53.65-98.42% in deduced amino acids, revealing a high homology with the other livestock species. Based on κ-casein cDNA sequences, the phylogenetic analysis indicated that yak κ-casein had a close relationship with that of cattle. This work might be useful in the genetic engineering researches for yak κ-casein.
Beccari, T; Hoade, J; Orlacchio, A; Stirling, J L
1992-01-01
cDNAs encoding the mouse beta-N-acetylhexosaminidase alpha-subunit were isolated from a mouse testis library. The longest of these (1.7 kb) was sequenced and showed 83% similarity with the human alpha-subunit cDNA sequence. The 5' end of the coding sequence was obtained from a genomic DNA clone. Alignment of the human and mouse sequences showed that all three putative N-glycosylation sites are conserved, but that the mouse alpha-subunit has an additional site towards the C-terminus. All eight cysteines in the human sequence are conserved in the mouse. There are an additional two cysteines in the mouse alpha-subunit signal peptide. All amino acids affected in Tay-Sachs-disease mutations are conserved in the mouse. Images Fig. 1. PMID:1379046
Salton, S R
1991-09-01
A nervous system-specific mRNA that is rapidly induced in PC12 cells to a greater extent by nerve growth factor (NGF) than by epidermal growth factor treatment has been cloned. The polypeptide deduced from the nucleic acid sequence of the NGF33.1 cDNA clone contains regions of amino acid sequence identity with that predicted by the cDNA clone VGF, and further analysis suggests that both NGF33.1 and VGF cDNA clones very likely correspond to the same mRNA (VGF). In this report both the nucleic acid sequence that corresponds to VGF mRNA and the polypeptide predicted by the NGF33.1 cDNA clone are presented. Genomic Southern analysis and database comparison did not detect additional sequences with high homology to the VGF gene. Induction of VGF mRNA by depolarization and phorbol 12-myristate 13-acetate treatment was greater than by serum stimulation or protein kinase A pathway activation. These studies suggest that VGF mRNA is induced to the greatest extent by NGF treatment and that VGF is one of the most rapidly regulated neuronal mRNAs identified in PC12 cells.
A database of annotated tentative orthologs from crop abiotic stress transcripts.
Balaji, Jayashree; Crouch, Jonathan H; Petite, Prasad V N S; Hoisington, David A
2006-10-07
A minimal requirement to initiate a comparative genomics study on plant responses to abiotic stresses is a dataset of orthologous sequences. The availability of a large amount of sequence information, including those derived from stress cDNA libraries allow for the identification of stress related genes and orthologs associated with the stress response. Orthologous sequences serve as tools to explore genes and their relationships across species. For this purpose, ESTs from stress cDNA libraries across 16 crop species including 6 important cereal crops and 10 dicots were systematically collated and subjected to bioinformatics analysis such as clustering, grouping of tentative orthologous sets, identification of protein motifs/patterns in the predicted protein sequence, and annotation with stress conditions, tissue/library source and putative function. All data are available to the scientific community at http://intranet.icrisat.org/gt1/tog/homepage.htm. We believe that the availability of annotated plant abiotic stress ortholog sets will be a valuable resource for researchers studying the biology of environmental stresses in plant systems, molecular evolution and genomics.
USDA-ARS?s Scientific Manuscript database
Several biosafety level (BSL)-3/4 pathogens are high consequence, single-stranded RNA viruses and their genomes, when introduced into permissive cells, are infectious. Moreover many of these viruses are Select Agents (SAs), and their genomes are also considered SAs. For this reason cDNAs and/or th...
NEIBank: Genomics and bioinformatics resources for vision research
Peterson, Katherine; Gao, James; Buchoff, Patee; Jaworski, Cynthia; Bowes-Rickman, Catherine; Ebright, Jessica N.; Hauser, Michael A.; Hoover, David
2008-01-01
NEIBank is an integrated resource for genomics and bioinformatics in vision research. It includes expressed sequence tag (EST) data and sequence-verified cDNA clones for multiple eye tissues of several species, web-based access to human eye-specific SAGE data through EyeSAGE, and comprehensive, annotated databases of known human eye disease genes and candidate disease gene loci. All expression- and disease-related data are integrated in EyeBrowse, an eye-centric genome browser. NEIBank provides a comprehensive overview of current knowledge of the transcriptional repertoires of eye tissues and their relation to pathology. PMID:18648525
McMeel, O M; Hoey, E M; Ferguson, A
2001-01-01
The cDNA nucleotide sequences of the lactate dehydrogenase alleles LDH-C1*90 and *100 of brown trout (Salmo trutta) were found to differ at position 308 where an A is present in the *100 allele but a G is present in the *90 allele. This base substitution results in an amino acid change from aspartic acid at position 82 in the LDH-C1 100 allozyme to a glycine in the 90 allozyme. Since aspartic acid has a net negative charge whilst glycine is uncharged, this is consistent with the electrophoretic observation that the LDH-C1 100 allozyme has a more anodal mobility relative to the LDH-C1 90 allozyme. Based on alignment of the cDNA sequence with the mouse genomic sequence, a local primer set was designed, incorporating the variable position, and was found to give very good amplification with brown trout genomic DNA. Sequencing of this fragment confirmed the difference in both homozygous and heterozygous individuals. Digestion of the polymerase chain reaction products with BslI, a restriction enzyme specific for the site difference, gave one, two and three fragments for the two homozygotes and the heterozygote, respectively, following electrophoretic separation. This provides a DNA-based means of routine screening of the highly informative LDH-C1* polymorphism in brown trout population genetic studies. Primer sets presented could be used to sequence cDNA of other LDH* genes of brown trout and other species.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Goldmuntz, E.; Budarf, M.L.; Wang, Zhili
1996-04-15
DiGeorge syndrome (DGS) and velocardiofacial syndrome have been shown to be associated with microdeletions of chromosomal region 22q11. More recently, patients with conotruncal anomaly face syndrome and some nonsyndromic patients with isolated forms of conotruncal cardiac defects have been found to have 22q11 microdeletions as well. The commonly deleted region, called the DiGeorge chromosomal region (DGCR), spans approximately 1.2 mb and is estimated to contain at least 30 genes. We report a computational approach for gene identification that makes use of large-scale sequencing of cosmids from a contig spanning the DGCR. Using this methodology, we have mapped the human homologmore » of a rodent citrate transport protein to the DGCR. We have isolated a partial cDNA containing the complete open reading frame and have determined the genomic structure by comparing the genomic sequence from the cosmid to the sequence of the cDNA clone. Whether the citrate transport protein can be implicated in the biological etiology of DGS or other 22q11 microdeletion syndromes remains to be defined. 36 refs., 3 figs., 1 tab.« less
Optimization of cDNA-AFLP experiments using genomic sequence data.
Kivioja, Teemu; Arvas, Mikko; Saloheimo, Markku; Penttilä, Merja; Ukkonen, Esko
2005-06-01
cDNA amplified fragment length polymorphism (cDNA-AFLP) is one of the few genome-wide level expression profiling methods capable of finding genes that have not yet been cloned or even predicted from sequence but have interesting expression patterns under the studied conditions. In cDNA-AFLP, a complex cDNA mixture is divided into small subsets using restriction enzymes and selective PCR. A large cDNA-AFLP experiment can require a substantial amount of resources, such as hundreds of PCR amplifications and gel electrophoresis runs, followed by manual cutting of a large number of bands from the gels. Our aim was to test whether this workload can be reduced by rational design of the experiment. We used the available genomic sequence information to optimize cDNA-AFLP experiments beforehand so that as many transcripts as possible could be profiled with a given amount of resources. Optimization of the selection of both restriction enzymes and selective primers for cDNA-AFLP experiments has not been performed previously. The in silico tests performed suggest that substantial amounts of resources can be saved by the optimization of cDNA-AFLP experiments.
Unprecedented genomic diversity of AhR1 and AhR2 genes in Atlantic salmon (Salmo salar L.).
Hansson, Maria C; Wittzell, Håkan; Persson, Kerstin; von Schantz, Torbjörn
2004-06-24
Aryl hydrocarbon receptor (AhR) genes encode proteins involved in mediating the toxic responses induced by several environmental pollutants. Here, we describe the identification of the first two AhR1 (alpha and beta) genes and two additional AhR2 (alpha and beta) genes in the tetraploid species Atlantic salmon (Salmo salar L.) from a cosmid library screening. Cosmid clones containing genomic salmon AhR sequences were isolated using a cDNA clone containing the coding region of the Atlantic salmon AhR2gamma as a probe. Screening revealed 14 positive clones, from which four were chosen for further analyses. One of the cosmids contained genomic AhR sequences that were highly similar to the rainbow trout (Oncorhynchus mykiss) AhR2alpha and beta genes. SMART RACE amplified two complete, highly similar but not identical AhR type 2 sequences from salmon cDNA, which from phylogenetic analyses were determined as the rainbow trout AhR2alpha and beta orthologs. The salmon AhR2alpha and beta encode proteins of 1071 and 1058 residues, respectively, and encompass characteristic AhR sequence elements like a basic-helix-loop-helix (bHLH) and two PER-ARNT-SIM (PAS) domains. Both genes are transcribed in liver, spleen and muscle tissues of adult salmon. A second cosmid contained partial sequences, which were identical to the previously characterized AhR2gamma gene. The last two cosmids contained partial genomic AhR sequences, which were more similar to other AhR type 1 fish genes than the four characterized salmon AhR2 genes. However, attempts to amplify the corresponding complete cDNA sequences of the inserts proved very difficult, suggesting that these genes are non-functional or very weakly transcribed in the examined tissues. Phylogenetic analyses of the conserved regions did, however, clearly indicate that these two AhRs belong to the AhR type 1 clade and have been assigned as the Atlantic salmon AhR1alpha and AhR1beta genes. Taken together, these findings demonstrate that multiple AhR genes are present in Atlantic salmon genome, which likely is a consequence of previous genome duplications in the evolutionary past of salmonids. Plausible explanations for the high incidence of AhR genes in fish and more specifically in salmonids, like rapid divergences in specialized functions, are discussed.
Identification of the genomic locus for the human Rieske Fe-S Protein gene on Chromosome 19q12
DOE Office of Scientific and Technical Information (OSTI.GOV)
Pennacchio, L.A.
1994-05-06
We have identified the chromosomal location of the human Rieske Iron-Sulfur Protein (UQCRFS1) gene. Mapping by hybridization to a panel of monochromosomal hybrid cell lines indicated that the gene was either on chromosome 19 or 22. By screening a human chromosome 19 specific genomic cosmid library with an oligonucleotide probe made from the published Rieske cDNA sequence, we identified a corresponding cosmid. Portions of this cosmid were sequenced directly. The exon, exon:intron junction, and flanking sequences verified that this cosmid contains the genomic locus. Fluorescent in situ hybridization (FISH) was performed to localize this cosmid to chromosome band 19q12.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hadano, S.; Ishida, Y.; Tomiyasu, H.
1994-09-01
To complete a transcription map of the 1 Mb region in human chromosome 4p16.3 containing the Huntington disease (HD) gene, the isolation of cDNA clones are being performed throughout. Our method relies on a direct screening of the cDNA libraries probed with single copy microclones from 3 YAC clones spanning 1 Mbp of the HD gene region. AC-DNAs were isolated by a preparative pulsed-field gel electrophoresis, amplified by both a single unique primer (SUP)-PCR and a linker ligation PCR, and 6 microclone-DNA libraries were generated. Then, 8,640 microclones from these libraries were independently amplified by PCR, and arrayed onto themore » membranes. 800-900 microclones that were not cross-hybridized with total human and yeast genomic DNA, TAC vector DNA, and ribosomal cDNA on a dot hybridization (putatively carrying single copy sequences) were pooled to make 9 probe pools. A total of {approximately}1.8x10{sup 7} plaques from the human brain cDNA libraries was screened with 9 pool-probes, and then 672 positive cDNA clones were obtained. So far, 597 cDNA clones were defined and arrayed onto a map of the 1 Mbp of the HD gene region by hybridization with HD region-specific cosmid contigs and YAC clones. Further characterization including a DNA sequencing and Northern blot analysis is currently underway.« less
Hwang, Shin-Rong; Garza, Christina Z; Wegrzyn, Jill; Hook, Vivian Y H
2004-08-16
This study demonstrates utilization of the novel GTG initiation codon for translation of a human mRNA transcript that encodes the serpin endopin 2B, a protease inhibitor. Molecular cloning revealed the nucleotide sequence of the human endopin 2B cDNA. Its deduced primary sequence shows high homology to bovine endopin 2A that possesses cross-class protease inhibition of elastase and papain. Notably, the human endopin 2B cDNA sequence revealed GTG as the predicted translation initiation codon; the predicted translation product of 46 kDa endopin 2B was produced by in vitro translation of 35S-endopin 2B with mammalian (rabbit) protein translation components. Importantly, bioinformatic studies demonstrated the presence of the entire human endopin 2B cDNA sequence with GTG as initiation codon within the human genome on chromosome 14. Further evidence for GTG as a functional initiation codon was illustrated by GTG-mediated in vitro translation of the heterologous protein EGFP, and by GTG-mediated expression of EGFP in mammalian PC12 cells. Mutagenesis of GTG to GTC resulted in the absence of EGFP expression in PC12 cells, indicating the function of GTG as an initiation codon. In addition, it was apparent that the GTG initiation codon produces lower levels of translated protein compared to ATG as initiation codon. Significantly, GTG-mediated translation of endopin 2B demonstrates a functional human gene product not previously predicted from initial analyses of the human genome. Further analyses based on GTG as an alternative initiation codon may predict new candidate genes of the human genome.
Cho, Young Sun; Choi, Buyl Nim; Ha, En-Mi; Kim, Ki Hong; Kim, Sung Koo; Kim, Dong Soo; Nam, Yoon Kwon
2005-01-01
Novel metallothionein (MT) complementary DNA and genomic sequences were isolated from a cartilaginous shark species, Scyliorhinus torazame. The full-length open reading frame (ORF) of shark MT cDNA encoded 68 amino acids with a high cysteine content (29%). The genomic ORF sequence (932 bp) of shark MT isolated by polymerase chain reaction (PCR) comprised 3 exons with 2 interventing introns. Shark MT sequence shared many conserved features with other vertebrate MTs: overall amino acid identities of shark MT ranged from 47% to 57% with fish MTs, and 41% to 62% with mammalian MTs. However, in addition to these conserved characteristics, shark MT sequence exhibited some unique characteristics. It contained 4 extra amino acids (Lys-Ala-Gly-Arg) at the end of the beta-domain, which have not been reported in any other vertebrate MTs. The last amino acid residue at the C-terminus was Ser, which also has not been reported in fish and mammalian MTs. The MT messenger RNA levels in shark liver and kidney, assessed by semiquantitative reverse transcriptase PCR and RNA blot hybridization, were significantly affected by experimental exposures to heavy metals (cadmium, copper, and zinc). Generally, the transcriptional activation of shark MT gene was dependent on the dose (0-10 mg/kg body weight for injection and 0-20 microM for immersion) and duration (1-10 days); zinc was a more potent inducer than copper and cadmium.
Yoshimitsu, Makoto; Higuchi, Koji; Miyata, Masaaki; Devine, Sean; Mattman, Andre; Sirrs, Sandra; Medin, Jeffrey A; Tei, Chuwa; Takenaka, Toshihiro
2011-05-01
Fabry disease is an X-linked lysosomal storage disorder caused by mutations of the α-galactosidase A (GLA) gene, and the disease is a relatively prevalent cause of left ventricular hypertrophy followed by conduction abnormalities and arrhythmias. Mutation analysis of the GLA gene is a valuable tool for accurate diagnosis of affected families. In this study, we carried out molecular studies of 10 unrelated families diagnosed with Fabry disease. Genetic analysis of the GLA gene using conventional genomic sequencing was performed in 9 hemizygous males and 6 heterozygous females. In patients with no mutations in coding DNA sequence, multiplex ligation-dependent probe amplification (MLPA) and/or cDNA sequencing were performed. We identified a novel exon 2 deletion (IVS1_IVS2) in a heterozygous female by MLPA, which was undetectable by conventional sequencing methods. In addition, the g.9331G>A mutation that has previously been found only in patients with cardiac Fabry disease was found in 3 unrelated, newly-diagnosed, cardiac Fabry patients by sequencing GLA genomic DNA and cDNA. Two other novel mutations, g.8319A>G and 832delA were also found in addition to 4 previously reported mutations (R112C, C142Y, M296I, and G373D) in 6 other families. We could identify GLA gene mutations in all hemizygotes and heterozygotes from 10 families with Fabry disease. Mutations in 4 out of 10 families could not be identified by classical genomic analysis, which focuses on exons and the flanking region. Instead, these data suggest that MLPA analysis and cDNA sequence should be considered in genetic testing surveys of patients with Fabry disease. Copyright © 2011 Japanese College of Cardiology. Published by Elsevier Ltd. All rights reserved.
Organization of the murine Cd22 locus
DOE Office of Scientific and Technical Information (OSTI.GOV)
Law, Che-Leung; Torres, R.M.; Sundeberg, H.A.
1993-07-01
Murine CD22 (mCD22) is a B cell-associated adhesion protein with seven extracellular Ig-like domains that has 62% amino acid identify to its human homologue. Southern analysis on genomic DNA isolated from tissues and cell lines from several mouse strains using mCD22 cDNA demonstrated that the Cd22 locus encoding mCD22 is a single copy gene of [le]30 kb. Digestion of genomic DNA preparations with four restriction endonucleases revealed the presence of restriction fragment length polymorphisms (RFLP) in BALB/c, C57BL/6, and C3H strains vs DBA/2j, NZB, and NZC strains, suggesting the presence of two or more Cd22 alleles. Using a mCD22 cDNAmore » clone derived from the BALB/c strain, the authors isolated genomic clones from a DBA/2 genomic library that contained all the exons necessary to encode the full length mCD22 cDNA. Fifteen exons, including exon 3 that encodes the translation start codon, were identified. Each extracellular Ig-like domain of mCD22 is encoded by a single exon. A comparison between the nucleotide sequences of the BALB/c CD22 cDNA and the exons of the DBA/2j CD22 genomic clones revealed an 18-nucleotide deletion in exon 4 (encoding the most distal Ig-like domain 1 of mCD22) of the DBA/2j genomic sequence in addition to a number of substitutions, insertions, and deletions in other exons. These nucleotide differences were also present in a cDNA clone isolated from total RNA of LPS-activated DBA/2j splenocytes mosome 7, a region sytenic to human chromosome 19q, close to the previously reported loci, Lyb-8 and Mag (a homologue of Cd22). An antibody (CY34) against the Lyb-8.2 B cell marker reacted with a BHK transfectant expressing the full length mCd22 cDNA, thus demonstrating that Lyb-8 and Cd22 loci are identical. Furthermore, a rat anti-mCD22 mAb, NIM-R6, bound to slgM[sup +] DBA/2j B cells, confirming the expression of a CD22 protein by the Cd22[sup a]/lyb-8[sup a] allele. 63 refs., 7 figs., 1 tab.« less
[Polymorphic loci and polymorphism analysis of short tandem repeats within XNP gene].
Liu, Qi-Ji; Gong, Yao-Qin; Guo, Chen-Hong; Chen, Bing-Xi; Li, Jiang-Xia; Guo, Yi-Shou
2002-01-01
To select polymorphic short tandem repeat markers within X-linked nuclear protein (XNP) gene, genomic clones which contain XNP gene were recognized by homologous analysis with XNP cDNA. By comparing the cDNA with genomic DNA, non-exonic sequences were identified, and short tandem repeats were selected from non-exonic sequences by using BCM search Launcher. Polymorphisms of the short tandem repeats in Chinese population were evaluated by PCR amplification and PAGE. Five short tandem repeats were identified from XNP gene, two of which were polymorphic. Four and 11 alleles were observed in Chinese population for XNPSTR1 and XNPSTR4, respectively. Heterozygosities were 47% for XNPSTR1 and 70% for XNPSTR4. XNPSTR1 and XNPSTR4 localized within 3' end and intron 10, respectively. Two polymorphic short tandem repeats have been identified within XNP gene and will be useful for linkage analysis and gene diagnosis of XNP gene.
Rise, Matthew L.; von Schalburg, Kristian R.; Brown, Gordon D.; Mawer, Melanie A.; Devlin, Robert H.; Kuipers, Nathanael; Busby, Maura; Beetz-Sargent, Marianne; Alberto, Roberto; Gibbs, A. Ross; Hunt, Peter; Shukin, Robert; Zeznik, Jeffrey A.; Nelson, Colleen; Jones, Simon R.M.; Smailus, Duane E.; Jones, Steven J.M.; Schein, Jacqueline E.; Marra, Marco A.; Butterfield, Yaron S.N.; Stott, Jeff M.; Ng, Siemon H.S.; Davidson, William S.; Koop, Ben F.
2004-01-01
We report 80,388 ESTs from 23 Atlantic salmon (Salmo salar) cDNA libraries (61,819 ESTs), 6 rainbow trout (Oncorhynchus mykiss) cDNA libraries (14,544 ESTs), 2 chinook salmon (Oncorhynchus tshawytscha) cDNA libraries (1317 ESTs), 2 sockeye salmon (Oncorhynchus nerka) cDNA libraries (1243 ESTs), and 2 lake whitefish (Coregonus clupeaformis) cDNA libraries (1465 ESTs). The majority of these are 3′ sequences, allowing discrimination between paralogs arising from a recent genome duplication in the salmonid lineage. Sequence assembly reveals 28,710 different S. salar, 8981 O. mykiss, 1085 O. tshawytscha, 520 O. nerka, and 1176 C. clupeaformis putative transcripts. We annotate the submitted portion of our EST database by molecular function. Higher- and lower-molecular-weight fractions of libraries are shown to contain distinct gene sets, and higher rates of gene discovery are associated with higher-molecular weight libraries. Pyloric caecum library group annotations indicate this organ may function in redox control and as a barrier against systemic uptake of xenobiotics. A microarray is described, containing 7356 salmonid elements representing 3557 different cDNAs. Analyses of cross-species hybridizations to this cDNA microarray indicate that this resource may be used for studies involving all salmonids. PMID:14962987
HLA genotyping by next-generation sequencing of complementary DNA.
Segawa, Hidenobu; Kukita, Yoji; Kato, Kikuya
2017-11-28
Genotyping of the human leucocyte antigen (HLA) is indispensable for various medical treatments. However, unambiguous genotyping is technically challenging due to high polymorphism of the corresponding genomic region. Next-generation sequencing is changing the landscape of genotyping. In addition to high throughput of data, its additional advantage is that DNA templates are derived from single molecules, which is a strong merit for the phasing problem. Although most currently developed technologies use genomic DNA, use of cDNA could enable genotyping with reduced costs in data production and analysis. We thus developed an HLA genotyping system based on next-generation sequencing of cDNA. Each HLA gene was divided into 3 or 4 target regions subjected to PCR amplification and subsequent sequencing with Ion Torrent PGM. The sequence data were then subjected to an automated analysis. The principle of the analysis was to construct candidate sequences generated from all possible combinations of variable bases and arrange them in decreasing order of the number of reads. Upon collecting candidate sequences from all target regions, 2 haplotypes were usually assigned. Cases not assigned 2 haplotypes were forwarded to 4 additional processes: selection of candidate sequences applying more stringent criteria, removal of artificial haplotypes, selection of candidate sequences with a relaxed threshold for sequence matching, and countermeasure for incomplete sequences in the HLA database. The genotyping system was evaluated using 30 samples; the overall accuracy was 97.0% at the field 3 level and 98.3% at the G group level. With one sample, genotyping of DPB1 was not completed due to short read size. We then developed a method for complete sequencing of individual molecules of the DPB1 gene, using the molecular barcode technology. The performance of the automatic genotyping system was comparable to that of systems developed in previous studies. Thus, next-generation sequencing of cDNA is a viable option for HLA genotyping.
Bricheux, G; Brugerolle, G
1997-08-01
The parasitic protozoan Trichomonas vaginalis is known to contain the ubiquitous and highly conserved protein actin. A genomic library and a cDNA library have been screened to identify and clone the actin gene(s) of T. vaginalis. The nucleotide sequence of one gene and its flanking regions have been determined. The open reading frame encodes a protein of 376 amino acids. The sequence is not interrupted by any introns and the promoter could be represented by a 10 bp motif close to a consensus motif also found upstream of most sequenced T. vaginalis genes. The five different clones isolated from the cDNA library have similar sequences and encode three actin proteins differing only by one or two amino acids. A phylogenetic analysis of 31 actin sequences by distance matrix and parsimony methods, using centractin as outgroup, gives congruent trees with Parabasala branching above Diplomonadida.
Huebner, K; Druck, T; Croce, C M; Thiesen, H J
1991-01-01
cDNA clones encoding zinc finger structures were isolated by screening Molt4 and Jurkat cDNA libraries with zinc finger consensus sequences. Candidate clones were partially sequenced to verify the presence of zinc finger-encoding regions; nonoverlapping cDNA clones were chosen on the basis of sequences and genomic hybridization pattern. Zinc finger structure-encoding clones, which were designated by the term "Kox" and a number from 1 to 32 and which were apparently unique (i.e., distinct from each other and distinct from those isolated by other laboratories), were chosen for mapping in the human genome. DNAs from rodent-human somatic cell hybrids retaining defined complements of human chromosomes were analyzed for the presence of each of the Kox genes. Correlation between the presence of specific human chromosome regions and specific Kox genes established the chromosomal locations. Multiple Kox loci were mapped to 7q (Kox 18 and 25 and a locus detected by both Kox 8 cDNA and Kox 27 cDNA), 8q24 5' to the myc locus (Kox 9 and 32), 10cen----q24 (Kox 2, 15, 19, 21, 30, and 31), 12q13-qter (Kox 1 and 20), 17p13 (Kox 11 and 26), and 19q (Kox 5, 6, 10, 22, 24, and 28). Single Kox loci were mapped to 7p22 (Kox 3), 18q12 (Kox 17), 19p (Kox 13), 22q11 between IG lambda and BCR-1 (locus detected by both Kox 8 cDNA and Kox 27 cDNA), and Xp (Kox 14). Several of the Kox loci map to regions in which other zinc finger structure-encoding loci have already been localized, indicating possible zinc finger gene clusters. In addition, Kox genes at 8q24, 17p13, and 22q11--and perhaps other Kox genes--are located near recurrent chromosomal translocation breakpoints. Others, such as those on 7p and 7q, may be near regions specifically active in T cells. Images Figure 4 Figure 5 Figure 2 Figure 3 PMID:2014798
Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete
2007-01-01
Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547
Cross-species transferability and mapping of genomic and cDNA SSRs in pines
D. Chagne; P. Chaumeil; A. Ramboer; C. Collada; A. Guevara; M. T. Cervera; G. G. Vendramin; V. Garcia; J-M. Frigerio; Craig Echt; T. Richardson; Christophe Plomion
2004-01-01
Two unigene datasets of Pinus taeda and Pinus pinaster were screened to detect di-, tri and tetranucleotide repeated motifs using the SSRIT script. A total of 419 simple sequence repeats (SSRs) were identified, from which only 12.8% overlapped between the two sets. The position of the SSRs within the coding sequence were predicted...
The nop gene from Phanerochaete chrysosporium encodes a peroxidase with novel structural features
Luis F. Larrondo; Angel Gonzalez; Tomas Perez-Acle; Dan Cullen; Rafael Vicuna
2005-01-01
Inspection of the genome of the ligninolytic basidiomycete Phanerochaete chrysosporium revealed an unusual peroxidase-like sequence. The corresponding full length cDNA was sequenced and an archetypal secretion signal predicted. The deduced mature protein (NoP, novel peroxidase) contains 295 aa residues and is therefore considerably shorter than other Class II (fungal)...
Wickramasinghe, Gammadde Hewa Ishan Maduka; Rathnayake, Pilimathalawe Panditharathna Attanayake Mudiyanselage Samith Indika; Chandrasekharan, Naduviladath Vishvanath; Weerasinghe, Mahindagoda Siril Samantha; Wijesundera, Ravindra Lakshman Chundananda; Wijesundera, Wijepurage Sandhya Sulochana
2017-06-21
Cellulose, a linear polymer of β 1-4, linked glucose, is the most abundant renewable fraction of plant biomass (lignocellulose). It is synergistically converted to glucose by endoglucanase (EG) cellobiohydrolase (CBH) and β-glucosidase (BGL) of the cellulase complex. BGL plays a major role in the conversion of randomly cleaved cellooligosaccharides into glucose. As it is well known, Saccharomyces cerevisiae can efficiently convert glucose into ethanol under anaerobic conditions. Therefore, S.cerevisiae was genetically modified with the objective of heterologous extracellular expression of the BGLI gene of Trichoderma virens making it capable of utilizing cellobiose to produce ethanol. The cDNA and a genomic sequence of the BGLI gene of Trichoderma virens was cloned in the yeast expression vector pGAPZα and separately transformed to Saccharomyces cerevisiae. The size of the BGLI cDNA clone was 1363 bp and the genomic DNA clone contained an additional 76 bp single intron following the first exon. The gene was 90% similar to the DNA sequence and 99% similar to the deduced amino acid sequence of 1,4-β-D-glucosidase of T. atroviride (AC237343.1). The BGLI activity expressed by the recombinant genomic clone was 3.4 times greater (1.7 x 10 -3 IU ml -1 ) than that observed for the cDNA clone (5 x 10 -4 IU ml -1 ). Furthermore, the activity was similar to the activity of locally isolated Trichoderma virens (1.5 x 10 -3 IU ml -1 ). The estimated size of the protein was 52 kDA. In fermentation studies, the maximum ethanol production by the genomic and the cDNA clones were 0.36 g and 0.06 g /g of cellobiose respectively. Molecular docking results indicated that the bare protein and cellobiose-protein complex behave in a similar manner with considerable stability in aqueous medium. The deduced binding site and the binding affinity of the constructed homology model appeared to be reasonable. Moreover, it was identified that the five hydrogen bonds formed between the amino acid residues of BGLI and cellobiose are mainly involved in the integrity of enzyme-substrate association. The BGLI activity was remarkably higher in the genomic DNA clone compared to the cDNA clone. Cellobiose was successfully fermented into ethanol by the recombinant S.cerevisiae genomic DNA clone. It has the potential to be used in the industrial production of ethanol as it is capable of simultaneous saccharification and fermentation of cellobiose. Homology modeling, docking studies and molecular dynamics simulation studies will provide a realistic model for further studies in the modification of active site residues which could be followed by mutation studies to improve the catalytic action of BGLI.
Poly A- transcripts expressed in HeLa cells.
Wu, Qingfa; Kim, Yeong C; Lu, Jian; Xuan, Zhenyu; Chen, Jun; Zheng, Yonglan; Zhou, Tom; Zhang, Michael Q; Wu, Chung-I; Wang, San Ming
2008-07-30
Transcripts expressed in eukaryotes are classified as poly A+ transcripts or poly A- transcripts based on the presence or absence of the 3' poly A tail. Most transcripts identified so far are poly A+ transcripts, whereas the poly A- transcripts remain largely unknown. We developed the TRD (Total RNA Detection) system for transcript identification. The system detects the transcripts through the following steps: 1) depleting the abundant ribosomal and small-size transcripts; 2) synthesizing cDNA without regard to the status of the 3' poly A tail; 3) applying the 454 sequencing technology for massive 3' EST collection from the cDNA; and 4) determining the genome origins of the detected transcripts by mapping the sequences to the human genome reference sequences. Using this system, we characterized the cytoplasmic transcripts from HeLa cells. Of the 13,467 distinct 3' ESTs analyzed, 24% are poly A-, 36% are poly A+, and 40% are bimorphic with poly A+ features but without the 3' poly A tail. Most of the poly A- 3' ESTs do not match known transcript sequences; they have a similar distribution pattern in the genome as the poly A+ and bimorphic 3' ESTs, and their mapped intergenic regions are evolutionarily conserved. Experiments confirmed the authenticity of the detected poly A- transcripts. Our study provides the first large-scale sequence evidence for the presence of poly A- transcripts in eukaryotes. The abundance of the poly A- transcripts highlights the need for comprehensive identification of these transcripts for decoding the transcriptome, annotating the genome and studying biological relevance of the poly A- transcripts.
USDA-ARS?s Scientific Manuscript database
This study reports generation of large-scale genomic resources for pigeonpea, a so-called ‘orphan crop species’ of the semi-arid tropic regions. Roche FLX/454 sequencing was carried out on a normalized cDNA pool prepared from 31 tissues produced 494,353 short transcript reads (STRs). Cluster analysi...
Oduru, Sreedhar; Campbell, Janee L; Karri, SriTulasi; Hendry, William J; Khan, Shafiq A; Williams, Simon C
2003-01-01
Background Complete genome annotation will likely be achieved through a combination of computer-based analysis of available genome sequences combined with direct experimental characterization of expressed regions of individual genomes. We have utilized a comparative genomics approach involving the sequencing of randomly selected hamster testis cDNAs to begin to identify genes not previously annotated on the human, mouse, rat and Fugu (pufferfish) genomes. Results 735 distinct sequences were analyzed for their relatedness to known sequences in public databases. Eight of these sequences were derived from previously unidentified genes and expression of these genes in testis was confirmed by Northern blotting. The genomic locations of each sequence were mapped in human, mouse, rat and pufferfish, where applicable, and the structure of their cognate genes was derived using computer-based predictions, genomic comparisons and analysis of uncharacterized cDNA sequences from human and macaque. Conclusion The use of a comparative genomics approach resulted in the identification of eight cDNAs that correspond to previously uncharacterized genes in the human genome. The proteins encoded by these genes included a new member of the kinesin superfamily, a SET/MYND-domain protein, and six proteins for which no specific function could be predicted. Each gene was expressed primarily in testis, suggesting that they may play roles in the development and/or function of testicular cells. PMID:12783626
Comparison of the canine and human acid {beta}-galactosidase gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Ahern-Rindell, A.J.; Kretz, K.A.; O`Brien, J.S.
Several canine cDNA libraries were screened with human {beta}-galactosidase cDNA as probe. Seven positive clones were isolated and sequenced yielding a partial (2060 bp) canine {beta}-galactosidase cDNA with 86% identity to the human {beta}-galactosidase cDNA. Preliminary analysis of a canine genomic library indicated conservation of exon number and size. Analysis by Northern blotting disclosed a single mRNA of 2.4 kb in fibroblasts and liver from normal dogs and dogs affected with GM1 gangliosidosis. Although incomplete, these results indicate canine GM1 gangliosidosis is a suitable animal model of the human disease and should further efforts to devise a gene therapy strategymore » for its treatment. 20 refs., 2 figs., 1 tab.« less
Seo, H S; Kim, H Y; Jeong, J Y; Lee, S Y; Cho, M J; Bahk, J D
1995-03-01
A cDNA clone, RGA1, was isolated by using a GPA1 cDNA clone of Arabidopsis thaliana G protein alpha subunit as a probe from a rice (Oryza sativa L. IR-36) seedling cDNA library from roots and leaves. Sequence analysis of genomic clone reveals that the RGA1 gene has 14 exons and 13 introns, and encodes a polypeptide of 380 amino acid residues with a calculated molecular weight of 44.5 kDa. The encoded protein exhibits a considerable degree of amino acid sequence similarity to all the other known G protein alpha subunits. A putative TATA sequence (ATATGA), a potential CAAT box sequence (AGCAATAC), and a cis-acting element, CCACGTGG (ABRE), known to be involved in ABA induction are found in the promoter region. The RGA1 protein contains all the consensus regions of G protein alpha subunits except the cysteine residue near the C-terminus for ADP-ribosylation by pertussis toxin. The RGA1 polypeptide expressed in Escherichia coli was, however, ADP-ribosylated by 10 microM [adenylate-32P] NAD and activated cholera toxin. Southern analysis indicates that there are no other genes similar to the RGA1 gene in the rice genome. Northern analysis reveals that the RGA1 mRNA is 1.85 kb long and expressed in vegetative tissues, including leaves and roots, and that its expression is regulated by light.
Regulation of pathogenicity in hop stunt viroid-related group II citrus viroids.
Reanwarakorn, K; Semancik, J S
1998-12-01
Nucleotide sequences were determined for two hop stunt viroid-related Group II citrus viroids characterized as either a cachexia disease non-pathogenic variant (CVd-IIa) or a pathogenic variant (CVd-IIb). Sequence identity between the two variants of 95.6% indicated a conserved genome with the principal region of nucleotide difference clustered in the variable (V) domain. Full-length viroid RT-PCR cDNA products were cloned into plasmid SP72. Viroid cDNA clones as well as derived RNA transcripts were transmissible to citron (Citrus medica L.) and Luffa aegyptiaca Mill. To determine the locus of cachexia pathogenicity as well as symptom expression in Luffa, chimeric viroid cDNA clones were constructed from segments of either the left terminal, pathogenic and conserved (T1-P-C) domains or the conserved, variable and right terminal (C-V-T2) domains of CVd-IIa or CVd-IIb in reciprocal exchanges. Symptoms induced by the various chimeric constructs on the two bioassay hosts reflected the differential response observed with CVd-IIa and -IIb. Constructs with the C-V-T2 domains region from clone-IIa induced severe symptoms on Luffa typical of CVd-IIa, but were non-symptomatic on mandarin as a bioassay host for the cachexia disease. Constructs with the same region (C-V-T2) from the clone-IIb genome induced only mild symptoms on Luffa, but produced a severe reaction on mandarin, as observed for CVd-IIb. Specific site-directed mutations were introduced into the V domain of the CVd-IIa clone to construct viroid cDNA clones with either partial or complete conversions to the CVd-IIb sequence. With the introduction of six site-specific changes into the V domain of the clone-IIa genome, cachexia pathogenicity was acquired as well as a moderation of severe symptoms on Luffa.
Bannasch, Detlev; Mehrle, Alexander; Glatting, Karl-Heinz; Pepperkok, Rainer; Poustka, Annemarie; Wiemann, Stefan
2004-01-01
We have implemented LIFEdb (http://www.dkfz.de/LIFEdb) to link information regarding novel human full-length cDNAs generated and sequenced by the German cDNA Consortium with functional information on the encoded proteins produced in functional genomics and proteomics approaches. The database also serves as a sample-tracking system to manage the process from cDNA to experimental read-out and data interpretation. A web interface enables the scientific community to explore and visualize features of the annotated cDNAs and ORFs combined with experimental results, and thus helps to unravel new features of proteins with as yet unknown functions. PMID:14681468
Grohmann, L; Brennicke, A; Schuster, W
1992-01-01
The Oenothera mitochondrial genome contains only a gene fragment for ribosomal protein S12 (rps12), while other plants encode a functional gene in the mitochondrion. The complete Oenothera rps12 gene is located in the nucleus. The transit sequence necessary to target this protein to the mitochondrion is encoded by a 5'-extension of the open reading frame. Comparison of the amino acid sequence encoded by the nuclear gene with the polypeptides encoded by edited mitochondrial cDNA and genomic sequences of other plants suggests that gene transfer between mitochondrion and nucleus started from edited mitochondrial RNA molecules. Mechanisms and requirements of gene transfer and activation are discussed. Images PMID:1454526
Mutations Affecting Expression of the rosy Locus in Drosophila melanogaster
Lee, Chong Sung; Curtis, Daniel; McCarron, Margaret; Love, Carol; Gray, Mark; Bender, Welcome; Chovnick, Arthur
1987-01-01
The rosy locus in Drosophila melanogaster codes for the enzyme xanthine dehydrogenase (XDH). Previous studies defined a "control element" near the 5' end of the gene, where variant sites affected the amount of rosy mRNA and protein produced. We have determined the DNA sequence of this region from both genomic and cDNA clones, and from the ry+10 underproducer strain. This variant strain had many sequence differences, so that the site of the regulatory change could not be fixed. A mutagenesis was also undertaken to isolate new regulatory mutations. We induced 376 new mutations with 1-ethyl-1-nitrosourea (ENU) and screened them to isolate those that reduced the amount of XDH protein produced, but did not change the properties of the enzyme. Genetic mapping was used to find mutations located near the 5' end of the gene. DNA from each of seven mutants was cloned and sequenced through the 5' region. Mutant base changes were identified in all seven; they appear to affect splicing and translation of the rosy mRNA. In a related study (T. P. Keith et al. 1987), the genomic and cDNA sequences are extended through the 3' end of the gene; the combined sequences define the processing pattern of the rosy transcript and predict the amino acid sequence of XDH. PMID:3036645
Towards decoding the conifer giga-genome.
Mackay, John; Dean, Jeffrey F D; Plomion, Christophe; Peterson, Daniel G; Cánovas, Francisco M; Pavy, Nathalie; Ingvarsson, Pär K; Savolainen, Outi; Guevara, M Ángeles; Fluch, Silvia; Vinceti, Barbara; Abarca, Dolores; Díaz-Sala, Carmen; Cervera, María-Teresa
2012-12-01
Several new initiatives have been launched recently to sequence conifer genomes including pines, spruces and Douglas-fir. Owing to the very large genome sizes ranging from 18 to 35 gigabases, sequencing even a single conifer genome had been considered unattainable until the recent throughput increases and cost reductions afforded by next generation sequencers. The purpose of this review is to describe the context for these new initiatives. A knowledge foundation has been acquired in several conifers of commercial and ecological interest through large-scale cDNA analyses, construction of genetic maps and gene mapping studies aiming to link phenotype and genotype. Exploratory sequencing in pines and spruces have pointed out some of the unique properties of these giga-genomes and suggested strategies that may be needed to extract value from their sequencing. The hope is that recent and pending developments in sequencing technology will contribute to rapidly filling the knowledge vacuum surrounding their structure, contents and evolution. Researchers are also making plans to use comparative analyses that will help to turn the data into a valuable resource for enhancing and protecting the world's conifer forests.
Kock, K; Ahlers, C; Schmale, H
1994-05-01
The rat von Ebner's gland protein 1 (VEGP 1) is a secretory protein, which is abundantly expressed in the small acinar von Ebner's salivary glands of the tongue. Based on the primary structure of this protein we have previously suggested that it is a member of the lipocalin superfamily of lipophilic-ligand carrier proteins. Although the physiological role of VEGP 1 is not clear, it might be involved in sensory or protective functions in the taste epithelium. Here, we report the purification of VEGP 1 and of a closely related secretory polypeptide, VEGP 2, the isolation of a cDNA clone encoding VEGP 2, and the isolation and structural characterization of the genes for both proteins. Protein purification by gel-filtration and anion-exchange chromatography using Mono Q revealed the presence of two different immunoreactive VEGP species. N-terminal sequence determination of peptide fragments isolated after protease Asp-N digestion allowed the identification of a new VEGP, named VEGP 2, in addition to the previously characterized VEGP 1. The complete VEGP 2 sequence was deduced from a cDNA clone isolated from a von Ebner's gland cDNA library. The VEGP 2 cDNA encodes a protein of 177 amino acids and is 94% identical to VEGP 1. DNA sequence analysis of the rat VEGP 1 and 2 genes isolated from rat genomic libraries revealed that both span about 4.5 kb and contain seven exons. The VEGP 1 and 2 genes are non-allelic distinct genes in the rat genome and probably arose by gene duplication. The high degree of nucleotide sequence identity in introns A-C (94-100%) points to a recent gene conversion event that included the 5' part of the genes. The genomic organization of the rat VEGP genes closely resembles that found in other lipocalins such as beta-lactoglobulin, mouse urinary proteins (MUPs) and prostaglandin D synthase, and therefore provides clear evidence that VEGPs belong to this superfamily of proteins.
Xu, Y L; Li, L; Wu, K; Peeters, A J; Gage, D A; Zeevaart, J A
1995-07-03
The biosynthesis of gibberellins (GAs) after GA12-aldehyde involves a series of oxidative steps that lead to the formation of bioactive GAs. Previously, a cDNA clone encoding a GA 20-oxidase [gibberellin, 2-oxoglutarate:oxygen oxidoreductase (20-hydroxylating, oxidizing), EC 1.14.11.-] was isolated by immunoscreening a cDNA library from liquid endosperm of pumpkin (Cucurbita maxima L.) with antibodies against partially purified GA 20-oxidase. Here, we report isolation of a genomic clone for GA 20-oxidase from a genomic library of the long-day species Arabidopsis thaliana Heynh., strain Columbia, by using the pumpkin cDNA clone as a heterologous probe. This genomic clone contains a GA 20-oxidase gene that consists of three exons and two introns. The three exons are 1131-bp long and encode 377 amino acid residues. A cDNA clone corresponding to the putative GA 20-oxidase genomic sequence was constructed with the reverse transcription-PCR method, and the identity of the cDNA clone was confirmed by analyzing the capability of the fusion protein expressed in Escherichia coli to convert GA53 to GA44 and GA19 to GA20. The Arabidopsis GA 20-oxidase shares 55% identity and > 80% similarity with the pumpkin GA 20-oxidase at the derived amino acid level. Both GA 20-oxidases share high homology with other 2-oxoglutarate-dependent dioxygenases (2-ODDs), but the highest homology was found between the two GA 20-oxidases. Mapping results indicated tight linkage between the cloned GA 20-oxidase and the GA5 locus of Arabidopsis. The ga5 semidwarf mutant contains a G-->A point mutation that inserts a translational stop codon in the protein-coding sequence, thus confirming that the GA5 locus encodes GA 20-oxidase. Expression of the GA5 gene in Ara-bidopsis leaves was enhanced after plants were transferred from short to long days; it was reduced by GA4 treatment, suggesting end-product repression in the GA biosynthetic pathway.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Xu, Yun-Ling; Li, Li; Wu, Keqiang
1995-07-03
The biosynthesis of gibberellins (GAs) after GA{sub 12}-aldehyde involves a series of oxidative steps that lead to the formation of bioactive GAs. Previously, a cDNA clone encoding a GA 20-oxidase [gibberellin, 2-oxoglutarate:oxygen oxidoreductase (20-hydroxylating, oxidizing), EC 1.14.11-] was isolated by immunoscreening a cDNA library from liquid endosperm of pumpkin (Cucurbita maxima L.) with antibodies against partially purified GA 20-oxidase. Here, we report isolation of a genomic clone for GA 20-oxidase from a genomic library of the long-day species Arabidopsis thaliana Heynh., strain Columbia, by using the pumpkin cDNA clone as a heterologous probe. This genomic clone contains a GA 20-oxidasemore » gene that consists of three exons and two introns. The three exons are 1131-bp long and encode 377 amino acid residues. A cDNA clone corresponding to the putative GA 20-oxidase genomic sequence was constructed with the reverse transcription-PCR method, and the identity of the cDNA clone was confirmed by analyzing the capability of the fusion protein expressed in Escherichia coli to convert GA{sub 53} to GA{sub 44} and GA{sub 19} to GA{sub 20}. The Arabidopsis GA 20-oxidase shares 55% identity and >80% similarity with the pumpkin GA 20-oxidase at the derived amino acid level. Both GA 20-oxidases share high homology with other 2-oxoglutarate-dependent dioxygenases (2-ODDs), but the highest homology was found between the two GA 20-oxidases. Mapping results indicated tight linkage between the cloned GA 20-oxidase and the GA locus of Arabidopsis. The ga5 semidwarf mutant contains a G {yields} A point mutation that inserts a translational stop codon in the protein-coding sequence, thus confirming that the GA5 locus encodes GA 20-oxidase. Expression of the GA5 gene in Arabidopsis leaves was enhanced after plants were transferred from short to long days; it was reduced by GA{sub 4} treatment, suggesting end-product repression in the GA biosynthetic pathway. 28 refs., 6 figs.« less
Nolden, T; Pfaff, F; Nemitz, S; Freuling, C M; Höper, D; Müller, T; Finke, Stefan
2016-04-05
Reverse genetics approaches are indispensable tools for proof of concepts in virus replication and pathogenesis. For negative strand RNA viruses (NSVs) the limited number of infectious cDNA clones represents a bottleneck as clones are often generated from cell culture adapted or attenuated viruses, with limited potential for pathogenesis research. We developed a system in which cDNA copies of complete NSV genomes were directly cloned into reverse genetics vectors by linear-to-linear RedE/T recombination. Rapid cloning of multiple rabies virus (RABV) full length genomes and identification of clones identical to field virus consensus sequence confirmed the approache's reliability. Recombinant viruses were recovered from field virus cDNA clones. Similar growth kinetics of parental and recombinant viruses, preservation of field virus characters in cell type specific replication and virulence in the mouse model were confirmed. Reduced titers after reporter gene insertion indicated that the low level of field virus replication is affected by gene insertions. The flexibility of the strategy was demonstrated by cloning multiple copies of an orthobunyavirus L genome segment. This important step in reverse genetics technology development opens novel avenues for the analysis of virus variability combined with phenotypical characterization of recombinant viruses at a clonal level.
Sequence, molecular properties, and chromosomal mapping of mouse lumican
NASA Technical Reports Server (NTRS)
Funderburgh, J. L.; Funderburgh, M. L.; Hevelone, N. D.; Stech, M. E.; Justice, M. J.; Liu, C. Y.; Kao, W. W.; Conrad, G. W.; Spooner, B. S. (Principal Investigator)
1995-01-01
PURPOSE. Lumican is a major proteoglycan of vertebrate cornea. This study characterizes mouse lumican, its molecular form, cDNA sequence, and chromosomal localization. METHODS. Lumican sequence was determined from cDNA clones selected from a mouse corneal cDNA expression library using a bovine lumican cDNA probe. Tissue expression and size of lumican mRNA were determined using Northern hybridization. Glycosidase digestion followed by Western blot analysis provided characterization of molecular properties of purified mouse corneal lumican. Chromosomal mapping of the lumican gene (Lcn) used Southern hybridization of a panel of genomic DNAs from an interspecific murine backcross. RESULTS. Mouse lumican is a 338-amino acid protein with high-sequence identity to bovine and chicken lumican proteins. The N-terminus of the lumican protein contains consensus sequences for tyrosine sulfation. A 1.9-kb lumican mRNA is present in cornea and several other tissues. Antibody against bovine lumican reacted with recombinant mouse lumican expressed in Escherichia coli and also detected high molecular weight proteoglycans in extracts of mouse cornea. Keratanase digestion of corneal proteoglycans released lumican protein, demonstrating the presence of sulfated keratan sulfate chains on mouse corneal lumican in vivo. The lumican gene (Lcn) was mapped to the distal region of mouse chromosome 10. The Lcn map site is in the region of a previously identified developmental mutant, eye blebs, affecting corneal morphology. CONCLUSIONS. This study demonstrates sulfated keratan sulfate proteoglycan in mouse cornea and describes the tools (antibodies and cDNA) necessary to investigate the functional role of this important corneal molecule using naturally occurring and induced mutants of the murine lumican gene.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Khani, S.C.; Lin, D.; Magovcevic, I.
1994-09-01
Rhodopsin kinase (RK) is a cytosolic enzyme in rod photoreceptors that initiates the deactivation of the phototransductions cascade by phosphorylating photoactivated rhodopsin. Although the cDNA sequence of bovine RK has been determined previously, no human cDNA or genomic sequence has thus far been available for genetic studies. In order to investigate the possible role of this candidate gene in retinitis pigmentosa (RP) and allied diseases, we have isolated and characterized human cDNA and genomic clones derived from the RK locus. The coding sequence of the human gene is 1692 nucleotides in length and is split into seven exons. The humanmore » and the bovine sequence show 84% identity at the nucleotide level and 92% identity at the amino acid level. Thus far, the intronic sequences flanking each exon except for one have been determined. We have also mapped the human RK gene to chromosome 13q34 using fluorescence in situ hybridization. To our knowledge, no RP gene has as yet been linked to this region. However, since the substrate for RK (rhodopsin) and other members of the phototransduction cascade have been implicated in the pathogenesis of RP, it is conceivable that defects in RK can also cause some forms of this disease. We are evaluating this possibility by screening DNA from 173 patients with autosomal recessive RP and 190 patients with autosomal dominant RP. So far, we have found 11 patients with variant bands. In one patient with autosomal dominant RP we discovered the missense change Ser536Leu. Cosegregation studies and further sequencing of the variant bands are currently underway.« less
Feng, X; Happ, G M
1996-11-14
The cDNA for Sp23, a structural protein of the spermatophore of Tenebrio molitor, had been previously cloned and characterized (Paesen, G.C., Schwartz, M.B., Peferoen, M., Weyda, F. and Happ, G.M. (1992a) Amino acid sequence of Sp23, a structure protein of the spermatophore of the mealworm beetle, Tenebrio molitor. J. Biol. Chem. 257, 18852-18857). Using the labeled cDNA for Sp23 as a probe to screen a library of genomic DNA from Tenebrio molitor, we isolated a genomic clone for Sp23. A 5373-base pair (bp) restriction fragment containing the Sp23 gene was sequenced. The coding region is separated by a 55-bp intron which is located close to the translation start site. Three putative ecdysone response elements (EcRE) are identified in the 5' flanking region of the Sp23 gene. Comparison of the flanking regions of the Sp23 gene with those of the D-protein gene expressed in the accessory glands of Tenebrio reveals similar sequences present in the flanking regions of the two genes. The genomic organization of the coding region of the Sp23 gene shares similarities with that of the D-protein gene, three Drosophila accessory gland genes and two Drosophila 20-OH ecdysone-responsive genes.
Więsyk, Aneta; Candresse, Thierry; Zagórski, Włodzimierz; Góra-Sochacka, Anna
2011-02-01
In an effort to study sequence space allowing the recovery of viable potato spindle tuber viroid (PSTVd) variants we have developed an in vivo selection (Selex) method to produce and bulk-inoculate by agroinfiltration large PSTVd cDNA banks in which a short stretch of the genome is mutagenized to saturation. This technique was applied to two highly conserved 6 nt-long regions of the PSTVd genome, the left terminal loop (TL bank) and part of the polypurine stretch in the upper strand of pre-melting loop 1 (PM1 bank). In each case, PSTVd accumulation was observed in a large fraction of bank-inoculated tomato plants. Characterization of the progeny molecules showed the recovery of the parental PSTVd sequence in 89 % (TL bank) and 18 % (PM1 bank) of the analysed plants. In addition, viable and genetically stable PSTVd variants with mutations outside of the known natural variability of PSTVd were recovered in both cases, although at different rates. In the case of the TL region, mutations were recovered at five of the six mutagenized positions (357, 358, 359, 1 and 3 of the genome) while for the PM1 region mutations were recovered at all six targeted positions (50-55), providing significant new insight on the plasticity of the PSTVd genome.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tomkinson, B.; Jonsson, A-K
1991-01-01
Tripeptidyl peptidase II is a high molecular weight serine exopeptidase, which has been purified from rat liver and human erythrocytes. Four clones, representing 4453 bp, or 90{percent} of the mRNA of the human enzyme, have been isolated from two different cDNA libraries. One clone, designated A2, was obtained after screening a human B-lymphocyte cDNA library with a degenerated oligonucleotide mixture. The B-lymphocyte cDNA library, obtained from human fibroblasts, were rescreened with a 147 bp fragment from the 5{prime} part of the A2 clone, whereby three different overlapping cDNA clones could be isolated. The deduced amino acid sequence, 1196 amino acidmore » residues, corresponding to the longest open rading frame of the assembled nucleotide sequence, was compared to sequences of current databases. This revealed a 56{percent} similarity between the bacterial enzyme subtilisin and the N-terminal part of tripeptidyl peptidase II. The enzyme was found to be represented by two different mRNAs of 4.2 and 5.0 kilobases, respectively, which probably result from the utilziation of two different polyadenylation sites. Futhermore, cDNA corresponding to both the N-terminal and C-terminal part of tripeptidyl peptidase II hybridized with genomic DNA from mouse, horse, calf, and hen, even under fairly high stringency conditions, indicating that tripeptidyl peptidase II is highly conserved.« less
Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun
2013-01-01
Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation in G. hirsutum and comparative genomics among Gossypium species. PMID:24146870
Unit-length line-1 transcripts in human teratocarcinoma cells.
Skowronski, J; Fanning, T G; Singer, M F
1988-01-01
We have characterized the approximately 6.5-kilobase cytoplasmic poly(A)+ Line-1 (L1) RNA present in a human teratocarcinoma cell line, NTera2D1, by primer extension and by analysis of cloned cDNAs. The bulk of the RNA begins (5' end) at the residue previously identified as the 5' terminus of the longest known primate genomic L1 elements, presumed to represent "unit" length. Several of the cDNA clones are close to 6 kilobase pairs, that is, close to full length. The partial sequences of 18 cDNA clones and full sequence of one (5,975 base pairs) indicate that many different genomic L1 elements contribute transcripts to the 6.5-kilobase cytoplasmic poly(A)+ RNA in NTera2D1 cells because no 2 of the 19 cDNAs analyzed had identical sequences. The transcribed elements appear to represent a subset of the total genomic L1s, a subset that has a characteristic consensus sequence in the 3' noncoding region and a high degree of sequence conservation throughout. Two open reading frames (ORFs) of 1,122 (ORF1) and 3,852 (ORF2) bases, flanked by about 800 and 200 bases of sequence at the 5' and 3' ends, respectively, can be identified in the cDNAs. Both ORFs are in the same frame, and they are separated by 33 bases bracketed by two conserved in-frame stop codons. ORF 2 is interrupted by at least one randomly positioned stop codon in the majority of the cDNAs. The data support proposals suggesting that the human L1 family includes one or more functional genes as well as an extraordinarily large number of pseudogenes whose ORFs are broken by stop codons. The cDNA structures suggest that both genes and pseudogenes are transcribed. At least one of the cDNAs (cD11), which was sequenced in its entirety, could, in principle, represent an mRNA for production of the ORF1 polypeptide. The similarity of mammalian L1s to several recently described invertebrate movable elements defines a new widely distributed class of elements which we term class II retrotransposons. Images PMID:2454389
Capsicum annuum dehydrin, an osmotic-stress gene in hot pepper plants.
Chung, Eunsook; Kim, Soo-Yong; Yi, So Young; Choi, Doil
2003-06-30
Osmotic stress-related genes were selected from an EST database constructed from 7 cDNA libraries from different tissues of the hot pepper. A full-length cDNA of Capsicum annuum dehydrin (Cadhn), a late embryogenesis abundant (lea) gene, was selected from the 5' single pass sequenced cDNA clones and sequenced. The deduced polypeptide has 87% identity with potato dehydrin C17, but very little identity with the dehydrin genes of other organisms. It contains a serine-tract (S-segment) and 3 conserved lysine-rich domains (K-segments). Southern blot analysis showed that 2 copies are present in the hot pepper genome. Cadhn was induced by osmotic stress in leaf tissues as well as by the application of abscisic acid. The RNA was most abundant in green fruit. The expression of several osmotic stress-related genes was examined and Cadhn proved to be the most abundantly expressed of these in response to osmotic stress.
RPG: the Ribosomal Protein Gene database.
Nakao, Akihiro; Yoshihama, Maki; Kenmochi, Naoya
2004-01-01
RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes.
RPG: the Ribosomal Protein Gene database
Nakao, Akihiro; Yoshihama, Maki; Kenmochi, Naoya
2004-01-01
RPG (http://ribosome.miyazaki-med.ac.jp/) is a new database that provides detailed information about ribosomal protein (RP) genes. It contains data from humans and other organisms, including Drosophila melanogaster, Caenorhabditis elegans, Saccharo myces cerevisiae, Methanococcus jannaschii and Escherichia coli. Users can search the database by gene name and organism. Each record includes sequences (genomic, cDNA and amino acid sequences), intron/exon structures, genomic locations and information about orthologs. In addition, users can view and compare the gene structures of the above organisms and make multiple amino acid sequence alignments. RPG also provides information on small nucleolar RNAs (snoRNAs) that are encoded in the introns of RP genes. PMID:14681386
Sumi, S; Tsuneyoshi, T; Furutani, H
1993-09-01
Rod-shaped flexuous viruses were partially purified from garlic plants (Allium sativum) showing typical mosaic symptoms. The genome was shown to be composed of RNA with a poly(A) tail of an estimated size of 10 kb as shown by denaturing agarose gel electrophoresis. We constructed cDNA libraries and screened four independent clones, which were designated GV-A, GV-B, GV-C and GV-D, using Northern and Southern blot hybridization. Nucleotide sequence determination of the cDNAs, two of which correspond to nearly one-third of the virus genomic RNA, shows that all of these viruses possess an identical genomic structure and that also at least four proteins are encoded in the viral cDNA, their M(r)s being estimated to be 15K, 27K, 40K and 11K. The 15K open reading frame (ORF) encodes the core-like sequence of a zinc finger protein preceded by a cluster of basic amino acid residues. The 27K ORF probably encodes the viral coat protein (CP), based on both the existence of some conserved sequences observed in many other rod-shaped or flexuous virus CPs and an overall amino acid sequence similarity to potexvirus and carlavirus CPs. The 11K ORF shows significant amino acid sequence similarities to the corresponding 12K proteins of the potexviruses and carlaviruses. On the other hand, the 40K ORF product does not resemble any other plant virus gene products reported so far. The genomic organization in the 3' region of the garlic viruses resembles, but clearly differs from, that of carlaviruses. Phylogenetic analysis based upon the amino acid sequence of the viral capsid protein also indicates that the garlic viruses have a unique and distinct domain different from those of the potexvirus and carlavirus groups. The results suggest that the garlic viruses described here belong to an unclassified and new virus group closely related to the carlaviruses.
Long-read sequencing of chicken transcripts and identification of new transcript isoforms.
Thomas, Sean; Underwood, Jason G; Tseng, Elizabeth; Holloway, Alisha K
2014-01-01
The chicken has long served as an important model organism in many fields, and continues to aid our understanding of animal development. Functional genomics studies aimed at probing the mechanisms that regulate development require high-quality genomes and transcript annotations. The quality of these resources has improved dramatically over the last several years, but many isoforms and genes have yet to be identified. We hope to contribute to the process of improving these resources with the data presented here: a set of long cDNA sequencing reads, and a curated set of new genes and transcript isoforms not currently represented in the most up-to-date genome annotation currently available to the community of researchers who rely on the chicken genome.
Holland, M J; Holland, J P; Thill, G P; Jackson, K A
1981-02-10
Segments of yeast genomic DNA containing two enolase structural genes have been isolated by subculture cloning procedures using a cDNA hybridization probe synthesized from purified yeast enolase mRNA. Based on restriction endonuclease and transcriptional maps of these two segments of yeast DNA, each hybrid plasmid contains a region of extensive nucleotide sequence homology which forms hybrids with the cDNA probe. The DNA sequences which flank this homologous region in the two hybrid plasmids are nonhomologous indicating that these sequences are nontandemly repeated in the yeast genome. The complete nucleotide sequence of the coding as well as the flanking noncoding regions of these genes has been determined. The amino acid sequence predicted from one reading frame of both structural genes is extremely similar to that determined for yeast enolase (Chin, C. C. Q., Brewer, J. M., Eckard, E., and Wold, F. (1981) J. Biol. Chem. 256, 1370-1376), confirming that these isolated structural genes encode yeast enolase. The nucleotide sequences of the coding regions of the genes are approximately 95% homologous, and neither gene contains an intervening sequence. Codon utilization in the enolase genes follows the same biased pattern previously described for two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes (Holland, J. P., and Holland, M. J. (1980) J. Biol. Chem. 255, 2596-2605). DNA blotting analysis confirmed that the isolated segments of yeast DNA are colinear with yeast genomic DNA and that there are two nontandemly repeated enolase genes per haploid yeast genome. The noncoding portions of the two enolase genes adjacent to the initiation and termination codons are approximately 70% homologous and contain sequences thought to be involved in the synthesis and processing messenger RNA. Finally there are regions of extensive homology between the two enolase structural genes and two yeast glyceraldehyde-3-phosphate dehydrogenase structural genes within the 5- noncoding portions of these glycolytic genes.
Barnes, D W
2012-04-01
Two of the most commonly used elasmobranch experimental model species are the spiny dogfish Squalus acanthias and the little skate Leucoraja erinacea. Comparative biology and genomics with these species have provided useful information in physiology, pharmacology, toxicology, immunology, evolutionary developmental biology and genetics. A wealth of information has been obtained using in vitro approaches to study isolated cells and tissues from these organisms under circumstances in which the extracellular environment can be controlled. In addition to classical work with primary cell cultures, continuously proliferating cell lines have been derived recently, representing the first cell lines from cartilaginous fishes. These lines have proved to be valuable tools with which to explore functional genomic and biological questions and to test hypotheses at the molecular level. In genomic experiments, complementary (c)DNA libraries have been constructed, and c. 8000 unique transcripts identified, with over 3000 representing previously unknown gene sequences. A sub-set of messenger (m)RNAs has been detected for which the 3' untranslated regions show elements that are remarkably well conserved evolutionarily, representing novel, potentially regulatory gene sequences. The cell culture systems provide physiologically valid tools to study functional roles of these sequences and other aspects of elasmobranch molecular cell biology and physiology. Information derived from the use of in vitro cell cultures is valuable in revealing gene diversity and information for genomic sequence assembly, as well as for identification of new genes and molecular markers, construction of gene-array probes and acquisition of full-length cDNA sequences. © 2012 The Author. Journal of Fish Biology © 2012 The Fisheries Society of the British Isles.
Hou, Wan-ru; Tang, Yun; Hou, Yi-ling; Song, Yan; Zhang, Tian; Wu, Guang-fu
2010-07-01
Eukaryotic initiation factor (eIF) EIF1 is a universally conserved translation factor that is involved in translation initiation site selection. The cDNA and the genomic sequences of EIF1 were cloned successfully from the giant panda (Ailuropoda melanoleuca) and the black bear (Ursus thibetanus mupinensis) using reverse transcription polymerase chain reaction (RT-PCR) technology and touchdown-polymerase chain reaction, respectively. The cDNAs of the EIF1 cloned from the giant panda and the black bear are 418 bp in size, containing an open reading frame (ORF) of 342 bp encoding 113 amino acids. The length of the genomic sequence of the giant panda is 1909 bp, which contains four exons and three introns. The length of the genomic sequence of the black bear is 1897 bp, which also contains four exons and three introns. Sequence alignment indicates a high degree of homology to those of Homo sapiens, Mus musculus, Rattus norvegicus, and Bos Taurus at both amino acid and DNA levels. Topology prediction shows there are one N-glycosylation site, two Casein kinase II phosphorylation sites, and a Amidation site in the EIF1 protein of the giant panda and black bear. In addition, there is a protein kinase C phosphorylation site in EIF1 of the giant panda. The giant panda and the black bear EIF1 genes were overexpressed in E. coli BL21. The results indicated that the both EIF1 fusion proteins with the N-terminally His-tagged form gave rise to the accumulation of two expected 19 kDa polypeptide. The expression products obtained could be used to purify the proteins and study their function further.
Poly A- Transcripts Expressed in HeLa Cells
Lu, Jian; Xuan, Zhenyu; Chen, Jun; Zheng, Yonglan; Zhou, Tom; Zhang, Michael Q.; Wu, Chung-I; Wang, San Ming
2008-01-01
Background Transcripts expressed in eukaryotes are classified as poly A+ transcripts or poly A- transcripts based on the presence or absence of the 3′ poly A tail. Most transcripts identified so far are poly A+ transcripts, whereas the poly A- transcripts remain largely unknown. Methodology/Principal Findings We developed the TRD (Total RNA Detection) system for transcript identification. The system detects the transcripts through the following steps: 1) depleting the abundant ribosomal and small-size transcripts; 2) synthesizing cDNA without regard to the status of the 3′ poly A tail; 3) applying the 454 sequencing technology for massive 3′ EST collection from the cDNA; and 4) determining the genome origins of the detected transcripts by mapping the sequences to the human genome reference sequences. Using this system, we characterized the cytoplasmic transcripts from HeLa cells. Of the 13,467 distinct 3′ ESTs analyzed, 24% are poly A-, 36% are poly A+, and 40% are bimorphic with poly A+ features but without the 3′ poly A tail. Most of the poly A- 3′ ESTs do not match known transcript sequences; they have a similar distribution pattern in the genome as the poly A+ and bimorphic 3′ ESTs, and their mapped intergenic regions are evolutionarily conserved. Experiments confirmed the authenticity of the detected poly A- transcripts. Conclusion/Significance Our study provides the first large-scale sequence evidence for the presence of poly A- transcripts in eukaryotes. The abundance of the poly A- transcripts highlights the need for comprehensive identification of these transcripts for decoding the transcriptome, annotating the genome and studying biological relevance of the poly A- transcripts. PMID:18665230
DOE Office of Scientific and Technical Information (OSTI.GOV)
Prody, C.A.; Zevin-Sonkin, D.; Gnatt, A.
1987-06-01
To study the primary structure and regulation of human cholinesterases, oligodeoxynucleotide probes were prepared according to a consensus peptide sequence present in the active site of both human serum pseudocholinesterase and Torpedo electric organ true acetylcholinesterase. Using these probes, the authors isolated several cDNA clones from lambdagt10 libraries of fetal brain and liver origins. These include 2.4-kilobase cDNA clones that code for a polypeptide containing a putative signal peptide and the N-terminal, active site, and C-terminal peptides of human BtChoEase, suggesting that they code either for BtChoEase itself or for a very similar but distinct fetal form of cholinesterase. Inmore » RNA blots of poly(A)/sup +/ RNA from the cholinesterase-producing fetal brain and liver, these cDNAs hybridized with a single 2.5-kilobase band. Blot hybridization to human genomic DNA revealed that these fetal BtChoEase cDNA clones hybridize with DNA fragments of the total length of 17.5 kilobases, and signal intensities indicated that these sequences are not present in many copies. Both the cDNA-encoded protein and its nucleotide sequence display striking homology to parallel sequences published for Torpedo AcChoEase. These finding demonstrate extensive homologies between the fetal BtChoEase encoded by these clones and other cholinesterases of various forms and species.« less
USDA-ARS?s Scientific Manuscript database
Plant ß-1,3-glucanase is commonly found to be involved in the disease resistance. A ß-1,3-glucanase gene was isolated from both the genomic DNA and cDNA of peanut variety Huayu20 by PCR and RT-PCR, respectively (GenBank Accession No. JQ801335). The genomic DNA sequence was 1,471 bp including two ext...
Evaluation of Genomic Instability in the Abnormal Prostate
2006-12-01
array CGH maps copy number aberrations relative to the genome sequence by using arrays of BAC or cDNA clones as the hybridization target instead of...data produced from these analyses complicate the interpretation of results . For these reasons, and as outlined by Davies et al., 22 it is desirable...There have been numerous studies of these abnormalities and several techniques, including 9 chromosome painting, array CGH and SNP arrays , have
DOE Office of Scientific and Technical Information (OSTI.GOV)
Garcia, C.K.; Li, X.; Luna, J.
1994-09-15
Lactate and pyruvate are transported across cell membranes by monocarboxylate transporters (MCTs). Here, the authors use the recently cloned cDNA for hamster MCT1 to isolate cDNA and genomic clones for human MCT1. Comparison of the human and hamster amino acid sequences revealed that the proteins are 86% identical. The gene for human MCT1 (gene symbol, SLC16A1) was localized to human chromosome bands 1p13.2-p12 by PCR analysis of panels of human X rodent cell hybrid lines and by fluorescence chromosomal in situ hybridization. 9 refs., 2 figs.
Chen, Tianbao; Gagliardo, Ron; Walker, Brian; Zhou, Mei; Shaw, Chris
2005-12-01
Phylloxin is a novel prototype antimicrobial peptide from the skin of Phyllomedusa bicolor. Here, we describe parallel identification and sequencing of phylloxin precursor transcript (mRNA) and partial gene structure (genomic DNA) from the same sample of lyophilized skin secretion using our recently-described cloning technique. The open-reading frame of the phylloxin precursor was identical in nucleotide sequence to that previously reported and alignment with the nucleotide sequence derived from genomic DNA indicated the presence of a 175 bp intron located in a near identical position to that found in the dermaseptins. The highly-conserved structural organization of skin secretion peptide genes in P. bicolor can thus be extended to include that encoding phylloxin (plx). These data further reinforce our assertion that application of the described methodology can provide robust genomic/transcriptomic/peptidomic data without the need for specimen sacrifice.
2012-01-01
Background The complete sequences of chloroplast genomes provide wealthy information regarding the evolutionary history of species. With the advance of next-generation sequencing technology, the number of completely sequenced chloroplast genomes is expected to increase exponentially, powerful computational tools annotating the genome sequences are in urgent need. Results We have developed a web server CPGAVAS. The server accepts a complete chloroplast genome sequence as input. First, it predicts protein-coding and rRNA genes based on the identification and mapping of the most similar, full-length protein, cDNA and rRNA sequences by integrating results from Blastx, Blastn, protein2genome and est2genome programs. Second, tRNA genes and inverted repeats (IR) are identified using tRNAscan, ARAGORN and vmatch respectively. Third, it calculates the summary statistics for the annotated genome. Fourth, it generates a circular map ready for publication. Fifth, it can create a Sequin file for GenBank submission. Last, it allows the extractions of protein and mRNA sequences for given list of genes and species. The annotation results in GFF3 format can be edited using any compatible annotation editing tools. The edited annotations can then be uploaded to CPGAVAS for update and re-analyses repeatedly. Using known chloroplast genome sequences as test set, we show that CPGAVAS performs comparably to another application DOGMA, while having several superior functionalities. Conclusions CPGAVAS allows the semi-automatic and complete annotation of a chloroplast genome sequence, and the visualization, editing and analysis of the annotation results. It will become an indispensible tool for researchers studying chloroplast genomes. The software is freely accessible from http://www.herbalgenomics.org/cpgavas. PMID:23256920
Application of industrial scale genomics to discovery of therapeutic targets in heart failure.
Mehraban, F; Tomlinson, J E
2001-12-01
In recent years intense activity in both academic and industrial sectors has provided a wealth of information on the human genome with an associated impressive increase in the number of novel gene sequences deposited in sequence data repositories and patent applications. This genomic industrial revolution has transformed the way in which drug target discovery is now approached. In this article we discuss how various differential gene expression (DGE) technologies are being utilized for cardiovascular disease (CVD) drug target discovery. Other approaches such as sequencing cDNA from cardiovascular derived tissues and cells coupled with bioinformatic sequence analysis are used with the aim of identifying novel gene sequences that may be exploited towards target discovery. Additional leverage from gene sequence information is obtained through identification of polymorphisms that may confer disease susceptibility and/or affect drug responsiveness. Pharmacogenomic studies are described wherein gene expression-based techniques are used to evaluate drug response and/or efficacy. Industrial-scale genomics supports and addresses not only novel target gene discovery but also the burgeoning issues in pharmaceutical and clinical cardiovascular medicine relative to polymorphic gene responses.
Asamizu, E; Nakamura, Y; Sato, S; Tabata, S
2000-06-30
For comprehensive analysis of genes expressed in the model dicotyledonous plant, Arabidopsis thaliana, expressed sequence tags (ESTs) were accumulated. Normalized and size-selected cDNA libraries were constructed from aboveground organs, flower buds, roots, green siliques and liquid-cultured seedlings, respectively, and a total of 14,026 5'-end ESTs and 39,207 3'-end ESTs were obtained. The 3'-end ESTs could be clustered into 12,028 non-redundant groups. Similarity search of the non-redundant ESTs against the public non-redundant protein database indicated that 4816 groups show similarity to genes of known function, 1864 to hypothetical genes, and the remaining 5348 are novel sequences. Gene coverage by the non-redundant ESTs was analyzed using the annotated genomic sequences of approximately 10 Mb on chromosomes 3 and 5. A total of 923 regions were hit by at least one EST, among which only 499 regions were hit by the ESTs deposited in the public database. The result indicates that the EST source generated in this project complements the EST data in the public database and facilitates new gene discovery.
Hao, Yan-Zhe; Hou, Wan-Ru; Hou, Yi-Ling; Du, Yu-Jie; Zhang, Tian; Peng, Zheng-Song
2009-11-01
RPS25 is a component of the 40S small ribosomal subunit encoded by RPS25 gene, which is specific to eukaryotes. Studies in reference to RPS25 gene from animals were handful. The Giant Panda (Ailuropoda melanoleuca), known as a "living fossil", are increasingly concerned by the world community. Studies on RPS25 of the Giant Panda could provide scientific data for inquiring into the hereditary traits of the gene and formulating the protective strategy for the Giant Panda. The cDNA of the RPS25 cloned from Giant Panda is 436 bp in size, containing an open reading frame of 378 bp encoding 125 amino acids. The length of the genomic sequence is 1,992 bp, which was found to possess four exons and three introns. Alignment analysis indicated that the nucleotide sequence of the coding sequence shows a high homology to those of Homo sapiens, Bos taurus, Mus musculus and Rattus norvegicus as determined by Blast analysis, 92.6, 94.4, 89.2 and 91.5%, respectively. Primary structure analysis revealed that the molecular weight of the putative RPS25 protein is 13.7421 kDa with a theoretical pI 10.12. Topology prediction showed there is one N-glycosylation site, one cAMP and cGMP-dependent protein kinase phosphorylation site, two Protein kinase C phosphorylation sites and one Tyrosine kinase phosphorylation site in the RPS25 protein of the Giant Panda. The RPS25 gene was overexpressed in E. coli BL21 and Western Blotting of the RPS25 protein was also done. The results indicated that the RPS25 gene can be really expressed in E. coli and the RPS25 protein fusioned with the N-terminally his-tagged form gave rise to the accumulation of an expected 17.4 kDa polypeptide. The cDNA and the genomic sequence of RPS25 were cloned successfully for the first time from the Giant Panda using RT-PCR technology and Touchdown-PCR, respectively, which were both sequenced and analyzed preliminarily; then the cDNA of the RPS25 gene was overexpressed in E. coli BL21 and immunoblotted, which is the first report on the RPS25 gene from the Giant Panda. The data will enrich and supplement the information about RPS25, which will contribute to the protection for gene resources and the discussion of the genetic polymorphism.
Guinoiseau, Thibault; Moreau, Alain; Hohnadel, Guillaume; Ngo-Giang-Huong, Nicole; Brulard, Celine; Vourc'h, Patrick; Goudeau, Alain; Gaudy-Graffin, Catherine
2017-01-01
Hepatitis C virus (HCV) evolves rapidly in a single host and circulates as a quasispecies wich is a complex mixture of genetically distinct virus's but closely related namely variants. To identify intra-individual diversity and investigate their functional properties in vitro, it is necessary to define their quasispecies composition and isolate the HCV variants. This is possible using single genome amplification (SGA). This technique, based on serially diluted cDNA to amplify a single cDNA molecule (clonal amplicon), has already been used to determine individual HCV diversity. In these studies, positive PCR reactions from SGA were directly sequenced using Sanger technology. The detection of non-clonal amplicons is necessary for excluding them to facilitate further functional analysis. Here, we compared Next Generation Sequencing (NGS) with De Novo assembly and Sanger sequencing for their ability to distinguish clonal and non-clonal amplicons after SGA on one plasma specimen. All amplicons (n = 42) classified as clonal by NGS were also classified as clonal by Sanger sequencing. No double peaks were seen on electropherograms for non-clonal amplicons with position-specific nucleotide variation below 15% by NGS. Altogether, NGS circumvented many of the difficulties encountered when using Sanger sequencing after SGA and is an appropriate tool to reliability select clonal amplicons for further functional studies.
Guinoiseau, Thibault; Moreau, Alain; Hohnadel, Guillaume; Ngo-Giang-Huong, Nicole; Brulard, Celine; Vourc’h, Patrick; Goudeau, Alain; Gaudy-Graffin, Catherine
2017-01-01
Hepatitis C virus (HCV) evolves rapidly in a single host and circulates as a quasispecies wich is a complex mixture of genetically distinct virus’s but closely related namely variants. To identify intra-individual diversity and investigate their functional properties in vitro, it is necessary to define their quasispecies composition and isolate the HCV variants. This is possible using single genome amplification (SGA). This technique, based on serially diluted cDNA to amplify a single cDNA molecule (clonal amplicon), has already been used to determine individual HCV diversity. In these studies, positive PCR reactions from SGA were directly sequenced using Sanger technology. The detection of non-clonal amplicons is necessary for excluding them to facilitate further functional analysis. Here, we compared Next Generation Sequencing (NGS) with De Novo assembly and Sanger sequencing for their ability to distinguish clonal and non-clonal amplicons after SGA on one plasma specimen. All amplicons (n = 42) classified as clonal by NGS were also classified as clonal by Sanger sequencing. No double peaks were seen on electropherograms for non-clonal amplicons with position-specific nucleotide variation below 15% by NGS. Altogether, NGS circumvented many of the difficulties encountered when using Sanger sequencing after SGA and is an appropriate tool to reliability select clonal amplicons for further functional studies. PMID:28362878
2010-01-01
Background Expressed Sequence Tag (EST) has been a cost-effective tool in molecular biology and represents an abundant valuable resource for genome annotation, gene expression, and comparative genomics in plants. Results In this study, we constructed a cDNA library of Prunus mume flower and fruit, sequenced 10,123 clones of the library, and obtained 8,656 expressed sequence tag (EST) sequences with high quality. The ESTs were assembled into 4,473 unigenes composed of 1,492 contigs and 2,981 singletons and that have been deposited in NCBI (accession IDs: GW868575 - GW873047), among which 1,294 unique ESTs were with known or putative functions. Furthermore, we found 1,233 putative simple sequence repeats (SSRs) in the P. mume unigene dataset. We randomly tested 42 pairs of PCR primers flanking potential SSRs, and 14 pairs were identified as true-to-type SSR loci and could amplify polymorphic bands from 20 individual plants of P. mume. We further used the 14 EST-SSR primer pairs to test the transferability on peach and plum. The result showed that nearly 89% of the primer pairs produced target PCR bands in the two species. A high level of marker polymorphism was observed in the plum species (65%) and low in the peach (46%), and the clustering analysis of the three species indicated that these SSR markers were useful in the evaluation of genetic relationships and diversity between and within the Prunus species. Conclusions We have constructed the first cDNA library of P. mume flower and fruit, and our data provide sets of molecular biology resources for P. mume and other Prunus species. These resources will be useful for further study such as genome annotation, new gene discovery, gene functional analysis, molecular breeding, evolution and comparative genomics between Prunus species. PMID:20626882
Yi, S Y; Hwang, B K
1998-10-31
Differential display techniques were used to isolate cDNA clones corresponding to genes which were expressed in soybean hypocotyls by Phytophthora sojae f.sp. glycines infection. With a partial cDNA clone C20CI4 from the differential display PCR as a probe, a new basic peroxidase cDNA clone, designated GMIPER1, was isolated from a cDNA library of soybean hypocotyls infected with P. sojae f.sp. glycines. Sequence analysis revealed that the peroxidase clone encodes a mature protein of 35,813 Da with a putative signal peptide of 27 amino acids in its N-terminus. The amino acid sequence of the soybean peroxidase GMIPER1 is between 54-75% identical to other plant peroxidases including a soybean seed coat peroxidase. Southern blot analysis indicated that multiple copies of sequences related to GMIPER1 exist in the soybean genome. The mRNAs corresponding to the GMIPER1 cDNA accumulated predominantly in the soybean hypocotyls infected with the incompatible race of P. sojae f.sp. glycines, but were expressed at low levels in the compatible interaction. Soybean GMIPER1 mRNAs were not expressed in hypocotyls, leaves, stems, and roots of soybean seedlings. However, treatments with ethephon, salicylic acid or methyl jasmonate induced the accumulation of the GMIPER1 mRNAs in the different organs of soybean. These results suggest that the GMIPER1 gene encoding a putative pathogen-induced peroxidase may play an important role in induced resistance of soybean to P. sojae f.sp. glycines and in response to various external stresses.
Huh, T L; Ryu, J H; Huh, J W; Sung, H C; Oh, I U; Song, B J; Veech, R L
1993-01-01
Mitochondrial NADP(+)-specific isocitrate dehydrogenase (IDP) was co-purified with the pyruvate dehydrogenase complex from bovine kidney mitochondria. The determination of its N-terminal 16-amino-acid sequence revealed that it is highly similar to the IDP from yeast. A cDNA clone (1.8 kb long) encoding this protein was isolated from a bovine kidney lambda gt11 cDNA library using a synthetic oligodeoxynucleotide. The deduced protein sequence of this cDNA clone rendered a precursor protein of 452 amino-acid residues (50,830 Da) and a mature protein of 413 amino-acid residues (46,519 Da). It is 100% identical to the internal tryptic peptide sequences of the autologous form from pig heart and 62% similar to that from yeast. However, it shares little similarity with the mitochondrial NAD(+)-specific isoenzyme from yeast. Structural analyses of the deduced proteins of IDP isoenzymes from different species indicated that similarity exists in certain regions, which may represent the common domains for the active sites or coenzyme-binding sites. In Northern-blot analysis, one species of mRNA (about 2.2 kb for both bovine and human) was hybridized with a 32P-labelled cDNA probe. Southern-blot analysis of genomic DNAs verified simple patterns of hybridization with this cDNA. These results strongly indicate that the mitochondrial IDP may be derived from a single gene family which does not appear to be closely related to that of the NAD(+)-specific isoenzyme. Images Figure 1 Figure 3 Figure 4 Figure 5 PMID:8318002
NASA Technical Reports Server (NTRS)
Reddy, A. S.; Czernik, A. J.; An, G.; Poovaiah, B. W.
1992-01-01
We cloned and sequenced a plant cDNA that encodes U1 small nuclear ribonucleoprotein (snRNP) 70K protein. The plant U1 snRNP 70K protein cDNA is not full length and lacks the coding region for 68 amino acids in the amino-terminal region as compared to human U1 snRNP 70K protein. Comparison of the deduced amino acid sequence of the plant U1 snRNP 70K protein with the amino acid sequence of animal and yeast U1 snRNP 70K protein showed a high degree of homology. The plant U1 snRNP 70K protein is more closely related to the human counter part than to the yeast 70K protein. The carboxy-terminal half is less well conserved but, like the vertebrate 70K proteins, is rich in charged amino acids. Northern analysis with the RNA isolated from different parts of the plant indicates that the snRNP 70K gene is expressed in all of the parts tested. Southern blotting of genomic DNA using the cDNA indicates that the U1 snRNP 70K protein is coded by a single gene.
Deep sampling of the Palomero maize transcriptome by a high throughput strategy of pyrosequencing.
Vega-Arreguín, Julio C; Ibarra-Laclette, Enrique; Jiménez-Moraila, Beatriz; Martínez, Octavio; Vielle-Calzada, Jean Philippe; Herrera-Estrella, Luis; Herrera-Estrella, Alfredo
2009-07-06
In-depth sequencing analysis has not been able to determine the overall complexity of transcriptional activity of a plant organ or tissue sample. In some cases, deep parallel sequencing of Expressed Sequence Tags (ESTs), although not yet optimized for the sequencing of cDNAs, has represented an efficient procedure for validating gene prediction and estimating overall gene coverage. This approach could be very valuable for complex plant genomes. In addition, little emphasis has been given to efforts aiming at an estimation of the overall transcriptional universe found in a multicellular organism at a specific developmental stage. To explore, in depth, the transcriptional diversity in an ancient maize landrace, we developed a protocol to optimize the sequencing of cDNAs and performed 4 consecutive GS20-454 pyrosequencing runs of a cDNA library obtained from 2 week-old Palomero Toluqueño maize plants. The protocol reported here allowed obtaining over 90% of informative sequences. These GS20-454 runs generated over 1.5 Million reads, representing the largest amount of sequences reported from a single plant cDNA library. A collection of 367,391 quality-filtered reads (30.09 Mb) from a single run was sufficient to identify transcripts corresponding to 34% of public maize ESTs databases; total sequences generated after 4 filtered runs increased this coverage to 50%. Comparisons of all 1.5 Million reads to the Maize Assembled Genomic Islands (MAGIs) provided evidence for the transcriptional activity of 11% of MAGIs. We estimate that 5.67% (86,069 sequences) do not align with public ESTs or annotated genes, potentially representing new maize transcripts. Following the assembly of 74.4% of the reads in 65,493 contigs, real-time PCR of selected genes confirmed a predicted correlation between the abundance of GS20-454 sequences and corresponding levels of gene expression. A protocol was developed that significantly increases the number, length and quality of cDNA reads using massive 454 parallel sequencing. We show that recurrent 454 pyrosequencing of a single cDNA sample is necessary to attain a thorough representation of the transcriptional universe present in maize, that can also be used to estimate transcript abundance of specific genes. This data suggests that the molecular and functional diversity contained in the vast native landraces remains to be explored, and that large-scale transcriptional sequencing of a presumed ancestor of the modern maize varieties represents a valuable approach to characterize the functional diversity of maize for future agricultural and evolutionary studies.
Shin, Dong-Ho; Webb, Barbara M; Nakao, Miki; Smith, Sylvia L
2009-07-01
Complement factor I is a crucial regulator of mammalian complement activity. Very little is known of complement regulators in non-mammalian species. We isolated and sequenced four highly similar complement factor I cDNAs from the liver of the nurse shark (Ginglymostoma cirratum), designated as GcIf-1, GcIf-2, GcIf-3 and GcIf-4 (previously referred to as nsFI-a, -b, -c and -d) which encode 689, 673, 673 and 657 amino acid residues, respectively. They share 95% (
Shin, Dong-Ho; Webb, Barbara M.; Nakao, Miki; Smith, Sylvia L.
2009-01-01
Complement factor I is a crucial regulator of mammalian complement activity. Very little is known of complement regulators in non-mammalian species. We isolated and sequenced four highly similar complement factor I cDNAs from the liver of the nurse shark (Ginglymostoma cirratum), designated as GcIf-1, GcIf-2, GcIf-3 and GcIf-4 (previously referred to as nsFI-a, -b, -c and –d) which encode 689, 673, 673 and 657 amino acid residues, respectively. They share 95% (≤) amino acid identities with each other, 35.4 ~ 39.6% and 62.8 ~ 65.9% with factor I of mammals and banded houndshark (Triakis scyllium), respectively. The modular structure of the GcIf is similar to that of mammals with one notable exception, the presence of a novel shark-specific sequence between the leader peptide (LP) and the factor I membrane attack complex (FIMAC) domain. The cDNA sequences differ only in the size and composition of the shark-specific region (SSR). Sequence analysis of each SSR has identified within the region two novel short sequences (SS1 and SS2) and three repeat sequences (RS1, 2 and 3). Genomic analysis has revealed the existence of three introns between the leader peptide and the FIMAC domain, tentatively designated intron 1, intron 2, and intron 3 which span 4067, 2293 and 2082 bp, respectively. Southern blot analysis suggests the presence of a single gene copy for each cDNA type. Phylogenetic analysis suggests that complement factor I of cartilaginous fish diverged prior to the emergence of mammals. All four GcIf cDNA species are expressed in four different tissues and the liver is the main tissue in which expression level of all four is high. This suggests that the expression of GcIf isotypes is tissue-dependent. PMID:19423168
Edvardsen, Rolf B; Malde, Ketil; Mittelholzer, Christian; Taranger, Geir Lasse; Nilsen, Frank
2011-03-01
The Atlantic cod, Gadus morhua, is an important species both for traditional fishery and increasingly also in fish farming. The Atlantic cod is also under potential threat from various environmental changes such as pollution and climate change, but the biological impact of such changes are not well known, in particular when it comes to sublethal effects that can be difficult to assert. Modern molecular and genomic approaches have revolutionized biological research during the last decade, and offer new avenues to study biological functions and e.g. the impact of anthropogenic activities at different life-stages for a given organism. In order to develop genomic data and genomic tools for Atlantic cod we conducted a program were we constructed 20 cDNA libraries, and produced and analyzed 44006 expressed sequence tags (ESTs) from these. Several tissues are represented in the multiple cDNA libraries, that differ in either sexual maturation or immulogical stimulation. This approach allowed us to identify genes that are expressed in particular tissues, life-stages or in response to specific stimuli, and also gives us information about potential functions of the transcripts. The ESTs were used to construct a 16k cDNA microarray to further investigate the cod transcriptome. Microarray analyses were preformed on pylorus, pituitary gland, spleen and testis of sexually maturing male cod. The four different tissues displayed tissue specific transcriptomes demonstrating that the cDNA array is working as expected and will prove to be a powerful tool in further experiments. Copyright © 2010 Elsevier Inc. All rights reserved.
Expressed sequence tag analysis of guinea pig (Cavia porcellus) eye tissues for NEIBank
Simpanya, Mukoma F.; Wistow, Graeme; Gao, James; David, Larry L.; Giblin, Frank J.
2008-01-01
Purpose To characterize gene expression patterns in guinea pig ocular tissues and identify orthologs of human genes from NEIBank expressed sequence tags. Methods RNA was extracted from dissected eye tissues of 2.5-month-old guinea pigs to make three unamplified and unnormalized cDNA libraries in the pCMVSport-6 vector for the lens, retina, and eye minus lens and retina. Over 4,000 clones were sequenced from each library and were analyzed using GRIST for clustering and gene identification. Lens crystallin EST data were validated using two-dimensional electrophoresis (2-DE), matrix assisted laser desorption (MALDI), and electrospray ionization mass spectrometry (ESIMS). Results Combined data from the three libraries generated a total of 6,694 distinctive gene clusters, with each library having between 1,000 and 3,000 clusters. Approximately 60% of the total gene clusters were novel cDNA sequences and had significant homologies to other mammalian sequences in GenBank. Complete cDNA sequences were obtained for many guinea pig lens proteins, including αA/αAinsert-, γN-, and γS-crystallins, lengsin and GRIFIN. The ratio of αA- to αB-crystallin on 2-DE gels was 8: 1 in the lens nucleus and 6.5: 1 in the cortex. Analysis of ESTs, genome sequence, and proteins (by MALDI), did not reveal any evidence for the presence of γD-, γE-, and γF-crystallin in the guinea pig. Predicted masses of many guinea pig lens crystallins were confirmed by ESIMS analysis. For the retina, orthologs of human phototransduction genes were found, such as Rhodopsin, S-antigen (Sag, Arrestin), and Transducin. The guinea-pig ortholog of NRL, a key rod photoreceptor-specific transcription factor, was also represented in EST data. In the ‘rest-of-eye’ library, the most abundant transcripts included decorin and keratin 12, representative of the cornea. Conclusions Genomic analysis of guinea pig eye tissues provides sequence-verified clones for future studies. Guinea pig orthologs of many human eye specific genes were identified. Guinea pig gene structures were similar to their human and rodent gene counterparts. Surprisingly, no orthologs of γD-, γE-, and γF-crystallin were found in EST, proteomic, or the current guinea pig genome data. PMID:19104676
Genes expressed during the development and ripening of watermelon fruit.
Levi, A; Davis, A; Hernandez, A; Wechter, P; Thimmapuram, J; Trebitsh, T; Tadmor, Y; Katzir, N; Portnoy, V; King, S
2006-11-01
A normalized cDNA library was constructed using watermelon flesh mRNA from three distinct developmental time-points and was subtracted by hybridization with leaf cDNA. Random cDNA clones of the watermelon flesh subtraction library were sequenced from the 5' end in order to identify potentially informative genes associated with fruit setting, development, and ripening. One-thousand and forty-six 5'-end sequences (expressed sequence tags; ESTs) were assembled into 832 non-redundant sequences, designated as "EST-unigenes". Of these 832 "EST-unigenes", 254 ( approximately 30%) have no significant homology to sequences published so far for other plant species. Additionally, 168 "EST-unigenes" ( approximately 20%) correspond to genes with unknown function, whereas 410 "EST-unigenes" ( approximately 50%) correspond to genes with known function in other plant species. These "EST-unigenes" are mainly associated with metabolism, membrane transport, cytoskeleton synthesis and structure, cell wall formation and cell division, signal transduction, nucleic acid binding and transcription factors, defense and stress response, and secondary metabolism. This study provides the scientific community with novel genetic information for watermelon as well as an expanded pool of genes associated with fruit development in watermelon. These genes will be useful targets in future genetic and functional genomic studies of watermelon and its development.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Clines, G.; Lovett, M.
1994-09-01
Diastrophic dysplasia (DTD) is an autosomal recessive disorder of unknown pathogenesis that is characterized by abnormal skeletal and cartilage growth. Phenotypic characteristics of the disorder include short stature, scoliosis, and deformation of the first metacarpal. The diastrophic dysplasia gene has been localized to chromosome 5q31-33, within {approximately}60 kb of the colony stimulating factor 1 receptor gene (CSF1R). We have used direct cDNA selection to build a transcription map across {approximately}250 kb surrounding and including the CSF1R locus. cDNA pools from human placenta, activated T cells, cerebellum, Hela cells, fetal brain, chondrocytes, chondrosarcomas and osteosarcomas were multiplexed in these selections. Aftermore » two rounds of selection, an analysis revealed that {approximately}70% of the selected cDNAs were contained within the contig. DNA sequencing and cosmid mapping data from a collection of 310 clones revealed the presence of three new genes in this region that show no appreciable homologies on sequence database searches, as well as cDNA clones from the CSF1R and the PDGFRB loci (another of the known genes in the region). An additional cDNA was found with 100% homology to the gene encoding human ribosomal protein L7 (RPL7). This cDNA comprised {approximately}25% of all selected clones. However, further analysis of the genomic contig revealed the presence of an RPL7 processed pseudogene in very close proximity to the CSF1R and PDGFRB genes. The selection of processed pseudogenes is one previously anticipated artifact of selection metholodolgies, but has not been previously observed. Mutational analysis of the three new genes is underway in diastrophic dysplasia families, as is derivation of full length cDNA clones and the expansion of this detailed transcription map into a larger genomic contig.« less
The Release 6 reference sequence of the Drosophila melanogaster genome
Hoskins, Roger A.; Carlson, Joseph W.; Wan, Kenneth H.; ...
2015-01-14
Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy andmore » middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. In conclusion, further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads.« less
The Release 6 reference sequence of the Drosophila melanogaster genome
DOE Office of Scientific and Technical Information (OSTI.GOV)
Hoskins, Roger A.; Carlson, Joseph W.; Wan, Kenneth H.
Drosophila melanogaster plays an important role in molecular, genetic, and genomic studies of heredity, development, metabolism, behavior, and human disease. The initial reference genome sequence reported more than a decade ago had a profound impact on progress in Drosophila research, and improving the accuracy and completeness of this sequence continues to be important to further progress. We previously described improvement of the 117-Mb sequence in the euchromatic portion of the genome and 21 Mb in the heterochromatic portion, using a whole-genome shotgun assembly, BAC physical mapping, and clone-based finishing. Here, we report an improved reference sequence of the single-copy andmore » middle-repetitive regions of the genome, produced using cytogenetic mapping to mitotic and polytene chromosomes, clone-based finishing and BAC fingerprint verification, ordering of scaffolds by alignment to cDNA sequences, incorporation of other map and sequence data, and validation by whole-genome optical restriction mapping. These data substantially improve the accuracy and completeness of the reference sequence and the order and orientation of sequence scaffolds into chromosome arm assemblies. Representation of the Y chromosome and other heterochromatic regions is particularly improved. The new 143.9-Mb reference sequence, designated Release 6, effectively exhausts clone-based technologies for mapping and sequencing. Highly repeat-rich regions, including large satellite blocks and functional elements such as the ribosomal RNA genes and the centromeres, are largely inaccessible to current sequencing and assembly methods and remain poorly represented. In conclusion, further significant improvements will require sequencing technologies that do not depend on molecular cloning and that produce very long reads.« less
Screening and analyzing genes associated with Amur tiger placental development.
Li, Q; Lu, T F; Liu, D; Hu, P F; Sun, B; Ma, J Z; Wang, W J; Wang, K F; Zhang, W X; Chen, J; Guan, W J; Ma, Y H; Zhang, M H
2014-09-26
The Amur tiger is a unique endangered species in the world, and thus, protection of its genetic resources is extremely important. In this study, an Amur tiger placenta cDNA library was constructed using the SMART cDNA Library Construction kit. A total of 508 colonies were sequenced, in which 205 (76%) genes were annotated and mapped to 74 KEGG pathways, including 29 metabolism, 29 genetic information processing, 4 environmental information processing, 7 cell motility, and 5 organismal system pathways. Additionally, PLAC8, PEG10 and IGF-II were identified after screening genes from the expressed sequence tags, and they were associated with placental development. These findings could lay the foundation for future functional genomic studies of the Amur tiger.
Moser, Lindsey A.; Ramirez-Carvajal, Lisbeth; Puri, Vinita; Pauszek, Steven J.; Matthews, Krystal; Dilley, Kari A.; Mullan, Clancy; McGraw, Jennifer; Khayat, Michael; Beeri, Karen; Yee, Anthony; Dugan, Vivien; Heise, Mark T.; Frieman, Matthew B.; Rodriguez, Luis L.; Bernard, Kristen A.; Wentworth, David E.
2016-01-01
ABSTRACT Several biosafety level 3 and/or 4 (BSL-3/4) pathogens are high-consequence, single-stranded RNA viruses, and their genomes, when introduced into permissive cells, are infectious. Moreover, many of these viruses are select agents (SAs), and their genomes are also considered SAs. For this reason, cDNAs and/or their derivatives must be tested to ensure the absence of infectious virus and/or viral RNA before transfer out of the BSL-3/4 and/or SA laboratory. This tremendously limits the capacity to conduct viral genomic research, particularly the application of next-generation sequencing (NGS). Here, we present a sequence-independent method to rapidly amplify viral genomic RNA while simultaneously abolishing both viral and genomic RNA infectivity across multiple single-stranded positive-sense RNA (ssRNA+) virus families. The process generates barcoded DNA amplicons that range in length from 300 to 1,000 bp, which cannot be used to rescue a virus and are stable to transport at room temperature. Our barcoding approach allows for up to 288 barcoded samples to be pooled into a single library and run across various NGS platforms without potential reconstitution of the viral genome. Our data demonstrate that this approach provides full-length genomic sequence information not only from high-titer virion preparations but it can also recover specific viral sequence from samples with limited starting material in the background of cellular RNA, and it can be used to identify pathogens from unknown samples. In summary, we describe a rapid, universal standard operating procedure that generates high-quality NGS libraries free of infectious virus and infectious viral RNA. IMPORTANCE This report establishes and validates a standard operating procedure (SOP) for select agents (SAs) and other biosafety level 3 and/or 4 (BSL-3/4) RNA viruses to rapidly generate noninfectious, barcoded cDNA amenable for next-generation sequencing (NGS). This eliminates the burden of testing all processed samples derived from high-consequence pathogens prior to transfer from high-containment laboratories to lower-containment facilities for sequencing. Our established protocol can be scaled up for high-throughput sequencing of hundreds of samples simultaneously, which can dramatically reduce the cost and effort required for NGS library construction. NGS data from this SOP can provide complete genome coverage from viral stocks and can also detect virus-specific reads from limited starting material. Our data suggest that the procedure can be implemented and easily validated by institutional biosafety committees across research laboratories. PMID:27822536
An atypical topoisomerase II sequence from the slime mold Physarum polycephalum.
Hugodot, Yannick; Dutertre, Murielle; Duguet, Michel
2004-01-21
We have determined the complete nucleotide sequence of the cDNA encoding DNA topoisomerase II from Physarum polycephalum. Using degenerate primers, based on the conserved amino acid sequences of other eukaryotic enzymes, a 250-bp fragment was polymerase chain reaction (PCR) amplified. This fragment was used as a probe to screen a Physarum cDNA library. A partial cDNA clone was isolated that was truncated at the 3' end. Rapid amplification of cDNA ends (RACE)-PCR was employed to isolate the remaining portion of the gene. The complete sequence of 4613 bp contains an open reading frame of 4494 bp that codes for 1498 amino acid residues with a theoretical molecular weight of 167 kDa. The predicted amino acid sequence shares similarity with those of other eukaryotes and shows the highest degree of identity with the enzyme of Dictyostelium discoideum. However, the enzyme of P. polycephalum contains an atypical amino-terminal domain very rich in serine and proline, whose function is unknown. Remarkably, both a mitochondrial targeting sequence and a nuclear localization signal were predicted respectively in the amino and carboxy-terminus of the protein, as in the case of human topoisomerase III alpha. At the Physarum genomic level, the topoisomerase II gene encompasses a region of about 16 kbp suggesting a large proportion of intronic sequences, an unusual situation for a gene of a lower eukaryote, often free of introns. Finally, expression of topoisomerase II mRNA does not appear significantly dependent on the plasmodium cycle stage, possibly due to the lack of G1 phase or (and) to a mitochondrial localization of the enzyme.
NASA Astrophysics Data System (ADS)
Hamid, Nur Athirah Abd; Ismail, Ismanizan
2013-11-01
Polygonum minus, locally named as Kesum is an aromatic herb which is high in secondary metabolite content. Alcohol dehydrogenase is an important enzyme that catalyzes the reversible oxidation of alcohol and aldehyde with the presence of NAD(P)(H) as co-factor. The main focus of this research is to identify the gene of ADH. The total RNA was extracted from leaves of P. minus which was treated with 150 μM Jasmonic acid. Full-length cDNA sequence of ADH was isolated via rapid amplification cDNA end (RACE). Subsequently, in silico analysis was conducted on the full-length cDNA sequence and PCR was done on genomic DNA to determine the exon and intron organization. Two sequences of ADH, designated as PmADH1 and PmADH2 were successfully isolated. Both sequences have ORF of 801 bp which encode 266 aa residues. Nucleotide sequence comparison of PmADH1 and PmADH2 indicated that both sequences are highly similar at the ORF region but divergent in the 3' untranslated regions (UTR). The amino acid is differ at the 107 residue; PmADH1 contains Gly (G) residue while PmADH2 contains Cys (C) residue. The intron-exon organization pattern of both sequences are also same, with 3 introns and 4 exons. Based on in silico analysis, both sequences contain "classical" short chain alcohol dehydrogenases/reductases ((c) SDRs) conserved domain. The results suggest that both sequences are the members of short chain alcohol dehydrogenase family.
A Novel Locomotion-based Validation Assay for Candidate Drugs Using Drosophila DYT1 Disease Model
2013-11-01
the genome using the same parental fly line, minimizing the effect of surrounding sequences and genetic variations on the ...locomotion and GTPC cyclrohydolase protein levels; (3) supplementation of dopamine can partially rescue the locomotion defects of Drosophila larvae...8217- GCGAACAACCAAAAAATCATTGAGATAATAAACTCCTCCATTAG-3’) to make dtorsin cDNA that lacks GAC (D307) (Fig. 1) respectively. After confirming mutated sequences , the insert was again
Congenital hypothyroidism with goiter in Tenterfield terriers.
Dodgson, S E; Day, R; Fyfe, J C
2012-01-01
A cluster of cases of congenital hypothyroidism with goiter (CHG) in Tenterfield Terriers was identified and hypothesized to be dyshormonogenesis of genetic etiology with autosomal recessive inheritance. To describe the phenotype, thyroid histopathology, biochemistry, mode of inheritance, and causal mutation of CHG in Tenterfield Terriers. Thyroid tissue from 1 CHG-affected Tenterfield Terriers, 2 affected Toy Fox Terriers, and 7 normal control dogs. Genomic DNA from blood or buccal brushings of 114 additional Tenterfield Terriers. Biochemical and genetic segregation analysis of functional gene candidates in a Tenterfield Terrier kindred. Thyroid peroxidase (TPO) iodide oxidation activity was measured, and TPO protein and SDS-resistant thyroglobulin aggregation were assessed on western blots. TPO cDNA was amplified from thyroid RNA and sequenced. Exons and flanking splice sites were amplified from genomic DNA and sequenced. Variant TPO allele segregation was assessed by restriction enzyme digestion of PCR products. Thyroid from an affected pup had lesions consistent with dyshormonogenesis. TPO activity was absent, but normal sized immunocrossreactive TPO protein was present. Affected dog cDNA and genomic sequences revealed a homozygous TPO missense mutation in exon 9 (R593W) that was heterozygous in all obligate carriers and in 31% of other clinically normal Tenterfield Terriers. The mutation underlying CHG in Tenterfield Terriers was identified, and a convenient carrier test made available for screening Tenterfield Terriers used for breeding. Copyright © 2012 by the American College of Veterinary Internal Medicine.
Expression of a synthetic rust fungal virus cDNA in yeast
USDA-ARS?s Scientific Manuscript database
Mycoviruses are viruses that infect fungi. Recently, mycovirus-like RNAs were sequenced from the fungus Phakopsora pachyrhizi, the causal agent of soybean rust. One of the RNAs appeared to represent a novel mycovirus and was designated Phakopsora pachyrhizi virus 2383 (PpV2383). The genome of PpV...
Murray-Stewart, Tracy; Applegren, Nancy B; Devereux, Wendy; Hacker, Amy; Smith, Renee; Wang, Yanlin; Casero, Robert A
2003-07-15
Spermidine/spermine N (1)-acetyltransferase (SSAT) activity is typically highly inducible in non-small-cell lung carcinomas in response to treatment with anti-tumour polyamine analogues, and this induction is associated with subsequent cell death. In contrast, cells of the small-cell lung carcinoma (SCLC) phenotype generally do not respond to these compounds with an increase in SSAT activity, and usually are only moderately affected with respect to growth. The goal of the present study was to produce an SSAT-overexpressing SCLC cell line to further investigate the role of SSAT in response to these anti-tumour analogues. To accomplish this, NCI-H82 SCLC cells were stably transfected with plasmids containing either the SSAT genomic sequence or the corresponding cDNA sequence. Individual clones were selected based on their ability to show induced SSAT activity in response to exposure to a polyamine analogue, and an increase in the steady-state SSAT mRNA level. Cells transfected with the genomic sequence exhibited a significant increase in basal SSAT mRNA expression, as well as enhanced SSAT activity, intracellular polyamine pool depletion and growth inhibition following treatment with the analogue N (1), N (11)-bis(ethyl)norspermine. Cells containing the transfected cDNA also exhibited an increase in the basal SSAT mRNA level, but remained phenotypically similar to vector control cells with respect to their response to analogue exposure. These studies indicate that both the genomic SSAT sequence and polyamine analogue exposure play a role in the transcriptional and post-transcriptional regulation and subsequent induction of SSAT activity in these cells. Furthermore, this is the first production of a cell line capable of SSAT protein induction from a generally unresponsive parent line.
The use of enzymopathic human red cells in the study of malarial parasite glucose metabolism.
Roth, E; Joulin, V; Miwa, S; Yoshida, A; Akatsuka, J; Cohen-Solal, M; Rosa, R
1988-05-01
The in vitro growth of Plasmodium falciparum malaria parasites was assayed in mutant red cells deficient in either diphosphoglycerate mutase (DPGM) or phosphoglycerate kinase (PGK). In addition, cDNA probes developed for human DNA sequences coding for these enzymes were used to examine the parasite genome by means of restriction endonuclease digestion and Southern blot analysis of parasite DNA. In both types of enzymopathic red cells, parasite growth was normal. In infected DPGM deficient red cells, no DPGM activity could be detected, and in normal red cells, DPGM activity declined slightly in a manner suggestive of parasite catabolism of host protein. However, in infected PGK deficient red cells, there was a 100-fold increase in PGK activity, and in normal red cells, a threefold increase in PGK activity was observed. Parasite PGK could be recovered from isolated parasites, and a marked increase in heat instability of parasite PGK as compared with the host cell enzyme was noted. Neither cDNA probe was found to cross-react with DNA sequences in the parasite genome. It is concluded that the parasite has no requirement for DPGM, and probably has no gene for this enzyme. On the other hand, the parasite does require PGK, (an adenosine triphosphate [ATP] generating enzyme) and synthesizes its own enzyme, which must have been encoded in the parasite genome. The parasite PGK gene most likely lacks sufficient homology to be detected by a human cDNA probe. Enzymopathic red cells are useful tools for elucidating the glycolytic enzymology of parasites and their co-evolution with their human hosts.
Prody, C A; Zevin-Sonkin, D; Gnatt, A; Goldberg, O; Soreq, H
1987-01-01
To study the primary structure and regulation of human cholinesterases, oligodeoxynucleotide probes were prepared according to a consensus peptide sequence present in the active site of both human serum pseudocholinesterase (BtChoEase; EC 3.1.1.8) and Torpedo electric organ "true" acetylcholinesterase (AcChoEase; EC 3.1.1.7). Using these probes, we isolated several cDNA clones from lambda gt10 libraries of fetal brain and liver origins. These include 2.4-kilobase cDNA clones that code for a polypeptide containing a putative signal peptide and the N-terminal, active site, and C-terminal peptides of human BtChoEase, suggesting that they code either for BtChoEase itself or for a very similar but distinct fetal form of cholinesterase. In RNA blots of poly(A)+ RNA from the cholinesterase-producing fetal brain and liver, these cDNAs hybridized with a single 2.5-kilobase band. Blot hybridization to human genomic DNA revealed that these fetal BtChoEase cDNA clones hybridize with DNA fragments of the total length of 17.5 kilobases, and signal intensities indicated that these sequences are not present in many copies. Both the cDNA-encoded protein and its nucleotide sequence display striking homology to parallel sequences published for Torpedo AcChoEase. These findings demonstrate extensive homologies between the fetal BtChoEase encoded by these clones and other cholinesterases of various forms and species. Images PMID:3035536
Cloning a Chymotrypsin-Like 1 (CTRL-1) Protease cDNA from the Jellyfish Nemopilema nomurai
Heo, Yunwi; Kwon, Young Chul; Bae, Seong Kyeong; Hwang, Duhyeon; Yang, Hye Ryeon; Choudhary, Indu; Lee, Hyunkyoung; Yum, Seungshic; Shin, Kyoungsoon; Yoon, Won Duk; Kang, Changkeun; Kim, Euikyung
2016-01-01
An enzyme in a nematocyst extract of the Nemopilema nomurai jellyfish, caught off the coast of the Republic of Korea, catalyzed the cleavage of chymotrypsin substrate in an amidolytic kinetic assay, and this activity was inhibited by the serine protease inhibitor, phenylmethanesulfonyl fluoride. We isolated the full-length cDNA sequence of this enzyme, which contains 850 nucleotides, with an open reading frame of 801 encoding 266 amino acids. A blast analysis of the deduced amino acid sequence showed 41% identity with human chymotrypsin-like (CTRL) and the CTRL-1 precursor. Therefore, we designated this enzyme N. nomurai CTRL-1. The primary structure of N. nomurai CTRL-1 includes a leader peptide and a highly conserved catalytic triad of His69, Asp117, and Ser216. The disulfide bonds of chymotrypsin and the substrate-binding sites are highly conserved compared with the CTRLs of other species, including mammalian species. Nemopilema nomurai CTRL-1 is evolutionarily more closely related to Actinopterygii than to Scyphozoan (Aurelia aurita) or Hydrozoan (Hydra vulgaris). The N. nomurai CTRL1 was amplified from the genomic DNA with PCR using specific primers designed based on the full-length cDNA, and then sequenced. The N. nomurai CTRL1 gene contains 2434 nucleotides and four distinct exons. The 5′ donor splice (GT) and 3′ acceptor splice sequences (AG) are wholly conserved. This is the first report of the CTRL1 gene and cDNA structures in the jellyfish N. nomurai. PMID:27399771
Cloning a Chymotrypsin-Like 1 (CTRL-1) Protease cDNA from the Jellyfish Nemopilema nomurai.
Heo, Yunwi; Kwon, Young Chul; Bae, Seong Kyeong; Hwang, Duhyeon; Yang, Hye Ryeon; Choudhary, Indu; Lee, Hyunkyoung; Yum, Seungshic; Shin, Kyoungsoon; Yoon, Won Duk; Kang, Changkeun; Kim, Euikyung
2016-07-05
An enzyme in a nematocyst extract of the Nemopilema nomurai jellyfish, caught off the coast of the Republic of Korea, catalyzed the cleavage of chymotrypsin substrate in an amidolytic kinetic assay, and this activity was inhibited by the serine protease inhibitor, phenylmethanesulfonyl fluoride. We isolated the full-length cDNA sequence of this enzyme, which contains 850 nucleotides, with an open reading frame of 801 encoding 266 amino acids. A blast analysis of the deduced amino acid sequence showed 41% identity with human chymotrypsin-like (CTRL) and the CTRL-1 precursor. Therefore, we designated this enzyme N. nomurai CTRL-1. The primary structure of N. nomurai CTRL-1 includes a leader peptide and a highly conserved catalytic triad of His(69), Asp(117), and Ser(216). The disulfide bonds of chymotrypsin and the substrate-binding sites are highly conserved compared with the CTRLs of other species, including mammalian species. Nemopilema nomurai CTRL-1 is evolutionarily more closely related to Actinopterygii than to Scyphozoan (Aurelia aurita) or Hydrozoan (Hydra vulgaris). The N. nomurai CTRL1 was amplified from the genomic DNA with PCR using specific primers designed based on the full-length cDNA, and then sequenced. The N. nomurai CTRL1 gene contains 2434 nucleotides and four distinct exons. The 5' donor splice (GT) and 3' acceptor splice sequences (AG) are wholly conserved. This is the first report of the CTRL1 gene and cDNA structures in the jellyfish N. nomurai.
Tian, Wenzhi; Chua, Kevin; Strober, Warren; Chu, Charles C.
2002-01-01
BACKGROUND: Identification of differentially expressed genes between normal and diseased states is an area of intense current medical research that can lead to the discovery of new therapeutic targets. However, isolation of differentially expressed genes by subtraction often suffers from unreported contamination of the resulting subtraction library with clones containing DNA sequences not from the original RNA samples. MATERIALS AND METHODS: Subtraction using cDNA representational difference analysis (RDA) was performed on human B cells from normal or common variable immunodeficiency patients. The material remaining after the subtraction was cloned and individual clones were sequenced. The sequence of one clone with similarity to integrases (ILG1, integrase-like gene-1) was used to obtain the full length cDNA sequence and as a probe for the presence of this sequence in RNA or genomic DNA samples. RESULTS: After five rounds of cDNA RDA, 23.3% of the clones from the resulting subtraction library contained Escherichia coli DNA. In addition, three clones contained the sequence of a new integrase, ILG1. The full length cDNA sequence of ILG1 exhibits prokaryotic, but not eukaryotic, features. At the DNA level, ILG1 is not similar to any known gene. At the protein level, ILG1 has 58% similarity to integrases from the cryptic P4 bacteriophage family (S clade). The catalytic domain of ILG1 contains the conserved features found in site-specific recombinases. The critical residues that form the catalytic active site pocket are conserved, including the highly conserved R-H-R-Y hallmark of these recombinases. Interestingly, ILG1 was not present in the original B cell populations. By probing genomic DNA, ILG1 could only be detected in the E. coli TOP10F' strain used in our laboratory for molecular cloning, but not in any of its precursor strains, including TOP10. Furthermore, bacteria cultured from the mouth of the laboratory worker who performed cDNA RDA were also positive for ILG1. CONCLUSIONS: In the course of our studies using cDNA RDA, we have isolated and identified ILG1, a likely active site-specific recombinase and new member of the bacteriophage P4 family of integrases. This family of integrases is implicated in the horizontal DNA transfer of pathogenic genes between bacterial species, such as those found in pathogenic strains of E. coli, Shigella, Yersinia, and Vibrio cholera. Using ILG1 as a marker of our laboratory E. coli strain TOP10F', our evidence suggests that contaminating bacterial DNA in our subtraction experiment is due to this laboratory bacterial strain, which colonized exposed surfaces of the laboratory worker. Thus, identification of differentially expressed genes between normal and diseased states could be dramatically improved by using extra precaution to prevent bacterial contamination of samples. PMID:12393938
Babak, Tomas; Garrett-Engele, Philip; Armour, Christopher D; Raymond, Christopher K; Keller, Mark P; Chen, Ronghua; Rohl, Carol A; Johnson, Jason M; Attie, Alan D; Fraser, Hunter B; Schadt, Eric E
2010-08-13
Identifying associations between genotypes and gene expression levels using microarrays has enabled systematic interrogation of regulatory variation underlying complex phenotypes. This approach has vast potential for functional characterization of disease states, but its prohibitive cost, given hundreds to thousands of individual samples from populations have to be genotyped and expression profiled, has limited its widespread application. Here we demonstrate that genomic regions with allele-specific expression (ASE) detected by sequencing cDNA are highly enriched for cis-acting expression quantitative trait loci (cis-eQTL) identified by profiling of 500 animals in parallel, with up to 90% agreement on the allele that is preferentially expressed. We also observed widespread noncoding and antisense ASE and identified several allele-specific alternative splicing variants. Monitoring ASE by sequencing cDNA from as little as one sample is a practical alternative to expression genetics for mapping cis-acting variation that regulates RNA transcription and processing.
Kim, Mi Ae; Rhee, Jae-Sung; Kim, Tae Ha; Lee, Jung Sick; Choi, Ah-Young; Choi, Beom-Soon; Choi, Ik-Young; Sohn, Young Chang
2017-03-09
In order to characterize the female or male transcriptome of the Pacific abalone and further increase genomic resources, we sequenced the mRNA of full-length complementary DNA (cDNA) libraries derived from pooled tissues of female and male Haliotis discus hannai by employing the Iso-Seq protocol of the PacBio RSII platform. We successfully assembled whole full-length cDNA sequences and constructed a transcriptome database that included isoform information. After clustering, a total of 15,110 and 12,145 genes that coded for proteins were identified in female and male abalones, respectively. A total of 13,057 putative orthologs were retained from each transcriptome in abalones. Overall Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analyzed in each database showed a similar composition between sexes. In addition, a total of 519 and 391 isoforms were genome-widely identified with at least two isoforms from female and male transcriptome databases. We found that the number of isoforms and their alternatively spliced patterns are variable and sex-dependent. This information represents the first significant contribution to sex-preferential genomic resources of the Pacific abalone. The availability of whole female and male transcriptome database and their isoform information will be useful to improve our understanding of molecular responses and also for the analysis of population dynamics in the Pacific abalone.
Kim, Mi Ae; Rhee, Jae-Sung; Kim, Tae Ha; Lee, Jung Sick; Choi, Ah-Young; Choi, Beom-Soon; Choi, Ik-Young; Sohn, Young Chang
2017-01-01
In order to characterize the female or male transcriptome of the Pacific abalone and further increase genomic resources, we sequenced the mRNA of full-length complementary DNA (cDNA) libraries derived from pooled tissues of female and male Haliotis discus hannai by employing the Iso-Seq protocol of the PacBio RSII platform. We successfully assembled whole full-length cDNA sequences and constructed a transcriptome database that included isoform information. After clustering, a total of 15,110 and 12,145 genes that coded for proteins were identified in female and male abalones, respectively. A total of 13,057 putative orthologs were retained from each transcriptome in abalones. Overall Gene Ontology terms and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways analyzed in each database showed a similar composition between sexes. In addition, a total of 519 and 391 isoforms were genome-widely identified with at least two isoforms from female and male transcriptome databases. We found that the number of isoforms and their alternatively spliced patterns are variable and sex-dependent. This information represents the first significant contribution to sex-preferential genomic resources of the Pacific abalone. The availability of whole female and male transcriptome database and their isoform information will be useful to improve our understanding of molecular responses and also for the analysis of population dynamics in the Pacific abalone. PMID:28282934
Cloning of cDNA of major antigen of foot and mouth disease virus and expression in E. coli
NASA Astrophysics Data System (ADS)
Küpper, Hans; Keller, Walter; Kurz, Christina; Forss, Sonja; Schaller, Heinz
1981-02-01
Double-stranded DNA copies of the single-stranded genomic RNA of foot and mouth disease virus have been cloned into the Escherichia coli plasmid pBR322. A restriction map of the viral genome was established and aligned with the biochemical map of foot and mouth disease virus. The coding sequence for structural protein VP1, the major antigen of the virus, was identified and inserted into a plasmid vector where the expression of this sequence is under control of the phage λ PL promoter. In an appropriate host the synthesis of antigenic polypeptide can be demonstrated by radioimmunoassay.
Gambling on a shortcut to genome sequencing
DOE Office of Scientific and Technical Information (OSTI.GOV)
Roberts, L.
1991-06-21
Almost from the start of the Human Genome Project, a debate has been raging over whether to sequence the entire human genome, all 3 billion bases, or just the genes - a mere 2% or 3% of the genome, and by far the most interesting part. In England, Sydney Brenner convinced the Medical Research Council (MRC) to start with the expressed genes, or complementary DNAs. But the US stance has been that the entire sequence is essential if we are to understand the blueprint of man. Craig Venter of the National Institute of Neurological Disorders and Stroke says that focusingmore » on the expressed genes may be even more useful than expected. His strategy involves randomly selecting clones from cDNA libraries which theoretically contain all the genes that are switched on at a particular time in a particular tissue. Then the researchers sequence just a short stretch of each clone, about 400 to 500 bases, to create can expressed sequence tag or EST. The sequences of these ESTs are then stored in a database. Using that information, other researchers can then recreate that EST by using polymerase chain reaction techniques.« less
Chromosomal mapping of the human M6 genes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Olinsky, S.; Loop, B.T.; DeKosky, A.
1996-05-01
M6 is a neuronal membrane glycoprotein that may have an important role in neural development. This molecule was initially defined by a monoclonal antibody that affected the survival of cultured cerebellar neurons and the outgrowth of neurites. The nature of the antigen was discovered by expression cDNA cloning using this monoclonal antibody. Two distinct murine M6 cDNAs (designated M6a and M6b) whose deduced amino acid sequences were remarkably similar to that of the myelin proteolipid protein human cDNA and genomic clones encoding M6a and M6b and have characterized them by restriction mapping, Southern hybridization with cDNA probes, and sequence analysis.more » We have localized these genes within the human genome by FISH (fluorescence in situ hybridization). The human M6a gene is located at 4q34, and the M6b gene is located at Xp22.2 A number of human neurological disorders have been mapped to the Xp22 region, including Aicardi syndrome (MIM 304050), Rett syndrome (MIM 312750), X-linked Charcot-Marie-Tooth neuropathy (MIM 302801), and X-linked mental retardation syndromes (MRX1, MIM 309530). This raises the possibility that a defect in the M6b gene is responsible for one of these neurological disorders. 8 refs., 3 figs.« less
González-Carranza, Zinnia Haydé; Whitelaw, Catherine Ann; Swarup, Ranjan; Roberts, Jeremy Alan
2002-01-01
During leaf abscission in oilseed rape (Brassica napus), cell wall degradation is brought about by the action of several hydrolytic enzymes. One of these is thought to be polygalacturonase (PG). Degenerate primers were used to isolate a PG cDNA fragment by reverse transcriptase-polymerase chain reaction from RNA extracted from ethylene-promoted leaf abscission zones (AZs), and in turn a full-length clone (CAW471) from an oilseed rape AZ cDNA library. The highest homology of this cDNA (82%) was to an Arabidopsis sequence that was predicted to encode a PG protein. Analysis of expression revealed that CAW471 mRNA accumulated in the AZ of leaves and reached a peak 24 h after ethylene treatment. Ethylene-promoted leaf abscission in oilseed rape was not apparent until 42 h after exposure to the gas, reaching 50% at 48 h and 100% by 56 h. In floral organ abscission, expression of CAW471 correlated with cell separation. Genomic libraries from oilseed rape and Arabidopsis were screened with CAW471 and the respective genomic clones PGAZBRAN and PGAZAT isolated. Characterization of these PG genes revealed that they had substantial homology within both the coding regions and in the 5′-upstream sequences. Fusion of a 1,476-bp 5′-upstream sequence of PGAZAT to β-glucuronidase or green fluorescent protein and transformation of Arabidopsis revealed that this fragment was sufficient to drive expression of these reporter genes in the AZs at the base of the anther filaments, petals, and sepals. PMID:11842157
Zhou, Wen-Zhao; Zhang, Yan-Mei; Lu, Jun-Ying; Li, Jun-Feng
2012-01-01
To provide a resource of sisal-specific expressed sequence data and facilitate this powerful approach in new gene research, the preparation of normalized cDNA libraries enriched with full-length sequences is necessary. Four libraries were produced with RNA pooled from Agave sisalana multiple tissues to increase efficiency of normalization and maximize the number of independent genes by SMART™ method and the duplex-specific nuclease (DSN). This procedure kept the proportion of full-length cDNAs in the subtracted/normalized libraries and dramatically enhanced the discovery of new genes. Sequencing of 3875 cDNA clones of libraries revealed 3320 unigenes with an average insert length about 1.2 kb, indicating that the non-redundancy of libraries was about 85.7%. These unigene functions were predicted by comparing their sequences to functional domain databases and extensively annotated with Gene Ontology (GO) terms. Comparative analysis of sisal unigenes and other plant genomes revealed that four putative MADS-box genes and knotted-like homeobox (knox) gene were obtained from a total of 1162 full-length transcripts. Furthermore, real-time PCR showed that the characteristics of their transcripts mainly depended on the tight expression regulation of a number of genes during the leaf and flower development. Analysis of individual library sequence data indicated that the pooled-tissue approach was highly effective in discovering new genes and preparing libraries for efficient deep sequencing. PMID:23202944
USDA-ARS?s Scientific Manuscript database
We used an expressed sequence tag and 454 pyrosequencing approach to initiate a study of the genome of the New World Screwworm, Cochliomyia hominivorax (Coquerel). Two normalized cDNA libraries were constructed from RNA isolated from embryos and 2nd instar larvae from the Panama 95 strain. Approxima...
Goetz, Frederick W; Norberg, Birgitta; McCauley, Linda A R; Iliev, Dimitar B
2004-03-01
The full-length cDNA for the cod (Gadus morhua) StAR was cloned by RT-PCR and library screening using ovarian RNA. From the library screening, 2 size classes of cDNA were obtained; a 1577 bp cDNA (cStAR1) and a 2851 bp cDNA (cStAR2). The cStAR1 cDNA presumably encodes a protein of 286 amino acids. The cStAR2 cDNA was composed of 6 separated sequences that contained all of the coding regions of cStAR1 when added together, but also contained 5 noncoding regions not observed in cStAR1. Polymerase chain reactions of cod genomic DNA produced products slightly larger than cStAR2. The sequence of these products were the same as cStAR2 but revealed one additional noncoding region (intron). Thus, the fish StAR gene contains the same number of exons (7) and introns (6) as observed in mammals, but is approximately half the size of the mammalian gene. Using Northern analysis and RT-PCR, cStAR1 expression was observed only in testes, ovaries and head kidneys. Polymerase chain reaction products were also observed using cDNA from steroidogenic tissues and primers designed to regions specific for cStAR2, indicating that cStAR2 is expressed in tissues and may account for the presence of larger transcripts observed on Northern blots.
Rodriguez, P L; Leube, M P; Grill, E
1998-11-01
We report the cloning of both the cDNA and the corresponding genomic sequence of a new PP2C from Arabidopsis thaliana, named AtP2C-HA (for homology to ABI1/ABI2). The AtP2C-HA cDNA contains an open reading frame of 1536 bp and encodes a putative protein of 511 amino acids with a predicted molecular mass of 55.7 kDa. The AtP2C-HA protein is composed of two domains, a C-terminal PP2C catalytic domain and a N-terminal extension of ca. 180 amino acid residues. The deduced amino acid sequence is 55% and 54% identical to ABI1 and ABI2, respectively. Comparison of the genomic structure of the ABI1, ABI2 and AtP2C-HA genes suggests that they belong to a multigene family. The expression of the AtP2C-HA gene is up-regulated by abscisic acid (ABA) treatment.
Scaglione, Davide; Lanteri, Sergio; Acquadro, Alberto; Lai, Zhao; Knapp, Steven J; Rieseberg, Loren; Portis, Ezio
2012-10-01
Cynara cardunculus (2n = 2× = 34) is a member of the Asteraceae family that contributes significantly to the agricultural economy of the Mediterranean basin. The species includes two cultivated varieties, globe artichoke and cardoon, which are grown mainly for food. Cynara cardunculus is an orphan crop species whose genome/transcriptome has been relatively unexplored, especially in comparison to other Asteraceae crops. Hence, there is a significant need to improve its genomic resources through the identification of novel genes and sequence-based markers, to design new breeding schemes aimed at increasing quality and crop productivity. We report the outcome of cDNA sequencing and assembly for eleven accessions of C. cardunculus. Sequencing of three mapping parental genotypes using Roche 454-Titanium technology generated 1.7 × 10⁶ reads, which were assembled into 38,726 reference transcripts covering 32 Mbp. Putative enzyme-encoding genes were annotated using the KEGG-database. Transcription factors and candidate resistance genes were surveyed as well. Paired-end sequencing was done for cDNA libraries of eight other representative C. cardunculus accessions on an Illumina Genome Analyzer IIx, generating 46 × 10⁶ reads. Alignment of the IGA and 454 reads to reference transcripts led to the identification of 195,400 SNPs with a Bayesian probability exceeding 95%; a validation rate of 90% was obtained by Sanger-sequencing of a subset of contigs. These results demonstrate that the integration of data from different NGS platforms enables large-scale transcriptome characterization, along with massive SNP discovery. This information will contribute to the dissection of key agricultural traits in C. cardunculus and facilitate the implementation of marker-assisted selection programs. © 2012 The Authors. Plant Biotechnology Journal © 2012 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.
Sheveleva, Anna; Kudryavtseva, Anna; Speranskaya, Anna; Belenikin, Maxim; Melnikova, Natalia; Chirkov, Sergei
2013-10-01
The near-complete (99.7 %) genome sequence of a novel Russian Plum pox virus (PPV) isolate Pk, belonging to the strain Winona (W), has been determined by 454 pyrosequencing with the exception of the thirty-one 5'-terminal nucleotides. This region was amplified using 5'RACE kit and sequenced by the Sanger method. Genomic RNA released from immunocaptured PPV particles was employed for generation of cDNA library using TransPlex Whole transcriptome amplification kit (WTA2, Sigma-Aldrich). The entire Pk genome has identity level of 92.8-94.5 % when compared to the complete nucleotide sequences of other PPV-W isolates (W3174, LV-141pl, LV-145bt, and UKR 44189), confirming a high degree of variability within the PPV-W strain. The isolates Pk and LV-141pl are most closely related. The Pk has been found in a wild plum (Prunus domestica) in a new region of Russia indicating widespread dissemination of the PPV-W strain in the European part of the former USSR.
Loit, Evelin; Melnyk, Charles W; MacFarlane, Amanda J; Scott, Fraser W; Altosaar, Illimar
2009-01-01
Background Exposure to dietary wheat proteins in genetically susceptible individuals has been associated with increased risk for the development of Type 1 diabetes (T1D). Recently, a wheat protein encoded by cDNA WP5212 has been shown to be antigenic in mice, rats and humans with autoimmune T1D. To investigate the genomic origin of the identified wheat protein cDNA, a hexaploid wheat genomic library from Glenlea cultivar was screened. Results Three unique wheat globulin genes, Glo-3A, Glo3-B and Glo-3C, were identified. We describe the genomic structure of these genes and their expression pattern in wheat seeds. The Glo-3A gene shared 99% identity with the cDNA of WP5212 at the nucleotide and deduced amino acid level, indicating that we have identified the gene(s) encoding wheat protein WP5212. Southern analysis revealed the presence of multiple copies of Glo-3-like sequences in all wheat samples, including hexaploid, tetraploid and diploid species wheat seed. Aleurone and embryo tissue specificity of WP5212 gene expression, suggested by promoter region analysis, which demonstrated an absence of endosperm specific cis elements, was confirmed by immunofluorescence microscopy using anti-WP5212 antibodies. Conclusion Taken together, the results indicate that a diverse group of globulins exists in wheat, some of which could be associated with the pathogenesis of T1D in some susceptible individuals. These data expand our knowledge of specific wheat globulins and will enable further elucidation of their role in wheat biology and human health. PMID:19615078
Circularization of the HIV-1 genome facilitates strand transfer during reverse transcription
Beerens, Nancy; Kjems, Jørgen
2010-01-01
Two obligatory DNA strand transfers take place during reverse transcription of a retroviral RNA genome. The first strand transfer involves a jump from the 5′ to the 3′ terminal repeat (R) region positioned at each end of the viral genome. The process depends on base pairing between the cDNA synthesized from the 5′ R region and the 3′ R RNA. The tertiary conformation of the viral RNA genome may facilitate strand transfer by juxtaposing the 5′ R and 3′ R sequences that are 9 kb apart in the linear sequence. In this study, RNA sequences involved in an interaction between the 5′ and 3′ ends of the HIV-1 genome were mapped by mutational analysis. This interaction appears to be mediated mainly by a sequence in the extreme 3′ end of the viral genome and in the gag open reading frame. Mutation of 3′ R sequences was found to inhibit the 5′–3′ interaction, which could be restored by a complementary mutation in the 5′ gag region. Furthermore, we find that circularization of the HIV-1 genome does not affect the initiation of reverse transcription, but stimulates the first strand transfer during reverse transcription in vitro, underscoring the functional importance of the interaction. PMID:20430859
Intra-isolate genome variation in arbuscular mycorrhizal fungi persists in the transcriptome.
Boon, E; Zimmerman, E; Lang, B F; Hijri, M
2010-07-01
Arbuscular mycorrhizal fungi (AMF) are heterokaryotes with an unusual genetic makeup. Substantial genetic variation occurs among nuclei within a single mycelium or isolate. AMF reproduce through spores that contain varying fractions of this heterogeneous population of nuclei. It is not clear whether this genetic variation on the genome level actually contributes to the AMF phenotype. To investigate the extent to which polymorphisms in nuclear genes are transcribed, we analysed the intra-isolate genomic and cDNA sequence variation of two genes, the large subunit ribosomal RNA (LSU rDNA) of Glomus sp. DAOM-197198 (previously known as G. intraradices) and the POL1-like sequence (PLS) of Glomus etunicatum. For both genes, we find high sequence variation at the genome and transcriptome level. Reconstruction of LSU rDNA secondary structure shows that all variants are functional. Patterns of PLS sequence polymorphism indicate that there is one functional gene copy, PLS2, which is preferentially transcribed, and one gene copy, PLS1, which is a pseudogene. This is the first study that investigates AMF intra-isolate variation at the transcriptome level. In conclusion, it is possible that, in AMF, multiple nuclear genomes contribute to a single phenotype.
McWilliams, D; Callahan, R C; Boime, I
1977-01-01
A complementary DNA (cDNA) strand was transcribed from human placental lactogen (hPL) mRNA. Based on alkaline sucrose gradient centrifugation, the size of the cDNA was about 8 S, which would represent at least 80% of the hPL mRNA. Previously we showed that four to five times more hPL was synthesized in cell-free extracts derived from term as compared to first trimester placentas. Hybridization of the cDNA with RNA derived from placental tissue revealed that there was about four times more hPL mRNA sequences in total RNA from term placenta than in a comparable quantity of total first trimester RNA. Only background hybridization was observed when the cDNA was incubated with RNA prepared from human kidney. To test if this differential accumulation of hPL mRNA was the result of an amplification of hPL genes, we hybridized the labeled cDNA with cellular DNA from first trimester and term placentas and with DNA isolated from human brain. In all cases, the amount of hPL sequences was approximately two copies per haploid genome. Thus, the enhanced synthesis of hPL mRNA appears to result from a transcriptional activation rather than an amplification of the hPL gene. The increase likely reflects placental differentiation in which the proportion of syncytial trophoblast increases at term. Images PMID:66681
Gil-Serna, Jessica; Vázquez, Covadonga; González-Jaén, María Teresa; Patiño, Belén
2015-12-02
Aspergillus steynii is probably the most relevant species of section Circumdati producing ochratoxin A (OTA). This mycotoxin contaminates a wide number of commodities and it is highly toxic for humans and animals. Little is known on the biosynthetic genes and their regulation in Aspergillus species. In this work, we identified and analysed three contiguous genes in A. steynii using 5'-RACE and genome walking approaches which predicted a cytochrome P450 monooxygenase (p450ste), a non-ribosomal peptide synthetase (nrpsste) and a polyketide synthase (pksste). These three genes were contiguous within a 20742 bp long genomic DNA fragment. Their corresponding cDNA were sequenced and their expression was analysed in three A. steynii strains using real time RT-PCR specific assays in permissive conditions in in vitro cultures. OTA was also analysed in these cultures. Comparative analyses of predicted genomic, cDNA and amino acid sequences were performed with sequences of similar gene functions. All the results obtained in these analyses were consistent and point out the involvement of these three genes in OTA biosynthesis by A. steynii and showed a co-ordinated expression pattern. This is the first time that a clustered organization OTA biosynthetic genes has been reported in Aspergillus genus. The results also suggested that this situation might be common in Aspergillus OTA-producing species and distinct to the one described for Penicillium species. Copyright © 2015 Elsevier B.V. All rights reserved.
copia-like retrotransposons are ubiquitous among plants.
Voytas, D F; Cummings, M P; Koniczny, A; Ausubel, F M; Rodermel, S R
1992-01-01
Transposable genetic elements are assumed to be a feature of all eukaryotic genomes. Their identification, however, has largely been haphazard, limited principally to organisms subjected to molecular or genetic scrutiny. We assessed the phylogenetic distribution of copia-like retrotransposons, a class of transposable element that proliferates by reverse transcription, using a polymerase chain reaction assay designed to detect copia-like element reverse transcriptase sequences. copia-like retrotransposons were identified in 64 plant species as well as the photosynthetic protist Volvox carteri. The plant species included representatives from 9 of 10 plant divisions, including bryophytes, lycopods, ferns, gymnosperms, and angiosperms. DNA sequence analysis of 29 cloned PCR products and of a maize retrotransposon cDNA confirmed the identity of these sequences as copia-like reverse transcriptase sequences, thereby demonstrating that this class of retrotransposons is a ubiquitous component of plant genomes. Images PMID:1379734
Structure of the horseradish peroxidase isozyme C genes.
Fujiyama, K; Takemura, H; Shibayama, S; Kobayashi, K; Choi, J K; Shinmyo, A; Takano, M; Yamada, Y; Okada, H
1988-05-02
We have isolated, cloned and characterized three cDNAs and two genomic DNAs corresponding to the mRNAs and genes for the horseradish (Armoracia rusticana) peroxidase isoenzyme C (HPR C). The amino acid sequence of HRP C1, deduced from the nucleotide sequence of one of the cDNA clone, pSK1, contained the same primary sequence as that of the purified enzyme established by Welinder [FEBS Lett. 72, 19-23 (1976)] with additional sequences at the N and C terminal. All three inserts in the cDNA clones, pSK1, pSK2 and pSK3, coded the same size of peptide (308 amino acid residues) if these are processed in the same way, and the amino acid sequence were homologous to each other by 91-94%. Functional amino acids, including His40, His170, Tyr185 and Arg183 and S-S-bond-forming Cys, were conserved in the three isozymes, but a few N-glycosylation sites were not the same. Two HRP C isoenzyme genomic genes, prxC1 and prxC2, were tandem on the chromosomal DNA and each gene consisted of four exons and three introns. The positions in the exons interrupted by introns were the same in two genes. We observed a putative promoter sequence 5' upstream and a poly(A) signal 3' downstream in both genes. The gene product of prxC1 might be processed with a signal sequence of 30 amino acid residues at the N terminus and a peptide consisting of 15 amino acid residues at the C terminus.
Using complementary DNA from MyoD-transduced fibroblasts to sequence large muscle genes.
Waddell, Leigh B; Monnier, Nicole; Cooper, Sandra T; North, Kathryn N; Clarke, Nigel F
2011-08-01
Large muscle genes are often sequenced using complementary DNA (cDNA) made from muscle messenger RNA (mRNA) to reduce the cost and workload associated with sequencing from genomic DNA. Two potential barriers are the availability of a frozen muscle biopsy, and difficulties in detecting nonsense mutations due to nonsense-mediated mRNA decay (NMD). We present patient examples showing that use of MyoD-transduced fibroblasts as a source of muscle-specific mRNA overcomes these potential difficulties in sequencing large muscle-related genes. Copyright © 2011 Wiley Periodicals, Inc.
1997-07-01
minimum region of allelic loss on chromosome 17p 13.3, between polymorphic markers D17S5 and D17S28, in genomic DNA from breast and ovarian tumors (Figure 1...encode proteins of 443 and 227 amino acids, with no known functional motifs. Comparison of genomic and cDNA sequences showed that the genes overlap...is tissue specific (Figure 4). When zoo blots comprised of EcoRI fragments of genomic DNA from various species were probed with the unique exon 1 of
Bushakra, Jill M; Lewers, Kim S; Staton, Margaret E; Zhebentyayeva, Tetyana; Saski, Christopher A
2015-10-26
Due to a relatively high level of codominant inheritance and transferability within and among taxonomic groups, simple sequence repeat (SSR) markers are important elements in comparative mapping and delineation of genomic regions associated with traits of economic importance. Expressed sequence tags (ESTs) are a source of SSRs that can be used to develop markers to facilitate plant breeding and for more basic research across genera and higher plant orders. Leaf and meristem tissue from 'Heritage' red raspberry (Rubus idaeus) and 'Bristol' black raspberry (R. occidentalis) were utilized for RNA extraction. After conversion to cDNA and library construction, ESTs were sequenced, quality verified, assembled and scanned for SSRs. Primers flanking the SSRs were designed and a subset tested for amplification, polymorphism and transferability across species. ESTs containing SSRs were functionally annotated using the GenBank non-redundant (nr) database and further classified using the gene ontology database. To accelerate development of EST-SSRs in the genus Rubus (Rosaceae), 1149 and 2358 cDNA sequences were generated from red raspberry and black raspberry, respectively. The cDNA sequences were screened using rigorous filtering criteria which resulted in the identification of 121 and 257 SSR loci for red and black raspberry, respectively. Primers were designed from the surrounding sequences resulting in 131 and 288 primer pairs, respectively, as some sequences contained more than one SSR locus. Sequence analysis revealed that the SSR-containing genes span a diversity of functions and share more sequence identity with strawberry genes than with other Rosaceous species. This resource of Rubus-specific, gene-derived markers will facilitate the construction of linkage maps composed of transferable markers for studying and manipulating important traits in this economically important genus.
Isolation and Characterization of the PKAr Gene From a Plant Pathogen, Curvularia lunata.
Liu, T; Ma, B C; Hou, J M; Zuo, Y H
2014-09-01
By using EST database from a full-length cDNA library of Curvularia lunata, we have isolated a 2.9 kb cDNA, termed PKAr. An ORF of 1,383 bp encoding a polypeptide of 460 amino acids with molecular weight 50.1 kDa, (GeneBank Acc. No. KF675744) was cloned. The deduced amino acid sequence of the PKAr shows 90 and 88 % identity with cAMP-dependent protein kinase A regulatory subunit from Alternaria alternate and Pyrenophora tritici-repentis Pt-1C-BFP, respectively. Database analysis revealed that the deduced amino acid sequence of PKAr shares considerable similarity with that of PKA regulatory subunits in other organisms, particularly in the conserved regions. No introns were identified within the 1,383 bp of ORF compared with PKAr genomic DNA sequence. Southern blot indicated that PKAr existed as a single copy per genome. The mRNA expression level of PKAr in different development stages were demonstrated using real-time quantitative PCR. The results showed that the level of PKAr expression was highest in vegetative growth mycelium, which indicated it might play an important role in the vegetative growth of C. lunata. These results provided a fundamental supporting research on the function of PKAr in plant pathogen, C. lunata.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Korenberg, J.R.
The ultimate goal of this research is to generate and apply novel technologies to speed completion and integration of the human genome map and sequence with biomedical problems. To do this, techniques were developed and genome-wide resources generated. This includes a genome-wide Mapped and Integrated BAC/PAC Resource that has been used for gene finding, map completion and anchoring, breakpoint definition and sequencing. In the last period of the grant, the Human Mapped BAC/PAC Resource was also applied to determine regions of human variation and to develop a novel paradigm of primate evolution through to humans. Further, in order to moremore » rapidly evaluate animal models of human disease, a BAC Map of the mouse was generated in collaboration with the MTI Genome Center, Dr. Bruce Birren.« less
Tran, Thi Kim Anh; MacFarlane, Geoff R; Kong, Richard Yuen Chong; O'Connor, Wayne A; Yu, Richard Man Kit
2016-05-01
Marine molluscs, such as oysters, respond to estrogenic compounds with the induction of the egg yolk protein precursor, vitellogenin (Vtg), availing a biomarker for estrogenic pollution. Despite this application, the precise molecular mechanism through which estrogens exert their action to induce molluscan vitellogenesis is unknown. As a first step to address this question, we cloned a gene encoding Vtg from the Sydney rock oyster Saccostrea glomerata (sgVtg). Using primers designed from a partial sgVtg cDNA sequence available in Genbank, a full-length sgVtg cDNA of 8498bp was obtained by 5'- and 3'-RACE. The open reading frame (ORF) of sgVtg was determined to be 7980bp, which is substantially longer than the orthologs of other oyster species. Its deduced protein sequence shares the highest homology at the N- and C-terminal regions with other molluscan Vtgs. The full-length genomic DNA sequence of sgVtg was obtained by genomic PCR and genome walking targeting the gene body and flanking regions, respectively. The genomic sequence spans 20kb and consists of 30 exons and 29 introns. Computer analysis identified three closely spaced half-estrogen responsive elements (EREs) in the promoter region and a 210-bp CpG island 62bp downstream of the transcription start site. Upregulation of sgVtg mRNA expression was observed in the ovaries following in vitro (explants) and in vivo (tank) exposure to 17β-estradiol (E2). Notably, treatment with an estrogen receptor (ER) antagonist in vitro abolished the upregulation, suggesting a requirement for an estrogen-dependent receptor for transcriptional activation. DNA methylation of the 5' CpG island was analysed using bisulfite genomic sequencing of the in vivo exposed ovaries. The CpG island was found to be hypomethylated (with 0-3% methylcytosines) in both control and E2-exposed oysters. However, no significant differential methylation or any correlation between methylation and sgVtg expression levels was observed. Overall, the results support the possible involvement of an ERE-containing promoter and an estrogen-activated receptor in estrogen signalling in marine molluscs. Copyright © 2016 Elsevier B.V. All rights reserved.
Establishing gene models from the Pinus pinaster genome using gene capture and BAC sequencing.
Seoane-Zonjic, Pedro; Cañas, Rafael A; Bautista, Rocío; Gómez-Maldonado, Josefa; Arrillaga, Isabel; Fernández-Pozo, Noé; Claros, M Gonzalo; Cánovas, Francisco M; Ávila, Concepción
2016-02-27
In the era of DNA throughput sequencing, assembling and understanding gymnosperm mega-genomes remains a challenge. Although drafts of three conifer genomes have recently been published, this number is too low to understand the full complexity of conifer genomes. Using techniques focused on specific genes, gene models can be established that can aid in the assembly of gene-rich regions, and this information can be used to compare genomes and understand functional evolution. In this study, gene capture technology combined with BAC isolation and sequencing was used as an experimental approach to establish de novo gene structures without a reference genome. Probes were designed for 866 maritime pine transcripts to sequence genes captured from genomic DNA. The gene models were constructed using GeneAssembler, a new bioinformatic pipeline, which reconstructed over 82% of the gene structures, and a high proportion (85%) of the captured gene models contained sequences from the promoter regulatory region. In a parallel experiment, the P. pinaster BAC library was screened to isolate clones containing genes whose cDNA sequence were already available. BAC clones containing the asparagine synthetase, sucrose synthase and xyloglucan endotransglycosylase gene sequences were isolated and used in this study. The gene models derived from the gene capture approach were compared with the genomic sequences derived from the BAC clones. This combined approach is a particularly efficient way to capture the genomic structures of gene families with a small number of members. The experimental approach used in this study is a valuable combined technique to study genomic gene structures in species for which a reference genome is unavailable. It can be used to establish exon/intron boundaries in unknown gene structures, to reconstruct incomplete genes and to obtain promoter sequences that can be used for transcriptional studies. A bioinformatics algorithm (GeneAssembler) is also provided as a Ruby gem for this class of analyses.
Venuti, A; Di Russo, C; del Grosso, N; Patti, A M; Ruggeri, F; De Stasio, P R; Martiniello, M G; Pagnotti, P; Degener, A M; Midulla, M
1985-01-01
A fast-growing strain of human hepatitis A virus was selected and characterized. The virus has the unusual property of developing a strong cytopathic effect in tissue culture in 7 to 10 days. Sequences of the viral genome were cloned into recombinant plasmids with the double-stranded replicative form as a template for the reverse transcription of cDNA. Restriction analysis and direct sequencing indicate that this strain is different from that described by Ticehurst et al. (Proc. Natl. Acad. Sci. USA 80:5885-5889, 1983) in the region that presumptively codes for the major capsid protein VP1, but both isolates have conserved large areas of homology in the untranslated 5'-terminal sequences of the genome. Images PMID:2997478
USDA-ARS?s Scientific Manuscript database
Myostatin, a member of TGF-beta superfamily, is a dominant inhibitor of skeletal muscle development and growth. Previously, skeletal muscle-specific over-expression of myostatin prodomain cDNA (5’-region 886 nucleotide) dramatically increased growth performance and muscle mass in transgenic mice. I...
USDA-ARS?s Scientific Manuscript database
Genomic and cDNA sequences corresponding to a ferredoxin-sulfite reductase (SiR) have been cloned from bulb onion (Allium cepa L.) and the expression of the gene and activity of the enzyme characterised with respect to sulfur (S) supply. Cloning, mapping and expression studies revealed that onion ha...
A Rapid Method for Engineering Recombinant Polioviruses or Other Enteroviruses.
Bessaud, Maël; Pelletier, Isabelle; Blondel, Bruno; Delpeyroux, Francis
2016-01-01
The cloning of large enterovirus RNA sequences is labor-intensive because of the frequent instability in bacteria of plasmidic vectors containing the corresponding cDNAs. In order to circumvent this issue we have developed a PCR-based method that allows the generation of highly modified or chimeric full-length enterovirus genomes. This method relies on fusion PCR which enables the concatenation of several overlapping cDNA amplicons produced separately. A T7 promoter sequence added upstream the fusion PCR products allows its transcription into infectious genomic RNAs directly in transfected cells constitutively expressing the phage T7 RNA polymerase. This method permits the rapid recovery of modified viruses that can be subsequently amplified on adequate cell-lines.
Wang, Li Ke; Niu, Xiao Wei; Lv, Yan Hui; Zhang, Tian Zhen; Guo, Wang Zhen
2010-10-01
Annexins constitute a family of multifunction and structurally related proteins. These proteins are ubiquitous in the plant kingdom, and are important calcium-dependent membrane-binding proteins that participate in the polar development of different plant regions such as rhizoids, root caps, and pollen tube tips. In this study, a novel cotton annexin gene (designated as GhFAnnx) was isolated from a fiber cDNA library of cotton (Gossypium hirsutum). The full-length cDNA of GhFAnnx comprises an open reading frame of 945 bp that encodes a 314-amino acid protein with a calculated molecular mass of 35.7 kDa and an isoelectric point of 6.49. Genomic GhFAnnx sequences from different cotton species, TM-1, Hai7124 and two diploid progenitor cottons, G. herbaceum (A-genome) and G. raimondii (D-genome) showed that at least two copies of the GhFAnnx gene, each with six exons and five introns in the coding region, were identified in the allotetraploid cotton genome. The GhFAnnx gene cloned from the cDNA library in this study was mapped to the chromosome 10 of the A-subgenome of the tetraploid cotton. Sequence alignment revealed that GhFAnnx contained four repeats of 70 amino acids. Semi-quantitative reverse transcriptase-polymerase chain reaction revealed that GhFAnnx is preferentially expressed in different developmental fibers but its expression is low in roots, stems, and leaves. Subcellular localization of GhFAnnx in onion epidermal cells and cotton fibers suggests that this protein is ubiquitous in the epidermal cells of onion, but assembles at the edge and the inner side of the apex of the cotton fiber tips with brilliant spots. In summary, GhFAnnx influences fiber development and is associated with the polar expansion of the cotton fiber during elongation stages.
Chung, F Z; Lentes, K U; Gocayne, J; Fitzgerald, M; Robinson, D; Kerlavage, A R; Fraser, C M; Venter, J C
1987-01-26
Two cDNA clones, lambda-CLFV-108 and lambda-CLFV-119, encoding for the beta-adrenergic receptor, have been isolated from a human brain stem cDNA library. One human genomic clone, LCV-517 (20 kb), was characterized by restriction mapping and partial sequencing. The human brain beta-receptor consists of 413 amino acids with a calculated Mr of 46480. The gene contains three potential glucocorticoid receptor-binding sites. The beta-receptor expressed in human brain was homology with rodent (88%) and avian (52%) beta-receptors and with porcine muscarinic cholinergic receptors (31%), supporting our proposal [(1984) Proc. Natl. Acad. Sci. USA 81, 272 276] that adrenergic and muscarinic cholinergic receptors are structurally related. This represents the first cloning of a neurotransmitter receptor gene from human brain.
Dores, Robert M
2016-01-01
The evolution of the melanocortin receptors (MCRs) is closely associated with the evolution of the melanocortin-2 receptor accessory proteins (MRAPs). Recent annotation of the elephant shark genome project revealed the sequence of a putative MRAP1 ortholog. The presence of this sequence in the genome of a cartilaginous fish raises the possibility that the mrap1 and mrap2 genes in the genomes of gnathostome vertebrates were the result of the chordate 2R genome duplication event. The presence of a putative MRAP1 ortholog in a cartilaginous fish genome is perplexing. Recent studies on melanocortin-2 receptor (MC2R) in the genomes of the elephant shark and the Japanese stingray indicate that these MC2R orthologs can be functionally expressed in CHO cells without co-expression of an exogenous mrap1 cDNA. The novel ligand selectivity of these cartilaginous fish MC2R orthologs is discussed. Finally, the origin of the mc2r and mc5r genes is reevaluated. The distinctive primary sequence conservation of MC2R and MC5R is discussed in light of the physiological roles of these two MCR paralogs.
Display of a maize cDNA library on baculovirus infected insect cells.
Meller Harel, Helene Y; Fontaine, Veronique; Chen, Hongying; Jones, Ian M; Millner, Paul A
2008-08-12
Maize is a good model system for cereal crop genetics and development because of its rich genetic heritage and well-characterized morphology. The sequencing of its genome is well advanced, and new technologies for efficient proteomic analysis are needed. Baculovirus expression systems have been used for the last twenty years to express in insect cells a wide variety of eukaryotic proteins that require complex folding or extensive posttranslational modification. More recently, baculovirus display technologies based on the expression of foreign sequences on the surface of Autographa californica (AcMNPV) have been developed. We investigated the potential of a display methodology for a cDNA library of maize young seedlings. We constructed a full-length cDNA library of young maize etiolated seedlings in the transfer vector pAcTMVSVG. The library contained a total of 2.5 x 10(5) independent clones. Expression of two known maize proteins, calreticulin and auxin binding protein (ABP1), was shown by western blot analysis of protein extracts from insect cells infected with the cDNA library. Display of the two proteins in infected insect cells was shown by selective biopanning using magnetic cell sorting and demonstrated proof of concept that the baculovirus maize cDNA display library could be used to identify and isolate proteins. The maize cDNA library constructed in this study relies on the novel technology of baculovirus display and is unique in currently published cDNA libraries. Produced to demonstrate proof of principle, it opens the way for the development of a eukaryotic in vivo display tool which would be ideally suited for rapid screening of the maize proteome for binding partners, such as proteins involved in hormone regulation or defence.
2010-01-01
Background Salmonids are one of the most intensely studied fish, in part due to their economic and environmental importance, and in part due to a recent whole genome duplication in the common ancestor of salmonids. This duplication greatly impacts species diversification, functional specialization, and adaptation. Extensive new genomic resources have recently become available for Atlantic salmon (Salmo salar), but documentation of allelic versus duplicate reference genes remains a major uncertainty in the complete characterization of its genome and its evolution. Results From existing expressed sequence tag (EST) resources and three new full-length cDNA libraries, 9,057 reference quality full-length gene insert clones were identified for Atlantic salmon. A further 1,365 reference full-length clones were annotated from 29,221 northern pike (Esox lucius) ESTs. Pairwise dN/dS comparisons within each of 408 sets of duplicated salmon genes using northern pike as a diploid out-group show asymmetric relaxation of selection on salmon duplicates. Conclusions 9,057 full-length reference genes were characterized in S. salar and can be used to identify alleles and gene family members. Comparisons of duplicated genes show that while purifying selection is the predominant force acting on both duplicates, consistent with retention of functionality in both copies, some relaxation of pressure on gene duplicates can be identified. In addition, there is evidence that evolution has acted asymmetrically on paralogs, allowing one of the pair to diverge at a faster rate. PMID:20433749
DOE Office of Scientific and Technical Information (OSTI.GOV)
Geraghty, M.T.; Stetten, G.; Kearns, W.
1994-09-01
X-linked adrenoleukodystrophy (ALD) is a disorder of peroxisomal {beta}-oxidation of very long chain fatty acids. It presents either as progressive dementia in childhood or as progressive paraparesis in later years. Adrenal insufficiency occurs in both phenotypes. The gene of the ALD protein has been mapped to Xq28 and has recently been cloned and characterized. The ALD protein has significant homology to the peroxisomal membrane protein, PMP70 and belongs to the ATP binding cassette superfamily of transporters. We screened a human genomic library with an ALDP cDNA and isolated 5 different but highly similar clones containing sequences corresponding to the 3{prime}more » end of the ALDP gene. Comparison of the sequences over the region corresponding to exon 9 through the 3{prime} end of the ALDP gene reveals {approximately}96% nucleotide identity in both exonic and intronic regions. Splice sites and open reading frames are maintained. Using both FISH and human-rodent DNA mapping panels, we positively assign these ALDP-related sequences to chromosomes 2, 16 and 22, and provisionally to 1 and 20. Southern blot of primate DNA probed with a partial ALDP cDNA (exon 2-10) shows that expansion of ALDP-related sequences occurred in higher primates (chimp, gorilla and human). Although Northern blots show multiple ALDP-hybridizing transcripts in certain tissues, we have no evidence to date for expression of these ALDP-related sequences. In conclusion, our data show there has been an unusual and recent dispersal to multiple chromosomes of structural gene sequences related to the ALDP gene. The functional significance of these sequences remains to be determined but their existence complicates PCR and mutation analysis of the ALDP gene.« less
Yao, Q; Fischer, K P; Tyrrell, D L; Gutfreund, K S
2015-04-01
Programmed death ligand-1 (PD-L1) plays an important role in the attenuation of adaptive immune responses in higher vertebrates. Here, we describe the identification of the Pekin duck PD-L1 orthologue (duPD-L1) and its gene structure. The duPD-L1 cDNA encodes a 311-amino acid protein that has an amino acid identity of 78% and 42% with chicken and human PD-L1, respectively. Mapping of the duPD-L1 cDNA with duck genomic sequences revealed an exonic structure of its coding sequence similar to those of other vertebrates but lacked a noncoding exon 1. Homology modelling of the duPD-L1 extracellular domain was compatible with the tandem IgV-like and IgC-like IgSF domain structure of human PD-L1 (PDB ID: 3BIS). Residues known to be important for receptor binding of human PD-L1 were mostly conserved in duPD-L1 within the N-terminus and the G sheet, and partially conserved within the F sheet but not within sheets C and C'. DuPD-L1 mRNA was constitutively expressed in all tissues examined with highest expression levels in lung and spleen and very low levels of expression in muscle, kidney and brain. Mitogen stimulation of duck peripheral blood mononuclear cells transiently increased duPD-L1 mRNA expression. Our observations demonstrate evolutionary conservation of the exonic structure of its coding sequence, the extracellular domain structure and residues implicated in receptor binding, but the role of the longer cytoplasmic tail in avian PD-L1 proteins remains to be determined. © 2014 John Wiley & Sons Ltd.
1989-04-01
strain-specific identification of HAV in human fecal samples was a major aim of the original contract application, as clinical trials of live and...derived materials and human and primate fecal specimens. 4. We molecularly cloned and partially sequenced the genome of PA21 strain HAV, a virus...antibody. This approach revealed that 99% of the infectious virus particles present in disrupted cell lysates from the 23rd passage of persistently
Huang, Wei; Zhang, Jianshe; Liao, Zhi; Lv, Zhenming; Wu, Huifei; Zhu, Aiyi; Wu, Changwen
2016-01-15
Gonadotropin-releasing hormone III (GnRH3) is considered to be a key neurohormone in fish reproduction control. In the present study, the cDNA and genomic sequences of GnRH3 were cloned and characterized from large yellow croaker Larimichthys crocea. The cDNA encoded a protein of 99 amino acids with four functional motifs. The full-length genome sequence was composed of 3797 nucleotides, including four exons and three introns. Higher identities of amino acid sequences and conserved exon-intron organizations were found between LcGnRH3 and other GnRH3 genes. In addition, some special features of the sequences were detected in partial species. For example, two specific residues (V and A) were found in the family Sciaenidae, and the unique 75-72 bp type of the open reading frame 2 and 3 existed in the family Cyprinidae. Analysis of the 2576 bp promoter fragment of LcGnRH3 showed a number of transcription factor binding sites, such as AP1, CREB, GATA-1, HSF, FOXA2, and FOXL1. Promoter functional analysis using an EGFP reporter fusion in zebrafish larvae presented positive signals in the brain, including the olfactory region, the terminal nerve ganglion, the telencephalon, and the hypothalamus. The expression pattern was generally consistent with the endogenous GnRH3 GFP-expressing transgenic zebrafish lines, but the details were different. These results indicate that the structure and function of LcGnRH3 are generally similar to the other teleost GnRH3 genes, but there exist some distinctions among them. Copyright © 2015 Elsevier B.V. All rights reserved.
Booman, Marije; Borza, Tudor; Feng, Charles Y; Hori, Tiago S; Higgins, Brent; Culf, Adrian; Léger, Daniel; Chute, Ian C; Belkaid, Anissa; Rise, Marlies; Gamperl, A Kurt; Hubert, Sophie; Kimball, Jennifer; Ouellette, Rodney J; Johnson, Stewart C; Bowman, Sharen; Rise, Matthew L
2011-08-01
The collapse of Atlantic cod (Gadus morhua) wild populations strongly impacted the Atlantic cod fishery and led to the development of cod aquaculture. In order to improve aquaculture and broodstock quality, we need to gain knowledge of genes and pathways involved in Atlantic cod responses to pathogens and other stressors. The Atlantic Cod Genomics and Broodstock Development Project has generated over 150,000 expressed sequence tags from 42 cDNA libraries representing various tissues, developmental stages, and stimuli. We used this resource to develop an Atlantic cod oligonucleotide microarray containing 20,000 unique probes. Selection of sequences from the full range of cDNA libraries enables application of the microarray for a broad spectrum of Atlantic cod functional genomics studies. We included sequences that were highly abundant in suppression subtractive hybridization (SSH) libraries, which were enriched for transcripts responsive to pathogens or other stressors. These sequences represent genes that potentially play an important role in stress and/or immune responses, making the microarray particularly useful for studies of Atlantic cod gene expression responses to immune stimuli and other stressors. To demonstrate its value, we used the microarray to analyze the Atlantic cod spleen response to stimulation with formalin-killed, atypical Aeromonas salmonicida, resulting in a gene expression profile that indicates a strong innate immune response. These results were further validated by quantitative PCR analysis and comparison to results from previous analysis of an SSH library. This study shows that the Atlantic cod 20K oligonucleotide microarray is a valuable new tool for Atlantic cod functional genomics research.
Isolation of candidate genes of Friedreich`s ataxia on chromosome 9q13
DOE Office of Scientific and Technical Information (OSTI.GOV)
Montermini, L.; Zara, F.; Pandolfo, M.
1994-09-01
Friedreich`s ataxia (FRDA) is an autosomal recessive degenerative disease involving the central and peripheral nervous system and the heart. The mutated gene in FRDA has recently been localized within a 450 Kb interval on chromosome 9q13 between the markers D9S202/FR1/FR8. We have been able to confirm such localization for the disease gene by analysis of extended haplotype in consanguineous families. Cases of loss of marker homozygosity, which are likely to be due to ancient recombinations, have been found to involve D9S110, D9S15, and D9S111 on the telomeric side, and FR5 on the centromeric side, while homozygosity was always found formore » a core haplotype including D9S5, FD1, and D9S202. We constructed a YAC contig spanning the region between the telomeric markers and FR5, and cosmids have been obtained from the YACs. In order to isolate transcribed sequences from the FRDA candidate region we are utilizing a combination of approaches, including hybridization of YACs and cosmids to an arrayed human heart cDNA library, cDNA direct selection, and exon amplification. A transcribed sequence near the telomeric end of the region has been isolated by cDNA direct selection using pooled cosmids as genomic template and primary human heart, muscle, brain, liver and placenta cDNAs as cDNA source. We have shown this sequence to be the human equivalent of ZO-2, a tight junction protein previously described in the dog. No mutations of this gene have been found in FRDA subjects. Additional cDNA have recently been isolated and they are currently being evaluated.« less
RNA circularization reveals terminal sequence heterogeneity in a double-stranded RNA virus.
Widmer, G
1993-03-01
Double-stranded RNA viruses (dsRNA), termed LRV1, have been found in several strains of the protozoan parasite Leishmania. With the aim of constructing a full-length cDNA copy of the viral genome, including its terminal sequences, a protocol based on PCR amplification across the 3'-5' junction of circularized RNA was developed. This method proved to be applicable to dsRNA. It provided a relatively simple alternative to one-sided PCR, without loss of specificity inherent in the use of generic primers. LRV1 terminal nucleotide sequences obtained by this method showed a considerable variation in length, particularly at the 5' end of the positive strand, as well as the potential for forming 3' overhangs. The opposite genomic end terminates in 0, 1, or 2 TCA trinucleotide repeats. These results are compared with terminal sequences derived from one-sided PCR experiments.
Pollier, Jacob; González-Guzmán, Miguel; Ardiles-Diaz, Wilson; Geelen, Danny; Goossens, Alain
2011-01-01
cDNA-Amplified Fragment Length Polymorphism (cDNA-AFLP) is a commonly used technique for genome-wide expression analysis that does not require prior sequence knowledge. Typically, quantitative expression data and sequence information are obtained for a large number of differentially expressed gene tags. However, most of the gene tags do not correspond to full-length (FL) coding sequences, which is a prerequisite for subsequent functional analysis. A medium-throughput screening strategy, based on integration of polymerase chain reaction (PCR) and colony hybridization, was developed that allows in parallel screening of a cDNA library for FL clones corresponding to incomplete cDNAs. The method was applied to screen for the FL open reading frames of a selection of 163 cDNA-AFLP tags from three different medicinal plants, leading to the identification of 109 (67%) FL clones. Furthermore, the protocol allows for the use of multiple probes in a single hybridization event, thus significantly increasing the throughput when screening for rare transcripts. The presented strategy offers an efficient method for the conversion of incomplete expressed sequence tags (ESTs), such as cDNA-AFLP tags, to FL-coding sequences.
Nagahashi, S; Endoh, H; Suzuki, Y; Okada, N
1991-11-20
A previous report from this laboratory showed that in vitro transcription of total genomic DNA of the newt Cynopus pyrrhogaster resulted in a discrete sized 8 S RNA, which represented highly repetitive and transcribable sequences with a glutamic acid tRNA-like structure in the newt genome. We isolated four independent clones from a newt genomic library and determined the complete sequences of three 2000 to 2400 base-pair PstI fragments spanning the 8 S RNA gene. The glutamic acid tRNA-related segment in the 8 S RNA gene contains the CCA sequence expected as the 3' terminus of a tRNA molecule. Further, the 11 nucleotides located 13 nucleotides upstream from one of the two transcription initiation sites of the 8 S RNA were found to be repeated in the region upstream from the termination site, suggesting that the original unit, which is shorter than the 8 S RNA, was retrotransposed via cDNA intermediates from the PolIII transcript. In the upstream region of the 8 S RNA gene, a 360 nucleotide unit containing the glutamic acid tRNA-related segment was found to be duplicated (clones NE1 and NE10) or triplicated (clone NE3). Except for the difference in the number of the 360 nucleotide unit, the three sequences of the 2000 to 2400 base-pair PstI fragment were essentially the same with only a few mutations and minor deletions. Inverse polymerase chain reaction and sequence determination of the products, together with a Southern hybridization experiment, demonstrated that the family consists of a tandemly repeated unit of 3300, 3700 or 4100 base-pairs. Thus during evolution, this family in the newt was created by retroposition via cDNA intermediates, followed by duplication or triplication of the 360 nucleotide unit and multiplication of the 3300 to 4100 base-pair region at the DNA level.
2010-01-01
Background Identifying associations between genotypes and gene expression levels using microarrays has enabled systematic interrogation of regulatory variation underlying complex phenotypes. This approach has vast potential for functional characterization of disease states, but its prohibitive cost, given hundreds to thousands of individual samples from populations have to be genotyped and expression profiled, has limited its widespread application. Results Here we demonstrate that genomic regions with allele-specific expression (ASE) detected by sequencing cDNA are highly enriched for cis-acting expression quantitative trait loci (cis-eQTL) identified by profiling of 500 animals in parallel, with up to 90% agreement on the allele that is preferentially expressed. We also observed widespread noncoding and antisense ASE and identified several allele-specific alternative splicing variants. Conclusion Monitoring ASE by sequencing cDNA from as little as one sample is a practical alternative to expression genetics for mapping cis-acting variation that regulates RNA transcription and processing. PMID:20707912
2011-01-01
Background Many plants have large and complex genomes with an abundance of repeated sequences. Many plants are also polyploid. Both of these attributes typify the genome architecture in the tribe Triticeae, whose members include economically important wheat, rye and barley. Large genome sizes, an abundance of repeated sequences, and polyploidy present challenges to genome-wide SNP discovery using next-generation sequencing (NGS) of total genomic DNA by making alignment and clustering of short reads generated by the NGS platforms difficult, particularly in the absence of a reference genome sequence. Results An annotation-based, genome-wide SNP discovery pipeline is reported using NGS data for large and complex genomes without a reference genome sequence. Roche 454 shotgun reads with low genome coverage of one genotype are annotated in order to distinguish single-copy sequences and repeat junctions from repetitive sequences and sequences shared by paralogous genes. Multiple genome equivalents of shotgun reads of another genotype generated with SOLiD or Solexa are then mapped to the annotated Roche 454 reads to identify putative SNPs. A pipeline program package, AGSNP, was developed and used for genome-wide SNP discovery in Aegilops tauschii-the diploid source of the wheat D genome, and with a genome size of 4.02 Gb, of which 90% is repetitive sequences. Genomic DNA of Ae. tauschii accession AL8/78 was sequenced with the Roche 454 NGS platform. Genomic DNA and cDNA of Ae. tauschii accession AS75 was sequenced primarily with SOLiD, although some Solexa and Roche 454 genomic sequences were also generated. A total of 195,631 putative SNPs were discovered in gene sequences, 155,580 putative SNPs were discovered in uncharacterized single-copy regions, and another 145,907 putative SNPs were discovered in repeat junctions. These SNPs were dispersed across the entire Ae. tauschii genome. To assess the false positive SNP discovery rate, DNA containing putative SNPs was amplified by PCR from AL8/78 and AS75 and resequenced with the ABI 3730 xl. In a sample of 302 randomly selected putative SNPs, 84.0% in gene regions, 88.0% in repeat junctions, and 81.3% in uncharacterized regions were validated. Conclusion An annotation-based genome-wide SNP discovery pipeline for NGS platforms was developed. The pipeline is suitable for SNP discovery in genomic libraries of complex genomes and does not require a reference genome sequence. The pipeline is applicable to all current NGS platforms, provided that at least one such platform generates relatively long reads. The pipeline package, AGSNP, and the discovered 497,118 Ae. tauschii SNPs can be accessed at (http://avena.pw.usda.gov/wheatD/agsnp.shtml). PMID:21266061
Isolation and characterization of a water stress-specific genomic gene, pwsi 18, from rice.
Joshee, N; Kisaka, H; Kitagawa, Y
1998-01-01
One of the water stress-specific cDNA clones of rice characterised previously, wsi18, was selected for further study. The wsi18 gene can be induced by water stress conditions such as mannitol, NaCl, and dryness, but not by ABA, cold, or heat. A genomic clone for wsi18, pwsi18, contained about 1.7 kbp of the 5' upstream sequence, two introns, and the full coding sequence. The 5'-upstream sequence of pwsi18 contained putative cis-acting elements, namely an ABA-responsive element (ABRE), three G-boxes, three E-boxes, a MEF-2 sequence, four direct and two inverted repeats, and four sequences similar to DRE, which is involved in the dehydration response of Arabidopsis genes. The gusA reporter gene under the control of the pwsi18 promoter showed transient expression in response to water stress. Deletion of the downstream DRE-like sequence between the distal G-boxes-2 and -3 resulted in rather low GUS expression.
Zhang, Xiao-Yan; Dong, Shu-Wei; Xiang, Hai-Ying; Chen, Xiang-Ru; Li, Da-Wei; Yu, Jia-Lin; Han, Cheng-Gui
2015-02-02
Brassica yellows virus is a newly identified species in the genus of Polerovirus within the family Luteoviridae. Brassica yellows virus (BrYV) is prevalently distributed throughout Mainland China and South Korea, is an important virus infecting cruciferous crops. Based on six BrYV genomic sequences of isolates from oilseed rape, rutabaga, radish, and cabbage, three genotypes, BrYV-A, BrYV-B, and BrYV-C, exist, which mainly differ in the 5' terminal half of the genome. BrYV is an aphid-transmitted and phloem-limited virus. The use of infectious cDNA clones is an alternative means of infecting plants that allows reverse genetic studies to be performed. In this study, full-length cDNA clones of BrYV-A, recombinant BrYV5B3A, and BrYV-C were constructed under control of the cauliflower mosaic virus 35S promoter. An agrobacterium-mediated inoculation system of Nicotiana benthamiana was developed using these cDNA clones. Three days after infiltration with full-length BrYV cDNA clones, necrotic symptoms were observed in the inoculated leaves of N. benthamiana; however, no obvious symptoms appeared in the upper leaves. Reverse transcription-PCR (RT-PCR) and western blot detection of samples from the upper leaves showed that the maximum infection efficiency of BrYVs could reach 100%. The infectivity of the BrYV-A, BrYV-5B3A, and BrYV-C cDNA clones was further confirmed by northern hybridization. The system developed here will be useful for further studies of BrYV, such as host range, pathogenicity, viral gene functions, and plant-virus-vector interactions, and especially for discerning the differences among the three genotypes. Copyright © 2014 Elsevier B.V. All rights reserved.
A genome-wide 20 K citrus microarray for gene expression analysis
Martinez-Godoy, M Angeles; Mauri, Nuria; Juarez, Jose; Marques, M Carmen; Santiago, Julia; Forment, Javier; Gadea, Jose
2008-01-01
Background Understanding of genetic elements that contribute to key aspects of citrus biology will impact future improvements in this economically important crop. Global gene expression analysis demands microarray platforms with a high genome coverage. In the last years, genome-wide EST collections have been generated in citrus, opening the possibility to create new tools for functional genomics in this crop plant. Results We have designed and constructed a publicly available genome-wide cDNA microarray that include 21,081 putative unigenes of citrus. As a functional companion to the microarray, a web-browsable database [1] was created and populated with information about the unigenes represented in the microarray, including cDNA libraries, isolated clones, raw and processed nucleotide and protein sequences, and results of all the structural and functional annotation of the unigenes, like general description, BLAST hits, putative Arabidopsis orthologs, microsatellites, putative SNPs, GO classification and PFAM domains. We have performed a Gene Ontology comparison with the full set of Arabidopsis proteins to estimate the genome coverage of the microarray. We have also performed microarray hybridizations to check its usability. Conclusion This new cDNA microarray replaces the first 7K microarray generated two years ago and allows gene expression analysis at a more global scale. We have followed a rational design to minimize cross-hybridization while maintaining its utility for different citrus species. Furthermore, we also provide access to a website with full structural and functional annotation of the unigenes represented in the microarray, along with the ability to use this site to directly perform gene expression analysis using standard tools at different publicly available servers. Furthermore, we show how this microarray offers a good representation of the citrus genome and present the usefulness of this genomic tool for global studies in citrus by using it to catalogue genes expressed in citrus globular embryos. PMID:18598343
Noh, Ju Young; Patnaik, Bharat Bhusan; Tindwa, Hamisi; Seo, Gi Won; Kim, Dong Hyun; Patnaik, Hongray Howrelia; Jo, Yong Hun; Lee, Yong Seok; Lee, Bok Luel; Kim, Nam Jung; Han, Yeon Soo
2014-01-25
Apolipophorin III (apoLp-III) is a well-known hemolymph protein having a functional role in lipid transport and immune response of insects. We cloned full-length cDNA encoding putative apoLp-III from larvae of the coleopteran beetle, Tenebrio molitor (TmapoLp-III), by identification of clones corresponding to the partial sequence of TmapoLp-III, subsequently followed with full length sequencing by a clone-by-clone primer walking method. The complete cDNA consists of 890 nucleotides, including an ORF encoding 196 amino acid residues. Excluding a putative signal peptide of the first 20 amino acid residues, the 176-residue mature apoLp-III has a calculated molecular mass of 19,146Da. Genomic sequence analysis with respect to its cDNA showed that TmapoLp-III was organized into four exons interrupted by three introns. Several immune-related transcription factor binding sites were discovered in the putative 5'-flanking region. BLAST and phylogenetic analyses reveal that TmapoLp-III has high sequence identity (88%) with Tribolium castaneum apoLp-III but shares little sequence homologies (<26%) with other apoLp-IIIs. Homology modeling of Tm apoLp-III shows a bundle of five amphipathic alpha helices, including a short helix 3'. The 'helix-short helix-helix' motif was predicted to be implicated in lipid binding interactions, through reversible conformational changes and accommodating the hydrophobic residues to the exterior for stability. Highest level of TmapoLp-III mRNA was detected at late pupal stages, albeit it is expressed in the larval and adult stages at lower levels. The tissue specific expression of the transcripts showed significantly higher numbers in larval fat body and adult integument. In addition, TmapoLp-III mRNA was found to be highly upregulated in late stages of L. monocytogenes or E. coli challenge. These results indicate that TmapoLp-III may play an important role in innate immune responses against bacterial pathogens in T. molitor. Copyright © 2013 Elsevier B.V. All rights reserved.
Automated multiplex genome-scale engineering in yeast
Si, Tong; Chao, Ran; Min, Yuhao; Wu, Yuying; Ren, Wen; Zhao, Huimin
2017-01-01
Genome-scale engineering is indispensable in understanding and engineering microorganisms, but the current tools are mainly limited to bacterial systems. Here we report an automated platform for multiplex genome-scale engineering in Saccharomyces cerevisiae, an important eukaryotic model and widely used microbial cell factory. Standardized genetic parts encoding overexpression and knockdown mutations of >90% yeast genes are created in a single step from a full-length cDNA library. With the aid of CRISPR-Cas, these genetic parts are iteratively integrated into the repetitive genomic sequences in a modular manner using robotic automation. This system allows functional mapping and multiplex optimization on a genome scale for diverse phenotypes including cellulase expression, isobutanol production, glycerol utilization and acetic acid tolerance, and may greatly accelerate future genome-scale engineering endeavours in yeast. PMID:28469255
Comparative 454 pyrosequencing of transcripts from two olive genotypes during fruit development
Alagna, Fiammetta; D'Agostino, Nunzio; Torchia, Laura; Servili, Maurizio; Rao, Rosa; Pietrella, Marco; Giuliano, Giovanni; Chiusano, Maria Luisa; Baldoni, Luciana; Perrotta, Gaetano
2009-01-01
Background Despite its primary economic importance, genomic information on olive tree is still lacking. 454 pyrosequencing was used to enrich the very few sequence data currently available for the Olea europaea species and to identify genes involved in expression of fruit quality traits. Results Fruits of Coratina, a widely cultivated variety characterized by a very high phenolic content, and Tendellone, an oleuropein-lacking natural variant, were used as starting material for monitoring the transcriptome. Four different cDNA libraries were sequenced, respectively at the beginning and at the end of drupe development. A total of 261,485 reads were obtained, for an output of about 58 Mb. Raw sequence data were processed using a four step pipeline procedure and data were stored in a relational database with a web interface. Conclusion Massively parallel sequencing of different fruit cDNA collections has provided large scale information about the structure and putative function of gene transcripts accumulated during fruit development. Comparative transcript profiling allowed the identification of differentially expressed genes with potential relevance in regulating the fruit metabolism and phenolic content during ripening. PMID:19709400
Chromosome-Encoded Broad-Spectrum Ambler Class A β-Lactamase RUB-1 from Serratia rubidaea
Didi, Jennifer; Ergani, Ayla; Lima, Sandra
2016-01-01
ABSTRACT Whole-genome sequencing of Serratia rubidaea CIP 103234T revealed a chromosomally located Ambler class A β-lactamase gene. The gene was cloned, and the β-lactamase, RUB-1, was characterized. RUB-1 displayed 74% and 73% amino acid sequence identity with the GIL-1 and TEM-1 penicillinases, respectively, and its substrate profile was similar to that of the latter β-lactamases. Analysis by 5′ rapid amplification of cDNA ends revealed promoter sequences highly divergent from the Escherichia coli σ70 consensus sequence. This work further illustrates the heterogeneity of β-lactamases among Serratia spp. PMID:27956418
Chromosome-Encoded Broad-Spectrum Ambler Class A β-Lactamase RUB-1 from Serratia rubidaea.
Bonnin, Rémy A; Didi, Jennifer; Ergani, Ayla; Lima, Sandra; Naas, Thierry
2017-02-01
Whole-genome sequencing of Serratia rubidaea CIP 103234 T revealed a chromosomally located Ambler class A β-lactamase gene. The gene was cloned, and the β-lactamase, RUB-1, was characterized. RUB-1 displayed 74% and 73% amino acid sequence identity with the GIL-1 and TEM-1 penicillinases, respectively, and its substrate profile was similar to that of the latter β-lactamases. Analysis by 5' rapid amplification of cDNA ends revealed promoter sequences highly divergent from the Escherichia coli σ 70 consensus sequence. This work further illustrates the heterogeneity of β-lactamases among Serratia spp. Copyright © 2017 American Society for Microbiology.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Rorsman, F.; Bywater, M.; Knott, T.J.
The human platelet-derived growth factor (PDGF) A-chain locus was characterized by restriction endonuclease analysis, and the nucleotide sequence of its exons was determined. Seven exons were identified, spanning approximately 22 kilobase pairs of genomic DNA. Alternative exon usage, identified by cDNA cloning, occurs in a human glioblastoma cell line and may give rise to two types of A-chain precursors with different C termini. The exon-intron arrangement was similar to that of the PDGF B-chain/sis locus and seemed to divide the precursor proteins into functional domains. Southern blot analysis of genomic DNA showed that a single PDGF A-chain gene was presentmore » in the human genome.« less
2010-01-01
Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi) is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST) library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs) was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE) and farnesyl-diphosphate synthase (FPS) were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13) potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum. PMID:20230644
Sakurai, Tetsuya; Plata, Germán; Rodríguez-Zapata, Fausto; Seki, Motoaki; Salcedo, Andrés; Toyoda, Atsushi; Ishiwata, Atsushi; Tohme, Joe; Sakaki, Yoshiyuki; Shinozaki, Kazuo; Ishitani, Manabu
2007-01-01
Background Cassava, an allotetraploid known for its remarkable tolerance to abiotic stresses is an important source of energy for humans and animals and a raw material for many industrial processes. A full-length cDNA library of cassava plants under normal, heat, drought, aluminum and post harvest physiological deterioration conditions was built; 19968 clones were sequence-characterized using expressed sequence tags (ESTs). Results The ESTs were assembled into 6355 contigs and 9026 singletons that were further grouped into 10577 scaffolds; we found 4621 new cassava sequences and 1521 sequences with no significant similarity to plant protein databases. Transcripts of 7796 distinct genes were captured and we were able to assign a functional classification to 78% of them while finding more than half of the enzymes annotated in metabolic pathways in Arabidopsis. The annotation of sequences that were not paired to transcripts of other species included many stress-related functional categories showing that our library is enriched with stress-induced genes. Finally, we detected 230 putative gene duplications that include key enzymes in reactive oxygen species signaling pathways and could play a role in cassava stress response features. Conclusion The cassava full-length cDNA library here presented contains transcripts of genes involved in stress response as well as genes important for different areas of cassava research. This library will be an important resource for gene discovery, characterization and cloning; in the near future it will aid the annotation of the cassava genome. PMID:18096061
Sequencing, Analysis, and Annotation of Expressed Sequence Tags for Camelus dromedarius
Al-Swailem, Abdulaziz M.; Shehata, Maher M.; Abu-Duhier, Faisel M.; Al-Yamani, Essam J.; Al-Busadah, Khalid A.; Al-Arawi, Mohammed S.; Al-Khider, Ali Y.; Al-Muhaimeed, Abdullah N.; Al-Qahtani, Fahad H.; Manee, Manee M.; Al-Shomrani, Badr M.; Al-Qhtani, Saad M.; Al-Harthi, Amer S.; Akdemir, Kadir C.; Otu, Hasan H.
2010-01-01
Despite its economical, cultural, and biological importance, there has not been a large scale sequencing project to date for Camelus dromedarius. With the goal of sequencing complete DNA of the organism, we first established and sequenced camel EST libraries, generating 70,272 reads. Following trimming, chimera check, repeat masking, cluster and assembly, we obtained 23,602 putative gene sequences, out of which over 4,500 potentially novel or fast evolving gene sequences do not carry any homology to other available genomes. Functional annotation of sequences with similarities in nucleotide and protein databases has been obtained using Gene Ontology classification. Comparison to available full length cDNA sequences and Open Reading Frame (ORF) analysis of camel sequences that exhibit homology to known genes show more than 80% of the contigs with an ORF>300 bp and ∼40% hits extending to the start codons of full length cDNAs suggesting successful characterization of camel genes. Similarity analyses are done separately for different organisms including human, mouse, bovine, and rat. Accompanying web portal, CAGBASE (http://camel.kacst.edu.sa/), hosts a relational database containing annotated EST sequences and analysis tools with possibility to add sequences from public domain. We anticipate our results to provide a home base for genomic studies of camel and other comparative studies enabling a starting point for whole genome sequencing of the organism. PMID:20502665
Laufer, Marlene; Mohammad, Hamza; Maiss, Edgar; Richert-Pöggeler, Katja; Dall'Ara, Mattia; Ratti, Claudio; Gilmer, David; Liebe, Sebastian; Varrelmann, Mark
2018-05-01
Two members of the Benyviridae family and genus Benyvirus, Beet soil-borne mosaic virus (BSBMV) and Beet necrotic yellow vein virus (BNYVV), possess identical genome organization, host range and high sequence similarity; they infect Beta vulgaris with variable symptom expression. In the US, mixed infections are described with limited information about viral interactions. Vectors suitable for agroinoculation of all genome components of both viruses were constructed by isothermal in vitro recombination. All 35S promoter-driven cDNA clones allowed production of recombinant viruses competent for Nicotiana benthamiana and Beta macrocarpa systemic infection and Polymyxa betae transmission and were compared to available BNYVV B-type clone. BNYVV and BSBMV RNA1 + 2 reassortants were viable and spread long-distance in N. benthamiana with symptoms dependent on the BNYVV type. Small genomic RNAs were exchangeable and systemically infected B. macrocarpa. These infectious clones represent a powerful tool for the identification of specific molecular host-pathogen determinants. Copyright © 2018 Elsevier Inc. All rights reserved.
Majira, Amel; Domin, Monique; Grandjean, Olivier; Gofron, Krystyna; Houba-Hérin, Nicole
2002-10-01
A seedling lethal mutant of Nicotiana plumbaginifolia (sdl-1) was isolated by transposon tagging using a maize Dissociation (Ds) element. The insertion mutation was produced by direct co-transformation of protoplasts with two plasmids: one containing Ds and a second with an Ac transposase gene. sdl-1 seedlings exhibit several phenotypes: swollen organs, short hypocotyls in light and dark conditions, and enlarged and multinucleated cells, that altogether suggest cell growth defects. Mutant cells are able to proliferate under in vitro culture conditions. Genomic DNA sequences bordering the transposon were used to recover cDNA from the normal allele. Complementation of the mutant phenotype with the cDNA confirmed that the transposon had caused the mutation. The Ds element was inserted into the first exon of the open reading frame and the homozygous mutant lacked detectable transcript. Phenocopies of the mutant were obtained by an antisense approach. SDL-1 encodes a novel protein found in several plant genomes but apparently missingfrom animal and fungal genomes; the protein is highly conserved and has a potential plastid targeting motif.
Li, Jitao; Li, Jian; Chen, Ping; Liu, Ping; He, Yuying
2015-01-01
The ridgetail white prawn Exopalaemon carinicauda is one of major economic mariculture species in eastern China. The deficiency of genomic and transcriptomic data is becoming the bottleneck of further researches on its good traits. In the present study, 454 pyrosequencing was undertaken to investigate the transcriptome profiles of E. carinicauda. A collection of 1,028,710 sequence reads (459.59 Mb) obtained from cDNA prepared from eyestalk and hemocytes was assembled into 162,056 expressed sequence tags (ESTs). Of these, 29.88 % of 48,428 contigs and 70.12 % of 113,628 singlets possessed high similarities to sequences in the GenBank non-redundant database, with most significant (E value <1e(-10)) unigenes matches occurring with crustacean and insect sequences. KEGG analysis of unigenes identified putative members of biological pathways related to growth and immunity. In addition, we obtained a total of putative 125,112 SNPs and 13,467 microsatellites. These results will contribute to the understanding of the genome makeup and provide useful information for future functional genomic research in E. carinicauda.
Dreyer, Christine; Hoffmann, Margarete; Lanz, Christa; Willing, Eva-Maria; Riester, Markus; Warthmann, Norman; Sprecher, Andrea; Tripathi, Namita; Henz, Stefan R; Weigel, Detlef
2007-01-01
Background The guppy, Poecilia reticulata, is a well-known model organism for studying inheritance and variation of male ornamental traits as well as adaptation to different river habitats. However, genomic resources for studying this important model were not previously widely available. Results With the aim of generating molecular markers for genetic mapping of the guppy, cDNA libraries were constructed from embryos and different adult organs to generate expressed sequence tags (ESTs). About 18,000 ESTs were annotated according to BLASTN and BLASTX results and the sequence information from the 3' UTRs was exploited to generate PCR primers for re-sequencing of genomic DNA from different wild type strains. By comparison of EST-linked genomic sequences from at least four different ecotypes, about 1,700 polymorphisms were identified, representing about 400 distinct genes. Two interconnected MySQL databases were built to organize the ESTs and markers, respectively. A robust phylogeny of the guppy was reconstructed, based on 10 different nuclear genes. Conclusion Our EST and marker databases provide useful tools for genetic mapping and phylogenetic studies of the guppy. PMID:17686157
SGP-1: Prediction and Validation of Homologous Genes Based on Sequence Alignments
Wiehe, Thomas; Gebauer-Jung, Steffi; Mitchell-Olds, Thomas; Guigó, Roderic
2001-01-01
Conventional methods of gene prediction rely on the recognition of DNA-sequence signals, the coding potential or the comparison of a genomic sequence with a cDNA, EST, or protein database. Reasons for limited accuracy in many circumstances are species-specific training and the incompleteness of reference databases. Lately, comparative genome analysis has attracted increasing attention. Several analysis tools that are based on human/mouse comparisons are already available. Here, we present a program for the prediction of protein-coding genes, termed SGP-1 (Syntenic Gene Prediction), which is based on the similarity of homologous genomic sequences. In contrast to most existing tools, the accuracy of SGP-1 depends little on species-specific properties such as codon usage or the nucleotide distribution. SGP-1 may therefore be applied to nonstandard model organisms in vertebrates as well as in plants, without the need for extensive parameter training. In addition to predicting genes in large-scale genomic sequences, the program may be useful to validate gene structure annotations from databases. To this end, SGP-1 output also contains comparisons between predicted and annotated gene structures in HTML format. The program can be accessed via a Web server at http://soft.ice.mpg.de/sgp-1. The source code, written in ANSI C, is available on request from the authors. PMID:11544202
Yao, Lin; Yang, Qian; Song, Jinzhu; Tan, Chong; Guo, Changhong; Wang, Li; Qu, Lianhai; Wang, Yun
2013-04-01
Trichoderma harzianum 88, a filamentous soil fungus, is an effective biocontrol agent against several plant pathogens. High-throughput sequencing was used here to study the mycoparasitism mechanisms of T. harzianum 88. Plate confrontation tests of T. harzianum 88 against plant pathogens were conducted, and a cDNA library was constructed from T. harzianum 88 mycelia in the presence of plant pathogen cell walls. Randomly selected transcripts from the cDNA library were compared with eukaryotic plant and fungal genomes. Of the 1,386 transcripts sequenced, the most abundant Gene Ontology (GO) classification group was "physiological process". Differential expression of 19 genes was confirmed by real-time RT-PCR at different mycoparasitism stages against plant pathogens. Gene expression analysis revealed the transcription of various genes involved in mycoparasitism of T. harzianum 88. Our study provides helpful insights into the mechanisms of T. harzianum 88-plant pathogen interactions.
Bag, Sudeep; Al Rwahnih, Maher; Li, Ashley; Gonzalez, Asaul; Rowhani, Adib; Uyemoto, Jerry K; Sudarshana, Mysore R
2015-06-01
In spring 2013, 5-year-old nectarine (Prunus persica) trees, grafted on peach rootstock Nemaguard, were found stunted in a propagation block in California. These trees had been propagated from budwood of three nectarine cultivars imported from France and cleared through the post-entry quarantine procedure. Examination of the canopy failed to reveal any obvious symptoms. However, examination of the trunks, after stripping the bark, revealed extensive pitting on the woody cylinder. To investigate the etiological agent, double-stranded RNA was extracted from bark scrapings from the scion and rootstock portions, and a cDNA library was prepared and sequenced using the Illumina platform. BLAST analysis of the contigs generated by the de novo assembly of sequence reads indicated the presence of a novel luteovirus. Complete sequence of the viral genome was determined by sequencing of three overlapping cDNA clones generated by reverse transcription-polymerase chain reaction (RT-PCR) and by rapid amplification of the 5'- and 3'-termini. The virus genome was comprised of 4,991 nucleotides with a gene organization similar to members of the genus Luteovirus (family Luteoviridae). The presence of the virus, tentatively named Nectarine stem pitting-associated virus, was confirmed in symptomatic trees by RT-PCR. Discovery of a new virus in nectarine trees after post-entry quarantine indicates the importance of including (i) metagenomic analysis by next-generation sequencing approach as an essential tool to assess the plant health status, and (ii) examination of the woody cylinders as part of the indexing process.
Deng, Youping; Dong, Yinghua; Thodima, Venkata; Clem, Rollie J; Passarelli, A Lorena
2006-01-01
Background Little is known about the genome sequences of lepidopteran insects, although this group of insects has been studied extensively in the fields of endocrinology, development, immunity, and pathogen-host interactions. In addition, cell lines derived from Spodoptera frugiperda and other lepidopteran insects are routinely used for baculovirus foreign gene expression. This study reports the results of an expressed sequence tag (EST) sequencing project in cells from the lepidopteran insect S. frugiperda, the fall armyworm. Results We have constructed an EST database using two cDNA libraries from the S. frugiperda-derived cell line, SF-21. The database consists of 2,367 ESTs which were assembled into 244 contigs and 951 singlets for a total of 1,195 unique sequences. Conclusion S. frugiperda is an agriculturally important pest insect and genomic information will be instrumental for establishing initial transcriptional profiling and gene function studies, and for obtaining information about genes manipulated during infections by insect pathogens such as baculoviruses. PMID:17052344
Li, Xinguo; Wu, Harry X; Dillon, Shannon K; Southerton, Simon G
2009-01-01
Background Wood is a major renewable natural resource for the timber, fibre and bioenergy industry. Pinus radiata D. Don is the most important commercial plantation tree species in Australia and several other countries; however, genomic resources for this species are very limited in public databases. Our primary objective was to sequence a large number of expressed sequence tags (ESTs) from genes involved in wood formation in radiata pine. Results Six developing xylem cDNA libraries were constructed from earlywood and latewood tissues sampled at juvenile (7 yrs), transition (11 yrs) and mature (30 yrs) ages, respectively. These xylem tissues represent six typical development stages in a rotation period of radiata pine. A total of 6,389 high quality ESTs were collected from 5,952 cDNA clones. Assembly of 5,952 ESTs from 5' end sequences generated 3,304 unigenes including 952 contigs and 2,352 singletons. About 97.0% of the 5,952 ESTs and 96.1% of the unigenes have matches in the UniProt and TIGR databases. Of the 3,174 unigenes with matches, 42.9% were not assigned GO (Gene Ontology) terms and their functions are unknown or unclassified. More than half (52.1%) of the 5,952 ESTs have matches in the Pfam database and represent 772 known protein families. About 18.0% of the 5,952 ESTs matched cell wall related genes in the MAIZEWALL database, representing all 18 categories, 91 of all 174 families and possibly 557 genes. Fifteen cell wall-related genes are ranked in the 30 most abundant genes, including CesA, tubulin, AGP, SAMS, actin, laccase, CCoAMT, MetE, phytocyanin, pectate lyase, cellulase, SuSy, expansin, chitinase and UDP-glucose dehydrogenase. Based on the PlantTFDB database 41 of the 64 transcription factor families in the poplar genome were identified as being involved in radiata pine wood formation. Comparative analysis of GO term abundance revealed a distinct transcriptome in juvenile earlywood formation compared to other stages of wood development. Conclusion The first large scale genomic resource in radiata pine was generated from six developing xylem cDNA libraries. Cell wall-related genes and transcription factors were identified. Juvenile earlywood has a distinct transcriptome, which is likely to contribute to the undesirable properties of juvenile wood in radiata pine. The publicly available resource of radiata pine will also be valuable for gene function studies and comparative genomics in forest trees. PMID:19159482
Reddy, M K; Nair, S; Singh, B N; Mudgil, Y; Tewari, K K; Sopory, S K
2001-01-24
We report the cloning and sequencing of both cDNA and genomic DNA of a 33 kDa chloroplast ribonucleoprotein (33RNP) from pea. The analysis of the predicted amino acid sequence of the cDNA clone revealed that the encoded protein contains two RNA binding domains, including the conserved consensus ribonucleoprotein sequences CS-RNP1 and CS-RNP2, on the C-terminus half and the presence of a putative transit peptide sequence in the N-terminus region. The phylogenetic and multiple sequence alignment analysis of pea chloroplast RNP along with RNPs reported from the other plant sources revealed that the pea 33RNP is very closely related to Nicotiana sylvestris 31RNP and 28RNP and also to 31RNP and 28RNP of Arabidopsis and spinach, respectively. The pea 33RNP was expressed in Escherichia coli and purified to homogeneity. The in vitro import of precursor protein into chloroplasts confirmed that the N-terminus putative transit peptide is a bona fide transit peptide and 33RNP is localized in the chloroplast. The nucleic acid-binding properties of the recombinant protein, as revealed by South-Western analysis, showed that 33RNP has higher binding affinity for poly (U) and oligo dT than for ssDNA and dsDNA. The steady state transcript level was higher in leaves than in roots and the expression of this gene is light stimulated. Sequence analysis of the genomic clone revealed that the gene contains four exons and three introns. We have also isolated and analyzed the 5' flanking region of the pea 33RNP gene.
2008-06-26
Homo sapiens decorin variant C mRNA, complete cds. 2.117 PKNOX2 HUM408A08B Human fetal brain (TFujiwara) Homo sapiens cDNA clone GEN -408A08 5’, mRNA...mRNA, complete cds. 2.117 PKNOX2 HUM408A08B Human fetal brain (TFujiwara) Homo sapiens cDNA clone GEN -408A08 5’, mRNA sequence. 2.076 SEC23B...RAS oncogene family ; RAB33B, member RAS oncogene family 205300_s_at 0.37 U1SNRNPBP U11/U12 snRNP 35K 220728_at 0.349 218689_at 0.342 FANCF Fanconi
Matthews, R J; Cahir, E D; Thomas, M L
1990-01-01
Protein-tyrosine-phosphatases (protein-tyrosine-phosphate phosphohydrolase, EC 3.13.48) have been implicated in the regulation of cell growth; however, to date few tyrosine phosphatases have been characterized. To identify additional family members, the cDNA for the human tyrosine phosphatase leukocyte common antigen (LCA; CD45) was used to screen, under low stringency, a mouse pre-B-cell cDNA library. Two cDNA clones were isolated and sequence analysis predicts a protein sequence of 793 amino acids. We have named the molecule LRP (LCA-related phosphatase). RNA transfer analysis indicates that the cDNAs were derived from a 3.2-kilobase mRNA. The LRP mRNA is transcribed in a wide variety of tissues. The predicted protein structure can be divided into the following structural features: a short 19-amino acid leader sequence, an exterior domain of 123 amino acids that is predicted to be highly glycosylated, a 24-amino acid membrane-spanning region, and a 627-amino acid cytoplasmic region. The cytoplasmic region contains two approximately 260-amino acid domains, each with homology to the tyrosine phosphatase family. One of the cDNA clones differed in that it had a 108-base-pair insertion that, while preserving the reading frame, would disrupt the first protein-tyrosine-phosphatase domain. Analysis of genomic DNA indicates that the insertion is due to an alternatively spliced exon. LRP appears to be evolutionarily conserved as a putative homologue has been identified in the invertebrate Styela plicata. Images PMID:2162042
Dores, Robert M.
2016-01-01
The evolution of the melanocortin receptors (MCRs) is closely associated with the evolution of the melanocortin-2 receptor accessory proteins (MRAPs). Recent annotation of the elephant shark genome project revealed the sequence of a putative MRAP1 ortholog. The presence of this sequence in the genome of a cartilaginous fish raises the possibility that the mrap1 and mrap2 genes in the genomes of gnathostome vertebrates were the result of the chordate 2R genome duplication event. The presence of a putative MRAP1 ortholog in a cartilaginous fish genome is perplexing. Recent studies on melanocortin-2 receptor (MC2R) in the genomes of the elephant shark and the Japanese stingray indicate that these MC2R orthologs can be functionally expressed in CHO cells without co-expression of an exogenous mrap1 cDNA. The novel ligand selectivity of these cartilaginous fish MC2R orthologs is discussed. Finally, the origin of the mc2r and mc5r genes is reevaluated. The distinctive primary sequence conservation of MC2R and MC5R is discussed in light of the physiological roles of these two MCR paralogs. PMID:27445982
Dubey, Anuja; Farmer, Andrew; Schlueter, Jessica; Cannon, Steven B; Abernathy, Brian; Tuteja, Reetu; Woodward, Jimmy; Shah, Trushar; Mulasmanovic, Benjamin; Kudapa, Himabindu; Raju, Nikku L; Gothalwal, Ragini; Pande, Suresh; Xiao, Yongli; Town, Chris D; Singh, Nagendra K; May, Gregory D; Jackson, Scott; Varshney, Rajeev K
2011-06-01
This study reports generation of large-scale genomic resources for pigeonpea, a so-called 'orphan crop species' of the semi-arid tropic regions. FLX/454 sequencing carried out on a normalized cDNA pool prepared from 31 tissues produced 494 353 short transcript reads (STRs). Cluster analysis of these STRs, together with 10 817 Sanger ESTs, resulted in a pigeonpea trancriptome assembly (CcTA) comprising of 127 754 tentative unique sequences (TUSs). Functional analysis of these TUSs highlights several active pathways and processes in the sampled tissues. Comparison of the CcTA with the soybean genome showed similarity to 10 857 and 16 367 soybean gene models (depending on alignment methods). Additionally, Illumina 1G sequencing was performed on Fusarium wilt (FW)- and sterility mosaic disease (SMD)-challenged root tissues of 10 resistant and susceptible genotypes. More than 160 million sequence tags were used to identify FW- and SMD-responsive genes. Sequence analysis of CcTA and the Illumina tags identified a large new set of markers for use in genetics and breeding, including 8137 simple sequence repeats, 12 141 single-nucleotide polymorphisms and 5845 intron-spanning regions. Genomic resources developed in this study should be useful for basic and applied research, not only for pigeonpea improvement but also for other related, agronomically important legumes.
Isolation and expression of three gibberellin 20-oxidase cDNA clones from Arabidopsis.
Phillips, A L; Ward, D A; Uknes, S; Appleford, N E; Lange, T; Huttly, A K; Gaskin, P; Graebe, J E; Hedden, P
1995-07-01
Using degenerate oligonucleotide primers based on a pumpkin (Cucurbita maxima) gibberellin (GA) 20-oxidase sequence, six different fragments of dioxygenase genes were amplified by polymerase chain reaction from arabidopsis thaliana genomic DNA. One of these was used to isolate two different full-length cDNA clones, At2301 and At2353, from shoots of the GA-deficient Arabidopsis mutant ga1-2. A third, related clone, YAP169, was identified in the Database of Expressed Sequence Tags. The cDNA clones were expressed in Escherichia coli as fusion proteins, each of which oxidized GA12 at C-20 to GA15, GA24, and the C19 compound GA9, a precursor of bioactive GAs; the C20 tricarboxylic acid compound GA25 was formed as a minor product. The expression products also oxidized the 13-hydroxylated substrate GA53, but less effectively than GA12. The three cDNAs hybridized to mRNA species with tissue-specific patterns of accumulation, with At2301 being expressed in stems and inflorescences, At2353 in inflorescences and developing siliques, and YAP169 in siliques only. In the floral shoots of the ga1-2 mutant, transcript levels corresponding to each cDNA decreased dramatically after GA3 application, suggesting that GA biosynthesis may be controlled, at least in part, through down-regulation of the expression of the 20-oxidase genes.
Construction of cDNA library and preliminary analysis of expressed sequence tags from Siberian tiger
Liu, Chang-Qing; Lu, Tao-Feng; Feng, Bao-Gang; Liu, Dan; Guan, Wei-Jun; Ma, Yue-Hui
2010-01-01
In this study we successfully constructed a full-length cDNA library from Siberian tiger, Panthera tigris altaica, the most well-known wild Animal. Total RNA was extracted from cultured Siberian tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.30×106 pfu/ml and 1.62×109 pfu/ml respectively. The proportion of recombinants from unamplified library was 90.5% and average length of exogenous inserts was 1.13 kb. A total of 282 individual ESTs with sizes ranging from 328 to 1,142bps were then analyzed the BLASTX score revealed that 53.9% of the sequences were classified as strong match, 38.6% as nominal and 7.4% as weak match. 28.0% of them were found to be related to enzyme/catalytic protein, 20.9% ESTs to metabolism, 13.1% ESTs to transport, 12.1% ESTs to signal transducer/cell communication, 9.9% ESTs to structure protein, 3.9% ESTs to immunity protein/defense metabolism, 3.2% ESTs to cell cycle, and 8.9 ESTs classified as novel genes. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genomic research of Siberian tigers. PMID:20941376
Aoki, Koh; Yano, Kentaro; Suzuki, Ayako; Kawamura, Shingo; Sakurai, Nozomu; Suda, Kunihiro; Kurabayashi, Atsushi; Suzuki, Tatsuya; Tsugane, Taneaki; Watanabe, Manabu; Ooga, Kazuhide; Torii, Maiko; Narita, Takanori; Shin-I, Tadasu; Kohara, Yuji; Yamamoto, Naoki; Takahashi, Hideki; Watanabe, Yuichiro; Egusa, Mayumi; Kodama, Motoichiro; Ichinose, Yuki; Kikuchi, Mari; Fukushima, Sumire; Okabe, Akiko; Arie, Tsutomu; Sato, Yuko; Yazawa, Katsumi; Satoh, Shinobu; Omura, Toshikazu; Ezura, Hiroshi; Shibata, Daisuke
2010-03-30
The Solanaceae family includes several economically important vegetable crops. The tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. Recently, a number of tomato resources have been developed in parallel with the ongoing tomato genome sequencing project. In particular, a miniature cultivar, Micro-Tom, is regarded as a model system in tomato genomics, and a number of genomics resources in the Micro-Tom-background, such as ESTs and mutagenized lines, have been established by an international alliance. To accelerate the progress in tomato genomics, we developed a collection of fully-sequenced 13,227 Micro-Tom full-length cDNAs. By checking redundant sequences, coding sequences, and chimeric sequences, a set of 11,502 non-redundant full-length cDNAs (nrFLcDNAs) was generated. Analysis of untranslated regions demonstrated that tomato has longer 5'- and 3'-untranslated regions than most other plants but rice. Classification of functions of proteins predicted from the coding sequences demonstrated that nrFLcDNAs covered a broad range of functions. A comparison of nrFLcDNAs with genes of sixteen plants facilitated the identification of tomato genes that are not found in other plants, most of which did not have known protein domains. Mapping of the nrFLcDNAs onto currently available tomato genome sequences facilitated prediction of exon-intron structure. Introns of tomato genes were longer than those of Arabidopsis and rice. According to a comparison of exon sequences between the nrFLcDNAs and the tomato genome sequences, the frequency of nucleotide mismatch in exons between Micro-Tom and the genome-sequencing cultivar (Heinz 1706) was estimated to be 0.061%. The collection of Micro-Tom nrFLcDNAs generated in this study will serve as a valuable genomic tool for plant biologists to bridge the gap between basic and applied studies. The nrFLcDNA sequences will help annotation of the tomato whole-genome sequence and aid in tomato functional genomics and molecular breeding. Full-length cDNA sequences and their annotations are provided in the database KaFTom http://www.pgb.kazusa.or.jp/kaftom/ via the website of the National Bioresource Project Tomato http://tomato.nbrp.jp.
Structure of the coding region and mRNA variants of the apyrase gene from pea (Pisum sativum)
NASA Technical Reports Server (NTRS)
Shibata, K.; Abe, S.; Davies, E.
2001-01-01
Partial amino acid sequences of a 49 kDa apyrase (ATP diphosphohydrolase, EC 3.6.1.5) from the cytoskeletal fraction of etiolated pea stems were used to derive oligonucleotide DNA primers to generate a cDNA fragment of pea apyrase mRNA by RT-PCR and these primers were used to screen a pea stem cDNA library. Two almost identical cDNAs differing in just 6 nucleotides within the coding regions were found, and these cDNA sequences were used to clone genomic fragments by PCR. Two nearly identical gene fragments containing 8 exons and 7 introns were obtained. One of them (H-type) encoded the mRNA sequence described by Hsieh et al. (1996) (DDBJ/EMBL/GenBank Z32743), while the other (S-type) differed by the same 6 nucleotides as the mRNAs, suggesting that these genes may be alleles. The six nucleotide differences between these two alleles were found solely in the first exon, and these mutation sites had two types of consensus sequences. These mRNAs were found with varying lengths of 3' untranslated regions (3'-UTR). There are some similarities between the 3'-UTR of these mRNAs and those of actin and actin binding proteins in plants. The putative roles of the 3'-UTR and alternative polyadenylation sites are discussed in relation to their possible role in targeting the mRNAs to different subcellular compartments.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Codina, J.; Olate, J.; Abramowitz, J.
1988-05-15
cDNA cloning has identified the presence in the human genome of three genes encoding ..cap alpha.. subunits of pertussis toxin substrates, generically called G/sub i/. They are named ..cap alpha../sub i/-1, ..cap alpha../sub i/-2 and ..cap alpha../sub i/-3. However, none of these genes has been functionally identified with any of the ..cap alpha.. subunits of several possible G proteins, including pertussis toxin-sensitive G/sub p/'s, stimulatory to phospholipase C or A/sub 2/, G/sub i/, inhibitory to adenylyl cyclase, or G/sub k/, stimulatory to a type of K/sup +/ channels. The authors now report the nucleotide sequence and the complete predicted aminomore » acid sequence of human liver ..cap alpha../sub i/-3 and the partial amino acid sequence of proteolytic fragments of the ..cap alpha.. subunit of human erythrocyte G/sub k/. The amino acid sequence of the proteolytic fragment is uniquely encoded by the cDNA of ..cap alpha../sub i/-3, thus identifying it as ..cap alpha../sub k/. The probable identity of ..cap alpha../sub i/-1 with ..cap alpha../sub p/ and possible roles for ..cap alpha../sub i/-2, as well as additional roles for ..cap alpha../sub i/-1 and ..cap alpha../sub i/-3 (..cap alpha../sub k/) are discussed.« less
Ngo, J T; Bateman, J B; Cortessis, V; Sparkes, R S; Mohandas, T; Inana, G; Spence, M A
1989-05-01
Previous study has shown that the usual DNA marker for Norrie disease, the L1.28 probe which identifies the DXS7 locus, can recombine with the disease locus. In this study, we used a human ornithine aminotransferase (OAT) cDNA which detects OAT-related DNA sequences mapped to the same region on the X chromosome as that of the L1.28 probe to investigate the family with Norrie disease who exhibited the recombinational event. When genomic DNA from this family was digested with the PvuII restriction endonuclease, we found a restriction fragment length polymorphism (RFLP) of 4.2 kb in size. This fragment was absent in the affected males and cosegregated with the disease locus; we calculated a lod score of 0.602, at theta = 0.00. No deletion could be detected by chromosomal analysis or on Southern blots with other enzymes. These results suggest that one of the OAT-related sequences on the X chromosome may be in close proximity to the Norrie disease locus and represent the first report which indicates that the OAT cDNA may be useful for the identification of carrier status and/or prenatal diagnosis.
MytiBase: a knowledgebase of mussel (M. galloprovincialis) transcribed sequences
Venier, Paola; De Pittà, Cristiano; Bernante, Filippo; Varotto, Laura; De Nardi, Barbara; Bovo, Giuseppe; Roch, Philippe; Novoa, Beatriz; Figueras, Antonio; Pallavicini, Alberto; Lanfranchi, Gerolamo
2009-01-01
Background Although Bivalves are among the most studied marine organisms due to their ecological role, economic importance and use in pollution biomonitoring, very little information is available on the genome sequences of mussels. This study reports the functional analysis of a large-scale Expressed Sequence Tag (EST) sequencing from different tissues of Mytilus galloprovincialis (the Mediterranean mussel) challenged with toxic pollutants, temperature and potentially pathogenic bacteria. Results We have constructed and sequenced seventeen cDNA libraries from different Mediterranean mussel tissues: gills, digestive gland, foot, anterior and posterior adductor muscle, mantle and haemocytes. A total of 24,939 clones were sequenced from these libraries generating 18,788 high-quality ESTs which were assembled into 2,446 overlapping clusters and 4,666 singletons resulting in a total of 7,112 non-redundant sequences. In particular, a high-quality normalized cDNA library (Nor01) was constructed as determined by the high rate of gene discovery (65.6%). Bioinformatic screening of the non-redundant M. galloprovincialis sequences identified 159 microsatellite-containing ESTs. Clusters, consensuses, related similarities and gene ontology searches have been organized in a dedicated, searchable database . Conclusion We defined the first species-specific catalogue of M. galloprovincialis ESTs including 7,112 unique transcribed sequences. Putative microsatellite markers were identified. This annotated catalogue represents a valuable platform for expression studies, marker validation and genetic linkage analysis for investigations in the biology of Mediterranean mussels. PMID:19203376
Zhu, Yu-Cheng; Specht, Charles A; Dittmer, Neal T; Muthukrishnan, Subbaratnam; Kanost, Michael R; Kramer, Karl J
2002-11-01
Glycosyltransferases are enzymes that synthesize oligosaccharides, polysaccharides and glycoconjugates. One type of glycosyltransferase is chitin synthase, a very important enzyme in biology, which is utilized by insects, fungi, and other invertebrates to produce chitin, a polysaccharide of beta-1,4-linked N-acetylglucosamine. Chitin is an important component of the insect's exoskeletal cuticle and gut lining. To identify and characterize a chitin synthase gene of the tobacco hornworm, Manduca sexta, degenerate primers were designed from two highly conserved regions in fungal and nematode chitin synthase protein sequences and then used to amplify a similar region from Manduca cDNA. A full-length cDNA of 5152 nucleotides was assembled for the putative Manduca chitin synthase gene, MsCHS1, and sequencing of genomic DNA verified the contiguity of the sequence. The MsCHS1 cDNA has an ORF of 4692 nucleotides that encodes a transmembrane protein of 1564 amino acid residues with a mass of approximately 179 kDa (GenBank no. AY062175). It is most similar, over its entire length of protein sequence, to putative chitin synthases from other insects and nematodes, with 68% identity to enzymes from both the blow fly, Lucilia cuprina, and the fruit fly, Drosophila melanogaster. The similarity with fungal chitin synthases is restricted to the putative catalytic domain, and the MsCHS1 protein has, at equivalent positions, several amino acids that are essential for activity as revealed by mutagenesis of the fungal enzymes. A 5.3-kb transcript of MsCHS1 was identified by northern blot hybridization of RNA from larval epidermis, suggesting that the enzyme functions to make chitin deposited in the cuticle. Further examination by RT-PCR showed that MsCHS1 expression is regulated in the epidermis, with the amount of transcript increasing during phases of cuticle deposition.
Kukekova, Anna V; Johnson, Jennifer L; Teiling, Clotilde; Li, Lewyn; Oskina, Irina N; Kharlamova, Anastasiya V; Gulevich, Rimma G; Padte, Ravee; Dubreuil, Michael M; Vladimirova, Anastasiya V; Shepeleva, Darya V; Shikhevich, Svetlana G; Sun, Qi; Ponnala, Lalit; Temnykh, Svetlana V; Trut, Lyudmila N; Acland, Gregory M
2011-10-03
Two strains of the silver fox (Vulpes vulpes), with markedly different behavioral phenotypes, have been developed by long-term selection for behavior. Foxes from the tame strain exhibit friendly behavior towards humans, paralleling the sociability of canine puppies, whereas foxes from the aggressive strain are defensive and exhibit aggression to humans. To understand the genetic differences underlying these behavioral phenotypes fox-specific genomic resources are needed. cDNA from mRNA from pre-frontal cortex of a tame and an aggressive fox was sequenced using the Roche 454 FLX Titanium platform (> 2.5 million reads & 0.9 Gbase of tame fox sequence; >3.3 million reads & 1.2 Gbase of aggressive fox sequence). Over 80% of the fox reads were assembled into contigs. Mapping fox reads against the fox transcriptome assembly and the dog genome identified over 30,000 high confidence fox-specific SNPs. Fox transcripts for approximately 14,000 genes were identified using SwissProt and the dog RefSeq databases. An at least 2-fold expression difference between the two samples (p < 0.05) was observed for 335 genes, fewer than 3% of the total number of genes identified in the fox transcriptome. Transcriptome sequencing significantly expanded genomic resources available for the fox, a species without a sequenced genome. In a very cost efficient manner this yielded a large number of fox-specific SNP markers for genetic studies and provided significant insights into the gene expression profile of the fox pre-frontal cortex; expression differences between the two fox samples; and a catalogue of potentially important gene-specific sequence variants. This result demonstrates the utility of this approach for developing genomic resources in species with limited genomic information.
2011-01-01
Background Two strains of the silver fox (Vulpes vulpes), with markedly different behavioral phenotypes, have been developed by long-term selection for behavior. Foxes from the tame strain exhibit friendly behavior towards humans, paralleling the sociability of canine puppies, whereas foxes from the aggressive strain are defensive and exhibit aggression to humans. To understand the genetic differences underlying these behavioral phenotypes fox-specific genomic resources are needed. Results cDNA from mRNA from pre-frontal cortex of a tame and an aggressive fox was sequenced using the Roche 454 FLX Titanium platform (> 2.5 million reads & 0.9 Gbase of tame fox sequence; >3.3 million reads & 1.2 Gbase of aggressive fox sequence). Over 80% of the fox reads were assembled into contigs. Mapping fox reads against the fox transcriptome assembly and the dog genome identified over 30,000 high confidence fox-specific SNPs. Fox transcripts for approximately 14,000 genes were identified using SwissProt and the dog RefSeq databases. An at least 2-fold expression difference between the two samples (p < 0.05) was observed for 335 genes, fewer than 3% of the total number of genes identified in the fox transcriptome. Conclusions Transcriptome sequencing significantly expanded genomic resources available for the fox, a species without a sequenced genome. In a very cost efficient manner this yielded a large number of fox-specific SNP markers for genetic studies and provided significant insights into the gene expression profile of the fox pre-frontal cortex; expression differences between the two fox samples; and a catalogue of potentially important gene-specific sequence variants. This result demonstrates the utility of this approach for developing genomic resources in species with limited genomic information. PMID:21967120
Fujisaki, K; Hagihara, F; Kaido, M; Mise, K; Okuno, T
2003-01-01
Spring beauty latent virus (SBLV), a bromovirus, systemically and efficiently infected Arabidopsis thaliana, whereas the well-studied bromoviruses brome mosaic virus (BMV) and cowpea chlorotic mottle virus (CCMV) did not infect and poorly infected A. thaliana, respectively. We constructed biologically active cDNA clones of SBLV genomic RNAs and determined their complete nucleotide sequences. Interestingly, SBLV RNA3 contains both the box B motif in the intercistronic region, as does BMV, and the subgenomic promoter-like sequence in the 5' noncoding region, as does CCMV. Sequence comparisons of SBLV, BMV, CCMV, and broad bean mottle virus demonstrated that SBLV is closely related to BMV and CCMV.
MEETING: Chlamydomonas Annotation Jamboree - October 2003
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grossman, Arthur R
2007-04-13
Shotgun sequencing of the nuclear genome of Chlamydomonas reinhardtii (Chlamydomonas throughout) was performed at an approximate 10X coverage by JGI. Roughly half of the genome is now contained on 26 scaffolds, all of which are at least 1.6 Mb, and the coverage of the genome is ~95%. There are now over 200,000 cDNA sequence reads that we have generated as part of the Chlamydomonas genome project (Grossman, 2003; Shrager et al., 2003; Grossman et al. 2007; Merchant et al., 2007); other sequences have also been generated by the Kasuza sequence group (Asamizu et al., 1999; Asamizu et al., 2000) ormore » individual laboratories that have focused on specific genes. Shrager et al. (2003) placed the reads into distinct contigs (an assemblage of reads with overlapping nucleotide sequences), and contigs that group together as part of the same genes have been designated ACEs (assembly of contigs generated from EST information). All of the reads have also been mapped to the Chlamydomonas nuclear genome and the cDNAs and their corresponding genomic sequences have been reassembled, and the resulting assemblage is called an ACEG (an Assembly of contiguous EST sequences supported by genomic sequence) (Jain et al., 2007). Most of the unique genes or ACEGs are also represented by gene models that have been generated by the Joint Genome Institute (JGI, Walnut Creek, CA). These gene models have been placed onto the DNA scaffolds and are presented as a track on the Chlamydomonas genome browser associated with the genome portal (http://genome.jgi-psf.org/Chlre3/Chlre3.home.html). Ultimately, the meeting grant awarded by DOE has helped enormously in the development of an annotation pipeline (a set of guidelines used in the annotation of genes) and resulted in high quality annotation of over 4,000 genes; the annotators were from both Europe and the USA. Some of the people who led the annotation initiative were Arthur Grossman, Olivier Vallon, and Sabeeha Merchant (with many individual annotators from Europe and the USA). Olivier Vallon has been most active in continued input of annotation information.« less
A tick-borne segmented RNA virus contains genome segments derived from unsegmented viral ancestors
Qin, Xin-Cheng; Shi, Mang; Tian, Jun-Hua; Lin, Xian-Dan; Gao, Dong-Ya; He, Jin-Rong; Wang, Jian-Bo; Li, Ci-Xiu; Kang, Yan-Jun; Yu, Bin; Zhou, Dun-Jin; Xu, Jianguo; Plyusnin, Alexander; Holmes, Edward C.; Zhang, Yong-Zhen
2014-01-01
Although segmented and unsegmented RNA viruses are commonplace, the evolutionary links between these two very different forms of genome organization are unclear. We report the discovery and characterization of a tick-borne virus—Jingmen tick virus (JMTV)—that reveals an unexpected connection between segmented and unsegmented RNA viruses. The JMTV genome comprises four segments, two of which are related to the nonstructural protein genes of the genus Flavivirus (family Flaviviridae), whereas the remaining segments are unique to this virus, have no known homologs, and contain a number of features indicative of structural protein genes. Remarkably, homology searching revealed that sequences related to JMTV were present in the cDNA library from Toxocara canis (dog roundworm; Nematoda), and that shared strong sequence and structural resemblances. Epidemiological studies showed that JMTV is distributed in tick populations across China, especially Rhipicephalus and Haemaphysalis spp., and experiences frequent host-switching and genomic reassortment. To our knowledge, JMTV is the first example of a segmented RNA virus with a genome derived in part from unsegmented viral ancestors. PMID:24753611
NASA Technical Reports Server (NTRS)
Wu, Liu-Lai; Song, Il; Karuppiah, Nadarajah; Kaufman, Peter B.
1993-01-01
An asymmetric (top vs. bottom halves of pulvini) induction of invertase mRNA by gravistimulation was analyzed in oat shoot pulvini. Total RNA and poly(A)(+) RNA, isolated from oat pulvini, and two oli-gonucleotide primers, corresponding to two conserved amino acid sequences (NDPNG and WECPD) found in invertase from other species, were used for the polymerase chain reaction (PCR). A partial length cDNA (550 bp) was obtained and characterized. A 62% nucleotide sequence homology and 58% deduced amino acid sequence homology, as compared to beta-fructosidase of carrot cell wall, was found. Northern blot analysis showed that there was an obviously transient induction of invertase mRNA by gravistimulation in the oat pulvinus system. The mRNA was rapidly induced to a maximum level at 1 hour after gravistimulation treatment and gradually decreased afterwards. The mRNA level in the bottom half of the oat pulvinus was significantly higher than that in the top half of the pulvinus tissue. The kinetic induction of invertase mRNA was consistent with the transient accumulation of invertase activity during the graviresponse of the pulvinus. This indicates that the expression of the invertase gene(s) could be regulated by gravistimulation at the transcriptional level. Southern blot analysis showed that there were two to three genomic DNA fragments which hybridized with the partial-length invertase cDNA.
Morin, Ryan D.; Chang, Elbert; Petrescu, Anca; Liao, Nancy; Griffith, Malachi; Kirkpatrick, Robert; Butterfield, Yaron S.; Young, Alice C.; Stott, Jeffrey; Barber, Sarah; Babakaiff, Ryan; Dickson, Mark C.; Matsuo, Corey; Wong, David; Yang, George S.; Smailus, Duane E.; Wetherby, Keith D.; Kwong, Peggy N.; Grimwood, Jane; Brinkley, Charles P.; Brown-John, Mabel; Reddix-Dugue, Natalie D.; Mayo, Michael; Schmutz, Jeremy; Beland, Jaclyn; Park, Morgan; Gibson, Susan; Olson, Teika; Bouffard, Gerard G.; Tsai, Miranda; Featherstone, Ruth; Chand, Steve; Siddiqui, Asim S.; Jang, Wonhee; Lee, Ed; Klein, Steven L.; Blakesley, Robert W.; Zeeberg, Barry R.; Narasimhan, Sudarshan; Weinstein, John N.; Pennacchio, Christa Prange; Myers, Richard M.; Green, Eric D.; Wagner, Lukas; Gerhard, Daniela S.; Marra, Marco A.; Jones, Steven J.M.; Holt, Robert A.
2006-01-01
Sequencing of full-insert clones from full-length cDNA libraries from both Xenopus laevis and Xenopus tropicalis has been ongoing as part of the Xenopus Gene Collection Initiative. Here we present 10,967 full ORF verified cDNA clones (8049 from X. laevis and 2918 from X. tropicalis) as a community resource. Because the genome of X. laevis, but not X. tropicalis, has undergone allotetraploidization, comparison of coding sequences from these two clawed (pipid) frogs provides a unique angle for exploring the molecular evolution of duplicate genes. Within our clone set, we have identified 445 gene trios, each comprised of an allotetraploidization-derived X. laevis gene pair and their shared X. tropicalis ortholog. Pairwise dN/dS, comparisons within trios show strong evidence for purifying selection acting on all three members. However, dN/dS ratios between X. laevis gene pairs are elevated relative to their X. tropicalis ortholog. This difference is highly significant and indicates an overall relaxation of selective pressures on duplicated gene pairs. We have found that the paralogs that have been lost since the tetraploidization event are enriched for several molecular functions, but have found no such enrichment in the extant paralogs. Approximately 14% of the paralogous pairs analyzed here also show differential expression indicative of subfunctionalization. PMID:16672307
Sikorav, J L; Duval, N; Anselmet, A; Bon, S; Krejci, E; Legay, C; Osterlund, M; Reimund, B; Massoulié, J
1988-01-01
In this paper, we show the existence of alternative splicing in the 3' region of the coding sequence of Torpedo acetylcholinesterase (AChE). We describe two cDNA structures which both diverge from the previously described coding sequence of the catalytic subunit of asymmetric (A) forms (Schumacher et al., 1986; Sikorav et al., 1987). They both contain a coding sequence followed by a non-coding sequence and a poly(A) stretch. Both of these structures were shown to exist in poly(A)+ RNAs, by S1 mapping experiments. The divergent region encoded by the first sequence corresponds to the precursor of the globular dimeric form (G2a), since it contains the expected C-terminal amino acids, Ala-Cys. These amino acids are followed by a 29 amino acid extension which contains a hydrophobic segment and must be replaced by a glycolipid in the mature protein. Analyses of intact G2a AChE showed that the common domain of the protein contains intersubunit disulphide bonds. The divergent region of the second type of cDNA consists of an adjacent genomic sequence, which is removed as an intron in A and Ga mRNAs, but may encode a distinct, less abundant catalytic subunit. The structures of the cDNA clones indicate that they are derived from minor mRNAs, shorter than the three major transcripts which have been described previously (14.5, 10.5 and 5.5 kb). Oligonucleotide probes specific for the asymmetric and globular terminal regions hybridize with the three major transcripts, indicating that their size is determined by 3'-untranslated regions which are not related to the differential splicing leading to A and Ga forms. Images PMID:3181125
A large-scale full-length cDNA analysis to explore the budding yeast transcriptome
Miura, Fumihito; Kawaguchi, Noriko; Sese, Jun; Toyoda, Atsushi; Hattori, Masahira; Morishita, Shinichi; Ito, Takashi
2006-01-01
We performed a large-scale cDNA analysis to explore the transcriptome of the budding yeast Saccharomyces cerevisiae. We sequenced two cDNA libraries, one from the cells exponentially growing in a minimal medium and the other from meiotic cells. Both libraries were generated by using a vector-capping method that allows the accurate mapping of transcription start sites (TSSs). Consequently, we identified 11,575 TSSs associated with 3,638 annotated genomic features, including 3,599 ORFs, to suggest that most yeast genes have two or more TSSs. In addition, we identified 45 previously undescribed introns, including those affecting current ORF annotations and those spliced alternatively. Furthermore, the analysis revealed 667 transcription units in the intergenic regions and transcripts derived from antisense strands of 367 known features. We also found that 348 ORFs carry TSSs in their 3′-halves to generate sense transcripts starting from inside the ORFs. These results indicate that the budding yeast transcriptome is considerably more complex than previously thought, and it shares many recently revealed characteristics with the transcriptomes of mammals and other higher eukaryotes. Thus, the genome-wide active transcription that generates novel classes of transcripts appears to be an intrinsic feature of the eukaryotic cells. The budding yeast will serve as a versatile model for the studies on these aspects of transcriptome, and the full-length cDNA clones can function as an invaluable resource in such studies. PMID:17101987
Dlugosch, Katrina M.; Lai, Zhao; Bonin, Aurélie; Hierro, José; Rieseberg, Loren H.
2013-01-01
Transcriptome sequences are becoming more broadly available for multiple individuals of the same species, providing opportunities to derive population genomic information from these datasets. Using the 454 Life Science Genome Sequencer FLX and FLX-Titanium next-generation platforms, we generated 11−430 Mbp of sequence for normalized cDNA for 40 wild genotypes of the invasive plant Centaurea solstitialis, yellow starthistle, from across its worldwide distribution. We examined the impact of sequencing effort on transcriptome recovery and overlap among individuals. To do this, we developed two novel publicly available software pipelines: SnoWhite for read cleaning before assembly, and AllelePipe for clustering of loci and allele identification in assembled datasets with or without a reference genome. AllelePipe is designed specifically for cases in which read depth information is not appropriate or available to assist with disentangling closely related paralogs from allelic variation, as in transcriptome or previously assembled libraries. We find that modest applications of sequencing effort recover most of the novel sequences present in the transcriptome of this species, including single-copy loci and a representative distribution of functional groups. In contrast, the coverage of variable sites, observation of heterozygosity, and overlap among different libraries are all highly dependent on sequencing effort. Nevertheless, the information gained from overlapping regions was informative regarding coarse population structure and variation across our small number of population samples, providing the first genetic evidence in support of hypothesized invasion scenarios. PMID:23390612
Bioinformatics and expressional analysis of cDNA clones from floral buds
NASA Astrophysics Data System (ADS)
Pawełkowicz, Magdalena Ewa; Skarzyńska, Agnieszka; Cebula, Justyna; Hincha, Dirck; ZiÄ bska, Karolina; PlÄ der, Wojciech; Przybecki, Zbigniew
2017-08-01
The application of genomic approaches may serve as an initial step in understanding the complexity of biochemical network and cellular processes responsible for regulation and execution of many developmental tasks. The molecular mechanism of sex expression in cucumber is still not elucidated. A study of differential expression was conducted to identify genes involved in sex determination and floral organ morphogenesis. Herein, we present generation of expression sequence tags (EST) obtained by differential hybridization (DH) and subtraction technique (cDNA-DSC) and their characteristic features such as molecular function, involvement in biology processes, expression and mapping position on the genome.
Novel Cell Culture-Adapted Genotype 2a Hepatitis C Virus Infectious Clone
Date, Tomoko; Kato, Takanobu; Kato, Junko; Takahashi, Hitoshi; Morikawa, Kenichi; Akazawa, Daisuke; Murayama, Asako; Tanaka-Kaneko, Keiko; Sata, Tetsutaro; Tanaka, Yasuhito; Mizokami, Masashi
2012-01-01
Although the recently developed infectious hepatitis C virus system that uses the JFH-1 clone enables the study of whole HCV viral life cycles, limited particular HCV strains have been available with the system. In this study, we isolated another genotype 2a HCV cDNA, the JFH-2 strain, from a patient with fulminant hepatitis. JFH-2 subgenomic replicons were constructed. HuH-7 cells transfected with in vitro transcribed replicon RNAs were cultured with G418, and selected colonies were isolated and expanded. From sequencing analysis of the replicon genome, several mutations were found. Some of the mutations enhanced JFH-2 replication; the 2217AS mutation in the NS5A interferon sensitivity-determining region exhibited the strongest adaptive effect. Interestingly, a full-length chimeric or wild-type JFH-2 genome with the adaptive mutation could replicate in Huh-7.5.1 cells and produce infectious virus after extensive passages of the virus genome-replicating cells. Virus infection efficiency was sufficient for autonomous virus propagation in cultured cells. Additional mutations were identified in the infectious virus genome. Interestingly, full-length viral RNA synthesized from the cDNA clone with these adaptive mutations was infectious for cultured cells. This approach may be applicable for the establishment of new infectious HCV clones. PMID:22787209
Singh, B N; Mudgil, Yashwanti; Sopory, S K; Reddy, M K
2003-07-01
We have successfully expressed enzymatically active plant topoisomerase II in Escherichia coli for the first time, which has enabled its biochemical characterization. Using a PCR-based strategy, we obtained a full-length cDNA and the corresponding genomic clone of tobacco topoisomerase II. The genomic clone has 18 exons interrupted by 17 introns. Most of the 5' and 3' splice junctions follow the typical canonical consensus dinucleotide sequence GU-AG present in other plant introns. The position of introns and phasing with respect to primary amino acid sequence in tobacco TopII and Arabidopsis TopII are highly conserved, suggesting that the two genes are evolved from the common ancestral type II topoisomerase gene. The cDNA encodes a polypeptide of 1482 amino acids. The primary amino acid sequence shows a striking sequence similarity, preserving all the structural domains that are conserved among eukaryotic type II topoisomerases in an identical spatial order. We have expressed the full-length polypeptide in E. coli and purified the recombinant protein to homogeneity. The full-length polypeptide relaxed supercoiled DNA and decatenated the catenated DNA in a Mg(2+)- and ATP-dependent manner, and this activity was inhibited by 4'-(9-acridinylamino)-3'-methoxymethanesulfonanilide (m-AMSA). The immunofluorescence and confocal microscopic studies, with antibodies developed against the N-terminal region of tobacco recombinant topoisomerase II, established the nuclear localization of topoisomerase II in tobacco BY2 cells. The regulated expression of tobacco topoisomerase II gene under the GAL1 promoter functionally complemented a temperature-sensitive TopII(ts) yeast mutant.
Strand-specific transcriptome profiling with directly labeled RNA on genomic tiling microarrays
2011-01-01
Background With lower manufacturing cost, high spot density, and flexible probe design, genomic tiling microarrays are ideal for comprehensive transcriptome studies. Typically, transcriptome profiling using microarrays involves reverse transcription, which converts RNA to cDNA. The cDNA is then labeled and hybridized to the probes on the arrays, thus the RNA signals are detected indirectly. Reverse transcription is known to generate artifactual cDNA, in particular the synthesis of second-strand cDNA, leading to false discovery of antisense RNA. To address this issue, we have developed an effective method using RNA that is directly labeled, thus by-passing the cDNA generation. This paper describes this method and its application to the mapping of transcriptome profiles. Results RNA extracted from laboratory cultures of Porphyromonas gingivalis was fluorescently labeled with an alkylation reagent and hybridized directly to probes on genomic tiling microarrays specifically designed for this periodontal pathogen. The generated transcriptome profile was strand-specific and produced signals close to background level in most antisense regions of the genome. In contrast, high levels of signal were detected in the antisense regions when the hybridization was done with cDNA. Five antisense areas were tested with independent strand-specific RT-PCR and none to negligible amplification was detected, indicating that the strong antisense cDNA signals were experimental artifacts. Conclusions An efficient method was developed for mapping transcriptome profiles specific to both coding strands of a bacterial genome. This method chemically labels and uses extracted RNA directly in microarray hybridization. The generated transcriptome profile was free of cDNA artifactual signals. In addition, this method requires fewer processing steps and is potentially more sensitive in detecting small amount of RNA compared to conventional end-labeling methods due to the incorporation of more fluorescent molecules per RNA fragment. PMID:21235785
Transcriptome Assembly, Gene Annotation and Tissue Gene Expression Atlas of the Rainbow Trout
Salem, Mohamed; Paneru, Bam; Al-Tobasei, Rafet; Abdouni, Fatima; Thorgaard, Gary H.; Rexroad, Caird E.; Yao, Jianbo
2015-01-01
Efforts to obtain a comprehensive genome sequence for rainbow trout are ongoing and will be complemented by transcriptome information that will enhance genome assembly and annotation. Previously, transcriptome reference sequences were reported using data from different sources. Although the previous work added a great wealth of sequences, a complete and well-annotated transcriptome is still needed. In addition, gene expression in different tissues was not completely addressed in the previous studies. In this study, non-normalized cDNA libraries were sequenced from 13 different tissues of a single doubled haploid rainbow trout from the same source used for the rainbow trout genome sequence. A total of ~1.167 billion paired-end reads were de novo assembled using the Trinity RNA-Seq assembler yielding 474,524 contigs > 500 base-pairs. Of them, 287,593 had homologies to the NCBI non-redundant protein database. The longest contig of each cluster was selected as a reference, yielding 44,990 representative contigs. A total of 4,146 contigs (9.2%), including 710 full-length sequences, did not match any mRNA sequences in the current rainbow trout genome reference. Mapping reads to the reference genome identified an additional 11,843 transcripts not annotated in the genome. A digital gene expression atlas revealed 7,678 housekeeping and 4,021 tissue-specific genes. Expression of about 16,000–32,000 genes (35–71% of the identified genes) accounted for basic and specialized functions of each tissue. White muscle and stomach had the least complex transcriptomes, with high percentages of their total mRNA contributed by a small number of genes. Brain, testis and intestine, in contrast, had complex transcriptomes, with a large numbers of genes involved in their expression patterns. This study provides comprehensive de novo transcriptome information that is suitable for functional and comparative genomics studies in rainbow trout, including annotation of the genome. PMID:25793877
Candidate Cancer Allele cDNA Collection | Office of Cancer Genomics
CTD2 researchers at the Broad Institute/DFCI have developed a collection of plasmids including mutant alleles found in sequencing studies of cancer. It includes somatic variants found in lung adenocarcinoma and across other cancer types. The clones enable researchers to characterize the function of the cancer variants in a high throughput experiments. These plasmids are collectively called the “Broad Target Accelerator Plasmid Collections”.
Coral Reef Genomics: Developing tools for functional genomics ofcoral symbiosis
DOE Office of Scientific and Technical Information (OSTI.GOV)
Schwarz, Jodi; Brokstein, Peter; Manohar, Chitra
Symbioses between cnidarians and dinoflagellates in the genus Symbiodinium are widespread in the marine environment. The importance of this symbiosis to reef-building corals and reef nutrient and carbon cycles is well documented, but little is known about the mechanisms by which the partners establish and regulate the symbiosis. Because the dinoflagellate symbionts live inside the cells of their host coral, the interactions between the partners occur on cellular and molecular levels, as each partner alters the expression of genes and proteins to facilitate the partnership. These interactions can examined using high-throughput techniques that allow thousands of genes to be examinedmore » simultaneously. We are developing the groundwork so that we can use DNA microarray profiling to identify genes involved in the Montastraea faveolata and Acropora palmata symbioses. Here we report results from the initial steps in this microarray initiative, that is, the construction of cDNA libraries from 4 of 16 target stages, sequencing of 3450 cDNA clones to generate Expressed Sequenced Tags (ESTs), and annotation of the ESTs to identify candidate genes to include in the microarrays. An understanding of how the coral-dinoflagellate symbiosis is regulated will have implications for atmospheric and ocean sciences, conservation biology, the study and diagnosis of coral bleaching and disease, and comparative studies of animal-protest interactions.« less
Purification and characterization of an antifungal protein, C-FKBP, from Chinese cabbage.
Park, Seong-Cheol; Lee, Jung Ro; Shin, Sun-Oh; Jung, Ji Hyun; Lee, Young Mee; Son, Hyosuk; Park, Yoonkyung; Lee, Sang Yeol; Hahm, Kyung-Soo
2007-06-27
An antifungal protein was isolated from Chinese cabbage (Brassica campestris L. ssp. pekinensis) by buffer-soluble extraction and two chromatographic procedures. The results of matrix-assisted laser desorption/ionization time-of-flight (MALDI-TOF) mass spectrometry revealed that the isolated Chinese cabbage protein was identical to human FK506-binding protein (FKBP). A cDNA encoding FKBP was isolated from a Chinese cabbage leaf cDNA library and named C-FKBP. The open reading frame of the gene encoded a 154-amino acid polypeptide. The amino acid sequence of C-FKBP exhibits striking degrees of identity with the corresponding mouse (61%), human (60%), and yeast (56%) proteins. Genomic Southern blot analyses using the full-length C-FKBP cDNA probe revealed a multigene family in the Chinese cabbage genome. The C-FKBP mRNA was highly expressed in vegetative tissues. We also analyzed the antifungal and peptidyl-prolyl cis-trans isomerase activity of recombinant C-FKBP protein expressed in Escherichia coli. This protein inhibited pathogenic fungal strains, including Candida albicans, Botrytis cinerea, Rhizoctonia solani, and Trichoderma viride, whereas it exhibited no activity against E. coli and Staphylococcus aureus. These results suggest that recombinant C-FKBP is an excellent candidate as a lead compound for the development of antifungal agents.
Salem, Nida’ M.; Miller, W. Allen; Rowhani, Adib; Golino, Deborah A.; Moyne, Anne-Laure; Falk, Bryce W.
2015-01-01
We determined the complete nucleotide sequence of the Rose spring dwarf-associated virus (RSDaV) genomic RNA (GenBank accession no. EU024678) and compared its predicted RNA structural characteristics affecting gene expression. A cDNA library was derived from RSDaV double-stranded RNAs (dsRNAs) purified from infected tissue. Nucleotide sequence analysis of the cloned cDNAs, plus for clones generated by 5′- and 3′-RACE showed the RSDaV genomic RNA to be 5,808 nucleotides. The genomic RNA contains five major open reading frames (ORFs), and three small ORFs in the 3′-terminal 800 nucleotides, typical for viruses of genus Luteovirus in the family Luteoviridae. Northern blot hybridization analysis revealed the genomic RNA and two prominent subgenomic RNAs of approximately 3 kb and 1 kb. Putative 5′ ends of the sgRNAs were predicted by identification of conserved sequences and secondary structures which resembled the Barley yellow dwarf virus (BYDV) genomic RNA 5′ end and subgenomic RNA promoter sequences. Secondary structures of the BYDV-like ribosomal frameshift elements and cap-independent translation elements, including long-distance base pairing spanning four kb were identified. These contain similarities but also informative differences with the BYDV structures, including a strikingly different structure predicted for the 3′ cap-independent translation element. These analyses of the RSDaV genomic RNA show more complexity for the RNA structural elements for members of the Luteoviridae. PMID:18329064
Salem, Nida' M; Miller, W Allen; Rowhani, Adib; Golino, Deborah A; Moyne, Anne-Laure; Falk, Bryce W
2008-06-05
We determined the complete nucleotide sequence of the Rose spring dwarf-associated virus (RSDaV) genomic RNA (GenBank accession no. EU024678) and compared its predicted RNA structural characteristics affecting gene expression. A cDNA library was derived from RSDaV double-stranded RNAs (dsRNAs) purified from infected tissue. Nucleotide sequence analysis of the cloned cDNAs, plus for clones generated by 5'- and 3'-RACE showed the RSDaV genomic RNA to be 5808 nucleotides. The genomic RNA contains five major open reading frames (ORFs), and three small ORFs in the 3'-terminal 800 nucleotides, typical for viruses of genus Luteovirus in the family Luteoviridae. Northern blot hybridization analysis revealed the genomic RNA and two prominent subgenomic RNAs of approximately 3 kb and 1 kb. Putative 5' ends of the sgRNAs were predicted by identification of conserved sequences and secondary structures which resembled the Barley yellow dwarf virus (BYDV) genomic RNA 5' end and subgenomic RNA promoter sequences. Secondary structures of the BYDV-like ribosomal frameshift elements and cap-independent translation elements, including long-distance base pairing spanning four kb were identified. These contain similarities but also informative differences with the BYDV structures, including a strikingly different structure predicted for the 3' cap-independent translation element. These analyses of the RSDaV genomic RNA show more complexity for the RNA structural elements for members of the Luteoviridae.
Asamizu, Erika; Nakamura, Yasukazu; Sato, Shusei; Tabata, Satoshi
2004-02-01
To perform a comprehensive analysis of genes expressed in a model legume, Lotus japonicus, a total of 74472 3'-end expressed sequence tags (EST) were generated from cDNA libraries produced from six different organs. Clustering of sequences was performed with an identity criterion of 95% for 50 bases, and a total of 20457 non-redundant sequences, 8503 contigs and 11954 singletons were generated. EST sequence coverage was analyzed by using the annotated L. japonicus genomic sequence and 1093 of the 1889 predicted protein-encoding genes (57.9%) were hit by the EST sequence(s). Gene content was compared to several plant species. Among the 8503 contigs, 471 were identified as sequences conserved only in leguminous species and these included several disease resistance-related genes. This suggested that in legumes, these genes may have evolved specifically to resist pathogen attack. The rate of gene sequence divergence was assessed by comparing similarity level and functional category based on the Gene Ontology (GO) annotation of Arabidopsis genes. This revealed that genes encoding ribosomal proteins, as well as those related to translation, photosynthesis, and cellular structure were more abundantly represented in the highly conserved class, and that genes encoding transcription factors and receptor protein kinases were abundantly represented in the less conserved class. To make the sequence information and the cDNA clones available to the research community, a Web database with useful services was created at http://www.kazusa.or.jp/en/plant/lotus/EST/.
Winkfein, R J; Nishikawa, S; Connor, W; Dixon, G H
1993-07-01
A synthetic oligonucleotide primer, designed from marsupial protamine protein-sequence data [Balhorn, R., Corzett, M., Matrimas, J. A., Cummins, J. & Faden, B. (1989) Analysis of protamines isolated from two marsupials, the ring-tailed wallaby and gray short-tailed opossum, J. Cell. Biol. 107] was used to amplify, via the polymerase chain reaction, protamine sequences from a North American opossum (Didelphis marsupialis) cDNA. Using the amplified sequences as probes, several protamine cDNA clones were isolated. The protein sequence, predicted from the cDNA sequences, consisted of 57 amino acids, contained a large number of arginine residues and exhibited the sequence ARYR at its amino terminus, which is conserved in avian and most eutherian mammal protamines. Like the true protamines of trout and chicken, the opossum protamine lacked cysteine residues, distinguishing it from placental mammalian protamine 1 (P1 or stable) protamines. Examination of the protamine gene, isolated by polymerase-chain-reaction amplification of genomic DNA, revealed the presence of an intron dividing the protamine-coding region, a common characteristic of all mammalian P1 genes. In addition, extensive sequence identity in the 5' and 3' flanking regions between mouse and opossum sequences classify the marsupial protamine as being closely related to placental mammal P1. Protamine transcripts, in both birds and mammals, are present in two size classes, differing by the length of their poly(A) tails (either short or long). Examination of opossum protamine transcripts by Northern hybridization revealed four distinct mRNA species in the total RNA fraction, two of which were enriched in the poly(A)-rich fraction. Northern-blot analysis, using an intron-specific probe, revealed the presence of intron sequences in two of the four protamine transcripts. If expressed, the corresponding protein from intron-containing transcripts would differ from spliced transcripts by length (49 versus 57 amino acids) and would contain a cysteine residue.
Marston, D A; McElhinney, L M; Johnson, N; Müller, T; Conzelmann, K K; Tordo, N; Fooks, A R
2007-04-01
We report the first full-length genomic sequences for European bat lyssavirus type-1 (EBLV-1) and type-2 (EBLV-2). The EBLV-1 genomic sequence was derived from a virus isolated from a serotine bat in Hamburg, Germany, in 1968 and the EBLV-2 sequence was derived from a virus isolate from a human case of rabies that occurred in Scotland in 2002. A long-distance PCR strategy was used to amplify the open reading frames (ORFs), followed by standard and modified RACE (rapid amplification of cDNA ends) techniques to amplify the 3' and 5' ends. The lengths of each complete viral genome for EBLV-1 and EBLV-2 were 11 966 and 11 930 base pairs, respectively, and follow the standard rhabdovirus genome organization of five viral proteins. Comparison with other lyssavirus sequences demonstrates variation in degrees of homology, with the genomic termini showing a high degree of complementarity. The nucleoprotein was the most conserved, both intra- and intergenotypically, followed by the polymerase (L), matrix and glyco- proteins, with the phosphoprotein being the most variable. In addition, we have shown that the two EBLVs utilize a conserved transcription termination and polyadenylation (TTP) motif, approximately 50 nt upstream of the L gene start codon. All available lyssavirus sequences to date, with the exception of Pasteur virus (PV) and PV-derived isolates, use the second TTP site. This observation may explain differences in pathogenicity between lyssavirus strains, dependent on the length of the untranslated region, which might affect transcriptional activity and RNA stability.
Cloning and characterization of the human 5,10-methenyltetrahydrofolate synthetase-encoding cDNA.
Dayan, A; Bertrand, R; Beauchemin, M; Chahla, D; Mamo, A; Filion, M; Skup, D; Massie, B; Jolivet, J
1995-11-20
Methenyltetrahydrofolate synthetase (MTHFS) catalyses the obligatory initial metabolic step in the intracellular conversion of 5-formyltetrahydrofolate to other reduced folates. We have isolated and sequenced a human MTHFS cDNA which is 872-bp long and codes for a 203-amino-acid protein of 23,229 Da. Escherichia coli BL21(DE3), transfected with pET11c plasmids containing an open reading frame encoding MTHFS, showed a 100-fold increase in MTHFS activity in bacterial extracts after IPTG induction. Northern blot studies of human tissues determined that the MTHFS mRNA was expressed preferentially in the liver and Southern blot analysis of human genomic DNA suggested the presence of a single-copy gene.
2010-01-01
Background The Fagaceae family comprises about 1,000 woody species worldwide. About half belong to the Quercus family. These oaks are often a source of raw material for biomass wood and fiber. Pedunculate and sessile oaks, are among the most important deciduous forest tree species in Europe. Despite their ecological and economical importance, very few genomic resources have yet been generated for these species. Here, we describe the development of an EST catalogue that will support ecosystem genomics studies, where geneticists, ecophysiologists, molecular biologists and ecologists join their efforts for understanding, monitoring and predicting functional genetic diversity. Results We generated 145,827 sequence reads from 20 cDNA libraries using the Sanger method. Unexploitable chromatograms and quality checking lead us to eliminate 19,941 sequences. Finally a total of 125,925 ESTs were retained from 111,361 cDNA clones. Pyrosequencing was also conducted for 14 libraries, generating 1,948,579 reads, from which 370,566 sequences (19.0%) were eliminated, resulting in 1,578,192 sequences. Following clustering and assembly using TGICL pipeline, 1,704,117 EST sequences collapsed into 69,154 tentative contigs and 153,517 singletons, providing 222,671 non-redundant sequences (including alternative transcripts). We also assembled the sequences using MIRA and PartiGene software and compared the three unigene sets. Gene ontology annotation was then assigned to 29,303 unigene elements. Blast search against the SWISS-PROT database revealed putative homologs for 32,810 (14.7%) unigene elements, but more extensive search with Pfam, Refseq_protein, Refseq_RNA and eight gene indices revealed homology for 67.4% of them. The EST catalogue was examined for putative homologs of candidate genes involved in bud phenology, cuticle formation, phenylpropanoids biosynthesis and cell wall formation. Our results suggest a good coverage of genes involved in these traits. Comparative orthologous sequences (COS) with other plant gene models were identified and allow to unravel the oak paleo-history. Simple sequence repeats (SSRs) and single nucleotide polymorphisms (SNPs) were searched, resulting in 52,834 SSRs and 36,411 SNPs. All of these are available through the Oak Contig Browser http://genotoul-contigbrowser.toulouse.inra.fr:9092/Quercus_robur/index.html. Conclusions This genomic resource provides a unique tool to discover genes of interest, study the oak transcriptome, and develop new markers to investigate functional diversity in natural populations. PMID:21092232
A global assembly of cotton ESTs
Udall, Joshua A.; Swanson, Jordan M.; Haller, Karl; Rapp, Ryan A.; Sparks, Michael E.; Hatfield, Jamie; Yu, Yeisoo; Wu, Yingru; Dowd, Caitriona; Arpat, Aladdin B.; Sickler, Brad A.; Wilkins, Thea A.; Guo, Jin Ying; Chen, Xiao Ya; Scheffler, Jodi; Taliercio, Earl; Turley, Ricky; McFadden, Helen; Payton, Paxton; Klueva, Natalya; Allen, Randell; Zhang, Deshui; Haigler, Candace; Wilkerson, Curtis; Suo, Jinfeng; Schulze, Stefan R.; Pierce, Margaret L.; Essenberg, Margaret; Kim, HyeRan; Llewellyn, Danny J.; Dennis, Elizabeth S.; Kudrna, David; Wing, Rod; Paterson, Andrew H.; Soderlund, Cari; Wendel, Jonathan F.
2006-01-01
Approximately 185,000 Gossypium EST sequences comprising >94,800,000 nucleotides were amassed from 30 cDNA libraries constructed from a variety of tissues and organs under a range of conditions, including drought stress and pathogen challenges. These libraries were derived from allopolyploid cotton (Gossypium hirsutum; AT and DT genomes) as well as its two diploid progenitors, Gossypium arboreum (A genome) and Gossypium raimondii (D genome). ESTs were assembled using the Program for Assembling and Viewing ESTs (PAVE), resulting in 22,030 contigs and 29,077 singletons (51,107 unigenes). Further comparisons among the singletons and contigs led to recognition of 33,665 exemplar sequences that represent a nonredundant set of putative Gossypium genes containing partial or full-length coding regions and usually one or two UTRs. The assembly, along with their UniProt BLASTX hits, GO annotation, and Pfam analysis results, are freely accessible as a public resource for cotton genomics. Because ESTs from diploid and allotetraploid Gossypium were combined in a single assembly, we were in many cases able to bioinformatically distinguish duplicated genes in allotetraploid cotton and assign them to either the A or D genome. The assembly and associated information provide a framework for future investigation of cotton functional and evolutionary genomics. PMID:16478941
Frentiu, Francesca D; Adamski, Marcin; McGraw, Elizabeth A; Blows, Mark W; Chenoweth, Stephen F
2009-01-21
The native Australian fly Drosophila serrata belongs to the highly speciose montium subgroup of the melanogaster species group. It has recently emerged as an excellent model system with which to address a number of important questions, including the evolution of traits under sexual selection and traits involved in climatic adaptation along latitudinal gradients. Understanding the molecular genetic basis of such traits has been limited by a lack of genomic resources for this species. Here, we present the first expressed sequence tag (EST) collection for D. serrata that will enable the identification of genes underlying sexually-selected phenotypes and physiological responses to environmental change and may help resolve controversial phylogenetic relationships within the montium subgroup. A normalized cDNA library was constructed from whole fly bodies at several developmental stages, including larvae and adults. Assembly of 11,616 clones sequenced from the 3' end allowed us to identify 6,607 unique contigs, of which at least 90% encoded peptides. Partial transcripts were discovered from a variety of genes of evolutionary interest by BLASTing contigs against the 12 Drosophila genomes currently sequenced. By incorporating into the cDNA library multiple individuals from populations spanning a large portion of the geographical range of D. serrata, we were able to identify 11,057 putative single nucleotide polymorphisms (SNPs), with 278 different contigs having at least one "double hit" SNP that is highly likely to be a real polymorphism. At least 394 EST-associated microsatellite markers, representing 355 different contigs, were also found, providing an additional set of genetic markers. The assembled EST library is available online at http://www.chenowethlab.org/serrata/index.cgi. We have provided the first gene collection and largest set of polymorphic genetic markers, to date, for the fly D. serrata. The EST collection will provide much needed genomic resources for this model species and facilitate comparative evolutionary studies within the montium subgroup of the D. melanogaster lineage.
The bglA Gene of Aspergillus kawachii Encodes Both Extracellular and Cell Wall-Bound β-Glucosidases
Iwashita, Kazuhiro; Nagahara, Tatsuya; Kimura, Hitoshi; Takano, Makoto; Shimoi, Hitoshi; Ito, Kiyoshi
1999-01-01
We cloned the genomic DNA and cDNA of bglA, which encodes β-glucosidase in Aspergillus kawachii, based on a partial amino acid sequence of purified cell wall-bound β-glucosidase CB-1. The nucleotide sequence of the cloned bglA gene revealed a 2,933-bp open reading frame with six introns that encodes an 860-amino-acid protein. Based on the deduced amino acid sequence, we concluded that the bglA gene encodes cell wall-bound β-glucosidase CB-1. The amino acid sequence exhibited high levels of homology with the amino acid sequences of fungal β-glucosidases classified in subfamily B. We expressed the bglA cDNA in Saccharomyces cerevisiae and detected the recombinant β-glucosidase in the periplasm fraction of the recombinant yeast. A. kawachii can produce two extracellular β-glucosidases (EX-1 and EX-2) in addition to the cell wall-bound β-glucosidase. A. kawachii in which the bglA gene was disrupted produced none of the three β-glucosidases, as determined by enzyme assays and a Western blot analysis. Thus, we concluded that the bglA gene encodes both extracellular and cell wall-bound β-glucosidases in A. kawachii. PMID:10584016
Yang, G; Liu, X G; Qiu, B S
2000-07-01
The complete nucleotides of two Chinese tobacco mosaic virus (TMV) isolates, TMV-Cv (vulgare strain) and TMV-N14 (an attenuated virus originated from a tomato strain), were determined from their respective full-length infectious cDNA clones and compared with published TMV sequences. The genome structure of TMV-Cv contained 6395 nucleotides, in which four functional open reading frames (ORF), coding for replicase (126 kD/183 kD), movement protein (MP, 30 kD) and coat protein (CP, 17.6 kD) respectively, could be recognized. TMV-N14 contained 6384 nucleotides in its genome. In contrast to TMV-Cv, five functional ORFs encoding the replicase 98.5 kD/126 kD/183 kD, MP(27 kD) and CP(17.6 kD), respectively, were detected in the TMV-N14 genome. TMV-Cv is 99% homologous to a Korean TMV isolate belonging to the vulgare strain at the nucleotide level. TMV-N14 is 99% homologous to a highly virulent Japanese isolate TMV-L (tomato strain) at the nucleotide level. In TMV-N14, one opal nulation (UGA) occurred in the replicase gene and one ochre nutation (UAA) in the MP gene. The former mutation created a potential, additional ORF within the replicase gene, the latter reduced the size of the MP to 27 kD. In addition, there were also 13 amino acid substitutions in the replicase gene of TMV-N14 when compared to that of TMV-L. Collectively, these changes may have significant implications in the attenuation of the virulence of TMV-N14.
Functional genomics of bio-energy plants and related patent activities.
Jiang, Shu-Ye; Ramachandran, Srinivasan
2013-04-01
With dwindling fossil oil resources and increased economic growth of many developing countries due to globalization, energy driven from an alternative source such as bio-energy in a sustainable fashion is the need of the hour. However, production of energy from biological source is relatively expensive due to low starch and sugar contents of bioenergy plants leading to lower oil yield and reduced quality along with lower conversion efficiency of feedstock. In this context genetic improvement of bio-energy plants offers a viable solution. In this manuscript, we reviewed the current status of functional genomics studies and related patent activities in bio-energy plants. Currently, genomes of considerable bio-energy plants have been sequenced or are in progress and also large amount of expression sequence tags (EST) or cDNA sequences are available from them. These studies provide fundamental data for more reliable genome annotation and as a result, several genomes have been annotated in a genome-wide level. In addition to this effort, various mutagenesis tools have also been employed to develop mutant populations for characterization of genes that are involved in bioenergy quantitative traits. With the progress made on functional genomics of important bio-energy plants, more patents were filed with a significant number of them focusing on genes and DNA sequences which may involve in improvement of bio-energy traits including higher yield and quality of starch, sugar and oil. We also believe that these studies will lead to the generation of genetically altered plants with improved tolerance to various abiotic and biotic stresses.
Blancher, C; Omri, B; Bidou, L; Pessac, B; Crisanti, P
1996-10-18
We report the isolation and characterization of a novel cDNA from quail neuroretina encoding a putative protein named nectinepsin. The nectinepsin cDNA identifies a major 2.2-kilobase mRNA that is detected from ED 5 in neuroretina and is increasingly abundant during embryonic development. A nectinepsin mRNA is also found in quail liver, brain, and intestine and in mouse retina. The deduced nectinepsin amino acid sequence contains the RGD cell binding motif of integrin ligands. Furthermore, nectinepsin shares substantial homologies with vitronectin and structural protein similarities with most of the matricial metalloproteases. However, the presence of a specific sequence and the lack of heparin and collagen binding domains of the vitronectin indicate that nectinepsin is a new extracellular matrix protein. Furthermore, genomic Southern blot studies suggest that nectinepsin and vitronectin are encoded by different genes. Western blot analysis with an anti-human vitronectin antiserum revealed, in addition to the 65- and 70-kDa vitronectin bands, an immunoreactive protein of about 54 kDa in all tissues containing nectinepsin mRNA. It seems likely that the form of vitronectin found in chick egg yolk plasma by Nagano et al. ((1992) J. Biol. Chem. 267, 24863-24870) is the protein that corresponds to the nectinepsin cDNA. This new protein could be an important molecule involved in the early steps of the development.
Identification and cloning of a gamma 3 subunit splice variant of the human GABA(A) receptor.
Poulsen, C F; Christjansen, K N; Hastrup, S; Hartvig, L
2000-05-31
cDNA sequences encoding two forms of the GABA(A) gamma 3 receptor subunit were cloned from human hippocampus. The nucleotide sequences differ by the absence (gamma 3S) or presence (gamma 3L) of 18 bp located in the presumed intracellular loop between transmembrane region (TM) III and IV. The extra 18 bp in the gamma 3L subunit generates a consensus site for phosphorylation by protein kinase C (PKC). Analysis of human genomic DNA encoding the gamma 3 subunit reveals that the 18 bp insert is contiguous with the upstream proximal exon.
Small, G J; Hemingway, J
2000-12-01
Widespread resistance to organophosphorus insecticides (OPs) in Nilaparvata lugens is associated with elevation of carboxylesterase activity. A cDNA encoding a carboxylesterase, Nl-EST1, has been isolated from an OP-resistant Sri Lankan strain of N. lugens. The full-length cDNA codes for a 547-amino acid protein with high homology to other esterases/lipases. Nl-EST1 has an N-terminal hydrophobic signal peptide sequence of 24 amino acids which suggests that the mature protein is secreted from cells expressing it. The nucleotide sequence of the homologue of Nl-EST1 in an OP-susceptible, low esterase Sri Lankan strain of N. lugens is identical to Nl-EST1. Southern analysis of genomic DNA from the Sri Lankan OP-resistant and susceptible strains suggests that Nl-EST1 is amplified in the resistant strain. Therefore, resistance to OPs in the Sri Lankan strain is through amplification of a gene identical to that found in the susceptible strain.
Bown, David P; Gatehouse, John A
2004-05-01
Carboxypeptidases were purified from guts of larvae of corn earworm (Helicoverpa armigera), a lepidopteran crop pest, by affinity chromatography on immobilized potato carboxypeptidase inhibitor, and characterized by N-terminal sequencing. A larval gut cDNA library was screened using probes based on these protein sequences. cDNA HaCA42 encoded a carboxypeptidase with sequence similarity to enzymes of clan MC [Barrett, A. J., Rawlings, N. D. & Woessner, J. F. (1998) Handbook of Proteolytic Enzymes. Academic Press, London.], but with a novel predicted specificity towards C-terminal acidic residues. This carboxypeptidase was expressed as a recombinant proprotein in the yeast Pichia pastoris. The expressed protein could be activated by treatment with bovine trypsin; degradation of bound pro-region, rather than cleavage of pro-region from mature protein, was the rate-limiting step in activation. Activated HaCA42 carboxypeptidase hydrolysed a synthetic substrate for glutamate carboxypeptidases (FAEE, C-terminal Glu), but did not hydrolyse substrates for carboxypeptidase A or B (FAPP or FAAK, C-terminal Phe or Lys) or methotrexate, cleaved by clan MH glutamate carboxypeptidases. The enzyme was highly specific for C-terminal glutamate in peptide substrates, with slow hydrolysis of C-terminal aspartate also observed. Glutamate carboxypeptidase activity was present in larval gut extract from H. armigera. The HaCA42 protein is the first glutamate-specific metallocarboxypeptidase from clan MC to be identified and characterized. The genome of Drosophila melanogaster contains genes encoding enzymes with similar sequences and predicted specificity, and a cDNA encoding a similar enzyme has been isolated from gut tissue in tsetse fly. We suggest that digestive carboxypeptidases with sequence similarity to the classical mammalian enzymes, but with specificity towards C-terminal glutamate, are widely distributed in insects.
[cDNA library construction from panicle meristem of finger millet].
Radchuk, V; Pirko, Ia V; Isaenkov, S V; Emets, A I; Blium, Ia B
2014-01-01
The protocol for production of full-size cDNA using SuperScript Full-Length cDNA Library Construction Kit II (Invitrogen) was tested and high quality cDNA library from meristematic tissue of finger millet panicle (Eleusine coracana (L.) Gaertn) was created. The titer of obtained cDNA library comprised 3.01 x 10(5) CFU/ml in avarage. In average the length of cDNA insertion consisted about 1070 base pairs, the effectivity of cDNA fragment insertions--99.5%. The selective sequencing of cDNA clones from created library was performed. The sequences of cDNA clones were identified with usage of BLAST-search. The results of cDNA library analysis and selective sequencing represents prove good functionality and full length character of inserted cDNA clones. Obtained cDNA library from meristematic tissue of finger millet panicle represents good and valuable source for isolation and identification of key genes regulating metabolism and meristematic development and for mining of new molecular markers to conduct out high quality genetic investigations and molecular breeding as well.
Coyne, Robert S; Thiagarajan, Mathangi; Jones, Kristie M; Wortman, Jennifer R; Tallon, Luke J; Haas, Brian J; Cassidy-Hanley, Donna M; Wiley, Emily A; Smith, Joshua J; Collins, Kathleen; Lee, Suzanne R; Couvillion, Mary T; Liu, Yifan; Garg, Jyoti; Pearlman, Ronald E; Hamilton, Eileen P; Orias, Eduardo; Eisen, Jonathan A; Methé, Barbara A
2008-01-01
Background Tetrahymena thermophila, a widely studied model for cellular and molecular biology, is a binucleated single-celled organism with a germline micronucleus (MIC) and somatic macronucleus (MAC). The recent draft MAC genome assembly revealed low sequence repetitiveness, a result of the epigenetic removal of invasive DNA elements found only in the MIC genome. Such low repetitiveness makes complete closure of the MAC genome a feasible goal, which to achieve would require standard closure methods as well as removal of minor MIC contamination of the MAC genome assembly. Highly accurate preliminary annotation of Tetrahymena's coding potential was hindered by the lack of both comparative genomic sequence information from close relatives and significant amounts of cDNA evidence, thus limiting the value of the genomic information and also leaving unanswered certain questions, such as the frequency of alternative splicing. Results We addressed the problem of MIC contamination using comparative genomic hybridization with purified MIC and MAC DNA probes against a whole genome oligonucleotide microarray, allowing the identification of 763 genome scaffolds likely to contain MIC-limited DNA sequences. We also employed standard genome closure methods to essentially finish over 60% of the MAC genome. For the improvement of annotation, we have sequenced and analyzed over 60,000 verified EST reads from a variety of cellular growth and development conditions. Using this EST evidence, a combination of automated and manual reannotation efforts led to updates that affect 16% of the current protein-coding gene models. By comparing EST abundance, many genes showing apparent differential expression between these conditions were identified. Rare instances of alternative splicing and uses of the non-standard amino acid selenocysteine were also identified. Conclusion We report here significant progress in genome closure and reannotation of Tetrahymena thermophila. Our experience to date suggests that complete closure of the MAC genome is attainable. Using the new EST evidence, automated and manual curation has resulted in substantial improvements to the over 24,000 gene models, which will be valuable to researchers studying this model organism as well as for comparative genomics purposes. PMID:19036158
Zhang, Li-Feng; Li, Wan-Feng; Han, Su-Ying; Yang, Wen-Hua; Qi, Li-Wang
2013-10-15
A full-length cDNA and genomic sequences of a translationally controlled tumor protein (TCTP) gene were isolated from Japanese larch (Larix leptolepis) and designated LaTCTP. The length of the cDNA was 1, 043 bp and contained a 504 bp open reading frame that encodes a predicted protein of 167 amino acids, characterized by two signature sequences of the TCTP protein family. Analysis of the LaTCTP gene structure indicated four introns and five exons, and it is the largest of all currently known TCTP genes in plants. The 5'-flanking promoter region of LaTCTP was cloned using an improved TAIL-PCR technique. In this region we identified many important potential cis-acting elements, such as a Box-W1 (fungal elicitor responsive element), a CAT-box (cis-acting regulatory element related to meristem expression), a CGTCA-motif (cis-acting regulatory element involved in MeJA-responsiveness), a GT1-motif (light responsive element), a Skn-1-motif (cis-acting regulatory element required for endosperm expression) and a TGA-element (auxin-responsive element), suggesting that expression of LaTCTP is highly regulated. Expression analysis demonstrated ubiquitous localization of LaTCTP mRNA in the roots, stems and needles, high mRNA levels in the embryonal-suspensor mass (ESM), browning embryogenic cultures and mature somatic embryos, and low levels of mRNA at day five during somatic embryogenesis. We suggest that LaTCTP might participate in the regulation of somatic embryo development. These results provide a theoretical basis for understanding the molecular regulatory mechanism of LaTCTP and lay the foundation for artificial regulation of somatic embryogenesis. © 2013.
Generation and analysis of expressed sequence tags from the bone marrow of Chinese Sika deer.
Yao, Baojin; Zhao, Yu; Zhang, Mei; Li, Juan
2012-03-01
Sika deer is one of the best-known and highly valued animals of China. Despite its economic, cultural, and biological importance, there has not been a large-scale sequencing project for Sika deer to date. With the ultimate goal of sequencing the complete genome of this organism, we first established a bone marrow cDNA library for Sika deer and generated a total of 2,025 reads. After processing the sequences, 2,017 high-quality expressed sequence tags (ESTs) were obtained. These ESTs were assembled into 1,157 unigenes, including 238 contigs and 919 singletons. Comparative analyses indicated that 888 (76.75%) of the unigenes had significant matches to sequences in the non-redundant protein database, In addition to highly expressed genes, such as stearoyl-CoA desaturase, cytochrome c oxidase, adipocyte-type fatty acid-binding protein, adiponectin and thymosin beta-4, we also obtained vascular endothelial growth factor-A and heparin-binding growth-associated molecule, both of which are of great importance for angiogenesis research. There were 244 (21.09%) unigenes with no significant match to any sequence in current protein or nucleotide databases, and these sequences may represent genes with unknown function in Sika deer. Open reading frame analysis of the sequences was performed using the getorf program. In addition, the sequences were functionally classified using the gene ontology hierarchy, clusters of orthologous groups of proteins and Kyoto encyclopedia of genes and genomes databases. Analysis of ESTs described in this paper provides an important resource for the transcriptome exploration of Sika deer, and will also facilitate further studies on functional genomics, gene discovery and genome annotation of Sika deer.
Simon, J W; Slabas, A R
1998-09-18
The GenBank database was searched using the E. coli malonyl CoA:ACP transacylase (MCAT) sequence, for plant protein/cDNA sequences corresponding to MCAT, a component of plant fatty acid synthetase (FAS), for which the plant cDNA has not been isolated. A 272-bp Zea mays EST sequence (GenBank accession number: AA030706) was identified which has strong homology to the E. coli MCAT. A PCR derived cDNA probe from Zea mays was used to screen a Brassica napus (rape) cDNA library. This resulted in the isolation of a 1200-bp cDNA clone which encodes an open reading frame corresponding to a protein of 351 amino acids. The protein shows 47% homology to the E. coli MCAT amino acid sequence in the coding region for the mature protein. Expression of a plasmid (pMCATrap2) containing the plant cDNA sequence in Fab D89, an E. coli mutant, in MCAT activity restores growth demonstrating functional complementation and direct function of the cloned cDNA. This is the first functional evidence supporting the identification of a plant cDNA for MCAT.
Tuo, Decai; Shen, Wentao; Yan, Pu; Li, Xiaoying; Zhou, Peng
2015-01-01
Papaya leaf distortion mosaic virus (PLDMV) is becoming a threat to papaya and transgenic papaya resistant to the related pathogen, papaya ringspot virus (PRSV). The generation of infectious viral clones is an essential step for reverse-genetics studies of viral gene function and cross-protection. In this study, a sequence- and ligation-independent cloning system, the In-Fusion® Cloning Kit (Clontech, Mountain View, CA, USA), was used to construct intron-less or intron-containing full-length cDNA clones of the isolate PLDMV-DF, with the simultaneous scarless assembly of multiple viral and intron fragments into a plasmid vector in a single reaction. The intron-containing full-length cDNA clone of PLDMV-DF was stably propagated in Escherichia coli. In vitro intron-containing transcripts were processed and spliced into biologically active intron-less transcripts following mechanical inoculation and then initiated systemic infections in Carica papaya L. seedlings, which developed similar symptoms to those caused by the wild-type virus. However, no infectivity was detected when the plants were inoculated with RNA transcripts from the intron-less construct because the instability of the viral cDNA clone in bacterial cells caused a non-sense or deletion mutation of the genomic sequence of PLDMV-DF. To our knowledge, this is the first report of the construction of an infectious full-length cDNA clone of PLDMV and the splicing of intron-containing transcripts following mechanical inoculation. In-Fusion cloning shortens the construction time from months to days. Therefore, it is a faster, more flexible, and more efficient method than the traditional multistep restriction enzyme-mediated subcloning procedure. PMID:26633465
Poirier, John T; Reddy, P Seshidhar; Idamakanti, Neeraja; Li, Shawn S; Stump, Kristine L; Burroughs, Kevin D; Hallenbeck, Paul L; Rudin, Charles M
2012-12-01
Seneca Valley virus (SVV-001) is an oncolytic picornavirus with selective tropism for a subset of human cancers with neuroendocrine differentiation. To characterize further the specificity of SVV-001 and its patterns and kinetics of intratumoral spread, bacterial plasmids encoding a cDNA clone of the full-length wild-type virus and a derivative virus expressing GFP were generated. The full-length cDNA of the SVV-001 RNA genome was cloned into a bacterial plasmid under the control of the T7 core promoter sequence to create an infectious cDNA clone, pNTX-09. A GFP reporter virus cDNA clone, pNTX-11, was then generated by cloning a fusion protein of GFP and the 2A protein from foot-and-mouth disease virus immediately following the native SVV-001 2A sequence. Recombinant GFP-expressing reporter virus, SVV-GFP, was rescued from cells transfected with in vitro RNA transcripts from pNTX-11 and propagated in cell culture. The proliferation kinetics of SVV-001 and SVV-GFP were indistinguishable. The SVV-GFP reporter virus was used to determine that a subpopulation of permissive cells is present in small-cell lung cancer cell lines previously thought to lack permissivity to SVV-001. Finally, it was shown that SVV-GFP administered to tumour-bearing animals homes in to and infects tumours whilst having no detectable tropism for normal mouse tissues at 1×10(11) viral particles kg(-1), a dose equivalent to that administered in ongoing clinical trials. These infectious clones will be of substantial value in further characterizing the biology of this virus and as a backbone for the generation of additional oncolytic derivatives.
Tuo, Decai; Shen, Wentao; Yan, Pu; Li, Xiaoying; Zhou, Peng
2015-12-01
Papaya leaf distortion mosaic virus (PLDMV) is becoming a threat to papaya and transgenic papaya resistant to the related pathogen, papaya ringspot virus (PRSV). The generation of infectious viral clones is an essential step for reverse-genetics studies of viral gene function and cross-protection. In this study, a sequence- and ligation-independent cloning system, the In-Fusion(®) Cloning Kit (Clontech, Mountain View, CA, USA), was used to construct intron-less or intron-containing full-length cDNA clones of the isolate PLDMV-DF, with the simultaneous scarless assembly of multiple viral and intron fragments into a plasmid vector in a single reaction. The intron-containing full-length cDNA clone of PLDMV-DF was stably propagated in Escherichia coli. In vitro intron-containing transcripts were processed and spliced into biologically active intron-less transcripts following mechanical inoculation and then initiated systemic infections in Carica papaya L. seedlings, which developed similar symptoms to those caused by the wild-type virus. However, no infectivity was detected when the plants were inoculated with RNA transcripts from the intron-less construct because the instability of the viral cDNA clone in bacterial cells caused a non-sense or deletion mutation of the genomic sequence of PLDMV-DF. To our knowledge, this is the first report of the construction of an infectious full-length cDNA clone of PLDMV and the splicing of intron-containing transcripts following mechanical inoculation. In-Fusion cloning shortens the construction time from months to days. Therefore, it is a faster, more flexible, and more efficient method than the traditional multistep restriction enzyme-mediated subcloning procedure.
Parallel gene analysis with allele-specific padlock probes and tag microarrays
Banér, Johan; Isaksson, Anders; Waldenström, Erik; Jarvius, Jonas; Landegren, Ulf; Nilsson, Mats
2003-01-01
Parallel, highly specific analysis methods are required to take advantage of the extensive information about DNA sequence variation and of expressed sequences. We present a scalable laboratory technique suitable to analyze numerous target sequences in multiplexed assays. Sets of padlock probes were applied to analyze single nucleotide variation directly in total genomic DNA or cDNA for parallel genotyping or gene expression analysis. All reacted probes were then co-amplified and identified by hybridization to a standard tag oligonucleotide array. The technique was illustrated by analyzing normal and pathogenic variation within the Wilson disease-related ATP7B gene, both at the level of DNA and RNA, using allele-specific padlock probes. PMID:12930977
Comparison of next generation sequencing technologies for transcriptome characterization
2009-01-01
Background We have developed a simulation approach to help determine the optimal mixture of sequencing methods for most complete and cost effective transcriptome sequencing. We compared simulation results for traditional capillary sequencing with "Next Generation" (NG) ultra high-throughput technologies. The simulation model was parameterized using mappings of 130,000 cDNA sequence reads to the Arabidopsis genome (NCBI Accession SRA008180.19). We also generated 454-GS20 sequences and de novo assemblies for the basal eudicot California poppy (Eschscholzia californica) and the magnoliid avocado (Persea americana) using a variety of methods for cDNA synthesis. Results The Arabidopsis reads tagged more than 15,000 genes, including new splice variants and extended UTR regions. Of the total 134,791 reads (13.8 MB), 119,518 (88.7%) mapped exactly to known exons, while 1,117 (0.8%) mapped to introns, 11,524 (8.6%) spanned annotated intron/exon boundaries, and 3,066 (2.3%) extended beyond the end of annotated UTRs. Sequence-based inference of relative gene expression levels correlated significantly with microarray data. As expected, NG sequencing of normalized libraries tagged more genes than non-normalized libraries, although non-normalized libraries yielded more full-length cDNA sequences. The Arabidopsis data were used to simulate additional rounds of NG and traditional EST sequencing, and various combinations of each. Our simulations suggest a combination of FLX and Solexa sequencing for optimal transcriptome coverage at modest cost. We have also developed ESTcalc http://fgp.huck.psu.edu/NG_Sims/ngsim.pl, an online webtool, which allows users to explore the results of this study by specifying individualized costs and sequencing characteristics. Conclusion NG sequencing technologies are a highly flexible set of platforms that can be scaled to suit different project goals. In terms of sequence coverage alone, the NG sequencing is a dramatic advance over capillary-based sequencing, but NG sequencing also presents significant challenges in assembly and sequence accuracy due to short read lengths, method-specific sequencing errors, and the absence of physical clones. These problems may be overcome by hybrid sequencing strategies using a mixture of sequencing methodologies, by new assemblers, and by sequencing more deeply. Sequencing and microarray outcomes from multiple experiments suggest that our simulator will be useful for guiding NG transcriptome sequencing projects in a wide range of organisms. PMID:19646272
Morozumi, Takeya; Toki, Daisuke; Eguchi-Ogawa, Tomoko; Uenishi, Hirohide
2011-09-01
Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
Conditional poliovirus mutants made by random deletion mutagenesis of infectious cDNA.
Kirkegaard, K; Nelsen, B
1990-01-01
Small deletions were introduced into DNA plasmids bearing cDNA copies of Mahoney type 1 poliovirus RNA. The procedure used was similar to that of P. Hearing and T. Shenk (J. Mol. Biol. 167:809-822, 1983), with modifications designed to introduce only one lesion randomly into each DNA molecule. Methods to map small deletions in either large DNA or RNA molecules were employed. Two poliovirus mutants, VP1-101 and VP1-102, were selected from mutagenized populations on the basis of their host range phenotype, showing a large reduction in the relative numbers of plaques on CV1 and HeLa cells compared with wild-type virus. The deletions borne by the mutant genomes were mapped to the region encoding the amino terminus of VP1. That these lesions were responsible for the mutant phenotypes was substantiated by reintroduction of the sequenced lesions into a wild-type poliovirus cDNA by deoxyoligonucleotide-directed mutagenesis. The deletion of nucleotides encoding amino acids 8 and 9 of VP1 was responsible for the VP1-101 phenotype; the VP1-102 defect was caused by the deletion of the sequences encoding the first four amino acids of VP1. The peptide sequence at the VP1-VP3 proteolytic cleavage site was altered from glutamine-glycine to glutamine-methionine in VP1-102; this apparently did not alter the proteolytic cleavage pattern. The biochemical defects resulting from these mutations are discussed in the accompanying report. Images PMID:2152811
Oliveira-Neto, Osmundo B; Batista, João A N; Rigden, Daniel J; Fragoso, Rodrigo R; Silva, Rodrigo O; Gomes, Eliane A; Franco, Octávio L; Dias, Simoni C; Cordeiro, Célia M T; Monnerat, Rose G; Grossi-De-Sá, Maria F
2004-09-01
Fourteen different cDNA fragments encoding serine proteinases were isolated by reverse transcription-PCR from cotton boll weevil (Anthonomus grandis) larvae. A large diversity between the sequences was observed, with a mean pairwise identity of 22% in the amino acid sequence. The cDNAs encompassed 11 trypsin-like sequences classifiable into three families and three chymotrypsin-like sequences belonging to a single family. Using a combination of 5' and 3' RACE, the full-length sequence was obtained for five of the cDNAs, named Agser2, Agser5, Agser6, Agser10 and Agser21. The encoded proteins included amino acid sequence motifs of serine proteinase active sites, conserved cysteine residues, and both zymogen activation and signal peptides. Southern blotting analysis suggested that one or two copies of these serine proteinase genes exist in the A. grandis genome. Northern blotting analysis of Agser2 and Agser5 showed that for both genes, expression is induced upon feeding and is concentrated in the gut of larvae and adult insects. Reverse northern analysis of the 14 cDNA fragments showed that only two trypsin-like and two chymotrypsin-like were expressed at detectable levels. Under the effect of the serine proteinase inhibitors soybean Kunitz trypsin inhibitor and black-eyed pea trypsin/chymotrypsin inhibitor, expression of one of the trypsin-like sequences was upregulated while expression of the two chymotrypsin-like sequences was downregulated. Copyright 2004 Elsevier Ltd.
Large-Scale Concatenation cDNA Sequencing
Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.
1997-01-01
A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Chromosome-specific physical localisation of expressed sequence tag loci in Corchorus olitorius L.
Joshi, A; Das, S K; Samanta, P; Paria, P; Sen, S K; Basu, A
2014-11-01
Jute (Corchorus spp.), as a natural fibre-producing species, ranks next only to cotton. Inadequate understanding of its genetic architecture is a major lacuna for genetic improvement of this crop in terms of yield and quality. Establishment of a physical map provides a genomic tool that helps in positional cloning of valuable genes. In this report, an attempt was initiated to study association and localisation of single copy expressed sequence tag (EST) loci in the genome of Corchorus olitorius. The chromosome-specific association of EST was determined based on the appearance of an extra signal for a single copy cDNA probe in mitotic interphase nuclei of specific trisomic(s) for fluorescence in situ hybridisation, and validated using a cDNA fragment of the 26S rRNA gene (600 bp) as molecular probe. The probe exhibited three signals in meiotic interphase nuclei of trisomic 5, instead of two as observed in diploids and other trisomics, indicating its association with chromosome 5. Subsequent hybridisation of the same probe on the pachytene chromosomes of diploids confirmed that 26S rRNA occupies the terminal end of the short arm of chromosome 5 in C. olitorius. Subsequently, chromosome-specific association of 63 single copy EST and their physical localisation were determined on chromosomes 2, 4, 5 and 7. The study describes chromosome-specific physical localisation of genes in jute. The approach used here could be a step towards construction of genome-wide physical maps for any recalcitrant plant species like jute. © 2014 German Botanical Society and The Royal Botanical Society of the Netherlands.
Zeng, Yichun; Hou, Yi-Ling; Ding, Xiang; Hou, Wan-Ru; Li, Jian
2014-01-01
Barrier to autointegration factor 1 (BANF1) is a DNA-binding protein found in the nucleus and cytoplasm of eukaryotic cells that functions to establish nuclear architecture during mitosis. The cDNA and the genomic sequence of BANF1 were cloned from the Giant Panda (Ailuropoda melanoleuca) and Black Bear (Ursus thibetanus mupinensis) using RT-PCR technology and Touchdown-PCR, respectively. The cDNA of the BANF1 cloned from Giant Panda and Black Bear is 297 bp in size, containing an open reading frame of 270 bp encoding 89 amino acids. The length of the genomic sequence from Giant Panda is 521 bp, from Black Bear is 536 bp, which were found both to possess 2 exons. Alignment analysis indicated that the nucleotide sequence and the deduced amino acid sequence are highly conserved to some mammalian species studied. Topology prediction showed there is one Protein kinase C phosphorylation site, one Casein kinase II phosphorylation site, one Tyrosine kinase phosphorylation site, one N-myristoylation site, and one Amidation site in the BANF1 protein of the Giant Panda, and there is one Protein kinase C phosphorylation site, one Tyrosine kinase phosphorylation site, one N-myristoylation site, and one Amidation site in the BANF1 protein of the Black Bear. The BANF1 gene can be readily expressed in E. coli. Results showed that the protein BANF1 fusion with the N-terminally His-tagged form gave rise to the accumulation of an expected 14 kD polypeptide that formed inclusion bodies. The expression products obtained could be used to purify the proteins and study their function further.
Li, Chun; Haug, Tor; Moe, Morten K; Styrvold, Olaf B; Stensvåg, Klara
2010-09-01
As immune effector molecules, antimicrobial peptides (AMPs) play an important role in the invertebrate immune system. Here, we present two novel AMPs, named centrocins 1 (4.5kDa) and 2 (4.4kDa), purified from coelomocyte extracts of the green sea urchin, Strongylocentrotus droebachiensis. The native peptides are cationic and show potent activities against Gram-positive and Gram-negative bacteria. The centrocins have an intramolecular heterodimeric structure, containing a heavy chain (30 amino acids) and a light chain (12 amino acids). The cDNA encoding the peptides and genomic sequences were cloned and sequenced. One putative isoform (centrocin 1b) was identified and one intron was found in the genes coding for the centrocins. The full length protein sequence of centrocin 1 consists of 119 amino acids, whereas centrocin 2 consists of 118 amino acids which both include a preprosequence of 51 or 50 amino acids for centrocins 1 and 2, respectively, and an interchain of 24 amino acids between the heavy and light chain. The difference of molecular mass between the native centrocins and the deduced sequences from cDNA indicates that the native centrocins contain a post-translational brominated tryptophan. In addition, two amino acids at the C-terminal, Gly-Arg, were removed from the light chains during the post-translational processing. The separate peptide chains of centrocin 1 were synthesized and the heavy chain alone was shown to be sufficient for antimicrobial activity. The genome of the closely related species, the purple sea urchin (S. purpuratus), was shown to contain two putative proteins with high similarity to the centrocins. Copyright 2010 Elsevier Ltd. All rights reserved.
Evangelista, Danilo Elton; de Paula, Fernando Fonseca Pereira; Rodrigues, André; Henrique-Silva, Flávio
2015-01-01
The cell wall in plants offers protection against invading organisms and is mainly composed of the polysaccharides pectin, cellulose, and hemicellulose, which can be degraded by plant cell wall degrading enzymes (PCWDEs). Such enzymes are often synthesized by free living microorganisms or endosymbionts that live in the gut of some animals, including certain phytophagous insects. Thus, the ability of an insect to degrade the cell wall was once thought to be related to endosymbiont enzyme activity. However, recent studies have revealed that some phytophagous insects are able to synthesize their own PCWDEs by endogenous genes, although questions regarding the origin of these genes remain unclear. This study describes two pectinases from the sugarcane weevil, Sphenophorus levis Vaurie, 1978 (Sl-pectinases), which is considered one of the most serious agricultural pests in Brazil. Two cDNA sequences identified in a cDNA library of the insect larvae coding for a pectin methylesterase (PME) and an endo-polygalacturonase (endo-PG)—denominated Sl-PME and Sl-endoPG, respectively—were isolated and characterized. The quantitative real-time reverse transcriptase polymerase chain reaction expression profile for both Sl-pectinases showed mRNA production mainly in the insect feeding stages and exclusively in midgut tissue of the larvae. This analysis, together Western blotting data, suggests that Sl-pectinases have a digestive role. Phylogenetic analyses indicate that Sl-PME and Sl-endoPG sequences are closely related to bacteria and fungi, respectively. Moreover, the partial genomic sequences of the pectinases were amplified from insect fat body DNA, which was certified to be free of endosymbiotic DNA. The analysis of genomic sequences revealed the existence of two small introns with 53 and 166 bp in Sl-endoPG, which is similar to the common pattern in fungal introns. In contrast, no intron was identified in the Sl-PME genomic sequence, as generally observed in bacteria. These data support the theory of horizontal gene transfer proposed for the origin of insect pectinases, reinforcing the acquisition of PME genes from bacteria and endo-PG genes from fungi. PMID:25673050
Gritsun, T S; Gould, E A
1998-12-01
In less than 1 month we have constructed an infectious clone of attenuated tick-borne encephalitis virus (strain Vasilchenko) from 100 microl of unpurified virus suspension using long high fidelity PCR and a modified bacterial cloning system. Optimization of the 3' antisense primer concentration was essential to achieve PCR synthesis of an 11 kb cDNA copy of RNA from infectious virus. A novel system utilising two antisense primers, a 14-mer for reverse transcription and a 35-mer for long PCR, produced high yields of genomic length cDNA. Use of low copy number Able K cells and an incubation temperature of 28 degrees C increased the genetic stability of cloned cDNA. Clones containing 11 kb cDNA inserts produced colonies of reduced size, thus providing a positive selection system for full length clones. Sequencing of the infectious clone emphasised the improved fidelity of the method compared with conventional PCR and cloning methods. A simple and rapid strategy for genetic manipulation of the infectious clone is also described. These developments represent a significant advance in recombinant technology and should be applicable to positive stranded RNA viruses which cannot easily be purified or genetically manipulated.
Lu, W; Wainwright, G; Olohan, L A; Webster, S G; Rees, H H; Turner, P C
2001-10-31
Synthesis of ecdysteroids (molting hormones) by crustacean Y-organs is regulated by a neuropeptide, molt-inhibiting hormone (MIH), produced in eyestalk neural ganglia. We report here the molecular cloning of a cDNA encoding MIH of the edible crab, Cancer pagurus. Full-length MIH cDNA was obtained by using reverse transcription-polymerase chain reaction (RT-PCR) with degenerate oligonucleotides based upon the amino acid sequence of MIH, in conjunction with 5'- and 3'-RACE. Full-length clones of MIH cDNA were obtained that encoded a 35 amino acid putative signal peptide and the mature 78 amino acid peptide. Of various tissues examined by Northern blot analysis, the X-organ was the sole major site of expression of the MIH gene. However, a nested-PCR approach using non-degenerate MIH-specific primers indicated the presence of MIH transcripts in other tissues. Southern blot analysis indicated a simple gene arrangement with at least two copies of the MIH gene in the genome of C. pagurus. Additional Southern blotting experiments detected MIH-hybridizing bands in another Cancer species, Cancer antennarius and another crab species, Carcinus maenas.
Kim, Jeongwoon; Matsuba, Yuki; Ning, Jing; Schilmiller, Anthony L.; Hammar, Dagan; Jones, A. Daniel; Pichersky, Eran; Last, Robert L.
2014-01-01
Flavonoids are ubiquitous plant aromatic specialized metabolites found in a variety of cell types and organs. Methylated flavonoids are detected in secreting glandular trichomes of various Solanum species, including the cultivated tomato (Solanum lycopersicum). Inspection of the sequenced S. lycopersicum Heinz 1706 reference genome revealed a close homolog of Solanum habrochaites MOMT1 3′/5′ myricetin O-methyltransferase gene, but this gene (Solyc06g083450) is missing the first exon, raising the question of whether cultivated tomato has a distinct 3′ or 3′/5′ O-methyltransferase. A combination of mining genome and cDNA sequences from wild tomato species and S. lycopersicum cultivar M82 led to the identification of Sl-MOMT4 as a 3′ O-methyltransferase. In parallel, three independent ethyl methanesulfonate mutants in the S. lycopersicum cultivar M82 background were identified as having reduced amounts of di- and trimethylated myricetins and increased monomethylated myricetin. Consistent with the hypothesis that Sl-MOMT4 is a 3′ O-methyltransferase gene, all three myricetin methylation defective mutants were found to have defects in MOMT4 sequence, transcript accumulation, or 3′-O-methyltransferase enzyme activity. Surprisingly, no MOMT4 sequence is found in the Heinz 1706 reference genome sequence, and this cultivar accumulates 3-methyl myricetin and is deficient in 3′-methyl myricetins, demonstrating variation in this gene among cultivated tomato varieties. PMID:25128240
A new polymorphic and multicopy MHC gene family related to nonmammalian class I
DOE Office of Scientific and Technical Information (OSTI.GOV)
Leelayuwat, C.; Degli-Esposti, M.A.; Abraham, L.J.
1994-12-31
The authors have used genomic analysis to characterize a region of the central major histocompatibility complex (MHC) spanning {approximately} 300 kilobases (kb) between TNF and HLA-B. This region has been suggested to carry genetic factors relevant to the development of autoimmune diseases such as myasthenia gravis (MG) and insulin dependent diabetes mellitus (IDDM). Genomic sequence was analyzed for coding potential, using two neural network programs, GRAIL and GeneParser. A genomic probe, JAB, containing putative coding sequences (PERB11) located 60 kb centromeric of HLA-B, was used for northern analysis of human tissues. Multiple transcripts were detected. Southern analysis of genomic DNAmore » and overlapping YAC clones, covering the region from BAT1 to HLA-F, indicated that there are at least five copies of PERB11, four of which are located within this region of the MHC. The partial cDNA sequence of PERB11 was obtained from poly-A RNA derived from skeletal muscle. The putative amino acid sequence of PERB11 shares {approximately} 30% identity to MHC class I molecules from various species, including reptiles, chickens, and frogs, as well as to other MHC class I-like molecules, such as the IgG FcR of the mouse and rat and the human Zn-{alpha}2-glycoprotein. From direct comparison of amino acid sequences, it is concluded that PERB11 is a distinct molecule more closely related to nonmammalian than known mammalian MHC class I molecules. Genomic sequence analysis of PERB11 from five MHC ancestral haplotypes (AH) indicated that the gene is polymorphic at both DNA and protein level. The results suggest that the authors have identified a novel polymorphic gene family with multiple copies within the MHC. 48 refs., 10 figs., 2 tabs.« less
Liu, X; Gorovsky, M A
1996-01-01
A truncated cDNA clone encoding Tetrahymena thermophila histone H2A2 was isolated using synthetic degenerate oligonucleotide probes derived from H2A protein sequences of Tetrahymena pyriformis. The cDNA clone was used as a homologous probe to isolate a truncated genomic clone encoding H2A1. The remaining regions of the genes for H2A1 (HTA1) and H2A2 (HTA2) were then isolated using inverse PCR on circularized genomic DNA fragments. These partial clones were assembled into intact HTA1 and HTA2 clones. Nucleotide sequences of the two genes were highly homologous within the coding region but not in the noncoding regions. Comparison of the deduced amino acid sequences with protein sequences of T. pyriformis H2As showed only two and three differences respectively, in a total of 137 amino acids for H2A1, and 132 amino acids for H2A2, indicating the two genes arose before the divergence of these two species. The HTA2 gene contains a TAA triplet within the coding region, encoding a glutamine residue. In contrast with the T. thermophila HHO and HTA3 genes, no introns were identified within the two genes. The 5'- and 3'-ends of the histone H2A mRNAs; were determined by RNase protection and by PCR mapping using RACE and RLM-RACE methods. Both genes encode polyadenylated mRNAs and are highly expressed in vegetatively growing cells but only weakly expressed in starved cultures. With the inclusion of these two genes, T. thermophila is the first organism whose entire complement of known core and linker histones, including replication-dependent and basal variants, has been cloned and sequenced. PMID:8760889
Wheat EST resources for functional genomics of abiotic stress
Houde, Mario; Belcaid, Mahdi; Ouellet, François; Danyluk, Jean; Monroy, Antonio F; Dryanova, Ani; Gulick, Patrick; Bergeron, Anne; Laroche, André; Links, Matthew G; MacCarthy, Luke; Crosby, William L; Sarhan, Fathey
2006-01-01
Background Wheat is an excellent species to study freezing tolerance and other abiotic stresses. However, the sequence of the wheat genome has not been completely characterized due to its complexity and large size. To circumvent this obstacle and identify genes involved in cold acclimation and associated stresses, a large scale EST sequencing approach was undertaken by the Functional Genomics of Abiotic Stress (FGAS) project. Results We generated 73,521 quality-filtered ESTs from eleven cDNA libraries constructed from wheat plants exposed to various abiotic stresses and at different developmental stages. In addition, 196,041 ESTs for which tracefiles were available from the National Science Foundation wheat EST sequencing program and DuPont were also quality-filtered and used in the analysis. Clustering of the combined ESTs with d2_cluster and TGICL yielded a few large clusters containing several thousand ESTs that were refractory to routine clustering techniques. To resolve this problem, the sequence proximity and "bridges" were identified by an e-value distance graph to manually break clusters into smaller groups. Assembly of the resolved ESTs generated a 75,488 unique sequence set (31,580 contigs and 43,908 singletons/singlets). Digital expression analyses indicated that the FGAS dataset is enriched in stress-regulated genes compared to the other public datasets. Over 43% of the unique sequence set was annotated and classified into functional categories according to Gene Ontology. Conclusion We have annotated 29,556 different sequences, an almost 5-fold increase in annotated sequences compared to the available wheat public databases. Digital expression analysis combined with gene annotation helped in the identification of several pathways associated with abiotic stress. The genomic resources and knowledge developed by this project will contribute to a better understanding of the different mechanisms that govern stress tolerance in wheat and other cereals. PMID:16772040
Mammalian cDNA Library from the NIH Mammalian Gene Collection (MGC) | Office of Cancer Genomics
The MGC provides the research community full-length clones for most of the defined (as of 2006) human and mouse genes, along with selected clones of cow and rat genes. Clones were designed to allow easy transfer of the ORF sequences into nearly any type of expression vector. MGC provides protein ‘expression-ready’ clones for each of the included human genes. MGC is part of the ORFeome Collaboration (OC).
Virtual Northern analysis of the human genome.
Hurowitz, Evan H; Drori, Iddo; Stodden, Victoria C; Donoho, David L; Brown, Patrick O
2007-05-23
We applied the Virtual Northern technique to human brain mRNA to systematically measure human mRNA transcript lengths on a genome-wide scale. We used separation by gel electrophoresis followed by hybridization to cDNA microarrays to measure 8,774 mRNA transcript lengths representing at least 6,238 genes at high (>90%) confidence. By comparing these transcript lengths to the Refseq and H-Invitational full-length cDNA databases, we found that nearly half of our measurements appeared to represent novel transcript variants. Comparison of length measurements determined by hybridization to different cDNAs derived from the same gene identified clones that potentially correspond to alternative transcript variants. We observed a close linear relationship between ORF and mRNA lengths in human mRNAs, identical in form to the relationship we had previously identified in yeast. Some functional classes of protein are encoded by mRNAs whose untranslated regions (UTRs) tend to be longer or shorter than average; these functional classes were similar in both human and yeast. Human transcript diversity is extensive and largely unannotated. Our length dataset can be used as a new criterion for judging the completeness of cDNAs and annotating mRNA sequences. Similar relationships between the lengths of the UTRs in human and yeast mRNAs and the functions of the proteins they encode suggest that UTR sequences serve an important regulatory role among eukaryotes.
Candidate gene database and transcript map for peach, a model species for fruit trees.
Horn, Renate; Lecouls, Anne-Claire; Callahan, Ann; Dandekar, Abhaya; Garay, Lilibeth; McCord, Per; Howad, Werner; Chan, Helen; Verde, Ignazio; Main, Doreen; Jung, Sook; Georgi, Laura; Forrest, Sam; Mook, Jennifer; Zhebentyayeva, Tatyana; Yu, Yeisoo; Kim, Hye Ran; Jesudurai, Christopher; Sosinski, Bryon; Arús, Pere; Baird, Vance; Parfitt, Dan; Reighard, Gregory; Scorza, Ralph; Tomkins, Jeffrey; Wing, Rod; Abbott, Albert Glenn
2005-05-01
Peach (Prunus persica) is a model species for the Rosaceae, which includes a number of economically important fruit tree species. To develop an extensive Prunus expressed sequence tag (EST) database for identifying and cloning the genes important to fruit and tree development, we generated 9,984 high-quality ESTs from a peach cDNA library of developing fruit mesocarp. After assembly and annotation, a putative peach unigene set consisting of 3,842 ESTs was defined. Gene ontology (GO) classification was assigned based on the annotation of the single "best hit" match against the Swiss-Prot database. No significant homology could be found in the GenBank nr databases for 24.3% of the sequences. Using core markers from the general Prunus genetic map, we anchored bacterial artificial chromosome (BAC) clones on the genetic map, thereby providing a framework for the construction of a physical and transcript map. A transcript map was developed by hybridizing 1,236 ESTs from the putative peach unigene set and an additional 68 peach cDNA clones against the peach BAC library. Hybridizing ESTs to genetically anchored BACs immediately localized 11.2% of the ESTs on the genetic map. ESTs showed a clustering of expressed genes in defined regions of the linkage groups. [The data were built into a regularly updated Genome Database for Rosaceae (GDR), available at (http://www.genome.clemson.edu/gdr/).].
Cloning and characterization of a Candida albicans maltase gene involved in sucrose utilization.
Geber, A; Williamson, P R; Rex, J H; Sweeney, E C; Bennett, J E
1992-01-01
In order to isolate the structural gene involved in sucrose utilization, we screened a sucrose-induced Candida albicans cDNA library for clones expressing alpha-glucosidase activity. The C. albicans maltase structural gene (CAMAL2) was isolated. No other clones expressing alpha-glucosidase activity. were detected. A genomic CAMAL2 clone was obtained by screening a size-selected genomic library with the cDNA clone. DNA sequence analysis reveals that CAMAL2 encodes a 570-amino-acid protein which shares 50% identity with the maltase structural gene (MAL62) of Saccharomyces carlsbergensis. The substrate specificity of the recombinant protein purified from Escherichia coli identifies the enzyme as a maltase. Northern (RNA) analysis reveals that transcription of CAMAL2 is induced by maltose and sucrose and repressed by glucose. These results suggest that assimilation of sucrose in C. albicans relies on an inducible maltase enzyme. The family of genes controlling sucrose utilization in C. albicans shares similarities with the MAL gene family of Saccharomyces cerevisiae and provides a model system for studying gene regulation in this pathogenic yeast. Images PMID:1400249
Millard, T P; Ashton, G H S; Kondeatis, E; Vaughan, R W; Hughes, G R V; Khamashta, M A; Hawk, J L M; McGregor, J M; McGrath, J A
2002-02-01
The Ro 60 kDa protein (Ro60 or SSA2) is the major component of the Ro ribonucleoprotein (Ro RNP) complex, to which an immune response is a specific feature of several autoimmune diseases. The genomic organization and any sequence variation within the DNA encoding Ro60 are unknown. To characterize the Ro60 gene structure and to assess whether any sequence alterations might be associated with serum anti-Ro antibody in subacute cutaneous lupus erythematosus (SCLE), thus potentially providing new insight into disease pathogenesis. The cDNA sequence for Ro60 was obtained from the NCBI database and used for a BLAST search for a clone containing the entire genomic sequence. The intron-exon borders were confirmed by designing intronic primer pairs to flank each exon, which were then used to amplify genomic DNA for automated sequencing from 36 caucasian patients with SCLE (anti-Ro positive) and 49 with discoid LE (DLE, anti-Ro negative), in addition to 36 healthy caucasian controls. Heteroduplex analysis of polymerase chain reaction (PCR) products from patients and controls spanning all Ro60 exons (1-8) revealed a common bandshift in the PCR products spanning exon 7. Sequencing of the corresponding PCR products demonstrated an A > G substitution at nucleotide position 1318-7, within the consensus acceptor splice site of exon 7 (GenBank XM001901). The allele frequencies were major allele A (0.71) and minor allele G (0.29) in 72 control chromosomes, with no significant differences found between SCLE patients, DLE patients and controls. The genomic organization of the DNA encoding the Ro60 protein is described, including a common polymorphism within the consensus acceptor splice site of exon 7. Our delineation of a strategy for the genomic amplification of Ro60 forms a basis for further examination of the pathological functions of the Ro RNP in autoimmune disease.
Cloning, sequencing, and expression of cDNA for human. beta. -glucuronidase
DOE Office of Scientific and Technical Information (OSTI.GOV)
Oshima, A.; Kyle, J.W.; Miller, R.D.
1987-02-01
The authors report here the cDNA sequence for human placental ..beta..-glucuronidase (..beta..-D-glucuronoside glucuronosohydrolase, EC 3.2.1.31) and demonstrate expression of the human enzyme in transfected COS cells. They also sequenced a partial cDNA clone from human fibroblasts that contained a 153-base-pair deletion within the coding sequence and found a second type of cDNA clone from placenta that contained the same deletion. Nuclease S1 mapping studies demonstrated two types of mRNAs in human placenta that corresponded to the two types of cDNA clones isolated. The NH/sub 2/-terminal amino acid sequence determined for human spleen ..beta..-glucuronidase agreed with that inferred from the DNAmore » sequence of the two placental clones, beginning at amino acid 23, suggesting a cleaved signal sequence of 22 amino acids. When transfected into COS cells, plasmids containing either placental clone expressed an immunoprecipitable protein that contained N-linked oligosaccharides as evidenced by sensitivity to endoglycosidase F. However, only transfection with the clone containing the 153-base-pair segment led to expression of human ..beta..-glucuronidase activity. These studies provide the sequence for the full-length cDNA for human ..beta..-glucuronidase, demonstrate the existence of two populations of mRNA for ..beta..-glucuronidase in human placenta, only one of which specifies a catalytically active enzyme, and illustrate the importance of expression studies in verifying that a cDNA is functionally full-length.« less
Xu, Li; Ding, Zhi-Shan; Zhou, Yun-Kai; Tao, Xue-Fen
2009-06-01
To obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis by RACE PCR,then investigate the character of Secoisolariciresinol Dehydrogenase gene. The full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene was obtained by 3'-RACE and 5'-RACE from Dysosma versipellis. We first reported the full cDNA sequences of Secoisolariciresinol Dehydrogenase in Dysosma versipellis. The acquired gene was 991bp in full length, including 5' untranslated region of 42bp, 3' untranslated region of 112bp with Poly (A). The open reading frame (ORF) encoding 278 amino acid with molecular weight 29253.3 Daltons and isolectric point 6.328. The gene accession nucleotide sequence number in GeneBank was EU573789. Semi-quantitative RT-PCR analysis revealed that the Secoisolariciresinol Dehydrogenase gene was highly expressed in stem. Alignment of the amino acid sequence of Secoisolariciresinol Dehydrogenase indicated there may be some significant amino acid sequence difference among different species. Obtain the full-length cDNA sequence of Secoisolariciresinol Dehydrogenase gene from Dysosma versipellis.
Danley, Patrick D; Mullen, Sean P; Liu, Fenglong; Nene, Vishvanath; Quackenbush, John; Shaw, Kerry L
2007-01-01
Background As the developmental costs of genomic tools decline, genomic approaches to non-model systems are becoming more feasible. Many of these systems may lack advanced genetic tools but are extremely valuable models in other biological fields. Here we report the development of expressed sequence tags (EST's) in an orthopteroid insect, a model for the study of neurobiology, speciation, and evolution. Results We report the sequencing of 14,502 EST's from clones derived from a nerve cord cDNA library, and the subsequent construction of a Gene Index from these sequences, from the Hawaiian trigonidiine cricket Laupala kohalensis. The Gene Index contains 8607 unique sequences comprised of 2575 tentative consensus (TC) sequences and 6032 singletons. For each of the unique sequences, an attempt was made to assign a provisional annotation and to categorize its function using a Gene Ontology-based classification through a sequence-based comparison to known proteins. In addition, a set of unique 70 base pair oligomers that can be used for DNA microarrays was developed. All Gene Index information is posted at the DFCI Gene Indices web page Conclusion Orthopterans are models used to understand the neurophysiological basis of complex motor patterns such as flight and stridulation. The sequences presented in the cricket Gene Index will provide neurophysiologists with many genetic tools that have been largely absent in this field. The cricket Gene Index is one of only two gene indices to be developed in an evolutionary model system. Species within the genus Laupala have speciated recently, rapidly, and extensively. Therefore, the genes identified in the cricket Gene Index can be used to study the genomics of speciation. Furthermore, this gene index represents a significant EST resources for basal insects. As such, this resource is a valuable comparative tool for the understanding of invertebrate molecular evolution. The sequences presented here will provide much needed genomic resources for three distinct but overlapping fields of inquiry: neurobiology, speciation, and molecular evolution. PMID:17459168
de Oliveira Ceita, Geruza; Vilas-Boas, Laurival Antônio; Castilho, Marcelo Santos; Carazzolle, Marcelo Falsarella; Pirovani, Carlos Priminho; Selbach-Schnadelbach, Alessandra; Gramacho, Karina Peres; Ramos, Pablo Ivan Pereira; Barbosa, Luciana Veiga; Pereira, Gonçalo Amarante Guimarães; Góes-Neto, Aristóteles
2014-10-01
The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches' broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea.
de Oliveira Ceita, Geruza; Vilas-Boas, Laurival Antônio; Castilho, Marcelo Santos; Carazzolle, Marcelo Falsarella; Pirovani, Carlos Priminho; Selbach-Schnadelbach, Alessandra; Gramacho, Karina Peres; Ramos, Pablo Ivan Pereira; Barbosa, Luciana Veiga; Pereira, Gonçalo Amarante Guimarães; Góes-Neto, Aristóteles
2014-01-01
The phytopathogenic fungus Moniliophthora perniciosa (Stahel) Aime & Philips-Mora, causal agent of witches’ broom disease of cocoa, causes countless damage to cocoa production in Brazil. Molecular studies have attempted to identify genes that play important roles in fungal survival and virulence. In this study, sequences deposited in the M. perniciosa Genome Sequencing Project database were analyzed to identify potential biological targets. For the first time, the ergosterol biosynthetic pathway in M. perniciosa was studied and the lanosterol 14α-demethylase gene (ERG11) that encodes the main enzyme of this pathway and is a target for fungicides was cloned, characterized molecularly and its phylogeny analyzed. ERG11 genomic DNA and cDNA were characterized and sequence analysis of the ERG11 protein identified highly conserved domains typical of this enzyme, such as SRS1, SRS4, EXXR and the heme-binding region (HBR). Comparison of the protein sequences and phylogenetic analysis revealed that the M. perniciosa enzyme was most closely related to that of Coprinopsis cinerea. PMID:25505843
A new ALF from Litopenaeus vannamei and its SNPs related to WSSV resistance
NASA Astrophysics Data System (ADS)
Liu, Jingwen; Yu, Yang; Li, Fuhua; Zhang, Xiaojun; Xiang, Jianhai
2014-11-01
Anti-lipopolysaccharide factors (ALFs) are basic components of the crustacean immune system that defend against a range of pathogens. The cDNA sequence of a new ALF, designated nLvALF2, with an open reading frame encoding 132 amino acids was cloned. Its deduced amino acid sequence contained the conserved functional domain of ALFs, the LPS binding domain (LBD). Its genomic sequence consisted of three exons and four introns. nLvALF2 was mainly expressed in the Oka organ and gills of shrimps. The transcriptional level of nLvALF2 increased significantly after white spot syndrome virus (WSSV) infection, suggesting its important roles in protecting shrimps from WSSV. Single nucleotide polymorphisms (SNPs) were found in the genomic sequence of nLvALF2, of which 38 were analyzed for associations with the susceptibility/resistance of shrimps to WSSV. The loci g.2422 A>G, g.2466 T>C, and g.2529 G>A were significantly associated with the resistance to WSSV ( P<0.05). These SNP loci could be developed as markers for selection of WSSV-resistant varieties of Litopenaeus vannamei.
Ivancic-Jelecki, Jelena; Slovic, Anamarija; Šantak, Maja; Tešović, Goran; Forcic, Dubravko
2016-07-29
The canonical genome organization of measles virus (MV) is characterized by total size of 15 894 nucleotides (nts) and defined length of every genomic region, both coding and non-coding. Only rarely have reports of strains possessing non-canonical genomic properties (possessing indels, with or without the change of total genome length) been published. The observed mutations are mutually compensatory in a sense that the total genome length remains polyhexameric. Although programmed and highly precise pseudo-templated nucleotide additions during transcription are inherent to polymerases of all viruses belonging to family Paramyxoviridae, a similar mechanism that would serve to non-randomly correct genome length, if an indel has occurred during replication, has so far not been described in the context of a complete virus genome. We compiled all complete MV genomic sequences (64 in total) available in open access sequence databases. Multiple sequence comparisons and phylogenetic analyses were performed with the aim of exploring whether non-recombinant and non-evolutionary linked measles strains that show deviations from canonical genome organization possess a common genetic characteristic. In 11 MV sequences we detected deviations from canonical genome organization due to short indels located within homopolymeric stretches or next to them. In nine out of 11 identified non-canonical MV sequences, a common feature was observed: one mutation, either an insertion or a deletion, was located in a 28 nts long region in F gene 5' untranslated region (positions 5051-5078 in genomic cDNA of canonical strains). This segment is composed of five tandemly linked homopolymeric stretches, its consensus sequence is G6-7C7-8A6-7G1-3C5-6. Although none of the mononucleotide repeats within this segment has fixed length, the total number of nts in canonical strains is always 28. These nine non-canonical strains, as well as the tenth (not mutated in 5051-5078 segment), can be grouped in three clusters, based on their passage histories/epidemiological data/genetic similarities. There are no indications that the 3 clusters are evolutionary linked, other than the fact that they all belong to clade D. A common narrow genomic region was found to be mutated in different, non-related, wild type strains suggesting that this region might have a function in non-random genome length corrections occurring during MV replication.
Chang, Cheng; Shen, Wen-Kai; Wang, Tzu-Ting; Lin, Ying-Hsi; Hsu, Err-Lieh; Dai, Shu-Mei
2009-04-01
To identify pertinent mutations associated with knockdown resistance to permethrin, the entire coding sequence of the voltage-gated sodium channel gene Aa-para was sequenced and analyzed from a Per-R strain with 190-fold resistance to permethrin and two susceptible strains of Aedes aegypti. The longest transcript, a 6441bp open reading frame, encodes 2147 amino acid residues with an estimated molecular mass of 241kDa. A total of 33 exons were found in the Aa-para gene over 293kb of genomic DNA. Three previously unreported optional exons were identified. The first two exons, m and n, were located within the intracellular domain I/II, and the third, f', was found within the II/III linkers. The two mutually exclusive exons, d and l, were the only alternative exons in all the cDNA clones sequenced in this study. The most distinct finding was a novel amino acid substitution mutation, D1794Y, located within the extracellular linker between IVS5 and IVS6, which is concurrent with the known V1023G mutation in Aa-para of the Per-R strain. The high frequency and coexistence of the two mutations in the Per-R strain suggest that they might exert a synergistic effect to provide the knockdown resistance to permethrin. Furthermore, both cDNA and genomic DNA data from the same individual mosquitoes have demonstrated that RNA editing was not involved in amino acid substitutions of the Per-R strain.
[Investigation of RNA viral genome amplification by multiple displacement amplification technique].
Pang, Zheng; Li, Jian-Dong; Li, Chuan; Liang, Mi-Fang; Li, De-Xin
2013-06-01
In order to facilitate the detection of newly emerging or rare viral infectious diseases, a negative-strand RNA virus-severe fever with thrombocytopenia syndrome bunyavirus, and a positive-strand RNA virus-dengue virus, were used to investigate RNA viral genome unspecific amplification by multiple displacement amplification technique from clinical samples. Series of 10-fold diluted purified viral RNA were utilized as analog samples with different pathogen loads, after a series of reactions were sequentially processed, single-strand cDNA, double-strand cDNA, double-strand cDNA treated with ligation without or with supplemental RNA were generated, then a Phi29 DNA polymerase depended isothermal amplification was employed, and finally the target gene copies were detected by real time PCR assays to evaluate the amplification efficiencies of various methods. The results showed that multiple displacement amplification effects of single-strand or double-strand cDNA templates were limited, while the fold increases of double-strand cDNA templates treated with ligation could be up to 6 X 10(3), even 2 X 10(5) when supplemental RNA existed, and better results were obtained when viral RNA loads were lower. A RNA viral genome amplification system using multiple displacement amplification technique was established in this study and effective amplification of RNA viral genome with low load was achieved, which could provide a tool to synthesize adequate viral genome for multiplex pathogens detection.
Oikonomopoulos, Spyros; Wang, Yu Chang; Djambazian, Haig; Badescu, Dunarel; Ragoussis, Jiannis
2016-08-24
To assess the performance of the Oxford Nanopore Technologies MinION sequencing platform, cDNAs from the External RNA Controls Consortium (ERCC) RNA Spike-In mix were sequenced. This mix mimics mammalian mRNA species and consists of 92 polyadenylated transcripts with known concentration. cDNA libraries were generated using a template switching protocol to facilitate the direct comparison between different sequencing platforms. The MinION performance was assessed for its ability to sequence the cDNAs directly with good accuracy in terms of abundance and full length. The abundance of the ERCC cDNA molecules sequenced by MinION agreed with their expected concentration. No length or GC content bias was observed. The majority of cDNAs were sequenced as full length. Additionally, a complex cDNA population derived from a human HEK-293 cell line was sequenced on an Illumina HiSeq 2500, PacBio RS II and ONT MinION platforms. We observed that there was a good agreement in the measured cDNA abundance between PacBio RS II and ONT MinION (rpearson = 0.82, isoforms with length more than 700bp) and between Illumina HiSeq 2500 and ONT MinION (rpearson = 0.75). This indicates that the ONT MinION can sequence quantitatively both long and short full length cDNA molecules.
Assignment of the human caltractin gene (CALT) to Xq28 by fluorescence in situ hybridization
DOE Office of Scientific and Technical Information (OSTI.GOV)
Tanaka, Tanaka; Okui, Keiko; Nakamura, Yusuke
1994-12-01
The centrosome is the major microtubule-organizing center of interphase eukaryotic cells, an its duplication is essential to eukaryotic cell division. Caltractin, a structural component of centrosomes, is highly homologous in amino acid sequence to the product of the CDC31 gene of Saccharomyces cerevisiae. In S. cerevisiae, an important role for CDC31 in duplication of the spindle pole body (SPB), a kind of microtubule-organizing center, has been demonstrated by an experiment in which mutant CDC31 prevented SPB duplication and led to formation of a monopolar spindle. In view of the localization of human caltractin in centrosomes and the sequence homology itmore » bears to yeast CDC31, it is reasonable to assume that caltractin functions in humans as CDC31 does in yeast. As a part of the Human Genome Project, we have been determining nucleotide sequences of DNA clones randomly selected from a directionally cloned cDNA library constructed from fetal brain mRNA obtained from Clontech (La Jolla, CA). By comparing 5{prime} partial DNA sequences of these cDNA clones with known DNA sequences in the database, we found one clone that was highly homologous to the caltractin gene of Chlamydomonas, which turned out to be the same as a human gene identified recently. 4 refs., 1 fig.« less
Cloning and sequence analysis of Hemonchus contortus HC58cDNA.
Muleke, Charles I; Ruofeng, Yan; Lixin, Xu; Xinwen, Bo; Xiangrui, Li
2007-06-01
The complete coding sequence of Hemonchus contortus HC58cDNA was generated by rapid amplification of cDNA ends and polymerase chain reaction using primers based on the 5' and 3' ends of the parasite mRNA, accession no. AF305964. The HC58cDNA gene was 851 bp long, with open reading frame of 717 bp, precursors to 239 amino acids coding for approximately 27 kDa protein. Analysis of amino acid sequence revealed conserved residues of cysteine, histidine, asparagine, occluding loop pattern, hemoglobinase motif and glutamine of the oxyanion hole characteristic of cathepsin B like proteases (CBL). Comparison of the predicted amino acid sequences showed the protein shared 33.5-58.7% identity to cathepsin B homologues in the papain clan CA family (family C1). Phylogenetic analysis revealed close evolutionary proximity of the protein sequence to counterpart sequences in the CBL, suggesting that HC58cDNA was a member of the papain family.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Toye, P.G.; Metzelaar, M.J.; Wijngaard, P.L.J.
1995-08-01
Theileria parva, a tick-transmitted protozoan parasite related to Plasmodium spp., causes the disease East Coast fever, an acute and usually fatal lymphoproliferative disorder of cattle in Africa. Previous studies using sera from cattle that have survived infection identified a polymorphic immunodominant molecule (PIM) that is expressed by both the infective sporozoite stage of the parasite and the intracellular schizont. Here we show that mAb specific for the PIM Ag can inhibit sporozoite invasion of lymphocytes in vitro. A cDNA clone encoding the PIM Ag of the T. parva (Muguga) stock was obtained by using these mAb in a novel eukaryoticmore » expression cloning system that allows isolation of cDNA encoding cytoplasmic or surface Ags. To establish the molecular basis of the polymorphism of PIM, the cDNA of the PIM Ag from a buffalo-derived T. parva stock was isolated and its sequence was compared with that of the cattle-derived Muguga PIM. The two cDNAs showed considerable identity in both the 5{prime} and 3{prime} regions, but there was substantial sequence divergence in the central regions. Several types of repeated sequences were identified in the variant regions. In the Muguga form of the molecule, there were five tandem repeats of the tetrapeptide, QPEP, that were shown, by transfection of a deleted version of the PIM gene, not to react with several anti-PIM mAbs. By isolating and sequencing the genomic version of the gene, we identified two small introns in the 3{prime} region of the gene. Finally, we showed that polyclonal rat Abs against recombinant PIM neutralize sporozoite infectivity in vitro, suggesting that the PIM Ag should be evaluated for its capacity to immunize cattle against East Coast Fever.« less
NASA Astrophysics Data System (ADS)
Yu, Shuiyan; Liu, Shicheng; Li, Chunyang; Zhou, Zhigang
2011-01-01
Myrmecia incisa is a green coccoid freshwater microalgae, which is rich in arachidonic acid (ArA, C20: 4ω-6, δ5, 8, 11, 14), a long chain polyunsaturated fatty acid (PUFA), especially under nitrogen starvation stress. A cDNA library of M. incisa was constructed with λ phage vectors and a 545 nt expressed sequence tag (EST) was screened from this library as a putative elongase gene due to its 56% and 49% identity to Marchantia polymorpha L. and Ostreococcus tauri Courties et Chrétiennot-Dinet, respectively. Based upon this EST sequence, an elongase gene designated MiFAE was isolated from M. incisa via 5'/3' rapid amplification of cDNA ends (RACE). The cDNA sequence was 1 331 bp long and included a 33 bp 5'-untranslated region (UTR) and a 431 bp 3'-UTR with a typical poly-A tail. The 867 bp ORF encoded a predicted protein of 288 amino acids. This protein was characterized by a conserved histidine-rich box and a MYxYY motif that was present in other members of the elongase family. The genomic DNA sequence of MiFAE was found to be interrupted by three introns with splicing sites of Introns I (81 bp), II (81 bp), and III (67 bp) that conformed to the GT-AG rule. Quantitative real-time PCR showed that the transcription level of MiFAE in this microalga under nitrogen starvation was higher than that under normal condition. Prior to the ArA content accumulation, the transcription of MiFAE was enhanced, suggesting that it was possibly responsible for the ArA accumulation in this microalga cultured under nitrogen starvation conditions.
Dalla Valle, Luisa; Nardi, Alessia; Belvedere, Paola; Toni, Mattia; Alibardi, Lorenzo
2007-07-01
Beta-keratins of reptilian scales have been recently cloned and characterized in some lizards. Here we report for the first time the sequence of some beta-keratins from the snake Elaphe guttata. Five different cDNAs were obtained using 5'- and 3'-RACE analyses. Four sequences differ by only few nucleotides in the coding region, whereas the last cDNA shows, in this region, only 84% of identity. The gene corresponding to one of the cDNA sequences has a single intron present in the 5'-untranslated region. This genomic organization is similar to that of birds' beta-keratins. Cloning and Southern blotting analysis suggest that snake beta-keratins belong to a family of high-related genes as for geckos. PCR analysis suggests a head-to-tail orientation of genes in the same chromosome. In situ hybridization detected beta-keratin transcripts almost exclusively in differentiating oberhautchen and beta-cells of the snake epidermis in renewal phase. This is confirmed by Northern blotting that showed, in this phase, a high expression of two different transcripts whereas only the longer transcript is expressed at a much lower level in resting skin. The cDNA coding sequences encoded putative glycine-proline-serine rich proteins containing 137-139 amino acids, with apparent isoelectric point at 7.5 and 8.2. A central region, rich in proline, shows over 50% homology with avian scale, claw, and feather keratins. The prediction of secondary structure shows mainly a random coil conformation and few beta-strand regions in the central region, likely involved in the formation of a fibrous framework of beta-keratins. This region was possibly present in basic reptiles that originated reptiles and birds. Copyright 2007 Wiley-Liss, Inc.
cDNA cloning and analysis of RNA 2 of a Prunus stem pitting isolate of tomato ringspot virus.
Hadidi, A; Powell, C A
1991-10-01
Recombinant plasmids containing sequences derived from the genome of a tomato ringspot virus (TomRSV) isolate associated with both stem pitting disease of stone fruits and apple union necrosis and decline were constructed. Selected inserts were subcloned into the polylinker region of the SP6 transcription vector pSP64. Using the SP6 promoter flanking this region, high specific activity 32P-labelled cRNA probes were generated by SP6 RNA polymerase. cRNA probes were specific for TomRSV RNA 2 present in purified virions or in extracts from woody and herbacous hosts. No sequence relatedness was detected between TomRSV RNA 2 and genomic RNA from tobacco ringspot, arabis mosaic, strawberry latent ringspot, or cucumber mosaic virus in Northern blot analysis using TomRSV cRNA probes. These probes detected TomRSV infection in woody and herbaceous hosts in dot-blot hybridization assays.
Characterization of embryo-specific genes
DOE Office of Scientific and Technical Information (OSTI.GOV)
Not Available
1989-01-01
The objective of the proposed research is to characterize the structure and function of a set of genes whose expression is regulated in embryo development, and that is not expressed in mature tissues -- the embryonic genes. In the last two years, using cDNA clones, we have isolated 22 cDNA clones, and characterized the expression pattern of their corresponding RNA. At least 4 cDNA clones detect RNAs of embryonic genes. These cDNA clones detect RNAs expressed in somatic as well as zygotic embryos of carrot. Using the cDNA clones, we screened the genomic library of carrot embryo DNA, and isolatedmore » genomic clones for three genes. The structure and function of two genes DC 8 and DC 59 have been characterized and are reported in this paper.« less
Hagen, Ingerid J; Billing, Anna M; Rønning, Bernt; Pedersen, Sindre A; Pärn, Henrik; Slate, Jon; Jensen, Henrik
2013-05-01
With the advent of next generation sequencing, new avenues have opened to study genomics in wild populations of non-model species. Here, we describe a successful approach to a genome-wide medium density Single Nucleotide Polymorphism (SNP) panel in a non-model species, the house sparrow (Passer domesticus), through the development of a 10 K Illumina iSelect HD BeadChip. Genomic DNA and cDNA derived from six individuals were sequenced on a 454 GS FLX system and generated a total of 1.2 million sequences, in which SNPs were detected. As no reference genome exists for the house sparrow, we used the zebra finch (Taeniopygia guttata) reference genome to determine the most likely position of each SNP. The 10 000 SNPs on the SNP-chip were selected to be distributed evenly across 31 chromosomes, giving on average one SNP per 100 000 bp. The SNP-chip was screened across 1968 individual house sparrows from four island populations. Of the original 10 000 SNPs, 7413 were found to be variable, and 99% of these SNPs were successfully called in at least 93% of all individuals. We used the SNP-chip to demonstrate the ability of such genome-wide marker data to detect population sub-division, and compared these results to similar analyses using microsatellites. The SNP-chip will be used to map Quantitative Trait Loci (QTL) for fitness-related phenotypic traits in natural populations. © 2013 Blackwell Publishing Ltd.
A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation.
Howe, Glenn T; Yu, Jianbin; Knaus, Brian; Cronn, Richard; Kolpak, Scott; Dolan, Peter; Lorenz, W Walter; Dean, Jeffrey F D
2013-02-28
Douglas-fir (Pseudotsuga menziesii), one of the most economically and ecologically important tree species in the world, also has one of the largest tree breeding programs. Although the coastal and interior varieties of Douglas-fir (vars. menziesii and glauca) are native to North America, the coastal variety is also widely planted for timber production in Europe, New Zealand, Australia, and Chile. Our main goal was to develop a SNP resource large enough to facilitate genomic selection in Douglas-fir breeding programs. To accomplish this, we developed a 454-based reference transcriptome for coastal Douglas-fir, annotated and evaluated the quality of the reference, identified putative SNPs, and then validated a sample of those SNPs using the Illumina Infinium genotyping platform. We assembled a reference transcriptome consisting of 25,002 isogroups (unique gene models) and 102,623 singletons from 2.76 million 454 and Sanger cDNA sequences from coastal Douglas-fir. We identified 278,979 unique SNPs by mapping the 454 and Sanger sequences to the reference, and by mapping four datasets of Illumina cDNA sequences from multiple seed sources, genotypes, and tissues. The Illumina datasets represented coastal Douglas-fir (64.00 and 13.41 million reads), interior Douglas-fir (80.45 million reads), and a Yakima population similar to interior Douglas-fir (8.99 million reads). We assayed 8067 SNPs on 260 trees using an Illumina Infinium SNP genotyping array. Of these SNPs, 5847 (72.5%) were called successfully and were polymorphic. Based on our validation efficiency, our SNP database may contain as many as ~200,000 true SNPs, and as many as ~69,000 SNPs that could be genotyped at ~20,000 gene loci using an Infinium II array-more SNPs than are needed to use genomic selection in tree breeding programs. Ultimately, these genomic resources will enhance Douglas-fir breeding and allow us to better understand landscape-scale patterns of genetic variation and potential responses to climate change.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shipley, J.M.; Klinkenberg, M.; Wu, B.M.
1993-03-01
PCR of cDNA produced from patient fibroblasts allowed the authors to determine the paternal mutation in the first patient reported with [beta]-glucuronidase-deficiency mucopolysaccharidosis type VII (MPS VII). The G[r arrow]T transversion 1,881 bp downstream of the ATG translation initiation codon destroys an MboII restriction site and converts Trp627 to Cys (W627C). Digestion of genomic DNA PCR fragments with MboII indicated that the patient and the father were heterozygous for this missense mutation in exon 12. Failure to find cDNAs from patient RNA which did not contain this mutation suggested that the maternal mutation leads to greatly reduced synthesis or reducedmore » stability of mRNA from the mutant allele. In order to identify the maternal mutation, it was necessary to analyze genomic sequences. This approach was complicated by the finding of multiple unprocessed pseudogenes and/or closely related genes. Using PCR with a panel of human/rodent hybrid cell lines, the authors found that these pseudogenes were present over chromosomes 5-7, 20, and 22 and the Y chromosome. Conditions were defined which allowed them to amplify and characterize genomic sequences for the true [beta]-glucuronidase gene despite this background of related sequences. The patient proved to be heterozygous for a second mutation, in which a C[r arrow]T transition introduces a termination codon (R356STOP) in exon 7. The mother was also heterozygous for this mutation. Expression of a cDNA containing the maternal mutation produced no enzyme activity, as expected. Expression of the paternal mutation in COS-7 cells produced a surprisingly high (65% of control) level of activity. However, activity was 13% of control in transiently transfected murine MPS VII cells. The level of activity of this mutant allele appears to correlate with the level of overexpression. 39 refs., 5 figs., 1 tab.« less
A SNP resource for Douglas-fir: de novo transcriptome assembly and SNP detection and validation
2013-01-01
Background Douglas-fir (Pseudotsuga menziesii), one of the most economically and ecologically important tree species in the world, also has one of the largest tree breeding programs. Although the coastal and interior varieties of Douglas-fir (vars. menziesii and glauca) are native to North America, the coastal variety is also widely planted for timber production in Europe, New Zealand, Australia, and Chile. Our main goal was to develop a SNP resource large enough to facilitate genomic selection in Douglas-fir breeding programs. To accomplish this, we developed a 454-based reference transcriptome for coastal Douglas-fir, annotated and evaluated the quality of the reference, identified putative SNPs, and then validated a sample of those SNPs using the Illumina Infinium genotyping platform. Results We assembled a reference transcriptome consisting of 25,002 isogroups (unique gene models) and 102,623 singletons from 2.76 million 454 and Sanger cDNA sequences from coastal Douglas-fir. We identified 278,979 unique SNPs by mapping the 454 and Sanger sequences to the reference, and by mapping four datasets of Illumina cDNA sequences from multiple seed sources, genotypes, and tissues. The Illumina datasets represented coastal Douglas-fir (64.00 and 13.41 million reads), interior Douglas-fir (80.45 million reads), and a Yakima population similar to interior Douglas-fir (8.99 million reads). We assayed 8067 SNPs on 260 trees using an Illumina Infinium SNP genotyping array. Of these SNPs, 5847 (72.5%) were called successfully and were polymorphic. Conclusions Based on our validation efficiency, our SNP database may contain as many as ~200,000 true SNPs, and as many as ~69,000 SNPs that could be genotyped at ~20,000 gene loci using an Infinium II array—more SNPs than are needed to use genomic selection in tree breeding programs. Ultimately, these genomic resources will enhance Douglas-fir breeding and allow us to better understand landscape-scale patterns of genetic variation and potential responses to climate change. PMID:23445355
Liu, Guo-Hua; Nakamura, Tatsuo; Amemiya, Takashi; Rajendran, Narasimmalu; Itoh, Kiminori
2011-01-01
Two-dimensional gel electrophoresis (2-DGE) mapping of genomic DNA and complementary DNA (cDNA) amplicons was attempted to analyze total and active bacterial populations within soil and activated sludge samples. Distinct differences in the number and species of bacterial populations and those that were metabolically active at the time of sampling were visually observed especially for the soil community. Statistical analyses and sequencing based on the 2-DGE data further revealed the relationships between total and active bacterial populations within each community. This high-resolution technique would be useful for obtaining a better understanding of bacterial population structures in the environment.
The Role(s) of Heparan Sulfate Proteoglycan(s) in the wnt-1 Signaling Pathway
1998-08-01
First , the sequence of the cDNA, when compared to the genomic site of insertion of the P-element, revealed that the P-element is inserted 686 bp...stages 8 to 13 (Yoffe et al. 1995). We first examined whether ectopic expression of Wgts effectively restores the naked cuticle as it does in wg and...by Kjell~n and Lindahl, 1991) . HS/heparin N-deacetylase/N-sulfotransferase catalyzes N-deacetylation and N-sulfation that is the first and key step
Mashoof, Sara; Criscitiello, Michael F.
2016-01-01
The B cell receptor and secreted antibody are at the nexus of humoral adaptive immunity. In this review, we summarize what is known of the immunoglobulin genes of jawed cartilaginous and bony fishes. We focus on what has been learned from genomic or cDNA sequence data, but where appropriate draw upon protein, immunization, affinity and structural studies. Work from major aquatic model organisms and less studied comparative species are both included to define what is the rule for an immunoglobulin isotype or taxonomic group and what exemplifies an exception. PMID:27879632
Chernicky, C L; Tan, H; Burfeind, P; Ilan, J; Ilan, J
1996-02-01
There are several cell types within the placenta that produce cytokines which can contribute to the regulatory mechanisms that ensure normal pregnancy. The immunological milieu at the maternofetal interface is considered to be crucial for survival of the fetus. Interleukin-2 (IL-2) is expressed by the syncytiotrophoblast, the cell layer between the mother and the fetus. IL-2 appears to be a key factor in maintenance of pregnancy. Therefore, it was important to determine the sequence of human placental interleukin-2. Direct sequencing of human placental IL-2 cDNA was determined for the coding region. Subclone sequencing was carried out for the 5'- and 3'-untranslated regions (5'-UTR and 3'-UTR). The 5'-UTR for human placental IL-2 cDNA is 294 bp, which is 247 nucleotides longer than that reported for cDNA IL-2 derived from T cells. The sequence of the coding region is identical to that reported for T cell IL-2, while sequence analysis of the polymerase chain reaction (PCR) product showed that the cDNA from the 3' end was the same as that reported for cDNA from T cells. Human placental IL-2 cDNA is 1,028 base pairs (excluding the poly A tail), which is 247 bp longer at the 5' end than that reported for IL-2 T cell cDNA. Therefore, the extended 5'-UTR of the placental IL-2 cDNA may be a consequence of alternative promoter utilization in the placenta.
Wolffe, E J; Gause, W C; Pelfrey, C M; Holland, S M; Steinberg, A D; August, J T
1990-01-05
We describe the isolation and sequencing of a cDNA encoding mouse Pgp-1. An oligonucleotide probe corresponding to the NH2-terminal sequence of the purified protein was synthesized by the polymerase chain reaction and used to screen a mouse macrophage lambda gt11 library. A cDNA clone with an insert of 1.2 kilobases was selected and sequenced. In Northern blot analysis, only cells expressing Pgp-1 contained mRNA species that hybridized with this Pgp-1 cDNA. The nucleotide sequence of the cDNA has a single open reading frame that yields a protein-coding sequence of 1076 base pairs followed by a 132-base pair 3'-untranslated sequence that includes a putative polyadenylation signal but no poly(A) tail. The translated sequence comprises a 13-amino acid signal peptide followed by a polypeptide core of 345 residues corresponding to an Mr of 37,800. Portions of the deduced amino acid sequence were identical to those obtained by amino acid sequence analysis from the purified glycoprotein, confirming that the cDNA encodes Pgp-1. The predicted structure of Pgp-1 includes an NH2-terminal extracellular domain (residues 14-265), a transmembrane domain (residues 266-286), and a cytoplasmic tail (residues 287-358). Portions of the mouse Pgp-1 sequence are highly similar to that of the human CD44 cell surface glycoprotein implicated in cell adhesion. The protein also shows sequence similarity to the proteoglycan tandem repeat sequences found in cartilage link protein and cartilage proteoglycan core protein which are thought to be involved in binding to hyaluronic acid.
Highly abundant and stage-specific mRNAs in the obligate pathogen Bremia lactucae.
Judelson, H S; Michelmore, R W
1990-01-01
Germinating spores of the obligate pathogen Bremia lactucae (lettuce downy mildew) contain several unusually abundant species of mRNA. Thirty-nine cDNA clones corresponding to prevalent transcripts were isolated from a library synthesized using poly(A)+ RNA from germinating spores; these clones represented only five distinct classes. Each corresponding mRNA accounted for from 0.4 to 9 percent by mass of poly(A)+ RNA from germinating spores and together represented greater than 20 percent of the mRNA. The expression of the corresponding genes, and a gene encoding Hsp70, was analyzed in spores during germination and during growth in planta. The Hsp70 mRNA and mRNA from one abundant cDNA clone (ham34) were expressed constitutively. Two clones (ham9 and ham12) hybridized only to mRNA from spores and germinating spores. Two clones (ham37 and ham27) showed hybridization specific to germinating spores. Quantification of the number of genes homologous to each cDNA clone indicated that four clones corresponded to one or two copies per haploid genome, and one hybridized to an approximately 11-member family of genes. A sequence of the gene corresponding to ham34 was obtained to investigate its function and to identify sequences conferring high levels of gene expression for use in constructing vectors for the transformation of B. lactucae.
Yang, Hui-Peng; Luo, Su-Juan; Li, Yi-Nü; Zhang, Yao-Zhou; Zhang, Zhi-Fang
2011-10-01
The ORC (origin recognition complex) binds to the DNA replication origin and recruits other replication factors to form the pre-replication complex. The cDNA and genomic sequences of all six subunits of ORC in Bombyx mori (BmORC1-6) were determined by RACE (rapid amplification of cDNA ends) and bioinformatic analysis. The conserved domains were identified in BmOrc1p-6p and the C-terminal of BmOrc6p features a short sequence that may be specific for Lepidoptera. As in other organisms, each of the six BmORC subunits had evolved individually from ancestral genes in early eukaryotes. During embryo development, the six genes were co-regulated, but different ratios of the abundance of mRNAs were observed in 13 tissues of the fifth instar day-6 larvae. Infection by BmNPV (B. mori nucleopolyhedrovirus) initially decreased and then increased the abundance of BmORC. We suggest that some of the BmOrc proteins may have additional functions and that BmOrc proteins participate in the replication of BmNPV.
Generation and reactivation of T-cell receptor A joining region pseudogenes in primates
DOE Office of Scientific and Technical Information (OSTI.GOV)
Thiel, C.; Lanchbury, J.S.; Otting, N.
1996-06-01
Tandemly duplicated T-cell receptor (Tcr) AJ (J{alpha}) segments contribute significantly to TCRA chain junctional region diversity in mammals. Since only limited data exists on TCRA diversity in nonhuman primates, we examined the TCRAJ regions of 37 chimpanzee and 71 rhesus macaque TCRA cDNA clones derived from inverse polymerase chain reaction on peripheral blood mononuclear cell cDNA of healthy animals. Twenty-five different TCRAJ regions were characterized in the chimpanzee and 36 in the rhesus macaque. Each bears a close structural relationship to an equivalent human TCRAJ region. Conserved amino acid motifs are shared between all three species. There are indications thatmore » differences between nonhuman primates and humans exist in the generation of TCRAJ pseudogenes. The nucleotide and amino acid sequences of the various characterized TCRAJ of each species are reported and we compare our results to the available information on human genomic sequences. Although we provide evidence of dynamic processes modifying TCRAJ segments during primate evolution, their repertoire and primary structure appears to be relatively conserved. 21 refs., 2 figs.« less
Microarray slide hybridization using fluorescently labeled cDNA.
Ares, Manuel
2014-01-01
Microarray hybridization is used to determine the amount and genomic origins of RNA molecules in an experimental sample. Unlabeled probe sequences for each gene or gene region are printed in an array on the surface of a slide, and fluorescently labeled cDNA derived from the RNA target is hybridized to it. This protocol describes a blocking and hybridization protocol for microarray slides. The blocking step is particular to the chemistry of "CodeLink" slides, but it serves to remind us that almost every kind of microarray has a treatment step that occurs after printing but before hybridization. We recommend making sure of the precise treatment necessary for the particular chemistry used in the slides to be hybridized because the attachment chemistries differ significantly. Hybridization is similar to northern or Southern blots, but on a much smaller scale.
Maia, Rafaela M; Valente, Valeria; Cunha, Marco A V; Sousa, Josane F; Araujo, Daniela D; Silva, Wilson A; Zago, Marco A; Dias-Neto, Emmanuel; Souza, Sandro J; Simpson, Andrew J G; Monesi, Nadia; Ramos, Ricardo G P; Espreafico, Enilza M; Paçó-Larson, Maria L
2007-07-24
The sequencing of the D.melanogaster genome revealed an unexpected small number of genes (~ 14,000) indicating that mechanisms acting on generation of transcript diversity must have played a major role in the evolution of complex metazoans. Among the most extensively used mechanisms that accounts for this diversity is alternative splicing. It is estimated that over 40% of Drosophila protein-coding genes contain one or more alternative exons. A recent transcription map of the Drosophila embryogenesis indicates that 30% of the transcribed regions are unannotated, and that 1/3 of this is estimated as missed or alternative exons of previously characterized protein-coding genes. Therefore, the identification of the variety of expressed transcripts depends on experimental data for its final validation and is continuously being performed using different approaches. We applied the Open Reading Frame Expressed Sequence Tags (ORESTES) methodology, which is capable of generating cDNA data from the central portion of rare transcripts, in order to investigate the presence of hitherto unnanotated regions of Drosophila transcriptome. Bioinformatic analysis of 1,303 Drosophila ORESTES clusters identified 68 sequences derived from unannotated regions in the current Drosophila genome version (4.3). Of these, a set of 38 was analysed by polyA+ northern blot hybridization, validating 17 (50%) new exons of low abundance transcripts. For one of these ESTs, we obtained the cDNA encompassing the complete coding sequence of a new serine protease, named SP212. The SP212 gene is part of a serine protease gene cluster located in the chromosome region 88A12-B1. This cluster includes the predicted genes CG9631, CG9649 and CG31326, which were previously identified as up-regulated after immune challenges in genomic-scale microarray analysis. In agreement with the proposal that this locus is co-regulated in response to microorganisms infection, we show here that SP212 is also up-regulated upon injury. Using the ORESTES methodology we identified 17 novel exons from low abundance Drosophila transcripts, and through a PCR approach the complete CDS of one of these transcripts was defined. Our results show that the computational identification and manual inspection are not sufficient to annotate a genome in the absence of experimentally derived data.
Maia, Rafaela M; Valente, Valeria; Cunha, Marco AV; Sousa, Josane F; Araujo, Daniela D; Silva, Wilson A; Zago, Marco A; Dias-Neto, Emmanuel; Souza, Sandro J; Simpson, Andrew JG; Monesi, Nadia; Ramos, Ricardo GP; Espreafico, Enilza M; Paçó-Larson, Maria L
2007-01-01
Background The sequencing of the D.melanogaster genome revealed an unexpected small number of genes (~ 14,000) indicating that mechanisms acting on generation of transcript diversity must have played a major role in the evolution of complex metazoans. Among the most extensively used mechanisms that accounts for this diversity is alternative splicing. It is estimated that over 40% of Drosophila protein-coding genes contain one or more alternative exons. A recent transcription map of the Drosophila embryogenesis indicates that 30% of the transcribed regions are unannotated, and that 1/3 of this is estimated as missed or alternative exons of previously characterized protein-coding genes. Therefore, the identification of the variety of expressed transcripts depends on experimental data for its final validation and is continuously being performed using different approaches. We applied the Open Reading Frame Expressed Sequence Tags (ORESTES) methodology, which is capable of generating cDNA data from the central portion of rare transcripts, in order to investigate the presence of hitherto unnanotated regions of Drosophila transcriptome. Results Bioinformatic analysis of 1,303 Drosophila ORESTES clusters identified 68 sequences derived from unannotated regions in the current Drosophila genome version (4.3). Of these, a set of 38 was analysed by polyA+ northern blot hybridization, validating 17 (50%) new exons of low abundance transcripts. For one of these ESTs, we obtained the cDNA encompassing the complete coding sequence of a new serine protease, named SP212. The SP212 gene is part of a serine protease gene cluster located in the chromosome region 88A12-B1. This cluster includes the predicted genes CG9631, CG9649 and CG31326, which were previously identified as up-regulated after immune challenges in genomic-scale microarray analysis. In agreement with the proposal that this locus is co-regulated in response to microorganisms infection, we show here that SP212 is also up-regulated upon injury. Conclusion Using the ORESTES methodology we identified 17 novel exons from low abundance Drosophila transcripts, and through a PCR approach the complete CDS of one of these transcripts was defined. Our results show that the computational identification and manual inspection are not sufficient to annotate a genome in the absence of experimentally derived data. PMID:17650329
Tu, Z; Hagedorn, H H
1997-02-01
Pyruvate carboxylase (PC, pyruvate: carbon dioxide ligase [ADP-forming], EC 6.4.1.1) was purified from the yellow fever mosquito, Aedes aegypti. The purified PC showed two polypeptides of similar M(r) (133 and 128 k). The N-terminal sequences of both polypeptides were shown to be very similar, if not identical. A polyclonal antiserum against the 133 kDa polypeptide cross-reacted strongly with the 128 kDa polypeptide. PC was found in all tissues examined. Using a semi-quantitative Western blot assay, PC was shown to be concentrated in the indirect flight muscles and fat body preparations. The ratios of the 133 to 128 kDa polypeptides were shown to differ in various tissues and an Aedes albopictus cell line. The indirect flight muscle was the only tissue in which the 128 kDa polypeptide was more abundant, while both the midgut and the cell line showed almost exclusively the 133 kDa polypeptide. Both peptides were present in varying amounts in brain, malpighian tubule, ovary and fat body preparation. The two isoforms of PC could play different roles in the flight muscle and other tissues. Clones covering a complete cDNA of PC of A. aegypti were obtained using a directional approach. The 3952 bp nucleotide sequence, including a 3585 bp coding region, was determined from these cDNA clones. The deduced 1195 amino acid sequence has a calculated M(r) of 132,200. A putative mitochondrial targeting sequence was determined by comparing the deduced amino acid sequence to the N-terminal sequences of the mature protein. The presence of a mitochondrial targeting sequence indicates that the mosquito PC encoded by the cloned cDNA may be localized in the mitochondria. After the targeting sequence, three functional domains were identified in the following order; biotin carboxylase (BC), carboxyltransferase (CT) and biotin carboxyl carrier protein (BCCP). The mosquito PC showed very high similarity to PCs from other sources (55.1-75.2% identity). Genomic Southern analysis indicated that there could be two similar PC genes or a single PC gene with allelic polymorphism in the A. aegypti genome. The evolutionary relationship of PCs among different organisms was consistent with the accepted evolutionary relationship of their host organisms. The evolution of the domain structures of the biotin-dependent carboxylases including PC was also investigated. This analysis indicates that biotin-dependent carboxylases evolved from a common origin. The analysis also provides evidence for early gene duplication events that shaped the family of biotin-dependent carboxylases. Clear evidence for the coevolution of BC and BCCP domains is presented, although they are associated with very different CT domains and the relative position of the three functional domains varies between members of the biotin-dependent carboxylases.
Nonoguchi, K; Itoh, K; Xue, J H; Tokuchi, H; Nishiyama, H; Kaneko, Y; Tatsumi, K; Okuno, H; Tomiwa, K; Fujita, J
1999-09-03
In mice, the Hsp110/SSE family is composed of the heat shock protein (Hsp)110/105, Apg-1 and Apg-2. In humans, however, only the Hsp110/105 homolog has been identified as a member, and two cDNAs, Hsp70RY and HS24/p52, potentially encoding proteins structurally similar to, but smaller than, mouse Apg-2 have been reported. To clarify the membership of Hsp110 family in humans, we isolated Apg-1 and Apg-2 cDNAs from a human testis cDNA library. The human Apg-1 was 100% and 91.8% identical in length and amino acid (aa) sequence, respectively, to mouse Apg-1. Human Apg-2 was one aa shorter than and 95.5% identical in sequence to mouse Apg-2. In ECV304, human endothelial cells Apg-1 but not Apg-2 transcripts were induced in 2 h by a temperature shift from 32 degrees C to 39 degrees C. As found in mice, the response was stronger than that to a 37-42 degrees C shift. The human Apg-1 and Apg-2 genes were mapped to the chromosomal loci 4q28 and 5q23.3-q31.1, respectively, by fluorescence in-situ hybridization. We isolated cDNA and genomic clones encompassing the region critical for the difference between Apg-2 and HS24/p52. Although the primer sets used were derived from the sequences common to both cDNAs, all cDNA and genomic clones corresponded to Apg-2. Using a similar approach, the relationship between Apg-2 and Hsp70RY was assessed, and no clone corresponding to Hsp70RY was obtained. These results demonstrated that the Hsp110 family consists of at least three members, Apg-1, Apg-2 and Hsp110 in humans as well as in mice. The significance of HS24/p52 and Hsp70RY cDNAs previously reported remains to be determined.
Cioffi, Anna Valentina; Ferrara, Diana; Cubellis, Maria Vittoria; Aniello, Francesco; Corrado, Marcella; Liguori, Francesca; Amoroso, Alessandro; Fucci, Laura; Branno, Margherita
2002-08-01
Analysis of the genome structure of the Paracentrotus lividus (sea urchin) DNA methyltransferase (DNA MTase) gene showed the presence of an open reading frame, named METEX, in intron 7 of the gene. METEX expression is developmentally regulated, showing no correlation with DNA MTase expression. In fact, DNA MTase transcripts are present at high concentrations in the early developmental stages, while METEX is expressed at late stages of development. Two METEX cDNA clones (Met1 and Met2) that are different in the 3' end have been isolated in a cDNA library screening. The putative translated protein from Met2 cDNA clone showed similarity with Escherichia coli endonuclease III on the basis of sequence and predictive three-dimensional structure. The protein, overexpressed in E. coli and purified, had functional properties similar to the endonuclease specific for apurinic/apyrimidinic (AP) sites on the basis of the lyase activity. Therefore the open reading frame, present in intron 7 of the P. lividus DNA MTase gene, codes for a functional AP endonuclease designated SuAP1.
Huang, P L; Do, Y Y; Huang, F C; Thay, T S; Chang, T W
1997-04-01
A cDNA encoding the banana 1-aminocyclopropane-1-carboxylate (ACC) oxidase has previously been isolated from a cDNA library that was constructed by extracting poly(A)+ RNA from peels of ripening banana. This cDNA, designated as pMAO2, has 1,199 bp and contains an open reading frame of 318 amino acids. In order to identify ripening-related promoters of the banana ACC oxidase gene, pMAO2 was used as a probe to screen a banana genomic library constructed in the lambda EMBL3 vector. The banana ACC oxidase MAO2 gene has four exons and three introns, with all of the boundaries between these introns and exons sharing a consensus dinucleotide sequence of GT-AG. The expression of MAO2 gene in banana begins after the onset of ripening (stage 2) and continuous into later stages of the ripening process. The accumulation of MAO2 mRNA can be induced by 1 microliter/l exogenous ethylene, and it reached steady state level when 100 microliters/l exogenous ethylene was present.
Xie, Bingkun; Yang, Wei; Ouyang, Yongchang; Chen, Lichan; Jiang, Hesheng; Liao, Yuying; Liao, D. Joshua
2016-01-01
Tens of thousands of chimeric RNAs have been reported. Most of them contain a short homologous sequence (SHS) at the joining site of the two partner genes but are not associated with a fusion gene. We hypothesize that many of these chimeras may be technical artifacts derived from SHS-caused mis-priming in reverse transcription (RT) or polymerase chain reactions (PCR). We cloned six chimeric complementary DNAs (cDNAs) formed by human mitochondrial (mt) 16S rRNA sequences at an SHS, which were similar to several expression sequence tags (ESTs).These chimeras, which could not be detected with cDNA protection assay, were likely formed because some regions of the 16S rRNA are reversely complementary to another region to form an SHS, which allows the downstream sequence to loop back and anneal at the SHS to prime the synthesis of its complementary strand, yielding a palindromic sequence that can form a hairpin-like structure.We identified a 16S rRNA that ended at the 4th nucleotide(nt) of the mt-tRNA-leu was dominant and thus should be the wild type. We also cloned a mouse Bcl2-Nek9 chimeric cDNA that contained a 5-nt unmatchable sequence between the two partners, contained two copies of the reverse primer in the same direction but did not contain the forward primer, making it unclear how this Bcl2-Nek9 was formed and amplified. Moreover, a cDNA was amplified because one primer has 4 nts matched to the template, suggesting that there may be many more artificial cDNAs than we have realized, because the nuclear and mt genomes have many more 4-nt than 5-nt or longer homologues. Altogether, the chimeric cDNAs we cloned are good examples suggesting that many cDNAs may be artifacts due to SHS-caused mis-priming and thus greater caution should be taken when new sequence is obtained from a technique involving DNA polymerization. PMID:27148738
Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun
2013-01-01
In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers. PMID:23708105
Liu, Changqing; Liu, Dan; Guo, Yu; Lu, Taofeng; Li, Xiangchen; Zhang, Minghai; Ma, Jianzhang; Ma, Yuehui; Guan, Weijun
2013-05-24
In this study, a full-length enriched cDNA library was successfully constructed from Bengal tiger, Panthera tigris tigris, the most well-known wild Animal. Total RNA was extracted from cultured Bengal tiger fibroblasts in vitro. The titers of primary and amplified libraries were 1.28 × 106 pfu/mL and 1.56 × 109 pfu/mL respectively. The percentage of recombinants from unamplified library was 90.2% and average length of exogenous inserts was 0.98 kb. A total of 212 individual ESTs with sizes ranging from 356 to 1108 bps were then analyzed. The BLASTX score revealed that 48.1% of the sequences were classified as a strong match, 45.3% as nominal and 6.6% as a weak match. Among the ESTs with known putative function, 26.4% ESTs were found to be related to all kinds of metabolisms, 19.3% ESTs to information storage and processing, 11.3% ESTs to posttranslational modification, protein turnover, chaperones, 11.3% ESTs to transport, 9.9% ESTs to signal transducer/cell communication, 9.0% ESTs to structure protein, 3.8% ESTs to cell cycle, and only 6.6% ESTs classified as novel genes. By EST sequencing, a full-length gene coding ferritin was identified and characterized. The recombinant plasmid pET32a-TAT-Ferritin was constructed, coded for the TAT-Ferritin fusion protein with two 6× His-tags in N and C-terminal. After BCA assay, the concentration of soluble Trx-TAT-Ferritin recombinant protein was 2.32 ± 0.12 mg/mL. These results demonstrated that the reliability and representativeness of the cDNA library attained to the requirements of a standard cDNA library. This library provided a useful platform for the functional genome and transcriptome research of Bengal tigers.
Trumbić, Željka; Bekaert, Michaël; Taggart, John B; Bron, James E; Gharbi, Karim; Mladineo, Ivona
2015-11-25
The largest of the tuna species, Atlantic bluefin tuna (Thunnus thynnus), inhabits the North Atlantic Ocean and the Mediterranean Sea and is considered to be an endangered species, largely a consequence of overfishing. T. thynnus aquaculture, referred to as fattening or farming, is a capture based activity dependent on yearly renewal from the wild. Thus, the development of aquaculture practices independent of wild resources can provide an important contribution towards ensuring security and sustainability of this species in the longer-term. The development of such practices is today greatly assisted by large scale transcriptomic studies. We have used pyrosequencing technology to sequence a mixed-tissue normalised cDNA library, derived from adult T. thynnus. A total of 976,904 raw sequence reads were assembled into 33,105 unique transcripts having a mean length of 893 bases and an N50 of 870. Of these, 33.4% showed similarity to known proteins or gene transcripts and 86.6% of them were matched to the congeneric Pacific bluefin tuna (Thunnus orientalis) genome, compared to 70.3% for the more distantly related Nile tilapia (Oreochromis niloticus) genome. Transcript sequences were used to develop a novel 15 K Agilent oligonucleotide DNA microarray for T. thynnus and comparative tissue gene expression profiles were inferred for gill, heart, liver, ovaries and testes. Functional contrasts were strongest between gills and ovaries. Gills were particularly associated with immune system, signal transduction and cell communication, while ovaries displayed signatures of glycan biosynthesis, nucleotide metabolism, transcription, translation, replication and repair. Sequence data generated from a novel mixed-tissue T. thynnus cDNA library provide an important transcriptomic resource that can be further employed for study of various aspects of T. thynnus ecology and genomics, with strong applications in aquaculture. Tissue-specific gene expression profiles inferred through the use of novel oligo-microarray can serve in the design of new and more focused transcriptomic studies for future research of tuna physiology and assessment of the welfare in a production environment.
Robinson, Lois; Panayiotakis, Alexandra; Papas, Takis S.; Kola, Ismail; Seth, Arun
1997-01-01
ETS transcription factors play important roles in hematopoiesis, angiogenesis, and organogenesis during murine development. The ETS genes also have a role in neoplasia, for example in Ewing’s sarcomas and retrovirally induced cancers. The ETS genes encode transcription factors that bind to specific DNA sequences and activate transcription of various cellular and viral genes. To isolate novel ETS target genes, we used two approaches. In the first approach, we isolated genes by the RNA differential display technique. Previously, we have shown that the overexpression of ETS1 and ETS2 genes effects transformation of NIH 3T3 cells and specific transformants produce high levels of the ETS proteins. To isolate ETS1 and ETS2 responsive genes in these transformed cells, we prepared RNA from ETS1, ETS2 transformants, and normal NIH 3T3 cell lines and converted it into cDNA. This cDNA was amplified by PCR and displayed on sequencing gels. The differentially displayed bands were subcloned into plasmid vectors. By Northern blot analysis, several clones showed differential patterns of mRNA expression in the NIH 3T3-, ETS1-, and ETS2-expressing cell lines. Sixteen clones were analyzed by DNA sequence analysis, and 13 of them appeared to be unique because their DNA sequences did not match with any of the known genes present in the gene bank. Three known genes were found to be identical to the CArG box binding factor, phospholipase A2-activating protein, and early growth response 1 (Egr1) genes. In the second approach, to isolate ETS target promoters directly, we performed ETS1 binding with MboI-cleaved genomic DNA in the presence of a specific mAb followed by whole genome PCR. The immune complex-bound ETS binding sites containing DNA fragments were amplified and subcloned into pBluescript and subjected to DNA sequence and computer analysis. We found that, of a large number of clones isolated, 43 represented unique sequences not previously identified. Three clones turned out to contain regulatory sequences derived from human serglycin, preproapolipoprotein C II, and Egr1 genes. The ETS binding sites derived from these three regulatory sequences showed specific binding with recombinant ETS proteins. Of interest, Egr1 was identified by both of these techniques, suggesting strongly that it is indeed an ETS target gene. PMID:9207063
LaPolla, R J; Mayne, K M; Davidson, N
1984-01-01
A mouse cDNA clone has been isolated that contains the complete coding region of a protein highly homologous to the delta subunit of the Torpedo acetylcholine receptor (AcChoR). The cDNA library was constructed in the vector lambda 10 from membrane-associated poly(A)+ RNA from BC3H-1 mouse cells. Surprisingly, the delta clone was selected by hybridization with cDNA encoding the gamma subunit of the Torpedo AcChoR. The nucleotide sequence of the mouse cDNA clone contains an open reading frame of 520 amino acids. This amino acid sequence exhibits 59% and 50% sequence homology to the Torpedo AcChoR delta and gamma subunits, respectively. However, the mouse nucleotide sequence has several stretches of high homology with the Torpedo gamma subunit cDNA, but not with delta. The mouse protein has the same general structural features as do the Torpedo subunits. It is encoded by a 3.3-kilobase mRNA. There is probably only one, but at most two, chromosomal genes coding for this or closely related sequences. Images PMID:6096870
Skelly, Daniel A.; Johansson, Marnie; Madeoy, Jennifer; Wakefield, Jon; Akey, Joshua M.
2011-01-01
Variation in gene expression is thought to make a significant contribution to phenotypic diversity among individuals within populations. Although high-throughput cDNA sequencing offers a unique opportunity to delineate the genome-wide architecture of regulatory variation, new statistical methods need to be developed to capitalize on the wealth of information contained in RNA-seq data sets. To this end, we developed a powerful and flexible hierarchical Bayesian model that combines information across loci to allow both global and locus-specific inferences about allele-specific expression (ASE). We applied our methodology to a large RNA-seq data set obtained in a diploid hybrid of two diverse Saccharomyces cerevisiae strains, as well as to RNA-seq data from an individual human genome. Our statistical framework accurately quantifies levels of ASE with specified false-discovery rates, achieving high reproducibility between independent sequencing platforms. We pinpoint loci that show unusual and biologically interesting patterns of ASE, including allele-specific alternative splicing and transcription termination sites. Our methodology provides a rigorous, quantitative, and high-resolution tool for profiling ASE across whole genomes. PMID:21873452
NASA Astrophysics Data System (ADS)
Woon, J. S. K.; Murad, A. M. A.; Abu Bakar, F. D.
2015-09-01
A cellobiohydrolase B (CbhB) from Aspergillus niger ATCC 10574 was cloned and expressed in E. coli. CbhB has an open reading frame of 1611 bp encoding a putative polypeptide of 536 amino acids. Analysis of the encoded polypeptide predicted a molecular mass of 56.2 kDa, a cellulose binding module (CBM) and a catalytic module. In order to obtain the mRNA of cbhB, total RNA was extracted from A. niger cells induced by 1% Avicel. First strand cDNA was synthesized from total RNA via reverse transcription. The full length cDNA of cbhB was amplified by PCR and cloned into the cloning vector, pGEM-T Easy. A comparison between genomic DNA and cDNA sequences of cbhB revealed that the gene is intronless. Upon the removal of the signal peptide, the cDNA of cbhB was cloned into the expression vector pET-32b. However, the recombinant CbhB was expressed in Escherichia coli Origami DE3 as an insoluble protein. A homology model of CbhB predicted the presence of nine disulfide bonds in the protein structure which may have contributed to the improper folding of the protein and thus, resulting in inclusion bodies in E. coli.
van der Leij, F R; Visser, R G; Ponstein, A S; Jacobsen, E; Feenstra, W J
1991-08-01
The genomic sequence of the potato gene for starch granule-bound starch synthase (GBSS; "waxy protein") has been determined for the wild-type allele of a monoploid genotype from which an amylose-free (amf) mutant was derived, and for the mutant part of the amf allele. Comparison of the wild-type sequence with a cDNA sequence from the literature and a newly isolated cDNA revealed the presence of 13 introns, the first of which is located in the untranslated leader. The promoter contains a G-box-like sequence. The deduced amino acid sequence of the precursor of GBSS shows a high degree of identity with monocot waxy protein sequences in the region corresponding to the mature form of the enzyme. The transit peptide of 77 amino acids, required for routing of the precursor to the plastids, shows much less identity with the transit peptides of the other waxy preproteins, but resembles the hydropathic distributions of these peptides. Alignment of the amino acid sequences of the four mature starch synthases with the Escherichia coli glgA gene product revealed the presence of at least three conserved boxes; there is no homology with previously proposed starch-binding domains of other enzymes involved in starch metabolism. We report the use of chimeric constructs with wild-type and amf sequences to localize, via complementation experiments, the region of the amf allele in which the mutation resides. Direct sequencing of polymerase chain reaction products confirmed that the amf mutation is a deletion of a single AT basepair in the region coding for the transit peptide.(ABSTRACT TRUNCATED AT 250 WORDS)
Application of Genomic Technologies to the Breeding of Trees
Badenes, Maria L.; Fernández i Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J.
2016-01-01
The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the former gap of genetic tools between trees and annual species. PMID:27895664
Application of Genomic Technologies to the Breeding of Trees.
Badenes, Maria L; Fernández I Martí, Angel; Ríos, Gabino; Rubio-Cabetas, María J
2016-01-01
The recent introduction of next generation sequencing (NGS) technologies represents a major revolution in providing new tools for identifying the genes and/or genomic intervals controlling important traits for selection in breeding programs. In perennial fruit trees with long generation times and large sizes of adult plants, the impact of these techniques is even more important. High-throughput DNA sequencing technologies have provided complete annotated sequences in many important tree species. Most of the high-throughput genotyping platforms described are being used for studies of genetic diversity and population structure. Dissection of complex traits became possible through the availability of genome sequences along with phenotypic variation data, which allow to elucidate the causative genetic differences that give rise to observed phenotypic variation. Association mapping facilitates the association between genetic markers and phenotype in unstructured and complex populations, identifying molecular markers for assisted selection and breeding. Also, genomic data provide in silico identification and characterization of genes and gene families related to important traits, enabling new tools for molecular marker assisted selection in tree breeding. Deep sequencing of transcriptomes is also a powerful tool for the analysis of precise expression levels of each gene in a sample. It consists in quantifying short cDNA reads, obtained by NGS technologies, in order to compare the entire transcriptomes between genotypes and environmental conditions. The miRNAs are non-coding short RNAs involved in the regulation of different physiological processes, which can be identified by high-throughput sequencing of RNA libraries obtained by reverse transcription of purified short RNAs, and by in silico comparison with known miRNAs from other species. All together, NGS techniques and their applications have increased the resources for plant breeding in tree species, closing the former gap of genetic tools between trees and annual species.
Wang, Dai; Parrish, Colin R.
1999-01-01
Phage display of cDNA clones prepared from feline cells was used to identify host cell proteins that bound to DNA-containing feline panleukopenia virus (FPV) capsids but not to empty capsids. One gene found in several clones encoded a heterogeneous nuclear ribonucleoprotein (hnRNP)-related protein (DBP40) that was very similar in sequence to the A/B-type hnRNP proteins. DBP40 bound specifically to oligonucleotides representing a sequence near the 5′ end of the genome which is exposed on the outside of the full capsid but did not bind most other terminal sequences. Adding purified DBP40 to an in vitro fill-in reaction using viral DNA as a template inhibited the production of the second strand after nucleotide (nt) 289 but prior to nt 469. DBP40 bound to various regions of the viral genome, including a region between nt 295 and 330 of the viral genome which has been associated with transcriptional attenuation of the parvovirus minute virus of mice, which is mediated by a stem-loop structure of the DNA and cellular proteins. Overexpression of the protein in feline cells from a plasmid vector made them largely resistant to FPV infection. Mutagenesis of the protein binding site within the 5′ end viral genome did not affect replication of the virus. PMID:10438866
Chao, Tianle; Wang, Guizhi; Wang, Jianmin; Liu, Zhaohua; Ji, Zhibin; Hou, Lei; Zhang, Chunlan
2016-01-01
High-throughput mRNA sequencing enables the discovery of new transcripts and additional parts of incompletely annotated transcripts. Compared with the human and cow genomes, the reference annotation level of the sheep genome is still low. An investigation of new transcripts in sheep skeletal muscle will improve our understanding of muscle development. Therefore, applying high-throughput sequencing, two cDNA libraries from the biceps brachii of small-tailed Han sheep and Dorper sheep were constructed, and whole-transcriptome analysis was performed to determine the unknown transcript catalogue of this tissue. In this study, 40,129 transcripts were finally mapped to the sheep genome. Among them, 3,467 transcripts were determined to be unannotated in the current reference sheep genome and were defined as new transcripts. Based on protein-coding capacity prediction and comparative analysis of sequence similarity, 246 transcripts were classified as portions of unannotated genes or incompletely annotated genes. Another 1,520 transcripts were predicted with high confidence to be long non-coding RNAs. Our analysis also revealed 334 new transcripts that displayed specific expression in ruminants and uncovered a number of new transcripts without intergenus homology but with specific expression in sheep skeletal muscle. The results confirmed a complex transcript pattern of coding and non-coding RNA in sheep skeletal muscle. This study provided important information concerning the sheep genome and transcriptome annotation, which could provide a basis for further study.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Zezza, D.J.; Stewart, S.E.; Steiner, L.A.
1992-12-15
Xenopus laevis Ig contain two distinct types of L chains, designated [rho] or L1 and [sigma] or L2. The authors have analyzed Xenopus genomic DNA by Southern blotting with cDNA probes specific for L1 V and C regions. Many fragments hybridized to the V probe, but only one or two fragments hybridized to the C probe. Corresponding C, J, and V gene segments were identified on clones isolated from a genomic library prepared from the same DNA. One clone contains a C gene segment separated from a J gene segment by an intron of 3.4 kb. The J and Cmore » gene segments are nearly identical in sequence to cDNA clones analyzed previously. The C segment is somewhat more similar and the J segment considerably more similar in sequence to the corresponding segments of mammalian [kappa] chains than to those of mammalian [lambda] chains. Upstream of the J segment is a typical recombination signal sequence with a spacer of 23 bp, as in J[kappa]. A second clone from the library contains four V gene segments, separated by 2.1 to 3.6 kb. Two of these, V1 and V3, have the expected structural and regulatory features of V genes, and are very similar in sequence to each other and to mammalian V[kappa]. A third gene segment, V2, resembles V1 and V3 in its coding region and nearby 5[prime]-flanking region, but diverges in sequence 5[prime] to position [minus]95 with loss of the octamer promoter element. The fourth V-like segment is similar to the others at the 3[prime]-end, but upstream of codon 64 bears no resemblance in sequence to any Ig V region. All four V segments have typical recombination signal sequences with 12-bp spacers at their 3[prime]-ends, as in V[kappa]. Taken together, the data suggest that Xenopus L1 L chain genes are members of the [kappa] gene family. 80 refs., 9 figs.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Kuhn, R.J.; Tada, H.; Ypma-Wong, M.F.
1988-01-01
By following a strategy of genetic analysis of poliovirus, the authors have constructed a synthetic mutagenesis cartridge spanning the genome-linked viral protein coding region and flanking cleavage sites in an infectious cDNA clone of the type I (Mahoney) genome. The insertion of new restriction sites within the infectious clone has allowed them to replace the wild-type sequences with short complementary pairs of synthetic oligonucleotides containing various mutations. A set of mutations have been made that create methionine codons within the genome-linked viral protein region. The resulting viruses have growth characteristics similar to wild type. Experiments that led to an alterationmore » of the tyrosine residue responsible for the linkage to RNA have resulted in nonviable virus. In one mutant, proteolytic processing assayed in vitro appeared unimpaired by the mutation. They suggest that the position of the tyrosine residue is important for genome-linked viral protein function(s).« less
MosaicSolver: a tool for determining recombinants of viral genomes from pileup data
Wood, Graham R.; Ryabov, Eugene V.; Fannon, Jessica M.; Moore, Jonathan D.; Evans, David J.; Burroughs, Nigel
2014-01-01
Viral recombination is a key evolutionary mechanism, aiding escape from host immunity, contributing to changes in tropism and possibly assisting transmission across species barriers. The ability to determine whether recombination has occurred and to locate associated specific recombination junctions is thus of major importance in understanding emerging diseases and pathogenesis. This paper describes a method for determining recombinant mosaics (and their proportions) originating from two parent genomes, using high-throughput sequence data. The method involves setting the problem geometrically and the use of appropriately constrained quadratic programming. Recombinants of the honeybee deformed wing virus and the Varroa destructor virus-1 are inferred to illustrate the method from both siRNAs and reads sampling the viral genome population (cDNA library); our results are confirmed experimentally. Matlab software (MosaicSolver) is available. PMID:25120266
Sequence verification as quality-control step for production of cDNA microarrays.
Taylor, E; Cogdell, D; Coombes, K; Hu, L; Ramdas, L; Tabor, A; Hamilton, S; Zhang, W
2001-07-01
To generate cDNA arrays in our core laboratory, we amplified about 2300 PCR products from a human, sequence-verified cDNA clone library. As a quality-control step, we sequenced the PCR products immediately before printing. The sequence information was used to search the GenBank database to confirm the identities. Although these clones were previously sequence verified by the company, we found that only 79% of the clones matched the original database after handling. Our experience strongly indicates the necessity to sequence verify the clones at the final stage before printing on microarray slides and to modify the gene list accordingly.
The Embryonic Transcriptome of the Red-Eared Slider Turtle (Trachemys scripta)
Kaplinsky, Nicholas J.; Gilbert, Scott F.; Cebra-Thomas, Judith; Lilleväli, Kersti; Saare, Merly; Chang, Eric Y.; Edelman, Hannah E.; Frick, Melissa A.; Guan, Yin; Hammond, Rebecca M.; Hampilos, Nicholas H.; Opoku, David S. B.; Sariahmed, Karim; Sherman, Eric A.; Watson, Ray
2013-01-01
The bony shell of the turtle is an evolutionary novelty not found in any other group of animals, however, research into its formation has suggested that it has evolved through modification of conserved developmental mechanisms. Although these mechanisms have been extensively characterized in model organisms, the tools for characterizing them in non-model organisms such as turtles have been limited by a lack of genomic resources. We have used a next generation sequencing approach to generate and assemble a transcriptome from stage 14 and 17 Trachemys scripta embryos, stages during which important events in shell development are known to take place. The transcriptome consists of 231,876 sequences with an N50 of 1,166 bp. GO terms and EC codes were assigned to the 61,643 unique predicted proteins identified in the transcriptome sequences. All major GO categories and metabolic pathways are represented in the transcriptome. Transcriptome sequences were used to amplify several cDNA fragments designed for use as RNA in situ probes. One of these, BMP5, was hybridized to a T. scripta embryo and exhibits both conserved and novel expression patterns. The transcriptome sequences should be of broad use for understanding the evolution and development of the turtle shell and for annotating any future T. scripta genome sequences. PMID:23840449
Van Damme, E J; Barre, A; Smeets, K; Torrekens, S; Van Leuven, F; Rougé, P; Peumans, W J
1995-01-01
Two lectins were isolated from the inner bark of Robinia pseudoacacia (black locust). The first (and major) lectin (called RPbAI) is composed of five isolectins that originate from the association of 31.5- and 29-kD polypeptides into tetramers. In contrast, the second (minor) lectin (called RPbAII) is a hometetramer composed of 26-kD subunits. The cDNA clones encoding the polypeptides of RPbAI and RPbAII were isolated and their sequences determined. Apparently all three polypeptides are translated from mRNAs of approximately 1.2 kb. Alignment of the deduced amino acid sequences of the different clones indicates that the 31.5- and 29-kD RPbAI polypeptides show approximately 80% sequence identity and are homologous to the previously reported legume seed lectins, whereas the 26-kD RPbAII polypeptide shows only 33% sequence identity to the previously described legume lectins. Modeling the 31.5-kD subunit of RPbAI predicts that its three-dimensional structure is strongly related to the three-dimensional models that have been determined thus far for a few legume lectins. Southern blot analysis of genomic DNA isolated from Robinia has revealed that the Robinia bark lectins are the result of the expression of a small family of lectin genes. PMID:7716244
Kobayashi, Masaaki; Nagasaki, Hideki; Garcia, Virginie; Just, Daniel; Bres, Cécile; Mauxion, Jean-Philippe; Le Paslier, Marie-Christine; Brunel, Dominique; Suda, Kunihiro; Minakuchi, Yohei; Toyoda, Atsushi; Fujiyama, Asao; Toyoshima, Hiromi; Suzuki, Takayuki; Igarashi, Kaori; Rothan, Christophe; Kaminuma, Eli; Nakamura, Yasukazu; Yano, Kentaro; Aoki, Koh
2014-02-01
Tomato (Solanum lycopersicum) is regarded as a model plant of the Solanaceae family. The genome sequencing of the tomato cultivar 'Heinz 1706' was recently completed. To accelerate the progress of tomato genomics studies, systematic bioresources, such as mutagenized lines and full-length cDNA libraries, have been established for the cultivar 'Micro-Tom'. However, these resources cannot be utilized to their full potential without the completion of the genome sequencing of 'Micro-Tom'. We undertook the genome sequencing of 'Micro-Tom' and here report the identification of single nucleotide polymorphisms (SNPs) and insertion/deletions (indels) between 'Micro-Tom' and 'Heinz 1706'. The analysis demonstrated the presence of 1.23 million SNPs and 0.19 million indels between the two cultivars. The density of SNPs and indels was high in chromosomes 2, 5 and 11, but was low in chromosomes 6, 8 and 10. Three known mutations of 'Micro-Tom' were localized on chromosomal regions where the density of SNPs and indels was low, which was consistent with the fact that these mutations were relatively new and introgressed into 'Micro-Tom' during the breeding of this cultivar. We also report SNP analysis for two 'Micro-Tom' varieties that have been maintained independently in Japan and France, both of which have served as standard lines for 'Micro-Tom' mutant collections. Approximately 28,000 SNPs were identified between these two 'Micro-Tom' lines. These results provide high-resolution DNA polymorphic information on 'Micro-Tom' and represent a valuable contribution to the 'Micro-Tom'-based genomics resources.
Small gene family encoding an eggshell (chorion) protein of the human parasite Schistosoma mansoni
DOE Office of Scientific and Technical Information (OSTI.GOV)
Bobek, L.A.; Rekosh, D.M.; Lo Verde, P.T.
1988-08-01
The authors isolated six independent genomic clones encoding schistosome chorion or eggshell proteins from a Schistosoma mansoni genomic library. A linkage map of five of the clones spanning 35 kilobase pairs (kbp) of the S. mansoni genome was constructed. The region contained two eggshell protein genes closely linked, separated by 7.5 kbp of intergenic DNA. The two genes of the cluster were arranged in the same orientation, that is, they were transcribed from the same strand. The sixth clone probably represents a third copy of the eggshell gene that is not contained within the 35-kbp region. The 5- end ofmore » the mRNA transcribed from these genes was defined by primer extension directly off the RNA. The ATCAT cap site sequence was homologous to a silkmoth chorion PuTCATT cap site sequence, where Pu indicates any purine. DNA sequence analysis showed that there were no introns in these genes. The DNA sequences of the three genes were very homologous to each other and to a cDNA clone, pSMf61-46, differing only in three or four nucleotices. A multiple TATA box was located at positions -23 to -31, and a CAAAT sequence was located at -52 upstream of the eggshell transcription unit. Comparison of sequences in regions further upstream with silkmoth and Drosophila sequences revealed very short elements that were shared. One such element, TCACGT, recently shown to be an essential cis-regulatory element for silkmoth chorion gene promoter function, was found at a similar position in all three organisms.« less
Generation, annotation and analysis of ESTs from Trichoderma harzianum CECT 2413
Vizcaíno, Juan Antonio; González, Francisco Javier; Suárez, M Belén; Redondo, José; Heinrich, Julian; Delgado-Jarana, Jesús; Hermosa, Rosa; Gutiérrez, Santiago; Monte, Enrique; Llobell, Antonio; Rey, Manuel
2006-01-01
Background The filamentous fungus Trichoderma harzianum is used as biological control agent of several plant-pathogenic fungi. In order to study the genome of this fungus, a functional genomics project called "TrichoEST" was developed to give insights into genes involved in biological control activities using an approach based on the generation of expressed sequence tags (ESTs). Results Eight different cDNA libraries from T. harzianum strain CECT 2413 were constructed. Different growth conditions involving mainly different nutrient conditions and/or stresses were used. We here present the analysis of the 8,710 ESTs generated. A total of 3,478 unique sequences were identified of which 81.4% had sequence similarity with GenBank entries, using the BLASTX algorithm. Using the Gene Ontology hierarchy, we performed the annotation of 51.1% of the unique sequences and compared its distribution among the gene libraries. Additionally, the InterProScan algorithm was used in order to further characterize the sequences. The identification of the putatively secreted proteins was also carried out. Later, based on the EST abundance, we examined the highly expressed genes and a hydrophobin was identified as the gene expressed at the highest level. We compared our collection of ESTs with the previous collections obtained from Trichoderma species and we also compared our sequence set with different complete eukaryotic genomes from several animals, plants and fungi. Accordingly, the presence of similar sequences in different kingdoms was also studied. Conclusion This EST collection and its annotation provide a significant resource for basic and applied research on T. harzianum, a fungus with a high biotechnological interest. PMID:16872539
Virtual Northern Analysis of the Human Genome
Hurowitz, Evan H.; Drori, Iddo; Stodden, Victoria C.; Donoho, David L.; Brown, Patrick O.
2007-01-01
Background We applied the Virtual Northern technique to human brain mRNA to systematically measure human mRNA transcript lengths on a genome-wide scale. Methodology/Principal Findings We used separation by gel electrophoresis followed by hybridization to cDNA microarrays to measure 8,774 mRNA transcript lengths representing at least 6,238 genes at high (>90%) confidence. By comparing these transcript lengths to the Refseq and H-Invitational full-length cDNA databases, we found that nearly half of our measurements appeared to represent novel transcript variants. Comparison of length measurements determined by hybridization to different cDNAs derived from the same gene identified clones that potentially correspond to alternative transcript variants. We observed a close linear relationship between ORF and mRNA lengths in human mRNAs, identical in form to the relationship we had previously identified in yeast. Some functional classes of protein are encoded by mRNAs whose untranslated regions (UTRs) tend to be longer or shorter than average; these functional classes were similar in both human and yeast. Conclusions/Significance Human transcript diversity is extensive and largely unannotated. Our length dataset can be used as a new criterion for judging the completeness of cDNAs and annotating mRNA sequences. Similar relationships between the lengths of the UTRs in human and yeast mRNAs and the functions of the proteins they encode suggest that UTR sequences serve an important regulatory role among eukaryotes. PMID:17520019
Evidence for a Pneumocystis carinii Flo8-like transcription factor: insights into organism adhesion.
Kottom, Theodore J; Limper, Andrew H
2016-02-01
Pneumocystis carinii (Pc) adhesion to alveolar epithelial cells is well established and is thought to be a prerequisite for the initiation of Pneumocystis pneumonia. Pc binding events occur in part through the major Pc surface glycoprotein Msg, as well as an integrin-like molecule termed PcInt1. Recent data from the Pc sequencing project also demonstrate DNA sequences homologous to other genes important in Candida spp. binding to mammalian host cells, as well as organism binding to polystyrene surfaces and in biofilm formation. One of these genes, flo8, a transcription factor needed for downstream cAMP/PKA-pathway-mediated activation of the major adhesion/flocculin Flo11 in yeast, was cloned from a Pc cDNA library utilizing a partial sequence available in the Pc genome database. A CHEF blot of Pc genomic DNA yielded a single band providing evidence this gene is present in the organism. BLASTP analysis of the predicted protein demonstrated 41 % homology to the Saccharomyces cerevisiae Flo8. Northern blotting demonstrated greatest expression at pH 6.0-8.0, pH comparable to reported fungal biofilm milieu. Western blot and immunoprecipitation assays of PcFlo8 protein in isolated cyst and tropic life forms confirmed the presence of the cognate protein in these Pc life forms. Heterologous expression of Pcflo8 cDNA in flo8Δ-deficient yeast strains demonstrated that the Pcflo8 was able to restore yeast binding to polystyrene and invasive growth of yeast flo8Δ cells. Furthermore, Pcflo8 promoted yeast binding to HEK293 human epithelial cells, strengthening its functional classification as a Flo8 transcription factor. Taken together, these data suggest that PcFlo8 is expressed by Pc and may exert activity in organism adhesion and biofilm formation.
Evidence for a Pneumocystis carinii Flo8-like Transcription Factor: Insights into Organism Adhesion
Kottom, Theodore J.; Limper, Andrew H.
2015-01-01
Pneumocystis carinii (Pc) adhesion to alveolar epithelial cells is well established and is thought to be a prerequisite for initiation of Pneumocystis pneumonia. Pc binding events occur in part through the major Pc surface glycoprotein Msg, as well as an integrin-like molecule termed PcInt1. Recent data from the Pc sequencing project also demonstrate DNA sequences homologous to other genes important in Candida spp. binding to mammalian host cells, as well as organism binding to polystyrene surfaces and in biofilm formation. One of these genes, flo8, a transcription factor needed for downstream cAMP/PKA-pathway-mediated activation of the major adhesin/flocculin Flo11 in yeast, was cloned from a Pc cDNA library utilizing a partial sequence available in the Pc genome database. A CHEF blot of Pc genomic DNA yielded a single band providing evidence this gene is present in the organism. BLASTP analysis of the predicted protein demonstrated 41% homology to the Saccharomyces cerevisiae Flo8. Northern blotting demonstrated greatest expression at pH 6.0–8.0, pH comparable to reported fungal biofilm milieu. Western blot and immunoprecipitation assays of PcFlo8 protein in isolated cyst and tropic life forms confirmed the presence of the cognate protein in these Pc life forms. Heterologous expression of Pcflo8 cDNA in flo8Δ (deficient) yeast strains demonstrated the Pcflo8 was able to restore yeast binding to polystyrene and invasive growth of yeast flo8Δ cells. Furthermore, Pcflo8 promoted yeast binding to HEK293 human epithelial cells, strengthening its functional classification as a Flo8 transcription factor. Taken together these data suggests that PcFlo8 is expressed by Pc and may exert activity in organism adhesion and biofilm formation. PMID:26215665
Characterization of Toll-like receptor 3 gene in large yellow croaker, Pseudosciaena crocea.
Huang, Xue-Na; Wang, Zhi-Yong; Yao, Cui-Luan
2011-07-01
Toll-like receptor 3 (TLR3) plays an important role in innate immune responses. In this report, the full-length cDNA sequence and genomic structure of Pseudosciaena crocea TLR3 (PcTLR3) were identified and characterized. The full-length cDNA of PcTLR3 was of 3384 bp, including a 5'-terminal untranslated region (UTR) of 65 bp, a 3'-terminal UTR of 589 bp and an open reading frame (ORF) of 2730 bp encoding a polypeptide of 909 amino acid residues. The full-length genome sequence of PcTLR3 was composed of 5721 nucleotides, including five exons and four introns. The putative PcTLR3 protein contained a signal peptide sequence, 16 leucine-rich repeat (LRR) motifs, a transmembrane region and a Toll/interleukin-1 receptor (TIR) domain. Quantitative real-time reverse transcription PCR analysis revealed a broad expression of PcTLR3 in most tissues, with the predominant expression in liver, then intestine, and the weakest expression in blood cells. The expression of PcTLR3 after injection with poly inosinic:cytidylic (I:C) and Vibrio parahemolyticus was tested in spleen, blood cells and liver. The results indicated that PcTLR3 transcripts could be induced in the three tissues by injection with poly I:C. The highest expression was in the blood cells with 43.5 times (at 6h) greater expression than in the control (p<0.05). In addition, after V. parahemolyticus challenge, a moderate up-regulation and down-regulation of PcTLR3 was found in blood cells and liver, respectively. Our results suggested that PcTLR3 might play an important role in fish's defense against both viral and bacterial infection. Copyright © 2011 Elsevier Ltd. All rights reserved.
Laassri, Majid; Dragunsky, Eugenia; Enterline, Joan; Eremeeva, Tatiana; Ivanova, Olga; Lottenbach, Kathleen; Belshe, Robert; Chumakov, Konstantin
2005-01-01
Sabin strains of poliovirus used in the manufacture of oral poliovirus vaccine (OPV) are prone to genetic variations that occur during growth in cell cultures and the organisms of vaccine recipients. Such derivative viruses often have increased neurovirulence and transmissibility, and in some cases they can reestablish chains of transmission in human populations. Monitoring for vaccine-derived polioviruses is an important part of the worldwide campaign to eradicate poliomyelitis. Analysis of vaccine-derived polioviruses requires, as a first step, their isolation in cell cultures, which takes significant time and may yield viral stocks that are not fully representative of the strains present in the original sample. Here we demonstrate that full-length viral cDNA can be PCR amplified directly from stool samples and immediately subjected to genomic analysis by oligonucleotide microarray hybridization and nucleotide sequencing. Most fecal samples from healthy children who received OPV were found to contain variants of Sabin vaccine viruses. Sequence changes in the 5′ untranslated region were common, as were changes in the VP1-coding region, including changes in a major antigenic site. Analysis of stool samples taken from cases of acute flaccid paralysis revealed the presence of mixtures of recombinant polioviruses, in addition to the emergence of new sequence variants. Avoiding the need for cell culture isolation dramatically shortened the time needed for identification and analysis of vaccine-derived polioviruses and could be useful for preliminary screening of clinical samples. The amplified full-length viral cDNA can be archived and used to recover live virus for further virological studies. PMID:15956413
Gonzalez, Luis Miguel; Bonay, Pedro; Benitez, Laura; Ferrer, Elizabeth; Harrison, Leslie J S; Parkhouse, R Michael E; Garate, Teresa
2007-02-01
Two clones from an activated Taenia saginata oncosphere cDNA library, Ts45W and Ts45S, were isolated and sequenced. Both of these genes belong to the Taenia ovis 45W gene family. The Ts45W and Ts45S cDNAs are 997- and 1,004-bp-long, each corresponding to 255 amino acids and with theoretical molecular masses of 27.8 and 27.7 kDa, respectively. Southern blot profiles obtained with Ts45W cDNA as a probe suggest that these two genes are members of a multigene family with tandem organization. The full genomic sequence was determined for the Ts45W gene and a new family member, the Ts45W/2 gene. The genomic sequences of the T. saginata Ts45W and Ts45W/2 genes were at least 2.2 kb in length with four exons separated by three introns. Exons 1 and 4 coded for hydrophobic domains, while, importantly, exons 2 and 3 coded for fibronectin homologous domains. These domains are presumably responsible for the demonstrated cell adhesion and, perhaps, the protective nature of this family of molecules and the acronym TAF (Taenia adhesion family) is proposed for this group of genes. We hypothesize that these TAF proteins and another T. saginata-protective antigen, HP6, have evolved the dual functions of facilitating tissue invasion and stimulating protective immunity to first ensure primary infection and subsequently to establish a concomitant protective immunity to protect the host from death or debilitation through superinfection by subsequent infections and thus help ensure parasite survival.
2011-01-01
Background Transcriptome sequencing data has become an integral component of modern genetics, genomics and evolutionary biology. However, despite advances in the technologies of DNA sequencing, such data are lacking for many groups of living organisms, in particular, many plant taxa. We present here the results of transcriptome sequencing for two closely related plant species. These species, Fagopyrum esculentum and F. tataricum, belong to the order Caryophyllales - a large group of flowering plants with uncertain evolutionary relationships. F. esculentum (common buckwheat) is also an important food crop. Despite these practical and evolutionary considerations Fagopyrum species have not been the subject of large-scale sequencing projects. Results Normalized cDNA corresponding to genes expressed in flowers and inflorescences of F. esculentum and F. tataricum was sequenced using the 454 pyrosequencing technology. This resulted in 267 (for F. esculentum) and 229 (F. tataricum) thousands of reads with average length of 341-349 nucleotides. De novo assembly of the reads produced about 25 thousands of contigs for each species, with 7.5-8.2× coverage. Comparative analysis of two transcriptomes demonstrated their overall similarity but also revealed genes that are presumably differentially expressed. Among them are retrotransposon genes and genes involved in sugar biosynthesis and metabolism. Thirteen single-copy genes were used for phylogenetic analysis; the resulting trees are largely consistent with those inferred from multigenic plastid datasets. The sister relationships of the Caryophyllales and asterids now gained high support from nuclear gene sequences. Conclusions 454 transcriptome sequencing and de novo assembly was performed for two congeneric flowering plant species, F. esculentum and F. tataricum. As a result, a large set of cDNA sequences that represent orthologs of known plant genes as well as potential new genes was generated. PMID:21232141
Lamm, Ayelet T; Stadler, Michael R; Zhang, Huibin; Gent, Jonathan I; Fire, Andrew Z
2011-02-01
We have used a combination of three high-throughput RNA capture and sequencing methods to refine and augment the transcriptome map of a well-studied genetic model, Caenorhabditis elegans. The three methods include a standard (non-directional) library preparation protocol relying on cDNA priming and foldback that has been used in several previous studies for transcriptome characterization in this species, and two directional protocols, one involving direct capture of single-stranded RNA fragments and one involving circular-template PCR (CircLigase). We find that each RNA-seq approach shows specific limitations and biases, with the application of multiple methods providing a more complete map than was obtained from any single method. Of particular note in the analysis were substantial advantages of CircLigase-based and ssRNA-based capture for defining sequences and structures of the precise 5' ends (which were lost using the double-strand cDNA capture method). Of the three methods, ssRNA capture was most effective in defining sequences to the poly(A) junction. Using data sets from a spectrum of C. elegans strains and stages and the UCSC Genome Browser, we provide a series of tools, which facilitate rapid visualization and assignment of gene structures.
Grushetskaia, Z E; Lemesh, V A; Khotyleva, L V
2010-01-01
Cellulose synthase catalytic subunit genes, CesA, have been discovered in several higher plant species, and it has been shown that the CesA gene family has multiple members. HVR2 fragment of these genes determine the class specificity of the CESA protein and its participation in the primary or secondary cell wall synthesis. The aim of this study was development of specific and degenerated primers to flax CesA gene fragments leading to obtaining the class specific HVR2 region of the gene. Two pairs of specific primers to the certain fragments of CesA-1 and CesA-6 genes and one pair of degenerated primers to HVR2 region of all flax CesA genes were developed basing on comparison of six CesA EST sequences of flax and full cDNA sequences of Arabidopsis, poplar, maize and cotton plants, obtained from GenBank. After amplification of flax cDNA, the bands of expected size were detected (201 and 300 b.p. for the CesA-1 and CesA-6, and 600 b.p. for the HVR2 region of CesA respectively). The developed markers can be used for cloning and sequencing of flax CesA genes, identifying their number in flax genome, tissue and stage specificity.
Chen, Honglin; Wang, Lixia; Liu, Xiaoyan; Hu, Liangliang; Wang, Suhua; Cheng, Xuzhen
2017-07-11
Cowpea [Vigna unguiculata (L.) Walp.] is one of the most important legumes in tropical and semi-arid regions. However, there is relatively little genomic information available for genetic research on and breeding of cowpea. The objectives of this study were to analyse the cowpea transcriptome and develop genic molecular markers for future genetic studies of this genus. Approximately 54 million high-quality cDNA sequence reads were obtained from cowpea based on Illumina paired-end sequencing technology and were de novo assembled to generate 47,899 unigenes with an N50 length of 1534 bp. Sequence similarity analysis revealed 36,289 unigenes (75.8%) with significant similarity to known proteins in the non-redundant (Nr) protein database, 23,471 unigenes (49.0%) with BLAST hits in the Swiss-Prot database, and 20,654 unigenes (43.1%) with high similarity in the Kyoto Encyclopedia of Genes and Genomes (KEGG) database. Further analysis identified 5560 simple sequence repeats (SSRs) as potential genic molecular markers. Validating a random set of 500 SSR markers yielded 54 polymorphic markers among 32 cowpea accessions. This transcriptomic analysis of cowpea provided a valuable set of genomic data for characterizing genes with important agronomic traits in Vigna unguiculata and a new set of genic SSR markers for further genetic studies and breeding in cowpea and related Vigna species.
De Pittà, Cristiano; Bertolucci, Cristiano; Mazzotta, Gabriella M; Bernante, Filippo; Rizzo, Giorgia; De Nardi, Barbara; Pallavicini, Alberto; Lanfranchi, Gerolamo; Costa, Rodolfo
2008-01-01
Background Little is known about the genome sequences of Euphausiacea (krill) although these crustaceans are abundant components of the pelagic ecosystems in all oceans and used for aquaculture and pharmaceutical industry. This study reports the results of an expressed sequence tag (EST) sequencing project from different tissues of Euphausia superba (the Antarctic krill). Results We have constructed and sequenced five cDNA libraries from different Antarctic krill tissues: head, abdomen, thoracopods and photophores. We have identified 1.770 high-quality ESTs which were assembled into 216 overlapping clusters and 801 singletons resulting in a total of 1.017 non-redundant sequences. Quantitative RT-PCR analysis was performed to quantify and validate the expression levels of ten genes presenting different EST countings in krill tissues. In addition, bioinformatic screening of the non-redundant E. superba sequences identified 69 microsatellite containing ESTs. Clusters, consensuses and related similarity and gene ontology searches were organized in a dedicated E. superba database . Conclusion We defined the first tissue transcriptional signatures of E. superba based on functional categorization among the examined tissues. The analyses of annotated transcripts showed a higher similarity with genes from insects with respect to Malacostraca possibly as an effect of the limited number of Malacostraca sequences in the public databases. Our catalogue provides for the first time a genomic tool to investigate the biology of the Antarctic krill. PMID:18226200
Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P
1988-02-01
Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators.
Antalis, T M; Clark, M A; Barnes, T; Lehrbach, P R; Devine, P L; Schevzov, G; Goss, N H; Stephens, R W; Tolstoshev, P
1988-01-01
Human monocyte-derived plasminogen activator inhibitor (mPAI-2) was purified to homogeneity from the U937 cell line and partially sequenced. Oligonucleotide probes derived from this sequence were used to screen a cDNA library prepared from U937 cells. One positive clone was sequenced and contained most of the coding sequence as well as a long incomplete 3' untranslated region (1112 base pairs). This cDNA sequence was shown to encode mPAI-2 by hybrid-select translation. A cDNA clone encoding the remainder of the mPAI-2 mRNA was obtained by primer extension of U937 poly(A)+ RNA using a probe complementary to the mPAI-2 coding region. The coding sequence for mPAI-2 was placed under the control of the lambda PL promoter, and the protein expressed in Escherichia coli formed a complex with urokinase that could be detected immunologically. By nucleotide sequence analysis, mPAI-2 cDNA encodes a protein containing 415 amino acids with a predicted unglycosylated Mr of 46,543. The predicted amino acid sequence of mPAI-2 is very similar to placental PAI-2 (3 amino acid differences) and shows extensive homology with members of the serine protease inhibitor (serpin) superfamily. mPAI-2 was found to be more homologous to ovalbumin (37%) than the endothelial plasminogen activator inhibitor, PAI-1 (26%). Like ovalbumin, mPAI-2 appears to have no typical amino-terminal signal sequence. The 3' untranslated region of the mPAI-2 cDNA contains a putative regulatory sequence that has been associated with the inflammatory mediators. Images PMID:3257578
PanGEA: identification of allele specific gene expression using the 454 technology.
Kofler, Robert; Teixeira Torres, Tatiana; Lelley, Tamas; Schlötterer, Christian
2009-05-14
Next generation sequencing technologies hold great potential for many biological questions. While mainly used for genomic sequencing, they are also very promising for gene expression profiling. Sequencing of cDNA does not only provide an estimate of the absolute expression level, it can also be used for the identification of allele specific gene expression. We developed PanGEA, a tool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology. PanGEA allows mapping of 454-ESTs to genes or whole genomes, displaying gene expression profiles, identification of SNPs and the quantification of allele specific gene expression. The intuitive GUI of PanGEA facilitates a flexible and interactive analysis of the data. PanGEA additionally implements a modification of the Smith-Waterman algorithm which deals with incorrect estimates of homopolymer length as occuring in the 454 technology To our knowledge, PanGEA is the first tool which facilitates the identification of allele specific gene expression. PanGEA is distributed under the Mozilla Public License and available at: http://www.kofler.or.at/bioinformatics/PanGEA
PanGEA: Identification of allele specific gene expression using the 454 technology
Kofler, Robert; Teixeira Torres, Tatiana; Lelley, Tamas; Schlötterer, Christian
2009-01-01
Background Next generation sequencing technologies hold great potential for many biological questions. While mainly used for genomic sequencing, they are also very promising for gene expression profiling. Sequencing of cDNA does not only provide an estimate of the absolute expression level, it can also be used for the identification of allele specific gene expression. Results We developed PanGEA, a tool which enables a fast and user-friendly analysis of allele specific gene expression using the 454 technology. PanGEA allows mapping of 454-ESTs to genes or whole genomes, displaying gene expression profiles, identification of SNPs and the quantification of allele specific gene expression. The intuitive GUI of PanGEA facilitates a flexible and interactive analysis of the data. PanGEA additionally implements a modification of the Smith-Waterman algorithm which deals with incorrect estimates of homopolymer length as occuring in the 454 technology Conclusion To our knowledge, PanGEA is the first tool which facilitates the identification of allele specific gene expression. PanGEA is distributed under the Mozilla Public License and available at: PMID:19442283
Gabus, C; Ficheux, D; Rau, M; Keith, G; Sandmeyer, S; Darlix, J L
1998-01-01
Retroviruses, including HIV-1 and the distantly related yeast retroelement Ty3, all encode a nucleoprotein required for virion structure and replication. During an in vitro comparison of HIV-1 and Ty3 nucleoprotein function in RNA dimerization and cDNA synthesis, we discovered a bipartite primer-binding site (PBS) for Ty3 composed of sequences located at opposite ends of the genome. Ty3 cDNA synthesis requires the 3' PBS for primer tRNAiMet annealing to the genomic RNA, and the 5' PBS, in cis or in trans, as the reverse transcription start site. Ty3 RNA alone is unable to dimerize, but formation of dimeric tRNAiMet bound to the PBS was found to direct dimerization of Ty3 RNA-tRNAiMet. Interestingly, HIV-1 nucleocapsid protein NCp7 and Ty3 NCp9 were interchangeable using HIV-1 and Ty3 RNA template-primer systems. Our findings impact on the understanding of non-canonical reverse transcription as well as on the use of Ty3 systems to screen for anti-NCp7 drugs. PMID:9707446
Gabus, C; Ficheux, D; Rau, M; Keith, G; Sandmeyer, S; Darlix, J L
1998-08-17
Retroviruses, including HIV-1 and the distantly related yeast retroelement Ty3, all encode a nucleoprotein required for virion structure and replication. During an in vitro comparison of HIV-1 and Ty3 nucleoprotein function in RNA dimerization and cDNA synthesis, we discovered a bipartite primer-binding site (PBS) for Ty3 composed of sequences located at opposite ends of the genome. Ty3 cDNA synthesis requires the 3' PBS for primer tRNAiMet annealing to the genomic RNA, and the 5' PBS, in cis or in trans, as the reverse transcription start site. Ty3 RNA alone is unable to dimerize, but formation of dimeric tRNAiMet bound to the PBS was found to direct dimerization of Ty3 RNA-tRNAiMet. Interestingly, HIV-1 nucleocapsid protein NCp7 and Ty3 NCp9 were interchangeable using HIV-1 and Ty3 RNA template-primer systems. Our findings impact on the understanding of non-canonical reverse transcription as well as on the use of Ty3 systems to screen for anti-NCp7 drugs.
Nucleotide sequences of two genomic DNAs encoding peroxidase of Arabidopsis thaliana.
Intapruk, C; Higashimura, N; Yamamoto, K; Okada, N; Shinmyo, A; Takano, M
1991-02-15
The peroxidase (EC 1.11.1.7)-encoding gene of Arabidopsis thaliana was screened from a genomic library using a cDNA encoding a neutral isozyme of horseradish, Armoracia rusticana, peroxidase (HRP) as a probe, and two positive clones were isolated. From the comparison with the sequences of the HRP-encoding genes, we concluded that two clones contained peroxidase-encoding genes, and they were named prxCa and prxEa. Both genes consisted of four exons and three introns; the introns had consensus nucleotides, GT and AG, at the 5' and 3' ends, respectively. The lengths of each putative exon of the prxEa gene were the same as those of the HRP-basic-isozyme-encoding gene, prxC3, and coded for 349 amino acids (aa) with a sequence homology of 89% to that encoded by prxC3. The prxCa gene was very close to the HRP-neutral-isozyme-encoding gene, prxC1b, and coded for 354 aa with 91% homology to that encoded by prxC1b. The aa sequence homology was 64% between the two peroxidases encoded by prxCa and prxEa.
REDIdb: the RNA editing database.
Picardi, Ernesto; Regina, Teresa Maria Rosaria; Brennicke, Axel; Quagliariello, Carla
2007-01-01
The RNA Editing Database (REDIdb) is an interactive, web-based database created and designed with the aim to allocate RNA editing events such as substitutions, insertions and deletions occurring in a wide range of organisms. The database contains both fully and partially sequenced DNA molecules for which editing information is available either by experimental inspection (in vitro) or by computational detection (in silico). Each record of REDIdb is organized in a specific flat-file containing a description of the main characteristics of the entry, a feature table with the editing events and related details and a sequence zone with both the genomic sequence and the corresponding edited transcript. REDIdb is a relational database in which the browsing and identification of editing sites has been simplified by means of two facilities to either graphically display genomic or cDNA sequences or to show the corresponding alignment. In both cases, all editing sites are highlighted in colour and their relative positions are detailed by mousing over. New editing positions can be directly submitted to REDIdb after a user-specific registration to obtain authorized secure access. This first version of REDIdb database stores 9964 editing events and can be freely queried at http://biologia.unical.it/py_script/search.html.
DeWitt, D L; Smith, W L
1988-01-01
Prostaglandin G/H synthase (8,11,14-icosatrienoate, hydrogen-donor:oxygen oxidoreductase, EC 1.14.99.1) catalyzes the first step in the formation of prostaglandins and thromboxanes, the conversion of arachidonic acid to prostaglandin endoperoxides G and H. This enzyme is the site of action of nonsteroidal anti-inflammatory drugs. We have isolated a 2.7-kilobase complementary DNA (cDNA) encompassing the entire coding region of prostaglandin G/H synthase from sheep vesicular glands. This cDNA, cloned from a lambda gt 10 library prepared from poly(A)+ RNA of vesicular glands, hybridizes with a single 2.75-kilobase mRNA species. The cDNA clone was selected using oligonucleotide probes modeled from amino acid sequences of tryptic peptides prepared from the purified enzyme. The full-length cDNA encodes a protein of 600 amino acids, including a signal sequence of 24 amino acids. Identification of the cDNA as coding for prostaglandin G/H synthase is based on comparison of amino acid sequences of seven peptides comprising 103 amino acids with the amino acid sequence deduced from the nucleotide sequence of the cDNA. The molecular weight of the unglycosylated enzyme lacking the signal peptide is 65,621. The synthase is a glycoprotein, and there are three potential sites for N-glycosylation, two of them in the amino-terminal half of the molecule. The serine reported to be acetylated by aspirin is at position 530, near the carboxyl terminus. There is no significant similarity between the sequence of the synthase and that of any other protein in amino acid or nucleotide sequence libraries, and a heme binding site(s) is not apparent from the amino acid sequence. The availability of a full-length cDNA clone coding for prostaglandin G/H synthase should facilitate studies of the regulation of expression of this enzyme and the structural features important for catalysis and for interaction with anti-inflammatory drugs. Images PMID:3125548
Tartar, Aurélien; Wheeler, Marsha M; Zhou, Xuguo; Coy, Monique R; Boucias, Drion G; Scharf, Michael E
2009-01-01
Background Termite lignocellulose digestion is achieved through a collaboration of host plus prokaryotic and eukaryotic symbionts. In the present work, we took a combined host and symbiont metatranscriptomic approach for investigating the digestive contributions of host and symbiont in the lower termite Reticulitermes flavipes. Our approach consisted of parallel high-throughput sequencing from (i) a host gut cDNA library and (ii) a hindgut symbiont cDNA library. Subsequently, we undertook functional analyses of newly identified phenoloxidases with potential importance as pretreatment enzymes in industrial lignocellulose processing. Results Over 10,000 expressed sequence tags (ESTs) were sequenced from the 2 libraries that aligned into 6,555 putative transcripts, including 171 putative lignocellulase genes. Sequence analyses provided insights in two areas. First, a non-overlapping complement of host and symbiont (prokaryotic plus protist) glycohydrolase gene families known to participate in cellulose, hemicellulose, alpha carbohydrate, and chitin degradation were identified. Of these, cellulases are contributed by host plus symbiont genomes, whereas hemicellulases are contributed exclusively by symbiont genomes. Second, a diverse complement of previously unknown genes that encode proteins with homology to lignase, antioxidant, and detoxification enzymes were identified exclusively from the host library (laccase, catalase, peroxidase, superoxide dismutase, carboxylesterase, cytochrome P450). Subsequently, functional analyses of phenoloxidase activity provided results that were strongly consistent with patterns of laccase gene expression. In particular, phenoloxidase activity and laccase gene expression are mostly restricted to symbiont-free foregut plus salivary gland tissues, and phenoloxidase activity is inducible by lignin feeding. Conclusion To our knowledge, this is the first time that a dual host-symbiont transcriptome sequencing effort has been conducted in a single termite species. This sequence database represents an important new genomic resource for use in further studies of collaborative host-symbiont termite digestion, as well as development of coevolved host and symbiont-derived biocatalysts for use in industrial biomass-to-bioethanol applications. Additionally, this study demonstrates that: (i) phenoloxidase activities are prominent in the R. flavipes gut and are not symbiont derived, (ii) expands the known number of host and symbiont glycosyl hydrolase families in Reticulitermes, and (iii) supports previous models of lignin degradation and host-symbiont collaboration in cellulose/hemicellulose digestion in the termite gut. All sequences in this paper are available publicly with the accession numbers FL634956-FL640828 (Termite Gut library) and FL641015-FL645753 (Symbiont library). PMID:19832970
Characterization and chromosomal localization of the gene for human rhodopsin kinase
DOE Office of Scientific and Technical Information (OSTI.GOV)
Khani, S.C.; Yamamoto, S.; Dryja, T.P.
1996-08-01
G-protein-dependent receptor kinases (GRKs) play a key role in the adapatation of receptors to persistent stimuli. In rod photoreceptors rhodopsin kinase (RK) mediates rapid densensitization of rod photoreceptors to light by catalyzing phosphorylation of the visual pigment rhodopsin. To study the structure and mechanism of FRKs in human photoreceptors, we have isolated and characterized cDNA and genomic clones derived from the human RK locus using a bovine rhodopsin kinase cDNA fragment as a probe. The RK locus, assigned to chromosome 13 band q34, is composed of seven exons that encode a protein 92% identical in amino acid sequence to bovinemore » rhodopsin kinase. The marked difference between the structure of this gene and that of another recently clone human GRK gene suggests the existence of a wide evolutionary gap between members of the GRK gene family. 39 refs., 3 figs.« less
Cloning, structure, and chromosome localization of the mouse glutaryl-CoA dehydrogenase gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Koeller, D.M.; DiGiulio, A.; Frerman, F.E.
Glutaryl-CoA dehydrogenase (GCDH) is a nuclear-encoded, mitochondrial matrix enzyme. In humans, deficiency of GCDH leads to glutaric acidemia type I, and inherited disorder of amino acid metabolism characterized by a progressive neurodegenerative disease. In this report we describe the cloning and structure of the mouse GCDH (Gcdh) gene and cDNA and its chromosomal localization. The mouse Gcdh cDNA is 1.75 kb long and contains and open reading frame of 438 amino acids. The amino acid sequences of mouse, human, and pig GCDH are highly conserved. The mouse Gcdh gene contains 11 exons and spans 7 kb of genomic DNA. Gcdhmore » was mapped by backcross analysis to mouse chromosome 8 within a region that is homologous to a region of human chromosome 19, where the human gene was previously mapped. 14 refs., 3 figs.« less
Regulation and Functional Expression of Cinnamate 4-Hydroxylase from Parsley
Koopmann, Edda; Logemann, Elke; Hahlbrock, Klaus
1999-01-01
A previously isolated parsley (Petroselinum crispum) cDNA with high sequence similarity to cinnamate 4-hydroxylase (C4H) cDNAs from several plant sources was expressed in yeast (Saccharomyces cerevisiae) containing a plant NADPH:cytochrome P450 oxidoreductase and verified as encoding a functional C4H (CYP73A10). Low genomic complexity and the occurrence of a single type of cDNA suggest the existence of only one C4H gene in parsley. The encoded mRNA and protein, in contrast to those of a functionally related NADPH:cytochrome P450 oxidoreductase, were strictly coregulated with phenylalanine ammonia-lyase mRNA and protein, respectively, as demonstrated by coinduction under various conditions and colocalization in situ in cross-sections from several different parsley tissues. These results support the hypothesis that the genes encoding the core reactions of phenylpropanoid metabolism form a tight regulatory unit. PMID:9880345
Sequencing, Annotation and Analysis of the Syrian Hamster (Mesocricetus auratus) Transcriptome
Tchitchek, Nicolas; Safronetz, David; Rasmussen, Angela L.; Martens, Craig; Virtaneva, Kimmo; Porcella, Stephen F.; Feldmann, Heinz
2014-01-01
Background The Syrian hamster (golden hamster, Mesocricetus auratus) is gaining importance as a new experimental animal model for multiple pathogens, including emerging zoonotic diseases such as Ebola. Nevertheless there are currently no publicly available transcriptome reference sequences or genome for this species. Results A cDNA library derived from mRNA and snRNA isolated and pooled from the brains, lungs, spleens, kidneys, livers, and hearts of three adult female Syrian hamsters was sequenced. Sequence reads were assembled into 62,482 contigs and 111,796 reads remained unassembled (singletons). This combined contig/singleton dataset, designated as the Syrian hamster transcriptome, represents a total of 60,117,204 nucleotides. Our Mesocricetus auratus Syrian hamster transcriptome mapped to 11,648 mouse transcripts representing 9,562 distinct genes, and mapped to a similar number of transcripts and genes in the rat. We identified 214 quasi-complete transcripts based on mouse annotations. Canonical pathways involved in a broad spectrum of fundamental biological processes were significantly represented in the library. The Syrian hamster transcriptome was aligned to the current release of the Chinese hamster ovary (CHO) cell transcriptome and genome to improve the genomic annotation of this species. Finally, our Syrian hamster transcriptome was aligned against 14 other rodents, primate and laurasiatheria species to gain insights about the genetic relatedness and placement of this species. Conclusions This Syrian hamster transcriptome dataset significantly improves our knowledge of the Syrian hamster's transcriptome, especially towards its future use in infectious disease research. Moreover, this library is an important resource for the wider scientific community to help improve genome annotation of the Syrian hamster and other closely related species. Furthermore, these data provide the basis for development of expression microarrays that can be used in functional genomics studies. PMID:25398096
Blair, Matthew W; Hurtado, Natalia; Chavarro, Carolina M; Muñoz-Torres, Monica C; Giraldo, Martha C; Pedraza, Fabio; Tomkins, Jeff; Wing, Rod
2011-03-22
Sequencing of cDNA libraries for the development of expressed sequence tags (ESTs) as well as for the discovery of simple sequence repeats (SSRs) has been a common method of developing microsatellites or SSR-based markers. In this research, our objective was to further sequence and develop common bean microsatellites from leaf and root cDNA libraries derived from the Andean gene pool accession G19833 and the Mesoamerican gene pool accession DOR364, mapping parents of a commonly used reference map. The root libraries were made from high and low phosphorus treated plants. A total of 3,123 EST sequences from leaf and root cDNA libraries were screened and used for direct simple sequence repeat discovery. From these EST sequences we found 184 microsatellites; the majority containing tri-nucleotide motifs, many of which were GC rich (ACC, AGC and AGG in particular). Di-nucleotide motif microsatellites were about half as common as the tri-nucleotide motif microsatellites but most of these were AGn microsatellites with a moderate number of ATn microsatellites in root ESTs followed by few ACn and no GCn microsatellites. Out of the 184 new SSR loci, 120 new microsatellite markers were developed in the BMc (Bean Microsatellites from cDNAs) series and these were evaluated for their capacity to distinguish bean diversity in a germplasm panel of 18 genotypes. We developed a database with images of the microsatellites and their polymorphism information content (PIC), which averaged 0.310 for polymorphic markers. The present study produced information about microsatellite frequency in root and leaf tissues of two important genotypes for common bean genomics: namely G19833, the Andean genotype selected for whole genome shotgun sequencing from race Peru, and DOR364 a race Mesoamerica subgroup 2 genotype that is a small-red seeded, released variety in Central America. Both race Peru and Mesoamerica subgroup 2 (small red beans) have been understudied in comparison to race Nueva Granada and Mesoamerica subgroup 1 (black beans) both with regards to gene expression and as sources of markers. However, we found few differences between SSR type and frequency between the G19833 leaf and DOR364 root tissue-derived ESTs. Overall, our work adds to the analysis of microsatellite frequency evaluation for common bean and provides a new set of 120 BMc markers which combined with the 248 previously developed BMc markers brings the total in this series to 368 markers. Once we include BMd markers, which are derived from GenBank sequences, the current total of gene-based markers from our laboratory surpasses 500 markers. These markers are basic for studies of the transcriptome of common bean and can form anchor points for genetic mapping studies in the future.
Tetteh, Kevin K. A.; Loukas, Alex; Tripp, Cindy; Maizels, Rick M.
1999-01-01
Larvae of Toxocara canis, a nematode parasite of dogs, infect humans, causing visceral and ocular larva migrans. In noncanid hosts, larvae neither grow nor differentiate but endure in a state of arrested development. Reasoning that parasite protein production is orientated to immune evasion, we undertook a random sequencing project from a larval cDNA library to characterize the most highly expressed transcripts. In all, 266 clones were sequenced, most from both 3′ and 5′ ends, and similarity searches against GenBank protein and dbEST nucleotide databases were conducted. Cluster analyses showed that 128 distinct gene products had been found, all but 3 of which represented newly identified genes. Ninety-five genes were represented by a single clone, but seven transcripts were present at high frequencies, each composing >2% of all clones sequenced. These high-abundance transcripts include a mucin and a C-type lectin, which are both major excretory-secretory antigens released by parasites. Four highly expressed novel gene transcripts, termed ant (abundant novel transcript) genes, were found. Together, these four genes comprised 18% of all cDNA clones isolated, but no similar sequences occur in the Caenorhabditis elegans genome. While the coding regions of the four genes are dissimilar, their 3′ untranslated tracts have significant homology in nucleotide sequence. The discovery of these abundant, parasite-specific genes of newly identified lectins and mucins, as well as a range of conserved and novel proteins, provides defined candidates for future analysis of the molecular basis of immune evasion by T. canis. PMID:10456930
Ashburner, M; Misra, S; Roote, J; Lewis, S E; Blazej, R; Davis, T; Doyle, C; Galle, R; George, R; Harris, N; Hartzell, G; Harvey, D; Hong, L; Houston, K; Hoskins, R; Johnson, G; Martin, C; Moshrefi, A; Palazzolo, M; Reese, M G; Spradling, A; Tsang, G; Wan, K; Whitelaw, K; Celniker, S
1999-01-01
A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. Milne 1926 PMID:10471707
EuroPineDB: a high-coverage web database for maritime pine transcriptome
2011-01-01
Background Pinus pinaster is an economically and ecologically important species that is becoming a woody gymnosperm model. Its enormous genome size makes whole-genome sequencing approaches are hard to apply. Therefore, the expressed portion of the genome has to be characterised and the results and annotations have to be stored in dedicated databases. Description EuroPineDB is the largest sequence collection available for a single pine species, Pinus pinaster (maritime pine), since it comprises 951 641 raw sequence reads obtained from non-normalised cDNA libraries and high-throughput sequencing from adult (xylem, phloem, roots, stem, needles, cones, strobili) and embryonic (germinated embryos, buds, callus) maritime pine tissues. Using open-source tools, sequences were optimally pre-processed, assembled, and extensively annotated (GO, EC and KEGG terms, descriptions, SNPs, SSRs, ORFs and InterPro codes). As a result, a 10.5× P. pinaster genome was covered and assembled in 55 322 UniGenes. A total of 32 919 (59.5%) of P. pinaster UniGenes were annotated with at least one description, revealing at least 18 466 different genes. The complete database, which is designed to be scalable, maintainable, and expandable, is freely available at: http://www.scbi.uma.es/pindb/. It can be retrieved by gene libraries, pine species, annotations, UniGenes and microarrays (i.e., the sequences are distributed in two-colour microarrays; this is the only conifer database that provides this information) and will be periodically updated. Small assemblies can be viewed using a dedicated visualisation tool that connects them with SNPs. Any sequence or annotation set shown on-screen can be downloaded. Retrieval mechanisms for sequences and gene annotations are provided. Conclusions The EuroPineDB with its integrated information can be used to reveal new knowledge, offers an easy-to-use collection of information to directly support experimental work (including microarray hybridisation), and provides deeper knowledge on the maritime pine transcriptome. PMID:21762488
Identification, validation and high-throughput genotyping of transcribed gene SNPs in cassava.
Ferguson, Morag E; Hearne, Sarah J; Close, Timothy J; Wanamaker, Steve; Moskal, William A; Town, Christopher D; de Young, Joe; Marri, Pradeep Reddy; Rabbi, Ismail Yusuf; de Villiers, Etienne P
2012-03-01
The availability of genomic resources can facilitate progress in plant breeding through the application of advanced molecular technologies for crop improvement. This is particularly important in the case of less researched crops such as cassava, a staple and food security crop for more than 800 million people. Here, expressed sequence tags (ESTs) were generated from five drought stressed and well-watered cassava varieties. Two cDNA libraries were developed: one from root tissue (CASR), the other from leaf, stem and stem meristem tissue (CASL). Sequencing generated 706 contigs and 3,430 singletons. These sequences were combined with those from two other EST sequencing initiatives and filtered based on the sequence quality. Quality sequences were aligned using CAP3 and embedded in a Windows browser called HarvEST:Cassava which is made available. HarvEST:Cassava consists of a Unigene set of 22,903 quality sequences. A total of 2,954 putative SNPs were identified. Of these 1,536 SNPs from 1,170 contigs and 53 cassava genotypes were selected for SNP validation using Illumina's GoldenGate assay. As a result 1,190 SNPs were validated technically and biologically. The location of validated SNPs on scaffolds of the cassava genome sequence (v.4.1) is provided. A diversity assessment of 53 cassava varieties reveals some sub-structure based on the geographical origin, greater diversity in the Americas as opposed to Africa, and similar levels of diversity in West Africa and southern, eastern and central Africa. The resources presented allow for improved genetic dissection of economically important traits and the application of modern genomics-based approaches to cassava breeding and conservation.
Prospecting for viral natural enemies of the fire ant Solenopsis invicta in Argentina.
Valles, Steven M; Porter, Sanford D; Calcaterra, Luis A
2018-01-01
Metagenomics and next generation sequencing were employed to discover new virus natural enemies of the fire ant, Solenopsis invicta Buren in its native range (i.e., Formosa, Argentina) with the ultimate goal of testing and releasing new viral pathogens into U.S. S. invicta populations to provide natural, sustainable control of this ant. RNA was purified from worker ants from 182 S. invicta colonies, which was pooled into 4 groups according to location. A library was created from each group and sequenced using Illumina Miseq technology. After a series of winnowing methods to remove S. invicta genes, known S. invicta virus genes, and all other non-virus gene sequences, 61,944 unique singletons were identified with virus identity. These were assembled de novo yielding 171 contiguous sequences with significant identity to non-plant virus genes. Fifteen contiguous sequences exhibited very high expression rates and were detected in all four gene libraries. One contig (Contig_29) exhibited the highest expression level overall and across all four gene libraries. Random amplification of cDNA ends analyses expanded this contiguous sequence yielding a complete virus genome, which we have provisionally named Solenopsis invicta virus 5 (SINV-5). SINV-5 is a positive-sense, single-stranded RNA virus with genome characteristics consistent with insect-infecting viruses from the family Dicistroviridae. Moreover, the replicative genome strand of SINV-5 was detected in worker ants indicating that S. invicta serves as host for the virus. Many additional sequences were identified that are likely of viral origin. These sequences await further investigation to determine their origins and relationship with S. invicta. This study expands knowledge of the RNA virome diversity found within S. invicta populations.
Prospecting for viral natural enemies of the fire ant Solenopsis invicta in Argentina
Porter, Sanford D.; Calcaterra, Luis A.
2018-01-01
Metagenomics and next generation sequencing were employed to discover new virus natural enemies of the fire ant, Solenopsis invicta Buren in its native range (i.e., Formosa, Argentina) with the ultimate goal of testing and releasing new viral pathogens into U.S. S. invicta populations to provide natural, sustainable control of this ant. RNA was purified from worker ants from 182 S. invicta colonies, which was pooled into 4 groups according to location. A library was created from each group and sequenced using Illumina Miseq technology. After a series of winnowing methods to remove S. invicta genes, known S. invicta virus genes, and all other non-virus gene sequences, 61,944 unique singletons were identified with virus identity. These were assembled de novo yielding 171 contiguous sequences with significant identity to non-plant virus genes. Fifteen contiguous sequences exhibited very high expression rates and were detected in all four gene libraries. One contig (Contig_29) exhibited the highest expression level overall and across all four gene libraries. Random amplification of cDNA ends analyses expanded this contiguous sequence yielding a complete virus genome, which we have provisionally named Solenopsis invicta virus 5 (SINV-5). SINV-5 is a positive-sense, single-stranded RNA virus with genome characteristics consistent with insect-infecting viruses from the family Dicistroviridae. Moreover, the replicative genome strand of SINV-5 was detected in worker ants indicating that S. invicta serves as host for the virus. Many additional sequences were identified that are likely of viral origin. These sequences await further investigation to determine their origins and relationship with S. invicta. This study expands knowledge of the RNA virome diversity found within S. invicta populations. PMID:29466388
Shen, K A; Meyers, B C; Islam-Faridi, M N; Chin, D B; Stelly, D M; Michelmore, R W
1998-08-01
The recent cloning of genes for resistance against diverse pathogens from a variety of plants has revealed that many share conserved sequence motifs. This provides the possibility of isolating numerous additional resistance genes by polymerase chain reaction (PCR) with degenerate oligonucleotide primers. We amplified resistance gene candidates (RGCs) from lettuce with multiple combinations of primers with low degeneracy designed from motifs in the nucleotide binding sites (NBSs) of RPS2 of Arabidopsis thaliana and N of tobacco. Genomic DNA, cDNA, and bacterial artificial chromosome (BAC) clones were successfully used as templates. Four families of sequences were identified that had the same similarity to each other as to resistance genes from other species. The relationship of the amplified products to resistance genes was evaluated by several sequence and genetic criteria. The amplified products contained open reading frames with additional sequences characteristic of NBSs. Hybridization of RGCs to genomic DNA and to BAC clones revealed large numbers of related sequences. Genetic analysis demonstrated the existence of clustered multigene families for each of the four RGC sequences. This parallels classical genetic data on clustering of disease resistance genes. Two of the four families mapped to known clusters of resistance genes; these two families were therefore studied in greater detail. Additional evidence that these RGCs could be resistance genes was gained by the identification of leucine-rich repeat (LRR) regions in sequences adjoining the NBS similar to those in RPM1 and RPS2 of A. thaliana. Fluorescent in situ hybridization confirmed the clustered genomic distribution of these sequences. The use of PCR with degenerate oligonucleotide primers is therefore an efficient method to identify numerous RGCs in plants.
Dangoudoubiyam, Sriveny; Vemulapalli, Ramesh; Hancock, Kathy; Kazacos, Kevin R.
2010-01-01
Larva migrans caused by Baylisascaris procyonis is an important zoonotic disease. Current serological diagnostic assays for this disease depend on the use of the parasite's larval excretory-secretory (ES) antigens. In order to identify genes encoding ES antigens and to generate recombinant antigens for use in diagnostic assays, construction and immunoscreening of a B. procyonis third-stage larva cDNA expression library was performed and resulted in identification of a partial-length cDNA clone encoding an ES antigen, designated repeat antigen 1 (RAG1). The full-length rag1 cDNA contained a 753-bp open reading frame that encoded a protein of 250 amino acids with 12 tandem repeats of a 12-amino-acid long sequence. The rag1 genomic DNA revealed a single intron of 837 bp that separated the 753-bp coding sequence into two exons delimited by canonical splice sites. No nucleotide or amino acid sequences present in the GenBank databases had significant similarity with those of RAG1. We have cloned, expressed, and purified the recombinant RAG1 (rRAG1) and analyzed its diagnostic potential by enzyme-linked immunosorbent assay. Anti-Baylisascaris species-specific rabbit serum showed strong reactivity to rRAG1, while only minimal to no reactivity was observed with sera against the related ascarids Toxocara canis and Ascaris suum, strongly suggesting the specificity of rRAG1. On the basis of these results, the identified RAG1 appears to be a promising diagnostic antigen for the development of serological assays for specific detection of B. procyonis larva migrans. PMID:20926699
Jouffe, Vincent; Rowe, Suzanne; Liaubet, Laurence; Buitenhuis, Bart; Hornshøj, Henrik; SanCristobal, Magali; Mormède, Pierre; de Koning, D J
2009-07-16
Microarray studies can supplement QTL studies by suggesting potential candidate genes in the QTL regions, which by themselves are too large to provide a limited selection of candidate genes. Here we provide a case study where we explore ways to integrate QTL data and microarray data for the pig, which has only a partial genome sequence. We outline various procedures to localize differentially expressed genes on the pig genome and link this with information on published QTL. The starting point is a set of 237 differentially expressed cDNA clones in adrenal tissue from two pig breeds, before and after treatment with adrenocorticotropic hormone (ACTH). Different approaches to localize the differentially expressed (DE) genes to the pig genome showed different levels of success and a clear lack of concordance for some genes between the various approaches. For a focused analysis on 12 genes, overlapping QTL from the public domain were presented. Also, differentially expressed genes underlying QTL for ACTH response were described. Using the latest version of the draft sequence, the differentially expressed genes were mapped to the pig genome. This enabled co-location of DE genes and previously studied QTL regions, but the draft genome sequence is still incomplete and will contain many errors. A further step to explore links between DE genes and QTL at the pathway level was largely unsuccessful due to the lack of annotation of the pig genome. This could be improved by further comparative mapping analyses but this would be time consuming. This paper provides a case study for the integration of QTL data and microarray data for a species with limited genome sequence information and annotation. The results illustrate the challenges that must be addressed but also provide a roadmap for future work that is applicable to other non-model species.
Glaberman, Scott; Du Pasquier, Louis; Caccone, Adalgisa
2008-01-01
Squamates are a diverse order of vertebrates, representing more than 7,000 species. Yet, descriptions of full-length major histocompatibility complex (MHC) genes in this group are nearly absent from the literature, while the number of MHC studies continues to rise in other vertebrate taxa. The lack of basic information about MHC organization in squamates inhibits investigation into the relationship between MHC polymorphism and disease, and leaves a large taxonomic gap in our understanding of amniote MHC evolution. Here, we use both cDNA and genomic sequence data to characterize a class I MHC gene (Amcr-UA) from the Galápagos marine iguana, a member of the squamate subfamily Iguaninae. Amcr-UA appears to be functional since it is expressed in the blood and contains many of the conserved peptide-binding residues that are found in classical class I genes of other vertebrates. In addition, comparison of Amcr-UA to homologous sequences from other iguanine species shows that the antigen-binding portion of this gene is under purifying selection, rather than balancing selection, and therefore may have a conserved function. A striking feature of Amcr-UA is that both the cDNA and genomic sequences lack the transmembrane and cytoplasmic domains that are necessary to anchor the class I receptor molecule into the cell membrane, suggesting that the product of this gene is secreted and consequently not involved in classical class I antigen-presentation. The truncated and conserved character of Amcr-UA lead us to define it as a nonclassical gene that is related to the few available squamate class I sequences. However, phylogenetic analysis placed Amcr-UA in a basal position relative to other published classical MHC genes from squamates, suggesting that this gene diverged near the beginning of squamate diversification. PMID:18682845
2013-01-01
Background Adenosine-to-inosine (A-to-I) RNA editing is recognized as a cellular mechanism for generating both RNA and protein diversity. Inosine base pairs with cytidine during reverse transcription and therefore appears as guanosine during sequencing of cDNA. Current approaches of RNA editing identification largely depend on the comparison between transcriptomes and genomic DNA (gDNA) sequencing datasets from the same individuals, and it has been challenging to identify editing candidates from transcriptomes in the absence of gDNA information. Results We have developed a new strategy to accurately predict constitutive RNA editing sites from publicly available human RNA-seq datasets in the absence of relevant genomic sequences. Our approach establishes new parameters to increase the ability to map mismatches and to minimize sequencing/mapping errors and unreported genome variations. We identified 695 novel constitutive A-to-I editing sites that appear in clusters (named “editing boxes”) in multiple samples and which exhibit spatial and dynamic regulation across human tissues. Some of these editing boxes are enriched in non-repetitive regions lacking inverted repeat structures and contain an extremely high conversion frequency of As to Is. We validated a number of editing boxes in multiple human cell lines and confirmed that ADAR1 is responsible for the observed promiscuous editing events in non-repetitive regions, further expanding our knowledge of the catalytic substrate of A-to-I RNA editing by ADAR enzymes. Conclusions The approach we present here provides a novel way of identifying A-to-I RNA editing events by analyzing only RNA-seq datasets. This method has allowed us to gain new insights into RNA editing and should also aid in the identification of more constitutive A-to-I editing sites from additional transcriptomes. PMID:23537002
Lee, Je Hyuk; Daugharthy, Evan R.; Scheiman, Jonathan; Kalhor, Reza; Ferrante, Thomas C.; Terry, Richard; Turczyk, Brian M.; Yang, Joyce L.; Lee, Ho Suk; Aach, John; Zhang, Kun; Church, George M.
2014-01-01
RNA sequencing measures the quantitative change in gene expression over the whole transcriptome, but it lacks spatial context. On the other hand, in situ hybridization provides the location of gene expression, but only for a small number of genes. Here we detail a protocol for genome-wide profiling of gene expression in situ in fixed cells and tissues, in which RNA is converted into cross-linked cDNA amplicons and sequenced manually on a confocal microscope. Unlike traditional RNA-seq our method enriches for context-specific transcripts over house-keeping and/or structural RNA, and it preserves the tissue architecture for RNA localization studies. Our protocol is written for researchers experienced in cell microscopy with minimal computing skills. Library construction and sequencing can be completed within 14 d, with image analysis requiring an additional 2 d. PMID:25675209
2009-09-01
binding ETS domain) and five type II (without ETS domain). Fusion-positive type I– and type II–containing phages were amplified with T3 and T7 primers...will be performed to identify the authentic 3’ UTRs from the mRNA pool from CaP patient specimens. Using phage excision strategy, we will use to... phage DNA sequences plasmids (cDNA) clones were generated by using phage excision strategy. Figure 1. ERG splice variants in prostate cancer
Evidence for two transferrin loci in the Salmo trutta genome.
Rozman, T; Dovc, P; Marić, S; Kokalj-Vokac, N; Erjavec-Skerget, A; Rab, P; Snoj, A
2008-12-01
To determine the organization of transferrin (TF) locus in the Salmo trutta genome, partial DNA and cDNA sequencing, fluorescent in situ hybridization (FISH) and Salmo salar BAC analysis were performed. TF expression levels and copy number prediction were assessed using real-time PCR. In addition to two previously reported DNA TF variant sequences of S. trutta and Salmo marmoratus (TF1), two novel variant sequences (TF2) were revealed in both species. Variant-specific sequence tags, characterizing two variants for each TF type (TF1 and TF2), were identified in genomic clones from each of the F1 hybrids between S. trutta and S. marmoratus. These clearly documented double heterozygote status at the TF loci. The real-time PCR data showed that each of the two TF types (TF1 and TF2) existed in one copy only and that the transcription of TF2 was considerably lower compared with TF1. Using FISH, hybridization signals were observed on two medium-sized acrocentric chromosomes of S. trutta karyotype. A TF type-specific PCR followed by a restriction analysis revealed the presence of two TF loci in the majority of analysed BAC clones. It was concluded that the TF gene is duplicated in the genome of S. trutta, and that the two TF loci are located adjacent to one another on the same chromosome. The differing transcription levels of TF1 and TF2 appear to depend on the corresponding promoter activity, which at least for TF2 seems to vary between different Salmo congeners.
Miao, Hong-Xia; Qin, Yong-Hua; Ye, Zi-Xing; Hu, Gui-Bing
2013-01-25
Ubiquitin-activating enzyme E1 (UBE1) catalyzes the first step in the ubiquitination reaction, which targets a protein for degradation via a proteasome pathway. UBE1 plays an important role in metabolic processes. In this study, full-length cDNA and DNA sequences of UBE1 gene, designated CrUBE1, were obtained from 'Wuzishatangju' (self-incompatible, SI) and 'Shatangju' (self-compatible, SC) mandarins. 5 amino acids and 8 bases were different in cDNA and DNA sequences of CrUBE1 between 'Wuzishatangju' and 'Shatangju', respectively. Southern blot analysis showed that there existed only one copy of the CrUBE1 gene in genome of 'Wuzishatangju' and 'Shatangju'. The temporal and spatial expression characteristics of the CrUBE1 gene were investigated using semi-quantitative RT-PCR (SqPCR) and quantitative real-time PCR (qPCR). The expression level of the CrUBE1 gene in anthers of 'Shatangju' was approximately 10-fold higher than in anthers of 'Wuzishatangju'. The highest expression level of CrUBE1 was detected in pistils at 7days after self-pollination of 'Wuzishatangju', which was approximately 5-fold higher than at 0 h. To obtain CrUBE1 protein, the full-length cDNA of CrUBE1 genes from 'Wuzishatangju' and 'Shatangju' were successfully expressed in Pichia pastoris. Pollen germination frequency of 'Wuzishatangju' was significantly inhibited with increasing of CrUBE1 protein concentrations from 'Wuzishatangju'. Copyright © 2012 Elsevier B.V. All rights reserved.
Construction, database integration, and application of an Oenothera EST library.
Mrácek, Jaroslav; Greiner, Stephan; Cho, Won Kyong; Rauwolf, Uwe; Braun, Martha; Umate, Pavan; Altstätter, Johannes; Stoppel, Rhea; Mlcochová, Lada; Silber, Martina V; Volz, Stefanie M; White, Sarah; Selmeier, Renate; Rudd, Stephen; Herrmann, Reinhold G; Meurer, Jörg
2006-09-01
Coevolution of cellular genetic compartments is a fundamental aspect in eukaryotic genome evolution that becomes apparent in serious developmental disturbances after interspecific organelle exchanges. The genus Oenothera represents a unique, at present the only available, resource to study the role of the compartmentalized plant genome in diversification of populations and speciation processes. An integrated approach involving cDNA cloning, EST sequencing, and bioinformatic data mining was chosen using Oenothera elata with the genetic constitution nuclear genome AA with plastome type I. The Gene Ontology system grouped 1621 unique gene products into 17 different functional categories. Application of arrays generated from a selected fraction of ESTs revealed significantly differing expression profiles among closely related Oenothera species possessing the potential to generate fertile and incompatible plastid/nuclear hybrids (hybrid bleaching). Furthermore, the EST library provides a valuable source of PCR-based polymorphic molecular markers that are instrumental for genotyping and molecular mapping approaches.
Harper, J R; Prince, J T; Healy, P A; Stuart, J K; Nauman, S J; Stallcup, W B
1991-03-01
We have isolated cDNA clones coding for the human homologue of the neuronal cell adhesion molecule L1. The nucleotide sequence of the cDNA clones and the deduced primary amino acid sequence of the carboxy terminal portion of the human L1 are homologous to the corresponding sequences of mouse L1 and rat NILE glycoprotein, with an especially high sequences identity in the cytoplasmic regions of the proteins. There is also protein sequence homology with the cytoplasmic region of the Drosophila cell adhesion molecule, neuroglian. The conservation of the cytoplasmic domain argues for an important functional role for this portion of the molecule.
Ziegenhagen, Birgit; Liepelt, Sascha
2015-01-01
Increasing drought periods as a result of global climate change pose a threat to many tree species by possibly outpacing their adaptive capabilities. Revealing the genetic basis of drought stress response is therefore implemental for future conservation strategies and risk assessment. Access to informative genomic regions is however challenging, especially for conifers, partially due to their large genomes, which puts constraints on the feasibility of whole genome scans. Candidate genes offer a valuable tool to reduce the complexity of the analysis and the amount of sequencing work and costs. For this study we combined an improved drought stress phenotyping of needles via a novel terahertz water monitoring technique with Massive Analysis of cDNA Ends to identify candidate genes for drought stress response in European silver fir (Abies alba Mill.). A pooled cDNA library was constructed from the cotyledons of six drought stressed and six well-watered silver fir seedlings, respectively. Differential expression analyses of these libraries revealed 296 candidate genes for drought stress response in silver fir (247 up- and 49 down-regulated) of which a subset was validated by RT-qPCR of the twelve individual cotyledons. A majority of these genes code for currently uncharacterized proteins and hint on new genomic resources to be explored in conifers. Furthermore, we could show that some traditional reference genes from model plant species (GAPDH and eIF4A2) are not suitable for differential analysis and we propose a new reference gene, TPC1, for drought stress expression profiling in needles of conifer seedlings. PMID:25924061
Lefèvre, Christophe M; Sharp, Julie A; Nicholas, Kevin R
2009-01-01
Using a milk-cell cDNA sequencing approach we characterised milk-protein sequences from two monotreme species, platypus (Ornithorhynchus anatinus) and echidna (Tachyglossus aculeatus) and found a full set of caseins and casein variants. The genomic organisation of the platypus casein locus is compared with other mammalian genomes, including the marsupial opossum and several eutherians. Physical linkage of casein genes has been seen in the casein loci of all mammalian genomes examined and we confirm that this is also observed in platypus. However, we show that a recent duplication of beta-casein occurred in the monotreme lineage, as opposed to more ancient duplications of alpha-casein in the eutherian lineage, while marsupials possess only single copies of alpha- and beta-caseins. Despite this variability, the close proximity of the main alpha- and beta-casein genes in an inverted tail-tail orientation and the relative orientation of the more distant kappa-casein genes are similar in all mammalian genome sequences so far available. Overall, the conservation of the genomic organisation of the caseins indicates the early, pre-monotreme development of the fundamental role of caseins during lactation. In contrast, the lineage-specific gene duplications that have occurred within the casein locus of monotremes and eutherians but not marsupials, which may have lost part of the ancestral casein locus, emphasises the independent selection on milk provision strategies to the young, most likely linked to different developmental strategies. The monotremes therefore provide insight into the ancestral drivers for lactation and how these have adapted in different lineages.
Urade, Y; Oberdick, J; Molinar-Rode, R; Morgan, J I
1991-01-01
The cerebellum contains a hexadecapeptide, termed cerebellin, that is conserved in sequence from human to chicken. Three independent, overlapping cDNA clones have been isolated from a human cerebellum cDNA library that encode the cerebellin sequence. The longest clone codes for a protein of 193 amino acids that we term precerebellin. This protein has a significant similarity (31.3% identity, 52.2% similarity) to the globular (non-collagen-like) region of the B chain of human complement component C1q. The region of relatedness extends over approximately 145 amino acids located in the carboxyl terminus of both proteins. Unlike C1q B chain, no collagen-like motifs are present in the amino-terminal regions of precerebellin. The amino terminus of precerebellin contains three possible N-linked glycosylation sites. Although hydrophobic amino acids are clustered at the amino terminus, they do not conform to the classical signal-peptide motif, and no other obvious membrane-spanning domains are predicted from the cDNA sequence. The cDNA predicts that the cerebellin peptide is flanked by Val-Arg and Glu-Pro residues. Therefore, cerebellin is not liberated from precerebellin by the classical dibasic amino acid proteolytic-cleavage mechanism seen in many neuropeptide precursors. In Northern (RNA) blots, precerebellin transcripts, with four distinct sizes (1.8, 2.3, 2.7, and 3.0 kilobases), are abundant in cerebellum. These transcripts are present at either very low or undetectable levels in other brain areas and extraneural structures. A similar pattern of cerebellin precursor transcripts are seen in rat, mouse, and human cerebellum. Furthermore, a partial genomic fragment from mouse shows the same bands in Northern blots as the human cDNA clone. During rat development, precerebellin transcripts mirror the level of cerebellin peptide. Low levels of precerebellin mRNA are seen at birth. Levels increase modestly from postpartum day 1 to 8, then increase more dramatically between day 5 and 15, and eventually reach peak values between day 21 and 56. Because cerebellin-like immunoreactivity is associated with Purkinje cell postsynaptic structures, these data raise interesting possibilities concerning the function of the cerebellin precursor in synaptic physiology. Images PMID:1704129
Guo, Deyin; Spetz, Carl; Saarma, Mart; Valkonen, Jari P T
2003-05-01
Potyviral helper-component proteinase (HCpro) is a multifunctional protein exerting its cellular functions in interaction with putative host proteins. In this study, cellular protein partners of the HCpro encoded by Potato virus A (PVA) (genus Potyvirus) were screened in a potato leaf cDNA library using a yeast two-hybrid system. Two cellular proteins were obtained that interact specifically with PVA HCpro in yeast and in the two in vitro binding assays used. Both proteins are encoded by single-copy genes in the potato genome. Analysis of the deduced amino acid sequences revealed that one (HIP1) of the two HCpro interactors is a novel RING finger protein. The sequence of the other protein (HIP2) showed no resemblance to the protein sequences available from databanks and has known biological functions.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Yoshida, Michihiro C.; Wada, Makio; Satoh, Hitoshi
1988-07-01
The human HST1 gene, previously designated the hst gene, and now assigned the name HSTF1 for heparin-binding secretory transforming factor in human gene nomenclature, was originally identified as a transforming gene in DNAs from human stomach cancers by transfection assay with mouse NIH 3T3 cells. The amino acid sequence of the product deduced from DNA sequences of the HST1 cDNA and genomic clones had approximately 40% homology to human basic and acidic fibroblast growth factors and mouse Int-2-encoded protein. The authors have mapped the human HST1 gene to chromosome 11 at band q13.3 by Southern blot hybridization analysis of amore » panel of human and mouse somatic cell hybrids and in situ hybridization with an HST1 cDNA probe. The HST1 gene was found to be amplified in DNAs obtained from a stomach cancer and a vulvar carcinoma cell line, A431. In all of these samples of DNA, the INT2 gene, previously mapped to human chromosome 11q13, was also amplified to the same degree as the HST1 gene.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Woon, J. S. K., E-mail: jameswoon@siswa.ukm.edu.my; Murad, A. M. A., E-mail: munir@ukm.edu.my; Abu Bakar, F. D., E-mail: fabyff@ukm.edu.my
A cellobiohydrolase B (CbhB) from Aspergillus niger ATCC 10574 was cloned and expressed in E. coli. CbhB has an open reading frame of 1611 bp encoding a putative polypeptide of 536 amino acids. Analysis of the encoded polypeptide predicted a molecular mass of 56.2 kDa, a cellulose binding module (CBM) and a catalytic module. In order to obtain the mRNA of cbhB, total RNA was extracted from A. niger cells induced by 1% Avicel. First strand cDNA was synthesized from total RNA via reverse transcription. The full length cDNA of cbhB was amplified by PCR and cloned into the cloning vector, pGEM-Tmore » Easy. A comparison between genomic DNA and cDNA sequences of cbhB revealed that the gene is intronless. Upon the removal of the signal peptide, the cDNA of cbhB was cloned into the expression vector pET-32b. However, the recombinant CbhB was expressed in Escherichia coli Origami DE3 as an insoluble protein. A homology model of CbhB predicted the presence of nine disulfide bonds in the protein structure which may have contributed to the improper folding of the protein and thus, resulting in inclusion bodies in E. coli.« less
Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones
Imanishi, Tadashi; Itoh, Takeshi; Suzuki, Yutaka; O'Donovan, Claire; Fukuchi, Satoshi; Koyanagi, Kanako O; Barrero, Roberto A; Tamura, Takuro; Yamaguchi-Kabata, Yumi; Tanino, Motohiko; Yura, Kei; Miyazaki, Satoru; Ikeo, Kazuho; Homma, Keiichi; Kasprzyk, Arek; Nishikawa, Tetsuo; Hirakawa, Mika; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Ashurst, Jennifer; Jia, Libin; Nakao, Mitsuteru; Thomas, Michael A; Mulder, Nicola; Karavidopoulou, Youla; Jin, Lihua; Kim, Sangsoo; Yasuda, Tomohiro; Lenhard, Boris; Eveno, Eric; Suzuki, Yoshiyuki; Yamasaki, Chisato; Takeda, Jun-ichi; Gough, Craig; Hilton, Phillip; Fujii, Yasuyuki; Sakai, Hiroaki; Tanaka, Susumu; Amid, Clara; Bellgard, Matthew; Bonaldo, Maria de Fatima; Bono, Hidemasa; Bromberg, Susan K; Brookes, Anthony J; Bruford, Elspeth; Carninci, Piero; Chelala, Claude; Couillault, Christine; de Souza, Sandro J.; Debily, Marie-Anne; Devignes, Marie-Dominique; Dubchak, Inna; Endo, Toshinori; Estreicher, Anne; Eyras, Eduardo; Fukami-Kobayashi, Kaoru; R. Gopinath, Gopal; Graudens, Esther; Hahn, Yoonsoo; Han, Michael; Han, Ze-Guang; Hanada, Kousuke; Hanaoka, Hideki; Harada, Erimi; Hashimoto, Katsuyuki; Hinz, Ursula; Hirai, Momoki; Hishiki, Teruyoshi; Hopkinson, Ian; Imbeaud, Sandrine; Inoko, Hidetoshi; Kanapin, Alexander; Kaneko, Yayoi; Kasukawa, Takeya; Kelso, Janet; Kersey, Paul; Kikuno, Reiko; Kimura, Kouichi; Korn, Bernhard; Kuryshev, Vladimir; Makalowska, Izabela; Makino, Takashi; Mano, Shuhei; Mariage-Samson, Regine; Mashima, Jun; Matsuda, Hideo; Mewes, Hans-Werner; Minoshima, Shinsei; Nagai, Keiichi; Nagasaki, Hideki; Nagata, Naoki; Nigam, Rajni; Ogasawara, Osamu; Ohara, Osamu; Ohtsubo, Masafumi; Okada, Norihiro; Okido, Toshihisa; Oota, Satoshi; Ota, Motonori; Ota, Toshio; Otsuki, Tetsuji; Piatier-Tonneau, Dominique; Poustka, Annemarie; Ren, Shuang-Xi; Saitou, Naruya; Sakai, Katsunaga; Sakamoto, Shigetaka; Sakate, Ryuichi; Schupp, Ingo; Servant, Florence; Sherry, Stephen; Shiba, Rie; Shimizu, Nobuyoshi; Shimoyama, Mary; Simpson, Andrew J; Soares, Bento; Steward, Charles; Suwa, Makiko; Suzuki, Mami; Takahashi, Aiko; Tamiya, Gen; Tanaka, Hiroshi; Taylor, Todd; Terwilliger, Joseph D; Unneberg, Per; Veeramachaneni, Vamsi; Watanabe, Shinya; Wilming, Laurens; Yasuda, Norikazu; Yoo, Hyang-Sook; Stodolsky, Marvin; Makalowski, Wojciech; Go, Mitiko; Nakai, Kenta; Takagi, Toshihisa; Kanehisa, Minoru; Sakaki, Yoshiyuki; Quackenbush, John; Okazaki, Yasushi; Hayashizaki, Yoshihide; Hide, Winston; Chakraborty, Ranajit; Nishikawa, Ken; Sugawara, Hideaki; Tateno, Yoshio; Chen, Zhu; Oishi, Michio; Tonellato, Peter; Apweiler, Rolf; Okubo, Kousaku; Wagner, Lukas; Wiemann, Stefan; Strausberg, Robert L; Isogai, Takao; Auffray, Charles; Nomura, Nobuo; Sugano, Sumio
2004-01-01
The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology. PMID:15103394
Yasuno, Rie; Wada, Hajime
1998-01-01
Lipoic acid is a coenzyme that is essential for the activity of enzyme complexes such as those of pyruvate dehydrogenase and glycine decarboxylase. We report here the isolation and characterization of LIP1 cDNA for lipoic acid synthase of Arabidopsis. The Arabidopsis LIP1 cDNA was isolated using an expressed sequence tag homologous to the lipoic acid synthase of Escherichia coli. This cDNA was shown to code for Arabidopsis lipoic acid synthase by its ability to complement a lipA mutant of E. coli defective in lipoic acid synthase. DNA-sequence analysis of the LIP1 cDNA revealed an open reading frame predicting a protein of 374 amino acids. Comparisons of the deduced amino acid sequence with those of E. coli and yeast lipoic acid synthase homologs showed a high degree of sequence similarity and the presence of a leader sequence presumably required for import into the mitochondria. Southern-hybridization analysis suggested that LIP1 is a single-copy gene in Arabidopsis. Western analysis with an antibody against lipoic acid synthase demonstrated that this enzyme is located in the mitochondrial compartment in Arabidopsis cells as a 43-kD polypeptide. PMID:9808738
Helm, Jared R.; Hertz-Fowler, Christiane; Aslett, Martin; Berriman, Matthew; Sanders, Mandy; Quail, Michael A.; Soares, Marcelo B.; Bonaldo, Maria F.; Sakurai, Tatsuya; Inoue, Noboru; Donelson, John E.
2009-01-01
Trypanosoma congolense is one of the most economically important pathogens of livestock in Africa. Culture-derived parasites of each of the three main insect stages of the T. congolense life cycle, i.e., the procyclic, epimastigote and metacyclic stages, and bloodstream stage parasites isolated from infected mice, were used to construct stage-specific cDNA libraries and expressed sequence tags (ESTs or cDNA clones) in each library were sequenced. Thirteen EST clusters encoding different variant surface glycoproteins (VSGs) were detected in the metacyclic library and twenty-six VSG EST clusters were found in the bloodstream library, six of which are shared by the metacyclic library. Rare VSG ESTs are present in the epimastigote library, and none were detected in the procyclic library. ESTs encoding enzymes that catalyze oxidative phosphorylation and amino acid metabolism are about twice as abundant in the procyclic and epimastigote stages as in the metacyclic and bloodstream stages. In contrast, ESTs encoding enzymes involved in glycolysis, the citric acid cycle and nucleotide metabolism are about the same in all four developmental stages. Cysteine proteases, kinases and phosphatases are the most abundant enzyme groups represented by the ESTs. All four libraries contain T. congolense-specific expressed sequences not present in the T. brucei and T. cruzi genomes. Normalized cDNA libraries were constructed from the metacyclic and bloodstream stages, and found to be further enriched for T. congolense-specific ESTs. Given that cultured T. congolense offers an experimental advantage over other African trypanosome species, these ESTs provide a basis for further investigation of the molecular properties of these four developmental stages, especially the epimastigote and metacyclic stages for which it is difficult to obtain large quantities of organisms. The T. congolense EST databases are available at: http://www.sanger.ac.uk/Projects/T_congolense/EST_index.shtml. PMID:19559733
Kaur, Sukhjiwan; Cogan, Noel O I; Pembleton, Luke W; Shinozuka, Maiko; Savin, Keith W; Materne, Michael; Forster, John W
2011-05-25
Lentil (Lens culinaris Medik.) is a cool-season grain legume which provides a rich source of protein for human consumption. In terms of genomic resources, lentil is relatively underdeveloped, in comparison to other Fabaceae species, with limited available data. There is hence a significant need to enhance such resources in order to identify novel genes and alleles for molecular breeding to increase crop productivity and quality. Tissue-specific cDNA samples from six distinct lentil genotypes were sequenced using Roche 454 GS-FLX Titanium technology, generating c. 1.38 × 106 expressed sequence tags (ESTs). De novo assembly generated a total of 15,354 contigs and 68,715 singletons. The complete unigene set was sequence-analysed against genome drafts of the model legume species Medicago truncatula and Arabidopsis thaliana to identify 12,639, and 7,476 unique matches, respectively. When compared to the genome of Glycine max, a total of 20,419 unique hits were observed corresponding to c. 31% of the known gene space. A total of 25,592 lentil unigenes were subsequently annoated from GenBank. Simple sequence repeat (SSR)-containing ESTs were identified from consensus sequences and a total of 2,393 primer pairs were designed. A subset of 192 EST-SSR markers was screened for validation across a panel 12 cultivated lentil genotypes and one wild relative species. A total of 166 primer pairs obtained successful amplification, of which 47.5% detected genetic polymorphism. A substantial collection of ESTs has been developed from sequence analysis of lentil genotypes using second-generation technology, permitting unigene definition across a broad range of functional categories. As well as providing resources for functional genomics studies, the unigene set has permitted significant enhancement of the number of publicly-available molecular genetic markers as tools for improvement of this species.
Nakajima, K; Hashimoto, T; Yamada, Y
1993-01-01
In the biosynthetic pathway of tropane alkaloids, tropinone reductase (EC 1.1.1.236) (TR)-I and TR-II, respectively, reduce a common substrate, tropinone, stereospecifically to the stereoisomeric alkamines tropine and pseudotropine (psi-tropine). cDNA clones coding for TR-I and TR-II, as well as a structurally related cDNA clone with an unknown function, were isolated from the solanaceous plant Datura stramonium. The cDNA clones for TR-I and TR-II encode polypeptides containing 273 and 260 amino acids, respectively, and when these clones were expressed in Escherichia coli, the recombinant TRs showed the same strict stereospecificity as that observed for the native TRs that had been isolated from plants. The deduced amino acid sequences of the two clones showed an overall identity of 64% in 260-amino acid residues and also shared significant similarities with enzymes in the short-chain, nonmetal dehydrogenase family. Genomic DNA-blot analysis detected the TR-encoding genes in three tropane alkaloid-producing solanaceous species but did not detect them in tobacco. We discuss how the two TRs may have evolved to catalyze the opposite stereospecific reductions. Images Fig. 4 Fig. 5 PMID:8415746
Liszewska, Frantz; Gaganidze, Dali; Sirko, Agnieszka
2005-01-01
We applied the yeast two-hybrid system for screening of a cDNA library of Nicotiana plumbaginifolia for clones encoding plant proteins interacting with two proteins of Escherichia coli: serine acetyltransferase (SAT, the product of cysE gene) and O-acetylserine (thiol)lyase A, also termed cysteine synthase (OASTL-A, the product of cysK gene). Two plant cDNA clones were identified when using the cysE gene as a bait. These clones encode a probable cytosolic isoform of OASTL and an organellar isoform of SAT, respectively, as indicated by evolutionary trees. The second clone, encoding SAT, was identified independently also as a "prey" when using cysK as a bait. Our results reveal the possibility of applying the two-hybrid system for cloning of plant cDNAs encoding enzymes of the cysteine synthase complex in the two-hybrid system. Additionally, using genome walking sequences located upstream of the sat1 cDNA were identified. Subsequently, in silico analyses were performed aiming towards identification of the potential signal peptide and possible location of the deduced mature protein encoded by sat1.
Chitin synthase in the filarial parasite, Brugia malayi.
Harris, M T; Lai, K; Arnold, K; Martinez, H F; Specht, C A; Fuhrman, J A
2000-12-01
Fragments of putative chitin synthase (chs) genes from two filarial species (Brugia malayi and Dirofilaria immitis) were amplified by PCR using degenerate primers. The full genomic and cDNA sequences were obtained for the B. malayi chs gene (Bm-chs-1); the predicted amino acid sequence is highly similar, over a large region, to two CHS sequences of the nematode Caenorhabditis elegans and also to two insect CHS sequences. Bm-chs-1 is abundantly transcribed in B. malayi adult females, independent of their fertilization status, but is also expressed in males and microfilariae. Oocytes and early embryos contain large amounts of Bm-chs-1 transcript by in situ hybridization, but later stage embryos within the maternal uterus show little or no Bm-chs-1 transcript. No specific hybridization could be demonstrated in maternal somatic tissues. Polyclonal antibodies were raised against a peptide expressed from a recombinant cDNA fragment of Bm-chs-1; immunostaining detected CHS protein in oocytes and early to midstage embryos. These studies characterize a gene that is likely to be essential to oogenesis and embryonic development in a parasitic nematode. Because chitin synthesis and eggshell formation begin after fertilization, the presence of CHS protein in early oocytes suggests that the enzyme must be activated as a result of fertilization. These studies also demonstrate that chitin synthesis may not be restricted to eggshell formation in nematodes, as the Bm-chs-1 gene is transcribed in life cycle stages other than adult females.
Characterization of a Novel Polerovirus Infecting Maize in China
Chen, Sha; Jiang, Guangzhuang; Wu, Jianxiang; Liu, Yong; Qian, Yajuan; Zhou, Xueping
2016-01-01
A novel virus, tentatively named Maize Yellow Mosaic Virus (MaYMV), was identified from the field-grown maize plants showing yellow mosaic symptoms on the leaves collected from the Yunnan Province of China by the deep sequencing of small RNAs. The complete 5642 nucleotide (nt)-long genome of the MaYMV shared the highest nucleotide sequence identity (73%) to Maize Yellow Dwarf Virus-RMV. Sequence comparisons and phylogenetic analyses suggested that MaYMV represents a new member of the genus Polerovirus in the family Luteoviridae. Furthermore, the P0 protein encoded by MaYMV was demonstrated to inhibit both local and systemic RNA silencing by co-infiltration assays using transgenic Nicotiana benthamiana line 16c carrying the GFP reporter gene, which further supported the identification of a new polerovirus. The biologically-active cDNA clone of MaYMV was generated by inserting the full-length cDNA of MaYMV into the binary vector pCB301. RT-PCR and Northern blot analyses showed that this clone was systemically infectious upon agro-inoculation into N. benthamiana. Subsequently, 13 different isolates of MaYMV from field-grown maize plants in different geographical locations of Yunnan and Guizhou provinces of China were sequenced. Analyses of their molecular variation indicate that the 3′ half of P3–P5 read-through protein coding region was the most variable, whereas the coat protein- (CP-) and movement protein- (MP-)coding regions were the most conserved. PMID:27136578
Characterization of a Novel Polerovirus Infecting Maize in China.
Chen, Sha; Jiang, Guangzhuang; Wu, Jianxiang; Liu, Yong; Qian, Yajuan; Zhou, Xueping
2016-04-28
A novel virus, tentatively named Maize Yellow Mosaic Virus (MaYMV), was identified from the field-grown maize plants showing yellow mosaic symptoms on the leaves collected from the Yunnan Province of China by the deep sequencing of small RNAs. The complete 5642 nucleotide (nt)-long genome of the MaYMV shared the highest nucleotide sequence identity (73%) to Maize Yellow Dwarf Virus-RMV. Sequence comparisons and phylogenetic analyses suggested that MaYMV represents a new member of the genus Polerovirus in the family Luteoviridae. Furthermore, the P0 protein encoded by MaYMV was demonstrated to inhibit both local and systemic RNA silencing by co-infiltration assays using transgenic Nicotiana benthamiana line 16c carrying the GFP reporter gene, which further supported the identification of a new polerovirus. The biologically-active cDNA clone of MaYMV was generated by inserting the full-length cDNA of MaYMV into the binary vector pCB301. RT-PCR and Northern blot analyses showed that this clone was systemically infectious upon agro-inoculation into N. benthamiana. Subsequently, 13 different isolates of MaYMV from field-grown maize plants in different geographical locations of Yunnan and Guizhou provinces of China were sequenced. Analyses of their molecular variation indicate that the 3' half of P3-P5 read-through protein coding region was the most variable, whereas the coat protein- (CP-) and movement protein- (MP-)coding regions were the most conserved.
Alkio, Merianne; Jonas, Uwe; Declercq, Myriam; Van Nocker, Steven; Knoche, Moritz
2014-01-01
The exocarp, or skin, of fleshy fruit is a specialized tissue that protects the fruit, attracts seed dispersing fruit eaters, and has large economical relevance for fruit quality. Development of the exocarp involves regulated activities of many genes. This research analyzed global gene expression in the exocarp of developing sweet cherry (Prunus avium L., ‘Regina’), a fruit crop species with little public genomic resources. A catalog of transcript models (contigs) representing expressed genes was constructed from de novo assembled short complementary DNA (cDNA) sequences generated from developing fruit between flowering and maturity at 14 time points. Expression levels in each sample were estimated for 34 695 contigs from numbers of reads mapping to each contig. Contigs were annotated functionally based on BLAST, gene ontology and InterProScan analyses. Coregulated genes were detected using partitional clustering of expression patterns. The results are discussed with emphasis on genes putatively involved in cuticle deposition, cell wall metabolism and sugar transport. The high temporal resolution of the expression patterns presented here reveals finely tuned developmental specialization of individual members of gene families. Moreover, the de novo assembled sweet cherry fruit transcriptome with 7760 full-length protein coding sequences and over 20 000 other, annotated cDNA sequences together with their developmental expression patterns is expected to accelerate molecular research on this important tree fruit crop. PMID:26504533
Kim, Hong-Il; Kwon, O-Chul; Kong, Won-Sik; Lee, Chang-Soo
2014-01-01
The aim of this study was to identify and characterize new Flammulina velutipes laccases from its whole-genome sequence. Of the 15 putative laccase genes detected in the F. velutipes genome, four new laccase genes (fvLac-1, fvLac-2, fvLac3, and fvLac-4) were found to contain four complete copper-binding regions (ten histidine residues and one cysteine residue) and four cysteine residues involved in forming disulfide bridges, fvLac-1, fvLac-2, fvLac3, and fvLac-4, encoding proteins consisting of 516, 518, 515, and 533 amino acid residues, respectively. Potential N-glycosylation sites (Asn-Xaa-Ser/Thr) were identified in the cDNA sequence of fvLac-1 (Asn-454), fvLac-2 (Asn-437 and Asn-455), fvLac-3 (Asn-111 and Asn-237), and fvLac4 (Asn-402 and Asn-457). In addition, the first 19~20 amino acid residues of these proteins were predicted to comprise signal peptides. Laccase activity assays and reverse transcription polymerase chain reaction analyses clearly reveal that CuSO4 affects the induction and the transcription level of these laccase genes. PMID:25606003
Mapping Flagellar Genes in Chlamydomonas Using Restriction Fragment Length Polymorphisms
Ranum, LPW.; Thompson, M. D.; Schloss, J. A.; Lefebvre, P. A.; Silflow, C. D.
1988-01-01
To correlate cloned nuclear DNA sequences with previously characterized mutations in Chlamydomonas and, to gain insight into the organization of its nuclear genome, we have begun to map molecular markers using restriction fragment length polymorphisms (RFLPs). A Chlamydomonas reinhardtii strain (CC-29) containing phenotypic markers on nine of the 19 linkage groups was crossed to the interfertile species Chlamydomonas smithii. DNA from each member of 22 randomly selected tetrads was analyzed for the segregation of RFLPs associated with cloned genes detected by hybridization with radioactive DNA probes. The current set of markers allows the detection of linkage to new molecular markers over approximately 54% of the existing genetic map. This study focused on mapping cloned flagellar genes and genes whose transcripts accumulate after deflagellation. Twelve different molecular clones have been assigned to seven linkage groups. The α-1 tubulin gene maps to linkage group III and is linked to the genomic sequence homologous to pcf6-100, a cDNA clone whose corresponding transcript accumulates after deflagellation. The α-2 tubulin gene maps to linkage group IV. The two β-tubulin genes are linked, with the β-1 gene being approximately 12 cM more distal from the centromere than the β-2 gene. A clone corresponding to a 73-kD dynein protein maps to the opposite arm of the same linkage group. The gene corresponding to the cDNA clone pcf6-187, whose mRNA accumulates after deflagellation, maps very close to the tightly linked pf-26 and pf-1 mutations on linkage group V. PMID:2906025
Cloning and expression of cDNA coding for bouganin.
den Hartog, Marcel T; Lubelli, Chiara; Boon, Louis; Heerkens, Sijmie; Ortiz Buijsse, Antonio P; de Boer, Mark; Stirpe, Fiorenzo
2002-03-01
Bouganin is a ribosome-inactivating protein that recently was isolated from Bougainvillea spectabilis Willd. In this work, the cloning and expression of the cDNA encoding for bouganin is described. From the cDNA, the amino-acid sequence was deduced, which correlated with the primary sequence data obtained by amino-acid sequencing on the native protein. Bouganin is synthesized as a pro-peptide consisting of 305 amino acids, the first 26 of which act as a leader signal while the 29 C-terminal amino acids are cleaved during processing of the molecule. The mature protein consists of 250 amino acids. Using the cDNA sequence encoding the mature protein of 250 amino acids, a recombinant protein was expressed, purified and characterized. The recombinant molecule had similar activity in a cell-free protein synthesis assay and had comparable toxicity on living cells as compared to the isolated native bouganin.
Shaw, D R; Richter, H; Giorda, R; Ohmachi, T; Ennis, H L
1989-09-01
A Dictyostelium discoideum repetitive element composed of long repeats of the codon (AAC) is found in developmentally regulated transcripts. The concentration of (AAC) sequences is low in mRNA from dormant spores and growing cells and increases markedly during spore germination and multicellular development. The sequence hybridizes to many different sized Dictyostelium DNA restriction fragments indicating that it is scattered throughout the genome. Four cDNA clones isolated contain (AAC) sequences in the deduced coding region. Interestingly, the (AAC)-rich sequences are present in all three reading frames in the deduced proteins, i.e., AAC (asparagine), ACA (threonine) and CAA (glutamine). Three of the clones contain only one of these in-frame so that the individual proteins carry either asparagine, threonine, or glutamine clusters, not mixtures. However, one clone is both glutamine- and asparagine-rich. The (AAC) portion of the transcripts are reiterated 300 times in the haploid genome while the other portions of the cDNAs represent single copy genes, whose sequences show no similarity other than the (AAC) repeats. The repeated sequence is similar to the opa or M sequence found in Drosophila melanogaster notch and homeo box genes and in fly developmentally regulated transcripts. The transcripts are present on polysomes suggesting that they are translated. Although the function of these repeats is unknown, long amino acid repeats are a characteristic feature of extracellular proteins of lower eukaryotes.
Efficient gene-driven germ-line point mutagenesis of C57BL/6J mice
Michaud, Edward J; Culiat, Cymbeline T; Klebig, Mitchell L; Barker, Paul E; Cain, KT; Carpenter, Debra J; Easter, Lori L; Foster, Carmen M; Gardner, Alysyn W; Guo, ZY; Houser, Kay J; Hughes, Lori A; Kerley, Marilyn K; Liu, Zhaowei; Olszewski, Robert E; Pinn, Irina; Shaw, Ginger D; Shinpock, Sarah G; Wymore, Ann M; Rinchik, Eugene M; Johnson, Dabney K
2005-01-01
Background Analysis of an allelic series of point mutations in a gene, generated by N-ethyl-N-nitrosourea (ENU) mutagenesis, is a valuable method for discovering the full scope of its biological function. Here we present an efficient gene-driven approach for identifying ENU-induced point mutations in any gene in C57BL/6J mice. The advantage of such an approach is that it allows one to select any gene of interest in the mouse genome and to go directly from DNA sequence to mutant mice. Results We produced the Cryopreserved Mutant Mouse Bank (CMMB), which is an archive of DNA, cDNA, tissues, and sperm from 4,000 G1 male offspring of ENU-treated C57BL/6J males mated to untreated C57BL/6J females. Each mouse in the CMMB carries a large number of random heterozygous point mutations throughout the genome. High-throughput Temperature Gradient Capillary Electrophoresis (TGCE) was employed to perform a 32-Mbp sequence-driven screen for mutations in 38 PCR amplicons from 11 genes in DNA and/or cDNA from the CMMB mice. DNA sequence analysis of heteroduplex-forming amplicons identified by TGCE revealed 22 mutations in 10 genes for an overall mutation frequency of 1 in 1.45 Mbp. All 22 mutations are single base pair substitutions, and nine of them (41%) result in nonconservative amino acid substitutions. Intracytoplasmic sperm injection (ICSI) of cryopreserved spermatozoa into B6D2F1 or C57BL/6J ova was used to recover mutant mice for nine of the mutations to date. Conclusions The inbred C57BL/6J CMMB, together with TGCE mutation screening and ICSI for the recovery of mutant mice, represents a valuable gene-driven approach for the functional annotation of the mammalian genome and for the generation of mouse models of human genetic diseases. The ability of ENU to induce mutations that cause various types of changes in proteins will provide additional insights into the functions of mammalian proteins that may not be detectable by knockout mutations. PMID:16300676
Isolation of CYP3A5P cDNA from human liver: a reflection of a novel cytochrome P-450 pseudogene.
Schuetz, J D; Guzelian, P S
1995-03-14
We have isolated, from a human liver cDNA library, a 1627 bp CYP3A5 cDNA variant (CYP3A5P) that contains several large insertions, deletions, and in-frame termination codons. By comparison with the genomic structure of other CYP3A genes, the major insertions in CYP3A5P cDNA demarcate the inferred sites of several CYP3A5 exons. The segments inserted in CYP3A5P have no homology with splice donor acceptor sites. It is unlikely that CYP3A5P cDNA represents an artifact of the cloning procedures since Southern blot analysis of human genomic DNA disclosed that CYP3A5P cDNA hybridized with a DNA fragment distinct from fragments that hybridized with either CYP3A5, CYP3A3 or CYP3A4. Moreover, analysis of adult human liver RNA on Northern blots hybridized with a CYP3A5P cDNA fragment revealed the presence of an mRNA with the predicted size of CYP3A5P. We conclude that CYP3A5P cDNA was derived from a separate gene, CYP3A5P, most likely a pseudogene evolved from CYP3A5.
NASA Technical Reports Server (NTRS)
Hsieh, H. L.; Tong, C. G.; Thomas, C.; Roux, S. J.
1996-01-01
A CDNA encoding a 47 kDa nucleoside triphosphatase (NTPase) that is associated with the chromatin of pea nuclei has been cloned and sequenced. The translated sequence of the cDNA includes several domains predicted by known biochemical properties of the enzyme, including five motifs characteristic of the ATP-binding domain of many proteins, several potential casein kinase II phosphorylation sites, a helix-turn-helix region characteristic of DNA-binding proteins, and a potential calmodulin-binding domain. The deduced primary structure also includes an N-terminal sequence that is a predicted signal peptide and an internal sequence that could serve as a bipartite-type nuclear localization signal. Both in situ immunocytochemistry of pea plumules and immunoblots of purified cell fractions indicate that most of the immunodetectable NTPase is within the nucleus, a compartment proteins typically reach through nuclear pores rather than through the endoplasmic reticulum pathway. The translated sequence has some similarity to that of human lamin C, but not high enough to account for the earlier observation that IgG against human lamin C binds to the NTPase in immunoblots. Northern blot analysis shows that the NTPase MRNA is strongly expressed in etiolated plumules, but only poorly or not at all in the leaf and stem tissues of light-grown plants. Accumulation of NTPase mRNA in etiolated seedlings is stimulated by brief treatments with both red and far-red light, as is characteristic of very low-fluence phytochrome responses. Southern blotting with pea genomic DNA indicates the NTPase is likely to be encoded by a single gene.
Searching for nuclear export elements in hepatitis D virus RNA.
Freitas, Natália; Cunha, Celso
2013-08-12
To search for the presence of cis elements in hepatitis D virus (HDV) genomic and antigenomic RNA capable of promoting nuclear export. We made use of a well characterized chloramphenicol acetyl-transferase reporter system based on plasmid pDM138. Twenty cDNA fragments corresponding to different HDV genomic and antigenomic RNA sequences were inserted in plasmid pDM138, and used in transfection experiments in Huh7 cells. The relative amounts of HDV RNA in nuclear and cytoplasmic fractions were then determined by real-time polymerase chain reaction and Northern blotting. The secondary structure of the RNA sequences that displayed nuclear export ability was further predicted using a web interface. Finally, the sensitivity to leptomycin B was assessed in order to investigate possible cellular pathways involved in HDV RNA nuclear export. Analysis of genomic RNA sequences did not allow identifying an unequivocal nuclear export element. However, two regions were found to promote the export of reporter mRNAs with efficiency higher than the negative controls albeit lower than the positive control. These regions correspond to nucleotides 266-489 and 584-920, respectively. In addition, when analyzing antigenomic RNA sequences a nuclear export element was found in positions 214-417. Export mediated by the nuclear export element of HDV antigenomic RNA is sensitive to leptomycin B suggesting a possible role of CRM1 in this transport pathway. A cis-acting nuclear export element is present in nucleotides 214-417 of HDV antigenomic RNA.
Isolation of a cDNA Encoding a Granule-Bound 152-Kilodalton Starch-Branching Enzyme in Wheat1
Båga, Monica; Nair, Ramesh B.; Repellin, Anne; Scoles, Graham J.; Chibbar, Ravindra N.
2000-01-01
Screening of a wheat (Triticum aestivum) cDNA library for starch-branching enzyme I (SBEI) genes combined with 5′-rapid amplification of cDNA ends resulted in isolation of a 4,563-bp composite cDNA, Sbe1c. Based on sequence alignment to characterized SBEI cDNA clones isolated from plants, the SBEIc predicted from the cDNA sequence was produced with a transit peptide directing the polypeptide into plastids. Furthermore, the predicted mature form of SBEIc was much larger (152 kD) than previously characterized plant SBEI (80–100 kD) and contained a partial duplication of SBEI sequences. The first SBEI domain showed high amino acid similarity to a 74-kD wheat SBEI-like protein that is inactive as a branching enzyme when expressed in Escherichia coli. The second SBEI domain on SBEIc was identical in sequence to a functional 87-kD SBEI produced in the wheat endosperm. Immunoblot analysis of proteins produced in developing wheat kernels demonstrated that the 152-kD SBEIc was, in contrast to the 87- to 88-kD SBEI, preferentially associated with the starch granules. Proteins similar in size and recognized by wheat SBEI antibodies were also present in Triticum monococcum, Triticum tauschii, and Triticum turgidum subsp. durum. PMID:10982440
Huiet, L; Feldstein, P A; Tsai, J H; Falk, B W
1993-12-01
Primer extension analyses and a PCR-based cloning strategy were used to identify and characterize 5' nucleotide sequences on the maize stripe virus (MStV) RNA4 mRNA transcripts encoding the major noncapsid protein (NCP). Direct RNA sequence analysis by primer extension showed that the NCP mRNA transcripts had 10-15 nucleotides beyond the 5' terminus of the MStV RNA4 nucleotide sequence. MStV genomic RNAs isolated from ribonucleoprotein particles (RNPs) lacked the additional 5' nucleotides. cDNA clones representing the 5' region of the mRNA transcripts were constructed, and the nucleotide sequences of the 5' regions were determined for 16 clones. Each was found to have a distinct 10-15 nucleotide sequence immediately 5' of the MStV RNA4 sequence. Eleven of 16 clones had the correct MStV RNA4 5' nucleotide sequence, while five showed minor variations at or near the 5' most MStV RNA4 nucleotide. These characteristics show strong similarities to other viral mRNA transcripts which are synthesized by cap snatching.
Kerschner, Joseph E; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J Christopher; Ehrlich, Garth D
2010-04-01
We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription-polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis.
Kerschner, Joseph E.; Erdos, Geza; Hu, Fen Ze; Burrows, Amy; Cioffi, Joseph; Khampang, Pawjai; Dahlgren, Margaret; Hayes, Jay; Keefe, Randy; Janto, Benjamin; Post, J. Christopher; Ehrlich, Garth D.
2010-01-01
Objectives We sought to construct and partially characterize complementary DNA (cDNA) libraries prepared from the middle ear mucosa (MEM) of chinchillas to better understand pathogenic aspects of infection and inflammation, particularly with respect to leukotriene biogenesis and response. Methods Chinchilla MEM was harvested from controls and after middle ear inoculation with nontypeable Haemophilus influenzae. RNA was extracted to generate cDNA libraries. Randomly selected clones were subjected to sequence analysis to characterize the libraries and to provide DNA sequence for phylogenetic analyses. Reverse transcription–polymerase chain reaction of the RNA pools was used to generate cDNA sequences corresponding to genes associated with leukotriene biosynthesis and metabolism. Results Sequence analysis of 921 randomly selected clones from the uninfected MEM cDNA library produced approximately 250,000 nucleotides of almost entirely novel sequence data. Searches of the GenBank database with the Basic Local Alignment Search Tool provided for identification of 515 unique genes expressed in the MEM and not previously described in chinchillas. In almost all cases, the chinchilla cDNA sequences displayed much greater homology to human or other primate genes than with rodent species. Genes associated with leukotriene metabolism were present in both normal and infected MEM. Conclusions Based on both phylogenetic comparisons and gene expression similarities with humans, chinchilla MEM appears to be an excellent model for the study of middle ear inflammation and infection. The higher degree of sequence similarity between chinchillas and humans compared to chinchillas and rodents was unexpected. The cDNA libraries from normal and infected chinchilla MEM will serve as useful molecular tools in the study of otitis media and should yield important information with respect to middle ear pathogenesis. PMID:20433028
Transcriptome characterisation of Pinus tabuliformis and evolution of genes in the Pinus phylogeny
2013-01-01
Background The Chinese pine (Pinus tabuliformis) is an indigenous conifer species in northern China but is relatively underdeveloped as a genomic resource; thus, limiting gene discovery and breeding. Large-scale transcriptome data were obtained using a next-generation sequencing platform to compensate for the lack of P. tabuliformis genomic information. Results The increasing amount of transcriptome data on Pinus provides an excellent resource for multi-gene phylogenetic analysis and studies on how conserved genes and functions are maintained in the face of species divergence. The first P. tabuliformis transcriptome from a normalised cDNA library of multiple tissues and individuals was sequenced in a full 454 GS-FLX run, producing 911,302 sequencing reads. The high quality overlapping expressed sequence tags (ESTs) were assembled into 46,584 putative transcripts, and more than 700 SSRs and 92,000 SNPs/InDels were characterised. Comparative analysis of the transcriptome of six conifer species yielded 191 orthologues, from which we inferred a phylogenetic tree, evolutionary patterns and calculated rates of gene diversion. We also identified 938 fast evolving sequences that may be useful for identifying genes that perhaps evolved in response to positive selection and might be responsible for speciation in the Pinus lineage. Conclusions A large collection of high-quality ESTs was obtained, de novo assembled and characterised, which represents a dramatic expansion of the current transcript catalogues of P. tabuliformis and which will gradually be applied in breeding programs of P. tabuliformis. Furthermore, these data will facilitate future studies of the comparative genomics of P. tabuliformis and other related species. PMID:23597112
Sasaki, Katsutomo; Mitsuda, Nobutaka; Nashima, Kenji; Kishimoto, Kyutaro; Katayose, Yuichi; Kanamori, Hiroyuki; Ohmiya, Akemi
2017-09-04
Chrysanthemum morifolium is one of the most economically valuable ornamental plants worldwide. Chrysanthemum is an allohexaploid plant with a large genome that is commercially propagated by vegetative reproduction. New cultivars with different floral traits, such as color, morphology, and scent, have been generated mainly by classical cross-breeding and mutation breeding. However, only limited genetic resources and their genome information are available for the generation of new floral traits. To obtain useful information about molecular bases for floral traits of chrysanthemums, we read expressed sequence tags (ESTs) of chrysanthemums by high-throughput sequencing using the 454 pyrosequencing technology. We constructed normalized cDNA libraries, consisting of full-length, 3'-UTR, and 5'-UTR cDNAs derived from various tissues of chrysanthemums. These libraries produced a total number of 3,772,677 high-quality reads, which were assembled into 213,204 contigs. By comparing the data obtained with those of full genome-sequenced species, we confirmed that our chrysanthemum contig set contained the majority of all expressed genes, which was sufficient for further molecular analysis in chrysanthemums. We confirmed that our chrysanthemum EST set (contigs) contained a number of contigs that encoded transcription factors and enzymes involved in pigment and aroma compound metabolism that was comparable to that of other species. This information can serve as an informative resource for identifying genes involved in various biological processes in chrysanthemums. Moreover, the findings of our study will contribute to a better understanding of the floral characteristics of chrysanthemums including the myriad cultivars at the molecular level.
ESTree db: a Tool for Peach Functional Genomics
Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo
2005-01-01
Background The ESTree db represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. Results The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. Conclusion The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig. PMID:16351742
ESTree db: a tool for peach functional genomics.
Lazzari, Barbara; Caprera, Andrea; Vecchietti, Alberto; Stella, Alessandra; Milanesi, Luciano; Pozzi, Carlo
2005-12-01
The ESTree db http://www.itb.cnr.it/estree/ represents a collection of Prunus persica expressed sequenced tags (ESTs) and is intended as a resource for peach functional genomics. A total of 6,155 successful EST sequences were obtained from four in-house prepared cDNA libraries from Prunus persica mesocarps at different developmental stages. Another 12,475 peach EST sequences were downloaded from public databases and added to the ESTree db. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts and data were collected in a MySQL database. A php-based web interface was developed to query the database. The ESTree db version as of April 2005 encompasses 18,630 sequences representing eight libraries. Contig assembly was performed with CAP3. Putative single nucleotide polymorphism (SNP) detection was performed with the AutoSNP program and a search engine was implemented to retrieve results. All the sequences and all the contig consensus sequences were annotated both with blastx against the GenBank nr db and with GOblet against the viridiplantae section of the Gene Ontology db. Links to NiceZyme (Expasy) and to the KEGG metabolic pathways were provided. A local BLAST utility is available. A text search utility allows querying and browsing the database. Statistics were provided on Gene Ontology occurrences to assign sequences to Gene Ontology categories. The resulting database is a comprehensive resource of data and links related to peach EST sequences. The Sequence Report and Contig Report pages work as the web interface core structures, giving quick access to data related to each sequence/contig.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Chen, M.; Auerbach, W.; Buchwald, M.
1994-09-01
Fanconi anemia (FA) is an autosomal recessive disease characterized by bone marrow failure, congenital malformations and predisposition to malignancies. The gene responsible for the defect in FA group C has been cloned and designated the Fanconi Anemia Complementation Group C gene (FACC). A murine cDNA for this gene (Facc) was also cloned. Here we report our progress in the establishment of a mouse model for FA. The mouse Facc cDNA was used as probe to screen a genomic library of mouse strain 129. More than twenty positive clones were isolated. Three of them were mapped and found to be overlappingmore » clones, encompassing the genomic region from exon 8 to the end of the 3{prime} UTR of the mouse cDNA. A targeting vector was constructed using the most 5{prime} mouse genomic sequence available. The end result of the homologous recombination is that exon 8 is deleted and the neo gene is inserted. The last exon, exon 14, is essential for the complementing function of the FACC gene product; the disruption in the middle of the murine Facc gene should render this locus biologically inactive. This targeting vector was linearized and electroporated into R1 embryonic stem (ES) cells which were derived from the 129 mouse. Of 102 clones screened, 19 positive cell lines were identified. Four targeted cell lines have been used to produce chimeric mice. 129-derived ES cells were aggregated ex vivo into the morulas derived from CD1 mice and then implanted into foster mothers. 22 chimeras have been obtained. Moderately and strongly chimeric mice have been bred to test for germline transmission. Progeny with the expected coat color derived from 2 chimeras are currently being examined to confirm transmission of the targeted allele.« less
Schuster, W; Wissinger, B; Unseld, M; Brennicke, A
1990-01-01
A number of cytosines are altered to be recognized as uridines in transcripts of the nad3 locus in mitochondria of the higher plant Oenothera. Such nucleotide modifications can be found at 16 different sites within the nad3 coding region. Most of these alterations in the mRNA sequence change codon identities to specify amino acids better conserved in evolution. Individual cDNA clones differ in their degree of editing at five nucleotide positions, three of which are silent, while two lead to codon alterations specifying different amino acids. None of the cDNA clones analysed is maximally edited at all possible sites, suggesting slow processing or lowered stringency of editing at these nucleotides. Differentially edited transcripts could be editing intermediates or could code for differing polypeptides. Two edited nucleotides in an open reading frame located upstream of nad3 change two amino acids in the deduced polypeptide. Part of the well-conserved ribosomal protein gene rps12 also encoded downstream of nad3 in other plants, is lost in Oenothera mitochondria by recombination events. The functional rps12 protein must be imported from the cytoplasm since the deleted sequences of this gene are not found in the Oenothera mitochondrial genome. The pseudogene sequence is not edited at any nucleotide position. Images Fig. 3. Fig. 4. Fig. 7. PMID:1688531
Cai, Q; Storey, K B
1997-08-01
The present study identifies a previously cloned cDNA, pBTaR914, as homologous to the mitochondrial WANCY (tryptophan, alanine, asparagine, cysteine, and tyrosine) tRNA gene cluster. This cDNA clone has a 304-bp sequence and its homologue, pBTaR09, has a 158-bp sequence with a long poly(A)+ tail (more than 60 adenosines). RNA blotting analysis using pBTaR914 probe against the total RNA from the tissues of adult and hatchling turtles revealed five bands: 540, 1800, 2200, 3200, and 3900 nucleotides (nt). The 540-nt transcript is considered to be an intact mtRNA unit from a novel mtDNA gene designated WANCYHP that overlaps the WANCY tRNA gene cluster region. This transcript was highly induced by both anoxic and freezing stresses in turtle heart. The other transcripts are considered to be the processed intermediates of mtRNA transcripts with WANCYHP sequence. All these transcripts were differentially regulated by anoxia and freezing in different organs. The data suggest that mtRNA processing is sensitive to regulation by external stresses, oxygen deprivation, and freezing. Furthermore, the fact that the WANCYHP transcript is highly induced during anoxic exposure suggests that it may play an important role in the regulation of mitochondrial activities to coordinate the physiological adaptation to anoxia.
Clinical characteristics of severe congenital neutropenia caused by novel ELANE gene mutations.
Shu, Zhou; Li, Xiao-Hui; Bai, Xiao-Ming; Zhang, Zhi-Yong; Jiang, Li-ping; Tang, Xue-Mei; Zhao, Xiao-dong
2015-02-01
Mutations within the ELANE gene, which encodes human neutrophil elastase, are the most common genetic causes of severe congenital neutropenia (SCN). No cases of SCN have been previously described from a Chinese population. Herein, we describe the clinical, hematologic and molecular characteristics of 7 Chinese SCN cases with novel ELANE mutations. Seven Chinese pediatric patients (4 males and 3 females) with suspected SCN were enrolled in this study. Clinical data, peripheral blood, bone marrow and immune function were evaluated for SCN. ELANE genomic DNA and cDNA sequences from patients and potential carriers were analyzed using polymerase chain reaction (PCR) and direct sequencing. All the7 patients experienced recurrent infection (soft tissue, lung, oral cavity) during a period of 120 days. Noninfectious conditions such as anemia and osteopenia were found in most patients, and absolute peripheral neutrophil counts varied. DNA and cDNA sequencing demonstrated that the patients harbored a range of heterozygous ELANE gene mutations, including substitution, deletion, insertion and frame shift alterations. All the mutations had not been reported previously; however, no mutation carriers were identified among the parents or siblings, even in a family with 2 affected offspring. SCN cases were identified for the first time in China, and all patients carried novel ELANE mutations. Granulocyte-colony stimulating factor (G-CSF) was an effective treatment for most of the SCN patients and prevented life-threatening bacterial infections.
DOE Office of Scientific and Technical Information (OSTI.GOV)
Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.
1988-01-05
In hyaline membrane disease of premature infants, lack of surfactant leads to pulmonary atelectasis and respiratory distress. Hydrophobic surfactant proteins of M/sub r/ = 5000-14,000 have been isolated from mammalian surfactants which enhance the rate of spreading and the surface tension lowering properties of phospholipids during dynamic compression. The authors have characterized the amino-terminal amino acid sequence of pulmonary proteolipids from ether/ethanol extracts of bovine, canine, and human surfactant. Two distinct peptides were identified and termed SPL(pVal) and SPL(Phe). An oligonucleotide probe based on the valine-rich amino-terminal amino acid sequence of SPL(pVal) was utilized to isolate cDNA and genomic DNAmore » encoding the human protein, termed surfactant proteolipid SPL(pVal) on the basis of its unique polyvaline domain. The primary structure of a precursor protein of 20,870 daltons, containing the SPL(pVal) peptide, was deduced from the nucleotide sequence of the cDNAs. Hybrid-arrested translation and immunoprecipitation of labeled translation products of human mRNA demonstrated a precursor protein, the active hydrophobic peptide being produced by proteolytic processing. Two classes of cDNAs encoding SPL(pVal) were identified. Human SPL(pVal) mRNA was more abundant in the adult than in fetal lung. The SPL(pVal) gene locus was assigned to chromosome 8.« less
Su, Xiu-Lan; Hou, Yi-Ling; Yan, Xiang-Hui; Ding, Xiang; Hou, Wan-Ru; Sun, Bing; Zhang, Si-Nan
2012-09-01
Ribosomal protein L31 gene is a component of the 60S large ribosomal subunit encoded by RPL31 gene, while ribosomal protein L31 (RPL31) is an important constituent of peptidyltransferase center. In our research, the cDNA and the genomic sequence of RPL31 were cloned successfully from the giant panda (Ailuropoda melanoleuca) using RT-PCR technology respectively, following sequencing and analyzing preliminarily. We constructed a recombinant expression vector contained RPL31 cDNA and over-expressed it in Escherichia coli using pET28a plasmids. The expression product was purified to obtain recombinant protein of RPL31 from the giant panda. Recombinant protein of RPL31 obtained from the experiment acted on human laryngeal carcinoma Hep-2 and human hepatoma HepG-2 cells for study of its anti-cancer activity by MTT [3-(4, 5-dimehyl-2-thiazolyl)-2, 5-diphenyl-2H-tetrazolium bromide] method. Then observe these cells growth depressive effect. The result indicated that the cDNA fragment of the RPL31 cloned from the giant panda is 419 bp in size, containing an open reading frame of 378 bp, and deduced protein was composed of 125 amino acids with an estimated molecular weight of 14.46-kDa and PI of 11.21. The length of the genomic sequence is 8,091 bp, which was found to possess four exons and three introns. The RPL31 gene can be readily expressed in E.coli, expecting 18-kDa polypeptide that formed inclusion bodies. Recombinant protein RPL31 from the giant panda consists of 157 amino acids with an estimated molecular weight of 17.86 kDa and PI of 10.77. The outcomes showed that the cell growth inhibition rate in a time- and dose-dependent on recombinant protein RPL31. And also indicated that the effect at low concentrations was better than high concentrations on Hep-2 cells, and the concentration of 0.33 μg/mL had the best rate of growth inhibition, 44 %. Consequently, our study aimed at revealing the recombinant protein RPL31 anti-cancer function from the giant panda, providing scientific basis and resources for the research and development of cancer protein drugs anti-cancer mechanism research. Further studies of the mechanism and the signal transduction pathways are in progress.
Sakuradani, Eiji; Kobayashi, Michihiko; Shimizu, Sakayu
1999-01-01
Based on the sequence information for bovine and yeast NADH-cytochrome b5 reductases (CbRs), a DNA fragment was cloned from Mortierella alpina 1S-4 after PCR amplification. This fragment was used as a probe to isolate a cDNA clone with an open reading frame encoding 298 amino acid residues which show marked sequence similarity to CbRs from other sources, such as yeast (Saccharomyces cerevisiae), bovine, human, and rat CbRs. These results suggested that this cDNA is a CbR gene. The results of a structural comparison of the flavin-binding β-barrel domains of CbRs from various species and that of the M. alpina enzyme suggested that the overall barrel-folding patterns are similar to each other and that a specific arrangement of three highly conserved amino acid residues (i.e., arginine, tyrosine, and serine) plays a role in binding with the flavin (another prosthetic group) through hydrogen bonds. The corresponding genomic gene, which was also cloned from M. alpina 1S-4 by means of a hybridization method with the above probe, had four introns of different sizes. These introns had GT at the 5′ end and AG at the 3′ end, according to a general GT-AG rule. The expression of the full-length cDNA in a filamentous fungus, Aspergillus oryzae, resulted in an increase (4.7 times) in ferricyanide reduction activity involving the use of NADH as an electron donor in the microsomes. The M. alpina CbR was purified by solubilization of microsomes with cholic acid sodium salt, followed by DEAE-Sephacel, Mono-Q HR 5/5, and AMP-Sepharose 4B affinity column chromatographies; there was a 645-fold increase in the NADH-ferricyanide reductase specific activity. The purified CbR preferred NADH over NADPH as an electron donor. This is the first report of an analysis of this enzyme in filamentous fungi. PMID:10473389
Li, Chibo; Ding, Xi-Qin; O’Brien, John; Al-Ubaidi, Muayyad R.
2010-01-01
PURPOSE A great deal of information about functionally significant domains of a protein may be obtained by comparison of primary sequences of gene homologues over a broad phylogenetic base. This study was designed to identify evolutionarily conserved domains of the photoreceptor disc membrane protein peripherin/rds by analysis of the homologue in a primitive vertebrate, the skate. METHODS A skate retinal cDNA library was screened using a mouse peripherin/rds clone. The 5′ and 3′ untranslated regions of the skate peripherin/rds (srds) cDNA were isolated by the rapid amplification of cDNA ends (RACE) approach. The gene structure was characterized by PCR amplification and sequencing of genomic fragments. Northern and Western blot analyses were used to identify srds transcript and protein, respectively. RESULTS A new homologue of peripherin/rds was identified from the skate retinal cDNA library. SRDS is a glycoprotein with a predicted molecular mass of 40.2 kDa. The srds gene consists of two exons and one small intron and transcribes into a single 6-kb message. Phylogenetic analysis places SRDS at the base of peripherin/rds family and near the division of that group and the branch leading to rds-like and rom-1 genes. SRDS protein is 54.5% identical with peripherin/rds across species. Identity is significantly higher (73%) in the intradiscal domains. Sequence comparison revealed the conservation of all residues that have been shown, on mutation, to associate with retinitis pigmentosa and showed conservation of most residues associated with macular dystrophies. Comparison with ROM-1 and other rds-like proteins revealed the presence of a highly conserved domain in the large intradiscal loop. CONCLUSIONS Srds represents the skate orthologue of mammalian peripherin/rds genes. Conservation of most of the residues associated with human retinal diseases indicates that these residues serve important functional roles. The high degree of conservation of a short stretch within the large intradiscal loop also suggests an important function for this domain. PMID:12766040
A genome-wide expression profile of salt-responsive genes in the apple rootstock Malus zumi.
Li, Qingtian; Liu, Jia; Tan, Dunxian; Allan, Andrew C; Jiang, Yuzhuang; Xu, Xuefeng; Han, Zhenhai; Kong, Jin
2013-10-18
In some areas of cultivation, a lack of salt tolerance severely affects plant productivity. Apple, Malus x domestica Borkh., is sensitive to salt, and, as a perennial woody plant the mechanism of salt stress adaption will be different from that of annual herbal model plants, such as Arabidopsis. Malus zumi is a salt tolerant apple rootstock, which survives high salinity (up to 0.6% NaCl). To examine the mechanism underlying this tolerance, a genome-wide expression analysis was performed, using a cDNA library constructed from salt-treated seedlings of Malus zumi. A total of 15,000 cDNA clones were selected for microarray analysis. In total a group of 576 cDNAs, of which expression changed more than four-fold, were sequenced and 18 genes were selected to verify their expression pattern under salt stress by semi-quantitative RT-PCR. Our genome-wide expression analysis resulted in the isolation of 50 novel Malus genes and the elucidation of a new apple-specific mechanism of salt tolerance, including the stabilization of photosynthesis under stress, involvement of phenolic compounds, and sorbitol in ROS scavenging and osmoprotection. The promoter regions of 111 genes were analyzed by PlantCARE, suggesting an intensive cross-talking of abiotic stress in Malus zumi. An interaction network of salt responsive genes was constructed and molecular regulatory pathways of apple were deduced. Our research will contribute to gene function analysis and further the understanding of salt-tolerance mechanisms in fruit trees.
A Genome-Wide Expression Profile of Salt-Responsive Genes in the Apple Rootstock Malus zumi
Li, Qingtian; Liu, Jia; Tan, Dunxian; Allan, Andrew C.; Jiang, Yuzhuang; Xu, Xuefeng; Han, Zhenhai; Kong, Jin
2013-01-01
In some areas of cultivation, a lack of salt tolerance severely affects plant productivity. Apple, Malus x domestica Borkh., is sensitive to salt, and, as a perennial woody plant the mechanism of salt stress adaption will be different from that of annual herbal model plants, such as Arabidopsis. Malus zumi is a salt tolerant apple rootstock, which survives high salinity (up to 0.6% NaCl). To examine the mechanism underlying this tolerance, a genome-wide expression analysis was performed, using a cDNA library constructed from salt-treated seedlings of Malus zumi. A total of 15,000 cDNA clones were selected for microarray analysis. In total a group of 576 cDNAs, of which expression changed more than four-fold, were sequenced and 18 genes were selected to verify their expression pattern under salt stress by semi-quantitative RT-PCR. Our genome-wide expression analysis resulted in the isolation of 50 novel Malus genes and the elucidation of a new apple-specific mechanism of salt tolerance, including the stabilization of photosynthesis under stress, involvement of phenolic compounds, and sorbitol in ROS scavenging and osmoprotection. The promoter regions of 111 genes were analyzed by PlantCARE, suggesting an intensive cross-talking of abiotic stress in Malus zumi. An interaction network of salt responsive genes was constructed and molecular regulatory pathways of apple were deduced. Our research will contribute to gene function analysis and further the understanding of salt-tolerance mechanisms in fruit trees. PMID:24145753
Shiraishi, H; Ishikura, S; Matsuura, K; Deyashiki, Y; Ninomiya, M; Sakai, S; Hara, A
1998-01-01
Human liver contains three isoforms (DD1, DD2 and DD4) of dihydrodiol dehydrogenase with 20alpha- or 3alpha-hydroxysteroid dehydrogenase activity; the dehydrogenases belong to the aldo-oxo reductase (AKR) superfamily. cDNA species encoding DD1 and DD4 have been identified. However, four cDNA species with more than 99% sequence identity have been cloned and are compatible with a partial amino acid sequence of DD2. In this study we have isolated a cDNA clone encoding DD2, which was confirmed by comparison of the properties of the recombinant and hepatic enzymes. This cDNA showed differences of one, two, four and five nucleotides from the previously reported four cDNA species for a dehydrogenase of human colon carcinoma HT29 cells, human prostatic 3alpha-hydroxysteroid dehydrogenase, a human liver 3alpha-hydroxysteroid dehydrogenase-like protein and chlordecone reductase-like protein respectively. Expression of mRNA species for the five similar cDNA species in 20 liver samples and 10 other different tissue samples was examined by reverse transcriptase-mediated PCR with specific primers followed by diagnostic restriction with endonucleases. All the tissues expressed only one mRNA species corresponding to the newly identified cDNA for DD2: mRNA transcripts corresponding to the other cDNA species were not detected. We suggest that the new cDNA is derived from the principal gene for DD2, which has been named AKR1C2 by a new nomenclature for the AKR superfamily. It is possible that some of the other cDNA species previously reported are rare allelic variants of this gene. PMID:9716498
Pauchet, Y; Wilkinson, P; Vogel, H; Nelson, D R; Reynolds, S E; Heckel, D G; ffrench-Constant, R H
2010-02-01
The tobacco hornworm Manduca sexta is an important model for insect physiology but genomic and transcriptomic data are currently lacking. Following a recent pyrosequencing study generating immune related expressed sequence tags (ESTs), here we use this new technology to define the M. sexta larval midgut transcriptome. We generated over 387,000 midgut ESTs, using a combination of Sanger and 454 sequencing, and classified predicted proteins into those involved in digestion, detoxification and immunity. In many cases the depth of 454 pyrosequencing coverage allowed us to define the entire cDNA sequence of a particular gene. Many new M. sexta genes are described including up to 36 new cytochrome P450s, some of which have been implicated in the metabolism of host plant-derived nicotine. New lepidopteran gene families such as the beta-fructofuranosidases, previously thought to be restricted to Bombyx mori, are also described. An unexpectedly high number of ESTs were involved in immunity, for example 39 contigs encoding serpins, and the increasingly appreciated role of the midgut in insect immunity is discussed. Similar studies of other tissues will allow for a tissue by tissue description of the M. sexta transcriptome and will form an essential complimentary step on the road to genome sequencing and annotation.
Kimura, Tomohiro; Nakano, Toshiki; Yamaguchi, Toshiyasu; Sato, Minoru; Ogawa, Tomohisa; Muramoto, Koji; Yokoyama, Takehiko; Kan-No, Nobuhiro; Nagahisa, Eizou; Janssen, Frank; Grieshaber, Manfred K
2004-01-01
The complete complementary DNA sequences of genes presumably coding for opine dehydrogenases from Arabella iricolor (sandworm), Haliotis discus hannai (abalone), and Patinopecten yessoensis (scallop) were determined, and partial cDNA sequences were derived for Meretrix lusoria (Japanese hard clam) and Spisula sachalinensis (Sakhalin surf clam). The primers ODH-9F and ODH-11R proved useful for amplifying the sequences for opine dehydrogenases from the 4 mollusk species investigated in this study. The sequence of the sandworm was obtained using primers constructed from the amino acid sequence of tauropine dehydrogenase, the main opine dehydrogenase in A. iricolor. The complete cDNA sequence of A. iricolor, H. discus hannai, and P. yessoensis encode 397, 400, and 405 amino acids, respectively. All sequences were aligned and compared with published databank sequences of Loligo opalescens, Loligo vulgaris (squid), Sepia officinalis (cuttlefish), and Pecten maximus (scallop). As expected, a high level of homology was observed for the cDNA from closely related species, such as for cephalopods or scallops, whereas cDNA from the other species showed lower-level homologies. A similar trend was observed when the deduced amino acid sequences were compared. Furthermore, alignment of these sequences revealed some structural motifs that are possibly related to the binding sites of the substrates. The phylogenetic trees derived from the nucleotide and amino acid sequences were consistent with the classification of species resulting from classical taxonomic analyses.
Ozawa, Tatsuhiko; Kondo, Masato; Isobe, Masaharu
2004-01-01
The 3' rapid amplification of cDNA ends (3' RACE) is widely used to isolate the cDNA of unknown 3' flanking sequences. However, the conventional 3' RACE often fails to amplify cDNA from a large transcript if there is a long distance between the 5' gene-specific primer and poly(A) stretch, since the conventional 3' RACE utilizes 3' oligo-dT-containing primer complementary to the poly(A) tail of mRNA at the first strand cDNA synthesis. To overcome this problem, we have developed an improved 3' RACE method suitable for the isolation of cDNA derived from very large transcripts. By using the oligonucleotide-containing random 9mer together with the GC-rich sequence for the suppression PCR technology at the first strand of cDNA synthesis, we have been able to amplify the cDNA from a very large transcript, such as the microtubule-actin crosslinking factor 1 (MACF1) gene, which codes a transcript of 20 kb in size. When there is no splicing variant, our highly specific amplification allows us to perform the direct sequencing of 3' RACE products without requiring cloning in bacterial hosts. Thus, this stepwise 3' RACE walking will help rapid characterization of the 3' structure of a gene, even when it encodes a very large transcript.
Sirakova, T D; Markaryan, A; Kolattukudy, P E
1994-01-01
An extracellular elastinolytic metalloproteinase, purified from Aspergillus fumigatus isolated from an aspergillosis and patient/and an internal peptide derived from it were subjected to N-terminal sequencing. Oligonucleotide primers based on these sequences were used to PCR amplify a segment of the metalloproteinase cDNA, which was used as a probe to isolate the cDNA and gene for this enzyme. The gene sequence matched exactly with the cDNA sequence except for the four introns that interrupted the open reading frame. According to the deduced amino acid sequence, the metalloproteinase has a signal sequence and 227 additional amino acids preceding the sequence for the mature protein of 389 amino acids with a calculated molecular mass of 42 kDa, which is close to the size of the purified mature fungal proteinase. This sequence contains segments that matched both the N terminus of the mature protein and the internal peptide. A. fumigatus metalloproteinase contains some of the conserved zinc-binding and active-site motifs characteristic of metalloproteinases but shows no overall homology with known metalloproteinases. The cDNA of the mature protein when introduced into Escherichia coli directed the expression of a protein with a size, N-terminal sequence, and immunological cross-reactivity identical to those of the native fungal enzyme. Although the enzyme in the inclusion bodies could not be renatured, expression at 30 degrees C yielded soluble enzyme that showed chromatographic behavior identical to that of the native fungal enzyme and catalyzed hydrolysis of elastin. The metalloproteinase gene described here was not found in Aspergillus flavus. Images PMID:7927676
Gomulski, Ludvik M; Dimopoulos, George; Xi, Zhiyong; Soares, Marcelo B; Bonaldo, Maria F; Malacrida, Anna R; Gasperi, Giuliano
2008-01-01
Background The medfly, Ceratitis capitata, is a highly invasive agricultural pest that has become a model insect for the development of biological control programs. Despite research into the behavior and classical and population genetics of this organism, the quantity of sequence data available is limited. We have utilized an expressed sequence tag (EST) approach to obtain detailed information on transcriptome signatures that relate to a variety of physiological systems in the medfly; this information emphasizes on reproduction, sex determination, and chemosensory perception, since the study was based on normalized cDNA libraries from embryos and adult heads. Results A total of 21,253 high-quality ESTs were obtained from the embryo and head libraries. Clustering analyses performed separately for each library resulted in 5201 embryo and 6684 head transcripts. Considering an estimated 19% overlap in the transcriptomes of the two libraries, they represent about 9614 unique transcripts involved in a wide range of biological processes and molecular functions. Of particular interest are the sequences that share homology with Drosophila genes involved in sex determination, olfaction, and reproductive behavior. The medfly transformer2 (tra2) homolog was identified among the embryonic sequences, and its genomic organization and expression were characterized. Conclusion The sequences obtained in this study represent the first major dataset of expressed genes in a tephritid species of agricultural importance. This resource provides essential information to support the investigation of numerous questions regarding the biology of the medfly and other related species and also constitutes an invaluable tool for the annotation of complete genome sequences. Our study has revealed intriguing findings regarding the transcript regulation of tra2 and other sex determination genes, as well as insights into the comparative genomics of genes implicated in chemosensory reception and reproduction. PMID:18500975
Syntenic conservation of HSP70 genes in cattle and humans
DOE Office of Scientific and Technical Information (OSTI.GOV)
Grosz, M.D.; Womack, J.E.; Skow, L.C.
1992-12-01
A phage library of bovine genomic DNA was screened for hybridization with a human HSP70 cDNA probe, and 21 positive plaques were identified and isolated. Restriction mapping and blot hybridization analysis of DNA from the recombinant plaques demonstrated that the cloned DNAs were derived from three different regions of the bovine genome. Ore region contains two tandemly arrayed HSP70 sequences, designated HSP70-1 and HSP70-2, separated by approximately 8 kb of DNA. Single HSP70 sequences, designated HSP70-3 and HSP70-4, were found in two other genomic regions. Locus-specific probes of unique flanking sequences from representative HSP70 clones were hybridized to restriction endonuclease-digestedmore » DNA from bovine-hamster and bovine-mouse somatic cell hybrid panels to determine the chromosomal location of the HSP70 sequences. The probe for the tandemly arrayed HSP70-1 and HSP70-2 sequences mapped to bovine chromosome 23, syntenic with glyoxalase 1, 21 steroid hydroxylase, and major histocompatibility class I loci. HSP70-3 sequences mapped to bovine chromosome 10, syntenic with nucleoside phosphorylase and murine osteosarcoma viral oncogene (v-fos), and HSP70-4 mapped to bovine syntenic group U6, syntenic with amylase 1 and phosphoglucomutase 1. On the basis of these data, the authors propose that bovine HSP70-1,2 are homologous to human HSPA1 and HSPA1L on chromosome 6p21.3, bovine HSP70-3 is the homolog of an unnamed human HSP70 gene on chromosome 14q22-q24, and bovine HSP70-4 is homologous to one of the human HSPA-6,-7 genes on chromosome 1. 34 refs., 2 figs., 1 tab.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)
Wang, O.; Masters, C.; Lewis, M.B.
1994-09-01
In an 8-year-old girl and her father, both of whom have severe type III OI, we have previously used RNA/RNA hybrid analysis to demonstrate a mismatch in the region of {alpha}1(I) mRNA coding for aa 558-861. We used SSCP to further localize the abnormality to a subregion coding for aa 579-679. This region was subcloned and sequenced. Each patient`s cDNA has a deletion of the sequences coding for the last residue of exon 34, and all of exons 35 and 36 (aa 604-639), followed by an insertion of 156 nt from the 3{prime}-end of intron 36. PCR amplification of leukocytemore » DNA from the patients and the clinically normal paternal grandmother yielded two fragments: a 1007 bp fragment predicted from normal genomic sequences and a 445 bp fragment. Subcloning and sequencing of the shorter genomic PCR product confirmed the presence of a 565 bp genomic deletion from the end of exon 34 to the middle of intron 36. The abnormal protein is apparently synthesized and incorporated into helix. The inserted nucleotides are in frame with the collagenous sequence and contain no stop codons. They encode a 52 aa non-collagenous region. The fibroblast procollagen of the patients has both normal and electrophoretically delayed pro{alpha}(I) bands. The electrophoretically delayed procollagen is very sensitive to pepsin or trypsin digestion, as predicted by its non-collagenous sequence, and cannot be visualized as collagen. This unique OI collagen mutation is an excellent candidate for molecular targeting to {open_quotes}turn off{close_quotes} a dominant mutant allele.« less
Conservation and divergence of ADAM family proteins in the Xenopus genome
2010-01-01
Background Members of the disintegrin metalloproteinase (ADAM) family play important roles in cellular and developmental processes through their functions as proteases and/or binding partners for other proteins. The amphibian Xenopus has long been used as a model for early vertebrate development, but genome-wide analyses for large gene families were not possible until the recent completion of the X. tropicalis genome sequence and the availability of large scale expression sequence tag (EST) databases. In this study we carried out a systematic analysis of the X. tropicalis genome and uncovered several interesting features of ADAM genes in this species. Results Based on the X. tropicalis genome sequence and EST databases, we identified Xenopus orthologues of mammalian ADAMs and obtained full-length cDNA clones for these genes. The deduced protein sequences, synteny and exon-intron boundaries are conserved between most human and X. tropicalis orthologues. The alternative splicing patterns of certain Xenopus ADAM genes, such as adams 22 and 28, are similar to those of their mammalian orthologues. However, we were unable to identify an orthologue for ADAM7 or 8. The Xenopus orthologue of ADAM15, an active metalloproteinase in mammals, does not contain the conserved zinc-binding motif and is hence considered proteolytically inactive. We also found evidence for gain of ADAM genes in Xenopus as compared to other species. There is a homologue of ADAM10 in Xenopus that is missing in most mammals. Furthermore, a single scaffold of X. tropicalis genome contains four genes encoding ADAM28 homologues, suggesting genome duplication in this region. Conclusions Our genome-wide analysis of ADAM genes in X. tropicalis revealed both conservation and evolutionary divergence of these genes in this amphibian species. On the one hand, all ADAMs implicated in normal development and health in other species are conserved in X. tropicalis. On the other hand, some ADAM genes and ADAM protease activities are absent, while other novel ADAM proteins in this species are predicted by this study. The conservation and unique divergence of ADAM genes in Xenopus probably reflect the particular selective pressures these amphibian species faced during evolution. PMID:20630080
NASA Astrophysics Data System (ADS)
Shahrashoob, M.; Mohsenifar, A.; Tabatabaei, M.; Rahmani-Cherati, T.; Mobaraki, M.; Mota, A.; Shojaei, T. R.
2016-05-01
A novel optics-based nanobiosensor for sensitive determination of the Helicobacter pylori genome using a gold nanoparticles (AuNPs)-labeled probe is reported. Two specific thiol-modified capture and signal probes were designed based on a single-stranded complementary DNA (cDNA) region of the urease gene. The capture probe was immobilized on AuNPs, which were previously immobilized on an APTES-activated glass, and the signal probe was conjugated to different AuNPs as well. The presence of the cDNA in the reaction mixture led to the hybridization of the AuNPs-labeled capture probe and the signal probe with the cDNA, and consequently the optical density of the reaction mixture (AuNPs) was reduced proportionally to the cDNA concentration. The limit of detection was measured at 0.5 nM.
2013-01-01
identity to acetylcholinesterase mRNA sequences of Culex tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a...tritaeniorhynchus and Lutzomyia longipalpis, respectively. The P. papatasi cDNA ORF encoded a 710-amino acid protein [GenBank: AFP20868] exhibiting 85...improve effectiveness of pesticide application for control of the new world sand fly Lutzomyia longipalpis in chicken sheds [13]. Attempts to control
Discovery of the "RNA continent" through a contrarian's research strategy.
Hayashizaki, Yoshihide
2011-01-01
The International Human Genome Sequencing Consortium completed the decoding of the human genome sequence in 2003. Readers will be aware of the paradigm shift which has occurred since then in the field of life science research. At last, mankind has been able to focus on a complete picture of the full extent of the genome, on which is recorded the basic information that controls all life. Meanwhile, another genome project, centered on Japan and known as the mouse genome encyclopedia project, was progressing with participation from around the world. Led by our research group at RIKEN, it was a full-length cDNA project which aimed to decode the whole RNA (transcriptome) using the mouse as a model. The basic information that controls all life is recorded on the genome, but in order to obtain a complete picture of this extensive information, the decoding of the genome alone is far from sufficient. These two genome projects established that the number of letters in the genome, which is the blueprint of life, is finite, that the number of RNA molecules derived from it is also finite, and that the number of protein molecules derived from the RNA is probably finite too. A massive number of combinations is still involved, but we are now able to understand one section of the network formed by these data. Once an object of study has been understood to be finite, establishing an image of the whole is certain to lead us to an understanding of the whole. Omics is an approach that views the information controlling life as finite and seeks to assemble and analyze it as a whole. Here, I would like to present our transcriptome research while making reference to our unique research strategy.
Dialynas, D P; Murre, C; Quertermous, T; Boss, J M; Leiden, J M; Seidman, J G; Strominger, J L
1986-01-01
Complementary DNA (cDNA) encoding a human T-cell gamma chain has been cloned and sequenced. At the junction of the variable and joining regions, there is an apparent deletion of two nucleotides in the human cDNA sequence relative to the murine gamma-chain cDNA sequence, resulting simultaneously in the generation of an in-frame stop codon and in a translational frameshift. For this reason, the sequence presented here encodes an aberrantly rearranged human T-cell gamma chain. There are several surprising differences between the deduced human and murine gamma-chain amino acid sequences. These include poor homology in the variable region, poor homology in a discrete segment of the constant region precisely bounded by the expected junctions of exon CII, and the presence in the human sequence of five potential sites for N-linked glycosylation. Images PMID:3458221
Xin, Haiping; Zhu, Wei; Wang, Lina; Xiang, Yue; Fang, Linchuan; Li, Jitao; Sun, Xiaoming; Wang, Nian; Londo, Jason P.; Li, Shaohua
2013-01-01
Grape is one of the most important fruit crops worldwide. The suitable geographical locations and productivity of grapes are largely limited by temperature. Vitis amurensis is a wild grapevine species with remarkable cold-tolerance, exceeding that of Vitis vinifera, the dominant cultivated species of grapevine. However, the molecular mechanisms that contribute to the enhanced freezing tolerance of V. amurensis remain unknown. Here we used deep sequencing data from restriction endonuclease-generated cDNA fragments to evaluate the whole genome wide modification of transcriptome of V. amurensis under cold treatment. Vitis vinifera cv. Muscat of Hamburg was used as control to help investigate the distinctive features of V. amruensis in responding to cold stress. Approximately 9 million tags were sequenced from non-cold treatment (NCT) and cold treatment (CT) cDNA libraries in each species of grapevine sampled from shoot apices. Alignment of tags into V. vinifera cv. Pinot noir (PN40024) annotated genome identified over 15,000 transcripts in each library in V. amruensis and more than 16,000 in Muscat of Hamburg. Comparative analysis between NCT and CT libraries indicate that V. amurensis has fewer differential expressed genes (DEGs, 1314 transcripts) than Muscat of Hamburg (2307 transcripts) when exposed to cold stress. Common DEGs (408 transcripts) suggest that some genes provide fundamental roles during cold stress in grapes. The most robust DEGs (more than 20-fold change) also demonstrated significant differences between two kinds of grapevine, indicating that cold stress may trigger species specific pathways in V. amurensis. Functional categories of DEGs indicated that the proportion of up-regulated transcripts related to metabolism, transport, signal transduction and transcription were more abundant in V. amurensis. Several highly expressed transcripts that were found uniquely accumulated in V. amurensis are discussed in detail. This subset of unique candidate transcripts may contribute to the excellent cold-hardiness of V. amurensis. PMID:23516547
Huang, Shengbing; Song, Wei; Lin, Qishui
2005-08-01
A membrane-bound protein was purified from rat liver mitochondria. After being digested with V8 protease, two peptides containing identical 14 amino acid residue sequences were obtained. Using the 14 amino acid peptide derived DNA sequence as gene specific primer, the cDNA of correspondent gene 5'-terminal and 3'-terminal were obtained by RACE technique. The full-length cDNA that encoded a protein of 616 amino acids was thus cloned, which included the above mentioned peptide sequence. The full length cDNA was highly homologous to that of human ETF-QO, indicating that it may be the cDNA of rat ETF-QO. ETF-QO is an iron sulfur protein located in mitochondria inner membrane containing two kinds of redox center: FAD and [4Fe-4S] center. After comparing the sequence from the cDNA of the 616 amino acids protein with that of the mature protein of rat liver mitochondria, it was found that the N terminal 32 amino acid residues did not exist in the mature protein, indicating that the cDNA was that of ETF-QOp. When the cDNA was expressed in Saccharomyces cerevisiae with inducible vectors, the protein product was enriched in mitochondrial fraction and exhibited electron transfer activity (NBT reductase activity) of ETF-QO. Results demonstrated that the 32 amino acid peptide was a mitochondrial targeting peptide, and both FAD and iron-sulfur cluster were inserted properly into the expressed ETF-QO. ETF-QO had a high level expression in rat heart, liver and kidney. The fusion protein of GFP-ETF-QO co-localized with mitochondria in COS-7 cells.
EdiPy: a resource to simulate the evolution of plant mitochondrial genes under the RNA editing.
Picardi, Ernesto; Quagliariello, Carla
2006-02-01
EdiPy is an online resource appropriately designed to simulate the evolution of plant mitochondrial genes in a biologically realistic fashion. EdiPy takes into account the presence of sites subjected to RNA editing and provides multiple artificial alignments corresponding to both genomic and cDNA sequences. Each artificial data set can successively be submitted to main and widespread evolutionary and phylogenetic software packages such as PAUP, Phyml, PAML and Phylip. As an online bioinformatic resource, EdiPy is available at the following web page: http://biologia.unical.it/py_script/index.html.
Tang, Qi; Ma, Xiaojun; Mo, Changming; Wilson, Iain W; Song, Cai; Zhao, Huan; Yang, Yanfang; Fu, Wei; Qiu, Deyou
2011-07-05
Siraitia grosvenorii (Luohanguo) is an herbaceous perennial plant native to southern China and most prevalent in Guilin city. Its fruit contains a sweet, fleshy, edible pulp that is widely used in traditional Chinese medicine. The major bioactive constituents in the fruit extract are the cucurbitane-type triterpene saponins known as mogrosides. Among them, mogroside V is nearly 300 times sweeter than sucrose. However, little is known about mogrosides biosynthesis in S. grosvenorii, especially the late steps of the pathway. In this study, a cDNA library generated from of equal amount of RNA taken from S. grosvenorii fruit at 50 days after flowering (DAF) and 70 DAF were sequenced using Illumina/Solexa platform. More than 48,755,516 high-quality reads from a cDNA library were generated that was assembled into 43,891 unigenes. De novo assembly and gap-filling generated 43,891 unigenes with an average sequence length of 668 base pairs. A total of 26,308 (59.9%) unique sequences were annotated and 11,476 of the unique sequences were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes. cDNA sequences for all of the known enzymes involved in mogrosides backbone synthesis were identified from our library. Additionally, a total of eighty-five cytochrome P450 (CYP450) and ninety UDP-glucosyltransferase (UDPG) unigenes were identified, some of which appear to encode enzymes responsible for the conversion of the mogroside backbone into the various mogrosides. Digital gene expression profile (DGE) analysis using Solexa sequencing was performed on three important stages of fruit development, and based on their expression pattern, seven CYP450s and five UDPGs were selected as the candidates most likely to be involved in mogrosides biosynthesis. A combination of RNA-seq and DGE analysis based on the next generation sequencing technology was shown to be a powerful method for identifying candidate genes encoding enzymes responsible for the biosynthesis of novel secondary metabolites in a non-model plant. Seven CYP450s and five UDPGs were selected as potential candidates involved in mogrosides biosynthesis. The transcriptome data from this study provides an important resource for understanding the formation of major bioactive constituents in the fruit extract from S. grosvenorii.
Somatic hypermutation and junctional diversification at Ig heavy chain loci in the nurse shark.
Malecek, Karolina; Brandman, Julie; Brodsky, Jennie E; Ohta, Yuko; Flajnik, Martin F; Hsu, Ellen
2005-12-15
We estimate there are approximately 15 IgM H chain loci in the nurse shark genome and have characterized one locus. It consists of one V, two D, and one J germline gene segments, and the constant (C) region can be distinguished from all of the others by a unique combination of restriction endonuclease sites in Cmu2. On the basis of these Cmu2 markers, 22 cDNA clones were selected from an epigonal organ cDNA library from the same individual; their C region sequences proved to be the same up to the polyadenylation site. With the identification of the corresponding germline gene segments, CDR3 from shark H chain rearrangements could be analyzed precisely, for the first time. Considerable diversity was generated by trimming and N addition at the three junctions and by varied recombination patterns of the two D gene segments. The cDNA sequences originated from independent rearrangements events, and most carried both single and contiguous substitutions. The 53 point mutations occurred with a bias for transition changes (53%), whereas the 78 tandem substitutions, mostly 2-4 bp long, do not (36%). The nature of the substitution patterns is the same as for mutants from six loci of two nurse shark L chain isotypes, showing that somatic hypermutation events are very similar at both H and L chain genes in this early vertebrate. The cis-regulatory elements targeting somatic hypermutation must have already existed in the ancestral Ig gene, before H and L chain divergence.
Evaluation of the arrestin gene in patients with retinitis pigmentosa or an allied disease
DOE Office of Scientific and Technical Information (OSTI.GOV)
DeStefano, D.J.; Berson, E.L.; Dryja, T.P.
1994-09-01
Arrestin, also called 48K protein or S-antigen, plays a role in deactivating rhodopsin, the photosensitive, seven-helix, G-protein receptor found in rod photoreceptors. In Drosophila, null mutations in arrestin genes cause a light-dependent photoreceptor degeneration. It is possible that a comparable photoreceptor degeneration in humans is caused by defects in the rod arrestin gene. In order to evaluate this possibility, we are characterizing the human arrestin locus on chromosome 2q. We screened a genomic library (5 million plaques) using an arrestin cDNA clone. Sixty-eight hybridizing clones were identified; portions of 7 clones were sequenced to determine the intron sequence flanking themore » exons. We are using SSCP analysis and direct genomic sequencing to screen the entire coding region, splice donor and acceptor sites, and the promoter region of the arrestin gene in 188 patients with autosomal dominant and 104 patients with autosomal recessive retinitis pigmentosa. We have already obtained flanking intron sequences necessary for SSCP analysis for 13 of 16 exons. So far, we have identified 4 silent base changes at codons 67 (TGC-to-TGT), 107 (CTG-to-CTC), 163 (GCC-to-GCT), and 288 (CTG-to-TGT), all with allele frequencies at 1% or less. Several other variant bands detected by SSCP analysis are currently being sequenced.« less
Zhang, Peipei; Liu, Yan; Liu, Wenwen; Cao, Mengji; Massart, Sebastien; Wang, Xifeng
2017-01-01
To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV) (most likely pathogens) using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV). The full genome of WLYaV corresponds to 5,772 nucleotides (nt), with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae. Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV), but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP) were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90%) in the family Luteoviridae. Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat. PMID:28932215
Zhang, Peipei; Liu, Yan; Liu, Wenwen; Cao, Mengji; Massart, Sebastien; Wang, Xifeng
2017-01-01
To identify the pathogens responsible for leaf yellowing symptoms on wheat samples collected from Jinan, China, we tested for the presence of three known barley/wheat yellow dwarf viruses (BYDV-GAV, -PAV, WYDV-GPV) (most likely pathogens) using RT-PCR. A sample that tested negative for the three viruses was selected for small RNA sequencing. Twenty-five million sequences were generated, among which 5% were of viral origin. A novel polerovirus was discovered and temporarily named wheat leaf yellowing-associated virus (WLYaV). The full genome of WLYaV corresponds to 5,772 nucleotides (nt), with six AUG-initiated open reading frames, one non-AUG-initiated open reading frame, and three untranslated regions, showing typical features of the family Luteoviridae . Sequence comparison and phylogenetic analyses suggested that WLYaV had the closest relationship with sugarcane yellow leaf virus (ScYLV), but the identities of full genomic nucleotides and deduced amino acid sequence of coat protein (CP) were 64.9 and 86.2%, respectively, below the species demarcation thresholds (90%) in the family Luteoviridae . Furthermore, agroinoculation of Nicotiana benthamiana leaves with a cDNA clone of WLYaV caused yellowing symptoms on the plant. Our study adds a new polerovirus that is associated with wheat leaf yellowing disease, which would help to identify and control pathogens of wheat.
Ribosomal protein S14 transcripts are edited in Oenothera mitochondria.
Schuster, W; Unseld, M; Wissinger, B; Brennicke, A
1990-01-01
The gene encoding ribosomal protein S14 (rps14) in Oenothera mitochondria is located upstream of the cytochrome b gene (cob). Sequence analysis of independently derived cDNA clones covering the entire rps14 coding region shows two nucleotides edited from the genomic DNA to the mRNA derived sequences by C to U modifications. A third editing event occurs four nucleotides upstream of the AUG initiation codon and improves a potential ribosome binding site. A CGG codon specifying arginine in a position conserved in evolution between chloroplasts and E. coli as a UGG tryptophan codon is not edited in any of the cDNAs analysed. An inverted repeat 3' of an unidentified open reading frame is located upstream of the rps14 gene. The inverted repeat sequence is highly conserved at analogous regions in other Oenothera mitochondrial loci. Images PMID:2326162
Ammayappan, Arun; Vakharia, Vikram N
2009-01-01
Background Viral hemorrhagic septicemia virus (VHSV) is a highly contagious viral disease of fresh and saltwater fish worldwide. VHSV caused several large scale fish kills in the Great Lakes area and has been found in 28 different host species. The emergence of VHS in the Great Lakes began with the isolation of VHSV from a diseased muskellunge (Esox masquinongy) caught from Lake St. Clair in 2003. VHSV is a member of the genus Novirhabdovirus, within the family Rhabdoviridae. It has a linear single-stranded, negative-sense RNA genome of approximately 11 kbp, with six genes. VHSV replicates in the cytoplasm and produces six monocistronic mRNAs. The gene order of VHSV is 3'-N-P-M-G-NV-L-5'. This study describes molecular characterization of the Great Lakes VHSV strain (MI03GL), and its phylogenetic relationships with selected European and North American isolates. Results The complete genomic sequences of VHSV-MI03GL strain was determined from cloned cDNA of six overlapping fragments, obtained by RT-PCR amplification of genomic RNA. The complete genome sequence of MI03GL comprises 11,184 nucleotides (GenBank GQ385941) with the gene order of 3'-N-P-M-G-NV-L-5'. These genes are separated by conserved gene junctions, with di-nucleotide gene spacers. The first 4 nucleotides at the termini of the VHSV genome are complementary and identical to other novirhadoviruses genomic termini. Sequence homology and phylogenetic analysis show that the Great Lakes virus is closely related to the Japanese strains JF00Ehi1 (96%) and KRRV9822 (95%). Among other novirhabdoviruses, VHSV shares highest sequence homology (62%) with snakehead rhabdovirus. Conclusion Phylogenetic tree obtained by comparing 48 glycoprotein gene sequences of different VHSV strains demonstrate that the Great Lakes VHSV is closely related to the North American and Japanese genotype IVa, but forms a distinct genotype IVb, which is clearly different from the three European genotypes. Molecular characterization of the Great Lakes isolate will be helpful in studying the pathogenesis of VHSV using a reverse genetics approach and developing efficient control strategies. PMID:19852863
Comparative analyses of two Geraniaceae transcriptomes using next-generation sequencing.
Zhang, Jin; Ruhlman, Tracey A; Mower, Jeffrey P; Jansen, Robert K
2013-12-29
Organelle genomes of Geraniaceae exhibit several unusual evolutionary phenomena compared to other angiosperm families including accelerated nucleotide substitution rates, widespread gene loss, reduced RNA editing, and extensive genomic rearrangements. Since most organelle-encoded proteins function in multi-subunit complexes that also contain nuclear-encoded proteins, it is likely that the atypical organellar phenomena affect the evolution of nuclear genes encoding organellar proteins. To begin to unravel the complex co-evolutionary interplay between organellar and nuclear genomes in this family, we sequenced nuclear transcriptomes of two species, Geranium maderense and Pelargonium x hortorum. Normalized cDNA libraries of G. maderense and P. x hortorum were used for transcriptome sequencing. Five assemblers (MIRA, Newbler, SOAPdenovo, SOAPdenovo-trans [SOAPtrans], Trinity) and two next-generation technologies (454 and Illumina) were compared to determine the optimal transcriptome sequencing approach. Trinity provided the highest quality assembly of Illumina data with the deepest transcriptome coverage. An analysis to determine the amount of sequencing needed for de novo assembly revealed diminishing returns of coverage and quality with data sets larger than sixty million Illumina paired end reads for both species. The G. maderense and P. x hortorum transcriptomes contained fewer transcripts encoding the PLS subclass of PPR proteins relative to other angiosperms, consistent with reduced mitochondrial RNA editing activity in Geraniaceae. In addition, transcripts for all six plastid targeted sigma factors were identified in both transcriptomes, suggesting that one of the highly divergent rpoA-like ORFs in the P. x hortorum plastid genome is functional. The findings support the use of the Illumina platform and assemblers optimized for transcriptome assembly, such as Trinity or SOAPtrans, to generate high-quality de novo transcriptomes with broad coverage. In addition, results indicated no major improvements in breadth of coverage with data sets larger than six billion nucleotides or when sampling RNA from four tissue types rather than from a single tissue. Finally, this work demonstrates the power of cross-compartmental genomic analyses to deepen our understanding of the correlated evolution of the nuclear, plastid, and mitochondrial genomes in plants.
Comparative analyses of two Geraniaceae transcriptomes using next-generation sequencing
2013-01-01
Background Organelle genomes of Geraniaceae exhibit several unusual evolutionary phenomena compared to other angiosperm families including accelerated nucleotide substitution rates, widespread gene loss, reduced RNA editing, and extensive genomic rearrangements. Since most organelle-encoded proteins function in multi-subunit complexes that also contain nuclear-encoded proteins, it is likely that the atypical organellar phenomena affect the evolution of nuclear genes encoding organellar proteins. To begin to unravel the complex co-evolutionary interplay between organellar and nuclear genomes in this family, we sequenced nuclear transcriptomes of two species, Geranium maderense and Pelargonium x hortorum. Results Normalized cDNA libraries of G. maderense and P. x hortorum were used for transcriptome sequencing. Five assemblers (MIRA, Newbler, SOAPdenovo, SOAPdenovo-trans [SOAPtrans], Trinity) and two next-generation technologies (454 and Illumina) were compared to determine the optimal transcriptome sequencing approach. Trinity provided the highest quality assembly of Illumina data with the deepest transcriptome coverage. An analysis to determine the amount of sequencing needed for de novo assembly revealed diminishing returns of coverage and quality with data sets larger than sixty million Illumina paired end reads for both species. The G. maderense and P. x hortorum transcriptomes contained fewer transcripts encoding the PLS subclass of PPR proteins relative to other angiosperms, consistent with reduced mitochondrial RNA editing activity in Geraniaceae. In addition, transcripts for all six plastid targeted sigma factors were identified in both transcriptomes, suggesting that one of the highly divergent rpoA-like ORFs in the P. x hortorum plastid genome is functional. Conclusions The findings support the use of the Illumina platform and assemblers optimized for transcriptome assembly, such as Trinity or SOAPtrans, to generate high-quality de novo transcriptomes with broad coverage. In addition, results indicated no major improvements in breadth of coverage with data sets larger than six billion nucleotides or when sampling RNA from four tissue types rather than from a single tissue. Finally, this work demonstrates the power of cross-compartmental genomic analyses to deepen our understanding of the correlated evolution of the nuclear, plastid, and mitochondrial genomes in plants. PMID:24373163
Hirotani, M; Kuroda, R; Suzuki, H; Yoshikawa, T
2000-05-01
A cDNA encoding UDP-glucose: baicalein 7-O-glucosyltransferase (UBGT) was isolated from a cDNA library from hairy root cultures of Scutellaria baicalensis Georgi probed with a partial-length cDNA clone of a UDP-glucose: flavonoid 3-O-glucosyltransferase (UFGT) from grape (Vitis vinifera L.). The heterologous probe contained a glucosyltransferase consensus amino acid sequence which was also present in the Scutellaria cDNA clones. The complete nucleotide sequence of the 1688-bp cDNA insert was determined and the deduced amino acid sequences are presented. The nucleotide sequence analysis of UBGT revealed an open reading frame encoding a polypeptide of 476 amino acids with a calculated molecular mass of 53,094 Da. The reaction product for baicalein and UDP-glucose catalyzed by recombinant UBGT in Escherichia coli was identified as authentic baicalein 7-O-glucoside using high-performance liquid chromatography and proton nuclear magnetic resonance spectroscopy. The enzyme activities of recombinant UBGT expressed in E. coli were also detected towards flavonoids such as baicalein, wogonin, apigenin, scutellarein, 7,4'-dihydroxyflavone and kaempferol, and phenolic compounds. The accumulation of UBGT mRNA in hairy roots was in response to wounding or salicylic acid treatments.
Elrobh, Mohamed S.; Alanazi, Mohammad S.; Khan, Wajahatullah; Abduljaleel, Zainularifeen; Al-Amri, Abdullah; Bazzi, Mohammad D.
2011-01-01
Heat shock proteins are ubiquitous, induced under a number of environmental and metabolic stresses, with highly conserved DNA sequences among mammalian species. Camelus dromedaries (the Arabian camel) domesticated under semi-desert environments, is well adapted to tolerate and survive against severe drought and high temperatures for extended periods. This is the first report of molecular cloning and characterization of full length cDNA of encoding a putative stress-induced heat shock HSPA6 protein (also called HSP70B′) from Arabian camel. A full-length cDNA (2417 bp) was obtained by rapid amplification of cDNA ends (RACE) and cloned in pET-b expression vector. The sequence analysis of HSPA6 gene showed 1932 bp-long open reading frame encoding 643 amino acids. The complete cDNA sequence of the Arabian camel HSPA6 gene was submitted to NCBI GeneBank (accession number HQ214118.1). The BLAST analysis indicated that C. dromedaries HSPA6 gene nucleotides shared high similarity (77–91%) with heat shock gene nucleotide of other mammals. The deduced 643 amino acid sequences (accession number ADO12067.1) showed that the predicted protein has an estimated molecular weight of 70.5 kDa with a predicted isoelectric point (pI) of 6.0. The comparative analyses of camel HSPA6 protein sequences with other mammalian heat shock proteins (HSPs) showed high identity (80–94%). Predicted camel HSPA6 protein structure using Protein 3D structural analysis high similarities with human and mouse HSPs. Taken together, this study indicates that the cDNA sequences of HSPA6 gene and its amino acid and protein structure from the Arabian camel are highly conserved and have similarities with other mammalian species. PMID:21845074
Eberwine, James; Bartfai, Tamas
2011-01-01
We report on an ‘unbiased’ molecular characterization of individual, adult neurons, active in a central, anterior hypothalamic neuronal circuit, by establishing cDNA libraries from each individual, electrophysiologically identified warm sensitive neuron (WSN). The cDNA libraries were analyzed by Affymetrix microarray. The presence and frequency of cDNAs was confirmed and enhanced with Illumina sequencing of each single cell cDNA library. cDNAs encoding the GABA biosynthetic enzyme. GAD1 and of adrenomedullin, galanin, prodynorphin, somatostatin, and tachykinin were found in the WSNs. The functional cellular and in vivo studies on dozens of the more than 500 neurotransmitter -, hormone- receptors and ion channels, whose cDNA was identified and sequence confirmed, suggest little or no discrepancy between the transcriptional and functional data in WSNs; whenever agonists were available for a receptor whose cDNA was identified, a functional response was found.. Sequencing single neuron libraries permitted identification of rarely expressed receptors like the insulin receptor, adiponectin receptor2 and of receptor heterodimers; information that is lost when pooling cells leads to dilution of signals and mixing signals. Despite the common electrophysiological phenotype and uniform GAD1 expression, WSN- transcriptomes show heterogenity, suggesting strong epigenetic influence on the transcriptome. Our study suggests that it is well-worth interrogating the cDNA libraries of single neurons by sequencing and chipping. PMID:20970451
Shiroguchi, Katsuyuki; Jia, Tony Z.; Sims, Peter A.; Xie, X. Sunney
2012-01-01
RNA sequencing (RNA-Seq) is a powerful tool for transcriptome profiling, but is hampered by sequence-dependent bias and inaccuracy at low copy numbers intrinsic to exponential PCR amplification. We developed a simple strategy for mitigating these complications, allowing truly digital RNA-Seq. Following reverse transcription, a large set of barcode sequences is added in excess, and nearly every cDNA molecule is uniquely labeled by random attachment of barcode sequences to both ends. After PCR, we applied paired-end deep sequencing to read the two barcodes and cDNA sequences. Rather than counting the number of reads, RNA abundance is measured based on the number of unique barcode sequences observed for a given cDNA sequence. We optimized the barcodes to be unambiguously identifiable, even in the presence of multiple sequencing errors. This method allows counting with single-copy resolution despite sequence-dependent bias and PCR-amplification noise, and is analogous to digital PCR but amendable to quantifying a whole transcriptome. We demonstrated transcriptome profiling of Escherichia coli with more accurate and reproducible quantification than conventional RNA-Seq. PMID:22232676
Cloning and High-Level Expression of α-Galactosidase cDNA from Penicillium purpurogenum
Shibuya, Hajime; Nagasaki, Hiroaki; Kaneko, Satoshi; Yoshida, Shigeki; Park, Gwi Gun; Kusakabe, Isao; Kobayashi, Hideyuki
1998-01-01
The cDNA coding for Penicillium purpurogenum α-galactosidase (αGal) was cloned and sequenced. The deduced amino acid sequence of the α-Gal cDNA showed that the mature enzyme consisted of 419 amino acid residues with a molecular mass of 46,334 Da. The derived amino acid sequence of the enzyme showed similarity to eukaryotic αGals from plants, animals, yeasts, and filamentous fungi. The highest similarity observed (57% identity) was to Trichoderma reesei AGLI. The cDNA was expressed in Saccharomyces cerevisiae under the control of the yeast GAL10 promoter. Almost all of the enzyme produced was secreted into the culture medium, and the expression level reached was approximately 0.2 g/liter. The recombinant enzyme purified to homogeneity was highly glycosylated, showed slightly higher specific activity, and exhibited properties almost identical to those of the native enzyme from P. purpurogenum in terms of the N-terminal amino acid sequence, thermoactivity, pH profile, and mode of action on galacto-oligosaccharides. PMID:9797312
Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.
Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T
1993-01-01
A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829
Primary analysis of repeat elements of the Asian seabass (Lates calcarifer) transcriptome and genome
Kuznetsova, Inna S.; Thevasagayam, Natascha M.; Sridatta, Prakki S. R.; Komissarov, Aleksey S.; Saju, Jolly M.; Ngoh, Si Y.; Jiang, Junhui; Shen, Xueyan; Orbán, László
2014-01-01
As part of our Asian seabass genome project, we are generating an inventory of repeat elements in the genome and transcriptome. The karyotype showed a diploid number of 2n = 24 chromosomes with a variable number of B-chromosomes. The transcriptome and genome of Asian seabass were searched for repetitive elements with experimental and bioinformatics tools. Six different types of repeats constituting 8–14% of the genome were characterized. Repetitive elements were clustered in the pericentromeric heterochromatin of all chromosomes, but some of them were preferentially accumulated in pretelomeric and pericentromeric regions of several chromosomes pairs and have chromosomes specific arrangement. From the dispersed class of fish-specific non-LTR retrotransposon elements Rex1 and MAUI-like repeats were analyzed. They were wide-spread both in the genome and transcriptome, accumulated on the pericentromeric and peritelomeric areas of all chromosomes. Every analyzed repeat was represented in the Asian seabass transcriptome, some showed differential expression between the gonads. The other group of repeats analyzed belongs to the rRNA multigene family. FISH signal for 5S rDNA was located on a single pair of chromosomes, whereas that for 18S rDNA was found on two pairs. A BAC-derived contig containing rDNA was sequenced and assembled into a scaffold containing incomplete fragments of 18S rDNA. Their assembly and chromosomal position revealed that this part of Asian seabass genome is extremely rich in repeats containing evolutionarily conserved and novel sequences. In summary, transcriptome assemblies and cDNA data are suitable for the identification of repetitive DNA from unknown genomes and for comparative investigation of conserved elements between teleosts and other vertebrates. PMID:25120555
Réfega, Susana; Girard-Misguich, Fabienne; Bourdieu, Christiane; Péry, Pierre; Labbé, Marie
2003-04-02
Specific antibodies were produced ex vivo from intestinal culture of Eimeria tenella infected chickens. The specificity of these intestinal antibodies was tested against different parasite stages. These antibodies were used to immunoscreen first generation schizont and sporozoite cDNA libraries permitting the identification of new E. tenella antigens. We obtained a total of 119 cDNA clones which were subjected to sequence analysis. The sequences coding for the proteins inducing local immune responses were compared with nucleotide or protein databases and with expressed sequence tags (ESTs) databases. We identified new Eimeria genes coding for heat shock proteins, a ribosomal protein, a pyruvate kinase and a pyridoxine kinase. Specific features of other sequences are discussed.
Full-Genome Sequencing as a Basis for Molecular Epidemiology Studies of Bluetongue Virus in India
Maan, Sushila; Maan, Narender S.; Belaganahalli, Manjunatha N.; Rao, Pavuluri Panduranga; Singh, Karam Pal; Hemadri, Divakar; Putty, Kalyani; Kumar, Aman; Batra, Kanisht; Krishnajyothi, Yadlapati; Chandel, Bharat S.; Reddy, G. Hanmanth; Nomikou, Kyriaki; Reddy, Yella Narasimha; Attoui, Houssam; Hegde, Nagendra R.; Mertens, Peter P. C.
2015-01-01
Since 1998 there have been significant changes in the global distribution of bluetongue virus (BTV). Ten previously exotic BTV serotypes have been detected in Europe, causing severe disease outbreaks in naïve ruminant populations. Previously exotic BTV serotypes were also identified in the USA, Israel, Australia and India. BTV is transmitted by biting midges (Culicoides spp.) and changes in the distribution of vector species, climate change, increased international travel and trade are thought to have contributed to these events. Thirteen BTV serotypes have been isolated in India since first reports of the disease in the country during 1964. Efficient methods for preparation of viral dsRNA and cDNA synthesis, have facilitated full-genome sequencing of BTV strains from the region. These studies introduce a new approach for BTV characterization, based on full-genome sequencing and phylogenetic analyses, facilitating the identification of BTV serotype, topotype and reassortant strains. Phylogenetic analyses show that most of the equivalent genome-segments of Indian BTV strains are closely related, clustering within a major eastern BTV ‘topotype’. However, genome-segment 5 (Seg-5) encoding NS1, from multiple post 1982 Indian isolates, originated from a western BTV topotype. All ten genome-segments of BTV-2 isolates (IND2003/01, IND2003/02 and IND2003/03) are closely related (>99% identity) to a South African BTV-2 vaccine-strain (western topotype). Similarly BTV-10 isolates (IND2003/06; IND2005/04) show >99% identity in all genome segments, to the prototype BTV-10 (CA-8) strain from the USA. These data suggest repeated introductions of western BTV field and/or vaccine-strains into India, potentially linked to animal or vector-insect movements, or unauthorised use of ‘live’ South African or American BTV-vaccines in the country. The data presented will help improve nucleic acid based diagnostics for Indian serotypes/topotypes, as part of control strategies. PMID:26121128
Gene Patents and Personalized Cancer Care: Impact of the Myriad Case on Clinical Oncology
Offit, Kenneth; Bradbury, Angela; Storm, Courtney; Merz, Jon F.; Noonan, Kevin E.; Spence, Rebecca
2013-01-01
Genomic discoveries have transformed the practice of oncology and cancer prevention. Diagnostic and therapeutic advances based on cancer genomics developed during a time when it was possible to patent genes. A case before the Supreme Court, Association for Molecular Pathology v Myriad Genetics, Inc seeks to overturn patents on isolated genes. Although the outcomes are uncertain, it is suggested here that the Supreme Court decision will have few immediate effects on oncology practice or research but may have more significant long-term impact. The Federal Circuit court has already rejected Myriad's broad diagnostic methods claims, and this is not affected by the Supreme Court decision. Isolated DNA patents were already becoming obsolete on scientific grounds, in an era when human DNA sequence is public knowledge and because modern methods of next-generation sequencing need not involve isolated DNA. The Association for Molecular Pathology v Myriad Supreme Court decision will have limited impact on new drug development, as new drug patents usually involve cellular methods. A nuanced Supreme Court decision acknowledging the scientific distinction between synthetic cDNA and genomic DNA will further mitigate any adverse impact. A Supreme Court decision to include or exclude all types of DNA from patent eligibility could impact future incentives for genomic discovery as well as the future delivery of medical care. Whatever the outcome of this important case, it is important that judicial and legislative actions in this area maximize genomic discovery while also ensuring patients' access to personalized cancer care. PMID:23766521
Identification of HIV Mutation as Diagnostic Biomarker through Next Generation Sequencing.
Shaw, Wen Hui; Lin, Qianqian; Muhammad, Zikry Zhiwei Bin Roslee; Lee, Jia Jun; Khong, Wei Xin; Ng, Oon Tek; Tan, Eng Lee; Li, Peng
2016-07-01
Current clinical detection of Human immunodeficiency virus 1 (HIV-1) is used to target viral genes and proteins. However, the immunoassay, such as viral culture or Polymerase Chain Reaction (PCR), lacks accuracy in the diagnosis, as these conventional assays rely on the stable genome and HIV-1 is a highly-mutated virus. Next generation sequencing (NGS) promises to be transformative for the practice of infectious disease, and the rapidly reducing cost and processing time mean that this will become a feasible technology in diagnostic and research laboratories in the near future. The technology offers the superior sensitivity to detect the pathogenic viruses, including unknown and unexpected strains. To leverage the NGS technology in order to improve current HIV-1 diagnosis and genotyping methods. Ten blood samples were collected from HIV-1 infected patients which were diagnosed by RT PCR at Singapore Communicable Disease Centre, Tan Tock Seng Hospital from October 2014 to March 2015. Viral RNAs were extracted from blood plasma and reversed into cDNA. The HIV-1 cDNA samples were cleaned up using a PCR purification kit and the sequencing library was prepared and identified through MiSeq. Two common mutations were observed in all ten samples. The common mutations were identified at genome locations 1908 and 2104 as missense and silent mutations respectively, conferring S37N and S3S found on aspartic protease and reverse transcriptase subunits. The common mutations identified in this study were not previously reported, therefore suggesting the potential for them to be used for identification of viral infection, disease transmission and drug resistance. This was especially the case for, missense mutation S37N which could cause an amino acid change in viral proteases thus reducing the binding affinity of some protease inhibitors. Thus, the unique common mutations identified in this study could be used as diagnostic biomarkers to indicate the origin of infection as being from Singapore.
A deletion mutation at the ep locus causes low seed coat peroxidase activity in soybean.
Gijzen, M
1997-11-01
The Ep locus severely affects the amount of peroxidase enzyme in soybean seed coats. Plants containing the dominant Ep allele accumulate large amounts of peroxidase in the hourglass cells of the sub-epidermis. Homozygous recessive epep genotypes do not accumulate peroxidase in the hourglass cells and are much reduced in total seed coat peroxidase activity. To isolate the gene encoding the seed coat peroxidase and to determine whether it corresponds to the Ep locus, a cDNA library was constructed from developing seed coats and an abundant 1.3 kb peroxidase transcript was cloned. The corresponding structural gene was also isolated from a genomic library. Sequence analysis shows that the seed coat peroxidase is translated as a 352 amino acid precursor protein of 38 kDa. Processing of a putative 26 amino acid signal sequence results in a mature protein of 326 residues with a calculated mass of 35 kDa and a pl of 4.4. Using probes derived from the cDNA, genomic DNA blot hybridization and polymerase chain reaction analysis detected polymorphisms that distinguished EpEp and epep genotypes. Co-segregation of the polymorphisms in an F2 population from a cross of EpEp and epep plants shows that the Ep locus encodes the seed coat peroxidase protein. Comparison of Ep and ep alleles indicates that the recessive gene lacks 87 bp of sequence encompassing the translation start codon. Analysis by RNA blot hybridization shows that epep plants have drastically reduced amounts of peroxidase transcript compared with EpEp plants. The peroxidase mRNA is abundant in seed coat tissues of EpEp plants during the late stages of seed maturation, and could also be detected in root tissues, but not in the flower, embryo, pod or leaf. The results indicate that the lack of peroxidase accumulation in seed coats of homozygous recessive epep plants is due to a mutation of the structural gene that reduces transcript abundance.
Itoh, Takeshi; Tanaka, Tsuyoshi; Barrero, Roberto A.; Yamasaki, Chisato; Fujii, Yasuyuki; Hilton, Phillip B.; Antonio, Baltazar A.; Aono, Hideo; Apweiler, Rolf; Bruskiewich, Richard; Bureau, Thomas; Burr, Frances; Costa de Oliveira, Antonio; Fuks, Galina; Habara, Takuya; Haberer, Georg; Han, Bin; Harada, Erimi; Hiraki, Aiko T.; Hirochika, Hirohiko; Hoen, Douglas; Hokari, Hiroki; Hosokawa, Satomi; Hsing, Yue; Ikawa, Hiroshi; Ikeo, Kazuho; Imanishi, Tadashi; Ito, Yukiyo; Jaiswal, Pankaj; Kanno, Masako; Kawahara, Yoshihiro; Kawamura, Toshiyuki; Kawashima, Hiroaki; Khurana, Jitendra P.; Kikuchi, Shoshi; Komatsu, Setsuko; Koyanagi, Kanako O.; Kubooka, Hiromi; Lieberherr, Damien; Lin, Yao-Cheng; Lonsdale, David; Matsumoto, Takashi; Matsuya, Akihiro; McCombie, W. Richard; Messing, Joachim; Miyao, Akio; Mulder, Nicola; Nagamura, Yoshiaki; Nam, Jongmin; Namiki, Nobukazu; Numa, Hisataka; Nurimoto, Shin; O’Donovan, Claire; Ohyanagi, Hajime; Okido, Toshihisa; OOta, Satoshi; Osato, Naoki; Palmer, Lance E.; Quetier, Francis; Raghuvanshi, Saurabh; Saichi, Naomi; Sakai, Hiroaki; Sakai, Yasumichi; Sakata, Katsumi; Sakurai, Tetsuya; Sato, Fumihiko; Sato, Yoshiharu; Schoof, Heiko; Seki, Motoaki; Shibata, Michie; Shimizu, Yuji; Shinozaki, Kazuo; Shinso, Yuji; Singh, Nagendra K.; Smith-White, Brian; Takeda, Jun-ichi; Tanino, Motohiko; Tatusova, Tatiana; Thongjuea, Supat; Todokoro, Fusano; Tsugane, Mika; Tyagi, Akhilesh K.; Vanavichit, Apichart; Wang, Aihui; Wing, Rod A.; Yamaguchi, Kaori; Yamamoto, Mayu; Yamamoto, Naoyuki; Yu, Yeisoo; Zhang, Hao; Zhao, Qiang; Higo, Kenichi; Burr, Benjamin; Gojobori, Takashi; Sasaki, Takuji
2007-01-01
We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ∼32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene. PMID:17210932
A putative peroxidase cDNA from turnip and analysis of the encoded protein sequence.
Romero-Gómez, S; Duarte-Vázquez, M A; García-Almendárez, B E; Mayorga-Martínez, L; Cervantes-Avilés, O; Regalado, C
2008-12-01
A putative peroxidase cDNA was isolated from turnip roots (Brassica napus L. var. purple top white globe) by reverse transcriptase-polymerase chain reaction (RT-PCR) and rapid amplification of cDNA ends (RACE). Total RNA extracted from mature turnip roots was used as a template for RT-PCR, using a degenerated primer designed to amplify the highly conserved distal motif of plant peroxidases. The resulting partial sequence was used to design the rest of the specific primers for 5' and 3' RACE. Two cDNA fragments were purified, sequenced, and aligned with the partial sequence from RT-PCR, and a complete overlapping sequence was obtained and labeled as BbPA (Genbank Accession No. AY423440, named as podC). The full length cDNA is 1167bp long and contains a 1077bp open reading frame (ORF) encoding a 358 deduced amino acid peroxidase polypeptide. The putative peroxidase (BnPA) showed a calculated Mr of 34kDa, and isoelectric point (pI) of 4.5, with no significant identity with other reported turnip peroxidases. Sequence alignment showed that only three peroxidases have a significant identity with BnPA namely AtP29a (84%), and AtPA2 (81%) from Arabidopsis thaliana, and HRPA2 (82%) from horseradish (Armoracia rusticana). Work is in progress to clone this gene into an adequate host to study the specific role and possible biotechnological applications of this alternative peroxidase source.
Characterization and chromosomal mapping of the human TFG gene involved in thyroid carcinoma
DOE Office of Scientific and Technical Information (OSTI.GOV)
Mencinger, M.; Panagopoulos, I.; Andreasson, P.
1997-05-01
Homology searches in the Expressed Sequence Tag Database were performed using SPYGQ-rich regions as query sequences to find genes encoding protein regions similar to the N-terminal parts of the sarcoma-associated EWS and FUS proteins. Clone 22911 (T74973), encoding a SPYGQ-rich region in its 5{prime} end, and several other clones that overlapped 22911 were selected. The combined data made it possible to assemble a full-length cDNA sequence. This cDNA sequence is 1677 bp, containing an initiation codon ATG, an open reading frame of 400 amino acids, a poly(A) signal, and a poly(A) tail. We found 100% identity between the 5{prime} partmore » of the consensus sequence and the 598-bp-long sequence named TFG. The TFG sequence is fused to the 3{prime} end of NTRK1, generating the TRK-T3 fusion transcript found in papillary thyroid carcinoma. The cDNA therefore represents the full-length transcript of the TFG gene. TFG was localized to 3q11-q12 by fluorescence in situ hybridization. The 3{prime} and the 5{prime} ends of the TFG cDNA probe hybridized to a 2.2-kb band on Northern blot filters in all tissues examined. 28 refs., 5 figs., 1 tab.« less
Immune-Related Transcriptome of Coptotermes formosanus Shiraki Workers: The Defense Mechanism
Hussain, Abid; Li, Yi-Feng; Cheng, Yu; Liu, Yang; Chen, Chuan-Cheng; Wen, Shuo-Yang
2013-01-01
Formosan subterranean termites, Coptotermes formosanus Shiraki, live socially in microbial-rich habitats. To understand the molecular mechanism by which termites combat pathogenic microbes, a full-length normalized cDNA library and four Suppression Subtractive Hybridization (SSH) libraries were constructed from termite workers infected with entomopathogenic fungi (Metarhizium anisopliae and Beauveria bassiana), Gram-positive Bacillus thuringiensis and Gram-negative Escherichia coli, and the libraries were analyzed. From the high quality normalized cDNA library, 439 immune-related sequences were identified. These sequences were categorized as pattern recognition receptors (47 sequences), signal modulators (52 sequences), signal transducers (137 sequences), effectors (39 sequences) and others (164 sequences). From the SSH libraries, 27, 17, 22 and 15 immune-related genes were identified from each SSH library treated with M. anisopliae, B. bassiana, B. thuringiensis and E. coli, respectively. When the normalized cDNA library was compared with the SSH libraries, 37 immune-related clusters were found in common; 56 clusters were identified in the SSH libraries, and 259 were identified in the normalized cDNA library. The immune-related gene expression pattern was further investigated using quantitative real time PCR (qPCR). Important immune-related genes were characterized, and their potential functions were discussed based on the integrated analysis of the results. We suggest that normalized cDNA and SSH libraries enable us to discover functional genes transcriptome. The results remarkably expand our knowledge about immune-inducible genes in C. formosanus Shiraki and enable the future development of novel control strategies for the management of Formosan subterranean termites. PMID:23874972
Rice, Michael; Gladstone, William; Weir, Michael
2004-01-01
We discuss how relational databases constitute an ideal framework for representing and analyzing large-scale genomic data sets in biology. As a case study, we describe a Drosophila splice-site database that we recently developed at Wesleyan University for use in research and teaching. The database stores data about splice sites computed by a custom algorithm using Drosophila cDNA transcripts and genomic DNA and supports a set of procedures for analyzing splice-site sequence space. A generic Web interface permits the execution of the procedures with a variety of parameter settings and also supports custom structured query language queries. Moreover, new analytical procedures can be added by updating special metatables in the database without altering the Web interface. The database provides a powerful setting for students to develop informatic thinking skills.
2004-01-01
We discuss how relational databases constitute an ideal framework for representing and analyzing large-scale genomic data sets in biology. As a case study, we describe a Drosophila splice-site database that we recently developed at Wesleyan University for use in research and teaching. The database stores data about splice sites computed by a custom algorithm using Drosophila cDNA transcripts and genomic DNA and supports a set of procedures for analyzing splice-site sequence space. A generic Web interface permits the execution of the procedures with a variety of parameter settings and also supports custom structured query language queries. Moreover, new analytical procedures can be added by updating special metatables in the database without altering the Web interface. The database provides a powerful setting for students to develop informatic thinking skills. PMID:15592597
Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus
NASA Astrophysics Data System (ADS)
Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat
2016-11-01
In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.
Genomic structure of the human D-site binding protein (DBP) gene
DOE Office of Scientific and Technical Information (OSTI.GOV)
Shutler, G.; Glassco, T.; Kang, Xiaolin
1996-06-15
The human gene for the D-Site Binding Protein (DBP) has been sequenced and characterized. This gene is a member of the b/ZIP family of transcription factors and is one of three genes forming the PAR sub-family. DBP has been implicated in the diurnal regulation of a variety of liver-specific genes. Examination of the genomic structure of DBP reveals that the gene is divided into four exons and is contained within a relatively compact region of approximately 6 kb. These exons appear to correspond to functional divisions the DBP protein. Exon 1 contains a long 5{prime} UTR, and conservation between themore » rat and the human genes of the presence of small open reading frames within this region suggests that is may play a role in translational control. Exon 2 contains a limited region of similarity to the other PAR domain genes, which may be part of a potential activation domain. Exon 3 contains the PAR domain and differs by only 1 of 71 amino acids between rat and human. Exon 4, containing both the basic and the leucine zipper domains, is likewise highly conserved. The overall degree of homology between the rat and the human cDNA sequences is 82% for the nucleic acid sequence and 92% for the protein sequence. comparison of the rat and human proximal promoters reveals extensive sequence conservation, with two previously characterized DNA binding sites being conserved at the functional and sequence levels. 31 refs., 4 figs.« less