Science.gov

Sample records for highly conserved sequences

  1. Highly conserved repetitive DNA sequences are present at human centromeres.

    PubMed Central

    Grady, D L; Ratliff, R L; Robinson, D L; McCanlies, E C; Meyne, J; Moyzis, R K

    1992-01-01

    Highly conserved repetitive DNA sequence clones, largely consisting of (GGAAT)n repeats, have been isolated from a human recombinant repetitive DNA library by high-stringency hybridization with rodent repetitive DNA. This sequence, the predominant repetitive sequence in human satellites II and III, is similar to the essential core DNA of the Saccharomyces cerevisiae centromere, centromere DNA element (CDE) III. In situ hybridization to human telophase and Drosophila polytene chromosomes shows localization of the (GGAAT)n sequence to centromeric regions. Hyperchromicity studies indicate that the (GGAAT)n sequence exhibits unusual hydrogen bonding properties. The purine-rich strand alone has the same thermal stability as the duplex. Hyperchromicity studies of synthetic DNA variants indicate that all sequences with the composition (AATGN)n exhibit this unusual thermal stability. DNA-mobility-shift assays indicate that specific HeLa-cell nuclear proteins recognize this sequence with a relative affinity greater than 10(5). The extreme evolutionary conservation of this DNA sequence, its centromeric location, its unusual hydrogen bonding properties, its high affinity for specific nuclear proteins, and its similarity to functional centromeres isolated from yeast suggest that this sequence may be a component of the functional human centromere. Images PMID:1542662

  2. High sequence conservation among cucumber mosaic virus isolates from lily.

    PubMed

    Chen, Y K; Derks, A F; Langeveld, S; Goldbach, R; Prins, M

    2001-08-01

    For classification of Cucumber mosaic virus (CMV) isolates from ornamental crops of different geographical areas, these were characterized by comparing the nucleotide sequences of RNAs 4 and the encoded coat proteins. Within the ornamental-infecting CMV viruses both subgroups were represented. CMV isolates of Alstroemeria and crocus were classified as subgroup II isolates, whereas 8 other isolates, from lily, gladiolus, amaranthus, larkspur, and lisianthus, were identified as subgroup I members. In general, nucleotide sequence comparisons correlated well with geographic distribution, with one notable exception: the analyzed nucleotide sequences of 5 lily isolates showed remarkably high homology despite different origins.

  3. Highly conserved d-loop sequences in woolly mouse opossums Marmosa (Micoureus).

    PubMed

    Rocha, Rita Gomes; Leite, Yuri Luiz Reis; Ferreira, Eduardo; Justino, Juliana; Costa, Leonora Pires

    2012-04-01

    This study reports the occurrence of highly conserved d-loop sequences in the mitochondrial genome of the woolly mouse opossum genus Marmosa subgenus Micoureus (Mammalia, Didelphimorphia, Didelphidae). Sixty-six sequences of Marmosa (Micoureus) demerarae, Marmosa (Micoureus) constantiae, and Marmosa (Micoureus) paraguayanus were amplified using universal d-loop primers and virtually no genetic differences were detected within and among species. These sequences matched the control region of the mitochondrial marsupial genome. Analyses of qualitative aspects of these sequences revealed that their structural composition is very similar to the d-loop region of other didelphid species. However, the total lack of variability has not been reported from other closely related species. The data analyzed here support the occurrence of highly conserved d-loop sequences, and we found no support for the hypothesis that these sequences are d-loop-like nuclear pseudogenes. Furthermore, the control and flanking regions obtained with different primers corroborate the lack of variability of the d-loop sequences in the mitochondrial genome of Marmosa (Micoureus).

  4. Highly conserved D-loop-like nuclear mitochondrial sequences (Numts) in tiger (Panthera tigris).

    PubMed

    Zhang, Wenping; Zhang, Zhihe; Shen, Fujun; Hou, Rong; Lv, Xiaoping; Yue, Bisong

    2006-08-01

    Using oligonucleotide primers designed to match hypervariable segments I (HVS-1) of Panthera tigris mitochondrial DNA (mtDNA), we amplified two different PCR products (500 bp and 287 bp) in the tiger (Panthera tigris), but got only one PCR product (287 bp) in the leopard (Panthera pardus). Sequence analyses indicated that the sequence of 287 bp was a D-loop-like nuclear mitochondrial sequence (Numts), indicating a nuclear transfer that occurred approximately 4.8-17 million years ago in the tiger and 4.6-16 million years ago in the leopard. Although the mtDNA D-loop sequence has a rapid rate of evolution, the 287-bp Numts are highly conserved; they are nearly identical in tiger subspecies and only 1.742% different between tiger and leopard. Thus, such sequences represent molecular 'fossils' that can shed light on evolution of the mitochondrial genome and may be the most appropriate outgroup for phylogenetic analysis. This is also proved by comparing the phylogenetic trees reconstructed using the D-loop sequence of snow leopard and the 287-bp Numts as outgroup.

  5. A highly conserved repeated chromosomal sequence in the radioresistant bacterium Deinococcus radiodurans SARK

    SciTech Connect

    Lennon, E.; Gutman, P.D.; Hanlong Yao; Minton, K.W. )

    1991-03-01

    A DNA fragment containing a portion of a DNA damage-inducible gene from Deinococcus radiodurans SARK hybridized to numerous fragments of SARK genomic DNA because of a highly conserved repetitive chromosomal element. The element is of variable length, ranging from 150 to 192 bp, depending on the absence or presence of one or two 21-bp sequences located internally. A putative translational start site of the damage-inducible gene is within the reiterated element. The element contains dyad symmetries that suggest modes of transcriptional and/or translational control.

  6. High Throughput Sequencing of T Cell Antigen Receptors Reveals a Conserved TCR Repertoire.

    PubMed

    Hou, Xianliang; Lu, Chong; Chen, Sisi; Xie, Qian; Cui, Guangying; Chen, Jianing; Chen, Zhi; Wu, Zhongwen; Ding, Yulong; Ye, Ping; Dai, Yong; Diao, Hongyan

    2016-03-01

    The T-cell receptor (TCR) repertoire is a mirror of the human immune system that reflects processes caused by infections, cancer, autoimmunity, and aging. Next-generation sequencing has become a powerful tool for deep TCR profiling. Herein, we used this technology to study the repertoire features of TCR beta chain in the blood of healthy individuals.Peripheral blood samples were collected from 10 healthy donors. T cells were isolated with anti-human CD3 magnetic beads according to the manufacturer's protocol. We then combined multiplex-PCR, Illumina sequencing, and IMGT/High V-QUEST to analyze the characteristics and polymorphisms of the TCR.Most of the individual T cell clones were present at very low frequencies, suggesting that they had not undergone clonal expansion. The usage frequencies of the TCR beta variable, beta joining, and beta diversity gene segments were similar among T cells from different individuals. Notably, the usage frequency of individual nucleotides and amino acids within complementarity-determining region (CDR3) intervals was remarkably consistent between individuals. Moreover, our data show that terminal deoxynucleotidyl transferase activity was biased toward the insertion of G (31.92%) and C (27.14%) over A (21.82%) and T (19.12%) nucleotides.Some conserved features could be observed in the composition of CDR3, which may inform future studies of human TCR gene recombination.

  7. High Throughput Sequencing of T Cell Antigen Receptors Reveals a Conserved TCR Repertoire

    PubMed Central

    Hou, Xianliang; Lu, Chong; Chen, Sisi; Xie, Qian; Cui, Guangying; Chen, Jianing; Chen, Zhi; Wu, Zhongwen; Ding, Yulong; Ye, Ping; Dai, Yong; Diao, Hongyan

    2016-01-01

    Abstract The T-cell receptor (TCR) repertoire is a mirror of the human immune system that reflects processes caused by infections, cancer, autoimmunity, and aging. Next-generation sequencing has become a powerful tool for deep TCR profiling. Herein, we used this technology to study the repertoire features of TCR beta chain in the blood of healthy individuals. Peripheral blood samples were collected from 10 healthy donors. T cells were isolated with anti-human CD3 magnetic beads according to the manufacturer's protocol. We then combined multiplex-PCR, Illumina sequencing, and IMGT/High V-QUEST to analyze the characteristics and polymorphisms of the TCR. Most of the individual T cell clones were present at very low frequencies, suggesting that they had not undergone clonal expansion. The usage frequencies of the TCR beta variable, beta joining, and beta diversity gene segments were similar among T cells from different individuals. Notably, the usage frequency of individual nucleotides and amino acids within complementarity-determining region (CDR3) intervals was remarkably consistent between individuals. Moreover, our data show that terminal deoxynucleotidyl transferase activity was biased toward the insertion of G (31.92%) and C (27.14%) over A (21.82%) and T (19.12%) nucleotides. Some conserved features could be observed in the composition of CDR3, which may inform future studies of human TCR gene recombination. PMID:26962778

  8. Characterization of an Unusually Conserved Alui Highly Reiterated DNA Sequence Family from the Honeybee, Apis Mellifera

    PubMed Central

    Tares, S.; Cornuet, J. M.; Abad, P.

    1993-01-01

    An AluI family of highly reiterated nontranscribed sequences has been found in the genome of the honeybee Apis mellifera. This repeated sequence is shown to be present at approximately 23,000 copies per haploid genome constituting about 2% of the total genomic DNA. The nucleotide sequence of 10 monomers was determined. The consensus sequence is 176 nucleotides long and has an A + T content of 58%. There are clusters of both direct and inverted repeats. Internal subrepeating units ranging from 11 to 17 nucleotides are observed, suggesting that it could have evolved from a shorter sequence. DNA sequence data reveal that this repeat class is unusually homogeneous compared to the other class of invertebrate highly reiterated DNA sequences. The average pairwise sequence divergence between the repeats is 2.5%. In spite of this unusual homogeneity, divergence has been found in the repeated sequence hybridization ladder between four different honeybee subspecies. Therefore, the AluI highly reiterated sequences provide a new probe for fingerprinting in A. m. mellifera. PMID:8104160

  9. Low molecular weight serine protease inhibitors from insects are proteins with highly conserved sequences.

    PubMed

    Boigegrain, R A; Pugnière, M; Paroutaud, P; Castro, B; Brehélin, M

    2000-02-01

    A low molecular weight protease inhibitor peptide found in ovaries of the desert locust Schistocerca gregaria (SGPI-2), was purified from plasma of the same locust and sequenced. It was named SGCI. It was found active towards chymotrypsin and human leukocyte elastase. SGCI was synthesized using a solid-phase procedure and the sequence of its reactive site for chymotrypsin was determined. Compared with an inhibitor purified earlier from another locust species, the total sequence of SGCI showed 88% identity. In particular, the sequence of the reactive site of these inhibitors was identical. Our search for a closely related peptide in an insect species far removed from locusts, the lepidopteran Spodoptera littoralis, was unfruitful but a different chymotrypsin inhibitor, belonging to the Kazal family, was found whose mass is greater than that of SGCI (20 vs 3.6 kDa). Its N-terminal sequence shares 80% identity with that of a chymotrypsin inhibitor purified earlier from the haemolymph of another lepidopteran. Conservation of the amino acid sequence in the reactive site seems to be an exception among protease inhibitors.

  10. Evolutionarily conserved sequences on human chromosome 21

    SciTech Connect

    Frazer, Kelly A.; Sheehan, John B.; Stokowski, Renee P.; Chen, Xiyin; Hosseini, Roya; Cheng, Jan-Fang; Fodor, Stephen P.A.; Cox, David R.; Patil, Nila

    2001-09-01

    Comparison of human sequences with the DNA of other mammals is an excellent means of identifying functional elements in the human genome. Here we describe the utility of high-density oligonucleotide arrays as a rapid approach for comparing human sequences with the DNA of multiple species whose sequences are not presently available. High-density arrays representing approximately 22.5 Mb of nonrepetitive human chromosome 21 sequence were synthesized and then hybridized with mouse and dog DNA to identify sequences conserved between humans and mice (human-mouse elements) and between humans and dogs (human-dog elements). Our data show that sequence comparison of multiple species provides a powerful empiric method for identifying actively conserved elements in the human genome. A large fraction of these evolutionarily conserved elements are present in regions on chromosome 21 that do not encode known genes.

  11. Touchdown digital polymerase chain reaction for quantification of highly conserved sequences in the HIV-1 genome.

    PubMed

    De Spiegelaere, Ward; Malatinkova, Eva; Kiselinova, Maja; Bonczkowski, Pawel; Verhofstede, Chris; Vogelaers, Dirk; Vandekerckhove, Linos

    2013-08-15

    Digital polymerase chain reaction (PCR) is an emerging absolute quantification method based on the limiting dilution principle and end-point PCR. This methodology provides high flexibility in assay design without influencing quantitative accuracy. This article describes an assay to quantify HIV DNA that targets a highly conserved region of the HIV-1 genome that hampers optimal probe design. To maintain high specificity and allow probe binding and hydrolysis of a probe with low melting temperature, a two-stage touchdown PCR was designed with a first round of amplification at high temperature and a subsequent round at low temperature to allow accumulation of fluorescence.

  12. High-throughput genomic sequencing of cassava bacterial blight strains identifies conserved effectors to target for durable resistance.

    PubMed

    Bart, Rebecca; Cohn, Megan; Kassen, Andrew; McCallum, Emily J; Shybut, Mikel; Petriello, Annalise; Krasileva, Ksenia; Dahlbeck, Douglas; Medina, Cesar; Alicai, Titus; Kumar, Lava; Moreira, Leandro M; Rodrigues Neto, Júlio; Verdier, Valerie; Santana, María Angélica; Kositcharoenkul, Nuttima; Vanderschuren, Hervé; Gruissem, Wilhelm; Bernal, Adriana; Staskawicz, Brian J

    2012-07-10

    Cassava bacterial blight (CBB), incited by Xanthomonas axonopodis pv. manihotis (Xam), is the most important bacterial disease of cassava, a staple food source for millions of people in developing countries. Here we present a widely applicable strategy for elucidating the virulence components of a pathogen population. We report Illumina-based draft genomes for 65 Xam strains and deduce the phylogenetic relatedness of Xam across the areas where cassava is grown. Using an extensive database of effector proteins from animal and plant pathogens, we identify the effector repertoire for each sequenced strain and use a comparative sequence analysis to deduce the least polymorphic of the conserved effectors. These highly conserved effectors have been maintained over 11 countries, three continents, and 70 y of evolution and as such represent ideal targets for developing resistance strategies.

  13. Identification and characterization of novel and conserved microRNAs in radish (Raphanus sativus L.) using high-throughput sequencing.

    PubMed

    Xu, Liang; Wang, Yan; Xu, Yuanyuan; Wang, Liangju; Zhai, Lulu; Zhu, Xianwen; Gong, Yiqin; Ye, Shan; Liu, Liwang

    2013-03-01

    MicroRNAs (miRNAs) are endogenous, non-coding, small RNAs that play significant regulatory roles in plant growth, development, and biotic and abiotic stress responses. To date, a great number of conserved and species-specific miRNAs have been identified in many important plant species such as Arabidopsis, rice and poplar. However, little is known about identification of miRNAs and their target genes in radish (Raphanus sativus L.). In the present study, a small RNA library from radish root was constructed and sequenced using the high-throughput Solexa sequencing. Through sequence alignment and secondary structure prediction, a total of 545 conserved miRNA families as well as 15 novel (with their miRNA* strand) and 64 potentially novel miRNAs were identified. Quantitative real-time PCR (qRT-PCR) analysis confirmed that both conserved and novel miRNAs were expressed in radish, and some of them were preferentially expressed in certain tissues. A total of 196 potential target genes were predicted for 42 novel radish miRNAs. Gene ontology (GO) analysis showed that most of the targets were involved in plant growth, development, metabolism and stress responses. This study represents a first large-scale identification and characterization of radish miRNAs and their potential target genes. These results could lead to the further identification of radish miRNAs and enhance our understanding of radish miRNA regulatory mechanisms in diverse biological and metabolic processes.

  14. Automatic identification of highly conserved family regions and relationships in genome wide datasets including remote protein sequences.

    PubMed

    Doğan, Tunca; Karaçalı, Bilge

    2013-01-01

    Identifying shared sequence segments along amino acid sequences generally requires a collection of closely related proteins, most often curated manually from the sequence datasets to suit the purpose at hand. Currently developed statistical methods are strained, however, when the collection contains remote sequences with poor alignment to the rest, or sequences containing multiple domains. In this paper, we propose a completely unsupervised and automated method to identify the shared sequence segments observed in a diverse collection of protein sequences including those present in a smaller fraction of the sequences in the collection, using a combination of sequence alignment, residue conservation scoring and graph-theoretical approaches. Since shared sequence fragments often imply conserved functional or structural attributes, the method produces a table of associations between the sequences and the identified conserved regions that can reveal previously unknown protein families as well as new members to existing ones. We evaluated the biological relevance of the method by clustering the proteins in gold standard datasets and assessing the clustering performance in comparison with previous methods from the literature. We have then applied the proposed method to a genome wide dataset of 17793 human proteins and generated a global association map to each of the 4753 identified conserved regions. Investigations on the major conserved regions revealed that they corresponded strongly to annotated structural domains. This suggests that the method can be useful in predicting novel domains on protein sequences.

  15. Sequence Fingerprints of MicroRNA Conservation

    PubMed Central

    Shi, Bing; Gao, Wei; Wang, Juan

    2012-01-01

    It is known that the conservation of protein-coding genes is associated with their sequences both various species, such as animals and plants. However, the association between microRNA (miRNA) conservation and their sequences in various species remains unexplored. Here we report the association of miRNA conservation with its sequence features, such as base content and cleavage sites, suggesting that miRNA sequences contain the fingerprints for miRNA conservation. More interestingly, different species show different and even opposite patterns between miRNA conservation and sequence features. For example, mammalian miRNAs show a positive/negative correlation between conservation and AU/GC content, whereas plant miRNAs show a negative/positive correlation between conservation and AU/GC content. Further analysis puts forward the hypothesis that the introns of protein-coding genes may be a main driving force for the origin and evolution of mammalian miRNAs. At the 5′ end, conserved miRNAs have a preference for base U, while less-conserved miRNAs have a preference for a non-U base in mammals. This difference does not exist in insects and plants, in which both conserved miRNAs and less-conserved miRNAs have a preference for base U at the 5′ end. We further revealed that the non-U preference at the 5′ end of less-conserved mammalian miRNAs is associated with miRNA function diversity, which may have evolved from the pressure of a highly sophisticated environmental stimulus the mammals encountered during evolution. These results indicated that miRNA sequences contain the fingerprints for conservation, and these fingerprints vary according to species. More importantly, the results suggest that although species share common mechanisms by which miRNAs originate and evolve, mammals may develop a novel mechanism for miRNA origin and evolution. In addition, the fingerprint found in this study can be predictor of miRNA conservation, and the findings are helpful in achieving a

  16. High-throughput sequencing discovery of conserved and novel microRNAs in Chinese cabbage (Brassica rapa L. ssp. pekinensis).

    PubMed

    Wang, Fengde; Li, Libin; Liu, Lifeng; Li, Huayin; Zhang, Yihui; Yao, Yingyin; Ni, Zhongfu; Gao, Jianwei

    2012-07-01

    MicroRNAs (miRNAs) are a class of 21-24 nucleotide non-coding RNAs that down-regulate gene expression by cleaving or inhibiting the translation of target gene transcripts. miRNAs have been extensively analyzed in a few model plant species such as Arabidopsis, rice and Populus, and partially investigated in other non-model plant species. However, only a few conserved miRNAs have been identified in Chinese cabbage, a common and economically important crop in Asia. To identify novel and conserved miRNAs in Chinese cabbage (Brassica rapa L. ssp. pekinensis) we constructed a small RNA library. Using high-throughput Solexa sequencing to identify microRNAs we found 11,210 unique sequences belonging to 321 conserved miRNA families and 228 novel miRNAs. We ran a Blast search with these sequences against the Chinese cabbage mRNA database and found 2,308 and 736 potential target genes for 221 conserved and 125 novel miRNAs, respectively. The BlastX search against the Arabidopsis genome and GO analysis suggested most of the targets were involved in plant growth, metabolism, development and stress response. This study provides the first large scale-cloning and characterization of Chinese cabbage miRNAs and their potential targets. These miRNAs add to the growing database of new miRNAs, prompt further study on Chinese cabbage miRNA regulation mechanisms, and help toward a greater understanding of the important roles of miRNAs in Chinese cabbage.

  17. Characterization of the dead ringer gene identifies a novel, highly conserved family of sequence-specific DNA-binding proteins.

    PubMed Central

    Gregory, S L; Kortschak, R D; Kalionis, B; Saint, R

    1996-01-01

    We reported the identification of a new family of DNA-binding proteins from our characterization of the dead ringer (dri) gene of Drosophila melanogaster. We show that dri encodes a nuclear protein that contains a sequence-specific DNA-binding domain that bears no similarity to known DNA-binding domains. A number of proteins were found to contain sequences homologous to this domain. Other proteins containing the conserved motif include yeast SWI1, two human retinoblastoma binding proteins, and other mammalian regulatory proteins. A mouse B-cell-specific regulator exhibits 75% identity with DRI over the 137-amino-acid DNA-binding domains of these proteins, indicating a high degree of conservation of this domain. Gel retardation and optimal binding site screens revealed that the in vitro sequence specificity of DRI is strikingly similar to that of many homeodomain proteins, although the sequence and predicted secondary structure do not resemble a homeodomain. The early general expression of dri and the similarity of DRI and homeodomain in vitro DNA-binding specificity compound the problem of understanding the in vivo specificity of action of these proteins. Maternally derived dri product is found throughout the embryo until germ band extension, when dri is expressed in a developmentally regulated set of tissues, including salivary gland ducts, parts of the gut, and a subset of neural cells. The discovery of this new, conserved DNA-binding domain offers an explanation for the regulatory activity of several important members of this class and predicts significant regulatory roles for the others. PMID:8622680

  18. Comparative Mitogenomics of the Genus Odontobutis (Perciformes: Gobioidei: Odontobutidae) Revealed Conserved Gene Rearrangement and High Sequence Variations

    PubMed Central

    Ma, Zhihong; Yang, Xuefen; Bercsenyi, Miklos; Wu, Junjie; Yu, Yongyao; Wei, Kaijian; Fan, Qixue; Yang, Ruibin

    2015-01-01

    To understand the molecular evolution of mitochondrial genomes (mitogenomes) in the genus Odontobutis, the mitogenome of Odontobutis yaluensis was sequenced and compared with those of another four Odontobutis species. Our results displayed similar mitogenome features among species in genome organization, base composition, codon usage, and gene rearrangement. The identical gene rearrangement of trnS-trnL-trnH tRNA cluster observed in mitogenomes of these five closely related freshwater sleepers suggests that this unique gene order is conserved within Odontobutis. Additionally, the present gene order and the positions of associated intergenic spacers of these Odontobutis mitogenomes indicate that this unusual gene rearrangement results from tandem duplication and random loss of large-scale gene regions. Moreover, these mitogenomes exhibit a high level of sequence variation, mainly due to the differences of corresponding intergenic sequences in gene rearrangement regions and the heterogeneity of tandem repeats in the control regions. Phylogenetic analyses support Odontobutis species with shared gene rearrangement forming a monophyletic group, and the interspecific phylogenetic relationships are associated with structural differences among their mitogenomes. The present study contributes to understanding the evolutionary patterns of Odontobutidae species. PMID:26492246

  19. Comparative Mitogenomics of the Genus Odontobutis (Perciformes: Gobioidei: Odontobutidae) Revealed Conserved Gene Rearrangement and High Sequence Variations.

    PubMed

    Ma, Zhihong; Yang, Xuefen; Bercsenyi, Miklos; Wu, Junjie; Yu, Yongyao; Wei, Kaijian; Fan, Qixue; Yang, Ruibin

    2015-10-20

    To understand the molecular evolution of mitochondrial genomes (mitogenomes) in the genus Odontobutis, the mitogenome of Odontobutis yaluensis was sequenced and compared with those of another four Odontobutis species. Our results displayed similar mitogenome features among species in genome organization, base composition, codon usage, and gene rearrangement. The identical gene rearrangement of trnS-trnL-trnH tRNA cluster observed in mitogenomes of these five closely related freshwater sleepers suggests that this unique gene order is conserved within Odontobutis. Additionally, the present gene order and the positions of associated intergenic spacers of these Odontobutis mitogenomes indicate that this unusual gene rearrangement results from tandem duplication and random loss of large-scale gene regions. Moreover, these mitogenomes exhibit a high level of sequence variation, mainly due to the differences of corresponding intergenic sequences in gene rearrangement regions and the heterogeneity of tandem repeats in the control regions. Phylogenetic analyses support Odontobutis species with shared gene rearrangement forming a monophyletic group, and the interspecific phylogenetic relationships are associated with structural differences among their mitogenomes. The present study contributes to understanding the evolutionary patterns of Odontobutidae species.

  20. The Chinese hamster Alu-equivalent sequence: a conserved highly repetitious, interspersed deoxyribonucleic acid sequence in mammals has a structure suggestive of a transposable element.

    PubMed Central

    Haynes, S R; Toomey, T P; Leinwand, L; Jelinek, W R

    1981-01-01

    A consensus sequence has been determined for a major interspersed deoxyribonucleic acid repeat in the genome of Chinese hamster ovary cells (CHO cells). This sequence is extensively homologous to (i) the human Alu sequence (P. L. Deininger et al., J. Mol. Biol., in press), (ii) the mouse B1 interspersed repetitious sequence (Krayev et al., Nucleic Acids Res. 8:1201-1215, 1980) (iii) an interspersed repetitious sequence from African green monkey deoxyribonucleic acid (Dhruva et al., Proc. Natl. Acad. Sci. U.S.A. 77:4514-4518, 1980) and (iv) the CHO and mouse 4.5S ribonucleic acid (this report; F. Harada and N. Kato, Nucleic Acids Res. 8:1273-1285, 1980). Because the CHO consensus sequence shows significant homology to the human Alu sequence it is termed the CHO Alu-equivalent sequence. A conserved structure surrounding CHO Alu-equivalent family members can be recognized. It is similar to that surrounding the human Alu and the mouse B1 sequences, and is represented as follows: direct repeat-CHO-Alu-A-rich sequence-direct repeat. A composite interspersed repetitious sequence has been identified. Its structure is represented as follows: direct repeat-residue 47 to 107 of CHO-Alu-non-Alu repetitious sequence-A-rich sequence-direct repeat. Because the Alu flanking sequences resemble those that flank known transposable elements, we think it likely that the Alu sequence dispersed throughout the mammalian genome by transposition. Images PMID:9279371

  1. Identification of conserved hepatic transcriptomic responses to 17β-estradiol using high-throughput sequencing in brown trout

    PubMed Central

    Uren Webster, Tamsyn M.; Shears, Janice A.; Moore, Karen

    2015-01-01

    Estrogenic chemicals are major contaminants of surface waters and can threaten the sustainability of natural fish populations. Characterization of the global molecular mechanisms of toxicity of environmental contaminants has been conducted primarily in model species rather than species with limited existing transcriptomic or genomic sequence information. We aimed to investigate the global mechanisms of toxicity of an endocrine disrupting chemical of environmental concern [17β-estradiol (E2)] using high-throughput RNA sequencing (RNA-Seq) in an environmentally relevant species, brown trout (Salmo trutta). We exposed mature males to measured concentrations of 1.94, 18.06, and 34.38 ng E2/l for 4 days and sequenced three individual liver samples per treatment using an Illumina HiSeq 2500 platform. Exposure to 34.4 ng E2/L resulted in 2,113 differentially regulated transcripts (FDR < 0.05). Functional analysis revealed upregulation of processes associated with vitellogenesis, including lipid metabolism, cellular proliferation, and ribosome biogenesis, together with a downregulation of carbohydrate metabolism. Using real-time quantitative PCR, we validated the expression of eight target genes and identified significant differences in the regulation of several known estrogen-responsive transcripts in fish exposed to the lower treatment concentrations (including esr1 and zp2.5). We successfully used RNA-Seq to identify highly conserved responses to estrogen and also identified some estrogen-responsive transcripts that have been less well characterized, including nots and tgm2l. These results demonstrate the potential application of RNA-Seq as a valuable tool for assessing mechanistic effects of pollutants in ecologically relevant species for which little genomic information is available. PMID:26082144

  2. The sequence organization of Yp/proximal Xq homologous regions of the human sex chromosomes is highly conserved

    SciTech Connect

    Sargent, C.A.; Briggs, H.; Chalmers, I.J.

    1996-03-01

    Detailed deletion analysis of patients with breakpoints in Yp has allowed the definition of two distinct intervals on the Y chromosome short arm outside the pseudoautosomal region that are homologous to Xq21.3. Detailed YAC contigs have been developed over these regions on both the X and Y chromosomes, and the relative order of markers has been compared to assess whether rearrangements on either sex chromosome have occurred since the transposition events creating these patterns of homology. On the X chromosome, the region forms almost one contiguous block of homology, whereas on the Y chromosome, there has been one major rearrangement leading to the two separate Yp-Xq21 blocks of homology. The rearrangement breakpoint has been mapped. Within these separate X-Y homologous blocks on Yp, the order of loci homologous to X has been conserved to a high degree between the sex chromosomes. With the exception of the amelogenin gene (proximal Yp block), all the X-Y homologous sequences in the two Yp blocks have homologues in Xq21.3, with the former having its X counterpart in Xp22.2. This suggests an independent evolutionary event leading to the formation of the amelogenin X-Y homology. 45 refs., 4 figs., 1 tab.

  3. Comprehensive Sequence Analysis of the Human IL23A Gene Defines New Variation Content and High Rate of Evolutionary Conservation

    PubMed Central

    Tindall, Elizabeth A.; Hayes, Vanessa M.

    2010-01-01

    A newly described heterodimeric cytokine, interleukin-23 (IL-23) is emerging as a key player in both the innate and the adaptive T helper (Th)17 driven immune response as well as an initiator of several autoimmune diseases. The rate-limiting element of IL-23 production is believed to be driven by expression of the unique p19 subunit encoded by IL23A. We set out to perform comprehensive DNA sequencing of this previously under-studied gene in 96 individuals from two evolutionary distinct human population groups, Southern African Bantu and European. We observed a total of 33 different DNA variants within these two groups, 22 (67%) of which are currently not reported in any available database. We further demonstrate both inter-population and intra-species sequence conservation within the coding and known regulatory regions of IL23A, supporting a critical physiological role for IL-23. We conclude that IL23A may have undergone positive selection pressure directed towards conservation, suggesting that functional genetic variants within IL23A will have a significant impact on the host immune response. PMID:20154336

  4. Alignment of U3 region sequences of mammalian type C viruses: identification of highly conserved motifs and implications for enhancer design.

    PubMed Central

    Golemis, E A; Speck, N A; Hopkins, N

    1990-01-01

    We aligned published sequences for the U3 region of 35 type C mammalian retroviruses. The alignment reveals that certain sequence motifs within the U3 region are strikingly conserved. A number of these motifs correspond to previously identified sites. In particular, we found that the enhancer region of most of the viruses examined contains a binding site for leukemia virus factor b, a viral corelike element, the consensus motif for nuclear factor 1, and the glucocorticoid response element. Most viruses containing more than one copy of enhancer sequences include these binding sites in both copies of the repeat. We consider this set of binding sites to constitute a framework for the enhancers of this set of viruses. Other highly conserved motifs in the U3 region include the retrovirus inverted repeat sequence, a negative regulatory element, and the CCAAT and TATA boxes. In addition, we identified two novel motifs in the promoter region that were exceptionally highly conserved but have not been previously described. PMID:2153223

  5. High-Throughput Sequencing Reveals Diverse Sets of Conserved, Nonconserved, and Species-Specific miRNAs in Jute.

    PubMed

    Islam, Md Tariqul; Ferdous, Ahlan Sabah; Najnin, Rifat Ara; Sarker, Suprovath Kumar; Khan, Haseena

    2015-01-01

    MicroRNAs play a pivotal role in regulating a broad range of biological processes, acting by cleaving mRNAs or by translational repression. A group of plant microRNAs are evolutionarily conserved; however, others are expressed in a species-specific manner. Jute is an agroeconomically important fibre crop; nonetheless, no practical information is available for microRNAs in jute to date. In this study, Illumina sequencing revealed a total of 227 known microRNAs and 17 potential novel microRNA candidates in jute, of which 164 belong to 23 conserved families and the remaining 63 belong to 58 nonconserved families. Among a total of 81 identified microRNA families, 116 potential target genes were predicted for 39 families and 11 targets were predicted for 4 among the 17 identified novel microRNAs. For understanding better the functions of microRNAs, target genes were analyzed by Gene Ontology and their pathways illustrated by KEGG pathway analyses. The presence of microRNAs identified in jute was validated by stem-loop RT-PCR followed by end point PCR and qPCR for randomly selected 20 known and novel microRNAs. This study exhaustively identifies microRNAs and their target genes in jute which will ultimately pave the way for understanding their role in this crop and other crops.

  6. High-Throughput Sequencing Reveals Diverse Sets of Conserved, Nonconserved, and Species-Specific miRNAs in Jute

    PubMed Central

    Islam, Md. Tariqul; Ferdous, Ahlan Sabah; Najnin, Rifat Ara; Sarker, Suprovath Kumar; Khan, Haseena

    2015-01-01

    MicroRNAs play a pivotal role in regulating a broad range of biological processes, acting by cleaving mRNAs or by translational repression. A group of plant microRNAs are evolutionarily conserved; however, others are expressed in a species-specific manner. Jute is an agroeconomically important fibre crop; nonetheless, no practical information is available for microRNAs in jute to date. In this study, Illumina sequencing revealed a total of 227 known microRNAs and 17 potential novel microRNA candidates in jute, of which 164 belong to 23 conserved families and the remaining 63 belong to 58 nonconserved families. Among a total of 81 identified microRNA families, 116 potential target genes were predicted for 39 families and 11 targets were predicted for 4 among the 17 identified novel microRNAs. For understanding better the functions of microRNAs, target genes were analyzed by Gene Ontology and their pathways illustrated by KEGG pathway analyses. The presence of microRNAs identified in jute was validated by stem-loop RT-PCR followed by end point PCR and qPCR for randomly selected 20 known and novel microRNAs. This study exhaustively identifies microRNAs and their target genes in jute which will ultimately pave the way for understanding their role in this crop and other crops. PMID:25861616

  7. Conservation of sequence in recombination signal sequence spacers.

    PubMed Central

    Ramsden, D A; Baetz, K; Wu, G E

    1994-01-01

    The variable domains of immunoglobulins and T cell receptors are assembled through the somatic, site specific recombination of multiple germline segments (V, D, and J segments) or V(D)J rearrangement. The recombination signal sequence (RSS) is necessary and sufficient for cell type specific targeting of the V(D)J rearrangement machinery to these germline segments. Previously, the RSS has been described as possessing both a conserved heptamer and a conserved nonamer motif. The heptamer and nonamer motifs are separated by a 'spacer' that was not thought to possess significant sequence conservation, however the length of the spacer could be either 12 +/- 1 bp or 23 +/- 1 bp long. In this report we have assembled and analyzed an extensive data base of published RSS. We have derived, through extensive consensus comparison, a more detailed description of the RSS than has previously been reported. Our analysis indicates that RSS spacers possess significant conservation of sequence, and that the conserved sequence in 12 bp spacers is similar to the conserved sequence in the first half of 23 bp spacers. PMID:8208601

  8. Sequence conservation on the Y chromosome

    SciTech Connect

    Gibson, L.H.; Yang-Feng, L.; Lau, C.

    1994-09-01

    The Y chromosome is present in all mammals and is considered to be essential to sex determination. Despite intense genomic research, only a few genes have been identified and mapped to this chromosome in humans. Several of them, such as SRY and ZFY, have been demonstrated to be conserved and Y-located in other mammals. In order to address the issue of sequence conservation on the Y chromosome, we performed fluorescence in situ hybridization (FISH) with DNA from a human Y cosmid library as a probe to study the Y chromosomes from other mammalian species. Total DNA from 3,000-4,500 cosmid pools were labeled with biotinylated-dUTP and hybridized to metaphase chromosomes. For human and primate preparations, human cot1 DNA was included in the hybridization mixture to suppress the hybridization from repeat sequences. FISH signals were detected on the Y chromosomes of human, gorilla, orangutan and baboon (Old World monkey) and were absent on those of squirrel monkey (New World monkey), Indian munjac, wood lemming, Chinese hamster, rat and mouse. Since sequence analysis suggested that specific genes, e.g. SRY and ZFY, are conserved between these two groups, the lack of detectable hybridization in the latter group implies either that conservation of the human Y sequences is limited to the Y chromosomes of the great apes and Old World monkeys, or that the size of the syntenic segment is too small to be detected under the resolution of FISH, or that homologeous sequences have undergone considerable divergence. Further studies with reduced hybridization stringency are currently being conducted. Our results provide some clues as to Y-sequence conservation across species and demonstrate the limitations of FISH across species with total DNA sequences from a particular chromosome.

  9. Comparison of orthologous and paralogous DNA flanking the wheat high molecular weight glutenin genes: sequence conservation and divergence, transposon distribution, and matrix-attachment regions.

    PubMed

    Anderson, O D; Larka, L; Christoffers, M J; McCue, K F; Gustafson, J P

    2002-04-01

    Extended flanking DNA sequences were characterized for five members of the wheat high molecular weight (HMW) glutenin gene family to understand more of the structure, control, and evolution of these genes. Analysis revealed more sequence conservation among orthologous regions than between paralogous regions, with differences mainly owing to transposition events involving putative retrotransposons and several miniature inverted transposable elements (MITEs). Both gyspy-like long terminal repeat (LTR) and non-LTR retrotransposon sequences are represented in the flanking DNAs. One of the MITEs is a novel class, but another MITE is related to the maize Stowaway family and is widely represented in Triticeae express sequence tags (ESTs). Flanking DNA of the longest sequence, a 20 425-bp fragment including and surrounding the HMW-glutenin Bx7 gene, showed additional cereal gene-like sequences both immediately 5' and 3' to the HMW-glutenin coding region. The transcriptional activities of sequences related to these flanking putative genes and the retrotransposon-related regions were indicated by matches to wheat and other Triticeae ESTs. Predictive analysis of matrix-attachment regions (MARs) of the HMW glutenin and several alpha-, gamma-, and omega-gliadin flanking DNAs indicate potential MARs immediately flanking each of the genes. Matrix binding activity in the predicted regions was confirmed for two of the HMW-glutenin genes.

  10. A highly conserved sequence associated with the HIV gp41 loop region is an immunomodulator of antigen-specific T cells in mice.

    PubMed

    Ashkenazi, Avraham; Faingold, Omri; Kaushansky, Nathali; Ben-Nun, Avraham; Shai, Yechiel

    2013-03-21

    Modulation of T-cell responses by HIV occurs via distinct mechanisms, 1 of which involves inactivation of T cells already at the stage of virus-cell fusion. Hydrophobic portions of the gp41 protein of the viral envelope that contributes to membrane fusion may modulate T-cell responsiveness. Here we found a highly conserved sequence (termed "ISLAD") that is associated with the membranotropic gp41 loop region. We showed that ISLAD has the ability to bind the T-cell membrane and to interact with the T-cell receptor (TCR) complex. Furthermore, ISLAD inhibited T-cell proliferation and interferon-γ secretion that resulted from TCR engagement through antigen-presenting cells. Moreover, administering ISLAD (10 µg per mouse) to an experimental autoimmune encephalomyelitis (EAE) model of multiple sclerosis reduced the severity of the disease. This was related to the inhibition of pathogenic T-cell proliferation and to reduced pro-inflammatory cytokine secretion in the lymph nodes of ISLAD-treated EAE mice. The data suggest that T-cell inactivation by HIV during membrane fusion may lie in part in this conserved sequence associated with the gp41 loop region.

  11. Conserved Noncoding Sequences in the Grasses4

    PubMed Central

    Inada, Dan Choffnes; Bashir, Ali; Lee, Chunghau; Thomas, Brian C.; Ko, Cynthia; Goff, Stephen A.; Freeling, Michael

    2003-01-01

    As orthologous genes from related species diverge over time, some sequences are conserved in noncoding regions. In mammals, large phylogenetic footprints, or conserved noncoding sequences (CNSs), are known to be common features of genes. Here we present the first large-scale analysis of plant genes for CNSs. We used maize and rice, maximally diverged members of the grass family of monocots. Using a local sequence alignment set to deliver only significant alignments, we found one or more CNSs in the noncoding regions of the majority of genes studied. Grass genes have dramatically fewer and much smaller CNSs than mammalian genes. Twenty-seven percent of grass gene comparisons revealed no CNSs. Genes functioning in upstream regulatory roles, such as transcription factors, are greatly enriched for CNSs relative to genes encoding enzymes or structural proteins. Further, we show that a CNS cluster in an intron of the knotted1 homeobox gene serves as a site of negative regulation. We showthat CNSs in the adh1 gene do not correlate with known cis-acting sites. We discuss the potential meanings of CNSs and their value as analytical tools and evolutionary characters. We advance the idea that many CNSs function to lock-in gene regulatory decisions. PMID:12952874

  12. Human IgE-binding protein: A soluble lectin exhibiting a highly conserved interspecies sequence and differential recognition of IgE glycoforms

    SciTech Connect

    Robertson, M.W.; Albrandt, K.; Keller, D.; Liu, Fu-Tong )

    1990-09-04

    IgE-binding protein ({epsilon}BP) refers to a protein originally identified in rat basophilic leukemia cells by virtue of its affinity for IgE. It is now known to be a {beta}-galactoside-binding lectin equivalent to carbohydrate-binding protein 35 (CBP 35). More recently, its identity to Mac-2, a macrophage cell-surface protein, has been established. cDNA coding for human {epsilon}BP has been cloned from a human HeLa cell cDNA library and contains an open reading frame of 750 base pairs encoding a 250 amino acid protein. Like the rat and murine counterparts, the human {epsilon}BP amino acid sequence can be divided into two domains with the amino-terminal domain consisting of a highly conserved repetitive sequence (YPGXXXPGA) and the carboxyl-terminal domain containing sequences shared by other S-type lectins. The human {epsilon}BP sequence exhibits extensive homology to murine and rat {epsilon}BP with 84% and 82% identity, respectively. The homology is particularly striking in the carboxyl-terminal domain where 95% identity is found between human and murine sequences in a stretch of over 70 amino acids. A survey of {epsilon}BP mRNA expression from several lymphocyte cell lines revealed that the level of {epsilon}BP transcription may reflect a relationship between cell differentiation and {epsilon}BP expression. Finally, human {epsilon}BP was purified from several human cell lines and shown to possess lactose-binding characteristics and cross-species reactivity to murine IgE. Surprisingly, three different human myeloma IgE proteins did not show reactivity to human {epsilon}BP. However, after neuraminidase treatment of each human IgE, pronounced binding to {epsilon}BP was observed, thereby indicating that only specific IgE glycoforms can be recognized by {epsilon}BP.

  13. Genome-Wide Analysis of Stowaway-Like MITEs in Wheat Reveals High Sequence Conservation, Gene Association, and Genomic Diversification1[C][W

    PubMed Central

    Yaakov, Beery; Ben-David, Smadar; Kashkush, Khalil

    2013-01-01

    The diversity and evolution of wheat (Triticum-Aegilops group) genomes is determined, in part, by the activity of transposable elements that constitute a large fraction of the genome (up to 90%). In this study, we retrieved sequences from publicly available wheat databases, including a 454-pyrosequencing database, and analyzed 18,217 insertions of 18 Stowaway-like miniature inverted-repeat transposable element (MITE) families previously characterized in wheat that together account for approximately 1.3 Mb of sequence. All 18 families showed high conservation in length, sequence, and target site preference. Furthermore, approximately 55% of the elements were inserted in transcribed regions, into or near known wheat genes. Notably, we observed significant correlation between the mean length of the MITEs and their copy number. In addition, the genomic composition of nine MITE families was studied by real-time quantitative polymerase chain reaction analysis in 40 accessions of Triticum spp. and Aegilops spp., including diploids, tetraploids, and hexaploids. The quantitative polymerase chain reaction data showed massive and significant intraspecific and interspecific variation as well as genome-specific proliferation and nonadditive quantities in the polyploids. We also observed significant differences in the methylation status of the insertion sites among MITE families. Our data thus suggest a possible role for MITEs in generating genome diversification and in the establishment of nascent polyploid species in wheat. PMID:23104862

  14. Sequence of cDNAs for mammalian H2A.Z, an evolutionarily diverged but highly conserved basal histone H2A isoprotein species.

    PubMed Central

    Hatch, C L; Bonner, W M

    1988-01-01

    The nucleotide sequences of cDNAs for the evolutionarily diverged but highly conserved basal H2A isoprotein, H2A.Z, have been determined for the rat, cow, and human. As a basal histone, H2A.Z is synthesized throughout the cell cycle at a constant rate, unlinked to DNA replication, and at a much lower rate in quiescent cells. Each of the cDNA isolates encodes the entire H2A.Z polypeptide. The human isolate is about 1.0 kilobases long. It contains a coding region of 387 nucleotides flanked by 106 nucleotides of 5'UTR and 376 nucleotides of 3'UTR, which contains a polyadenylation signal followed by a poly A tail. The bovine and rat cDNAs have 97 and 94% nucleotide positional identity to the human cDNA in the coding region and 98% in the proximal 376 nucleotides of the 3'UTR which includes the polyadenylation signal. A potential stem-forming sequence imbedded in a direct repeat is found centered at 261 nucleotides into the 3'UTR. Each of the cDNA clones could be transcribed and translated in vitro to yield H2A.Z protein. The mammalian H2A.Z cDNA coding sequences are approximately 80% similar to those in chicken and 75% to those in sea urchin. PMID:3344202

  15. Multigenome DNA sequence conservation identifies Hox cis-regulatory elements

    PubMed Central

    Kuntz, Steven G.; Schwarz, Erich M.; DeModena, John A.; De Buysscher, Tristan; Trout, Diane; Shizuya, Hiroaki; Sternberg, Paul W.; Wold, Barbara J.

    2008-01-01

    To learn how well ungapped sequence comparisons of multiple species can predict cis-regulatory elements in Caenorhabditis elegans, we made such predictions across the large, complex ceh-13/lin-39 locus and tested them transgenically. We also examined how prediction quality varied with different genomes and parameters in our comparisons. Specifically, we sequenced ∼0.5% of the C. brenneri and C. sp. 3 PS1010 genomes, and compared five Caenorhabditis genomes (C. elegans, C. briggsae, C. brenneri, C. remanei, and C. sp. 3 PS1010) to find regulatory elements in 22.8 kb of noncoding sequence from the ceh-13/lin-39 Hox subcluster. We developed the MUSSA program to find ungapped DNA sequences with N-way transitive conservation, applied it to the ceh-13/lin-39 locus, and transgenically assayed 21 regions with both high and low degrees of conservation. This identified 10 functional regulatory elements whose activities matched known ceh-13/lin-39 expression, with 100% specificity and a 77% recovery rate. One element was so well conserved that a similar mouse Hox cluster sequence recapitulated the native nematode expression pattern when tested in worms. Our findings suggest that ungapped sequence comparisons can predict regulatory elements genome-wide. PMID:18981268

  16. Use of genotyping by sequencing data to develop a high-throughput and multifunctional SNP panel for conservation applications in Pacific lamprey.

    PubMed

    Hess, Jon E; Campbell, Nathan R; Docker, Margaret F; Baker, Cyndi; Jackson, Aaron; Lampman, Ralph; McIlraith, Brian; Moser, Mary L; Statler, David P; Young, William P; Wildbill, Andrew J; Narum, Shawn R

    2015-01-01

    Next-generation sequencing data can be mined for highly informative single nucleotide polymorphisms (SNPs) to develop high-throughput genomic assays for nonmodel organisms. However, choosing a set of SNPs to address a variety of objectives can be difficult because SNPs are often not equally informative. We developed an optimal combination of 96 high-throughput SNP assays from a total of 4439 SNPs identified in a previous study of Pacific lamprey (Entosphenus tridentatus) and used them to address four disparate objectives: parentage analysis, species identification and characterization of neutral and adaptive variation. Nine of these SNPs are FST outliers, and five of these outliers are localized within genes and significantly associated with geography, run-timing and dwarf life history. Two of the 96 SNPs were diagnostic for two other lamprey species that were morphologically indistinguishable at early larval stages and were sympatric in the Pacific Northwest. The majority (85) of SNPs in the panel were highly informative for parentage analysis, that is, putatively neutral with high minor allele frequency across the species' range. Results from three case studies are presented to demonstrate the broad utility of this panel of SNP markers in this species. As Pacific lamprey populations are undergoing rapid decline, these SNPs provide an important resource to address critical uncertainties associated with the conservation and recovery of this imperiled species.

  17. Properties of Sequence Conservation in Upstream Regulatory and Protein Coding Sequences among Paralogs in Arabidopsis thaliana

    NASA Astrophysics Data System (ADS)

    Richardson, Dale N.; Wiehe, Thomas

    Whole genome duplication (WGD) has catalyzed the formation of new species, genes with novel functions, altered expression patterns, complexified signaling pathways and has provided organisms a level of genetic robustness. We studied the long-term evolution and interrelationships of 5’ upstream regulatory sequences (URSs), protein coding sequences (CDSs) and expression correlations (EC) of duplicated gene pairs in Arabidopsis. Three distinct methods revealed significant evolutionary conservation between paralogous URSs and were highly correlated with microarray-based expression correlation of the respective gene pairs. Positional information on exact matches between sequences unveiled the contribution of micro-chromosomal rearrangements on expression divergence. A three-way rank analysis of URS similarity, CDS divergence and EC uncovered specific gene functional biases. Transcription factor activity was associated with gene pairs exhibiting conserved URSs and divergent CDSs, whereas a broad array of metabolic enzymes was found to be associated with gene pairs showing diverged URSs but conserved CDSs.

  18. HIV-1 conserved-element vaccines: relationship between sequence conservation and replicative capacity.

    PubMed

    Rolland, Morgane; Manocheewa, Siriphan; Swain, J Victor; Lanxon-Cookson, Erinn C; Kim, Moon; Westfall, Dylan H; Larsen, Brendan B; Gilbert, Peter B; Mullins, James I

    2013-05-01

    To overcome the problem of HIV-1 variability, candidate vaccine antigens have been designed to be composed of conserved elements of the HIV-1 proteome. Such candidate vaccines could be improved with a better understanding of both HIV-1 evolutionary constraints and the fitness cost of specific mutations. We evaluated the in vitro fitness cost of 23 mutations engineered in the HIV-1 subtype B Gag-p24 Center-of-Tree (COT) protein through fitness competition assays. While some mutations at conserved sites exacted a high fitness cost, as expected under the assumption that the most conserved residue confers the highest fitness, there was no overall strong relationship between sequence conservation and replicative capacity. By comparing sites that have evolved since the beginning of the epidemic to those that have remain unchanged, we found that sites that have evolved over time were more likely to correspond to HLA-associated sites and that their mutation had limited fitness costs. Our data showed no transcendent link between high conservation and high fitness cost, indicating that merely focusing on conserved segments of HIV-1 would not be sufficient for a successful vaccine strategy. Nonetheless, a subset of sites exacted a high fitness cost upon mutation--these sites have been under selective pressure to change since the beginning of the epidemic but have proved virtually nonmutable and could constitute preferred targets for vaccine design.

  19. A highly conserved N-terminal sequence for teleost vitellogenin with potential value to the biochemistry, molecular biology and pathology of vitellogenesis

    USGS Publications Warehouse

    Folmar, L.D.; Denslow, N.D.; Wallace, R.A.; LaFleur, G.; Gross, T.S.; Bonomelli, S.; Sullivan, C.V.

    1995-01-01

    N-terminal amino acid sequences for vitellogenin (Vtg) from six species of teleost fish (striped bass, mummichog, pinfish, brown bullhead, medaka, yellow perch and the sturgeon) are compared with published N-terminal Vtg sequences for the lamprey, clawed frog and domestic chicken. Striped bass and mummichog had 100% identical amino acids between positions 7 and 21, while pinfish, brown bullhead, sturgeon, lamprey, Xenopus and chicken had 87%, 93%, 60%, 47%, 47-60%) for four transcripts and had 40% identical, respectively, with striped bass for the same positions. Partial sequences obtained for medaka and yellow perch were 100% identical between positions 5 to 10. The potential utility of this conserved sequence for studies on the biochemistry, molecular biology and pathology of vitellogenesis is discussed.

  20. High-Throughput Sequencing Identifies Novel and Conserved Cucumber (Cucumis sativus L.) microRNAs in Response to Cucumber Green Mottle Mosaic Virus Infection.

    PubMed

    Liu, H W; Luo, L X; Liang, C Q; Jiang, N; Liu, P F; Li, J Q

    2015-01-01

    Seedlings of Cucumis sativus L. (cv. 'Zhongnong 16') were artificially inoculated with Cucumber green mottle mosaic virus (CGMMV) at the three-true-leaf stage. Leaf and flower samples were collected at different time points post-inoculation (10, 30 and 50 d), and processed by high throughput sequencing analysis to identify candidate miRNA sequences. Bioinformatic analysis using screening criteria, and secondary structure prediction, indicated that 8 novel and 23 known miRNAs (including 15 miRNAs described for the first time in vivo) were produced by cucumber plants in response to CGMMV infection. Moreover, gene expression profiles (p-value <0.01) validated the expression of 3 of the novel miRNAs and 3 of the putative candidate miRNAs and identified a further 82 conserved miRNAs in CGMMV-infected cucumbers. Gene ontology (GO) analysis revealed that the predicted target genes of these 88 miRNAs, which were screened using the psRNATarget and miRanda algorithms, were involved in three functional categories: 2265 in molecular function, 1362 as cellular components and 276 in biological process. The subsequent Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that the predicted target genes were frequently involved in metabolic processes (166 pathways) and genetic information processes (40 pathways) and to a lesser degree the biosynthesis of secondary metabolites (12 pathways). These results could provide useful clues to help elucidate host-pathogen interactions in CGMMV and cucumber, as well as for the screening of resistance genes.

  1. Functionally conserved enhancers with divergent sequences in distant vertebrates

    DOE PAGES

    Yang, Song; Oksenberg, Nir; Takayama, Sachiko; ...

    2015-10-30

    To examine the contributions of sequence and function conservation in the evolution of enhancers, we systematically identified enhancers whose sequences are not conserved among distant groups of vertebrate species, but have homologous function and are likely to be derived from a common ancestral sequence. In conclusion, our approach combined comparative genomics and epigenomics to identify potential enhancer sequences in the genomes of three groups of distantly related vertebrate species.

  2. The complete sequence of a Spanish isolate of Broad bean wilt virus 1 (BBWV-1) reveals a high variability and conserved motifs in the genus Fabavirus.

    PubMed

    Ferrer, R M; Guerri, J; Luis-Arteaga, M S; Moreno, P; Rubio, L

    2005-10-01

    The genome of a Spanish isolate of Broad bean wilt virus-1 (BBWV-1) was completely sequenced and compared with available sequences of other isolates of the genus Fabavirus (BBWV-1 and BBWV-2). This consisted of two RNAs of 5814 and 3431 nucleotides, respectively, and their organization was similar to that of other members of the family Comoviridae. Its mean nucleotide identity with a BBWV-1 American isolate was 81.5%, and between 59.8 and 63.5% with seven BBWV-2 isolates. Our analysis showed sequence stretches in the 5' non-coding regions which are conserved in both genomic RNAs and in BBWV-1 and BBWV-2 isolates.

  3. Novel low abundance and transient RNAs in yeast revealed by tiling microarrays and ultra high-throughput sequencing are not conserved across closely related yeast species.

    PubMed

    Lee, Albert; Hansen, Kasper Daniel; Bullard, James; Dudoit, Sandrine; Sherlock, Gavin

    2008-12-01

    A complete description of the transcriptome of an organism is crucial for a comprehensive understanding of how it functions and how its transcriptional networks are controlled, and may provide insights into the organism's evolution. Despite the status of Saccharomyces cerevisiae as arguably the most well-studied model eukaryote, we still do not have a full catalog or understanding of all its genes. In order to interrogate the transcriptome of S. cerevisiae for low abundance or rapidly turned over transcripts, we deleted elements of the RNA degradation machinery with the goal of preferentially increasing the relative abundance of such transcripts. We then used high-resolution tiling microarrays and ultra high-throughput sequencing (UHTS) to identify, map, and validate unannotated transcripts that are more abundant in the RNA degradation mutants relative to wild-type cells. We identified 365 currently unannotated transcripts, the majority presumably representing low abundance or short-lived RNAs, of which 185 are previously unknown and unique to this study. It is likely that many of these are cryptic unstable transcripts (CUTs), which are rapidly degraded and whose function(s) within the cell are still unclear, while others may be novel functional transcripts. Of the 185 transcripts we identified as novel to our study, greater than 80 percent come from regions of the genome that have lower conservation scores amongst closely related yeast species than 85 percent of the verified ORFs in S. cerevisiae. Such regions of the genome have typically been less well-studied, and by definition transcripts from these regions will distinguish S. cerevisiae from these closely related species.

  4. Identification of novel and conserved miRNAs involved in pollen development in Brassica campestris ssp. chinensis by high-throughput sequencing and degradome analysis

    PubMed Central

    2014-01-01

    Background microRNAs (miRNAs) are endogenous, noncoding, small RNAs that have essential regulatory functions in plant growth, development, and stress response processes. However, limited information is available about their functions in sexual reproduction of flowering plants. Pollen development is an important process in the life cycle of a flowering plant and is a major factor that affects the yield and quality of crop seeds. Results This study aims to identify miRNAs involved in pollen development. Two independent small RNA libraries were constructed from the flower buds of the male sterile line (Bcajh97-01A) and male fertile line (Bcajh97-01B) of Brassica campestris ssp. chinensis. The libraries were subjected to high-throughput sequencing by using the Illumina Solexa system. Eight novel miRNAs on the other arm of known pre-miRNAs, 54 new conserved miRNAs, and 8 novel miRNA members were identified. Twenty-five pairs of novel miRNA/miRNA* were found. Among all the identified miRNAs, 18 differentially expressed miRNAs with over two-fold change between flower buds of male sterile line (Bcajh97-01A) and male fertile line (Bcajh97-01B) were identified. qRT-PCR analysis revealed that most of the differentially expressed miRNAs were preferentially expressed in flower buds of the male fertile line (Bcajh97-01B). Degradome analysis showed that a total of 15 genes were predicted to be the targets of seven miRNAs. Conclusions Our findings provide an overview of potential miRNAs involved in pollen development and interactions between miRNAs and their corresponding targets, which may provide important clues on the function of miRNAs in pollen development. PMID:24559317

  5. A Developmental Sequence of Skills Leading to Conservation

    ERIC Educational Resources Information Center

    Walker, Alice A.

    1978-01-01

    Examines the developmental sequence of skills involved in the understanding of relational concepts and in the development of conservation. Fifty kindergarten children participated in the study. (BD/BR)

  6. Coupling DNA-binding and ATP hydrolysis in Escherichia coli RecQ: role of a highly conserved aromatic-rich sequence.

    PubMed

    Zittel, Morgan C; Keck, James L

    2005-01-01

    RecQ enzymes are broadly conserved Superfamily-2 (SF-2) DNA helicases that play critical roles in DNA metabolism. RecQ proteins use the energy of ATP hydrolysis to drive DNA unwinding; however, the mechanisms by which RecQ links ATPase activity to DNA-binding/unwinding are unknown. In many Superfamily-1 (SF-1) DNA helicases, helicase sequence motif III links these activities by binding both single-stranded (ss) DNA and ATP. However, the ssDNA-binding aromatic-rich element in motif III present in these enzymes is missing from SF-2 helicases, raising the question of how these enzymes link ATP hydrolysis to DNA-binding/unwinding. We show that Escherichia coli RecQ contains a conserved aromatic-rich loop in its helicase domain between motifs II and III. Although placement of the RecQ aromatic-rich loop is topologically distinct relative to the SF-1 enzymes, both loops map to similar tertiary structural positions. We examined the functions of the E.coli RecQ aromatic-rich loop using RecQ variants with single amino acid substitutions within the segment. Our results indicate that the aromatic-rich loop in RecQ is critical for coupling ATPase and DNA-binding/unwinding activities. Our studies also suggest that RecQ's aromatic-rich loop might couple ATP hydrolysis to DNA-binding in a mechanistically distinct manner from SF-1 helicases.

  7. Nucleotide sequence conservation in paramyxoviruses; the concept of codon constellation.

    PubMed

    Rima, Bert K

    2015-05-01

    The stability and conservation of the sequences of RNA viruses in the field and the high error rates measured in vitro are paradoxical. The field stability indicates that there are very strong selective constraints on sequence diversity. The nature of these constraints is discussed. Apart from constraints on variation in cis-acting RNA and the amino acid sequences of viral proteins, there are other ones relating to the presence of specific dinucleotides such CpG and UpA as well as the importance of RNA secondary structures and RNA degradation rates. Recent other constraints identified in other RNA viruses, such as effects of secondary RNA structure on protein folding or modification of cellular tRNA complements, are also discussed. Using the family Paramyxoviridae, I show that the codon usage pattern (CUP) is (i) specific for each virus species and (ii) that it is markedly different from the host - it does not vary even in vaccine viruses that have been derived by passage in a number of inappropriate host cells. The CUP might thus be an additional constraint on variation, and I propose the concept of codon constellation to indicate the informational content of the sequences of RNA molecules relating not only to stability and structure but also to the efficiency of translation of a viral mRNA resulting from the CUP and the numbers and position of rare codons.

  8. Housekeeping genes tend to show reduced upstream sequence conservation

    PubMed Central

    Farré, Domènec; Bellora, Nicolás; Mularoni, Loris; Messeguer, Xavier; Albà, M Mar

    2007-01-01

    Background Understanding the constraints that operate in mammalian gene promoter sequences is of key importance to understand the evolution of gene regulatory networks. The level of promoter conservation varies greatly across orthologous genes, denoting differences in the strength of the evolutionary constraints. Here we test the hypothesis that the number of tissues in which a gene is expressed is related in a significant manner to the extent of promoter sequence conservation. Results We show that mammalian housekeeping genes, expressed in all or nearly all tissues, show significantly lower promoter sequence conservation, especially upstream of position -500 with respect to the transcription start site, than genes expressed in a subset of tissues. In addition, we evaluate the effect of gene function, CpG island content and protein evolutionary rate on promoter sequence conservation. Finally, we identify a subset of transcription factors that bind to motifs that are specifically over-represented in housekeeping gene promoters. Conclusion This is the first report that shows that the promoters of housekeeping genes show reduced sequence conservation with respect to genes expressed in a more tissue-restricted manner. This is likely to be related to simpler gene expression, requiring a smaller number of functional cis-regulatory motifs. PMID:17626644

  9. Localization of the labile disulfide bond between SU and TM of the murine leukemia virus envelope protein complex to a highly conserved CWLC motif in SU that resembles the active-site sequence of thiol-disulfide exchange enzymes.

    PubMed Central

    Pinter, A; Kopelman, R; Li, Z; Kayman, S C; Sanders, D A

    1997-01-01

    Previous studies have indicated that the surface (SU) and transmembrane (TM) subunits of the envelope protein (Env) of murine leukemia viruses (MuLVs) are joined by a labile disulfide bond that can be stabilized by treatment of virions with thiol-specific reagents. In the present study this observation was extended to the Envs of additional classes of MuLV, and the cysteines of SU involved in this linkage were mapped by proteolytic fragmentation analyses to the CWLC sequence present at the beginning of the C-terminal domain of SU. This sequence is highly conserved across a broad range of distantly related retroviruses and resembles the CXXC motif present at the active site of thiol-disulfide exchange enzymes. A model is proposed in which rearrangements of the SU-TM intersubunit disulfide linkage, mediated by the CWLC sequence, play roles in the assembly and function of the Env complex. PMID:9311907

  10. Conserved Sequence Preferences Contribute to Substrate Recognition by the Proteasome*

    PubMed Central

    Yu, Houqing; Singh Gautam, Amit K.; Wilmington, Shameika R.; Wylie, Dennis; Martinez-Fonts, Kirby; Kago, Grace; Warburton, Marie; Chavali, Sreenivas; Inobe, Tomonao; Finkelstein, Ilya J.; Babu, M. Madan

    2016-01-01

    The proteasome has pronounced preferences for the amino acid sequence of its substrates at the site where it initiates degradation. Here, we report that modulating these sequences can tune the steady-state abundance of proteins over 2 orders of magnitude in cells. This is the same dynamic range as seen for inducing ubiquitination through a classic N-end rule degron. The stability and abundance of His3 constructs dictated by the initiation site affect survival of yeast cells and show that variation in proteasomal initiation can affect fitness. The proteasome's sequence preferences are linked directly to the affinity of the initiation sites to their receptor on the proteasome and are conserved between Saccharomyces cerevisiae, Schizosaccharomyces pombe, and human cells. These findings establish that the sequence composition of unstructured initiation sites influences protein abundance in vivo in an evolutionarily conserved manner and can affect phenotype and fitness. PMID:27226608

  11. The highly conserved amino acid sequence motif Tyr-Gly-Asp-Thr-Asp-Ser in alpha-like DNA polymerases is required by phage phi 29 DNA polymerase for protein-primed initiation and polymerization.

    PubMed Central

    Bernad, A; Lázaro, J M; Salas, M; Blanco, L

    1990-01-01

    The alpha-like DNA polymerases from bacteriophage phi 29 and other viruses, prokaryotes and eukaryotes contain an amino acid consensus sequence that has been proposed to form part of the dNTP binding site. We have used site-directed mutants to study five of the six highly conserved consecutive amino acids corresponding to the most conserved C-terminal segment (Tyr-Gly-Asp-Thr-Asp-Ser). Our results indicate that in phi 29 DNA polymerase this consensus sequence, although irrelevant for the 3'----5' exonuclease activity, is essential for initiation and elongation. Based on these results and on its homology with known or putative metal-binding amino acid sequences, we propose that in phi 29 DNA polymerase the Tyr-Gly-Asp-Thr-Asp-Ser consensus motif is part of the dNTP binding site, involved in the synthetic activities of the polymerase (i.e., initiation and polymerization), and that it is involved particularly in the metal binding associated with the dNTP site. Images PMID:2191296

  12. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction

    PubMed Central

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N.; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-01-01

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment. PMID:28262684

  13. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction

    NASA Astrophysics Data System (ADS)

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N.; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-03-01

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment.

  14. Conserved intergenic sequences revealed by CTAG-profiling in Salmonella: thermodynamic modeling for function prediction.

    PubMed

    Tang, Le; Zhu, Songling; Mastriani, Emilio; Fang, Xin; Zhou, Yu-Jie; Li, Yong-Guo; Johnston, Randal N; Guo, Zheng; Liu, Gui-Rong; Liu, Shu-Lin

    2017-03-06

    Highly conserved short sequences help identify functional genomic regions and facilitate genomic annotation. We used Salmonella as the model to search the genome for evolutionarily conserved regions and focused on the tetranucleotide sequence CTAG for its potentially important functions. In Salmonella, CTAG is highly conserved across the lineages and large numbers of CTAG-containing short sequences fall in intergenic regions, strongly indicating their biological importance. Computer modeling demonstrated stable stem-loop structures in some of the CTAG-containing intergenic regions, and substitution of a nucleotide of the CTAG sequence would radically rearrange the free energy and disrupt the structure. The postulated degeneration of CTAG takes distinct patterns among Salmonella lineages and provides novel information about genomic divergence and evolution of these bacterial pathogens. Comparison of the vertically and horizontally transmitted genomic segments showed different CTAG distribution landscapes, with the genome amelioration process to remove CTAG taking place inward from both terminals of the horizontally acquired segment.

  15. Protein sequence conservation and stable molecular evolution reveals influenza virus nucleoprotein as a universal druggable target.

    PubMed

    Babar, Mustafeez Mujtaba; Zaidi, Najam-us-Sahar Sadaf

    2015-08-01

    The high mutation rate in influenza virus genome and appearance of drug resistance calls for a constant effort to identify alternate drug targets and develop new antiviral strategies. The internal proteins of the virus can be exploited as a potential target for therapeutic interventions. Among these, the nucleoprotein (NP) is the most abundant protein that provides structural and functional support to the viral replication machinery. The current study aims at analysis of protein sequence polymorphism patterns, degree of molecular evolution and sequence conservation as a function of potential druggability of nucleoprotein. We analyzed a universal set of amino acid sequences, (n=22,000) and, in order to identify and correlate the functionally conserved, druggable regions across different parameters, classified them on the basis of host organism, strain type and continental region of sample isolation. The results indicated that around 95% of the sequence length was conserved, with at least 7 regions conserved across the protein among various classes. Moreover, the highly variable regions, though very limited in number, were found to be positively selected indicating, thereby, the high degree of protein stability against various hosts and spatio-temporal references. Furthermore, on mapping the conserved regions on the protein, 7 drug binding pockets in the functionally important regions of the protein were revealed. The results, therefore, collectively indicate that nucleoprotein is a highly conserved and stable viral protein that can potentially be exploited for development of broadly effective antiviral strategies.

  16. Patterns of sequence conservation in the S-Layer proteins and related sequences in Clostridium difficile.

    PubMed

    Calabi, Emanuela; Fairweather, Neil

    2002-07-01

    Clostridium difficile is the etiological agent of antibiotic-associated diarrhea. Among the factors that may play a role in infection are S-layer proteins (SLPs). Previous work has shown these to consist mainly of two components, resulting from the cleavage of a precursor encoded by the slpA gene. The high-molecular-weight (MW) subunit is related both to amidases from B. subtilis and to at least another 28 gene products in C. difficile strain 630. To gain insight into the functions of the SLPs and related proteins, we have further investigated the pattern of variability both at the slpA locus and at six nearby paralogs. Sequencing of the slpA gene from an S-layer group II strain and a variant S-layer group strain confirms a high degree of divergence in the low-MW SLP, which may result from diversifying selection. A highly conserved motif, however, is found at the C terminus in all low-MW subunits and may be essential for SlpA precursor cleavage. In strain 167, a variant cleavage product is present, suggesting a secondary processing site. Southern blotting analysis shows slpA-like open reading frames (ORFs) 2 to 7 to be conserved in all nine strains tested, with one exception: ORF2, which encodes a 66-kDa polypeptide coextracted at low pH with the main SLPs in strain 630, may be partially deleted in strain 167. Polymorphism within the slpA-ORF7 cluster may be more pronounced in the region proximal to the slpA gene. Unexpectedly, a high-MW subunit probe cross hybridizes to sequences outside the slpA locus, which appear to vary in number in different strains.

  17. Local Function Conservation in Sequence and Structure Space

    PubMed Central

    Weinhold, Nils; Sander, Oliver; Domingues, Francisco S.; Lengauer, Thomas; Sommer, Ingolf

    2008-01-01

    We assess the variability of protein function in protein sequence and structure space. Various regions in this space exhibit considerable difference in the local conservation of molecular function. We analyze and capture local function conservation by means of logistic curves. Based on this analysis, we propose a method for predicting molecular function of a query protein with known structure but unknown function. The prediction method is rigorously assessed and compared with a previously published function predictor. Furthermore, we apply the method to 500 functionally unannotated PDB structures and discuss selected examples. The proposed approach provides a simple yet consistent statistical model for the complex relations between protein sequence, structure, and function. The GOdot method is available online (http://godot.bioinf.mpi-inf.mpg.de). PMID:18604264

  18. Local function conservation in sequence and structure space.

    PubMed

    Weinhold, Nils; Sander, Oliver; Domingues, Francisco S; Lengauer, Thomas; Sommer, Ingolf

    2008-07-04

    We assess the variability of protein function in protein sequence and structure space. Various regions in this space exhibit considerable difference in the local conservation of molecular function. We analyze and capture local function conservation by means of logistic curves. Based on this analysis, we propose a method for predicting molecular function of a query protein with known structure but unknown function. The prediction method is rigorously assessed and compared with a previously published function predictor. Furthermore, we apply the method to 500 functionally unannotated PDB structures and discuss selected examples. The proposed approach provides a simple yet consistent statistical model for the complex relations between protein sequence, structure, and function. The GOdot method is available online (http://godot.bioinf.mpi-inf.mpg.de).

  19. Internal epitope tagging informed by relative lack of sequence conservation

    PubMed Central

    Burg, Leonard; Zhang, Karen; Bonawitz, Tristan; Grajevskaja, Viktorija; Bellipanni, Gianfranco; Waring, Richard; Balciunas, Darius

    2016-01-01

    Many experimental techniques rely on specific recognition and stringent binding of proteins by antibodies. This can readily be achieved by introducing an epitope tag. We employed an approach that uses a relative lack of evolutionary conservation to inform epitope tag site selection, followed by integration of the tag-coding sequence into the endogenous locus in zebrafish. We demonstrate that an internal epitope tag is accessible for antibody binding, and that tagged proteins retain wild type function. PMID:27892520

  20. High-Throughput Sequencing Technologies

    PubMed Central

    Reuter, Jason A.; Spacek, Damek; Snyder, Michael P.

    2015-01-01

    Summary The human genome sequence has profoundly altered our understanding of biology, human diversity and disease. The path from the first draft sequence to our nascent era of personal genomes and genomic medicine has been made possible only because of the extraordinary advancements in DNA sequencing technologies over the past ten years. Here, we discuss commonly used high-throughput sequencing platforms, the growing array of sequencing assays developed around them as well as the challenges facing current sequencing platforms and their clinical application. PMID:26000844

  1. Conservation patterns in different functional sequence categoriesof divergent Drosophila species

    SciTech Connect

    Papatsenko, Dmitri; Kislyuk, Andrey; Levine, Michael; Dubchak, Inna

    2005-10-01

    We have explored the distributions of fully conservedungapped blocks in genome-wide pairwise alignments of recently completedspecies of Drosophila: D.yakuba, D.ananassae, D.pseudoobscura, D.virilisand D.mojavensis. Based on these distributions we have found that nearlyevery functional sequence category possesses its own distinctiveconservation pattern, sometimes independent of the overall sequenceconservation level. In the coding and regulatory regions, the ungappedblocks were longer than in introns, UTRs and non-functional sequences. Atthe same time, the blocks in the coding regions carried 3N+2 signaturecharacteristic to synonymic substitutions in the 3rd codon positions.Larger block sizes in transcription regulatory regions can be explainedby the presence of conserved arrays of binding sites for transcriptionfactors. We also have shown that the longest ungapped blocks, or'ultraconserved' sequences, are associated with specific gene groups,including those encoding ion channels and components of the cytoskeleton.We discussed how restrained conservation patterns may help in mappingfunctional sequence categories and improving genomeannotation.

  2. Conservation patterns in angiosperm rDNA ITS2 sequences.

    PubMed Central

    Hershkovitz, M A; Zimmer, E A

    1996-01-01

    The two internal transcribed spacers (ITS1 and ITS2) of nuclear ribosomal DNA have become commonly exploited sources of informative variation for interspecific-/intergeneric-level phylogenetic analyses among angiosperms and other eukaryotes. We present an alignment in which one-third to one-half of the ITS2 sequence is alignable above the family level in angiosperms and a phenetic analysis showing that ITS2 contains information sufficient to diagnose lineages at several hierarchical levels. Base compositional analysis shows that angiosperm ITS2 is inherently GC-rich, and that the proportion of T is much more variable than that for other bases. We propose a general model of angiosperm ITS2 secondary structure that shows common pairing relationships for most of the conserved sequence tracts. Variations in our secondary structure predictions for sequences from different taxa indicate that compensatory mutation is not limited to paired positions. PMID:8760866

  3. Conservative Patch Algorithm and Mesh Sequencing for PAB3D

    NASA Technical Reports Server (NTRS)

    Pao, S. P.; Abdol-Hamid, K. S.

    2005-01-01

    A mesh-sequencing algorithm and a conservative patched-grid-interface algorithm (hereafter Patch Algorithm ) have been incorporated into the PAB3D code, which is a computer program that solves the Navier-Stokes equations for the simulation of subsonic, transonic, or supersonic flows surrounding an aircraft or other complex aerodynamic shapes. These algorithms are efficient, flexible, and have added tremendously to the capabilities of PAB3D. The mesh-sequencing algorithm makes it possible to perform preliminary computations using only a fraction of the grid cells (provided the original cell count is divisible by an integer) along any grid coordinate axis, independently of the other axes. The patch algorithm addresses another critical need in multi-block grid situation where the cell faces of adjacent grid blocks may not coincide, leading to errors in calculating fluxes of conserved physical quantities across interfaces between the blocks. The patch algorithm, based on the Stokes integral formulation of the applicable conservation laws, effectively matches each of the interfacial cells on one side of the block interface to the corresponding fractional cell area pieces on the other side. This approach is comprehensive and unified such that all interface topology is automatically processed without user intervention. This algorithm is implemented in a preprocessing code that creates a cell-by-cell database that will maintain flux conservation at any level of full or reduced grid density as the user may choose by way of the mesh-sequencing algorithm. These two algorithms have enhanced the numerical accuracy of the code, reduced the time and effort for grid preprocessing, and provided users with the flexibility of performing computations at any desired full or reduced grid resolution to suit their specific computational requirements.

  4. Conservation of sequence and function in fertilization of the cortical granule serine protease in echinoderms.

    PubMed

    Oulhen, Nathalie; Xu, Dongdong; Wessel, Gary M

    2014-08-01

    Conservation of the cortical granule serine protease during fertilization in echinoderms was tested both functionally in sea stars, and computationally throughout the echinoderm phylum. We find that the inhibitor of serine protease (soybean trypsin inhibitor) effectively blocks proper transition of the sea star fertilization envelope into a protective sperm repellent, whereas inhibitors of the other main types of proteases had no effect. Scanning the transcriptomes of 15 different echinoderm ovaries revealed sequences of high conservation to the originally identified sea urchin cortical serine protease, CGSP1. These conserved sequences contained the catalytic triad necessary for enzymatic activity, and the tandemly repeated LDLr-like repeats. We conclude that the protease involved in the slow block to polyspermy is an essential and conserved element of fertilization in echinoderms, and may provide an important reagent for identification and testing of the cell surface proteins in eggs necessary for sperm binding.

  5. Conservation of sequence and function in fertilization of the cortical granule serine protease in echinoderms

    PubMed Central

    Oulhen, Nathalie; Xu, Dongdong; Wessel, Gary M.

    2014-01-01

    Conservation of the cortical granule serine protease during fertilization in echinoderms was tested both functionally in sea stars, and computationally throughout the echinoderm phylum. We find that the inhibitor of serine protease (soybean trypsin inhibitor) effectively blocks proper transition of the sea star fertilization envelope into a protective sperm repellent, whereas inhibitors of the other main types of proteases had no effect. Scanning the transcriptomes of 15 different echinoderm ovaries revealed sequences of high conservation to the originally identified sea urchin cortical serine protease, CGSP1. These conserved sequences contained the catalytic triad necessary for enzymatic activity, and the tandemly repeated LDLr-like repeats. We conclude that the protease involved in the slow block to polyspermy is an essential and conserved element of fertilization in echinoderms, and may provide an important reagent for identification and testing of the cell surface proteins in eggs necessary for sperm binding. PMID:24878526

  6. In Vivo Enhancer Analysis Chromosome 16 Conserved NoncodingSequences

    SciTech Connect

    Pennacchio, Len A.; Ahituv, Nadav; Moses, Alan M.; Nobrega,Marcelo; Prabhakar, Shyam; Shoukry, Malak; Minovitsky, Simon; Visel,Axel; Dubchak, Inna; Holt, Amy; Lewis, Keith D.; Plajzer-Frick, Ingrid; Akiyama, Jennifer; De Val, Sarah; Afzal, Veena; Black, Brian L.; Couronne, Olivier; Eisen, Michael B.; Rubin, Edward M.

    2006-02-01

    The identification of enhancers with predicted specificitiesin vertebrate genomes remains a significant challenge that is hampered bya lack of experimentally validated training sets. In this study, weleveraged extreme evolutionary sequence conservation as a filter toidentify putative gene regulatory elements and characterized the in vivoenhancer activity of human-fish conserved and ultraconserved1 noncodingelements on human chromosome 16 as well as such elements from elsewherein the genome. We initially tested 165 of these extremely conservedsequences in a transgenic mouse enhancer assay and observed that 48percent (79/165) functioned reproducibly as tissue-specific enhancers ofgene expression at embryonic day 11.5. While driving expression in abroad range of anatomical structures in the embryo, the majority of the79 enhancers drove expression in various regions of the developingnervous system. Studying a set of DNA elements that specifically droveforebrain expression, we identified DNA signatures specifically enrichedin these elements and used these parameters to rank all ~;3,400human-fugu conserved noncoding elements in the human genome. The testingof the top predictions in transgenic mice resulted in a three-foldenrichment for sequences with forebrain enhancer activity. These datadramatically expand the catalogue of in vivo-characterized human geneenhancers and illustrate the future utility of such training sets for avariety of iological applications including decoding the regulatoryvocabulary of the human genome.

  7. 77 FR 74167 - Information Collection Request: Highly Erodible Land Conservation and Wetland Conservation

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-12-13

    ... Farm Service Agency Information Collection Request: Highly Erodible Land Conservation and Wetland Conservation AGENCIES: Farm Service Agency, USDA. ACTION: Notice; request for comments. SUMMARY: In accordance... associated with Highly Erodible Land Conservation and Wetland Conservation certification requirements....

  8. Genetic mapping of legume orthologs reveals high conservation of synteny between lentil species and the sequenced genomes of Medicago and chickpea

    PubMed Central

    Gujaria-Verma, Neha; Vail, Sally L.; Carrasquilla-Garcia, Noelia; Penmetsa, R. Varma; Cook, Douglas R.; Farmer, Andrew D.; Vandenberg, Albert; Bett, Kirstin E.

    2014-01-01

    Lentil (Lens culinaris Medik.) is a global food crop with increasing importance for food security in south Asia and other regions. Lens ervoides, a wild relative of cultivated lentil, is an important source of agronomic trait variation. Lens is a member of the galegoid clade of the Papilionoideae family, which includes other important dietary legumes such as chickpea (Cicer arietinum) and pea (Pisum sativum), and the sequenced model legume Medicago truncatula. Understanding the genetic structure of Lens spp. in relation to more fully sequenced legumes would allow leveraging of genomic resources. A set of 1107 TOG-based amplicons were identified in L. ervoides and a subset thereof used to design SNP markers for mapping. A map of L. ervoides consisting of 377 SNP markers spread across seven linkage groups was developed using a GoldenGate genotyping array and single SNP marker assays. Comparison with maps of M. truncatula and L. culinaris documented considerable shared synteny and led to the identification of a few major translocations and a major inversion that distinguish Lens from M. truncatula, as well as a translocation that distinguishes L. culinaris from L. ervoides. The identification of chromosome-level differences among Lens spp. will aid in the understanding of introgression of genes from L. ervoides into cultivated L. culinaris, furthering genetic research and breeding applications in lentil. PMID:25538716

  9. Genome-Wide Identification and Comparative Analysis of Conserved and Novel MicroRNAs in Grafted Watermelon by High-Throughput Sequencing

    PubMed Central

    Liu, Na; Yang, Jinghua; Guo, Shaogui; Xu, Yong; Zhang, Mingfang

    2013-01-01

    MicroRNAs (miRNAs) are a class of endogenous small non-coding RNAs involved in the post-transcriptional gene regulation and play a critical role in plant growth, development and stresses response. However less is known about miRNAs involvement in grafting behaviors, especially with the watermelon (Citrullus lanatus L.) crop, which is one of the most important agricultural crops worldwide. Grafting method is commonly used in watermelon production in attempts to improve its adaptation to abiotic and biotic stresses, in particular to the soil-borne fusarium wilt disease. In this study, Solexa sequencing has been used to discover small RNA populations and compare miRNAs on genome-wide scale in watermelon grafting system. A total of 11,458,476, 11,614,094 and 9,339,089 raw reads representing 2,957,751, 2,880,328 and 2,964,990 unique sequences were obtained from the scions of self-grafted watermelon and watermelon grafted on-to bottle gourd and squash at two true-leaf stage, respectively. 39 known miRNAs belonging to 30 miRNA families and 80 novel miRNAs were identified in our small RNA dataset. Compared with self-grafted watermelon, 20 (5 known miRNA families and 15 novel miRNAs) and 47 (17 known miRNA families and 30 novel miRNAs) miRNAs were expressed significantly different in watermelon grafted on to bottle gourd and squash, respectively. MiRNAs expressed differentially when watermelon was grafted onto different rootstocks, suggesting that miRNAs might play an important role in diverse biological and metabolic processes in watermelon and grafting may possibly by changing miRNAs expressions to regulate plant growth and development as well as adaptation to stresses. The small RNA transcriptomes obtained in this study provided insights into molecular aspects of miRNA-mediated regulation in grafted watermelon. Obviously, this result would provide a basis for further unravelling the mechanism on how miRNAs information is exchanged between scion and rootstock in grafted

  10. Plasmodium vivax Cell Traversal Protein for Ookinetes and Sporozoites (PvCelTOS) gene sequence and potential epitopes are highly conserved among isolates from different regions of Brazilian Amazon

    PubMed Central

    Bitencourt Chaves, Lana; Perce-da-Silva, Daiana de Souza; Rodrigues-da-Silva, Rodrigo Nunes; Martins da Silva, João Hermínio; Cassiano, Gustavo Capatti; Machado, Ricardo Luiz Dantas; Pratt-Riccio, Lilian Rose; Banic, Dalma Maria

    2017-01-01

    The Plasmodium vivax Cell-traversal protein for ookinetes and sporozoites (PvCelTOS) plays an important role in the traversal of host cells. Although essential to PvCelTOS progress as a vaccine candidate, its genetic diversity remains uncharted. Therefore, we investigated the PvCelTOS genetic polymorphism in 119 field isolates from five different regions of Brazilian Amazon (Manaus, Novo Repartimento, Porto Velho, Plácido de Castro and Oiapoque). Moreover, we also evaluated the potential impact of non-synonymous mutations found in the predicted structure and epitopes of PvCelTOS. The field isolates showed high similarity (99.3% of bp) with the reference Sal-1 strain, presenting only four Single-Nucleotide Polymorphisms (SNP) at positions 24A, 28A, 109A and 352C. The frequency of synonymous C109A (82%) was higher than all others (p<0.0001). However, the non-synonymous G28A and G352C were observed in 9.2% and 11.7% isolates. The great majority of the isolates (79.8%) revealed complete amino acid sequence homology with Sal-1, 10.9% presented complete homology with Brazil I and two undescribed PvCelTOS sequences were observed in 9.2% field isolates. Concerning the prediction analysis, the N-terminal substitution (Gly10Ser) was predicted to be within a B-cell epitope (PvCelTOS Accession Nos. AB194053.1) and exposed at the protein surface, while the Val118Leu substitution was not a predicted epitope. Therefore, our data suggest that although G28A SNP might interfere in potential B-cell epitopes at PvCelTOS N-terminal region the gene sequence is highly conserved among the isolates from different geographic regions, which is an important feature to be taken into account when evaluating its potential as a vaccine candidate. PMID:28158176

  11. Significance of satellite DNA revealed by conservation of a widespread repeat DNA sequence among angiosperms.

    PubMed

    Mehrotra, Shweta; Goel, Shailendra; Raina, Soom Nath; Rajpal, Vijay Rani

    2014-08-01

    The analysis of plant genome structure and evolution requires comprehensive characterization of repetitive sequences that make up the majority of plant nuclear DNA. In the present study, we analyzed the nature of pCtKpnI-I and pCtKpnI-II tandem repeated sequences, reported earlier in Carthamus tinctorius. Interestingly, homolog of pCtKpnI-I repeat sequence was also found to be present in widely divergent families of angiosperms. pCtKpnI-I showed high sequence similarity but low copy number among various taxa of different families of angiosperms analyzed. In comparison, pCtKpnI-II was specific to the genus Carthamus and was not present in any other taxa analyzed. The molecular structure of pCtKpnI-I was analyzed in various unrelated taxa of angiosperms to decipher the evolutionary conserved nature of the sequence and its possible functional role.

  12. Studying RNA homology and conservation with Infernal: from single sequences to RNA families

    PubMed Central

    Barquist, Lars; Burge, Sarah W.; Gardner, Paul P.

    2016-01-01

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remains difficult. This protocol introduces methods developed by the Rfam database for identifying “families” of homologous ncRNAs starting from single “seed” sequences using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs, then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. PMID:27322404

  13. Studying RNA Homology and Conservation with Infernal: From Single Sequences to RNA Families.

    PubMed

    Barquist, Lars; Burge, Sarah W; Gardner, Paul P

    2016-06-20

    Emerging high-throughput technologies have led to a deluge of putative non-coding RNA (ncRNA) sequences identified in a wide variety of organisms. Systematic characterization of these transcripts will be a tremendous challenge. Homology detection is critical to making maximal use of functional information gathered about ncRNAs: identifying homologous sequence allows us to transfer information gathered in one organism to another quickly and with a high degree of confidence. ncRNA presents a challenge for homology detection, as the primary sequence is often poorly conserved and de novo secondary structure prediction and search remain difficult. This unit introduces methods developed by the Rfam database for identifying "families" of homologous ncRNAs starting from single "seed" sequences, using manually curated sequence alignments to build powerful statistical models of sequence and structure conservation known as covariance models (CMs), implemented in the Infernal software package. We provide a step-by-step iterative protocol for identifying ncRNA homologs and then constructing an alignment and corresponding CM. We also work through an example for the bacterial small RNA MicA, discovering a previously unreported family of divergent MicA homologs in genus Xenorhabdus in the process. © 2016 by John Wiley & Sons, Inc.

  14. Sense-antisense gene pairs: sequence, transcription, and structure are not conserved between human and mouse

    PubMed Central

    Wood, Emily J.; Chin-Inmanu, Kwanrutai; Jia, Hui; Lipovich, Leonard

    2013-01-01

    Previous efforts to characterize conservation between the human and mouse genomes focused largely on sequence comparisons. These studies are inherently limited because they don't account for gene structure differences, which may exist despite genomic sequence conservation. Recent high-throughput transcriptome studies have revealed widespread and extensive overlaps between genes, and transcripts, encoded on both strands of the genomic sequence. This overlapping gene organization, which produces sense-antisense (SAS) gene pairs, is capable of effecting regulatory cascades through established mechanisms. We present an evolutionary conservation assessment of SAS pairs, on three levels: genomic, transcriptomic, and structural. From a genome-wide dataset of human SAS pairs, we first identified orthologous loci in the mouse genome, then assessed their transcription in the mouse, and finally compared the genomic structures of SAS pairs expressed in both species. We found that approximately half of human SAS loci have single orthologous locations in the mouse genome; however, only half of those orthologous locations have SAS transcriptional activity in the mouse. This suggests that high human-mouse gene conservation overlooks widespread distinctions in SAS pair incidence and expression. We compared gene structures at orthologous SAS loci, finding frequent differences in gene structure between human and orthologous mouse SAS pair members. Our categorization of human SAS pairs with respect to mouse conservation of expression as well as structure points to limitations of mouse models. Gene structure differences, including at SAS loci, may account for some of the phenotypic distinctions between primates and rodents. Genes in non-conserved SAS pairs may contribute to evolutionary lineage-specific regulatory outcomes. PMID:24133500

  15. A highly conserved pericentromeric domain in human and gorilla chromosomes.

    PubMed

    Pita, M; Gosálvez, J; Gosálvez, A; Nieddu, M; López-Fernández, C; Mezzanotte, R

    2009-01-01

    Significant similarity between human and gorilla genomes has been found in all chromosome arms, but not in centromeres, using whole-comparative genomic hybridization (W-CGH). In human chromosomes, centromeric regions, generally containing highly repetitive DNAs, are characterized by the presence of specific human DNA sequences and an absence of homology with gorilla DNA sequences. The only exception is the pericentromeric area of human chromosome 9, which, in addition to a large block of human DNA, also contains a region of homology with gorilla DNA sequences; the localization of these sequences coincides with that of human satellite III. Since highly repetitive DNAs are known for their high mutation frequency, we hypothesized that the chromosome 9 pericentromeric DNA conserved in human chromosomes and deriving from the gorilla genome may thus play some important functional role.

  16. The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

    PubMed Central

    Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

    2015-01-01

    Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191

  17. Sequence-related human proteins cluster by degree of evolutionary conservation

    NASA Astrophysics Data System (ADS)

    Mrowka, Ralf; Patzak, Andreas; Herzel, Hanspeter; Holste, Dirk

    2004-11-01

    Gene duplication followed by adaptive evolution is thought to be a central mechanism for the emergence of novel genes. To illuminate the contribution of duplicated protein-coding sequences to the complexity of the human genome, we study the connectivity of pairwise sequence-related human proteins and construct a network (N) of linked protein sequences with shared similarities. We find that (i) the connectivity distribution P(k) for k sequence-related proteins decays as a power law P(k)˜k-γ with γ≈1.2 , (ii) the top rank of N consists of a single large cluster of proteins (≈70%) , while bottom ranks consist of multiple isolated clusters, and (iii) structural characteristics of N show both a high degree of clustering and an intermediate connectivity (“small-world” features). We gain further insight into structural properties of N by studying the relationship between the connectivity distribution and the phylogenetic conservation of proteins in bacteria, plants, invertebrates, and vertebrates. We find that (iv) the proportion of sequence-related proteins increases with increasing extent of evolutionary conservation. Our results support that small-world network properties constitute a footprint of an evolutionary mechanism and extend the traditional interpretation of protein families.

  18. Energy Conservation Featured in Illinois High School

    ERIC Educational Resources Information Center

    Modern Schools, 1976

    1976-01-01

    The William Fremd High School in Palatine, Illinois, scheduled to open in 1977, is being built with energy conservation uppermost in mind. In this system, 70 heat pumps will heat and cool 300,000 square feet of educational facilities. (Author/MLF)

  19. Discovering conserved insect microRNAs from expressed sequence tags.

    PubMed

    Jia, Qidong; Lin, Kejian; Liang, Jingdong; Yu, Lun; Li, Fei

    2010-12-01

    MicroRNAs (miRNA) participate in regulating diverse biological pathways by translational repression in animals. They have attracted increasing attention recently. However, little work has been done on the miRNA genes in agriculturally important pests. Because the transcripts of most miRNA genes are the products of type-II RNA polymerase, pri-miRNA has a poly(A) tail and appears in expressed sequence tags (EST). We developed a computational pipeline to identify miRNA genes from insect ESTs. First, 980,697 ESTs from 63 insects were collected and used to search the nr database. The ESTs which did not share significant similarities with any known protein-coding genes were treated as non-coding ESTs. Next, known mature miRNAs were used to align with non-coding ESTs. The ESTs which contain the sequence of mature miRNA were treated as candidate ESTs. Finally, putative precursors were extracted flanking the mature miRNA region in candidate ESTs and evaluated by the Triplet-SVM algorithm. As a result, 86 miRNAs from 30 insect species were found based on a strict criterion while 330 miRNAs from 51 species were found based on a loose criterion. Evolution analysis indicated that mir-467, mir-297 and mir-466 were the highest conserved miRNA families in insects. To confirm the reliability of putative insect miRNAs, the expression profile of nine predicted miRNAs in Locusta migratoria was investigated. Eight miRNAs were successfully detected by RT-PCR. Most miRNAs were expressed ubiquitously at all examined tissues and developmental stages whereas Lmi-mir-509 was specifically expressed in the thorax of the 2nd, 4th and 5th instars and adult locust. In all, our work reported an efficient computational strategy for predicting miRNA genes from insect ESTs and presented tens of miRNAs in diverse insect species which are expected to participate in many important physiological processes.

  20. Detection of conserved segments in proteins: iterative scanning of sequence databases with alignment blocks.

    PubMed Central

    Tatusov, R L; Altschul, S F; Koonin, E V

    1994-01-01

    We describe an approach to analyzing protein sequence databases that, starting from a single uncharacterized sequence or group of related sequences, generates blocks of conserved segments. The procedure involves iterative database scans with an evolving position-dependent weight matrix constructed from a coevolving set of aligned conserved segments. For each iteration, the expected distribution of matrix scores under a random model is used to set a cutoff score for the inclusion of a segment in the next iteration. This cutoff may be calculated to allow the chance inclusion of either a fixed number or a fixed proportion of false positive segments. With sufficiently high cutoff scores, the procedure converged for all alignment blocks studied, with varying numbers of iterations required. Different methods for calculating weight matrices from alignment blocks were compared. The most effective of those tested was a logarithm-of-odds, Bayesian-based approach that used prior residue probabilities calculated from a mixture of Dirichlet distributions. The procedure described was used to detect novel conserved motifs of potential biological importance. Images PMID:7991589

  1. High-bay Lighting Energy Conservation Measures

    SciTech Connect

    Ian Metzger, Jesse Dean

    2010-12-31

    This software requires inputs of simple high-bay lighting system inventory information and calculates the energy and cost benefits of various retrofit opportunities. This tool includes energy conservation measures for: 1000 Watt to 750 Watt High-pressure Sodium lighting retrofit, 400 Watt to 360 Watt High Pressure Sodium lighting retrofit, High Intensity Discharge to T5 lighting retrofit, High Intensity Discharge to T8 lighting retrofit, and Daylighting. This tool calculates energy savings, demand reduction, cost savings, building life cycle costs including: simple payback, discounted payback, net-present value, and savings to investment ratio. In addition this tool also displays the environmental benefits of a project.

  2. High resolution schemes for hyperbolic conservation laws

    NASA Technical Reports Server (NTRS)

    Harten, A.

    1983-01-01

    A class of new explicit second order accurate finite difference schemes for the computation of weak solutions of hyperbolic conservation laws is presented. These highly nonlinear schemes are obtained by applying a nonoscillatory first order accurate scheme to an appropriately modified flux function. The so-derived second order accurate schemes achieve high resolution while preserving the robustness of the original nonoscillatory first order accurate scheme. Numerical experiments are presented to demonstrate the performance of these new schemes.

  3. Identification of conserved and novel microRNAs in Aquilaria sinensis based on small RNA sequencing and transcriptome sequence data.

    PubMed

    Gao, Zhi-Hui; Wei, Jian-He; Yang, Yun; Zhang, Zheng; Xiong, Huan-Ying; Zhao, Wen-Ting

    2012-08-15

    Agarwood is in great demand for its high value in medicine, incense, and perfume across Asia, Middle East, and Europe. As agarwood is formed only when the Aquilaria trees are wounded or infected by some microbes, overharvesting and habitat loss are threatening some populations of agarwood-producing species. Aquilaria sinensis is such a significant economic tree species. To promote the production efficiency and protect the resource of A. sinensis, it would be critical to reveal the regulation mechanisms of stress-induced agarwood formation. MicroRNAs (miRNAs), a key gene expression regulator involved in various plant stress response and metabolic processes, might function in agarwood formation, but no report concerning miRNAs in Aquilaria is available. In this study, the small RNA high-throughput sequencing and 454 transcriptome data were adopted to identify both conserved and novel miRNAs in A. sinensis. Deep sequencing showed that the small RNA (sRNA) population of A. sinensis was complex and the length of sRNAs varied. By in silico analysis of the small RNA deep sequencing data and transcriptome data, we discovered 27 novel miRNAs in A. sinensis. Based on the mature miRNA sequence conservation, we identified 74 putative conserved miRNAs from A. sinensis and 10 of them were confirmed with hairpin forming precursor. Interestingly, a novel miRNA sequence was determined to be the miRNA of asi-miR408, but with accumulation much higher than asi-miR408. The expression levels of ten stress-responsive miRNAs were examined during the time-course after wound treatment. Eight were shown to be wound-responsive. This not only shows the existence of miRNAs in this Asian economically significant tree species but also indicated its critical role in stress-induced agarwood formation. The highly accumulated miRNA of asi-miR408 implied miRNAs would be functional as well as miRNAs in plants.

  4. Deletion of conserved sequences in IG-DMR at Dlk1-Gtl2 locus suggests their involvement in expression of paternally expressed genes in mice

    PubMed Central

    SAITO, Takeshi; HARA, Satoshi; TAMANO, Moe; ASAHARA, Hiroshi; TAKADA, Shuji

    2016-01-01

    Expression regulation of the Dlk1-Dio3 imprinted domain by the intergenic differentially methylated region (IG-DMR) is essential for normal embryonic development in mammals. In this study, we investigated conserved IG-DMR genomic sequences in eutherians to elucidate their role in genomic imprinting of the Dlk1-Dio3 domain. Using a comparative genomics approach, we identified three highly conserved sequences in IG-DMR. To elucidate the functions of these sequences in vivo, we generated mutant mice lacking each of the identified highly conserved sequences using the CRISPR/Cas9 system. Although mutant mice did not exhibit the gross phenotype, deletions of the conserved sequences altered the expression levels of paternally expressed imprinted genes in the mutant embryos without skewing imprinting status. These results suggest that the conserved sequences in IG-DMR are involved in the expression regulation of some of the imprinted genes in the Dlk1-Dio3 domain. PMID:27904015

  5. Deletion of conserved sequences in IG-DMR at Dlk1-Gtl2 locus suggests their involvement in expression of paternally expressed genes in mice.

    PubMed

    Saito, Takeshi; Hara, Satoshi; Tamano, Moe; Asahara, Hiroshi; Takada, Shuji

    2017-02-16

    Expression regulation of the Dlk1-Dio3 imprinted domain by the intergenic differentially methylated region (IG-DMR) is essential for normal embryonic development in mammals. In this study, we investigated conserved IG-DMR genomic sequences in eutherians to elucidate their role in genomic imprinting of the Dlk1-Dio3 domain. Using a comparative genomics approach, we identified three highly conserved sequences in IG-DMR. To elucidate the functions of these sequences in vivo, we generated mutant mice lacking each of the identified highly conserved sequences using the CRISPR/Cas9 system. Although mutant mice did not exhibit the gross phenotype, deletions of the conserved sequences altered the expression levels of paternally expressed imprinted genes in the mutant embryos without skewing imprinting status. These results suggest that the conserved sequences in IG-DMR are involved in the expression regulation of some of the imprinted genes in the Dlk1-Dio3 domain.

  6. CSTminer: a web tool for the identification of coding and noncoding conserved sequence tags through cross-species genome comparison.

    PubMed

    Castrignanò, Tiziana; Canali, Alessandro; Grillo, Giorgio; Liuni, Sabino; Mignone, Flavio; Pesole, Graziano

    2004-07-01

    The identification and characterization of genome tracts that are highly conserved across species during evolution may contribute significantly to the functional annotation of whole-genome sequences. Indeed, such sequences are likely to correspond to known or unknown coding exons or regulatory motifs. Here, we present a web server implementing a previously developed algorithm that, by comparing user-submitted genome sequences, is able to identify statistically significant conserved blocks and assess their coding or noncoding nature through the measure of a coding potential score. The web tool, available at http://www.caspur.it/CSTminer/, is dynamically interconnected with the Ensembl genome resources and produces a graphical output showing a map of detected conserved sequences and annotated gene features.

  7. Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

    SciTech Connect

    Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng; Kurz,Thorsten; Dubchak, Inna; Frazer, Kelly A.; Ober, Carole

    2005-09-10

    Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs each inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.

  8. Forest conservation delivers highly variable coral reef conservation outcomes.

    PubMed

    Klein, Carissa J; Jupiter, Stacy D; Selig, Elizabeth R; Watts, Matthew E; Halpern, Benjamin S; Kamal, Muhammad; Roelfsema, Chris; Possingham, Hugh P

    2012-06-01

    Coral reefs are threatened by human activities on both the land (e.g., deforestation) and the sea (e.g., overfishing). Most conservation planning for coral reefs focuses on removing threats in the sea, neglecting management actions on the land. A more integrated approach to coral reef conservation, inclusive of land-sea connections, requires an understanding of how and where terrestrial conservation actions influence reefs. We address this by developing a land-sea planning approach to inform fine-scale spatial management decisions and test it in Fiji. Our aim is to determine where the protection of forest can deliver the greatest return on investment for coral reef ecosystems. To assess the benefits of conservation to coral reefs, we estimate their relative condition as influenced by watershed-based pollution and fishing. We calculate the cost-effectiveness of protecting forest and find that investments deliver rapidly diminishing returns for improvements to relative reef condition. For example, protecting 2% of forest in one area is almost 500 times more beneficial than protecting 2% in another area, making prioritization essential. For the scenarios evaluated, relative coral reef condition could be improved by 8-58% if all remnant forest in Fiji were protected rather than deforested. Finally, we determine the priority of each coral reef for implementing a marine protected area when all remnant forest is protected for conservation. The general results will support decisions made by the Fiji Protected Area Committee as they establish a national protected area network that aims to protect 20% of the land and 30% of the inshore waters by 2020. Although challenges remain, we can inform conservation decisions around the globe by tackling the complex issues relevant to integrated land-sea planning.

  9. High Throughput Sequencing: An Overview of Sequencing Chemistry.

    PubMed

    Ambardar, Sheetal; Gupta, Rikita; Trakroo, Deepika; Lal, Rup; Vakhlu, Jyoti

    2016-12-01

    In the present century sequencing is to the DNA science, what gel electrophoresis was to it in the last century. From 1977 to 2016 three generation of the sequencing technologies of various types have been developed. Second and third generation sequencing technologies referred commonly to as next generation sequencing technology, has evolved significantly with increase in sequencing speed, decrease in sequencing cost, since its inception in 2004. GS FLX by 454 Life Sciences/Roche diagnostics, Genome Analyzer, HiSeq, MiSeq and NextSeq by Illumina, Inc., SOLiD by ABI, Ion Torrent by Life Technologies are various type of the sequencing platforms available for second generation sequencing. The platforms available for the third generation sequencing are Helicos™ Genetic Analysis System by SeqLL, LLC, SMRT Sequencing by Pacific Biosciences, Nanopore sequencing by Oxford Nanopore's, Complete Genomics by Beijing Genomics Institute and GnuBIO by BioRad, to name few. The present article is an overview of the principle and the sequencing chemistry of these high throughput sequencing technologies along with brief comparison of various types of sequencing platforms available.

  10. Sequence Conservation, Radial Distance and Packing Density in Spherical Viral Capsids

    PubMed Central

    Lee, Chi-Wen; Huang, Tsun-Tsao; Shih, Chung-Shiuan; Hwang, Jenn-Kang

    2015-01-01

    The conservation level of a residue is a useful measure about the importance of that residue in protein structure and function. Much information about sequence conservation comes from aligning homologous sequences. Profiles showing the variation of the conservation level along the sequence are usually interpreted in evolutionary terms and dictated by site similarities of a proper set of homologous sequences. Here, we report that, of the viral icosahedral capsids, the sequence conservation profile can be determined by variations in the distances between residues and the centroid of the capsid – with a direct inverse proportionality between the conservation level and the centroid distance – as well as by the spatial variations in local packing density. Examining both the centroid and the packing density models against a dataset of 51 crystal structures of nonhomologous icosahedral capsids, we found that many global patterns and minor features derived from the viral structures are consistent with those present in the sequence conservation profiles. The quantitative link between the level of conservation and structural features like centroid-distance or packing density allows us to look at residue conservation from a structural viewpoint as well as from an evolutionary viewpoint. PMID:26132081

  11. Mitochondrial genome sequences illuminate maternal lineages of conservation concern in a rare carnivore

    PubMed Central

    2011-01-01

    Background Science-based wildlife management relies on genetic information to infer population connectivity and identify conservation units. The most commonly used genetic marker for characterizing animal biodiversity and identifying maternal lineages is the mitochondrial genome. Mitochondrial genotyping figures prominently in conservation and management plans, with much of the attention focused on the non-coding displacement ("D") loop. We used massively parallel multiplexed sequencing to sequence complete mitochondrial genomes from 40 fishers, a threatened carnivore that possesses low mitogenomic diversity. This allowed us to test a key assumption of conservation genetics, specifically, that the D-loop accurately reflects genealogical relationships and variation of the larger mitochondrial genome. Results Overall mitogenomic divergence in fishers is exceedingly low, with 66 segregating sites and an average pairwise distance between genomes of 0.00088 across their aligned length (16,290 bp). Estimates of variation and genealogical relationships from the displacement (D) loop region (299 bp) are contradicted by the complete mitochondrial genome, as well as the protein coding fraction of the mitochondrial genome. The sources of this contradiction trace primarily to the near-absence of mutations marking the D-loop region of one of the most divergent lineages, and secondarily to independent (recurrent) mutations at two nucleotide position in the D-loop amplicon. Conclusions Our study has two important implications. First, inferred genealogical reconstructions based on the fisher D-loop region contradict inferences based on the entire mitogenome to the point that the populations of greatest conservation concern cannot be accurately resolved. Whole-genome analysis identifies Californian haplotypes from the northern-most populations as highly distinctive, with a significant excess of amino acid changes that may be indicative of molecular adaptation; D-loop sequences fail

  12. Inference of transcriptional networks in Arabidopsis through conserved noncoding sequence analysis.

    PubMed

    Van de Velde, Jan; Heyndrickx, Ken S; Vandepoele, Klaas

    2014-07-01

    Transcriptional regulation plays an important role in establishing gene expression profiles during development or in response to (a)biotic stimuli. Transcription factor binding sites (TFBSs) are the functional elements that determine transcriptional activity, and the identification of individual TFBS in genome sequences is a major goal to inferring regulatory networks. We have developed a phylogenetic footprinting approach for the identification of conserved noncoding sequences (CNSs) across 12 dicot plants. Whereas both alignment and non-alignment-based techniques were applied to identify functional motifs in a multispecies context, our method accounts for incomplete motif conservation as well as high sequence divergence between related species. We identified 69,361 footprints associated with 17,895 genes. Through the integration of known TFBS obtained from the literature and experimental studies, we used the CNSs to compile a gene regulatory network in Arabidopsis thaliana containing 40,758 interactions, of which two-thirds act through binding events located in DNase I hypersensitive sites. This network shows significant enrichment toward in vivo targets of known regulators, and its overall quality was confirmed using five different biological validation metrics. Finally, through the integration of detailed expression and function information, we demonstrate how static CNSs can be converted into condition-dependent regulatory networks, offering opportunities for regulatory gene annotation.

  13. New insights into SRY regulation through identification of 5' conserved sequences

    PubMed Central

    Ross, Diana GF; Bowles, Josephine; Koopman, Peter; Lehnert, Sigrid

    2008-01-01

    Background SRY is the pivotal gene initiating male sex determination in most mammals, but how its expression is regulated is still not understood. In this study we derived novel SRY 5' flanking genomic sequence data from bovine and caprine genomic BAC clones. Results We identified four intervals of high homology upstream of SRY by comparison of human, bovine, pig, goat and mouse genomic sequences. These conserved regions contain putative binding sites for a large number of known transcription factor families, including several that have been implicated previously in sex determination and early gonadal development. Conclusion Our results reveal potentially important SRY regulatory elements, mutations in which might underlie cases of idiopathic human XY sex reversal. PMID:18851760

  14. Global geno-proteomic analysis reveals cross-continental sequence conservation and druggable sites among influenza virus polymerases.

    PubMed

    Babar, Mustafeez Mujtaba; Zaidi, Najam-us-Sahar Sadaf; Tahir, Muhammad

    2014-12-01

    Influenza virus is one of the major causes of mortality and morbidity associated with respiratory diseases. The high rate of mutation in the viral proteome provides it with the ability to survive in a variety of host species. This property helps it in maintaining and developing its pathogenicity, transmission and drug resistance. Alternate drug targets, particularly the internal proteins, can potentially be exploited for addressing the resistance issues. In the current analysis, the degree of conservation of influenza virus polymerases has been studied as one of the essential elements for establishing its candidature as a potential target of antiviral therapy. We analyzed more than 130,000 nucleotide and amino acid sequences by classifying them on the basis of continental presence of host organisms. Computational analyses including genetic polymorphism study, mutation pattern determination, molecular evolution and geophylogenetic analysis were performed to establish the high degree of conservation among the sequences. These studies lead to establishing the polymerases, in particular PB1, as highly conserved proteins. Moreover, we mapped the conservation percentage on the tertiary structures of proteins to identify the conserved, druggable sites. The research study, hence, revealed that the influenza virus polymerases are highly conserved (95-99%) proteins with a very slow mutation rate. Potential drug binding sites on various polymerases have also been reported. A scheme for drug target candidate development that can be employed to rapidly mutating proteins has been presented. Moreover, the research output can help in designing new therapeutic molecules against the identified targets.

  15. Conserved Plasmid Hydrogen-Uptake (hup)-Specific Sequences within Hup+Rhizobium leguminosarum Strains

    PubMed Central

    Leyva, Antonio; Palacios, José M.; Ruiz-Argüeso, Tomás

    1987-01-01

    Thirteen Rhizobium leguminosarum strains previously reported as H2-uptake hydrogenase positive (Hup+) or negative (Hup−) were analyzed for the presence and conservation of DNA sequences homologous to cloned Bradyrhizobium japonicum hup-specific DNA from cosmid pHU1 (M. A. Cantrell, R. A. Haugland, and H. J. Evans, Proc. Natl. Acad. Sci. USA 80:181-185, 1983). The Hup phenotype of these strains was reexamined by determining hydrogenase activity induced in bacteroids from pea nodules. Five strains, including H2 oxidation-ATP synthesis-coupled and -uncoupled strains, induced significant rates of H2-uptake hydrogenase activity and contained DNA sequences homologous to three probe DNA fragments (5.9-kilobase [kb] HindIII, 2.9-kb EcoRI, and 5.0-kb EcoRI) from pHU1. The pattern of genomic DNA HindIII and EcoRI fragments with significant homology to each of the three probes was identical in all five strains regardless of the H2-dependent ATP generation trait. The restriction fragments containing the homology totalled about 22 kb of DNA common to the five strains. In all instances the putative hup sequences were located on a plasmid that also contained nif genes. The molecular sizes of the identified hup-sym plasmids ranged between 184 and 212 megadaltons. No common DNA sequences homologous to B. japonicum hup DNA were found in genomic DNA from any of the eight remaining strains showing no significant hydrogenase activity in pea bacteroids. These results suggest that the identified DNA region contains genes essential for hydrogenase activity in R. leguminosarum and that its organization is highly conserved within Hup+ strains in this symbiotic species. Images PMID:16347471

  16. WeederH: an algorithm for finding conserved regulatory motifs and regions in homologous sequences

    PubMed Central

    Pavesi, Giulio; Zambelli, Federico; Pesole, Graziano

    2007-01-01

    Background This work addresses the problem of detecting conserved transcription factor binding sites and in general regulatory regions through the analysis of sequences from homologous genes, an approach that is becoming more and more widely used given the ever increasing amount of genomic data available. Results We present an algorithm that identifies conserved transcription factor binding sites in a given sequence by comparing it to one or more homologs, adapting a framework we previously introduced for the discovery of sites in sequences from co-regulated genes. Differently from the most commonly used methods, the approach we present does not need or compute an alignment of the sequences investigated, nor resorts to descriptors of the binding specificity of known transcription factors. The main novel idea we introduce is a relative measure of conservation, assuming that true functional elements should present a higher level of conservation with respect to the rest of the sequence surrounding them. We present tests where we applied the algorithm to the identification of conserved annotated sites in homologous promoters, as well as in distal regions like enhancers. Conclusion Results of the tests show how the algorithm can provide fast and reliable predictions of conserved transcription factor binding sites regulating the transcription of a gene, with better performances than other available methods for the same task. We also show examples on how the algorithm can be successfully employed when promoter annotations of the genes investigated are missing, or when regulatory sites and regions are located far away from the genes. PMID:17286865

  17. Multilign: an algorithm to predict secondary structures conserved in multiple RNA sequences

    PubMed Central

    Xu, Zhenjiang; Mathews, David H.

    2011-01-01

    Motivation: With recent advances in sequencing, structural and functional studies of RNA lag behind the discovery of sequences. Computational analysis of RNA is increasingly important to reveal structure–function relationships with low cost and speed. The purpose of this study is to use multiple homologous sequences to infer a conserved RNA structure. Results: A new algorithm, called Multilign, is presented to find the lowest free energy RNA secondary structure common to multiple sequences. Multilign is based on Dynalign, which is a program that simultaneously aligns and folds two sequences to find the lowest free energy conserved structure. For Multilign, Dynalign is used to progressively construct a conserved structure from multiple pairwise calculations, with one sequence used in all pairwise calculations. A base pair is predicted only if it is contained in the set of low free energy structures predicted by all Dynalign calculations. In this way, Multilign improves prediction accuracy by keeping the genuine base pairs and excluding competing false base pairs. Multilign has computational complexity that scales linearly in the number of sequences. Multilign was tested on extensive datasets of sequences with known structure and its prediction accuracy is among the best of available algorithms. Multilign can run on long sequences (> 1500 nt) and an arbitrarily large number of sequences. Availability: The algorithm is implemented in ANSI C++ and can be downloaded as part of the RNAstructure package at: http://rna.urmc.rochester.edu Contact: david_mathews@urmc.rochester.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21193521

  18. Conservation analysis predicts in vivo occupancy of glucocorticoid receptor-binding sequences at glucocorticoid-induced genes.

    PubMed

    So, Alex Yick-Lun; Cooper, Samantha B; Feldman, Brian J; Manuchehri, Mitra; Yamamoto, Keith R

    2008-04-15

    The glucocorticoid receptor (GR) interacts with specific GR-binding sequences (GBSs) at glucocorticoid response elements (GREs) to orchestrate transcriptional networks. Although the sequences of the GBSs are highly variable among different GREs, the precise sequence within an individual GRE is highly conserved. In this study, we examined whether sequence conservation of sites resembling GBSs is sufficient to predict GR occupancy of GREs at genes responsive to glucocorticoids. Indeed, we found that the level of conservation of these sites at genes up-regulated by glucocorticoids in mouse C3H10T1/2 mesenchymal stem-like cells correlated directly with the extent of occupancy by GR. In striking contrast, we failed to observe GR occupancy of GBSs at genes repressed by glucocorticoids, despite the occurrence of these sites at a frequency similar to that of the induced genes. Thus, GR occupancy of the GBS motif correlates with induction but not repression, and GBS conservation alone is sufficient to predict GR occupancy and GRE function at induced genes.

  19. High speed nucleic acid sequencing

    SciTech Connect

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  20. A conserved C-terminal sequence of high-risk cutaneous beta-human papillomavirus E6 proteins alters localization and signalling of β1-integrin to promote cell migration.

    PubMed

    Holloway, Amy; Storey, Alan

    2014-01-01

    Beta-human papillomaviruses (β-HPV) infect cutaneous epithelia, and accumulating evidence suggests that the virus may act as a co-factor with UV-induced DNA damage in the development and progression of non-melanoma skin cancer, although the molecular mechanisms involved are poorly understood. The E6 protein of cutaneous β-HPV types encodes functions consistent with a role in tumorigenesis, and E6 expression can result in papilloma formation in transgenic animals. The E6 proteins of high-risk α-HPV types, which are associated with the development of anogenital cancers, have a conserved 4 aa motif at their extreme C terminus that binds to specific PDZ domain-containing proteins to promote cell invasion. Likewise, the high-risk β-HPVs HPV5 and HPV8 E6 proteins also share a conserved C-terminal motif, but this is markedly different from that of α-HPV types, implying functional differences. Using binding and functional studies, we have shown that β-HPV E6 proteins target β1-integrin using this C-terminal motif. E6 expression reduced membrane localization of β1-integrin, but increased overall levels of β1-integrin protein and its downstream effector focal adhesion kinase in human keratinocytes. Altered β1-integrin localization due to E6 expression was associated with actin cytoskeleton rearrangement and increased cell migration that was abolished by point mutations in the C-terminal motif of E6. We concluded that modulation of β1-integrin signalling by E6 proteins may contribute towards the pathogenicity of these β-HPV types.

  1. The complete nucleotide sequence of the Crossostoma lacustre mitochondrial genome: conservation and variations among vertebrates.

    PubMed Central

    Tzeng, C S; Hui, C F; Shen, S C; Huang, P C

    1992-01-01

    The complete mitochondrial (mt) genome of Crossostoma lacustre, a freshwater loach from mountain stream of Taiwan, has been cloned and sequenced. This fish mt genome, consisting of 16558 base-pairs, encodes genes for 13 proteins, two rRNAs, and 22 tRNAs, in addition to a regulatory sequence for replication and transcription (D-loop), is similar to those of the other vertebrates in both the order and orientation of these genes. The protein-coding and ribosomal RNA genes are highly homologous both in size and composition, to their counterparts in mammals, birds, amphibians, and invertebrates, and using essentially the same set of codons, including both the initiation and termination signals, and the tRNAs. Differences do exist, however, in the lengths and sequences of the D-loop regions, and in space between genes, which account for the variations in total lengths of the genomes. Our observations provide evidence for the first time for the conservation of genetic information in the fish mitochondrial genome, especially among the vertebrates. PMID:1408800

  2. Remarkable intron and exon sequence conservation in human and mouse homeobox Hox 1. 3 genes

    SciTech Connect

    Tournier-Lasserve, E.; Odenwald, W.F.; Garbern, J.; Trojanowski, J.; Lazzarini, R.A.

    1989-05-01

    A high degree of conservation exists between the Hox 1.3 homeobox genes of mice and humans. The two genes occupy the same relative positions in their respective Hox 1 gene clusters, they show extensive sequence similarities in their coding and noncoding portions, and both are transcribed into multiple transcripts of similar sizes. The predicted human Hox 1.3 protein differs from its murine counterpart in only 7 of 270 amino acids. The sequence similarity in the 250 base pairs upstream of the initiation codon is 98%, the similarity between the two introns, both 960 base pairs long, is 72%, and the similarity in the 3' noncoding region from termination codon to polyadenylation signal is 90%. Both mouse and human Hox 1.3 introns contain a sequence with homology to a mating-type-controlled cis element of the yeast Ty1 transposon. DNA-binding studies with a recombinant mouse Hox 1.3 protein identified two binding sites in the intron, both of which were within the region of shared homology with this Ty1 cis element.

  3. Sequence conservation in avian CR1: an interspersed repetitive DNA family evolving under functional constraints.

    PubMed Central

    Chen, Z Q; Ritzel, R G; Lin, C C; Hodgetts, R B

    1991-01-01

    CR1 is a short interspersed repetitive DNA element originally identified in the domestic chicken (Gallus gallus). However, unlike virtually all other such sequences described to date, CR1 is not confined to one or a few closely related species. It is probably a ubiquitous component of the avian genome, having been detected in representatives of nine orders encompassing a wide spectrum of the class Aves. This identification was made possible by using the polymerase chain reaction (PCR), which revealed interspecific similarities not detected by conventional Southern analysis. DNA sequence comparisons between a CR1 element isolated from a sarus crane (Grus antigone) and those isolated from an emu (Dromaius novaehollandiae) showed that two short highly conserved regions are present. These are included within two regions previously characterized in the CR1 units of domestic fowl. One of these behaves as a transcriptional silencer and the other is a binding site for a nuclear protein. Our observations suggest that CR1 has evolved under functional constraints and that interspersed repetitive sequences as a class may constitute a more significant component of the eukaryotic genome than is generally acknowledged. Images PMID:1829530

  4. Detection of Weakly Conserved Ancestral Mammalian RegulatorySequences by Primate Comparisons

    SciTech Connect

    Wang, Qian-fei; Prabhakar, Shyam; Chanan, Sumita; Cheng,Jan-Fang; Rubin, Edward M.; Boffelli, Dario

    2006-06-01

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detectcryptic functional elements, which are too weakly conserved among mammalsto distinguish from nonfunctional DNA. To address this problem, weexplored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  5. Accelerated Evolution of Conserved Noncoding Sequences in theHuman Genome

    SciTech Connect

    Prambhakar, Shyam; Noonan, James P.; Paabo, Svante; Rubin, EdwardM.

    2006-07-06

    Genomic comparisons between human and distant, non-primatemammals are commonly used to identify cis-regulatory elements based onconstrained sequence evolution. However, these methods fail to detect"cryptic" functional elements, which are too weakly conserved amongmammals to distinguish from nonfunctional DNA. To address this problem,we explored the potential of deep intra-primate sequence comparisons. Wesequenced the orthologs of 558 kb of human genomic sequence, coveringmultiple loci involved in cholesterol homeostasis, in 6 nonhumanprimates. Our analysis identified 6 noncoding DNA elements displayingsignificant conservation among primates, but undetectable in more distantcomparisons. In vitro and in vivo tests revealed that at least three ofthese 6 elements have regulatory function. Notably, the mouse orthologsof these three functional human sequences had regulatory activity despitetheir lack of significant sequence conservation, indicating that they arecryptic ancestral cis-regulatory elements. These regulatory elementscould still be detected in a smaller set of three primate speciesincluding human, rhesus and marmoset. Since the human and rhesus genomesequences are already available, and the marmoset genome is activelybeing sequenced, the primate-specific conservation analysis describedhere can be applied in the near future on a whole-genome scale, tocomplement the annotation provided by more distant speciescomparisons.

  6. Readings in Wildlife and Fish Conservation, High School Conservation Curriculum Project.

    ERIC Educational Resources Information Center

    Ensminger, Jack

    This publication is a tentative edition of readings on Wildlife and Fish Conservation in Louisiana, and as such it forms part of one of the four units of study designed for an experimental high school course, the "High School Conservation Curriculum Project." The other three units are concerned with Forest Conervation, Soil and Water…

  7. Conservation of the human telomere sequence (TTAGGG)n among vertebrates.

    PubMed Central

    Meyne, J; Ratliff, R L; Moyzis, R K

    1989-01-01

    To determine the evolutionary origin of the human telomere sequence (TTAGGG)n, biotinylated oligodeoxynucleotides of this sequence were hybridized to metaphase spreads from 91 different species, including representative orders of bony fish, reptiles, amphibians, birds, and mammals. Under stringent hybridization conditions, fluorescent signals were detected at the telomeres of all chromosomes, in all 91 species. The conservation of the (TTAGGG)n sequence and its telomeric location, in species thought to share a common ancestor over 400 million years ago, strongly suggest that this sequence is the functional vertebrate telomere. Images PMID:2780561

  8. Complete nucleotide sequence of the Actinomyces viscosus T14V sialidase gene: presence of a conserved repeating sequence among strains of Actinomyces spp.

    PubMed Central

    Yeung, M K

    1993-01-01

    The nucleotide sequence of the Actinomyces viscosus T14V sialidase gene (nanH) and flanking regions was determined. An open reading frame of 2,703 nucleotides that encodes a predominately hydrophobic protein of 901 amino acids (M(r), 92,871) was identified. The amino acid sequence at the amino terminus of the predicted protein exhibited properties characteristic of a typical leader peptide. Five 12-amino-acid units that shared between 33 and 67% sequence identity were noted within the central domain of the protein. Each unit contained the sequence Ser-X-Asp-X-Gly-X-Thr-Trp, which is conserved among other bacterial and trypanosoma sp. sialidases. Thus, the A. viscosus T14V nanH gene and the other prokaryotic and eukaryotic sialidase genes evolved from a common ancestor. Southern hybridization analyses under conditions of high stringency revealed the existence of DNA sequences homologous to A. viscosus T14V nanH in the genomes of 18 strains of five Actinomyces species that expressed various levels of sialidase activity. The data demonstrate that the sialidase genes from divergent groups of Actinomyces spp. are highly conserved. Images PMID:8418033

  9. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence.

    PubMed

    Gordon, Kacy L; Arthur, Robert K; Ruvinsky, Ilya

    2015-05-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements.

  10. Phylum-Level Conservation of Regulatory Information in Nematodes despite Extensive Non-coding Sequence Divergence

    PubMed Central

    Gordon, Kacy L.; Arthur, Robert K.; Ruvinsky, Ilya

    2015-01-01

    Gene regulatory information guides development and shapes the course of evolution. To test conservation of gene regulation within the phylum Nematoda, we compared the functions of putative cis-regulatory sequences of four sets of orthologs (unc-47, unc-25, mec-3 and elt-2) from distantly-related nematode species. These species, Caenorhabditis elegans, its congeneric C. briggsae, and three parasitic species Meloidogyne hapla, Brugia malayi, and Trichinella spiralis, represent four of the five major clades in the phylum Nematoda. Despite the great phylogenetic distances sampled and the extensive sequence divergence of nematode genomes, all but one of the regulatory elements we tested are able to drive at least a subset of the expected gene expression patterns. We show that functionally conserved cis-regulatory elements have no more extended sequence similarity to their C. elegans orthologs than would be expected by chance, but they do harbor motifs that are important for proper expression of the C. elegans genes. These motifs are too short to be distinguished from the background level of sequence similarity, and while identical in sequence they are not conserved in orientation or position. Functional tests reveal that some of these motifs contribute to proper expression. Our results suggest that conserved regulatory circuitry can persist despite considerable turnover within cis elements. PMID:26020930

  11. Position-specific prediction of methylation sites from sequence conservation based on information theory.

    PubMed

    Shi, Yinan; Guo, Yanzhi; Hu, Yayun; Li, Menglong

    2015-07-23

    Protein methylation plays vital roles in many biological processes and has been implicated in various human diseases. To fully understand the mechanisms underlying methylation for use in drug design and work in methylation-related diseases, an initial but crucial step is to identify methylation sites. The use of high-throughput bioinformatics methods has become imperative to predict methylation sites. In this study, we developed a novel method that is based only on sequence conservation to predict protein methylation sites. Conservation difference profiles between methylated and non-methylated peptides were constructed by the information entropy (IE) in a wider neighbor interval around the methylation sites that fully incorporated all of the environmental information. Then, the distinctive neighbor residues were identified by the importance scores of information gain (IG). The most representative model was constructed by support vector machine (SVM) for Arginine and Lysine methylation, respectively. This model yielded a promising result on both the benchmark dataset and independent test set. The model was used to screen the entire human proteome, and many unknown substrates were identified. These results indicate that our method can serve as a useful supplement to elucidate the mechanism of protein methylation and facilitate hypothesis-driven experimental design and validation.

  12. The nucleotide sequence of the human int-1 mammary oncogene; evolutionary conservation of coding and non-coding sequences.

    PubMed Central

    van Ooyen, A; Kwee, V; Nusse, R

    1985-01-01

    The mouse mammary tumor virus can induce mammary tumors in mice by proviral activation of an evolutionarily conserved cellular oncogene called int-1. Here we present the nucleotide sequence of the human homologue of int-1, and compare it with the mouse gene. Like the mouse gene, the human homologue contains a reading frame of 370 amino acids, with only four substitutions. The amino acid changes are all in the hydrophobic leader domain of the int-1 encoded protein, and do not significantly alter its hydropathic index. The conservation between the mouse and the human int-1 genes is not restricted to exons; extensive parts of the introns are also homologous. Thus, int-1 ranks among the most conserved genes known, a property shared with other oncogenes. PMID:2998762

  13. Two distinct nuclear factors bind the conserved regulatory sequences of a rabbit major histocompatibility complex class II gene.

    PubMed Central

    Sittisombut, N

    1988-01-01

    The constitutive coexpression of the major histocompatibility complex (MHC) class II genes in B lymphocytes requires positive, trans-acting transcriptional factors. The need for these trans-acting factors has been suggested by the reversion of the MHC class II-negative phenotype of rare B-lymphocyte mutants through somatic cell fusion with B cells or T-cell lines. The mechanism by which the trans-acting factors exert their effect on gene transcription is unknown. The possibility that two highly conserved DNA sequences, located 90 to 100 base pairs (bp) (the A sequence) and 60 to 70 bp (the B sequence) upstream of the transcription start site of the class II genes, are recognized by the trans-acting factors was investigated in this study. By using the gel electrophoresis retardation assay, a minimum of two proteins which specifically bound the conserved A or B sequence of a rabbit DP beta gene were identified in murine nuclear extracts of a B-lymphoma cell line, A20-2J. Fractionation of nuclear extract through a heparin-agarose column allowed the identification of one protein, designated NF-MHCIIB, which bound an oligonucleotide containing the B sequence and protected the entire B sequence in the DNase I protection analysis. Another protein, designated NF-MHCIIA, which bound an oligonucleotide containing the A sequence and partially protected the 3' half of this sequence, was also identified. NF-MHCIIB did not protect a CCAAT sequence located 17 bp downstream of the B sequence. The possible relationship between these DNA-binding factors and the trans-acting factors identified in the cell fusion experiments is discussed. Images PMID:3133552

  14. Inter-specific sequence conservation and intra-individual sequence variation in a spider silk gene.

    PubMed

    Tai, Pei-Ling; Hwang, Guang-Yuh; Tso, I-Min

    2004-10-01

    Currently, studies on major ampullate spidroin 1 (MaSp1) genes of non-orb weaving spiders are few, and it is not clear whether genes of these organisms exhibit the same characteristics as those of orb-weavers. In addition, many studies have proposed that MaSp1 might be a single gene with allelic variants, but supporting evidence is still lacking. In this study, we compared partial DNA and amino acid sequences of MaSp1 cloned from different spider guilds. We also cloned partial MaSp1 sequences from genomic DNA and cDNA of the same individuals of spiders using the same primer combination to see if different molecular forms existed. In the repetitive region of partial MaSp1 sequences obtained, GGX, GA and poly-A motifs were present in all Araneomorphae and Mygalomorpae species examined. An extreme similarity in MaSp1 non-repetitive portions was found in sequences of ecribellate, cribellate and Mygalomorphae web-builders and such a result suggested that this sequence might exhibit an important function. A comparison of sequences amplified from the same individual showed that substitutions in amino acids occurred in both repetitive and non-repetitive regions, with a much higher variation in the former. These results suggest that the MaSp1 of Araneomorphae spiders exhibits several forms in an individual spider and it might be either a multiple gene or a single gene with a multiple exon/intron organization.

  15. FRESCO: Referential compression of highly similar sequences.

    PubMed

    Wandelt, Sebastian; Leser, Ulf

    2013-01-01

    In many applications, sets of similar texts or sequences are of high importance. Prominent examples are revision histories of documents or genomic sequences. Modern high-throughput sequencing technologies are able to generate DNA sequences at an ever-increasing rate. In parallel to the decreasing experimental time and cost necessary to produce DNA sequences, computational requirements for analysis and storage of the sequences are steeply increasing. Compression is a key technology to deal with this challenge. Recently, referential compression schemes, storing only the differences between a to-be-compressed input and a known reference sequence, gained a lot of interest in this field. In this paper, we propose a general open-source framework to compress large amounts of biological sequence data called Framework for REferential Sequence COmpression (FRESCO). Our basic compression algorithm is shown to be one to two orders of magnitudes faster than comparable related work, while achieving similar compression ratios. We also propose several techniques to further increase compression ratios, while still retaining the advantage in speed: 1) selecting a good reference sequence; and 2) rewriting a reference sequence to allow for better compression. In addition,we propose a new way of further boosting the compression ratios by applying referential compression to already referentially compressed files (second-order compression). This technique allows for compression ratios way beyond state of the art, for instance,4,000:1 and higher for human genomes. We evaluate our algorithms on a large data set from three different species (more than 1,000 genomes, more than 3 TB) and on a collection of versions of Wikipedia pages. Our results show that real-time compression of highly similar sequences at high compression ratios is possible on modern hardware.

  16. AlignMiner: a Web-based tool for detection of divergent regions in multiple sequence alignments of conserved sequences

    PubMed Central

    2010-01-01

    Background Multiple sequence alignments are used to study gene or protein function, phylogenetic relations, genome evolution hypotheses and even gene polymorphisms. Virtually without exception, all available tools focus on conserved segments or residues. Small divergent regions, however, are biologically important for specific quantitative polymerase chain reaction, genotyping, molecular markers and preparation of specific antibodies, and yet have received little attention. As a consequence, they must be selected empirically by the researcher. AlignMiner has been developed to fill this gap in bioinformatic analyses. Results AlignMiner is a Web-based application for detection of conserved and divergent regions in alignments of conserved sequences, focusing particularly on divergence. It accepts alignments (protein or nucleic acid) obtained using any of a variety of algorithms, which does not appear to have a significant impact on the final results. AlignMiner uses different scoring methods for assessing conserved/divergent regions, Entropy being the method that provides the highest number of regions with the greatest length, and Weighted being the most restrictive. Conserved/divergent regions can be generated either with respect to the consensus sequence or to one master sequence. The resulting data are presented in a graphical interface developed in AJAX, which provides remarkable user interaction capabilities. Users do not need to wait until execution is complete and can.even inspect their results on a different computer. Data can be downloaded onto a user disk, in standard formats. In silico and experimental proof-of-concept cases have shown that AlignMiner can be successfully used to designing specific polymerase chain reaction primers as well as potential epitopes for antibodies. Primer design is assisted by a module that deploys several oligonucleotide parameters for designing primers "on the fly". Conclusions AlignMiner can be used to reliably detect

  17. Use of a Drosophila Genome-Wide Conserved Sequence Database to Identify Functionally Related cis-Regulatory Enhancers

    PubMed Central

    Brody, Thomas; Yavatkar, Amarendra S; Kuzin, Alexander; Kundu, Mukta; Tyson, Leonard J; Ross, Jermaine; Lin, Tzu-Yang; Lee, Chi-Hon; Awasaki, Takeshi; Lee, Tzumin; Odenwald, Ward F

    2012-01-01

    Background: Phylogenetic footprinting has revealed that cis-regulatory enhancers consist of conserved DNA sequence clusters (CSCs). Currently, there is no systematic approach for enhancer discovery and analysis that takes full-advantage of the sequence information within enhancer CSCs. Results: We have generated a Drosophila genome-wide database of conserved DNA consisting of >100,000 CSCs derived from EvoPrints spanning over 90% of the genome. cis-Decoder database search and alignment algorithms enable the discovery of functionally related enhancers. The program first identifies conserved repeat elements within an input enhancer and then searches the database for CSCs that score highly against the input CSC. Scoring is based on shared repeats as well as uniquely shared matches, and includes measures of the balance of shared elements, a diagnostic that has proven to be useful in predicting cis-regulatory function. To demonstrate the utility of these tools, a temporally-restricted CNS neuroblast enhancer was used to identify other functionally related enhancers and analyze their structural organization. Conclusions: cis-Decoder reveals that co-regulating enhancers consist of combinations of overlapping shared sequence elements, providing insights into the mode of integration of multiple regulating transcription factors. The database and accompanying algorithms should prove useful in the discovery and analysis of enhancers involved in any developmental process. Developmental Dynamics 241:169–189, 2012. © 2011 Wiley Periodicals, Inc. Key findings A genome-wide catalog of Drosophila conserved DNA sequence clusters. cis-Decoder discovers functionally related enhancers. Functionally related enhancers share balanced sequence element copy numbers. Many enhancers function during multiple phases of development. PMID:22174086

  18. The complete mitochondrial genome sequence of the liverwort Pleurozia purpurea reveals extremely conservative mitochondrial genome evolution in liverworts.

    PubMed

    Wang, Bin; Xue, Jiayu; Li, Libo; Liu, Yang; Qiu, Yin-Long

    2009-12-01

    Plant mitochondrial genomes have been known to be highly unusual in their large sizes, frequent intra-genomic rearrangement, and generally conservative sequence evolution. Recent studies show that in early land plants the mitochondrial genomes exhibit a mixed mode of conservative yet dynamic evolution. Here, we report the completely sequenced mitochondrial genome from the liverwort Pleurozia purpurea. The circular genome has a size of 168,526 base pairs, containing 43 protein-coding genes, 3 rRNA genes, 25 tRNA genes, and 31 group I or II introns. It differs from the Marchantia polymorpha mitochondrial genome, the only other liverwort chondriome that has been sequenced, in lacking two genes (trnRucg and trnTggu) and one intron (rrn18i1065gII). The two genomes have identical gene orders and highly similar sequences in exons, introns, and intergenic spacers. Finally, a comparative analysis of duplicated trnRucu and other trnR genes from the two liverworts and several other organisms identified the recent lateral origin of trnRucg in Marchantia mtDNA through modification of a duplicated trnRucu. This study shows that the mitochondrial genomes evolve extremely slowly in liverworts, the earliest-diverging lineage of extant land plants, in stark contrast to what is known of highly dynamic evolution of mitochondrial genomes in seed plants.

  19. Nucleotide sequence conservation of novel and established cis-regulatory sites within the tyrosine hydroxylase gene promoter

    PubMed Central

    Wang, Meng; Banerjee, Kasturi; Baker, Harriet; Cave, John W.

    2015-01-01

    Tyrosine hydroxylase (TH) is the rate-limiting enzyme in catecholamine biosynthesis and its gene proximal promoter ( < 1 kb upstream from the transcription start site) is essential for regulating transcription in both the developing and adult nervous systems. Several putative regulatory elements within the TH proximal promoter have been reported, but evolutionary conservation of these elements has not been thoroughly investigated. Since many vertebrate species are used to model development, function and disorders of human catecholaminergic neurons, identifying evolutionarily conserved transcription regulatory mechanisms is a high priority. In this study, we align TH proximal promoter nucleotide sequences from several vertebrate species to identify evolutionarily conserved motifs. This analysis identified three elements (a TATA box, cyclic AMP response element (CRE) and a 5′-GGTGG-3′ site) that constitute the core of an ancient vertebrate TH promoter. Focusing on only eutherian mammals, two regions of high conservation within the proximal promoter were identified: a ∼250 bp region adjacent to the transcription start site and a ∼85 bp region located approximately 350 bp further upstream. Within both regions, conservation of previously reported cis-regulatory motifs and human single nucleotide variants was evaluated. Transcription reporter assays in a TH -expressing cell line demonstrated the functionality of highly conserved motifs in the proximal promoter regions and electromobility shift assays showed that brain-region specific complexes assemble on these motifs. These studies also identified a non-canonical CRE binding (CREB) protein recognition element in the proximal promoter. Together, these studies provide a detailed analysis of evolutionary conservation within the TH promoter and identify potential cis-regulatory motifs that underlie a core set of regulatory mechanisms in mammals. PMID:25774193

  20. Sequence and domain conservation of the coelacanth Hsp40 and Hsp90 chaperones suggests conservation of function.

    PubMed

    Bishop, Özlem Tastan; Edkins, Adrienne Lesley; Blatch, Gregory Lloyd

    2014-09-01

    Molecular chaperones and their associated co-chaperones play an important role in preserving and regulating the active conformational state of cellular proteins. The chaperone complement of the Indonesian Coelacanth, Latimeria menadoensis, was elucidated using transcriptomic sequences. Heat shock protein 90 (Hsp90) and heat shock protein 40 (Hsp40) chaperones, and associated co-chaperones were focused on, and homologous human sequences were used to search the sequence databases. Coelacanth homologs of the cytosolic, mitochondrial and endoplasmic reticulum (ER) homologs of human Hsp90 were identified, as well as all of the major co-chaperones of the cytosolic isoform. Most of the human Hsp40s were found to have coelacanth homologs, and the data suggested that all of the chaperone machinery for protein folding at the ribosome, protein translocation to cellular compartments such as the ER and protein degradation were conserved. Some interesting similarities and differences were identified when interrogating human, mouse, and zebrafish homologs. For example, DnaJB13 is predicted to be a non-functional Hsp40 in humans, mouse, and zebrafish due to a corrupted histidine-proline-aspartic acid (HPD) motif, while the coelacanth homolog has an intact HPD. These and other comparisons enabled important functional and evolutionary questions to be posed for future experimental studies.

  1. Rice pseudomolecule-anchored cross-species DNA sequence alignments indicate regional genomic variation in expressed sequence conservation

    PubMed Central

    Armstead, Ian; Huang, Lin; King, Julie; Ougham, Helen; Thomas, Howard; King, Ian

    2007-01-01

    Background Various methods have been developed to explore inter-genomic relationships among plant species. Here, we present a sequence similarity analysis based upon comparison of transcript-assembly and methylation-filtered databases from five plant species and physically anchored rice coding sequences. Results A comparison of the frequency of sequence alignments, determined by MegaBLAST, between rice coding sequences in TIGR pseudomolecules and annotations vs 4.0 and comprehensive transcript-assembly and methylation-filtered databases from Lolium perenne (ryegrass), Zea mays (maize), Hordeum vulgare (barley), Glycine max (soybean) and Arabidopsis thaliana (thale cress) was undertaken. Each rice pseudomolecule was divided into 10 segments, each containing 10% of the functionally annotated, expressed genes. This indicated a correlation between relative segment position in the rice genome and numbers of alignments with all the queried monocot and dicot plant databases. Colour-coded moving windows of 100 functionally annotated, expressed genes along each pseudomolecule were used to generate 'heat-maps'. These revealed consistent intra- and inter-pseudomolecule variation in the relative concentrations of significant alignments with the tested plant databases. Analysis of the annotations and derived putative expression patterns of rice genes from 'hot-spots' and 'cold-spots' within the heat maps indicated possible functional differences. A similar comparison relating to ancestral duplications of the rice genome indicated that duplications were often associated with 'hot-spots'. Conclusion Physical positions of expressed genes in the rice genome are correlated with the degree of conservation of similar sequences in the transcriptomes of other plant species. This relative conservation is associated with the distribution of different sized gene families and segmentally duplicated loci and may have functional and evolutionary implications. PMID:17708759

  2. Regulation of ABCC6 trafficking and stability by a conserved C-terminal PDZ-like sequence.

    PubMed

    Xue, Peng; Crum, Chelsea M; Thibodeau, Patrick H

    2014-01-01

    Mutations in the ABCC6 ABC-transporter are causative of pseudoxanthoma elasticum (PXE). The loss of functional ABCC6 protein in the basolateral membrane of the kidney and liver is putatively associated with altered secretion of a circulatory factor. As a result, systemic changes in elastic tissues are caused by progressive mineralization and degradation of elastic fibers. Premature arteriosclerosis, loss of skin and vascular tone, and a progressive loss of vision result from this ectopic mineralization. However, the identity of the circulatory factor and the specific role of ABCC6 in disease pathophysiology are not known. Though recessive loss-of-function alleles are associated with alterations in ABCC6 expression and function, the molecular pathologies associated with the majority of PXE-causing mutations are also not known. Sequence analysis of orthologous ABCC6 proteins indicates the C-terminal sequences are highly conserved and share high similarity to the PDZ sequences found in other ABCC subfamily members. Genetic testing of PXE patients suggests that at least one disease-causing mutation is located in a PDZ-like sequence at the extreme C-terminus of the ABCC6 protein. To evaluate the role of this C-terminal sequence in the biosynthesis and trafficking of ABCC6, a series of mutations were utilized to probe changes in ABCC6 biosynthesis, membrane stability and turnover. Removal of this PDZ-like sequence resulted in decreased steady-state ABCC6 levels, decreased cell surface expression and stability, and mislocalization of the ABCC6 protein in polarized cells. These data suggest that the conserved, PDZ-like sequence promotes the proper biosynthesis and trafficking of the ABCC6 protein.

  3. Conservation of Tubulin-Binding Sequences in TRPV1 throughout Evolution

    PubMed Central

    Sardar, Puspendu; Kumar, Abhishek; Bhandari, Anita; Goswami, Chandan

    2012-01-01

    Background Transient Receptor Potential Vanilloid sub type 1 (TRPV1), commonly known as capsaicin receptor can detect multiple stimuli ranging from noxious compounds, low pH, temperature as well as electromagnetic wave at different ranges. In addition, this receptor is involved in multiple physiological and sensory processes. Therefore, functions of TRPV1 have direct influences on adaptation and further evolution also. Availability of various eukaryotic genomic sequences in public domain facilitates us in studying the molecular evolution of TRPV1 protein and the respective conservation of certain domains, motifs and interacting regions that are functionally important. Methodology and Principal Findings Using statistical and bioinformatics tools, our analysis reveals that TRPV1 has evolved about ∼420 million years ago (MYA). Our analysis reveals that specific regions, domains and motifs of TRPV1 has gone through different selection pressure and thus have different levels of conservation. We found that among all, TRP box is the most conserved and thus have functional significance. Our results also indicate that the tubulin binding sequences (TBS) have evolutionary significance as these stretch sequences are more conserved than many other essential regions of TRPV1. The overall distribution of positively charged residues within the TBS motifs is conserved throughout evolution. In silico analysis reveals that the TBS-1 and TBS-2 of TRPV1 can form helical structures and may play important role in TRPV1 function. Conclusions and Significance Our analysis identifies the regions of TRPV1, which are important for structure – function relationship. This analysis indicates that tubulin binding sequence-1 (TBS-1) near the TRP-box forms a potential helix and the tubulin interactions with TRPV1 via TBS-1 have evolutionary significance. This interaction may be required for the proper channel function and regulation and may also have significance in the context of Taxol

  4. Integrated genome analysis suggests that most conserved non-coding sequences are regulatory factor binding sites

    PubMed Central

    Hemberg, Martin; Gray, Jesse M.; Cloonan, Nicole; Kuersten, Scott; Grimmond, Sean; Greenberg, Michael E.; Kreiman, Gabriel

    2012-01-01

    More than 98% of a typical vertebrate genome does not code for proteins. Although non-coding regions are sprinkled with short (<200 bp) islands of evolutionarily conserved sequences, the function of most of these unannotated conserved islands remains unknown. One possibility is that unannotated conserved islands could encode non-coding RNAs (ncRNAs); alternatively, unannotated conserved islands could serve as promoter-distal regulatory factor binding sites (RFBSs) like enhancers. Here we assess these possibilities by comparing unannotated conserved islands in the human and mouse genomes to transcribed regions and to RFBSs, relying on a detailed case study of one human and one mouse cell type. We define transcribed regions by applying a novel transcript-calling algorithm to RNA-Seq data obtained from total cellular RNA, and we define RFBSs using ChIP-Seq and DNAse-hypersensitivity assays. We find that unannotated conserved islands are four times more likely to coincide with RFBSs than with unannotated ncRNAs. Thousands of conserved RFBSs can be categorized as insulators based on the presence of CTCF or as enhancers based on the presence of p300/CBP and H3K4me1. While many unannotated conserved RFBSs are transcriptionally active to some extent, the transcripts produced tend to be unspliced, non-polyadenylated and expressed at levels 10 to 100-fold lower than annotated coding or ncRNAs. Extending these findings across multiple cell types and tissues, we propose that most conserved non-coding genomic DNA in vertebrate genomes corresponds to promoter-distal regulatory elements. PMID:22684627

  5. Primary structure of the merozoite surface antigen 1 of Plasmodium vivax reveals sequences conserved between different Plasmodium species.

    PubMed Central

    del Portillo, H A; Longacre, S; Khouri, E; David, P H

    1991-01-01

    Merozoite surface antigen 1 (MSA1) of several species of plasmodia has been shown to be a promising candidate for a vaccine directed against the asexual blood stages of malaria. We report the cloning and characterization of the MSA1 gene of the human malaria parasite Plasmodium vivax. This gene, which we call Pv200, encodes a polypeptide of 1726 amino acids and displays features described for MSA1 genes of other species, such as signal peptide and anchoring sequences, conserved cysteine residues, number of potential N-glycosylation sites, and repeats consisting here of 23 glutamine residues in a row. When the nucleotide and deduced amino acid sequences of the MSA1 of P. vivax are compared to those of another human malaria parasite, Plasmodium falciparum, and to those of the rodent parasite Plasmodium yoelii, 10 regions of high amino acid similarity are observed despite the very different dG + dC contents of the corresponding genes. All of the interspecies conserved regions reside within the conserved or semiconserved blocks delimited by the sequences of different alleles of the MSA1 gene of P. falciparum. Images PMID:2023952

  6. Structure and sequence conservation of hao cluster genes of autotrophic ammonia-oxidizing bacteria: evidence for their evolutionary history.

    PubMed

    Bergmann, David J; Hooper, Alan B; Klotz, Martin G

    2005-09-01

    Comparison of the organization and sequence of the hao (hydroxylamine oxidoreductase) gene clusters from the gammaproteobacterial autotrophic ammonia-oxidizing bacterium (aAOB) Nitrosococcus oceani and the betaproteobacterial aAOB Nitrosospira multiformis and Nitrosomonas europaea revealed a highly conserved gene cluster encoding the following proteins: hao, hydroxylamine oxidoreductase; orf2, a putative protein; cycA, cytochrome c(554); and cycB, cytochrome c(m)(552). The deduced protein sequences of HAO, c(554), and c(m)(552) were highly similar in all aAOB despite their differences in species evolution and codon usage. Phylogenetic inference revealed a broad family of multi-c-heme proteins, including HAO, the pentaheme nitrite reductase, and tetrathionate reductase. The c-hemes of this group also have a nearly identical geometry of heme orientation, which has remained conserved during divergent evolution of function. High sequence similarity is also seen within a protein family, including cytochromes c(m)(552), NrfH/B, and NapC/NirT. It is proposed that the hydroxylamine oxidation pathway evolved from a nitrite reduction pathway involved in anaerobic respiration (denitrification) during the radiation of the Proteobacteria. Conservation of the hydroxylamine oxidation module was maintained by functional pressure, and the module expanded into two separate narrow taxa after a lateral gene transfer event between gamma- and betaproteobacterial ancestors of extant aAOB. HAO-encoding genes were also found in six non-aAOB, either singly or tandemly arranged with an orf2 gene, whereas a c(554) gene was lacking. The conservation of the hao gene cluster in general and the uniqueness of the c(554) gene in particular make it a suitable target for the design of primers and probes useful for molecular ecology approaches to detect aAOB.

  7. Structure and Sequence Conservation of hao Cluster Genes of Autotrophic Ammonia-Oxidizing Bacteria: Evidence for Their Evolutionary History

    PubMed Central

    Bergmann, David J.; Hooper, Alan B.; Klotz, Martin G.

    2005-01-01

    Comparison of the organization and sequence of the hao (hydroxylamine oxidoreductase) gene clusters from the gammaproteobacterial autotrophic ammonia-oxidizing bacterium (aAOB) Nitrosococcus oceani and the betaproteobacterial aAOB Nitrosospira multiformis and Nitrosomonas europaea revealed a highly conserved gene cluster encoding the following proteins: hao, hydroxylamine oxidoreductase; orf2, a putative protein; cycA, cytochrome c554; and cycB, cytochrome cm552. The deduced protein sequences of HAO, c554, and cm552 were highly similar in all aAOB despite their differences in species evolution and codon usage. Phylogenetic inference revealed a broad family of multi-c-heme proteins, including HAO, the pentaheme nitrite reductase, and tetrathionate reductase. The c-hemes of this group also have a nearly identical geometry of heme orientation, which has remained conserved during divergent evolution of function. High sequence similarity is also seen within a protein family, including cytochromes cm552, NrfH/B, and NapC/NirT. It is proposed that the hydroxylamine oxidation pathway evolved from a nitrite reduction pathway involved in anaerobic respiration (denitrification) during the radiation of the Proteobacteria. Conservation of the hydroxylamine oxidation module was maintained by functional pressure, and the module expanded into two separate narrow taxa after a lateral gene transfer event between gamma- and betaproteobacterial ancestors of extant aAOB. HAO-encoding genes were also found in six non-aAOB, either singly or tandemly arranged with an orf2 gene, whereas a c554 gene was lacking. The conservation of the hao gene cluster in general and the uniqueness of the c554 gene in particular make it a suitable target for the design of primers and probes useful for molecular ecology approaches to detect aAOB. PMID:16151127

  8. The penicillin gene cluster is amplified in tandem repeats linked by conserved hexanucleotide sequences.

    PubMed Central

    Fierro, F; Barredo, J L; Díez, B; Gutierrez, S; Fernández, F J; Martín, J F

    1995-01-01

    The penicillin biosynthetic genes (pcbAB, pcbC, penDE) of Penicillium chrysogenum AS-P-78 were located in a 106.5-kb DNA region that is amplified in tandem repeats (five or six copies) linked by conserved TTTACA sequences. The wild-type strains P. chrysogenum NRRL 1951 and Penicillium notatum ATCC 9478 (Fleming's isolate) contain a single copy of the 106.5-kb region. This region was bordered by the same TTTACA hexanucleotide found between tandem repeats in strain AS-P-78. A penicillin overproducer strain, P. chrysogenum E1, contains a large number of copies in tandem of a 57.9-kb DNA fragment, linked by the same hexanucleotide or its reverse complementary TGTAAA sequence. The deletion mutant P. chrysogenum npe10 showed a deletion of 57.9 kb that corresponds exactly to the DNA fragment that is amplified in E1. The conserved hexanucleotide sequence was reconstituted at the deletion site. The amplification has occurred within a single chromosome (chromosome I). The tandem reiteration and deletion appear to arise by mutation-induced site-specific recombination at the conserved hexanucleotide sequences. Images Fig. 3 PMID:7597101

  9. Genotyping by sequencing resolves shallow population structure to inform conservation of Chinook salmon (Oncorhynchus tshawytscha)

    PubMed Central

    Larson, Wesley A; Seeb, Lisa W; Everett, Meredith V; Waples, Ryan K; Templin, William D; Seeb, James E

    2014-01-01

    Recent advances in population genomics have made it possible to detect previously unidentified structure, obtain more accurate estimates of demographic parameters, and explore adaptive divergence, potentially revolutionizing the way genetic data are used to manage wild populations. Here, we identified 10 944 single-nucleotide polymorphisms using restriction-site-associated DNA (RAD) sequencing to explore population structure, demography, and adaptive divergence in five populations of Chinook salmon (Oncorhynchus tshawytscha) from western Alaska. Patterns of population structure were similar to those of past studies, but our ability to assign individuals back to their region of origin was greatly improved (>90% accuracy for all populations). We also calculated effective size with and without removing physically linked loci identified from a linkage map, a novel method for nonmodel organisms. Estimates of effective size were generally above 1000 and were biased downward when physically linked loci were not removed. Outlier tests based on genetic differentiation identified 733 loci and three genomic regions under putative selection. These markers and genomic regions are excellent candidates for future research and can be used to create high-resolution panels for genetic monitoring and population assignment. This work demonstrates the utility of genomic data to inform conservation in highly exploited species with shallow population structure. PMID:24665338

  10. Annotation of cis-regulatory elements by identification, subclassification, and functional assessment of multispecies conserved sequences

    PubMed Central

    Hughes, Jim R.; Cheng, Jan-Fang; Ventress, Nicki; Prabhakar, Shyam; Clark, Kevin; Anguita, Eduardo; De Gobbi, Marco; de Jong, Pieter; Rubin, Eddy; Higgs, Douglas R.

    2005-01-01

    An important step toward improving the annotation of the human genome is to identify cis-acting regulatory elements from primary DNA sequence. One approach is to compare sequences from multiple, divergent species. This approach distinguishes multispecies conserved sequences (MCS) in noncoding regions from more rapidly evolving neutral DNA. Here, we have analyzed a region of ≈238kb containing the human α globin cluster that was sequenced and/or annotated across the syntenic region in 22 species spanning 500 million years of evolution. Using a variety of bioinformatic approaches and correlating the results with many aspects of chromosome structure and function in this region, we were able to identify and evaluate the importance of 24 individual MCSs. This approach sensitively and accurately identified previously characterized regulatory elements but also discovered unidentified promoters, exons, splicing, and transcriptional regulatory elements. Together, these studies demonstrate an integrated approach by which to identify, subclassify, and predict the potential importance of MCSs. PMID:15998734

  11. Predicting RNA-binding residues from evolutionary information and sequence conservation

    PubMed Central

    2010-01-01

    Abstract Background RNA-binding proteins (RBPs) play crucial roles in post-transcriptional control of RNA. RBPs are designed to efficiently recognize specific RNA sequences after it is derived from the DNA sequence. To satisfy diverse functional requirements, RNA binding proteins are composed of multiple blocks of RNA-binding domains (RBDs) presented in various structural arrangements to provide versatile functions. The ability to computationally predict RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments. Results The proposed prediction framework named “ProteRNA” combines a SVM-based classifier with conserved residue discovery by WildSpan to identify the residues that interact with RNA in a RNA-binding protein. Although these conserved residues can be either functionally conserved residues or structurally conserved residues, they provide clues on the important residues in a protein sequence. In the independent testing dataset, ProteRNA has been able to deliver overall accuracy of 89.78%, MCC of 0.2628, F-score of 0.3075, and F0.5-score of 0.3546. Conclusions This article presents the design of a sequence-based predictor aiming to identify the RNA-binding residues in a RNA-binding protein by combining machine learning and pattern mining approaches. RNA-binding proteins have diverse functions while interacting with different categories of RNAs because these proteins are composed of multiple copies of RNA-binding domains presented in various structural arrangements to expand the functional repertoire of RNA-binding proteins. Furthermore, predicting RNA-binding residues in a RNA-binding protein can help biologists reveal important site-directed mutagenesis in wet-lab experiments. PMID:21143803

  12. Lariat sequencing in a unicellular yeast identifies regulated alternative splicing of exons that are evolutionarily conserved with humans.

    PubMed

    Awan, Ali R; Manfredo, Amanda; Pleiss, Jeffrey A

    2013-07-30

    Alternative splicing is a potent regulator of gene expression that vastly increases proteomic diversity in multicellular eukaryotes and is associated with organismal complexity. Although alternative splicing is widespread in vertebrates, little is known about the evolutionary origins of this process, in part because of the absence of phylogenetically conserved events that cross major eukaryotic clades. Here we describe a lariat-sequencing approach, which offers high sensitivity for detecting splicing events, and its application to the unicellular fungus, Schizosaccharomyces pombe, an organism that shares many of the hallmarks of alternative splicing in mammalian systems but for which no previous examples of exon-skipping had been demonstrated. Over 200 previously unannotated splicing events were identified, including examples of regulated alternative splicing. Remarkably, an evolutionary analysis of four of the exons identified here as subject to skipping in S. pombe reveals high sequence conservation and perfect length conservation with their homologs in scores of plants, animals, and fungi. Moreover, alternative splicing of two of these exons have been documented in multiple vertebrate organisms, making these the first demonstrations of identical alternative-splicing patterns in species that are separated by over 1 billion y of evolution.

  13. Detecting Remote Sequence Homology in Disordered Proteins: Discovery of Conserved Motifs in the N-Termini of Mononegavirales phosphoproteins

    PubMed Central

    Karlin, David; Belshaw, Robert

    2012-01-01

    Paramyxovirinae are a large group of viruses that includes measles virus and parainfluenza viruses. The viral Phosphoprotein (P) plays a central role in viral replication. It is composed of a highly variable, disordered N-terminus and a conserved C-terminus. A second viral protein alternatively expressed, the V protein, also contains the N-terminus of P, fused to a zinc finger. We suspected that, despite their high variability, the N-termini of P/V might all be homologous; however, using standard approaches, we could previously identify sequence conservation only in some Paramyxovirinae. We now compared the N-termini using sensitive sequence similarity search programs, able to detect residual similarities unnoticeable by conventional approaches. We discovered that all Paramyxovirinae share a short sequence motif in their first 40 amino acids, which we called soyuz1. Despite its short length (11–16aa), several arguments allow us to conclude that soyuz1 probably evolved by homologous descent, unlike linear motifs. Conservation across such evolutionary distances suggests that soyuz1 plays a crucial role and experimental data suggest that it binds the viral nucleoprotein to prevent its illegitimate self-assembly. In some Paramyxovirinae, the N-terminus of P/V contains a second motif, soyuz2, which might play a role in blocking interferon signaling. Finally, we discovered that the P of related Mononegavirales contain similarly overlooked motifs in their N-termini, and that their C-termini share a previously unnoticed structural similarity suggesting a common origin. Our results suggest several testable hypotheses regarding the replication of Mononegavirales and suggest that disordered regions with little overall sequence similarity, common in viral and eukaryotic proteins, might contain currently overlooked motifs (intermediate in length between linear motifs and disordered domains) that could be detected simply by comparing orthologous proteins. PMID:22403617

  14. Predicting protein ligand binding sites by combining evolutionary sequence conservation and 3D structure.

    PubMed

    Capra, John A; Laskowski, Roman A; Thornton, Janet M; Singh, Mona; Funkhouser, Thomas A

    2009-12-01

    Identifying a protein's functional sites is an important step towards characterizing its molecular function. Numerous structure- and sequence-based methods have been developed for this problem. Here we introduce ConCavity, a small molecule binding site prediction algorithm that integrates evolutionary sequence conservation estimates with structure-based methods for identifying protein surface cavities. In large-scale testing on a diverse set of single- and multi-chain protein structures, we show that ConCavity substantially outperforms existing methods for identifying both 3D ligand binding pockets and individual ligand binding residues. As part of our testing, we perform one of the first direct comparisons of conservation-based and structure-based methods. We find that the two approaches provide largely complementary information, which can be combined to improve upon either approach alone. We also demonstrate that ConCavity has state-of-the-art performance in predicting catalytic sites and drug binding pockets. Overall, the algorithms and analysis presented here significantly improve our ability to identify ligand binding sites and further advance our understanding of the relationship between evolutionary sequence conservation and structural and functional attributes of proteins. Data, source code, and prediction visualizations are available on the ConCavity web site (http://compbio.cs.princeton.edu/concavity/).

  15. A phylogenetically conserved sequence within viral 3' untranslated RNA pseudoknots regulates translation.

    PubMed Central

    Leathers, V; Tanguay, R; Kobayashi, M; Gallie, D R

    1993-01-01

    Both the 68-base 5' leader (omega) and the 205-base 3' untranslated region (UTR) of tobacco mosaic virus (TMV) promote efficient translation. A 35-base region within omega is necessary and sufficient for the regulation. Within the 3' UTR, a 52-base region, composed of two RNA pseudoknots, is required for regulation. These pseudoknots are phylogenetically conserved among seven viruses from two different viral groups and one satellite virus. The pseudoknots contained significant conservation at the secondary and tertiary levels and at several positions at the primary sequence level. Mutational analysis of the sequences determined that the primary sequence in several conserved positions, particularly within the third pseudoknot, was essential for function. The higher-order structure of the pseudoknots was also required. Both the leader and the pseudoknot region were specifically recognized by, and competed for, the same proteins in extracts made from carrot cell suspension cells and wheat germ. Binding of the proteins is much stronger to omega than the pseudoknot region. Synergism was observed between the TMV 3' UTR and the cap and to a lesser extent between omega and the 3' UTR. The functional synergism and the protein binding data suggest that the cap, TMV 5' leader, and 3' UTR interact to establish an efficient level of translation. Images PMID:8355685

  16. Conservation of uORF repressiveness and sequence features in mouse, human and zebrafish

    PubMed Central

    Chew, Guo-Liang; Pauli, Andrea; Schier, Alexander F.

    2016-01-01

    Upstream open reading frames (uORFs) are ubiquitous repressive genetic elements in vertebrate mRNAs. While much is known about the regulation of individual genes by their uORFs, the range of uORF-mediated translational repression in vertebrate genomes is largely unexplored. Moreover, it is unclear whether the repressive effects of uORFs are conserved across species. To address these questions, we analyse transcript sequences and ribosome profiling data from human, mouse and zebrafish. We find that uORFs are depleted near coding sequences (CDSes) and have initiation contexts that diminish their translation. Linear modelling reveals that sequence features at both uORFs and CDSes modulate the translation of CDSes. Moreover, the ratio of translation over 5′ leaders and CDSes is conserved between human and mouse, and correlates with the number of uORFs. These observations suggest that the prevalence of vertebrate uORFs may be explained by their conserved role in repressing CDS translation. PMID:27216465

  17. Assembly of transmembrane helices of simple polytopic membrane proteins from sequence conservation patterns.

    PubMed

    Park, Yungki; Helms, Volkhard

    2006-09-01

    The transmembrane (TM) domains of most membrane proteins consist of helix bundles. The seemingly simple task of TM helix bundle assembly has turned out to be extremely difficult. This is true even for simple TM helix bundle proteins, i.e., those that have the simple form of compact TM helix bundles. Herein, we present a computational method that is capable of generating native-like structural models for simple TM helix bundle proteins having modest numbers of TM helices based on sequence conservation patterns. Thus, the only requirement for our method is the presence of more than 30 homologous sequences for an accurate extraction of sequence conservation patterns. The prediction method first computes a number of representative well-packed conformations for each pair of contacting TM helices, and then a library of tertiary folds is generated by overlaying overlapping TM helices of the representative conformations. This library is scored using sequence conservation patterns, and a subsequent clustering analysis yields five final models. Assuming that neighboring TM helices in the sequence contact each other (but not that TM helices A and G contact each other), the method produced structural models of Calpha atom root-mean-square deviation (CA RMSD) of 3-5 A from corresponding crystal structures for bacteriorhodopsin, halorhodopsin, sensory rhodopsin II, and rhodopsin. In blind predictions, this type of contact knowledge is not available. Mimicking this, predictions were made for the rotor of the V-type Na(+)-adenosine triphosphatase without such knowledge. The CA RMSD between the best model and its crystal structure is only 3.4 A, and its contact accuracy reaches 55%. Furthermore, the model correctly identifies the binding pocket for sodium ion. These results demonstrate that the method can be readily applied to ab initio structure prediction of simple TM helix bundle proteins having modest numbers of TM helices.

  18. Septal localization by membrane targeting sequences and a conserved sequence essential for activity at the COOH-terminus of Bacillus subtilis cardiolipin synthase.

    PubMed

    Kusaka, Jin; Shuto, Satoshi; Imai, Yukiko; Ishikawa, Kazuki; Saito, Tomo; Natori, Kohei; Matsuoka, Satoshi; Hara, Hiroshi; Matsumoto, Kouji

    2016-04-01

    The acidic phospholipid cardiolipin (CL) is localized on polar and septal membranes and plays an important physiological role in Bacillus subtilis cells. ClsA, the enzyme responsible for CL synthesis, is also localized on septal membranes. We found that GFP fusion proteins of the enzyme with NH2-terminal and internal deletions retained septal localization. However, derivatives with deletions starting from the COOH-terminus (Leu482) ceased to localize to the septum once the deletion passed the Ile residue at 448, indicating that the sequence responsible for septal localization is confined within a short distance from the COOH-terminus. Two sequences, Ile436-Leu450 and Leu466-Leu478, are predicted to individually form an amphipathic α-helix. This configuration is known as a membrane targeting sequence (MTS) and we therefore refer to them as MTS2 and MTS1, respectively. Either one has the ability to affect septal localization, and each of these sequences by itself localizes to the septum. Membrane association of the constructs of this enzyme containing the MTSs was verified by subcellular fractionation of the cells. CL synthesis, in contrast, was abolished after deleting just the last residue, Leu482, in the COOH-terminal four amino acid residue sequence, Ser-Pro-Ile-Leu, which is highly conserved among bacterial CL synthases.

  19. The complete mitochondrial genome sequence of the tubeworm Lamellibrachia satsuma and structural conservation in the mitochondrial genome control regions of Order Sabellida.

    PubMed

    Patra, Ajit Kumar; Kwon, Yong Min; Kang, Sung Gyun; Fujiwara, Yoshihiro; Kim, Sang-Jin

    2016-04-01

    The control region of the mitochondrial genomes shows high variation in conserved sequence organizations, which follow distinct evolutionary patterns in different species or taxa. In this study, we sequenced the complete mitochondrial genome of Lamellibrachia satsuma from the cold-seep region of Kagoshima Bay, as a part of whole genome study and extensively studied the structural features and patterns of the control region sequences. We obtained 15,037 bp of mitochondrial genome using Illumina sequencing and identified the non-coding AT-rich region or control region (354 bp, AT=83.9%) located between trnH and trnR. We found 7 conserved sequence blocks (CSB), scattered throughout the control region of L. satsuma and other taxa of Annelida. The poly-TA stretches, which commonly form the stem of multiple stem-loop structures, are most conserved in the CSB-I and CSB-II regions. The mitochondrial genome of L. satsuma encodes a unique repetitive sequence in the control region, which forms a unique secondary structure in comparison to Lamellibrachia luymesi. Phylogenetic analyses of all protein-coding genes indicate that L. satsuma forms a monophyletic clade with L. luymesi along with other tubeworms found in cold-seep regions (genera: Lamellibrachia, Escarpia, and Seepiophila). In general, the control region sequences of Annelida could be aligned with certainty within each genus, and to some extent within the family, but with a higher rate of variation in conserved regions.

  20. Computational analysis of conserved coil functional residues in the mitochondrial genomic sequences of dermatophytes

    PubMed Central

    Gupta, Bulbul; Kaur, Jaspreet

    2016-01-01

    Dermatophyte is a group of closely related fungi that have the capacity to invade keratinized tissue of humans and other animals. The infection known as dermatophytosis, caused by members of the genera Microsporum, Trichophyton, and Epidermophyton includes infection to the groin (tinea cruris), beard (tinea barbae), scalp (tinea capitis), feet (tinea pedis), glabrous skin (tinea corporis), nail (tinea unguium), and hand (tinea manuum). The identification of evolutionary relationship between these three genera of dermatophyte is epidemiologically important to understand their pathogenicity. Mitochondrial DNA evolves more rapidly than a nuclear DNA due to higher rate of mutation but is very less affected by genetic recombination, making it an important tool for phylogenetic studies. Thus, here we present a novel scheme to identify the conserved coil functional residues of Trichophyton rubrum, Trichophyton mentagrophytes, Epidermophyton floccosum and Microsporum canis. Protein coding sequences of the mitochondrial genome were aligned for their similar sequences and homology modelling was performed for structure and pocket identification. The results obtained from comparative analysis of the protein sequences revealed the presence of functionally active sites in all the species of the genera Trichophyton and Microsporum. However in Epidermophyton floccosum it was observed in three protein sequences of the five studied. The absence of these conserved coil functional residues in E. floccusum may be correlated with lesser infectivity of this organism. The functional residues identified in the present study could be responsible for the disease and thus can act as putative target sites for drug designing. PMID:28149055

  1. GC Content Heterogeneity Transition of Conserved Noncoding Sequences Occurred at the Emergence of Vertebrates

    PubMed Central

    Hettiarachchi, Nilmini; Saitou, Naruya

    2016-01-01

    Conserved non-coding sequences (CNSs) of Eukaryotes are known to be significantly enriched in regulatory sequences. CNSs of diverse lineages follow different patterns in abundance, sequence composition, and location. Here, we report a thorough analysis of CNSs in diverse groups of Eukaryotes with respect to GC content heterogeneity. We examined 24 fungi, 19 invertebrates, and 12 non-mammalian vertebrates so as to find lineage specific features of CNSs. We found that fungi and invertebrate CNSs are predominantly GC rich as in plants we previously observed, whereas vertebrate CNSs are GC poor. This result suggests that the CNS GC content transition occurred from the ancestral GC rich state of Eukaryotes to GC poor in the vertebrate lineage due to the enrollment of GC poor transcription factor binding sites that are lineage specific. CNS GC content is closely linked with the nucleosome occupancy that determines the location and structural architecture of DNAs. PMID:28040773

  2. Massive microRNA sequence conservation and prevalence in human and chimpanzee introns.

    PubMed

    Hill, Aubrey E; Sorscher, Eric J

    2013-06-01

    Human and chimpanzee introns contain numerous sequences strongly related to known microRNA hairpin structures. The relative frequency is precisely maintained across all chromosomes, suggesting the possible co-evolution of gene networks dependent upon microRNA regulation and with origins corresponding to the advent of primate transposable elements (TEs). While the motifs are known to be derived from transposable elements, the most common are far more numerous than expected from the number of TEs and their paralogous sequences, and exhibit striking conservation in comparison to the surrounding TE sequence context. Several of these motifs also exhibit structural complimentarity to each other, suggesting a pairing function at the level of DNA or RNA. These "pseudomicroRNAs," in semblance to pseudogenes, include hundreds of thousands of vestigial paralogs of primate microRNAs, many of which may have functioned historically or remain active today.

  3. Sample sequencing of vascular plants demonstrates widespread conservation and divergence of microRNAs.

    PubMed

    Chávez Montes, Ricardo A; de Fátima Rosas-Cárdenas, Flor; De Paoli, Emanuele; Accerbi, Monica; Rymarquis, Linda A; Mahalingam, Gayathri; Marsch-Martínez, Nayelli; Meyers, Blake C; Green, Pamela J; de Folter, Stefan

    2014-04-23

    Small RNAs are pivotal regulators of gene expression that guide transcriptional and post-transcriptional silencing mechanisms in eukaryotes, including plants. Here we report a comprehensive atlas of sRNA and miRNA from 3 species of algae and 31 representative species across vascular plants, including non-model plants. We sequence and quantify sRNAs from 99 different tissues or treatments across species, resulting in a data set of over 132 million distinct sequences. Using miRBase mature sequences as a reference, we identify the miRNA sequences present in these libraries. We apply diverse profiling methods to examine critical sRNA and miRNA features, such as size distribution, tissue-specific regulation and sequence conservation between species, as well as to predict putative new miRNA sequences. We also develop database resources, computational analysis tools and a dedicated website, http://smallrna.udel.edu/. This study provides new insights on plant sRNAs and miRNAs, and a foundation for future studies.

  4. Reptiles and mammals have differentially retained long conserved noncoding sequences from the amniote ancestor.

    PubMed

    Janes, D E; Chapus, C; Gondo, Y; Clayton, D F; Sinha, S; Blatti, C A; Organ, C L; Fujita, M K; Balakrishnan, C N; Edwards, S V

    2011-01-01

    Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation.

  5. Reptiles and Mammals Have Differentially Retained Long Conserved Noncoding Sequences from the Amniote Ancestor

    PubMed Central

    Janes, D.E.; Chapus, C.; Gondo, Y.; Clayton, D.F.; Sinha, S.; Blatti, C.A.; Organ, C.L.; Fujita, M.K.; Balakrishnan, C.N.; Edwards, S.V.

    2011-01-01

    Many noncoding regions of genomes appear to be essential to genome function. Conservation of large numbers of noncoding sequences has been reported repeatedly among mammals but not thus far among birds and reptiles. By searching genomes of chicken (Gallus gallus), zebra finch (Taeniopygia guttata), and green anole (Anolis carolinensis), we quantified the conservation among birds and reptiles and across amniotes of long, conserved noncoding sequences (LCNS), which we define as sequences ≥500 bp in length and exhibiting ≥95% similarity between species. We found 4,294 LCNS shared between chicken and zebra finch and 574 LCNS shared by the two birds and Anolis. The percent of genomes comprised by LCNS in the two birds (0.0024%) is notably higher than the percent in mammals (<0.0003% to <0.001%), differences that we show may be explained in part by differences in genome-wide substitution rates. We reconstruct a large number of LCNS for the amniote ancestor (ca. 8,630) and hypothesize differential loss and substantial turnover of these sites in descendent lineages. By contrast, we estimated a small role for recruitment of LCNS via acquisition of novel functions over time. Across amniotes, LCNS are significantly enriched with transcription factor binding sites for many developmental genes, and 2.9% of LCNS shared between the two birds show evidence of expression in brain expressed sequence tag databases. These results show that the rate of retention of LCNS from the amniote ancestor differs between mammals and Reptilia (including birds) and that this may reflect differing roles and constraints in gene regulation. PMID:21183607

  6. Protein engineering of selected residues from conserved sequence regions of a novel Anoxybacillus α-amylase

    PubMed Central

    Ranjani, Velayudhan; Janeček, Štefan; Chai, Kian Piaw; Shahir, Shafinaz; Rahman, Raja Noor Zaliha Raja Abdul; Chan, Kok-Gan; Goh, Kian Mau

    2014-01-01

    The α-amylases from Anoxybacillus species (ASKA and ADTA), Bacillus aquimaris (BaqA) and Geobacillus thermoleovorans (GTA, Pizzo and GtamyII) were proposed as a novel group of the α-amylase family GH13. An ASKA yielding a high percentage of maltose upon its reaction on starch was chosen as a model to study the residues responsible for the biochemical properties. Four residues from conserved sequence regions (CSRs) were thus selected, and the mutants F113V (CSR-I), Y187F and L189I (CSR-II) and A161D (CSR-V) were characterised. Few changes in the optimum reaction temperature and pH were observed for all mutants. Whereas the Y187F (t1/2 43 h) and L189I (t1/2 36 h) mutants had a lower thermostability at 65°C than the native ASKA (t1/2 48 h), the mutants F113V and A161D exhibited an improved t1/2 of 51 h and 53 h, respectively. Among the mutants, only the A161D had a specific activity, kcat and kcat/Km higher (1.23-, 1.17- and 2.88-times, respectively) than the values determined for the ASKA. The replacement of the Ala-161 in the CSR-V with an aspartic acid also caused a significant reduction in the ratio of maltose formed. This finding suggests the Ala-161 may contribute to the high maltose production of the ASKA. PMID:25069018

  7. Conservation of the sizes of 53 introns and over 100 intronic sequences for the binding of common transcription factors in the human and mouse genes for type II procollagen (COL2A1).

    PubMed Central

    Ala-Kokko, L; Kvist, A P; Metsäranta, M; Kivirikko, K I; de Crombrugghe, B; Prockop, D J; Vuorio, E

    1995-01-01

    Over 11,000 bp of previously undefined sequences of the human COL2A1 gene were defined. The results made it possible to compare the intron structures of a highly complex gene from man and mouse. Surprisingly, the sizes of the 53 introns of the two genes were highly conserved with a mean difference of 13%. After alignment of the sequences, 69% of the intron sequences were identical. The introns contained consensus sequences for the binding of over 100 different transcription factors that were conserved in the introns of the two genes. The first intron of the gene contained 80 conserved consensus sequences and the remaining 52 introns of the gene contained 106 conserved sequences for the binding of transcription factors. The 5'-end of intron 2 in both genes had a potential for forming a stem loop in RNA transcripts. Images Figure 4 PMID:8948452

  8. 76 FR 82075 - Highly Erodible Land and Wetland Conservation

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-30

    ... Secretary 7 CFR Part 12 RIN 0560-AH97 Highly Erodible Land and Wetland Conservation AGENCY: Office of the... agricultural commodities are planted on highly erodible land or a converted wetland, or the production of... ``good faith'' provisions in the USDA regulations allow violators of highly erodible land...

  9. Identification of evolutionarily conserved non-AUG-initiated N-terminal extensions in human coding sequences

    PubMed Central

    Ivanov, Ivaylo P.; Firth, Andrew E.; Michel, Audrey M.; Atkins, John F.; Baranov, Pavel V.

    2011-01-01

    In eukaryotes, it is generally assumed that translation initiation occurs at the AUG codon closest to the messenger RNA 5′ cap. However, in certain cases, initiation can occur at codons differing from AUG by a single nucleotide, especially the codons CUG, UUG, GUG, ACG, AUA and AUU. While non-AUG initiation has been experimentally verified for a handful of human genes, the full extent to which this phenomenon is utilized—both for increased coding capacity and potentially also for novel regulatory mechanisms—remains unclear. To address this issue, and hence to improve the quality of existing coding sequence annotations, we developed a methodology based on phylogenetic analysis of predicted 5′ untranslated regions from orthologous genes. We use evolutionary signatures of protein-coding sequences as an indicator of translation initiation upstream of annotated coding sequences. Our search identified novel conserved potential non-AUG-initiated N-terminal extensions in 42 human genes including VANGL2, FGFR1, KCNN4, TRPV6, HDGF, CITED2, EIF4G3 and NTF3, and also affirmed the conservation of known non-AUG-initiated extensions in 17 other genes. In several instances, we have been able to obtain independent experimental evidence of the expression of non-AUG-initiated products from the previously published literature and ribosome profiling data. PMID:21266472

  10. Lack of evidence of conserved lentiviral sequences in pigs with post weaning multisystemic wasting syndrome.

    PubMed Central

    Bratanich, A; Lairmore, M; Heneine, W; Konoby, C; Harding, J; West, K; Vasquez, G; Allan, G; Ellis, J

    1999-01-01

    In order to investigate the role of retroviruses in the recently described porcine postweaning multisystemic wasting syndrome (PMWS) serum and leukocytes were screened for reverse transcriptase (RT) activity, and tissues were examined for the presence of conserved lentiviral sequences using degenerate primers in a polymerase chain reaction (PCR). Serum and stimulated leukocytes from the blood and lymph nodes from pigs with PMWS, as well as from control pigs had RT activity that was detected by the sensitive Amp-RT assay. A 257-bp fragment was amplified from DNA from the blood and bone marrow of pigs with PMWS. This fragment was identical in size to conserved lentiviral sequences that were amplified from plasmids containing DNA from several lentiviruses. Cloning and sequencing of the fragment from affected pigs, however, did not reveal homology with the recognized lentiviruses. Together the results of these analyses suggest that the RT activity present in tissues from control and affected pigs is the result of endogenous retrovirus expression, and that a lentivirus is not a primary pathogen in PMWS. Images Figure 1. Figure 2. PMID:10480463

  11. Nullomers and High Order Nullomers in Genomic Sequences

    PubMed Central

    Vergni, Davide; Santoni, Daniele

    2016-01-01

    A nullomer is an oligomer that does not occur as a subsequence in a given DNA sequence, i.e. it is an absent word of that sequence. The importance of nullomers in several applications, from drug discovery to forensic practice, is now debated in the literature. Here, we investigated the nature of nullomers, whether their absence in genomes has just a statistical explanation or it is a peculiar feature of genomic sequences. We introduced an extension of the notion of nullomer, namely high order nullomers, which are nullomers whose mutated sequences are still nullomers. We studied different aspects of them: comparison with nullomers of random sequences, CpG distribution and mean helical rise. In agreement with previous results we found that the number of nullomers in the human genome is much larger than expected by chance. Nevertheless antithetical results were found when considering a random DNA sequence preserving dinucleotide frequencies. The analysis of CpG frequencies in nullomers and high order nullomers revealed, as expected, a high CpG content but it also highlighted a strong dependence of CpG frequencies on the dinucleotide position, suggesting that nullomers have their own peculiar structure and are not simply sequences whose CpG frequency is biased. Furthermore, phylogenetic trees were built on eleven species based on both the similarities between the dinucleotide frequencies and the number of nullomers two species share, showing that nullomers are fairly conserved among close species. Finally the study of mean helical rise of nullomers sequences revealed significantly high mean rise values, reinforcing the hypothesis that those sequences have some peculiar structural features. The obtained results show that nullomers are the consequence of the peculiar structure of DNA (also including biased CpG frequency and CpGs islands), so that the hypermutability model, also taking into account CpG islands, seems to be not sufficient to explain nullomer phenomenon

  12. A Collection of Conserved Noncoding Sequences to Study Gene Regulation in Flowering Plants1[OPEN

    PubMed Central

    2016-01-01

    Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops. PMID:27261064

  13. Nucleotide sequence of the capsid protein gene of two serotypes of San Miguel sea lion virus: identification of conserved and non-conserved amino acid sequences among calicivirus capsid proteins.

    PubMed

    Neill, J D

    1992-07-01

    The San Miguel sea lion viruses, members of the calicivirus family, are closely related to the vesicular disease of swine viruses which can cause severe disease in swine. In order to begin the molecular characterization of these viruses, the nucleotide sequence of the capsid protein gene of two San Miguel sea lion viruses (SMSV), serotypes 1 and 4, was determined. The coding sequences for the capsid precursor protein were located within the 3' terminal 2620 bases of the genomic RNAs of both viruses. The encoded capsid precursor proteins were 79,500 and 77,634 Da for SMSV 1 and SMSV 4, respectively. The SMSV 1 protein was 47.7% and SMSV 4 was 48.6% homologous to the feline calicivirus (FCV) capsid precursor protein while the two SMSV capsid precursors were 73% homologous to each other. Six distinct regions within the capsid precursors (denoted as regions A-F) were identified based on amino acid sequence alignment analysis of the two SMSV serotypes with FCV and the rabbit hemorrhagic disease virus (RHDV) capsid protein. Three regions showed similarity among all four viruses (regions B, D and F) and one region showed a very high degree of homology between the SMSV serotypes but only limited similarity with FCV (region A). RHDV contained only a truncated region A. A fifth region, consisting of approximately 100 residues, was not conserved among any of the viruses (region E) and, in SMSV, may contain the serotype-specific determinants. Another small region (region C) contained between 15 and 27 amino acids and showed little sequence conservation. Region B showed the highest degree of conservation among the four viruses and contained the residues which had homology to the picornavirus VP3 structural protein. An open reading frame, found in the 3' terminal 514 bases of the SMSV genomes, encoded small proteins (12,575 and 12,522 Da, respectively for SMSV 1 and SMSV 4) of which 32% of the conserved amino acids were basic residues, implying a possible nucleic acid

  14. Conservation.

    ERIC Educational Resources Information Center

    National Audubon Society, New York, NY.

    This set of teaching aids consists of seven Audubon Nature Bulletins, providing the teacher and student with informational reading on various topics in conservation. The bulletins have these titles: Plants as Makers of Soil, Water Pollution Control, The Ground Water Table, Conservation--To Keep This Earth Habitable, Our Threatened Air Supply,…

  15. Structural analysis of the regulatory elements of the type-II procollagen gene. Conservation of promoter and first intron sequences between human and mouse.

    PubMed Central

    Vikkula, M; Metsäranta, M; Syvänen, A C; Ala-Kokko, L; Vuorio, E; Peltonen, L

    1992-01-01

    Transcription of the type-II procollagen gene (COL2A1) is very specifically restricted to a limited number of tissues, particularly cartilages. In order to identify transcription-control motifs we have sequenced the promoter region and the first intron of the human and mouse COL2A1 genes. With the assumption that these motifs should be well conserved during evolution, we have searched for potential elements important for the tissue-specific transcription of the COL2A1 gene by aligning the two sequences with each other and with the available rat type-II procollagen sequence for the promoter. With this approach we could identify specific evolutionarily well-conserved motifs in the promoter area. On the other hand, several suggested regulatory elements in the promoter region did not show evolutionary conservation. In the middle of the first intron we found a cluster of well-conserved transcription-control elements and we conclude that these conserved motifs most probably possess a significant function in the control of the tissue-specific transcription of the COL2A1 gene. We also describe locations of additional, highly conserved nucleotide stretches, which are good candidate regions in the search for binding sites of yet-uncharacterized cartilage-specific transcription regulators of the COL2A1 gene. PMID:1637314

  16. Protein E of Haemophilus influenzae is a ubiquitous highly conserved adhesin.

    PubMed

    Singh, Birendra; Brant, Marta; Kilian, Mogens; Hallström, Björn; Riesbeck, Kristian

    2010-02-01

    Protein E (PE) of nontypeable Haemophilus influenzae (NTHi) is involved in adhesion and activation of epithelial cells. A total of 186 clinical NTHi isolates, encapsulated H. influenzae, and culture collection strains were analyzed. PE was highly conserved in both NTHi and encapsulated H. influenzae (96.9%-100% identity without the signal peptide). PE also existed in other members of the genus Pasteurellaceae. The epithelial cell binding region (amino acids 84-108) was completely conserved. Phylogenetic analysis of the pe sequence separated Haemophilus species into 2 separate clusters. Importantly, PE was expressed in 98.4% of all NTHi (126 isolates) independently of the growth phase.

  17. Yeast general transcription factor GFI: sequence requirements for binding to DNA and evolutionary conservation.

    PubMed Central

    Dorsman, J C; van Heeswijk, W C; Grivell, L A

    1990-01-01

    GFI is an abundant DNA binding protein in the yeast S. cerevisiae. The protein binds to specific sequences in both ARS elements and the upstream regions of a large number of genes and is likely to play an important role in yeast cell growth. To get insight into the relative strength of the various GFI-DNA binding sites within the yeast genome, we have determined dissociation rates for several GFI-DNA complexes and found them to vary over a 70-fold range. Strong binding sites for GFI are present in the upstream activating sequences of the gene encoding the 40 kDa subunit II of the QH2:cytochrome c reductase, the gene encoding ribosomal protein S33 and in the intron of the actin gene. The binding site in the ARS1-TRP1 region is of intermediate strength. All strong binding sites conform to the sequence 5' RTCRYYYNNNACG-3'. Modification interference experiments and studies with mutant binding sites indicate that critical bases for GFI recognition are within the two elements of the consensus DNA recognition sequence. Proteins with the DNA binding specificities of GFI and GFII can also be detected in the yeast K. lactis, suggesting evolutionary conservation of at least the respective DNA-binding domains in both yeasts. Images PMID:2187179

  18. Sequence of Radiotherapy and Chemotherapy in Breast Cancer After Breast-Conserving Surgery

    SciTech Connect

    Jobsen, Jan J.; Palen, Job van der; Brinkhuis, Marieel; Ong, Francisca; Struikmans, Henk

    2012-04-01

    Purpose: The optimal sequence of radiotherapy and chemotherapy in breast-conserving therapy is unknown. Methods and Materials: From 1983 through 2007, a total of 641 patients with 653 instances of breast-conserving therapy (BCT), received both chemotherapy and radiotherapy and are the basis of this analysis. Patients were divided into three groups. Groups A and B comprised patients treated before 2005, Group A radiotherapy first and Group B chemotherapy first. Group C consisted of patients treated from 2005 onward, when we had a fixed sequence of radiotherapy first, followed by chemotherapy. Results: Local control did not show any differences among the three groups. For distant metastasis, no difference was shown between Groups A and B. Group C, when compared with Group A, showed, on univariate and multivariate analyses, a significantly better distant metastasis-free survival. The same was noted for disease-free survival. With respect to disease-specific survival, no differences were shown on multivariate analysis among the three groups. Conclusion: Radiotherapy, as an integral part of the primary treatment of BCT, should be administered first, followed by adjuvant chemotherapy.

  19. Conserved sequence motifs among bacterial, eukaryotic, and archaeal phosphatases that define a new phosphohydrolase superfamily.

    PubMed Central

    Thaller, M. C.; Schippa, S.; Rossolini, G. M.

    1998-01-01

    Members of a new molecular family of bacterial nonspecific acid phosphatases (NSAPs), indicated as class C, were found to share significant sequence similarities to bacterial class B NSAPs and to some plant acid phosphatases, representing the first example of a family of bacterial NSAPs that has a relatively close eukaryotic counterpart. Despite the lack of an overall similarity, conserved sequence motifs were also identified among the above enzyme families (class B and class C bacterial NSAPs, and related plant phosphatases) and several other families of phosphohydrolases, including bacterial phosphoglycolate phosphatases, histidinol-phosphatase domains of the bacterial bifunctional enzymes imidazole-glycerolphosphate dehydratases, and bacterial, eukaryotic, and archaeal phosphoserine phosphatases and threalose-6-phosphatases. These conserved motifs are clustered within two domains, separated by a variable spacer region, according to the pattern [FILMAVT]-D-[ILFRMVY]-D-[GSNDE]-[TV]-[ILVAM]-[AT S VILMC]-X-¿YFWHKR)-X-¿YFWHNQ¿-X( 102,191)-¿KRHNQ¿-G-D-¿FYWHILVMC¿-¿QNH¿-¿FWYGP¿-D -¿PSNQYW¿. The dephosphorylating activity common to all these proteins supports the definition of this phosphatase motif and the inclusion of these enzymes into a superfamily of phosphohydrolases that we propose to indicate as "DDDD" after the presence of the four invariant aspartate residues. Database searches retrieved various hypothetical proteins of unknown function containing this or similar motifs, for which a phosphohydrolase activity could be hypothesized. PMID:9684901

  20. Robust high-order space-time conservative schemes for solving conservation laws on hybrid meshes

    NASA Astrophysics Data System (ADS)

    Shen, Hua; Wen, Chih-Yung; Liu, Kaixin; Zhang, Deliang

    2015-01-01

    In this paper, the second-order space-time conservation element and solution element (CE/SE) method proposed by Chang (1995) [3] is implemented on hybrid meshes for solving conservation laws. In addition, the present scheme has been extended to high-order versions including third and fourth order. Most methodologies of proposed schemes are consistent with that of the original CE/SE method, including: (i) a unified treatment of space and time (thereby ensuring good conservation in both space and time); (ii) a highly compact node stencil (the solution node is calculated using only the neighboring mesh nodes) regardless of the order of accuracy at the cost of storing all derivatives. A staggered time marching strategy is adopted and the solutions are updated alternatively between cell centers and vertexes. To construct explicit high-order schemes, second- and third-order derivatives are calculated by a modified finite-difference/weighted-average procedure which is different from that used to calculate the first-order derivatives. The present schemes can be implemented on a wide variety of meshes, including triangular, quadrilateral and hybrid (consisting of both triangular and quadrilateral elements). Beyond that, it can be easily extended to arbitrary-order schemes and arbitrary shape of polygonal elements by using the present methodologies. A series of common benchmark examples are used to confirm the accuracy and robustness of the proposed schemes.

  1. Heterochromatin protein 1, a known suppressor of position-effect variegation, is highly conserved in Drosophila.

    PubMed Central

    Clark, R F; Elgin, S C

    1992-01-01

    The Su(var)205 gene of Drosophila melanogaster encodes heterochromatin protein 1 (HP1), a protein located preferentially within beta-heterochromatin. Mutation of this gene has been associated with dominant suppression of position-effect variegation. We have cloned and sequenced the gene encoding HP1 from Drosophila virilis, a distantly related species. Comparison of the predicted amino acid sequence with Drosophila melanogaster HP1 shows two regions of strong homology, one near the N-terminus (57/61 amino acids identical) and the other near the C-terminus (62/68 amino acids identical) of the protein. Little homology is seen in the 5' and 3' untranslated portions of the gene, as well as in the intronic sequences, although intron/exon boundaries are generally conserved. A comparison of the deduced amino acid sequences of HP1-like proteins from other species shows that the cores of the N-terminal and C-terminal domains have been conserved from insects to mammals. The high degree of conservation suggests that these N- and C-terminal domains could interact with other macromolecules in the formation of the condensed structure of heterochromatin. Images PMID:1461737

  2. High compression image and image sequence coding

    NASA Technical Reports Server (NTRS)

    Kunt, Murat

    1989-01-01

    The digital representation of an image requires a very large number of bits. This number is even larger for an image sequence. The goal of image coding is to reduce this number, as much as possible, and reconstruct a faithful duplicate of the original picture or image sequence. Early efforts in image coding, solely guided by information theory, led to a plethora of methods. The compression ratio reached a plateau around 10:1 a couple of years ago. Recent progress in the study of the brain mechanism of vision and scene analysis has opened new vistas in picture coding. Directional sensitivity of the neurones in the visual pathway combined with the separate processing of contours and textures has led to a new class of coding methods capable of achieving compression ratios as high as 100:1 for images and around 300:1 for image sequences. Recent progress on some of the main avenues of object-based methods is presented. These second generation techniques make use of contour-texture modeling, new results in neurophysiology and psychophysics and scene analysis.

  3. Genome-wide analyses of Epstein-Barr virus reveal conserved RNA structures and a novel stable intronic sequence RNA

    PubMed Central

    2013-01-01

    Background Epstein-Barr virus (EBV) is a human herpesvirus implicated in cancer and autoimmune disorders. Little is known concerning the roles of RNA structure in this important human pathogen. This study provides the first comprehensive genome-wide survey of RNA and RNA structure in EBV. Results Novel EBV RNAs and RNA structures were identified by computational modeling and RNA-Seq analyses of EBV. Scans of the genomic sequences of four EBV strains (EBV-1, EBV-2, GD1, and GD2) and of the closely related Macacine herpesvirus 4 using the RNAz program discovered 265 regions with high probability of forming conserved RNA structures. Secondary structure models are proposed for these regions based on a combination of free energy minimization and comparative sequence analysis. The analysis of RNA-Seq data uncovered the first observation of a stable intronic sequence RNA (sisRNA) in EBV. The abundance of this sisRNA rivals that of the well-known and highly expressed EBV-encoded non-coding RNAs (EBERs). Conclusion This work identifies regions of the EBV genome likely to generate functional RNAs and RNA structures, provides structural models for these regions, and discusses potential functions suggested by the modeled structures. Enhanced understanding of the EBV transcriptome will guide future experimental analyses of the discovered RNAs and RNA structures. PMID:23937650

  4. Comparative genomic analysis of equilibrative nucleoside transporters suggests conserved protein structure despite limited sequence identity.

    PubMed

    Sankar, Narendra; Machado, Jerry; Abdulla, Parween; Hilliker, Arthur J; Coe, Imogen R

    2002-10-15

    Equilibrative nucleoside transporters (ENTs) are a recently characterized and poorly understood group of membrane proteins that are important in the uptake of endogenous nucleosides required for nucleic acid and nucleoside triphosphate synthesis. Despite their central importance in cellular metabolism and nucleoside analog chemotherapy, no human ENT gene has been described and nothing is known about gene structure and function. To gain insight into the ENT gene family, we used experimental and in silico comparative genomic approaches to identify ENT genes in three evolutionarily diverse organisms with completely (or almost completely) sequenced genomes, Homo sapiens, Caenorhabditis elegans and Drosophila melanogaster. We describe the chromosomal location, the predicted ENT gene structure and putative structural topologies of predicted ENT proteins derived from the open reading frames. Despite variations in genomic layout and limited ortholog protein sequence identity (< or =27.45%), predicted topologies of ENT proteins are strikingly similar, suggesting an evolutionary conservation of a prototypic structure. In addition, a similar distribution of protein domains on exons is apparent in all three taxa. These data demonstrate that comparative sequence analyses should be combined with other approaches (such as genomic and proteomic analyses) to fully understand structure, function and evolution of protein families.

  5. Evolutionary conservation of sequence and secondary structures inCRISPR repeats

    SciTech Connect

    Kunin, Victor; Sorek, Rotem; Hugenholtz, Philip

    2006-09-01

    Clustered Regularly Interspaced Palindromic Repeats (CRISPRs) are a novel class of direct repeats, separated by unique spacer sequences of similar length, that are present in {approx}40% of bacterial and all archaeal genomes analyzed to date. More than 40 gene families, called CRISPR-associated sequences (CAS), appear in conjunction with these repeats and are thought to be involved in the propagation and functioning of CRISPRs. It has been proposed that the CRISPR/CAS system samples, maintains a record of, and inactivates invasive DNA that the cell has encountered, and therefore constitutes a prokaryotic analog of an immune system. Here we analyze CRISPR repeats identified in 195 microbial genomes and show that they can be organized into multiple clusters based on sequence similarity. All individual repeats in any given cluster were inferred to form characteristic RNA secondary structure, ranging from non-existent to pronounced. Stable secondary structures included G:U base pairs and exhibited multiple compensatory base changes in the stem region, indicating evolutionary conservation and functional importance. We also show that the repeat-based classification corresponds to, and expands upon, a previously reported CAS gene-based classification including specific relationships between CRISPR and CAS subtypes.

  6. The nucleotide sequence of the nitrogen-regulation gene ntrA of Klebsiella pneumoniae and comparison with conserved features in bacterial RNA polymerase sigma factors.

    PubMed Central

    Merrick, M J; Gibbins, J R

    1985-01-01

    The nucleotide sequence of the Klebsiella pneumoniae ntrA gene has been determined. NtrA encodes a 53,926 Dalton acidic polypeptide; a calculated molecular weight which is significantly lower than that determined by SDS polyacrylamide gel analysis. NtrA is followed by another open-reading frame (orf) of at least 75 amino acids. In the spacer region between ntrA and orf there are no apparent transcription termination or promoter sequences and therefore orf may be co-transcribed with ntrA. Previous authors have proposed that NtrA could act as an RNA polymerase sigma factor but the NtrA amino acid sequence does not show a high level of homology to any known sigma factor. However analysis of sequences of five sigma factors from E. coli and B. subtilis has identified two conserved sequences at the C-terminal end of all these polypeptides. These sequences resemble those found in known site-specific DNA-binding domains and may be involved in recognition of conserved -35 and -10 promoter sequences. A similar pair of sequences is present at the C-terminus of NtrA and could play a role in recognition of ntr-activatable promoters. Images PMID:2999700

  7. Dinoflagellate tandem array gene transcripts are highly conserved and not polycistronic

    PubMed Central

    Beauchemin, Mathieu; Roy, Sougata; Daoust, Philippe; Dagenais-Bellefeuille, Steve; Bertomeu, Thierry; Letourneau, Louis; Lang, B. Franz; Morse, David

    2012-01-01

    Dinoflagellates are an important component of the marine biota, but a large genome with high–copy number (up to 5,000) tandem gene arrays has made genomic sequencing problematic. More importantly, little is known about the expression and conservation of these unusual gene arrays. We assembled de novo a gene catalog of 74,655 contigs for the dinoflagellate Lingulodinium polyedrum from RNA-Seq (Illumina) reads. The catalog contains 93% of a Lingulodinium EST dataset deposited in GenBank and 94% of the enzymes in 16 primary metabolic KEGG (Kyoto Encyclopedia of Genes and Genomes) pathways, indicating it is a good representation of the transcriptome. Analysis of the catalog shows a marked underrepresentation of DNA-binding proteins and DNA-binding domains compared with other algae. Despite this, we found no evidence to support the proposal of polycistronic transcription, including a marked underrepresentation of sequences corresponding to the intergenic spacers of two tandem array genes. We also have used RNA-Seq to assess the degree of sequence conservation in tandem array genes and found their transcripts to be highly conserved. Interestingly, some of the sequences in the catalog have only bacterial homologs and are potential candidates for horizontal gene transfer. These presumably were transferred as single-copy genes, and because they are now all GC-rich, any derived from AT-rich contexts must have experienced extensive mutation. Our study not only has provided the most complete dinoflagellate gene catalog known to date, it has also exploited RNA-Seq to address fundamental issues in basic transcription mechanisms and sequence conservation in these algae. PMID:23019363

  8. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  9. High-resolution schemes for hyperbolic conservation laws

    NASA Technical Reports Server (NTRS)

    Harten, A.

    1982-01-01

    A class of new explicit second order accurate finite difference schemes for the computation of weak solutions of hyperbolic conservation laws is presented. These highly nonlinear schemes are obtained by applying a nonoscillatory first order accurae scheme to an appropriately modified flux function. The so derived second order accurate schemes achieve high resolution while preserving the robustness of the original nonoscillatory first order accurate scheme.

  10. Sequence conservation, HLA-E-Restricted peptide, and best-defined CTL/CD8+ epitopes in gag P24 (capsid) of HIV-1 subtype B

    NASA Astrophysics Data System (ADS)

    Prasetyo, Afiono Agung; Dharmawan, Ruben; Sari, Yulia; Sariyatun, Ratna

    2017-02-01

    Human immunodeficiency virus type 1 (HIV-1) remains a cause of global health problem. Continuous studies of HIV-1 genetic and immunological profiles are important to find strategies against the virus. This study aimed to conduct analysis of sequence conservation, HLA-E-restricted peptide, and best-defined CTL/CD8+ epitopes in p24 (capsid) of HIV-1 subtype B worldwide. The p24-coding sequences from 3,557 HIV subtype B isolates were aligned using MUSCLE and analysed. Some highly conserved regions (sequence conservation ≥95%) were observed. Two considerably long series of sequences with conservation of 100% was observed at base 349-356 and 550-557 of p24 (HXB2 numbering). The consensus from all aligned isolates was precisely the same as consensus B in the Los Alamos HIV Database. The HLA-E-restricted peptide in amino acid (aa) 14-22 of HIV-1 p24 (AISPRTLNA) was found in 55.9% (1,987/3,557) of HIV-1 subtype B worldwide. Forty-four best-defined CTL/CD8+ epitopes were observed, in which VKNWMTETL epitope (aa 181-189 of p24) restricted by B*4801 was the most frequent, as found in 94.9% of isolates. The results of this study would contribute information about HIV-1 subtype B and benefits for further works willing to develop diagnostic and therapeutic strategies against the virus.

  11. In Silico Structure and Sequence Analysis of Bacterial Porins and Specific Diffusion Channels for Hydrophilic Molecules: Conservation, Multimericity and Multifunctionality

    PubMed Central

    Vollan, Hilde S.; Tannæs, Tone; Vriend, Gert; Bukholm, Geir

    2016-01-01

    Diffusion channels are involved in the selective uptake of nutrients and form the largest outer membrane protein (OMP) family in Gram-negative bacteria. Differences in pore size and amino acid composition contribute to the specificity. Structure-based multiple sequence alignments shed light on the structure-function relations for all eight subclasses. Entropy-variability analysis results are correlated to known structural and functional aspects, such as structural integrity, multimericity, specificity and biological niche adaptation. The high mutation rate in their surface-exposed loops is likely an important mechanism for host immune system evasion. Multiple sequence alignments for each subclass revealed conserved residue positions that are involved in substrate recognition and specificity. An analysis of monomeric protein channels revealed particular sequence patterns of amino acids that were observed in other classes at multimeric interfaces. This adds to the emerging evidence that all members of the family exist in a multimeric state. Our findings are important for understanding the role of members of this family in a wide range of bacterial processes, including bacterial food uptake, survival and adaptation mechanisms. PMID:27110766

  12. The human archain gene, ARCN1, has highly conserved homologs in rice and drosophila

    SciTech Connect

    Radice, P.; Jones, C.; Perry, H.

    1995-03-01

    A novel human gene, ARCN1, has been identified in chromosome band 11q23.3. It maps approximately 50 kb telomeric to MLL, a gene that is disrupted in a number of leukemia-associated translocation chromosomes. cDNA clones representing ARCN1 hybridize to 4-kb mRNA species present in all tissues tested. Sequencing of cDNAs suggests that at least two forms of mRNA with alternative 5 {prime} ends are present within the cell. The mRNA with the longest open reading frame gives rise to a protein of 57 kDa. Although the sequence reported is novel, remarkable similarity is observed with two predicted protein sequences from partial DNA sequences generated by rice (Oryza sativa) and fruit fly (Drosophila melanogaster) genome projects. The degree of sequence conservation is comparable to that observed for highly conserved structural proteins, such as heat shock protein HSP70, and is greater than that of {gamma}-gubulin and heat shock protein HSP60. A more distant relationship to the group of clathrin-associated proteins suggests a possible role in vesicle structure or trafficking. In view of its ancient pedigree and a potential involvement in cellular architecture, the authors propose that the ARCN1 protein be named archain. 20 refs., 5 figs.

  13. Comparative sequence and structure analysis reveals the conservation and diversity of nucleotide positions and their associated tertiary interactions in the riboswitches.

    PubMed

    Appasamy, Sri D; Ramlan, Effirul Ikhwan; Firdaus-Raih, Mohd

    2013-01-01

    The tertiary motifs in complex RNA molecules play vital roles to either stabilize the formation of RNA 3D structure or to provide important biological functionality to the molecule. In order to better understand the roles of these tertiary motifs in riboswitches, we examined 11 representative riboswitch PDB structures for potential agreement of both motif occurrences and conservations. A total of 61 unique tertiary interactions were found in the reference structures. In addition to the expected common A-minor motifs and base-triples mainly involved in linking distant regions the riboswitch structures three highly conserved variants of A-minor interactions called G-minors were found in the SAM-I and FMN riboswitches where they appear to be involved in the recognition of the respective ligand's functional groups. From our structural survey as well as corresponding structure and sequence alignments, the agreement between motif occurrences and conservations are very prominent across the representative riboswitches. Our analysis provide evidence that some of these tertiary interactions are essential components to form the structure where their sequence positions are conserved despite a high degree of diversity in other parts of the respective riboswitches sequences. This is indicative of a vital role for these tertiary interactions in determining the specific biological function of riboswitch.

  14. Relationship between sequence conservation and three-dimensional structure in a large family of esterases, lipases, and related proteins.

    PubMed Central

    Cygler, M.; Schrag, J. D.; Sussman, J. L.; Harel, M.; Silman, I.; Gentry, M. K.; Doctor, B. P.

    1993-01-01

    Based on the recently determined X-ray structures of Torpedo californica acetylcholinesterase and Geotrichum candidum lipase and on their three-dimensional superposition, an improved alignment of a collection of 32 related amino acid sequences of other esterases, lipases, and related proteins was obtained. On the basis of this alignment, 24 residues are found to be invariant in 29 sequences of hydrolytic enzymes, and an additional 49 are well conserved. The conservation in the three remaining sequences is somewhat lower. The conserved residues include the active site, disulfide bridges, salt bridges, and residues in the core of the proteins. Most invariant residues are located at the edges of secondary structural elements. A clear structural basis for the preservation of many of these residues can be determined from comparison of the two X-ray structures. PMID:8453375

  15. Origin and spread of photosynthesis based upon conserved sequence features in key bacteriochlorophyll biosynthesis proteins.

    PubMed

    Gupta, Radhey S

    2012-11-01

    The origin of photosynthesis and how this capability has spread to other bacterial phyla remain important unresolved questions. I describe here a number of conserved signature indels (CSIs) in key proteins involved in bacteriochlorophyll (Bchl) biosynthesis that provide important insights in these regards. The proteins BchL and BchX, which are essential for Bchl biosynthesis, are derived by gene duplication in a common ancestor of all phototrophs. More ancient gene duplication gave rise to the BchX-BchL proteins and the NifH protein of the nitrogenase complex. The sequence alignment of NifH-BchX-BchL proteins contain two CSIs that are uniquely shared by all NifH and BchX homologs, but not by any BchL homologs. These CSIs and phylogenetic analysis of NifH-BchX-BchL protein sequences strongly suggest that the BchX homologs are ancestral to BchL and that the Bchl-based anoxygenic photosynthesis originated prior to the chlorophyll (Chl)-based photosynthesis in cyanobacteria. Another CSI in the BchX-BchL sequence alignment that is uniquely shared by all BchX homologs and the BchL sequences from Heliobacteriaceae, but absent in all other BchL homologs, suggests that the BchL homologs from Heliobacteriaceae are primitive in comparison to all other photosynthetic lineages. Several other identified CSIs in the BchN homologs are commonly shared by all proteobacterial homologs and a clade consisting of the marine unicellular Cyanobacteria (Clade C). These CSIs in conjunction with the results of phylogenetic analyses and pair-wise sequence similarity on the BchL, BchN, and BchB proteins, where the homologs from Clade C Cyanobacteria and Proteobacteria exhibited close relationship, provide strong evidence that these two groups have incurred lateral gene transfers. Additionally, phylogenetic analyses and several CSIs in the BchL-N-B proteins that are uniquely shared by all Chlorobi and Chloroflexi homologs provide evidence that the genes for these proteins have also been

  16. Characterization of a highly repeated DNA sequence family in five species of the genus Eulemur.

    PubMed

    Ventura, M; Boniotto, M; Cardone, M F; Fulizio, L; Archidiacono, N; Rocchi, M; Crovella, S

    2001-09-19

    The karyotypes of Eulemur species exhibit a high degree of variation, as a consequence of the Robertsonian fusion and/or centromere fission. Centromeric and pericentromeric heterochromatin of eulemurs is constituted by highly repeated DNA sequences (including some telomeric TTAGGG repeats) which have so far been investigated and used for the study of the systematic relationships of the different species of the genus Eulemur. In our study, we have cloned a set of repetitive pericentromeric sequences of five Eulemur species: E. fulvus fulvus (EFU), E. mongoz (EMO), E. macaco (EMA), E. rubriventer (ERU), and E. coronatus (ECO). We have characterized these clones by sequence comparison and by comparative fluorescence in situ hybridization analysis in EMA and EFU. Our results showed a high degree of sequence similarity among Eulemur species, indicating a strong conservation, within the five species, of these pericentromeric highly repeated DNA sequences.

  17. The expressed TCRβ CDR3 repertoire is dominated by conserved DNA sequences in channel catfish.

    PubMed

    Findly, R Craig; Niagro, Frank D; Dickerson, Harry W

    2017-03-01

    We analyzed by high-throughput sequencing T cell receptor beta CDR3 repertoires expressed by αβ T cells in outbred channel catfish before and after an immunizing infection with the parasitic protozoan Ichthyophthirius multifiliis. We compared CDR3 repertoires in caudal fin before infection and at three weeks after infection, and in skin, PBL, spleen and head kidney at seven and twenty-one weeks after infection. Public clonotypes with the same CDR3 amino acid sequence were expressed by αβ T cells that underwent clonal expansion following development of immunity. These clonally expanded αβ T cells were primarily located in spleen and skin, which is a site of infection. Although multiple DNA sequences were expected to code for each public clonotype, each public clonotype was predominately coded by an identical CDR3 DNA sequence in combination with the same J gene in all fish. The processes underlying this shared use of CDR3 DNA sequences are not clear.

  18. Discovery and profiling of novel and conserved microRNAs during flower development in Carya cathayensis via deep sequencing.

    PubMed

    Wang, Zheng Jia; Huang, Jian Qin; Huang, You Jun; Li, Zheng; Zheng, Bing Song

    2012-08-01

    Hickory (Carya cathayensis Sarg.) is an economically important woody plant in China, but its long juvenile phase delays yield. MicroRNAs (miRNAs) are critical regulators of genes and important for normal plant development and physiology, including flower development. We used Solexa technology to sequence two small RNA libraries from two floral differentiation stages in hickory to identify miRNAs related to flower development. We identified 39 conserved miRNA sequences from 114 loci belonging to 23 families as well as two novel and ten potential novel miRNAs belonging to nine families. Moreover, 35 conserved miRNA*s and two novel miRNA*s were detected. Twenty miRNA sequences from 49 loci belonging to 11 families were differentially expressed; all were up-regulated at the later stage of flower development in hickory. Quantitative real-time PCR of 12 conserved miRNA sequences, five novel miRNA families, and two novel miRNA*s validated that all were expressed during hickory flower development, and the expression patterns were similar to those detected with Solexa sequencing. Finally, a total of 146 targets of the novel and conserved miRNAs were predicted. This study identified a diverse set of miRNAs that were closely related to hickory flower development and that could help in plant floral induction.

  19. High sequence turnover in the regulatory regions of the developmental gene hunchback in insects.

    PubMed

    Hancock, J M; Shaw, P J; Bonneton, F; Dover, G A

    1999-02-01

    Extensive sequence analysis of the developmental gene hunchback and its 5' and 3' regulatory regions in Drosophila melanogaster, Drosophila virilis, Musca domestica, and Tribolium castaneum, using a variety of computer algorithms, reveals regions of high sequence simplicity probably generated by slippage-like mechanisms of turnover. No regions are entirely refractory to the action of slippage, although the density and composition of simple sequence motifs varies from region to region. Interestingly, the 5' and 3' flanking regions share short repetitive motifs despite their separation by the gene itself, and the motifs are different in composition from those in the exons and introns. Furthermore, there are high levels of conservation of motifs in equivalent orthologous regions. Detailed sequence analysis of the P2 promoter and DNA footprinting assays reveal that the number, orientation, sequence, spacing, and protein-binding affinities of the BICOID-binding sites varies between species and that the 'P2' promoter, the nanos response element in the 3' untranslated region, and several conserved boxes of sequence in the gene (e.g., the two zinc-finger regions) are surrounded by cryptically-simple-sequence DNA. We argue that high sequence turnover and genetic redundancy permit both the general maintenance of promoter functions through the establishment of coevolutionary (compensatory) changes in cis- and trans-acting genetic elements and, at the same time, the possibility of subtle changes in the regulation of hunchback in the different species.

  20. High Throughput Sequencing of Extracellular RNA from Human Plasma

    PubMed Central

    Danielson, Kirsty M.; Rubio, Renee; Abderazzaq, Fieda; Das, Saumya; Wang, Yaoyu E.

    2017-01-01

    The presence and relative stability of extracellular RNAs (exRNAs) in biofluids has led to an emerging recognition of their promise as ‘liquid biopsies’ for diseases. Most prior studies on discovery of exRNAs as disease-specific biomarkers have focused on microRNAs (miRNAs) using technologies such as qRT-PCR and microarrays. The recent application of next-generation sequencing to discovery of exRNA biomarkers has revealed the presence of potential novel miRNAs as well as other RNA species such as tRNAs, snoRNAs, piRNAs and lncRNAs in biofluids. At the same time, the use of RNA sequencing for biofluids poses unique challenges, including low amounts of input RNAs, the presence of exRNAs in different compartments with varying degrees of vulnerability to isolation techniques, and the high abundance of specific RNA species (thereby limiting the sensitivity of detection of less abundant species). Moreover, discovery in human diseases often relies on archival biospecimens of varying age and limiting amounts of samples. In this study, we have tested RNA isolation methods to optimize profiling exRNAs by RNA sequencing in individuals without any known diseases. Our findings are consistent with other recent studies that detect microRNAs and ribosomal RNAs as the major exRNA species in plasma. Similar to other recent studies, we found that the landscape of biofluid microRNA transcriptome is dominated by several abundant microRNAs that appear to comprise conserved extracellular miRNAs. There is reasonable correlation of sets of conserved miRNAs across biological replicates, and even across other data sets obtained at different investigative sites. Conversely, the detection of less abundant miRNAs is far more dependent on the exact methodology of RNA isolation and profiling. This study highlights the challenges in detecting and quantifying less abundant plasma miRNAs in health and disease using RNA sequencing platforms. PMID:28060806

  1. Structural Relationships between Highly Conserved Elements and Genes in Vertebrate Genomes

    PubMed Central

    Sun, Hong; Skogerbø, Geir; Wang, Zhen; Liu, Wei; Li, Yixue

    2008-01-01

    Large numbers of sequence elements have been identified to be highly conserved among vertebrate genomes. These highly conserved elements (HCEs) are often located in or around genes that are involved in transcription regulation and early development. They have been shown to be involved in cis-regulatory activities through both in vivo and additional computational studies. We have investigated the structural relationships between such elements and genes in six vertebrate genomes human, mouse, rat, chicken, zebrafish and tetraodon and detected several thousand cases of conserved HCE-gene associations, and also cases of HCEs with no common target genes. A few examples underscore the potential significance of our findings about several individual genes. We found that the conserved association between HCE/HCEs and gene/genes are not restricted to elements by their absolute distance on the genome. Notably, long-range associations were identified and the molecular functions of the associated genes do not show any particular overrepresentation of the functional categories previously reported. HCEs in close proximity are found to be linked with different set of gene/genes. The results reflect the highly complex correlation between HCEs and their putative target genes. PMID:19008958

  2. A conserved intronic U1 snRNP-binding sequence promotes trans-splicing in Drosophila.

    PubMed

    Gao, Jun-Li; Fan, Yu-Jie; Wang, Xiu-Ye; Zhang, Yu; Pu, Jia; Li, Liang; Shao, Wei; Zhan, Shuai; Hao, Jianjiang; Xu, Yong-Zhen

    2015-04-01

    Unlike typical cis-splicing, trans-splicing joins exons from two separate transcripts to produce chimeric mRNA and has been detected in most eukaryotes. Trans-splicing in trypanosomes and nematodes has been characterized as a spliced leader RNA-facilitated reaction; in contrast, its mechanism in higher eukaryotes remains unclear. Here we investigate mod(mdg4), a classic trans-spliced gene in Drosophila, and report that two critical RNA sequences in the middle of the last 5' intron, TSA and TSB, promote trans-splicing of mod(mdg4). In TSA, a 13-nucleotide (nt) core motif is conserved across Drosophila species and is essential and sufficient for trans-splicing, which binds U1 small nuclear RNP (snRNP) through strong base-pairing with U1 snRNA. In TSB, a conserved secondary structure acts as an enhancer. Deletions of TSA and TSB using the CRISPR/Cas9 system result in developmental defects in flies. Although it is not clear how the 5' intron finds the 3' introns, compensatory changes in U1 snRNA rescue trans-splicing of TSA mutants, demonstrating that U1 recruitment is critical to promote trans-splicing in vivo. Furthermore, TSA core-like motifs are found in many other trans-spliced Drosophila genes, including lola. These findings represent a novel mechanism of trans-splicing, in which RNA motifs in the 5' intron are sufficient to bring separate transcripts into close proximity to promote trans-splicing.

  3. Optimal assembly for high throughput shotgun sequencing

    PubMed Central

    2013-01-01

    We present a framework for the design of optimal assembly algorithms for shotgun sequencing under the criterion of complete reconstruction. We derive a lower bound on the read length and the coverage depth required for reconstruction in terms of the repeat statistics of the genome. Building on earlier works, we design a de Brujin graph based assembly algorithm which can achieve very close to the lower bound for repeat statistics of a wide range of sequenced genomes, including the GAGE datasets. The results are based on a set of necessary and sufficient conditions on the DNA sequence and the reads for reconstruction. The conditions can be viewed as the shotgun sequencing analogue of Ukkonen-Pevzner's necessary and sufficient conditions for Sequencing by Hybridization. PMID:23902516

  4. The Putative Leishmania Telomerase RNA (LeishTER) Undergoes Trans-Splicing and Contains a Conserved Template Sequence

    PubMed Central

    da Silva, Marcelo S.; Segatto, Marcela; Myler, Peter J.; Cano, Maria Isabel N.

    2014-01-01

    Telomerase RNAs (TERs) are highly divergent between species, varying in size and sequence composition. Here, we identify a candidate for the telomerase RNA component of Leishmania genus, which includes species that cause leishmaniasis, a neglected tropical disease. Merging a thorough computational screening combined with RNA-seq evidence, we mapped a non-coding RNA gene localized in a syntenic locus on chromosome 25 of five Leishmania species that shares partial synteny with both Trypanosoma brucei TER locus and a putative TER candidate-containing locus of Crithidia fasciculata. Using target-driven molecular biology approaches, we detected a ∼2,100 nt transcript (LeishTER) that contains a 5′ spliced leader (SL) cap, a putative 3′ polyA tail and a predicted C/D box snoRNA domain. LeishTER is expressed at similar levels in the logarithmic and stationary growth phases of promastigote forms. A 5′SL capped LeishTER co-immunoprecipitated and co-localized with the telomerase protein component (TERT) in a cell cycle-dependent manner. Prediction of its secondary structure strongly suggests the existence of a bona fide single-stranded template sequence and a conserved C[U/C]GUCA motif-containing helix II, representing the template boundary element. This study paves the way for further investigations on the biogenesis of parasite TERT ribonucleoproteins (RNPs) and its role in parasite telomere biology. PMID:25391020

  5. The putative Leishmania telomerase RNA (LeishTER) undergoes trans-splicing and contains a conserved template sequence.

    PubMed

    Vasconcelos, Elton J R; Nunes, Vinícius S; da Silva, Marcelo S; Segatto, Marcela; Myler, Peter J; Cano, Maria Isabel N

    2014-01-01

    Telomerase RNAs (TERs) are highly divergent between species, varying in size and sequence composition. Here, we identify a candidate for the telomerase RNA component of Leishmania genus, which includes species that cause leishmaniasis, a neglected tropical disease. Merging a thorough computational screening combined with RNA-seq evidence, we mapped a non-coding RNA gene localized in a syntenic locus on chromosome 25 of five Leishmania species that shares partial synteny with both Trypanosoma brucei TER locus and a putative TER candidate-containing locus of Crithidia fasciculata. Using target-driven molecular biology approaches, we detected a ∼2,100 nt transcript (LeishTER) that contains a 5' spliced leader (SL) cap, a putative 3' polyA tail and a predicted C/D box snoRNA domain. LeishTER is expressed at similar levels in the logarithmic and stationary growth phases of promastigote forms. A 5'SL capped LeishTER co-immunoprecipitated and co-localized with the telomerase protein component (TERT) in a cell cycle-dependent manner. Prediction of its secondary structure strongly suggests the existence of a bona fide single-stranded template sequence and a conserved C[U/C]GUCA motif-containing helix II, representing the template boundary element. This study paves the way for further investigations on the biogenesis of parasite TERT ribonucleoproteins (RNPs) and its role in parasite telomere biology.

  6. The first complete plastid genomes of Melastomataceae are highly structurally conserved

    PubMed Central

    Neubig, Kurt M.; Majure, Lucas C.

    2016-01-01

    Background In the past three decades, several studies have predominantly relied on a small sample of the plastome to infer deep phylogenetic relationships in the species-rich Melastomataceae. Here, we report the first full plastid sequences of this family, compare general features of the sampled plastomes to other sequenced Myrtales, and survey the plastomes for highly informative regions for phylogenetics. Methods Genome skimming was performed for 16 species spread across the Melastomataceae. Plastomes were assembled, annotated and compared to eight sequenced plastids in the Myrtales. Phylogenetic inference was performed using Maximum Likelihood on six different data sets, where putative biases were taken into account. Summary statistics were generated for all introns and intergenic spacers with suitable size for polymerase chain reaction (PCR) amplification and used to rank the markers by phylogenetic information. Results The majority of the plastomes sampled are conserved in gene content and order, as well as in sequence length and GC content within plastid regions and sequence classes. Departures include the putative presence of rps16 and rpl2 pseudogenes in some plastomes. Phylogenetic analyses of the majority of the schemes analyzed resulted in the same topology with high values of bootstrap support. Although there is still uncertainty in some relationships, in the highest supported topologies only two nodes received bootstrap values lower than 95%. Discussion Melastomataceae plastomes are no exception for the general patterns observed in the genomic structure of land plant chloroplasts, being highly conserved and structurally similar to most other Myrtales. Despite the fact that the full plastome phylogeny shares most of the clades with the previously widely used and reduced data set, some changes are still observed and bootstrap support is higher. The plastome data set presented here is a step towards phylogenomic analyses in the Melastomataceae and will be

  7. THE GRK4 SUBFAMILY OF G PROTEIN-COUPLED RECEPTOR KINASES: ALTERNATIVE SPLICING, GENE ORGANIZATION, AND SEQUENCE CONSERVATION

    EPA Science Inventory

    The GRK4 subfamily of G protein-coupled receptor kinases. Alternative splicing, gene organization, and sequence conservation.

    Premont RT, Macrae AD, Aparicio SA, Kendall HE, Welch JE, Lefkowitz RJ.

    Department of Medicine, Howard Hughes Medical Institute, Duke Univer...

  8. Computational Prediction of Phylogenetically Conserved Sequence Motifs for Five Different Candidate Genes in Type II Diabetic Nephropathy

    PubMed Central

    Sindhu, T; Rajamanikandan, S; Srinivasan, P

    2012-01-01

    Background: Computational identification of phylogenetic motifs helps to understand the knowledge about known functional features that includes catalytic site, substrate binding epitopes, and protein-protein interfaces. Furthermore, they are strongly conserved among orthologs, indicating their evolutionary importance. The study aimed to analyze five candidate genes involved in type II diabetic nephropathy and to predict phylogenetic motifs from their corresponding orthologous protein sequences. Methods: AKR1B1, APOE, ENPP1, ELMO1 and IGFBP1 are the genes that have been identified as an important target for type II diabetic nephropathy through experimental studies. Their corresponding protein sequences, structures, orthologous sequences were retrieved from UniprotKB, PDB, and PHOG database respectively. Multiple sequence alignments were constructed using ClustalW and phylogenetic motifs were identified using MINER. The occurrence of amino acids in the obtained phylogenetic motifs was generated using WebLogo and false positive expectations were calculated against phylogenetic similarity. Results: In total, 17 phylogenetic motifs were identified from the five proteins and the residues such as glycine, leucine, tryptophan, aspartic acid were found in appreciable frequency whereas arginine identified in all the predicted PMs. The result implies that these residues can be important to the functional and structural role of the proteins and calculated false positive expectations implies that they were generally conserved in traditional sense. Conclusion: The prediction of phylogenetic motifs is an accurate method for detecting functionally important conserved residues. The conserved motifs can be used as a potential drug target for type II diabetic nephropathy. PMID:23113206

  9. Conserved noncoding genomic sequences associated with a flowering-time quantitative trait locus in maize

    PubMed Central

    Salvi, Silvio; Sponza, Giorgio; Morgante, Michele; Tomes, Dwight; Niu, Xiaomu; Fengler, Kevin A.; Meeley, Robert; Ananiev, Evgueni V.; Svitashev, Sergei; Bruggemann, Edward; Li, Bailin; Hainey, Christine F.; Radovic, Slobodanka; Zaina, Giusi; Rafalski, J.-Antoni; Tingey, Scott V.; Miao, Guo-Hua; Phillips, Ronald L.; Tuberosa, Roberto

    2007-01-01

    Flowering time is a fundamental trait of maize adaptation to different agricultural environments. Although a large body of information is available on the map position of quantitative trait loci for flowering time, little is known about the molecular basis of quantitative trait loci. Through positional cloning and association mapping, we resolved the major flowering-time quantitative trait locus, Vegetative to generative transition 1 (Vgt1), to an ≈2-kb noncoding region positioned 70 kb upstream of an Ap2-like transcription factor that we have shown to be involved in flowering-time control. Vgt1 functions as a cis-acting regulatory element as indicated by the correlation of the Vgt1 alleles with the transcript expression levels of the downstream gene. Additionally, within Vgt1, we identified evolutionarily conserved noncoding sequences across the maize–sorghum–rice lineages. Our results support the notion that changes in distant cis-acting regulatory regions are a key component of plant genetic adaptation throughout breeding and evolution. PMID:17595297

  10. Nucleotide sequence of a cluster of early and late genes in a conserved segment of the vaccinia virus genome.

    PubMed Central

    Plucienniczak, A; Schroeder, E; Zettlmeissl, G; Streeck, R E

    1985-01-01

    The nucleotide sequence of a 7.6 kb vaccinia DNA segment from a genomic region conserved among different orthopox virus has been determined. This segment contains a tight cluster of 12 partly overlapping open reading frames most of which can be correlated with previously identified early and late proteins and mRNAs. Regulatory signals used by vaccinia virus have been studied. Presumptive promoter regions are rich in A, T and carry the consensus sequences TATA and AATAA spaced at 20-24 base pairs. Tandem repeats of a CTATTC consensus sequence are proposed to be involved in the termination of early transcription. PMID:2987815

  11. Advances in high throughput DNA sequence data compression.

    PubMed

    Sardaraz, Muhammad; Tahir, Muhammad; Ikram, Ataul Aziz

    2016-06-01

    Advances in high throughput sequencing technologies and reduction in cost of sequencing have led to exponential growth in high throughput DNA sequence data. This growth has posed challenges such as storage, retrieval, and transmission of sequencing data. Data compression is used to cope with these challenges. Various methods have been developed to compress genomic and sequencing data. In this article, we present a comprehensive review of compression methods for genome and reads compression. Algorithms are categorized as referential or reference free. Experimental results and comparative analysis of various methods for data compression are presented. Finally, key challenges and research directions in DNA sequence data compression are highlighted.

  12. CDvist: A webserver for identification and visualization of conserved domains in protein sequences

    SciTech Connect

    Adebali, Ogun; Ortega, Davi R.; Zhulin, Igor B.

    2014-12-18

    Identification of domains in protein sequences allows their assigning to biological functions. Several webservers exist for identification of protein domains using similarity searches against various databases of protein domain models. However, none of them provides comprehensive domain coverage while allowing bulk querying and their visualization schemes can be improved. To address these issues, we developed CDvist (a comprehensive domain visualization tool), which combines the best available search algorithms and databases into a user-friendly framework. First, a given protein sequence is matched to domain models using high-specificity tools and only then unmatched segments are subjected to more sensitive algorithms resulting in a best possible comprehensive coverage. In conclusion, bulk querying and rich visualization and download options provide improved functionality to domain architecture analysis.

  13. CDvist: A webserver for identification and visualization of conserved domains in protein sequences

    DOE PAGES

    Adebali, Ogun; Ortega, Davi R.; Zhulin, Igor B.

    2014-12-18

    Identification of domains in protein sequences allows their assigning to biological functions. Several webservers exist for identification of protein domains using similarity searches against various databases of protein domain models. However, none of them provides comprehensive domain coverage while allowing bulk querying and their visualization schemes can be improved. To address these issues, we developed CDvist (a comprehensive domain visualization tool), which combines the best available search algorithms and databases into a user-friendly framework. First, a given protein sequence is matched to domain models using high-specificity tools and only then unmatched segments are subjected to more sensitive algorithms resulting inmore » a best possible comprehensive coverage. In conclusion, bulk querying and rich visualization and download options provide improved functionality to domain architecture analysis.« less

  14. QColors: an algorithm for conservative viral quasispecies reconstruction from short and non-contiguous next generation sequencing reads.

    PubMed

    Huang, Austin; Kantor, Rami; DeLong, Allison; Schreier, Leeann; Istrail, Sorin

    Next generation sequencing technologies have recently been applied to characterize mutational spectra of the heterogeneous population of viral genotypes (known as a quasispecies) within HIV-infected patients. Such information is clinically relevant because minority genetic subpopulations of HIV within patients enable viral escape from selection pressures such as the immune response and antiretroviral therapy. However, methods for quasispecies sequence reconstruction from next generation sequencing reads are not yet widely used and remains an emerging area of research. Furthermore, the majority of research methodology in HIV has focused on 454 sequencing, while many next-generation sequencing platforms used in practice are limited to shorter read lengths relative to 454 sequencing. Little work has been done in determining how best to address the read length limitations of other platforms. The approach described here incorporates graph representations of both read differences and read overlap to conservatively determine the regions of the sequence with sufficient variability to separate quasispecies sequences. Within these tractable regions of quasispecies inference, we use constraint programming to solve for an optimal quasispecies subsequence determination via vertex coloring of the conflict graph, a representation which also lends itself to data with non-contiguous reads such as paired-end sequencing. We demonstrate the utility of the method by applying it to simulations based on actual intra-patient clonal HIV-1 sequencing data.

  15. Cdc14: a highly conserved family of phosphatases with non-conserved functions?

    PubMed

    Mocciaro, Annamaria; Schiebel, Elmar

    2010-09-01

    CDC14 was originally identified by L. Hartwell in his famous screen for genes that regulate the budding yeast cell cycle. Subsequent work showed that Cdc14 belongs to a family of highly conserved dual-specificity phosphatases that are present in a wide range of organisms from yeast to human. Human CDC14B is even able to fulfill the essential functions of budding yeast Cdc14. In budding yeast, Cdc14 counteracts the activity of cyclin dependent kinase (Cdk1) at the end of mitosis and thus has important roles in the regulation of anaphase, mitotic exit and cytokinesis. On the basis of the functional conservation of other cell-cycle genes it seemed obvious to assume that Cdc14 phosphatases also have roles in late mitosis in mammalian cells and regulate similar targets to those found in yeast. However, analysis of the human Cdc14 proteins (CDC14A, CDC14B and CDC14C) by overexpression or by depletion using small interfering RNA (siRNA) has suggested functions that are quite different from those of ScCdc14. Recent studies in avian and human somatic cell lines in which the gene encoding either Cdc14A or Cdc14B had been deleted, have shown - surprisingly - that neither of the two phosphatases on its own is essential for viability, cell-cycle progression and checkpoint control. In this Commentary, we critically review the available data on the functions of yeast and vertebrate Cdc14 phosphatases, and discuss whether they indeed share common functions as generally assumed.

  16. Sequences of conserved region in the A subunit of DNA gyrase from nine species of the genus Mycobacterium: phylogenetic analysis and implication for intrinsic susceptibility to quinolones.

    PubMed

    Guillemin, I; Cambau, E; Jarlier, V

    1995-09-01

    The sequences of a conserved region in the A subunit of DNA gyrase corresponding to the quinolone resistance-determining region were determined for nine mycobacterial species and were compared. Although the nucleotide sequences were highly conserved, they clearly differentiated one species from another. The results of the phylogenetic analysis based on the sequences of the quinolone resistance-determining regions were compared with those provided by the 16S rRNA sequences. Deduced amino acid sequences were identical within the nine species except for amino acid 83, which was frequently involved in acquired resistance to quinolones in many genera, including mycobacteria. The presence at position 83 of an alanine for seven mycobacterial species (M. tuberculosis, M. bovis BCG, M. leprae, M. avium, M. kansasii, M. chelonae, and M. smegmatis) and of a serine for the two remaining mycobacterial species (M. fortuitum and M. aurum) correlated well with the MICs of ofloxacin for both groups of species, suggesting the role of this residue in intrinsic susceptibility to quinolones in mycobacteria.

  17. Taking high conservation value from forests to freshwaters.

    PubMed

    Abell, Robin; Morgan, Siân K; Morgan, Alexis J

    2015-07-01

    The high conservation value (HCV) concept, originally developed by the Forest Stewardship Council, has been widely incorporated outside the forestry sector into companies' supply chain assessments and responsible purchasing policies, financial institutions' investment policies, and numerous voluntary commodity standards. Many, if not most, of these newer applications relate to production practices that are likely to affect freshwater systems directly or indirectly, yet there is little guidance as to whether or how HCV can be applied to water bodies. We focus this paper on commodity standards and begin by exploring how prominent standards currently address both HCVs and freshwaters. We then highlight freshwater features of high conservation importance and examine how well those features are captured by the existing HCV framework. We propose a new set of freshwater 'elements' for each of the six values and suggest an approach for identifying HCV Areas that takes out-of-fence line impacts into account, thereby spatially extending the scope of existing methods to define HCVs. We argue that virtually any non-marine HCV assessment, regardless of the production sector, should be expanded to include freshwater values, and we suggest how to put those recommendations into practice.

  18. Taking High Conservation Value from Forests to Freshwaters

    NASA Astrophysics Data System (ADS)

    Abell, Robin; Morgan, Siân K.; Morgan, Alexis J.

    2015-07-01

    The high conservation value (HCV) concept, originally developed by the Forest Stewardship Council, has been widely incorporated outside the forestry sector into companies' supply chain assessments and responsible purchasing policies, financial institutions' investment policies, and numerous voluntary commodity standards. Many, if not most, of these newer applications relate to production practices that are likely to affect freshwater systems directly or indirectly, yet there is little guidance as to whether or how HCV can be applied to water bodies. We focus this paper on commodity standards and begin by exploring how prominent standards currently address both HCVs and freshwaters. We then highlight freshwater features of high conservation importance and examine how well those features are captured by the existing HCV framework. We propose a new set of freshwater `elements' for each of the six values and suggest an approach for identifying HCV Areas that takes out-of-fence line impacts into account, thereby spatially extending the scope of existing methods to define HCVs. We argue that virtually any non-marine HCV assessment, regardless of the production sector, should be expanded to include freshwater values, and we suggest how to put those recommendations into practice.

  19. Antibody Recognition of a Highly Conserved Influenza Virus Epitope

    SciTech Connect

    Ekiert, Damian C.; Bhabha, Gira; Elsliger, Marc-André; Friesen, Robert H.E.; Jongeneelen, Mandy; Throsby, Mark; Goudsmit, Jaap; Wilson, Ian A.; Scripps; Crucell

    2009-05-21

    Influenza virus presents an important and persistent threat to public health worldwide, and current vaccines provide immunity to viral isolates similar to the vaccine strain. High-affinity antibodies against a conserved epitope could provide immunity to the diverse influenza subtypes and protection against future pandemic viruses. Cocrystal structures were determined at 2.2 and 2.7 angstrom resolutions for broadly neutralizing human antibody CR6261 Fab in complexes with the major surface antigen (hemagglutinin, HA) from viruses responsible for the 1918 H1N1 influenza pandemic and a recent lethal case of H5N1 avian influenza. In contrast to other structurally characterized influenza antibodies, CR6261 recognizes a highly conserved helical region in the membrane-proximal stem of HA1 and HA2. The antibody neutralizes the virus by blocking conformational rearrangements associated with membrane fusion. The CR6261 epitope identified here should accelerate the design and implementation of improved vaccines that can elicit CR6261-like antibodies, as well as antibody-based therapies for the treatment of influenza.

  20. High performance computing with a conservative spectral Boltzmann solver

    NASA Astrophysics Data System (ADS)

    Haack, Jeffrey R.; Gamba, Irene M.

    2012-11-01

    We present new results building on the conservative deterministic spectral method for the space inhomogeneous Boltzmann equation developed by Gamba and Tharkabhushaman. This approach is a two-step process that acts on the weak form of the Boltzmann equation, and uses the machinery of the Fourier transform to reformulate the collisional integral into a weighted convolution in Fourier space. A constrained optimization problem is solved to preserve the mass, momentum, and energy of the resulting distribution. We extend this method to second order accuracy in space and time, and explore how to leverage the structure of the collisional formulation for high performance computing environments. The locality in space of the collisional term provides a straightforward memory decomposition, and we perform some initial scaling tests on high performance computing resources. We also use the improved computational power of this method to investigate a boundary-layer generated shock problem that cannot be described by classical hydrodynamics.

  1. Deep small RNA sequencing from the nematode Ascaris reveals conservation, functional diversification, and novel developmental profiles.

    PubMed

    Wang, Jianbin; Czech, Benjamin; Crunk, Amanda; Wallace, Adam; Mitreva, Makedonka; Hannon, Gregory J; Davis, Richard E

    2011-09-01

    Eukaryotic cells express several classes of small RNAs that regulate gene expression and ensure genome maintenance. Endogenous siRNAs (endo-siRNAs) and Piwi-interacting RNAs (piRNAs) mainly control gene and transposon expression in the germline, while microRNAs (miRNAs) generally function in post-transcriptional gene silencing in both somatic and germline cells. To provide an evolutionary and developmental perspective on small RNA pathways in nematodes, we identified and characterized known and novel small RNA classes through gametogenesis and embryo development in the parasitic nematode Ascaris suum and compared them with known small RNAs of Caenorhabditis elegans. piRNAs, Piwi-clade Argonautes, and other proteins associated with the piRNA pathway have been lost in Ascaris. miRNAs are synthesized immediately after fertilization in utero, before pronuclear fusion, and before the first cleavage of the zygote. This is the earliest expression of small RNAs ever described at a developmental stage long thought to be transcriptionally quiescent. A comparison of the two classes of Ascaris endo-siRNAs, 22G-RNAs and 26G-RNAs, to those in C. elegans, suggests great diversification and plasticity in the use of small RNA pathways during spermatogenesis in different nematodes. Our data reveal conserved characteristics of nematode small RNAs as well as features unique to Ascaris that illustrate significant flexibility in the use of small RNAs pathways, some of which are likely an adaptation to Ascaris' life cycle and parasitism. The transcriptome assembly has been submitted to NCBI Transcriptome Shotgun Assembly Sequence Database(http://www.ncbi.nlm.nih.gov/genbank/TSA.html) under accession numbers JI163767–JI182837 and JI210738–JI257410.

  2. A conserved 11 nucleotide sequence contains an essential promoter element of the maize mitochondrial atp1 gene.

    PubMed Central

    Rapp, W D; Stern, D B

    1992-01-01

    To determine the structure of a functional plant mitochondrial promoter, we have partially purified an RNA polymerase activity that correctly initiates transcription at the maize mitochondrial atp1 promoter in vitro. Using a series of 5' deletion constructs, we found that essential sequences are located within--19 nucleotides (nt) of the transcription initiation site. The region surrounding the initiation site includes conserved sequence motifs previously proposed to be maize mitochondrial promoter elements. Deletion of a conserved 11 nt sequence showed that it is critical for promoter function, but deletion or alteration of conserved upstream G(A/T)3-4 repeats had no effect. When the atp1 11 nt sequence was inserted into different plasmids lacking mitochondrial promoter activity, transcription was only observed for one of these constructs. We infer from these data that the functional promoter extends beyond this motif, most likely in the 5' direction. The maize mitochondrial cox3 and atp6 promoters also direct transcription initiation in this in vitro system, suggesting that it may be widely applicable for studies of mitochondrial transcription in this species. Images PMID:1372246

  3. Generating barcoded libraries for multiplex high-throughput sequencing.

    PubMed

    Knapp, Michael; Stiller, Mathias; Meyer, Matthias

    2012-01-01

    Molecular barcoding is an essential tool to use the high throughput of next generation sequencing platforms optimally in studies involving more than one sample. Various barcoding strategies allow for the incorporation of short recognition sequences (barcodes) into sequencing libraries, either by ligation or polymerase chain reaction (PCR). Here, we present two approaches optimized for generating barcoded sequencing libraries from low copy number extracts and amplification products typical of ancient DNA studies.

  4. High-throughput sequencing and vaccine design.

    PubMed

    Luciani, F

    2016-04-01

    Next-generation sequencing (NGS) technologies have reshaped genome research. The resulting increase in sequencing depth and resolution has led to an unprecedented level of genomic detail and thus an increasing awareness of the complexity of animal, human and pathogen genomes. This has resulted in new approaches to vaccine research. On the one hand, the increase in genome complexity challenges our ability to study and understand pathogen biology and pathogen-host interactions. On the other hand, the increase in genomic data also provides key information for developing and designing improved vaccines against pathogens that were previously extremely difficult to deal with, such as rapidly mutating RNA viruses or bacteria that have complex interactions with the host immune system. This review describes how the broad application of NGS technologies to genome research is affecting vaccine research. It focuses on implications for the field of viral genomics, and includes recent animal and human studies.

  5. Dominant sequences of human major histocompatibility complex conserved extended haplotypes from HLA-DQA2 to DAXX.

    PubMed

    Larsen, Charles E; Alford, Dennis R; Trautwein, Michael R; Jalloh, Yanoh K; Tarnacki, Jennifer L; Kunnenkeri, Sushruta K; Fici, Dolores A; Yunis, Edmond J; Awdeh, Zuheir L; Alper, Chester A

    2014-10-01

    We resequenced and phased 27 kb of DNA within 580 kb of the MHC class II region in 158 population chromosomes, most of which were conserved extended haplotypes (CEHs) of European descent or contained their centromeric fragments. We determined the single nucleotide polymorphism and deletion-insertion polymorphism alleles of the dominant sequences from HLA-DQA2 to DAXX for these CEHs. Nine of 13 CEHs remained sufficiently intact to possess a dominant sequence extending at least to DAXX, 230 kb centromeric to HLA-DPB1. We identified the regions centromeric to HLA-DQB1 within which single instances of eight "common" European MHC haplotypes previously sequenced by the MHC Haplotype Project (MHP) were representative of those dominant CEH sequences. Only two MHP haplotypes had a dominant CEH sequence throughout the centromeric and extended class II region and one MHP haplotype did not represent a known European CEH anywhere in the region. We identified the centromeric recombination transition points of other MHP sequences from CEH representation to non-representation. Several CEH pairs or groups shared sequence identity in small blocks but had significantly different (although still conserved for each separate CEH) sequences in surrounding regions. These patterns partly explain strong calculated linkage disequilibrium over only short (tens to hundreds of kilobases) distances in the context of a finite number of observed megabase-length CEHs comprising half a population's haplotypes. Our results provide a clearer picture of European CEH class II allelic structure and population haplotype architecture, improved regional CEH markers, and raise questions concerning regional recombination hotspots.

  6. Identification of conserved genomic regions and variation therein amongst Cetartiodactyla species using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Next Generation Sequencing has created an opportunity to genetically characterize an individual both inexpensively and comprehensively. In earlier work produced in our collaboration [1], it was demonstrated that, for animals without a reference genome, their Next Generation Sequence data ...

  7. Sequencing strategy for the whole mitochondrial genome resulting in high quality sequences

    PubMed Central

    Fendt, Liane; Zimmermann, Bettina; Daniaux, Martin; Parson, Walther

    2009-01-01

    Background It has been demonstrated that a reliable and fail-safe sequencing strategy is mandatory for high-quality analysis of mitochondrial (mt) DNA, as the sequencing and base-calling process is prone to error. Here, we present a high quality, reliable and easy handling manual procedure for the sequencing of full mt genomes that is also appropriate for laboratories where fully automated processes are not available. Results We amplified whole mitochondrial genomes as two overlapping PCR-fragments comprising each about 8500 bases in length. We developed a set of 96 primers that can be applied to a (manual) 96 well-based technology, which resulted in at least double strand sequence coverage of the entire coding region (codR). Conclusion This elaborated sequencing strategy is straightforward and allows for an unambiguous sequence analysis and interpretation including sometimes challenging phenomena such as point and length heteroplasmy that are relevant for the investigation of forensic and clinical samples. PMID:19331681

  8. Genome-wide discovery and differential regulation of conserved and novel microRNAs in chickpea via deep sequencing.

    PubMed

    Jain, Mukesh; Chevala, V V S Narayana; Garg, Rohini

    2014-11-01

    MicroRNAs (miRNAs) are essential components of complex gene regulatory networks that orchestrate plant development. Although several genomic resources have been developed for the legume crop chickpea, miRNAs have not been discovered until now. For genome-wide discovery of miRNAs in chickpea (Cicer arietinum), we sequenced the small RNA content from seven major tissues/organs employing Illumina technology. About 154 million reads were generated, which represented more than 20 million distinct small RNA sequences. We identified a total of 440 conserved miRNAs in chickpea based on sequence similarity with known miRNAs in other plants. In addition, 178 novel miRNAs were identified using a miRDeep pipeline with plant-specific scoring. Some of the conserved and novel miRNAs with significant sequence similarity were grouped into families. The chickpea miRNAs targeted a wide range of mRNAs involved in diverse cellular processes, including transcriptional regulation (transcription factors), protein modification and turnover, signal transduction, and metabolism. Our analysis revealed several miRNAs with differential spatial expression. Many of the chickpea miRNAs were expressed in a tissue-specific manner. The conserved and differential expression of members of the same miRNA family in different tissues was also observed. Some of the same family members were predicted to target different chickpea mRNAs, which suggested the specificity and complexity of miRNA-mediated developmental regulation. This study, for the first time, reveals a comprehensive set of conserved and novel miRNAs along with their expression patterns and putative targets in chickpea, and provides a framework for understanding regulation of developmental processes in legumes.

  9. CodaChrome: a tool for the visualization of proteome conservation across all fully sequenced bacterial genomes

    PubMed Central

    2014-01-01

    Background The relationships between bacterial genomes are complicated by rampant horizontal gene transfer, varied selection pressures, acquisition of new genes, loss of genes, and divergence of genes, even in closely related lineages. As more and more bacterial genomes are sequenced, organizing and interpreting the incredible amount of relational information that connects them becomes increasingly difficult. Results We have developed CodaChrome (http://www.sourceforge.com/p/codachrome), a one-versus-all proteome comparison tool that allows the user to visually investigate the relationship between a bacterial proteome of interest and the proteomes encoded by every other bacterial genome recorded in GenBank in a massive interactive heat map. This tool has allowed us to rapidly identify the most highly conserved proteins encoded in the bacterial pan-genome, fast-clock genes useful for subtyping of bacterial species, the evolutionary history of an indel in the Sphingobium lineage, and an example of horizontal gene transfer from a member of the genus Enterococcus to a recent ancestor of Helicobacter pylori. Conclusion CodaChrome is a user-friendly and powerful tool for simultaneously visualizing relationships between thousands of proteomes. PMID:24460813

  10. cDNA sequence, genomic organization, and evolutionary conservation of a novel gene from the WAGR region

    SciTech Connect

    Schwartz, F.; Eisenman, R.; Knoll, J.; Bruns, G.

    1995-09-20

    A new gene (239FB) with predominant and differential expression in fetal brain has recently been isolated from a chromosome 11p13-p14 boundary area near FSHB. The corresponding mRNA has an open reading frame of 294 amino acids, a 3` untranslated region of 1247 nucleotides, and a highly GC-rich 5` untranslated region. The coding and 3` UT sequence is specified by 6 exons within nearly 87 kb of isolated genomic locus. The 5` end region of the transcript maps adjacent to the only genomically defined CpG island in a chromosomal subregion that may be associated with part of the mental retardation of some WAGR (Wilms tumor, aniridia, genitourinary anomalies, and mental retardation) syndrome patients. In addition to nucleotide and amino acid similarity to an EST from a normalized infant brain cDNA library, the predicted protein has extensive similarity to Caenorhbditis elegans polypeptides of, as yet, unknown function. The 239FB locus is, therefore, likely part of a family of genes with two members expressed in human brain. The extensive conservation of the predicted protein suggests a fundamental function of the gene product and will enable evaluation of the role of the 239FB gene in neurogenesis in model organisms. 48 refs., 4 figs., 1 tab.

  11. Conserved sequence-specific lincRNA-steroid receptor interactions drive transcriptional repression and direct cell fate

    SciTech Connect

    Hudson, William H.; Pickard, Mark R.; de Vera, Ian Mitchelle S.; Kuiper, Emily G.; Mourtada-Maarabouni, Mirna; Conn, Graeme L.; Kojetin, Douglas J.; Williams, Gwyn T.; Ortlund, Eric A.

    2014-12-23

    The majority of the eukaryotic genome is transcribed, generating a significant number of long intergenic noncoding RNAs (lincRNAs). Although lincRNAs represent the most poorly understood product of transcription, recent work has shown lincRNAs fulfill important cellular functions. In addition to low sequence conservation, poor understanding of structural mechanisms driving lincRNA biology hinders systematic prediction of their function. Here we report the molecular requirements for the recognition of steroid receptors (SRs) by the lincRNA growth arrest-specific 5 (Gas5), which regulates steroid-mediated transcriptional regulation, growth arrest and apoptosis. We identify the functional Gas5-SR interface and generate point mutations that ablate the SR-Gas5 lincRNA interaction, altering Gas5-driven apoptosis in cancer cell lines. Further, we find that the Gas5 SR-recognition sequence is conserved among haplorhines, with its evolutionary origin as a splice acceptor site. This study demonstrates that lincRNAs can recognize protein targets in a conserved, sequence-specific manner in order to affect critical cell functions.

  12. Highly conserved small subunit residues influence rubisco large subunit catalysis.

    PubMed

    Genkov, Todor; Spreitzer, Robert J

    2009-10-30

    The chloroplast enzyme ribulose 1,5-bisphosphate carboxylase/oxygenase (Rubisco) catalyzes the rate-limiting step of photosynthetic CO(2) fixation. With a deeper understanding of its structure-function relationships and competitive inhibition by O(2), it may be possible to engineer an increase in agricultural productivity and renewable energy. The chloroplast-encoded large subunits form the active site, but the nuclear-encoded small subunits can also influence catalytic efficiency and CO(2)/O(2) specificity. To further define the role of the small subunit in Rubisco function, the 10 most conserved residues in all small subunits were substituted with alanine by transformation of a Chlamydomonas reinhardtii mutant that lacks the small subunit gene family. All the mutant strains were able to grow photosynthetically, indicating that none of the residues is essential for function. Three of the substitutions have little or no effect (S16A, P19A, and E92A), one primarily affects holoenzyme stability (L18A), and the remainder affect catalysis with or without some level of associated structural instability (Y32A, E43A, W73A, L78A, P79A, and F81A). Y32A and E43A cause decreases in CO(2)/O(2) specificity. Based on the x-ray crystal structure of Chlamydomonas Rubisco, all but one (Glu-92) of the conserved residues are in contact with large subunits and cluster near the amino- or carboxyl-terminal ends of large subunit alpha-helix 8, which is a structural element of the alpha/beta-barrel active site. Small subunit residues Glu-43 and Trp-73 identify a possible structural connection between active site alpha-helix 8 and the highly variable small subunit loop between beta-strands A and B, which can also influence Rubisco CO(2)/O(2) specificity.

  13. MLST analysis reveals a highly conserved core genome among poultry isolates of Clostridium septicum.

    PubMed

    Neumann, Anthony P; Rehberger, Thomas G

    2009-06-01

    Clostridium septicum is a highly virulent, anaerobic bacterium capable of establishing necrotizing tissue infections and forming heat resistant endospores. Disease is primarily facilitated by secretion of numerous toxic products including a lethal pore-forming cytolysin. Spontaneously occurring clostridial myonecrosis involving C. septicum has recently reemerged as a concern for many poultry producers. However, despite its increasing prevalence, the epidemiology of infection and population structure of C. septicum remains largely unknown. In this study a multilocus sequence typing (MLST) approach was utilized to examine evolutionary relationships within a diverse collection of C. septicum isolates recovered from poultry flocks experiencing episodes of gangrenous dermatitis. The 109 isolates examined represented 42 turkey flocks and 24 different flocks of broiler chickens as well as C. septicum type strain, ATCC 12464. Isolates were recovered predominantly from gangrenous lesions although isolates from livers, gastrointestinal tracts, spleens and blood were included. The loci analyzed were csa, the major lethal toxin produced by C. septicum, and the housekeeping genes gyrA, groEL, dnaK, recA, tpi, ddl, colA and glpK. These loci were included in part because of their previous use in MLST analysis of Clostridium perfringens and Clostridium difficile. Results indicated a high level of conservation present within these housekeeping gene fragments when compared to what has been previously reported for the aforementioned clostridia. Of the 5352 bp of sequence data examined for each isolate, 99.7% (5335/5352) was absolutely conserved among the 109 isolates. Only one of the ten unique sequence types, or allelic profiles, identified among the isolates was recovered from both turkeys and broiler chickens suggesting some host species preference. Phylogenetic analyses identified two unique clusters, or clonal complexes, among these poultry isolates which may have important

  14. Sequence of a cDNA encoding nitrite reductase from the tree Betula pendula and identification of conserved protein regions.

    PubMed

    Friemann, A; Brinkmann, K; Hachtel, W

    1992-02-01

    The sequence of an mRNA encoding nitrite reductase (NiR, EC 1.7.7.1.) from the tree Betula pendula was determined. A cDNA library constructed from leaf poly(A)+ mRNA was screened with an oligonucleotide probe deduced from NiR sequences from spinach and maize. A 2.5 kb cDNA was isolated that hybridized to an mRNA, the steady-state level of which increased markedly upon induction with nitrate. The nucleotide sequence of the cDNA contains a reading frame encoding a protein of 583 amino acids that reveals 79% identity with NiR from spinach. The transit peptide of the NiR precursor from birch was determined to be 22 amino acids in size by sequence comparison with NiR from spinach and maize and is the shortest transit peptide reported so far. A graphical evaluation of identities found in the NiR sequence alignment revealed nine well conserved sections each exceeding ten amino acids in size. Sequence comparisons with related redox proteins identified essential residues involved in cofactor binding. A putative binding site for ferredoxin was found in the N-terminal half of the protein.

  15. Large distribution and high sequence identity of a Copia-type retrotransposon in angiosperm families.

    PubMed

    Dias, Elaine Silva; Hatt, Clémence; Hamon, Serge; Hamon, Perla; Rigoreau, Michel; Crouzillat, Dominique; Carareto, Claudia Marcia Aparecida; de Kochko, Alexandre; Guyot, Romain

    2015-09-01

    Retrotransposons are the main component of plant genomes. Recent studies have revealed the complexity of their evolutionary dynamics. Here, we have identified Copia25 in Coffea canephora, a new plant retrotransposon belonging to the Ty1-Copia superfamily. In the Coffea genomes analyzed, Copia25 is present in relatively low copy numbers and transcribed. Similarity sequence searches and PCR analyses show that this retrotransposon with LTRs (Long Terminal Repeats) is widely distributed among the Rubiaceae family and that it is also present in other distantly related species belonging to Asterids, Rosids and monocots. A particular situation is the high sequence identity found between the Copia25 sequences of Musa, a monocot, and Ixora, a dicot species (Rubiaceae). Our results reveal the complexity of the evolutionary dynamics of the ancient element Copia25 in angiosperm, involving several processes including sequence conservation, rapid turnover, stochastic losses and horizontal transfer.

  16. Cloning and characterization of a highly repetitive fish nucleotide sequence.

    PubMed

    Datta, U; Dutta, P; Mandal, R K

    1988-01-01

    We have cloned and sequenced a highly repetitive HindIII fragment of DNA from the common carp Cyprinus carpio. It represents a tandemly repeated sequence with a monomeric unit of 245 bp and comprises 8% of the fish genome. Higher units of this monomer appear as a ladder in Southern blots. The monomeric unit has been sequenced; it is A + T-rich with some direct and some inverse-repeat nucleotide clusters.

  17. Denoising DNA deep sequencing data-high-throughput sequencing errors and their correction.

    PubMed

    Laehnemann, David; Borkhardt, Arndt; McHardy, Alice Carolyn

    2016-01-01

    Characterizing the errors generated by common high-throughput sequencing platforms and telling true genetic variation from technical artefacts are two interdependent steps, essential to many analyses such as single nucleotide variant calling, haplotype inference, sequence assembly and evolutionary studies. Both random and systematic errors can show a specific occurrence profile for each of the six prominent sequencing platforms surveyed here: 454 pyrosequencing, Complete Genomics DNA nanoball sequencing, Illumina sequencing by synthesis, Ion Torrent semiconductor sequencing, Pacific Biosciences single-molecule real-time sequencing and Oxford Nanopore sequencing. There is a large variety of programs available for error removal in sequencing read data, which differ in the error models and statistical techniques they use, the features of the data they analyse, the parameters they determine from them and the data structures and algorithms they use. We highlight the assumptions they make and for which data types these hold, providing guidance which tools to consider for benchmarking with regard to the data properties. While no benchmarking results are included here, such specific benchmarks would greatly inform tool choices and future software development. The development of stand-alone error correctors, as well as single nucleotide variant and haplotype callers, could also benefit from using more of the knowledge about error profiles and from (re)combining ideas from the existing approaches presented here.

  18. Domains in microbial beta-1, 4-glycanases: sequence conservation, function, and enzyme families.

    PubMed Central

    Gilkes, N R; Henrissat, B; Kilburn, D G; Miller, R C; Warren, R A

    1991-01-01

    Several types of domain occur in beta-1, 4-glycanases. The best characterized of these are the catalytic domains and the cellulose-binding domains. The domains may be joined by linker sequences rich in proline or hydroxyamino acids or both. Some of the enzymes contain repeated sequences up to 150 amino acids in length. The enzymes can be grouped into families on the basis of sequence similarities between the catalytic domains. There are sequence similarities between the cellulose-binding domains, of which two types have been identified, and also between some domains of unknown function. The beta-1, 4-glycanases appear to have arisen by the shuffling of a relatively small number of progenitor sequences. PMID:1886523

  19. Role of Escherichia coli YbeY, a highly conserved protein, in rRNA processing

    PubMed Central

    Davies, Bryan W.; Köhrer, Caroline; Jacob, Asha I.; Simmons, Lyle A.; Zhu, Jianyu; Aleman, Lourdes M.; RajBhandary, Uttam L.; Walker, Graham C.

    2010-01-01

    The UPF0054 protein family is highly conserved with homologs present in nearly every sequenced bacterium. In some bacteria, the respective gene is essential, while in others its loss results in a highly pleiotropic phenotype. Despite detailed structural studies, a cellular role for this protein family has remained unknown. We report here that deletion of the Escherichia coli homolog, YbeY, causes striking defects that affect ribosome activity, translational fidelity and ribosome assembly. Mapping of 16S, 23S and 5S rRNA termini reveals that YbeY influences the maturation of all three rRNAs, with a particularly strong effect on maturation at both the 5′- and 3′-ends of 16S rRNA as well as maturation of the 5′-termini of 23S and 5S rRNAs. Furthermore, we demonstrate strong genetic interactions between ybeY and rnc (encoding RNase III), ybeY and rnr (encoding RNase R), and ybeY and pnp (encoding PNPase), further suggesting a role for YbeY in rRNA maturation. Mutation of highly conserved amino acids in YbeY, allowed the identification of two residues (H114, R59) that were found to have a significant effect in vivo. We discuss the implications of these findings for rRNA maturation and ribosome assembly in bacteria. PMID:20807199

  20. High-Order Space-Time Methods for Conservation Laws

    NASA Technical Reports Server (NTRS)

    Huynh, H. T.

    2013-01-01

    Current high-order methods such as discontinuous Galerkin and/or flux reconstruction can provide effective discretization for the spatial derivatives. Together with a time discretization, such methods result in either too small a time step size in the case of an explicit scheme or a very large system in the case of an implicit one. To tackle these problems, two new high-order space-time schemes for conservation laws are introduced: the first is explicit and the second, implicit. The explicit method here, also called the moment scheme, achieves a Courant-Friedrichs-Lewy (CFL) condition of 1 for the case of one-spatial dimension regardless of the degree of the polynomial approximation. (For standard explicit methods, if the spatial approximation is of degree p, then the time step sizes are typically proportional to 1/p(exp 2)). Fourier analyses for the one and two-dimensional cases are carried out. The property of super accuracy (or super convergence) is discussed. The implicit method is a simplified but optimal version of the discontinuous Galerkin scheme applied to time. It reduces to a collocation implicit Runge-Kutta (RK) method for ordinary differential equations (ODE) called Radau IIA. The explicit and implicit schemes are closely related since they employ the same intermediate time levels, and the former can serve as a key building block in an iterative procedure for the latter. A limiting technique for the piecewise linear scheme is also discussed. The technique can suppress oscillations near a discontinuity while preserving accuracy near extrema. Preliminary numerical results are shown

  1. Structure-sequence based analysis for identification of conserved regions in proteins

    DOEpatents

    Zemla, Adam T; Zhou, Carol E; Lam, Marisa W; Smith, Jason R; Pardes, Elizabeth

    2013-05-28

    Disclosed are computational methods, and associated hardware and software products for scoring conservation in a protein structure based on a computationally identified family or cluster of protein structures. A method of computationally identifying a family or cluster of protein structures in also disclosed herein.

  2. The role of evolutionary conserved germline DH sequence in B-1 cell development and natural antibody production

    PubMed Central

    Vale, Andre M.; Nobrega, Alberto; Schroeder, Harry W.

    2015-01-01

    Due to N addition and variation in the site of V–D–J joining, the third complementarity-determining region of the heavy chain (CDR-H3) is the most diverse component of the initial immunoglobulin antigen-binding site repertoire. A large component of the peritoneal cavity B-1 cell component is the product of fetal and perinatal B cell production. The CDR-H3 repertoire is thus depleted of N addition, which increases dependency on germ-line sequence. Cross-species comparisons have shown that DH gene sequence demonstrates conservation of amino acid preferences by reading frame. Preference for reading frame 1, which is enriched for tyrosine and glycine, is created both by rearrangement patterns and by pre-BCR and BCR selection. In previous studies, we have assessed the role of conserved DH sequence by examining peritoneal cavity B-1 cell numbers and antibody production in BALB/c mice with altered DH loci. Here, we review our finding that changes in the constraints normally imposed by germ line–encoded amino acids within the CDR-H3 repertoire profoundly affect B-1 cell development, especially B-1a cells, and thus natural antibody immunity. Our studies suggest that both natural and somatic selection operate to create a restricted B-1 cell CDR-H3 repertoire. PMID:26104486

  3. The role of evolutionarily conserved germ-line DH sequence in B-1 cell development and natural antibody production.

    PubMed

    Vale, Andre M; Nobrega, Alberto; Schroeder, Harry W

    2015-12-01

    Because of N addition and variation in the site of VDJ joining, the third complementarity-determining region of the heavy chain (CDR-H3) is the most diverse component of the initial immunoglobulin antigen-binding site repertoire. A large component of the peritoneal cavity B-1 cell component is the product of fetal and perinatal B cell production. The CDR-H3 repertoire is thus depleted of N addition, which increases dependency on germ-line sequence. Cross-species comparisons have shown that DH gene sequence demonstrates conservation of amino acid preferences by reading frame. Preference for reading frame 1, which is enriched for tyrosine and glycine, is created both by rearrangement patterns and by pre-BCR and BCR selection. In previous studies, we have assessed the role of conserved DH sequence by examining peritoneal cavity B-1 cell numbers and antibody production in BALB/c mice with altered DH loci. Here, we review our finding that changes in the constraints normally imposed by germ-line-encoded amino acids within the CDR-H3 repertoire profoundly affect B-1 cell development, especially B-1a cells, and thus natural antibody immunity. Our studies suggest that both natural and somatic selection operate to create a restricted B-1 cell CDR-H3 repertoire.

  4. Characterization and complete genome sequence of a panicovirus from Bermuda grass by high-throughput sequencing.

    PubMed

    Tahir, Muhammad N; Lockhart, Ben; Grinstead, Samuel; Mollov, Dimitre

    2017-04-01

    Bermuda grass samples were examined by transmission electron microscopy and 28-30 nm spherical virus particles were observed. Total RNA from these plants was subjected to high-throughput sequencing (HTS). The nearly full genome sequence of a panicovirus was identified from one HTS scaffold. Sanger sequencing was used to confirm the HTS results and complete the genome sequence of 4404 nt. This virus was provisionally named Bermuda grass latent virus (BGLV). Its predicted open reading frames follow the typical arrangement of the genus Panicovirus. Based on sequence comparisons and phylogenetic analyses BGLV differs from other viruses and therefore taxonomically it is a new member of the genus Panicovirus, family Tombusviridae.

  5. Highly effective sequencing whole chloroplast genomes of angiosperms by nine novel universal primer pairs.

    PubMed

    Yang, Jun-Bo; Li, De-Zhu; Li, Hong-Tao

    2014-09-01

    Chloroplast genomes supply indispensable information that helps improve the phylogenetic resolution and even as organelle-scale barcodes. Next-generation sequencing technologies have helped promote sequencing of complete chloroplast genomes, but compared with the number of angiosperms, relatively few chloroplast genomes have been sequenced. There are two major reasons for the paucity of completely sequenced chloroplast genomes: (i) massive amounts of fresh leaves are needed for chloroplast sequencing and (ii) there are considerable gaps in the sequenced chloroplast genomes of many plants because of the difficulty of isolating high-quality chloroplast DNA, preventing complete chloroplast genomes from being assembled. To overcome these obstacles, all known angiosperm chloroplast genomes available to date were analysed, and then we designed nine universal primer pairs corresponding to the highly conserved regions. Using these primers, angiosperm whole chloroplast genomes can be amplified using long-range PCR and sequenced using next-generation sequencing methods. The primers showed high universality, which was tested using 24 species representing major clades of angiosperms. To validate the functionality of the primers, eight species representing major groups of angiosperms, that is, early-diverging angiosperms, magnoliids, monocots, Saxifragales, fabids, malvids and asterids, were sequenced and assembled their complete chloroplast genomes. In our trials, only 100 mg of fresh leaves was used. The results show that the universal primer set provided an easy, effective and feasible approach for sequencing whole chloroplast genomes in angiosperms. The designed universal primer pairs provide a possibility to accelerate genome-scale data acquisition and will therefore magnify the phylogenetic resolution and species identification in angiosperms.

  6. Conserved hypothetical protein Rv1977 in Mycobacterium tuberculosis strains contains sequence polymorphisms and might be involved in ongoing immune evasion.

    PubMed

    Jiang, Yi; Liu, Haican; Wang, Xuezhi; Li, Guilian; Qiu, Yan; Dou, Xiangfeng; Wan, Kanglin

    2015-01-01

    Host immune pressure and associated parasite immune evasion are key features of host-pathogen co-evolution. A previous study showed that human T cell epitopes of Mycobacterium tuberculosis are evolutionarily hyperconserved and thus it was deduced that M. tuberculosis lacks antigenic variation and immune evasion. Here, we selected 151 clinical Mycobacterium tuberculosis isolates from China, amplified gene encoding Rv1977 and compared the sequences. The results showed that Rv1977, a conserved hypothetical protein, is not conserved in M. tuberculosis strains and there are polymorphisms existed in the protein. Some mutations, especially one frameshift mutation, occurred in the antigen Rv1977, which is uncommon in M.tb strains and may lead to the protein function altering. Mutations and deletion in the gene all affect one of three T cell epitopes and the changed T cell epitope contained more than one variable position, which may suggest ongoing immune evasion.

  7. Use of ancient sedimentary DNA as a novel conservation tool for high-altitude tropical biodiversity.

    PubMed

    Boessenkool, Sanne; McGlynn, Gayle; Epp, Laura S; Taylor, David; Pimentel, Manuel; Gizaw, Abel; Nemomissa, Sileshi; Brochmann, Christian; Popp, Magnus

    2014-04-01

    Conservation of biodiversity may in the future increasingly depend upon the availability of scientific information to set suitable restoration targets. In traditional paleoecology, sediment-based pollen provides a means to define preanthropogenic impact conditions, but problems in establishing the exact provenance and ecologically meaningful levels of taxonomic resolution of the evidence are limiting. We explored the extent to which the use of sedimentary ancient DNA (sedaDNA) may complement pollen data in reconstructing past alpine environments in the tropics. We constructed a record of afro-alpine plants retrieved from DNA preserved in sediment cores from 2 volcanic crater sites in the Albertine Rift, eastern Africa. The record extended well beyond the onset of substantial anthropogenic effects on tropical mountains. To ensure high-quality taxonomic inference from the sedaDNA sequences, we built an extensive DNA reference library covering the majority of the afro-alpine flora, by sequencing DNA from taxonomically verified specimens. Comparisons with pollen records from the same sediment cores showed that plant diversity recovered with sedaDNA improved vegetation reconstructions based on pollen records by revealing both additional taxa and providing increased taxonomic resolution. Furthermore, combining the 2 measures assisted in distinguishing vegetation change at different geographic scales; sedaDNA almost exclusively reflects local vegetation, whereas pollen can potentially originate from a wide area that in highlands in particular can span several ecozones. Our results suggest that sedaDNA may provide information on restoration targets and the nature and magnitude of human-induced environmental changes, including in high conservation priority, biodiversity hotspots, where understanding of preanthropogenic impact (or reference) conditions is highly limited.

  8. Mammalian ets-1 and ets-2 genes encode highly conserved proteins.

    PubMed Central

    Watson, D K; McWilliams, M J; Lapis, P; Lautenberger, J A; Schweinfest, C W; Papas, T S

    1988-01-01

    Cellular ets sequences homologous to v-ets of the avian leukemia virus E26 are highly conserved. In mammals the ets sequences are dispersed on two separate chromosomal loci, called ets-1 and ets-2. To determine the structure of these two genes and identify the open reading frames that code for the putative proteins, we have sequenced human ets-1 cDNAs and ets-2 cDNA clones obtained from both human and mouse. The human ETS1 gene is capable of encoding a protein of 441 amino acids. This protein is greater than 95% identical to the chicken c-ets-1 gene product. Thus, the human ETS1 gene is homologous to the chicken c-ets-1 gene, the protooncogene that the E26 virus transduced. Human and mouse ets-2 cDNA clones are closely related and contain open reading frames capable of encoding proteins of 469 and 468 residues, respectively. Direct comparison of these data with previously published findings indicates that ets is a family of genes whose members share distinct domains. PMID:2847145

  9. The human HNRPD locus maps to 4q21 and encodes a highly conserved protein.

    PubMed

    Dempsey, L A; Li, M J; DePace, A; Bray-Ward, P; Maizels, N

    1998-05-01

    The hnRNP D protein interacts with nucleic acids both in vivo and in vitro. Like many other proteins that interact with RNA, it contains RBD (or "RRM") domains and arg-gly-gly (RGG) motifs. We have examined the organization and localization of the human and murine genes that encode the hnRNP D protein. Comparison of the predicted sequences of the hnRNP D proteins in human and mouse shows that they are 96.9% identical (98.9% similar). This very high level of conservation suggests a critical function for hnRNP D. Sequence analysis of the human HNRPD gene shows that the protein is encoded by eight exons and that two additional exons specify sequences in the 3' UTR. Use of two of the coding exons is determined by alternative splicing of the HNRPD mRNA. The human HNRPD gene maps to 4q21. The mouse Hnrpd gene maps to the F region of chromosome 3, which is syntenic with the human 4q21 region.

  10. Specific binding of eukaryotic ORC to DNA replication origins depends on highly conserved basic residues.

    PubMed

    Kawakami, Hironori; Ohashi, Eiji; Kanamoto, Shota; Tsurimoto, Toshiki; Katayama, Tsutomu

    2015-10-12

    In eukaryotes, the origin recognition complex (ORC) heterohexamer preferentially binds replication origins to trigger initiation of DNA replication. Crystallographic studies using eubacterial and archaeal ORC orthologs suggested that eukaryotic ORC may bind to origin DNA via putative winged-helix DNA-binding domains and AAA+ ATPase domains. However, the mechanisms how eukaryotic ORC recognizes origin DNA remain elusive. Here, we show in budding yeast that Lys-362 and Arg-367 residues of the largest subunit (Orc1), both outside the aforementioned domains, are crucial for specific binding of ORC to origin DNA. These basic residues, which reside in a putative disordered domain, were dispensable for interaction with ATP and non-specific DNA sequences, suggesting a specific role in recognition. Consistent with this, both residues were required for origin binding of Orc1 in vivo. A truncated Orc1 polypeptide containing these residues solely recognizes ARS sequence with low affinity and Arg-367 residue stimulates sequence specific binding mode of the polypeptide. Lys-362 and Arg-367 residues of Orc1 are highly conserved among eukaryotic ORCs, but not in eubacterial and archaeal orthologs, suggesting a eukaryote-specific mechanism underlying recognition of replication origins by ORC.

  11. Species identification using genetic tools: the value of nuclear and mitochondrial gene sequences in whale conservation.

    PubMed

    Palumbi, S R; Cipriano, F

    1998-01-01

    DNA sequence analysis is a powerful tool for identifying the source of samples thought to be derived from threatened or endangered species. Analysis of mitochondrial DNA (mtDNA) from retail whale meat markets has shown consistently that the expected baleen whale in these markets, the minke whale, makes up only about half the products analyzed. The other products are either unregulated small toothed whales like dolphins or are protected baleen whales such as humpback, Bryde's, fin, or blue whales. Independent verification of such mtDNA identifications requires analysis of nuclear genetic loci, but this is technically more difficult than standard mtDNA sequencing. In addition, evolution of species-specific sequences (i.e., fixation of sequence differences to produce reciprocally monophyletic gene trees) is slower in nuclear than in mitochondrial genes primarily because genetic drift is slower at nuclear loci. When will use of nuclear sequences allow forensic DNA identification? Comparison of neutral theories of coalescence of mitochondrial and nuclear loci suggests a simple rule of thumb. The "three-times rule" suggests that phylogenetic sorting at nuclear loci is likely to produce species-specific sequences when mitochondrial alleles are reciprocally monophyletic and the branches leading to the mtDNA sequences of a species are three times longer than the average difference observed within species. A preliminary test of the three-times rule, which depends on many assumptions about the species and genes involved, suggests that blue and fin whales should have species-specific sequences at most neutral nuclear loci, whereas humpback and fin whales should show species-specific sequences at fewer nuclear loci. Partial sequences of actin introns from these species confirm the predictions of the three-times rule and show that blue and fin whales are reciprocally monophyletic at this locus. These intron sequences are thus good tools for the identification of these species

  12. A New DNA Binding Protein Highly Conserved in Diverse Crenarchaeal Viruses

    SciTech Connect

    Larson, E.T.; Eilers, B.J.; Reiter, D.; Ortmann, A.C.; Young, M.J.; Lawrence, C.M.; /Montana State U. /Tubingen U.

    2007-07-09

    Sulfolobus turreted icosahedral virus (STIV) infects Sulfolobus species found in the hot springs of Yellowstone National Park. Its 37 open reading frames (ORFs) generally lack sequence similarity to other genes. One exception, however, is ORF B116. While its function is unknown, orthologs are found in three additional crenarchaeal viral families. Due to the central importance of this protein family to crenarchaeal viruses, we have undertaken structural and biochemical studies of B116. The structure reveals a previously unobserved fold consisting of a five-stranded beta-sheet flanked on one side by three alpha helices. Two subunits come together to form a homodimer with a 10-stranded mixed beta-sheet, where the topology of the central strands resembles an unclosed beta-barrel. Highly conserved loops rise above the surface of the saddle-shaped protein and suggest an interaction with the major groove of DNA. The predicted B116-DNA interaction is confirmed by electrophoretic mobility shift assays.

  13. 7 CFR 760.821 - Compliance with highly erodible land and wetland conservation.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 7 Agriculture 7 2014-01-01 2014-01-01 false Compliance with highly erodible land and wetland... Disaster Program § 760.821 Compliance with highly erodible land and wetland conservation. (a) The highly erodible land and wetland conservation provisions of part 12 of this title apply to the receipt of...

  14. 7 CFR 760.821 - Compliance with highly erodible land and wetland conservation.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 7 Agriculture 7 2011-01-01 2011-01-01 false Compliance with highly erodible land and wetland... Disaster Program § 760.821 Compliance with highly erodible land and wetland conservation. (a) The highly erodible land and wetland conservation provisions of part 12 of this title apply to the receipt of...

  15. 7 CFR 760.821 - Compliance with highly erodible land and wetland conservation.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 7 Agriculture 7 2012-01-01 2012-01-01 false Compliance with highly erodible land and wetland... Disaster Program § 760.821 Compliance with highly erodible land and wetland conservation. (a) The highly erodible land and wetland conservation provisions of part 12 of this title apply to the receipt of...

  16. 7 CFR 760.821 - Compliance with highly erodible land and wetland conservation.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 7 2010-01-01 2010-01-01 false Compliance with highly erodible land and wetland... Disaster Program § 760.821 Compliance with highly erodible land and wetland conservation. (a) The highly erodible land and wetland conservation provisions of part 12 of this title apply to the receipt of...

  17. 7 CFR 760.821 - Compliance with highly erodible land and wetland conservation.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 7 Agriculture 7 2013-01-01 2013-01-01 false Compliance with highly erodible land and wetland... Disaster Program § 760.821 Compliance with highly erodible land and wetland conservation. (a) The highly erodible land and wetland conservation provisions of part 12 of this title apply to the receipt of...

  18. Conservation of nucleotide sequences for molecular diagnosis of Middle East respiratory syndrome coronavirus, 2015.

    PubMed

    Furuse, Yuki; Okamoto, Michiko; Oshitani, Hitoshi

    2015-11-01

    Infection due to the Middle East respiratory syndrome coronavirus (MERS-CoV) is widespread. The present study was performed to assess the protocols used for the molecular diagnosis of MERS-CoV by analyzing the nucleotide sequences of viruses detected between 2012 and 2015, including sequences from the large outbreak in eastern Asia in 2015. Although the diagnostic protocols were established only 2 years ago, mismatches between the sequences of primers/probes and viruses were found for several of the assays. Such mismatches could lead to a lower sensitivity of the assay, thereby leading to false-negative diagnosis. A slight modification in the primer design is suggested. Protocols for the molecular diagnosis of viral infections should be reviewed regularly after they are established, particularly for viruses that pose a great threat to public health such as MERS-CoV.

  19. Length heterogeneity at conserved sequence block 2 in human mitochondrial DNA acts as a rheostat for RNA polymerase POLRMT activity

    PubMed Central

    Tan, Benedict G.; Wellesley, Frederick C.; Savery, Nigel J.; Szczelkun, Mark D.

    2016-01-01

    The guanine (G)-tract of conserved sequence block 2 (CSB 2) in human mitochondrial DNA can result in transcription termination due to formation of a hybrid G-quadruplex between the nascent RNA and the nontemplate DNA strand. This structure can then influence genome replication, stability and localization. Here we surveyed the frequency of variation in sequence identity and length at CSB 2 amongst human mitochondrial genomes and used in vitro transcription to assess the effects of this length heterogeneity on the activity of the mitochondrial RNA polymerase, POLRMT. In general, increased G-tract length correlated with increased termination levels. However, variation in the population favoured CSB 2 sequences which produced efficient termination while particularly weak or strong signals were avoided. For all variants examined, the 3′ end of the transcripts mapped to the same downstream sequences and were prevented from terminating by addition of the transcription factor TEFM. We propose that CSB 2 length heterogeneity allows variation in the efficiency of transcription termination without affecting the position of the products or the capacity for regulation by TEFM. PMID:27436287

  20. Complete sequence of the mitochondrial DNA in the sea urchin Arbacia lixula: conserved features of the echinoid mitochondrial genome.

    PubMed

    De Giorgi, C; Martiradonna, A; Lanave, C; Saccone, C

    1996-04-01

    The complete nucleotide sequence (15,719 nucleotides) of the mitochondrial DNA (mtDNA) from the sea urchin Arbacia lixula is presented. The comparison of gene arrangement between different echinoderm orders of the same class provides evidence that the gene organization is conserved within the same echinoderm class. The peculiarities of sea urchin mtDNA features, already described, are confirmed by the A. lixula mtDNA sequence. The comparison of the entire sequences of mtDNA among A. lixula, Paracentrotus lividus, and Strongylocentrotus purpuratus allowed us to detect peculiar features, common to the three sea urchin species, that can represent the molecular signature of the mt genome in the sea urchin group. Analysis of the nucleotide composition indicates that A. lixula mtDNA, in contrast with the mtDNA of other sea urchins, shows a bias in the use of T and tends to avoid the use of C, most evident in the neutral part of the molecule, such as the third codon positions. This observation indicates that the three sea urchin mtDNAs evolve under different mutation pressure. Analysis of the sequence evolution allowed us to confirm the phylogenetic tree. However, the absolute divergence time, calculated on the basis of paleontological estimates, largely diverged from the expected one.

  1. Evolution of ITS1 rDNA in the Digenea (Platyhelminthes: trematoda): 3' end sequence conservation and its phylogenetic utility.

    PubMed

    vd Schulenburg, J H; Englisch, U; Wägele, J W

    1999-01-01

    A comparison of ribosomal internal transcribed spacer 1 (ITS1) elements of digenetic trematodes (Platyhelminthes) including unidentified digeneans isolated from Cyathura carinata (Crustacea: Isopoda) revealed DNA sequence similarities at more than half of the spacer at its 3' end. Primary sequence similarity was shown to be associated with secondary structure conservation, which suggested that similarity is due to identity by descent and not chance. Using an analysis of apomorphies, the sequence data were shown to produce a distinct phylogenetic signal. This was confirmed by the consistency of results of different tree reconstruction methods such as distance approaches, maximum parsimony, and maximum likelihood. Morphological evidence additionally supported the phylogenetic tree based on ITS1 data and the inferred phylogenetic position of the unidentified digeneans of C. carinata met the expectations from known trematode life-cycle patterns. Although ribosomal ITS1 elements are generally believed to be too variable for phylogenetic analysis above the species or genus level, the overall consistency of the results of this study strongly suggests that this is not the case in digenetic trematodes. Here, 3' end ITS1 sequence data seem to provide a valuable tool for elucidating phylogenetic relationships of a broad range of phylogenetically distinct taxa.

  2. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    SciTech Connect

    Le Coq, Johanne; Ghosh, Partho

    2012-06-19

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd ({approx}16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10{sup 20} potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.

  3. A conserved unusual posttranscriptional processing mediated by short, direct repeated (SDR) sequences in plants.

    PubMed

    Niu, Xiangli; Luo, Di; Gao, Shaopei; Ren, Guangjun; Chang, Lijuan; Zhou, Yuke; Luo, Xiaoli; Li, Yuxiang; Hou, Pei; Tang, Wei; Lu, Bao-Rong; Liu, Yongsheng

    2010-01-01

    In several stress responsive gene loci of monocot cereal crops, we have previously identified an unusual posttranscriptional processing mediated by paired presence of short direct repeated (SDR) sequences at 5' and 3' splicing junctions that are distinct from conventional (U2/U12-type) splicing boundaries. By using the known SDR-containing sequences as probes, 24 plant candidate genes involved in diverse functional pathways from both monocots and dicots that potentially possess SDR-mediated posttranscriptional processing were predicted in the GenBank database. The SDRs-mediated posttranscriptional processing events including cis- and trans-actions were experimentally detected in majority of the predicted candidates. Extensive sequence analysis demonstrates several types of SDR-associated splicing peculiarities including partial exon deletion, exon fragment repetition, exon fragment scrambling and trans-splicing that result in either loss of partial exon or unusual exonic sequence rearrangements within or between RNA molecules. In addition, we show that the paired presence of SDR is necessary but not sufficient in SDR-mediated splicing in transient expression and stable transformation systems. We also show prokaryote is incapable of SDR-mediated premRNA splicing.

  4. Storage and retrieval of highly repetitive sequence collections.

    PubMed

    Mäkinen, Veli; Navarro, Gonzalo; Sirén, Jouni; Välimäki, Niko

    2010-03-01

    A repetitive sequence collection is a set of sequences which are small variations of each other. A prominent example are genome sequences of individuals of the same or close species, where the differences can be expressed by short lists of basic edit operations. Flexible and efficient data analysis on such a typically huge collection is plausible using suffix trees. However, the suffix tree occupies much space, which very soon inhibits in-memory analyses. Recent advances in full-text indexing reduce the space of the suffix tree to, essentially, that of the compressed sequences, while retaining its functionality with only a polylogarithmic slowdown. However, the underlying compression model considers only the predictability of the next sequence symbol given the k previous ones, where k is a small integer. This is unable to capture longer-term repetitiveness. For example, r identical copies of an incompressible sequence will be incompressible under this model. We develop new static and dynamic full-text indexes that are able of capturing the fact that a collection is highly repetitive, and require space basically proportional to the length of one typical sequence plus the total number of edit operations. The new indexes can be plugged into a recent dynamic fully-compressed suffix tree, achieving full functionality for sequence analysis, while retaining the reduced space and the polylogarithmic slowdown. Our experimental results confirm the practicality of our proposal.

  5. High-Throughput Next-Generation Sequencing of Polioviruses.

    PubMed

    Montmayeur, Anna M; Ng, Terry Fei Fan; Schmidt, Alexander; Zhao, Kun; Magaña, Laura; Iber, Jane; Castro, Christina J; Chen, Qi; Henderson, Elizabeth; Ramos, Edward; Shaw, Jing; Tatusov, Roman L; Dybdahl-Sissoko, Naomi; Endegue-Zanga, Marie Claire; Adeniji, Johnson A; Oberste, M Steven; Burns, Cara C

    2017-02-01

    The poliovirus (PV) is currently targeted for worldwide eradication and containment. Sanger-based sequencing of the viral protein 1 (VP1) capsid region is currently the standard method for PV surveillance. However, the whole-genome sequence is sometimes needed for higher resolution global surveillance. In this study, we optimized whole-genome sequencing protocols for poliovirus isolates and FTA cards using next-generation sequencing (NGS), aiming for high sequence coverage, efficiency, and throughput. We found that DNase treatment of poliovirus RNA followed by random reverse transcription (RT), amplification, and the use of the Nextera XT DNA library preparation kit produced significantly better results than other preparations. The average viral reads per total reads, a measurement of efficiency, was as high as 84.2% ± 15.6%. PV genomes covering >99 to 100% of the reference length were obtained and validated with Sanger sequencing. A total of 52 PV genomes were generated, multiplexing as many as 64 samples in a single Illumina MiSeq run. This high-throughput, sequence-independent NGS approach facilitated the detection of a diverse range of PVs, especially for those in vaccine-derived polioviruses (VDPV), circulating VDPV, or immunodeficiency-related VDPV. In contrast to results from previous studies on other viruses, our results showed that filtration and nuclease treatment did not discernibly increase the sequencing efficiency of PV isolates. However, DNase treatment after nucleic acid extraction to remove host DNA significantly improved the sequencing results. This NGS method has been successfully implemented to generate PV genomes for molecular epidemiology of the most recent PV isolates. Additionally, the ability to obtain full PV genomes from FTA cards will aid in facilitating global poliovirus surveillance.

  6. Structure-function studies of nerve growth factor: functional importance of highly conserved amino acid residues.

    PubMed Central

    Ibáñez, C F; Hallböök, F; Ebendal, T; Persson, H

    1990-01-01

    Selected amino acid residues in chicken nerve growth factor (NGF) were replaced by site-directed mutagenesis. Mutated NGF sequences were transiently expressed in COS cells and the yield of NGF protein in conditioned medium was quantified by Western blotting. Binding of each mutant to NGF receptors on PC12 cells was evaluated in a competition assay. The biological activity was determined by measuring stimulation of neurite outgrowth from chick sympathetic ganglia. The residues homologous to the proposed receptor binding site of insulin (Ser18, Met19, Val21, Asp23) were substituted by Ala. Replacement of Ser18, Met19 and Asp23 did not affect NGF activity. Modification of Val21 notably reduced both receptor binding and biological activity, suggesting that this residue is important to retain a fully active NGF. The highly conserved Tyr51 and Arg99 were converted into Phe and Lys respectively, without changing the biological properties of the molecule. However, binding and biological activity were greatly impaired after the simultaneous replacement of both Arg99 and Arg102 by Gly. The three conserved Trp residues at positions 20, 75 and 98 were substituted by Phe. The Trp mutated proteins retained 15-60% of receptor binding and 40-80% of biological activity, indicating that the Trp residues are not essential for NGF activity. However, replacement of Trp20 significantly reduced the amount of NGF in the medium, suggesting that this residue may be important for protein stability. Images Fig. 4. PMID:2328722

  7. A highly conserved SOX6 double binding site mediates SOX6 gene downregulation in erythroid cells

    PubMed Central

    Cantu', Claudio; Grande, Vito; Alborelli, Ilaria; Cassinelli, Letizia; Cantu’, Ileana; Colzani, Maria Teresa; Ierardi, Rossella; Ronzoni, Luisa; Cappellini, Maria Domenica; Ferrari, Giuliana; Ottolenghi, Sergio; Ronchi, Antonella

    2011-01-01

    The Sox6 transcription factor plays critical roles in various cell types, including erythroid cells. Sox6-deficient mice are anemic due to impaired red cell maturation and show inappropriate globin gene expression in definitive erythrocytes. To identify new Sox6 target genes in erythroid cells, we used the known repressive double Sox6 consensus within the εy-globin promoter to perform a bioinformatic genome-wide search for similar, evolutionarily conserved motifs located within genes whose expression changes during erythropoiesis. We found a highly conserved Sox6 consensus within the Sox6 human gene promoter itself. This sequence is bound by Sox6 in vitro and in vivo, and mediates transcriptional repression in transient transfections in human erythroleukemic K562 cells and in primary erythroblasts. The binding of a lentiviral transduced Sox6FLAG protein to the endogenous Sox6 promoter is accompanied, in erythroid cells, by strong downregulation of the endogenous Sox6 transcript and by decreased in vivo chromatin accessibility of this region to the PstI restriction enzyme. These observations suggest that the negative Sox6 autoregulation, mediated by the double Sox6 binding site within its own promoter, may be relevant to control the Sox6 transcriptional downregulation that we observe in human erythroid cultures and in mouse bone marrow cells in late erythroid maturation. PMID:20852263

  8. Library preparation for highly accurate population sequencing of RNA viruses

    PubMed Central

    Acevedo, Ashley; Andino, Raul

    2015-01-01

    Circular resequencing (CirSeq) is a novel technique for efficient and highly accurate next-generation sequencing (NGS) of RNA virus populations. The foundation of this approach is the circularization of fragmented viral RNAs, which are then redundantly encoded into tandem repeats by ‘rolling-circle’ reverse transcription. When sequenced, the redundant copies within each read are aligned to derive a consensus sequence of their initial RNA template. This process yields sequencing data with error rates far below the variant frequencies observed for RNA viruses, facilitating ultra-rare variant detection and accurate measurement of low-frequency variants. Although library preparation takes ~5 d, the high-quality data generated by CirSeq simplifies downstream data analysis, making this approach substantially more tractable for experimentalists. PMID:24967624

  9. A Next-Generation Sequencing Method for Genotyping-by-Sequencing of Highly Heterozygous Autotetraploid Potato

    PubMed Central

    Uitdewilligen, Jan G. A. M. L.; Wolters, Anne-Marie A.; D’hoop, Bjorn B.; Borm, Theo J. A.; Visser, Richard G. F.; van Eck, Herman J.

    2013-01-01

    Assessment of genomic DNA sequence variation and genotype calling in autotetraploids implies the ability to distinguish among five possible alternative allele copy number states. This study demonstrates the accuracy of genotyping-by-sequencing (GBS) of a large collection of autotetraploid potato cultivars using next-generation sequencing. It is still costly to reach sufficient read depths on a genome wide scale, across the cultivated gene pool. Therefore, we enriched cultivar-specific DNA sequencing libraries using an in-solution hybridisation method (SureSelect). This complexity reduction allowed to confine our study to 807 target genes distributed across the genomes of 83 tetraploid cultivars and one reference (DM 1–3 511). Indexed sequencing libraries were paired-end sequenced in 7 pools of 12 samples using Illumina HiSeq2000. After filtering and processing the raw sequence data, 12.4 Gigabases of high-quality sequence data was obtained, which mapped to 2.1 Mb of the potato reference genome, with a median average read depth of 63× per cultivar. We detected 129,156 sequence variants and genotyped the allele copy number of each variant for every cultivar. In this cultivar panel a variant density of 1 SNP/24 bp in exons and 1 SNP/15 bp in introns was obtained. The average minor allele frequency (MAF) of a variant was 0.14. Potato germplasm displayed a large number of relatively rare variants and/or haplotypes, with 61% of the variants having a MAF below 0.05. A very high average nucleotide diversity (π = 0.0107) was observed. Nucleotide diversity varied among potato chromosomes. Several genes under selection were identified. Genotyping-by-sequencing results, with allele copy number estimates, were validated with a KASP genotyping assay. This validation showed that read depths of ∼60–80× can be used as a lower boundary for reliable assessment of allele copy number of sequence variants in autotetraploids. Genotypic data were associated with traits, and

  10. High-Resolution Genuinely Multidimensional Solution of Conservation Laws by the Space-Time Conservation Element and Solution Element Method

    NASA Technical Reports Server (NTRS)

    Himansu, Ananda; Chang, Sin-Chung; Yu, Sheng-Tao; Wang, Xiao-Yen; Loh, Ching-Yuen; Jorgenson, Philip C. E.

    1999-01-01

    In this overview paper, we review the basic principles of the method of space-time conservation element and solution element for solving the conservation laws in one and two spatial dimensions. The present method is developed on the basis of local and global flux conservation in a space-time domain, in which space and time are treated in a unified manner. In contrast to the modern upwind schemes, the approach here does not use the Riemann solver and the reconstruction procedure as the building blocks. The drawbacks of the upwind approach, such as the difficulty of rationally extending the 1D scalar approach to systems of equations and particularly to multiple dimensions is here contrasted with the uniformity and ease of generalization of the Conservation Element and Solution Element (CE/SE) 1D scalar schemes to systems of equations and to multiple spatial dimensions. The assured compatibility with the simplest type of unstructured meshes, and the uniquely simple nonreflecting boundary conditions of the present method are also discussed. The present approach has yielded high-resolution shocks, rarefaction waves, acoustic waves, vortices, ZND detonation waves, and shock/acoustic waves/vortices interactions. Moreover, since no directional splitting is employed, numerical resolution of two-dimensional calculations is comparable to that of the one-dimensional calculations. Some sample applications displaying the strengths and broad applicability of the CE/SE method are reviewed.

  11. The First Myriapod Genome Sequence Reveals Conservative Arthropod Gene Content and Genome Organisation in the Centipede Strigamia maritima

    PubMed Central

    Chipman, Ariel D.; Ferrier, David E. K.; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S. T.; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C.; Alonso, Claudio R.; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C. J.; Blankenburg, Kerstin P.; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K.; Du Pasquier, Louis; Duncan, Elizabeth J.; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D.; Extavour, Cassandra G.; Francisco, Liezl; Gabaldón, Toni; Gillis, William J.; Goodwin-Horn, Elizabeth A.; Green, Jack E.; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J. P.; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H. L.; Hunn, Julia P.; Hunnekuhl, Vera S.; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N.; Jiggins, Francis M.; Jones, Tamsin E.; Kaiser, Tobias S.; Kalra, Divya; Kenny, Nathan J.; Korchina, Viktoriya; Kovar, Christie L.; Kraus, F. Bernhard; Lapraz, François; Lee, Sandra L.; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N.; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J.; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H.; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C.; Robertson, Helen E.; Robertson, Hugh M.; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E.; Schurko, Andrew M.; Siggens, Kenneth W.; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J.; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M.; Willis, Judith H.; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M.; Worley, Kim C.; Gibbs, Richard A.; Akam, Michael; Richards, Stephen

    2014-01-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific

  12. Sorting out relationships among the grouse and ptarmigan using intron, mitochondrial, and ultra-conserved element sequences.

    PubMed

    Persons, Nicholas W; Hosner, Peter A; Meiklejohn, Kelly A; Braun, Edward L; Kimball, Rebecca T

    2016-05-01

    The Holarctic phasianid clade of the grouse and ptarmigan has received substantial attention in areas such as evolution of mating systems, display behavior, and population ecology related to their conservation and management as wild game species. There are multiple molecular phylogenetic studies that focus on grouse and ptarmigan. In spite of this, there is little consensus regarding historical relationships, particularly among genera, which has led to unstable and partial taxonomic revisions. We estimated the phylogeny of all currently recognized species using a combination of novel data from seven nuclear loci (largely intron sequences) and published data from one additional autosomal locus, two W-linked loci, and four mitochondrial regions. To explore relationships among genera and assess paraphyly of one genus more rigorously, we then added over 3000 ultra-conserved element (UCE) loci (over 1.7million bp) gathered using Illumina sequencing. The UCE topology agreed with that of the combined nuclear intron and previously published sequence data with 100% bootstrap support for all relationships. These data strongly support previous studies separating Bonasa from Tetrastes and Dendragapus from Falcipennis. However, the placement of Lagopus differed from previous studies, and we found no support for Falcipennis monophyly. Biogeographic analysis suggests that the ancestors of grouse and ptarmigan were distributed in the New World and subsequently underwent at least four dispersal events between the Old and New Worlds. Divergence time estimates from maternally-inherited and autosomal markers show stark differences across this clade, with divergence time estimates from maternally-inherited markers being nearly half that of the autosomal markers at some nodes, and nearly twice that at other nodes.

  13. The first myriapod genome sequence reveals conservative arthropod gene content and genome organisation in the centipede Strigamia maritima.

    PubMed

    Chipman, Ariel D; Ferrier, David E K; Brena, Carlo; Qu, Jiaxin; Hughes, Daniel S T; Schröder, Reinhard; Torres-Oliva, Montserrat; Znassi, Nadia; Jiang, Huaiyang; Almeida, Francisca C; Alonso, Claudio R; Apostolou, Zivkos; Aqrawi, Peshtewani; Arthur, Wallace; Barna, Jennifer C J; Blankenburg, Kerstin P; Brites, Daniela; Capella-Gutiérrez, Salvador; Coyle, Marcus; Dearden, Peter K; Du Pasquier, Louis; Duncan, Elizabeth J; Ebert, Dieter; Eibner, Cornelius; Erikson, Galina; Evans, Peter D; Extavour, Cassandra G; Francisco, Liezl; Gabaldón, Toni; Gillis, William J; Goodwin-Horn, Elizabeth A; Green, Jack E; Griffiths-Jones, Sam; Grimmelikhuijzen, Cornelis J P; Gubbala, Sai; Guigó, Roderic; Han, Yi; Hauser, Frank; Havlak, Paul; Hayden, Luke; Helbing, Sophie; Holder, Michael; Hui, Jerome H L; Hunn, Julia P; Hunnekuhl, Vera S; Jackson, LaRonda; Javaid, Mehwish; Jhangiani, Shalini N; Jiggins, Francis M; Jones, Tamsin E; Kaiser, Tobias S; Kalra, Divya; Kenny, Nathan J; Korchina, Viktoriya; Kovar, Christie L; Kraus, F Bernhard; Lapraz, François; Lee, Sandra L; Lv, Jie; Mandapat, Christigale; Manning, Gerard; Mariotti, Marco; Mata, Robert; Mathew, Tittu; Neumann, Tobias; Newsham, Irene; Ngo, Dinh N; Ninova, Maria; Okwuonu, Geoffrey; Ongeri, Fiona; Palmer, William J; Patil, Shobha; Patraquim, Pedro; Pham, Christopher; Pu, Ling-Ling; Putman, Nicholas H; Rabouille, Catherine; Ramos, Olivia Mendivil; Rhodes, Adelaide C; Robertson, Helen E; Robertson, Hugh M; Ronshaugen, Matthew; Rozas, Julio; Saada, Nehad; Sánchez-Gracia, Alejandro; Scherer, Steven E; Schurko, Andrew M; Siggens, Kenneth W; Simmons, DeNard; Stief, Anna; Stolle, Eckart; Telford, Maximilian J; Tessmar-Raible, Kristin; Thornton, Rebecca; van der Zee, Maurijn; von Haeseler, Arndt; Williams, James M; Willis, Judith H; Wu, Yuanqing; Zou, Xiaoyan; Lawson, Daniel; Muzny, Donna M; Worley, Kim C; Gibbs, Richard A; Akam, Michael; Richards, Stephen

    2014-11-01

    Myriapods (e.g., centipedes and millipedes) display a simple homonomous body plan relative to other arthropods. All members of the class are terrestrial, but they attained terrestriality independently of insects. Myriapoda is the only arthropod class not represented by a sequenced genome. We present an analysis of the genome of the centipede Strigamia maritima. It retains a compact genome that has undergone less gene loss and shuffling than previously sequenced arthropods, and many orthologues of genes conserved from the bilaterian ancestor that have been lost in insects. Our analysis locates many genes in conserved macro-synteny contexts, and many small-scale examples of gene clustering. We describe several examples where S. maritima shows different solutions from insects to similar problems. The insect olfactory receptor gene family is absent from S. maritima, and olfaction in air is likely effected by expansion of other receptor gene families. For some genes S. maritima has evolved paralogues to generate coding sequence diversity, where insects use alternate splicing. This is most striking for the Dscam gene, which in Drosophila generates more than 100,000 alternate splice forms, but in S. maritima is encoded by over 100 paralogues. We see an intriguing linkage between the absence of any known photosensory proteins in a blind organism and the additional absence of canonical circadian clock genes. The phylogenetic position of myriapods allows us to identify where in arthropod phylogeny several particular molecular mechanisms and traits emerged. For example, we conclude that juvenile hormone signalling evolved with the emergence of the exoskeleton in the arthropods and that RR-1 containing cuticle proteins evolved in the lineage leading to Mandibulata. We also identify when various gene expansions and losses occurred. The genome of S. maritima offers us a unique glimpse into the ancestral arthropod genome, while also displaying many adaptations to its specific

  14. Co-conservation of rRNA tetraloop sequences and helix length suggests involvement of the tetraloops in higher-order interactions

    NASA Technical Reports Server (NTRS)

    Hedenstierna, K. O.; Siefert, J. L.; Fox, G. E.; Murgola, E. J.

    2000-01-01

    Terminal loops containing four nucleotides (tetraloops) are common in structural RNAs, and they frequently conform to one of three sequence motifs, GNRA, UNCG, or CUUG. Here we compare available sequences and secondary structures for rRNAs from bacteria, and we show that helices capped by phylogenetically conserved GNRA loops display a strong tendency to be of conserved length. The simplest interpretation of this correlation is that the conserved GNRA loops are involved in higher-order interactions, intramolecular or intermolecular, resulting in a selective pressure for maintaining the lengths of these helices. A small number of conserved UNCG loops were also found to be associated with conserved length helices, consistent with the possibility that this type of tetraloop also takes part in higher-order interactions.

  15. Evolutionarily conserved sequences of striated muscle myosin heavy chain isoforms. Epitope mapping by cDNA expression.

    PubMed

    Miller, J B; Teal, S B; Stockdale, F E

    1989-08-05

    A cDNA expression strategy was used to localize amino acid sequences which were specific for fast, as opposed to slow, isoforms of the chicken skeletal muscle myosin heavy chain (MHC) and which were conserved in vertebrate evolution. Five monoclonal antibodies (mAbs), termed F18, F27, F30, F47, and F59, were prepared that reacted with all of the known chicken fast MHC isoforms but did not react with any of the known chicken slow nor with smooth muscle MHC isoforms. The epitopes recognized by mAbs F18, F30, F47, and F59 were on the globular head fragment of the MHC, whereas the epitope recognized by mAb F27 was on the helical tail or rod fragment. Reactivity of all five mAbs also was confined to fast MHCs in the rat, with the exception of mAb F59, which also reacted with the beta-cardiac MHC, the single slow MHC isoform common to both the rat heart and skeletal muscle. None of the five epitopes was expressed on amphioxus, nematode, or Dictyostelium MHC. The F27 and F59 epitopes were found on shark, electric ray, goldfish, newt, frog, turtle, chicken, quail, rabbit, and rat MHCs. The epitopes recognized by these mAbs were conserved, therefore, to varying degrees through vertebrate evolution and differed in sequence from homologous regions of a number of invertebrate MHCs and myosin-like proteins. The sequence of those epitopes on the head were mapped using a two-part cDNA expression strategy. First, Bal31 exonuclease digestion was used to rapidly generate fragments of a chicken embryonic fast MHC cDNA that were progressively deleted from the 3' end. These cDNA fragments were expressed as beta-galactosidase/MHC fusion proteins using the pUR290 vector; the fusion proteins were tested by immunoblotting for reactivity with the mAbs; and the approximate locations of the epitopes were determined from the sizes of the cDNA fragments that encoded a particular epitope. The epitopes were then precisely mapped by expression of overlapping cDNA fragments of known sequence that

  16. Molecular polymorphisms associated with host range in the highly conserved genomes of burrowing nematodes, Radopholus spp.

    PubMed

    Kaplan, D T; Vanderspool, M C; Garrett, C; Chang, S; Opperman, C H

    1996-01-01

    Six polymorphic bands of DNA were amplified from purified Radopholus citrophilus genomic DNA from one strain of each of the sibling species R. citrophilus and R. similis in random amplified polymorphic DNA analyses involving 380 single 10-base primers. Four of these polymorphic DNA fragments were successfully cloned and amplified through subsequent use of primers designed to complement the terminal sequences of the polymorphic DNA. Results of ensuing studies using mini-prepped DNA from 14 burrowing nematode strains collected from Florida, Hawaii, and Central America, characterized for their ability to parasitize citrus, indicated that a 2.4-kb fragment appeared to be associated with citrus parasitism in burrowing nematode populations from Florida. However, a fragment of comparable size was also detected in R. citrophilus from Hawaii and from burrowing nematode populations collected from Belize and Puerto Rico. Overall, findings suggest that the genome organization of the burrowing nematode sibling species R. citrophilus and R. similis is highly conserved. This remarkable genetic similarity should facilitate identification of genetic sequence related to important phenotypes such as citrus parasitism. Detection of R. citrophilus-specific DNA fragments in burrowing nematodes collected from Belize and Puerto Rico suggests that R. citrophilus is resident in some Central American countries.

  17. Horse domestication and conservation genetics of Przewalski's horse inferred from sex chromosomal and autosomal sequences.

    PubMed

    Lau, Allison N; Peng, Lei; Goto, Hiroki; Chemnick, Leona; Ryder, Oliver A; Makova, Kateryna D

    2009-01-01

    Despite their ability to interbreed and produce fertile offspring, there is continued disagreement about the genetic relationship of the domestic horse (Equus caballus) to its endangered wild relative, Przewalski's horse (Equus przewalskii). Analyses have differed as to whether or not Przewalski's horse is placed phylogenetically as a separate sister group to domestic horses. Because Przewalski's horse and domestic horse are so closely related, genetic data can also be used to infer domestication-specific differences between the two. To investigate the genetic relationship of Przewalski's horse to the domestic horse and to address whether evolution of the domestic horse is driven by males or females, five homologous introns (a total of approximately 3 kb) were sequenced on the X and Y chromosomes in two Przewalski's horses and three breeds of domestic horses: Arabian horse, Mongolian domestic horse, and Dartmoor pony. Five autosomal introns (a total of approximately 6 kb) were sequenced for these horses as well. The sequences of sex chromosomal and autosomal introns were used to determine nucleotide diversity and the forces driving evolution in these species. As a result, X chromosomal and autosomal data do not place Przewalski's horses in a separate clade within phylogenetic trees for horses, suggesting a close relationship between domestic and Przewalski's horses. It was also found that there was a lack of nucleotide diversity on the Y chromosome and higher nucleotide diversity than expected on the X chromosome in domestic horses as compared with the Y chromosome and autosomes. This supports the hypothesis that very few male horses along with numerous female horses founded the various domestic horse breeds. Patterns of nucleotide diversity among different types of chromosomes were distinct for Przewalski's in contrast to domestic horses, supporting unique evolutionary histories of the two species.

  18. High-throughput sequencing in veterinary infection biology and diagnostics.

    PubMed

    Belák, S; Karlsson, O E; Leijon, M; Granberg, F

    2013-12-01

    Sequencing methods have improved rapidly since the first versions of the Sanger techniques, facilitating the development of very powerful tools for detecting and identifying various pathogens, such as viruses, bacteria and other microbes. The ongoing development of high-throughput sequencing (HTS; also known as next-generation sequencing) technologies has resulted in a dramatic reduction in DNA sequencing costs, making the technology more accessible to the average laboratory. In this White Paper of the World Organisation for Animal Health (OIE) Collaborating Centre for the Biotechnology-based Diagnosis of Infectious Diseases in Veterinary Medicine (Uppsala, Sweden), several approaches and examples of HTS are summarised, and their diagnostic applicability is briefly discussed. Selected future aspects of HTS are outlined, including the need for bioinformatic resources, with a focus on improving the diagnosis and control of infectious diseases in veterinary medicine.

  19. High conservation of a 5' element required for RNA editing of a C target in chloroplast psbE transcripts.

    PubMed

    Hayes, Michael L; Hanson, Maureen R

    2008-09-01

    C-to-U editing modifies 30-40 distinct nucleotides within higher-plant chloroplast transcripts. Many C targets are located at the same position in homologous genes from different plants; these either could have emerged independently or could share a common origin. The 5' sequence GCCGUU, required for editing of C214 in tobacco psbE in vitro, is one of the few identified editing cis-elements. We investigated psbE sequences from many plant species to determine in what lineage(s) editing of psbE C214 emerged and whether the cis-element identified in tobacco is conserved in plants with a C214. The GCCGUU sequence is present at a high frequency in plants that carry a C214 in psbE. However, Sciadopitys verticillata (Pinophyta) edits C214 despite the presence of nucleotide differences compared to the conserved cis-element. The C214 site in psbE genes is represented in members of four branches of spermatophytes but not in gnetophytes, resulting in the parsimonious prediction that editing of psbE C214 was present in the ancestor of spermatophytes. Extracts from chloroplasts from a species that has a difference in the motif and lacks the C target are incapable of editing tobacco psbE C214 substrates, implying that the critical trans-acting protein factors were not retained without a C target. Because noncoding sequences are less constrained than coding regions, we analyzed sequences 5' to two C editing targets located within coding regions to search for possible editing-related conserved elements. Putative editing cis-elements were uncovered in the 5' UTRs near editing sites psbL C2 and ndhD C2.

  20. Canine Polydactyl Mutations With Heterogeneous Origin in the Conserved Intronic Sequence of LMBR1

    PubMed Central

    Park, Kiyun; Kang, Joohyun; Subedi, Krishna Pd.; Ha, Ji-Hong; Park, Chankyu

    2008-01-01

    Canine preaxial polydactyly (PPD) in the hind limb is a developmental trait that restores the first digit lost during canine evolution. Using a linkage analysis, we previously demonstrated that the affected gene in a Korean breed is located on canine chromosome 16. The candidate locus was further limited to a linkage disequilibrium (LD) block of <213 kb composing the single gene, LMBR1, by LD mapping with single nucleotide polymorphisms (SNPs) for affected individuals from both Korean and Western breeds. The ZPA regulatory sequence (ZRS) in intron 5 of LMBR1 was implicated in mammalian polydactyly. An analysis of the LD haplotypes around the ZRS for various dog breeds revealed that only a subset is assigned to Western breeds. Furthermore, two distinct affected haplotypes for Asian and Western breeds were found, each containing different single-base changes in the upstream sequence (pZRS) of the ZRS. Unlike the previously characterized cases of PPD identified in the mouse and human ZRS regions, the canine mutations in pZRS lacked the ectopic expression of sonic hedgehog in the anterior limb bud, distinguishing its role in limb development from that of the ZRS. PMID:18689889

  1. Mitochondrial genome sequences of Artemia tibetiana and Artemia urmiana: assessing molecular changes for high plateau adaptation.

    PubMed

    Zhang, Hangxiao; Luo, Qibin; Sun, Jing; Liu, Fei; Wu, Gang; Yu, Jun; Wang, Weiwei

    2013-05-01

    Brine shrimps, Artemia (Crustacea, Anostraca), inhabit hypersaline environments and have a broad geographical distribution from sea level to high plateaus. Artemia therefore possess significant genetic diversity, which gives them their outstanding adaptability. To understand this remarkable plasticity, we sequenced the mitochondrial genomes of two Artemia tibetiana isolates from the Tibetan Plateau in China and one Artemia urmiana isolate from Lake Urmia in Iran and compared them with the genome of a low-altitude Artemia, A. franciscana. We compared the ratio of the rate of nonsynonymous (Ka) and synonymous (Ks) substitutions (Ka/Ks ratio) in the mitochondrial protein-coding gene sequences and found that atp8 had the highest Ka/Ks ratios in comparisons of A. franciscana with either A. tibetiana or A. urmiana and that atp6 had the highest Ka/Ks ratio between A. tibetiana and A. urmiana. Atp6 may have experienced strong selective pressure for high-altitude adaptation because although A. tibetiana and A. urmiana are closely related they live at different altitudes. We identified two extended termination-associated sequences and three conserved sequence blocks in the D-loop region of the mitochondrial genomes. We propose that sequence variations in the D-loop region and in the subunits of the respiratory chain complexes independently or collectively contribute to the adaptation of Artemia to different altitudes.

  2. Exome Sequence Analysis of 14 Families With High Myopia

    PubMed Central

    Kloss, Bethany A.; Tompson, Stuart W.; Whisenhunt, Kristina N.; Quow, Krystina L.; Huang, Samuel J.; Pavelec, Derek M.; Rosenberg, Thomas; Young, Terri L.

    2017-01-01

    Purpose To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Methods Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sanger sequencing was used to confirm variants in original DNA, and to test for disease cosegregation in additional family members. Candidate genes and chromosomal loci previously associated with myopic refractive error and its endophenotypes were comprehensively screened. Results In 14 high myopia families, we identified 73 rare and 31 novel gene variants as candidates for pathogenicity. In seven of these families, two of the novel and eight of the rare variants were within known myopia loci. A total of 104 heterozygous nonsynonymous rare variants in 104 genes were identified in 10 out of 14 probands. Each variant cosegregated with affection status. No rare variants were identified in genes known to cause myopia or in genes closest to published genome-wide association study association signals for refractive error or its endophenotypes. Conclusions Whole exome sequencing was performed to determine gene variants implicated in the pathogenesis of AD high myopia. This study provides new genes for consideration in the pathogenesis of high myopia, and may aid in the development of genetic profiling of those at greatest risk for attendant ocular morbidities of this disorder. PMID:28384719

  3. Assessment of selected conservation measures for high-temperature process industries

    SciTech Connect

    Kusik, C L; Parameswaran, K; Nadkarni, R; O'Neill, J K; Malhotra, S; Hyde, R; Kinneberg, D; Fox, L; Rossetti, M

    1981-01-01

    Energy conservation projects involving high-temperature processes in various stages of development are assessed to quantify their energy conservation potential; to determine their present status of development; to identify their research and development needs and estimate the associated costs; and to determine the most effective role for the Federal government in developing these technologies. The program analyzed 25 energy conserving processes in the iron and steel, aluminium, copper, magnesium, cement, and glassmaking industries. A preliminary list of other potential energy conservation projects in these industries is also presented in the appendix. (MCW)

  4. Structural sequences are conserved in the genes coding for the alpha, alpha' and beta-subunits of the soybean 7S seed storage protein.

    PubMed Central

    Schuler, M A; Ladin, B F; Pollaco, J C; Freyer, G; Beachy, R N

    1982-01-01

    Cloned DNAs encoding four different proteins have been isolated from recombinant cDNA libraries constructed with Glycine max seed mRNAs. Two cloned DNAs code for the alpha and alpha'-subunits of the 7S seed storage protein (conglycinin). The other cloned cDNAs code for proteins which are synthesized in vitro as 68,000 d., 60,000 d. or 53,000 d. polypeptides. Hybrid selection experiments indicate that, under low stringency hybridization conditions, all four cDNAs hybridize with mRNAs for the alpha and alpha'-subunits and the 68,000 d., 60,000 d. and 53,000 d. in vitro translation products. Within three of the mRNA, there is a conserved sequence of 155 nucleotides which is responsible for this hybridization. The conserved nucleotides in the alpha and alpha'-subunit cDNAs and the 68,000 d. polypeptide cDNAs span both coding and noncoding sequences. The differences in the coding nucleotides outside the conserved region are extensive. This suggests that selective pressure to maintain the 155 conserved nucleotides has been influenced by the structure of the seed mRNA. RNA blot hybridizations demonstrate that mRNA encoding the other major subunit (beta) of the 7S seed storage protein also shares sequence homology with the conserved 155 nucleotide sequence of the alpha and alpha'-subunit mRNAs, but not with other coding sequences. Images PMID:6897678

  5. The Number, Organization, and Size of Polymorphic Membrane Protein Coding Sequences as well as the Most Conserved Pmp Protein Differ within and across Chlamydia Species.

    PubMed

    Van Lent, Sarah; Creasy, Heather Huot; Myers, Garry S A; Vanrompay, Daisy

    2016-01-01

    Variation is a central trait of the polymorphic membrane protein (Pmp) family. The number of pmp coding sequences differs between Chlamydia species, but it is unknown whether the number of pmp coding sequences is constant within a Chlamydia species. The level of conservation of the Pmp proteins has previously only been determined for Chlamydia trachomatis. As different Pmp proteins might be indispensible for the pathogenesis of different Chlamydia species, this study investigated the conservation of Pmp proteins both within and across C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci. The pmp coding sequences were annotated in 16 C. trachomatis, 6 C. pneumoniae, 2 C. abortus, and 16 C. psittaci genomes. The number and organization of polymorphic membrane coding sequences differed within and across the analyzed Chlamydia species. The length of coding sequences of pmpA,pmpB, and pmpH was conserved among all analyzed genomes, while the length of pmpE/F and pmpG, and remarkably also of the subtype pmpD, differed among the analyzed genomes. PmpD, PmpA, PmpH, and PmpA were the most conserved Pmp in C. trachomatis,C. pneumoniae,C. abortus, and C. psittaci, respectively. PmpB was the most conserved Pmp across the 4 analyzed Chlamydia species.

  6. 7 CFR 1430.225 - Violations of highly erodible land and wetland conservation provisions.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 7 Agriculture 10 2011-01-01 2011-01-01 false Violations of highly erodible land and wetland conservation provisions. 1430.225 Section 1430.225 Agriculture Regulations of the Department of Agriculture... wetland conservation provisions. The provisions of part 12 of this title apply to this part....

  7. 7 CFR 1412.68 - Compliance with highly erodible land and wetland conservation provisions.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 7 Agriculture 10 2013-01-01 2013-01-01 false Compliance with highly erodible land and wetland conservation provisions. 1412.68 Section 1412.68 Agriculture Regulations of the Department of Agriculture... and wetland conservation provisions. The provisions of part 12 of this title apply to this part....

  8. 7 CFR 1412.68 - Compliance with highly erodible land and wetland conservation provisions.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 7 Agriculture 10 2014-01-01 2014-01-01 false Compliance with highly erodible land and wetland conservation provisions. 1412.68 Section 1412.68 Agriculture Regulations of the Department of Agriculture... and wetland conservation provisions. The provisions of part 12 of this title apply to this part....

  9. 7 CFR 1430.225 - Violations of highly erodible land and wetland conservation provisions.

    Code of Federal Regulations, 2014 CFR

    2014-01-01

    ... 7 Agriculture 10 2014-01-01 2014-01-01 false Violations of highly erodible land and wetland conservation provisions. 1430.225 Section 1430.225 Agriculture Regulations of the Department of Agriculture... wetland conservation provisions. The provisions of part 12 of this title apply to this part....

  10. 7 CFR 1412.68 - Compliance with highly erodible land and wetland conservation provisions.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 10 2010-01-01 2010-01-01 false Compliance with highly erodible land and wetland conservation provisions. 1412.68 Section 1412.68 Agriculture Regulations of the Department of Agriculture... and wetland conservation provisions. The provisions of part 12 of this title apply to this part....

  11. 7 CFR 1412.68 - Compliance with highly erodible land and wetland conservation provisions.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... 7 Agriculture 10 2011-01-01 2011-01-01 false Compliance with highly erodible land and wetland conservation provisions. 1412.68 Section 1412.68 Agriculture Regulations of the Department of Agriculture... and wetland conservation provisions. The provisions of part 12 of this title apply to this part....

  12. 7 CFR 1412.68 - Compliance with highly erodible land and wetland conservation provisions.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 7 Agriculture 10 2012-01-01 2012-01-01 false Compliance with highly erodible land and wetland conservation provisions. 1412.68 Section 1412.68 Agriculture Regulations of the Department of Agriculture... and wetland conservation provisions. The provisions of part 12 of this title apply to this part....

  13. 7 CFR 1430.225 - Violations of highly erodible land and wetland conservation provisions.

    Code of Federal Regulations, 2010 CFR

    2010-01-01

    ... 7 Agriculture 10 2010-01-01 2010-01-01 false Violations of highly erodible land and wetland conservation provisions. 1430.225 Section 1430.225 Agriculture Regulations of the Department of Agriculture... wetland conservation provisions. The provisions of part 12 of this title apply to this part....

  14. 7 CFR 1430.225 - Violations of highly erodible land and wetland conservation provisions.

    Code of Federal Regulations, 2012 CFR

    2012-01-01

    ... 7 Agriculture 10 2012-01-01 2012-01-01 false Violations of highly erodible land and wetland conservation provisions. 1430.225 Section 1430.225 Agriculture Regulations of the Department of Agriculture... wetland conservation provisions. The provisions of part 12 of this title apply to this part....

  15. 7 CFR 1430.225 - Violations of highly erodible land and wetland conservation provisions.

    Code of Federal Regulations, 2013 CFR

    2013-01-01

    ... 7 Agriculture 10 2013-01-01 2013-01-01 false Violations of highly erodible land and wetland conservation provisions. 1430.225 Section 1430.225 Agriculture Regulations of the Department of Agriculture... wetland conservation provisions. The provisions of part 12 of this title apply to this part....

  16. 78 FR 20503 - Energy Conservation Program: Availability of the Interim Technical Support Document for High...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2013-04-05

    ... CFR Part 431 RIN 1904-AC36 Energy Conservation Program: Availability of the Interim Technical Support... interim technical support document (TSD) for high-intensity discharge (HID) lamps energy conservation... submitting comments on the interim TSD or any other aspect of the rulemaking for HID lamps. The...

  17. CpG methylation differences between neurons and glia are highly conserved from mouse to human.

    PubMed

    Kessler, Noah J; Van Baak, Timothy E; Baker, Maria S; Laritsky, Eleonora; Coarfa, Cristian; Waterland, Robert A

    2016-01-15

    Understanding epigenetic differences that distinguish neurons and glia is of fundamental importance to the nascent field of neuroepigenetics. A recent study used genome-wide bisulfite sequencing to survey differences in DNA methylation between these two cell types, in both humans and mice. That study minimized the importance of cell type-specific differences in CpG methylation, claiming these are restricted to localized genomic regions, and instead emphasized that widespread and highly conserved differences in non-CpG methylation distinguish neurons and glia. We reanalyzed the data from that study and came to markedly different conclusions. In particular, we found widespread cell type-specific differences in CpG methylation, with a genome-wide tendency for neuronal CpG-hypermethylation punctuated by regions of glia-specific hypermethylation. Alarmingly, our analysis indicated that the majority of genes identified by the primary study as exhibiting cell type-specific CpG methylation differences were misclassified. To verify the accuracy of our analysis, we isolated neuronal and glial DNA from mouse cortex and performed quantitative bisulfite pyrosequencing at nine loci. The pyrosequencing results corroborated our analysis, without exception. Most interestingly, we found that gene-associated neuron vs. glia CpG methylation differences are highly conserved across human and mouse, and are very likely to be functional. In addition to underscoring the importance of independent verification to confirm the conclusions of genome-wide epigenetic analyses, our data indicate that CpG methylation plays a major role in neuroepigenetics, and that the mouse is likely an excellent model in which to study the role of DNA methylation in human neurodevelopment and disease.

  18. Complete mitochondrial DNA sequence of the endangered giant sable antelope (Hippotragus niger variani): insights into conservation and taxonomy.

    PubMed

    Espregueira Themudo, Gonçalo; Rufino, Ana C; Campos, Paula F

    2015-02-01

    The giant sable antelope is one of the most endangered African bovids. Populations of this iconic animal, the national symbol of Angola, were recently rediscovered, after many decades of presumed extinction. Even so, their numbers are scarce and hence conservation plans are essential. However, fundamental information such as its taxonomic position, time of divergence and degree of genetic variation are still lacking. Here, we used a museum preserved horn as a source of DNA to describe, for the first time, the complete mitochondrial genome of the giant sable antelope, and provide insights into its evolutionary history. Reads generated by shotgun sequencing were mapped against the mitochondrial genome of common sable antelope and the nuclear genomes of cow and sheep. Phylogenetic reconstruction and divergence time estimate give support to the monophyly of the giant sable and a maximum divergence time of 170 thousand years to the closest subspecies. About 7% of the nuclear genome was mapped against the reference. The genetic resources reported here are now available for future work in the field of conservation genetics and phylogeny, in this and related species.

  19. An improved high throughput sequencing method for studying oomycete communities.

    PubMed

    Sapkota, Rumakanta; Nicolaisen, Mogens

    2015-03-01

    Culture-independent studies using next generation sequencing have revolutionized microbial ecology, however, oomycete ecology in soils is severely lagging behind. The aim of this study was to improve and validate standard techniques for using high throughput sequencing as a tool for studying oomycete communities. The well-known primer sets ITS4, ITS6 and ITS7 were used in the study in a semi-nested PCR approach to target the internal transcribed spacer (ITS) 1 of ribosomal DNA in a next generation sequencing protocol. These primers have been used in similar studies before, but with limited success. We were able to increase the proportion of retrieved oomycete sequences dramatically mainly by increasing the annealing temperature during PCR. The optimized protocol was validated using three mock communities and the method was further evaluated using total DNA from 26 soil samples collected from different agricultural fields in Denmark, and 11 samples from carrot tissue with symptoms of Pythium infection. Sequence data from the Pythium and Phytophthora mock communities showed that our strategy successfully detected all included species. Taxonomic assignments of OTUs from 26 soil sample showed that 95% of the sequences could be assigned to oomycetes including Pythium, Aphanomyces, Peronospora, Saprolegnia and Phytophthora. A high proportion of oomycete reads was consistently present in all 26 soil samples showing the versatility of the strategy. A large diversity of Pythium species including pathogenic and saprophytic species were dominating in cultivated soil. Finally, we analyzed amplicons from carrots with symptoms of cavity spot. This resulted in 94% of the reads belonging to oomycetes with a dominance of species of Pythium that are known to be involved in causing cavity spot, thus demonstrating the usefulness of the method not only in soil DNA but also in a plant DNA background. In conclusion, we demonstrate a successful approach for pyrosequencing of oomycete

  20. Binary interactions with high accretion rates onto main sequence stars

    NASA Astrophysics Data System (ADS)

    Shiber, Sagiv; Schreier, Ron; Soker, Noam

    2016-07-01

    Energetic outflows from main sequence stars accreting mass at very high rates might account for the powering of some eruptive objects, such as merging main sequence stars, major eruptions of luminous blue variables, e.g., the Great Eruption of Eta Carinae, and other intermediate luminosity optical transients (ILOTs; red novae; red transients). These powerful outflows could potentially also supply the extra energy required in the common envelope process and in the grazing envelope evolution of binary systems. We propose that a massive outflow/jets mediated by magnetic fields might remove energy and angular momentum from the accretion disk to allow such high accretion rate flows. By examining the possible activity of the magnetic fields of accretion disks, we conclude that indeed main sequence stars might accrete mass at very high rates, up to ≈ 10-2 M ⊙ yr-1 for solar type stars, and up to ≈ 1 M ⊙ yr-1 for very massive stars. We speculate that magnetic fields amplified in such extreme conditions might lead to the formation of massive bipolar outflows that can remove most of the disk's energy and angular momentum. It is this energy and angular momentum removal that allows the very high mass accretion rate onto main sequence stars.

  1. High-utility conserved avian microsatellite markers enable parentage and population studies across a wide range of species

    PubMed Central

    2013-01-01

    Background Microsatellites are widely used for many genetic studies. In contrast to single nucleotide polymorphism (SNP) and genotyping-by-sequencing methods, they are readily typed in samples of low DNA quality/concentration (e.g. museum/non-invasive samples), and enable the quick, cheap identification of species, hybrids, clones and ploidy. Microsatellites also have the highest cross-species utility of all types of markers used for genotyping, but, despite this, when isolated from a single species, only a relatively small proportion will be of utility. Marker development of any type requires skill and time. The availability of sufficient “off-the-shelf” markers that are suitable for genotyping a wide range of species would not only save resources but also uniquely enable new comparisons of diversity among taxa at the same set of loci. No other marker types are capable of enabling this. We therefore developed a set of avian microsatellite markers with enhanced cross-species utility. Results We selected highly-conserved sequences with a high number of repeat units in both of two genetically distant species. Twenty-four primer sets were designed from homologous sequences that possessed at least eight repeat units in both the zebra finch (Taeniopygia guttata) and chicken (Gallus gallus). Each primer sequence was a complete match to zebra finch and, after accounting for degenerate bases, at least 86% similar to chicken. We assessed primer-set utility by genotyping individuals belonging to eight passerine and four non-passerine species. The majority of the new Conserved Avian Microsatellite (CAM) markers amplified in all 12 species tested (on average, 94% in passerines and 95% in non-passerines). This new marker set is of especially high utility in passerines, with a mean 68% of loci polymorphic per species, compared with 42% in non-passerine species. Conclusions When combined with previously described conserved loci, this new set of conserved markers will not only

  2. Identification and characterization of flowering genes in kiwifruit: sequence conservation and role in kiwifruit flower development

    PubMed Central

    2011-01-01

    Background Flower development in kiwifruit (Actinidia spp.) is initiated in the first growing season, when undifferentiated primordia are established in latent shoot buds. These primordia can differentiate into flowers in the second growing season, after the winter dormancy period and upon accumulation of adequate winter chilling. Kiwifruit is an important horticultural crop, yet little is known about the molecular regulation of flower development. Results To study kiwifruit flower development, nine MADS-box genes were identified and functionally characterized. Protein sequence alignment, phenotypes obtained upon overexpression in Arabidopsis and expression patterns suggest that the identified genes are required for floral meristem and floral organ specification. Their role during budbreak and flower development was studied. A spontaneous kiwifruit mutant was utilized to correlate the extended expression domains of these flowering genes with abnormal floral development. Conclusions This study provides a description of flower development in kiwifruit at the molecular level. It has identified markers for flower development, and candidates for manipulation of kiwifruit growth, phase change and time of flowering. The expression in normal and aberrant flowers provided a model for kiwifruit flower development. PMID:21521532

  3. Sequence-Based Screening for Rare Enzymes: New Insights into the World of AMDases Reveal a Conserved Motif and 58 Novel Enzymes Clustering in Eight Distinct Families

    PubMed Central

    Maimanakos, Janine; Chow, Jennifer; Gaßmeyer, Sarah K.; Güllert, Simon; Busch, Florian; Kourist, Robert; Streit, Wolfgang R.

    2016-01-01

    Arylmalonate Decarboxylases (AMDases, EC 4.1.1.76) are very rare and mostly underexplored enzymes. Currently only four known and biochemically characterized representatives exist. However, their ability to decarboxylate α-disubstituted malonic acid derivatives to optically pure products without cofactors makes them attractive and promising candidates for the use as biocatalysts in industrial processes. Until now, AMDases could not be separated from other members of the aspartate/glutamate racemase superfamily based on their gene sequences. Within this work, a search algorithm was developed that enables a reliable prediction of AMDase activity for potential candidates. Based on specific sequence patterns and screening methods 58 novel AMDase candidate genes could be identified in this work. Thereby, AMDases with the conserved sequence pattern of Bordetella bronchiseptica’s prototype appeared to be limited to the classes of Alpha-, Beta-, and Gamma-proteobacteria. Amino acid homologies and comparison of gene surrounding sequences enabled the classification of eight enzyme clusters. Particularly striking is the accumulation of genes coding for different transporters of the tripartite tricarboxylate transporters family, TRAP transporters and ABC transporters as well as genes coding for mandelate racemases/muconate lactonizing enzymes that might be involved in substrate uptake or degradation of AMDase products. Further, three novel AMDases were characterized which showed a high enantiomeric excess (>99%) of the (R)-enantiomer of flurbiprofen. These are the recombinant AmdA and AmdV from Variovorax sp. strains HH01 and HH02, originated from soil, and AmdP from Polymorphum gilvum found by a data base search. Altogether our findings give new insights into the class of AMDases and reveal many previously unknown enzyme candidates with high potential for bioindustrial processes. PMID:27610105

  4. High Throughput Plasmid Sequencing with Illumina and CLC Bio (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Athavale, Ajay [Monsanto

    2016-07-12

    Ajay Athavale (Monsanto) presents "High Throughput Plasmid Sequencing with Illumina and CLC Bio" at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  5. High Throughput Plasmid Sequencing with Illumina and CLC Bio (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    SciTech Connect

    Athavale, Ajay

    2012-06-01

    Ajay Athavale (Monsanto) presents "High Throughput Plasmid Sequencing with Illumina and CLC Bio" at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  6. Molecular signatures (conserved indels) in protein sequences that are specific for the order Pasteurellales and distinguish two of its main clades.

    PubMed

    Naushad, Hafiz Sohail; Gupta, Radhey S

    2012-01-01

    The members of the order Pasteurellales are currently distinguished primarily on the basis of their branching in the rRNA trees and no convincing biochemical or molecular markers are known that distinguish them from all other bacteria. The genome sequences for 20 Pasteurellaceae species/strains are now publicly available. We report here detailed analyses of protein sequences from these genomes to identify conserved signature indels (CSIs) that are specific for either all Pasteurellales or its major clades. We describe more than 23 CSIs in widely distributed genes/proteins that are uniquely shared by all sequenced Pasteurellaceae species/strains but are not found in any other bacteria. Twenty-one additional CSIs are also specific for the Pasteurellales except in some of these cases homologues were not detected in a few species or the CSI was also present in an isolated non-Pasteurellaceae species. The sequenced Pasteurellaceae species formed two distinct clades in a phylogenetic tree based upon concatenated sequences for 10 conserved proteins. The first of these clades consisting of Aggregatibacter, Pasteurella, Actinobacillus succinogenes, Mannheimia succiniciproducens, Haemophilus influenzae and Haemophilus somnus was also independently supported by 13 uniquely shared CSIs that are not present in other Pasteurellaceae species or other bacteria. Another clade consisting of the remaining Pasteurellaceae species (viz. Actinobacillus pleuropneumoniae, Actinobacillus minor, Haemophilus ducryi, Mannheimia haemolytica and Haemophilus parasuis) was also strongly and independently supported by nine CSIs that are uniquely present in these bacteria. The order Pasteurellales is presently made up of a single family, Pasteurellaceae, that encompasses all of its genera. In this context, our identification of two distinct clades within the Pasteurellales, which are supported by both phylogenetic analyses and by multiple highly specific molecular markers, strongly argues for and

  7. Genome sequence of ground tit Pseudopodoces humilis and its adaptation to high altitude

    PubMed Central

    2013-01-01

    Background The mechanism of high-altitude adaptation has been studied in certain mammals. However, in avian species like the ground tit Pseudopodoces humilis, the adaptation mechanism remains unclear. The phylogeny of the ground tit is also controversial. Results Using next generation sequencing technology, we generated and assembled a draft genome sequence of the ground tit. The assembly contained 1.04 Gb of sequence that covered 95.4% of the whole genome and had higher N50 values, at the level of both scaffolds and contigs, than other sequenced avian genomes. About 1.7 million SNPs were detected, 16,998 protein-coding genes were predicted and 7% of the genome was identified as repeat sequences. Comparisons between the ground tit genome and other avian genomes revealed a conserved genome structure and confirmed the phylogeny of ground tit as not belonging to the Corvidae family. Gene family expansion and positively selected gene analysis revealed genes that were related to cardiac function. Our findings contribute to our understanding of the adaptation of this species to extreme environmental living conditions. Conclusions Our data and analysis contribute to the study of avian evolutionary history and provide new insights into the adaptation mechanisms to extreme conditions in animals. PMID:23537097

  8. cDNA cloning and sequencing of human fibrillarin, a conserved nucleolar protein recognized by autoimmune antisera

    SciTech Connect

    Aris, J.P.; Blobel, G. )

    1991-02-01

    The authors have isolated a 1.1-kilobase cDNA clone that encodes human fibrillarin by screening a hepatoma library in parallel with DNA probes derived from the fibrillarin genes of Saccharomyces cerevisiae (NOP1) and Xenopus laevis. RNA blot analysis indicates that the corresponding mRNA is {approximately}1,300 nucleotides in length. Human fibrillarin expressed in vitro migrates on SDS gels as a 36-kDa protein that is specifically immunoprecipitated by antisera from humans with scleroderma autoimmune disease. Human fibrillarin contains an amino-terminal repetitive domain {approximately}75-80 amino acids in length that is rich in glycine and arginine residues and is similar to amino-terminal domains in the yeast and Xenopus fibrillarins. The occurrence of a putative RNA-binding domain and an RNP consensus sequence within the protein is consistent with the association of fibrillarin with small nucleolar RNAs. Protein sequence alignments show that 67% of amino acids from human fibrillarin are identical to those in yeast fibrillarin and that 81% are identical to those in Xenopus fibrillarin. This identity suggests the evolutionary conservation of an important function early in the pathway for ribosome biosynthesis.

  9. Genome-wide analyses reveal a highly conserved Dengue virus envelope peptide which is critical for virus viability and antigenic in humans

    PubMed Central

    Fleith, Renata C.; Lobo, Francisco P.; dos Santos, Paula F.; Rocha, Mariana M.; Bordignon, Juliano; Strottmann, Daisy M.; Patricio, Daniel O.; Pavanelli, Wander R.; Lo Sarzi, Maria; Santos, Claudia N. D.; Ferguson, Brian J.; Mansur, Daniel S.

    2016-01-01

    Targeting regions of proteins that show a high degree of structural conservation has been proposed as a method of developing immunotherapies and vaccines that may bypass the wide genetic variability of RNA viruses. Despite several attempts, a vaccine that protects evenly against the four circulating Dengue virus (DV) serotypes remains elusive. To find critical conserved amino acids in dengue viruses, 120 complete genomes of each serotype were selected at random and used to calculate conservation scores for nucleotide and amino acid sequences. The identified peptide sequences were analysed for their structural conservation and localisation using crystallographic data. The longest, surface exposed, highly conserved peptide of Envelope protein was found to correspond to amino acid residues 250 to 270. Mutation of this peptide in DV1 was lethal, since no replication of the mutant virus was detected in human cells. Antibodies against this peptide were detected in DV naturally infected patients indicating its potential antigenicity. Hence, this study has identified a highly conserved, critical peptide in DV that is a target of antibodies in infected humans. PMID:27805018

  10. Streamlining and core genome conservation among highly divergent members of the SAR11 clade.

    PubMed

    Grote, Jana; Thrash, J Cameron; Huggett, Megan J; Landry, Zachary C; Carini, Paul; Giovannoni, Stephen J; Rappé, Michael S

    2012-01-01

    SAR11 is an ancient and diverse clade of heterotrophic bacteria that are abundant throughout the world's oceans, where they play a major role in the ocean carbon cycle. Correlations between the phylogenetic branching order and spatiotemporal patterns in cell distributions from planktonic ocean environments indicate that SAR11 has evolved into perhaps a dozen or more specialized ecotypes that span evolutionary distances equivalent to a bacterial order. We isolated and sequenced genomes from diverse SAR11 cultures that represent three major lineages and encompass the full breadth of the clade. The new data expand observations about genome evolution and gene content that previously had been restricted to the SAR11 Ia subclade, providing a much broader perspective on the clade's origins, evolution, and ecology. We found small genomes throughout the clade and a very high proportion of core genome genes (48 to 56%), indicating that small genome size is probably an ancestral characteristic. In their level of core genome conservation, the members of SAR11 are outliers, the most conserved free-living bacteria known. Shared features of the clade include low GC content, high gene synteny, a large hypervariable region bounded by rRNA genes, and low numbers of paralogs. Variation among the genomes included genes for phosphorus metabolism, glycolysis, and C1 metabolism, suggesting that adaptive specialization in nutrient resource utilization is important to niche partitioning and ecotype divergence within the clade. These data provide support for the conclusion that streamlining selection for efficient cell replication in the planktonic habitat has occurred throughout the evolution and diversification of this clade. IMPORTANCE The SAR11 clade is the most abundant group of marine microorganisms worldwide, making them key players in the global carbon cycle. Growing knowledge about their biochemistry and metabolism is leading to a more mechanistic understanding of organic carbon

  11. The highly conserved MraZ protein is a transcriptional regulator in Escherichia coli

    SciTech Connect

    Eraso, Jesus M.; Markillie, Lye Meng; Mitchell, Hugh D.; Taylor, Ronald C.; Orr, Galya; Margolin, William

    2014-05-05

    The mraZ and mraW genes are highly conserved in bacteria, both in sequence and location at the head of the division and cell wall (dcw) gene cluster. Although MraZ has structural similarity to the AbrB transition state regulator and the MazE antitoxin, and MraW is known to methylate ribosomal RNA, mraZ and mraW null mutants have no detectable growth phenotype in any species tested to date, hampering progress in understanding their physiological role. Here we show that overproduction of Escherichia coli MraZ perturbs cell division and the cell envelope, is more lethal at high levels or in minimal growth medium, and that MraW antagonizes these effects. MraZGFP localizes to the nucleoid, suggesting that it binds DNA. Indeed, purified MraZ directly binds a region upstream from its own promoter containing three direct repeats to regulate its own expression and that of downstream cell division and cell wall genes. MraZ-LacZ fusions are repressed by excess MraZ but not when DNA binding by MraZ is inhibited. RNAseq analysis indicates that MraZ is a global transcriptional regulator with numerous targets in addition to dcw genes. One of these targets, mioC, is directly bound by MraZ in a region with three direct repeats.

  12. Next-generation sequencing: big data meets high performance computing.

    PubMed

    Schmidt, Bertil; Hildebrandt, Andreas

    2017-02-02

    The progress of next-generation sequencing has a major impact on medical and genomic research. This high-throughput technology can now produce billions of short DNA or RNA fragments in excess of a few terabytes of data in a single run. This leads to massive datasets used by a wide range of applications including personalized cancer treatment and precision medicine. In addition to the hugely increased throughput, the cost of using high-throughput technologies has been dramatically decreasing. A low sequencing cost of around US$1000 per genome has now rendered large population-scale projects feasible. However, to make effective use of the produced data, the design of big data algorithms and their efficient implementation on modern high performance computing systems is required.

  13. De novo sequencing of highly modified therapeutic oligonucleotides by hydrophobic tag sequencing coupled with LC-MS.

    PubMed

    Goto, R; Miyakawa, S; Inomata, E; Takami, T; Yamaura, J; Nakamura, Y

    2017-02-01

    Correct sequences are prerequisite for quality control of therapeutic oligonucleotides. However, there is no definitive method available for determining sequences of highly modified therapeutic RNAs, and thereby, most of the oligonucleotides have been used clinically without direct sequence determination. In this study, we developed a novel sequencing method called 'hydrophobic tag sequencing'. Highly modified oligonucleotides are sequenced by partially digesting oligonucleotides conjugated with a 5'-hydrophobic tag, followed by liquid chromatography-mass spectrometry analysis. 5'-Hydrophobic tag-printed fragments (5'-tag degradates) can be separated in order of their molecular masses from tag-free oligonucleotides by reversed-phase liquid chromatography. As models for the sequencing, the anti-VEGF aptamer (Macugen) and the highly modified 38-mer RNA sequences were analyzed under blind conditions. Most nucleotides were identified from the molecular weight of hydrophobic 5'-tag degradates calculated from monoisotopic mass in simple full mass data. When monoisotopic mass could not be assigned, the nucleotide was estimated using the molecular weight of the most abundant mass. The sequences of Macugen and 38-mer RNA perfectly matched the theoretical sequences. The hydrophobic tag sequencing worked well to obtain simple full mass data, resulting in accurate and clear sequencing. The present study provides for the first time a de novo sequencing technology for highly modified RNAs and contributes to quality control of therapeutic oligonucleotides. Copyright © 2016 John Wiley & Sons, Ltd.

  14. Compression of Structured High-Throughput Sequencing Data

    PubMed Central

    Campagne, Fabien; Dorff, Kevin C.; Chambwe, Nyasha; Robinson, James T.; Mesirov, Jill P.

    2013-01-01

    Large biological datasets are being produced at a rapid pace and create substantial storage challenges, particularly in the domain of high-throughput sequencing (HTS). Most approaches currently used to store HTS data are either unable to quickly adapt to the requirements of new sequencing or analysis methods (because they do not support schema evolution), or fail to provide state of the art compression of the datasets. We have devised new approaches to store HTS data that support seamless data schema evolution and compress datasets substantially better than existing approaches. Building on these new approaches, we discuss and demonstrate how a multi-tier data organization can dramatically reduce the storage, computational and network burden of collecting, analyzing, and archiving large sequencing datasets. For instance, we show that spliced RNA-Seq alignments can be stored in less than 4% the size of a BAM file with perfect data fidelity. Compared to the previous compression state of the art, these methods reduce dataset size more than 40% when storing exome, gene expression or DNA methylation datasets. The approaches have been integrated in a comprehensive suite of software tools (http://goby.campagnelab.org) that support common analyses for a range of high-throughput sequencing assays. PMID:24260313

  15. Compressive biological sequence analysis and archival in the era of high-throughput sequencing technologies.

    PubMed

    Giancarlo, Raffaele; Rombo, Simona E; Utro, Filippo

    2014-05-01

    High-throughput sequencing technologies produce large collections of data, mainly DNA sequences with additional information, requiring the design of efficient and effective methodologies for both their compression and storage. In this context, we first provide a classification of the main techniques that have been proposed, according to three specific research directions that have emerged from the literature and, for each, we provide an overview of the current techniques. Finally, to make this review useful to researchers and technicians applying the existing software and tools, we include a synopsis of the main characteristics of the described approaches, including details on their implementation and availability. Performance of the various methods is also highlighted, although the state of the art does not lend itself to a consistent and coherent comparison among all the methods presented here.

  16. MEGARes: an antimicrobial resistance database for high throughput sequencing

    PubMed Central

    Lakin, Steven M.; Dean, Chris; Noyes, Noelle R.; Dettenwanger, Adam; Ross, Anne Spencer; Doster, Enrique; Rovira, Pablo; Abdo, Zaid; Jones, Kenneth L.; Ruiz, Jaime; Belk, Keith E.; Morley, Paul S.; Boucher, Christina

    2017-01-01

    Antimicrobial resistance has become an imminent concern for public health. As methods for detection and characterization of antimicrobial resistance move from targeted culture and polymerase chain reaction to high throughput metagenomics, appropriate resources for the analysis of large-scale data are required. Currently, antimicrobial resistance databases are tailored to smaller-scale, functional profiling of genes using highly descriptive annotations. Such characteristics do not facilitate the analysis of large-scale, ecological sequence datasets such as those produced with the use of metagenomics for surveillance. In order to overcome these limitations, we present MEGARes (https://megares.meglab.org), a hand-curated antimicrobial resistance database and annotation structure that provides a foundation for the development of high throughput acyclical classifiers and hierarchical statistical analysis of big data. MEGARes can be browsed as a stand-alone resource through the website or can be easily integrated into sequence analysis pipelines through download. Also via the website, we provide documentation for AmrPlusPlus, a user-friendly Galaxy pipeline for the analysis of high throughput sequencing data that is pre-packaged for use with the MEGARes database. PMID:27899569

  17. MEGARes: an antimicrobial resistance database for high throughput sequencing.

    PubMed

    Lakin, Steven M; Dean, Chris; Noyes, Noelle R; Dettenwanger, Adam; Ross, Anne Spencer; Doster, Enrique; Rovira, Pablo; Abdo, Zaid; Jones, Kenneth L; Ruiz, Jaime; Belk, Keith E; Morley, Paul S; Boucher, Christina

    2017-01-04

    Antimicrobial resistance has become an imminent concern for public health. As methods for detection and characterization of antimicrobial resistance move from targeted culture and polymerase chain reaction to high throughput metagenomics, appropriate resources for the analysis of large-scale data are required. Currently, antimicrobial resistance databases are tailored to smaller-scale, functional profiling of genes using highly descriptive annotations. Such characteristics do not facilitate the analysis of large-scale, ecological sequence datasets such as those produced with the use of metagenomics for surveillance. In order to overcome these limitations, we present MEGARes (https://megares.meglab.org), a hand-curated antimicrobial resistance database and annotation structure that provides a foundation for the development of high throughput acyclical classifiers and hierarchical statistical analysis of big data. MEGARes can be browsed as a stand-alone resource through the website or can be easily integrated into sequence analysis pipelines through download. Also via the website, we provide documentation for AmrPlusPlus, a user-friendly Galaxy pipeline for the analysis of high throughput sequencing data that is pre-packaged for use with the MEGARes database.

  18. High-throughput sequencing of small RNAs and anatomical characteristics associated with leaf development in celery.

    PubMed

    Jia, Xiao-Ling; Li, Meng-Yao; Jiang, Qian; Xu, Zhi-Sheng; Wang, Feng; Xiong, Ai-Sheng

    2015-06-09

    MicroRNAs (miRNAs) exhibit diverse and important roles in plant growth, development, and stress responses and regulate gene expression at the post-transcriptional level. Knowledge about the diversity of miRNAs and their roles in leaf development in celery remains unknown. To elucidate the roles of miRNAs in celery leaf development, we identified leaf development-related miRNAs through high-throughput sequencing. Small RNA libraries were constructed using leaves from three stages (10, 20, and 30 cm) of celery cv.'Ventura' and then subjected to high-throughput sequencing and bioinformatics analysis. At Stage 1, Stage 2, and Stage 3 of 'Ventura', a total of 333, 329, and 344 conserved miRNAs (belonging to 35, 35, and 32 families, respectively) were identified. A total of 131 miRNAs were identified as novel in 'Ventura'. Potential miRNA target genes were predicted and annotated using the eggNOG, GO, and KEGG databases to explore gene functions. The abundance of five conserved miRNAs and their corresponding potential target genes were validated. Expression profiles of novel potential miRNAs were also detected. Anatomical characteristics of the leaf blades and petioles at three leaf stages were further analyzed. This study contributes to our understanding on the functions and molecular regulatory mechanisms of miRNAs in celery leaf development.

  19. Human metapneumovirus G protein is highly conserved within but not between genetic lineages.

    PubMed

    Yang, Chin-Fen; Wang, Chiaoyin K; Tollefson, Sharon J; Lintao, Linda D; Liem, Alexis; Chu, Marla; Williams, John V

    2013-06-01

    Human metapneumovirus (HMPV) is an important cause of acute respiratory illnesses in children. HMPV encodes two major surface glycoproteins, fusion (F) and glycoprotein (G). The function of G has not been fully established, though it is dispensable for in vitro and in vivo replication. We analyzed 87 full-length HMPV G sequences from isolates collected over 20 years. The G sequences fell into four subgroups with a mean 63 % amino acid identity (minimum 29 %). The length of G varied from 217 to 241 residues. Structural features such as proline content and N- and O-glycosylation sites were present in all strains but quite variable between subgroups. There was minimal drift within the subgroups over 20 years. The estimated time to the most recent common ancestor was 215 years. HMPV G was conserved within lineages over 20 years, suggesting functional constraints on diversity. However, G was poorly conserved between subgroups, pointing to potentially distinct roles for G among different viral lineages.

  20. Human T-cell recognition of synthetic peptides representing conserved and variant sequences from the merozoite surface protein 2 of Plasmodium falciparum.

    PubMed

    Theander, T G; Hviid, L; Dodoo, D; Afari, E A; Jensen, J B; Rzepczyk, C M

    1997-06-01

    Merozoite surface protein 2 (MSP2) is a malaria vaccine candidate currently undergoing clinical trials. We analyzed the peripheral blood mononuclear cell (PBMC) response to synthetic peptides corresponding to conserved and variant regions of the FCQ-27 allelic form of MSP2 in Ghanaian individuals from an area of hyperendemic malaria transmission and in Danes without exposure to malaria. PBMC from 20-39% of Ghanaians responded to each of the peptides by proliferation and 29-36% had PBMC which produced interferon-gamma (IFN-gamma) in response to peptide stimulation. In Danes, there was no proliferation to two of the peptides and only PBMC from 5% of the individuals proliferated to the other three peptides. IFN-gamma production was not detected to any peptide. In both Danes and Ghanaians in only a few instances was IL-4 detected in the PBMC cultures. Overall PBMC from 79% of the Ghanaians responded by proliferation and/or cytokine secretion to at least one of three peptides tested, whereas responses were only observed in 14% of Danes (P = 0.002). These data suggest that the Ghanaians had expanded peripheral blood T-cell populations recognizing the peptides as a result of natural infection. The findings are encouraging for the development of a vaccine based on these T-epitope containing regions of MSP2, as the peptides were broadly recognized suggesting that they can bind to diverse HLA alleles and also because they include conserved MSP2 sequences. Immunisation with a vaccine construct incorporating the sequences present in these peptides could thus be expected to be immunogenic in a high percentage of individuals and lead to the establishment of memory T-cells, which can be boosted through natural infection.

  1. Genotype-Frequency Estimation from High-Throughput Sequencing Data.

    PubMed

    Maruki, Takahiro; Lynch, Michael

    2015-10-01

    Rapidly improving high-throughput sequencing technologies provide unprecedented opportunities for carrying out population-genomic studies with various organisms. To take full advantage of these methods, it is essential to correctly estimate allele and genotype frequencies, and here we present a maximum-likelihood method that accomplishes these tasks. The proposed method fully accounts for uncertainties resulting from sequencing errors and biparental chromosome sampling and yields essentially unbiased estimates with minimal sampling variances with moderately high depths of coverage regardless of a mating system and structure of the population. Moreover, we have developed statistical tests for examining the significance of polymorphisms and their genotypic deviations from Hardy-Weinberg equilibrium. We examine the performance of the proposed method by computer simulations and apply it to low-coverage human data generated by high-throughput sequencing. The results show that the proposed method improves our ability to carry out population-genomic analyses in important ways. The software package of the proposed method is freely available from https://github.com/Takahiro-Maruki/Package-GFE.

  2. Population Genomic Analysis Reveals Highly Conserved Mitochondrial Genomes in the Yeast Species Lachancea thermotolerans

    PubMed Central

    Freel, Kelle C.; Friedrich, Anne; Hou, Jing; Schacherer, Joseph

    2014-01-01

    The increasing availability of mitochondrial (mt) sequence data from various yeasts provides a tool to study genomic evolution within and between different species. While the genomes from a range of lineages are available, there is a lack of information concerning intraspecific mtDNA diversity. Here, we analyzed the mt genomes of 50 strains from Lachancea thermotolerans, a protoploid yeast species that has been isolated from several locations (Europe, Asia, Australia, South Africa, and North / South America) and ecological sources (fruit, tree exudate, plant material, and grape and agave fermentations). Protein-coding genes from the mtDNA were used to construct a phylogeny, which reflected a similar, yet less resolved topology than the phylogenetic tree of 50 nuclear genes. In comparison to its sister species Lachancea kluyveri, L. thermotolerans has a smaller mt genome. This is due to shorter intergenic regions and fewer introns, of which the latter are only found in COX1. We revealed that L. kluyveri and L. thermotolerans share similar levels of intraspecific divergence concerning the nuclear genomes. However, L. thermotolerans has a more highly conserved mt genome with the coding regions characterized by low rates of nonsynonymous substitution. Thus, in the mt genomes of L. thermotolerans, stronger purifying selection and lower mutation rates potentially shape genome diversity in contract to what was found for L. kluyveri, demonstrating that the factors driving mt genome evolution are different even between closely related species. PMID:25212859

  3. Fulcrum: condensing redundant reads from high-throughput sequencing studies

    PubMed Central

    Burriesci, Matthew S.; Lehnert, Erik M.; Pringle, John R.

    2012-01-01

    Motivation: Ultra-high-throughput sequencing produces duplicate and near-duplicate reads, which can consume computational resources in downstream applications. A tool that collapses such reads should reduce storage and assembly complications and costs. Results: We developed Fulcrum to collapse identical and near-identical Illumina and 454 reads (such as those from PCR clones) into single error-corrected sequences; it can process paired-end as well as single-end reads. Fulcrum is customizable and can be deployed on a single machine, a local network or a commercially available MapReduce cluster, and it has been optimized to maximize ease-of-use, cross-platform compatibility and future scalability. Sequence datasets have been collapsed by up to 71%, and the reduced number and improved quality of the resulting sequences allow assemblers to produce longer contigs while using less memory. Availability and implementation: Source code and a tutorial are available at http://pringlelab.stanford.edu/protocols.html under a BSD-like license. Fulcrum was written and tested in Python 2.6, and the single-machine and local-network modes depend on a modified version of the Parallel Python library (provided). Contact: erik.m.lehnert@gmail.com Supplementary information: Supplementary information is available at Bioinformatics online. PMID:22419786

  4. Detecting Alu insertions from high-throughput sequencing data

    PubMed Central

    David, Matei; Mustafa, Harun; Brudno, Michael

    2013-01-01

    High-throughput sequencing technologies have allowed for the cataloguing of variation in personal human genomes. In this manuscript, we present alu-detect, a tool that combines read-pair and split-read information to detect novel Alus and their precise breakpoints directly from either whole-genome or whole-exome sequencing data while also identifying insertions directly in the vicinity of existing Alus. To set the parameters of our method, we use simulation of a faux reference, which allows us to compute the precision and recall of various parameter settings using real sequencing data. Applying our method to 100 bp paired Illumina data from seven individuals, including two trios, we detected on average 1519 novel Alus per sample. Based on the faux-reference simulation, we estimate that our method has 97% precision and 85% recall. We identify 808 novel Alus not previously described in other studies. We also demonstrate the use of alu-detect to study the local sequence and global location preferences for novel Alu insertions. PMID:23921633

  5. Discovery of highly conserved unique peanut and tree nut peptides by LC-MS/MS for multi-allergen detection.

    PubMed

    Sealey-Voyksner, Jennifer; Zweigenbaum, Jerry; Voyksner, Robert

    2016-03-01

    Proteins unique to peanuts and various tree nuts have been extracted, subjected to trypsin digestion and analysis by liquid chromatography/quadrupole time-of-flight mass spectrometry, in order to find highly conserved peptides that can be used as markers to detect peanuts and tree nuts in food. The marker peptide sequences chosen were those found to be present in both native (unroasted) and thermally processed (roasted) forms of peanuts and tree nuts. Each peptide was selected by assuring its presence in food that was processed or unprocessed, its abundance for sensitivity, sequence size, and uniqueness for peanut and each specific variety of tree nut. At least two peptides were selected to represent peanut, almond, pecan, cashew, walnut, hazelnut, pine nut, Brazil nut, macadamia nut, pistachio nut, chestnut and coconut; to determine the presence of trace levels of peanut and tree nuts in food by a novel multiplexed LC-MS method.

  6. Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

    SciTech Connect

    Lan, Yemin; Rosen, Gail; Hershberg, Ruth

    2016-05-03

    The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that the percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.

  7. Marker genes that are less conserved in their sequences are useful for predicting genome-wide similarity levels between closely related prokaryotic strains

    DOE PAGES

    Lan, Yemin; Rosen, Gail; Hershberg, Ruth

    2016-05-03

    The 16s rRNA gene is so far the most widely used marker for taxonomical classification and separation of prokaryotes. Since it is universally conserved among prokaryotes, it is possible to use this gene to classify a broad range of prokaryotic organisms. At the same time, it has often been noted that the 16s rRNA gene is too conserved to separate between prokaryotes at finer taxonomic levels. In this paper, we examine how well levels of similarity of 16s rRNA and 73 additional universal or nearly universal marker genes correlate with genome-wide levels of gene sequence similarity. We demonstrate that themore » percent identity of 16s rRNA predicts genome-wide levels of similarity very well for distantly related prokaryotes, but not for closely related ones. In closely related prokaryotes, we find that there are many other marker genes for which levels of similarity are much more predictive of genome-wide levels of gene sequence similarity. Finally, we show that the identities of the markers that are most useful for predicting genome-wide levels of similarity within closely related prokaryotic lineages vary greatly between lineages. However, the most useful markers are always those that are least conserved in their sequences within each lineage. In conclusion, our results show that by choosing markers that are less conserved in their sequences within a lineage of interest, it is possible to better predict genome-wide gene sequence similarity between closely related prokaryotes than is possible using the 16s rRNA gene. We point readers towards a database we have created (POGO-DB) that can be used to easily establish which markers show lowest levels of sequence conservation within different prokaryotic lineages.« less

  8. Epidermal surface antigen (MS17S1) is highly conserved between mouse and human

    SciTech Connect

    Cho, Y.J.; Chema, D.; Cho, M.

    1995-05-20

    A mouse monoclonal antibody ECS-1 raised to human keratinocytes detects a 35-kDa epidermal surface antigen (ESA) and causes keratinocyte dissociation in vitro. ECS-1 stains skin of 16-day mouse embryo and 8- to 9-week human fetus. Mouse Esa cDNA encodes a 379-amino-acid protein that is 99.2% identical to the human, differing at only 3 amino acids. The gene (M17S1) was mapped to mouse chromosome 11, highlighting the conserved linkage synteny existing between human chromosome 17 and mouse chromosome 11. Although the nude locus has been mapped to the same region of chromosome 11, no abnormalities in protein, mRNA, or cDNA or genomic sequences were detected in nude mice. However, both nude and control mice were found to have a second Esa mRNA transcript that conserves amino acid sequence and molecular weight. The mouse and human 5{prime} and 3{prime} untranslated sequences are conserved. Similar RNA folding patterns of the 5{prime} untranslated region are predicted despite a 91-bp insertion in the mouse. These data suggest that both the function and the regulation of ESA protein are of importance and that Esa (M17S1) is not the nude locus gene. 42 refs., 7 figs., 3 tabs.

  9. Localization of a highly conserved human potassium channel gene (NGK2-KV4-KCNC1) to chromosome 11p15

    SciTech Connect

    Ried, T.; Ward, D.C. ); Rudy, B.; Miera, V.S. de; Lau, D.; Sen, K. )

    1993-02-01

    Several genes (the Shaker or Sh gene family) encoding components of voltage-gated K[sub +] channels have been identified in various species. Based on sequence similarities Sh genes are classified into four groups or subfamilies. Mammalian genes of each one of these subfamilies also show high levels of sequence similarity to one of four related Drosophila genes: Shaker, Shab, Shaw, and Shal. Here we report the isolation of human cDNAs for a Shaw-related product (NGK2,KV2.1a) previously identified in rat and mice. A comparison of the nucleotide and deduced amino acid sequence of NGK2 in rodents and humans shows that this product is highly conserved in mammals; the human NGK2 protein shows over 99% amino acid sequence identity to its rodent homologue. The gene (NGK2-KV4; KCNC1) encoding NGK2 was mapped to human chromosome 11p15 by fluorescence in situ hybridization with the human NGK2 cDNAs. 65 refs., 2 figs., 1 tab.

  10. The use of museum specimens with high-throughput DNA sequencers

    PubMed Central

    Burrell, Andrew S.; Disotell, Todd R.; Bergey, Christina M.

    2015-01-01

    Natural history collections have long been used by morphologists, anatomists, and taxonomists to probe the evolutionary process and describe biological diversity. These biological archives also offer great opportunities for genetic research in taxonomy, conservation, systematics, and population biology. They allow assays of past populations, including those of extinct species, giving context to present patterns of genetic variation and direct measures of evolutionary processes. Despite this potential, museum specimens are difficult to work with because natural postmortem processes and preservation methods fragment and damage DNA. These problems have restricted geneticists’ ability to use natural history collections primarily by limiting how much of the genome can be surveyed. Recent advances in DNA sequencing technology, however, have radically changed this, making truly genomic studies from museum specimens possible. We review the opportunities and drawbacks of the use of museum specimens, and suggest how to best execute projects when incorporating such samples. Several high-throughput (HT) sequencing methodologies, including whole genome shotgun sequencing, sequence capture, and restriction digests (demonstrated here), can be used with archived biomaterials. PMID:25532801

  11. The use of museum specimens with high-throughput DNA sequencers.

    PubMed

    Burrell, Andrew S; Disotell, Todd R; Bergey, Christina M

    2015-02-01

    Natural history collections have long been used by morphologists, anatomists, and taxonomists to probe the evolutionary process and describe biological diversity. These biological archives also offer great opportunities for genetic research in taxonomy, conservation, systematics, and population biology. They allow assays of past populations, including those of extinct species, giving context to present patterns of genetic variation and direct measures of evolutionary processes. Despite this potential, museum specimens are difficult to work with because natural postmortem processes and preservation methods fragment and damage DNA. These problems have restricted geneticists' ability to use natural history collections primarily by limiting how much of the genome can be surveyed. Recent advances in DNA sequencing technology, however, have radically changed this, making truly genomic studies from museum specimens possible. We review the opportunities and drawbacks of the use of museum specimens, and suggest how to best execute projects when incorporating such samples. Several high-throughput (HT) sequencing methodologies, including whole genome shotgun sequencing, sequence capture, and restriction digests (demonstrated here), can be used with archived biomaterials.

  12. Conserved sequences in the current strains of HIV-1 subtype A in Russia are effectively targeted by artificial RNAi in vitro.

    PubMed

    Tchurikov, Nickolai A; Fedoseeva, Daria M; Gashnikova, Natalya M; Sosin, Dmitri V; Gorbacheva, Maria A; Alembekov, Ildar R; Chechetkin, Vladimir R; Kravatsky, Yuri V; Kretova, Olga V

    2016-05-25

    Highly active antiretroviral therapy has greatly reduced the morbidity and mortality of AIDS. However, many of the antiretroviral drugs are toxic with long-term use, and all currently used anti-HIV agents generate drug-resistant mutants. Therefore, there is a great need for new approaches to AIDS therapy. RNAi is a powerful means of inhibiting HIV-1 production in human cells. We propose to use RNAi for gene therapy of HIV/AIDS. Previously we identified a number of new biologically active siRNAs targeting several moderately conserved regions in HIV-1 transcripts. Here we analyze the heterogeneity of nucleotide sequences in three RNAi targets in sequences encoding the reverse transcriptase and integrase domains of current isolates of HIV-1 subtype A in Russia. These data were used to generate genetic constructs expressing short hairpin RNAs 28-30-bp in length that could be processed in cells into siRNAs. After transfection of the constructs we observed siRNAs that efficiently attacked the selected targets. We expect that targeting several viral genes important for HIV-1 reproduction will help overcome the problem of viral adaptation and will prevent the appearance of RNAi escape mutants in current virus strains, an important feature of gene therapy of HIV/AIDS.

  13. A HIGH COVERAGE GENOME SEQUENCE FROM AN ARCHAIC DENISOVAN INDIVIDUAL

    PubMed Central

    Meyer, Matthias; Kircher, Martin; Gansauge, Marie-Theres; Li, Heng; Racimo, Fernando; Mallick, Swapan; Schraiber, Joshua G.; Jay, Flora; Prüfer, Kay; de Filippo, Cesare; Sudmant, Peter H.; Alkan, Can; Fu, Qiaomei; Do, Ron; Rohland, Nadin; Tandon, Arti; Siebauer, Michael; Green, Richard E.; Bryc, Katarzyna; Briggs, Adrian W.; Stenzel, Udo; Dabney, Jesse; Shendure, Jay; Kitzman, Jacob; Hammer, Michael F.; Shunkov, Michael V.; Derevianko, Anatoli P.; Patterson, Nick; Andrés, Aida M.; Eichler, Evan E.; Slatkin, Montgomery; Reich, David; Kelso, Janet; Pääbo, Svante

    2013-01-01

    We present a DNA library preparation method that has allowed us to reconstruct a high coverage (30X) genome sequence of a Denisovan, an extinct relative of Neandertals. The quality of this genome allows a direct estimation of Denisovan heterozygosity indicating that genetic diversity in these archaic hominins was extremely low. It also allows tentative dating of the specimen on the basis of “missing evolution” in its genome, detailed measurements of Denisovan and Neandertal admixture into present-day human populations, and the generation of a near-complete catalog of genetic changes that swept to high frequency in modern humans since their divergence from Denisovans. PMID:22936568

  14. High-Throughput Sequencing of a South American Amerindian

    PubMed Central

    Almeida, Renan; Alencar, Dayse O.; Barbosa, Maria Silvanira; Gusmão, Leonor; Silva, Wilson A.; de Souza, Sandro J.; Silva, Artur; Ribeiro-dos-Santos, Ândrea; Darnet, Sylvain; Santos, Sidney

    2013-01-01

    The emergence of next-generation sequencing technologies allowed access to the vast amounts of information that are contained in the human genome. This information has contributed to the understanding of individual and population-based variability and improved the understanding of the evolutionary history of different human groups. However, the genome of a representative of the Amerindian populations had not been previously sequenced. Thus, the genome of an individual from a South American tribe was completely sequenced to further the understanding of the genetic variability of Amerindians. A total of 36.8 giga base pairs (Gbp) were sequenced and aligned with the human genome. These Gbp corresponded to 95.92% of the human genome with an estimated miscall rate of 0.0035 per sequenced bp. The data obtained from the alignment were used for SNP (single-nucleotide) and INDEL (insertion-deletion) calling, which resulted in the identification of 502,017 polymorphisms, of which 32,275 were potentially new high-confidence SNPs and 33,795 new INDELs, specific of South Native American populations. The authenticity of the sample as a member of the South Native American populations was confirmed through the analysis of the uniparental (maternal and paternal) lineages. The autosomal comparison distinguished the investigated sample from others continental populations and revealed a close relation to the Eastern Asian populations and Aboriginal Australian. Although, the findings did not discard the classical model of America settlement; it brought new insides to the understanding of the human population history. The present study indicates a remarkable genetic variability in human populations that must still be identified and contributes to the understanding of the genetic variability of South Native American populations and of the human populations history. PMID:24386182

  15. Validation of high throughput sequencing and microbial forensics applications

    PubMed Central

    2014-01-01

    High throughput sequencing (HTS) generates large amounts of high quality sequence data for microbial genomics. The value of HTS for microbial forensics is the speed at which evidence can be collected and the power to characterize microbial-related evidence to solve biocrimes and bioterrorist events. As HTS technologies continue to improve, they provide increasingly powerful sets of tools to support the entire field of microbial forensics. Accurate, credible results allow analysis and interpretation, significantly influencing the course and/or focus of an investigation, and can impact the response of the government to an attack having individual, political, economic or military consequences. Interpretation of the results of microbial forensic analyses relies on understanding the performance and limitations of HTS methods, including analytical processes, assays and data interpretation. The utility of HTS must be defined carefully within established operating conditions and tolerances. Validation is essential in the development and implementation of microbial forensics methods used for formulating investigative leads attribution. HTS strategies vary, requiring guiding principles for HTS system validation. Three initial aspects of HTS, irrespective of chemistry, instrumentation or software are: 1) sample preparation, 2) sequencing, and 3) data analysis. Criteria that should be considered for HTS validation for microbial forensics are presented here. Validation should be defined in terms of specific application and the criteria described here comprise a foundation for investigators to establish, validate and implement HTS as a tool in microbial forensics, enhancing public safety and national security. PMID:25101166

  16. Validation of high throughput sequencing and microbial forensics applications.

    PubMed

    Budowle, Bruce; Connell, Nancy D; Bielecka-Oder, Anna; Colwell, Rita R; Corbett, Cindi R; Fletcher, Jacqueline; Forsman, Mats; Kadavy, Dana R; Markotic, Alemka; Morse, Stephen A; Murch, Randall S; Sajantila, Antti; Schmedes, Sarah E; Ternus, Krista L; Turner, Stephen D; Minot, Samuel

    2014-01-01

    High throughput sequencing (HTS) generates large amounts of high quality sequence data for microbial genomics. The value of HTS for microbial forensics is the speed at which evidence can be collected and the power to characterize microbial-related evidence to solve biocrimes and bioterrorist events. As HTS technologies continue to improve, they provide increasingly powerful sets of tools to support the entire field of microbial forensics. Accurate, credible results allow analysis and interpretation, significantly influencing the course and/or focus of an investigation, and can impact the response of the government to an attack having individual, political, economic or military consequences. Interpretation of the results of microbial forensic analyses relies on understanding the performance and limitations of HTS methods, including analytical processes, assays and data interpretation. The utility of HTS must be defined carefully within established operating conditions and tolerances. Validation is essential in the development and implementation of microbial forensics methods used for formulating investigative leads attribution. HTS strategies vary, requiring guiding principles for HTS system validation. Three initial aspects of HTS, irrespective of chemistry, instrumentation or software are: 1) sample preparation, 2) sequencing, and 3) data analysis. Criteria that should be considered for HTS validation for microbial forensics are presented here. Validation should be defined in terms of specific application and the criteria described here comprise a foundation for investigators to establish, validate and implement HTS as a tool in microbial forensics, enhancing public safety and national security.

  17. A novel, evolutionarily conserved gene family with putative sequence-specific single-stranded DNA-binding activity.

    PubMed

    Castro, Patricia; Liang, Hong; Liang, Jan C; Nagarajan, Lalitha

    2002-07-01

    Complete and partial deletions of chromosome 5q are recurrent cytogenetic anomalies associated with aggressive myeloid malignancies. Earlier, we identified an approximately 1.5-Mb region of loss at 5q13.3 between the loci D5S672 and D5S620 in primary leukemic blasts. A leukemic cell line, ML3, is diploid for all of chromosome 5, except for an inversion-coupled translocation within the D5S672-D5S620 interval. Here, we report the development of a bacterial artificial chromosome (BAC) contig to define the breakpoint and the identification of a novel gene SSBP2, the target of disruption in ML3 cells. A preliminary evaluation of SSBP2 as a tumor suppressor gene in primary leukemic blasts and cell lines suggests that the remaining allele does not undergo intragenic mutations. SSBP2 is one of three members of a closely related, evolutionarily conserved, and ubiquitously expressed gene family. SSBP3 is the human ortholog of a chicken gene, CSDP, that encodes a sequence-specific single-stranded DNA-binding protein. SSBP3 localizes to chromosome 1p31.3, and the third member, SSBP4, maps to chromosome 19p13.1. Chromosomal localization and the putative single-stranded DNA-binding activity suggest that all three members of this family are capable of potential tumor suppressor activity by gene dosage or other epigenetic mechanisms.

  18. Comparative Genome Sequence Analysis Reveals the Extent of Diversity and Conservation for Glycan-Associated Proteins in Burkholderia spp.

    PubMed Central

    Ong, Hui San; Mohamed, Rahmah; Firdaus-Raih, Mohd

    2012-01-01

    Members of the Burkholderia family occupy diverse ecological niches. In pathogenic family members, glycan-associated proteins are often linked to functions that include virulence, protein conformation maintenance, surface recognition, cell adhesion, and immune system evasion. Comparative analysis of available Burkholderia genomes has revealed a core set of 178 glycan-associated proteins shared by all Burkholderia of which 68 are homologous to known essential genes. The genome sequence comparisons revealed insights into species-specific gene acquisitions through gene transfers, identified an S-layer protein, and proposed that significantly reactive surface proteins are associated to sugar moieties as a potential means to circumvent host defense mechanisms. The comparative analysis using a curated database of search queries enabled us to gain insights into the extent of conservation and diversity, as well as the possible virulence-associated roles of glycan-associated proteins in members of the Burkholderia spp. The curated list of glycan-associated proteins used can also be directed to screen other genomes for glycan-associated homologs. PMID:22991502

  19. Structure of Trypanosoma brucei glutathione synthetase: Domain and loop alterations in the catalytic cycle of a highly conserved enzyme

    PubMed Central

    Fyfe, Paul K.; Alphey, Magnus S.; Hunter, William N.

    2010-01-01

    Glutathione synthetase catalyses the synthesis of the low molecular mass thiol glutathione from l-γ-glutamyl-l-cysteine and glycine. We report the crystal structure of the dimeric enzyme from Trypanosoma brucei in complex with the product glutathione. The enzyme belongs to the ATP-grasp family, a group of enzymes known to undergo conformational changes upon ligand binding. The T. brucei enzyme crystal structure presents two dimers in the asymmetric unit. The structure reveals variability in the order and position of a small domain, which forms a lid for the active site and serves to capture conformations likely to exist during the catalytic cycle. Comparisons with orthologous enzymes, in particular from Homo sapiens and Saccharomyces cerevisae, indicate a high degree of sequence and structure conservation in part of the active site. Structural differences that are observed between the orthologous enzymes are assigned to different ligand binding states since key residues are conserved. This suggests that the molecular determinants of ligand recognition and reactivity are highly conserved across species. We conclude that it would be difficult to target the parasite enzyme in preference to the host enzyme and therefore glutathione synthetase may not be a suitable target for antiparasitic drug discovery. PMID:20045436

  20. Capturing neutral and adaptive genetic diversity for conservation in a highly structured tree species.

    PubMed

    Rodríguez-Quilón, Isabel; Santos-Del-Blanco, Luis; Serra-Varela, María Jesús; Koskela, Jarkko; González-Martínez, Santiago C; Alía, Ricardo

    2016-10-01

    Preserving intraspecific genetic diversity is essential for long-term forest sustainability in a climate change scenario. Despite that, genetic information is largely neglected in conservation planning, and how conservation units should be defined is still heatedly debated. Here, we use maritime pine (Pinus pinaster Ait.), an outcrossing long-lived tree with a highly fragmented distribution in the Mediterranean biodiversity hotspot, to prove the importance of accounting for genetic variation, of both neutral molecular markers and quantitative traits, to define useful conservation units. Six gene pools associated to distinct evolutionary histories were identified within the species using 12 microsatellites and 266 single nucleotide polymorphisms (SNPs). In addition, height and survival standing variation, their genetic control, and plasticity were assessed in a multisite clonal common garden experiment (16 544 trees). We found high levels of quantitative genetic differentiation within previously defined neutral gene pools. Subsequent cluster analysis and post hoc trait distribution comparisons allowed us to define 10 genetically homogeneous population groups with high evolutionary potential. They constitute the minimum number of units to be represented in a maritime pine dynamic conservation program. Our results uphold that the identification of conservation units below the species level should account for key neutral and adaptive components of genetic diversity, especially in species with strong population structure and complex evolutionary histories. The environmental zonation approach currently used by the pan-European genetic conservation strategy for forest trees would be largely improved by gradually integrating molecular and quantitative trait information, as data become available.

  1. NemaFootPrinter: a web based software for the identification of conserved non-coding genome sequence regions between C. elegans and C. briggsae

    PubMed Central

    Rambaldi, Davide; Guffanti, Alessandro; Morandi, Paolo; Cassata, Giuseppe

    2005-01-01

    Background NemaFootPrinter (Nematode Transcription Factor Scan Through Philogenetic Footprinting) is a web-based software for interactive identification of conserved, non-exonic DNA segments in the genomes of C. elegans and C. briggsae. It has been implemented according to the following project specifications: a) Automated identification of orthologous gene pairs. b) Interactive selection of the boundaries of the genes to be compared. c) Pairwise sequence comparison with a range of different methods. d) Identification of putative transcription factor binding sites on conserved, non-exonic DNA segments. Results Starting from a C. elegans or C. briggsae gene name or identifier, the software identifies the putative ortholog (if any), based on information derived from public nematode genome annotation databases. The investigator can then retrieve the genome DNA sequences of the two orthologous genes; visualize graphically the genes' intron/exon structure and the surrounding DNA regions; select, through an interactive graphical user interface, subsequences of the two gene regions. Using a bioinformatics toolbox (Blast2seq, Dotmatcher, Ssearch and connection to the rVista database) the investigator is able at the end of the procedure to identify and analyze significant sequences similarities, detecting the presence of transcription factor binding sites corresponding to the conserved segments. The software automatically masks exons. Discussion This software is intended as a practical and intuitive tool for the researchers interested in the identification of non-exonic conserved sequence segments between C. elegans and C. briggsae. These sequences may contain regulatory transcriptional elements since they are conserved between two related, but rapidly evolving genomes. This software also highlights the power of genome annotation databases when they are conceived as an open resource and the possibilities offered by seamless integration of different web services via the http

  2. Comparative Analysis of Genome and Epigenome in Closely Related Medaka Species Identifies Conserved Sequence Preferences for DNA Hypomethylated Domains.

    PubMed

    Uno, Ayako; Nakamura, Ryohei; Tsukahara, Tatsuya; Qu, Wei; Sugano, Sumio; Suzuki, Yutaka; Morishita, Shinichi; Takeda, Hiroyuki

    2016-08-01

    The genomes of vertebrates are globally methylated, but a small portion of genomic regions are known to be hypomethylated. Although hypomethylated domains (HMDs) have been implicated in transcriptional regulation in various ways, how a HMD is determined in a particular genomic region remains elusive. To search for DNA motifs essential for the formation of HMDs, we performed the genome-wide comparative analysis of genome and DNA methylation patterns of the two medaka inbred lines, Hd-rRII1 and HNI-II, which are derived from northern and southern subpopulations of Japan and exhibit high levels of genetic variations (SNP, ∼ 3%). We successfully mapped > 70% of HMDs in both genomes and found that the majority of those mapped HMDs are conserved between the two lines (common HMDs). Unexpectedly, the average genetic variations are similar in the common HMD and other genome regions. However, we identified short well-conserved motifs that are specifically enriched in HMDs, suggesting that they may play roles in the establishment of HMDs in the medaka genome.

  3. Computational identification and characterization of conserved miRNAs and their target genes in garlic (Allium sativum L.) expressed sequence tags.

    PubMed

    Panda, Debashis; Dehury, Budheswar; Sahu, Jagajjit; Barooah, Madhumita; Sen, Priyabrata; Modi, Mahendra K

    2014-03-10

    The endogenous small non-coding functional microRNAs (miRNAs) are short in size, range from ~21 to 24 nucleotides in length, play a pivotal role in gene expression in plants and animals by silencing genes either by destructing or blocking of translation of homologous mRNA. Although various high-throughput, time consuming and expensive techniques like forward genetics and direct cloning are employed to detect miRNAs in plants but comparative genomics complemented with novel bioinformatic tools pave the way for efficient and cost-effective identification of miRNAs through homologous sequence search with previously known miRNAs. In this study, an attempt was made to identify and characterize conserved miRNAs in garlic expressed sequence tags (ESTs) through computational means. For identification of novel miRNAs in garlic, a total 3227 known mature miRNAs of plant kingdom Viridiplantae were searched for homology against 21,637 EST sequences resulting in identification of 6 potential miRNA candidates belonging to 6 different miRNA families. The psRNATarget server predicted 33 potential target genes and their probable functions for the six identified miRNA families in garlic. Most of the garlic miRNA target genes seem to encode transcription factors as well as genes involved in stress response, metabolism, plant growth and development. The results from the present study will shed more light on the understanding of molecular mechanisms of miRNA in garlic which may aid in the development of novel and precise techniques to understand some post-transcriptional gene silencing mechanism in response to stress tolerance.

  4. High-Resolution Satellite Imagery Is an Important yet Underutilized Resource in Conservation Biology

    PubMed Central

    Boyle, Sarah A.; Kennedy, Christina M.; Torres, Julio; Colman, Karen; Pérez-Estigarribia, Pastor E.; de la Sancha, Noé U.

    2014-01-01

    Technological advances and increasing availability of high-resolution satellite imagery offer the potential for more accurate land cover classifications and pattern analyses, which could greatly improve the detection and quantification of land cover change for conservation. Such remotely-sensed products, however, are often expensive and difficult to acquire, which prohibits or reduces their use. We tested whether imagery of high spatial resolution (≤5 m) differs from lower-resolution imagery (≥30 m) in performance and extent of use for conservation applications. To assess performance, we classified land cover in a heterogeneous region of Interior Atlantic Forest in Paraguay, which has undergone recent and dramatic human-induced habitat loss and fragmentation. We used 4 m multispectral IKONOS and 30 m multispectral Landsat imagery and determined the extent to which resolution influenced the delineation of land cover classes and patch-level metrics. Higher-resolution imagery more accurately delineated cover classes, identified smaller patches, retained patch shape, and detected narrower, linear patches. To assess extent of use, we surveyed three conservation journals (Biological Conservation, Biotropica, Conservation Biology) and found limited application of high-resolution imagery in research, with only 26.8% of land cover studies analyzing satellite imagery, and of these studies only 10.4% used imagery ≤5 m resolution. Our results suggest that high-resolution imagery is warranted yet under-utilized in conservation research, but is needed to adequately monitor and evaluate forest loss and conversion, and to delineate potentially important stepping-stone fragments that may serve as corridors in a human-modified landscape. Greater access to low-cost, multiband, high-resolution satellite imagery would therefore greatly facilitate conservation management and decision-making. PMID:24466287

  5. High-resolution satellite imagery is an important yet underutilized resource in conservation biology.

    PubMed

    Boyle, Sarah A; Kennedy, Christina M; Torres, Julio; Colman, Karen; Pérez-Estigarribia, Pastor E; de la Sancha, Noé U

    2014-01-01

    Technological advances and increasing availability of high-resolution satellite imagery offer the potential for more accurate land cover classifications and pattern analyses, which could greatly improve the detection and quantification of land cover change for conservation. Such remotely-sensed products, however, are often expensive and difficult to acquire, which prohibits or reduces their use. We tested whether imagery of high spatial resolution (≤5 m) differs from lower-resolution imagery (≥30 m) in performance and extent of use for conservation applications. To assess performance, we classified land cover in a heterogeneous region of Interior Atlantic Forest in Paraguay, which has undergone recent and dramatic human-induced habitat loss and fragmentation. We used 4 m multispectral IKONOS and 30 m multispectral Landsat imagery and determined the extent to which resolution influenced the delineation of land cover classes and patch-level metrics. Higher-resolution imagery more accurately delineated cover classes, identified smaller patches, retained patch shape, and detected narrower, linear patches. To assess extent of use, we surveyed three conservation journals (Biological Conservation, Biotropica, Conservation Biology) and found limited application of high-resolution imagery in research, with only 26.8% of land cover studies analyzing satellite imagery, and of these studies only 10.4% used imagery ≤5 m resolution. Our results suggest that high-resolution imagery is warranted yet under-utilized in conservation research, but is needed to adequately monitor and evaluate forest loss and conversion, and to delineate potentially important stepping-stone fragments that may serve as corridors in a human-modified landscape. Greater access to low-cost, multiband, high-resolution satellite imagery would therefore greatly facilitate conservation management and decision-making.

  6. Structure Analysis Uncovers a Highly Diverse but Structurally Conserved Effector Family in Phytopathogenic Fungi

    PubMed Central

    Gracy, Jérome; Fournier, Elisabeth; Kroj, Thomas; Padilla, André

    2015-01-01

    Phytopathogenic ascomycete fungi possess huge effector repertoires that are dominated by hundreds of sequence-unrelated small secreted proteins. The molecular function of these effectors and the evolutionary mechanisms that generate this tremendous number of singleton genes are largely unknown. To get a deeper understanding of fungal effectors, we determined by NMR spectroscopy the 3-dimensional structures of the Magnaporthe oryzae effectors AVR1-CO39 and AVR-Pia. Despite a lack of sequence similarity, both proteins have very similar 6 β-sandwich structures that are stabilized in both cases by a disulfide bridge between 2 conserved cysteins located in similar positions of the proteins. Structural similarity searches revealed that AvrPiz-t, another effector from M. oryzae, and ToxB, an effector of the wheat tan spot pathogen Pyrenophora tritici-repentis have the same structures suggesting the existence of a family of sequence-unrelated but structurally conserved fungal effectors that we named MAX-effectors (Magnaporthe Avrs and ToxB like). Structure-informed pattern searches strengthened this hypothesis by identifying MAX-effector candidates in a broad range of ascomycete phytopathogens. Strong expansion of the MAX-effector family was detected in M. oryzae and M. grisea where they seem to be particularly important since they account for 5–10% of the effector repertoire and 50% of the cloned avirulence effectors. Expression analysis indicated that the majority of M. oryzae MAX-effectors are expressed specifically during early infection suggesting important functions during biotrophic host colonization. We hypothesize that the scenario observed for MAX-effectors can serve as a paradigm for ascomycete effector diversity and that the enormous number of sequence-unrelated ascomycete effectors may in fact belong to a restricted set of structurally conserved effector families. PMID:26506000

  7. A root chicory MADS box sequence and the Arabidopsis flowering repressor FLC share common features that suggest conserved function in vernalization and de-vernalization responses.

    PubMed

    Périlleux, Claire; Pieltain, Alexandra; Jacquemin, Guillaume; Bouché, Frédéric; Detry, Nathalie; D'Aloia, Maria; Thiry, Laura; Aljochim, Pierre; Delansnay, Martin; Mathieu, Anne-Sophie; Lutts, Stanley; Tocquin, Pierre

    2013-08-01

    Root chicory (Cichorium intybus var. sativum) is a biennial crop, but is harvested to obtain root inulin at the end of the first growing season before flowering. However, cold temperatures may vernalize seeds or plantlets, leading to incidental early flowering, and hence understanding the molecular basis of vernalization is important. A MADS box sequence was isolated by RT-PCR and named FLC-LIKE1 (CiFL1) because of its phylogenetic positioning within the same clade as the floral repressor Arabidopsis FLOWERING LOCUS C (AtFLC). Moreover, over-expression of CiFL1 in Arabidopsis caused late flowering and prevented up-regulation of the AtFLC target FLOWERING LOCUS T by photoperiod, suggesting functional conservation between root chicory and Arabidopsis. Like AtFLC in Arabidopsis, CiFL1 was repressed during vernalization of seeds or plantlets of chicory, but repression of CiFL1 was unstable when the post-vernalization temperature was favorable to flowering and when it de-vernalized the plants. This instability of CiFL1 repression may be linked to the bienniality of root chicory compared with the annual lifecycle of Arabidopsis. However, re-activation of AtFLC was also observed in Arabidopsis when a high temperature treatment was used straight after seed vernalization, eliminating the promotive effect of cold on flowering. Cold-induced down-regulation of a MADS box floral repressor and its re-activation by high temperature thus appear to be conserved features of the vernalization and de-vernalization responses in distant species.

  8. High-Throughput Sequencing: A Roadmap Toward Community Ecology

    PubMed Central

    Poisot, Timothée; Péquin, Bérangère; Gravel, Dominique

    2013-01-01

    High-throughput sequencing is becoming increasingly important in microbial ecology, yet it is surprisingly under-used to generate or test biogeographic hypotheses. In this contribution, we highlight how adding these methods to the ecologist toolbox will allow the detection of new patterns, and will help our understanding of the structure and dynamics of diversity. Starting with a review of ecological questions that can be addressed, we move on to the technical and analytical issues that will benefit from an increased collaboration between different disciplines. PMID:23610649

  9. Influence of FGD gypsum on the properties of a highly erodible soil under conservation tillage

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The performance of conservation tillage practices imposed on highly erodible soils may be improved by the use of amendments with a high solubility rate, and whose dissolution products are translocated at depth in the soil profile faster than normally used agricultural lime and fertilizer products. T...

  10. Low-Cost, High-Throughput Sequencing of DNA Assemblies Using a Highly Multiplexed Nextera Process.

    PubMed

    Shapland, Elaine B; Holmes, Victor; Reeves, Christopher D; Sorokin, Elena; Durot, Maxime; Platt, Darren; Allen, Christopher; Dean, Jed; Serber, Zach; Newman, Jack; Chandran, Sunil

    2015-07-17

    In recent years, next-generation sequencing (NGS) technology has greatly reduced the cost of sequencing whole genomes, whereas the cost of sequence verification of plasmids via Sanger sequencing has remained high. Consequently, industrial-scale strain engineers either limit the number of designs or take short cuts in quality control. Here, we show that over 4000 plasmids can be completely sequenced in one Illumina MiSeq run for less than $3 each (15× coverage), which is a 20-fold reduction over using Sanger sequencing (2× coverage). We reduced the volume of the Nextera tagmentation reaction by 100-fold and developed an automated workflow to prepare thousands of samples for sequencing. We also developed software to track the samples and associated sequence data and to rapidly identify correctly assembled constructs having the fewest defects. As DNA synthesis and assembly become a centralized commodity, this NGS quality control (QC) process will be essential to groups operating high-throughput pipelines for DNA construction.

  11. Zooming in on the human-mouse comparative map: genome conservation re-examined on a high-resolution scale.

    PubMed

    Carver, E A; Stubbs, L

    1997-12-01

    Over the past decade, conservation of genetic linkage groups has been shown in mammals and used to great advantage, fueling significant exchanges of gene mapping and functional information especially between the genomes of humans and mice. As human physical maps increase in resolution from chromosome bands to nucleotide sequence, comparative alignments of mouse and human regions have revealed striking similarities and surprising differences between the genomes of these two best-mapped mammalian species. Whereas, at present, very few mouse and human regions have been compared on the physical level, existing studies provide intriguing insights to genome evolution, including the observation of recent duplications and deletions of genes that may play significant roles in defining some of the biological differences between the two species. Although high-resolution conserved marker-based maps are currently available only for human and mouse, a variety of new methods and resources are speeding the development of comparative maps of additional organisms. These advances mark the first step toward establishment of the human genome as a reference map for vertebrate species, providing evolutionary and functional annotation to human sequence and vast new resources for genetic analysis of a variety of commercially, medically, and ecologically important animal models.

  12. A monoclonal antibody targeting a highly conserved epitope in influenza B neuraminidase provides protection against drug resistant strains.

    PubMed

    Doyle, Tracey M; Li, Changgui; Bucher, Doris J; Hashem, Anwar M; Van Domselaar, Gary; Wang, Junzhi; Farnsworth, Aaron; She, Yi-Min; Cyr, Terry; He, Runtao; Brown, Earl G; Hurt, Aeron C; Li, Xuguang

    2013-11-08

    All influenza viral neuraminidases (NA) of both type A and B viruses have only one universally conserved sequence located between amino acids 222-230. A monoclonal antibody against this region has been previously reported to provide broad inhibition against all nine subtypes of influenza A NA; yet its inhibitory effect against influenza B viral NA remained unknown. Here, we report that the monoclonal antibody provides a broad inhibition against various strains of influenza B viruses of both Victoria and Yamagata genetic lineage. Moreover, the growth and NA enzymatic activity of two drug resistant influenza B strains (E117D and D197E) are also inhibited by the antibody even though these two mutations are conformationally proximal to the universal epitope. Collectively, these data suggest that this unique, highly-conserved linear sequence in viral NA is exposed sufficiently to allow access by inhibitory antibody during the course of infection; it could represent a potential target for antiviral agents and vaccine-induced immune responses against diverse strains of type B influenza virus.

  13. Universal antibodies against the highly conserved influenza fusion peptide cross-neutralize several subtypes of influenza A virus

    SciTech Connect

    Hashem, Anwar M.; Van Domselaar, Gary; Li, Changgui; Wang, Junzhi; She, Yi-Min; Cyr, Terry D.; Sui, Jianhua; He, Runtao; Marasco, Wayne A.; Li, Xuguang

    2010-12-10

    Research highlights: {yields} The fusion peptide is the only universally conserved epitope in all influenza viral hemagglutinins. {yields} Anti-fusion peptide antibodies are universal antibodies that cross-react with all influenza HA subtypes. {yields} The universal antibodies cross-neutralize different influenza A subtypes. {yields} The universal antibodies inhibit the fusion process between the viruses and the target cells. -- Abstract: The fusion peptide of influenza viral hemagglutinin plays a critical role in virus entry by facilitating membrane fusion between the virus and target cells. As the fusion peptide is the only universally conserved epitope in all influenza A and B viruses, it could be an attractive target for vaccine-induced immune responses. We previously reported that antibodies targeting the first 14 amino acids of the N-terminus of the fusion peptide could bind to virtually all influenza virus strains and quantify hemagglutinins in vaccines produced in embryonated eggs. Here we demonstrate that these universal antibodies bind to the viral hemagglutinins in native conformation presented in infected mammalian cell cultures and neutralize multiple subtypes of virus by inhibiting the pH-dependant fusion of viral and cellular membranes. These results suggest that this unique, highly-conserved linear sequence in viral hemagglutinin is exposed sufficiently to be attacked by the antibodies during the course of infection and merits further investigation because of potential importance in the protection against diverse strains of influenza viruses.

  14. Potential of high residue conservation tillage to enhance water conservation and water use efficiency in corn production in the Southeast

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Following the adoption of its first Comprehensive State-wide Water Management Plan in early February 2008, Georgia is on course to drafting regionally-based water development and conservation plans. Conservation tillage-based crop production can be one of the management tools available for achieving...

  15. Identification of Novel and Conserved miRNAs in Leaves of In vitro Grown Citrus reticulata “Lugan” Plantlets by Solexa Sequencing

    PubMed Central

    Guo, Rongfang; Chen, Xiaodong; Lin, Yuling; Xu, Xuhan; Thu, Min Kyaw; Lai, Zhongxiong

    2016-01-01

    MicroRNAs (miRNAs) play essential roles in plant development, but the roles in the in vitro plant development are unknown. Leaves of ponkan plantlets derived from mature embryos at in vitro culture conditions were used to sequence small RNA fraction via Solexa sequencing, and the miRNAs expression was analyzed. The results showed that there were 3,065,625 unique sequences in ponkan, of which 0.79% were miRNAs. The RNA sequences with lengths of 18–25 nt derived from the library were analyzed, leading to the identification of 224 known miRNAs, of which the most abundant were miR157, miR156, and miR166. Three hundred and fifty-eight novel miRNA candidates were also identified, and the number of reads of ponkan novel miRNAs varied from 5 to 168,273. The expression of the most known miRNAs obtained was at low levels, which varied from 5 to 4,946,356. To better understand the role of miRNAs during the preservation of ponkan in vitro plantlet, the expression patterns of cre-miR156a/159b/160a/166a/167a/168a/171/398b were validated by quantitative real-time PCR (qPCR). The results showed that not only the development-associated miRNAs, e.g., cre-miR156/159/166/396, expressed highly at the early preservation period in the in vitro ponkan plantlet leaves but also the stress-related miRNAs, e.g., cre-miR171 and cre-miR398b, expressed highly at the same time. The expression levels of most tested miRNAs were found to decrease after 6 months and the amounts of these miRNAs were kept at low levels at 18 months. After analyzing the expression level of their targets during the reservation of the ponkan in vitro plantlet, development-associated cre-ARF6 and stress-related cre-CSD modules exhibited negative correlation with miR167 and miR398, respectively, indicating an involvement of the miRNAs in the in vitro development of ponkan and function in the conservation of ponkan germplasm. PMID:26779240

  16. Identification of conserved and novel microRNAs in the Pacific oyster Crassostrea gigas by deep sequencing.

    PubMed

    Xu, Fei; Wang, Xiaotong; Feng, Yue; Huang, Wen; Wang, Wei; Li, Li; Fang, Xiaodong; Que, Huayong; Zhang, Guofan

    2014-01-01

    MicroRNAs (miRNAs) play important roles in regulatory processes in various organisms. To date many studies have been performed in the investigation of miRNAs of numerous bilaterians, but limited numbers of miRNAs have been identified in the few species belonging to the clade Lophotrochozoa. In the current study, deep sequencing was conducted to identify the miRNAs of Crassostrea gigas (Lophotrochozoa) at a genomic scale, using 21 libraries that included different developmental stages and adult organs. A total of 100 hairpin precursor loci were predicted to encode miRNAs. Of these, 19 precursors (pre-miRNA) were novel in the oyster. As many as 53 (53%) miRNAs were distributed in clusters and 49 (49%) precursors were intragenic, which suggests two important biogenetic sources of miRNAs. Different developmental stages were characterized with specific miRNA expression patterns that highlighted regulatory variation along a temporal axis. Conserved miRNAs were expressed universally throughout different stages and organs, whereas novel miRNAs tended to be more specific and may be related to the determination of the novel body plan. Furthermore, we developed an index named the miRNA profile age index (miRPAI) to integrate the evolutionary age and expression levels of miRNAs during a particular developmental stage. We found that the swimming stages were characterized by the youngest miRPAIs. Indeed, the large-scale expression of novel miRNAs indicated the importance of these stages during development, particularly from organogenetic and evolutionary perspectives. Some potentially important miRNAs were identified for further study through significant changes between expression patterns in different developmental events, such as metamorphosis. This study broadened the knowledge of miRNAs in animals and indicated the presence of sophisticated miRNA regulatory networks related to the biological processes in lophotrochozoans.

  17. Identification of Conserved and Novel MicroRNAs in the Pacific Oyster Crassostrea gigas by Deep Sequencing

    PubMed Central

    Xu, Fei; Wang, Xiaotong; Feng, Yue; Huang, Wen; Wang, Wei; Li, Li; Fang, Xiaodong; Que, Huayong; Zhang, Guofan

    2014-01-01

    MicroRNAs (miRNAs) play important roles in regulatory processes in various organisms. To date many studies have been performed in the investigation of miRNAs of numerous bilaterians, but limited numbers of miRNAs have been identified in the few species belonging to the clade Lophotrochozoa. In the current study, deep sequencing was conducted to identify the miRNAs of Crassostrea gigas (Lophotrochozoa) at a genomic scale, using 21 libraries that included different developmental stages and adult organs. A total of 100 hairpin precursor loci were predicted to encode miRNAs. Of these, 19 precursors (pre-miRNA) were novel in the oyster. As many as 53 (53%) miRNAs were distributed in clusters and 49 (49%) precursors were intragenic, which suggests two important biogenetic sources of miRNAs. Different developmental stages were characterized with specific miRNA expression patterns that highlighted regulatory variation along a temporal axis. Conserved miRNAs were expressed universally throughout different stages and organs, whereas novel miRNAs tended to be more specific and may be related to the determination of the novel body plan. Furthermore, we developed an index named the miRNA profile age index (miRPAI) to integrate the evolutionary age and expression levels of miRNAs during a particular developmental stage. We found that the swimming stages were characterized by the youngest miRPAIs. Indeed, the large-scale expression of novel miRNAs indicated the importance of these stages during development, particularly from organogenetic and evolutionary perspectives. Some potentially important miRNAs were identified for further study through significant changes between expression patterns in different developmental events, such as metamorphosis. This study broadened the knowledge of miRNAs in animals and indicated the presence of sophisticated miRNA regulatory networks related to the biological processes in lophotrochozoans. PMID:25137038

  18. The Runt domain of AML1 (RUNX1) binds a sequence-conserved RNA motif that mimics a DNA element

    PubMed Central

    Fukunaga, Junichi; Nomura, Yusuke; Tanaka, Yoichiro; Amano, Ryo; Tanaka, Taku; Nakamura, Yoshikazu; Kawai, Gota; Sakamoto, Taiichi; Kozu, Tomoko

    2013-01-01

    AML1 (RUNX1) is a key transcription factor for hematopoiesis that binds to the Runt-binding double-stranded DNA element (RDE) of target genes through its N-terminal Runt domain. Aberrations in the AML1 gene are frequently found in human leukemia. To better understand AML1 and its potential utility for diagnosis and therapy, we obtained RNA aptamers that bind specifically to the AML1 Runt domain. Enzymatic probing and NMR analyses revealed that Apt1-S, which is a truncated variant of one of the aptamers, has a CACG tetraloop and two stem regions separated by an internal loop. All the isolated aptamers were found to contain the conserved sequence motif 5′-NNCCAC-3′ and 5′-GCGMGN′N′-3′ (M:A or C; N and N′ form Watson–Crick base pairs). The motif contains one AC mismatch and one base bulged out. Mutational analysis of Apt1-S showed that three guanines of the motif are important for Runt binding as are the three guanines of RDE, which are directly recognized by three arginine residues of the Runt domain. Mutational analyses of the Runt domain revealed that the amino acid residues used for Apt1-S binding were similar to those used for RDE binding. Furthermore, the aptamer competed with RDE for binding to the Runt domain in vitro. These results demonstrated that the Runt domain of the AML1 protein binds to the motif of the aptamer that mimics DNA. Our findings should provide new insights into RNA function and utility in both basic and applied sciences. PMID:23709277

  19. High-speed lossless compression for angiography image sequences

    NASA Astrophysics Data System (ADS)

    Kennedy, Jonathon M.; Simms, Michael; Kearney, Emma; Dowling, Anita; Fagan, Andrew; O'Hare, Neil J.

    2001-05-01

    High speed processing of large amounts of data is a requirement for many diagnostic quality medical imaging applications. A demanding example is the acquisition, storage and display of image sequences in angiography. The functional performance requirements for handling angiography data were identified. A new lossless image compression algorithm was developed, implemented in C++ for the Intel Pentium/MS-Windows environment and optimized for speed of operation. Speeds of up to 6M pixels per second for compression and 12M pixels per second for decompression were measured. This represents an improvement of up to 400% over the next best high-performance algorithm (LOCO-I) without significant reduction in compression ratio. Performance tests were carried out at St. James's Hospital using actual angiography data. Results were compared with the lossless JPEG standard and other leading methods such as JPEG-LS (LOCO-I) and the lossless wavelet approach proposed for JPEG 2000. Our new algorithm represents a significant improvement in the performance of lossless image compression technology without using specialized hardware. It has been applied successfully to image sequence decompression at video rate for angiography, one of the most challenging application areas in medical imaging.

  20. Sequence-conserved and antibody-accessible sites in the V1V2 domain of HIV-1 gp120 envelope protein.

    PubMed

    Shmelkov, Evgeny; Grigoryan, Arsen; Krachmarov, Chavdar; Abagyan, Ruben; Cardozo, Timothy

    2014-09-01

    The immune-correlates analysis of the RV144 trial suggested that epitopes targeted by protective antibodies (Abs) reside in the V1V2 domain of gp120. We mapped V1V2 positional sequence variation onto the conserved V1V2 structural fold and showed that while most of the solvent-accessible V1V2 amino acids vary between strains, there are two accessible molecular surface regions that are conserved and also naturally antigenic. These sites may contain epitopes targeted by broadly cross-reactive anti-V1V2 antibodies.

  1. Conservation of the S10-spc-α Locus within Otherwise Highly Plastic Genomes Provides Phylogenetic Insight into the Genus Leptospira

    PubMed Central

    Zuerner, Richard L.; Ahmed, Niyaz; Bulach, Dieter M.; Quinteiro, Javier; Hartskeerl, Rudy A.

    2008-01-01

    S10-spc-α is a 17.5 kb cluster of 32 genes encoding ribosomal proteins. This locus has an unusual composition and organization in Leptospira interrogans. We demonstrate the highly conserved nature of this region among diverse Leptospira and show its utility as a phylogenetically informative region. Comparative analyses were performed by PCR using primer sets covering the whole locus. Correctly sized fragments were obtained by PCR from all L. interrogans strains tested for each primer set indicating that this locus is well conserved in this species. Few differences were detected in amplification profiles between different pathogenic species, indicating that the S10-spc-α locus is conserved among pathogenic Leptospira. In contrast, PCR analysis of this locus using DNA from saprophytic Leptospira species and species with an intermediate pathogenic capacity generated varied results. Sequence alignment of the S10-spc-α locus from two pathogenic species, L. interrogans and L. borgpetersenii, with the corresponding locus from the saprophyte L. biflexa serovar Patoc showed that genetic organization of this locus is well conserved within Leptospira. Multilocus sequence typing (MLST) of four conserved regions resulted in the construction of well-defined phylogenetic trees that help resolve questions about the interrelationships of pathogenic Leptospira. Based on the results of secY sequence analysis, we found that reliable species identification of pathogenic Leptospira is possible by comparative analysis of a 245 bp region commonly used as a target for diagnostic PCR for leptospirosis. Comparative analysis of Leptospira strains revealed that strain H6 previously classified as L. inadai actually belongs to the pathogenic species L. interrogans and that L. meyeri strain ICF phylogenetically co-localized with the pathogenic clusters. These findings demonstrate that the S10-spc-α locus is highly conserved throughout the genus and may be more useful in comparing evolution of

  2. High-Order Entropy Stable Finite Difference Schemes for Nonlinear Conservation Laws: Finite Domains

    NASA Technical Reports Server (NTRS)

    Fisher, Travis C.; Carpenter, Mark H.

    2013-01-01

    Developing stable and robust high-order finite difference schemes requires mathematical formalism and appropriate methods of analysis. In this work, nonlinear entropy stability is used to derive provably stable high-order finite difference methods with formal boundary closures for conservation laws. Particular emphasis is placed on the entropy stability of the compressible Navier-Stokes equations. A newly derived entropy stable weighted essentially non-oscillatory finite difference method is used to simulate problems with shocks and a conservative, entropy stable, narrow-stencil finite difference approach is used to approximate viscous terms.

  3. Complete genome sequences of two highly divergent Japanese isolates of Plantago asiatica mosaic virus.

    PubMed

    Komatsu, Ken; Yamashita, Kazuo; Sugawara, Kota; Verbeek, Martin; Fujita, Naoko; Hanada, Kaoru; Uehara-Ichiki, Tamaki; Fuji, Shin-Ichi

    2017-02-01

    Plantago asiatica mosaic virus (PlAMV) is a member of the genus Potexvirus and has an exceptionally wide host range. It causes severe damage to lilies. Here we report on the complete nucleotide sequences of two new Japanese PlAMV isolates, one from the eudicot weed Viola grypoceras (PlAMV-Vi), and the other from the eudicot shrub Nandina domestica Thunb. (PlAMV-NJ). Their genomes contain five open reading frames (ORFs), which is characteristic of potexviruses. Surprisingly, the isolates showed only 76.0-78.0 % sequence identity with each other and with other PlAMV isolates, including isolates from Japanese lily and American nandina. Amino acid alignments of the replicase coding region encoded by ORF1 showed that the regions between the methyltransferase and helicase domains were less conserved than other regions, with several insertions and/or deletions. Phylogenetic analyses of the full-length nucleotide sequences revealed a moderate correlation between phylogenetic clustering and the original host plants of the PlAMV isolates. This study revealed the presence of two highly divergent PlAMV isolates in Japan.

  4. The Highly Conserved Proline at Position 438 in Pseudorabies Virus gH Is Important for Regulation of Membrane Fusion

    PubMed Central

    Schröter, Christina; Klupp, Barbara G.; Fuchs, Walter; Gerhard, Marika; Backovic, Marija; Rey, Felix A.

    2014-01-01

    ABSTRACT Membrane fusion in herpesviruses requires viral glycoproteins (g) gB and gH/gL. While gB is considered the actual fusion protein but is nonfusogenic per se, the function of gH/gL remains enigmatic. Crystal structures for different gH homologs are strikingly similar despite only moderate amino acid sequence conservation. A highly conserved sequence motif comprises the residues serine-proline-cysteine corresponding to positions 437 to 439 in pseudorabies virus (PrV) gH. The PrV-gH structure shows that proline438 induces bending at the end of an alpha-helix, thereby placing cysteine404 and cysteine439 in juxtaposition to allow formation of a strictly conserved disulfide bond. However, PrV vaccine strain Bartha unexpectedly carries a serine at this conserved position. To test the influence of this substitution, we constructed different gH chimeras carrying proline or serine at position 438 in gH derived from either PrV strain Kaplan or strain Bartha. Mutants expressing gH with serine438 showed reduced fusion activity in transient-fusion assays and during infection, with delayed penetration kinetics and a small-plaque phenotype which indicates that proline438 is important for efficient fusion. A more drastic effect was observed when disulfide bond formation was completely blocked by mutation of cysteine404 to serine. Although PrV expressing gHC404S was viable, plaque size and penetration kinetics were drastically reduced. Alteration of serine438 to proline in gH of strain Bartha enhanced cell-to-cell spread and penetration kinetics, but restoration of full activity required additional alteration of aspartic acid to valine at position 59. IMPORTANCE The role of the gH/gL complex in herpesvirus membrane fusion is still unclear. Structural studies predicted a critical role for proline438 in PrV gH to allow the formation of a conserved disulfide bond and correct protein folding. Functional analyses within this study corroborated these structural predictions

  5. Single nucleotide polymorphisms (SNPs) are highly conserved in rhesus (Macaca mulatta) and cynomolgus (Macaca fascicularis) macaques

    PubMed Central

    Street, Summer L; Kyes, Randall C; Grant, Richard; Ferguson, Betsy

    2007-01-01

    Background Macaca fascicularis (cynomolgus or longtail macaques) is the most commonly used non-human primate in biomedical research. Little is known about the genomic variation in cynomolgus macaques or how the sequence variants compare to those of the well-studied related species, Macaca mulatta (rhesus macaque). Previously we identified single nucleotide polymorphisms (SNPs) in portions of 94 rhesus macaque genes and reported that Indian and Chinese rhesus had largely different SNPs. Here we identify SNPs from some of the same genomic regions of cynomolgus macaques (from Indochina, Indonesia, Mauritius and the Philippines) and compare them to the SNPs found in rhesus. Results We sequenced a portion of 10 genes in 20 cynomolgus macaques. We identified 69 SNPs in these regions, compared with 71 SNPs found in the same genomic regions of 20 Indian and Chinese rhesus macaques. Thirty six (52%) of the M. fascicularis SNPs were overlapping in both species. The majority (70%) of the SNPs found in both Chinese and Indian rhesus macaque populations were also present in M. fascicularis. Of the SNPs previously found in a single rhesus population, 38% (Indian) and 44% (Chinese) were also identified in cynomolgus macaques. In an alternative approach, we genotyped 100 cynomolgus DNAs using a rhesus macaque SNP array representing 53 genes and found that 51% (29/57) of the rhesus SNPs were present in M. fascicularis. Comparisons of SNP profiles from cynomolgus macaques imported from breeding centers in China (where M. fascicularis are not native) showed they were similar to those from Indochina. Conclusion This study demonstrates a surprisingly high conservation of SNPs between M. fascicularis and M. mulatta, suggesting that the relationship of these two species is closer than that suggested by morphological and mitochondrial DNA analysis alone. These findings indicate that SNP discovery efforts in either species will generate useful resources for both macaque species

  6. Spatial overlap between environmental policy instruments and areas of high conservation value in forest.

    PubMed

    Sverdrup-Thygeson, Anne; Søgaard, Gunnhild; Rusch, Graciela M; Barton, David N

    2014-01-01

    In order to safeguard biodiversity in forest we need to know how forest policy instruments work. Here we use a nationwide network of 9400 plots in productive forest to analyze to what extent large-scale policy instruments, individually and together, target forest of high conservation value in Norway. We studied both instruments working through direct regulation; Strict Protection and Landscape Protection, and instruments working through management planning and voluntary schemes of forest certification; Wilderness Area and Mountain Forest. As forest of high conservation value (HCV-forest) we considered the extent of 12 Biodiversity Habitats and the extent of Old-Age Forest. We found that 22% of productive forest area contained Biodiversity Habitats. More than 70% of this area was not covered by any large-scale instruments. Mountain Forest covered 23%, while Strict Protection and Wilderness both covered 5% of the Biodiversity Habitat area. A total of 9% of productive forest area contained Old-Age Forest, and the relative coverage of the four instruments was similar as for Biodiversity Habitats. For all instruments, except Landscape Protection, the targeted areas contained significantly higher proportions of HCV-forest than areas not targeted by these instruments. Areas targeted by Strict Protection had higher proportions of HCV-forest than areas targeted by other instruments, except for areas targeted by Wilderness Area which showed similar proportions of Biodiversity Habitats. There was a substantial amount of spatial overlap between the policy tools, but no incremental conservation effect of overlapping instruments in terms of contributing to higher percentages of targeted HCV-forest. Our results reveal that although the current policy mix has an above average representation of forest of high conservation value, the targeting efficiency in terms of area overlap is limited. There is a need to improve forest conservation and a potential to cover this need by better

  7. Sequence of a cDNA encoding the bi-specific NAD(P)H-nitrate reductase from the tree Betula pendula and identification of conserved protein regions.

    PubMed

    Friemann, A; Brinkmann, K; Hachtel, W

    1991-05-01

    Nitrate reductase (NR) assays revealed a bispecific NAD(P)H-NR (EC 1.6.6.2.) to be the only nitrate-reducing enzyme in leaves of hydroponically grown birches. To obtain the primary structure of the NAD(P)H-NR, leaf poly(A)+ mRNA was used to construct a cDNA library in the lambda gt11 phage. Recombinant clones were screened with heterologous gene probes encoding NADH-NR from tobacco and squash. A 3.0 kb cDNA was isolated which hybridized to a 3.2 kb mRNA whose level was significantly higher in plants grown on nitrate than in those grown on ammonia. The nucleotide sequence of the cDNA comprises a reading frame encoding a protein of 898 amino acids which reveals 67%-77% identity with NADH-nitrate reductase sequences from higher plants. To identify conserved and variable regions of the multicentre electron-transfer protein a graphical evaluation of identities found in NR sequence alignments was carried out. Thirteen well-conserved sections exceeding a size of 10 amino acids were found in higher plant nitrate reductases. Sequence comparisons with related redox proteins indicate that about half of the conserved NR regions are involved in cofactor binding. The most striking difference in the birch NAD(P)H-NR sequence in comparison to NADH-NR sequences was found at the putative pyridine nucleotide binding site. Southern analysis indicates that the bi-specific NR is encoded by a single copy gene in birch.

  8. A highly conserved novel family of mammalian developmental transcription factors related to Drosophila grainyhead.

    PubMed

    Wilanowski, Tomasz; Tuckfield, Annabel; Cerruti, Loretta; O'Connell, Sinead; Saint, Robert; Parekh, Vishwas; Tao, Jianning; Cunningham, John M; Jane, Stephen M

    2002-06-01

    The Drosophila transcription factor Grainyhead regulates several key developmental processes. Three mammalian genes, CP2, LBP-1a and LBP-9 have been previously identified as homologues of grainyhead. We now report the cloning of two new mammalian genes (Mammalian grainyhead (MGR) and Brother-of-MGR (BOM)) and one new Drosophila gene (dCP2) that rewrite the phylogeny of this family. We demonstrate that MGR and BOM are more closely related to grh, whereas CP2, LBP-1a and LBP-9 are descendants of the dCP2 gene. MGR shares the greatest sequence homology with grh, is expressed in tissue-restricted patterns more comparable to grh and binds to and transactivates the promoter of the human Engrailed-1 gene, the mammalian homologue of the key grainyhead target gene, engrailed. This sequence and functional conservation indicates that the new mammalian members of this family play important developmental roles.

  9. A stationary-phase gene in Saccharomyces cerevisiae is a member of a novel, highly conserved gene family.

    PubMed Central

    Braun, E L; Fuge, E K; Padilla, P A; Werner-Washburne, M

    1996-01-01

    The regulation of cellular growth and proliferation in response to environmental cues is critical for development and the maintenance of viability in all organisms. In unicellular organisms, such as the budding yeast Saccharomyces cerevisiae, growth and proliferation are regulated by nutrient availability. We have described changes in the pattern of protein synthesis during the growth of S. cerevisiae cells to stationary phase (E. K. Fuge, E. L. Braun, and M. Werner-Washburne, J. Bacteriol. 176:5802-5813, 1994) and noted a protein, which we designated Snz1p (p35), that shows increased synthesis after entry into stationary phase. We report here the identification of the SNZ1 gene, which encodes this protein. We detected increased SNZ1 mRNA accumulation almost 2 days after glucose exhaustion, significantly later than that of mRNAs encoded by other postexponential genes. SNZ1-related sequences were detected in phylogenetically diverse organisms by sequence comparisons and low-stringency hybridization. Multiple SNZ1-related sequences were detected in some organisms, including S. cerevisiae. Snz1p was found to be among the most evolutionarily conserved proteins currently identified, indicating that we have identified a novel, highly conserved protein involved in growth arrest in S. cerevisiae. The broad phylogenetic distribution, the regulation of the SNZ1 mRNA and protein in S. cerevisiae, and identification of a Snz protein modified during sporulation in the gram-positive bacterium Bacillus subtilis support the hypothesis that Snz proteins are part of an ancient response that occurs during nutrient limitation and growth arrest. PMID:8955308

  10. Phylogeography of western Pacific Leucetta 'chagosensis' (Porifera: Calcarea) from ribosomal DNA sequences: implications for population history and conservation of the Great Barrier Reef World Heritage Area (Australia).

    PubMed

    Wörheide, Gert; Hooper, John N A; Degnan, Bernard M

    2002-09-01

    Leucetta 'chagosensis' is a widespread calcareous sponge, occurring in shaded habitats of Indo-Pacific coral reefs. In this study we explore relationships among 19 ribosomal DNA sequence types (the ITS1-5.8S-ITS2 region plus flanking gene sequences) found among 54 individuals from 28 locations throughout the western Pacific, with focus on the Great Barrier Reef (GBR). Maximum parsimony analysis revealed phylogeographical structuring into four major clades (although not highly supported by bootstrap analysis) corresponding to the northern/central GBR with Guam and Taiwan, the southern GBR and subtropical regions south to Brisbane, Vanuatu and Indonesia. Subsequent nested clade analysis (NCA) confirmed this structure with a probability of > 95%. After NCA of geographical distances, a pattern of range expansion from the internal Indonesian clade was inferred at the total cladogram level, as the Indonesian clade was found to be the internal and therefore oldest clade. Two distinct clades were found on the GBR, which narrowly overlap geographically in a line approximately from the Whitsunday Islands to the northern Swain Reefs. At various clade levels, NCA inferred that the northern GBR clade was influenced by past fragmentation and contiguous range expansion events, presumably during/after sea level low stands in the Pleistocene, after which the northern GBR might have been recolonized from the Queensland Plateau in the Coral Sea. The southern GBR clade is most closely related to subtropical L. 'chagosensis', and we infer that the southern GBR probably was recolonized from there after sea level low stands, based on our NCA results and supported by oceanographic data. Our results have important implications for conservation and management of the GBR, as they highlight the importance of marginal transition zones in the generation and maintenance of species rich zones, such as the Great Barrier Reef World Heritage Area.

  11. Resolving postglacial phylogeography using high-throughput sequencing

    PubMed Central

    Emerson, Kevin J.; Merz, Clayton R.; Catchen, Julian M.; Hohenlohe, Paul A.; Cresko, William A.; Bradshaw, William E.; Holzapfel, Christina M.

    2010-01-01

    The distinction between model and nonmodel organisms is becoming increasingly blurred. High-throughput, second-generation sequencing approaches are being applied to organisms based on their interesting ecological, physiological, developmental, or evolutionary properties and not on the depth of genetic information available for them. Here, we illustrate this point using a low-cost, efficient technique to determine the fine-scale phylogenetic relationships among recently diverged populations in a species. This application of restriction site-associated DNA tags (RAD tags) reveals previously unresolved genetic structure and direction of evolution in the pitcher plant mosquito, Wyeomyia smithii, from a southern Appalachian Mountain refugium following recession of the Laurentide Ice Sheet at 22,000–19,000 B.P. The RAD tag method can be used to identify detailed patterns of phylogeography in any organism regardless of existing genomic data, and, more broadly, to identify incipient speciation and genome-wide variation in natural populations in general. PMID:20798348

  12. High throughput sequencing reveals a novel fabavirus infecting sweet cherry.

    PubMed

    Villamor, D E V; Pillai, S S; Eastwell, K C

    2017-03-01

    The genus Fabavirus currently consists of five species represented by viruses that infect a wide range of hosts but none reported from temperate climate fruit trees. A virus with genomic features resembling fabaviruses (tentatively named Prunus virus F, PrVF) was revealed by high throughput sequencing of extracts from a sweet cherry tree (Prunus avium). PrVF was subsequently shown to be graft transmissible and further identified in three other non-symptomatic Prunus spp. from different geographical locations. Two genetic variants of RNA1 and RNA2 coexisted in the same samples. RNA1 consisted of 6,165 and 6,163 nucleotides, and RNA2 consisted of 3,622 and 3,468 nucleotides.

  13. Octreotide for conservative management of intractable high output post operative chylous fistula: a case report.

    PubMed

    Prabhu, Sundararaman; Thomas, Shaji

    2015-03-01

    A case of high output post neck dissection chylous fistula is presented, which was successfully managed conservatively with octreotide; a long acting somatostatin analogue. Routine measures had failed, and secondary complications precluded thoracoscopic ligation. We discuss the spectrum of problems associated with chylous fistula and review the rationale behind the use of octreotide.

  14. The transcription factor Spn1 regulates gene expression via a highly conserved novel structural motif

    PubMed Central

    Pujari, Venugopal; Radebaugh, Catherine A.; Chodaparambil, Jayanth V.; Muthurajan, Uma M.; Almeida, Adam R.; Fischbeck, Julie A.; Luger, Karolin; Stargell, Laurie A.

    2010-01-01

    Spn1 plays essential roles in the regulation of gene expression by RNA Polymerase II (RNAPII), and it is highly conserved in organisms ranging from yeast to humans. Spn1 physically and/or genetically interacts with RNAPII, TBP, TFIIS and a number of chromatin remodeling factors (Swi/Snf and Spt6). The central domain of Spn1 (residues 141-305 out of 410) is necessary and sufficient for performing the essential functions of SPN1 in yeast cells. Here we report the high-resolution (1.85Å) crystal structure of the conserved central domain of Saccharomyces cerevisiae Spn1. The central domain is comprised of eight alpha-helices in a right handed super helical arrangement, and exhibits structural similarity to domain I of TFIIS. A unique structural feature of Spn1 is a highly conserved loop, which defines one side of a pronounced cavity. The loop and the other residues forming the cavity are highly conserved at the amino acid level among all Spn1 family members, suggesting that this is a signature motif for Spn1 orthologs. The locations and the molecular characterization of temperature-sensitive mutations in Spn1 indicate that the cavity is a key attribute of Spn1 that is critical for its regulatory functions during RNAPII-mediated transcriptional activity. PMID:20875428

  15. The transcription factor Spn1 regulates gene expression via a highly conserved novel structural motif.

    PubMed

    Pujari, Venugopal; Radebaugh, Catherine A; Chodaparambil, Jayanth V; Muthurajan, Uma M; Almeida, Adam R; Fischbeck, Julie A; Luger, Karolin; Stargell, Laurie A

    2010-11-19

    Spn1/Iws1 plays essential roles in the regulation of gene expression by RNA polymerase II (RNAPII), and it is highly conserved in organisms ranging from yeast to humans. Spn1 physically and/or genetically interacts with RNAPII, TBP (TATA-binding protein), TFIIS (transcription factor IIS), and a number of chromatin remodeling factors (Swi/Snf and Spt6). The central domain of Spn1 (residues 141-305 out of 410) is necessary and sufficient for performing the essential functions of SPN1 in yeast cells. Here, we report the high-resolution (1.85 Å) crystal structure of the conserved central domain of Saccharomyces cerevisiae Spn1. The central domain is composed of eight α-helices in a right-handed superhelical arrangement and exhibits structural similarity to domain I of TFIIS. A unique structural feature of Spn1 is a highly conserved loop, which defines one side of a pronounced cavity. The loop and the other residues forming the cavity are highly conserved at the amino acid level among all Spn1 family members, suggesting that this is a signature motif for Spn1 orthologs. The locations and the molecular characterization of temperature-sensitive mutations in Spn1 indicate that the cavity is a key attribute of Spn1 that is critical for its regulatory functions during RNAPII-mediated transcriptional activity.

  16. Complex evolution of a highly conserved microsatellite locus in several fish species.

    PubMed

    Liu, J-X; Ely, B

    2009-08-01

    The evolutionary dynamics of a highly conserved microsatellite locus (Dla 11) were studied in several fish species. The data indicated that multiple types of compound microsatellites arose through point mutations that were sometimes followed by expansion of the derived motif. Furthermore, extensive length variation was detected among species in the regions immediately flanking the repeat region.

  17. Evaluation of sequencing approaches for high-throughput ...

    EPA Pesticide Factsheets

    Whole-genome in vitro transcriptomics has shown the capability to identify mechanisms of action and estimates of potency for chemical-mediated effects in a toxicological framework, but with limited throughput and high cost. We present the evaluation of three toxicogenomics platforms for potential application to high-throughput screening: 1. TempO-Seq utilizing custom designed paired probes per gene; 2. Targeted sequencing (TSQ) utilizing Illumina’s TruSeq RNA Access Library Prep Kit containing tiled exon-specific probe sets; 3. Low coverage whole transcriptome sequencing (LSQ) using Illumina’s TruSeq Stranded mRNA Kit. Each platform was required to cover the ~20,000 genes of the full transcriptome, operate directly with cell lysates, and be automatable with 384-well plates. Technical reproducibility was assessed using MAQC control RNA samples A and B, while functional utility for chemical screening was evaluated using six treatments at a single concentration after 6 hr in MCF7 breast cancer cells: 10 µM chlorpromazine, 10 µM ciclopriox, 10 µM genistein, 100 nM sirolimus, 1 µM tanespimycin, and 1 µM trichostatin A. All RNA samples and chemical treatments were run with 5 technical replicates. The three platforms achieved different read depths, with the TempO-Seq having ~34M mapped reads per sample, while TSQ and LSQ averaged 20M and 11M aligned reads per sample, respectively. Inter-replicate correlation averaged ≥0.95 for raw log2 expression values i

  18. Sequence conservation in the Ancylostoma secreted protein-2 of Necator americanus (Na-ASP-2) from hookworm infected individuals in Thailand.

    PubMed

    Ungcharoensuk, Charoenchai; Putaporntip, Chaturong; Pattanawong, Urassaya; Jongwutiwes, Somchai

    2012-12-01

    The Ancylostoma secreted protein-2 of Necator americanus (Na-ASP-2) was one of the promising vaccine candidates against the most prevalent human hookworm species as adverse vaccine reaction has compromised further human vaccine trials. To elucidate the gene structure and the extent of sequence diversity, we determined the complete nucleotide sequence of the Na-asp-2 gene of individual larvae from 32 infected subjects living in 3 different endemic areas of Thailand. Sequence analysis revealed that the gene encoding Na-ASP-2 comprised 8 exons. Of 3 nucleotide substitutions in these exons, only one causes an amino acid change from leucine to methionine. A consensus conserved GT and AG at the 5' and the 3' boundaries of each intron was observed akin to those found in other eukaryotic genes. Introns of Na-asp-2 contained 23 nucleotide substitutions and 0-18 indels. The mean number of nucleotide substitutions per site (d) in introns was not significantly different from the mean number of synonymous substitutions per synonymous site (d(S)) in exons whereas d in introns was significantly exceeded d(N) (the mean number of nonsynonymous substitutions per nonsynonymous site) in exons (p<0.05), suggesting that introns and synonymous sites in exons may evolve at a similar rate whereas functional constraints at the amino acid could limit amino acid substitutions in Na-ASP-2. A recombination site was identified in an intron near the 3' portion of the gene. The positions of introns and the intron phases in the Na-asp-2 gene comparing with those in other pathogenesis-related-1 proteins of Loa loa, Onchocerca volvulus, Heterodera glycines, Caenorhabditis elegans and human were relatively conserved, suggesting evolutionary conservation of these genes. Sequence conservation in Na-ASP-2 may not compromise further vaccine design if adverse vaccine effects could be resolved whereas microheterogeneity in introns of this locus may be useful for population genetics analysis of N. americanus.

  19. A Highly Conserved Residue in HIV-1 Nef Alpha Helix 2 Modulates Protein Expression

    PubMed Central

    Johnson, Aaron L.; Dirk, Brennan S.; Coutu, Mathieu; Haeryfar, S. M. Mansour; Arts, Eric J.; Finzi, Andrés

    2016-01-01

    ABSTRACT Extensive genetic diversity is a defining characteristic of human immunodeficiency virus type 1 (HIV-1) and poses a significant barrier to the development of an effective vaccine. To better understand the impact of this genetic diversity on the HIV-1 pathogenic factor Nef, we compiled a panel of reference strains from the NIH Los Alamos HIV Database. Initial sequence analysis identified point mutations at Nef residues 13, 84, and 92 in subtype C reference strain C.BR92025 from Brazil. Functional analysis revealed impaired major histocompatibility complex class I and CD4 downregulation of strain C.BR92025 Nef, which corresponded to decreased protein expression. Metabolic labeling demonstrated that strain C.BR92025 Nef has a greater rate of protein turnover than subtype B reference strain B.JRFL that, on the basis of mutational analysis, is related to Nef residue A84. An alanine-to-valine substitution at position 84, located in alpha helix 2 of Nef, was sufficient to alter the rate of turnover of an otherwise highly expressed Nef protein. In conclusion, these findings highlight HIV-1 Nef residue A84 as a major determinant of protein expression that may offer an additional avenue to disrupt or mediate the effects of this key HIV-1 pathogenic factor. IMPORTANCE The HIV-1 Nef protein has been established as a key pathogenic determinant of HIV/AIDS, but there is little knowledge of how the extensive genetic diversity of HIV-1 affects Nef function. Upon compiling a set of subtype-specific reference strains, we identified a subtype C reference strain, C.BR92025, that contained natural polymorphisms at otherwise highly conserved residues 13, 84, and 92. Interestingly, strain C.BR92025 Nef displayed impaired Nef function and had decreased protein expression. We have demonstrated that strain C.BR92025 Nef has a higher rate of protein turnover than highly expressed Nef proteins and that this higher rate of protein turnover is due to an alanine-to-valine substitution

  20. Sequence-Dependent T:G Base Pair Opening in DNA Double Helix Bound by Cren7, a Chromatin Protein Conserved among Crenarchaea

    PubMed Central

    Tian, Lei; Zhang, Zhenfeng; Wang, Hanqian; Zhao, Mohan; Dong, Yuhui; Gong, Yong

    2016-01-01

    T:G base pair arising from spontaneous deamination of 5mC or polymerase errors is a great challenge for DNA repair of hyperthermophilic archaea, especially Crenarchaea. Most strains in this phylum lack the protein homologues responsible for the recognition of the mismatch in the DNA repair pathways. To investigate whether Cren7, a highly conserved chromatin protein in Crenarchaea, serves a role in the repair of T:G mispairs, the crystal structures of Cren7-GTAATTGC and Cren7-GTGATCGC complexes were solved at 2.0 Å and 2.1 Å. In our structures, binding of Cren7 to the AT-rich DNA duplex (GTAATTGC) induces opening of T2:G15 but not T10:G7 base pair. By contrast, both T:G mispairs in the GC-rich DNA duplex (GTGATCGC) retain the classic wobble type. Structural analysis also showed DNA helical changes of GTAATTGC, especially in the steps around the open T:G base pair, as compared to GTGATCGC or the matched DNAs. Surface plasmon resonance assays revealed a 4-fold lower binding affinity of Cren7 for GTAATTGC than that for GTGATCGC, which was dominantly contributed by the decrease of association rate. These results suggested that binding of Cren7 to DNA leads to T:G mispair opening in a sequence dependent manner, and therefore propose the potential roles of Cren7 in DNA repair. PMID:27685992

  1. Unprecedented high-resolution view of bacterial operon architecture revealed by RNA sequencing.

    PubMed

    Conway, Tyrrell; Creecy, James P; Maddox, Scott M; Grissom, Joe E; Conkle, Trevor L; Shadid, Tyler M; Teramoto, Jun; San Miguel, Phillip; Shimada, Tomohiro; Ishihama, Akira; Mori, Hirotada; Wanner, Barry L

    2014-07-08

    We analyzed the transcriptome of Escherichia coli K-12 by strand-specific RNA sequencing at single-nucleotide resolution during steady-state (logarithmic-phase) growth and upon entry into stationary phase in glucose minimal medium. To generate high-resolution transcriptome maps, we developed an organizational schema which showed that in practice only three features are required to define operon architecture: the promoter, terminator, and deep RNA sequence read coverage. We precisely annotated 2,122 promoters and 1,774 terminators, defining 1,510 operons with an average of 1.98 genes per operon. Our analyses revealed an unprecedented view of E. coli operon architecture. A large proportion (36%) of operons are complex with internal promoters or terminators that generate multiple transcription units. For 43% of operons, we observed differential expression of polycistronic genes, despite being in the same operons, indicating that E. coli operon architecture allows fine-tuning of gene expression. We found that 276 of 370 convergent operons terminate inefficiently, generating complementary 3' transcript ends which overlap on average by 286 nucleotides, and 136 of 388 divergent operons have promoters arranged such that their 5' ends overlap on average by 168 nucleotides. We found 89 antisense transcripts of 397-nucleotide average length, 7 unannotated transcripts within intergenic regions, and 18 sense transcripts that completely overlap operons on the opposite strand. Of 519 overlapping transcripts, 75% correspond to sequences that are highly conserved in E. coli (>50 genomes). Our data extend recent studies showing unexpected transcriptome complexity in several bacteria and suggest that antisense RNA regulation is widespread. Importance: We precisely mapped the 5' and 3' ends of RNA transcripts across the E. coli K-12 genome by using a single-nucleotide analytical approach. Our resulting high-resolution transcriptome maps show that ca. one-third of E. coli operons are

  2. Balancing forest-regeneration probabilities and maintenance costs in dry grasslands of high conservation priority

    USGS Publications Warehouse

    Bolliger, Janine; Edwards, Thomas C.; Eggenberg, Stefan; Ismail, Sascha; Seidl, Irmi; Kienast, Felix

    2011-01-01

    Abandonment of agricultural land has resulted in forest regeneration in species-rich dry grasslands across European mountain regions and threatens conservation efforts in this vegetation type. To support national conservation strategies, we used a site-selection algorithm (MARXAN) to find optimum sets of floristic regions (reporting units) that contain grasslands of high conservation priority. We sought optimum sets that would accommodate 136 important dry-grassland species and that would minimize forest regeneration and costs of management needed to forestall predicted forest regeneration. We did not consider other conservation elements of dry grasslands, such as animal species richness, cultural heritage, and changes due to climate change. Optimal sets that included 95–100% of the dry grassland species encompassed an average of 56–59 floristic regions (standard deviation, SD 5). This is about 15% of approximately 400 floristic regions that contain dry-grassland sites and translates to 4800–5300 ha of dry grassland out of a total of approximately 23,000 ha for the entire study area. Projected costs to manage the grasslands in these optimum sets ranged from CHF (Swiss francs) 5.2 to 6.0 million/year. This is only 15–20% of the current total estimated cost of approximately CHF30–45 million/year required if all dry grasslands were to be protected. The grasslands of the optimal sets may be viewed as core sites in a national conservation strategy.

  3. On the freestream preservation of high-order conservative flux-reconstruction schemes

    NASA Astrophysics Data System (ADS)

    Abe, Yoshiaki; Haga, Takanori; Nonomura, Taku; Fujii, Kozo

    2015-01-01

    The appropriate procedure for constructing the symmetric conservative metric is presented with which both the freestream preservation and global conservation properties are satisfied in the high-order conservative flux-reconstruction scheme on a three-dimensional stationary-curvilinear grid. A freestream preservation test is conducted, and the symmetric conservative metric constructed by the appropriate procedure preserves the freestream regardless of the order of shape functions, while other metrics cannot always preserve the freestream. Also a convecting vortex is computed on three-dimensional wavy grids, and the formal order of accuracy is achieved when the symmetric conservative metric is appropriately constructed, while it is not when they are inappropriately constructed. In addition, although the sufficient condition for the freestream preservation with the nonconservative (cross product form) metric was reported in the previous study to be that the order of solution polynomial has to be greater than or equal to the twice of the order of a shape function, a special case is newly found in the present study: when the Radau polynomial is used for the correction function, the freestream is preserved even if the solution order is lower than the known condition. Using the properties of Legendre polynomials, the mechanism for this special case is analytically explained, considering the cancellation of aliasing errors.

  4. Identification of individual barley chromosomes based on repetitive sequences: conservative distribution of Afa-family repetitive sequences on the chromosomes of barley and wheat.

    PubMed

    Tsujimoto, H; Mukai, Y; Akagawa, K; Nagaki, K; Fujigaki, J; Yamamoto, M; Sasakuma, T

    1997-10-01

    The Afa-family repetitive sequences were isolated from barley (Hordeum vulgare, 2n = 14) and cloned as pHvA14. This sequence distinguished each barely chromosome by in situ hybridization. Double color fluorescence in situ hybridization using pHvA14 and 5S rDNA or HvRT-family sequence (subtelomeric sequence of barley) allocated individual barley chromosomes showing a specific pattern of pHvA14 to chromosome 1H to 7H. As the case of the D genome chromosomes of Aegilops squarrosa and common wheat (Triticum aestivum) hybridized by its Afa-family sequences, the signals of pHvA14 in barley chromosomes tended to appear in the distal regions that do not carry many chromosome band markers. In the telomeric regions these signals always placed in more proximal portions than those of HvRT-family. Based on the distribution patterns of Afa-family sequences in the chromosomes of barley and D genome chromosomes of wheat, we discuss a possible mechanism of amplification of the repetitive sequences during the evolution of Triticeae. In addition, we show here that HvRT-family also could be used to distinguish individual barley chromosomes from the patterns of in situ hybridization.

  5. A method for high-performance sequence analysis using polyvinylidene difluoride membranes with a biphasic reaction column sequencer.

    PubMed

    Reim, D F; Speicher, D W

    1994-01-01

    Methods have been developed for high-sensitivity sequence analysis of proteins electroblotted onto polyvinylidene difluoride (PVDF) membranes using a Hewlett-Packard G1005A protein sequencer. This sequencer normally uses a biphasic (hydrophobic/hydrophilic) reaction column which was designed to accommodate loading and cleanup of samples from diverse solutions. However, the standard column, programs, and chemistry were not designed to accommodate PVDF, which has become a common sequencing support. In this study, a systematic evaluation of the suitability of this sequencer for analysis using PVDF bound samples was performed and included evaluation of: different wash and extraction solvents, multiple programming changes, two alternative formulations of coupling reagents, and the effect of direction for solvent and reagent deliveries. High-performance analysis of PVDF bound samples was achieved by: using a modified reaction column with an empty hydrophobic (top) half of the column module, program modifications for the reaction column and converter, substitution of ethyl acetate for the standard S2/3 extraction solvent and using prototype Version 2.0 formulations of the coupling reagents, R1 and R2. High-performance sequence analyses of experimental samples electroblotted from either 1D or 2D gels onto high-retention PVDF membranes were obtained with a 41-min cycle time, including experimental samples with initial coupling yields < 2 pmol. Routine sequencer performance was comparable to, or slightly better than, a conventional gas-phase sequencer which had been previously optimized by us for high-performance sequence analysis of electroblotted samples in the low pmol range.

  6. An Analysis of Stimuli that Influence Compliance during the High-Probability Instruction Sequence

    ERIC Educational Resources Information Center

    Normand, Matthew P.; Kestner, Kathryn; Jessel, Joshua

    2010-01-01

    When we evaluated variables that influence the effectiveness of the high-probability (high-p) instruction sequence, the sequence was associated with a precipitous decrease in compliance with high-"p" instructions for 1 participant, thereby precluding continued use of the sequence. We investigated the reasons for this decrease. Stimuli associated…

  7. Evolutionary conservation analysis increases the colocalization of predicted exonic splicing enhancers in the BRCA1 gene with missense sequence changes and in-frame deletions, but not polymorphisms

    PubMed Central

    Pettigrew, Christopher; Wayte, Nicola; Lovelock, Paul K; Tavtigian, Sean V; Chenevix-Trench, Georgia; Spurdle, Amanda B; Brown, Melissa A

    2005-01-01

    Introduction Aberrant pre-mRNA splicing can be more detrimental to the function of a gene than changes in the length or nature of the encoded amino acid sequence. Although predicting the effects of changes in consensus 5' and 3' splice sites near intron:exon boundaries is relatively straightforward, predicting the possible effects of changes in exonic splicing enhancers (ESEs) remains a challenge. Methods As an initial step toward determining which ESEs predicted by the web-based tool ESEfinder in the breast cancer susceptibility gene BRCA1 are likely to be functional, we have determined their evolutionary conservation and compared their location with known BRCA1 sequence variants. Results Using the default settings of ESEfinder, we initially detected 669 potential ESEs in the coding region of the BRCA1 gene. Increasing the threshold score reduced the total number to 464, while taking into consideration the proximity to splice donor and acceptor sites reduced the number to 211. Approximately 11% of these ESEs (23/211) either are identical at the nucleotide level in human, primates, mouse, cow, dog and opossum Brca1 (conserved) or are detectable by ESEfinder in the same position in the Brca1 sequence (shared). The frequency of conserved and shared predicted ESEs between human and mouse is higher in BRCA1 exons (2.8 per 100 nucleotides) than in introns (0.6 per 100 nucleotides). Of conserved or shared putative ESEs, 61% (14/23) were predicted to be affected by sequence variants reported in the Breast Cancer Information Core database. Applying the filters described above increased the colocalization of predicted ESEs with missense changes, in-frame deletions and unclassified variants predicted to be deleterious to protein function, whereas they decreased the colocalization with known polymorphisms or unclassified variants predicted to be neutral. Conclusion In this report we show that evolutionary conservation analysis may be used to improve the specificity of an ESE

  8. Protein Sequence Annotation Tool (PSAT): A centralized web-based meta-server for high-throughput sequence annotations

    SciTech Connect

    Leung, Elo; Huang, Amy; Cadag, Eithon; Montana, Aldrin; Soliman, Jan Lorenz; Zhou, Carol L. Ecale

    2016-01-20

    In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resulting functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.

  9. Protein Sequence Annotation Tool (PSAT): A centralized web-based meta-server for high-throughput sequence annotations

    DOE PAGES

    Leung, Elo; Huang, Amy; Cadag, Eithon; ...

    2016-01-20

    In this study, we introduce the Protein Sequence Annotation Tool (PSAT), a web-based, sequence annotation meta-server for performing integrated, high-throughput, genome-wide sequence analyses. Our goals in building PSAT were to (1) create an extensible platform for integration of multiple sequence-based bioinformatics tools, (2) enable functional annotations and enzyme predictions over large input protein fasta data sets, and (3) provide a web interface for convenient execution of the tools. In this paper, we demonstrate the utility of PSAT by annotating the predicted peptide gene products of Herbaspirillum sp. strain RV1423, importing the results of PSAT into EC2KEGG, and using the resultingmore » functional comparisons to identify a putative catabolic pathway, thereby distinguishing RV1423 from a well annotated Herbaspirillum species. This analysis demonstrates that high-throughput enzyme predictions, provided by PSAT processing, can be used to identify metabolic potential in an otherwise poorly annotated genome. Lastly, PSAT is a meta server that combines the results from several sequence-based annotation and function prediction codes, and is available at http://psat.llnl.gov/psat/. PSAT stands apart from other sequencebased genome annotation systems in providing a high-throughput platform for rapid de novo enzyme predictions and sequence annotations over large input protein sequence data sets in FASTA. PSAT is most appropriately applied in annotation of large protein FASTA sets that may or may not be associated with a single genome.« less

  10. DUC-Curve, a highly compact 2D graphical representation of DNA sequences and its application in sequence alignment

    NASA Astrophysics Data System (ADS)

    Li, Yushuang; Liu, Qian; Zheng, Xiaoqi

    2016-08-01

    A highly compact and simple 2D graphical representation of DNA sequences, named DUC-Curve, is constructed through mapping four nucleotides to a unit circle with a cyclic order. DUC-Curve could directly detect nucleotide, di-nucleotide compositions and microsatellite structure from DNA sequences. Moreover, it also could be used for DNA sequence alignment. Taking geometric center vectors of DUC-Curves as sequence descriptor, we perform similarity analysis on the first exons of β-globin genes of 11 species, oncogene TP53 of 27 species and twenty-four Influenza A viruses, respectively. The obtained reasonable results illustrate that the proposed method is very effective in sequence comparison problems, and will at least play a complementary role in classification and clustering problems.

  11. Phylogenetic and Functional Analysis of Metagenome Sequence from High-Temperature Archaeal Habitats Demonstrate Linkages between Metabolic Potential and Geochemistry

    PubMed Central

    Inskeep, William P.; Jay, Zackary J.; Herrgard, Markus J.; Kozubal, Mark A.; Rusch, Douglas B.; Tringe, Susannah G.; Macur, Richard E.; Jennings, Ryan deM.; Boyd, Eric S.; Spear, John R.; Roberto, Francisco F.

    2013-01-01

    Geothermal habitats in Yellowstone National Park (YNP) provide an unparalleled opportunity to understand the environmental factors that control the distribution of archaea in thermal habitats. Here we describe, analyze, and synthesize metagenomic and geochemical data collected from seven high-temperature sites that contain microbial communities dominated by archaea relative to bacteria. The specific objectives of the study were to use metagenome sequencing to determine the structure and functional capacity of thermophilic archaeal-dominated microbial communities across a pH range from 2.5 to 6.4 and to discuss specific examples where the metabolic potential correlated with measured environmental parameters and geochemical processes occurring in situ. Random shotgun metagenome sequence (∼40–45 Mb Sanger sequencing per site) was obtained from environmental DNA extracted from high-temperature sediments and/or microbial mats and subjected to numerous phylogenetic and functional analyses. Analysis of individual sequences (e.g., MEGAN and G + C content) and assemblies from each habitat type revealed the presence of dominant archaeal populations in all environments, 10 of whose genomes were largely reconstructed from the sequence data. Analysis of protein family occurrence, particularly of those involved in energy conservation, electron transport, and autotrophic metabolism, revealed significant differences in metabolic strategies across sites consistent with differences in major geochemical attributes (e.g., sulfide, oxygen, pH). These observations provide an ecological basis for understanding the distribution of indigenous archaeal lineages across high-temperature systems of YNP. PMID:23720654

  12. Unprecedented High-Resolution View of Bacterial Operon Architecture Revealed by RNA Sequencing

    PubMed Central

    Creecy, James P.; Maddox, Scott M.; Grissom, Joe E.; Conkle, Trevor L.; Shadid, Tyler M.; Teramoto, Jun; San Miguel, Phillip; Shimada, Tomohiro; Ishihama, Akira; Mori, Hirotada

    2014-01-01

    ABSTRACT We analyzed the transcriptome of Escherichia coli K-12 by strand-specific RNA sequencing at single-nucleotide resolution during steady-state (logarithmic-phase) growth and upon entry into stationary phase in glucose minimal medium. To generate high-resolution transcriptome maps, we developed an organizational schema which showed that in practice only three features are required to define operon architecture: the promoter, terminator, and deep RNA sequence read coverage. We precisely annotated 2,122 promoters and 1,774 terminators, defining 1,510 operons with an average of 1.98 genes per operon. Our analyses revealed an unprecedented view of E. coli operon architecture. A large proportion (36%) of operons are complex with internal promoters or terminators that generate multiple transcription units. For 43% of operons, we observed differential expression of polycistronic genes, despite being in the same operons, indicating that E. coli operon architecture allows fine-tuning of gene expression. We found that 276 of 370 convergent operons terminate inefficiently, generating complementary 3′ transcript ends which overlap on average by 286 nucleotides, and 136 of 388 divergent operons have promoters arranged such that their 5′ ends overlap on average by 168 nucleotides. We found 89 antisense transcripts of 397-nucleotide average length, 7 unannotated transcripts within intergenic regions, and 18 sense transcripts that completely overlap operons on the opposite strand. Of 519 overlapping transcripts, 75% correspond to sequences that are highly conserved in E. coli (>50 genomes). Our data extend recent studies showing unexpected transcriptome complexity in several bacteria and suggest that antisense RNA regulation is widespread. PMID:25006232

  13. Whole Genome Mapping with Feature Sets from High-Throughput Sequencing Data

    PubMed Central

    Pan, Yonglong; Wang, Xiaoming; Liu, Lin; Wang, Hao; Luo, Meizhong

    2016-01-01

    A good physical map is essential to guide sequence assembly in de novo whole genome sequencing, especially when sequences are produced by high-throughput sequencing such as next-generation-sequencing (NGS) technology. We here present a novel method, Feature sets-based Genome Mapping (FGM). With FGM, physical map and draft whole genome sequences can be generated, anchored and integrated using the same data set of NGS sequences, independent of restriction digestion. Method model was created and parameters were inspected by simulations using the Arabidopsis genome sequence. In the simulations, when ~4.8X genome BAC library including 4,096 clones was used to sequence the whole genome, ~90% of clones were successfully connected to physical contigs, and 91.58% of genome sequences were mapped and connected to chromosomes. This method was experimentally verified using the existing physical map and genome sequence of rice. Of 4,064 clones covering 115 Mb sequence selected from ~3 tiles of 3 chromosomes of a rice draft physical map, 3,364 clones were reconstructed into physical contigs and 98 Mb sequences were integrated into the 3 chromosomes. The physical map-integrated draft genome sequences can provide permanent frameworks for eventually obtaining high-quality reference sequences by targeted sequencing, gap filling and combining other sequences. PMID:27611682

  14. Gene expression profile of human bone marrow stromal cells: high-throughput expressed sequence tag sequencing analysis.

    PubMed

    Jia, Libin; Young, Marian F; Powell, John; Yang, Liming; Ho, Nicola C; Hotchkiss, Robert; Robey, Pamela Gehron; Francomano, Clair A

    2002-01-01

    Human bone marrow stromal cells (HBMSC) are pluripotent cells with the potential to differentiate into osteoblasts, chondrocytes, myelosupportive stroma, and marrow adipocytes. We used high-throughput DNA sequencing analysis to generate 4258 single-pass sequencing reactions (known as expressed sequence tags, or ESTs) obtained from the 5' (97) and 3' (4161) ends of human cDNA clones from a HBMSC cDNA library. Our goal was to obtain tag sequences from the maximum number of possible genes and to deposit them in the publicly accessible database for ESTs (dbEST of the National Center for Biotechnology Information). Comparisons of our EST sequencing data with nonredundant human mRNA and protein databases showed that the ESTs represent 1860 gene clusters. The EST sequencing data analysis showed 60 novel genes found only in this cDNA library after BLAST analysis against 3.0 million ESTs in NCBI's dbEST database. The BLAST search also showed the identified ESTs that have close homology to known genes, which suggests that these may be newly recognized members of known gene families. The gene expression profile of this cell type is revealed by analyzing both the frequency with which a message is encountered and the functional categorization of expressed sequences. Comparing an EST sequence with the human genomic sequence database enables assignment of an EST to a specific chromosomal region (a process called digital gene localization) and often enables immediate partial determination of intron/exon boundaries within the genomic structure. It is expected that high-throughput EST sequencing and data mining analysis will greatly promote our understanding of gene expression in these cells and of growth and development of the skeleton.

  15. Experimental design-based functional mining and characterization of high-throughput sequencing data in the sequence read archive.

    PubMed

    Nakazato, Takeru; Ohta, Tazro; Bono, Hidemasa

    2013-01-01

    High-throughput sequencing technology, also called next-generation sequencing (NGS), has the potential to revolutionize the whole process of genome sequencing, transcriptomics, and epigenetics. Sequencing data is captured in a public primary data archive, the Sequence Read Archive (SRA). As of January 2013, data from more than 14,000 projects have been submitted to SRA, which is double that of the previous year. Researchers can download raw sequence data from SRA website to perform further analyses and to compare with their own data. However, it is extremely difficult to search entries and download raw sequences of interests with SRA because the data structure is complicated, and experimental conditions along with raw sequences are partly described in natural language. Additionally, some sequences are of inconsistent quality because anyone can submit sequencing data to SRA with no quality check. Therefore, as a criterion of data quality, we focused on SRA entries that were cited in journal articles. We extracted SRA IDs and PubMed IDs (PMIDs) from SRA and full-text versions of journal articles and retrieved 2748 SRA ID-PMID pairs. We constructed a publication list referring to SRA entries. Since, one of the main themes of -omics analyses is clarification of disease mechanisms, we also characterized SRA entries by disease keywords, according to the Medical Subject Headings (MeSH) extracted from articles assigned to each SRA entry. We obtained 989 SRA ID-MeSH disease term pairs, and constructed a disease list referring to SRA data. We previously developed feature profiles of diseases in a system called "Gendoo". We generated hyperlinks between diseases extracted from SRA and the feature profiles of it. The developed project, publication and disease lists resulting from this study are available at our web service, called "DBCLS SRA" (http://sra.dbcls.jp/). This service will improve accessibility to high-quality data from SRA.

  16. Interview-based sighting histories can inform regional conservation prioritization for highly threatened cryptic species

    PubMed Central

    Turvey, Samuel T; Trung, Cao Tien; Quyet, Vo Dai; Nhu, Hoang Van; Thoai, Do Van; Tuan, Vo Cong Anh; Hoa, Dang Thi; Kacha, Kouvang; Sysomphone, Thongsay; Wallate, Sousakhone; Hai, Chau Thi Thanh; Thanh, Nguyen Van; Wilkinson, Nicholas M

    2015-01-01

    The use of robust ecological data to make evidence-based management decisions is frequently prevented by limited data quantity or quality, and local ecological knowledge (LEK) is increasingly seen as an important source of information for conservation. However, there has been little assessment of LEK's usefulness for informing prioritization and management of landscapes for threatened species, or assessing comparative species status across landscapes. A large-scale interview survey in the Annamite Mountains (Vietnam and Lao PDR) compiled the first systematic LEK data set for saola Pseudoryx nghetinhensis, one of the world's rarest mammals, and eight other ungulates. Saola conservation is hindered by uncertainty over continued presence across much of its proposed distribution. We analysed comparative LEK-based last-sighting data across three landscapes to determine whether regional sighting histories support previous suggestions of landscape importance for saola conservation (Hue-Quang Nam: top-priority Vietnamese landscape; Pu Mat: lower priority Vietnamese landscape; Viengthong: high-priority Lao landscape) and whether they constitute an effective spatial prioritization tool for cryptic species management. Wild pig and red muntjac may be the only Annamite ungulates with stable populations; the regional status of all other species appears to be worse. Saola have declined more severely and/or are significantly rarer than most other ungulates and have been seen by relatively few respondents. Saola were also frequently considered locally rarest or declining, and never as species that had not declined. In contrast to other species, there are no regional differences in saola sighting histories, with continued persistence in all landscapes challenging suggestions that regional status differs greatly. Remnant populations persist in Vietnam despite heavy hunting, but even remote landscapes in Lao may be under intense pressure. Synthesis and applications. Our local

  17. Comparative genomic analysis of a neurotoxigenic Clostridium species using partial genome sequence: Phylogenetic analysis of a few conserved proteins involved in cellular processes and metabolism.

    PubMed

    Alam, Syed Imteyaz; Dixit, Aparna; Tomar, Arvind; Singh, Lokendra

    2010-04-01

    Clostridial organisms produce neurotoxins, which are generally regarded as the most potent toxic substances of biological origin and potential biological warfare agents. Clostridium tetani produces tetanus neurotoxin and is responsible for the fatal tetanus disease. In spite of the extensive immunization regimen, the disease is an important cause of death especially among neonates. Strains of C. tetani have not been genetically characterized except the complete genome sequencing of strain E88. The present study reports the genetic makeup and phylogenetic affiliations of an environmental strain of this bacterium with respect to C. tetani E88 and other clostridia. A shot gun library was constructed from the genomic DNA of C. tetani drde, isolated from decaying fish sample. Unique clones were sequenced and sequences compared with its closest relative C. tetani E88. A total of 275 clones were obtained and 32,457 bases of non-redundant sequence were generated. A total of 150 base changes were observed over the entire length of sequence obtained, including, additions, deletions and base substitutions. Of the total 120 ORFs detected, 48 exhibited closest similarity to E88 proteins of which three are hypothetical proteins. Eight of the ORFs exhibited similarity with hypothetical proteins from other organisms and 10 aligned with other proteins from unrelated organisms. There is an overall conservation of protein sequences among the two strains of C. tetani and. Selected ORFs involved in cellular processes and metabolism were subjected to phylogenetic analysis.

  18. Identification and classification of structural soil conservation measures based on very high resolution stereo satellite data.

    PubMed

    Eckert, Sandra; Tesfay Ghebremicael, Selamawit; Hurni, Hans; Kohler, Thomas

    2017-05-15

    Land degradation affects large areas of land around the globe, with grave consequences for those living off the land. Major efforts are being made to implement soil and water conservation measures that counteract soil erosion and help secure vital ecosystem services. However, where and to what extent such measures have been implemented is often not well documented. Knowledge about this could help to identify areas where soil and water conservation measures are successfully supporting sustainable land management, as well as areas requiring urgent rehabilitation of conservation structures such as terraces and bunds. This study explores the potential of the latest satellite-based remote sensing technology for use in assessing and monitoring the extent of existing soil and water conservation structures. We used a set of very high resolution stereo Geoeye-1 satellite data, from which we derived a detailed digital surface model as well as a set of other spectral, terrain, texture, and filtered information layers. We developed and applied an object-based classification approach, working on two segmentation levels. On the coarser level, the aim was to delimit certain landscape zones. Information about these landscape zones is useful in distinguishing different types of soil and water conservation structures, as each zone contains certain specific types of structures. On the finer level, the goal was to extract and identify different types of linear soil and water conservation structures. The classification rules were based mainly on spectral, textural, shape, and topographic properties, and included object relationships. This approach enabled us to identify and separate from other classes the majority (78.5%) of terraces and bunds, as well as most hillside terraces (81.25%). Omission and commission errors are similar to those obtained by the few existing studies focusing on the same research objective but using different types of remotely sensed data. Based on our results

  19. Choristoneura fumiferana Granulovirus p74 protein, a highly conserved baculoviral envelope protein.

    PubMed

    Rashidan, Kianoush Khajeh; Nassoury, Nasha; Tazi, Samia; Giannopoulos, Paresa N; Guertin, Claude

    2003-09-30

    A gene that encodes a homologue to baculoviral p74, an envelope-associated viral structural protein, has been identified and sequenced on the genome of Choristoneura fumiferana granulovirus (ChfuGV). A part of the ChfuGV p74 gene was located on an 8.9 kb BamHI subgenomic fragment using different sets of degenerated primers. These were designed using the results of the protein sequencing of a major 74 kDa structural protein that is associated with the occlusion-derived virus (ODV). The gene has a 1992 nucleotide (nt) open-reading frame (ORF) that encodes a protein with 663 amino acids with a predicted molecular mass of 74,812 Da. Comparative studies revealed the presence of two major conserved regions in the ChfuGV p74 protein. This study also shows that all of the p74 proteins contain two putative transmembrane domains at their C-terminal segments. At the nucleotide sequence level, two late promoter motifs (TAAG and GTAAG) were located upstream of the first ATG of the p74 gene. The gene contained a canonical poly(A) signal, AATAAA, at its 3 non-translated region. A phylogenetic tree for baculoviral p74 was constructed using a maximum parsimony analysis. The phylogenetic estimation demonstrated that ChfuGV p74 is related the closest to those of Cydia pomonella granulovirus (CpGV) and Phthorimaea operculella granulovirus (PhopGV).

  20. Development of a protein-ligand-binding site prediction method based on interaction energy and sequence conservation.

    PubMed

    Tsujikawa, Hiroto; Sato, Kenta; Wei, Cao; Saad, Gul; Sumikoshi, Kazuya; Nakamura, Shugo; Terada, Tohru; Shimizu, Kentaro

    2016-09-01

    We present a new method for predicting protein-ligand-binding sites based on protein three-dimensional structure and amino acid conservation. This method involves calculation of the van der Waals interaction energy between a protein and many probes placed on the protein surface and subsequent clustering of the probes with low interaction energies to identify the most energetically favorable locus. In addition, it uses amino acid conservation among homologous proteins. Ligand-binding sites were predicted by combining the interaction energy and the amino acid conservation score. The performance of our prediction method was evaluated using a non-redundant dataset of 348 ligand-bound and ligand-unbound protein structure pairs, constructed by filtering entries in a ligand-binding site structure database, LigASite. Ligand-bound structure prediction (bound prediction) indicated that 74.0 % of predicted ligand-binding sites overlapped with real ligand-binding sites by over 25 % of their volume. Ligand-unbound structure prediction (unbound prediction) indicated that 73.9 % of predicted ligand-binding residues overlapped with real ligand-binding residues. The amino acid conservation score improved the average prediction accuracy by 17.0 and 17.6 points for the bound and unbound predictions, respectively. These results demonstrate the effectiveness of the combined use of the interaction energy and amino acid conservation in the ligand-binding site prediction.

  1. Ice-binding site of snow mold fungus antifreeze protein deviates from structural regularity and high conservation.

    PubMed

    Kondo, Hidemasa; Hanada, Yuichi; Sugimoto, Hiroshi; Hoshino, Tamotsu; Garnham, Christopher P; Davies, Peter L; Tsuda, Sakae

    2012-06-12

    Antifreeze proteins (AFPs) are found in organisms ranging from fish to bacteria, where they serve different functions to facilitate survival of their host. AFPs that protect freeze-intolerant fish and insects from internal ice growth bind to ice using a regular array of well-conserved residues/motifs. Less is known about the role of AFPs in freeze-tolerant species, which might be to beneficially alter the structure of ice in or around the host. Here we report the 0.95-Å high-resolution crystal structure of a 223-residue secreted AFP from the snow mold fungus Typhula ishikariensis. Its main structural element is an irregular β-helix with six loops of 18 or more residues that lies alongside an α-helix. β-Helices have independently evolved as AFPs on several occasions and seem ideally structured to bind to several planes of ice, including the basal plane. A novelty of the β-helical fold is the nonsequential arrangement of loops that places the N- and C termini inside the solenoid of β-helical coils. The ice-binding site (IBS), which could not be predicted from sequence or structure, was located by site-directed mutagenesis to the flattest surface of the protein. It is remarkable for its lack of regularity and its poor conservation in homologs from psychrophilic diatoms and bacteria and other fungi.

  2. Camera Trapping: A Contemporary Approach to Monitoring Invasive Rodents in High Conservation Priority Ecosystems

    PubMed Central

    Rendall, Anthony R.; Sutherland, Duncan R.; Cooke, Raylene; White, John

    2014-01-01

    Invasive rodent species have established on 80% of the world's islands causing significant damage to island environments. Insular ecosystems support proportionally more biodiversity than comparative mainland areas, highlighting them as critical for global biodiversity conservation. Few techniques currently exist to adequately detect, with high confidence, species that are trap-adverse such as the black rat, Rattus rattus, in high conservation priority areas where multiple non-target species persist. This study investigates the effectiveness of camera trapping for monitoring invasive rodents in high conservation areas, and the influence of habitat features and density of colonial-nesting seabirds on rodent relative activity levels to provide insights into their potential impacts. A total of 276 camera sites were established and left in situ for 8 days. Identified species were recorded in discrete 15 min intervals, referred to as ‘events’. In total, 19 804 events were recorded. From these, 31 species were identified comprising 25 native species and six introduced. Two introduced rodent species were detected: the black rat (90% of sites), and house mouse Mus musculus (56% of sites). Rodent activity of both black rats and house mice were positively associated with the structural density of habitats. Density of seabird burrows was not strongly associated with relative activity levels of rodents, yet rodents were still present in these areas. Camera trapping enabled a large number of rodents to be detected with confidence in site-specific absences and high resolution to quantify relative activity levels. This method enables detection of multiple species simultaneously with low impact (for both target and non-target individuals); an ideal strategy for monitoring trap-adverse invasive rodents in high conservation areas. PMID:24599307

  3. Assessment of the effects of farming and conservation programs on pesticide deposition in high plains wetlands.

    PubMed

    Belden, Jason B; Hanson, Brittany Rae; McMurry, Scott T; Smith, Loren M; Haukos, David A

    2012-03-20

    We examined pesticide contamination in sediments from depressional playa wetlands embedded in the three dominant land-use types in the western High Plains and Rainwater Basin of the United States including cropland, perennial grassland enrolled in conservation programs (e.g., Conservation Reserve Program [CRP]), and native grassland or reference condition. Two hundred and sixty four playas, selected from the three land-use types, were sampled from Nebraska and Colorado in the north to Texas and New Mexico in the south. Sediments were examined for most of the commonly used agricultural pesticides. Atrazine, acetochlor, metolachlor, and trifluralin were the most commonly detected pesticides in the northern High Plains and Rainwater Basin. Atrazine, metolachlor, trifluralin, and pendimethalin were the most commonly detected pesticides in the southern High Plains. The top 5-10% of playas contained herbicide concentrations that are high enough to pose a hazard for plants. However, insecticides and fungicides were rarely detected. Pesticide occurrence and concentrations were higher in wetlands surrounded by cropland as compared to native grassland and CRP perennial grasses. The CRP, which is the largest conservation program in the U.S., was protective and had lower pesticide concentrations compared to cropland.

  4. A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing

    PubMed Central

    Green, Richard E.; Malaspinas, Anna-Sapfo; Krause, Johannes; Briggs, Adrian W.; Johnson, Philip L. F.; Uhler, Caroline; Meyer, Matthias; Good, Jeffrey M.; Maricic, Tomislav; Stenzel, Udo; Prüfer, Kay; Siebauer, Michael; Burbano, Hernán A.; Ronan, Michael; Rothberg, Jonathan M.; Egholm, Michael; Rudan, Pavao; Brajković, Dejana; Kućan, Željko; Gušić, Ivan; Wikström, Mårten; Laakkonen, Liisa; Kelso, Janet; Slatkin, Montgomery; Pääbo, Svante

    2008-01-01

    Summary A complete mitochondrial (mt) genome sequence was reconstructed from a 38,000-year-old Neandertal individual using 8,341 mtDNA sequences identified among 4.8 Gb of DNA generated from ~0.3 grams of bone. Analysis of the assembled sequence unequivocally establishes that the Neandertal mtDNA falls outside the variation of extant human mtDNAs and allows an estimate of the divergence date between the two mtDNA lineages of 660,000±140,000 years. Of the 13 proteins encoded in the mtDNA, subunit 2 of cytochrome c oxidase of the mitochondrial electron transport chain has experienced the largest number of amino acid substitutions in human ancestors since the separation from Neandertals. There is evidence that purifying selection in the Neandertal mtDNA was reduced compared to other primate lineages suggesting that the effective population size of Neandertals was small. PMID:18692465

  5. Evaluation of a Pooled Strategy for High-Throughput Sequencing of Cosmid Clones from Metagenomic Libraries

    PubMed Central

    Lam, Kathy N.; Hall, Michael W.; Engel, Katja; Vey, Gregory; Cheng, Jiujun; Neufeld, Josh D.; Charles, Trevor C.

    2014-01-01

    High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones. PMID:24911009

  6. Evaluation of a pooled strategy for high-throughput sequencing of cosmid clones from metagenomic libraries.

    PubMed

    Lam, Kathy N; Hall, Michael W; Engel, Katja; Vey, Gregory; Cheng, Jiujun; Neufeld, Josh D; Charles, Trevor C

    2014-01-01

    High-throughput sequencing methods have been instrumental in the growing field of metagenomics, with technological improvements enabling greater throughput at decreased costs. Nonetheless, the economy of high-throughput sequencing cannot be fully leveraged in the subdiscipline of functional metagenomics. In this area of research, environmental DNA is typically cloned to generate large-insert libraries from which individual clones are isolated, based on specific activities of interest. Sequence data are required for complete characterization of such clones, but the sequencing of a large set of clones requires individual barcode-based sample preparation; this can become costly, as the cost of clone barcoding scales linearly with the number of clones processed, and thus sequencing a large number of metagenomic clones often remains cost-prohibitive. We investigated a hybrid Sanger/Illumina pooled sequencing strategy that omits barcoding altogether, and we evaluated this strategy by comparing the pooled sequencing results to reference sequence data obtained from traditional barcode-based sequencing of the same set of clones. Using identity and coverage metrics in our evaluation, we show that pooled sequencing can generate high-quality sequence data, without producing problematic chimeras. Though caveats of a pooled strategy exist and further optimization of the method is required to improve recovery of complete clone sequences and to avoid circumstances that generate unrecoverable clone sequences, our results demonstrate that pooled sequencing represents an effective and low-cost alternative for sequencing large sets of metagenomic clones.

  7. Communicating the Benefits of a Full Sequence of High School Science Courses

    ERIC Educational Resources Information Center

    Nicholas, Catherine Marie

    2014-01-01

    High school students are generally uninformed about the benefits of enrolling in a full sequence of science courses, therefore only about a third of our nation's high school graduates have completed the science sequence of Biology, Chemistry and Physics. The lack of students completing a full sequence of science courses contributes to the deficit…

  8. Targeted carbon conservation at national scales with high-resolution monitoring

    PubMed Central

    Asner, Gregory P.; Knapp, David E.; Martin, Roberta E.; Tupayachi, Raul; Anderson, Christopher B.; Mascaro, Joseph; Sinca, Felipe; Chadwick, K. Dana; Higgins, Mark; Farfan, William; Llactayo, William; Silman, Miles R.

    2014-01-01

    Terrestrial carbon conservation can provide critical environmental, social, and climate benefits. Yet, the geographically complex mosaic of threats to, and opportunities for, conserving carbon in landscapes remain largely unresolved at national scales. Using a new high-resolution carbon mapping approach applied to Perú, a megadiverse country undergoing rapid land use change, we found that at least 0.8 Pg of aboveground carbon stocks are at imminent risk of emission from land use activities. Map-based information on the natural controls over carbon density, as well as current ecosystem threats and protections, revealed three biogeographically explicit strategies that fully offset forthcoming land-use emissions. High-resolution carbon mapping affords targeted interventions to reduce greenhouse gas emissions in rapidly developing tropical nations. PMID:25385593

  9. Targeted carbon conservation at national scales with high-resolution monitoring.

    PubMed

    Asner, Gregory P; Knapp, David E; Martin, Roberta E; Tupayachi, Raul; Anderson, Christopher B; Mascaro, Joseph; Sinca, Felipe; Chadwick, K Dana; Higgins, Mark; Farfan, William; Llactayo, William; Silman, Miles R

    2014-11-25

    Terrestrial carbon conservation can provide critical environmental, social, and climate benefits. Yet, the geographically complex mosaic of threats to, and opportunities for, conserving carbon in landscapes remain largely unresolved at national scales. Using a new high-resolution carbon mapping approach applied to Perú, a megadiverse country undergoing rapid land use change, we found that at least 0.8 Pg of aboveground carbon stocks are at imminent risk of emission from land use activities. Map-based information on the natural controls over carbon density, as well as current ecosystem threats and protections, revealed three biogeographically explicit strategies that fully offset forthcoming land-use emissions. High-resolution carbon mapping affords targeted interventions to reduce greenhouse gas emissions in rapidly developing tropical nations.

  10. Taxonomic distinctness and conservation of a new high biodiversity subterranean area in Brazil.

    PubMed

    Gallão, Jonas E; Bichuette, Maria Elina

    2015-03-01

    Subterranean environments, even though they do not possess a primary production (photosynthesis), may present high biodiversity, faunistic originality, endemism, phylogenetic isolations and unique ecological and/or evolution events, in addition to rare taxa. Studies investigating the biological diversity in Neotropical caves are relatively rare and recent, and most of them have been conducted in Brazil. We sampled caves from the state of Bahia, northeastern Brazil, and through sampling sufficiency tests and richness estimators, we demonstrate that the normatization for the Brazilian cave laws is not adequate for its conservation and that only α diversity index is not enough to verify faunistic patterns. We suggest that a phylogenetic diversity index be more robust and accurate for conservation purposes, particularly the Taxonomic Distinctness index. Moreover, we propose that the sandstone complex caves from Chapada Diamantina National Park need to be classified as being of high subterranean biodiversity in a global scope.

  11. High resolution MICA genotyping by sequence-based typing (SBT).

    PubMed

    Zou, Yizhou; Stastny, Peter

    2012-01-01

    We have developed a MICA typing method based on polymerase chain reaction (PCR) sequence-based typing and a computer program that determines the polymorphisms and distinguishes the GCT repeats in exon 5. One PCR amplification was performed to obtain templates of 2.2 kb, including exons 2, 3, 4, and 5 of MICA to be sequenced with two forward and two reverse primers. Overlay of nucleotide sequencing signals resulting from presence of different GCT repeats in exon 5 from two different MICA alleles can be identified by a computer program that analyses the combined signal string containing the 35 bases.

  12. Conservative high-order-accurate finite-difference methods for curvilinear grids

    NASA Technical Reports Server (NTRS)

    Rai, Man M.; Chakrvarthy, Sukumar

    1993-01-01

    Two fourth-order-accurate finite-difference methods for numerically solving hyperbolic systems of conservation equations on smooth curvilinear grids are presented. The first method uses the differential form of the conservation equations; the second method uses the integral form of the conservation equations. Modifications to these schemes, which are required near boundaries to maintain overall high-order accuracy, are discussed. An analysis that demonstrates the stability of the modified schemes is also provided. Modifications to one of the schemes to make it total variation diminishing (TVD) are also discussed. Results that demonstrate the high-order accuracy of both schemes are included in the paper. In particular, a Ringleb-flow computation demonstrates the high-order accuracy and the stability of the boundary and near-boundary procedures. A second computation of supersonic flow over a cylinder demonstrates the shock-capturing capability of the TVD methodology. An important contribution of this paper is the dear demonstration that higher order accuracy leads to increased computational efficiency.

  13. MicroRNA expression analysis of rosette and folding leaves in Chinese cabbage using high-throughput Solexa sequencing.

    PubMed

    Wang, Fengde; Li, Huayin; Zhang, Yihui; Li, Jingjuan; Li, Libin; Liu, Lifeng; Wang, Lihua; Wang, Cuihua; Gao, Jianwei

    2013-12-15

    In this study, a global analysis of miRNA expression from rosette leaves (RLs) and folding leaves (FLs) of Chinese cabbage (Brassica rapa L. ssp. pekinensis) was conducted using high-throughput Solexa sequencing. In total, over 12 million clean reads were obtained from each library. Sequence analysis identified 64 conserved miRNA families in each leaf type and 104 and 95 novel miRNAs from RLs and FLs, respectively. Among these, 61 conserved miRNAs and 61 novel miRNAs were detected in both types of leaves. Furthermore, six conserved and 21 novel miRNAs were differentially expressed between the two libraries. Target gene annotation suggested that these differentially expressed miRNAs targeted transcription factors, F-box proteins, auxin and Ca(2+) signaling pathway proteins, protein kinases and other proteins that may function in governing leafy head formation. This study advanced our understanding of the important roles of miRNAs in regulating leafy head development in Chinese cabbage.

  14. Highly Informative Simple Sequence Repeat (SSR) Markers for Fingerprinting Hazelnut

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Simple sequence repeat (SSR) or microsatellite markers have many applications in breeding and genetic studies of plants, including fingerprinting of cultivars and investigations of genetic diversity, and therefore provide information for better management of germplasm collections. They are repeatab...

  15. Identification and analysis of a highly conserved chemotaxis gene cluster in Shewanella species.

    SciTech Connect

    Li, J.; Romine, Margaret F.; Ward, M.

    2007-08-01

    A conserved cluster of chemotaxis genes was identified from the genome sequences of fifteen Shewanella species. An in-frame deletion of the cheA-3 gene, which is located in this cluster, was created in S. oneidensis MR-1 and the gene shown to be essential for chemotactic responses to anaerobic electron acceptors. The CheA-3 protein showed strong similarity to Vibrio cholerae CheA-2 and P. aeruginosa CheA-1, two proteins that are also essential for chemotaxis. The genes encoding these proteins were shown to be located in chemotaxis gene clusters closely related to the cheA-3-containing cluster in Shewanella species. The results of this study suggest that a combination of gene neighborhood and homology analyses may be used to predict which cheA genes are essential for chemotaxis in groups of closely related microorganisms.

  16. Nmf9 Encodes a Highly Conserved Protein Important to Neurological Function in Mice and Flies

    PubMed Central

    Zhang, Shuxiao; Ross, Kevin D.; Seidner, Glen A.; Gorman, Michael R.; Poon, Tiffany H.; Wang, Xiaobo; Keithley, Elizabeth M.; Lee, Patricia N.; Martindale, Mark Q.; Joiner, William J.; Hamilton, Bruce A.

    2015-01-01

    Many protein-coding genes identified by genome sequencing remain without functional annotation or biological context. Here we define a novel protein-coding gene, Nmf9, based on a forward genetic screen for neurological function. ENU-induced and genome-edited null mutations in mice produce deficits in vestibular function, fear learning and circadian behavior, which correlated with Nmf9 expression in inner ear, amygdala, and suprachiasmatic nuclei. Homologous genes from unicellular organisms and invertebrate animals predict interactions with small GTPases, but the corresponding domains are absent in mammalian Nmf9. Intriguingly, homozygotes for null mutations in the Drosophila homolog, CG45058, show profound locomotor defects and premature death, while heterozygotes show striking effects on sleep and activity phenotypes. These results link a novel gene orthology group to discrete neurological functions, and show conserved requirement across wide phylogenetic distance and domain level structural changes. PMID:26131556

  17. HybPiper: Extracting coding sequence and introns for phylogenetics from high-throughput sequencing reads using target enrichment1

    PubMed Central

    Johnson, Matthew G.; Gardner, Elliot M.; Liu, Yang; Medina, Rafael; Goffinet, Bernard; Shaw, A. Jonathan; Zerega, Nyree J. C.; Wickett, Norman J.

    2016-01-01

    Premise of the study: Using sequence data generated via target enrichment for phylogenetics requires reassembly of high-throughput sequence reads into loci, presenting a number of bioinformatics challenges. We developed HybPiper as a user-friendly platform for assembly of gene regions, extraction of exon and intron sequences, and identification of paralogous gene copies. We test HybPiper using baits designed to target 333 phylogenetic markers and 125 genes of functional significance in Artocarpus (Moraceae). Methods and Results: HybPiper implements parallel execution of sequence assembly in three phases: read mapping, contig assembly, and target sequence extraction. The pipeline was able to recover nearly complete gene sequences for all genes in 22 species of Artocarpus. HybPiper also recovered more than 500 bp of nontargeted intron sequence in over half of the phylogenetic markers and identified paralogous gene copies in Artocarpus. Conclusions: HybPiper was designed for Linux and Mac OS X and is freely available at https://github.com/mossmatters/HybPiper. PMID:27437175

  18. Regulation of DNA replication at the end of the mitochondrial D-loop involves the helicase TWINKLE and a conserved sequence element

    PubMed Central

    Jemt, Elisabeth; Persson, Örjan; Shi, Yonghong; Mehmedovic, Majda; Uhler, Jay P.; Dávila López, Marcela; Freyer, Christoph; Gustafsson, Claes M.; Samuelsson, Tore; Falkenberg, Maria

    2015-01-01

    The majority of mitochondrial DNA replication events are terminated prematurely. The nascent DNA remains stably associated with the template, forming a triple-stranded displacement loop (D-loop) structure. However, the function of the D-loop region of the mitochondrial genome remains poorly understood. Using a comparative genomics approach we here identify two closely related 15 nt sequence motifs of the D-loop, strongly conserved among vertebrates. One motif is at the D-loop 5′-end and is part of the conserved sequence block 1 (CSB1). The other motif, here denoted coreTAS, is at the D-loop 3′-end. Both these sequences may prevent transcription across the D-loop region, since light and heavy strand transcription is terminated at CSB1 and coreTAS, respectively. Interestingly, the replication of the nascent D-loop strand, occurring in a direction opposite to that of heavy strand transcription, is also terminated at coreTAS, suggesting that coreTAS is involved in termination of both transcription and replication. Finally, we demonstrate that the loading of the helicase TWINKLE at coreTAS is reversible, implying that this site is a crucial component of a switch between D-loop formation and full-length mitochondrial DNA replication. PMID:26253742

  19. Combining Natural Sequence Variation with High Throughput Mutational Data to Reveal Protein Interaction Sites

    PubMed Central

    Melamed, Daniel; Young, David L.; Miller, Christina R.; Fields, Stanley

    2015-01-01

    Many protein interactions are conserved among organisms despite changes in the amino acid sequences that comprise their contact sites, a property that has been used to infer the location of these sites from protein homology. In an inter-species complementation experiment, a sequence present in a homologue is substituted into a protein and tested for its ability to support function. Therefore, substitutions that inhibit function can identify interaction sites that changed over evolution. However, most of the sequence differences within a protein family remain unexplored because of the small-scale nature of these complementation approaches. Here we use existing high throughput mutational data on the in vivo function of the RRM2 domain of the Saccharomyces cerevisiae poly(A)-binding protein, Pab1, to analyze its sites of interaction. Of 197 single amino acid differences in 52 Pab1 homologues, 17 reduce the function of Pab1 when substituted into the yeast protein. The majority of these deleterious mutations interfere with the binding of the RRM2 domain to eIF4G1 and eIF4G2, isoforms of a translation initiation factor. A large-scale mutational analysis of the RRM2 domain in a two-hybrid assay for eIF4G1 binding supports these findings and identifies peripheral residues that make a smaller contribution to eIF4G1 binding. Three single amino acid substitutions in yeast Pab1 corresponding to residues from the human orthologue are deleterious and eliminate binding to the yeast eIF4G isoforms. We create a triple mutant that carries these substitutions and other humanizing substitutions that collectively support a switch in binding specificity of RRM2 from the yeast eIF4G1 to its human orthologue. Finally, we map other deleterious substitutions in Pab1 to inter-domain (RRM2–RRM1) or protein-RNA (RRM2–poly(A)) interaction sites. Thus, the combined approach of large-scale mutational data and evolutionary conservation can be used to characterize interaction sites at single

  20. Construction of a high-density genetic map for grape using next generation restriction-site associated DNA sequencing

    PubMed Central

    2012-01-01

    Background Genetic mapping and QTL detection are powerful methodologies in plant improvement and breeding. Construction of a high-density and high-quality genetic map would be of great benefit in the production of superior grapes to meet human demand. High throughput and low cost of the recently developed next generation sequencing (NGS) technology have resulted in its wide application in genome research. Sequencing restriction-site associated DNA (RAD) might be an efficient strategy to simplify genotyping. Combining NGS with RAD has proven to be powerful for single nucleotide polymorphism (SNP) marker development. Results An F1 population of 100 individual plants was developed. In-silico digestion-site prediction was used to select an appropriate restriction enzyme for construction of a RAD sequencing library. Next generation RAD sequencing was applied to genotype the F1 population and its parents. Applying a cluster strategy for SNP modulation, a total of 1,814 high-quality SNP markers were developed: 1,121 of these were mapped to the female genetic map, 759 to the male map, and 1,646 to the integrated map. A comparison of the genetic maps to the published Vitis vinifera genome revealed both conservation and variations. Conclusions The applicability of next generation RAD sequencing for genotyping a grape F1 population was demonstrated, leading to the successful development of a genetic map with high density and quality using our designed SNP markers. Detailed analysis revealed that this newly developed genetic map can be used for a variety of genome investigations, such as QTL detection, sequence assembly and genome comparison. PMID:22908993

  1. Proteome-wide mapping of the Drosophila acetylome demonstrates a high degree of conservation of lysine acetylation.

    PubMed

    Weinert, Brian T; Wagner, Sebastian A; Horn, Heiko; Henriksen, Peter; Liu, Wenshe R; Olsen, Jesper V; Jensen, Lars J; Choudhary, Chunaram

    2011-07-26

    Posttranslational modification of proteins by acetylation and phosphorylation regulates most cellular processes in living organisms. Surprisingly, the evolutionary conservation of phosphorylated serine and threonine residues is only marginally higher than that of unmodified serines and threonines. With high-resolution mass spectrometry, we identified 1981 lysine acetylation sites in the proteome of Drosophila melanogaster. We used data sets of experimentally identified acetylation and phosphorylation sites in Drosophila and humans to analyze the evolutionary conservation of these modification sites between flies and humans. Site-level conservation analysis revealed that acetylation sites are highly conserved, significantly more so than phosphorylation sites. Furthermore, comparison of lysine conservation in Drosophila and humans with that in nematodes and zebrafish revealed that acetylated lysines were significantly more conserved than were nonacetylated lysines. Bioinformatics analysis using Gene Ontology terms suggested that the proteins with conserved acetylation control cellular processes such as protein translation, protein folding, DNA packaging, and mitochondrial metabolism. We found that acetylation of ubiquitin-conjugating E2 enzymes was evolutionarily conserved, and mutation of a conserved acetylation site impaired the function of the human E2 enzyme UBE2D3. This systems-level analysis of comparative posttranslational modification showed that acetylation is an anciently conserved modification and suggests that phosphorylation sites may have evolved faster than acetylation sites.

  2. The conserved lymphokine element-0 in the IL5 promoter binds to a high mobility group-1 protein.

    PubMed

    Marrugo, J; Marsh, D G; Ghosh, B

    1996-10-01

    The conserved lymphokine elements-0 (CLE0) in the IL5 promoter is essential for the expression of IL-5. Here, we report the cloning and expression of a cDNA encoding a novel CLE0-binding protein, CLEBP-1 from a mouse Th2 clone, D10.G4.1. Interestingly, it was found that the CLEBP1 cDNA sequence was almost identical to the sequences of known high mobility group-1 (HMG1) cDNAs. When expressed as a recombinant fusion protein in Escherichia coli, CLEBP-1 was shown to bind to the IL5-CLE0 element in electrophoretic mobility-shift assays (EMSA) and southwestern blot analysis. The CLEBP-1 fusion protein cross-reacts with and-HMG-1/2 in Western blot analysis. It also binds to the CLE0 elements of IL4, GMCSF and GCSF genes. CLEBP-1 and closely related HMG-1 and HMG-2 proteins may play key roles in facilitating the expression of the lymphokine genes that contain CLE0 elements.

  3. Genetic Determinants of Sindbis Virus Mosquito Infection Are Associated with a Highly Conserved Alphavirus and Flavivirus Envelope Sequence▿

    PubMed Central

    Pierro, Dennis J.; Powers, Erik L.; Olson, Ken E.

    2008-01-01

    Wild-type Sindbis virus (SINV) strain MRE16 efficiently infects Aedes aegypti midgut epithelial cells (MEC), but laboratory-derived neurovirulent SINV strain TE/5′2J infects MEC poorly. SINV determinants for MEC infection have been localized to the E2 glycoprotein. The E2 amino acid sequences of MRE16 and TE/5′2J differ at 60 residue sites. To identify the genetic determinants of MEC infection of MRE16, the TE/5′2J virus genome was altered to contain either domain chimeras or more focused nucleotide substitutions of MRE16. The growth patterns of derived viruses in cell culture were determined, as were the midgut infection rates (MIR) in A. aegypti mosquitoes. The results showed that substitutions of MRE16 E2 aa 95 to 96 and 116 to 119 into the TE/5′2J virus increased MIR both independently and in combination with each other. In addition, a unique PPF/.GDS amino acid motif was located between these two sites that was found to be a highly conserved sequence among alphaviruses and flaviviruses but not other arboviruses. PMID:18160430

  4. The highly conserved human cytomegalovirus UL136 ORF generates multiple Golgi-localizing protein isoforms through differential translation initiation.

    PubMed

    Liao, Huanan; Lee, Jung-Hyun; Kondo, Rikita; Katata, Marei; Imadome, Ken-Ichi; Miyado, Kenji; Inoue, Naoki; Fujiwara, Shigeyoshi; Nakamura, Hiroyuki

    2014-01-22

    The UL133-UL138 locus in the unique long b' (ULb') region of the human cytomegalovirus (HCMV) genome is considered to play certain roles in viral replication, dissemination and latency in a host cell type-dependent manner. Here we characterized the proteins encoded by UL136, one of the open reading frames (ORFs) in the locus. Comparative sequence analysis of UL136 among clinical isolates and laboratory strains indicates that its predicted amino-acid sequence is highly conserved. A polyclonal antibody against UL136 proteins (pUL136s) was raised against its carboxy-terminal region and this antibody specifically recognized at least five UL136-encoded protein isoforms of 29-17 kDa both in HCMV-infected cells and in cells transfected with a construct expressing pUL136. Immunofluorescence analysis with this antibody revealed localization of pUL136 in the Golgi apparatus. Analysis of several pUL136 mutants indicated that the putative transmembrane domain of pUL136 is required for its Golgi localization. Mutational analysis of multiple AUG codons in UL136 demonstrated that translation initiation from these AUG codons contributes in the generation of pUL136 isoforms.

  5. Exceptional conservation of horse-human gene order on X chromosome revealed by high-resolution radiation hybrid mapping.

    PubMed

    Raudsepp, Terje; Lee, Eun-Joon; Kata, Srinivas R; Brinkmeyer, Candice; Mickelson, James R; Skow, Loren C; Womack, James E; Chowdhary, Bhanu P

    2004-02-24

    Development of a dense map of the horse genome is key to efforts aimed at identifying genes controlling health, reproduction, and performance. We herein report a high-resolution gene map of the horse (Equus caballus) X chromosome (ECAX) generated by developing and typing 116 gene-specific and 12 short tandem repeat markers on the 5,000-rad horse x hamster whole-genome radiation hybrid panel and mapping 29 gene loci by fluorescence in situ hybridization. The human X chromosome sequence was used as a template to select genes at 1-Mb intervals to develop equine orthologs. Coupled with our previous data, the new map comprises a total of 175 markers (139 genes and 36 short tandem repeats, of which 53 are fluorescence in situ hybridization mapped) distributed on average at approximately 880-kb intervals along the chromosome. This is the densest and most uniformly distributed chromosomal map presently available in any mammalian species other than humans and rodents. Comparison of the horse and human X chromosome maps shows remarkable conservation of gene order along the entire span of the chromosomes, including the location of the centromere. An overview of the status of the horse map in relation to mouse, livestock, and companion animal species is also provided. The map will be instrumental for analysis of X linked health and fertility traits in horses by facilitating identification of targeted chromosomal regions for isolation of polymorphic markers, building bacterial artificial chromosome contigs, or sequencing.

  6. Contamination-controlled high-throughput whole genome sequencing for influenza A viruses using the MiSeq sequencer

    PubMed Central

    Lee, Hong Kai; Lee, Chun Kiat; Tang, Julian Wei-Tze; Loh, Tze Ping; Koay, Evelyn Siew-Chuan

    2016-01-01

    Accurate full-length genomic sequences are important for viral phylogenetic studies. We developed a targeted high-throughput whole genome sequencing (HT-WGS) method for influenza A viruses, which utilized an enzymatic cleavage-based approach, the Nextera XT DNA library preparation kit, for library preparation. The entire library preparation workflow was adapted for the Sentosa SX101, a liquid handling platform, to automate this labor-intensive step. As the enzymatic cleavage-based approach generates low coverage reads at both ends of the cleaved products, we corrected this loss of sequencing coverage at the termini by introducing modified primers during the targeted amplification step to generate full-length influenza A sequences with even coverage across the whole genome. Another challenge of targeted HTS is the risk of specimen-to-specimen cross-contamination during the library preparation step that results in the calling of false-positive minority variants. We included an in-run, negative system control to capture contamination reads that may be generated during the liquid handling procedures. The upper limits of 99.99% prediction intervals of the contamination rate were adopted as cut-off values of contamination reads. Here, 148 influenza A/H3N2 samples were sequenced using the HTS protocol and were compared against a Sanger-based sequencing method. Our data showed that the rate of specimen-to-specimen cross-contamination was highly significant in HTS. PMID:27624998

  7. ICAP-1, a Novel β1 Integrin Cytoplasmic Domain–associated Protein, Binds to a Conserved and Functionally Important NPXY Sequence Motif of β1 Integrin

    PubMed Central

    Chang, David D.; Wong, Carol; Smith, Healy; Liu, Jenny

    1997-01-01

    The cytoplasmic domains of integrins are essential for cell adhesion. We report identification of a novel protein, ICAP-1 (integrin cytoplasmic domain– associated protein-1), which binds to the β1 integrin cytoplasmic domain. The interaction between ICAP-1 and β1 integrins is highly specific, as demonstrated by the lack of interaction between ICAP-1 and the cytoplasmic domains of other β integrins, and requires a conserved and functionally important NPXY sequence motif found in the COOH-terminal region of the β1 integrin cytoplasmic domain. Mutational studies reveal that Asn and Tyr of the NPXY motif and a Val residue located NH2-terminal to this motif are critical for the ICAP-1 binding. Two isoforms of ICAP-1, a 200–amino acid protein (ICAP-1α) and a shorter 150–amino acid protein (ICAP-1β), derived from alternatively spliced mRNA, are expressed in most cells. ICAP-1α is a phosphoprotein and the extent of its phosphorylation is regulated by the cell–matrix interaction. First, an enhancement of ICAP-1α phosphorylation is observed when cells were plated on fibronectin-coated but not on nonspecific poly-l-lysine–coated surface. Second, the expression of a constitutively activated RhoA protein that disrupts the cell–matrix interaction results in dephosphorylation of ICAP-1α. The regulation of ICAP-1α phosphorylation by the cell–matrix interaction suggests an important role of ICAP-1 during integrin-dependent cell adhesion. PMID:9281591

  8. Bacillus thuringiensis insecticidal Cry1Aa toxin binds to a highly conserved region of aminopeptidase N in the host insect leading to its evolutionary success.

    PubMed

    Nakanishi, K; Yaoi, K; Shimada, N; Kadotani, T; Sato, R

    1999-06-15

    Bacillus thuringiensis insecticidal protein, Cry1Aa toxin, binds to a specific receptor in insect midguts and has insecticidal activity. Therefore, the structure of the receptor molecule is probably a key factor in determining the binding affinity of the toxin and insect susceptibility. The cDNA fragment (PX frg1) encoding the Cry1Aa toxin-binding region of an aminopeptidase N (APN) or an APN family protein from diamondback moth, Plutella xylostella midgut was cloned and sequenced. A comparison between the deduced amino acid sequence of PX frg1 and other insect APN sequences shows that Cry1Aa toxin binds to a highly conserved region of APN family protein. In this paper, we propose a model to explain the mechanism that causes B. thuringiensis evolutionary success and differing insect susceptibility to Cry1Aa toxin.

  9. Comparative cytogenetics of tree frogs of the Dendropsophus marmoratus (Laurenti, 1768) group: conserved karyotypes and interstitial telomeric sequences.

    PubMed

    Teixeira, Lívia S R; Seger, Karin Regina; Targueta, Cíntia Pelegrineti; Orrico, Victor G Dill; Lourenço, Luciana Bolsoni

    2016-01-01

    The diploid number 2n = 30 is a presumed synapomorphy of Dendropsophus Fitzinger, 1843, although a noticeable variation in the number of biarmed/telocentric chromosomes is observed in this genus. Such a variation suggests that several chromosomal rearrangements took place after the evolutionary origin of the hypothetical ancestral 30-chromosome karyotype; however, the inferred rearrangements remain unknown. Distinct numbers of telocentric chromosomes are found in the two most cytogenetically studied species groups of Dendropsophus. In contrast, all three species of the Dendropsophus marmoratus (Laurenti, 1768) group that are already karyotyped presented five pairs of telocentric chromosomes. In this study, we analyzed cytogenetically three additional species of this group to investigate if the number of telocentric chromosomes in this group is not as variable as in other Dendropsophus groups. We described the karyotypes of Dendropsophus seniculus (Cope, 1868), Dendropsophus soaresi (Caramaschi & Jim, 1983) and Dendropsophus novaisi (Bokermann, 1968) based on Giemsa staining, C-banding, silver impregnation and in situ hybridization with telomeric probes. Dendropsophus seniculus, Dendropsophus soaresi and Dendropsophus novaisi presented five pairs of telocentric chromosomes, as did the remaining species of the group previously karyotyped. Though the species of this group show a high degree of karyotypic similarity, Dendropsophus soaresi was unique in presenting large blocks of het-ITSs (heterochromatic internal telomeric sequences) in the majority of the centromeres. Although the ITSs have been interpreted as evidence of ancestral chromosomal fusions and inversions, the het-ITSs detected in the karyotype of Dendropsophus soaresi could not be explained as direct remnants of ancestral chromosomal rearrangements because no evidence of chromosomal changes emerged from the comparison of the karyotypes of all of the species of the Dendropsophus marmoratus group.

  10. Comparative cytogenetics of tree frogs of the Dendropsophus marmoratus (Laurenti, 1768) group: conserved karyotypes and interstitial telomeric sequences

    PubMed Central

    Teixeira, Lívia S. R.; Seger, Karin Regina; Targueta, Cíntia Pelegrineti; Orrico, Victor G. Dill; Lourenço, Luciana Bolsoni

    2016-01-01

    Abstract The diploid number 2n = 30 is a presumed synapomorphy of Dendropsophus Fitzinger, 1843, although a noticeable variation in the number of biarmed/telocentric chromosomes is observed in this genus. Such a variation suggests that several chromosomal rearrangements took place after the evolutionary origin of the hypothetical ancestral 30-chromosome karyotype; however, the inferred rearrangements remain unknown. Distinct numbers of telocentric chromosomes are found in the two most cytogenetically studied species groups of Dendropsophus. In contrast, all three species of the Dendropsophus marmoratus (Laurenti, 1768) group that are already karyotyped presented five pairs of telocentric chromosomes. In this study, we analyzed cytogenetically three additional species of this group to investigate if the number of telocentric chromosomes in this group is not as variable as in other Dendropsophus groups. We described the karyotypes of Dendropsophus seniculus (Cope, 1868), Dendropsophus soaresi (Caramaschi & Jim, 1983) and Dendropsophus novaisi (Bokermann, 1968) based on Giemsa staining, C-banding, silver impregnation and in situ hybridization with telomeric probes. Dendropsophus seniculus, Dendropsophus soaresi and Dendropsophus novaisi presented five pairs of telocentric chromosomes, as did the remaining species of the group previously karyotyped. Though the species of this group show a high degree of karyotypic similarity, Dendropsophus soaresi was unique in presenting large blocks of het-ITSs (heterochromatic internal telomeric sequences) in the majority of the centromeres. Although the ITSs have been interpreted as evidence of ancestral chromosomal fusions and inversions, the het-ITSs detected in the karyotype of Dendropsophus soaresi could not be explained as direct remnants of ancestral chromosomal rearrangements because no evidence of chromosomal changes emerged from the comparison of the karyotypes of all of the species of the Dendropsophus marmoratus group

  11. A High-Order Finite Spectral Volume Method for Conservation Laws on Unstructured Grids

    NASA Technical Reports Server (NTRS)

    Wang, Z. J.; Liu, Yen; Kwak, Dochan (Technical Monitor)

    2001-01-01

    A time accurate, high-order, conservative, yet efficient method named Finite Spectral Volume (FSV) is developed for conservation laws on unstructured grids. The concept of a 'spectral volume' is introduced to achieve high-order accuracy in an efficient manner similar to spectral element and multi-domain spectral methods. In addition, each spectral volume is further sub-divided into control volumes (CVs), and cell-averaged data from these control volumes is used to reconstruct a high-order approximation in the spectral volume. Riemann solvers are used to compute the fluxes at spectral volume boundaries. Then cell-averaged state variables in the control volumes are updated independently. Furthermore, TVD (Total Variation Diminishing) and TVB (Total Variation Bounded) limiters are introduced in the FSV method to remove/reduce spurious oscillations near discontinuities. A very desirable feature of the FSV method is that the reconstruction is carried out only once, and analytically, and is the same for all cells of the same type, and that the reconstruction stencil is always non-singular, in contrast to the memory and CPU-intensive reconstruction in a high-order finite volume (FV) method. Discussions are made concerning why the FSV method is significantly more efficient than high-order finite volume and the Discontinuous Galerkin (DG) methods. Fundamental properties of the FSV method are studied and high-order accuracy is demonstrated for several model problems with and without discontinuities.

  12. High order filtering methods for approximating hyperbolic systems of conservation laws

    NASA Technical Reports Server (NTRS)

    Lafon, F.; Osher, S.

    1991-01-01

    The essentially nonoscillatory (ENO) schemes, while potentially useful in the computation of discontinuous solutions of hyperbolic conservation-law systems, are computationally costly relative to simple central-difference methods. A filtering technique is presented which employs central differencing of arbitrarily high-order accuracy except where a local test detects the presence of spurious oscillations and calls upon the full ENO apparatus to remove them. A factor-of-three speedup is thus obtained over the full-ENO method for a wide range of problems, with high-order accuracy in regions of smooth flow.

  13. PhyPA: Phylogenetic method with pairwise sequence alignment outperforms likelihood methods in phylogenetics involving highly diverged sequences.

    PubMed

    Xia, Xuhua

    2016-09-01

    While pairwise sequence alignment (PSA) by dynamic programming is guaranteed to generate one of the optimal alignments, multiple sequence alignment (MSA) of highly divergent sequences often results in poorly aligned sequences, plaguing all subsequent phylogenetic analysis. One way to avoid this problem is to use only PSA to reconstruct phylogenetic trees, which can only be done with distance-based methods. I compared the accuracy of this new computational approach (named PhyPA for phylogenetics by pairwise alignment) against the maximum likelihood method using MSA (the ML+MSA approach), based on nucleotide, amino acid and codon sequences simulated with different topologies and tree lengths. I present a surprising discovery that the fast PhyPA method consistently outperforms the slow ML+MSA approach for highly diverged sequences even when all optimization options were turned on for the ML+MSA approach. Only when sequences are not highly diverged (i.e., when a reliable MSA can be obtained) does the ML+MSA approach outperforms PhyPA. The true topologies are always recovered by ML with the true alignment from the simulation. However, with MSA derived from alignment programs such as MAFFT or MUSCLE, the recovered topology consistently has higher likelihood than that for the true topology. Thus, the failure to recover the true topology by the ML+MSA is not because of insufficient search of tree space, but by the distortion of phylogenetic signal by MSA methods. I have implemented in DAMBE PhyPA and two approaches making use of multi-gene data sets to derive phylogenetic support for subtrees equivalent to resampling techniques such as bootstrapping and jackknifing.

  14. Conservation of inner nuclear membrane targeting sequences in mammalian Pom121 and yeast Heh2 membrane proteins

    PubMed Central

    Kralt, Annemarie; Jagalur, Noorjahan B.; van den Boom, Vincent; Lokareddy, Ravi K.; Steen, Anton; Cingolani, Gino; Fornerod, Maarten; Veenhoff, Liesbeth M.

    2015-01-01

    Endoplasmic reticulum–synthesized membrane proteins traffic through the nuclear pore complex (NPC) en route to the inner nuclear membrane (INM). Although many membrane proteins pass the NPC by simple diffusion, two yeast proteins, ScSrc1/ScHeh1 and ScHeh2, are actively imported. In these proteins, a nuclear localization signal (NLS) and an intrinsically disordered linker encode the sorting signal for recruiting the transport factors for FG-Nup and RanGTP-dependent transport through the NPC. Here we address whether a similar import mechanism applies in metazoans. We show that the (putative) NLSs of metazoan HsSun2, MmLem2, HsLBR, and HsLap2β are not sufficient to drive nuclear accumulation of a membrane protein in yeast, but the NLS from RnPom121 is. This NLS of Pom121 adapts a similar fold as the NLS of Heh2 when transport factor bound and rescues the subcellular localization and synthetic sickness of Heh2ΔNLS mutants. Consistent with the conservation of these NLSs, the NLS and linker of Heh2 support INM localization in HEK293T cells. The conserved features of the NLSs of ScHeh1, ScHeh2, and RnPom121 and the effective sorting of Heh2-derived reporters in human cells suggest that active import is conserved but confined to a small subset of INM proteins. PMID:26179916

  15. [Current applications of high-throughput DNA sequencing technology in antibody drug research].

    PubMed

    Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong

    2012-03-01

    Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.

  16. Conserved sequence motifs upstream from the co-ordinately expressed vitellogenin and apoVLDLII genes of chicken.

    PubMed

    van het Schip, F; Strijker, R; Samallo, J; Gruber, M; Geert, A B

    1986-11-11

    The vitellogenin and apoVLDLII yolk protein genes of chicken are transcribed in the liver upon estrogenization. To get information on putative regulatory elements, we compared more than 2 kb of their 5' flanking DNA sequences. Common sequence motifs were found in regions exhibiting estrogen-induced changes in chromatin structure. Stretches of alternating pyrimidines and purines of about 30-nucleotides long are present at roughly similar positions. A distinct box of sequence homology in the chicken genes also appears to be present at a similar position in front of the vitellogenin genes of Xenopus laevis, but is absent from the estrogen-responsive egg-white protein genes expressed in the oviduct. In front of the vitellogenin (position -595) and the VLDLII gene (position -548), a DNA element of about 300 base-pairs was found, which possesses structural characteristics of a mobile genetic element and bears homology to the transposon-like Vi element of Xenopus laevis.

  17. Structural features of the murine dihydrofolate reductase transcription termination region: identification of a conserved DNA sequence element.

    PubMed Central

    Frayne, E G; Kellems, R E

    1986-01-01

    Structural features of the transcription termination region for the mouse dihydrofolate reductase gene have been determined and compared with those of several other known termination regions for protein coding genes. A common feature identified among these termination regions was the presence of a 20 bp consensus DNA sequence element (ATCAGAATATAGGAAAGTAGCAAT). The results imply that the 20 bp consensus DNA sequence element is important for signaling RNA polymerase II transcription termination at least in the several vertebrate species investigated. Furthermore, the results suggest that for the dhfr gene and possibly for other genes in mice as well, the potential termination consensus sequence can exist as part of a long interspersed repetitive DNA element. Images PMID:3714472

  18. Putting Physics First: Three Case Studies of High School Science Department and Course Sequence Reorganization

    ERIC Educational Resources Information Center

    Larkin, Douglas B.

    2016-01-01

    This article examines the process of shifting to a "Physics First" sequence in science course offerings in three school districts in the United States. This curricular sequence reverses the more common U.S. high school sequence of biology/chemistry/physics, and has gained substantial support in the physics education community over the…

  19. How to Go Green: Creating a Conservation Culture in a Public High School through Education, Modeling, and Communication

    ERIC Educational Resources Information Center

    Schelly, Chelsea; Cross, Jennifer E.; Franzen, William; Hall, Pete; Reeve, Stu

    2012-01-01

    This case study examines how energy conservation efforts in one public high school contributed to both sustainability education and the adoption of sustainable behavior within educational and organizational practice. Individual role models, school facilities, school governance and school culture together support both conservation and environmental…

  20. Water Wisdom: 23 Stand-Alone Activities on Water Supply and Water Conservation for High School Students. 2nd Edition.

    ERIC Educational Resources Information Center

    Massachusetts State Water Resources Authority, Boston.

    This water conservation education program for high schools consists of both stand-alone activities and teacher support materials. Lessons are divided into six broad categories: (1) The Water Cycle; (2) Water and Society; (3) Keeping Water Pure; (4) Visualizing Volumes; (5) The Economics of Water Use; and (6) Domestic Water Conservation. The…

  1. Patterns in the bony skull development of marsupials: high variation in onset of ossification and conserved regions of bone contact

    PubMed Central

    Spiekman, Stephan N. F.; Werneburg, Ingmar

    2017-01-01

    Development in marsupials is specialized towards an extremely short gestation and highly altricial newborns. As a result, marsupial neonates display morphological adaptations at birth related to functional constraints. However, little is known about the variability of marsupial skull development and its relation to morphological diversity. We studied bony skull development in five marsupial species. The relative timing of the onset of ossification was compared to literature data and the ossification sequence of the marsupial ancestor was reconstructed using squared-change parsimony. The high range of variation in the onset of ossification meant that no patterns could be observed that differentiate species. This finding challenges traditional studies concentrating on the onset of ossification as a marker for phylogeny or as a functional proxy. Our study presents observations on the developmental timing of cranial bone-to-bone contacts and their evolutionary implications. Although certain bone contacts display high levels of variation, connections of early and late development are quite conserved and informative. Bones that surround the oral cavity are generally the first to connect and the bones of the occipital region are among the last. We conclude that bone contact is preferable over onset of ossification for studying cranial bone development. PMID:28233826

  2. A remote and highly conserved enhancer supports amygdala specific expression of the gene encoding the anxiogenic neuropeptide substance-P.

    PubMed

    Davidson, S; Miller, K A; Dowell, A; Gildea, A; Mackenzie, A

    2006-04-01

    The neuropeptide substance P (SP), encoded by the preprotachykinin-A (PPTA) gene, is expressed in the central and medial amygdaloid nucleus, where it plays a critical role in modulating fear and anxiety related behaviour. Determining the regulatory systems that support PPTA expression in the amygdala may provide important insights into the causes of depression and anxiety related disorders and will provide avenues for the development of novel therapies. In order to identify the tissue specific regulatory element responsible for supporting expression of the PPTA gene in the amygdala, we used long-range comparative genomics in combination with transgenic analysis and immunohistochemistry. By comparing human and chicken genomes, it was possible to detect and characterise a highly conserved long-range enhancer that supported tissue specific expression in SP expressing cells of the medial and central amygdaloid bodies (ECR1; 158.5 kb 5' of human PPTA ORF). Further bioinformatic analysis using the TRANSFAC database indicated that the ECR1 element contained multiple and highly conserved consensus binding sequences of transcription factors (TFs) such as MEIS1. The results of immunohistochemical analysis of transgenic lines were consistent with the hypothesis that the MEIS1 TF interacts with and maintains ECR1 activity in the central amygdala in vivo. The discovery of ECR1 and the in vivo functional relationship with MEIS1 inferred by our studies suggests a mechanism to the regulatory systems that control PPTA expression in the amygdala. Uncovering these mechanisms may play an important role in the future development of tissue specific therapies for the treatment of anxiety and depression.

  3. [Comparative chromosome painting shows the red panda (Ailurus fulgens) has a highly conserved karyotype].

    PubMed

    Tian, Ying; Nie, Wen-Hui; Wang, Jin-Huan; Yang, Yun-Fei; Yang, Feng-Tang

    2002-02-01

    We have established a comparative chromosome map between red panda (Ailurus fulgens, 2n = 36) and dog by chromosome painting with biotin-labelled chromosome-specific probes of the dog. Dog probes specific for the 38 automates delineated 71 homologous segments in the metaphase chromosomes of red panda. Of the 38 autosomal paints, 18 probes each delineated one homologous segment in red panda genome, while the other 20 ones each detected two to five homologous segments. The dog X chromosome-specific paint delineated the whole X chromosome of the red panda. The results indicate that at least 28 fissions (breaks), 49 fusions and 4 inversions were needed to "convert" the dog karyotype to that of the red panda, suggesting that extensive chromosome rearrangements differentiate the karyotypes of red panda and dog. Based on the established comparative chromosome homologies of dog and domestic cat, we could infer that there were 26 segments of conserved synteny between red panda and domestic cat. Comparative analysis of the distribution patterns of conserved segments defined by dog paints in red panda and domestic cat genomes revealed at least 2 cryptic inversions in two large chromosomal regions of conserved synteny between red panda and domestic cat. The karyotype of red panda shows high degree of homology with that of domestic cat.

  4. A comparative analysis of distribution and conservation of microsatellites in the transcripts of sequenced Fusarium species and development of genic-SSR markers for polymorphism analysis.

    PubMed

    Mahfooz, Sahil; Srivastava, Arpita; Srivastava, Alok K; Arora, Dilip K

    2015-09-01

    We used an in silico approach to survey and compare microsatellites in transcript sequences of four sequenced members of genus Fusarium. G + C content of transcripts was found to be positively correlated with the frequency of SSRs. Our analysis revealed that, in all the four transcript sequences studied, the occurrence, relative abundance and density of microsatellites varied and was not influenced by transcript sizes. No correlation between relative abundance and transcript sizes was observed. The relative abundance and density of microsatellites were highest in the transcripts of Fusarium solani when compared with F. graminearum, F. verticillioides and F. oxysporum. The maximum frequency of SSRs among all four sequence sets was of trinucleotide repeats (67.8%), whereas the dinucleotide repeat represents <1%. Among all classes of repeats, 36.5% motifs were found conserved within Fusarium species. In order to study polymorphism within Fusarium isolates, 11 polymorphic genic-SSR markers were developed. Of the 11 markers, 5 were from F. oxysporum and remaining 6 belongs to F. solani. SSR markers from F. oxysporum were found to be more polymorphic (38%) as compared to F. solani (26%). Eleven polymorphic markers obtained in this study clearly demonstrate the utility of newly developed SSR markers in establishing genetic relationships among different isolates of Fusarium.

  5. Fisheries conservation on the high seas: linking conservation physiology and fisheries ecology for the management of large pelagic fishes

    PubMed Central

    Horodysky, Andrij Z.; Cooke, Steven J.; Graves, John E.; Brill, Richard W.

    2016-01-01

    Populations of tunas, billfishes and pelagic sharks are fished at or over capacity in many regions of the world. They are captured by directed commercial and recreational fisheries (the latter of which often promote catch and release) or as incidental catch or bycatch in commercial fisheries. Population assessments of pelagic fishes typically incorporate catch-per-unit-effort time-series data from commercial and recreational fisheries; however, there have been notable changes in target species, areas fished and depth-specific gear deployments over the years that may have affected catchability. Some regional fisheries management organizations take into account the effects of time- and area-specific changes in the behaviours of fish and fishers, as well as fishing gear, to standardize catch-per-unit-effort indices and refine population estimates. However, estimates of changes in stock size over time may be very sensitive to underlying assumptions of the effects of oceanographic conditions and prey distribution on the horizontal and vertical movement patterns and distribution of pelagic fishes. Effective management and successful conservation of pelagic fishes requires a mechanistic understanding of their physiological and behavioural responses to environmental variability, potential for interaction with commercial and recreational fishing gear, and the capture process. The interdisciplinary field of conservation physiology can provide insights into pelagic fish demography and ecology (including environmental relationships and interspecific interactions) by uniting the complementary expertise and skills of fish physiologists and fisheries scientists. The iterative testing by one discipline of hypotheses generated by the other can span the fundamental–applied science continuum, leading to the development of robust insights supporting informed management. The resulting species-specific understanding of physiological abilities and tolerances can help to improve stock

  6. Fisheries conservation on the high seas: linking conservation physiology and fisheries ecology for the management of large pelagic fishes.

    PubMed

    Horodysky, Andrij Z; Cooke, Steven J; Graves, John E; Brill, Richard W

    2016-01-01

    Populations of tunas, billfishes and pelagic sharks are fished at or over capacity in many regions of the world. They are captured by directed commercial and recreational fisheries (the latter of which often promote catch and release) or as incidental catch or bycatch in commercial fisheries. Population assessments of pelagic fishes typically incorporate catch-per-unit-effort time-series data from commercial and recreational fisheries; however, there have been notable changes in target species, areas fished and depth-specific gear deployments over the years that may have affected catchability. Some regional fisheries management organizations take into account the effects of time- and area-specific changes in the behaviours of fish and fishers, as well as fishing gear, to standardize catch-per-unit-effort indices and refine population estimates. However, estimates of changes in stock size over time may be very sensitive to underlying assumptions of the effects of oceanographic conditions and prey distribution on the horizontal and vertical movement patterns and distribution of pelagic fishes. Effective management and successful conservation of pelagic fishes requires a mechanistic understanding of their physiological and behavioural responses to environmental variability, potential for interaction with commercial and recreational fishing gear, and the capture process. The interdisciplinary field of conservation physiology can provide insights into pelagic fish demography and ecology (including environmental relationships and interspecific interactions) by uniting the complementary expertise and skills of fish physiologists and fisheries scientists. The iterative testing by one discipline of hypotheses generated by the other can span the fundamental-applied science continuum, leading to the development of robust insights supporting informed management. The resulting species-specific understanding of physiological abilities and tolerances can help to improve stock

  7. Forecasting Ecological Genomics: High-Tech Animal Instrumentation Meets High-Throughput Sequencing

    PubMed Central

    Shafer, Aaron B. A.; Northrup, Joseph M.; Wikelski, Martin; Wittemyer, George; Wolf, Jochen B. W.

    2016-01-01

    Recent advancements in animal tracking technology and high-throughput sequencing are rapidly changing the questions and scope of research in the biological sciences. The integration of genomic data with high-tech animal instrumentation comes as a natural progression of traditional work in ecological genetics, and we provide a framework for linking the separate data streams from these technologies. Such a merger will elucidate the genetic basis of adaptive behaviors like migration and hibernation and advance our understanding of fundamental ecological and evolutionary processes such as pathogen transmission, population responses to environmental change, and communication in natural populations. PMID:26745372

  8. Forecasting Ecological Genomics: High-Tech Animal Instrumentation Meets High-Throughput Sequencing.

    PubMed

    Shafer, Aaron B A; Northrup, Joseph M; Wikelski, Martin; Wittemyer, George; Wolf, Jochen B W

    2016-01-01

    Recent advancements in animal tracking technology and high-throughput sequencing are rapidly changing the questions and scope of research in the biological sciences. The integration of genomic data with high-tech animal instrumentation comes as a natural progression of traditional work in ecological genetics, and we provide a framework for linking the separate data streams from these technologies. Such a merger will elucidate the genetic basis of adaptive behaviors like migration and hibernation and advance our understanding of fundamental ecological and evolutionary processes such as pathogen transmission, population responses to environmental change, and communication in natural populations.

  9. Highly Differentiated ZW Sex Microchromosomes in the Australian Varanus Species Evolved through Rapid Amplification of Repetitive Sequences

    PubMed Central

    Matsubara, Kazumi; Sarre, Stephen D.; Georges, Arthur; Matsuda, Yoichi; Marshall Graves, Jennifer A.; Ezaz, Tariq

    2014-01-01

    Transitions between sex determination systems have occurred in many lineages of squamates and it follows that novel sex chromosomes will also have arisen multiple times. The formation of sex chromosomes may be reinforced by inhibition of recombination and the accumulation of repetitive DNA sequences. The karyotypes of monitor lizards are known to be highly conserved yet the sex chromosomes in this family have not been fully investigated. Here, we compare male and female karyotypes of three Australian monitor lizards, Varanus acanthurus, V. gouldii and V. rosenbergi, from two different clades. V. acanthurus belongs to the acanthurus clade and the other two belong to the gouldii clade. We applied C-banding and comparative genomic hybridization to reveal that these species have ZZ/ZW sex micro-chromosomes in which the W chromosome is highly differentiated from the Z chromosome. In combination with previous reports, all six Varanus species in which sex chromosomes have been identified have ZZ/ZW sex chromosomes, spanning several clades on the varanid phylogeny, making it likely that the ZZ/ZW sex chromosome is ancestral for this family. However, repetitive sequences of these ZW chromosome pairs differed among species. In particular, an (AAT)n microsatellite repeat motif mapped by fluorescence in situ hybridization on part of W chromosome in V. acanthurus only, whereas a (CGG)n motif mapped onto the W chromosomes of V. gouldii and V. rosenbergi. Furthermore, the W chromosome probe for V. acanthurus produced hybridization signals only on the centromeric regions of W chromosomes of the other two species. These results suggest that the W chromosome sequences were not conserved between gouldii and acanthurus clades and that these repetitive sequences have been amplified rapidly and independently on the W chromosome of the two clades after their divergence. PMID:24743344

  10. Assessment of genetic diversity among four orchids based on ddRAD sequencing data for conservation purposes.

    PubMed

    Roy, Subhas Chandra; Moitra, Kaushik; De Sarker, Dilip

    2017-01-01

    Genetic diversity was assessed in the four orchid species using NGS based ddRAD sequencing data. The assembled nucleotide sequences (fastq) were deposited in the SRA archive of NCBI Database with accession number (SRP063543 for Dendrobium, SRP065790 for Geodorum, SRP072201 for Cymbidium and SRP072378 for Rhynchostylis). Total base pair read was 1.1 Mbp in case of Dendrobium sp., 553.3 Kbp for Geodorum sp., 1.6 Gbp for Cymbidium, and 1.4 Gbp for Rhynchostylis. Average GC% was 43.9 in Geodorum, 43.7% in Dendrobium, 41.2% in Cymbidium and 42.3% in Rhynchostylis. Four partial gene sequences were used in DnaSP5 program for nucleotide diversity and phylogenetic relationship determination (Ycf2 gene of Dendrobium, matK gene of Geodorum, psbD gene of Cymbidium and Ycf2 gene of Ryhnchostylis). Nucleotide diversity (per site) Pi (π) was 0.10560 in Dendrobium, 0.03586 in Geodorum, 0.01364 in Cymbidium and 0.011344 in Rhynchostylis. Neutrality test statistics showed the negative value in all the four orchid species (Tajima's D value -2.17959 in Dendrobium, -2.01655 in Geodorum, -2.12362 in Rhynchostylis and -1.54222 in Cymbidium) indicating the purifying selection. Result for these gene sequences (matK and Ycf2 and psbD) indicate that they were not evolved neutrally, but signifying that selection might have played a role in evolution of these genes in these four groups of orchids. Phylogenetic relationship was analyzed by reconstructing dendrogram based on the matK, psbD and Ycf2 gene sequences using maximum likelihood method in MEGA6 program.

  11. The N14 anti-afamin antibody Fab: a rare VL1 CDR glycosylation, crystallographic re-sequencing, molecular plasticity and conservative versus enthusiastic modelling

    PubMed Central

    Naschberger, Andreas; Fürnrohr, Barbara G.; Lenac Rovis, Tihana; Malic, Suzana; Scheffzek, Klaus; Dieplinger, Hans; Rupp, Bernhard

    2016-01-01

    The monoclonal antibody N14 is used as a detection antibody in ELISA kits for the human glycoprotein afamin, a member of the albumin family, which has recently gained interest in the capture and stabilization of Wnt signalling proteins, and for its role in metabolic syndrome and papillary thyroid carcinoma. As a rare occurrence, the N14 Fab is N-glycosylated at Asn26L at the onset of the VL1 antigen-binding loop, with the α-1–6 core fucosylated complex glycan facing out of the L1 complementarity-determining region. The crystal structures of two non-apparent (pseudo) isomorphous crystals of the N14 Fab were analyzed, which differ significantly in the elbow angles, thereby cautioning against the overinterpretation of domain movements upon antigen binding. In addition, the map quality at 1.9 Å resolution was sufficient to crystallographically re-sequence the variable VL and VH domains and to detect discrepancies in the hybridoma-derived sequence. Finally, a conservatively refined parsimonious model is presented and its statistics are compared with those from a less conservatively built model that has been modelled more enthusiastically. Improvements to the PDB validation reports affecting ligands, clashscore and buried surface calculations are suggested. PMID:27917827

  12. High order numerical methods for networks of hyperbolic conservation laws coupled with ODEs and lumped parameter models

    NASA Astrophysics Data System (ADS)

    Borsche, Raul; Kall, Jochen

    2016-12-01

    In this paper we construct high order finite volume schemes on networks of hyperbolic conservation laws with coupling conditions involving ODEs. We consider two generalized Riemann solvers at the junction, one of Toro-Castro type and a solver of Harten, Enquist, Osher, Chakravarthy type. The ODE is treated with a Taylor method or an explicit Runge-Kutta scheme, respectively. Both resulting high order methods conserve quantities exactly if the conservation is part of the coupling conditions. Furthermore we present a technique to incorporate lumped parameter models, which arise from simplifying parts of a network. The high order convergence and the robust capturing of shocks are investigated numerically in several test cases.

  13. [Role of high-throughput sequencing in oncology].

    PubMed

    Rodrigues, Manuel Jorge; Gomez-Roca, Carlos

    2013-03-01

    New sequencing technologies are one of the most important technical advances in biology in the last 10 years. These technologies allow sequencing millions of DNA fragments in parallel, covering billions of bases in a short period of time. These techniques allowed discovering millions of variants, which functional and clinical value rest yet to be confirmed. This technology allows us to search new constitutional and somatic mutations in various samples in a short time. The complexity of data interpretation and size of data as well as the important investment needed to implement make these technologies to be present only in big institutions. The objective of this article is to present the different techniques, their associated technologies and to discuss their current applications.

  14. Symmetrization of conservation laws with entropy for high-temperature hypersonic computations

    NASA Technical Reports Server (NTRS)

    Chalot, F.; Hughes, T. J. R.; Shakib, F.

    1990-01-01

    Results of Hughes, France, and Mallet are generalized to conservation law systems taking into account high-temperature effects. Symmetric forms of different equation sets are derived in terms of entropy variables. First, the case of a general divariant gas is studied; it can be specialized to the usual Navier-Stokes equations, as well as to situations where the gas is vibrationally excited, and undergoes equilibrium chemical reactions. The case of gas in thermochemical nonequilibrium is considered next. Transport phenomena, and in particular mass diffusion, are examined in the framework of symmetric advective-diffusive systems.

  15. High resolution numerical simulation of the linearized Euler equations in conservation law form

    NASA Technical Reports Server (NTRS)

    Sreenivas, Kidambi; Whitfield, David L.; Huff, Dennis L.

    1993-01-01

    A linearized Euler solver based on a high resolution numerical scheme is presented. The approach is to linearize the flux vector as opposed to carrying through the complete linearization analysis with the dependent variable vector written as a sum of the mean and the perturbed flow. This allows the linearized equations to be maintained in conservation law form. The linearized equations are used to compute unsteady flows in turbomachinery blade rows arising due to blade vibrations. Numerical solutions are compared to theoretical results (where available) and to numerical solutions of the nonlinear Euler equations.

  16. Targeted high throughput sequencing in hereditary ataxia and spastic paraplegia

    PubMed Central

    Koht, Jeanette; Pihlstrøm, Lasse; Rengmark, Aina H.; Henriksen, Sandra P.; Tallaksen, Chantal M. E.; Toft, Mathias

    2017-01-01

    Hereditary ataxia and spastic paraplegia are heterogeneous monogenic neurodegenerative disorders. To date, a large number of individuals with such disorders remain undiagnosed. Here, we have assessed molecular diagnosis by gene panel sequencing in 105 early and late-onset hereditary ataxia and spastic paraplegia probands, in whom extensive previous investigations had failed to identify the genetic cause of disease. Pathogenic and likely-pathogenic variants were identified in 20 probands (19%) and variants of uncertain significance in ten probands (10%). Together these accounted for 30 probands (29%) and involved 18 different genes. Among several interesting findings, dominantly inherited KIF1A variants, p.(Val8Met) and p.(Ile27Thr) segregated in two independent families, both presenting with a pure spastic paraplegia phenotype. Two homozygous missense variants, p.(Gly4230Ser) and p.(Leu4221Val) were found in SACS in one consanguineous family, presenting with spastic ataxia and isolated cerebellar atrophy. The average disease duration in probands with pathogenic and likely-pathogenic variants was 31 years, ranging from 4 to 51 years. In conclusion, this study confirmed and expanded the clinical phenotypes associated with known disease genes. The results demonstrate that gene panel sequencing and similar sequencing approaches can serve as efficient diagnostic tools for different heterogeneous disorders. Early use of such strategies may help to reduce both costs and time of the diagnostic process. PMID:28362824

  17. High-Throughput Sequencing of RNA Silencing-Associated Small RNAs in Olive (Olea europaea L.)

    PubMed Central

    Donaire, Livia; Pedrola, Laia; de la Rosa, Raúl; Llave, César

    2011-01-01

    Small RNAs (sRNAs) of 20 to 25 nucleotides (nt) in length maintain genome integrity and control gene expression in a multitude of developmental and physiological processes. Despite RNA silencing has been primarily studied in model plants, the advent of high-throughput sequencing technologies has enabled profiling of the sRNA component of more than 40 plant species. Here, we used deep sequencing and molecular methods to report the first inventory of sRNAs in olive (Olea europaea L.). sRNA libraries prepared from juvenile and adult shoots revealed that the 24-nt class dominates the sRNA transcriptome and atypically accumulates to levels never seen in other plant species, suggesting an active role of heterochromatin silencing in the maintenance and integrity of its large genome. A total of 18 known miRNA families were identified in the libraries. Also, 5 other sRNAs derived from potential hairpin-like precursors remain as plausible miRNA candidates. RNA blots confirmed miRNA expression and suggested tissue- and/or developmental-specific expression patterns. Target mRNAs of conserved miRNAs were computationally predicted among the olive cDNA collection and experimentally validated through endonucleolytic cleavage assays. Finally, we use expression data to uncover genetic components of the miR156, miR172 and miR390/TAS3-derived trans-acting small interfering RNA (tasiRNA) regulatory nodes, suggesting that these interactive networks controlling developmental transitions are fully operational in olive. PMID:22140484

  18. Whole Genome Sequencing of Enterovirus species C Isolates by High-Throughput Sequencing: Development of Generic Primers

    PubMed Central

    Bessaud, Maël; Sadeuh-Mba, Serge A.; Joffret, Marie-Line; Razafindratsimandresy, Richter; Polston, Patsy; Volle, Romain; Rakoto-Andrianarivelo, Mala; Blondel, Bruno; Njouom, Richard; Delpeyroux, Francis

    2016-01-01

    Enteroviruses are among the most common viruses infecting humans and can cause diverse clinical syndromes ranging from minor febrile illness to severe and potentially fatal diseases. Enterovirus species C (EV-C) consists of more than 20 types, among which the three serotypes of polioviruses, the etiological agents of poliomyelitis, are included. Biodiversity and evolution of EV-C genomes are shaped by frequent recombination events. Therefore, identification and characterization of circulating EV-C strains require the sequencing of different genomic regions. A simple method was developed to quickly sequence the entire genome of EV-C isolates. Four overlapping fragments were produced separately by RT-PCR performed with generic primers. The four amplicons were then pooled and purified prior to being sequenced by a high-throughput technique. The method was assessed on a panel of EV-Cs belonging to a wide-range of types. It can be used to determine full-length genome sequences through de novo assembly of thousands of reads. It was also able to discriminate reads from closely related viruses in mixtures. By decreasing the workload compared to classical Sanger-based techniques, this method will serve as a precious tool for sequencing large panels of EV-Cs isolated in cell cultures during environmental surveillance or from patients, including vaccine-derived polioviruses. PMID:27617004

  19. Conserved cis-regulatory modules in promoters of genes encoding wheat high-molecular-weight glutenin subunits

    PubMed Central

    Ravel, Catherine; Fiquet, Samuel; Boudet, Julie; Dardevet, Mireille; Vincent, Jonathan; Merlino, Marielle; Michard, Robin; Martre, Pierre

    2014-01-01

    The concentration and composition of the gliadin and glutenin seed storage proteins (SSPs) in wheat flour are the most important determinants of its end-use value. In cereals, the synthesis of SSPs is predominantly regulated at the transcriptional level by a complex network involving at least five cis-elements in gene promoters. The high-molecular-weight glutenin subunits (HMW-GS) are encoded by two tightly linked genes located on the long arms of group 1 chromosomes. Here, we sequenced and annotated the HMW-GS gene promoters of 22 electrophoretic wheat alleles to identify putative cis-regulatory motifs. We focused on 24 motifs known to be involved in SSP gene regulation. Most of them were identified in at least one HMW-GS gene promoter sequence. A common regulatory framework was observed in all the HMW-GS gene promoters, as they shared conserved cis-regulatory modules (CCRMs) including all the five motifs known to regulate the transcription of SSP genes. This common regulatory framework comprises a composite box made of the GATA motifs and GCN4-like Motifs (GLMs) and was shown to be functional as the GLMs are able to bind a bZIP transcriptional factor SPA (Storage Protein Activator). In addition to this regulatory framework, each HMW-GS gene promoter had additional motifs organized differently. The promoters of most highly expressed x-type HMW-GS genes contain an additional box predicted to bind R2R3-MYB transcriptional factors. However, the differences in annotation between promoter alleles could not be related to their level of expression. In summary, we identified a common modular organization of HMW-GS gene promoters but the lack of correlation between the cis-motifs of each HMW-GS gene promoter and their level of expression suggests that other cis-elements or other mechanisms regulate HMW-GS gene expression. PMID:25429295

  20. Analysis of Cytochrome P450 Conserved Sequence Motifs between Helices E and H: Prediction of Critical Motifs and Residues in Enzyme Functions

    PubMed Central

    Oezguen, Numan; Kumar, Santosh

    2014-01-01

    Rational approaches have been extensively used to investigate the role of active site residues in cytochrome P450 (CYP) functions. However, recent studies using random mutagenesis suggest an important role for non-active site residues in CYP functions. Meta-analysis of the random mutants showed that 75% of the functionally important non-active site residues are present in 20% of the entire protein between helices E and H (E-H) and conserved sequence motif (CSM) between 7 and 11. The CSM approach was developed recently to investigate the functional role of non-active site residues in CYP2B4. Furthermore, we identified and analyzed the CSM in multiple CYP families and subfamilies in the E-H region. Results from CSM analysis showed that CSM 7, 8, 10, and 11 are conserved in CYP1, CYP2, and CYP3 families, while CSM 9 is conserved only in CYP2 family. Analysis of different CYP2 subfamilies showed that CYP2B and CYP2C have similar characteristics in the CSM, while the characteristics of CYP2A and CYP2D subfamilies are different. Finally, we analyzed CSM 7, 8, 10, and 11, which are common in all the CYP families/subfamilies analyzed, in fifteen important drug-metabolizing CYPs. The results showed that while CSM 8 is most conserved among these CYPs, CSM 7, 9, and 10 have significant variations. We suggest that CSM8 has a common role in all the CYPs that have been analyzed, while CSM 7, 10, and 11 may have relatively specific role within the subfamily. We further suggest that these CSM play important role in opening and closing of the substrate access/egress channel by modulating the flexible/plastic region of the protein. Thus, site-directed mutagenesis of these CSM can be used to study structure-function and dynamic/plasticity-function relationships and to design CYP biocatalysts. PMID:25426333

  1. Distinct activation phenotype of a highly conserved novel HLA-B57-restricted epitope during dengue virus infection

    PubMed Central

    Townsley, Elizabeth; Woda, Marcia; Thomas, Stephen J; Kalayanarooj, Siripen; Gibbons, Robert V; Nisalak, Ananda; Srikiatkhachorn, Anon; Green, Sharone; Stephens, Henry AF; Rothman, Alan L; Mathew, Anuja

    2014-01-01

    Variation in the sequence of T-cell epitopes between dengue virus (DENV) serotypes is believed to alter memory T-cell responses during second heterologous infections. We identified a highly conserved, novel, HLA-B57-restricted epitope on the DENV NS1 protein. We predicted higher frequencies of B57-NS126–34-specific CD8+ T cells in peripheral blood mononuclear cells from individuals undergoing secondary rather than primary DENV infection. However, high tetramer-positive T-cell frequencies during acute infection were seen in only one of nine subjects with secondary infection. B57-NS126–34-specific and other DENV epitope-specific CD8+ T cells, as well as total CD8+ T cells, expressed an activated phenotype (CD69+ and/or CD38+) during acute infection. In contrast, expression of CD71 was largely limited to DENV epitope-specific CD8+ T cells. In vitro stimulation of cell lines indicated that CD71 expression was differentially sensitive to stimulation by homologous and heterologous variant peptides. CD71 may represent a useful marker of antigen-specific T-cell activation. PMID:23941420

  2. Distinct activation phenotype of a highly conserved novel HLA-B57-restricted epitope during dengue virus infection.

    PubMed

    Townsley, Elizabeth; Woda, Marcia; Thomas, Stephen J; Kalayanarooj, Siripen; Gibbons, Robert V; Nisalak, Ananda; Srikiatkhachorn, Anon; Green, Sharone; Stephens, Henry A F; Rothman, Alan L; Mathew, Anuja

    2014-01-01

    Variation in the sequence of T-cell epitopes between dengue virus (DENV) serotypes is believed to alter memory T-cell responses during second heterologous infections. We identified a highly conserved, novel, HLA-B57-restricted epitope on the DENV NS1 protein. We predicted higher frequencies of B57-NS1(26-34) -specific CD8(+) T cells in peripheral blood mononuclear cells from individuals undergoing secondary rather than primary DENV infection. However, high tetramer-positive T-cell frequencies during acute infection were seen in only one of nine subjects with secondary infection. B57-NS1(26-34) -specific and other DENV epitope-specific CD8(+) T cells, as well as total CD8(+) T cells, expressed an activated phenotype (CD69(+) and/or CD38(+)) during acute infection. In contrast, expression of CD71 was largely limited to DENV epitope-specific CD8(+) T cells. In vitro stimulation of cell lines indicated that CD71 expression was differentially sensitive to stimulation by homologous and heterologous variant peptides. CD71 may represent a useful marker of antigen-specific T-cell activation.

  3. In silico identification of conserved microRNAs and their target transcripts from expressed sequence tags of three earthworm species.

    PubMed

    Gong, Ping; Xie, Fuliang; Zhang, Baohong; Perkins, Edward J

    2010-12-01

    MicroRNAs are a recently identified class of small regulatory RNAs that target more than 30% protein-coding genes. Elevating evidence shows that miRNAs play a critical role in many biological processes, including developmental timing, tissue differentiation, and response to chemical exposure. In this study, we applied a computational approach to analyze expressed sequence tags, and identified 32 miRNAs belonging to 22 miRNA families, in three earthworm species Eisenia fetida, Eisenia andrei, and Lumbricus rubellus. These newly identified earthworm miRNAs possess a difference of 2-4 nucleotides from their homologous counterparts in Caenorhabditis elegans. They also share similar features with other known animal miRNAs, for instance, the nucleotide U being dominant in both mature and pre-miRNA sequences, particularly in the first position of mature miRNA sequences at the 5' end. The newly identified earthworm miRNAs putatively regulate mRNA genes that are involved in many important biological processes and pathways related to development, growth, locomotion, and reproduction as well as response to stresses, particularly oxidative stress. Future efforts will focus on experimental validation of their presence and target mRNA genes to further elucidate their biological functions in earthworms.

  4. Molecular cloning, nucleotide sequence, and abscisic acid induction of a suberization-associated highly anionic peroxidase.

    PubMed

    Roberts, E; Kolattukudy, P E

    1989-06-01

    A highly anionic peroxidase induced in suberizing cells was suggested to be the key enzyme involved in polymerization of phenolic monomers to generate the aromatic matrix of suberin. The enzyme encoded by a potato cDNA was found to be highly homologous to the anionic peroxidase induced in suberizing tomato fruit. A tomato genomic library was screened using the potato anionic peroxidase cDNA and one genomic clone was isolated that contained two tandemly oriented anionic peroxidase genes. These genes were sequenced and were 96% and 87% identical to the mRNA for potato anionic peroxidase. Both genes consist of three exons with the relative positions of their two introns being conserved between the two genes. Primer extension analysis showed that only one of the genes is expressed in the periderm of 3 day wound-healed tomato fruits. Southern blot analyses suggested that there are two copies each of the two highly homologous genes per haploid genome in both potato and tomato. Abscisic acid (ABA) induced the accumulation of the anionic peroxidase transcripts in potato and tomato callus tissues. Northern blots showed that peroxidase mRNA was detectable at 2 days and was maximal at 8 days after transfer of potato callus to solid agar media containing 10(-4) M ABA. The transcripts induced by ABA in both potato and tomato callus were identical in size to those induced in wound-healing potato tuber and tomato fruit. The anionic peroxidase peptide was detected in extracts of potato callus grown on the ABA-containing media by western blot analysis. The results support the suggestion that stimulation of suberization by ABA involves the induction of the highly anionic peroxidase.

  5. High order filtering methods for approximating hyberbolic systems of conservation laws

    NASA Technical Reports Server (NTRS)

    Lafon, F.; Osher, S.

    1990-01-01

    In the computation of discontinuous solutions of hyperbolic systems of conservation laws, the recently developed essentially non-oscillatory (ENO) schemes appear to be very useful. However, they are computationally costly compared to simple central difference methods. A filtering method which is developed uses simple central differencing of arbitrarily high order accuracy, except when a novel local test indicates the development of spurious oscillations. At these points, the full ENO apparatus is used, maintaining the high order of accuracy, but removing spurious oscillations. Numerical results indicate the success of the method. High order of accuracy was obtained in regions of smooth flow without spurious oscillations for a wide range of problems and a significant speed up of generally a factor of almost three over the full ENO method.

  6. Low-intensity agricultural landscapes in Transylvania support high butterfly diversity: implications for conservation.

    PubMed

    Loos, Jacqueline; Dorresteijn, Ine; Hanspach, Jan; Fust, Pascal; Rakosy, László; Fischer, Joern

    2014-01-01

    European farmland biodiversity is declining due to land use changes towards agricultural intensification or abandonment. Some Eastern European farming systems have sustained traditional forms of use, resulting in high levels of biodiversity. However, global markets and international policies now imply rapid and major changes to these systems. To effectively protect farmland biodiversity, understanding landscape features which underpin species diversity is crucial. Focusing on butterflies, we addressed this question for a cultural-historic landscape in Southern Transylvania, Romania. Following a natural experiment, we randomly selected 120 survey sites in farmland, 60 each in grassland and arable land. We surveyed butterfly species richness and abundance by walking transects with four repeats in summer 2012. We analysed species composition using Detrended Correspondence Analysis. We modelled species richness, richness of functional groups, and abundance of selected species in response to topography, woody vegetation cover and heterogeneity at three spatial scales, using generalised linear mixed effects models. Species composition widely overlapped in grassland and arable land. Composition changed along gradients of heterogeneity at local and context scales, and of woody vegetation cover at context and landscape scales. The effect of local heterogeneity on species richness was positive in arable land, but negative in grassland. Plant species richness, and structural and topographic conditions at multiple scales explained species richness, richness of functional groups and species abundances. Our study revealed high conservation value of both grassland and arable land in low-intensity Eastern European farmland. Besides grassland, also heterogeneous arable land provides important habitat for butterflies. While butterfly diversity in arable land benefits from heterogeneity by small-scale structures, grasslands should be protected from fragmentation to provide

  7. Low-Intensity Agricultural Landscapes in Transylvania Support High Butterfly Diversity: Implications for Conservation

    PubMed Central

    Loos, Jacqueline; Dorresteijn, Ine; Hanspach, Jan; Fust, Pascal; Rakosy, László; Fischer, Joern

    2014-01-01

    European farmland biodiversity is declining due to land use changes towards agricultural intensification or abandonment. Some Eastern European farming systems have sustained traditional forms of use, resulting in high levels of biodiversity. However, global markets and international policies now imply rapid and major changes to these systems. To effectively protect farmland biodiversity, understanding landscape features which underpin species diversity is crucial. Focusing on butterflies, we addressed this question for a cultural-historic landscape in Southern Transylvania, Romania. Following a natural experiment, we randomly selected 120 survey sites in farmland, 60 each in grassland and arable land. We surveyed butterfly species richness and abundance by walking transects with four repeats in summer 2012. We analysed species composition using Detrended Correspondence Analysis. We modelled species richness, richness of functional groups, and abundance of selected species in response to topography, woody vegetation cover and heterogeneity at three spatial scales, using generalised linear mixed effects models. Species composition widely overlapped in grassland and arable land. Composition changed along gradients of heterogeneity at local and context scales, and of woody vegetation cover at context and landscape scales. The effect of local heterogeneity on species richness was positive in arable land, but negative in grassland. Plant species richness, and structural and topographic conditions at multiple scales explained species richness, richness of functional groups and species abundances. Our study revealed high conservation value of both grassland and arable land in low-intensity Eastern European farmland. Besides grassland, also heterogeneous arable land provides important habitat for butterflies. While butterfly diversity in arable land benefits from heterogeneity by small-scale structures, grasslands should be protected from fragmentation to provide

  8. Gla-rich protein (GRP), a new vitamin K-dependent protein identified from sturgeon cartilage and highly conserved in vertebrates.

    PubMed

    Viegas, Carla S B; Simes, Dina C; Laizé, Vincent; Williamson, Matthew K; Price, Paul A; Cancela, M Leonor

    2008-12-26

    We report the isolation of a novel vitamin K-dependent protein from the calcified cartilage of Adriatic sturgeon (Acipenser nacarii). This 10.2-kDa secreted protein contains 16 gamma-carboxyglutamic acid (Gla) residues in its 74-residue sequence, the highest Gla percent of any known protein, and we have therefore termed it Gla-rich protein (GRP). GRP has a high charge density (36 negative+16 positive=20 net negative) yet is insoluble at neutral pH. GRP has orthologs in all taxonomic groups of vertebrates, and a paralog (GRP2) in bony fish; no GRP homolog was found in invertebrates. There is no significant sequence homology between GRP and the Gla-containing region of any presently known vitamin K-dependent protein. Forty-seven GRP sequences were obtained by a combination of cDNA cloning and comparative genomics: all 47 have a propeptide that contains a gamma-carboxylase recognition site and a mature protein with 14 highly conserved Glu residues, each of them being gamma-carboxylated in sturgeon. The protein sequence of GRP is also highly conserved, with 78% identity between sturgeon and human GRP. Analysis of the corresponding gene structures suggests a highly constrained organization, particularly for exon 4, which encodes the core Gla domain. GRP mRNA is found in virtually all rat and sturgeon tissues examined, with the highest expression in cartilage. Cells expressing GRP include chondrocytes, chondroblasts, osteoblasts, and osteocytes. Because of its potential to bind calcium through Gla residues, we suggest that GRP may regulate calcium in the extracellular environment.

  9. Gla-rich Protein (GRP), A New Vitamin K-dependent Protein Identified from Sturgeon Cartilage and Highly Conserved in Vertebrates*S⃞

    PubMed Central

    Viegas, Carla S. B.; Simes, Dina C.; Laizé, Vincent; Williamson, Matthew K.; Price, Paul A.; Cancela, M. Leonor

    2008-01-01

    We report the isolation of a novel vitamin K-dependent protein from the calcified cartilage of Adriatic sturgeon (Acipenser nacarii). This 10.2-kDa secreted protein contains 16 γ-carboxyglutamic acid (Gla) residues in its 74-residue sequence, the highest Gla percent of any known protein, and we have therefore termed it Gla-rich protein (GRP). GRP has a high charge density (36 negative + 16 positive = 20 net negative) yet is insoluble at neutral pH. GRP has orthologs in all taxonomic groups of vertebrates, and a paralog (GRP2) in bony fish; no GRP homolog was found in invertebrates. There is no significant sequence homology between GRP and the Gla-containing region of any presently known vitamin K-dependent protein. Forty-seven GRP sequences were obtained by a combination of cDNA cloning and comparative genomics: all 47 have a propeptide that contains a γ-carboxylase recognition site and a mature protein with 14 highly conserved Glu residues, each of them being γ-carboxylated in sturgeon. The protein sequence of GRP is also highly conserved, with 78% identity between sturgeon and human GRP. Analysis of the corresponding gene structures suggests a highly constrained organization, particularly for exon 4, which encodes the core Gla domain. GRP mRNA is found in virtually all rat and sturgeon tissues examined, with the highest expression in cartilage. Cells expressing GRP include chondrocytes, chondroblasts, osteoblasts, and osteocytes. Because of its potential to bind calcium through Gla residues, we suggest that GRP may regulate calcium in the extracellular environment. PMID:18836183

  10. Deletions involving long-range conserved nongenic sequences upstream and downstream of FOXL2 as a novel disease-causing mechanism in blepharophimosis syndrome.

    PubMed

    Beysen, D; Raes, J; Leroy, B P; Lucassen, A; Yates, J R W; Clayton-Smith, J; Ilyina, H; Brooks, S Sklower; Christin-Maitre, S; Fellous, M; Fryns, J P; Kim, J R; Lapunzina, P; Lemyre, E; Meire, F; Messiaen, L M; Oley, C; Splitt, M; Thomson, J; Van de Peer, Y; Veitia, R A; De Paepe, A; De Baere, E

    2005-08-01

    The expression of a gene requires not only a normal coding sequence but also intact regulatory regions, which can be located at large distances from the target genes, as demonstrated for an increasing number of developmental genes. In previous mutation studies of the role of FOXL2 in blepharophimosis syndrome (BPES), we identified intragenic mutations in 70% of our patients. Three translocation breakpoints upstream of FOXL2 in patients with BPES suggested a position effect. Here, we identified novel microdeletions outside of FOXL2 in cases of sporadic and familial BPES. Specifically, four rearrangements, with an overlap of 126 kb, are located 230 kb upstream of FOXL2, telomeric to the reported translocation breakpoints. Moreover, the shortest region of deletion overlap (SRO) contains several conserved nongenic sequences (CNGs) harboring putative transcription-factor binding sites and representing potential long-range cis-regulatory elements. Interestingly, the human region orthologous to the 12-kb sequence deleted in the polled intersex syndrome in goat, which is an animal model for BPES, is contained in this SRO, providing evidence of human-goat conservation of FOXL2 expression and of the mutational mechanism. Surprisingly, in a fifth family with BPES, one rearrangement was found downstream of FOXL2. In addition, we report nine novel rearrangements encompassing FOXL2 that range from partial gene deletions to submicroscopic deletions. Overall, genomic rearrangements encompassing or outside of FOXL2 account for 16% of all molecular defects found in our families with BPES. In summary, this is the first report of extragenic deletions in BPES, providing further evidence of potential long-range cis-regulatory elements regulating FOXL2 expression. It contributes to the enlarging group of developmental diseases caused by defective distant regulation of gene expression. Finally, we demonstrate that CNGs are candidate regions for genomic rearrangements in developmental

  11. Deletions Involving Long-Range Conserved Nongenic Sequences Upstream and Downstream of FOXL2 as a Novel Disease-Causing Mechanism in Blepharophimosis Syndrome

    PubMed Central

    Beysen, D.; Raes, J.; Leroy, B. P.; Lucassen, A.; Yates, J. R. W.; Clayton-Smith, J.; Ilyina, H.; Brooks, S. Sklower; Christin-Maitre, S.; Fellous, M.; Fryns, J. P.; Kim, J. R.; Lapunzina, P.; Lemyre, E.; Meire, F.; Messiaen, L. M.; Oley, C.; Splitt, M.; Thomson, J.; Peer, Y. Van de; Veitia, R. A.; De Paepe, A.; De Baere, E.

    2005-01-01

    The expression of a gene requires not only a normal coding sequence but also intact regulatory regions, which can be located at large distances from the target genes, as demonstrated for an increasing number of developmental genes. In previous mutation studies of the role of FOXL2 in blepharophimosis syndrome (BPES), we identified intragenic mutations in 70% of our patients. Three translocation breakpoints upstream of FOXL2 in patients with BPES suggested a position effect. Here, we identified novel microdeletions outside of FOXL2 in cases of sporadic and familial BPES. Specifically, four rearrangements, with an overlap of 126 kb, are located 230 kb upstream of FOXL2, telomeric to the reported translocation breakpoints. Moreover, the shortest region of deletion overlap (SRO) contains several conserved nongenic sequences (CNGs) harboring putative transcription-factor binding sites and representing potential long-range cis-regulatory elements. Interestingly, the human region orthologous to the 12-kb sequence deleted in the polled intersex syndrome in goat, which is an animal model for BPES, is contained in this SRO, providing evidence of human-goat conservation of FOXL2 expression and of the mutational mechanism. Surprisingly, in a fifth family with BPES, one rearrangement was found downstream of FOXL2. In addition, we report nine novel rearrangements encompassing FOXL2 that range from partial gene deletions to submicroscopic deletions. Overall, genomic rearrangements encompassing or outside of FOXL2 account for 16% of all molecular defects found in our families with BPES. In summary, this is the first report of extragenic deletions in BPES, providing further evidence of potential long-range cis-regulatory elements regulating FOXL2 expression. It contributes to the enlarging group of developmental diseases caused by defective distant regulation of gene expression. Finally, we demonstrate that CNGs are candidate regions for genomic rearrangements in developmental

  12. Fetal akinesia deformation sequence in a highly developed acardius twin.

    PubMed

    Konstantinidou, A E; Agapitos, E V; Pavlopoulos, P M; Davaris, P S

    1997-10-01

    We report a case of a holoacardius twin with extremely advanced development of the head, face, upper and lower limbs in the absence of all thoracic and upper abdominal viscera and associated with intestinal and anal atresia. The malformed fetus also had craniofacial abnormalities, hydrops, cystic hygroma of the neck, arthrogryposis and pterygia. The monozygous co-twin was found to be normal. The association of acardia with the typical characteristics of the fetal akinesia deformation sequence has not been previously described in the literature.

  13. Efficient screening of long terminal repeat retrotransposons that show high insertion polymorphism via high-throughput sequencing of the primer binding site.

    PubMed

    Monden, Yuki; Fujii, Nobuyuki; Yamaguchi, Kentaro; Ikeo, Kazuho; Nakazawa, Yoshiko; Waki, Takamitsu; Hirashima, Keita; Uchimura, Yosuke; Tahara, Makoto

    2014-05-01

    Retrotransposons have been used frequently for the development of molecular markers by using their insertion polymorphisms among cultivars, because multiple copies of these elements are dispersed throughout the genome and inserted copies are inherited genetically. Although a large number of long terminal repeat (LTR) retrotransposon families exist in the higher eukaryotic genomes, the identification of families that show high insertion polymorphism has been challenging. Here, we performed an efficient screening of these retrotransposon families using an Illumina HiSeq2000 sequencing platform with comprehensive LTR library construction based on the primer binding site (PBS), which is located adjacent to the 5' LTR and has a motif that is universal and conserved among LTR retrotransposon families. The paired-end sequencing library of the fragments containing a large number of LTR sequences and their insertion sites was sequenced for seven strawberry (Fragaria × ananassa Duchesne) cultivars and one diploid wild species (Fragaria vesca L.). Among them, we screened 24 families with a "unique" insertion site that appeared only in one cultivar and not in any others, assuming that this type of insertion should have occurred quite recently. Finally, we confirmed experimentally the selected LTR families showed high insertion polymorphisms among closely related cultivars.

  14. Effects of the Conservation Reserve Program on Hydrologic Processes in the Southern High Plains

    NASA Astrophysics Data System (ADS)

    Haacker, E. M.; Smidt, S. J.; Kendall, A. D.; Basso, B.; Hyndman, D. W.

    2015-12-01

    The Southern High Plains Aquifer is a rapidly depleting resource that supports agriculture in parts of New Mexico and the Texas Panhandle. The development of the aquifer has changed the landscape and the water cycle of the region. This study illustrates the evolving patterns of land use and the effects of cultivation, from irrigated to dryland farming to the countermanding influence of the Conservation Reserve Program (CRP). Previous research indicates that greater recharge rates occur under cultivated land in the Southern High Plains than under unbroken soil: the transition to cultivation causes increased recharge, under both dryland and irrigated management, though most recharge still occurs through playa lakes. The Conservation Reserve Program takes land out of crop production, replacing the land cover with something more like the natural ecosystem. This may decrease recharge below fields, and reduce runoff that feeds playa lakes; or, CRP may help stabilize playa lakes, increasing recharge. Changes to the water cycle are investigated at the field scale using the System Approach to Land Use Sustainability (SALUS) crop model, and at the regional scale with the Landscape Hydrology Model (LHM), and compared with historical data and water table elevations.

  15. Nosema ceranae alters a highly conserved hormonal stress pathway in honeybees.

    PubMed

    Mayack, C; Natsopoulou, M E; McMahon, D P

    2015-12-01

    Nosema ceranae, an emerging pathogen of the western honeybee (Apis mellifera), is implicated in recent pollinator losses and causes severe energetic stress. However, whether precocious foraging and accelerated behavioural maturation in infected bees are caused by the infection itself or via indirect energetic stress remains unknown. Using a combination of nutritional and infection treatments, we investigated how starvation and infection alters the regulation of adipokinetic hormone (AKH) and octopamine, two highly conserved physiological pathways that respond to energetic stress by mobilizing fat stores and increasing search activity for food. Although there was no response from AKH when bees were experimentally infected with N. ceranae or starved, supporting the notion that honeybees have lost this pathway, there were significant regulatory changes in the octopamine pathway. Significantly, we found no evidence of acute energetic stress being the only cause of symptoms associated with N. ceranae infection. Therefore, the parasite itself appears to alter regulatory components along a highly conserved physiological pathway in an infection-specific manner. This indicates that pathogen-induced behavioural alteration of chronically infected bees should not just be viewed as a coincidental short-term by-product of pathogenesis (acute energetic stress) and may be a result of a generalist manipulation strategy to obtain energy for reproduction.

  16. Highly conserved low-copy nuclear genes as effective markers for phylogenetic analyses in angiosperms.

    PubMed

    Zhang, Ning; Zeng, Liping; Shan, Hongyan; Ma, Hong

    2012-09-01

    Organismal phylogeny provides a crucial evolutionary framework for many studies and the angiosperm phylogeny has been greatly improved recently, largely using organellar and rDNA genes. However, low-copy protein-coding nuclear genes have not been widely used on a large scale in spite of the advantages of their biparental inheritance and vast number of choices. Here, we identified 1083 highly conserved low-copy nu