Science.gov

Sample records for acid sequence variations

  1. Amino acid repeats cause extraordinary coding sequence variation in the social amoeba Dictyostelium discoideum.

    PubMed

    Scala, Clea; Tian, Xiangjun; Mehdiabadi, Natasha J; Smith, Margaret H; Saxer, Gerda; Stephens, Katie; Buzombo, Prince; Strassmann, Joan E; Queller, David C

    2012-01-01

    Protein sequences are normally the most conserved elements of genomes owing to purifying selection to maintain their functions. We document an extraordinary amount of within-species protein sequence variation in the model eukaryote Dictyostelium discoideum stemming from triplet DNA repeats coding for long strings of single amino acids. D. discoideum has a very large number of such strings, many of which are polyglutamine repeats, the same sequence that causes various human neurological disorders in humans, like Huntington's disease. We show here that D. discoideum coding repeat loci are highly variable among individuals, making D. discoideum a candidate for the most variable proteome. The coding repeat loci are not significantly less variable than similar non-coding triplet repeats. This pattern is consistent with these amino-acid repeats being largely non-functional sequences evolving primarily by mutation and drift. PMID:23029418

  2. DNA Sequence and Expression Variation of Hop (Humulus lupulus) Valerophenone Synthase (VPS), a Key Gene in Bitter Acid Biosynthesis

    PubMed Central

    Castro, Consuelo B.; Whittock, Lucy D.; Whittock, Simon P.; Leggett, Grey; Koutoulis, Anthony

    2008-01-01

    Background The hop plant (Humulus lupulus) is a source of many secondary metabolites, with bitter acids essential in the beer brewing industry and others having potential applications for human health. This study investigated variation in DNA sequence and gene expression of valerophenone synthase (VPS), a key gene in the bitter acid biosynthesis pathway of hop. Methods Sequence variation was studied in 12 varieties, and expression was analysed in four of the 12 varieties in a series across the development of the hop cone. Results Nine single nucleotide polymorphisms (SNPs) were detected in VPS, seven of which were synonymous. The two non-synonymous polymorphisms did not appear to be related to typical bitter acid profiles of the varieties studied. However, real-time quantitative reverse-transcription polymerase chain reaction (qRT-PCR) analysis of VPS expression during hop cone development showed a clear link with the bitter acid content. The highest levels of VPS expression were observed in two triploid varieties, ‘Symphony’ and ‘Ember’, which typically have high bitter acid levels. Conclusions In all hop varieties studied, VPS expression was lowest in the leaves and an increase in expression was consistently observed during the early stages of cone development. PMID:18519445

  3. Fad7 gene identification and fatty acids phenotypic variation in an olive collection by EcoTILLING and sequencing approaches.

    PubMed

    Sabetta, Wilma; Blanco, Antonio; Zelasco, Samanta; Lombardo, Luca; Perri, Enzo; Mangini, Giacomo; Montemurro, Cinzia

    2013-08-01

    The ω-3 fatty acid desaturases (FADs) are enzymes responsible for catalyzing the conversion of linoleic acid to α-linolenic acid localized in the plastid or in the endoplasmic reticulum. In this research we report the genotypic and phenotypic variation of Italian Olea europaea L. germoplasm for the fatty acid composition. The phenotypic oil characterization was followed by the molecular analysis of the plastidial-type ω-3 FAD gene (fad7) (EC 1.14.19), whose full-length sequence has been here identified in cultivar Leccino. The gene consisted of 2635 bp with 8 exons and 5'- and 3'-UTRs of 336 and 282 bp respectively, and showed a high level of heterozygousity (1/110 bp). The natural allelic variation was investigated both by a LiCOR EcoTILLING assay and the PCR product direct sequencing. Only three haplotypes were identified among the 96 analysed cultivars, highlighting the strong degree of conservation of this gene. PMID:23685785

  4. Sequence variation in transcription factor IIIA.

    PubMed Central

    Gaskins, C J; Hanas, J S

    1990-01-01

    Previous studies characterized macromolecular differences between Xenopus and Rana transcription factor IIIA (TFIIIA) (Gaskins et al., 1989, Nucl. Acids Res. 17, 781-794). In the present study, cDNAs for TFIIIA from Xenopus borealis and Rana catesbeiana (American bullfrog) were cloned and sequenced in order to gain molecular insight into the structure, function, and species variation of TFIIIA and the TFIIIA-type zinc finger. X. borealis and R. catesbeiana TFIIIAs have 339 and 335 amino acids respectively, 5 and 9 fewer than X. laevis TFIIIA. X. borealis TFIIIA exhibited 84% sequence homology (55 amino acid differences) with X. laevis TFIIIA and R. catesbeiana TFIIIA exhibited 63% homology (128 amino acid changes) with X. laevis TFIIIA. This sequence variation is not random; the C-terminal halves of these TFIIIAs contain substantially more non-conservative changes than the N-terminal halves. In particular, the N-terminal region of TFIIIA (that region forming strong DNA contacts) is the most conserved and the C-terminal tail (that region involved in transcription promotion) the most divergent. Hydropathy analyses of these sequences revealed zinc finger periodicity in the N-terminal halves, extreme hydrophilicity in the C-terminal halves, and a different C-terminal tail hydropathy for R. catesbeiana TFIIIA. Although considerable sequence variation exists in these TFIIIA zinc fingers, the Cys/His, Tyr/Phe and Leu residues are strictly conserved between X. laevis and X. borealis. Strict conservation of only the Cys/His motif is observed between X. laevis and R. catesbeiana TFIIIA. Overall, Cys/His zinc fingers in TFIIIA are much less conserved than Cys/Cys fingers in erythroid transcription factor (Eryf 1) and also less conserved than homeo box domains in segmentation genes. The collective evidence indicates that TFIIIA evolved from a common precursor containing up to 12 finger domains which subsequently evolved at different rates. Images PMID:2110661

  5. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  6. Composition for nucleic acid sequencing

    SciTech Connect

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  7. Sequence variations in the FAD2 gene in seeded pumpkins.

    PubMed

    Ge, Y; Chang, Y; Xu, W L; Cui, C S; Qu, S P

    2015-12-21

    Seeded pumpkins are important economic crops; the seeds contain various unsaturated fatty acids, such as oleic acid and linoleic acid, which are crucial for human and animal nutrition. The fatty acid desaturase-2 (FAD2) gene encodes delta-12 desaturase, which converts oleic acid to linoleic acid. However, little is known about sequence variations in FAD2 in seeded pumpkins. Twenty-seven FAD2 clones from 27 accessions of Cucurbita moschata, Cucurbita maxima, Cucurbita pepo, and Cucurbita ficifolia were obtained (totally 1152 bp; a single gene without introns). More than 90% nucleotide identities were detected among the 27 FAD2 clones. Nucleotide substitution, rather than nucleotide insertion and deletion, led to sequence polymorphism in the 27 FAD2 clones. Furthermore, the 27 FAD2 selected clones all encoded the FAD2 enzyme (delta-12 desaturase) with amino acid sequence identities from 91.7 to 100% for 384 amino acids. The same main-function domain between 47 and 329 amino acids was identified. The four species clustered separately based on differences in the sequences that were identified using the unweighted pair group method with arithmetic mean. Geographic origin and species were found to be closely related to sequence variation in FAD2.

  8. Sequence variations in the FAD2 gene in seeded pumpkins.

    PubMed

    Ge, Y; Chang, Y; Xu, W L; Cui, C S; Qu, S P

    2015-01-01

    Seeded pumpkins are important economic crops; the seeds contain various unsaturated fatty acids, such as oleic acid and linoleic acid, which are crucial for human and animal nutrition. The fatty acid desaturase-2 (FAD2) gene encodes delta-12 desaturase, which converts oleic acid to linoleic acid. However, little is known about sequence variations in FAD2 in seeded pumpkins. Twenty-seven FAD2 clones from 27 accessions of Cucurbita moschata, Cucurbita maxima, Cucurbita pepo, and Cucurbita ficifolia were obtained (totally 1152 bp; a single gene without introns). More than 90% nucleotide identities were detected among the 27 FAD2 clones. Nucleotide substitution, rather than nucleotide insertion and deletion, led to sequence polymorphism in the 27 FAD2 clones. Furthermore, the 27 FAD2 selected clones all encoded the FAD2 enzyme (delta-12 desaturase) with amino acid sequence identities from 91.7 to 100% for 384 amino acids. The same main-function domain between 47 and 329 amino acids was identified. The four species clustered separately based on differences in the sequences that were identified using the unweighted pair group method with arithmetic mean. Geographic origin and species were found to be closely related to sequence variation in FAD2. PMID:26782391

  9. High speed nucleic acid sequencing

    SciTech Connect

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  10. Variation in seed fatty acid composition and sequence divergence in the FAD2 gene coding region between wild and cultivated sesame.

    PubMed

    Chen, Zhenbang; Tonnis, Brandon; Morris, Brad; Wang, Richard B; Zhang, Amy L; Pinnow, David; Wang, Ming Li

    2014-12-01

    Sesame germplasm harbors genetic diversity which can be useful for sesame improvement in breeding programs. Seven accessions with different levels of oleic acid were selected from the entire USDA sesame germplasm collection (1232 accessions) and planted for morphological observation and re-examination of fatty acid composition. The coding region of the FAD2 gene for fatty acid desaturase (FAD) in these accessions was also sequenced. Cultivated sesame accessions flowered and matured earlier than the wild species. The cultivated sesame seeds contained a significantly higher percentage of oleic acid (40.4%) than the seeds of the wild species (26.1%). Nucleotide polymorphisms were identified in the FAD2 gene coding region between wild and cultivated species. Some nucleotide polymorphisms led to amino acid changes, one of which was located in the enzyme active site and may contribute to the altered fatty acid composition. Based on the morphology observation, chemical analysis, and sequence analysis, it was determined that two accessions were misnamed and need to be reclassified. The results obtained from this study are useful for sesame improvement in molecular breeding programs.

  11. The regions of sequence variation in caulimovirus gene VI.

    PubMed

    Sanger, M; Daubert, S; Goodman, R M

    1991-06-01

    The sequence of gene VI from figwort mosaic virus (FMV) clone x4 was determined and compared with that previously published for FMV clone DxS. Both clones originated from the same virus isolation, but the virus used to clone DxS was propagated extensively in a host of a different family prior to cloning whereas that used to clone x4 was not. Differences in the amino acid sequence inferred from the DNA sequences occurred in two clusters. An N-terminal conserved region preceded two regions of variation separated by a central conserved region. Variation in cauliflower mosaic virus (CaMV) gene VI sequences, all of which were derived from virus isolates from hosts from one host family, was similar to that seen in the FMV comparison, though the extent of variation was less. Alignment of gene VI domains from FMV and CaMV revealed regions of amino acid sequence identical in both viruses within the conserved regions. The similarity in the pattern of conserved and variable domains of these two viruses suggests common host-interactive functions in caulimovirus gene VI homologues, and possibly an analogy between caulimoviruses and certain animal viruses in the influence of the host on sequence variability of viral genes.

  12. A case study on the genetic origin of the high oleic acid trait through FAD2-1 DNA sequence variation in safflower (Carthamus tinctorius L.).

    PubMed

    Rapson, Sara; Wu, Man; Okada, Shoko; Das, Alpana; Shrestha, Pushkar; Zhou, Xue-Rong; Wood, Craig; Green, Allan; Singh, Surinder; Liu, Qing

    2015-01-01

    The safflower (Carthamus tinctorius L.) is considered a strongly domesticated species with a long history of cultivation. The hybridization of safflower with its wild relatives has played an important role in the evolution of cultivars and is of particular interest with regards to their production of high quality edible oils. Original safflower varieties were all rich in linoleic acid, while varieties rich in oleic acid have risen to prominence in recent decades. The high oleic acid trait is controlled by a partially recessive allele ol at a single locus OL. The ol allele was found to be a defective microsomal oleate desaturase FAD2-1. Here we present DNA sequence data and Southern blot analysis suggesting that there has been an ancient hybridization and introgression of the FAD2-1 gene into C. tinctorius from its wild relative C. palaestinus. It is from this gene that FAD2-1Δ was derived more recently. Identification and characterization of the genetic origin and diversity of FAD2-1 could aid safflower breeders in reducing population size and generations required for the development of new high oleic acid varieties by using perfect molecular marker-assisted selection.

  13. A case study on the genetic origin of the high oleic acid trait through FAD2-1 DNA sequence variation in safflower (Carthamus tinctorius L.)

    PubMed Central

    Rapson, Sara; Wu, Man; Okada, Shoko; Das, Alpana; Shrestha, Pushkar; Zhou, Xue-Rong; Wood, Craig; Green, Allan; Singh, Surinder; Liu, Qing

    2015-01-01

    The safflower (Carthamus tinctorius L.) is considered a strongly domesticated species with a long history of cultivation. The hybridization of safflower with its wild relatives has played an important role in the evolution of cultivars and is of particular interest with regards to their production of high quality edible oils. Original safflower varieties were all rich in linoleic acid, while varieties rich in oleic acid have risen to prominence in recent decades. The high oleic acid trait is controlled by a partially recessive allele ol at a single locus OL. The ol allele was found to be a defective microsomal oleate desaturase FAD2-1. Here we present DNA sequence data and Southern blot analysis suggesting that there has been an ancient hybridization and introgression of the FAD2-1 gene into C. tinctorius from its wild relative C. palaestinus. It is from this gene that FAD2-1Δ was derived more recently. Identification and characterization of the genetic origin and diversity of FAD2-1 could aid safflower breeders in reducing population size and generations required for the development of new high oleic acid varieties by using perfect molecular marker-assisted selection. PMID:26442008

  14. A case study on the genetic origin of the high oleic acid trait through FAD2-1 DNA sequence variation in safflower (Carthamus tinctorius L.).

    PubMed

    Rapson, Sara; Wu, Man; Okada, Shoko; Das, Alpana; Shrestha, Pushkar; Zhou, Xue-Rong; Wood, Craig; Green, Allan; Singh, Surinder; Liu, Qing

    2015-01-01

    The safflower (Carthamus tinctorius L.) is considered a strongly domesticated species with a long history of cultivation. The hybridization of safflower with its wild relatives has played an important role in the evolution of cultivars and is of particular interest with regards to their production of high quality edible oils. Original safflower varieties were all rich in linoleic acid, while varieties rich in oleic acid have risen to prominence in recent decades. The high oleic acid trait is controlled by a partially recessive allele ol at a single locus OL. The ol allele was found to be a defective microsomal oleate desaturase FAD2-1. Here we present DNA sequence data and Southern blot analysis suggesting that there has been an ancient hybridization and introgression of the FAD2-1 gene into C. tinctorius from its wild relative C. palaestinus. It is from this gene that FAD2-1Δ was derived more recently. Identification and characterization of the genetic origin and diversity of FAD2-1 could aid safflower breeders in reducing population size and generations required for the development of new high oleic acid varieties by using perfect molecular marker-assisted selection. PMID:26442008

  15. Indole acetic acid production by fluorescent Pseudomonas spp. from the rhizosphere of Plectranthus amboinicus (Lour.) Spreng. and their variation in extragenic repetitive DNA sequences.

    PubMed

    Sethia, Bedhya; Mustafa, Mariam; Manohar, Sneha; Patil, Savita V; Jayamohan, Nellickal Subramanian; Kumudini, Belur Satyan

    2015-06-01

    Fluorescent Pseudomonas (FP) is a heterogenous group of growth promoting rhizobacteria that regulate plant growth by releasing secondary metabolic compounds viz., indole acetic acid (IAA), siderophores, ammonia and hydrogen cyanide. In the present study, IAA producing FPs from the rhizosphere of Plectranthus amboinicus were characterized morphologically, biochemically and at the molecular level. Molecular identification of the isolates were carried out using Pseudomonas specific primers. The effect of varying time (24, 48, 72 and 96 h), Trp concentrations (100, 200, 300, 400 and 500 μg x ml(-1)), temperature (10, 26, 37 and 50 ± 2 degrees C) and pH (6, 7 and 8) on IAA production by 10 best isolates were studied. Results showed higher IAA production at 72 h incubation, at 300 μg x ml(-1) Trp concentration, temperature 26 ± 2 degrees C and pH 7. TLC with acidified ethyl acetate extract showed that the IAA produced has a similar Rf value to that of the standard IAA. Results of TLC were confirmed by HPLC analysis. Genetic diversity of the isolates was also studied using 40 RAPD and 4 Rep primers. Genetic diversity parameters such as dominance, Shannon index and Simpson index were calculated. Out of 40 RAPD primers tested, 9 (2 OP-D series and 7 OP-E series) were shortlisted for further analysis. Studies using RAPD, ERIC, BOX, REP and GTG5 primers revealed that isolates exhibit significant diversity in repetitive DNA sequences irrespective of the rhizosphere. PMID:26155673

  16. Chip-based sequencing nucleic acids

    SciTech Connect

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  17. Sequence Variation Within the Fragile X Locus

    PubMed Central

    Mathews, Debra J.; Kashuk, Carl; Brightwell, Gale; Eichler, Evan E.; Chakravarti, Aravinda

    2001-01-01

    The human genome provides a reference sequence, which is a template for resequencing studies that aim to discover and interpret the record of common ancestry that exists in extant genomes. To understand the nature and pattern of variation and linkage disequilibrium comprising this history, we present a study of ∼31 kb spanning an ∼70 kb region of FMR1, sequenced in a sample of 20 humans (worldwide sample) and four great apes (chimp, bonobo, and gorilla). Twenty-five polymorphic sites and two insertion/deletions, distributed in 11 unique haplotypes, were identified among humans. Africans are the only geographic group that do not share any haplotypes with other groups. Parsimony analysis reveals two main clades and suggests that the four major human geographic groups are distributed throughout the phylogenetic tree and within each major clade. An African sample appears to be most closely related to the common ancestor shared with the three other geographic groups. Nucleotide diversity, π, for this sample is 2.63 ± 6.28 × 10−4. The mutation rate, μ, is 6.48 × 10−10 per base pair per year, giving an ancestral population size of ∼6200 and a time to the most recent common ancestor of ∼320,000 ± 72,000 per base pair per year. Linkage disequilibrium (LD) at the FMR1 locus, evaluated by conventional LD analysis and by the length of segment shared between any two chromosomes, is extensive across the region. PMID:11483579

  18. Distinguishing proteins from arbitrary amino acid sequences.

    PubMed

    Yau, Stephen S-T; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  19. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  20. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  1. Variation in Seed Fatty Acid Composition, and Sequence Divergence in the FAD2 Gene Coding Region between Wild and Cultivated Sesame

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Sesame germplasm harbors genetic diversity which can be useful for sesame improvement in breeding programs. Seven accessions with different levels of oleic acid were selected from the entire USDA sesame germplasm collection (1232 accessions) and planted for morphological observation and re-examinati...

  2. Mitochondrial DNA sequence variation in multiple sclerosis

    PubMed Central

    Santaniello, Adam; Caillier, Stacy J.; D'Alfonso, Sandra; Martinelli Boneschi, Filippo; Hauser, Stephen L.; Oksenberg, Jorge R.

    2015-01-01

    Objective: To assess the influence of common mitochondrial DNA (mtDNA) sequence variation on multiple sclerosis (MS) risk in cases and controls part of an international consortium. Methods: We analyzed 115 high-quality mtDNA variants and common haplogroups from a previously published genome-wide association study among 7,391 cases from the International Multiple Sclerosis Genetics Consortium and 14,568 controls from the Wellcome Trust Case Control Consortium 2 project from 7 countries. Significant single nucleotide polymorphism and haplogroup associations were replicated in 3,720 cases and 879 controls from the University of California, San Francisco. Results: An elevated risk of MS was detected among haplogroup JT carriers from 7 pooled clinic sites (odds ratio [OR] = 1.15, 95% confidence interval [CI] = 1.07–1.24, p = 0.0002) included in the discovery study. The increased risk of MS was observed for both haplogroup T (OR = 1.17, 95% CI = 1.06–1.29, p = 0.002) and haplogroup J carriers (OR = 1.11, 95% CI = 1.01–1.22, p = 0.03). These haplogroup associations with MS were not replicated in the independent sample set. An elevated risk of primary progressive (PP) MS was detected for haplogroup J participants from 3 European discovery populations (OR = 1.49, 95% CI = 1.10–2.01, p = 0.009). This elevated risk was borderline significant in the US replication population (OR = 1.43, 95% CI = 0.99–2.08, p = 0.058) and remained significant in pooled analysis of discovery and replication studies (OR = 1.43, 95% CI = 1.14–1.81, p = 0.002). No common individual mtDNA variants were associated with MS risk. Conclusions: Identification and validation of mitochondrial genetic variants associated with MS and PPMS may lead to new targets for treatment and diagnostic tests for identifying potential responders to interventions that target mitochondria. PMID:26136518

  3. Genome nucleotide composition shapes variation in simple sequence repeats.

    PubMed

    Tian, Xiangjun; Strassmann, Joan E; Queller, David C

    2011-02-01

    Simple sequence repeats (SSRs) or microsatellites are a common component of genomes but vary greatly across species in their abundance. We tested the hypothesis that this variation is due in part to AT/GC content of genomes, with genomes biased toward either high AT or high CG generating more short random repeats that are long enough to enhance expansion through slippage during replication. To test this hypothesis, we identified repeats with perfect tandem iterations of 1-6 bp from 25 protists with complete or near-complete genome sequences. As expected, the density and the frequency are highly related to genome AT content, with excellent fits to quadratic regressions with minima near a 50% AT content and rising toward both extremes. Within species, the same trends hold, except the limited variation in AT content within each species places each mainly on the descending (GC rich), middle, or ascending (AT rich) part of the curve. The base usages of repeat motifs are also significantly correlated with genome nucleotide compositions: Percentages of AT-rich motifs rise with the increase of genome AT content but vice versa for GC-rich subgroups. Amino acid homopolymer repeats also show the expected quadratic relationship, with higher abundance in species with AT content biased in either direction. Our results show that genome nucleotide composition explains up to half of the variance in the abundance and motif constitution of SSRs.

  4. Bovine Parathyroid Hormone: Amino Acid Sequence

    PubMed Central

    Brewer, H. Bryan; Ronan, Rosemary

    1970-01-01

    Bovine parathyroid hormone has been isolated in homogeneous form, and its complete amino acid sequence determined. The bovine hormone is a single chain, 84 amino acids long. It contains amino-terminal alanine, and carboxyl-terminal glutamine. The bovine parathyroid hormone is approximately three times the length of the newly discovered hormone, thyrocalcitonin, whose action is reciprocal to parathyroid hormone. Images PMID:5275384

  5. Activation of c-jun N-Terminal Kinase upon Influenza A Virus (IAV) Infection Is Independent of Pathogen-Related Receptors but Dependent on Amino Acid Sequence Variations of IAV NS1

    PubMed Central

    Nacken, Wolfgang; Anhlan, Darisuren; Hrincius, Eike R.; Mostafa, Ahmed; Wolff, Thorsten; Sadewasser, Anne; Pleschka, Stephan; Ehrhardt, Christina

    2014-01-01

    ABSTRACT A hallmark cell response to influenza A virus (IAV) infections is the phosphorylation and activation of c-jun N-terminal kinase (JNK). However, so far it is not fully clear which molecules are involved in the activation of JNK upon IAV infection. Here, we report that the transfection of influenza viral-RNA induces JNK in a retinoic acid-inducible gene I (RIG-I)-dependent manner. However, neither RIG-I-like receptors nor MyD88-dependent Toll-like receptors were found to be involved in the activation of JNK upon IAV infection. Viral JNK activation may be blocked by addition of cycloheximide and heat shock protein inhibitors during infection, suggesting that the expression of an IAV-encoded protein is responsible for JNK activation. Indeed, the overexpression of nonstructural protein 1 (NS1) of certain IAV subtypes activated JNK, whereas those of some other subtypes failed to activate JNK. Site-directed mutagenesis experiments using NS1 of the IAV H7N7, H5N1, and H3N2 subtypes identified the amino acid residue phenylalanine (F) at position 103 to be decisive for JNK activation. Cleavage- and polyadenylation-specific factor 30 (CPSF30), whose binding to NS1 is stabilized by the amino acids F103 and M106, is not involved in JNK activation. Conclusively, subtype-specific sequence variations in the IAV NS1 protein result in subtype-specific differences in JNK signaling upon IAV infection. IMPORTANCE Influenza A virus (IAV) infection leads to the activation or modulation of multiple signaling pathways. Here, we demonstrate for the first time that the c-jun N-terminal kinase (JNK), a long-known stress-activated mitogen-activated protein (MAP) kinase, is activated by RIG-I when cells are treated with IAV RNA. However, at the same time, nonstructural protein 1 (NS1) of IAV has an intrinsic JNK-activating property that is dependent on IAV subtype-specific amino acid variations around position 103. Our findings identify two different and independent pathways that

  6. Variations on strongly lacunary quasi Cauchy sequences

    NASA Astrophysics Data System (ADS)

    Kaplan, Huseyin; Cakalli, Huseyin

    2016-08-01

    We introduce a new function space, namely the space of Nθ (p)-ward continuous functions, which turns out to be a closed subspace of the space of continuous functions for each positive integer p. Nθα(p ) -ward continuity is also introduced and investigated for any fixed 0 < α ≤ 1, and for any fixed positive integer p. A real valued function f defined on a subset A of R, the set of real numbers is Nθα(p ) -ward continuous if it preserves Nθα(p ) -quasi-Cauchy sequences, i.e. (f (xn)) is an Nθα(p ) -quasi-Cauchy sequence whenever (xn) is Nθα(p ) -quasi-Cauchy sequence of points in A, where a sequence (xk) of points in R is called Nθα(p ) -quasi-Cauchy if lim r →∞ 1/hrα ∑k ∈Ir |Δ xk | p =0 , where Δxk = xk+1-xk for each positive integer k, p is a fixed positive integer, α is fixed in ]0, 1], Ir = (kr-1, kr], and θ = (kr) is a lacunary sequence, i.e. an increasing sequence of positive integers such that k0 ≠ 0, and hr: kr-kr-1 →∞.

  7. A variation on lacunary quasi Cauchy sequences

    NASA Astrophysics Data System (ADS)

    Cakalli, Huseyin; Et, Mikail; Sengul, Hacer

    2016-08-01

    In the present paper, we introduce a concept of ideal lacunary statistical quasi-Cauchy sequence of order α of real numbers in the sense that a sequence (xk) of points in R is called I-lacunary statistically quasi-Cauchy of order α, if { r ∈N :1/hrα | { k ∈Ir:| Δ xk | ≥ɛ } | ≥δ } ∈I for each ɛ > 0 and for each δ > 0, where an ideal I is a family of subsets of positive integers N which is closed under taking finite unions and subsets of its elements. The main purpose of this paper is to investigate ideal lacunary statistical ward continuity of order α, where a function f is called I- lacunary statistically ward continuous of order α if it preserves I-lacunary statistically quasi-Cauchy sequences of order α, i.e. (f (xn)) is a Sθα(I ) -quasi-Cauchy sequence whenever (xn) is.

  8. Algorithm of detecting structural variations in DNA sequences

    NASA Astrophysics Data System (ADS)

    Nałecz-Charkiewicz, Katarzyna; Nowak, Robert

    2014-11-01

    Whole genome sequencing enables to use the longest common subsequence algorithm to detect genetic structure variations. We propose to search position of short unique fragments, genetic markers, to achieve acceptable time and space complexity. The markers are generated by algorithms searching the genetic sequence or its Fourier transformation. The presented methods are checked on structural variations generated in silico on bacterial genomes giving the comparable or better results than other solutions.

  9. Identification of cis-regulatory sequence variations in individual genome sequences.

    PubMed

    Worsley-Hunt, Rebecca; Bernard, Virginie; Wasserman, Wyeth W

    2011-01-01

    Functional contributions of cis-regulatory sequence variations to human genetic disease are numerous. For instance, disrupting variations in a HNF4A transcription factor binding site upstream of the Factor IX gene contributes causally to hemophilia B Leyden. Although clinical genome sequence analysis currently focuses on the identification of protein-altering variation, the impact of cis-regulatory mutations can be similarly strong. New technologies are now enabling genome sequencing beyond exomes, revealing variation across the non-coding 98% of the genome responsible for developmental and physiological patterns of gene activity. The capacity to identify causal regulatory mutations is improving, but predicting functional changes in regulatory DNA sequences remains a great challenge. Here we explore the existing methods and software for prediction of functional variation situated in the cis-regulatory sequences governing gene transcription and RNA processing.

  10. Identification of cis-regulatory sequence variations in individual genome sequences

    PubMed Central

    2011-01-01

    Functional contributions of cis-regulatory sequence variations to human genetic disease are numerous. For instance, disrupting variations in a HNF4A transcription factor binding site upstream of the Factor IX gene contributes causally to hemophilia B Leyden. Although clinical genome sequence analysis currently focuses on the identification of protein-altering variation, the impact of cis-regulatory mutations can be similarly strong. New technologies are now enabling genome sequencing beyond exomes, revealing variation across the non-coding 98% of the genome responsible for developmental and physiological patterns of gene activity. The capacity to identify causal regulatory mutations is improving, but predicting functional changes in regulatory DNA sequences remains a great challenge. Here we explore the existing methods and software for prediction of functional variation situated in the cis-regulatory sequences governing gene transcription and RNA processing. PMID:21989199

  11. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  12. Mitochondrial DNA sequence variation in Greeks.

    PubMed

    Kouvatsi, A; Karaiskou, N; Apostolidis, A; Kirmizidis, G

    2001-12-01

    Mitochondrial DNA (mtDNA) control region sequences were determined in 54 unrelated Greeks, coming from different regions in Greece, for both segments HVR-I and HVR-II. Fifty-two different mtDNA haplotypes were revealed, one of which was shared by three individuals. A very low heterogeneity was found among Greek regions. No one cluster of lineages was specific to individuals coming from a certain region. The average pairwise difference distribution showed a value of 7.599. The data were compared with that for other European or neighbor populations (British, French, Germans, Tuscans, Bulgarians, and Turks). The genetic trees that were constructed revealed homogeneity between Europeans. Median networks revealed that most of the Greek mtDNA haplotypes are clustered to the five known haplogroups and that a number of haplotypes are shared among Greeks and other European and Near Eastern populations.

  13. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  14. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  15. Using chaos to generate variations on movement sequences

    NASA Astrophysics Data System (ADS)

    Bradley, Elizabeth; Stuart, Joshua

    1998-12-01

    We describe a method for introducing variations into predefined motion sequences using a chaotic symbol-sequence reordering technique. A progression of symbols representing the body positions in a dance piece, martial arts form, or other motion sequence is mapped onto a chaotic trajectory, establishing a symbolic dynamics that links the movement sequence and the attractor structure. A variation on the original piece is created by generating a trajectory with slightly different initial conditions, inverting the mapping, and using special corpus-based graph-theoretic interpolation schemes to smooth any abrupt transitions. Sensitive dependence guarantees that the variation is different from the original; the attractor structure and the symbolic dynamics guarantee that the two resemble one another in both aesthetic and mathematical senses.

  16. Optimization of short amino acid sequences classifier

    NASA Astrophysics Data System (ADS)

    Barcz, Aleksy; Szymański, Zbigniew

    This article describes processing methods used for short amino acid sequences classification. The data processed are 9-symbols string representations of amino acid sequences, divided into 49 data sets - each one containing samples labeled as reacting or not with given enzyme. The goal of the classification is to determine for a single enzyme, whether an amino acid sequence would react with it or not. Each data set is processed separately. Feature selection is performed to reduce the number of dimensions for each data set. The method used for feature selection consists of two phases. During the first phase, significant positions are selected using Classification and Regression Trees. Afterwards, symbols appearing at the selected positions are substituted with numeric values of amino acid properties taken from the AAindex database. In the second phase the new set of features is reduced using a correlation-based ranking formula and Gram-Schmidt orthogonalization. Finally, the preprocessed data is used for training LS-SVM classifiers. SPDE, an evolutionary algorithm, is used to obtain optimal hyperparameters for the LS-SVM classifier, such as error penalty parameter C and kernel-specific hyperparameters. A simple score penalty is used to adapt the SPDE algorithm to the task of selecting classifiers with best performance measures values.

  17. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  18. Nucleotide sequence variation of chitin synthase genes among ectomycorrhizal fungi and its potential use in taxonomy.

    PubMed Central

    Mehmann, B; Brunner, I; Braus, G H

    1994-01-01

    DNA sequences of single-copy genes coding for chitin synthases (UDP-N-acetyl-D-glucosamine:chitin 4-beta-N-acetylglucosaminyltransferase; EC 2.4.1.16) were used to characterize ectomycorrhizal fungi. Degenerate primers deduced from short, completely conserved amino acid stretches flanking a region of about 200 amino acids of zymogenic chitin synthases allowed the amplification of DNA fragments of several members of this gene family. Different DNA band patterns were obtained from basidiomycetes because of variation in the number and length of amplified fragments. Cloning and sequencing of the most prominent DNA fragments revealed that these differences were due to various introns at conserved positions. The presence of introns in basidiomycetous fungi therefore has a potential use in identification of genera by analyzing PCR-generated DNA fragment patterns. Analyses of the nucleotide sequences of cloned fragments revealed variations in nucleotide sequences from 4 to 45%. By comparison of the deduced amino acid sequences, the majority of the DNA fragments were identified as members of genes for chitin synthase class II. The deduced amino acid sequences from species of the same genus differed only in one amino acid residue, whereas identity between the amino acid sequences of ascomycetous and basidiomycetous fungi within the same taxonomic class was found to be approximately 43 to 66%. Phylogenetic analysis of the amino acid sequence of class II chitin synthase-encoding gene fragments by using parsimony confirmed the current taxonomic groupings. In addition, our data revealed a fourth class of putative zymogenic chitin synthesis. Images PMID:7944356

  19. In Silico Detection of Sequence Variations Modifying Transcriptional Regulation

    PubMed Central

    Andersen, Malin C; Engström, Pär G; Lithwick, Stuart; Arenillas, David; Eriksson, Per; Lenhard, Boris; Wasserman, Wyeth W; Odeberg, Jacob

    2008-01-01

    Identification of functional genetic variation associated with increased susceptibility to complex diseases can elucidate genes and underlying biochemical mechanisms linked to disease onset and progression. For genes linked to genetic diseases, most identified causal mutations alter an encoded protein sequence. Technological advances for measuring RNA abundance suggest that a significant number of undiscovered causal mutations may alter the regulation of gene transcription. However, it remains a challenge to separate causal genetic variations from linked neutral variations. Here we present an in silico driven approach to identify possible genetic variation in regulatory sequences. The approach combines phylogenetic footprinting and transcription factor binding site prediction to identify variation in candidate cis-regulatory elements. The bioinformatics approach has been tested on a set of SNPs that are reported to have a regulatory function, as well as background SNPs. In the absence of additional information about an analyzed gene, the poor specificity of binding site prediction is prohibitive to its application. However, when additional data is available that can give guidance on which transcription factor is involved in the regulation of the gene, the in silico binding site prediction improves the selection of candidate regulatory polymorphisms for further analyses. The bioinformatics software generated for the analysis has been implemented as a Web-based application system entitled RAVEN (regulatory analysis of variation in enhancers). The RAVEN system is available at http://www.cisreg.ca for all researchers interested in the detection and characterization of regulatory sequence variation. PMID:18208319

  20. Massively parallel sequencing approaches for characterization of structural variation.

    PubMed

    Koboldt, Daniel C; Larson, David E; Chen, Ken; Ding, Li; Wilson, Richard K

    2012-01-01

    The emergence of next-generation sequencing (NGS) technologies offers an incredible opportunity to comprehensively study DNA sequence variation in human genomes. Commercially available platforms from Roche (454), Illumina (Genome Analyzer and Hiseq 2000), and Applied Biosystems (SOLiD) have the capability to completely sequence individual genomes to high levels of coverage. NGS data is particularly advantageous for the study of structural variation (SV) because it offers the sensitivity to detect variants of various sizes and types, as well as the precision to characterize their breakpoints at base pair resolution. In this chapter, we present methods and software algorithms that have been developed to detect SVs and copy number changes using massively parallel sequencing data. We describe visualization and de novo assembly strategies for characterizing SV breakpoints and removing false positives.

  1. Transcriptome and genome sequencing uncovers functional variation in humans

    PubMed Central

    Lappalainen, Tuuli; Sammeth, Michael; Friedländer, Marc R; ‘t Hoen, Peter AC; Monlong, Jean; Rivas, Manuel A; Gonzàlez-Porta, Mar; Kurbatova, Natalja; Griebel, Thasso; Ferreira, Pedro G; Barann, Matthias; Wieland, Thomas; Greger, Liliana; van Iterson, Maarten; Almlöf, Jonas; Ribeca, Paolo; Pulyakhina, Irina; Esser, Daniela; Giger, Thomas; Tikhonov, Andrew; Sultan, Marc; Bertier, Gabrielle; MacArthur, Daniel G; Lek, Monkol; Lizano, Esther; Buermans, Henk PJ; Padioleau, Ismael; Schwarzmayr, Thomas; Karlberg, Olof; Ongen, Halit; Kilpinen, Helena; Beltran, Sergi; Gut, Marta; Kahlem, Katja; Amstislavskiy, Vyacheslav; Stegle, Oliver; Pirinen, Matti; Montgomery, Stephen B; Donnelly, Peter; McCarthy, Mark I; Flicek, Paul; Strom, Tim M; Lehrach, Hans; Schreiber, Stefan; Sudbrak, Ralf; Carracedo, Ángel; Antonarakis, Stylianos E; Häsler, Robert; Syvänen, Ann-Christine; van Ommen, Gert-Jan; Brazma, Alvis; Meitinger, Thomas; Rosenstiel, Philip; Guigó, Roderic; Gut, Ivo G; Estivill, Xavier; Dermitzakis, Emmanouil T

    2013-01-01

    Summary Genome sequencing projects are discovering millions of genetic variants in humans, and interpretation of their functional effects is essential for understanding the genetic basis of variation in human traits. Here we report sequencing and deep analysis of mRNA and miRNA from lymphoblastoid cell lines of 462 individuals from the 1000 Genomes Project – the first uniformly processed RNA-seq data from multiple human populations with high-quality genome sequences. We discovered extremely widespread genetic variation affecting regulation of the majority of genes, with transcript structure and expression level variation being equally common but genetically largely independent. Our characterization of causal regulatory variation sheds light on cellular mechanisms of regulatory and loss-of-function variation, and allowed us to infer putative causal variants for dozens of disease-associated loci. Altogether, this study provides a deep understanding of the cellular mechanisms of transcriptome variation and of the landscape of functional variants in the human genome. PMID:24037378

  2. A sequence-based variation map of zebrafish.

    PubMed

    Patowary, Ashok; Purkanti, Ramya; Singh, Meghna; Chauhan, Rajendra; Singh, Angom Ramcharan; Swarnkar, Mohit; Singh, Naresh; Pandey, Vikas; Torroja, Carlos; Clark, Matthew D; Kocher, Jean-Pierre; Clark, Karl J; Stemple, Derek L; Klee, Eric W; Ekker, Stephen C; Scaria, Vinod; Sivasubbu, Sridhar

    2013-03-01

    Zebrafish (Danio rerio) is a popular vertebrate model organism largely deployed using outbred laboratory animals. The nonisogenic nature of the zebrafish as a model system offers the opportunity to understand natural variations and their effect in modulating phenotype. In an effort to better characterize the range of natural variation in this model system and to complement the zebrafish reference genome project, the whole genome sequence of a wild zebrafish at 39-fold genome coverage was determined. Comparative analysis with the zebrafish reference genome revealed approximately 5.2 million single nucleotide variations and over 1.6 million insertion-deletion variations. This dataset thus represents a new catalog of genetic variations in the zebrafish genome. Further analysis revealed selective enrichment for variations in genes involved in immune function and response to the environment, suggesting genome-level adaptations to environmental niches. We also show that human disease gene orthologs in the sequenced wild zebrafish genome show a lower ratio of nonsynonymous to synonymous single nucleotide variations.

  3. Understanding mechanisms underlying human gene expression variation with RNA sequencing

    PubMed Central

    Pickrell, Joseph K.; Marioni, John C.; Pai, Athma A.; Degner, Jacob F.; Engelhardt, Barbara E.; Nkadori, Everlyne; Veyrieras, Jean-Baptiste; Stephens, Matthew; Gilad, Yoav; Pritchard, Jonathan K.

    2011-01-01

    Understanding the genetic mechanisms underlying natural variation in gene expression is a central goal of both medical and evolutionary genetics, and studies of expression quantitative trait loci (eQTLs) have become an important tool for achieving this goal1. Although all eQTL studies so far have assayed messenger RNA levels using expression microarrays, recent advances in RNA sequencing enable the analysis of transcript variation at unprecedented resolution. We sequenced RNA from 69 lymphoblastoid cell lines derived from unrelated Nigerian individuals that have been extensively genotyped by the International HapMap Project2. By pooling data from all individuals, we generated a map of the transcriptional landscape of these cells, identifying extensive use of unannotated untranslated regions and more than 100 new putative protein-coding exons. Using the genotypes from the HapMap project, we identified more than a thousand genes at which genetic variation influences overall expression levels or splicing. We demonstrate that eQTLs near genes generally act by a mechanism involving allele-specific expression, and that variation that influences the inclusion of an exon is enriched within and near the consensus splice sites. Our results illustrate the power of high-throughput sequencing for the joint analysis of variation in transcription, splicing and allele-specific expression across individuals. PMID:20220758

  4. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid...

  5. Dissecting the relationship between protein structure and sequence variation

    NASA Astrophysics Data System (ADS)

    Shahmoradi, Amir; Wilke, Claus; Wilke Lab Team

    2015-03-01

    Over the past decade several independent works have shown that some structural properties of proteins are capable of predicting protein evolution. The strength and significance of these structure-sequence relations, however, appear to vary widely among different proteins, with absolute correlation strengths ranging from 0 . 1 to 0 . 8 . Here we present the results from a comprehensive search for the potential biophysical and structural determinants of protein evolution by studying more than 200 structural and evolutionary properties in a dataset of 209 monomeric enzymes. We discuss the main protein characteristics responsible for the general patterns of protein evolution, and identify sequence divergence as the main determinant of the strengths of virtually all structure-evolution relationships, explaining ~ 10 - 30 % of observed variation in sequence-structure relations. In addition to sequence divergence, we identify several protein structural properties that are moderately but significantly coupled with the strength of sequence-structure relations. In particular, proteins with more homogeneous back-bone hydrogen bond energies, large fractions of helical secondary structures and low fraction of beta sheets tend to have the strongest sequence-structure relation. BEACON-NSF center for the study of evolution in action.

  6. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  7. The amino acid sequence of elephant (Elephas maximus) myoglobin and the phylogeny of Proboscidea.

    PubMed

    Dene, H; Goodman, M; Romero-Herrera, A E

    1980-02-13

    The complete amino acid sequence of skeletal myoglobin from the Asian elephant (Elephas maximus) is reported. The functional significance of variations seen when this sequence is compared with that of sperm whale myoglobin is explored in the light of the crystallographic model available for the latter molecule. The phylogenetic implications of the elephant myoglobin amino acid sequence are evaluated by using the maximum parsimony technique. A similar analysis is also presented which incorporates all of the proteins sequenced from the elephant. These results are discussed with respect to current views on proboscidean phylogeny.

  8. GeneSV - an Approach to Help Characterize Possible Variations in Genomic and Protein Sequences.

    PubMed

    Zemla, Adam; Kostova, Tanya; Gorchakov, Rodion; Volkova, Evgeniya; Beasley, David W C; Cardosa, Jane; Weaver, Scott C; Vasilakis, Nikos; Naraghi-Arani, Pejman

    2014-01-01

    A computational approach for identification and assessment of genomic sequence variability (GeneSV) is described. For a given nucleotide sequence, GeneSV collects information about the permissible nucleotide variability (changes that potentially preserve function) observed in corresponding regions in genomic sequences, and combines it with conservation/variability results from protein sequence and structure-based analyses of evaluated protein coding regions. GeneSV was used to predict effects (functional vs. non-functional) of 37 amino acid substitutions on the NS5 polymerase (RdRp) of dengue virus type 2 (DENV-2), 36 of which are not observed in any publicly available DENV-2 sequence. 32 novel mutants with single amino acid substitutions in the RdRp were generated using a DENV-2 reverse genetics system. In 81% (26 of 32) of predictions tested, GeneSV correctly predicted viability of introduced mutations. In 4 of 5 (80%) mutants with double amino acid substitutions proximal in structure to one another GeneSV was also correct in its predictions. Predictive capabilities of the developed system were illustrated on dengue RNA virus, but described in the manuscript a general approach to characterize real or theoretically possible variations in genomic and protein sequences can be applied to any organism. PMID:24453480

  9. Sequence variation in the androgen receptor gene is not a common determinant of male sexual orientation

    SciTech Connect

    Macke, J.P.; Nathans, J.; King, V.L. ); Hu, N.; Hu, S.; Hamer, D.; Bailey, M. ); Brown, T. )

    1993-10-01

    To test the hypothesis that DNA sequence variation in the androgen receptor gene plays a causal role in the development of male sexual orientation, the authors have (1) measured the degree of concordance of androgen receptor alleles in 36 pairs of homosexual brothers, (2) compared the lengths of polyglutamine and polyglycine tracts in the amino-terminal domain of the androgen receptor in a sample of 197 homosexual males and 213 unselected subjects, and (3) screened the entire androgen receptor coding region for sequence variation by PCR and denaturing gradient-gel electrophoresis (DGGE) and/or single-strand conformation polymorphism analysis in 20 homosexual males with homosexual or bisexual brothers and one homosexual male with no homosexual brothers, and screened the amino-terminal domain of the receptor for sequence variation in an additional 44 homosexual males, 37 of whom had one or more first- or second-degree male relatives who were either homosexual or bisexual. These analyses show that (1) homosexual brothers are as likely to be discordant as concordant for androgen receptor alleles; (2) there are no large-scale differences between the distributions of polyglycine or polyglutamine tract lengths in the homosexual and control groups; and (3) coding region sequence variation is not commonly found within the androgen receptor gene of homosexual men. The DGGE screen identified two rare amino acid substitutions, ser[sup 205] -to-arg and glu[sup 793]-to-asp, the biological significance of which is unknown. 32 refs., 2 figs., 2 tabs.

  10. STR allele sequence variation: Current knowledge and future issues.

    PubMed

    Gettings, Katherine Butler; Aponte, Rachel A; Vallone, Peter M; Butler, John M

    2015-09-01

    This article reviews what is currently known about short tandem repeat (STR) allelic sequence variation in and around the twenty-four loci most commonly used throughout the world to perform forensic DNA investigations. These STR loci include D1S1656, TPOX, D2S441, D2S1338, D3S1358, FGA, CSF1PO, D5S818, SE33, D6S1043, D7S820, D8S1179, D10S1248, TH01, vWA, D12S391, D13S317, Penta E, D16S539, D18S51, D19S433, D21S11, Penta D, and D22S1045. All known reported variant alleles are compiled along with genomic information available from GenBank, dbSNP, and the 1000 Genomes Project. Supplementary files are included which provide annotated reference sequences for each STR locus, characterize genomic variation around the STR repeat region, and compare alleles present in currently available STR kit allelic ladders. Looking to the future, STR allele nomenclature options are discussed as they relate to next generation sequencing efforts underway. PMID:26197946

  11. DYZ1 arrays show sequence variation between the monozygotic males

    PubMed Central

    2014-01-01

    Background Monozygotic twins (MZT) are an important resource for genetical studies in the context of normal and diseased genomes. In the present study we used DYZ1, a satellite fraction present in the form of tandem arrays on the long arm of the human Y chromosome, as a tool to uncover sequence variations between the monozygotic males. Results We detected copy number variation, frequent insertions and deletions within the sequences of DYZ1 arrays amongst all the three sets of twins used in the present study. MZT1b showed loss of 35 bp compared to that in 1a, whereas 2a showed loss of 31 bp compared to that in 2b. Similarly, 3b showed 10 bp insertion compared to that in 3a. MZT1a germline DNA showed loss of 5 bp and 1b blood DNA showed loss of 26 bp compared to that of 1a blood and 1b germline DNA, respectively. Of the 69 restriction sites detected in DYZ1 arrays, MboII, BsrI, TspEI and TaqI enzymes showed frequent loss and or gain amongst all the 3 pairs studied. MZT1 pair showed loss/gain of VspI, BsrDI, AgsI, PleI, TspDTI, TspEI, TfiI and TaqI restriction sites in both blood and germline DNA. All the three sets of MZT showed differences in the number of DYZ1 copies. FISH signals reflected somatic mosaicism of the DYZ1 copies across the cells. Conclusions DYZ1 showed both sequence and copy number variation between the MZT males. Sequence variation was also noticed between germline and blood DNA samples of the same individual as we observed at least in one set of sample. The result suggests that DYZ1 faithfully records all the genetical changes occurring after the twining which may be ascribed to the environmental factors. PMID:24495361

  12. Comparative RNA sequencing reveals substantial genetic variation in endangered primates

    PubMed Central

    Perry, George H.; Melsted, Páll; Marioni, John C.; Wang, Ying; Bainer, Russell; Pickrell, Joseph K.; Michelini, Katelyn; Zehr, Sarah; Yoder, Anne D.; Stephens, Matthew; Pritchard, Jonathan K.; Gilad, Yoav

    2012-01-01

    Comparative genomic studies in primates have yielded important insights into the evolutionary forces that shape genetic diversity and revealed the likely genetic basis for certain species-specific adaptations. To date, however, these studies have focused on only a small number of species. For the majority of nonhuman primates, including some of the most critically endangered, genome-level data are not yet available. In this study, we have taken the first steps toward addressing this gap by sequencing RNA from the livers of multiple individuals from each of 16 mammalian species, including humans and 11 nonhuman primates. Of the nonhuman primate species, five are lemurs and two are lorisoids, for which little or no genomic data were previously available. To analyze these data, we developed a method for de novo assembly and alignment of orthologous gene sequences across species. We assembled an average of 5721 gene sequences per species and characterized diversity and divergence of both gene sequences and gene expression levels. We identified patterns of variation that are consistent with the action of positive or directional selection, including an 18-fold enrichment of peroxisomal genes among genes whose regulation likely evolved under directional selection in the ancestral primate lineage. Importantly, we found no relationship between genetic diversity and endangered status, with the two most endangered species in our study, the black and white ruffed lemur and the Coquerel's sifaka, having the highest genetic diversity among all primates. Our observations imply that many endangered lemur populations still harbor considerable genetic variation. Timely efforts to conserve these species alongside their habitats have, therefore, strong potential to achieve long-term success. PMID:22207615

  13. Predicting intrinsic disorder from amino acid sequence.

    PubMed

    Obradovic, Zoran; Peng, Kang; Vucetic, Slobodan; Radivojac, Predrag; Brown, Celeste J; Dunker, A Keith

    2003-01-01

    Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence. PMID:14579347

  14. Associations between DNA Sequence Variation and Variation in Expression of the Adh Gene in Natural Populations of Drosophila Melanogaster

    PubMed Central

    Laurie, C. C.; Bridgham, J. T.; Choudhary, M.

    1991-01-01

    A large part of the genetic variation in alcohol dehydrogenase (ADH) activity level in natural populations of Drosophila melanogaster is associated with segregation of an amino acid replacement polymorphism at nucleotide 1490, which generates a difference in electrophoretic mobility. Part of the allozymic difference in activity level is due to a catalytic efficiency difference, which is also caused by the amino acid replacement, and part is due to a difference in the concentration of ADH protein. A previous site-directed in vitro mutagenesis experiment clearly demonstrated that the amino acid replacement has no effect on the concentration of ADH protein, nor does a strongly associated silent polymorphism at nucleotide 1443. Here we analyze associations between polymorphisms within the Adh gene and variation in ADH protein level for a number of chromosomes derived from natural populations. A sequence length polymorphism within the first intron of the distal (adult) transcript, &1, is in strong linkage disequilibrium with the amino acid replacement. Among a sample of 46 isochromosomal lines analyzed, all but one of the 14 Fast lines have &1 and all but one of the 32 Slow lines lack &1. The exceptional Fast line has an unusually low level of ADH protein (typical of Slow lines) and the exceptional Slow line has an unusually high level (typical of Fast lines). These results suggest that the &1 polymorphism may be responsible for the average difference in ADH protein between the allozymic classes. A previous experiment localized the effect on ADH protein to a 2.3-kb restriction fragment. DNA sequences of this fragment from several alleles of each allozymic type indicate that no other polymorphisms within this region are as closely associated with the ADH protein level difference as the &1 polymorphism. PMID:1683848

  15. Unraveling genomic variation from next generation sequencing data.

    PubMed

    Pavlopoulos, Georgios A; Oulas, Anastasis; Iacucci, Ernesto; Sifrim, Alejandro; Moreau, Yves; Schneider, Reinhard; Aerts, Jan; Iliopoulos, Ioannis

    2013-01-01

    Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field.

  16. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  17. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  18. tax and rex Sequences of bovine leukaemia virus from globally diverse isolates: rex amino acid sequence more variable than tax.

    PubMed

    McGirr, K M; Buehring, G C

    2005-02-01

    Bovine leukaemia virus (BLV) is an important agricultural problem with high costs to the dairy industry. Here, we examine the variation of the tax and rex genes of BLV. The tax and rex genes share 420 bases and have overlapping reading frames. The tax gene encodes a protein that functions as a transactivator of the BLV promoter, is required for viral replication, acts on cellular promoters, and is responsible for oncogenesis. The rex facilitates the export of viral mRNAs from the nucleus and regulates transcription. We have sequenced five new isolates of the tax/rex gene. We examined the five new and three previously published tax/rex DNA and predicted amino acid sequences of BLV isolates from cattle in representative regions worldwide. The highest variation among nucleic acid sequences for tax and rex was 7% and 5%, respectively; among predicted amino acid sequences for Tax and Rex, 9% and 11%, respectively. Significantly more nucleotide changes resulted in predicted amino acid changes in the rex gene than in the tax gene (P < or = 0.0006). This variability is higher than previously reported for any region of the viral genome. This research may also have implications for the development of Tax-based vaccines. PMID:15702995

  19. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  20. Using evolutionary sequence variation to make inferences about protein structure and function

    NASA Astrophysics Data System (ADS)

    Colwell, Lucy

    2015-03-01

    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. The explosive growth in the number of available protein sequences raises the possibility of using the natural variation present in homologous protein sequences to infer these constraints and thus identify residues that control different protein phenotypes. Because in many cases phenotypic changes are controlled by more than one amino acid, the mutations that separate one phenotype from another may not be independent, requiring us to understand the correlation structure of the data. To address this we build a maximum entropy probability model for the protein sequence. The parameters of the inferred model are constrained by the statistics of a large sequence alignment. Pairs of sequence positions with the strongest interactions accurately predict contacts in protein tertiary structure, enabling all atom structural models to be constructed. We describe development of a theoretical inference framework that enables the relationship between the amount of available input data and the reliability of structural predictions to be better understood.

  1. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  2. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  3. Mapping copy number variation by population-scale genome sequencing.

    PubMed

    Mills, Ryan E; Walter, Klaudia; Stewart, Chip; Handsaker, Robert E; Chen, Ken; Alkan, Can; Abyzov, Alexej; Yoon, Seungtai Chris; Ye, Kai; Cheetham, R Keira; Chinwalla, Asif; Conrad, Donald F; Fu, Yutao; Grubert, Fabian; Hajirasouliha, Iman; Hormozdiari, Fereydoun; Iakoucheva, Lilia M; Iqbal, Zamin; Kang, Shuli; Kidd, Jeffrey M; Konkel, Miriam K; Korn, Joshua; Khurana, Ekta; Kural, Deniz; Lam, Hugo Y K; Leng, Jing; Li, Ruiqiang; Li, Yingrui; Lin, Chang-Yun; Luo, Ruibang; Mu, Xinmeng Jasmine; Nemesh, James; Peckham, Heather E; Rausch, Tobias; Scally, Aylwyn; Shi, Xinghua; Stromberg, Michael P; Stütz, Adrian M; Urban, Alexander Eckehart; Walker, Jerilyn A; Wu, Jiantao; Zhang, Yujun; Zhang, Zhengdong D; Batzer, Mark A; Ding, Li; Marth, Gabor T; McVean, Gil; Sebat, Jonathan; Snyder, Michael; Wang, Jun; Ye, Kenny; Eichler, Evan E; Gerstein, Mark B; Hurles, Matthew E; Lee, Charles; McCarroll, Steven A; Korbel, Jan O

    2011-02-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.

  4. Conversion of amino-acid sequence in proteins to classical music: search for auditory patterns

    PubMed Central

    2007-01-01

    We have converted genome-encoded protein sequences into musical notes to reveal auditory patterns without compromising musicality. We derived a reduced range of 13 base notes by pairing similar amino acids and distinguishing them using variations of three-note chords and codon distribution to dictate rhythm. The conversion will help make genomic coding sequences more approachable for the general public, young children, and vision-impaired scientists. PMID:17477882

  5. Heterogeneity of amino acid sequence in hippopotamus cytochrome c.

    PubMed

    Thompson, R B; Borden, D; Tarr, G E; Margoliash, E

    1978-12-25

    The amino acid sequences of chymotryptic and tryptic peptides of Hippopotamus amphibius cytochrome c were determined by a recent modification of the manual Edman sequential degradation procedure. They were ordered by comparison with the structure of the hog protein. The hippopotamus protein differs in three positions: serine, alanine, and glutamine replace alanine, glutamic acid, and lysine in positions 43, 92, and 100, respectively. Since the artiodactyl suborders diverged in the mid-Eocene some 50 million years ago, the fact that representatives of some of them show no differences in their cytochromes c (cow, sheep, and hog), while another exhibits as many as three such differences, verifies that even in relatively closely related lines of descent the rate at which cytochrome c changes in the course of evolution is not constant. Furthermore, 10.6% of the hippopotamus cytochrome c preparation was shown to contain isoleucine instead of valine at position 3, indicating that one of the four animals from which the protein was obtained was heterozygous in the cytochrome c gene. Such heterogeneity is a necessary condition of evolutionary variation and has not been previously observed in the cytochrome c of a wild mammalian population.

  6. From Artificial Amino Acids to Sequence-Defined Targeted Oligoaminoamides.

    PubMed

    Morys, Stephan; Wagner, Ernst; Lächelt, Ulrich

    2016-01-01

    Artificial oligoamino acids with appropriate protecting groups can be used for the sequential assembly of oligoaminoamides on solid-phase. With the help of these oligoamino acids multifunctional nucleic acid (NA) carriers can be designed and produced in highly defined topologies. Here we describe the synthesis of the artificial oligoamino acid Fmoc-Stp(Boc3)-OH, the subsequent assembly into sequence-defined oligomers and the formulation of tumor-targeted plasmid DNA (pDNA) polyplexes. PMID:27436323

  7. Studies on monotreme proteins. VII. Amino acid sequence of myoglobin from the platypus, Ornithoryhynchus anatinus.

    PubMed

    Fisher, W K; Thompson, E O

    1976-03-01

    Myoglobin isolated from skeletal muscle of the platypus contains 153 amino acid residues. The complete amino acid sequence has been determined following cleavage with cyanogen bromide and further digestion of the four fragments with trypsin, chymotrypsin, pepsin and thermolysin. Sequences of the purified peptides were determined by the dansyl-Edman procedure. The amino acid sequence showed 25 differences from human myoglobin and 24 from kangaroo myoglobin. Amino acid sequences in myoglobins are more conserved than sequences in the alpha- and beta-globin chains, and platypus myoglobin shows a similar number of variations in sequence to kangaroo myoglobin when compared with myoglobin of other species. The date of divergence of the platypus from other mammals was estimated at 102 +/- 31 million years, based on the number of amino acid differences between species and allowing for mutations during the evolutionary period. This estimate differs widely from the estimate given by similar treatment of the alpha- and beta-chain sequences and a constant rate of mutation of globin chains is not supported. PMID:962722

  8. Geographical distribution and temporal variation of rain acidity over China

    SciTech Connect

    Wen-Xing Wang; Yan-Bo Pang; Guo-An Ding

    1996-12-31

    In recent decade, large areas of acid rain have appeared in China. With the increasing emission of SO{sub 2} and NO{sub x} year by year, the acidity of precipitation has increased, and the acid rain area is expanding. Presently, the acid rain in China has become the third largest area of acid rain in the world, next to Europe and North America. The Chinese government took action against acid rain and planned a five-year National Acid Deposition Research Project. The space-time distribution and variation of rain acidity described in this paper is a part of this project. China is a large country. The area is almost equal to that of Europe. Its climate varies greatly and spans the tropics, subtropics, temperate and frigid zone. There is a varied topography including mountain, hilly country, desert and plain, on the other hand the distribution of anthropogenic sources are not even. All of the human and natural factors caused different chemical composition in different parts of China, the acidity of precipitation varies also. The acidity of the precipitation is the most important parameter in the acid rain research. In order to obtain the regional representative distribution of rain acidity, National Acidic Deposition Research Monitoring Network with 261 monitoring sites was established in 1992. This paper summarizes the rain acidity of 21355 precipitation samples, and gave the annual, seasonal, and the monthly pH contours. Results show that the acid rain area has expanded from the south during winter. Regional differences of monthly acid precipitation exists, generally, the rain acidity level is higher during summer and fall and lower during winter and spring in the northern provinces. The 9 opposite is the case in the southern provinces. The central areas are in a transitional situation. The geographical distribution and temporal variation of rain acidity are quite different from North America and Europe.

  9. Intrapatient sequence variation of the gag gene of human immunodeficiency virus type 1 plasma virions.

    PubMed Central

    Yoshimura, F K; Diem, K; Learn, G H; Riddell, S; Corey, L

    1996-01-01

    Because certain regions of the gag gene, such as p24, are highly conserved among human immunodeficiency virus (HIV) isolates, many therapeutic strategies have been directed at gag gene targets. Although intrapatient variation of segments of gag have been determined, little is known about the variability of the full-length gag gene for HIV isolated from a single individual. To evaluate intrapatient full-length gag variability, we derived the nucleotide sequences of at least 10 cDNA gag clones of virion RNA isolated from plasma for each of four asymptomatic HIV type 1-infected patients with relatively high CD4+ T-cell counts (300 to 450 cells per mm3). Mean values of intrapatient gag nucleotide variation obtained by pairwise comparisons ranged from 0.55 to 2.86%. For three subjects, this value was equivalent to that reported for intrapatient full-length env variation. The greatest range of intrapatient mean nucleotide variation for individual protein-coding regions was observed for p7. We did not detect any G-to-A hypermutation, as A-to-G and G-to-A transitions occurred at similar frequencies, accounting for 29 and 25%, respectively, of the changes. Mean variation values and phylogenetic analysis suggested that the extent of nucleotide variation correlated with the length of viral infection. Furthermore, no distinct subpopulations of quasispecies were detectable within an individual. The predicted amino acid sequences indicated that there were no regions within a gag protein that were comprised of clustered changes. PMID:8971017

  10. Simple sequence repeat variations expedite phage divergence: Mechanisms of indels and gene mutations.

    PubMed

    Lin, Tiao-Yin

    2016-07-01

    Phages are the most abundant biological entities and influence prokaryotic communities on Earth. Comparing closely related genomes sheds light on molecular events shaping phage evolution. Simple sequence repeat (SSR) variations impart over half of the genomic changes between T7M and T3, indicating an important role of SSRs in accelerating phage genetic divergence. Differences in coding and noncoding regions of phages infecting different hosts, coliphages T7M and T3, Yersinia phage ϕYeO3-12, and Salmonella phage ϕSG-JL2, frequently arise from SSR variations. Such variations modify noncoding and coding regions; the latter efficiently changes multiple amino acids, thereby hastening protein evolution. Four classes of events are found to drive SSR variations: insertion/deletion of SSR units, expansion/contraction of SSRs without alteration of genome length, changes of repeat motifs, and generation/loss of repeats. The categorization demonstrates the ways SSRs mutate in genomes during phage evolution. Indels are common constituents of genome variations and human diseases, yet, how they occur without preexisting repeat sequence is less understood. Non-repeat-unit-based misalignment-elongation (NRUBME) is proposed to be one mechanism for indels without adjacent repeats. NRUBME or consecutive NRUBME may also change repeat motifs or generate new repeats. NRUBME invoking a non-Watson-Crick base pair explains insertions that initiate mononucleotide repeats. Furthermore, NRUBME successfully interprets many inexplicable human di- to tetranucleotide repeat generations. This study provides the first evidence of SSR variations expediting phage divergence, and enables insights into the events and mechanisms of genome evolution. NRUBME allows us to emulate natural evolution to design indels for various applications.

  11. Segments of amino acid sequence similarity in beta-amylases.

    PubMed

    Friedberg, F; Rhodes, C

    1988-01-01

    In alpha-amylases from animals, plants and bacteria and in beta-amylases from plants and bacteria a number of segments exhibit amino acid sequence similarity specific to the alpha or to the beta type, respectively. In the case of the beta-amylases the similar sequence regions are extensive and they are disrupted only by short interspersed dissimilar regions. Close to the C terminus, however, no such sequence similarity exist. PMID:2464171

  12. Genotypic variation in fatty acid content of blackcurrant seeds.

    PubMed

    Ruiz del Castillo, M L; Dobson, G; Brennan, R; Gordon, S

    2002-01-16

    The fatty acid composition and total fatty acid content of seeds from 36 blackcurrant genotypes developed at the Scottish Crop Research Institute were examined. A rapid small-scale procedure, involving homogenization of seeds in toluene followed by sodium methoxide transesterification and gas chromatography, was used. There was considerable variation between genotypes. The gamma-linolenic acid content generally varied from 11 to 19% of the total fatty acids, but three genotypes had higher values of 22-24%, levels previously not reported for blackcurrant seed and similar to those for borage seed. Other nutritionally important fatty acids, stearidonic acid and alpha-linolenic acid, varied from 2 to 4% and 10-19%, respectively. The mean total fatty acid contents ranged from 14 to 23% of the seed, but repeatability was poor. The results are discussed. Blackcurrant seeds are mainly byproducts from juice production, and the study shows the potential for developing blackcurrant genotypes with optimal added value. PMID:11782203

  13. Haplotypes and Sequence Variation in the Ovine Adiponectin Gene (ADIPOQ).

    PubMed

    An, Qing-Ming; Zhou, Hui-Tong; Hu, Jiang; Luo, Yu-Zhu; Hickford, Jon G H

    2015-01-01

    The adiponectin gene (ADIPOQ) plays an important role in energy homeostasis. In this study five separate regions (regions 1 to 5) of ovine ADIPOQ were analysed using PCR-SSCP. Four different PCR-SSCP patterns (A₁-D₁, A₂-D₂) were detected in region-1 and region-2, respectively, with seven and six SNPs being revealed. In region-3, three different patterns (A₃-C₃) and three SNPs were observed. Two patterns (A₄-B₄, A₅-B₅) and two and one SNPs were observed in region-4 and region-5, respectively. In total, nineteen SNPs were detected, with five of them in the coding region and two (c.46T/C and c.515G/A) putatively resulting in amino acid changes (p.Tyr16His and p.Lys172Arg). In region-1, -2 and -3 of 316 sheep from eight New Zealand breeds, variants A₁, A₂ and A₃ were the most common, although variant frequencies differed in the eight breeds. Across region-1 and region-3, nine haplotypes were identified and haplotypes A₁-A₃, A₁-C₃, B₁-A₃ and B₁-C₃ were most common. These results indicate that the ADIPOQ gene is polymorphic and suggest that further analysis is required to see if the variation in the gene is associated with animal production traits. PMID:26610572

  14. Geochemical variations during the 2012 Emilia seismic sequence

    NASA Astrophysics Data System (ADS)

    Sciarra, Alessandra; Cantucci, Barbara; Galli, Gianfranco; Cinti, Daniele; Pizzino, Luca

    2015-04-01

    Several geochemical surveys (soil gas and shallow water) were performed in the Modena province (Massa Finalese, Finale Emilia, Medolla and S. Felice sul Panaro), during 2006-2014 period. In May-June 2012, a seismic sequence (main shocks of ML 5.9 and 5.8) was occurred closely to the investigated area. In this area 300 CO2 and CH4 fluxes measurements, 150 soil gas concentrations (He, H2, CO2, CH4 and C2H6), 30 shallow waters and their isotopic analyses (δ13C- CH4, δD- CH4 and δ13C- CO2) were performed in April-May 2006, October and December 2008, repeated in May and September 2012, June 2013 and July 2014 afterwards the 2012 Emilia seismic sequences. Chemical composition of soil gas are dominated by CH4 in the southern part by CO2 in the northern part. Very anomalous fluxes and concentrations are recorded in spot areas; elsewhere CO2 and CH4 values are very low, within the typical range of vegetative and of organic exhalation of the cultivated soil. After the seismic sequence the CH4 and CO2 fluxes are increased of one order of magnitude in the spotty areas, whereas in the surrounding area the values are within the background. On the contrary, CH4 concentration decrease (40%v/v in the 2012 surveys) and CO2 concentration increase until to 12.7%v/v (2013 survey). Isotopic gas analysis were carried out only on samples with anomalous values. Pre-seismic data hint a thermogenic origin of CH4 probably linked to leakage from a deep source in the Medolla area. Conversely, 2012/2013 isotopic data indicate a typical biogenic origin (i.e. microbial hydrocarbon production) of the CH4, as recognized elsewhere in the Po Plain and surroundings. The δ13C-CO2 value suggests a prevalent shallow origin of CO2 (i.e. organic and/or soil-derived) probably related to anaerobic oxidation of heavy hydrocarbons. Water samples, collected from domestic, industrial and hydrocarbons exploration wells, allowed us to recognize different families of waters. Waters are meteoric in origin and

  15. Amino acid sequences of proteins from Leptospira serovar pomona.

    PubMed

    Alves, S F; Lefebvre, R B; Probert, W

    2000-01-01

    This report describes a partial amino acid sequences from three putative outer envelope proteins from Leptospira serovar pomona. In order to obtain internal fragments for protein sequencing, enzymatic and chemical digestion was performed. The enzyme clostripain was used to digest the proteins 32 and 45 kDa. In situ digestion of 40 kDa molecular weight protein was accomplished using cyanogen bromide. The 32 kDa protein generated two fragments, one of 21 kDa and another of 10 kDa that yielded five residues. A fragment of 24 kDa that yielded nineteen residues of amino acids was obtained from 45 kDa protein. A fragment with a molecular weight of 20 kDa, yielding a twenty amino acids sequence from the 40 kDa protein.

  16. The amino acid sequence of Staphylococcus aureus penicillinase.

    PubMed Central

    Ambler, R P

    1975-01-01

    The amino acid sequence of the penicillinase (penicillin amido-beta-lactamhydrolase, EC 3.5.2.6) from Staphylococcus aureus strain PC1 was determined. The protein consists of a single polypeptide chain of 257 residues, and the sequence was determined by characterization of tryptic, chymotryptic, peptic and CNBr peptides, with some additional evidence from thermolysin and S. aureus proteinase peptides. A mistake in the preliminary report of the sequence is corrected; residues 113-116 are now thought to be -Lys-Lys-Val-Lys- rather than -Lys-Val-Lys-Lys-. Detailed evidence for the amino acid sequence has been deposited as Supplementary Publication SUP 50056 (91 pages) at the British Library (Lending Division), Boston Spa, Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1975) 145, 5. PMID:1218078

  17. Flagellin gene sequence variation in the genus Pseudomonas.

    PubMed

    Bellingham, N F; Morgan, J A; Saunders, J R; Winstanley, C

    2001-07-01

    Flagellin gene (fliC) sequences from 18 strains of Pseudomonas sensu stricto representing 8 different species, and 9 representative fliC sequences from other members of the gamma sub-division of proteobacteria, were compared. Analysis was performed on N-terminal, C-terminal and whole fliC sequences. The fliC analyses confirmed the inferred relationship between P. mendocina, P. oleovorans and P. aeruginosa based on 16S rRNA sequence comparisons. In addition, the analyses indicated that P. putida PRS2000 was closely related to P. fluorescens SBW25 and P. fluorescens NCIMB 9046T, but suggested that P. putida PaW8 and P. putida PRS2000 were more closely related to other Pseudomonas spp. than they were to each other. There were a number of inconsistencies in inferred evolutionary relationships between strains, depending on the analysis performed. In particular, whole flagellin gene comparisons often differed from those obtained using N- and C-terminal sequences. However, there were also inconsistencies between the terminal region analyses, suggesting that phylogenetic relationships inferred on the basis of fliC sequence should be treated with caution. Although the central domain of fliC is highly variable between Pseudomonas strains, there was evidence of sequence similarities between the central domains of different Pseudomonas fliC sequences. This indicates the possibility of recombination in the central domain of fliC genes within Pseudomonas species, and between these genes and those from other bacteria. PMID:11518318

  18. The amino-acid sequence of kangaroo pancreatic ribonuclease.

    PubMed

    Gaastra, W; Welling, G W; Beintema, J J

    1978-05-01

    Red kangaroo (Macropus rufus) ribonuclease was isolated from pancreatic tissue by affinity chromatography. The amino acid sequence was determined by automatic sequencing of overlapping large fragments and by analysis of shorter peptides obtained by digestion with a number of proteolytic enzymes. The polypeptide chain consists of 122 amino acid residues. Compared to other ribonucleases, the N-terminal residue and residue 114 are deleted. In other pancreatic ribonucleases position 114 is occupied by a cis proline residue in an external loop at the surface of the molecule. Other remarkable substitutions are the presence of a tyrosine residue at position 123 instead of a serine which forms a hydrogen bond with the pyrimidine ring of a nucleotide substrate, and a number of hydrophobichydrophilic interchanges in the sequence 51-55, which forms part of an alpha-helix in bovine ribonuclease and exhibits few substitutions in the placental mammals. Kangaroo ribonuclease contains no carbohydrate, although the enzyme possesses a recognition site for carbohydrate attachment in the sequence Asn-Val-Thr (62-64). The enzyme differs at about 35-40% of the positions from all other mammalian pancreatic ribonucleases sequenced to date, which is in agreement with the early divergence between the marsupials and the placental mammals. From fragmentary data a tentative sequence of red-necked wallaby (Macropus rufogriseus) pancreatic ribonuclease has been derived. Eight differences with the kangaroo sequence were found.

  19. Prebiotically plausible mechanisms increase compositional diversity of nucleic acid sequences

    PubMed Central

    Derr, Julien; Manapat, Michael L.; Rajamani, Sudha; Leu, Kevin; Xulvi-Brunet, Ramon; Joseph, Isaac; Nowak, Martin A.; Chen, Irene A.

    2012-01-01

    During the origin of life, the biological information of nucleic acid polymers must have increased to encode functional molecules (the RNA world). Ribozymes tend to be compositionally unbiased, as is the vast majority of possible sequence space. However, ribonucleotides vary greatly in synthetic yield, reactivity and degradation rate, and their non-enzymatic polymerization results in compositionally biased sequences. While natural selection could lead to complex sequences, molecules with some activity are required to begin this process. Was the emergence of compositionally diverse sequences a matter of chance, or could prebiotically plausible reactions counter chemical biases to increase the probability of finding a ribozyme? Our in silico simulations using a two-letter alphabet show that template-directed ligation and high concatenation rates counter compositional bias and shift the pool toward longer sequences, permitting greater exploration of sequence space and stable folding. We verified experimentally that unbiased DNA sequences are more efficient templates for ligation, thus increasing the compositional diversity of the pool. Our work suggests that prebiotically plausible chemical mechanisms of nucleic acid polymerization and ligation could predispose toward a diverse pool of longer, potentially structured molecules. Such mechanisms could have set the stage for the appearance of functional activity very early in the emergence of life. PMID:22319215

  20. The amino-acid sequence of kangaroo pancreatic ribonuclease.

    PubMed

    Gaastra, W; Welling, G W; Beintema, J J

    1978-05-01

    Red kangaroo (Macropus rufus) ribonuclease was isolated from pancreatic tissue by affinity chromatography. The amino acid sequence was determined by automatic sequencing of overlapping large fragments and by analysis of shorter peptides obtained by digestion with a number of proteolytic enzymes. The polypeptide chain consists of 122 amino acid residues. Compared to other ribonucleases, the N-terminal residue and residue 114 are deleted. In other pancreatic ribonucleases position 114 is occupied by a cis proline residue in an external loop at the surface of the molecule. Other remarkable substitutions are the presence of a tyrosine residue at position 123 instead of a serine which forms a hydrogen bond with the pyrimidine ring of a nucleotide substrate, and a number of hydrophobichydrophilic interchanges in the sequence 51-55, which forms part of an alpha-helix in bovine ribonuclease and exhibits few substitutions in the placental mammals. Kangaroo ribonuclease contains no carbohydrate, although the enzyme possesses a recognition site for carbohydrate attachment in the sequence Asn-Val-Thr (62-64). The enzyme differs at about 35-40% of the positions from all other mammalian pancreatic ribonucleases sequenced to date, which is in agreement with the early divergence between the marsupials and the placental mammals. From fragmentary data a tentative sequence of red-necked wallaby (Macropus rufogriseus) pancreatic ribonuclease has been derived. Eight differences with the kangaroo sequence were found. PMID:658039

  1. Glycoprotein Gene Sequence Variation in Rhesus Monkey Rhadinovirus

    PubMed Central

    Shin, Young C.; Jones, Leandro R.; Manrique, Julieta; Lauer, William; Carville, Angela; Mansfield, Keith G.; Desrosiers, Ronald C.

    2010-01-01

    Gene sequences for seven glycoproteins from 20 independent isolates of rhesus monkey rhadinovirus (RRV) and of the corresponding seven glycoprotein genes from nine strains of the Kaposi’s sarcoma-associated herpesvirus (KSHV) were obtained and analyzed. Phylogenetic analysis revealed two discrete groupings of RRV gH sequences, two discrete groupings of RRV gL sequences and two discrete groupings of RRV gB sequences. We called these phylogenetic groupings gHa, gHb, gLa, gLb, gBa and gBb. gHa was always paired with gLa and gHb was always paired with gLb for any individual RRV isolate. Since gH and gL are known to be interacting partners, these results suggest the need of matching sequence types for function of these cooperating proteins. gB phylogenetic grouping was not associated with gH/gL phylogenetic grouping. Our results demonstrate two distinct, distantly-related phylogenetic groupings of gH and gL of RRV despite a remarkable degree of sequence conservation within each individual phylogenetic group. PMID:20172576

  2. Clinal variation for amino acid polymorphisms at the Pgm locus in Drosophila melanogaster.

    PubMed Central

    Verrelli, B C; Eanes, W F

    2001-01-01

    Clinal variation is common for enzymes in the glycolytic pathway for Drosophila melanogaster and is generally accepted as an adaptive response to different climates. Although the enzyme phosphoglucomutase (PGM) possesses several allozyme polymorphisms, it is unique in that it had been reported to show no clinal variation. Our recent DNA sequence investigation of Pgm found extensive cryptic amino acid polymorphism segregating with the allozyme alleles. In this study, we characterize the geographic variation of Pgm amino acid polymorphisms at the nucleotide level along a latitudinal cline in the eastern United States. A survey of 15 SNPs across the Pgm gene finds significant clinal differentiation for the allozyme polymorphisms as well as for many of the cryptic amino acid polymorphisms. A test of independence shows that pervasive linkage disequilibrium across this gene region can explain many of the amino acid clines. A single Pgm haplotype defined by two amino acid polymorphisms shows the strongest correlation with latitude and the steepest change in allele frequency across the cline. We propose that clinal selection at Pgm may in part explain the extensive amino acid polymorphism at this locus and is consistent with a multilocus response to selection in the glycolytic pathway. PMID:11290720

  3. Limited Variation in BK Virus T-Cell Epitopes Revealed by Next-Generation Sequencing.

    PubMed

    Sahoo, Malaya K; Tan, Susanna K; Chen, Sharon F; Kapusinszky, Beatrix; Concepcion, Katherine R; Kjelson, Lynn; Mallempati, Kalyan; Farina, Heidi M; Fernández-Viña, Marcelo; Tyan, Dolly; Grimm, Paul C; Anderson, Matthew W; Concepcion, Waldo; Pinsky, Benjamin A

    2015-10-01

    BK virus (BKV) infection causing end-organ disease remains a formidable challenge to the hematopoietic cell transplant (HCT) and kidney transplant fields. As BKV-specific treatments are limited, immunologic-based therapies may be a promising and novel therapeutic option for transplant recipients with persistent BKV infection. Here, we describe a whole-genome, deep-sequencing methodology and bioinformatics pipeline that identify BKV variants across the genome and at BKV-specific HLA-A2-, HLA-B0702-, and HLA-B08-restricted CD8 T-cell epitopes. BKV whole genomes were amplified using long-range PCR with four inverse primer sets, and fragmentation libraries were sequenced on the Ion Torrent Personal Genome Machine (PGM). An error model and variant-calling algorithm were developed to accurately identify rare variants. A total of 65 samples from 18 pediatric HCT and kidney recipients with quantifiable BKV DNAemia underwent whole-genome sequencing. Limited genetic variation was observed. The median number of amino acid variants identified per sample was 8 (range, 2 to 37; interquartile range, 10), with the majority of variants (77%) detected at a frequency of <5%. When normalized for length, there was no statistical difference in the median number of variants across all genes. Similarly, the predominant virus population within samples harbored T-cell epitopes similar to the reference BKV strain that was matched for the BKV genotype. Despite the conservation of epitopes, low-level variants in T-cell epitopes were detected in 77.7% (14/18) of patients. Understanding epitope variation across the whole genome provides insight into the virus-immune interface and may help guide the development of protocols for novel immunologic-based therapies.

  4. Limited Variation in BK Virus T-Cell Epitopes Revealed by Next-Generation Sequencing

    PubMed Central

    Sahoo, Malaya K.; Tan, Susanna K.; Chen, Sharon F.; Kapusinszky, Beatrix; Concepcion, Katherine R.; Kjelson, Lynn; Mallempati, Kalyan; Farina, Heidi M.; Fernández-Viña, Marcelo; Tyan, Dolly; Grimm, Paul C.; Anderson, Matthew W.; Concepcion, Waldo

    2015-01-01

    BK virus (BKV) infection causing end-organ disease remains a formidable challenge to the hematopoietic cell transplant (HCT) and kidney transplant fields. As BKV-specific treatments are limited, immunologic-based therapies may be a promising and novel therapeutic option for transplant recipients with persistent BKV infection. Here, we describe a whole-genome, deep-sequencing methodology and bioinformatics pipeline that identify BKV variants across the genome and at BKV-specific HLA-A2-, HLA-B0702-, and HLA-B08-restricted CD8 T-cell epitopes. BKV whole genomes were amplified using long-range PCR with four inverse primer sets, and fragmentation libraries were sequenced on the Ion Torrent Personal Genome Machine (PGM). An error model and variant-calling algorithm were developed to accurately identify rare variants. A total of 65 samples from 18 pediatric HCT and kidney recipients with quantifiable BKV DNAemia underwent whole-genome sequencing. Limited genetic variation was observed. The median number of amino acid variants identified per sample was 8 (range, 2 to 37; interquartile range, 10), with the majority of variants (77%) detected at a frequency of <5%. When normalized for length, there was no statistical difference in the median number of variants across all genes. Similarly, the predominant virus population within samples harbored T-cell epitopes similar to the reference BKV strain that was matched for the BKV genotype. Despite the conservation of epitopes, low-level variants in T-cell epitopes were detected in 77.7% (14/18) of patients. Understanding epitope variation across the whole genome provides insight into the virus-immune interface and may help guide the development of protocols for novel immunologic-based therapies. PMID:26202116

  5. Development of an expert system for amino acid sequence identification.

    PubMed

    Hu, L; Saulinskas, E F; Johnson, P; Harrington, P B

    1996-08-01

    An expert system for amino acid sequence identification has been developed. The algorithm uses heuristic rules developed by human experts in protein sequencing. The system is applied to the chromatographic data of phenylthiohydantoin-amino acids acquired from an automated sequencer. The peak intensities in the current cycle are compared with those in the previous cycle, while the calibration and succeeding cycles are used as ancillary identification criteria when necessary. The retention time for each chromatographic peak in each cycle is corrected by the corresponding peak in the calibration cycle at the same run. The main improvement of our system compared with the onboard software used by the Applied Biosystems 477A Protein/Peptide Sequencer is that each peak in each cycle is assigned an identification name according to the corrected retention time to be used for the comparison with different cycles. The system was developed from analyses of ribonuclease A and evaluated by runs of four other protein samples that were not used in rule development. This paper demonstrates that rules developed by human experts can be automatically applied to sequence assignment. The expert system performed more accurately than the onboard software of the protein sequencer, in that the misidentification rates for the expert system were around 7%, whereas those for the onboard software were between 13 and 21%.

  6. Cytochrome b nucleotide sequence variation among the Atlantic Alcidae.

    PubMed

    Friesen, V L; Montevecchi, W A; Davidson, W S

    1993-01-01

    Analysis of cytochrome b nucleotide sequences of the six extant species of Atlantic alcids and a gull revealed an excess of adenines and cytosines and a deficit of guanines at silent sites on the coding strand. Phylogenetic analyses grouped the sequences of the common (Uria aalge) and Brünnich's (U. lomvia) guillemots, followed by the razorbill (Alca torda) and little auk (Alle alle). The black guillemot (Cepphus grylle) sequence formed a sister taxon, and the puffin (Fratercula arctica) fell outside the other alcids. Phylogenetic comparisons of substitutions indicated that mutabilities of bases did not differ, but that C was much more likely to be incorporated than was G. Imbalances in base composition appear to result from a strand bias in replication errors, which may result from selection on secondary RNA structure and/or the energetics of codon-anticodon interactions. PMID:7916741

  7. Metadata-driven comparative analysis tool for sequences (meta-CATS): an automated process for identifying significant sequence variations that correlate with virus attributes.

    PubMed

    Pickett, B E; Liu, M; Sadat, E L; Squires, R B; Noronha, J M; He, S; Jen, W; Zaremba, S; Gu, Z; Zhou, L; Larsen, C N; Bosch, I; Gehrke, L; McGee, M; Klem, E B; Scheuermann, R H

    2013-12-01

    The Virus Pathogen Resource (ViPR; www.viprbrc.org) and Influenza Research Database (IRD; www.fludb.org) have developed a metadata-driven Comparative Analysis Tool for Sequences (meta-CATS), which performs statistical comparative analyses of nucleotide and amino acid sequence data to identify correlations between sequence variations and virus attributes (metadata). Meta-CATS guides users through: selecting a set of nucleotide or protein sequences; dividing them into multiple groups based on any associated metadata attribute (e.g. isolation location, host species); performing a statistical test at each aligned position; and identifying all residues that significantly differ between the groups. As proofs of concept, we have used meta-CATS to identify sequence biomarkers associated with dengue viruses isolated from different hemispheres, and to identify variations in the NS1 protein that are unique to each of the 4 dengue serotypes. Meta-CATS is made freely available to virology researchers to identify genotype-phenotype correlations for development of improved vaccines, diagnostics, and therapeutics.

  8. Metadata-driven Comparative Analysis Tool for Sequences (meta-CATS): an Automated Process for Identifying Significant Sequence Variations that Correlate with Virus Attributes

    PubMed Central

    Pickett, BE; Liu, M; Sadat, EL; Squires, RB; Noronha, JM; He, S; Jen, W; Zaremba, S; Gu, Z; Zhou, L; Larsen, CN; Bosch, I; Gehrke, L; McGee, M; Klem, EB; Scheuermann, RH

    2016-01-01

    The Virus Pathogen Resource (ViPR; www.viprbrc.org) and Influenza Research Database (IRD; www.fludb.org) have developed a metadata-driven Comparative Analysis Tool for Sequences (meta-CATS), which performs statistical comparative analyses of nucleotide and amino acid sequence data to identify correlations between sequence variations and virus attributes (metadata). Meta-CATS guides users through: selecting a set of nucleotide or protein sequences; dividing them into multiple groups based on any associated metadata attribute (e.g. isolation location, host species); performing a statistical test at each aligned position; and identifying all residues that significantly differ between the groups. As proofs of concept, we have used meta-CATS to identify sequence biomarkers associated with dengue viruses isolated from different hemispheres, and to identify variations in the NS1 protein that are unique to each of the 4 dengue serotypes. Meta-CATS is made freely available to virology researchers to identify genotype-phenotype correlations for development of improved vaccines, diagnostics, and therapeutics. PMID:24210098

  9. Fatty acid metabolism: Implications for diet, genetic variation, and disease

    PubMed Central

    Suburu, Janel; Gu, Zhennan; Chen, Haiqin; Chen, Wei; Zhang, Hao; Chen, Yong Q.

    2014-01-01

    Cultures across the globe, especially Western societies, are burdened by chronic diseases such as obesity, metabolic syndrome, cardiovascular disease, and cancer. Several factors, including diet, genetics, and sedentary lifestyle, are suspected culprits to the development and progression of these health maladies. Fatty acids are primary constituents of cellular physiology. Humans can acquire fatty acids by de novo synthesis from carbohydrate or protein sources or by dietary consumption. Importantly, regulation of their metabolism is critical to sustain balanced homeostasis, and perturbations of such can lead to the development of disease. Here, we review de novo and dietary fatty acid metabolism and highlight recent advances in our understanding of the relationship between dietary influences and genetic variation in fatty acid metabolism and their role in chronic diseases. PMID:24511462

  10. Variation in Symbiodinium ITS2 sequence assemblages among coral colonies.

    PubMed

    Stat, Michael; Bird, Christopher E; Pochon, Xavier; Chasqui, Luis; Chauka, Leonard J; Concepcion, Gregory T; Logan, Dan; Takabayashi, Misaki; Toonen, Robert J; Gates, Ruth D

    2011-01-01

    Endosymbiotic dinoflagellates in the genus Symbiodinium are fundamentally important to the biology of scleractinian corals, as well as to a variety of other marine organisms. The genus Symbiodinium is genetically and functionally diverse and the taxonomic nature of the union between Symbiodinium and corals is implicated as a key trait determining the environmental tolerance of the symbiosis. Surprisingly, the question of how Symbiodinium diversity partitions within a species across spatial scales of meters to kilometers has received little attention, but is important to understanding the intrinsic biological scope of a given coral population and adaptations to the local environment. Here we address this gap by describing the Symbiodinium ITS2 sequence assemblages recovered from colonies of the reef building coral Montipora capitata sampled across Kāne'ohe Bay, Hawai'i. A total of 52 corals were sampled in a nested design of Coral Colony(Site(Region)) reflecting spatial scales of meters to kilometers. A diversity of Symbiodinium ITS2 sequences was recovered with the majority of variance partitioning at the level of the Coral Colony. To confirm this result, the Symbiodinium ITS2 sequence diversity in six M. capitata colonies were analyzed in much greater depth with 35 to 55 clones per colony. The ITS2 sequences and quantitative composition recovered from these colonies varied significantly, indicating that each coral hosted a different assemblage of Symbiodinium. The diversity of Symbiodinium ITS2 sequence assemblages retrieved from individual colonies of M. capitata here highlights the problems inherent in interpreting multi-copy and intra-genomically variable molecular markers, and serves as a context for discussing the utility and biological relevance of assigning species names based on Symbiodinium ITS2 genotyping. PMID:21246044

  11. Extensive sequence variation in rice blast resistance gene Pi54 makes it broad spectrum in nature

    PubMed Central

    Thakur, Shallu; Singh, Pankaj K.; Das, Alok; Rathour, R.; Variar, M.; Prashanthi, S. K.; Singh, A. K.; Singh, U. D.; Chand, Duni; Singh, N. K.; Sharma, Tilak R.

    2015-01-01

    Rice blast resistant gene, Pi54 cloned from rice line, Tetep, is effective against diverse isolates of Magnaporthe oryzae. In this study, we prospected the allelic variants of the dominant blast resistance gene from a set of 92 rice lines to determine the nucleotide diversity, pattern of its molecular evolution, phylogenetic relationships and evolutionary dynamics, and to develop allele specific markers. High quality sequences were generated for homologs of Pi54 gene. Using comparative sequence analysis, InDels of variable sizes in all the alleles were observed. Profiling of the selected sites of SNP (Single Nucleotide Polymorphism) and amino acids (N sites ≥ 10) exhibited constant frequency distribution of mutational and substitutional sites between the resistance and susceptible rice lines, respectively. A total of 50 new haplotypes based on the nucleotide polymorphism was also identified. A unique haplotype (H_3) was found to be linked to all the resistant alleles isolated from indica rice lines. Unique leucine zipper and tyrosine sulfation sites were identified in the predicted Pi54 proteins. Selection signals were observed in entire coding sequence of resistance alleles, as compared to LRR domains for susceptible alleles. This is a maiden report of extensive variability of Pi54 alleles in different landraces and cultivated varieties, possibly, attributing broad-spectrum resistance to Magnaporthe oryzae. The sequence variation in two consensus region: 163 and 144 bp were used for the development of allele specific DNA markers. Validated markers can be used for the selection and identification of better allele(s) and their introgression in commercial rice cultivars employing marker assisted selection. PMID:26052332

  12. The genome of RNA tumor viruses contains polyadenylic acid sequences.

    PubMed

    Green, M; Cartas, M

    1972-04-01

    The 70S genome of two RNA tumor viruses, murine sarcoma virus and avian myeloblastosis virus, binds to Millipore filters in buffer with high salt concentration and to glass fiber filters containing poly(U). These observations suggest that 70S RNA contains adenylic acid-rich sequences. When digested by pancreatic RNase, 70S RNA of murine sarcoma virus yielded poly(A) sequences that contain 91% adenylic acid. These poly(A) sequences sedimented as a relatively homogenous peak in sucrose gradients with a sedimentation coefficient of 4-5 S, but had a mobility during polyacrylamide gel electrophoresis that corresponds to molecules that sediment at 6-7 S. If we estimate a molecular weight for each sequence of 30,000-60,000 (100-200 nucleotides) and a molecular weight for viral 70S RNA of 3-12 million, each viral genome could contain 1-8 poly(A) sequences. Possible functions of poly(A) in the infecting viral RNA may include a role in the initiation of viral DNA or RNA synthesis, in protein maturation, or in the assembly of the viral genome.

  13. [Mitochondrial DNA sequence variations of Keriyan in the Taklamakan desert].

    PubMed

    Duan, Ran-Hui; Cui, Yin-Qiu; Zhou, Hui; Zhu, Hong

    2003-05-01

    The Keriyans live in the center of the Taklamakan desert of Xinjiang Province and they have never married with outsiders. Nobody knows clearly how they immigrated here and who was their origin. The mtDNA hypervariable segment I sequences were sequenced in 75 Keriyans. Seventy-one unique HVS I types were identified, varying at 68 nucleotide positions. Nucleotide diversity and the mean pairwise differences of Keriyan are intermediate between those reported for Eastern and Western populations. Keriyan's low Tajima's D statistics and bell-shaped pairwise-difference distributions can be interpreted as the hallmark of an ancient population expansion. Phylogenetic analysis shows Central Asian populations occupy a position intermediate between the Eastern and Western populations, moreover, the Keriyan presents shorter genetic distances to Xinjiang Uighur and Uighur in other places than to other populations.

  14. Sequence variation of alcohol dehydrogenase (Adh) paralogs in cactophilic Drosophila.

    PubMed Central

    Matzkin, Luciano M; Eanes, Walter F

    2003-01-01

    This study focuses on the population genetics of alcohol dehydrogenase (Adh) in cactophilic Drosophila. Drosophila mojavensis and D. arizonae utilize cactus hosts, and each host contains a characteristic mixture of alcohol compounds. In these Drosophila species there are two functional Adh loci, an adult form (Adh-2) and a larval and ovarian form (Adh-1). Overall, the greater level of variation segregating in D. arizonae than in D. mojavensis suggests a larger population size for D. arizonae. There are markedly different patterns of variation between the paralogs across both species. A 16-bp intron haplotype segregates in both species at Adh-2, apparently the product of an ancient gene conversion event between the paralogs, which suggests that there is selection for the maintenance of the intron structure possibly for the maintenance of pre-mRNA structure. We observe a pattern of variation consistent with adaptive protein evolution in the D. mojavensis lineage at Adh-1, suggesting that the cactus host shift that occurred in the divergence of D. mojavensis from D. arizonae had an effect on the evolution of the larval expressed paralog. Contrary to previous work we estimate a recent time for both the divergence of D. mojavensis and D. arizonae (2.4 +/- 0.7 MY) and the age of the gene duplication (3.95 +/- 0.45 MY). PMID:12586706

  15. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  16. Nucleic acid sequence design via efficient ensemble defect optimization.

    PubMed

    Zadeh, Joseph N; Wolfe, Brian R; Pierce, Niles A

    2011-02-01

    We describe an algorithm for designing the sequence of one or more interacting nucleic acid strands intended to adopt a target secondary structure at equilibrium. Sequence design is formulated as an optimization problem with the goal of reducing the ensemble defect below a user-specified stop condition. For a candidate sequence and a given target secondary structure, the ensemble defect is the average number of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of unpseudoknotted secondary structures. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, candidate mutations are evaluated on the leaf nodes of a tree-decomposition of the target structure. During leaf optimization, defect-weighted mutation sampling is used to select each candidate mutation position with probability proportional to its contribution to the ensemble defect of the leaf. As subsequences are merged moving up the tree, emergent structural defects resulting from crosstalk between sibling sequences are eliminated via reoptimization within the defective subtree starting from new random subsequences. Using a Θ(N(3) ) dynamic program to evaluate the ensemble defect of a target structure with N nucleotides, this hierarchical approach implies an asymptotic optimality bound on design time: for sufficiently large N, the cost of sequence design is bounded below by 4/3 the cost of a single evaluation of the ensemble defect for the full sequence. Hence, the design algorithm has time complexity Ω(N(3) ). For target structures containing N ∈{100,200,400,800,1600,3200} nucleotides and duplex stems ranging from 1 to 30 base pairs, RNA sequence designs at 37°C typically succeed in satisfying a stop condition with ensemble defect less than N/100. Empirically, the sequence design algorithm exhibits asymptotic optimality and the exponent in the time complexity bound is sharp.

  17. On combining protein sequences and nucleic acid sequences in phylogenetic analysis: the homeobox protein case.

    PubMed

    Agosti, D; Jacobs, D; DeSalle, R

    1996-01-01

    Amino acid encoding genes contain character state information that may be useful for phylogenetic analysis on at least two levels. The nucleotide sequence and the translated amino acid sequences have both been employed separately as character states for cladistic studies of various taxa, including studies of the genealogy of genes in multigene families. In essence, amino acid sequences and nucleic acid sequences are two different ways of character coding the information in a gene. Silent positions in the nucleotide sequence (first or third positions in codons that can accrue change without changing the identity of the amino acid that the triplet codes for) may accrue change relatively rapidly and become saturated, losing the pattern of historical divergence. On the other hand, non-silent nucleotide alterations and their accompanying amino acid changes may evolve too slowly to reveal relationships among closely related taxa. In general, the dynamics of sequence change in silent and non-silent positions in protein coding genes result in homoplasy and lack of resolution, respectively. We suggest that the combination of nucleic acid and the translated amino acid coded character states into the same data matrix for phylogenetic analysis addresses some of the problems caused by the rapid change of silent nucleotide positions and overall slow rate of change of non-silent nucleotide positions and slowly changing amino acid positions. One major theoretical problem with this approach is the apparent non-independence of the two sources of characters. However, there are at least three possible outcomes when comparing protein coding nucleic acid sequences with their translated amino acids in a phylogenetic context on a codon by codon basis. First, the two character sets for a codon may be entirely congruent with respect to the information they convey about the relationships of a certain set of taxa. Second, one character set may display no information concerning a phylogenetic

  18. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  19. Allelic sequence variation of the HLA-DQ loci: relationship to serology and to insulin-dependent diabetes susceptibility.

    PubMed Central

    Horn, G T; Bugawan, T L; Long, C M; Erlich, H A

    1988-01-01

    Analysis of sequence variation in the polymorphic second exon of the major histocompatibility complex genes HLA-DQ alpha and -DQ beta has revealed 8 allelic variants at the alpha locus and 13 variants at the beta locus. Correlation of sequence variation with serologic typing suggests that the DQw2, DQw3, and DQ(blank) types are determined by the DQ beta subunit, while the DQw1 specificity is determined by DQ alpha. The nature of the amino acid at position 57 in the DQ beta subunit is correlated with susceptibility to insulin-dependent diabetes mellitus. This region of the DQ beta chain contains shared peptides with Epstein-Barr virus and rubella virus. PMID:2842756

  20. Sequence variation and haplotype structure at the human HFE locus.

    PubMed Central

    Toomajian, Christopher; Kreitman, Martin

    2002-01-01

    The HFE locus encodes an HLA class-I-type protein important in iron regulation and segregates replacement mutations that give rise to the most common form of genetic hemochromatosis. The high frequency of one disease-associated mutation, C282Y, and the nature of this disease have led some to suggest a selective advantage for this mutation. To investigate the context in which this mutation arose and gain a better understanding of HFE genetic variation, we surveyed nucleotide variability in 11.2 kb encompassing the HFE locus and experimentally determined haplotypes. We fully resequenced 60 chromosomes of African, Asian, or European ancestry as well as one chimpanzee, revealing 41 variable sites and a nucleotide diversity of 0.08%. This indicates that linkage to the HLA region has not substantially increased the level of HFE variation. Although several haplotypes are shared between populations, one haplotype predominates in Asia but is nearly absent elsewhere, causing higher than average genetic differentiation among the three major populations. Our samples show evidence of intragenic recombination, so the scarcity of recombination events within the C282Y allele class is consistent with selection increasing the frequency of a young allele. Otherwise, the pattern of variability in this region does not clearly indicate the action of positive selection at this or linked loci. PMID:12196404

  1. HIV-1 sequence variation between isolates from mother-infant transmission pairs

    SciTech Connect

    Wike, C.M.; Daniels, M.R.; Furtado, M.; Wolinsky, M.; Korber, B.; Hutto, C.; Munoz, J.; Parks, W.; Saah, A.

    1991-01-01

    To examine the sequence diversity of human immunodeficiency virus type 1 (HIV-1) between known transmission sets, sequences from the V3 and V4-V5 region of the env gene from 4 mother-infant pairs were analyzed. The mean interpatient sequence variation between isolates from linked mother-infant pairs was comparable to the sequence diversity found between isolates from other close contacts. The mean intrapatient variation was significantly less in the infants' isolates then the isolates from both their mothers and other characterized intrapatient sequence sets. In addition, a distinct and characteristic difference in the glycosylation pattern preceding the V3 loop was found between each linked transmission pair. These findings indicate that selection of specific genotypic variants, which may play a role in some direct transmission sets, and the duration of infection are important factors in the degree of diversity seen between the sequence sets.

  2. HIV-1 sequence variation between isolates from mother-infant transmission pairs

    SciTech Connect

    Wike, C.M.; Daniels, M.R.; Furtado, M.; Wolinsky, M.; Korber, B.; Hutto, C.; Munoz, J.; Parks, W.; Saah, A.

    1991-12-31

    To examine the sequence diversity of human immunodeficiency virus type 1 (HIV-1) between known transmission sets, sequences from the V3 and V4-V5 region of the env gene from 4 mother-infant pairs were analyzed. The mean interpatient sequence variation between isolates from linked mother-infant pairs was comparable to the sequence diversity found between isolates from other close contacts. The mean intrapatient variation was significantly less in the infants` isolates then the isolates from both their mothers and other characterized intrapatient sequence sets. In addition, a distinct and characteristic difference in the glycosylation pattern preceding the V3 loop was found between each linked transmission pair. These findings indicate that selection of specific genotypic variants, which may play a role in some direct transmission sets, and the duration of infection are important factors in the degree of diversity seen between the sequence sets.

  3. Atomic force microscopy of crystalline insulins: the influence of sequence variation on crystallization and interfacial structure.

    PubMed Central

    Yip, C M; Brader, M L; DeFelippis, M R; Ward, M D

    1998-01-01

    The self-association of proteins is influenced by amino acid sequence, molecular conformation, and the presence of molecular additives. In the presence of phenolic additives, LysB28ProB29 insulin, in which the C-terminal prolyl and lysyl residues of wild-type human insulin have been inverted, can be crystallized into forms resembling those of wild-type insulins in which the protein exists as zinc-complexed hexamers organized into well-defined layers. We describe herein tapping-mode atomic force microscopy (TMAFM) studies of single crystals of rhombohedral (R3) LysB28ProB29 that reveal the influence of sequence variation on hexamer-hexamer association at the surface of actively growing crystals. Molecular scale lattice images of these crystals were acquired in situ under growth conditions, enabling simultaneous identification of the rhombohedral LysB28ProB29 crystal form, its orientation, and its dynamic growth characteristics. The ability to obtain crystallographic parameters on multiple crystal faces with TMAFM confirmed that bovine and porcine insulins grown under these conditions crystallized into the same space group as LysB28ProB29 (R3), enabling direct comparison of crystal growth behavior and the influence of sequence variation. Real-time TMAFM revealed hexamer vacancies on the (001) terraces of LysB28ProB29, and more rounded dislocation noses and larger terrace widths for actively growing screw dislocations compared to wild-type bovine and porcine insulin crystals under identical conditions. This behavior is consistent with weaker interhexamer attachment energies for LysB28ProB29 at active growth sites. Comparison of the single crystal x-ray structures of wild-type insulins and LysB28ProB29 suggests that differences in protein conformation at the hexamer-hexamer interface and accompanying changes in interhexamer bonding are responsible for this behavior. These studies demonstrate that subtle changes in molecular conformation due to a single sequence

  4. The amino acid sequence of Escherichia coli cyanase.

    PubMed

    Chin, C C; Anderson, P M; Wold, F

    1983-01-10

    The amino acid sequence of the enzyme cyanase (cyanate hydrolase) from Escherichia coli has been determined by automatic Edman degradation of the intact protein and of its component peptides. The primary peptides used in the sequencing were produced by cyanogen bromide cleavage at the methionine residues, yielding 4 peptides plus free homoserine from the NH2-terminal methionine, and by trypsin cleavage at the 7 arginine residues after acetylation of the lysines. Secondary peptides required for overlaps and COOH-terminal sequences were produced by chymotrypsin or clostripain cleavage of some of the larger peptides. The complete sequence of the cyanase subunit consists of 156 amino acid residues (Mr 16,350). Based on the observation that the cysteine-containing peptide is obtained as a disulfide-linked dimer, it is proposed that the covalent structure of cyanase is made up of two subunits linked by a disulfide bond between the single cystine residue in each subunit. The native enzyme (Mr 150,000) then appears to be a complex of four or five such subunit dimers.

  5. Sequence variation at the major histocompatibility complex locus DQ beta in beluga whales (Delphinapterus leucas)

    PubMed

    Murray, B W; Malik, S; White, B N

    1995-07-01

    Genetic variation at the Major Histocompatibility Complex locus DQ beta was analyzed in 233 beluga whales (Delphinapterus leucas) from seven populations: St. Lawrence Estuary, eastern Beaufort Sea, eastern Chukchi Sea, western Hudson Bay, eastern Hudson Bay, southeastern Baffin Island, and High Arctic and in 12 narwhals (Monodon monoceros) sympatric with the High Arctic beluga population. Variation was assessed by amplification of the exon coding for the peptide binding region via the polymerase chain reaction, followed by either cloning and DNA sequencing or single-stranded conformation polymorphism analysis. Five alleles were found across the beluga populations and one in the narwhal. Pairwise comparisons of these alleles showed a 5:1 ratio of nonsynonymous to synonymous substitutions per site leading to eight amino acid differences, five of which were nonconservative substitutions, centered around positions previously shown to be important for peptide binding. Although the amount of allelic variation is low when compared with terrestrial mammals, the nature of the substitutions in the peptide binding sites indicates an important role for the DQ beta locus in the cellular immune response of beluga whales. Comparisons of allele frequencies among populations show the High Arctic population to be different (P < or = .005) from the other beluga populations surveyed. In these other populations an allele, Dele-DQ beta*0101-2, was found in 98% of the animals, while in the High Arctic it was found in only 52% of the animals. Two other alleles were found at high frequencies in the High Arctic population, one being very similar to the single allele found in narwhal. PMID:7659014

  6. Variation of unsaturated fatty acids in soybean sprout of high oleic acid accessions.

    PubMed

    Dhakal, Krishna Hari; Jung, Ki-Hwal; Chae, Jong-Hyun; Shannon, J Grover; Lee, Jeong-Dong

    2014-12-01

    Oleic acid and oleic acid rich foods may have beneficial health effects in humans. Soybeans with high oleic acid (around 80% in seed oil) have been developed. Soybean sprouts are an important vegetable in Korea, Japan and China. The objective of this study was to investigate the variation of unsaturated fatty acids, oleic, linoleic and α-linolenic acids, in sprouts from soybeans with normal and high oleic acid concentration. Twelve soybean accessions with six high oleic acid lines, three parents of high oleic acid lines, and three checks with normal and high oleic acid concentration were used in this study. The unsaturated fatty acid concentration in sprouts from each genotype was similar to the concentration in the ungerminated seed. The oleic acid concentration in the sprouts of high oleic acid lines (up to 80%) was still high (>70%) compared to the ungerminated seed. Thus, high oleic soybean varieties developed for sprout production could add valuable health benefits to sprouts and the individuals who consume this vegetable.

  7. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  8. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

    PubMed Central

    Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

    2016-01-01

    Mass spectrometry–based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications. PMID:27049631

  9. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

    NASA Astrophysics Data System (ADS)

    Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

    2016-06-01

    Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.

  10. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

    NASA Astrophysics Data System (ADS)

    Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

    2016-06-01

    Mass spectrometry–based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.

  11. Mitochondrial control-region sequence variation in aboriginal Australians.

    PubMed Central

    van Holst Pellekaan, S; Frommer, M; Sved, J; Boettcher, B

    1998-01-01

    The mitochondrial D-loop hypervariable segment 1 (mt HVS1) between nucleotides 15997 and 16377 has been examined in aboriginal Australian people from the Darling River region of New South Wales (riverine) and from Yuendumu in central Australia (desert). Forty-seven unique HVS1 types were identified, varying at 49 nucleotide positions. Pairwise analysis by calculation of BEPPI (between population proportion index) reveals statistically significant structure in the populations, although some identical HVS1 types are seen in the two contrasting regions. mt HVS1 types may reflect more-ancient distributions than do linguistic diversity and other culturally distinguishing attributes. Comparison with sequences from five published global studies reveals that these Australians demonstrate greatest divergence from some Africans, least from Papua New Guinea highlanders, and only slightly more from some Pacific groups (Indonesian, Asian, Samoan, and coastal Papua New Guinea), although the HVS1 types vary at different nucleotide sites. Construction of a median network, displaying three main groups, suggests that several hypervariable nucleotide sites within the HVS1 are likely to have undergone mutation independently, making phylogenetic comparison with global samples by conventional methods difficult. Specific nucleotide-site variants are major separators in median networks constructed from Australian HVS1 types alone and for one global selection. The distribution of these, requiring extended study, suggests that they may be signatures of different groups of prehistoric colonizers into Australia, for which the time of colonization remains elusive. PMID:9463317

  12. Mitochondrial control-region sequence variation in aboriginal Australians.

    PubMed

    van Holst Pellekaan, S; Frommer, M; Sved, J; Boettcher, B

    1998-02-01

    The mitochondrial D-loop hypervariable segment 1 (mt HVS1) between nucleotides 15997 and 16377 has been examined in aboriginal Australian people from the Darling River region of New South Wales (riverine) and from Yuendumu in central Australia (desert). Forty-seven unique HVS1 types were identified, varying at 49 nucleotide positions. Pairwise analysis by calculation of BEPPI (between population proportion index) reveals statistically significant structure in the populations, although some identical HVS1 types are seen in the two contrasting regions. mt HVS1 types may reflect more-ancient distributions than do linguistic diversity and other culturally distinguishing attributes. Comparison with sequences from five published global studies reveals that these Australians demonstrate greatest divergence from some Africans, least from Papua New Guinea highlanders, and only slightly more from some Pacific groups (Indonesian, Asian, Samoan, and coastal Papua New Guinea), although the HVS1 types vary at different nucleotide sites. Construction of a median network, displaying three main groups, suggests that several hypervariable nucleotide sites within the HVS1 are likely to have undergone mutation independently, making phylogenetic comparison with global samples by conventional methods difficult. Specific nucleotide-site variants are major separators in median networks constructed from Australian HVS1 types alone and for one global selection. The distribution of these, requiring extended study, suggests that they may be signatures of different groups of prehistoric colonizers into Australia, for which the time of colonization remains elusive. PMID:9463317

  13. Amino acids in dew - origin and seasonal variation

    NASA Astrophysics Data System (ADS)

    Scheller, Edwin

    At two sites in the Armenhof district, 10 km east of Fulda, Germany, dew samples were collected from June 1996 to June 1997 and investigated for free and protein-bound amino acids. On account of the high pollen content, at the beginning of June 1996 and in May 1997 total amino acid concentrations were 53-390 μmol l -1, in one sample 922 μmol l -1. At other times the concentration in dew was 8-164 μmol l -1. On 4 and 5 June 1996 the diluted free amino acid fraction (DFAA) of the total hydrolysed amino acids (THAA) at both sites amounted to 35-44% and was predominantly arginine, proline and glutamine/glutamate. Likewise on 11 March 1997 the fraction of DFAA was found to be 39.5% with extremely high arginine and proline fractions. At other times the DFAA-fraction was in the range 14-26%. From July 1996 to June 1997 the amino acid concentrations in the vapours rising from a meadow were also measured and it ranged from 8 to 51 μmol l -1. From July to October 1996 the amino acid composition in the hydrolysates of dew samples and meadow vapours collected overnight were almost identical. The DFAA fraction in the condensation water collected overnight from the meadow varied from 18 to 40%. From 4 to 6 June 1996, on 11 and 13 March 1997 and in the period 16-20 May 1997, the amino acid distribution in dew showed much variation. The percentage fraction of arginine and proline in the hydrolysate increased greatly, whereas that of glycine and serine decreased. The large increase in proline and arginine in hydrolysate is attributable solely to the large amounts of free arginine and proline. This effect occurred in both 1996 and 1997 over several days at both sites at any one time and therefore appears confirmed.

  14. Sequence variation at the major histocompatibility complex DRB loci in beluga (Delphinapterus leucas) and narwhal (Monodon monoceros).

    PubMed

    Murray, B W; White, B N

    1998-09-01

    The variation at loci with similarity to DRB class II major histocompatibility complex loci was assessed in 313 beluga collected from 13 sampling locations across North America, and 11 narwhal collected in the Canadian high Arctic. Variation was assessed by amplification of exon 2, which codes for the peptide binding region, via the polymerase chain reaction, followed by either cloning and DNA sequencing or single-stranded conformation polymorphism analysis. Two DRB loci were identified in beluga: DRB1, a polymorphic locus, and, DRB2, a monomorphic locus. Eight alleles representing five distinct lineages (based on sequence similarity) were found at the beluga DRB1 locus. Although the relative number of alleles is low when compared with terrestrial mammals, the amino acid variation found among the lineages is moderate. At the DRB1 locus, the average number of nonsynonymous substitutions per site is greater than the average number of synonymous substitutions per site (0.0806 : 0.0207, respectively; P<0.01). Most of the 31 amino acid substitutions do not conserve the physiochemical properties of the residue, and 21 of these are located at positions implicated as forming pockets responsible for the selective binding of foreign peptide side chains. Only DRB1 variation was examined in 11 narwhal, revealing a low amount of variation. These data are consistent with an important role for the DRB1 locus in the cellular immune response of beluga. In addition, the ratio of nonsynonymous to synonymous substitutions is similar to that among primate alleles, arguing against a reduction in the balancing selection pressure in the marine environment. Two hypotheses may explain the modest amount of Mhc variation when compared with terrestrial mammals: small population sizes at speciation or a reduced neutral substitution rate in cetaceans. PMID:9716643

  15. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    SciTech Connect

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  16. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  17. Characterization and amino acid sequence of a fatty acid-binding protein from human heart.

    PubMed Central

    Offner, G D; Brecher, P; Sawlivich, W B; Costello, C E; Troxler, R F

    1988-01-01

    The complete amino acid sequence of a fatty acid-binding protein from human heart was determined by automated Edman degradation of CNBr, BNPS-skatole [3'-bromo-3-methyl-2-(2-nitrobenzenesulphenyl)indolenine], hydroxylamine, Staphylococcus aureus V8 proteinase, tryptic and chymotryptic peptides, and by digestion of the protein with carboxypeptidase A. The sequence of the blocked N-terminal tryptic peptide from citraconylated protein was determined by collisionally induced decomposition mass spectrometry. The protein contains 132 amino acid residues, is enriched with respect to threonine and lysine, lacks cysteine, has an acetylated valine residue at the N-terminus, and has an Mr of 14768 and an isoelectric point of 5.25. This protein contains two short internal repeated sequences from residues 48-54 and from residues 114-119 located within regions of predicted beta-structure and decreasing hydrophobicity. These short repeats are contained within two longer repeated regions from residues 48-60 and residues 114-125, which display 62% sequence similarity. These regions could accommodate the charged and uncharged moieties of long-chain fatty acids and may represent fatty acid-binding domains consistent with the finding that human heart fatty acid-binding protein binds 2 mol of oleate or palmitate/mol of protein. Detailed evidence for the amino acid sequences of the peptides has been deposited as Supplementary Publication SUP 50143 (23 pages) at the British Library Lending Division, Boston Spa, Yorkshire LS23 7BQ, U.K., from whom copies may be obtained as indicated in Biochem. J. (1988) 249, 5. PMID:3421901

  18. The Quantification of Representative Sequences pipeline for amplicon sequencing: case study on within-population ITS1 sequence variation in a microparasite infecting Daphnia.

    PubMed

    González-Tortuero, E; Rusek, J; Petrusek, A; Gießler, S; Lyras, D; Grath, S; Castro-Monzón, F; Wolinska, J

    2015-11-01

    Next generation sequencing (NGS) platforms are replacing traditional molecular biology protocols like cloning and Sanger sequencing. However, accuracy of NGS platforms has rarely been measured when quantifying relative frequencies of genotypes or taxa within populations. Here we developed a new bioinformatic pipeline (QRS) that pools similar sequence variants and estimates their frequencies in NGS data sets from populations or communities. We tested whether the estimated frequency of representative sequences, generated by 454 amplicon sequencing, differs significantly from that obtained by Sanger sequencing of cloned PCR products. This was performed by analysing sequence variation of the highly variable first internal transcribed spacer (ITS1) of the ichthyosporean Caullerya mesnili, a microparasite of cladocerans of the genus Daphnia. This analysis also serves as a case example of the usage of this pipeline to study within-population variation. Additionally, a public Illumina data set was used to validate the pipeline on community-level data. Overall, there was a good correspondence in absolute frequencies of C. mesnili ITS1 sequences obtained from Sanger and 454 platforms. Furthermore, analyses of molecular variance (amova) revealed that population structure of C. mesnili differs across lakes and years independently of the sequencing platform. Our results support not only the usefulness of amplicon sequencing data for studies of within-population structure but also the successful application of the QRS pipeline on Illumina-generated data. The QRS pipeline is freely available together with its documentation under GNU Public Licence version 3 at http://code.google.com/p/quantification-representative-sequences.

  19. Targeted Exome Sequencing Outcome Variations of Colorectal Tumors within and across Two Sequencing Platforms

    PubMed Central

    Ashktorab, Hassan; Azimi, Hamed; Nickerson, Michael L.; Bass, Sara; Varma, Sudhir; Brim, Hassan

    2016-01-01

    Background and Aim Next generation sequencing (NGS) has quickly the tool of choice for genome and exome data generation. The multitude of sequencing platforms as well as the variabilities within each platform need to be assessed. In this paper we used two platforms (ION TORRENT AND ILLUMINA) to assess single nucleotides variants in colorectal cancer (CRC) specimens. Methods CRC specimens (n = 13) collected from 6 CRC (cancer and matched normal) patients were used to establish the mutational profile using ION TORRENT AND ILLUMINA sequencing platforms. We analyzed a set of samples from Formalin Fixed Paraffin Embedded and FF (FF) samples on both platforms to assess the effect of sample nature (FFPE vs. FF) on sequencing outcome and to evaluate the similarity/differences of SNVs across the two platforms. In addition, duplicates of FF samples were sequenced on each platform to assess variability within platform. Results The comparison of FF replicates to each other gave a concordance of 77% (± 15.3%) in Ion Torrent and 70% (± 3.7%) in Illumina. FFPE vs. FF replicates gave a concordance of 40% (± 32%) in Ion Torrent and 49% (± 19%) in Illumina. For the cross platform concordance were FFPE compared to FF (Average of 75% (± 9.8%) for FFPE samples and 67% (± 32%) for FF and 70% (± 26.8%) overall average). Conclusion Our data show a significant variability within and across platforms. Also the number of detected variants depend on the nature of the specimen; FF vs. FFPE. Validation of NGS discovered mutations is a must to rule-out false positive mutants. This validation might either be performed through a second NGS platform or through Sanger sequencing. PMID:27547838

  20. Variation of trans fatty acids in milk fats.

    PubMed

    Precht, D

    1995-03-01

    Trans fatty acids are discussed in connection with an increased risk of atherosclerosis. Therefore, the development of a rapid and exact measuring method for the determination of trans fatty acids in milk fat is of great interest. Using gas chromatographic analysis of the trans-octadecenoic fatty acids as well as of the triglycerides of 100 different milk fat samples a formula consisting of different triglycerides for the quick determination of trans contents was developed by means of statistical methods (standard deviation = 0.293%, r = 0.9977). Subsequently, the seasonal variations of the trans contents in milk fat samples from a large milk collection area were determined using rapid triglyceride analyses. For the trans fatty acid contents of the 100 milk fat samples and the samples from the milk collection area scattering ranges of 1.91-6.34 wt% resp. 1.97-4.37 wt% were found; the mean contents were 3.83 and 3.18 wt%, and the median values 3.67 and 3.30 wt%, respectively.

  1. Analysis of sequence variations in several human genes using phosphoramidite bond DNA fragmentation and chip-based MALDI-TOF.

    PubMed

    Smylie, Kevin J; Cantor, Charles R; Denissenko, Mikhail F

    2004-01-01

    The challenge in the postgenome era is to measure sequence variations over large genomic regions in numerous patient samples. This massive amount of work can only be completed if more accurate, cost-effective, and high-throughput solutions become available. Here we describe a novel DNA fragmentation approach for single nucleotide polymorphism (SNP) discovery and sequence validation. The base-specific cleavage is achieved by creating primer extension products, in which acid-labile phosphoramidite (P-N) bonds replace the 5' phosphodiester bonds of newly incorporated pyrimidine nucleotides. Sequence variations are detected by hydrolysis of this acid-labile bond and MALDI-TOF analysis of the resulting fragments. In this study, we developed a robust protocol for P-N-bond fragmentation and investigated additional ways to improve its sensitivity and reproducibility. We also present the analysis of several human genomic targets ranging from 100-450 bp in length. By using a semiautomated sample processing protocol, we investigated an array of SNPs within a 240-bp segment of the NFKBIA gene in 48 human DNA samples. We identified and measured frequencies for the two common SNPs in the 3'UTR of NFKBIA (separated by 123 bp) and then confirmed these values in an independent genotyping experiment. The calculated allele frequencies in white and African American groups differed significantly, yet both fit Hardy-Weinberg expectations. This demonstrates the utility and effectiveness of PN-bond DNA fragmentation and subsequent MALDI-TOF MS analysis for the high-throughput discovery and measurement of sequence variations in fragments up to 0.5 kb in length in multiple human blood DNA samples.

  2. Targeted capture enrichment and sequencing identifies extensive nucleotide variation in the turkey MHC-B.

    PubMed

    Reed, Kent M; Mendoza, Kristelle M; Settlage, Robert E

    2016-03-01

    Variation in the major histocompatibility complex (MHC) is increasingly associated with disease susceptibility and resistance in avian species of agricultural importance. This variation includes sequence polymorphisms but also structural differences (gene rearrangement) and copy number variation (CNV). The MHC has now been described for multiple galliform species including the best defined assemblies of the chicken (Gallus gallus) and domestic turkey (Meleagris gallopavo). Using this sequence resource, this study applied high-throughput sequencing to investigate MHC variation in turkeys of North America (NA turkeys). An MHC-specific SureSelect (Agilent) capture array was developed, and libraries were created for 14 turkeys representing domestic (commercial bred), heritage breed, and wild turkeys. In addition, a representative of the Ocellated turkey (M. ocellata) and chicken (G. gallus) was included to test cross-species applicability of the capture array allowing for identification of new species-specific polymorphisms. Libraries were hybridized to ∼12 K cRNA baits and the resulting pools were sequenced. On average, 98% of processed reads mapped to the turkey whole genome sequence and 53% to the MHC target. In addition to the MHC, capture hybridization recovered sequences corresponding to other MHC regions. Sequence alignment and de novo assembly indicated the presence of several additional BG genes in the turkey with evidence for CNV. Variant detection identified an average of 2245 polymorphisms per individual for the NA turkeys, 3012 for the Ocellated turkey, and 462 variants in the chicken (RJF-256). This study provides an extensive sequence resource for examining MHC variation and its relation to health of this agriculturally important group of birds.

  3. On human disease-causing amino acid variants: statistical study of sequence and structural patterns

    PubMed Central

    Alexov, Emil

    2015-01-01

    Statistical analysis was carried out on large set of naturally occurring human amino acid variations and it was demonstrated that there is a preference for some amino acid substitutions to be associated with diseases. At an amino acid sequence level, it was shown that the disease-causing variants frequently involve drastic changes of amino acid physico-chemical properties of proteins such as charge, hydrophobicity and geometry. Structural analysis of variants involved in diseases and being frequently observed in human population showed similar trends: disease-causing variants tend to cause more changes of hydrogen bond network and salt bridges as compared with harmless amino acid mutations. Analysis of thermodynamics data reported in literature, both experimental and computational, indicated that disease-causing variants tend to destabilize proteins and their interactions, which prompted us to investigate the effects of amino acid mutations on large databases of experimentally measured energy changes in unrelated proteins. Although the experimental datasets were linked neither to diseases nor exclusory to human proteins, the observed trends were the same: amino acid mutations tend to destabilize proteins and their interactions. Having in mind that structural and thermodynamics properties are interrelated, it is pointed out that any large change of any of them is anticipated to cause a disease. PMID:25689729

  4. New approaches for computer analysis of nucleic acid sequences.

    PubMed

    Karlin, S; Ghandour, G; Ost, F; Tavare, S; Korn, L J

    1983-09-01

    A new high-speed computer algorithm is outlined that ascertains within and between nucleic acid and protein sequences all direct repeats, dyad symmetries, and other structural relationships. Large repeats, repeats of high frequency, dyad symmetries of specified stem length and loop distance, and their distributions are determined. Significance of homologies is assessed by a hierarchy of permutation procedures. Applications are made to papovaviruses, the human papillomavirus HPV, lambda phage, the human and mouse mitochondrial genomes, and the human and mouse immunoglobulin kappa-chain genes. PMID:6577449

  5. Comprehensive characterization of complex structural variations in cancer by directly comparing genome sequence reads.

    PubMed

    Moncunill, Valentí; Gonzalez, Santi; Beà, Sílvia; Andrieux, Lise O; Salaverria, Itziar; Royo, Cristina; Martinez, Laura; Puiggròs, Montserrat; Segura-Wang, Maia; Stütz, Adrian M; Navarro, Alba; Royo, Romina; Gelpí, Josep L; Gut, Ivo G; López-Otín, Carlos; Orozco, Modesto; Korbel, Jan O; Campo, Elias; Puente, Xose S; Torrents, David

    2014-11-01

    The development of high-throughput sequencing technologies has advanced our understanding of cancer. However, characterizing somatic structural variants in tumor genomes is still challenging because current strategies depend on the initial alignment of reads to a reference genome. Here, we describe SMUFIN (somatic mutation finder), a single program that directly compares sequence reads from normal and tumor genomes to accurately identify and characterize a range of somatic sequence variation, from single-nucleotide variants (SNV) to large structural variants at base pair resolution. Performance tests on modeled tumor genomes showed average sensitivity of 92% and 74% for SNVs and structural variants, with specificities of 95% and 91%, respectively. Analyses of aggressive forms of solid and hematological tumors revealed that SMUFIN identifies breakpoints associated with chromothripsis and chromoplexy with high specificity. SMUFIN provides an integrated solution for the accurate, fast and comprehensive characterization of somatic sequence variation in cancer. PMID:25344728

  6. Otopalatodigital syndrome type 2 in a male infant: A case report with a novel sequence variation

    PubMed Central

    Sankararaman, Senthilkumar; Kurepa, Dalibor; Shen, Yiping; Kakkilaya, Venkatakrishna; Ursin, Sussone; Chen, Harold

    2013-01-01

    We report a male infant with typical clinical, pathological and radiological features of otopalatodigital syndrome type 2 (OPD 2) with a novel sequence variation in the FLNA gene. His clinical manifestations include typical craniofacial features, cleft palate, hearing impairment, omphalocele, bowing of the long bones, absent fibulae and digital abnormalities consistent with OPD 2. Two hemizygous sequence variations in the FLNA gene were identified. The variation c.5290G>A/p.Ala1764Thr has been previously reported in a patient with periventricular nodular heterotopia, but subsequently it has been reported as a polymorphism. The other variation c.613T>C/p.Cys205Arg detected in the proband has not been previously reported and our analysis indicates that this is a novel disease-causing mutation for OPD2. PMID:27625837

  7. Otopalatodigital syndrome type 2 in a male infant: A case report with a novel sequence variation.

    PubMed

    Sankararaman, Senthilkumar; Kurepa, Dalibor; Shen, Yiping; Kakkilaya, Venkatakrishna; Ursin, Sussone; Chen, Harold

    2013-03-01

    We report a male infant with typical clinical, pathological and radiological features of otopalatodigital syndrome type 2 (OPD 2) with a novel sequence variation in the FLNA gene. His clinical manifestations include typical craniofacial features, cleft palate, hearing impairment, omphalocele, bowing of the long bones, absent fibulae and digital abnormalities consistent with OPD 2. Two hemizygous sequence variations in the FLNA gene were identified. The variation c.5290G>A/p.Ala1764Thr has been previously reported in a patient with periventricular nodular heterotopia, but subsequently it has been reported as a polymorphism. The other variation c.613T>C/p.Cys205Arg detected in the proband has not been previously reported and our analysis indicates that this is a novel disease-causing mutation for OPD2. PMID:27625837

  8. Sequence variation of ribosomal internal transcribed spacers (ITS) in commercially important Phytoseiidae mites.

    PubMed

    Navajas, M; Lagnel, J; Fauvel, G; de Moraes, G

    1999-11-01

    Preliminary work is needed to assess the usefulness of different markers at different taxonomic scales when a new group is analyzed, such as the commercially important Phytoseiidae mites. We investigate here the level of sequence variation of the nuclear ribosomal spacers ITS 1 and 2 and the 5.8S gene in six species of Phytoseiidae: Neoseiulus culifornicus, N. fallacis, Euseius concordis, Metaseiulus occidentalis, Typhlodromus pyri and Phytoseiulus persimilis. As expected, the 5.8S gene (148 base pairs) is markedly conserved and displays little variation in between genera comparisons. ITS1 and ITS2 show contrasting patterns: while the ITS2 is short (80-89 bp) and shows little variation, the ITS1 is longer (303-404 bp) and is very variable in sequence. This fact compromises reliable nucleotide homologies when comparing the genera. The comparison of ITS1 sequence similarity at the species level might be useful for species identification, however, the value of ITS in taxonomic studies does not extend to the level of the family. The intraspecific variations of ITS were investigated in three species: N. californicus, N. fallacis and E. concordis. The first species has identical ITS1 sequences and the last two display low polymorphism (2 nucleotide substitutions). The ITS2 and 5.8S sequences were identical in all three subspecies comparisons.

  9. Sequence variation of ribosomal internal transcribed spacers (ITS) in commercially important Phytoseiidae mites.

    PubMed

    Navajas, M; Lagnel, J; Fauvel, G; de Moraes, G

    1999-11-01

    Preliminary work is needed to assess the usefulness of different markers at different taxonomic scales when a new group is analyzed, such as the commercially important Phytoseiidae mites. We investigate here the level of sequence variation of the nuclear ribosomal spacers ITS 1 and 2 and the 5.8S gene in six species of Phytoseiidae: Neoseiulus culifornicus, N. fallacis, Euseius concordis, Metaseiulus occidentalis, Typhlodromus pyri and Phytoseiulus persimilis. As expected, the 5.8S gene (148 base pairs) is markedly conserved and displays little variation in between genera comparisons. ITS1 and ITS2 show contrasting patterns: while the ITS2 is short (80-89 bp) and shows little variation, the ITS1 is longer (303-404 bp) and is very variable in sequence. This fact compromises reliable nucleotide homologies when comparing the genera. The comparison of ITS1 sequence similarity at the species level might be useful for species identification, however, the value of ITS in taxonomic studies does not extend to the level of the family. The intraspecific variations of ITS were investigated in three species: N. californicus, N. fallacis and E. concordis. The first species has identical ITS1 sequences and the last two display low polymorphism (2 nucleotide substitutions). The ITS2 and 5.8S sequences were identical in all three subspecies comparisons. PMID:10668860

  10. Nucleotide sequence variation of GLABRA1 contributing to phenotypic variation of leaf hairiness in Brassicaceae vegetables.

    PubMed

    Li, Feng; Zou, Zhongwei; Yong, Hui-Yee; Kitashiba, Hiroyasu; Nishio, Takeshi

    2013-05-01

    GLABRA1 (GL1) belongs to the group of R2R3-MYB transcription factors and is known to be essential for trichome initiation in Arabidopsis. In our previous study, we identified a GL1 ortholog in Brassica rapa as a candidate for the gene controlling leaf hairiness by QTL analysis and suggested that a 5-bp deletion (B-allele) and a 2-bp deletion (D-allele) in the exon 3 of BrGL1 and a non-synonymous SNP (C-allele) in the second nucleotide of exon 3 possibly cause leaf hairlessness. In this study, we transformed a B. rapa line having the B-allele with the A-allele (wild type) or the C-allele of BrGL1 under the control of the CaMV 35S promoter. The transgenic plants with the A-allele showed dense coverage of seedling tissues including stems, young leaves and hypocotyls with trichomes, whereas the phenotypes of those with the C-allele were unchanged. In order to obtain more information about allelic variation of GL1 in different plant lineages and its correlation with leaf hairiness, two GL1 homologs, i.e., RsGL1a and RsGL1b, in Raphanus sativus were analyzed. Allelic variation of RsGL1a between a hairless line and a hairy line was completely associated with hairiness in their BC1F1 population. Comparison of the full-length of RsGL1a in the hairless and hairy lines showed great variation of nucleotides in the 3' end, which might be essential for its function and expression.

  11. Major Breeding Plumage Color Differences of Male Ruffs (Philomachus pugnax) Are Not Associated With Coding Sequence Variation in the MC1R Gene

    PubMed Central

    Küpper, Clemens; Burke, Terry; Lank, David B.

    2015-01-01

    Sequence variation in the melanocortin-1 receptor (MC1R) gene explains color morph variation in several species of birds and mammals. Ruffs (Philomachus pugnax) exhibit major dark/light color differences in melanin-based male breeding plumage which is closely associated with alternative reproductive behavior. A previous study identified a microsatellite marker (Ppu020) near the MC1R locus associated with the presence/absence of ornamental plumage. We investigated whether coding sequence variation in the MC1R gene explains major dark/light plumage color variation and/or the presence/absence of ornamental plumage in ruffs. Among 821bp of the MC1R coding region from 44 male ruffs we found 3 single nucleotide polymorphisms, representing 1 nonsynonymous and 2 synonymous amino acid substitutions. None were associated with major dark/light color differences or the presence/absence of ornamental plumage. At all amino acid sites known to be functionally important in other avian species with dark/light plumage color variation, ruffs were either monomorphic or the shared polymorphism did not coincide with color morph. Neither ornamental plumage color differences nor the presence/absence of ornamental plumage in ruffs are likely to be caused entirely by amino acid variation within the coding regions of the MC1R locus. Regulatory elements and structural variation at other loci may be involved in melanin expression and contribute to the extreme plumage polymorphism observed in this species. PMID:25534935

  12. Exome sequencing and arrayCGH detection of gene sequence and copy number variation between ILS and ISS mouse strains.

    PubMed

    Dumas, Laura; Dickens, C Michael; Anderson, Nathan; Davis, Jonathan; Bennett, Beth; Radcliffe, Richard A; Sikela, James M

    2014-06-01

    It has been well documented that genetic factors can influence predisposition to develop alcoholism. While the underlying genomic changes may be of several types, two of the most common and disease associated are copy number variations (CNVs) and sequence alterations of protein coding regions. The goal of this study was to identify CNVs and single-nucleotide polymorphisms that occur in gene coding regions that may play a role in influencing the risk of an individual developing alcoholism. Toward this end, two mouse strains were used that have been selectively bred based on their differential sensitivity to alcohol: the Inbred long sleep (ILS) and Inbred short sleep (ISS) mouse strains. Differences in initial response to alcohol have been linked to risk for alcoholism, and the ILS/ISS strains are used to investigate the genetics of initial sensitivity to alcohol. Array comparative genomic hybridization (arrayCGH) and exome sequencing were conducted to identify CNVs and gene coding sequence differences, respectively, between ILS and ISS mice. Mouse arrayCGH was performed using catalog Agilent 1 × 244 k mouse arrays. Subsequently, exome sequencing was carried out using an Illumina HiSeq 2000 instrument. ArrayCGH detected 74 CNVs that were strain-specific (38 ILS/36 ISS), including several ISS-specific deletions that contained genes implicated in brain function and neurotransmitter release. Among several interesting coding variations detected by exome sequencing was the gain of a premature stop codon in the alpha-amylase 2B (AMY2B) gene specifically in the ILS strain. In total, exome sequencing detected 2,597 and 1,768 strain-specific exonic gene variants in the ILS and ISS mice, respectively. This study represents the most comprehensive and detailed genomic comparison of ILS and ISS mouse strains to date. The two complementary genome-wide approaches identified strain-specific CNVs and gene coding sequence variations that should provide strong candidates to

  13. Nucleotide sequence variation of the VP7 gene of two G3-type rotaviruses isolated from dogs.

    PubMed

    Martella, V; Pratelli, A; Greco, G; Gentile, M; Fiorente, P; Tempesta, M; Buonavoglia, C

    2001-04-01

    The sequence of the VP7 gene of two rotaviruses isolated from dogs in southern Italy was determined and the inferred amino acid sequence was compared with that of other rotavirus strains. There was very high nucleotide and amino acid identity between canine strain RV198/95 and other canine strains, and to the human strain HCR3A. Strain RV52/96, however, was found to have about 95% identity to the G3 serotype canine strains K9, A79-10 and CU-1 and 96% identity to strain RV198/95 and to the simian strain RRV. Therefore both of the canine strains belong to the G3 serotype. Nevertheless, detailed analysis of the VP7 variable regions revealed that RV52/96 possesses amino acid substitutions uncommon to the other canine isolates. In addition, strain RV52/96 exhibited a nucleotide divergence greater than 16% from all the other canine strains studied; however, it revealed the closest identity (90.4%) to the simian strain RRV. With only a few exceptions, phylogenetic analysis allowed clear differentiation of the G3 rotaviruses on the basis of the species of origin. The nucleotide and amino acid variations observed in strain RV52/96 could account for the existence of a canine rotavirus G3 sub-type. PMID:11226570

  14. Individual variation and intraclass correlation in arachidonic acid and eicosapentaenoic acid in chicken muscle

    PubMed Central

    2010-01-01

    Chicken meat with reduced concentration of arachidonic acid (AA) and reduced ratio between omega-6 and omega-3 fatty acids has potential health benefits because a reduction in AA intake dampens prostanoid signaling, and the proportion between omega-6 and omega-3 fatty acids is too high in our diet. Analyses for fatty acid determination are expensive, and finding the optimal number of analyses to give reliable results is a challenge. The objective of the present study was i) to analyse the intraclass correlation of different fatty acids in five meat samples, of one gram each, within the same chicken thigh, and ii) to study individual variations in the concentrations of a range of fatty acids and the ratio between omega-6 and omega-3 fatty acid concentrations among fifteen chickens. Fifteen newly hatched broilers were fed a wheat-based diet containing 4% rapeseed oil and 1% linseed oil for three weeks. Five muscle samples from the mid location of the thigh of each chicken were analysed for fatty acid composition. The intraclass correlation (sample correlation within the same animal) was 0.85-0.98 for the ratios of total omega-6 to total omega-3 fatty acids and of AA to eicosapentaenoic acid (EPA). This indicates that when studying these fatty acid ratios, one sample of one gram per animal is sufficient. However, due to the high individual variation between chicken for these ratios, a relatively high number of animals (minimum 15) are required to obtain a sufficiently high power to reveal significant effects of experimental factors (e.g. feeding regimes). The present experiment resulted in meat with a favorable concentration ratio between omega-6 and omega-3 fatty acids. The AA concentration varied from 1.5 to 2.8 g/100 g total fatty acids in thigh muscle in the fifteen broilers, and the ratio between AA and EPA concentrations ranged from 2.3 to 3.9. These differences among the birds may be due to genetic variance that can be exploited by breeding for lower AA

  15. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway. PMID:26025428

  16. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway.

  17. Genetic variation of six desaturase genes in flax and their impact on fatty acid composition.

    PubMed

    Thambugala, Dinushika; Duguid, Scott; Loewen, Evelyn; Rowland, Gordon; Booker, Helen; You, Frank M; Cloutier, Sylvie

    2013-10-01

    Flax (Linum usitatissimum L.) is one of the richest plant sources of omega-3 fatty acids praised for their health benefits. In this study, the extent of the genetic variability of genes encoding stearoyl-ACP desaturase (SAD), and fatty acid desaturase 2 (FAD2) and 3 (FAD3) was determined by sequencing the six paralogous genes from 120 flax accessions representing a broad range of germplasm including some EMS mutant lines. A total of 6 alleles for sad1 and sad2, 21 for fad2a, 5 for fad2b, 15 for fad3a and 18 for fad3b were identified. Deduced amino acid sequences of the alleles predicted 4, 2, 3, 4, 6 and 7 isoforms, respectively. Allele frequencies varied greatly across genes. Fad3a, with 110 SNPs and 19 indels, and fad3b, with 50 SNPs and 5 indels, showed the highest levels of genetic variations. While most of the SNPs and all the indels were silent mutations, both genes carried nonsense SNP mutations resulting in premature stop codons, a feature not observed in sad and fad2 genes. Some alleles and isoforms discovered in induced mutant lines were absent in the natural germplasm. Correlation of these genotypic data with fatty acid composition data of 120 flax accessions phenotyped in six field experiments revealed statistically significant effects of some of the SAD and FAD isoforms on fatty acid composition, oil content and iodine value. The novel allelic variants and isoforms identified for the six desaturases will be a resource for the development of oilseed flax with unique and useful fatty acid profiles.

  18. Genetic variation assessment of acid lime accessions collected from south of Iran using SSR and ISSR molecular markers.

    PubMed

    Sharafi, Ata Allah; Abkenar, Asad Asadi; Sharafi, Ali; Masaeli, Mohammad

    2016-01-01

    Iran has a long history of acid lime cultivation and propagation. In this study, genetic variation in 28 acid lime accessions from five regions of south of Iran, and their relatedness with other 19 citrus cultivars were analyzed using Simple Sequence Repeat (SSR) and Inter-Simple Sequence Repeat (ISSR) molecular markers. Nine primers for SSR and nine ISSR primers were used for allele scoring. In total, 49 SSR and 131 ISSR polymorphic alleles were detected. Cluster analysis of SSR and ISSR data showed that most of the acid lime accessions (19 genotypes) have hybrid origin and genetically distance with nucellar of Mexican lime (9 genotypes). As nucellar of Mexican lime are susceptible to phytoplasma, these acid lime genotypes can be used to evaluate their tolerance against biotic constricts like lime "witches' broom disease".

  19. Whole-genome sequencing reveals the diversity of cattle copy number variations and multicopy genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Structural and functional impacts of copy number variations (CNVs) on livestock genomes are not yet well understood. We identified 1853 CNV regions using population-scale sequencing data generated from 75 cattle representing 8 breeds (Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, Romagnol...

  20. Intraspecific genetic variation in Paramecium revealed by mitochondrial cytochrome C oxidase I sequences.

    PubMed

    Barth, Dana; Krenek, Sascha; Fokin, Sergei I; Berendonk, Thomas U

    2006-01-01

    Studies of intraspecific genetic diversity of ciliates, such as population genetics and biogeography, are particularly hampered by the lack of suitable DNA markers. For example, sequences of the non-coding ribosomal internal transcribed spacer (ITS) regions are often too conserved for intraspecific analyses. We have therefore identified primers for the mitochondrial cytochrome c oxidase I (COI) gene and applied them for intraspecific investigations in Paramecium caudatum and Paramecium multimicronucleatum. Furthermore, we obtained sequences of the ITS regions from the same strains and carried out comparative sequence analyses of both data sets. The mitochondrial sequences revealed substantially higher variation in both Paramecium species, with intraspecific divergences up to 7% in P. caudatum and 9.5% in P. multimicronucleatum. Moreover, an initial survey of the population structure discovered different mitochondrial haplotypes of P. caudatum in one pond, thereby demonstrating the potential of this genetic marker for population genetic analyses. Our primers successfully amplified the COI gene of other Paramecium. This is the first report of intraspecific variation in free-living protozoans based on mitochondrial sequence data. Our results show that the high variation in mitochondrial DNA makes it a suitable marker for intraspecific and population genetic studies.

  1. Mitochondrial COI sequences in mites: evidence for variations in base composition.

    PubMed

    Navajas, M; Fournier, D; Lagnel, J; Gutierrez, J; Boursot, P

    1996-11-01

    Studies of mitochondrial DNA sequences in a variety of animals have shown important differences between phyla, including differences in the genetic codes used, and varying constraints on base composition. In that respect, little is known of mites, an important and diversified group. We sequenced a portion (340 nt) of the cytochrome oxidase subunit I (COI) encoding gene in twenty species of phytophagous mites belonging to nine genera of the two families Tetranychidae and Tenuipalpidae. The mitochondrial genetic code used in mites appeared to be the same as in insects. As is generally also the case in insects, the mite sequences were very rich in A + T (75% on average), especially at the third codon position (94%). However, important variations of base composition were observed among mite species, one of them showing as little as 69% A + T. Variations of base composition occur mostly through synonymous transitions, and do not have detectable effects on polypeptide evolution in this group. PMID:8933179

  2. Predicting protein disorder by analyzing amino acid sequence

    PubMed Central

    Yang, Jack Y; Yang, Mary Qu

    2008-01-01

    Background Many protein regions and some entire proteins have no definite tertiary structure, presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured Proteins (IUP). IUP have been associated with a wide range of protein functions, along with roles in diseases characterized by protein misfolding and aggregation. Results Identifying IUP is important task in structural and functional genomics. We exact useful features from sequences and develop machine learning algorithms for the above task. We compare our IUP predictor with PONDRs (mainly neural-network-based predictors), disEMBL (also based on neural networks) and Globplot (based on disorder propensity). Conclusion We find that augmenting features derived from physiochemical properties of amino acids (such as hydrophobicity, complexity etc.) and using ensemble method proved beneficial. The IUP predictor is a viable alternative software tool for identifying IUP protein regions and proteins. PMID:18831799

  3. LOESS correction for length variation in gene set-based genomic sequence analysis

    PubMed Central

    Aboukhalil, Anton; Bulyk, Martha L.

    2012-01-01

    Motivation: Sequence analysis algorithms are often applied to sets of DNA, RNA or protein sequences to identify common or distinguishing features. Controlling for sequence length variation is critical to properly score sequence features and identify true biological signals rather than length-dependent artifacts. Results: Several cis-regulatory module discovery algorithms exhibit a substantial dependence between DNA sequence score and sequence length. Our newly developed LOESS method is flexible in capturing diverse score-length relationships and is more effective in correcting DNA sequence scores for length-dependent artifacts, compared with four other approaches. Application of this method to genes co-expressed during Drosophila melanogaster embryonic mesoderm development or neural development scored by the Lever motif analysis algorithm resulted in successful recovery of their biologically validated cis-regulatory codes. The LOESS length-correction method is broadly applicable, and may be useful not only for more accurate inference of cis-regulatory codes, but also for detection of other types of patterns in biological sequences. Availability: Source code and compiled code are available from http://thebrain.bwh.harvard.edu/LM_LOESS/ Contact: mlbulyk@receptor.med.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22492312

  4. Effect of sequence variation on the mechanical response of amyloid fibrils probed by steered molecular dynamics simulation.

    PubMed

    Ndlovu, Hlengisizwe; Ashcroft, Alison E; Radford, Sheena E; Harris, Sarah A

    2012-02-01

    The mechanical failure of mature amyloid fibers produces fragments that act as seeds for the growth of new fibrils. Fragmentation may also be correlated with cytotoxicity. We have used steered atomistic molecular dynamics simulations to study the mechanical failure of fibrils formed by the amyloidogenic fragment of human amylin hIAPP20-29 subjected to force applied in a variety of directions. By introducing systematic variations to this peptide sequence in silico, we have also investigated the role of the amino-acid sequence in determining the mechanical stability of amyloid fibrils. Our calculations show that the force required to induce mechanical failure depends on the direction of the applied stress and upon the degree of structural order present in the β-sheet assemblies, which in turn depends on the peptide sequence. The results have implications for the importance of sequence-dependent mechanical properties on seeding the growth of new fibrils and the role of breakage events in cytotoxicity.

  5. A quantitative trait locus for variation in dopamine metabolism mapped in a primate model using reference sequences from related species

    PubMed Central

    Freimer, Nelson B.; Service, Susan K.; Ophoff, Roel A.; Jasinska, Anna J.; McKee, Kevin; Villeneuve, Amelie; Belisle, Alexandre; Bailey, Julia N.; Breidenthal, Sherry E.; Jorgensen, Matthew J.; Mann, J. John; Cantor, Rita M.; Dewar, Ken; Fairbanks, Lynn A.

    2007-01-01

    Non-human primates (NHP) provide crucial research models. Their strong similarities to humans make them particularly valuable for understanding complex behavioral traits and brain structure and function. We report here the genetic mapping of an NHP nervous system biologic trait, the cerebrospinal fluid (CSF) concentration of the dopamine metabolite homovanillic acid (HVA), in an extended inbred vervet monkey (Chlorocebus aethiops sabaeus) pedigree. CSF HVA is an index of CNS dopamine activity, which is hypothesized to contribute substantially to behavioral variations in NHP and humans. For quantitative trait locus (QTL) mapping, we carried out a two-stage procedure. We first scanned the genome using a first-generation genetic map of short tandem repeat markers. Subsequently, using >100 SNPs within the most promising region identified by the genome scan, we mapped a QTL for CSF HVA at a genome-wide level of significance (peak logarithm of odds score >4) to a narrow well delineated interval (<10 Mb). The SNP discovery exploited conserved segments between human and rhesus macaque reference genome sequences. Our findings demonstrate the potential of using existing primate reference genome sequences for designing high-resolution genetic analyses applicable across a wide range of NHP species, including the many for which full genome sequences are not yet available. Leveraging genomic information from sequenced to nonsequenced species should enable the utilization of the full range of NHP diversity in behavior and disease susceptibility to determine the genetic basis of specific biological and behavioral traits. PMID:17884980

  6. Identification of staphylococcal species based on variations in protein sequences (mass spectrometry) and DNA sequence (sodA microarray).

    PubMed

    Kooken, Jennifer; Fox, Karen; Fox, Alvin; Altomare, Diego; Creek, Kim; Wunschel, David; Pajares-Merino, Sara; Martínez-Ballesteros, Ilargi; Garaizar, Javier; Oyarzabal, Omar; Samadpour, Mansour

    2014-02-01

    This report is among the first using sequence variation in newly discovered protein markers for staphylococcal (or indeed any other bacterial) speciation. Variation, at the DNA sequence level, in the sodA gene (commonly used for staphylococcal speciation) provided excellent correlation. Relatedness among strains was also assessed using protein profiling using microcapillary electrophoresis and pulsed field electrophoresis. A total of 64 strains were analyzed including reference strains representing the 11 staphylococcal species most commonly isolated from man (Staphylococcus aureus and 10 coagulase negative species [CoNS]). Matrix assisted time of flight ionization/ionization mass spectrometry (MALDI TOF MS) and liquid chromatography-electrospray ionization tandem mass spectrometry (LC ESI MS/MS) were used for peptide analysis of proteins isolated from gel bands. Comparison of experimental spectra of unknowns versus spectra of peptides derived from reference strains allowed bacterial identification after MALDI TOF MS analysis. After LC-MS/MS analysis of gel bands bacterial speciation was performed by comparing experimental spectra versus virtual spectra using the software X!Tandem. Finally LC-MS/MS was performed on whole proteomes and data analysis also employing X!tandem. Aconitate hydratase and oxoglutarate dehydrogenase served as marker proteins on focused analysis after gel separation. Alternatively on full proteomics analysis elongation factor Tu generally provided the highest confidence in staphylococcal speciation.

  7. High intraindividual variation in internal transcibed spacer sequences in Aeschynanthus (Gesneriaceae): implications for phylogenetics.

    PubMed Central

    Denduangboripant, J; Cronk, Q C

    2000-01-01

    Aeschynanthus (Gesneriaceae) is a large genus of tropical epiphytes that is widely distributed from the Himalayas and China throughout South-East Asia to New Guinea and the Solomon Islands. Polymerase chain reaction (PCR) consensus sequences of the internal transcribed spacers (ITS) of Aeschynanthus nuclear ribosomal DNA showed sequence polymorphism that was difficult to interpret. Cloning individual sequences from the PCR product generated a phylogenetic tree of 23 Aeschynanthus species (two clones per species). The intraindividual clone pairs varied from 0 to 5.01%. We suggest that the high intraindividual sequence variation results from low molecular drive in the ITS of Aeschynanthus. However, this study shows that, despite the variation found within some individuals, it is still possible to use these data to reconstruct phylogenetic relationships of the species, suggesting that clone variation, although persistent, does not pre-date the divergence of Aeschynanthus species. The Aeschynanthus analysis revealed two major clades with different but overlapping geographic distributions and reflected classification based on morphology (particularly seed hair type). PMID:10983824

  8. Simulated seasonal variations in wet acid depositions over East Asia.

    PubMed

    Ge, Cui; Zhang, Meigen; Zhu, Lingyun; Han, Xiao; Wang, Jun

    2011-11-01

    The air quality modeling system Regional Atmospheric Modeling System-Community Multi-scale Air Quality (RAMS-CMAQ) was applied to analyze temporospatial variations in wet acid deposition over East Asia in 2005, and model results obtained on a monthly basis were evaluated against extensive observations, including precipitation amounts at 704 stations and SO4(2-), NO3-, and NH4+ concentrations in the atmosphere and rainwater at 18 EANET (the Acid Deposition Monitoring Network in East Asia) stations. The comparison shows that the modeling system can reasonably reproduce seasonal precipitation patterns, especially the extensive area of dry conditions in northeast China and north China and the major precipitation zones. For ambient concentrations and wet depositions, the simulated results are in reasonable agreement (within a factor of 2) with observations in most cases, and the major observed features are mostly well reproduced. The analysis of modeled wet deposition distributions indicates that East Asia experiences noticeable variations in its wet deposition patterns throughout the year. In winter, southern China and the coastal areas of the Japan Sea report higher S04(2-) and NO3- wet depositions. In spring, elevated SO4(2-) and NO3-wet depositions are found in northeastern China, southern China, and around the Yangtze River. In summer, a remarkable rise in precipitation in northeastern China, the valleys of the Huaihe and Yangtze rivers, Korea, and Japan leads to a noticeable increase in SO4(2-) and NO3- wet depositions, whereas in autumn, higher SO4(2-) and NO3-wet depositions are found around Sichuan Province. Meanwhile, due to the high emission of SO2, high wet depositions of SO4(2-) are found throughout the entire year in the area surrounding Sichuan Province. There is a tendency toward decreasing NO3- concentrations in rainwater from China through Korea to Japan in both observed and simulated results, which is a consequence of the influence of the continental

  9. Simulated seasonal variations in wet acid depositions over East Asia.

    PubMed

    Ge, Cui; Zhang, Meigen; Zhu, Lingyun; Han, Xiao; Wang, Jun

    2011-11-01

    The air quality modeling system Regional Atmospheric Modeling System-Community Multi-scale Air Quality (RAMS-CMAQ) was applied to analyze temporospatial variations in wet acid deposition over East Asia in 2005, and model results obtained on a monthly basis were evaluated against extensive observations, including precipitation amounts at 704 stations and SO4(2-), NO3-, and NH4+ concentrations in the atmosphere and rainwater at 18 EANET (the Acid Deposition Monitoring Network in East Asia) stations. The comparison shows that the modeling system can reasonably reproduce seasonal precipitation patterns, especially the extensive area of dry conditions in northeast China and north China and the major precipitation zones. For ambient concentrations and wet depositions, the simulated results are in reasonable agreement (within a factor of 2) with observations in most cases, and the major observed features are mostly well reproduced. The analysis of modeled wet deposition distributions indicates that East Asia experiences noticeable variations in its wet deposition patterns throughout the year. In winter, southern China and the coastal areas of the Japan Sea report higher S04(2-) and NO3- wet depositions. In spring, elevated SO4(2-) and NO3-wet depositions are found in northeastern China, southern China, and around the Yangtze River. In summer, a remarkable rise in precipitation in northeastern China, the valleys of the Huaihe and Yangtze rivers, Korea, and Japan leads to a noticeable increase in SO4(2-) and NO3- wet depositions, whereas in autumn, higher SO4(2-) and NO3-wet depositions are found around Sichuan Province. Meanwhile, due to the high emission of SO2, high wet depositions of SO4(2-) are found throughout the entire year in the area surrounding Sichuan Province. There is a tendency toward decreasing NO3- concentrations in rainwater from China through Korea to Japan in both observed and simulated results, which is a consequence of the influence of the continental

  10. Worldwide DNA sequence variation in a 10-kilobase noncoding region on human chromosome 22.

    PubMed

    Zhao, Z; Jin, L; Fu, Y X; Ramsay, M; Jenkins, T; Leskinen, E; Pamilo, P; Trexler, M; Patthy, L; Jorde, L B; Ramos-Onsins, S; Yu, N; Li, W H

    2000-10-10

    Human DNA sequence variation data are useful for studying the origin, evolution, and demographic history of modern humans and the mechanisms of maintenance of genetic variability in human populations, and for detecting linkage association of disease. Here, we report worldwide variation data from a approximately 10-kilobase noncoding autosomal region. We identified 75 variant sites in 64 humans (128 sequences) and 463 variant sites among the human, chimpanzee, and orangutan sequences. Statistical tests suggested that the region is selectively neutral. The average nucleotide diversity (pi) across the region was 0.088% among all of the human sequences obtained, 0.085% among African sequences, and 0.082% among non-African sequences, supporting the view of a low nucleotide diversity ( approximately 0.1%) in humans. The comparable pi value in non-Africans to that in Africans indicates no severe bottleneck during the evolution of modern non-Africans; however, the possibility of a mild bottleneck cannot be excluded because non-Africans showed considerably fewer variants than Africans. The present and two previous large data sets all show a strong excess of low frequency variants in comparison to that expected from an equilibrium population, indicating a relatively recent population expansion. The mutation rate was estimated to be 1.15 x 10(-9) per nucleotide per year. Estimates of the long-term effective population size N(e) by various statistical methods were similar to those in other studies. The age of the most recent common ancestor was estimated to be approximately 1.29 million years ago among all of the sequences obtained and approximately 634,000 years ago among the non-African sequences, providing the first evidence from a noncoding autosomal region for ancient human histories, even among non-Africans.

  11. Human liver apolipoprotein B-100 cDNA: complete nucleic acid and derived amino acid sequence.

    PubMed Central

    Law, S W; Grant, S M; Higuchi, K; Hospattankar, A; Lackner, K; Lee, N; Brewer, H B

    1986-01-01

    Human apolipoprotein B-100 (apoB-100), the ligand on low density lipoproteins that interacts with the low density lipoprotein receptor and initiates receptor-mediated endocytosis and low density lipoprotein catabolism, has been cloned, and the complete nucleic acid and derived amino acid sequences have been determined. ApoB-100 cDNAs were isolated from normal human liver cDNA libraries utilizing immunoscreening as well as filter hybridization with radiolabeled apoB-100 oligodeoxynucleotides. The apoB-100 mRNA is 14.1 kilobases long encoding a mature apoB-100 protein of 4536 amino acids with a calculated amino acid molecular weight of 512,723. ApoB-100 contains 20 potential glycosylation sites, and 12 of a total of 25 cysteine residues are located in the amino-terminal region of the apolipoprotein providing a potential globular structure of the amino terminus of the protein. ApoB-100 contains relatively few regions of amphipathic helices, but compared to other human apolipoproteins it is enriched in beta-structure. The delineation of the entire human apoB-100 sequence will now permit a detailed analysis of the conformation of the protein, the low density lipoprotein receptor binding domain(s), and the structural relationship between apoB-100 and apoB-48 and will provide the basis for the study of genetic defects in apoB-100 in patients with dyslipoproteinemias. PMID:3464946

  12. BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations.

    PubMed

    Wang, Junbai; Batmanov, Kirill

    2015-12-01

    Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein-DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein-DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972

  13. Oleic Acid: Natural variation and potential enhancement in oilseed crops.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Oleic acid is a monounsaturated omega 9 fatty acid (MUFA, C18:1) which can be found in various plant lipids and animal fats. Unlike omega 3 (a-linolenic acid, C18:3) and omega 6 (linoleic acid, C18:2) fatty acids which are essential because they cannot be synthesized by humans and must be obtained f...

  14. Genome-wide Mycobacterium tuberculosis variation (GMTV) database: a new tool for integrating sequence variations and epidemiology

    PubMed Central

    2014-01-01

    Background Tuberculosis (TB) poses a worldwide threat due to advancing multidrug-resistant strains and deadly co-infections with Human immunodeficiency virus. Today large amounts of Mycobacterium tuberculosis whole genome sequencing data are being assessed broadly and yet there exists no comprehensive online resource that connects M. tuberculosis genome variants with geographic origin, with drug resistance or with clinical outcome. Description Here we describe a broadly inclusive unifying Genome-wide Mycobacterium tuberculosis Variation (GMTV) database, (http://mtb.dobzhanskycenter.org) that catalogues genome variations of M. tuberculosis strains collected across Russia. GMTV contains a broad spectrum of data derived from different sources and related to M. tuberculosis molecular biology, epidemiology, TB clinical outcome, year and place of isolation, drug resistance profiles and displays the variants across the genome using a dedicated genome browser. GMTV database, which includes 1084 genomes and over 69,000 SNP or Indel variants, can be queried about M. tuberculosis genome variation and putative associations with drug resistance, geographical origin, and clinical stages and outcomes. Conclusions Implementation of GMTV tracks the pattern of changes of M. tuberculosis strains in different geographical areas, facilitates disease gene discoveries associated with drug resistance or different clinical sequelae, and automates comparative genomic analyses among M. tuberculosis strains. PMID:24767249

  15. SoftSearch: Integration of Multiple Sequence Features to Identify Breakpoints of Structural Variations

    PubMed Central

    Hart, Steven N.; Sarangi, Vivekananda; Moore, Raymond; Baheti, Saurabh; Bhavsar, Jaysheel D.; Couch, Fergus J.; Kocher, Jean-Pierre A.

    2013-01-01

    Background Structural variation (SV) represents a significant, yet poorly understood contribution to an individual’s genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. Results We developed and validated SoftSearch using real and synthetic datasets. SoftSearch’s key features are 1) not requiring secondary (or exhaustive primary) alignment, 2) portability into established sequencing workflows, and 3) is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.). SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. Conclusions We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance. PMID:24358278

  16. Sequence-Based Pronunciation Variation Modeling for Spontaneous ASR Using a Noisy Channel Approach

    NASA Astrophysics Data System (ADS)

    Hofmann, Hansjörg; Sakti, Sakriani; Hori, Chiori; Kashioka, Hideki; Nakamura, Satoshi; Minker, Wolfgang

    The performance of English automatic speech recognition systems decreases when recognizing spontaneous speech mainly due to multiple pronunciation variants in the utterances. Previous approaches address this problem by modeling the alteration of the pronunciation on a phoneme to phoneme level. However, the phonetic transformation effects induced by the pronunciation of the whole sentence have not yet been considered. In this article, the sequence-based pronunciation variation is modeled using a noisy channel approach where the spontaneous phoneme sequence is considered as a “noisy” string and the goal is to recover the “clean” string of the word sequence. Hereby, the whole word sequence and its effect on the alternation of the phonemes will be taken into consideration. Moreover, the system not only learns the phoneme transformation but also the mapping from the phoneme to the word directly. In this study, first the phonemes will be recognized with the present recognition system and afterwards the pronunciation variation model based on the noisy channel approach will map from the phoneme to the word level. Two well-known natural language processing approaches are adopted and derived from the noisy channel model theory: Joint-sequence models and statistical machine translation. Both of them are applied and various experiments are conducted using microphone and telephone of spontaneous speech.

  17. Sequence analysis and identification of new variations in the coding sequence of melatonin receptor gene (MTNR1A) of Indian Chokla sheep breed

    PubMed Central

    Saxena, Vijay Kumar; Jha, Bipul Kumar; Meena, Amar Singh; Naqvi, S.M.K.

    2014-01-01

    Melatonin receptor 1A gene is the prime receptor mediating the effect of melatonin at the neuroendocrine level for control of seasonal reproduction in sheep. The aims of this study were to examine the polymorphism pattern of coding sequence of MTNR1A gene in Chokla sheep, a breed of Indian arid tract and to identify new variations in relation to its aseasonal status. Genomic DNAs of 101 Chokla sheep were collected and an 824 bp coding sequence of Exon II was amplified. RFLP was performed with enzyme RsaI and MnlI to assess the presence of polymorphism at position C606T and G612A, respectively. Genotyping revealed significantly higher frequency of M and R alleles than m and r alleles. RR and MM were found to be dominantly present in the group of studied population. Cloning and sequencing of Exon II followed by mutation/polymorphism analysis revealed ten mutations of which three were non-synonymous mutations (G706A, C893A, G931C). G706A leads to substitution of valine by isoleucine Val125I (U14109) in the fifth transmembrane domain. C893A leads to substitution of alanine by aspartic acid in the third extracellular loop. G931C mutation brings about substitution of amino acid alanine by proline in the seventh transmembrane helix, can affect the conformational stability of the molecule. Polyphen-2 analysis revealed that the polymorphism at position 931 is potentially damaging while the mutations at positions 706 and 893 were benign. It is concluded that G931C mutation of MTNR 1A gene, may explain, in part, the importance of melatonin structure integrity in influencing seasonality in sheep. PMID:25606429

  18. Spatial variation in acidic deposition in an appalachian forest

    SciTech Connect

    Tajchman, S.J.; Kosuri, S.R.; Zeleznik, J.D. . Division of Forestry)

    1993-03-01

    Precipitation is the main source of water for the forest. Gross precipitation reaching the forest canopy is partitioned among interception, stem flow, and through fall. While the intercepted fraction of precipitation evaporates into the atmosphere, stem flow and through fall reach the forest floor, but their chemistry is as a rule different from that of gross precipitation. This modification is due to the enrichment of stem flow and through fall with substances of dry deposition washed off from the vegetation surface. In addition, there is a mass exchange effect between the solutions of chemicals on the surface of vegetation and plant tissue; the balance of this exchange can be positive or negative. There is spatial variability in the amount and chemistry of through fall. This variability is mainly related to the distribution of leaf area and to the species composition. In the West Virginia University Forest, the coefficient of variation of the growing-season through fall means ranged from 0.09 to 0.14. The major components in acid rain are sulfates and nitrates, which affect plant growth and forest declines. The objective of this study was to obtain characteristic properties of through fall and stem flow of selected trees in the West Virginia University Forest.

  19. ENTPRISE: An Algorithm for Predicting Human Disease-Associated Amino Acid Substitutions from Sequence Entropy and Predicted Protein Structures

    PubMed Central

    Zhou, Hongyi; Gao, Mu; Skolnick, Jeffrey

    2016-01-01

    The advance of next-generation sequencing technologies has made exome sequencing rapid and relatively inexpensive. A major application of exome sequencing is the identification of genetic variations likely to cause Mendelian diseases. This requires processing large amounts of sequence information and therefore computational approaches that can accurately and efficiently identify the subset of disease-associated variations are needed. The accuracy and high false positive rates of existing computational tools leave much room for improvement. Here, we develop a boosted tree regression machine-learning approach to predict human disease-associated amino acid variations by utilizing a comprehensive combination of protein sequence and structure features. On comparing our method, ENTPRISE, to the state-of-the-art methods SIFT, PolyPhen-2, MUTATIONASSESSOR, MUTATIONTASTER, FATHMM, ENTPRISE exhibits significant improvement. In particular, on a testing dataset consisting of only proteins with balanced disease-associated and neutral variations defined as having the ratio of neutral/disease-associated variations between 0.3 and 3, the Mathews Correlation Coefficient by ENTPRISE is 0.493 as compared to 0.432 by PPH2-HumVar, 0.406 by SIFT, 0.403 by MUTATIONASSESSOR, 0.402 by PPH2-HumDiv, 0.305 by MUTATIONTASTER, and 0.181 by FATHMM. ENTPRISE is then applied to nucleic acid binding proteins in the human proteome. Disease-associated predictions are shown to be highly correlated with the number of protein-protein interactions. Both these predictions and the ENTPRISE server are freely available for academic users as a web service at http://cssb.biology.gatech.edu/entprise/. PMID:26982818

  20. ENTPRISE: An Algorithm for Predicting Human Disease-Associated Amino Acid Substitutions from Sequence Entropy and Predicted Protein Structures.

    PubMed

    Zhou, Hongyi; Gao, Mu; Skolnick, Jeffrey

    2016-01-01

    The advance of next-generation sequencing technologies has made exome sequencing rapid and relatively inexpensive. A major application of exome sequencing is the identification of genetic variations likely to cause Mendelian diseases. This requires processing large amounts of sequence information and therefore computational approaches that can accurately and efficiently identify the subset of disease-associated variations are needed. The accuracy and high false positive rates of existing computational tools leave much room for improvement. Here, we develop a boosted tree regression machine-learning approach to predict human disease-associated amino acid variations by utilizing a comprehensive combination of protein sequence and structure features. On comparing our method, ENTPRISE, to the state-of-the-art methods SIFT, PolyPhen-2, MUTATIONASSESSOR, MUTATIONTASTER, FATHMM, ENTPRISE exhibits significant improvement. In particular, on a testing dataset consisting of only proteins with balanced disease-associated and neutral variations defined as having the ratio of neutral/disease-associated variations between 0.3 and 3, the Mathews Correlation Coefficient by ENTPRISE is 0.493 as compared to 0.432 by PPH2-HumVar, 0.406 by SIFT, 0.403 by MUTATIONASSESSOR, 0.402 by PPH2-HumDiv, 0.305 by MUTATIONTASTER, and 0.181 by FATHMM. ENTPRISE is then applied to nucleic acid binding proteins in the human proteome. Disease-associated predictions are shown to be highly correlated with the number of protein-protein interactions. Both these predictions and the ENTPRISE server are freely available for academic users as a web service at http://cssb.biology.gatech.edu/entprise/.

  1. Temporal Variations of Organic Acids in Sumac Fruit

    SciTech Connect

    Robbins, C.; Mulcahy, F.; Somayajula, K.; Edenborn, H.M.

    2006-10-01

    Extracts from staghorn sumac (Rhus typhina) fruits were obtained from fresh fruits obtained from June to October in two successive years. Total acidity, pH, and concentrations of malic and succinic acids determined using liquid chromatography were measured for each extract. Acidity and acid concentrations reached their maxima in late July, and declined slowly thereafter. Malic and succinic acid concentrations in the extracts reached maxima of about 4 and 0.2% (expressed per unit weight of fruit), respectively. Malic and succinic acids were the only organic acids observed in the extracts, and mass balance determinations indicate that these acids are most likely the only ones present in appreciable amounts.

  2. Cloning and sequencing of the Bet v 1-homologous allergen Fra a 1 in strawberry (Fragaria ananassa) shows the presence of an intron and little variability in amino acid sequence.

    PubMed

    Musidlowska-Persson, Anna; Alm, Rikard; Emanuelsson, Cecilia

    2007-02-01

    The Fra a 1 allergen in strawberry (Fragaria ananassa) is homologous to the major birch pollen allergen Bet v 1, which has numerous isoforms differing in terms of amino acid sequence and immunological impact. To map the extent of sequence differences in the Fra a 1 allergen, PCR cloning and sequencing was applied. Several genomic sequences of Fra a 1, with a length of either 584, 591 or 594 nucleotides, were obtained from three different strawberry varieties. All contained one intron, with the length of either 101 or 110 nucleotides. By sequencing 30 different clones, eight different DNA sequences were obtained, giving in total five potential Fra a 1 protein isoforms, with high sequence similarity (>97% sequence identity) and only seven positions of amino acid variability, which were largely confirmed by mass spectrometry of expressed proteins. We conclude that the sequence variability in the strawberry allergen Fra a 1 is small, within and between strawberry varieties, and that multiple spots, previously detected in 2DE, are presumably due to differences in post-translational modification rather than differences in amino acid sequence. The most abundant Fra a 1 isoform sequence, recombinantly expressed in Escherichia coli after removal of the intron, was recognized by IgE from strawberry allergic patients. It cross-reacted with antibodies to Bet v 1 and the homologous apple allergen Mal d 1 (61 and 78% sequence identity, respectively), and will be used in further analyses of variation in Fra a 1-expression.

  3. Genome-wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomic structural variations are an important source of genetic diversity. Copy number variations (CNVs), gains and losses of large regions of genomic sequence between individuals of a species, are known to be associated with both diseases and phenotypic traits. Deeply sequenced genomes are often u...

  4. Sequence variation in the ribosomal DNA internal transcribed spacer of Tridacna crocea.

    PubMed

    Yu, E T; Juinio-Meñez, M A; Monje, V D

    2000-11-01

    DNA-based genetic markers are needed to augment existing allozyme markers in the assessment of genetic diversity of wild giant clam populations. The dearth of polymorphic mitochondrial DNA regions amplified from known universal polymerase chain reaction (PCR) primers has led us to search other regions of the genome for viable sources of DNA polymorphism. We have designed tridacnid-specific PCR primers for the amplification of internal transcribed spacer regions. Sequences of the first internal transcribed spacer segment (ITS-1) revealed very high polymorphism, showing 29% variation arising from base substitutions alone. Preliminary restriction analysis of the ITS regions using 8 restriction enzymes revealed cryptic changes in the DNA sequence. These mutations are promising as marker tools for differentiating geographically separated populations. Such variation in the ITS region can possibly be used for population genetic analysis.

  5. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  6. A map of human genome variation from population-scale sequencing.

    PubMed

    Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A

    2010-10-28

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.

  7. Identification of genetic variations of a Chinese family with paramyotonia congenita via whole exome sequencing.

    PubMed

    Li, Jinxin; Huang, Qinghai; Ge, Liang; Xu, Jing; Shi, Xingjuan; Xie, Wei; Liu, Xiang; Liu, Xiangdong

    2015-06-01

    Paramyotonia congenita (PC) is a rare autosomal dominant neuromuscular disorder characterized by juvenile onset and development of cold-induced myotonia after repeated activities. The disease is mostly caused by genetic mutations of the sodium channel, voltage-gated, type IV, alpha subunit (SCN4A) gene. This study intended to systematically identify the causative genetic variations of a Chinese Han PC family. Seven members of this PC family, including four patients and three healthy controls, were selected for whole exome sequencing (WES) using the Illumina HiSeq platform. Sequence variations were identified using the SoftGenetics program. The mutation R1448C of SCN4A was found to be the only causative mutation. This study applied WES technology to sequence multiple members of a large PC family and was the first to systematically confirm that the genetic change in SCN4A is the only causative variation in this PC family and the SCN4A mutation is sufficient to lead to PC.

  8. Sequence variation of koala retrovirus transmembrane protein p15E among koalas from different geographic regions.

    PubMed

    Ishida, Yasuko; McCallister, Chelsea; Nikolaidis, Nikolas; Tsangaras, Kyriakos; Helgen, Kristofer M; Greenwood, Alex D; Roca, Alfred L

    2015-01-15

    The koala retrovirus (KoRV), which is transitioning from an exogenous to an endogenous form, has been associated with high mortality in koalas. For other retroviruses, the envelope protein p15E has been considered a candidate for vaccine development. We therefore examined proviral sequence variation of KoRV p15E in a captive Queensland and three wild southern Australian koalas. We generated 163 sequences with intact open reading frames, which grouped into 39 distinct haplotypes. Sixteen distinct haplotypes comprising 139 of the sequences (85%) coded for the same polypeptide. Among the remaining 23 haplotypes, 22 were detected only once among the sequences, and each had 1 or 2 non-synonymous differences from the majority sequence. Several analyses suggested that p15E was under purifying selection. Important epitopes and domains were highly conserved across the p15E sequences and in previously reported exogenous KoRVs. Overall, these results support the potential use of p15E for KoRV vaccine development.

  9. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing.

    PubMed

    Ferreira, Pedro G; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R; Rivas, Manuel A; Esteve-Codina, Anna; Rosenstiel, Philip; Strom, Tim M; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-01-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing-alternative splice sites, introns, and cleavage sites-which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts. PMID:27617755

  10. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    NASA Astrophysics Data System (ADS)

    2016-09-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.

  11. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    PubMed Central

    Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A.C.T; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-01-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts. PMID:27617755

  12. Sequence variation of koala retrovirus transmembrane protein p15E among koalas from different geographic regions.

    PubMed

    Ishida, Yasuko; McCallister, Chelsea; Nikolaidis, Nikolas; Tsangaras, Kyriakos; Helgen, Kristofer M; Greenwood, Alex D; Roca, Alfred L

    2015-01-15

    The koala retrovirus (KoRV), which is transitioning from an exogenous to an endogenous form, has been associated with high mortality in koalas. For other retroviruses, the envelope protein p15E has been considered a candidate for vaccine development. We therefore examined proviral sequence variation of KoRV p15E in a captive Queensland and three wild southern Australian koalas. We generated 163 sequences with intact open reading frames, which grouped into 39 distinct haplotypes. Sixteen distinct haplotypes comprising 139 of the sequences (85%) coded for the same polypeptide. Among the remaining 23 haplotypes, 22 were detected only once among the sequences, and each had 1 or 2 non-synonymous differences from the majority sequence. Several analyses suggested that p15E was under purifying selection. Important epitopes and domains were highly conserved across the p15E sequences and in previously reported exogenous KoRVs. Overall, these results support the potential use of p15E for KoRV vaccine development. PMID:25462343

  13. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza.

    PubMed

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  14. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  15. Natural vs. random protein sequences: Discovering combinatorics properties on amino acid words.

    PubMed

    Santoni, Daniele; Felici, Giovanni; Vergni, Davide

    2016-02-21

    Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lacking of an unambiguous answer. Casual mutations tend to randomize protein sequences while, in order to have the correct functionality, one expects that selection mechanisms impose rigid constraints on amino acid sequences. Moreover, one also has to consider that the space of all possible amino acid sequences is so astonishingly large that it could be reasonable to have a well tuned amino acid sequence indistinguishable from a random one. In order to study the possibility to discriminate between random and natural amino acid sequences, we introduce different measures of association between pairs of amino acids in a sequence, and apply them to a dataset of 1047 natural protein sequences and 10,470 random sequences, carefully generated in order to preserve the relative length and amino acid distribution of the natural proteins. We analyze the multidimensional measures with machine learning techniques and show that, to a reasonable extent, natural protein sequences can be differentiated from random ones.

  16. Natural vs. random protein sequences: Discovering combinatorics properties on amino acid words.

    PubMed

    Santoni, Daniele; Felici, Giovanni; Vergni, Davide

    2016-02-21

    Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lacking of an unambiguous answer. Casual mutations tend to randomize protein sequences while, in order to have the correct functionality, one expects that selection mechanisms impose rigid constraints on amino acid sequences. Moreover, one also has to consider that the space of all possible amino acid sequences is so astonishingly large that it could be reasonable to have a well tuned amino acid sequence indistinguishable from a random one. In order to study the possibility to discriminate between random and natural amino acid sequences, we introduce different measures of association between pairs of amino acids in a sequence, and apply them to a dataset of 1047 natural protein sequences and 10,470 random sequences, carefully generated in order to preserve the relative length and amino acid distribution of the natural proteins. We analyze the multidimensional measures with machine learning techniques and show that, to a reasonable extent, natural protein sequences can be differentiated from random ones. PMID:26656109

  17. Population genetic structure of Indian shad, Tenualosa ilisha inferred from variation in mitochondrial DNA sequences.

    PubMed

    Behera, B K; Singh, N S; Paria, P; Sahoo, A K; Panda, D; Meena, D K; Das, P; Pakrashi, S; Biswas, D K; Sharma, A P

    2015-09-01

    Indian shad, Tenualosa ilisha, is a commercially important anadromous fish representing major catch in Indo-pacific region. The present study evaluated partial Cytochrome b (Cyt b) gene sequence of mtDNA in T. ilisha for determining genetic variation from Bay of Bengal and Arabian Sea origins. The genomic DNA extracted from T. ilisha samples representing two distant rivers in the Indian subcontinent, the Bhagirathi (lower stretch of Ganges) and the Tapi was analyzed. Sequencing of 307 bp mtDNA Cytochrome b gene fragment revealed the presence of 5 haplotypes, with high haplotype diversity (Hd) of 0.9048 with variance 0.103 and low nucleotide diversity (π) of 0.14301. Three population specific haplotypes were observed in river Ganga and two haplotypes in river Tapi. Neighbour-joining tree based on Cytochrome b gene sequences of T. ilisha showed that population from Bay of Bengal and Arabian Sea origins belonged to two distinct clusters.

  18. Population genetic structure of Indian shad, Tenualosa ilisha inferred from variation in mitochondrial DNA sequences.

    PubMed

    Behera, B K; Singh, N S; Paria, P; Sahoo, A K; Panda, D; Meena, D K; Das, P; Pakrashi, S; Biswas, D K; Sharma, A P

    2015-09-01

    Indian shad, Tenualosa ilisha, is a commercially important anadromous fish representing major catch in Indo-pacific region. The present study evaluated partial Cytochrome b (Cyt b) gene sequence of mtDNA in T. ilisha for determining genetic variation from Bay of Bengal and Arabian Sea origins. The genomic DNA extracted from T. ilisha samples representing two distant rivers in the Indian subcontinent, the Bhagirathi (lower stretch of Ganges) and the Tapi was analyzed. Sequencing of 307 bp mtDNA Cytochrome b gene fragment revealed the presence of 5 haplotypes, with high haplotype diversity (Hd) of 0.9048 with variance 0.103 and low nucleotide diversity (π) of 0.14301. Three population specific haplotypes were observed in river Ganga and two haplotypes in river Tapi. Neighbour-joining tree based on Cytochrome b gene sequences of T. ilisha showed that population from Bay of Bengal and Arabian Sea origins belonged to two distinct clusters. PMID:26521565

  19. Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

    NASA Astrophysics Data System (ADS)

    Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

    2015-12-01

    Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.

  20. Population subdivision in Europe's great bustard inferred from mitochondrial and nuclear DNA sequence variation.

    PubMed

    Pitra, C; Lieckfeldt, D; Alonso, J C

    2000-08-01

    A continent-wide survey of sequence variation in mitochondrial (mt) and nuclear (n) DNA of the endangered great bustard (Otis tarda) was conducted to assess the extent of phylogeographic structure in a morphologically monotypic bird. DNA sequence variation in a combined 809 bp segment of the mtDNA genome from 66 individuals from the last six breeding regions showed relatively low levels of intraspecific sequence diversity (n = 0.32%) but significant differences in the regional distribution of 11 haplotypes (phiST = 0.49). Despite their exceptional potential for dispersal, a complete and long-term historical separation between the populations from the Iberian Peninsula (Spain) and mainland Europe (Hungary, Slovakia, Germany, and Russia) was demonstrated. Divergence between populations based on a 3-bp insertion-deletion polymorphism within the intron region of the nuclear CHD-Z gene was geographically concordant with the primary subdivision identified within the mtDNA sequences. Inferred aspects of phylogeography were used to formulate conservation recommendations for this endangered species.

  1. Magnetic susceptibility variations in Loess sequences and their relationship to astronomical forcing

    NASA Technical Reports Server (NTRS)

    Verosub, Kenneth L.; Singer, Michael J.

    1992-01-01

    The long, well-exposed and often continuous sequences of loess found throughout the world are generally thought to provide an excellent opportunity for studying long-term, large-scale environmental change during the last few million years. In recent years, the most fruitful loess studies have been those involving the deposits of the loess in China. One of the most intriguing results of that work has been the discovery of an apparent correlation between variations in the magnetic susceptibility of the loess sequence and the oxygen isotope record of the deep sea. This correlation implies that magnetic susceptibility variations are being driven by astronomical parameters. However, the basic data have been interpreted in various ways by different authors, most of whom assumed that the magnetic minerals in the loess have not been affected by post-depositional processes. Using a chemical extraction procedure that allows us to separate the contribution of secondary pedogenic magnetic minerals from primary inherited magnetic minerals, we have found that the magnetic susceptibility of the Chinese paleosols is largely due to a pedogenic component which is present to a lesser degree in the loess. We have also found that the smaller inherited component of the magnetic susceptibility is about the same in the paleosols and the loess. These results demonstrate the need for additional study of the processes that create magnetic susceptibility variations in order to interpret properly the role of astronomical forcing in producing these variations.

  2. Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase.

    PubMed Central

    Clark, A G; Weiss, K M; Nickerson, D A; Taylor, S L; Buchanan, A; Stengård, J; Salomaa, V; Vartiainen, E; Perola, M; Boerwinkle, E; Sing, C F

    1998-01-01

    Allelic variation in 9.7 kb of genomic DNA sequence from the human lipoprotein lipase gene (LPL) was scored in 71 healthy individuals (142 chromosomes) from three populations: African Americans (24) from Jackson, MS; Finns (24) from North Karelia, Finland; and non-Hispanic Whites (23) from Rochester, MN. The sequences had a total of 88 variable sites, with a nucleotide diversity (site-specific heterozygosity) of .002+/-.001 across this 9.7-kb region. The frequency spectrum of nucleotide variation exhibited a slight excess of heterozygosity, but, in general, the data fit expectations of the infinite-sites model of mutation and genetic drift. Allele-specific PCR helped resolve linkage phases, and a total of 88 distinct haplotypes were identified. For 1,410 (64%) of the 2,211 site pairs, all four possible gametes were present in these haplotypes, reflecting a rich history of past recombination. Despite the strong evidence for recombination, extensive linkage disequilibrium was observed. The number of haplotypes generally is much greater than the number expected under the infinite-sites model, but there was sufficient multisite linkage disequilibrium to reveal two major clades, which appear to be very old. Variation in this region of LPL may depart from the variation expected under a simple, neutral model, owing to complex historical patterns of population founding, drift, selection, and recombination. These data suggest that the design and interpretation of disease-association studies may not be as straightforward as often is assumed. PMID:9683608

  3. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    PubMed Central

    Chateigner, Aurélien; Bézier, Annie; Labrousse, Carole; Jiolle, Davy; Barbe, Valérie; Herniou, Elisabeth A.

    2015-01-01

    Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%). K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs). Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential. PMID:26198241

  4. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity

    PubMed Central

    Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H.; Allen, Andrew S.; Goldstein, David B.

    2015-01-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene’s proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene’s regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen’s Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance

  5. Sequence variation in the coding region of the melanocortin-1 receptor gene (MC1R) is not associated with plumage variation in the blue-crowned manakin (Lepidothrix coronata).

    PubMed

    Cheviron, Z A; Hackett, Shannon J; Brumfield, Robb T

    2006-07-01

    Avian plumage traits are the targets of both natural and sexual selection. Consequently, genetic changes resulting in plumage variation among closely related taxa might represent important evolutionary events. The molecular basis of such differences, however, is unknown in most cases. Sequence variation in the melanocortin-1 receptor gene (MC1R) is associated with melanistic phenotypes in many vertebrate taxa, including several avian species. The blue-crowned manakin (Lepidothrix coronata), a widespread, sexually dichromatic passerine, exhibits striking geographic variation in male plumage colour across its range in southern Central America and western Amazonia. Northern males are black with brilliant blue crowns whereas southern males are green with lighter blue crowns. We sequenced 810 bp of the MC1R coding region in 23 individuals spanning the range of male plumage variation. The only variable sites we detected among L. coronata sequences were four synonymous substitutions, none of which were strictly associated with either plumage type. Similarly, comparative analyses showed that L. coronata sequences were monomorphic at the three amino acid sites hypothesized to be functionally important in other birds. These results demonstrate that genes other than MC1R underlie melanic plumage polymorphism in blue-crowned manakins. PMID:16769631

  6. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: Combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance

    SciTech Connect

    Wu, Gang; Nie, Lei; Zhang, Weiwen

    2006-05-26

    ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...

  7. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  8. Amino acid sequence of horseshoe crab, Tachypleus tridentatus, striated muscle troponin C.

    PubMed

    Kobayashi, T; Kagami, O; Takagi, T; Konishi, K

    1989-05-01

    The amino acid sequence of troponin C obtained from horseshoe crab, Tachypleus tridentatus, striated muscle was determined by sequence analysis and alignments of chemically and enzymatically cleaved peptides. Troponin C is composed of 153 amino acid residues with a blocked N-terminus and contains no tryptophan or cysteine residue. The site I, one of the four Ca2+-binding sites, is considered to have lost its ability to bind Ca2+ owing to the replacements of certain amino acid residues.

  9. Capturing genomic signatures of DNA sequence variation using a standard anonymous microarray platform

    PubMed Central

    Cannon, C. H.; Kua, C. S.; Lobenhofer, E. K.; Hurban, P.

    2006-01-01

    Comparative genomics, using the model organism approach, has provided powerful insights into the structure and evolution of whole genomes. Unfortunately, only a small fraction of Earth's biodiversity will have its genome sequenced in the foreseeable future. Most wild organisms have radically different life histories and evolutionary genomics than current model systems. A novel technique is needed to expand comparative genomics to a wider range of organisms. Here, we describe a novel approach using an anonymous DNA microarray platform that gathers genomic samples of sequence variation from any organism. Oligonucleotide probe sequences placed on a custom 44 K array were 25 bp long and designed using a simple set of criteria to maximize their complexity and dispersion in sequence probability space. Using whole genomic samples from three known genomes (mouse, rat and human) and one unknown (Gonystylus bancanus), we demonstrate and validate its power, reliability, transitivity and sensitivity. Using two separate statistical analyses, a large numbers of genomic ‘indicator’ probes were discovered. The construction of a genomic signature database based upon this technique would allow virtual comparisons and simple queries could generate optimal subsets of markers to be used in large-scale assays, using simple downstream techniques. Biologists from a wide range of fields, studying almost any organism, could efficiently perform genomic comparisons, at potentially any phylogenetic level after performing a small number of standardized DNA microarray hybridizations. Possibilities for refining and expanding the approach are discussed. PMID:17000641

  10. Discovery and statistical genotyping of copy-number variation from whole-exome sequencing depth.

    PubMed

    Fromer, Menachem; Moran, Jennifer L; Chambert, Kimberly; Banks, Eric; Bergen, Sarah E; Ruderfer, Douglas M; Handsaker, Robert E; McCarroll, Steven A; O'Donovan, Michael C; Owen, Michael J; Kirov, George; Sullivan, Patrick F; Hultman, Christina M; Sklar, Pamela; Purcell, Shaun M

    2012-10-01

    Sequencing of gene-coding regions (the exome) is increasingly used for studying human disease, for which copy-number variants (CNVs) are a critical genetic component. However, detecting copy number from exome sequencing is challenging because of the noncontiguous nature of the captured exons. This is compounded by the complex relationship between read depth and copy number; this results from biases in targeted genomic hybridization, sequence factors such as GC content, and batching of samples during collection and sequencing. We present a statistical tool (exome hidden Markov model [XHMM]) that uses principal-component analysis (PCA) to normalize exome read depth and a hidden Markov model (HMM) to discover exon-resolution CNV and genotype variation across samples. We evaluate performance on 90 schizophrenia trios and 1,017 case-control samples. XHMM detects a median of two rare (<1%) CNVs per individual (one deletion and one duplication) and has 79% sensitivity to similarly rare CNVs overlapping three or more exons discovered with microarrays. With sensitivity similar to state-of-the-art methods, XHMM achieves higher specificity by assigning quality metrics to the CNV calls to filter out bad ones, as well as to statistically genotype the discovered CNV in all individuals, yielding a trio call set with Mendelian-inheritance properties highly consistent with expectation. We also show that XHMM breakpoint quality scores enable researchers to explicitly search for novel classes of structural variation. For example, we apply XHMM to extract those CNVs that are highly likely to disrupt (delete or duplicate) only a portion of a gene. PMID:23040492

  11. Functional consequences of sequence variation in the pheromone biosynthetic gene pgFAR for Ostrinia moths.

    PubMed

    Lassance, Jean-Marc; Liénard, Marjorie A; Antony, Binu; Qian, Shuguang; Fujii, Takeshi; Tabata, Jun; Ishikawa, Yukio; Löfstedt, Christer

    2013-03-01

    Pheromones are central to the mating systems of a wide range of organisms, and reproductive isolation between closely related species is often achieved by subtle differences in pheromone composition. In insects and moths in particular, the use of structurally similar components in different blend ratios is usually sufficient to impede gene flow between taxa. To date, the genetic changes associated with variation and divergence in pheromone signals remain largely unknown. Using the emerging model system Ostrinia, we show the functional consequences of mutations in the protein-coding region of the pheromone biosynthetic fatty-acyl reductase gene pgFAR. Heterologous expression confirmed that pgFAR orthologs encode enzymes exhibiting different substrate specificities that are the direct consequences of extensive nonsynonymous substitutions. When taking natural ratios of pheromone precursors into account, our data reveal that pgFAR substrate preference provides a good explanation of how species-specific ratios of pheromone components are obtained among Ostrinia species. Moreover, our data indicate that positive selection may have promoted the observed accumulation of nonsynonymous amino acid substitutions. Site-directed mutagenesis experiments substantiate the idea that amino acid polymorphisms underlie subtle or drastic changes in pgFAR substrate preference. Altogether, this study identifies the reduction step as a potential source of variation in pheromone signals in the moth genus Ostrinia and suggests that selection acting on particular mutations provides a mechanism allowing pheromone reductases to evolve new functional properties that may contribute to variation in the composition of pheromone signals.

  12. Clinical Interpretation of Variants from Next-Generation Sequencing: The 2016 Scientific Meeting of the Human Genome Variation Society.

    PubMed

    Oetting, William S; Brookes, Anthony J; Béroud, Christophe; Taschner, Peter E

    2016-10-01

    The 2016 scientific meeting of the Human Genome Variation Society (HGVS; http://www.hgvs.org) was held on the 20th of May in Barcelona, Spain, with the theme of "Clinical Interpretation of Variants from Next-Generation Sequencing."

  13. Evaluation of a novel food composition database that includes glutamine and other amino acids derived from gene sequencing data

    PubMed Central

    Lenders, CM; Liu, S; Wilmore, DW; Sampson, L; Dougherty, LW; Spiegelman, D; Willett, WC

    2011-01-01

    Objectives To determine the content of glutamine in major food proteins. Subjects/Methods We used a validated 131-food item food frequency questionnaire (FFQ) to identify the foods that contributed the most to protein intake among 70 356 women in the Nurses’ Health Study (NHS, 1984). The content of glutamine and other amino acids in foods was calculated based on protein fractions generated from gene sequencing methods (Swiss Institute of Bioinformatics) and compared with data from conventional (USDA) and modified biochemical (Khun) methods. Pearson correlation coefficients were used to compare the participants’ dietary intakes of amino acids by sequencing and USDA methods. Results The glutamine content varied from 0.01 to to 9.49 g/100 g of food and contributed from 1 to to 33% of total protein for all FFQ foods with protein. When comparing the sequencing and Kuhn’s methods, the proportion of glutamine in meat was 4.8 vs 4.4%. Among NHS participants, mean glutamine intake was 6.84 (s.d.=2.19) g/day and correlation coefficients for amino acid between intakes assessed by sequencing and USDA methods ranged from 0.94 to 0.99 for absolute intake, −0.08 to 0.90 after adjusting for 100 g of protein, and 0.88 to 0.99 after adjusting for 1000 kcal. The between-person coefficient of variation of energy-adjusted intake of glutamine was 16%. Conclusions These data suggest that (1) glutamine content can be estimated from gene sequencing methods and (2) there is a reasonably wide variation in energy-adjusted glutamine intake, allowing for exploration of glutamine consumption and disease. PMID:19756030

  14. Trichomonas vaginalis acidic phospholipase A2: isolation and partial amino acid sequence.

    PubMed

    Escobedo-Guajardo, Brenda L; González-Salazar, Francisco; Palacios-Corona, Rebeca; Torres de la Cruz, Víctor M; Morales-Vallarta, Mario; Mata-Cárdenas, Benito D; Garza-González, Jesús N; Rivera-Silva, Gerardo; Vargas-Villarreal, Javier

    2013-12-01

    Sexually transmitted diseases are a major cause of acute disease worldwide, and trichomoniasis is the most common and curable disease, generating more than 170 million cases annually worldwide. Trichomonas vaginalis is the causal agent of trichomoniasis and has the ability to destroy in vitro cell monolayers of the vaginal mucosa, where the phospholipases A2 (PLA2) have been reported as potential virulence factors. These enzymes have been partially characterized from the subcellular fraction S30 of pathogenic T. vaginalis strains. The main objective of this study was to purify a phospholipase A2 from T. vaginalis, make a partial characterization, obtain a partial amino acid sequence, and determine its enzymatic participation as hemolytic factor causing lysis of erythrocytes. Trichomonas S30, RF30 and UFF30 sub-fractions from GT-15 strain have the capacity to hydrolyze [2-(14)C-PA]-PC at pH 6.0. Proteins from the UFF30 sub-fraction were separated by affinity chromatography into two eluted fractions with detectable PLA A2 activity. The EDTA-eluted fraction was analyzed by HPLC using on-line HPLC-tandem mass spectrometry and two protein peaks were observed at 8.2 and 13 kDa. Peptide sequences were identified from the proteins present in the eluted EDTA UFF30 fraction; bioinformatic analysis using Protein Link Global Server charged with T. vaginalis protein database suggests that eluted peptides correspond a putative ubiquitin protein in the 8.2 kDa fraction and a phospholipase preserved in the 13 kDa fraction. The EDTA-eluted fraction hydrolyzed [2-(14)C-PA]-PC lyses erythrocytes from Sprague-Dawley in a time and dose-dependent manner. The acidic hemolytic activity decreased by 84% with the addition of 100 μM of Rosenthal's inhibitor. PMID:24338313

  15. Trichomonas vaginalis acidic phospholipase A2: isolation and partial amino acid sequence.

    PubMed

    Escobedo-Guajardo, Brenda L; González-Salazar, Francisco; Palacios-Corona, Rebeca; Torres de la Cruz, Víctor M; Morales-Vallarta, Mario; Mata-Cárdenas, Benito D; Garza-González, Jesús N; Rivera-Silva, Gerardo; Vargas-Villarreal, Javier

    2013-12-01

    Sexually transmitted diseases are a major cause of acute disease worldwide, and trichomoniasis is the most common and curable disease, generating more than 170 million cases annually worldwide. Trichomonas vaginalis is the causal agent of trichomoniasis and has the ability to destroy in vitro cell monolayers of the vaginal mucosa, where the phospholipases A2 (PLA2) have been reported as potential virulence factors. These enzymes have been partially characterized from the subcellular fraction S30 of pathogenic T. vaginalis strains. The main objective of this study was to purify a phospholipase A2 from T. vaginalis, make a partial characterization, obtain a partial amino acid sequence, and determine its enzymatic participation as hemolytic factor causing lysis of erythrocytes. Trichomonas S30, RF30 and UFF30 sub-fractions from GT-15 strain have the capacity to hydrolyze [2-(14)C-PA]-PC at pH 6.0. Proteins from the UFF30 sub-fraction were separated by affinity chromatography into two eluted fractions with detectable PLA A2 activity. The EDTA-eluted fraction was analyzed by HPLC using on-line HPLC-tandem mass spectrometry and two protein peaks were observed at 8.2 and 13 kDa. Peptide sequences were identified from the proteins present in the eluted EDTA UFF30 fraction; bioinformatic analysis using Protein Link Global Server charged with T. vaginalis protein database suggests that eluted peptides correspond a putative ubiquitin protein in the 8.2 kDa fraction and a phospholipase preserved in the 13 kDa fraction. The EDTA-eluted fraction hydrolyzed [2-(14)C-PA]-PC lyses erythrocytes from Sprague-Dawley in a time and dose-dependent manner. The acidic hemolytic activity decreased by 84% with the addition of 100 μM of Rosenthal's inhibitor.

  16. A nucleic acid sequence-based amplification system for detection of Listeria monocytogenes hlyA sequences.

    PubMed Central

    Blais, B W; Turner, G; Sooknanan, R; Malek, L T

    1997-01-01

    A nucleic acid sequence-based amplification system primarily targeting mRNA from the Listeria monocytogenes hlyA gene was developed. This system enabled the detection of low numbers (< 10 CFU/g) of L. monocytogenes cells inoculated into a variety of dairy and egg products after 48 h of enrichment in modified listeria enrichment broth. PMID:8979357

  17. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  18. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  19. Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RN...

  20. Interspecific variations in adhesive protein sequences of Mytilus edulis, M. galloprovincialis, and M. trossulus.

    PubMed

    Inoue, K; Waite, J H; Matsuoka, M; Odo, S; Harayama, S

    1995-12-01

    Variation in the adhesive protein gene sequences of Mytilus edulis, M. galloprovincialis, and M. trossulus collected in Delaware, Kamaishi (Japan), and Alaska, respectively, was analyzed by the polymerase chain reaction (PCR) using two sets of oligonucleotide primers. The first set, Me 13 and Me 14, was designed to amplify the repetitive region. The length of the amplified fragments was highly variable, even among samples of the same species. Another set, Me 15 and Me 16, was designed to amplify a part of the nonrepetitive region. The length of the amplified fragments was uniform in each species and differed interspecifically; 180, 168, and 126 bp for M. edulis, M. trossulus, and M. galloprovincialis, respectively. The amplified sequence of M. trossulus resembled that of M. edulis. Mussels from other sites were also examined by PCR using Me 15 and Me 16. Wild mussels from Tromsö (Norway) and cultured mussels from Brittany (France) were identified as M. edulis. Cultured mussels from the Mediterranean coast of France and wild mussels from Shimizu (Japan) were identified as M. galloprovincialis. Some wild mussels from Hiura (Japan) were identified as a hybrid between M. galloprovincialis and M. trossulus. Thus, the length of this part (variable region) of the sequence is proposed as a diagnostic marker for these three morphologically similar species and their hybrids.

  1. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat

    PubMed Central

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-01-01

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches. PMID:27172215

  2. Deleted copy number variation of Hanwoo and Holstein using next generation sequencing at the population level

    PubMed Central

    2014-01-01

    Background Copy number variation (CNV), a source of genetic diversity in mammals, has been shown to underlie biological functions related to production traits. Notwithstanding, there have been few studies conducted on CNVs using next generation sequencing at the population level. Results Illumina NGS data was obtained for ten Holsteins, a dairy cattle, and 22 Hanwoo, a beef cattle. The sequence data for each of the 32 animals varied from 13.58-fold to almost 20-fold coverage. We detected a total of 6,811 deleted CNVs across the analyzed individuals (average length = 2732.2 bp) corresponding to 0.74% of the cattle genome (18.6 Mbp of variable sequence). By examining the overlap between CNV deletion regions and genes, we selected 30 genes with the highest deletion scores. These genes were found to be related to the nervous system, more specifically with nervous transmission, neuron motion, and neurogenesis. We regarded these genes as having been effected by the domestication process. Further analysis of the CNV genotyping information revealed 94 putative selected CNVs and 954 breed-specific CNVs. Conclusions This study provides useful information for assessing the impact of CNVs on cattle traits using NGS at the population level. PMID:24673797

  3. Cytochrome Oxidase I (COI) sequence conservation and variation patterns in the yellowfin and longtail tunas.

    PubMed

    Kunal, Swaraj Priyaranjan; Kumar, Girish

    2013-01-01

    Tunas are commercially important fishery worldwide. There are at least 13 species of tuna belonging to three genera, out of which genus Thunnus has maximum eight species. On the basis of their availability, they can be characterised as oceanic such as Thunnus albacares (yellowfin tuna) or coastal such as Thunnus tonggol (longtail tuna). Although these two are different species, morphological differentiation can only be seen in mature individuals, hence misidentification may result in erroneous data set, which ultimately affect conservation strategies. The mitochondrial DNA cytochrome oxidase c subunit 1 (COI) gene is one of the most popular markers for population genetic and phylogeographic studies across the animal kingdom. The present study aims to study the sequence conservation and variation in mitochondrial Cytochrome Oxidase I (COI) between these two species of tuna. COI sequence analysis of yellowfin and longtail revealed the close relationship between them in Thunnus genera. The present study is the first direct comparison of mitochondrial COI sequences of these two tuna species. PMID:23649742

  4. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat.

    PubMed

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-07-07

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches.

  5. Genome organization and variation in the 3'-partial sequence of garlic latent virus in China.

    PubMed

    Chen, Jiong; Zheng, Hongying; Chen, Jianping; Yang, Chongliang

    2002-08-01

    Ten different isolates of a carlavirus were detected by degenerate PCR from 12 garlic samples collected from 6 provinces in China, and the complete genome sequence of the Zhejiang isolate ZJ1 and 3'-terminal sequences of 9 other isolates were determined. The RNA genome of isolate ZJ1 consisted of 8363nts excluding the 3'-poly (A) tail, and the genome organization was similar to other carlaviruses with 6 open reading frames encoding a replicase, TGB1, TGB2, TGB3, CP and NABP respectively. Sequence comparisons showed that all 10 isolates were Garlic latent virus (GarLV). The variations in the TGB2, TGB3 and NABP were more significant than those in the CP. High homology was also detected between those isolates and Shallot latent virus (ShLV). Phylogenetic analysis suggested that GarLV isolates from garlic can be divided into 4 main groups and Chinese isolates belonged to each group. This is the first reported molecular analysis of members of the genus Carlavirus in China. PMID:18759032

  6. Whole Genome Sequencing demonstrates that Geographic Variation of Escherichia coli O157 Genotypes Dominates Host Association

    PubMed Central

    Strachan, Norval J. C.; Rotariu, Ovidiu; Lopes, Bruno; MacRae, Marion; Fairley, Susan; Laing, Chad; Gannon, Victor; Allison, Lesley J.; Hanson, Mary F.; Dallman, Tim; Ashton, Philip; Franz, Eelco; van Hoek, Angela H. A. M.; French, Nigel P.; George, Tessy; Biggs, Patrick J.; Forbes, Ken J.

    2015-01-01

    Genetic variation in an infectious disease pathogen can be driven by ecological niche dissimilarities arising from different host species and different geographical locations. Whole genome sequencing was used to compare E. coli O157 isolates from host reservoirs (cattle and sheep) from Scotland and to compare genetic variation of isolates (human, animal, environmental/food) obtained from Scotland, New Zealand, Netherlands, Canada and the USA. Nei’s genetic distance calculated from core genome single nucleotide polymorphisms (SNPs) demonstrated that the animal isolates were from the same population. Investigation of the Shiga toxin bacteriophage and their insertion sites (SBI typing) revealed that cattle and sheep isolates had statistically indistinguishable rarefaction profiles, diversity and genotypes. In contrast, isolates from different countries exhibited significant differences in Nei’s genetic distance and SBI typing. Hence, after successful international transmission, which has occurred on multiple occasions, local genetic variation occurs, resulting in a global patchwork of continental and trans-continental phylogeographic clades. These findings are important for three reasons: first, understanding transmission and evolution of infectious diseases associated with multiple host reservoirs and multi-geographic locations; second, highlighting the relevance of the sheep reservoir when considering farm based interventions; and third, improving our understanding of why human disease incidence varies across the world. PMID:26442781

  7. Whole Genome Sequencing demonstrates that Geographic Variation of Escherichia coli O157 Genotypes Dominates Host Association.

    PubMed

    Strachan, Norval J C; Rotariu, Ovidiu; Lopes, Bruno; MacRae, Marion; Fairley, Susan; Laing, Chad; Gannon, Victor; Allison, Lesley J; Hanson, Mary F; Dallman, Tim; Ashton, Philip; Franz, Eelco; van Hoek, Angela H A M; French, Nigel P; George, Tessy; Biggs, Patrick J; Forbes, Ken J

    2015-10-07

    Genetic variation in an infectious disease pathogen can be driven by ecological niche dissimilarities arising from different host species and different geographical locations. Whole genome sequencing was used to compare E. coli O157 isolates from host reservoirs (cattle and sheep) from Scotland and to compare genetic variation of isolates (human, animal, environmental/food) obtained from Scotland, New Zealand, Netherlands, Canada and the USA. Nei's genetic distance calculated from core genome single nucleotide polymorphisms (SNPs) demonstrated that the animal isolates were from the same population. Investigation of the Shiga toxin bacteriophage and their insertion sites (SBI typing) revealed that cattle and sheep isolates had statistically indistinguishable rarefaction profiles, diversity and genotypes. In contrast, isolates from different countries exhibited significant differences in Nei's genetic distance and SBI typing. Hence, after successful international transmission, which has occurred on multiple occasions, local genetic variation occurs, resulting in a global patchwork of continental and trans-continental phylogeographic clades. These findings are important for three reasons: first, understanding transmission and evolution of infectious diseases associated with multiple host reservoirs and multi-geographic locations; second, highlighting the relevance of the sheep reservoir when considering farm based interventions; and third, improving our understanding of why human disease incidence varies across the world.

  8. Facile Analysis and Sequencing of Linear and Branched Peptide Boronic Acids by MALDI Mass Spectrometry

    PubMed Central

    Crumpton, Jason; Zhang, Wenyu; Santos, Webster

    2011-01-01

    Interest in peptides incorporating boronic acid moieties is increasing due to their potential as therapeutics/diagnostics for a variety of diseases such as cancer. The utility of peptide boronic acids may be expanded with access to vast libraries that can be deconvoluted rapidly and economically. Unfortunately, current detection protocols using mass spectrometry are laborious and confounded by boronic acid trimerization, which requires time consuming analysis of dehydration products. These issues are exacerbated when the peptide sequence is unknown, as with de novo sequencing, and especially when multiple boronic acid moieties are present. Thus, a rapid, reliable and simple method for peptide identification is of utmost importance. Herein, we report the identification and sequencing of linear and branched peptide boronic acids containing up to five boronic acid groups by matrix-assisted laser desorption/ionization mass spectrometry (MALDI-MS). Protocols for preparation of pinacol boronic esters were adapted for efficient MALDI analysis of peptides. Additionally, a novel peptide boronic acid detection strategy was developed in which 2,5-dihydroxybenzoic acid (DHB) served as both matrix and derivatizing agent in a convenient, in situ, on-plate esterification. Finally, we demonstrate that DHB-modified peptide boronic acids from a single bead can be analyzed by MALDI-MSMS analysis, validating our approach for the identification and sequencing of branched peptide boronic acid libraries. PMID:21449540

  9. Evolution of an Enzyme from a Noncatalytic Nucleic Acid Sequence.

    PubMed

    Gysbers, Rachel; Tram, Kha; Gu, Jimmy; Li, Yingfu

    2015-01-01

    The mechanism by which enzymes arose from both abiotic and biological worlds remains an unsolved natural mystery. We postulate that an enzyme can emerge from any sequence of any functional polymer under permissive evolutionary conditions. To support this premise, we have arbitrarily chosen a 50-nucleotide DNA fragment encoding for the Bos taurus (cattle) albumin mRNA and subjected it to test-tube evolution to derive a catalytic DNA (DNAzyme) with RNA-cleavage activity. After only a few weeks, a DNAzyme with significant catalytic activity has surfaced. Sequence comparison reveals that seven nucleotides are responsible for the conversion of the noncatalytic sequence into the enzyme. Deep sequencing analysis of DNA pools along the evolution trajectory has identified individual mutations as the progressive drivers of the molecular evolution. Our findings demonstrate that an enzyme can indeed arise from a sequence of a functional polymer via permissive molecular evolution, a mechanism that may have been exploited by nature for the creation of the enormous repertoire of enzymes in the biological world today. PMID:26091540

  10. Computer Simulation of the Determination of Amino Acid Sequences in Polypeptides

    ERIC Educational Resources Information Center

    Daubert, Stephen D.; Sontum, Stephen F.

    1977-01-01

    Describes a computer program that generates a random string of amino acids and guides the student in determining the correct sequence of a given protein by using experimental analytic data for that protein. (MLH)

  11. Recurrent Coding Sequence Variation Explains Only A Small Fraction of the Genetic Architecture of Colorectal Cancer.

    PubMed

    Timofeeva, Maria N; Kinnersley, Ben; Farrington, Susan M; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J; Harris, Sarah E; Northwood, Emma L; Barrett, Jennifer H; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F A; Hes, Frederik J; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Försti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernández-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellví-Bel, Sergi; Campbell, Harry; Bishop, D Timothy; Tomlinson, Ian P M; Dunlop, Malcolm G; Houlston, Richard S

    2015-01-01

    Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,045 controls from six European populations. Single-variant analysis identified a coding variant (rs3184504) in SH2B3 (12q24) associated with CRC risk (OR = 1.08, P = 3.9 × 10(-7)), and novel damaging coding variants in 3 genes previously tagged by GWAS efforts; rs16888728 (8q24) in UTP23 (OR = 1.15, P = 1.4 × 10(-7)); rs6580742 and rs12303082 (12q13) in FAM186A (OR = 1.11, P = 1.2 × 10(-7) and OR = 1.09, P = 7.4 × 10(-8)); rs1129406 (12q13) in ATF1 (OR = 1.11, P = 8.3 × 10(-9)), all reaching exome-wide significance levels. Gene based tests identified associations between CRC and PCDHGA genes (P < 2.90 × 10(-6)). We found an excess of rare, damaging variants in base-excision (P = 2.4 × 10(-4)) and DNA mismatch repair genes (P = 6.1 × 10(-4)) consistent with a recessive mode of inheritance. This study comprehensively explores the contribution of coding sequence variation to CRC risk, identifying associations with coding variation in 4 genes and PCDHG gene cluster and several candidate recessive alleles. However, these findings suggest that recurrent, low-frequency coding variants account for a minority of the unexplained heritability of CRC. PMID:26553438

  12. Recurrent Coding Sequence Variation Explains Only A Small Fraction of the Genetic Architecture of Colorectal Cancer

    PubMed Central

    Timofeeva, Maria N.; Kinnersley, Ben; Farrington, Susan M.; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J.; Harris, Sarah E.; Northwood, Emma L.; Barrett, Jennifer H.; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F A; Hes, Frederik J.; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Försti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernández-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellví-Bel, Sergi; Campbell, Harry; Bishop, D. Timothy; Tomlinson, Ian P M; Dunlop, Malcolm G.; Houlston, Richard S.

    2015-01-01

    Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,045 controls from six European populations. Single-variant analysis identified a coding variant (rs3184504) in SH2B3 (12q24) associated with CRC risk (OR = 1.08, P = 3.9 × 10−7), and novel damaging coding variants in 3 genes previously tagged by GWAS efforts; rs16888728 (8q24) in UTP23 (OR = 1.15, P = 1.4 × 10−7); rs6580742 and rs12303082 (12q13) in FAM186A (OR = 1.11, P = 1.2 × 10−7 and OR = 1.09, P = 7.4 × 10−8); rs1129406 (12q13) in ATF1 (OR = 1.11, P = 8.3 × 10−9), all reaching exome-wide significance levels. Gene based tests identified associations between CRC and PCDHGA genes (P < 2.90 × 10−6). We found an excess of rare, damaging variants in base-excision (P = 2.4 × 10−4) and DNA mismatch repair genes (P = 6.1 × 10−4) consistent with a recessive mode of inheritance. This study comprehensively explores the contribution of coding sequence variation to CRC risk, identifying associations with coding variation in 4 genes and PCDHG gene cluster and several candidate recessive alleles. However, these findings suggest that recurrent, low-frequency coding variants account for a minority of the unexplained heritability of CRC. PMID:26553438

  13. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly).

  14. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly). PMID:9836434

  15. cDNA-derived amino acid sequences of myoglobins from nine species of whales and dolphins.

    PubMed

    Iwanami, Kentaro; Mita, Hajime; Yamamoto, Yasuhiko; Fujise, Yoshihiro; Yamada, Tadasu; Suzuki, Tomohiko

    2006-10-01

    We determined the myoglobin (Mb) cDNA sequences of nine cetaceans, of which six are the first reports of Mb sequences: sei whale (Balaenoptera borealis), Bryde's whale (Balaenoptera edeni), pygmy sperm whale (Kogia breviceps), Stejneger's beaked whale (Mesoplodon stejnegeri), Longman's beaked whale (Indopacetus pacificus), and melon-headed whale (Peponocephala electra), and three confirm the previously determined chemical amino acid sequences: sperm whale (Physeter macrocephalus), common minke whale (Balaenoptera acutorostrata) and pantropical spotted dolphin (Stenella attenuata). We found two types of Mb in the skeletal muscle of pantropical spotted dolphin: Mb I with the same amino acid sequence as that deposited in the protein database, and Mb II, which differs at two amino acid residues compared with Mb I. Using an alignment of the amino acid or cDNA sequences of cetacean Mb, we constructed a phylogenetic tree by the NJ method. Clustering of cetacean Mb amino acid and cDNA sequences essentially follows the classical taxonomy of cetaceans, suggesting that Mb sequence data is valid for classification of cetaceans at least to the family level. PMID:16962803

  16. Multiple Genome Sequences of Important Beer-Spoiling Lactic Acid Bacteria

    PubMed Central

    Geissler, Andreas J.; Vogel, Rudi F.

    2016-01-01

    Seven strains of important beer-spoiling lactic acid bacteria were sequenced using single-molecule real-time sequencing. Complete genomes were obtained for strains of Lactobacillus paracollinoides, Lactobacillus lindneri, and Pediococcus claussenii. The analysis of these genomes emphasizes the role of plasmids as the genomic foundation of beer-spoiling ability. PMID:27795248

  17. Association Between Absolute Neutrophil Count and Variation at TCIRG1: The NHLBI Exome Sequencing Project.

    PubMed

    Rosenthal, Elisabeth A; Makaryan, Vahagn; Burt, Amber A; Crosslin, David R; Kim, Daniel Seung; Smith, Joshua D; Nickerson, Deborah A; Reiner, Alex P; Rich, Stephen S; Jackson, Rebecca D; Ganesh, Santhi K; Polfus, Linda M; Qi, Lihong; Dale, David C; Jarvik, Gail P

    2016-09-01

    Neutrophils are a key component of innate immunity. Individuals with low neutrophil count are susceptible to frequent infections. Linkage and association between congenital neutropenia and a single rare missense variant in TCIRG1 have been reported in a single family. Here, we report on nine rare missense variants at evolutionarily conserved sites in TCIRG1 that are associated with lower absolute neutrophil count (ANC; p = 0.005) in 1,058 participants from three cohorts: Atherosclerosis Risk in Communities (ARIC), Coronary Artery Risk Development in Young Adults (CARDIA), and Jackson Heart Study (JHS) of the NHLBI Grand Opportunity Exome Sequencing Project (GO ESP). These results validate the effects of TCIRG1 coding variation on ANC and suggest that this gene may be associated with a spectrum of mild to severe effects on ANC. PMID:27229898

  18. Plant simple sequence repeats: distribution, variation, and effects on gene expression.

    PubMed

    Sharopova, Natalya

    2008-02-01

    Genome-wide simple sequence repeat (SSR) information was analyzed together with functional annotations of Arabidopsis genes and public gene expression data for Arabidopsis and rice. Analysis of more than 15,000 Arabidopsis and more than 16,000 rice SSRs indicated that SSRs may affect the expression of hundreds of genes. Data from experiments on DNA methylation, histone acetylation, and transcript turnover suggest that SSRs may affect gene expression at transcriptional and posttranscriptional levels. Members of some functional groups were shown to be enriched with SSRs and often contained similar but non-homologous repeats within the same gene regions. In addition, the distribution of perfect and imperfect SSRs in some Arabidopsis, maize, and rice genes was used to demonstrate how two-level control of SSR variation may contribute to protein evolution.

  19. Color differences among feral pigeons (Columba livia) are not attributable to sequence variation in the coding region of the melanocortin-1 receptor gene (MC1R)

    PubMed Central

    2013-01-01

    Background Genetic variation at the melanocortin-1 receptor (MC1R) gene is correlated with melanin color variation in many birds. Feral pigeons (Columba livia) show two major melanin-based colorations: a red coloration due to pheomelanic pigment and a black coloration due to eumelanic pigment. Furthermore, within each color type, feral pigeons display continuous variation in the amount of melanin pigment present in the feathers, with individuals varying from pure white to a full dark melanic color. Coloration is highly heritable and it has been suggested that it is under natural or sexual selection, or both. Our objective was to investigate whether MC1R allelic variants are associated with plumage color in feral pigeons. Findings We sequenced 888 bp of the coding sequence of MC1R among pigeons varying both in the type, eumelanin or pheomelanin, and the amount of melanin in their feathers. We detected 10 non-synonymous substitutions and 2 synonymous substitution but none of them were associated with a plumage type. It remains possible that non-synonymous substitutions that influence coloration are present in the short MC1R fragment that we did not sequence but this seems unlikely because we analyzed the entire functionally important region of the gene. Conclusions Our results show that color differences among feral pigeons are probably not attributable to amino acid variation at the MC1R locus. Therefore, variation in regulatory regions of MC1R or variation in other genes may be responsible for the color polymorphism of feral pigeons. PMID:23915680

  20. A cladistic model of ACE sequence variation with implications for myocardial infarction, Alzheimer disease and obesity.

    PubMed

    Katzov, Hagit; Bennet, Anna M; Kehoe, Patrick; Wiman, Björn; Gatz, Margaret; Blennow, Kaj; Lenhard, Boris; Pedersen, Nancy L; de Faire, Ulf; Prince, Jonathan A

    2004-11-01

    Sequence variation in ACE, which encodes angiotensin I converting enzyme, contributes to a large proportion of variability in plasma ACE levels, but the extent to which this impacts upon human disease is unresolved. Most efforts to associate ACE with other heritable traits have involved a single Alu insertion/deletion polymorphism, despite the probable existence of other functional sequence variants with effects that may not be consistently detectable by solely typing the Alu indel. Here, utilizing single nucleotide polymorphisms (SNPs) that differentiate major ACE clades in European populations, we demonstrate a number of significant phenotype associations across more than 4000 Swedish individuals. In a systematic analysis of metabolic phenotypes, effects were detected upon several traits, including fasting plasma glucose levels, insulin levels and measures of obesity (P-values ranging from 0.046 to 8.4 x 10(-6)). Extending cladistic models to the study of myocardial infarction and Alzheimer disease, significant associations were observed with greater effect sizes than those typically obtained in large-scale meta-analyses based on the Alu indel. Population frequencies of ACE genotypes were also found to change with age, congruent with previous data suggesting effects upon longevity. Clade models consistently outperformed those based upon single markers, reinforcing the importance of taking into consideration the possible confounding effects of allelic heterogeneity in this genomic region. Utilizing computational tools, potential functional variants are highlighted that may underlie phenotypic variability, which is discussed along with the broader implications these results may have for studies attempting to link variation in ACE to human disease.

  1. Patchwork sequencing of tomato San Marzano and Vesuviano varieties highlights genome-wide variations

    PubMed Central

    2014-01-01

    Background Investigation of tomato genetic resources is a crucial issue for better straight evolution and genetic studies as well as tomato breeding strategies. Traditional Vesuviano and San Marzano varieties grown in Campania region (Southern Italy) are famous for their remarkable fruit quality. Owing to their economic and social importance is crucial to understand the genetic basis of their unique traits. Results Here, we present the draft genome sequences of tomato Vesuviano and San Marzano genome. A 40x genome coverage was obtained from a hybrid Illumina paired-end reads assembling that combines de novo assembly with iterative mapping to the reference S. lycopersicum genome (SL2.40). Insertions, deletions and SNP variants were carefully measured. When assessed on the basis of the reference annotation, 30% of protein-coding genes are predicted to have variants in both varieties. Copy genes number and gene location were assessed by mRNA transcripts mapping, showing a closer relationship of San Marzano with reference genome. Distinctive variations in key genes and transcription/regulation factors related to fruit quality have been revealed for both cultivars. Conclusions The effort performed highlighted varieties relationships and important variants in fruit key processes useful to dissect the path from sequence variant to phenotype. PMID:24548308

  2. Serine Hydroxymethyltransferase 1 and 2: Gene Sequence Variation and Functional Genomic Characterization

    PubMed Central

    Hebbring, Scott J.; Chai, Yubo; Ji, Yuan; Abo, Ryan P.; Jenkins, Gregory D.; Fridley, Brooke; Zhang, Jianping; Eckloff, Bruce W.; Wieben, Eric D.; Weinshilboum, Richard M.

    2012-01-01

    Serine hydroxymethyltransferase (SHMT) catalyzes the transfer of a beta carbon from serine to tetrahydrofolate (THF) to form glycine and 5,10-methylene-THF. This reaction plays an important role in neurotransmitter synthesis and metabolism. We set out to resequence SHMT1 and SHMT2, followed by functional genomic studies. We identified 87 and 60 polymorphisms in SHMT1 and SHMT2, respectively. We observed no significant functional effect of the 13 nonsynonymous SNPs in these genes, either on catalytic activity or protein quantity. We imputed additional variants across the two genes using “1000 Genomes” data, and identified 14 variants that were significantly associated (p-value < 1.0E-10) with SHMT1 mRNA expression in lymphoblastoid cell lines. Many of these SNPs were also significantly correlated with basal SHMT1 protein expression in 268 human liver biopsy samples. Reporter gene assays suggested that the SHMT1 promoter SNP, rs669340, contributed to this variation. Finally, SHMT1 and SHMT2 expression were significantly correlated with those of other Folate and Methionine Cycle genes at both the mRNA and protein levels. These experiments represent a comprehensive study of SHMT1 and SHMT2 gene sequence variation and its functional implications. In addition, we obtained preliminary indications that these genes may be co-regulated with other Folate and Methionine Cycle genes. PMID:22220685

  3. Serine hydroxymethyltransferase 1 and 2: gene sequence variation and functional genomic characterization.

    PubMed

    Hebbring, Scott J; Chai, Yubo; Ji, Yuan; Abo, Ryan P; Jenkins, Gregory D; Fridley, Brooke; Zhang, Jianping; Eckloff, Bruce W; Wieben, Eric D; Weinshilboum, Richard M

    2012-03-01

    Serine hydroxymethyltransferase (SHMT) catalyzes the transfer of a β-carbon from serine to tetrahydrofolate to form glycine and 5,10-methylene-tetrahydrofolate. This reaction plays an important role in neurotransmitter synthesis and metabolism. We set out to resequence SHMT1 and SHMT2, followed by functional genomic studies. We identified 87 and 60 polymorphisms in SHMT1 and SHMT2, respectively. We observed no significant functional effect of the 13 non-synonymous single-nucleotide polymorphism (SNPs) in these genes, either on catalytic activity or protein quantity. We imputed additional variants across the two genes using '1000 Genomes' data, and identified 14 variants that were significantly associated (p<1.0E-10) with SHMT1 messenger RNA expression in lymphoblastoid cell lines. Many of these SNPs were also significantly correlated with basal SHMT1 protein expression in 268 human liver biopsy samples. Reporter gene assays suggested that the SHMT1 promoter SNP, rs669340, contributed to this variation. Finally, SHMT1 and SHMT2 expression were significantly correlated with those of other Folate and Methionine Cycle genes at both the messenger RNA and protein levels. These experiments represent a comprehensive study of SHMT1 and SHMT2 gene sequence variation and its functional implications. In addition, we obtained preliminary indications that these genes may be co-regulated with other Folate and Methionine Cycle genes.

  4. Impact of genomic structural variation in Drosophila melanogaster based on population-scale sequencing

    PubMed Central

    Zichner, Thomas; Garfield, David A.; Rausch, Tobias; Stütz, Adrian M.; Cannavó, Enrico; Braun, Martina; Furlong, Eileen E.M.; Korbel, Jan O.

    2013-01-01

    Genomic structural variation (SV) is a major determinant for phenotypic variation. Although it has been extensively studied in humans, the nucleotide resolution structure of SVs within the widely used model organism Drosophila remains unknown. We report a highly accurate, densely validated map of unbalanced SVs comprising 8962 deletions and 916 tandem duplications in 39 lines derived from short-read DNA sequencing in a natural population (the “Drosophila melanogaster Genetic Reference Panel,” DGRP). Most SVs (>90%) were inferred at nucleotide resolution, and a large fraction was genotyped across all samples. Comprehensive analyses of SV formation mechanisms using the short-read data revealed an abundance of SVs formed by mobile element and nonhomologous end-joining-mediated rearrangements, and clustering of variants into SV hotspots. We further observed a strong depletion of SVs overlapping genes, which, along with population genetics analyses, suggests that these SVs are often deleterious. We inferred several gene fusion events also highlighting the potential role of SVs in the generation of novel protein products. Expression quantitative trait locus (eQTL) mapping revealed the functional impact of our high-resolution SV map, with quantifiable effects at >100 genic loci. Our map represents a resource for population-level studies of SVs in an important model organism. PMID:23222910

  5. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome

    PubMed Central

    Pinto, Ameet J.; Sharp, Jonathan O.; Yoder, Michael J.

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  6. Complete Genome Sequence of Streptomyces clavuligerus F613-1, an Industrial Producer of Clavulanic Acid.

    PubMed

    Cao, Guangxiang; Zhong, Chuanqing; Zong, Gongli; Fu, Jiafang; Liu, Zhong; Zhang, Guimin; Qin, Ronghuo

    2016-01-01

    Streptomyces clavuligerus strain F613-1 is an industrial strain with high-yield clavulanic acid production. In this study, the complete genome sequence of S. clavuligerus strain F613-1 was determined, including one linear chromosome and one linear plasmid, carrying numerous sets of genes involving in the biosynthesis of clavulanic acid.

  7. Complete Genome Sequence of Streptomyces clavuligerus F613-1, an Industrial Producer of Clavulanic Acid.

    PubMed

    Cao, Guangxiang; Zhong, Chuanqing; Zong, Gongli; Fu, Jiafang; Liu, Zhong; Zhang, Guimin; Qin, Ronghuo

    2016-01-01

    Streptomyces clavuligerus strain F613-1 is an industrial strain with high-yield clavulanic acid production. In this study, the complete genome sequence of S. clavuligerus strain F613-1 was determined, including one linear chromosome and one linear plasmid, carrying numerous sets of genes involving in the biosynthesis of clavulanic acid. PMID:27660792

  8. Complete Genome Sequence of Streptomyces clavuligerus F613-1, an Industrial Producer of Clavulanic Acid

    PubMed Central

    Zhong, Chuanqing; Zong, Gongli; Fu, Jiafang; Liu, Zhong; Zhang, Guimin; Qin, Ronghuo

    2016-01-01

    Streptomyces clavuligerus strain F613-1 is an industrial strain with high-yield clavulanic acid production. In this study, the complete genome sequence of S. clavuligerus strain F613-1 was determined, including one linear chromosome and one linear plasmid, carrying numerous sets of genes involving in the biosynthesis of clavulanic acid. PMID:27660792

  9. Parvalbumins from coelacanth muscle. III. Amino acid sequence of the major component.

    PubMed

    Jauregui-Adell, J; Pechere, J F

    1978-09-26

    The primary structure of the major parvalbumin (pI = 4.52) from coelacanth muscle (Latimeria chalumnae) has been determined. Sequence analysis of the tryptic peptides, in some cases obtained with beta-trypsin, accounts for the total amino acid content of the protein. Chymotryptic peptides provide appropriate sequence overlaps, to complete the localization of the tryptic peptides. Examination of the amino acid sequence of this protein shows the typical structure of a beta-parvalbumin. Its position in the dendrogram of related calcium-binding proteins corresponds to that usually accepted for crossopterygians.

  10. Copy number variation of individual cattle genomes using next-generation sequencing

    PubMed Central

    Bickhart, Derek M.; Hou, Yali; Schroeder, Steven G.; Alkan, Can; Cardone, Maria Francesca; Matukumalli, Lakshmi K.; Song, Jiuzhou; Schnabel, Robert D.; Ventura, Mario; Taylor, Jeremy F.; Garcia, Jose Fernando; Van Tassell, Curtis P.; Sonstegard, Tad S.; Eichler, Evan E.; Liu, George E.

    2012-01-01

    Copy number variations (CNVs) affect a wide range of phenotypic traits; however, CNVs in or near segmental duplication regions are often intractable. Using a read depth approach based on next-generation sequencing, we examined genome-wide copy number differences among five taurine (three Angus, one Holstein, and one Hereford) and one indicine (Nelore) cattle. Within mapped chromosomal sequence, we identified 1265 CNV regions comprising ∼55.6-Mbp sequence—476 of which (∼38%) have not previously been reported. We validated this sequence-based CNV call set with array comparative genomic hybridization (aCGH), quantitative PCR (qPCR), and fluorescent in situ hybridization (FISH), achieving a validation rate of 82% and a false positive rate of 8%. We further estimated absolute copy numbers for genomic segments and annotated genes in each individual. Surveys of the top 25 most variable genes revealed that the Nelore individual had the lowest copy numbers in 13 cases (∼52%, χ2 test; P-value <0.05). In contrast, genes related to pathogen- and parasite-resistance, such as CATHL4 and ULBP17, were highly duplicated in the Nelore individual relative to the taurine cattle, while genes involved in lipid transport and metabolism, including APOL3 and FABP2, were highly duplicated in the beef breeds. These CNV regions also harbor genes like BPIFA2A (BSP30A) and WC1, suggesting that some CNVs may be associated with breed-specific differences in adaptation, health, and production traits. By providing the first individualized cattle CNV and segmental duplication maps and genome-wide gene copy number estimates, we enable future CNV studies into highly duplicated regions in the cattle genome. PMID:22300768

  11. Detection and implication of significant temporal b-value variation during earthquake sequences

    NASA Astrophysics Data System (ADS)

    Gulia, Laura; Tormann, Thessa; Schorlemmer, Danijel; Wiemer, Stefan

    2016-04-01

    Earthquakes tend to cluster in space and time and periods of increased seismic activity are also periods of increased seismic hazard. Forecasting models currently used in statistical seismology and in Operational Earthquake Forecasting (e.g. ETAS) consider the spatial and temporal changes in the activity rates whilst the spatio-temporal changes in the earthquake size distribution, the b-value, are not included. Laboratory experiments on rock samples show an increasing relative proportion of larger events as the system approaches failure, and a sudden reversal of this trend after the main event. The increasing fraction of larger events during the stress increase period can be mathematically represented by a systematic b-value decrease, while the b-value increases immediately following the stress release. We investigate whether these lab-scale observations also apply to natural earthquake sequences and can help to improve our understanding of the physical processes generating damaging earthquakes. A number of large events nucleated in low b-value regions and spatial b-value variations have been extensively documented in the past. Detecting temporal b-value evolution with confidence is more difficult, one reason being the very different scales that have been suggested for a precursory drop in b-value, from a few days to decadal scale gradients. We demonstrate with the results of detailed case studies of the 2009 M6.3 L'Aquila and 2011 M9 Tohoku earthquakes that significant and meaningful temporal b-value variability can be detected throughout the sequences, which e.g. suggests that foreshock probabilities are not generic but subject to significant spatio-temporal variability. Such potential conclusions require and motivate the systematic study of many sequences to investigate whether general patterns exist that might eventually be useful for time-dependent or even real-time seismic hazard assessment.

  12. Effect of laying sequence on egg mercury in captive zebra finches: an interpretation considering individual variation.

    PubMed

    Ou, Langbo; Varian-Ramos, Claire W; Cristol, Daniel A

    2015-08-01

    Bird eggs are used widely as noninvasive bioindicators for environmental mercury availability. Previous studies, however, have found varying relationships between laying sequence and egg mercury concentrations. Some studies have reported that the mercury concentration was higher in first-laid eggs or declined across the laying sequence, whereas in other studies mercury concentration was not related to egg order. Approximately 300 eggs (61 clutches) were collected from captive zebra finches dosed throughout their reproductive lives with methylmercury (0.3 μg/g, 0.6 μg/g, 1.2 μg/g, or 2.4 μg/g wet wt in diet); the total mercury concentration (mean ± standard deviation [SD] dry wt basis) of their eggs was 7.03 ± 1.38 μg/g, 14.15 ± 2.52 μg/g, 26.85 ± 5.85 μg/g, and 49.76 ± 10.37 μg/g, respectively (equivalent to fresh wt egg mercury concentrations of 1.24 μg/g, 2.50 μg/g, 4.74 μg/g, and 8.79 μg/g). The authors observed a significant decrease in the mercury concentration of successive eggs when compared with the first egg and notable variation between clutches within treatments. The mercury level of individual females within and among treatments did not alter this relationship. Based on the results, sampling of a single egg in each clutch from any position in the laying sequence is sufficient for purposes of population risk assessment, but it is not recommended as a proxy for individual female exposure or as an estimate of average mercury level within the clutch.

  13. Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

    SciTech Connect

    Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng; Kurz,Thorsten; Dubchak, Inna; Frazer, Kelly A.; Ober, Carole

    2005-09-10

    Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs each inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.

  14. Spatial and Temporal Stress Drop Variations of the 2011 Tohoku Earthquake Sequence

    NASA Astrophysics Data System (ADS)

    Miyake, H.

    2013-12-01

    The 2011 Tohoku earthquake sequence consists of foreshocks, mainshock, aftershocks, and repeating earthquakes. To quantify spatial and temporal stress drop variations is important for understanding M9-class megathrust earthquakes. Variability and spatial and temporal pattern of stress drop is a basic information for rupture dynamics as well as useful to source modeling. As pointed in the ground motion prediction equations by Campbell and Bozorgnia [2008, Earthquake Spectra], mainshock-aftershock pairs often provide significant decrease of stress drop. We here focus strong motion records before and after the Tohoku earthquake, and analyze source spectral ratios considering azimuth- and distance dependency [Miyake et al., 2001, GRL]. Due to the limitation of station locations on land, spatial and temporal stress drop variations are estimated by adjusting shifts from the omega-squared source spectral model. The adjustment is based on the stochastic Green's function simulations of source spectra considering azimuth- and distance dependency. We assumed the same Green's functions for event pairs for each station, both the propagation path and site amplification effects are cancelled out. Precise studies of spatial and temporal stress drop variations have been performed [e.g., Allmann and Shearer, 2007, JGR], this study targets the relations between stress drop vs. progression of slow slip prior to the Tohoku earthquake by Kato et al. [2012, Science] and plate structures. Acknowledgement: This study is partly supported by ERI Joint Research (2013-B-05). We used the JMA unified earthquake catalogue and K-NET, KiK-net, and F-net data provided by NIED.

  15. Sequencing and computational analysis of complete genome sequences of Citrus yellow mosaic badna virus from acid lime and pummelo.

    PubMed

    Borah, Basanta K; Johnson, A M Anthony; Sai Gopal, D V R; Dasgupta, Indranil

    2009-08-01

    Citrus yellow mosaic badna virus (CMBV), a member of the Family Caulimoviridae, Genus Badnavirus, is the causative agent of Citrus mosaic disease in India. Although the virus has been detected in several citrus species, only two full-length genomes, one each from Sweet orange and Rangpur lime, are available in publicly accessible databases. In order to obtain a better understanding of the genetic variability of the virus in other citrus mosaic-affected citrus species, we performed the cloning and sequence analysis of complete genomes of CMBV from two additional citrus species, Acid lime and Pummelo. We show that CMBV genomes from the two hosts share high homology with previously reported CMBV sequences and hence conclude that the new isolates represent variants of the virus present in these species. Based on in silico sequence analysis, we predict the possible function of the protein encoded by one of the five ORFs.

  16. Amino acid sequence of anionic peroxidase from the windmill palm tree Trachycarpus fortunei.

    PubMed

    Baker, Margaret R; Zhao, Hongwei; Sakharov, Ivan Yu; Li, Qing X

    2014-12-10

    Palm peroxidases are extremely stable and have uncommon substrate specificity. This study was designed to fill in the knowledge gap about the structures of a peroxidase from the windmill palm tree Trachycarpus fortunei. The complete amino acid sequence and partial glycosylation were determined by MALDI-top-down sequencing of native windmill palm tree peroxidase (WPTP), MALDI-TOF/TOF MS/MS of WPTP tryptic peptides, and cDNA sequencing. The propeptide of WPTP contained N- and C-terminal signal sequences which contained 21 and 17 amino acid residues, respectively. Mature WPTP was 306 amino acids in length, and its carbohydrate content ranged from 21% to 29%. Comparison to closely related royal palm tree peroxidase revealed structural features that may explain differences in their substrate specificity. The results can be used to guide engineering of WPTP and its novel applications.

  17. Amino acid sequence of a new mitochondrially synthesized proteolipid of the ATP synthase of Saccharomyces cerevisiae.

    PubMed Central

    Velours, J; Esparza, M; Hoppe, J; Sebald, W; Guerin, B

    1984-01-01

    The purification and the amino acid sequence of a proteolipid translated on ribosomes in yeast mitochondria is reported. This protein, which is a subunit of the ATP synthase, was purified by extraction with chloroform/methanol (2/1) and subsequent chromatography on phosphocellulose and reverse phase h.p.l.c. A mol. wt. of 5500 was estimated by chromatography on Bio-Gel P-30 in 80% formic acid. The complete amino acid sequence of this protein was determined by automated solid phase Edman degradation of the whole protein and of fragments obtained after cleavage with cyanogen bromide. The sequence analysis indicates a length of 48 amino acid residues. The calculated mol. wt. of 5870 corresponds to the value found by gel chromatography. This polypeptide contains three basic residues and no negatively charged side chain. The three basic residues are clustered at the C terminus. The primary structure of this protein is in full agreement with the predicted amino acid sequence of the putative polypeptide encoded by the mitochondrial aap1 gene recently discovered in Saccharomyces cerevisiae. Moreover, this protein shows 50% homology with the amino acid sequence of a putative polypeptide encoded by an unidentified reading frame also discovered near the mitochondrial ATPase subunit 6 gene in Aspergillus nidulans. Images Fig. 2. PMID:6323165

  18. TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations.

    PubMed

    Abascal, Federico; Zardoya, Rafael; Telford, Maximilian J

    2010-07-01

    We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk.

  19. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor. PMID:2708331

  20. Homology of amino acid sequences of rat liver cathepsins B and H with that of papain.

    PubMed Central

    Takio, K; Towatari, T; Katunuma, N; Teller, D C; Titani, K

    1983-01-01

    The amino acid sequences of rat liver lysosomal thiol endopeptidases, cathepsins B and H, are presented and compared with that of the plant thiol protease papain. The 252-residue sequence of cathepsin B and the 220-residue sequence of cathepsin H were determined largely by automated Edman degradation of their intact polypeptide chains and of the two chains of each enzyme generated by limited proteolysis. Subfragments of the chains were produced by enzymatic digestion and by chemical cleavage of methionyl and tryptophanyl bonds. Comparison of the amino acid sequences of cathepsins B and H with each other and with that of papain demonstrates a striking homology among their primary structures. Sequence identity is extremely high in regions which, according to the three-dimensional structure of papain, constitute the catalytic site. The results not only reveal the first structural features of mammalian thiol endopeptidases but also provide insight into the evolutionary relationships among plant and mammalian thiol proteases. PMID:6574504

  1. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  2. Complete cDNA and derived amino acid sequence of human factor V.

    PubMed Central

    Jenny, R J; Pittman, D D; Toole, J J; Kriz, R W; Aldape, R A; Hewick, R M; Kaufman, R J; Mann, K G

    1987-01-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A) tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approximately equal to 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approximately 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approximately 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues. Images PMID:3110773

  3. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  4. Simultaneous Detection of Both Single Nucleotide Variations and Copy Number Alterations by Next-Generation Sequencing in Gorlin Syndrome.

    PubMed

    Morita, Kei-ichi; Naruto, Takuya; Tanimoto, Kousuke; Yasukawa, Chisato; Oikawa, Yu; Masuda, Kiyoshi; Imoto, Issei; Inazawa, Johji; Omura, Ken; Harada, Hiroyuki

    2015-01-01

    Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions. PMID:26544948

  5. Seasonal variations in acid-neutralizing capacity in 13 northeast United States headwater streams

    NASA Astrophysics Data System (ADS)

    Dewalle, David R.; Davies, Trevor D.

    1997-04-01

    Variations in acid-neutralizing capacity (ANC) in 13 streams in the Adirondack, Catskill, and Northern Appalachian Plateau regions of the northeast United States were related to discharge, time of year, and seasonal variations in cation and anion concentrations using periodic regression analysis, ANC varied significantly with both discharge and time of year in 12 streams. Generation of ANC seasonal variations, being dependent upon the precise timing and magnitude of seasonal variations in cation and anion concentrations, was unique to each stream. Greatest seasonal ANC variations occurred in streams where seasonal variations in major anion and cation concentrations were completely out of phase. Maximum errors that could occur because of extrapolation of ANC data from one time of year to another were equal to or greater than maximum errors due to extrapolation of ANC from one discharge to another.

  6. Sequence variation in the Trichuris trichiura beta-tubulin locus: implications for the development of benzimidazole resistance.

    PubMed

    Bennett, A B; Anderson, T J C; Barker, G C; Michael, E; Bundy, D A P

    2002-11-01

    Benzimidazole resistance has evolved in a variety of organisms and typically results from mutations in the beta-tubulin locus at specific amino acid sites. Despite widespread treatment of human intestinal nematodes with benzimidazole drugs, there have been no unambiguous reports of resistance. However, since beta-tubulin mutations conferring resistance are generally recessive, frequencies of resistance alleles less than 30% would be difficult to detect on the basis of drug treatment failures. Here we investigate sequence variation in a 1079 bp segment of the beta-tubulin locus in the human whipworm Trichuris trichiura from 72 individual nematodes from seven countries. We did not observe any alleles with amino acid mutations indicative of resistance, and of 40 point mutations there were only four non-synonymous mutations all of which were singletons. Estimated effective population sizes are an order of magnitude lower than those from another nematode species in which benzimidazole resistance has developed (Haemonchus contortus). Both the lower diversity and reduced population sizes suggest that benzimidazole resistance is likely to evolve less rapidly in Trichuris than in trichostrongyle parasites of livestock. We observed moderate levels of population subdivision (Phi(ST)=0.26) comparable with that previously observed in Ascaris lumbricoides, and identical alleles were frequently found in parasites from different continents, suggestive of recent admixture. A particularly interesting feature of the data is the high nucleotide diversities observed in nematodes from the Caribbean. This genetic complexity may be a direct result of extensive admixture and complex history of human populations in this region of the world. These data should encourage (but not make complacent) those involved in large-scale benzimidazole treatment of human intestinal nematodes.

  7. Santorini mutation detection meeting 2011: rapid advance in sequencing technology poses challenges for interpretation of genetic variations.

    PubMed

    Stavrou, Eleana F; Goriely, Anne

    2012-10-01

    The 11th International Symposium on Mutations in the Genome was held on 6-10 June, 2011, in Santorini, Greece. Meeting participants described novel detection technologies, rapid advances in whole genome and whole-exome sequencing, but also highlighted the urgent need for the development of sequence variation databases and the clinical interpretation of the genomic data. This report summarizes some of the major themes presented during the meeting.

  8. Concentration variations of amino acids in mammalian fossils: effects of diagenesis and the implications for amino acid racemization analysis

    SciTech Connect

    Blackwell, B.; Rutter, N.W.

    1985-01-01

    Detailed amino acid analysis of bones, teeth, and antler from several mammal species have shown that concentrations of several amino acids can be related to three factors: type of material analyzed, diagenetic alteration of the material, and relative age of the fossil. Concentrations of several amino acids are significantly different in enamel compared to those of dentine or cement. This can be used to check that no contamination of one material by another has occurred, which is critical for using the data for amino acid dating, since all three materials have different racemization rates for some acids. With increased in growth of secondary minerals, generally reduced amino acid concentrations are observed. Interacid ratios and concentrations vary significantly the norms expected for the type of material with increasing degrees of alteration. These effects can be linked to abnormal racemization ratios observed in the same samples. Therefore, abnormal concentrations and/or interacid ratios can be used to detect samples in which the D/L amino acid ratios otherwise appear normal, thereby insuring better accuracy of amino acid racemization analysis. For unaltered fossils, with increasing sample age regardless the type of material, some amino acids steadily degrade, while others actually increase in concentration initially due to their generation as by-products of decay. Preliminary studies indicate that this progressive alteration can used to complement racemization data for determining relative stratigraphic sequences.

  9. Amino acid sequence heterogeneity of the chromosomal encoded Borrelia burgdorferi sensu lato major antigen P100.

    PubMed

    Fellinger, W; Farencena, A; Redl, B; Sambri, V; Cevenini, R; Stöffler, G

    1995-04-01

    The entire nucleotide sequence of the chromosomal encoded major antigen p100 of the European Borrelia garinii isolate B29 was determined and the deduced amino acid sequence was compared to the homologous antigen p83 of the North American Borrelia burgdorferi sensu stricto strain B31 and the p100 of the European Borrelia afzelii (group VS461) strain PKo. p100 of strain B29 shows 87% amino acid sequence identity to strain B31 and 79.2% to strain PKo, p100 of strain B31 and PKo shows 62.5% identity to each other. In addition, partial nucleotide sequences of the most heterogeneous region of the p100 gene of two other Borrelia garinii isolates (PBi and VS286) have been determined and the deduced amino acid sequences were compared with all p100 of Borrelia garinii published so far. We found an amino acid sequence identity between 88.6 and 100% within the same genospecies. The N-terminal part of the p100 proteins is highly conserved whereas a striking heterogeneous region within the C-terminal part of the proteins was observed.

  10. Detecting rare structural variation in evolving microbial populations from new sequence junctions using breseq

    PubMed Central

    Deatherage, Daniel E.; Traverse, Charles C.; Wolf, Lindsey N.; Barrick, Jeffrey E.

    2014-01-01

    New mutations leading to structural variation (SV) in genomes—in the form of mobile element insertions, large deletions, gene duplications, and other chromosomal rearrangements—can play a key role in microbial evolution. Yet, SV is considerably more difficult to predict from short-read genome resequencing data than single-nucleotide substitutions and indels (SN), so it is not yet routinely identified in studies that profile population-level genetic diversity over time in evolution experiments. We implemented an algorithm for detecting polymorphic SV as part of the breseq computational pipeline. This procedure examines split-read alignments, in which the two ends of a single sequencing read match disjoint locations in the reference genome, in order to detect structural variants and estimate their frequencies within a sample. We tested our algorithm using simulated Escherichia coli data and then applied it to 500- and 1000-generation population samples from the Lenski E. coli long-term evolution experiment (LTEE). Knowledge of genes that are targets of selection in the LTEE and mutations present in previously analyzed clonal isolates allowed us to evaluate the accuracy of our procedure. Overall, SV accounted for ~25% of the genetic diversity found in these samples. By profiling rare SV, we were able to identify many cases where alternative mutations in key genes transiently competed within a single population. We also found, unexpectedly, that mutations in two genes that rose to prominence at these early time points always went extinct in the long term. Because it is not limited by the base-calling error rate of the sequencing technology, our approach for identifying rare SV in whole-population samples may have a lower detection limit than similar predictions of SNs in these data sets. We anticipate that this functionality of breseq will be useful for providing a more complete picture of genome dynamics during evolution experiments with haploid microorganisms

  11. Structural variation detection using next-generation sequencing data: A comparative technical review.

    PubMed

    Guan, Peiyong; Sung, Wing-Kin

    2016-06-01

    Structural variations (SVs) are mutations in the genome of size at least fifty nucleotides. They contribute to the phenotypic differences among healthy individuals, cause severe diseases and even cancers by breaking or linking genes. Thus, it is crucial to systematically profile SVs in the genome. In the past decade, many next-generation sequencing (NGS)-based SV detection methods have been proposed due to the significant cost reduction of NGS experiments and their ability to unbiasedly detect SVs to the base-pair resolution. These SV detection methods vary in both sensitivity and specificity, since they use different SV-property-dependent and library-property-dependent features. As a result, predictions from different SV callers are often inconsistent. Besides, the noises in the data (both platform-specific sequencing error and artificial chimeric reads) impede the specificity of SV detection. Poorly characterized regions in the human genome (e.g., repeat regions) greatly impact the reads mapping and in turn affect the SV calling accuracy. Calling of complex SVs requires specialized SV callers. Apart from accuracy, processing speed of SV caller is another factor deciding its usability. Knowing the pros and cons of different SV calling techniques and the objectives of the biological study are essential for biologists and bioinformaticians to make informed decisions. This paper describes different components in the SV calling pipeline and reviews the techniques used by existing SV callers. Through simulation study, we also demonstrate that library properties, especially insert size, greatly impact the sensitivity of different SV callers. We hope the community can benefit from this work both in designing new SV calling methods and in selecting the appropriate SV caller for specific biological studies. PMID:26845461

  12. Sequence variation and differential splicing of the midgut cadherin gene in Trichoplusia ni.

    PubMed

    Zhang, Xin; Kain, Wendy; Wang, Ping

    2013-08-01

    The insect midgut cadherin serves as an important receptor for the Cry toxins from Bacillus thuringiensis (Bt). Variation of the cadherin in insect populations provides a genetic potential for development of cadherin-based Bt resistance in insect populations. Sequence analysis of the cadherin from the cabbage looper, Trichoplusia ni, together with cadherins from 18 other lepidopterans showed a similar phylogenetic relationship of the cadherins to the phylogeny of Lepidoptera. The midgut cadherin in three laboratory populations of T. ni exhibited high variability, although the resistance to Bt toxin Cry1Ac in the T. ni strain is not genetically associated with cadherin gene mutations. A total of 142 single nucleotide polymorphisms (SNPs) were identified in the cadherin cDNAs from the T. ni strains, including 20 missense mutations. In addition, insertion and deletion polymorphisms (indels) were also identified in the cadherin alleles in T. ni. More interestingly, the results from this study reveal that differential splicing of mRNA also occurs in the cadherin gene expression. Therefore, variation of the midgut cadherin in insects may not only be caused by cadherin gene mutations, but could also result from alternative splicing of its mRNA regulated by factors acting in trans. Analysis of cadherin gene alleles in F2, F3 and F4 progenies from the cross between the Cry1Ac resistant and the susceptible strain after consecutive selections with Cry1Ac for three generations showed that selection with Cry1Ac did not result in an increase of frequencies of the cadherin alleles originated from the resistant strain. PMID:23743444

  13. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  14. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  15. Ligation with nucleic acid sequence-based amplification.

    PubMed

    Ong, Carmichael; Tai, Warren; Sarma, Aartik; Opal, Steven M; Artenstein, Andrew W; Tripathi, Anubhav

    2012-01-01

    This work presents a novel method for detecting nucleic acid targets using a ligation step along with an isothermal, exponential amplification step. We use an engineered ssDNA with two variable regions on the ends, allowing us to design the probe for optimal reaction kinetics and primer binding. This two-part probe is ligated by T4 DNA Ligase only when both parts bind adjacently to the target. The assay demonstrates that the expected 72-nt RNA product appears only when the synthetic target, T4 ligase, and both probe fragments are present during the ligation step. An extraneous 38-nt RNA product also appears due to linear amplification of unligated probe (P3), but its presence does not cause a false-positive result. In addition, 40 mmol/L KCl in the final amplification mix was found to be optimal. It was also found that increasing P5 in excess of P3 helped with ligation and reduced the extraneous 38-nt RNA product. The assay was also tested with a single nucleotide polymorphism target, changing one base at the ligation site. The assay was able to yield a negative signal despite only a single-base change. Finally, using P3 and P5 with longer binding sites results in increased overall sensitivity of the reaction, showing that increasing ligation efficiency can improve the assay overall. We believe that this method can be used effectively for a number of diagnostic assays. PMID:22449695

  16. Diversity and Variation of Bacterial Community Revealed by MiSeq Sequencing in Chinese Dark Teas

    PubMed Central

    Fu, Jianyu; Lv, Haipeng; Chen, Feng

    2016-01-01

    Chinese dark teas (CDTs) are now among the popular tea beverages worldwide due to their unique health benefits. Because the production of CDTs involves fermentation that is characterized by the effect of microbes, microorganisms are believed to play critical roles in the determination of the chemical characteristics of CDTs. Some dominant fungi have been identified from CDTs. In contrast, little, if anything, is known about the composition of bacterial community in CDTs. This study was set to investigate the diversity and variation of bacterial community in four major types of CDTs from China. First, the composition of the bacterial community of CDTs was determined using MiSeq sequencing. From the four typical CDTs, a total of 238 genera that belong to 128 families of bacteria were detected, including most of the families of beneficial bacteria known to be associated with fermented food. While different types of CDTs had generally distinct bacterial structures, the two types of brick teas produced from adjacent regions displayed strong similarity in bacterial composition, suggesting that the producing environment and processing condition perhaps together influence bacterial succession in CDTs. The global characterization of bacterial communities in CDTs is an essential first step for us to understand their function in fermentation and their potential impact on human health. Such knowledge will be important guidance for improving the production of CDTs with higher quality and elevated health benefits. PMID:27690376

  17. Genetic variation of Sargassum horneri populations detected by inter-simple sequence repeats.

    PubMed

    Ren, J R; Yang, R; He, Y Y; Sun, Q H

    2015-01-30

    The seaweed Sargassum horneri is an important brown alga in the marine environment, and it is an important raw material in the alginate industry. Unfortunately, the fixed resource that was originally reported is now reduced or disappeared, and increased floating populations have been reported in recent years. We sampled a floating population and 4 fixed cultivated populations of S. horneri along the coast of Zhejiang, China. Inter-simple sequence repeat (ISSR) markers were applied in this research to analyze the genetic variation between floating populations and fixed cultivated populations of S. horneri. In total, 220 loci were amplified with 23 ISSR primers. The percentage of polymorphic loci within each population ranged from 53.64 to 95.45%. The highest diversity was observed in population 3, which was the local species that was suspension cultured in the lab and then fixed cultivated in the Nanji Islands before sampling. The lowest diversity was obtained in the floating population 4. The genetic distances among the 5 S. horneri populations ranged from 0.0819 to 0.2889, and the distance tendency confirmed the genetic diversity. The results suggest that the floating population had the lowest genetic diversity and could not be joined into the cluster branch of the fixed cultivated populations.

  18. Interspecific and intrapopulation variation in mitochondrial ribosomal DNA sequences of Mytilus spp. (Bivalvia: Mollusca).

    PubMed

    Geller, J B; Carlton, J T; Powers, D A

    1993-02-01

    A 560-base pair portion of the mitochondrial 16S ribosomal DNA (16S rDNA) from three morphologically similar mussels, Mytilus edulis, M. galloprovincialis, and M. trossulus, was amplified with the polymerase chain reaction, and 349 base pairs were sequenced. These data showed that this gene in M. edulis and M. galloprovincialis has not diverged; however, the north Pacific mussel, M. trossulus, showed fixed differences from M. edulis and M. galloprovincialis at 5 nucleotide positions. Furthermore, the population of M. trossulus at Tillamook Bay, Oregon, was found to contain two very divergent 16S rDNA genotypes that differ at 37 nucleotide positions. Thus, intraspecific variation in this gene in M. trossulus is greater than that seen interspecifically in M. edulis and M. galloprovincialis. Despite this large difference, in the absence of evidence of genetic isolation between these groups of M. trossulus, no taxonomic changes are proposed. These data are consistent with a north Pacific origin of the genus with subsequent dispersal to the Atlantic Ocean across the Artic Sea, giving rise to M. edulis in northern Europe and subsequently M. galloprovincialis in southern Europe and the Mediterranean Sea.

  19. Role of promoter DNA sequence variations on the binding of EGR1 transcription factor.

    PubMed

    Mikles, David C; Schuchardt, Brett J; Bhat, Vikas; McDonald, Caleb B; Farooq, Amjad

    2014-05-01

    In response to a wide variety of stimuli such as growth factors and hormones, EGR1 transcription factor is rapidly induced and immediately exerts downstream effects central to the maintenance of cellular homeostasis. Herein, our biophysical analysis reveals that DNA sequence variations within the target gene promoters tightly modulate the energetics of binding of EGR1 and that nucleotide substitutions at certain positions are much more detrimental to EGR1-DNA interaction than others. Importantly, the reduction in binding affinity poorly correlates with the loss of enthalpy and gain of entropy-a trend indicative of a complex interplay between underlying thermodynamic factors due to the differential role of water solvent upon nucleotide substitution. We also provide a rationale for the physical basis of the effect of nucleotide substitutions on the EGR1-DNA interaction at atomic level. Taken together, our study bears important implications on understanding the molecular determinants of a key protein-DNA interaction at the cross-roads of human health and disease.

  20. Structural variation discovery in the cancer genome using next generation sequencing: Computational solutions and perspectives

    PubMed Central

    Liu, Biao; Conroy, Jeffrey M.; Morrison, Carl D.; Odunsi, Adekunle O.; Qin, Maochun; Wei, Lei; Trump, Donald L.; Johnson, Candace S.; Liu, Song; Wang, Jianmin

    2015-01-01

    Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937

  1. The amino acid sequence of mitogenic lectin-B from the roots of pokeweed (Phytolacca americana).

    PubMed

    Yamaguchi, K; Yurino, N; Kino, M; Ishiguro, M; Funatsu, G

    1997-04-01

    The complete amino acid sequence of pokeweed lectin-B (PL-B) has been analyzed by first sequencing seven lysylendopeptidase peptides derived from the reduced and S-pyridylethylated PL-B and then connecting them by analyzing the arginylendopeptidase peptides from the reduced and S-carboxymethylated PL-B. PL-B consists of 295 amino acid residues and two oligosaccharides linked to Asn96 and Asn139, and has a molecular mass of 34,493 Da. PL-B is composed of seven repetitive chitin-binding domains having 48-79% sequence homology with each other. Twelve amino acid residues including eight cysteine residues in these domains are absolutely conserved in all other chitin-binding domains of plant lectins and class I chitinases. Also, it was strongly suggested that the extremely high hemagglutinating and mitogenic activities of PL-B may be ascribed to its seven-domain structure.

  2. ConSurf 2010: calculating evolutionary conservation in sequence and structure of proteins and nucleic acids.

    PubMed

    Ashkenazy, Haim; Erez, Elana; Martz, Eric; Pupko, Tal; Ben-Tal, Nir

    2010-07-01

    It is informative to detect highly conserved positions in proteins and nucleic acid sequence/structure since they are often indicative of structural and/or functional importance. ConSurf (http://consurf.tau.ac.il) and ConSeq (http://conseq.tau.ac.il) are two well-established web servers for calculating the evolutionary conservation of amino acid positions in proteins using an empirical Bayesian inference, starting from protein structure and sequence, respectively. Here, we present the new version of the ConSurf web server that combines the two independent servers, providing an easier and more intuitive step-by-step interface, while offering the user more flexibility during the process. In addition, the new version of ConSurf calculates the evolutionary rates for nucleic acid sequences. The new version is freely available at: http://consurf.tau.ac.il/.

  3. Genetic Variation of Fatty Acid Oxidation and Obesity, A Literature Review

    PubMed Central

    Freitag Luglio, Harry

    2016-01-01

    Modulation of fat metabolism is an important component of the etiology of obesity as well as individual response to weight loss program. The influence of lipolysis process had receives many attentions in recent decades. Compared to that, fatty acid oxidation which occurred after lipolysis seems to be less exposed. There are limited publications on how fatty acid oxidation influences predisposition to obesity, especially the importance of genetic variations of fatty acid oxidation proteins on development of obesity. The aim of this review is to provide recent knowledge on how polymorphism of genes related fatty acid oxidation is obtained. Studies in human as well as animal model showed that disturbance of genes related fatty acid oxidation process gave impact on body weight and risks to obesity. Several polymorphisms on CD36, CPT, ACS and FABP had been shown to be related to obesity either by regulating enzymatic activity or directly influence fatty acid oxidation process. PMID:27127449

  4. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  5. Shark myoglobins. II. Isolation, characterization and amino acid sequence of myoglobin from Galeorhinus japonicus.

    PubMed

    Suzuki, T; Suzuki, T; Yata, T

    1985-01-01

    Native oxymyoglobin (MbO2) was isolated from red muscle of G. japonicus by chromatographic separation from metmyoglobin (metMb) on DEAE-cellulose and the amino acid sequence of the major chain was determined with the aid of sequence homology with that of G. australis. It was shown to differ in amino acid sequence from that of G. australis by 10 replacements, to be acetylated at the amino terminus and to contain glutamine at the distal (E7) residue. It was also shown to have a spectrum very similar to that of mammalian MbO2. However, the pH-dependence for the autoxidation of MbO2 was seen to be quite different from that of sperm whale (Physeter catodon) MbO2. Although the sequence homology between sperm whale and G. japonicus myoglobins is about 40%, their hydropathy profiles were very similar, indicating that they have a similar geometry in their globin folding.

  6. BIOLOGICAL RESPONSE TO VARIATION OF ACID-VOLATILE SULFIDES AND METALS IN FIELD-EXPOSED SPIKED SEDIMENTS

    EPA Science Inventory

    Vertical and temporal variations of acid-volatile sulfides (AVS) and simultaneously extracted metals (SEM) in sediment can control biological impacts of metals. To assess the significance of these variations in field sediments, sediments spiked with cadmium, copper, lead, nickel ...

  7. Visible sensing of nucleic acid sequences using a genetically encodable unmodified mRNA probe.

    PubMed

    Narita, Atsushi; Ogawa, Kazumasa; Sando, Shinsuke; Aoyama, Yasuhiro

    2006-01-01

    We previously reported a molecular beacon-mRNA (MB-mRNA) strategy for nucleic acid detection/sensing in a cell-free translation system using unmodified RNA as a probe. Here in this presentation, we report that a combination with RNase H activity, which induces an additional process of irreversible cleavage of MB-domain, achieves an improved sequence selectivity (one nucleotide selectivity) and an enhanced sensitivity. This improved system finally enabled visible sensing of target nucleic acid sequence at a single nucleotide resolution under isothermal conditions.

  8. Variation-Tolerant Capture and Multiplex Detection of Nucleic Acids: Application to Detection of Microbes

    PubMed Central

    Öhrmalm, Christina; Eriksson, Ronnie; Jobs, Magnus; Simonson, Magnus; Strømme, Maria; Bondeson, Kåre; Herrmann, Björn; Melhus, Åsa

    2012-01-01

    In contrast to ordinary PCRs, which have a limited multiplex capacity and often return false-negative results due to target variation or inhibition, our new detection strategy, VOCMA (variation-tolerant capture multiplex assay), allows variation-tolerant, target-specific capture and detection of many nucleic acids in one test. Here we demonstrate the use of a single-tube, dual-step amplification strategy that overcomes the usual limitations of PCR multiplexing, allowing at least a 22-plex format with retained sensitivity. Variation tolerance was achieved using long primers and probes designed to withstand variation at known sites and a judicious mix of degeneration and universal bases. We tested VOCMA in situations where enrichment from a large sample volume with high sensitivity and multiplexity is important (sepsis; streptococci, enterococci, and staphylococci, several enterobacteria, candida, and the most important antibiotic resistance genes) and where variation tolerance and high multiplexity is important (gastroenteritis; astrovirus, adenovirus, rotavirus, norovirus genogroups I and II, and sapovirus, as well as enteroviruses, which are not associated with gastroenteritis). Detection sensitivities of 10 to 1,000 copies per reaction were achieved for many targets. VOCMA is a highly multiplex, variation-tolerant, general purpose nucleic acid detection concept. It is a specific and sensitive method for simultaneous detection of nucleic acids from viruses, bacteria, fungi, and protozoa, as well as host nucleic acid, in the same test. It can be run on an ordinary PCR and a Luminex machine and is suitable for both clinical diagnoses and microbial surveillance. PMID:22814465

  9. Amino acid and cDNA sequences of lysozyme from Hyalophora cecropia

    PubMed Central

    Engström, Å.; Xanthopoulos, K. G.; Boman, H. G.; Bennich, H.

    1985-01-01

    The amino acid and cDNA sequences of lysozyme from the giant silk moth Hyalophora cecropia have been determined. This enzyme is one of several immune proteins produced by the diapausing pupae after injection of bacteria. Cecropia lysozyme is composed of 120 amino acids, has a mol. wt. of 13.8 kd and shows great similarity with vertebrate lysozymes of the chicken type. The amino acid residues responsible for the catalytic activity and for the binding of substrate are essentially conserved. Three allelic variants of the Cecropia enzyme are identified. A comparison of the chicken and the Cecropia lysozymes shows that there is a 40% identity at both the amino acid and the nucleotide level. Some evolutionary aspects of the sequence data are discussed. PMID:16453632

  10. Microsatellites and 16S sequences corroborate phenotypic evidence of trans-Andean variation in the parasitoid Microctonus hyperodae (Hymenoptera: Braconidae).

    PubMed

    Winder, L M; Phillips, C B; Lenney-Williams, C; Cane, R P; Paterson, K; Vink, C J; Goldson, S L

    2005-08-01

    Eight South American geographical populations of the parasitoid Microctonus hyperodae Loan were collected in South America (Argentina, Brazil, Chile and Uruguay) and released in New Zealand for biological control of the weevil Listronotus bonariensis (Kuschel), a pest of pasture grasses and cereals. DNA sequencing (16S, COI, 28S, ITS1, beta-tubulin), RAPD, AFLP, microsatellite, SSCP and RFLP analyses were used to seek markers for discriminating between the South American populations. All of the South American populations were more homogeneous than expected. However, variation in microsatellites and 16S gene sequences corroborated morphological, allozyme and other phenotypic evidence of trans-Andes variation between the populations. The Chilean populations were the most genetically variable, while the variation present on the eastern side of the Andes mountains was a subset of that observed in Chile.

  11. Draft genome sequence of the docosahexaenoic acid producing thraustochytrid Aurantiochytrium sp. T66.

    PubMed

    Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan

    2016-06-01

    Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids. PMID:27222814

  12. Draft genome sequence of the docosahexaenoic acid producing thraustochytrid Aurantiochytrium sp. T66.

    PubMed

    Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan

    2016-06-01

    Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids.

  13. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences.

  14. AcalPred: a sequence-based tool for discriminating between acidic and alkaline enzymes.

    PubMed

    Lin, Hao; Chen, Wei; Ding, Hui

    2013-01-01

    The structure and activity of enzymes are influenced by pH value of their surroundings. Although many enzymes work well in the pH range from 6 to 8, some specific enzymes have good efficiencies only in acidic (pH<5) or alkaline (pH>9) solution. Studies have demonstrated that the activities of enzymes correlate with their primary sequences. It is crucial to judge enzyme adaptation to acidic or alkaline environment from its amino acid sequence in molecular mechanism clarification and the design of high efficient enzymes. In this study, we developed a sequence-based method to discriminate acidic enzymes from alkaline enzymes. The analysis of variance was used to choose the optimized discriminating features derived from g-gap dipeptide compositions. And support vector machine was utilized to establish the prediction model. In the rigorous jackknife cross-validation, the overall accuracy of 96.7% was achieved. The method can correctly predict 96.3% acidic and 97.1% alkaline enzymes. Through the comparison between the proposed method and previous methods, it is demonstrated that the proposed method is more accurate. On the basis of this proposed method, we have built an online web-server called AcalPred which can be freely accessed from the website (http://lin.uestc.edu.cn/server/AcalPred). We believe that the AcalPred will become a powerful tool to study enzyme adaptation to acidic or alkaline environment.

  15. AcalPred: A Sequence-Based Tool for Discriminating between Acidic and Alkaline Enzymes

    PubMed Central

    Lin, Hao; Chen, Wei; Ding, Hui

    2013-01-01

    The structure and activity of enzymes are influenced by pH value of their surroundings. Although many enzymes work well in the pH range from 6 to 8, some specific enzymes have good efficiencies only in acidic (pH<5) or alkaline (pH>9) solution. Studies have demonstrated that the activities of enzymes correlate with their primary sequences. It is crucial to judge enzyme adaptation to acidic or alkaline environment from its amino acid sequence in molecular mechanism clarification and the design of high efficient enzymes. In this study, we developed a sequence-based method to discriminate acidic enzymes from alkaline enzymes. The analysis of variance was used to choose the optimized discriminating features derived from g-gap dipeptide compositions. And support vector machine was utilized to establish the prediction model. In the rigorous jackknife cross-validation, the overall accuracy of 96.7% was achieved. The method can correctly predict 96.3% acidic and 97.1% alkaline enzymes. Through the comparison between the proposed method and previous methods, it is demonstrated that the proposed method is more accurate. On the basis of this proposed method, we have built an online web-server called AcalPred which can be freely accessed from the website (http://lin.uestc.edu.cn/server/AcalPred). We believe that the AcalPred will become a powerful tool to study enzyme adaptation to acidic or alkaline environment. PMID:24130738

  16. Antibody-specific model of amino acid substitution for immunological inferences from alignments of antibody sequences.

    PubMed

    Mirsky, Alexander; Kazandjian, Linda; Anisimova, Maria

    2015-03-01

    Antibodies are glycoproteins produced by the immune system as a dynamically adaptive line of defense against invading pathogens. Very elegant and specific mutational mechanisms allow B lymphocytes to produce a large and diversified repertoire of antibodies, which is modified and enhanced throughout all adulthood. One of these mechanisms is somatic hypermutation, which stochastically mutates nucleotides in the antibody genes, forming new sequences with different properties and, eventually, higher affinity and selectivity to the pathogenic target. As somatic hypermutation involves fast mutation of antibody sequences, this process can be described using a Markov substitution model of molecular evolution. Here, using large sets of antibody sequences from mice and humans, we infer an empirical amino acid substitution model AB, which is specific to antibody sequences. Compared with existing general amino acid models, we show that the AB model provides significantly better description for the somatic evolution of mice and human antibody sequences, as demonstrated on large next generation sequencing (NGS) antibody data. General amino acid models are reflective of conservation at the protein level due to functional constraints, with most frequent amino acids exchanges taking place between residues with the same or similar physicochemical properties. In contrast, within the variable part of antibody sequences we observed an elevated frequency of exchanges between amino acids with distinct physicochemical properties. This is indicative of a sui generis mutational mechanism, specific to antibody somatic hypermutation. We illustrate this property of antibody sequences by a comparative analysis of the network modularity implied by the AB model and general amino acid substitution models. We recommend using the new model for computational studies of antibody sequence maturation, including inference of alignments and phylogenetic trees describing antibody somatic hypermutation in

  17. The value of short amino acid sequence matches for prediction of protein allergenicity.

    PubMed

    Silvanovich, Andre; Nemeth, Margaret A; Song, Ping; Herman, Rod; Tagliani, Laura; Bannon, Gary A

    2006-03-01

    Typically, genetically engineered crops contain traits encoded by one or a few newly expressed proteins. The allergenicity assessment of newly expressed proteins is an important component in the safety evaluation of genetically engineered plants. One aspect of this assessment involves sequence searches that compare the amino acid sequence of the protein to all known allergens. Analyses are performed to determine the potential for immunologically based cross-reactivity where IgE directed against a known allergen could bind to the protein and elicit a clinical reaction in sensitized individuals. Bioinformatic searches are designed to detect global sequence similarity and short contiguous amino acid sequence identity. It has been suggested that potential allergen cross-reactivity may be predicted by identifying matches as short as six to eight contiguous amino acids between the protein of interest and a known allergen. A series of analyses were performed, and match probabilities were calculated for different size peptides to determine if there was a scientifically justified search window size that identified allergen sequence characteristics. Four probability modeling methods were tested: (1) a mock protein and a mock allergen database, (2) a mock protein and genuine allergen database, (3) a genuine allergen and genuine protein database, and (4) a genuine allergen and genuine protein database combined with a correction for repeating peptides. These analyses indicated that searches for short amino acid sequence matches of eight amino acids or fewer to identify proteins as potential cross-reactive allergens is a product of chance and adds little value to allergy assessments for newly expressed proteins.

  18. Antibody responses to hepatitis C virus hypervariable region 1: evidence for cross-reactivity and immune-mediated sequence variation.

    PubMed

    Mondelli, M U; Cerino, A; Lisa, A; Brambilla, S; Segagni, L; Cividini, A; Bissolati, M; Missale, G; Bellati, G; Meola, A; Bruniercole, B; Nicosia, A; Galfrè, G; Silini, E

    1999-08-01

    Sequence heterogeneity of hepatitis C virus (HCV) is unevenly distributed along the genome, and maximal variation is confined to a short sequence of the HCV second envelope glycoprotein (E2), designated hypervariable region 1 (HVR1), whose biological function is still undefined. We prospectively studied serological responses to synthetic oligopeptides derived from HVR1 sequences of patients with acute and chronic HCV infection obtained at baseline and after a defined follow-up period. Extensive serological cross-reactivity for unrelated HVR1 peptides was observed in the majority of the patients. Antibody response was restricted to the IgG1 isotype and was focused on the carboxyterminal end of the HVR1 region. Cross-reactive antibodies could be readily elicited following immunization of mice with multiple antigenic peptides carrying HVR1 sequences derived from our patients. The vigor and heterogeneity of cross-reactive antibody responses were significantly higher in patients with chronic hepatitis compared with those with acute hepatitis and in patients infected with HCV type 2 compared with patients infected with other viral genotypes (predominantly type 1), which suggest that higher time-related HVR1 sequence diversification previously described for type 2 may result from immune selection. The finding of a statistically significant correlation between HVR1 sequence variation, and intensity, and cross-reactivity of humoral immune responses provided stronger evidence in support of the contention that HCV variant selection is driven by the host's immune pressure.

  19. Comparison of the amino acid sequence of the major immunogen from three serotypes of foot and mouth disease virus.

    PubMed Central

    Makoff, A J; Paynter, C A; Rowlands, D J; Boothroyd, J C

    1982-01-01

    Cloned cDNA molecules from three serotypes of FMDV have been sequenced around the VP1-coding region. The predicted amino acid sequences for VP1 were compared with the published sequences and variable regions identified. The amino acid sequences were also analysed for hydrophilic regions. Two of the variable regions, numbered 129-160 and 193-204 overlapped hydrophilic regions, and were therefore identified as potentially immunogenic. These regions overlap regions shown by others to be immunogenic. PMID:6298715

  20. Nucleic acid detection compositions

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James L.

    2008-08-05

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  1. Cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

    2000-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  2. Nucleic acid detection assays

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James E.

    2005-04-05

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  3. Cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor L.; Brow, Mary Ann D.; Dahlberg, James E.

    2007-12-11

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  4. Cleavage of nucleic acids

    SciTech Connect

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow; Mary Ann D.; Dahlberg, James E.

    2010-11-09

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  5. Identification of conserved genomic regions and variation therein amongst Cetartiodactyla species using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Next Generation Sequencing has created an opportunity to genetically characterize an individual both inexpensively and comprehensively. In earlier work produced in our collaboration [1], it was demonstrated that, for animals without a reference genome, their Next Generation Sequence data ...

  6. Sequence Variation in Superoxide Dismutase Gene of Toxoplasma gondii among Various Isolates from Different Hosts and Geographical Regions.

    PubMed

    Wang, Shuai; Cao, Aiping; Li, Xun; Zhao, Qunli; Liu, Yuan; Cong, Hua; He, Shenyi; Zhou, Huaiyu

    2015-06-01

    Toxoplasma gondii, an obligate intracellular protozoan parasite of the phylum Apicomplexa, can infect all warm-blooded vertebrates, including humans, livestock, and marine mammals. The aim of this study was to investigate whether superoxide dismutase (SOD) of T. gondii can be used as a new marker for genetic study or a potential vaccine candidate. The partial genome region of the SOD gene was amplified and sequenced from 10 different T. gondii isolates from different parts of the world, and all the sequences were examined by PCR-RFLP, sequence analysis, and phylogenetic reconstruction. The results showed that partial SOD gene sequences ranged from 1,702 bp to 1,712 bp and A + T contents varied from 50.1% to 51.1% among all examined isolates. Sequence alignment analysis identified total 43 variable nucleotide positions, and these results showed that 97.5% sequence similarity of SOD gene among all examined isolates. Phylogenetic analysis revealed that these SOD sequences were not an effective molecular marker for differential identification of T. gondii strains. The research demonstrated existence of low sequence variation in the SOD gene among T. gondii strains of different genotypes from different hosts and geographical regions. PMID:26174817

  7. The use of high-throughput DNA sequencing in the investigation of antigenic variation: application to Neisseria species.

    PubMed

    Davies, John K; Harrison, Paul F; Lin, Ya-Hsun; Bartley, Stephanie; Khoo, Chen Ai; Seemann, Torsten; Ryan, Catherine S; Kahler, Charlene M; Hill, Stuart A

    2014-01-01

    Antigenic variation occurs in a broad range of species. This process resembles gene conversion in that variant DNA is unidirectionally transferred from partial gene copies (or silent loci) into an expression locus. Previous studies of antigenic variation have involved the amplification and sequencing of individual genes from hundreds of colonies. Using the pilE gene from Neisseria gonorrhoeae we have demonstrated that it is possible to use PCR amplification, followed by high-throughput DNA sequencing and a novel assembly process, to detect individual antigenic variation events. The ability to detect these events was much greater than has previously been possible. In N. gonorrhoeae most silent loci contain multiple partial gene copies. Here we show that there is a bias towards using the copy at the 3' end of the silent loci (copy 1) as the donor sequence. The pilE gene of N. gonorrhoeae and some strains of Neisseria meningitidis encode class I pilin, but strains of N. meningitidis from clonal complexes 8 and 11 encode a class II pilin. We have confirmed that the class II pili of meningococcal strain FAM18 (clonal complex 11) are non-variable, and this is also true for the class II pili of strain NMB from clonal complex 8. In addition when a gene encoding class I pilin was moved into the meningococcal strain NMB background there was no evidence of antigenic variation. Finally we investigated several members of the opa gene family of N. gonorrhoeae, where it has been suggested that limited variation occurs. Variation was detected in the opaK gene that is located close to pilE, but not at the opaJ gene located elsewhere on the genome. The approach described here promises to dramatically improve studies of the extent and nature of antigenic variation systems in a variety of species.

  8. Fin whale MDH-1 and MPI allozyme variation is not reflected in the corresponding DNA sequences

    PubMed Central

    Olsen, Morten Tange; Pampoulie, Christophe; Daníelsdóttir, Anna K; Lidh, Emmelie; Bérubé, Martine; Víkingsson, Gísli A; Palsbøll, Per J

    2014-01-01

    The appeal of genetic inference methods to assess population genetic structure and guide management efforts is grounded in the correlation between the genetic similarity and gene flow among populations. Effects of such gene flow are typically genomewide; however, some loci may appear as outliers, displaying above or below average genetic divergence relative to the genomewide level. Above average population, genetic divergence may be due to divergent selection as a result of local adaptation. Consequently, substantial efforts have been directed toward such outlying loci in order to identify traits subject to local adaptation. Here, we report the results of an investigation into the molecular basis of the substantial degree of genetic divergence previously reported at allozyme loci among North Atlantic fin whale (Balaenoptera physalus) populations. We sequenced the exons encoding for the two most divergent allozyme loci (MDH-1 and MPI) and failed to detect any nonsynonymous substitutions. Following extensive error checking and analysis of additional bioinformatic and morphological data, we hypothesize that the observed allozyme polymorphisms may reflect phenotypic plasticity at the cellular level, perhaps as a response to nutritional stress. While such plasticity is intriguing in itself, and of fundamental evolutionary interest, our key finding is that the observed allozyme variation does not appear to be a result of genetic drift, migration, or selection on the MDH-1 and MPI exons themselves, stressing the importance of interpreting allozyme data with caution. As for North Atlantic fin whale population structure, our findings support the low levels of differentiation found in previous analyses of DNA nucleotide loci. PMID:24963377

  9. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.

    PubMed

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-02-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  10. Population clustering based on copy number variations detected from next generation sequencing data

    PubMed Central

    Duan, Junbo; Zhang, Ji-Gang; Wan, Mingxi; Deng, Hong-Wen; Wang, Yu-Ping

    2015-01-01

    Copy number variations (CNVs) can be used as significant bio-markers and next generation sequencing (NGS) provides a high resolution detection of these CNVs. But how to extract features from CNVs and further apply them to genomic studies such as population clustering have become a big challenge. In this paper, we propose a novel method for population clustering based on CNVs from NGS. First, CNVs are extracted from each sample to form a feature matrix. Then, this feature matrix is decomposed into the source matrix and weight matrix with non-negative matrix factorization (NMF). The source matrix consists of common CNVs that are shared by all the samples from the same group, and the weight matrix indicates the corresponding level of CNVs from each sample. Therefore, using NMF of CNVs one can differentiate samples from different ethnic groups, i.e. population clustering. To validate the approach, we applied it to the analysis of both simulation data and two real data set from the 1000 Genomes Project. The results on simulation data demonstrate that the proposed method can recover the true common CNVs with high quality. The results on the first real data analysis show that the proposed method can cluster two family trio with different ancestries into two ethnic groups and the results on the second real data analysis show that the proposed method can be applied to the whole-genome with large sample size consisting of multiple groups. Both results demonstrate the potential of the proposed method for population clustering. PMID:25152046

  11. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma

    PubMed Central

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-01-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  12. Maternal effects and maternal selection arising from variation in allocation of free amino acid to eggs.

    PubMed

    Newcombe, Devi; Hunt, John; Mitchell, Christopher; Moore, Allen J

    2015-06-01

    Maternal provisioning can have profound effects on offspring phenotypes, or maternal effects, especially early in life. One ubiquitous form of provisioning is in the makeup of egg. However, only a few studies examine the role of specific egg constituents in maternal effects, especially as they relate to maternal selection (a standardized selection gradient reflecting the covariance between maternal traits and offspring fitness). Here, we report on the evolutionary consequences of differences in maternal acquisition and allocation of amino acids to eggs. We manipulated acquisition by varying maternal diet (milkweed or sunflower) in the large milkweed bug, Oncopeltus fasciatus. Variation in allocation was detected by examining two source populations with different evolutionary histories and life-history response to sunflower as food. We measured amino acids composition in eggs in this 2 × 2 design and found significant effects of source population and maternal diet on egg and nymph mass and of source population, maternal diet, and their interaction on amino acid composition of eggs. We measured significant linear and quadratic maternal selection on offspring mass associated with variation in amino acid allocation. Visualizing the performance surface along the major axes of nonlinear selection and plotting the mean amino acid profile of eggs from each treatment onto the surface revealed a saddle-shaped fitness surface. While maternal selection appears to have influenced how females allocate amino acids, this maternal effect did not evolve equally in the two populations. Furthermore, none of the population means coincided with peak performance. Thus, we found that the composition of free amino acids in eggs was due to variation in both acquisition and allocation, which had significant fitness effects and created selection. However, although there can be an evolutionary response to novel food resources, females may be constrained from reaching phenotypic optima with

  13. Maternal effects and maternal selection arising from variation in allocation of free amino acid to eggs

    PubMed Central

    Newcombe, Devi; Hunt, John; Mitchell, Christopher; Moore, Allen J

    2015-01-01

    Maternal provisioning can have profound effects on offspring phenotypes, or maternal effects, especially early in life. One ubiquitous form of provisioning is in the makeup of egg. However, only a few studies examine the role of specific egg constituents in maternal effects, especially as they relate to maternal selection (a standardized selection gradient reflecting the covariance between maternal traits and offspring fitness). Here, we report on the evolutionary consequences of differences in maternal acquisition and allocation of amino acids to eggs. We manipulated acquisition by varying maternal diet (milkweed or sunflower) in the large milkweed bug, Oncopeltus fasciatus. Variation in allocation was detected by examining two source populations with different evolutionary histories and life-history response to sunflower as food. We measured amino acids composition in eggs in this 2 × 2 design and found significant effects of source population and maternal diet on egg and nymph mass and of source population, maternal diet, and their interaction on amino acid composition of eggs. We measured significant linear and quadratic maternal selection on offspring mass associated with variation in amino acid allocation. Visualizing the performance surface along the major axes of nonlinear selection and plotting the mean amino acid profile of eggs from each treatment onto the surface revealed a saddle-shaped fitness surface. While maternal selection appears to have influenced how females allocate amino acids, this maternal effect did not evolve equally in the two populations. Furthermore, none of the population means coincided with peak performance. Thus, we found that the composition of free amino acids in eggs was due to variation in both acquisition and allocation, which had significant fitness effects and created selection. However, although there can be an evolutionary response to novel food resources, females may be constrained from reaching phenotypic optima with

  14. Quantitative detection of Aspergillus spp. by real-time nucleic acid sequence-based amplification.

    PubMed

    Zhao, Yanan; Perlin, David S

    2013-01-01

    Rapid and quantitative detection of Aspergillus from clinical samples may facilitate an early diagnosis of invasive pulmonary aspergillosis (IPA). As nucleic acid-based detection is a viable option, we demonstrate that Aspergillus burdens can be rapidly and accurately detected by a novel real-time nucleic acid assay other than qPCR by using the combination of nucleic acid sequence-based amplification (NASBA) and the molecular beacon (MB) technology. Here, we detail a real-time NASBA assay to determine quantitative Aspergillus burdens in lungs and bronchoalveolar lavage (BAL) fluids of rats with experimental IPA.

  15. Draft Genome Sequence of the Butyric Acid Producer Clostridium tyrobutyricum Strain CIP I-776 (IFP923)

    PubMed Central

    Clément, Benjamin; Lopes Ferreira, Nicolas

    2016-01-01

    Here, we report the draft genome sequence of Clostridium tyrobutyricum CIP I-776 (IFP923), an efficient producer of butyric acid. The genome consists of a single chromosome of 3.19 Mb and provides useful data concerning the metabolic capacities of the strain. PMID:26941139

  16. Amino acid sequence of the encephalitogenic basic protein from human myelin

    PubMed Central

    Carnegie, P. R.

    1971-01-01

    Myelin from the central nervous system contains an unusual basic protein, which can induce experimental autoimmune encephalomyelitis. The basic protein from human brain was digested with trypsin and other enzymes and the sequence of the 170 amino acids was determined. The localization of the encephalitogenic determinants was described. Possible roles for the protein in the structure and function of myelin are discussed. PMID:4108501

  17. Sequence-specific formation of d-amino acids in a monoclonal antibody during light exposure.

    PubMed

    Mozziconacci, Olivier; Schöneich, Christian

    2014-11-01

    The photoirradiation of a monoclonal antibody 1 (mAb1) at λ = 254 nm and λmax = 305 nm resulted in the sequence-specific generation of d-Val, d-Tyr, and potentially d-Ala and d-Arg, in the heavy chain sequence [95-101] YCARVVY. d-Amino acid formation is most likely the product of reversible intermediary carbon-centered radical formation at the (α)C-positions of the respective amino acids ((α)C(•) radicals) through the action of Cys thiyl radicals (CysS(•)). The latter can be generated photochemically either through direct homolysis of cystine or through photoinduced electron transfer from Trp and/or Tyr residues. The potential of mAb1 sequences to undergo epimerization was first evaluated through covalent H/D exchange during photoirradiation in D2O, and proteolytic peptides exhibiting deuterium incorporation were monitored by HPLC-MS/MS analysis. Subsequently, mAb1 was photoirradiated in H2O, and peptides, for which deuterium incorporation in D2O had been documented, were purified by HPLC and subjected to hydrolysis and amino acid analysis. Importantly, not all peptide sequences which incorporated deuterium during photoirradiation in D2O also exhibited photoinduced d-amino acid formation. For example, the heavy chain sequence [12-18] VQPGGSL showed significant deuterium incorporation during photoirradiation in D2O, but no photoinduced formation of d-amino acids was detected. Instead this sequence contained ca. 22% d-Val in both a photoirradiated and a control sample. This observation could indicate that d-Val may have been generated either during production and/or storage or during sample preparation. While sample preparation did not lead to the formation of d-Val or other d-amino acids in the control sample for the heavy chain sequence [95-101] YCARVVY, we may have to consider that during hydrolysis N-terminal residues (such as in VQPGGSL) may be more prone to epimerization. We conclude that the photoinduced, radical-dependent formation of d-amino acids

  18. Mitochondrial DNA variation and phylogenetic relationships among five tuna species based on sequencing of D-loop region.

    PubMed

    Kumar, Girish; Kocour, Martin; Kunal, Swaraj Priyaranjan

    2016-05-01

    In order to assess the DNA sequence variation and phylogenetic relationship among five tuna species (Auxis thazard, Euthynnus affinis, Katsuwonus pelamis, Thunnus tonggol, and T. albacares) out of all four tuna genera, partial sequences of the mitochondrial DNA (mtDNA) D-loop region were analyzed. The estimate of intra-specific sequence variation in studied species was low, ranging from 0.027 to 0.080 [Kimura's two parameter distance (K2P)], whereas values of inter-specific variation ranged from 0.049 to 0.491. The longtail tuna (T. tonggol) and yellowfin tuna (T. albacares) were found to share a close relationship (K2P = 0.049) while skipjack tuna (K. pelamis) was most divergent studied species. Phylogenetic analysis using Maximum-Likelihood (ML) and Neighbor-Joining (NJ) methods supported the monophyletic origin of Thunnus species. Similarly, phylogeny of Auxis and Euthynnus species substantiate the monophyly. However, results showed a distinct origin of K. pelamis from genus Thunnus as well as Auxis and Euthynnus. Thus, the mtDNA D-loop region sequence data supports the polyphyletic origin of tuna species. PMID:25329285

  19. The complete amino acid sequence of chitinase-B from the leaves of pokeweed (Phytolacca americana).

    PubMed

    Tanigawa, M; Yamagami, T; Funatsu, G

    1995-05-01

    The complete amino acid sequence of pokeweed leaf chitinase-B (PLC-B) has been determined by first sequencing all 19 tryptic peptides derived from the reduced and S-carboxymethylated (RCm-) PLC-B and then connecting them by analyzing the chymotryptic peptides from three fragments produced by cyanogen bromide cleavage of RCm-PLC-B. PLC-B consists of 274 amino acid residues and has a molecular mass of 29,473 Da. Six cysteine residues are linked by disulfide bonds between Cys20 and Cys67, Cys50 and Cys57, and Cys159 and Cys188. From 58-68% sequence homology of PLC-B with five class III chitinases, it was concluded that PLC-B is a basic class III chitinase.

  20. Pyruvate decarboxylase from Pisum sativum. Properties, nucleotide and amino acid sequences.

    PubMed

    Mücke, U; Wohlfarth, T; Fiedler, U; Bäumlein, H; Rücknagel, K P; König, S

    1996-04-15

    To study the molecular structure and function of pyruvate decarboxylase (PDC) from plants the protein was isolated from pea seeds and partially characterised. The active enzyme which occurs in the form of higher oligomers consists of two different subunits appearing in SDS/PAGE and mass spectroscopy experiments. For further experiments, like X-ray crystallography, it was necessary to elucidate the protein sequence. Partial cDNA clones encoding pyruvate decarboxylase from seeds of Pisum sativum cv. Miko have been obtained by means of polymerase chain reaction techniques. The first sequences were found using degenerate oligonucleotide primers designated according to conserved amino acid sequences of known pyruvate decarboxylases. The missing parts of one cDNA were amplified applying the 3'- and 5'-rapid amplification of cDNA ends systems. The amino acid sequence deduced from the entire cDNA sequence displays strong similarity to pyruvate decarboxylases from other organisms, especially from plants. A molecular mass of 64 kDa was calculated for this protein correlating with estimations for the smaller subunit of the oligomeric enzyme. The PCR experiments led to at least three different clones representing the middle part of the PDC cDNA indicating the existence of three isozymes. Two of these isoforms could be confirmed on the protein level by sequencing tryptic peptides. Only anaerobically treated roots showed a positive signal for PDC mRNA in Northern analysis although the cDNA from imbibed seeds was successfully used for PCR.

  1. Allelic polymorphism in arabian camel ribonuclease and the amino acid sequence of bactrian camel ribonuclease.

    PubMed

    Welling, G W; Mulder, H; Beintema, J J

    1976-04-01

    Pancreatic ribonucleases from several species (whitetail deer, roe deer, guinea pig, and arabian camel) exhibit more than one amino acid at particular positions in their amino acid sequences. Since these enzymes were isolated from pooled pancreas, the origin of this heterogeneity is not clear. The pancreatic ribonucleases from 11 individual arabian camels (Camelus dromedarius) have been investigated with respect to the lysine-glutamine heterogeneity at position 103 (Welling et al., 1975). Six ribonucleases showed only one basic band and five showed two bands after polyacrylamide gel electrophoresis, suggesting a gene frequency of about 0.75 for the Lys gene and about 0.25 for the Gln gene. The amino acid sequence of bactrian camel (Camelus bactrianus) ribonuclease isolated from individual pancreatic tissue was determined and compared with that of arabian camel ribonuclease. The only difference was observed at position 103. In the ribonucleases from two unrelated bactrian camels, only glutamine was observed at that position. PMID:962846

  2. Phylogenetic Relationships and Genetic Variation in Longidorus and Xiphinema Species (Nematoda: Longidoridae) Using ITS1 Sequences of Nuclear Ribosomal DNA.

    PubMed

    Ye, Weimin; Szalanski, Allen L; Robbins, R T

    2004-03-01

    Genetic analyses using DNA sequences of nuclear ribosomal DNA ITS1 were conducted to determine the extent of genetic variation within and among Longidorus and Xiphinema species. DNA sequences were obtained from samples collected from Arkansas, California and Australia as well as 4 Xiphinema DNA sequences from GenBank. The sequences of the ITS1 region including the 3' end of the 18S rDNA gene and the 5' end of the 5.8S rDNA gene ranged from 1020 bp to 1244 bp for the 9 Longidorus species, and from 870 bp to 1354 bp for the 7 Xiphinema species. Nucleotide frequencies were: A = 25.5%, C = 21.0%, G = 26.4%, and T = 27.1%. Genetic variation between the two genera had a maximum divergence of 38.6% between X. chambersi and L. crassus. Genetic variation among Xiphinema species ranged from 3.8% between X. diversicaudatum and X. bakeri to 29.9% between X. chambersi and X. italiae. Within Longidorus, genetic variation ranged from 8.9% between L. crassus and L. grandis to 32.4% between L. fragilis and L. diadecturus. Intraspecific genetic variation in X. americanum sensu lato ranged from 0.3% to 1.9%, while genetic variation in L. diadecturus had 0.8% and L. biformis ranged from 0.6% to 10.9%. Identical sequences were obtained between the two populations of L. grandis, and between the two populations of X. bakeri. Phylogenetic analyses based on the ITS1 DNA sequence data were conducted on each genus separately using both maximum parsimony and maximum likelihood analysis. Among the Longidorus taxa, 4 subgroups are supported: L. grandis, L. crassus, and L. elongatus are in one cluster; L. biformis and L. paralongicaudatus are in a second cluster; L. fragilis and L. breviannulatus are in a third cluster; and L. diadecturus is in a fourth cluster. Among the Xiphinema taxa, 3 subgroups are supported: X. americanum with X. chambersi, X. bakeri with X. diversicaudatum, and X. italiae and X. vuittenezi forming a sister group with X. index. The relationships observed in this study

  3. Hybridization properties of long nucleic acid probes for detection of variable target sequences, and development of a hybridization prediction algorithm

    PubMed Central

    Öhrmalm, Christina; Jobs, Magnus; Eriksson, Ronnie; Golbob, Sultan; Elfaitouri, Amal; Benachenhou, Farid; Strømme, Maria; Blomberg, Jonas

    2010-01-01

    One of the main problems in nucleic acid-based techniques for detection of infectious agents, such as influenza viruses, is that of nucleic acid sequence variation. DNA probes, 70-nt long, some including the nucleotide analog deoxyribose-Inosine (dInosine), were analyzed for hybridization tolerance to different amounts and distributions of mismatching bases, e.g. synonymous mutations, in target DNA. Microsphere-linked 70-mer probes were hybridized in 3M TMAC buffer to biotinylated single-stranded (ss) DNA for subsequent analysis in a Luminex® system. When mismatches interrupted contiguous matching stretches of 6 nt or longer, it had a strong impact on hybridization. Contiguous matching stretches are more important than the same number of matching nucleotides separated by mismatches into several regions. dInosine, but not 5-nitroindole, substitutions at mismatching positions stabilized hybridization remarkably well, comparable to N (4-fold) wobbles in the same positions. In contrast to shorter probes, 70-nt probes with judiciously placed dInosine substitutions and/or wobble positions were remarkably mismatch tolerant, with preserved specificity. An algorithm, NucZip, was constructed to model the nucleation and zipping phases of hybridization, integrating both local and distant binding contributions. It predicted hybridization more exactly than previous algorithms, and has the potential to guide the design of variation-tolerant yet specific probes. PMID:20864443

  4. Characterization and Sequence Variation in the rDNA Region of Six Nematode Species of the Genus Longidorus (Nematoda)

    PubMed Central

    De Luca, F.; Reyes, A.; Grunder, J.; Kunz, P.; Agostinelli, A.; De Giorgi, C.; Lamberti, F.

    2004-01-01

    Total DNA was isolated from individual nematodes of the species Longidorus helveticus, L. macrosoma, L. arthensis, L. profundorum, L. elongatus, and L. raskii collected in Switzerland. The ITS region and D1-D2 expansion segments of the 26S rDNA were amplified and cloned. The sequences obtained were aligned in order to investigate sequence diversity and to infer the phylogenetic relationships among the six Longidorus species. D1-D2 sequences were more conserved than the ITS sequences that varied widely in primary structure and length, and no consensus was observed. Phylogenetic analyses using the neighbor-joining, maximum parsimony and maximum likelihood methods were performed with three different sequence data sets: ITS1-ITS2, 5.8S-D1-D2, and combining ITS1-ITS2+5.8S-D1-D2 sequences. All multiple alignments yielded similar basic trees supporting the existence of the six species established using morphological characters. These sequence data also provided evidence that the different regions of the rDNA are characterized by different evolution rates and by different factors associated with the generation of extreme size variation. PMID:19262800

  5. Nucleotide sequence of Crithidia fasciculata cytosol 5S ribosomal ribonucleic acid.

    PubMed

    MacKay, R M; Gray, M W; Doolittle, W F

    1980-11-11

    The complete nucleotide sequence of the cytosol 5S ribosomal ribonucleic acid of the trypanosomatid protozoan Crithidia fasciculata has been determined by a combination of T1-oligonucleotide catalog and gel sequencing techniques. The sequence is: GAGUACGACCAUACUUGAGUGAAAACACCAUAUCCCGUCCGAUUUGUGAAGUUAAGCACC CACAGGCUUAGUUAGUACUGAGGUCAGUGAUGACUCGGGAACCCUGAGUGCCGUACUCCCOH. This 5S ribosomal RNA is unique in having GAUU in place of the GAAC or GAUC found in all other prokaryotic and eukaryotic 5S RNAs, and thought to be involved in interactions with tRNAs. Comparisons to other eukaryotic cytosol 5S ribosomal RNA sequences indicate that the four major eukaryotic kingdoms (animals, plants, fungi, and protists) are about equally remote from each other, and that the latter kingdom may be the most internally diverse.

  6. Pattern recognition in nucleic acid sequences. II. An efficient method for finding locally stable secondary structures.

    PubMed Central

    Kanehisa, M I; Goad, W B

    1982-01-01

    We present a method for calculating all possible single hairpin loop secondary structures in a nucleic acid sequence by the order of N2 operations where N is the total number of bases. Each structure may contain any number of bulges and internal loops. Most natural sequences are found to be indistinguishable from random sequences in the potential of forming secondary structures, which is defined by the frequency of possible secondary structures calculated by the method. There is a strong correlation between the higher G+C content and the higher structure forming potential. Interestingly, the removal of intervening sequences in mRNAs is almost always accompanied by an increase in the G+C content, which may suggest an involvement of structural stabilization in the mRNA maturation. PMID:6174936

  7. Prevalence of Marek's disease virus in different chicken populations in Iraq and indicative virulence based on sequence variation in the ecoRI-q (meq) gene.

    PubMed

    Wajid, Salih J; Katz, Margaret E; Renz, Katrin G; Walkden-Brown, Stephen W

    2013-06-01

    A cross-sectional survey was conducted in six provinces in southern Iraq to determine the point prevalence of Marek's disease virus (MDV) in different chicken populations followed by sequencing the meq gene for phylogenetic analysis and virulence-associated polymorphisms. A total of 109 samples from unvaccinated flocks were analyzed comprising 52 dust and 30 spleen samples from commercial broiler farms and 27 spleens from local layer chickens purchased in the town markets. The overall prevalence of MDV was 49.5% with no significant differences between provinces (P = 0.08) or sample types (P = 0.89). Prevalence ranged from 36.8% in Karbala and Nasiriyah to 65% in Amarah. The percentages of positive samples were 59.1%, 46.7%, and 48.1% in broiler dust, broiler spleen, and layer spleen, respectively. The overall mean (+/- SEM) Log10 MDV viral copy number per milligram of dust or spleen as determined by quantitative PCR was 1.78 +/- 0.19, with no significant differences between provinces (P = 0.10) or sample types (P = 0.38). In positive samples only, the overall mean was 3.43 +/- 0.18. Sequencing of the meq gene from samples that showed high levels of MDV target in qPCR testing was attempted. Nine samples were sequenced. These sequences were compared with meq sequences of MDVs of different pathotype. All the Iraqi MDVs had a short meq gene of 897 base pairs because of the deletion of 123 bp relative to the reference strain Md5. The Iraqi meq sequences also contained single-nucleotide polymorphisms, resulting in differences in the amino acid sequence. All of the nine Iraqi meq genes encoded two repeats of four-proline sequences. The published negative association between four-proline repeat number and MDV virulence suggests that the Iraqi MDVs are likely to be highly virulent, but this needs to be confirmed by in vivo testing. Taken together, these results indicate that MDV is common in unvaccinated commercial and village chickens in southern Iraq, that there is

  8. Seasonal variation in soil nitrogen availability across a fertilization chronosequence in moist acidic tundra

    NASA Astrophysics Data System (ADS)

    McLaren, J. R.; Gough, L.; Weintraub, M. N.

    2012-12-01

    Changes in global climate may result in altered timing of seasonal events including the timing of the spring-thaw and fall freeze-up. In addition to this changing seasonality, arctic environments are experiencing overall increases in nutrient availability caused by climate warming resulting in alterations of plant species composition, such as the observed increases in the abundance of deciduous shrubs. Changing species composition may have large effects on nutrient dynamics in the surrounding ecosystem because of documented differences in how particular plant species influence soil nutrient availability. Although we have some idea of how plant identity influences soil nutrients, soil biogeochemical processes are strongly seasonal, and we have a poor understanding of how plant identity, or nutrient levels, may influence these seasonal patterns. We examined the responses of moist acidic tundra to experimentally increased soil nutrient availability and the accompanying increase in shrub abundance at the Arctic Long Term Ecological Research (LTER) site at Toolik Lake, Alaska. We examined a chrono-sequence of long-term fertilization experiments, composed of experiments fertilized for 5, 15 and 22 years, which has resulted in increasing shrub density with time since fertilization. The fertilized plots receive both nitrogen (N, 10 g/m2/yr) and phosphorus (5 g/m2/yr) annually following snowmelt. In the 2011 growing season we measured variation in soil available N weekly, including measures of ammonium (NH4), nitrate (NO3) and total free amino acids (TFAA). We found that differences between fertilized and control plots depended strongly on both the seasonal timing of measurements, as well as the duration of the fertilization treatment. Early in the growing season fertilization resulted in large increases in available soil N (both NH4 and NO3) across the entire chronosequence. As the season progressed, however, older fertilized plots show evidence of N saturation, where

  9. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization.

    PubMed

    Anahtar, Melis N; Bowman, Brittany A; Kwon, Douglas S

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  10. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization

    PubMed Central

    Anahtar, Melis N.; Bowman, Brittany A.; Kwon, Douglas S.

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  11. Design of nucleic acid sequences for DNA computing based on a thermodynamic approach.

    PubMed

    Tanaka, Fumiaki; Kameda, Atsushi; Yamamoto, Masahito; Ohuchi, Azuma

    2005-01-01

    We have developed an algorithm for designing multiple sequences of nucleic acids that have a uniform melting temperature between the sequence and its complement and that do not hybridize non-specifically with each other based on the minimum free energy (DeltaG (min)). Sequences that satisfy these constraints can be utilized in computations, various engineering applications such as microarrays, and nano-fabrications. Our algorithm is a random generate-and-test algorithm: it generates a candidate sequence randomly and tests whether the sequence satisfies the constraints. The novelty of our algorithm is that the filtering method uses a greedy search to calculate DeltaG (min). This effectively excludes inappropriate sequences before DeltaG (min) is calculated, thereby reducing computation time drastically when compared with an algorithm without the filtering. Experimental results in silico showed the superiority of the greedy search over the traditional approach based on the hamming distance. In addition, experimental results in vitro demonstrated that the experimental free energy (DeltaG (exp)) of 126 sequences correlated well with DeltaG (min) (|R| = 0.90) than with the hamming distance (|R| = 0.80). These results validate the rationality of a thermodynamic approach. We implemented our algorithm in a graphic user interface-based program written in Java.

  12. Molecular characterization and mitochondrial sequence variation in two sympatric species of Proechimys (Rodentia: Echimyidae) in French Guiana.

    PubMed

    Steiner; Sourrouille; Catzeflis

    2000-12-01

    Variations in mitochondrial DNA characters were used to characterize two morphologically similar and sympatric species of Neotropical terrestrial rodents of the genus Proechimys (Mammalia: Echimyidae). We sequenced both cytochrome b (1140pb) and part of the control region (445pb) from four individuals of P. cuvieri and five of P. cayennensis from French Guiana, which allowed us to depict intra- and inter-specific patterns of variation. The phylogenetic relationships between the nine sequences evidence the monophyly of each species, and illustrate that more polymorphism might exist within P. cuvieri than within P. cayennensis. By developing species-specific primers to amplify a fragment of the cytochrome b gene, we were able to identify 50 individuals of Proechimys spp. caught in two localities of French Guiana. In both sites of primary rainforests, we showed that the two species live in syntopy, and this observation emphasizes the need to document ecological differences which should exist in order to diminish inter-specific competition.

  13. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D [Ken; SNL,

    2016-07-12

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  14. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    SciTech Connect

    Patel, Kamlesh D; SNL,

    2012-06-01

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  15. Studies on adenosine triphosphate transphosphorylases. Amino acid sequence of rabbit muscle ATP-AMP transphosphorylase.

    PubMed

    Kuby, S A; Palmieri, R H; Frischat, A; Fischer, A H; Wu, L H; Maland, L; Manship, M

    1984-05-22

    The total amino acid sequence of rabbit muscle adenylate kinase has been determined, and the single polypeptide chain of 194 amino acid residues starts with N-acetylmethionine and ends with leucyllysine at its carboxyl terminus, in agreement with the earlier data on its amino acid composition [Mahowald, T. A., Noltmann, E. A., & Kuby, S. A. (1962) J. Biol. Chem. 237, 1138-1145] and its carboxyl-terminus sequence [Olson, O. E., & Kuby, S. A. (1964) J. Biol. Chem. 239, 460-467]. Elucidation of the primary structure was based on tryptic and chymotryptic cleavages of the performic acid oxidized protein, cyanogen bromide cleavages of the 14C-labeled S-carboxymethylated protein at its five methionine sites (followed by maleylation of peptide fragments), and tryptic cleavages at its 12 arginine sites of the maleylated 14C-labeled S-carboxymethylated protein. Calf muscle myokinase, whose sequence has also been established, differs primarily from the rabbit muscle myokinase's sequence in the following: His-30 is replaced by Gln-30; Lys-56 is replaced by Met-56; Ala-84 and Asp 85 are replaced by Val-84 and Asn-85. A comparison of the four muscle-type adenylate kinases, whose covalent structures have now been determined, viz., rabbit, calf, porcine, and human [for the latter two sequences see Heil, A., Müller, G., Noda, L., Pinder, T., Schirmer, H., Schirmer, I., & Von Zabern, I. (1974) Eur. J. Biochem. 43, 131-144, and Von Zabern, I., Wittmann-Liebold, B., Untucht-Grau, R., Schirmer, R. H., & Pai, E. F. (1976) Eur. J. Biochem. 68, 281-290], demonstrates an extraordinary degree of homology.(ABSTRACT TRUNCATED AT 250 WORDS)

  16. Phylogenetic and functional analysis of sequence variation of human papillomavirus type 31 E6 and E7 oncoproteins.

    PubMed

    Ferenczi, Annamária; Gyöngyösi, Eszter; Szalmás, Anita; László, Brigitta; Kónya, József; Veress, György

    2016-09-01

    High-risk human papillomaviruses (HPV) are the causative agents of cervical and other anogenital cancers as well as a subset of head and neck cancers. The E6 and E7 oncoproteins of HPV contribute to oncogenesis by associating with the tumour suppressor protein p53 and pRb, respectively. For HPV types 16 and 18, intratypic sequence variation was shown to have biological and clinical significance. The functional significance of sequence variation among HPV 31 variants was studied less intensively. HPV 31 variants belonging to different variant lineages were found to have differences in persistence and in the ability to cause high grade cervical intraepithelial neoplasia. In the present study, we started to explore the functional effects of natural sequence variation of HPV 31 E6 and E7 oncoproteins. The E6 variants were tested for their effects on p53 protein stability and transcriptional activity, while the E7 variants were tested for their effects on pRb protein level and also on the transcriptional activity of E2F transcription factors. HPV 31 E7 variants displayed uniform effects on pRb stability and also on the activity of E2F transcription factors. HPV 31 E6 variants had remarkable differences in the ability to inhibit the trans-activation function of p53 but not in the ability to induce the in vivo degradation of p53. Our results indicate that natural sequence variation of the HPV 31 E6 protein may be involved in the observed differences in the oncogenic potential between HPV 31 variants. PMID:27197052

  17. Deduced amino acid sequence of human pulmonary surfactant proteolipid: SPL(pVal)

    SciTech Connect

    Whitsett, J.A.; Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.; Clark, J.; Pilot-Matias, T.; Meuth, J.; Fox, J.L.

    1987-05-01

    Hydrophobic, proteolipid-like protein of Mr 6500 was isolated from ether/ethanol extracts of human, canine and bovine pulmonary surfactant. Amino acid composition of the protein demonstrated a remarkable abundance of hydrophobic residues, particularly valine and leucine. The N-terminal amino acid sequence of the human protein was determined: N-Leu-Ile-Pro-Cys-Cys-Pro-Val-Asn-Leu-Lys-Arg-Leu-Leu-Ile-Val4... An oligonucleotide probe was used to screen an adult human lung cDNA library and resulted in detection of cDNA clones with predicted amino acid sequence with close identity to the N-terminal amino acid sequence of the human peptide. SPL(pVal) was found within the reading frame of a larger peptide. SPL(pVal) results from proteolytic processing of a larger preprotein. Northern blot analysis detected in a single 1.0 kilobase SPL(pVal) RNA which was less abundant in fetal than in adult lung. Mixtures of purified canine and bovine SPL(pVal) and synthetic phospholipids display properties of rapid adsorption and surface tension lowering activity characteristic of surfactant. Human SPL(pVal) is a pulmonary surfactant proteolipid which may therefore be useful in combination with phospholipids and/or other surfactant proteins for the treatment of surfactant deficiency such as hyaline membrane disease in newborn infants.

  18. Complete amino acid sequence of a human monocyte chemoattractant, a putative mediator of cellular immune reactions.

    PubMed Central

    Robinson, E A; Yoshimura, T; Leonard, E J; Tanaka, S; Griffin, P R; Shabanowitz, J; Hunt, D F; Appella, E

    1989-01-01

    In a study of the structural basis for leukocyte specificity of chemoattractants, we determined the complete amino acid sequence of human glioma-derived monocyte chemotactic factor (GDCF-2), a peptide that attracts human monocytes but not neutrophils. The choice of a tumor cell product for analysis was dictated by its relative abundance and an amino acid composition indistinguishable from that of lymphocyte-derived chemotactic factor (LDCF), the agonist thought to account for monocyte accumulation in cellular immune reactions. By a combination of Edman degradation and mass spectrometry, it was established that GDCF-2 comprises 76 amino acid residues, commencing at the N terminus with pyroglutamic acid. The peptide contains four half-cystines, at positions 11, 12, 36, and 52, which create a pair of loops, clustered at the disulfide bridges. The relative positions of the half-cystines are almost identical to those of monocyte-derived neutrophil chemotactic factor (MDNCF), a peptide of similar mass but with only 24% sequence identity to GDCF. Thus, GDCF and MDNCF have a similar gross secondary structure because of the loops formed by the clustered disulfides, and their different leukocyte specificities are most likely determined by the large differences in primary sequence. PMID:2648385

  19. Amino acid sequences of lower vertebrate parvalbumins and their evolution: parvalbumins of boa, turtle, and salamander.

    PubMed

    Maeda, N; Zhu, D X; Fitch, W M

    1984-11-01

    One major parvalbumin each was isolated from the skeletal muscle of two reptiles, a boa snake, Boa constrictor, and a map turtle, Graptemys geographica, while two parvalbumins were isolated from an amphibian, the salamander Amphiuma means. The amino acid sequences of all four parvalbumins were determined from the sequences of their tryptic peptides, which were ordered partially by homology to other parvalbumins. Phylogenetic study of these and 16 other parvalbumin sequences revealed that the turtle parvalbumin belongs to beta lineage, while the salamander sequences belong, one each, to the alpha and beta lineages defined by Goodman and Pechère (1977). Boa parvalbumin, however, while belonging to the beta lineage, clusters within the fish in all reasonably parsimonious trees. The most parsimonious trees show many parallel or back mutations in the evolution of many parvalbumin residues, although the residues responsible for Ca2+ binding are very well conserved. These most parsimonious trees show an actinopterygian rather than a crossoptyrigian origin of the tetrapods in both the alpha and beta groups. One of two electric eel parvalbumins is evolving more than 10 times faster than its paralogous partner, suggesting it may be on its way to becoming a pseudogene. It is concluded that varying rates of amino acid replacement, much homoplasy, considerable gene duplication, plus complicated lineages make the set of parvalbumin sequences unsuitable for systematic study of the origin of the tetrapods and other higher-taxa divergence, although it may be suitable within a genus or family.

  20. Gene-pool variation in caledonian and European Scots pine (Pinus sylvestris L.) revealed by chloroplast simple-sequence repeats.

    PubMed

    Provan, J; Soranzo, N; Wilson, N J; McNicol, J W; Forrest, G I; Cottrell, J; Powell, W

    1998-09-22

    We have used polymorphic chloroplast simple-sequence repeats to analyse levels of genetic variation within and between seven native Scottish and eight mainland European populations of Scots pine (Pinus sylvestris L.). Diversity levels for the Scottish populations based on haplotype frequency were far in excess of those previously obtained using monoterpenes and isozymes and confirmed lower levels of genetic variation within the derelict population at Glen Falloch. The diversity levels were higher than those reported in similar studies in other Pinus species. An analysis of molecular variance (AMOVA) showed that small (3.24-8.81%) but significant (p < or = 0.001) portions of the variation existed between the populations and that there was no significant difference between the Scottish and the mainland European populations. Evidence of population substructure was found in the Rannoch population, which exhibited two subgroups. Finally, one of the loci studied exhibited an allele distribution uncharacteristic of the stepwise mutation model of evolution of simple-sequence repeats, and sequencing of the PCR products revealed that this was due to a duplication rather than slippage in the repeat region. An examination of the distribution of this mutation suggests that it may have occurred fairly recently in the Wester Ross region or that it may be evidence of a refugial population.

  1. Variation in Lake Michigan alewife (Alosa pseudoharengus) thiaminase and fatty acids composition

    USGS Publications Warehouse

    Honeyfield, D.C.; Tillitt, D.E.; Fitzsimons, J.D.; Brown, S.B.

    2010-01-01

    Thiaminase activity of alewife (Alosa pseudoharengus) is variable across Lake Michigan, yet factors that contribute to the variability in alewife thiaminase activity are unknown. The fatty acid content of Lake Michigan alewife has not been previously reported. Analysis of 53 Lake Michigan alewives found a positive correlation between thiaminase activity and the following fatty acid: C22:ln9, sum of omega-6 fatty acids (Sw6), and sum of the polyunsaturated fatty acids. Thiaminase activity was negatively correlated with C15:0, C16:0, C17:0, C18:0, C20:0, C22:0, C24:0, C18:ln9t, C20:3n3, C22:2, and the sum of all saturated fatty acids (SAFA). Multi-variant regression analysis resulted in three variables (C18:ln9t, Sw6, SAFA) that explained 71% (R2=0.71, P<0.0001) of the variation in thiaminase activity. Because the fatty acid content of an organism is related is food source, diet may be an important factor modulating alewife thiaminase activity. These data suggest there is an association between fatty acids and thiaminase activity in Lake Michigan alewife.

  2. DNA Cloning of Plasmodium falciparum Circumsporozoite Gene: Amino Acid Sequence of Repetitive Epitope

    NASA Astrophysics Data System (ADS)

    Enea, Vincenzo; Ellis, Joan; Zavala, Fidel; Arnot, David E.; Asavanich, Achara; Masuda, Aoi; Quakyi, Isabella; Nussenzweig, Ruth S.

    1984-08-01

    A clone of complementary DNA encoding the circumsporozoite (CS) protein of the human malaria parasite Plasmodium falciparum has been isolated by screening an Escherichia coli complementary DNA library with a monoclonal antibody to the CS protein. The DNA sequence of the complementary DNA insert encodes a four-amino acid sequence: proline-asparagine-alanine-asparagine, tandemly repeated 23 times. The CS β -lactamase fusion protein specifically binds monoclonal antibodies to the CS protein and inhibits the binding of these antibodies to native Plasmodium falciparum CS protein. These findings provide a basis for the development of a vaccine against Plasmodium falciparum malaria.

  3. Nucleotide and amino acid sequences of human intestinal alkaline phosphatase: close homology to placental alkaline phosphatase

    SciTech Connect

    Henthorn, P.S.; Raducha, M.; Edwards, Y.H.; Weiss, M.J.; Slaughter, C.; Lafferty, M.A.; Harris, H.

    1987-03-01

    A cDNA clone for human adult intestinal alkaline phosphatase (ALP) (orthophosphoric-monoester phosphohydrolase (alkaline optimum); EC 3.1.3.1) was isolated from a lambdagt11 expression library. The cDNA insert of this clone is 2513 base pairs in length and contains an open reading frame that encodes a 528-amino acid polypeptide. This deduced polypeptide contains the first 40 amino acids of human intestinal ALP, as determined by direct protein sequencing. Intestinal ALP shows 86.5% amino acid identity to placental (type 1) ALP and 56.6% amino acid identity to liver/bone/kidney ALP. In the 3'-untranslated regions, intestinal and placental ALP cDNAs are 73.5% identical (excluding gaps). The evolution of this multigene enzyme family is discussed.

  4. Gene-related strain variation of Staphylococcus aureus for homologous resistance response to acid stress.

    PubMed

    Lee, Soomin; Ahn, Sooyeon; Lee, Heeyoung; Kim, Won-Il; Kim, Hwang-Yong; Ryu, Jae-Gee; Kim, Se-Ri; Choi, Kyoung-Hee; Yoon, Yohan

    2014-10-01

    This study investigated the effect of adaptation of Staphylococcus aureus strains to the acidic condition of tomato in response to environmental stresses, such as heat and acid. S. aureus ATCC 13565, ATCC 14458, ATCC 23235, ATCC 27664, and NCCP10826 habituated in tomato extract at 35°C for 24 h were inoculated in tryptic soy broth. The culture suspensions were then subjected to heat challenge or acid challenge at 60°C and pH 3.0, respectively, for 60 min. In addition, transcriptional analysis using quantitative real-time PCR was performed to evaluate the expression level of acid-shock genes, such as clpB, zwf, nuoF, and gnd, from five S. aureus strains after the acid habituation of strains in tomato at 35°C for 15 min and 60 min in comparison with that of the nonhabituated strains. In comparison with the nonhabituated strains, the five tomato-habituated S. aureus strains did not show cross protection to heat, but tomato-habituated S. aureus ATCC 23235 showed acid resistance. In quantitative real-time-PCR analysis, the relative expression levels of acid-shock genes (clpB, zwf, nuoF, and gnd) were increased the most in S. aureus ATCC 23235 after 60 min of tomato habituation, but there was little difference in the expression levels among the five S. aureus strains after 15 min of tomato habituation. These results indicate that the variation of acid resistance of S. aureus is related to the expression of acid-shock genes during acid habituation. PMID:25285500

  5. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement.

    PubMed

    Le Coq, Johanne; Ghosh, Partho

    2011-08-30

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd (∼16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10(20) potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.

  6. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    SciTech Connect

    Le Coq, Johanne; Ghosh, Partho

    2012-06-19

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd ({approx}16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10{sup 20} potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.

  7. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    PubMed Central

    Le Coq, Johanne; Ghosh, Partho

    2011-01-01

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd (∼16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 1020 potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation. PMID:21873231

  8. The Effects of Sequence Variation on Genome-wide NRF2 Binding—New Target Genes and Regulatory SNPs

    PubMed Central

    Kuosmanen, Suvi M.; Viitala, Sari; Laitinen, Tuomo; Peräkylä, Mikael; Pölönen, Petri; Kansanen, Emilia; Leinonen, Hanna; Raju, Suresh; Wienecke-Baldacchino, Anke; Närvänen, Ale; Poso, Antti; Heinäniemi, Merja; Heikkinen, Sami; Levonen, Anna-Liisa

    2016-01-01

    Transcription factor binding specificity is crucial for proper target gene regulation. Motif discovery algorithms identify the main features of the binding patterns, but the accuracy on the lower affinity sites is often poor. Nuclear factor E2-related factor 2 (NRF2) is a ubiquitous redox-activated transcription factor having a key protective role against endogenous and exogenous oxidant and electrophile stress. Herein, we decipher the effects of sequence variation on the DNA binding sequence of NRF2, in order to identify both genome-wide binding sites for NRF2 and disease-associated regulatory SNPs (rSNPs) with drastic effects on NRF2 binding. Interactions between NRF2 and DNA were studied using molecular modelling, and NRF2 chromatin immunoprecipitation-sequence datasets together with protein binding microarray measurements were utilized to study binding sequence variation in detail. The binding model thus generated was used to identify genome-wide binding sites for NRF2, and genomic binding sites with rSNPs that have strong effects on NRF2 binding and reside on active regulatory elements in human cells. As a proof of concept, miR-126–3p and -5p were identified as NRF2 target microRNAs, and a rSNP (rs113067944) residing on NRF2 target gene (Ferritin, light polypeptide, FTL) promoter was experimentally verified to decrease NRF2 binding and result in decreased transcriptional activity. PMID:26826707

  9. Low-frequency normal modes that describe allosteric transitions in biological nanomachines are robust to sequence variations.

    PubMed

    Zheng, Wenjun; Brooks, Bernard R; Thirumalai, D

    2006-05-16

    By representing the high-resolution crystal structures of a number of enzymes using the elastic network model, it has been shown that only a few low-frequency normal modes are needed to describe the large-scale domain movements that are triggered by ligand binding. Here we explore a link between the nearly invariant nature of the modes that describe functional dynamics at the mesoscopic level and the large evolutionary sequence variations at the residue level. By using a structural perturbation method (SPM), which probes the residue-specific response to perturbations (or mutations), we identify a sparse network of strongly conserved residues that transmit allosteric signals in three structurally unrelated biological nanomachines, namely, DNA polymerase, myosin motor, and the Escherichia coli chaperonin. Based on the response of every mode to perturbations, which are generated by interchanging specific sequence pairs in a multiple sequence alignment, we show that the functionally relevant low-frequency modes are most robust to sequence variations. Our work shows that robustness of dynamical modes at the mesoscopic level is encoded in the structure through a sparse network of residues that transmit allosteric signals.

  10. AFLP and DNA sequence variation in an Andean domesticate, pepino (Solanum muricatum, Solanaceae): implications for evolution and domestication.

    PubMed

    Blanca, José M; Prohens, Jaime; Anderson, Gregory J; Zuriaga, Elena; Cañizares, Joaquín; Nuez, Fernando

    2007-07-01

    The pepino (Solanum muricatum) is a vegetatively propagated, domesticated native of the Andes, where it grows with wild relatives. We used AFLPs and a 1-kb sequence of the 3-methylcrotonyl-CoA carboxylase gene to study variation of 27 accessions of S. muricatum and 35 collections of 10 species of wild relatives (Solanum section Basarthrum). A total of 298 AFLP fragments and 29 DNA sequence haplotypes were detected. Cluster and principal coordinate analyses and other genetic parameters estimated from both types of markers, show that S. muricatum is closely related to the species from one of the series (Caripensia) of section Basarthrum and that >90% of the variation of the cultigen is also represented in that series. Pepino is highly diverse, either because it is not monophyletic or it has been subjected to regular introgression with wild species, or both. Although a continuous distribution of the genetic variation occurred within the cultivated species, three genetic clusters were recognized. Cluster 1 is mostly centered in Ecuador, cluster 2 in Ecuador and Peru, and cluster 3 in Colombia and Ecuador. Cluster 3 also includes all modern cultivars studied. These results and other evidence suggest that northern Ecuador/southern Colombia is the main center of pepino diversity and the center of origin. The high genetic variation of this cultigen indicates that domestication does not always produce a genetic bottleneck.

  11. A worldwide survey of genome sequence variation provides insight into the evolutionary history of the honeybee Apis mellifera.

    PubMed

    Wallberg, Andreas; Han, Fan; Wellhagen, Gustaf; Dahle, Bjørn; Kawata, Masakado; Haddad, Nizar; Simões, Zilá Luz Paulino; Allsopp, Mike H; Kandemir, Irfan; De la Rúa, Pilar; Pirk, Christian W; Webster, Matthew T

    2014-10-01

    The honeybee Apis mellifera has major ecological and economic importance. We analyze patterns of genetic variation at 8.3 million SNPs, identified by sequencing 140 honeybee genomes from a worldwide sample of 14 populations at a combined total depth of 634×. These data provide insight into the evolutionary history and genetic basis of local adaptation in this species. We find evidence that population sizes have fluctuated greatly, mirroring historical fluctuations in climate, although contemporary populations have high genetic diversity, indicating the absence of domestication bottlenecks. Levels of genetic variation are strongly shaped by natural selection and are highly correlated with patterns of gene expression and DNA methylation. We identify genomic signatures of local adaptation, which are enriched in genes expressed in workers and in immune system- and sperm motility-related genes that might underlie geographic variation in reproduction, dispersal and disease resistance. This study provides a framework for future investigations into responses to pathogens and climate change in honeybees.

  12. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  13. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  14. Individual Variation in Lipidomic Profiles of Healthy Subjects in Response to Omega-3 Fatty Acids

    PubMed Central

    Nording, Malin L.; Yang, Jun; Georgi, Katrin; Hegedus Karbowski, Christine; German, J. Bruce; Weiss, Robert H.; Hogg, Ronald J.; Trygg, Johan; Hammock, Bruce D.; Zivkovic, Angela M.

    2013-01-01

    Introduction Conflicting findings in both interventional and observational studies have resulted in a lack of consensus on the benefits of ω3 fatty acids in reducing disease risk. This may be due to individual variability in response. We used a multi-platform lipidomic approach to investigate both the consistent and inconsistent responses of individuals comprehensively to a defined ω3 intervention. Methods The lipidomic profile including fatty acids, lipid classes, lipoprotein distribution, and oxylipins was examined multi- and uni-variately in 12 healthy subjects pre vs. post six weeks of ω3 fatty acids (1.9 g/d eicosapentaenoic acid [EPA] and 1.5 g/d docosahexaenoic acid [DHA]). Results Total lipidomic and oxylipin profiles were significantly different pre vs. post treatment across all subjects (p=0.00007 and p=0.00002 respectively). There was a strong correlation between oxylipin profiles and EPA and DHA incorporated into different lipid classes (r2=0.93). However, strikingly divergent responses among individuals were also observed. Both ω3 and ω6 fatty acid metabolites displayed a large degree of variation among the subjects. For example, in half of the subjects, two arachidonic acid cyclooxygenase products, prostaglandin E2 (PGE2) and thromboxane B2 (TXB2), and a lipoxygenase product, 12-hydroxyeicosatetraenoic acid (12-HETE) significantly decreased post intervention, whereas in the other half they either did not change or increased. The EPA lipoxygenase metabolite 12-hydroxyeicosapentaenoic acid (12-HEPE) varied among subjects from an 82% decrease to a 5,000% increase. Conclusions Our results show that certain defined responses to ω3 fatty acid intervention were consistent across all subjects. However, there was also a high degree of inter-individual variability in certain aspects of lipid metabolism. This lipidomic based phenotyping approach demonstrated that individual responsiveness to ω3 fatty acids is highly variable and measurable, and could be

  15. Reaction sequences in simulated neutralized current acid waste slurry during processing with formic acid

    SciTech Connect

    Smith, H.D.; Wiemers, K.D.; Langowski, M.H.; Powell, M.R.; Larson, D.E.

    1993-11-01

    The Hanford Waste Vitrification Plant (HWVP) is being designed for the Department of Energy to immobilize high-level and transuranic wastes as glass for permanent disposal. Pacific Northwest Laboratory is supporting the HWVP design activities by conducting laboratory-scale studies using a HWVP simulated waste slurry. Conditions which affect the slurry processing chemistry were evaluated in terms of offgas composition and peak generation rate and changes in slurry composition. A standard offgas profile defined in terms of three reaction phases, decomposition of H{sub 2}CO{sub 3}, destruction of NO{sub 2}{sup {minus}}, and production of H{sub 2} and NH{sub 3} was used as a baseline against which changes were evaluated. The test variables include nitrite concentration, acid neutralization capacity, temperature, and formic acid addition rate. Results to date indicate that pH is an important parameter influencing the N{sub 2}O/NO{sub x} generation ratio; nitrite can both inhibit and activate rhodium as a catalyst for formic acid decomposition to CO{sub 2} and H{sub 2}; and a separate reduced metal phase forms in the reducing environment. These data are being compiled to provide a basis for predicting the HWVP feed processing chemistry as a function of feed composition and operation variables, recommending criteria for chemical adjustments, and providing guidelines with respect to important control parameters to consider during routine and upset plant operation.

  16. BLAZAR ANTI-SEQUENCE OF SPECTRAL VARIATION WITHIN INDIVIDUAL BLAZARS: CASES FOR MRK 501 AND 3C 279

    SciTech Connect

    Zhang, Jin; Zhang, Shuang-Nan; Liang, En-Wei

    2013-04-10

    The jet properties of Mrk 501 and 3C 279 are derived by fitting broadband spectral energy distributions (SEDs) with lepton models. The derived {gamma}{sub b} (the break Lorenz factor of the electron distribution) is 10{sup 4}-10{sup 6} for Mrk 501 and 200 {approx} 600 for 3C 279. The magnetic field strength (B) of Mrk 501 is usually one order of magnitude lower than that of 3C 279, but their Doppler factors ({delta}) are comparable. A spectral variation feature where the peak luminosity is correlated with the peak frequency, which is opposite from the blazar sequence, is observed in the two sources. We find that (1) the peak luminosities of the two bumps in the SEDs for Mrk 501 depend on {gamma}{sub b} in both the observer and co-moving frames, but they are not correlated with B and {delta} and (2) the luminosity variation of 3C 279 is dominated by the external Compton (EC) peak and its peak luminosity is correlated with {gamma}{sub b} and {delta}, but anti-correlated with B. These results suggest that {gamma}{sub b} may govern the spectral variation of Mrk 501 and {delta} and B would be responsible for the spectral variation of 3C 279. The narrow distribution of {gamma}{sub b} and the correlation of {gamma}{sub b} and B in 3C 279 would be due to the cooling from the EC process and the strong magnetic field. Based on our brief discussion, we propose that this spectral variation feature may originate from the instability of the corona but not from the variation of the accretion rate as does the blazar sequence.

  17. The complete amino acid sequence of lectin-C from the roots of pokeweed (Phytolacca americana).

    PubMed

    Yamaguchi, K; Mori, A; Funatsu, G

    1995-07-01

    The complete amino acid sequence of pokeweed lectin-C (PL-C) consisting of 126 residues has been determined. PL-C is an acidic simple protein with molecular mass of 13,747 Da and consists of three cysteine-rich domains with 51-63% homology. PL-C shows homology to chitin-binding proteins such as wheat germ agglutinin, and all eight cysteine residues in the three domains of PL-C are completely conserved in all other chitin-binding domains.

  18. Amino-acid sequence of a cooperative, dimeric myoglobin from the gastropod mollusc, Buccinum undatum L.

    PubMed

    Wen, D; Laursen, R A

    1994-10-19

    The complete amino-acid sequence of a dimeric myoglobin from the radular mussel of the gastropod mollusc, Buccinum undatum L. has been determined. The globin, which shows cooperative binding of oxygen, contains 146 amino acids, is N-terminal aminoacetylated, and has histidine residues at position 65 and 97, corresponding to the heme-binding histidines seen in mammalian myoglobins. It shows about 75% and 50% homology, respectively, with the dimeric molluscan myoglobins from Busycon canaliculatum and Cerithidea rhizophorarum, the former of which also shows weak cooperatively, but much less similarity to other species of myoglobin and hemoglobin.

  19. The Complete Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis ssp. lactis IL1403

    PubMed Central

    Bolotin, Alexander; Wincker, Patrick; Mauger, Stéphane; Jaillon, Olivier; Malarme, Karine; Weissenbach, Jean; Ehrlich, S. Dusko; Sorokin, Alexei

    2001-01-01

    Lactococcus lactis is a nonpathogenic AT-rich gram-positive bacterium closely related to the genus Streptococcus and is the most commonly used cheese starter. It is also the best-characterized lactic acid bacterium. We sequenced the genome of the laboratory strain IL1403, using a novel two-step strategy that comprises diagnostic sequencing of the entire genome and a shotgun polishing step. The genome contains 2,365,589 base pairs and encodes 2310 proteins, including 293 protein-coding genes belonging to six prophages and 43 insertion sequence (IS) elements. Nonrandom distribution of IS elements indicates that the chromosome of the sequenced strain may be a product of recent recombination between two closely related genomes. A complete set of late competence genes is present, indicating the ability of L. lactis to undergo DNA transformation. Genomic sequence revealed new possibilities for fermentation pathways and for aerobic respiration. It also indicated a horizontal transfer of genetic information from Lactococcus to gram-negative enteric bacteria of Salmonella-Escherichia group. [The sequence data described in this paper has been submitted to the GenBank data library under accession no. AE005176.] PMID:11337471

  20. Amino acid sequence of human cholinesterase. Annual report, 30 September 1984-30 September 1985

    SciTech Connect

    Lockridge, O.

    1985-10-01

    The active-site serine residue is located 198 amino acids from the N-terminal. The active-site peptide was isolated from three different genetic types of human serum cholinesterase: from usual, atypical, and atypical-silent genotypes. It was found that the amino acid sequence of the active-site peptide was identical in all three genotypes. Comparison of the complete sequences of cholinesterase from human serum and acetylcholinesterase from the electric organ of Torpedo californica shows an identity of 53%. Cholinesterase is of interest to the Department of Defense because cholinesterase protects against organophosphate poisons of the type used in chemical warfare. The structural results presented here will serve as the basis for cloning the gene for cholinesterase. The potential uses of large amounts of cholinesterase would be for cleaning up spills of organophosphates and possibly for detoxifying exposed personnel.

  1. Amino acid sequence differences in pancreatic ribonucleases from water buffalo breeds from Indonesia and Italy.

    PubMed

    Sidik, A; Martena, B; Beintema, J J

    1979-12-01

    The amino acid sequences of the pancreatic ribonucleases from river-breed water buffaloes from Italy and swamp-breed water buffaloes from Indonesia differ at three positions. One of the differences involves a replacement of asparagine-34, with covalently attached carbohydrate on all molecules, in the river-breed enzyme by serine in the swamp-breed enzyme. The ribonuclease content of the pancreas differs considerably between breeds and is lower in river buffaloes. A ribonuclease preparation from two swamp buffaloes contained a minor glycosylated component. Preliminary evidence was obtained that the amino acid sequence of this component has factors in common with the main component of the swamp-breed ribonuclease and with the river-breed enzyme.

  2. Stereochemical Sequence Ion Selectivity: Proline versus Pipecolic-acid-containing Protonated Peptides

    NASA Astrophysics Data System (ADS)

    Abutokaikah, Maha T.; Guan, Shanshan; Bythell, Benjamin J.

    2016-10-01

    Substitution of proline by pipecolic acid, the six-membered ring congener of proline, results in vastly different tandem mass spectra. The well-known proline effect is eliminated and amide bond cleavage C-terminal to pipecolic acid dominates instead. Why do these two ostensibly similar residues produce dramatically differing spectra? Recent evidence indicates that the proton affinities of these residues are similar, so are unlikely to explain the result [Raulfs et al., J. Am. Soc. Mass Spectrom. 25, 1705-1715 (2014)]. An additional hypothesis based on increased flexibility was also advocated. Here, we provide a computational investigation of the "pipecolic acid effect," to test this and other hypotheses to determine if theory can shed additional light on this fascinating result. Our calculations provide evidence for both the increased flexibility of pipecolic-acid-containing peptides, and structural changes in the transition structures necessary to produce the sequence ions. The most striking computational finding is inversion of the stereochemistry of the transition structures leading to "proline effect"-type amide bond fragmentation between the proline/pipecolic acid-congeners: R (proline) to S (pipecolic acid). Additionally, our calculations predict substantial stabilization of the amide bond cleavage barriers for the pipecolic acid congeners by reduction in deleterious steric interactions and provide evidence for the importance of experimental energy regime in rationalizing the spectra.

  3. Complete plastid genome sequence of Primula sinensis (Primulaceae): structure comparison, sequence variation and evidence for accD transfer to nucleus.

    PubMed

    Liu, Tong-Jian; Zhang, Cai-Yun; Yan, Hai-Fei; Zhang, Lu; Ge, Xue-Jun; Hao, Gang

    2016-01-01

    Species-rich genus Primula L. is a typical plant group with which to understand genetic variance between species in different levels of relationships. Chloroplast genome sequences are used to be the information resource for quantifying this difference and reconstructing evolutionary history. In this study, we reported the complete chloroplast genome sequence of Primula sinensis and compared it with other related species. This genome of chloroplast showed a typical circular quadripartite structure with 150,859 bp in sequence length consisting of 37.2% GC base. Two inverted repeated regions (25,535 bp) were separated by a large single-copy region (82,064 bp) and a small single-copy region (17,725 bp). The genome consists of 112 genes, including 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Among them, seven coding genes, seven tRNA genes and four rRNA genes have two copies due to their locations in the IR regions. The accD and infA genes lacking intact open reading frames (ORF) were identified as pseudogenes. SSR and sequence variation analyses were also performed on the plastome of Primula sinensis, comparing with another available plastome of P. poissonii. The four most variable regions, rpl36-rps8, rps16-trnQ, trnH-psbA and ndhC-trnV, were identified. Phylogenetic relationship estimates using three sub-datasets extracted from a matrix of 57 protein-coding gene sequences showed the identical result that was consistent with previous studies. A transcript found from P. sinensis transcriptome showed a high similarity to plastid accD functional region and was identified as a putative plastid transit peptide at the N-terminal region. The result strongly suggested that plastid accD has been functionally transferred to the nucleus in P. sinensis.

  4. Complete plastid genome sequence of Primula sinensis (Primulaceae): structure comparison, sequence variation and evidence for accD transfer to nucleus

    PubMed Central

    Liu, Tong-Jian; Zhang, Cai-Yun; Yan, Hai-Fei; Zhang, Lu

    2016-01-01

    Species-rich genus Primula L. is a typical plant group with which to understand genetic variance between species in different levels of relationships. Chloroplast genome sequences are used to be the information resource for quantifying this difference and reconstructing evolutionary history. In this study, we reported the complete chloroplast genome sequence of Primula sinensis and compared it with other related species. This genome of chloroplast showed a typical circular quadripartite structure with 150,859 bp in sequence length consisting of 37.2% GC base. Two inverted repeated regions (25,535 bp) were separated by a large single-copy region (82,064 bp) and a small single-copy region (17,725 bp). The genome consists of 112 genes, including 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Among them, seven coding genes, seven tRNA genes and four rRNA genes have two copies due to their locations in the IR regions. The accD and infA genes lacking intact open reading frames (ORF) were identified as pseudogenes. SSR and sequence variation analyses were also performed on the plastome of Primula sinensis, comparing with another available plastome of P. poissonii. The four most variable regions, rpl36–rps8, rps16–trnQ, trnH–psbA and ndhC–trnV, were identified. Phylogenetic relationship estimates using three sub-datasets extracted from a matrix of 57 protein-coding gene sequences showed the identical result that was consistent with previous studies. A transcript found from P. sinensis transcriptome showed a high similarity to plastid accD functional region and was identified as a putative plastid transit peptide at the N-terminal region. The result strongly suggested that plastid accD has been functionally transferred to the nucleus in P. sinensis. PMID:27375965

  5. Reprint of "Identification of staphylococcal species based on variations in protein sequences (mass spectrometry) and DNA sequence (sodA microarray)".

    PubMed

    Kooken, Jennifer; Fox, Karen; Fox, Alvin; Altomare, Diego; Creek, Kim; Wunschel, David; Pajares-Merino, Sara; Martínez-Ballesteros, Ilargi; Garaizar, Javier; Oyarzabal, Omar; Samadpour, Mansour

    2014-01-01

    This report is among the first using sequence variation in newly discovered protein markers for staphylococcal (or indeed any other bacterial) speciation. Variation, at the DNA sequence level, in the sodA gene (commonly used for staphylococcal speciation) provided excellent correlation. Relatedness among strains was also assessed using protein profiling using microcapillary electrophoresis and pulsed field electrophoresis. A total of 64 strains were analyzed including reference strains representing the 11 staphylococcal species most commonly isolated from man (Staphylococcus aureus and 10 coagulase negative species [CoNS]). Matrix assisted time of flight ionization/ionization mass spectrometry (MALDI TOF MS) and liquid chromatography-electrospray ionization tandem mass spectrometry (LC ESI MS/MS) were used for peptide analysis of proteins isolated from gel bands. Comparison of experimental spectra of unknowns versus spectra of peptides derived from reference strains allowed bacterial identification after MALDI TOF MS analysis. After LC-MS/MS analysis of gel bands bacterial speciation was performed by comparing experimental spectra versus virtual spectra using the software X!Tandem. Finally LC-MS/MS was performed on whole proteomes and data analysis also employing X!tandem. Aconitate hydratase and oxoglutarate dehydrogenase served as marker proteins on focused analysis after gel separation. Alternatively on full proteomics analysis elongation factor Tu generally provided the highest confidence in staphylococcal speciation.

  6. Optimization of shRNA inhibitors by variation of the terminal loop sequence.

    PubMed

    Schopman, Nick C T; Liu, Ying Poi; Konstantinova, Pavlina; ter Brake, Olivier; Berkhout, Ben

    2010-05-01

    Gene silencing by RNA interference (RNAi) can be achieved by intracellular expression of a short hairpin RNA (shRNA) that is processed into the effective small interfering RNA (siRNA) inhibitor by the RNAi machinery. Previous studies indicate that shRNA molecules do not always reflect the activity of corresponding synthetic siRNAs that attack the same target sequence. One obvious difference between these two effector molecules is the hairpin loop of the shRNA. Most studies use the original shRNA design of the pSuper system, but no extensive study regarding optimization of the shRNA loop sequence has been performed. We tested the impact of different hairpin loop sequences, varying in size and structure, on the activity of a set of shRNAs targeting HIV-1. We were able to transform weak inhibitors into intermediate or even strong shRNA inhibitors by replacing the loop sequence. We demonstrate that the efficacy of these optimized shRNA inhibitors is improved significantly in different cell types due to increased siRNA production. These results indicate that the loop sequence is an essential part of the shRNA design. The optimized shRNA loop sequence is generally applicable for RNAi knockdown studies, and will allow us to develop a more potent gene therapy against HIV-1. PMID:20188764

  7. Characterizing novel endogenous retroviruses from genetic variation inferred from short sequence reads

    PubMed Central

    Mourier, Tobias; Mollerup, Sarah; Vinner, Lasse; Hansen, Thomas Arn; Kjartansdóttir, Kristín Rós; Guldberg Frøslev, Tobias; Snogdal Boutrup, Torsten; Nielsen, Lars Peter; Willerslev, Eske; Hansen, Anders J.

    2015-01-01

    From Illumina sequencing of DNA from brain and liver tissue from the lion, Panthera leo, and tumor samples from the pike-perch, Sander lucioperca, we obtained two assembled sequence contigs with similarity to known retroviruses. Phylogenetic analyses suggest that the pike-perch retrovirus belongs to the epsilonretroviruses, and the lion retrovirus to the gammaretroviruses. To determine if these novel retroviral sequences originate from an endogenous retrovirus or from a recently integrated exogenous retrovirus, we assessed the genetic diversity of the parental sequences from which the short Illumina reads are derived. First, we showed by simulations that we can robustly infer the level of genetic diversity from short sequence reads. Second, we find that the measures of nucleotide diversity inferred from our retroviral sequences significantly exceed the level observed from Human Immunodeficiency Virus infections, prompting us to conclude that the novel retroviruses are both of endogenous origin. Through further simulations, we rule out the possibility that the observed elevated levels of nucleotide diversity are the result of co-infection with two closely related exogenous retroviruses. PMID:26493184

  8. Self-sequencing of amino acids and origins of polyfunctional protocells

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1984-01-01

    The role of proteins in the origin of living things is discussed. It has been experimentally established that amino acids can sequence themselves under simulated geological conditions with highly nonrandom products which accordingly contain diverse information. Multiple copies of each type of macromolecule are formed, resulting in greater power for any protoenzymic molecule than would accrue from a single copy of each type. Thermal proteins are readily incorporated into laboratory protocells. The experimental evidence for original polyfunctional protocells is discussed.

  9. Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing

    PubMed Central

    2013-01-01

    Background Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Results Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (< 50 bp) of IAPV. These coverage gaps occurred across sequencing runs and were virtually unchanged when reads were re-mapped with greater permissiveness (up to 8% divergence), suggesting a recurrent sequencing artifact rather than strain divergence. Consensus sequences of DWV for each sample showed little phylogenetic divergence, low nucleotide diversity, and strongly negative values of Fu and Li’s D statistic, suggesting a recent population bottleneck and/or purifying selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li’s D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low

  10. Comparisons of the Distribution of Nucleotides and Common Sequences in Deoxyribonucleic Acid from Selected Bacteriophages

    PubMed Central

    Skalka, A.; Hanson, P.

    1972-01-01

    Results from comparisons of deoxyribonucleic acid (DNA) from several classes of bacteriophages suggest that most phage chromosomes contain either a homogeneous distribution of nucleotides or are made up of a few, rather large segments of different quanine plus cytosine (G + C) contents which are internally homogeneous. Among those temperate phages tested, most contained segmented DNA. Comparisons of sequence similarities among segments from lambdoid phage DNA species revealed the following order in relatedness to λ: 82 (and 434) > 21 > 424 > φ80. Most common sequences are found in the highest G + C segments, which in λ contain head and tail genes. Hybridization tests with λ and 186 or P2 DNA species verified that the lambdoids and 186 and P2 belong to two distinct groups. There are fewer homologous sequences between the DNA species of coliphages λ and P2 or 186 than there are between the DNA species of coliphage λ and salmonella phage P22. PMID:4553679

  11. Structure of the fully modified left-handed cyclohexene nucleic acid sequence GTGTACAC.

    PubMed

    Robeyns, Koen; Herdewijn, Piet; Van Meervelt, Luc

    2008-02-13

    CeNA oligonucleotides consist of a phosphorylated backbone where the deoxyribose sugars are replaced by cyclohexene moieties. The X-ray structure determination and analysis of a fully modified octamer sequence GTGTACAC, which is the first crystal structure of a carbocyclic-based nucleic acid, is presented. This particular sequence was built with left-handed building blocks and crystallizes as a left-handed double helix. The helix can be characterized as belonging to the (mirrored) A-type family. Crystallographic data were processed up to 1.53 A, and the octamer sequence crystallizes in the space group R32. The sugar puckering is found to adopt the 3H2 half-chair conformation which mimics the C3'-endo conformation of the ribose sugar. The double helices stack on top of each other to form continuous helices, and static disorder is observed due to this end-to-end stacking.

  12. Amino acid sequence of a protease inhibitor isolated from Sarcophaga bullata determined by mass spectrometry.

    PubMed

    Papayannopoulos, I A; Biemann, K

    1992-02-01

    The amino acid sequence of a protease inhibitor isolated from the hemolymph of Sarcophaga bullata larvae was determined by tandem mass spectrometry. Homology considerations with respect to other protease inhibitors with known primary structures assisted in the choice of the procedure followed in the sequence determination and in the alignment of the various peptides obtained from specific chemical cleavage at cysteines and enzyme digests of the S. bullata protease inhibitor. The resulting sequence of 57 residues is as follows: Val Asp Lys Ser Ala Cys Leu Gln Pro Lys Glu Val Gly Pro Cys Arg Lys Ser Asp Phe Val Phe Phe Tyr Asn Ala Asp Thr Lys Ala Cys Glu Glu Phe Leu Tyr Gly Gly Cys Arg Gly Asn Asp Asn Arg Phe Asn Thr Lys Glu Glu Cys Glu Lys Leu Cys Leu.

  13. Fatty Acid Profile and Unigene-Derived Simple Sequence Repeat Markers in Tung Tree (Vernicia fordii)

    PubMed Central

    Zhang, Lin; Jia, Baoguang; Tan, Xiaofeng; Thammina, Chandra S.; Long, Hongxu; Liu, Min; Wen, Shanna; Song, Xianliang; Cao, Heping

    2014-01-01

    Tung tree (Vernicia fordii) provides the sole source of tung oil widely used in industry. Lack of fatty acid composition and molecular markers hinders biochemical, genetic and breeding research. The objectives of this study were to determine fatty acid profiles and develop unigene-derived simple sequence repeat (SSR) markers in tung tree. Fatty acid profiles of 41 accessions showed that the ratio of α-eleostearic acid was increasing continuously with a parallel trend to the amount of tung oil accumulation while the ratios of other fatty acids were decreasing in different stages of the seeds and that α-eleostearic acid (18∶3) consisted of 77% of the total fatty acids in tung oil. Transcriptome sequencing identified 81,805 unigenes from tung cDNA library constructed using seed mRNA and discovered 6,366 SSRs in 5,404 unigenes. The di- and tri-nucleotide microsatellites accounted for 92% of the SSRs with AG/CT and AAG/CTT being the most abundant SSR motifs. Fifteen polymorphic genic-SSR markers were developed from 98 unigene loci tested in 41 cultivated tung accessions by agarose gel and capillary electrophoresis. Genbank database search identified 10 of them putatively coding for functional proteins. Quantitative PCR demonstrated that all 15 polymorphic SSR-associated unigenes were expressed in tung seeds and some of them were highly correlated with oil composition in the seeds. Dendrogram revealed that most of the 41 accessions were clustered according to the geographic region. These new polymorphic genic-SSR markers will facilitate future studies on genetic diversity, molecular fingerprinting, comparative genomics and genetic mapping in tung tree. The lipid profiles in the seeds of 41 tung accessions will be valuable for biochemical and breeding studies. PMID:25167054

  14. Cis-regulatory sequence variation and association with Mycoplasma load in natural populations of the house finch (Carpodacus mexicanus)

    PubMed Central

    Backström, Niclas; Shipilina, Daria; Blom, Mozes P K; Edwards, Scott V

    2013-01-01

    Characterization of the genetic basis of fitness traits in natural populations is important for understanding how organisms adapt to the changing environment and to novel events, such as epizootics. However, candidate fitness-influencing loci, such as regulatory regions, are usually unavailable in nonmodel species. Here, we analyze sequence data from targeted resequencing of the cis-regulatory regions of three candidate genes for disease resistance (CD74, HSP90α, and LCP1) in populations of the house finch (Carpodacus mexicanus) historically exposed (Alabama) and naïve (Arizona) to Mycoplasma gallisepticum. Our study, the first to quantify variation in regulatory regions in wild birds, reveals that the upstream regions of CD74 and HSP90α are GC-rich, with the former exhibiting unusually low sequence variation for this species. We identified two SNPs, located in a GC-rich region immediately upstream of an inferred promoter site in the gene HSP90α, that were significantly associated with Mycoplasma pathogen load in the two populations. The SNPs are closely linked and situated in potential regulatory sequences: one in a binding site for the transcription factor nuclear NFYα and the other in a dinucleotide microsatellite ((GC)6). The genotype associated with pathogen load in the putative NFYα binding site was significantly overrepresented in the Alabama birds. However, we did not see strong effects of selection at this SNP, perhaps because selection has acted on standing genetic variation over an extremely short time in a highly recombining region. Our study is a useful starting point to explore functional relationships between sequence polymorphisms, gene expression, and phenotypic traits, such as pathogen resistance that affect fitness in the wild. PMID:23532859

  15. Sequence variation within the KIV-2 copy number polymorphism of the human LPA gene in African, Asian, and European populations.

    PubMed

    Noureen, Asma; Fresser, Friedrich; Utermann, Gerd; Schmidt, Konrad

    2015-01-01

    Amazingly little sequence variation is reported for the kringle IV 2 copy number variation (KIV 2 CNV) in the human LPA gene. Apart from whole genome sequencing projects, this region has only been analyzed in some detail in samples of European populations. We have performed a systematic resequencing study of the exonic and flanking intron regions within the KIV 2 CNV in 90 alleles from Asian, European, and four different African populations. Alleles have been separated according to their CNV length by pulsed field gel electrophoresis prior to unbiased specific PCR amplification of the target regions. These amplicons covered all KIV 2 copies of an individual allele simultaneously. In addition, cloned amplicons from genomic DNA of an African individual were sequenced. Our data suggest that sequence variation in this genomic region may be higher than previously appreciated. Detection probability of variants appeared to depend on the KIV 2 copy number of the analyzed DNA and on the proportion of copies carrying the variant. Asians had a high frequency of so-called KIV 2 type B and type C (together 70% of alleles), which differ by three or two synonymous substitutions respectively from the reference type A. This is most likely explained by the strong bottleneck suggested to have occurred when modern humans migrated to East Asia. A higher frequency of variable sites was detected in the Africans. In particular, two previously unreported splice site variants were found. One was associated with non-detectable Lp(a). The other was observed at high population frequencies (10% to 40%). Like the KIV 2 type B and C variants, this latter variant was also found in a high proportion of KIV 2 repeats in the affected alleles and in alleles differing in copy numbers. Our findings may have implications for the interpretation of SNP analyses in other repetitive loci of the human genome.

  16. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    PubMed

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  17. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon

    PubMed Central

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, ‘SCNU1154’, ‘Edisto47’, ‘MR-1’, and ‘PMR5’. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon. PMID:27311063

  18. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    PubMed

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon. PMID:27311063

  19. Comparative Analysis of Mycobacterium tuberculosis pe and ppe Genes Reveals High Sequence Variation and an Apparent Absence of Selective Constraints

    PubMed Central

    McEvoy, Christopher R. E.; Cloete, Ruben; Müller, Borna; Schürch, Anita C.; van Helden, Paul D.; Gagneux, Sebastien; Warren, Robin M.; Gey van Pittius, Nicolaas C.

    2012-01-01

    Mycobacterium tuberculosis complex (MTBC) genomes contain 2 large gene families termed pe and ppe. The function of pe/ppe proteins remains enigmatic but studies suggest that they are secreted or cell surface associated and are involved in bacterial virulence. Previous studies have also shown that some pe/ppe genes are polymorphic, a finding that suggests involvement in antigenic variation. Using comparative sequence analysis of 18 publicly available MTBC whole genome sequences, we have performed alignments of 33 pe (excluding pe_pgrs) and 66 ppe genes in order to detect the frequency and nature of genetic variation. This work has been supplemented by whole gene sequencing of 14 pe/ppe (including 5 pe_pgrs) genes in a cohort of 40 diverse and well defined clinical isolates covering all the main lineages of the M. tuberculosis phylogenetic tree. We show that nsSNP's in pe (excluding pgrs) and ppe genes are 3.0 and 3.3 times higher than in non-pe/ppe genes respectively and that numerous other mutation types are also present at a high frequency. It has previously been shown that non-pe/ppe M. tuberculosis genes display a remarkably low level of purifying selection. Here, we also show that compared to these genes those of the pe/ppe families show a further reduction of selection pressure that suggests neutral evolution. This is inconsistent with the positive selection pressure of “classical” antigenic variation. Finally, by analyzing such a large number of genes we were able to detect large differences in mutation type and frequency between both individual genes and gene sub-families. The high variation rates and absence of selective constraints provides valuable insights into potential pe/ppe function. Since pe/ppe proteins are highly antigenic and have been studied as potential vaccine components these results should also prove informative for aspects of M. tuberculosis vaccine design. PMID:22496726

  20. Some properties and amino acid sequence of plastocyanin from a green alga, Ulva arasakii.

    PubMed

    Yoshizaki, F; Fukazawa, T; Mishina, Y; Sugimura, Y

    1989-08-01

    Plastocyanin was purified from a multicellular, marine green alga, Ulva arasakii, by conventional methods to homogeneity. The oxidized plastocyanin showed absorption maxima at 252, 276.8, 460, 595.3, and 775 nm, and shoulders at 259, 265, 269, and 282.5 nm; the ratio A276.8/A595.3 was 1.5. The midpoint redox potential was determined to be 0.356 V at pH 7.0 with a ferri- and ferrocyanide system. The molecular weight was estimated to be 10,200 and 11,000 by SDS-PAGE and by gel filtration, respectively. U. arasakii also has a small amount of cytochrome c6, like Enteromorpha prolifera. The amino acid sequence of U. arasakii plastocyanin was determined by Edman degradation and by carboxypeptidase digestion of the plastocyanin, six tryptic peptides, and five staphylococcal protease peptides. The plastocyanin contained 98 amino acid residues, giving a molecular weight of 10,236 including one copper atom. The complete sequence is as follows: AQIVKLGGDDGALAFVPSKISVAAGEAIEFVNNAGFPHNIVFDEDAVPAGVDADAISYDDYLNSKGETV VRKLSTPGVY G VYCEPHAGAGMKMTITVQ. The sequence of U. arasakii plastocyanin is closet to that of the E. prolifera protein (85% homology). A phylogenetic tree of five algal and two higher plant plastocyanins was constructed by comparing the amino acid differences. The branching order is considered to be as follows: a blue-green alga, unicellular green algae, multicellular green algae, and higher plants. PMID:2509442

  1. Characterization of the microbial acid mine drainage microbial community using culturing and direct sequencing techniques.

    PubMed

    Auld, Ryan R; Myre, Maxine; Mykytczuk, Nadia C S; Leduc, Leo G; Merritt, Thomas J S

    2013-05-01

    We characterized the bacterial community from an AMD tailings pond using both classical culturing and modern direct sequencing techniques and compared the two methods. Acid mine drainage (AMD) is produced by the environmental and microbial oxidation of minerals dissolved from mining waste. Surprisingly, we know little about the microbial communities associated with AMD, despite the fundamental ecological roles of these organisms and large-scale economic impact of these waste sites. AMD microbial communities have classically been characterized by laboratory culturing-based techniques and more recently by direct sequencing of marker gene sequences, primarily the 16S rRNA gene. In our comparison of the techniques, we find that their results are complementary, overall indicating very similar community structure with similar dominant species, but with each method identifying some species that were missed by the other. We were able to culture the majority of species that our direct sequencing results indicated were present, primarily species within the Acidithiobacillus and Acidiphilium genera, although estimates of relative species abundance were only obtained from direct sequencing. Interestingly, our culture-based methods recovered four species that had been overlooked from our sequencing results because of the rarity of the marker gene sequences, likely members of the rare biosphere. Further, direct sequencing indicated that a single genus, completely missed in our culture-based study, Legionella, was a dominant member of the microbial community. Our results suggest that while either method does a reasonable job of identifying the dominant members of the AMD microbial community, together the methods combine to give a more complete picture of the true diversity of this environment. PMID:23485423

  2. Complete amino acid sequence of chitinase-A from leaves of pokeweed (Phytolacca americana).

    PubMed

    Yamagami, T; Tanigawa, M; Ishiguro, M; Funatsu, G

    1998-04-01

    The complete amino acid sequence of pokeweed leaf chitinase-A was determined. First all 11 tryptic peptides from the reduced and S-carboxymethylated form of the enzyme were sequenced. Then the same form of the enzyme was cleaved with cyanogen bromide, giving three fragments. The fragments were digested with chymotrypsin or Staphylococcus aureus V8 protease. Last, the 11 tryptic peptides were put in order. Of seven cysteine residues, six were linked by disulfide bonds (between Cys25 and Cys74, Cys89 and Cys98, and Cys195 and Cys208); Cys176 was free. The enzyme consisted of 208 amino acid residues and had a molecular weight of 22,391. It consisted of only one polypeptide chain without a chitin-binding domain. The length of the chain was almost the same as that of the catalytic domains of class IL chitinases. These findings suggested that this enzyme is a new kind of class IIL chitinase, although its sequence resembles that of catalytic domains of class IL chitinases more than that of the class IIL chitinases reported so far. Discussion on the involvement of specific tryptophan residue in the active site of PLC-A is also given based on the sequence similarity with rye seed chitinase-c.

  3. Metazoan remaining genes for essential amino acid biosynthesis: sequence conservation and evolutionary analyses.

    PubMed

    Costa, Igor R; Thompson, Julie D; Ortega, José Miguel; Prosdocimi, Francisco

    2014-12-24

    Essential amino acids (EAA) consist of a group of nine amino acids that animals are unable to synthesize via de novo pathways. Recently, it has been found that most metazoans lack the same set of enzymes responsible for the de novo EAA biosynthesis. Here we investigate the sequence conservation and evolution of all the metazoan remaining genes for EAA pathways. Initially, the set of all 49 enzymes responsible for the EAA de novo biosynthesis in yeast was retrieved. These enzymes were used as BLAST queries to search for similar sequences in a database containing 10 complete metazoan genomes. Eight enzymes typically attributed to EAA pathways were found to be ubiquitous in metazoan genomes, suggesting a conserved functional role. In this study, we address the question of how these genes evolved after losing their pathway partners. To do this, we compared metazoan genes with their fungal and plant orthologs. Using phylogenetic analysis with maximum likelihood, we found that acetolactate synthase (ALS) and betaine-homocysteine S-methyltransferase (BHMT) diverged from the expected Tree of Life (ToL) relationships. High sequence conservation in the paraphyletic group Plant-Fungi was identified for these two genes using a newly developed Python algorithm. Selective pressure analysis of ALS and BHMT protein sequences showed higher non-synonymous mutation ratios in comparisons between metazoans/fungi and metazoans/plants, supporting the hypothesis that these two genes have undergone non-ToL evolution in animals.

  4. The amino acid sequence of the aspartate aminotransferase from baker's yeast (Saccharomyces cerevisiae).

    PubMed Central

    Cronin, V B; Maras, B; Barra, D; Doonan, S

    1991-01-01

    1. The single (cytosolic) aspartate aminotransferase was purified in high yield from baker's yeast (Saccharomyces cerevisiae). 2. Amino-acid-sequence analysis was carried out by digestion of the protein with trypsin and with CNBr; some of the peptides produced were further subdigested with Staphylococcus aureus V8 proteinase or with pepsin. Peptides were sequenced by the dansyl-Edman method and/or by automated gas-phase methods. The amino acid sequence obtained was complete except for a probable gap of two residues as indicated by comparison with the structures of counterpart proteins in other species. 3. The N-terminus of the enzyme is blocked. Fast-atom-bombardment m.s. was used to identify the blocking group as an acetyl one. 4. Alignment of the sequence of the enzyme with those of vertebrate cytosolic and mitochondrial aspartate aminotransferases and with the enzyme from Escherichia coli showed that about 25% of residues are conserved between these distantly related forms. 5. Experimental details and confirmatory data for the results presented here are given in a Supplementary Publication (SUP 50164, 25 pages) that has been deposited at the British Library Document Supply Centre, Boston Spa. Wetherby, West Yorkshire LS23 7 BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1991) 273, 5. PMID:1859361

  5. [MOLECULAR EVOLUTION OF ION CHANNELS: AMINO ACID SEQUENCES AND 3D STRUCTURES].

    PubMed

    Korkosh, V S; Zhorov, B S; Tikhonov, D B

    2016-01-01

    An integral part of modern evolutionary biology is comparative analysis of structure and function of macromolecules such as proteins. The first and critical step to understand evolution of homologous proteins is their amino acid sequence alignment. However, standard algorithms fop not provide unambiguous sequence alignments for proteins of poor homology. More reliable results can be obtained by comparing experimental 3D structures obtained at atomic resolution, for instance, with the aid of X-ray structural analysis. If such structures are lacking, homology modeling is used, which may take into account indirect experimental data on functional roles of individual amino-acid residues. An important problem is that the sequence alignment, which reflects genetic modifications, does not necessarily correspond to the functional homology. The latter depends on three-dimensional structures which are critical for natural selection. Since alignment techniques relying only on the analysis of primary structures carry no information on the functional properties of proteins, including 3D structures into consideration is very important. Here we consider several examples involving ion channels and demonstrate that alignment of their three-dimensional structures can significantly improve sequence alignments obtained by traditional methods.

  6. Source quality variations tied to sequence development: Integration of physical and chemical aspects, Lower to Middle Triassic, western Barents Sea

    SciTech Connect

    Bohacs, K.M.; Isaksen, G.H. )

    1991-03-01

    Triassic mudrocks from the Barents Sea area demonstrate to covariance of physical and chemical properties of mudrocks deposited in shelfal environments and the aspect of depositional sequences in distal settings. The tie of physical parameters to chemical character within a detailed sequence-stratigraphic framework enables the construction of depositional-facies models to predict organic-matter content and quality. This allows the explorer to more closely constrain and predict the nature of potential source rocks using seismic and well-log data. Changes in lithology, bedding geometry, sedimentary structures, body and trace-fossil assemblages, and inorganic, bulk-organic, and molecular geochemistry revealed the detailed depositional environments. The depositional environments stack predictably, according to their position in the depositional sequence: from aerobic lower-shoreface--offshore transition environments in lowstand systems tracts to dysaerobic-anaerobic distal open-marine-shelf environment in transgressive and early highstand systems tracts. Quantitative molecular geochemistry also revealed variations within this distal setting and strong covariance with sequence position. Input of organic matter from terrigenous higher plants dominates the lowstands whereas marine-algal organic matter is most prevalent within transgressive and highstand systems tracts. Specifically, the abundance of C{sub 30} steranes, total steranes, and moretane reflected development of the sequences.

  7. Distinct intraspecific variations of garlic (Allium sativum L.) revealed by the exon-intron sequences of the alliinase gene.

    PubMed

    Endo, Aki; Imai, Yukiko; Nakamura, Mizuho; Yanagisawa, Eri; Taguchi, Takaaki; Torii, Kosuke; Okumura, Hidenobu; Ichinose, Koji

    2014-04-01

    Garlic (Allium sativum L.) has been used worldwide as a food and for medicinal purposes since early times. Garlic cultivars exhibit considerable morphological diversity despite the fact that they are mostly sterile and are grown only by vegetative propagation of cloves. Considerable recombination occurs in garlic genomes, including the genes involved in secondary metabolites. We examined the genomic DNAs (gDNAs) from garlic, encoding alliinase, a key enzyme involved in organosulfur metabolism in Allium plants. The 1.7-kb gDNA fragments, covering three exons (2, 3, and 4) and all four introns, were amplified from total DNAs prepared from garlic samples produced in Asia and Europe, leading to 73 sequences in total: Japan (JPN), China (CHN), India (IND), Spain (ESP), and France (FRA). The exon sequences were highly conserved among all the sequences, probably reflecting the fully functional alliinase associated with the flavor quality. Distinct intraspecific variations were detected for all four intron sequences, leading to the haplotype classifications. A close relationship between JPN and CHN was observed for all four introns, whereas IND showed a more divergent distribution. ESP and FRA afforded clearly different variants compared with those from Asian sequences. The present study provides information that could be useful in the development of an additional molecular marker for garlic authentication and quality control.

  8. Effects of temporal variations in the acidity of rain on crop response to acidic deposition (Task Force Project)

    SciTech Connect

    Miller, J.E.; Irving, P.M.

    1983-01-01

    The variation of rainfall pH from one event to another may have important biological consequences in several respects. A few rainfall events, or even only one, at low pH values may be more damaging than the same total hydrogen ion deposition spread out over a larger number of events with higher pH values. The variation of rainfall pH within events may also affect biological response. For example, during the early stages of an event, acid precipitation may be less effective in damaging vegetation since the water falling on the plant surfaces is continually being washed off by the subsequent rain. On the other hand, that rainwater which falls during the very last stages of an event remains on the vegetation until it evaporates. Since the concentration of contaminants remaining in solution on the leaves in these circumstances steadily increasing during evaporation, this particular rain may have a significantly greater effect. Of course, there is also a third hypothesis to the effect that it is primarily the total hydrogen ion deposition received by plant surfaces over a long period of time (conceivably, over an entire growing season) which is the most important parameter due to the cumulative stress imposed on the plant tissue over the entire period of exposure. The proposed experiments will determine which of these hypotheses best represents the response of a particular, economically-important crop species to the impact of more realistically programmed, acid rain simulant treatments. Design of the experiments is discussed.

  9. Space and time variations of crustal anisotropy during the 1997 Umbria-Marche, central Italy, seismic sequence

    NASA Astrophysics Data System (ADS)

    Piccinini, D.; Margheriti, L.; Chiaraluce, L.; Cocco, M.

    2006-12-01

    We measure crustal anisotropy parameters from several hundreds of aftershocks (ML > 2.5) of the 1997 Umbria-Marche seismic sequence which occurred in a carbonatic fold and thrust belt in the shallow crust of central Apennines (Italy). The analysis of shear wave polarization shows clear S-wave splitting with prevalent fast direction ~140°N and average delay times of 0.06 s. The observed fast direction is parallel to the strike of the activated normal-fault system and to the maximum horizontal stress (σ2) active in the region. This is explained by the presence of stress-aligned microcracks or stress-opened fluid-filled cracks and fractures within the sedimentary coverage, even if the role of structural anisotropy cannot be completely ruled out since the maximum horizontal stress is subparallel to the major structural features of the area (main thrusts and normal faults). The peculiar spatio-temporal evolution of the seismic sequence gives us also the opportunity to investigate temporal variations of anisotropic parameters. We analyse those seismograms whose ray paths sample the crustal volume containing two of the major fault zones, before and after the occurrence of normal faulting mainshocks (Mw > 5). We observe variations of the anisotropic parameters during the days before and after the occurrence of mainshocks and we interpret them in terms of temporal variations of anisotropic parameters. This interpretation is consistent with temporal variations of the local stress condition and of the fluid pressure in the studied crustal volume proposed in the literature. However, since the spatial sampling of the selected ray paths varies with time, we cannot exclude the contribution of spatial variations of anisotropic parameters.

  10. Complete Genome Sequence of a thermotolerant sporogenic lactic acid bacterium, Bacillus coagulans strain 36D1

    PubMed Central

    Rhee, Mun Su; Moritz, Brélan E.; Xie, Gary; Glavina del Rio, T.; Dalin, E.; Tice, H.; Bruce, D.; Goodwin, L.; Chertkov, O.; Brettin, T.; Han, C.; Detter, C.; Pitluck, S.; Land, Miriam L.; Patel, Milind; Ou, Mark; Harbrucker, Roberta; Ingram, Lonnie O.; Shanmugam, K. T.

    2011-01-01

    Bacillus coagulans is a ubiquitous soil bacterium that grows at 50-55 °C and pH 5.0 and ferments various sugars that constitute plant biomass to L (+)-lactic acid. The ability of this sporogenic lactic acid bacterium to grow at 50-55 °C and pH 5.0 makes this organism an attractive microbial biocatalyst for production of optically pure lactic acid at industrial scale not only from glucose derived from cellulose but also from xylose, a major constituent of hemicellulose. This bacterium is also considered as a potential probiotic. Complete genome sequence of a representative strain, B. coagulans strain 36D1, is presented and discussed. PMID:22675583

  11. BeadCons: detection of nucleic acid sequences by flow cytometry.

    PubMed

    Horejsh, Douglas; Martini, Federico; Capobianchi, Maria Rosaria

    2005-11-01

    Molecular beacons are single-stranded nucleic acid structures with a terminal fluorophore and a distal, terminal quencher. These molecules are typically used in real-time PCR assays, but have also been conjugated with solid matrices. This unit describes protocols related to molecular beacon-conjugated beads (BeadCons), whose specific hybridization with complementary target sequences can be resolved by cytometry. Assay sensitivity is achieved through the concentration of fluorescence signal on discrete particles. By using molecular beacons with different fluorophores and microspheres of different sizes, it is possible to construct a fluid array system with each bead corresponding to a specific target nucleic acid. Methods are presented for the design, construction, and use of BeadCons for the specific, multiplexed detection of unlabeled nucleic acids in solution. The use of bead-based detection methods will likely lead to the design of new multiplex molecular diagnostic tools.

  12. Measuring nanometer distances in nucleic acids using a sequence-independent nitroxide probe

    PubMed Central

    Qin, Peter Z; Haworth, Ian S; Cai, Qi; Kusnetzow, Ana K; Grant, Gian Paola G; Price, Eric A; Sowa, Glenna Z; Popova, Anna; Herreros, Bruno; He, Honghang

    2008-01-01

    This protocol describes the procedures for measuring nanometer distances in nucleic acids using a nitroxide probe that can be attached to any nucleotide within a given sequence. Two nitroxides are attached to phosphorothioates that are chemically substituted at specific sites of DNA or RNA. Inter-nitroxide distances are measured using a four-pulse double electron–electron resonance technique, and the measured distances are correlated to the parent structures using a Web-accessible computer program. Four to five days are needed for sample labeling, purification and distance measurement. The procedures described herein provide a method for probing global structures and studying conformational changes of nucleic acids and protein/nucleic acid complexes. PMID:17947978

  13. [Partial sequence homology of FtsZ in phylogenetics analysis of lactic acid bacteria].

    PubMed

    Zhang, Bin; Dong, Xiu-zhu

    2005-10-01

    FtsZ is a structurally conserved protein, which is universal among the prokaryotes. It plays a key role in prokaryote cell division. A partial fragment of the ftsZ gene about 800bp in length was amplified and sequenced and a partial FtsZ protein phylogenetic tree for the lactic acid bacteria was constructed. By comparing the FtsZ phylogenetic tree with the 16S rDNA tree, it was shown that the two trees were similar in topology. Both trees revealed that Pediococcus spp. were closely related with L. casei group of Lactobacillus spp. , but less related with other lactic acid cocci such as Enterococcus and Streptococcus. The results also showed that the discriminative power of FtsZ was higher than that of 16S rDNA for either inter-species or inter-genus and could be a very useful tool in species identification of lactic acid bacteria. PMID:16342751

  14. Investigation of the population structure of Legionella pneumophila by analysis of tandem repeat copy number and internal sequence variation.

    PubMed

    Visca, Paolo; D'Arezzo, Silvia; Ramisse, Françoise; Gelfand, Yevgeniy; Benson, Gary; Vergnaud, Gilles; Fry, Norman K; Pourcel, Christine

    2011-09-01

    The population structure of the species Legionella pneumophila was investigated by multilocus variable number of tandem repeats (VNTR) analysis (MLVA) and sequencing of three VNTRs (Lpms01, Lpms04 and Lpms13) in selected strains. Of 150 isolates of diverse origins, 136 (86 %) were distributed into eight large MLVA clonal complexes (VACCs) and the rest were either unique or formed small clusters of up to two MLVA genotypes. In spite of the lower degree of genome-wide linkage disequilibrium of the MLVA loci compared with sequence-based typing, the clustering achieved by the two methods was highly congruent. The detailed analysis of VNTR Lpms04 alleles showed a very complex organization, with five different repeat unit lengths and a high level of internal variation. Within each MLVA-defined VACC, Lpms04 was endowed with a common recognizable pattern with some interesting exceptions. Evidence of recombination events was suggested by analysis of internal repeat variations at the two additional VNTR loci, Lpms01 and Lpms13. Sequence analysis of L. pneumophila VNTR locus Lpms04 alone provides a first-line assay for allocation of a new isolate within the L. pneumophila population structure and for epidemiological studies.

  15. mit-o-matic: a comprehensive computational pipeline for clinical evaluation of mitochondrial variations from next-generation sequencing datasets.

    PubMed

    Vellarikkal, Shamsudheen Karuthedath; Dhiman, Heena; Joshi, Kandarp; Hasija, Yasha; Sivasubbu, Sridhar; Scaria, Vinod

    2015-04-01

    The human mitochondrial genome has been reported to have a very high mutation rate as compared with the nuclear genome. A large number of mitochondrial mutations show significant phenotypic association and are involved in a broad spectrum of diseases. In recent years, there has been a remarkable progress in the understanding of mitochondrial genetics. The availability of next-generation sequencing (NGS) technologies have not only reduced sequencing cost by orders of magnitude but has also provided us good quality mitochondrial genome sequences with high coverage, thereby enabling decoding of a number of human mitochondrial diseases. In this study, we report a computational and experimental pipeline to decipher the human mitochondrial DNA variations and examine them for their clinical correlation. As a proof of principle, we also present a clinical study of a patient with Leigh disease and confirmed maternal inheritance of the causative allele. The pipeline is made available as a user-friendly online tool to annotate variants and find haplogroup, disease association, and heteroplasmic sites. The "mit-o-matic" computational pipeline represents a comprehensive cloud-based tool for clinical evaluation of mitochondrial genomic variations from NGS datasets. The tool is freely available at http://genome.igib.res.in/mitomatic/.

  16. Analysis of genetic variation within clonal lineages of grape phylloxera (Daktulosphaira vitifoliae Fitch) using AFLP fingerprinting and DNA sequencing.

    PubMed

    Vorwerk, S; Forneck, A

    2007-07-01

    Two AFLP fingerprinting methods were employed to estimate the potential of AFLP fingerprints for the detection of genetic diversity within single founder lineages of grape phylloxera (Daktulosphaira vitifoliae Fitch). Eight clonal lineages, reared under controlled conditions in a greenhouse and reproducing asexually throughout a minimum of 15 generations, were monitored and mutations were scored as polymorphisms between the founder individual and individuals of succeeding generations. Genetic variation was detected within all lineages, from early generations on. Six to 15 polymorphic loci (from a total of 141 loci) were detected within the lineages, making up 4.3% of the total amount of genetic variation. The presence of contaminating extra-genomic sequences (e.g., viral material, bacteria, or ingested chloroplast DNA) was excluded as a source of intraclonal variation. Sequencing of 37 selected polymorphic bands confirmed their origin in mostly noncoding regions of the grape phylloxera genome. AFLP techniques were revealed to be powerful for the identification of reproducible banding patterns within clonal lineages.

  17. The amino acid sequence of Lady Amherst's pheasant (Chrysolophus amherstiae) and golden pheasant (Chrysolophus pictus) egg-white lysozymes.

    PubMed

    Araki, T; Kuramoto, M; Torikata, T

    1990-09-01

    The amino acids of Lady Amherst's pheasant and golden pheasant egg-white lysozymes have been sequenced. The carboxymethylated lysozymes were digested with trypsin followed by sequencing of the tryptic peptides. Lady Amherst's pheasant lysozyme proved to consist of 129 amino acid residues, and a relative molecular mass of 14,423 Da was calculated. This lysozyme had 6 amino acids substitutions when compared with hen egg-white lysozyme: Phe3 to Tyr, His15 to Leu, Gln41 to His, Asn77 to His, Gln 121 to Asn, and a newly found substitution of Ile124 to Thr. The amino acid sequence of golden pheasant lysozyme was identical to that of Lady Amherst's phesant lysozyme. The phylogenetic tree constructured by the comparison of amino acid sequences of phasianoid birds lysozymes revealed a minimum genetic distance between these pheasants and the turkey-peafowl group.

  18. The amino acid sequence of Lady Amherst's pheasant (Chrysolophus amherstiae) and golden pheasant (Chrysolophus pictus) egg-white lysozymes.

    PubMed

    Araki, T; Kuramoto, M; Torikata, T

    1990-09-01

    The amino acids of Lady Amherst's pheasant and golden pheasant egg-white lysozymes have been sequenced. The carboxymethylated lysozymes were digested with trypsin followed by sequencing of the tryptic peptides. Lady Amherst's pheasant lysozyme proved to consist of 129 amino acid residues, and a relative molecular mass of 14,423 Da was calculated. This lysozyme had 6 amino acids substitutions when compared with hen egg-white lysozyme: Phe3 to Tyr, His15 to Leu, Gln41 to His, Asn77 to His, Gln 121 to Asn, and a newly found substitution of Ile124 to Thr. The amino acid sequence of golden pheasant lysozyme was identical to that of Lady Amherst's phesant lysozyme. The phylogenetic tree constructured by the comparison of amino acid sequences of phasianoid birds lysozymes revealed a minimum genetic distance between these pheasants and the turkey-peafowl group. PMID:1368578

  19. Estimation of Response Functions Based on Variational Bayes Algorithm in Dynamic Images Sequences

    PubMed Central

    2016-01-01

    We proposed a nonparametric Bayesian model based on variational Bayes algorithm to estimate the response functions in dynamic medical imaging. In dynamic renal scintigraphy, the impulse response or retention functions are rather complicated and finding a suitable parametric form is problematic. In this paper, we estimated the response functions using nonparametric Bayesian priors. These priors were designed to favor desirable properties of the functions, such as sparsity or smoothness. These assumptions were used within hierarchical priors of the variational Bayes algorithm. We performed our algorithm on the real online dataset of dynamic renal scintigraphy. The results demonstrated that this algorithm improved the estimation of response functions with nonparametric priors. PMID:27631007

  20. Estimation of Response Functions Based on Variational Bayes Algorithm in Dynamic Images Sequences

    PubMed Central

    2016-01-01

    We proposed a nonparametric Bayesian model based on variational Bayes algorithm to estimate the response functions in dynamic medical imaging. In dynamic renal scintigraphy, the impulse response or retention functions are rather complicated and finding a suitable parametric form is problematic. In this paper, we estimated the response functions using nonparametric Bayesian priors. These priors were designed to favor desirable properties of the functions, such as sparsity or smoothness. These assumptions were used within hierarchical priors of the variational Bayes algorithm. We performed our algorithm on the real online dataset of dynamic renal scintigraphy. The results demonstrated that this algorithm improved the estimation of response functions with nonparametric priors.

  1. Understanding Gene Sequence Variation in the Context of Transcription Regulation in Yeast

    NASA Astrophysics Data System (ADS)

    Gat-Viks, Irit; Meller, Renana; Kupiec, Martin; Shamir, Ron

    The availability of expression quantitative trait loci (eQTL) data can help understanding the genetic basis of variation in gene expression. However, it has proven difficult to accurately predict functional genetic changes due to low statistical power. To address this challenge, we developed a novel computational approach for combining eQTL data with complementary regulatory network to identify modules of genes, their underlying genetic polymorphism and their shared regulatory proteins activity. The resulting eQTL model implicates novel central protein complexes that share not only a regulatory protein but also an underlying genetic variation. Our method manifests higher sensitivity than prior computational efforts.

  2. A pedigree-based study of mitochondrial D-loop DNA sequence variation among Arabian horses.

    PubMed

    Bowling, A T; Del Valle, A; Bowling, M

    2000-02-01

    Through DNA sequence comparisons of a mitochondrial D-loop hypervariable region, we investigated matrilineal diversity for Arabian horses in the United States. Sixty-two horses were tested. From published pedigrees they traced in the maternal line to 34 mares acquired primarily in the mid to late 19th century from nomadic Bedouin tribes. Compared with the reference sequence (GenBank X79547), these samples showed 27 haplotypes with altogether 31 base substitution sites within 397 bp of sequence. Based on examination of pedigrees from a random sampling of 200 horses in current studbooks of the Arabian Horse Registry of America, we estimated that this study defined the expected mtDNA haplotypes for at least 89% of Arabian horses registered in the US. The reliability of the studbook recorded maternal lineages of Arabian pedigrees was demonstrated by haplotype concordance among multiple samplings in 14 lines. Single base differences observed within two maternal lines were interpreted as representing alternative fixations of past heteroplasmy. The study also demonstrated the utility of mtDNA sequence studies to resolve historical maternity questions without access to biological material from the horses whose relationship was in question, provided that representatives of the relevant female lines were available for comparison. The data call into question the traditional assumption that Arabian horses of the same strain necessarily share a common maternal ancestry.

  3. Comparative Effectiveness of Variations in the Demonstration Method of Teaching a Complex Manipulative Sequence.

    ERIC Educational Resources Information Center

    Blankenbaker, E. Keith

    There are so many methods and approaches to teaching that it is sometimes difficult to choose the approach best suited to the needs of the students. This study sought to ascertain the relative effectiveness and efficiency of selected approaches to the demonstration of complex manipulative sequences, and to test the theory that students of high…

  4. Mitogenome sequence variation in migratory and stationary ecotypes of North-east Atlantic cod.

    PubMed

    Karlsen, Bård O; Emblem, Åse; Jørgensen, Tor E; Klingan, Kevin A; Nordeide, Jarle T; Moum, Truls; Johansen, Steinar D

    2014-06-01

    Sequencing of mitochondrial gene fragments from specimens representing a wide range of geographical locations has indicated limited population structuring in Atlantic cod (Gadus morhua). We recently performed whole genome analysis based on next-generation sequencing of two pooled ecotype samples representing offshore migratory and inshore stationary cod from the North-east Atlantic Ocean. Here we report molecular features and variability of the 16.7kb mitogenome component that was collected from the datasets. These sequences represented more than 25 times coverage of each individual and more than 1100 times coverage of each ecotype sample. We estimated the mitogenome to have evolved 14 times more rapidly than the nuclear genome. Among the 365 single nucleotide polymorphism (SNP) sites identified, 121 were shared between ecotypes, and 151 and 93 were private within the migratory and stationary cod, respectively. We found 323 SNPs to be located in protein coding genes, of which 29 were non-synonymous. One synonymous site in ND2 was likely to be under positive selection. FST measurements indicated weak differentiation in ND1 and ND2 between ecotypes. We conclude that the Atlantic cod mitogenome and the nuclear genome apparently evolved by distinct evolutionary constraints, and that the reproductive isolation observed from whole genome analysis was not visible in the mtDNA sequences.

  5. Bovine herpesvirus-1: comparison and differentiation of vaccine and field strains based on genomic sequence variation.

    PubMed

    Fulton, R W; d'Offay, J M; Eberle, R

    2013-03-01

    Bovine herpesvirus-1 (BoHV-1) causes significant disease in cattle including respiratory, fetal diseases, and reproductive tract infections. Control programs usually include vaccination with a modified live viral (MLV) vaccine. On occasion BoHV-1 strains are isolated from diseased animals or fetuses postvaccination. Currently there are no markers for differentiating MLV strains from field strains of BoHV-1. In this study several BoHV-1 strains were sequenced using whole-genome sequencing technologies and the data analyzed to identify single nucleotide polymorphisms (SNPs). Strains sequenced included the reference BoHV-1 Cooper strain (GenBank Accession JX898220), eight commercial MLV vaccine strains, and 14 field strains from cases presented for diagnosis. Based on SNP analyses, the viruses could be classified into groups having similar SNP patterns. The eight MLV strains could be differentiated from one another although some were closely related to each other. A number of field strains isolated from animals with a history of prior vaccination had SNP patterns similar to specific MLV viruses, while other field isolates were very distinct from all vaccine strains. The results indicate that some BoHV-1 isolates from clinically ill cattle/fetuses can be associated with a prior MLV vaccination history, but more information is needed on the rate of BoHV-1 genome sequence change before irrefutable associations can be drawn. PMID:23333211

  6. Gene sequencing reveals heterokaryotic variations and evolutionary mechanisms in Puccinia striiformis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Puccinia striiformis (Ps), the causal agent of stripe rust, is an obligate biotrophic fungus. The objective of this study was to identify polymorphic genes for determining the mechanisms of the pathogen variation. Primers were designed for seven important putative genes including beta-tubulin (BT), ...

  7. N-terminal amino acid sequences and some characteristics of fibrinolytic/hemorrhagic metalloproteinases purified from Bothrops jararaca venom.

    PubMed

    Maruyama, Masugi; Sugiki, Masahiko; Anai, Keita; Yoshida, Etsuo

    2002-08-01

    We determined the N-terminal amino acid sequences of the fibrinolytic/hemorrhagic metalloproteinases (jararafibrases I, III and IV) purified from Bothrops jararaca venom. The N-terminal amino acid sequences of jararafibrase I and its degradation products were identical to those of jararhagin, another hemorrhagic metalloproteinase purified from the same snake venom. Together with enzymatic and immunological properties, we concluded that those two enzymes are identical. The N-terminal amino acid sequence of jararafibrase III was quite similar to C-type lectin isolated from Crotalus atrox, and the protein had a hemagglutinating activity on intact rat red blood cells. PMID:12165326

  8. Invasive cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

    2002-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  9. Invasive cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  10. Multiple Cis-Acting Sequences Contribute to Evolved Regulatory Variation for Drosophila Adh Genes

    PubMed Central

    Fang, X. M.; Brennan, M. D.

    1992-01-01

    Drosophila affinidisjuncta and Drosophila hawaiiensis are closely related species that display distinct tissue-specific expression patterns for their homologous alcohol dehydrogenase genes (Adh genes). In Drosophila melanogaster transformants, both genes are expressed at high levels in the larval and adult fat bodies, but the D. affinidisjuncta gene is expressed 10-50-fold more strongly in the larval and adult midguts and Malpighian tubules. The present study reports the mapping of cis-acting sequences contributing to the regulatory differences between these two genes in transformants. Chimeric genes were constructed and introduced into the germ line of D. melanogaster. Stage- and tissue-specific expression patterns were determined by measuring steady-state RNA levels in larvae and adults. Three portions of the promoter region make distinct contributions to the tissue-specific regulatory differences between the native genes. Sequences immediately upstream of the distal promoter have a strong effect in the adult Malpighian tubules, while sequences between the two promoters are relatively important in the larval Malpighian tubules. A third gene segment, immediately upstream of the proximal promoter, influences levels of the proximal Adh transcript in all tissues and developmental stages examined, and largely accounts for the regulatory difference in the larval and adult midguts. However, these as well as other sequences make smaller contributions to various aspects of the tissue-specific regulatory differences. In addition, some chimeric genes display aberrant RNA levels for the whole organism, suggesting close physical association between sequences involved in tissue-specific regulatory differences and those important for Adh expression in the larval and adult fat bodies. PMID:1644276

  11. Protein sequence analysis by incorporating modified chaos game and physicochemical properties into Chou's general pseudo amino acid composition.

    PubMed

    Xu, Chunrui; Sun, Dandan; Liu, Shenghui; Zhang, Yusen

    2016-10-01

    In this contribution we introduced a novel graphical method to compare protein sequences. By mapping a protein sequence into 3D space based on codons and physicochemical properties of 20 amino acids, we are able to get a unique P-vector from the 3D curve. This approach is consistent with wobble theory of amino acids. We compute the distance between sequences by their P-vectors to measure similarities/dissimilarities among protein sequences. Finally, we use our method to analyze four datasets and get better results compared with previous approaches. PMID:27375218

  12. Sequence variation and gene duplication at MHC DQB loci of baiji (Lipotes vexillifer), a Chinese river dolphin.

    PubMed

    Yang, G; Yan, J; Zhou, K; Wei, F

    2005-01-01

    The major histocompatibility complex (MHC) is a fundamental part of the vertebrate immune system, and the high variability in many MHC genes is thought to play an important role in the recognition of parasites. Baiji (Lipotes vexillifer) is one of the most endangered species in the world. Its wild population has declined to fewer than 100 individuals and has a very high risk of becoming extinct in the near future. In this study we present a first step in the molecular characterization of a DQB-like locus of baiji by nucleotide sequence analysis of the polymorphic exon 2 segments. In the examined 172 bp sequences from a group of 18 incidentally captured or stranded individuals, 48 variable sites were determined and 43 alleles were identified, many of which were represented by only one clone. Three to seven alleles were found in each individual, suggesting gene duplications. No deletion, insertion, or exceptional stop codon was detected, suggesting these alleles function in vivo. Phylogenetic reconstruction using neighbor joining grouped the 43 alleles into two distinct lineages, differing by seven nucleotides and four amino acids. Substitutions of amino acids tend to be clustered around sites postulated to be responsible for selective peptide recognition. In the peptide-binding region (PBR) of the DQB locus, the average number of nonsynonymous substitutions per site is greater than that of synonymous substitutions per site (0.1962 versus 0.0256, respectively). Nucleotide and amino acid sequences both showed a relatively high level of similarity (nucleotides 90.6%; amino acids 80.6%) to those of beluga whale (Delphinapterus leucas) and narwhal (Monodon monoceros). The high level of baiji MHC polymorphism revealed in the present study has not been reported in other cetaceans and could be a consequence of the small baiji population adapting to freshwater with a relatively high level of pathogens. PMID:15843636

  13. Comprehensive characterization of human genome variation by high coverage whole-genome sequencing of forty four Caucasians.

    PubMed

    Shen, Hui; Li, Jian; Zhang, Jigang; Xu, Chao; Jiang, Yan; Wu, Zikai; Zhao, Fuping; Liao, Li; Chen, Jun; Lin, Yong; Tian, Qing; Papasian, Christopher J; Deng, Hong-Wen

    2013-01-01

    Whole genome sequencing studies are essential to obtain a comprehensive understanding of the vast pattern of human genomic variations. Here we report the results of a high-coverage whole genome sequencing study for 44 unrelated healthy Caucasian adults, each sequenced to over 50-fold coverage (averaging 65.8×). We identified approximately 11 million single nucleotide polymorphisms (SNPs), 2.8 million short insertions and deletions, and over 500,000 block substitutions. We showed that, although previous studies, including the 1000 Genomes Project Phase 1 study, have catalogued the vast majority of common SNPs, many of the low-frequency and rare variants remain undiscovered. For instance, approximately 1.4 million SNPs and 1.3 million short indels that we found were novel to both the dbSNP and the 1000 Genomes Project Phase 1 data sets, and the majority of which (∼96%) have a minor allele frequency less than 5%. On average, each individual genome carried ∼3.3 million SNPs and ∼492,000 indels/block substitutions, including approximately 179 variants that were predicted to cause loss of function of the gene products. Moreover, each individual genome carried an average of 44 such loss-of-function variants in a homozygous state, which would completely "knock out" the corresponding genes. Across all the 44 genomes, a total of 182 genes were "knocked-out" in at least one individual genome, among which 46 genes were "knocked out" in over 30% of our samples, suggesting that a number of genes are commonly "knocked-out" in general populations. Gene ontology analysis suggested that these commonly "knocked-out" genes are enriched in biological process related to antigen processing and immune response. Our results contribute towards a comprehensive characterization of human genomic variation, especially for less-common and rare variants, and provide an invaluable resource for future genetic studies of human variation and diseases.

  14. Striking structural dynamism and nucleotide sequence variation of the transposon Galileo in the genome of Drosophila mojavensis

    PubMed Central

    2013-01-01

    Background Galileo is a transposable element responsible for the generation of three chromosomal inversions in natural populations of Drosophila buzzatii. Although the most characteristic feature of Galileo is the long internally-repetitive terminal inverted repeats (TIRs), which resemble the Drosophila Foldback element, its transposase-coding sequence has led to its classification as a member of the P-element superfamily (Class II, subclass 1, TIR order). Furthermore, Galileo has a wide distribution in the genus Drosophila, since it has been found in 6 of the 12 Drosophila sequenced genomes. Among these species, D. mojavensis, the one closest to D. buzzatii, presented the highest diversity in sequence and structure of Galileo elements. Results In the present work, we carried out a thorough search and annotation of all the Galileo copies present in the D. mojavensis sequenced genome. In our set of 170 Galileo copies we have detected 5 Galileo subfamilies (C, D, E, F, and X) with different structures ranging from nearly complete, to only 2 TIR or solo TIR copies. Finally, we have explored the structural and length variation of the Galileo copies that point out the relatively frequent rearrangements within and between Galileo elements. Different mechanisms responsible for these rearrangements are discussed. Conclusions Although Galileo is a transposable element with an ancient history in the D. mojavensis genome, our data indicate a recent transpositional activity. Furthermore, the dynamism in sequence and structure, mainly affecting the TIRs, suggests an active exchange of sequences among the copies. This exchange could lead to new subfamilies of the transposon, which could be crucial for the long-term survival of the element in the genome. PMID:23374229

  15. Identification of eight mutations and three sequence variations in the cystic fibrosis transmembrane conductance regulator (CFTR) gene

    SciTech Connect

    Ghanem, N.; Costes, B.; Girodon, E.; Martin, J.; Fanen, P.; Goossens, M. )

    1994-05-15

    To determine cystic fibrosis (CF) defects in a sample of 224 non-[Delta]F508 CF chromosomes, the authors used denaturing gradient gel multiplex analysis of CF transmembrane conductance regulator gene segments, a strategy based on blind exhaustive analysis rather than a search for known mutations. This process allowed detection of 11 novel variations comprising two nonsense mutations (Q890X and W1204X), a splice defect (405 + 4 A [yields] G), a frameshift (3293delA), four presumed missense mutations (S912L, H949Y, L1065P, Q1071P), and three sequence polymorphisms (R31C or 223 C/T, 3471 T/C, and T1220I or 3791 C/T). The authors describe these variations, together with the associated phenotype when defects on both CF chromosomes were identified. 8 refs., 1 fig., 1 tab.

  16. Genetic variation in and spatial structure of natural populations of Dipterocarpus alatus (Dipterocarpaceae) determined using single sequence repeat markers.

    PubMed

    Tam, N M; Duy, V D; Duc, N M; Giap, V D; Xuan, B T T

    2014-01-01

    Dipterocarpus alatus (Dipterocarpaceae) is widely distributed in lowland forests in central and southern Vietnam, Cambodia, Laos, Myanmar, Philippines, Thailand, and India. Due to over-exploitation and habitat destruction, the species is now threatened. The genetic variation within and among populations of D. alatus was investigated on the basis of 9 microsatellite (single sequence repeat, SSR) loci. In all, 268 sampled trees from 10 populations in central and southern Vietnam were analyzed in this study. The SSR data showed a high genetic variability within populations with an average of HO = 0.209 and HE = 0.239. Genetic differentiation among populations was high (FST = 0.266), indicating limited gene flow (Nm = 0.69). Analysis of molecular variance showed that most genetic variation was within populations (74.96%). This study highlights the importance of conserving the genetic resources of D. alatus species.

  17. Genetic variation in and spatial structure of natural populations of Dipterocarpus alatus (Dipterocarpaceae) determined using single sequence repeat markers.

    PubMed

    Tam, N M; Duy, V D; Duc, N M; Giap, V D; Xuan, B T T

    2014-01-01

    Dipterocarpus alatus (Dipterocarpaceae) is widely distributed in lowland forests in central and southern Vietnam, Cambodia, Laos, Myanmar, Philippines, Thailand, and India. Due to over-exploitation and habitat destruction, the species is now threatened. The genetic variation within and among populations of D. alatus was investigated on the basis of 9 microsatellite (single sequence repeat, SSR) loci. In all, 268 sampled trees from 10 populations in central and southern Vietnam were analyzed in this study. The SSR data showed a high genetic variability within populations with an average of HO = 0.209 and HE = 0.239. Genetic differentiation among populations was high (FST = 0.266), indicating limited gene flow (Nm = 0.69). Analysis of molecular variance showed that most genetic variation was within populations (74.96%). This study highlights the importance of conserving the genetic resources of D. alatus species. PMID:25078594

  18. Fast T2-weighted MR imaging: impact of variation in pulse sequence parameters on image quality and artifacts.

    PubMed

    Li, Tao; Mirowitz, Scott A

    2003-09-01

    The purpose of this study was to quantitatively evaluate in a phantom model the practical impact of alteration of key imaging parameters on image quality and artifacts for the most commonly used fast T(2)-weighted MR sequences. These include fast spin-echo (FSE), single shot fast spin-echo (SSFSE), and spin-echo echo-planar imaging (EPI) pulse sequences. We developed a composite phantom with different T1 and T2 values, which was evaluated while stationary as well as during periodic motion. Experiments involved controlled variations in key parameters including effective TE, TR, echo spacing (ESP), receive bandwidth (BW), echo train length (ETL), and shot number (SN). Quantitative analysis consisted of signal-to-noise ratio (SNR), image nonuniformity, full-width-at-half-maximum (i.e., blurring or geometric distortion) and ghosting ratio. Among the fast T(2)-weighted sequences, EPI was most sensitive to alterations in imaging parameters. Among imaging parameters that we tested, effective TE, ETL, and shot number most prominently affected image quality and artifacts. Short T(2) objects were more sensitive to alterations in imaging parameters in terms of image quality and artifacts. Optimal clinical application of these fast T(2)-weighted imaging pulse sequences requires careful attention to selection of imaging parameters.

  19. Naturally occurring variations in sequence length creates microRNA isoforms that differ in argonaute effector complex specificity

    PubMed Central

    2010-01-01

    Background Micro(mi)RNAs are short RNA sequences, ranging from 16 to 35 nucleotides (miRBase; http://www.mirbase.org). The majority of the identified sequences are 21 or 22 nucleotides in length. Despite the range of sequence lengths for different miRNAs, individual miRNAs were thought to have a specific sequence of a particular length. A recent report describing a longer variant of a previously identified miRNA in Arabidopsis thaliana prompted this investigation for variations in the length of other miRNAs. Results In this paper, we demonstrate that a fifth of annotated A. thaliana miRNAs recorded in miRBase V.14 have stable miRNA isoforms that are one or two nucleotides longer than their respective recorded miRNA. Further, we demonstrate that miRNA isoforms are co-expressed and often show differential argonaute complex association. We postulate that these extensions are caused by differential cleavage of the parent precursor miRNA. Conclusions Our systematic analysis of A. thaliana miRNAs reveals that miRNA length isoforms are relatively common. This finding not only has implications for miRBase and miRNA annotation, but also extends to miRNA validation experiments and miRNA localization studies. Further, we predict that miRNA isoforms are present in other plant species also. PMID:20534119

  20. Genetic Control of Environmental Variation of Two Quantitative Traits of Drosophila melanogaster Revealed by Whole-Genome Sequencing

    PubMed Central

    Sørensen, Peter; de los Campos, Gustavo; Morgante, Fabio; Mackay, Trudy F. C.; Sorensen, Daniel

    2015-01-01

    Genetic studies usually focus on quantifying and understanding the existence of genetic control on expected phenotypic outcomes. However, there is compelling evidence suggesting the existence of genetic control at the level of environmental variability, with some genotypes exhibiting more stable and others more volatile performance. Understanding the mechanisms responsible for environmental variability not only informs medical questions but is relevant in evolution and in agricultural science. In this work fully sequenced inbred lines of Drosophila melanogaster were analyzed to study the nature of genetic control of environmental variance for two quantitative traits: starvation resistance (SR) and startle response (SL). The evidence for genetic control of environmental variance is compelling for both traits. Sequence information is incorporated in random regression models to study the underlying genetic signals, which are shown to be different in the two traits. Genomic variance in sexual dimorphism was found for SR but not for SL. Indeed, the proportion of variance captured by sequence information and the contribution to this variance from four chromosome segments differ between sexes in SR but not in SL. The number of studies of environmental variation, particularly in humans, is limited. The availability of full sequence information and modern computationally intensive statistical methods provides opportunities for rigorous analyses of environmental variability. PMID:26269504

  1. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  2. Lack of sequence variation in sporadic bovine leucosis in regions of tumour suppressor genes p53 and p16.

    PubMed

    Mayr, B; Grüneis, C; Brem, G; Reifinger, M; Schaffner, G; Hochsteiner, W

    2001-08-01

    Regions of the promoter and exons 5-8 of the tumour suppressor gene p53 were analysed in 25 cases of sporadic bovine leucosis. The study included 17 cases of juvenile leucosis, five cases of adult leucosis and three cases of skin leucosis. Exon 2 of tumour suppressor gene p16 was also investigated in the same samples. No sequence variations were present in the analysed areas of the genes. In p53, this fact represents a clear difference in comparison with enzootic bovine leucosis. In p16, no comparative data are available. PMID:11554494

  3. The non-obese diabetic mouse sequence, annotation and variation resource: an aid for investigating type 1 diabetes

    PubMed Central

    Steward, Charles A.; Gonzalez, Jose M.; Trevanion, Steve; Sheppard, Dan; Kerry, Giselle; Gilbert, James G. R.; Wicker, Linda S.; Rogers, Jane; Harrow, Jennifer L.

    2013-01-01

    Model organisms are becoming increasingly important for the study of complex diseases such as type 1 diabetes (T1D). The non-obese diabetic (NOD) mouse is an experimental model for T1D having been bred to develop the disease spontaneously in a process that is similar to humans. Genetic analysis of the NOD mouse has identified around 50 disease loci, which have the nomenclature Idd for insulin-dependent diabetes, distributed across at least 11 different chromosomes. In total, 21 Idd regions across 6 chromosomes, that are major contributors to T1D susceptibility or resistance, were selected for finished sequencing and annotation at the Wellcome Trust Sanger Institute. Here we describe the generation of 40.4 mega base-pairs of finished sequence from 289 bacterial artificial chromosomes for the NOD mouse. Manual annotation has identified 738 genes in the diabetes sensitive NOD mouse and 765 genes in homologous regions of the diabetes resistant C57BL/6J reference mouse across 19 candidate Idd regions. This has allowed us to call variation consequences between homologous exonic sequences for all annotated regions in the two mouse strains. We demonstrate the importance of this resource further by illustrating the technical difficulties that regions of inter-strain structural variation between the NOD mouse and the C57BL/6J reference mouse can cause for current next generation sequencing and assembly techniques. Furthermore, we have established that the variation rate in the Idd regions is 2.3 times higher than the mean found for the whole genome assembly for the NOD/ShiLtJ genome, which we suggest reflects the fact that positive selection for functional variation in immune genes is beneficial in regard to host defence. In summary, we provide an important resource, which aids the analysis of potential causative genes involved in T1D susceptibility. Database URLs: http://www.sanger.ac.uk/resources/mouse/nod/; http://vega

  4. Patterns of structural and sequence variation within isotype lineages of the Neisseria meningitidis transferrin receptor system

    PubMed Central

    Adamiak, Paul; Calmettes, Charles; Moraes, Trevor F; Schryvers, Anthony B

    2015-01-01

    Neisseria meningitidis inhabits the human upper respiratory tract and is an important cause of sepsis and meningitis. A surface receptor comprised of transferrin-binding proteins A and B (TbpA and TbpB), is responsible for acquiring iron from host transferrin. Sequence and immunological diversity divides TbpBs into two distinct lineages; isotype I and isotype II. Two representative isotype I and II strains, B16B6 and M982, differ in their dependence on TbpB for in vitro growth on exogenous transferrin. The crystal structure of TbpB and a structural model for TbpA from the representative isotype I N. meningitidis strain B16B6 were obtained. The structures were integrated with a comprehensive analysis of the sequence diversity of these proteins to probe for potential functional differences. A distinct isotype I TbpA was identified that co-varied with TbpB and lacked sequence in the region for the loop 3 α-helix that is proposed to be involved in iron removal from transferrin. The tightly associated isotype I TbpBs had a distinct anchor peptide region, a distinct, smaller linker region between the lobes and lacked the large loops in the isotype II C-lobe. Sequences of the intact TbpB, the TbpB N-lobe, the TbpB C-lobe, and TbpA were subjected to phylogenetic analyses. The phylogenetic clustering of TbpA and the TbpB C-lobe were similar with two main branches comprising the isotype 1 and isotype 2 TbpBs, possibly suggesting an association between TbpA and the TbpB C-lobe. The intact TbpB and TbpB N-lobe had 4 main branches, one consisting of the isotype 1 TbpBs. One isotype 2 TbpB cluster appeared to consist of isotype 1 N-lobe sequences and isotype 2 C-lobe sequences, indicating the swapping of N-lobes and C-lobes. Our findings should inform future studies on the interaction between TbpB and TbpA and the process of iron acquisition. PMID:25800619

  5. Purification to homogeneity and amino acid sequence analysis of two anionic species of human interleukin 1

    PubMed Central

    1986-01-01

    Two anionic species of human IL-1 have been purified to homogeneity. These molecules were characterized as having pI of 5.4 and 5.2 and molecular weights identical to IL-1/6.8 (17,500). The specific activities of IL-1/5.4 and IL-1/5.2, as measured in the mouse thymocyte co-mitogenic assay, were identical to that of IL-1/6.8, namely 1.2 X 10(7) U/mg, with half-maximal stimulation observed at 2 X 10(-11) M. IL- 1/5.4 and IL-1/5.2 were found to be antigenically distinct from IL- 1/6.8 in an ELISA. IL-1/5.4 was structurally distinct from IL-1/6.8 based on reverse-phase HPLC or CNBr peptides. Intact IL-1/5.2 and three intact CNBr peptides of IL-1/5.4 were sequenced, with the identification of 74 amino acid residues. These sequences were found to correspond exactly with the amino acid sequence deduced from the IL-1- alpha cDNA reported by March et al. PMID:3487613

  6. Protein meta-functional signatures from combining sequence, structure, evolution, and amino acid property information.

    PubMed

    Wang, Kai; Horst, Jeremy A; Cheng, Gong; Nickle, David C; Samudrala, Ram

    2008-09-26

    Protein function is mediated by different amino acid residues, both their positions and types, in a protein sequence. Some amino acids are responsible for the stability or overall shape of the protein, playing an indirect role in protein function. Others play a functionally important role as part of active or binding sites of the protein. For a given protein sequence, the residues and their degree of functional importance can be thought of as a signature representing the function of the protein. We have developed a combination of knowledge- and biophysics-based function prediction approaches to elucidate the relationships between the structural and the functional roles of individual residues and positions. Such a meta-functional signature (MFS), which is a collection of continuous values representing the functional significance of each residue in a protein, may be used to study proteins of known function in greater detail and to aid in experimental characterization of proteins of unknown function. We demonstrate the superior performance of MFS in predicting protein functional sites and also present four real-world examples to apply MFS in a wide range of settings to elucidate protein sequence-structure-function relationships. Our results indicate that the MFS approach, which can combine multiple sources of information and also give biological interpretation to each component, greatly facilitates the understanding and characterization of protein function.

  7. Posttranslational modification and sequence variation of redox-active proteins correlate with biofilm life cycle in natural microbial communities

    SciTech Connect

    Singer, Steven; Erickson, Brian K; Verberkmoes, Nathan C; Hwang, Mona; Shah, Manesh B; Hettich, Robert {Bob} L; Banfield, Jillian F.; Thelen, Michael P.

    2010-01-01

    Characterizing proteins recovered from natural microbial communities affords the opportunity to correlate protein expression and modification with environmental factors, including species composition and successional stage. Proteogenomic and biochemical studies of pellicle biofilms from subsurface acid mine drainage streams have shown abundant cytochromes from the dominant organism, Leptospirillum Group II. These cytochromes are proposed to be key proteins in aerobic Fe(II) oxidation, the dominant mode of cellular energy generation by the biofilms. In this study, we determined that posttranslational modification and expression of amino-acid sequence variants change as a function of biofilm maturation. For Cytochrome579 (Cyt579), the most abundant cytochrome in the biofilms, late developmental-stage biofilms differed from early-stage biofilms in N-terminal truncations and decreased redox potentials. Expression of sequence variants of two monoheme c-type cytochromes also depended on biofilm development. For Cyt572, an abundant membrane-bound cytochrome, the expression of multiple sequence variants was observed in both early and late developmental-stage biofilms; however, redox potentials of Cyt572 from these different sources did not vary significantly. These cytochrome analyses show a complex response of the Leptospirillum Group II electron transport chain to growth within a microbial community and illustrate the power of multiple proteomics techniques to define biochemistry in natural systems.

  8. Bacteria obtained from a sequencing batch reactor that are capable of growth on dehydroabietic acid.

    PubMed

    Mohn, W W

    1995-06-01

    Eleven isolates capable of growth on the resin acid dehydroabietic acid (DhA) were obtained from a sequencing batch reactor designed to treat a high-strength process stream from a paper mill. The isolates belonged to two groups, represented by strains DhA-33 and DhA-35, which were characterized. In the bioreactor, bacteria like DhA-35 were more abundant than those like DhA-33. The population in the bioreactor of organisms capable of growth on DhA was estimated to be 1.1 x 10(6) propagules per ml, based on a most-probable-number determination. Analysis of small-subunit rRNA partial sequences indicated that DhA-33 was most closely related to Sphingomonas yanoikuyae (Sab = 0.875) and that DhA-35 was most closely related to Zoogloea ramigera (Sab = 0.849). Both isolates additionally grew on other abietanes, i.e., abietic and palustric acids, but not on the pimaranes, pimaric and isopimaric acids. For DhA-33 and DhA-35 with DhA as the sole organic substrate, doubling times were 2.7 and 2.2 h, respectively, and growth yields were 0.30 and 0.25 g of protein per g of DhA, respectively. Glucose as a cosubstrate stimulated growth of DhA-33 on DhA and stimulated DhA degradation by the culture. Pyruvate as a cosubstrate did not stimulate growth of DhA-35 on DhA and reduced the specific rate of DhA degradation of the culture. DhA induced DhA and abietic acid degradation activities in both strains, and these activities were heat labile. Cell suspensions of both strains consumed DhA at a rate of 6 mumol mg of protein-1 h-1.(ABSTRACT TRUNCATED AT 250 WORDS)

  9. Amplification and thrifty single-molecule sequencing of recurrent somatic structural variations.

    PubMed

    Patel, Anand; Schwab, Richard; Liu, Yu-Tsueng; Bafna, Vineet

    2014-02-01

    Deletion of tumor-suppressor genes as well as other genomic rearrangements pervade cancer genomes across numerous types of solid tumor and hematologic malignancies. However, even for a specific rearrangement, the breakpoints may vary between individuals, such as the recurrent CDKN2A deletion. Characterizing the exact breakpoints for structural variants (SVs) is useful for designating patient-specific tumor biomarkers. We propose AmBre (Amplification of Breakpoints), a method to target SV breakpoints occurring in samples composed of heterogeneous tumor and germline DNA. Additionally, AmBre validates SVs called by whole-exome/genome sequencing and hybridization arrays. AmBre involves a PCR-based approach to amplify the DNA segment containing an SV's breakpoint and then confirms breakpoints using sequencing by Pacific Biosciences RS. To amplify breakpoints with PCR, primers tiling specified target regions are carefully selected with a simulated annealing algorithm to minimize off-target amplification and maximize efficiency at capturing all possible breakpoints within the target regions. To confirm correct amplification and obtain breakpoints, PCR amplicons are combined without barcoding and simultaneously long-read sequenced using a single SMRT cell. Our algorithm efficiently separates reads based on breakpoints. Each read group supporting the same breakpoint corresponds with an amplicon and a consensus amplicon sequence is called. AmBre was used to discover CDKN2A deletion breakpoints in cancer cell lines: A549, CEM, Detroit562, MOLT4, MCF7, and T98G. Also, we successfully assayed RUNX1-RUNX1T1 reciprocal translocations by finding both breakpoints in the Kasumi-1 cell line. AmBre successfully targets SVs where DNA harboring the breakpoints are present in 1:1000 mixtures.

  10. Sequence variation of cytotoxic T cell epitopes in different isolates of Epstein-Barr virus.

    PubMed

    Apolloni, A; Moss, D; Stumm, R; Burrows, S; Suhrbier, A; Misko, I; Schmidt, C; Sculley, T

    1992-01-01

    Previous results have identified two distinct cytotoxic T lymphocyte (CTL) epitopes encoded by Epstein-Barr virus (EBV), TETA (ORF BLRF3/BERF1 residues 329-353) and EENL (ORF BERF3/BERF4 residues 290-309). Measurement of the specificities of CTL clones (TETA-specific clone 13 and EENL-specific clone 7) directed against these epitopes indicated that the EENL epitope is conserved in all strains of EBV tested while the TETA epitope varied between individual virus strains. Sequencing of the DNA regions encoding these two CTL epitopes in different EBV isolates confirmed these interpretations and demonstrated that different TETA epitope sequences were encoded by B-type EBV strains and by the B95-8 isolate of EBV compared to the other A-type EBV strains. Titration of synthetic variants of the TETA epitope revealed that the epitope encoded by B95-8 was 15-fold less efficient as a T cell epitope than the sequence encoded by other A-type viral strains while the TETA variant encoded by the B-type strains displayed essentially no activity as a T cell epitope.

  11. Optimally recovering rate variation information from genomes and sequences: pattern filtering.

    PubMed

    Lake, J A

    1998-09-01

    Nucleotide substitution rates vary at different positions within genes and genomes, but rates are difficult to estimate, because they are masked by the stochastic nature of substitutions. In this paper, a linear method, pattern filtering, is described which can optimally separate the signals (related to substitution rates or to other measures of sequence change) from stochastic noise. Pattern filtering promises to be useful in both genomic and molecular evolution studies. In an example using mitochondrial genomes, it is shown that pattern filtering can reveal coding and non-coding regions without the need for prior identification of reading frames or other knowledge of the sequence and promises to be an important tool for genomic analysis. In a second example, it is shown that pattern filtering allows one to classify sites on the basis of an estimator of substitution rates. Using elongation factor EF-1 alpha sequences, it is shown that the fastest sites favor archaea as the sister taxon of eukaryotes, whereas the slower sites support the eocyte prokaryotes as the sister taxon of eukaryotes, suggesting that the former result is an artifact of "long branch attraction." PMID:9729887

  12. Development of a SCAR (sequence-characterised amplified region) marker for acid resistance-related gene in Lactobacillus plantarum.

    PubMed

    Liu, Shu-Wen; Li, Kai; Yang, Shi-Ling; Tian, Shu-Fen; He, Ling

    2015-03-01

    A sequence characterised amplified region marker was developed to determine an acid resistance-related gene in Lactobacillus plantarum. A random amplified polymorphic DNA marker named S116-680 was reported to be closely related to the acid resistance of the strains. The DNA band corresponding to this marker was cloned and sequenced with the induction of specific designed PCR primers. The results of PCR test helped to amplify a clear specific band of 680 bp in the tested acid-resistant strains. S116-680 marker would be useful to explore the acid-resistant mechanism of L. plantarum and to screen desirable malolactic fermentation strains.

  13. Nucleic and amino acid sequences relating to a novel transketolase, and methods for the expression thereof

    DOEpatents

    Croteau, Rodney Bruce; Wildung, Mark Raymond; Lange, Bernd Markus; McCaskill, David G.

    2001-01-01

    cDNAs encoding 1-deoxyxylulose-5-phosphate synthase from peppermint (Mentha piperita) have been isolated and sequenced, and the corresponding amino acid sequences have been determined. Accordingly, isolated DNA sequences (SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7) are provided which code for the expression of 1-deoxyxylulose-5-phosphate synthase from plants. In another aspect the present invention provides for isolated, recombinant DXPS proteins, such as the proteins having the sequences set forth in SEQ ID NO:4, SEQ ID NO:6 and SEQ ID NO:8. In other aspects, replicable recombinant cloning vehicles are provided which code for plant 1-deoxyxylulose-5-phosphate synthases, or for a base sequence sufficiently complementary to at least a portion of 1-deoxyxylulose-5-phosphate synthase DNA or RNA to enable hybridization therewith. In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding a plant 1-deoxyxylulose-5-phosphate synthase. Thus, systems and methods are provided for the recombinant expression of the aforementioned recombinant 1-deoxyxylulose-5-phosphate synthase that may be used to facilitate its production, isolation and purification in significant amounts. Recombinant 1-deoxyxylulose-5-phosphate synthase may be used to obtain expression or enhanced expression of 1-deoxyxylulose-5-phosphate synthase in plants in order to enhance the production of 1-deoxyxylulose-5-phosphate, or its derivatives such as isopentenyl diphosphate (BP), or may be otherwise employed for the regulation or expression of 1-deoxyxylulose-5-phosphate synthase, or the production of its products.

  14. Natural variation in Brachypodium disctachyon: Deep Sequencing of Highly Diverse Natural Accessions (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    SciTech Connect

    Gordon, Sean

    2013-03-01

    Sean Gordon of the USDA on "Natural variation in Brachypodium disctachyon: Deep Sequencing of Highly Diverse Natural Accessions" at the 8th Annual Genomics of Energy & Environment Meeting on March 27, 2013 in Walnut Creek, Calif.

  15. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3

    PubMed Central

    Xiao, Jingfa; Hao, Lirui; Crowley, David E.; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  16. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3.

    PubMed

    Wang, Xiaoyu; Chen, Meili; Xiao, Jingfa; Hao, Lirui; Crowley, David E; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  17. [Variation of the sialic acid-protein ratio in the human cervical mucus (author's transl)].

    PubMed

    Kesserü, E; Westphal, N

    1975-01-01

    Serial determinations of total protein and sialic acid concentrations were carried out in individual cervical mucus samples of normal women, throughout the menstrual cycle as well as under the influence of estrogen and progestagen steroids. Total protein was titrated by a modified micro-biuret method and sialic acid was determined using the "Direct Ehrlich" method. Both parameters diminished gradually during the follicular phase of the cycle with lowest values around the time of ovulation, and showed a strong increase during the luteal phase. Administration of ethinylestradiol yielded values significantly lower than those of normal ovulatory phase and under d-Nosgestrel treatment the values were significantly higher than in normal luteal phase. However the proportion of sialic acid as related to total protein showed an inverse behavior, i.e. increased under estrogenic and decreased under progestogenic influence. The cyclic and hormone-induced variations of this ratio were more specific than those of both separate components. Thus, estrogens produce an increase of the sialic acid containing fraction of cervical mucus proteins, while progestagens have an opposite effect. The role of these actions in reproductive physiology is discussed.

  18. Discovery of a novel amino acid racemase through exploration of natural variation in Arabidopsis thaliana

    PubMed Central

    Strauch, Renee C.; Svedin, Elisabeth; Dilkes, Brian; Chapple, Clint; Li, Xu

    2015-01-01

    Plants produce diverse low-molecular-weight compounds via specialized metabolism. Discovery of the pathways underlying production of these metabolites is an important challenge for harnessing the huge chemical diversity and catalytic potential in the plant kingdom for human uses, but this effort is often encumbered by the necessity to initially identify compounds of interest or purify a catalyst involved in their synthesis. As an alternative approach, we have performed untargeted metabolite profiling and genome-wide association analysis on 440 natural accessions of Arabidopsis thaliana. This approach allowed us to establish genetic linkages between metabolites and genes. Investigation of one of the metabolite–gene associations led to the identification of N-malonyl-d-allo-isoleucine, and the discovery of a novel amino acid racemase involved in its biosynthesis. This finding provides, to our knowledge, the first functional characterization of a eukaryotic member of a large and widely conserved phenazine biosynthesis protein PhzF-like protein family. Unlike most of known eukaryotic amino acid racemases, the newly discovered enzyme does not require pyridoxal 5′-phosphate for its activity. This study thus identifies a new d-amino acid racemase gene family and advances our knowledge of plant d-amino acid metabolism that is currently largely unexplored. It also demonstrates that exploitation of natural metabolic variation by integrating metabolomics with genome-wide association is a powerful approach for functional genomics study of specialized metabolism. PMID:26324904

  19. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, Heinz-Ulrich G.; Gray, Joe W.

    1995-01-01

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.

  20. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, H.U.G.; Gray, J.W.

    1995-06-27

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.

  1. Unconventional amino acid sequence of the sun anemone (Stoichactis helianthus) polypeptide neurotoxin

    SciTech Connect

    Kem, W.; Dunn, B.; Parten, B.; Pennington, M.; Price, D.

    1986-05-01

    A 5000 dalton polypeptide neurotoxin (Sh-NI) purified by G50 Sephadex, P-cellulose, and SP-Sephadex chromatography was homogeneous by isoelectric focusing. Sh-NI was highly toxic to crayfish (LD/sub 50/ 0.6 ..mu..g/kg) but without effect upon mice at 15,000 ..mu..g/kg (i.p. injection). The reduced, /sup 3/H-carboxymethylated toxin and its fragments were subjected to automatic Edman degradation and the resulting PTH-amino acids were identified by HPLC, back hydrolysis, and scintillation counting. Peptides resulting from proteolytic (clostripain, staphylococcal protease) and chemical (tryptophan) cleavage were sequenced. The sequence is: AACKCDDEGPDIRTAPLTGTVDLGSCNAGWEKCASYYTIIADCCRKKK. This sequence differs considerably from the homologous Anemonia and Anthopleura toxins; many of the identical residues (6 half-cystines, G9, P10, R13, G19, G29, W30) are probably critical for folding rather than receptor recognition. However, the Sh-NI sequence closely resembles Radioanthus macrodactylus neurotoxin III and r. paumotensis II. The authors propose that Sh-NI and related Radioanthus toxins act upon a different site on the sodium channel.

  2. Sequence-defined bioactive macrocycles via an acid-catalysed cascade reaction

    NASA Astrophysics Data System (ADS)

    Porel, Mintu; Thornlow, Dana N.; Phan, Ngoc N.; Alabi, Christopher A.

    2016-06-01

    Synthetic macrocycles derived from sequence-defined oligomers are a unique structural class whose ring size, sequence and structure can be tuned via precise organization of the primary sequence. Similar to peptides and other peptidomimetics, these well-defined synthetic macromolecules become pharmacologically relevant when bioactive side chains are incorporated into their primary sequence. In this article, we report the synthesis of oligothioetheramide (oligoTEA) macrocycles via a one-pot acid-catalysed cascade reaction. The versatility of the cyclization chemistry and modularity of the assembly process was demonstrated via the synthesis of >20 diverse oligoTEA macrocycles. Structural characterization via NMR spectroscopy revealed the presence of conformational isomers, which enabled the determination of local chain dynamics within the macromolecular structure. Finally, we demonstrate the biological activity of oligoTEA macrocycles designed to mimic facially amphiphilic antimicrobial peptides. The preliminary results indicate that macrocyclic oligoTEAs with just two-to-three cationic charge centres can elicit potent antibacterial activity against Gram-positive and Gram-negative bacteria.

  3. A new antifungal peptide from the seeds of Phytolacca americana: characterization, amino acid sequence and cDNA cloning.

    PubMed

    Shao, F; Hu, Z; Xiong, Y M; Huang, Q Z; WangCG; Zhu, R H; Wang, D C

    1999-03-19

    An antifungal peptide from seeds of Phytolacca americana, designated PAFP-s, has been isolated. The peptide is highly basic and consists of 38 residues with three disulfide bridges. Its molecular mass of 3929.0 was determined by mass spectrometry. The complete amino acid sequence was obtained from automated Edman degradation, and cDNA cloning was successfully performed by 3'-RACE. The deduced amino acid sequence of a partial cDNA corresponded to the amino acid sequence from chemical sequencing. PAFP-s exhibited a broad spectrum of antifungal activity, and its activities differed among various fungi. PAFP-s displayed no inhibitory activity towards Escherichia coli. PAFP-s shows significant sequence similarities and the same cysteine motif with Mj-AMPs, antimicrobial peptides from seeds of Mirabilis jalapa belonging to the knottin-type antimicrobial peptide.

  4. Amino acid sequence and variant forms of favin, a lectin from Vicia faba.

    PubMed

    Hopp, T P; Hemperly, J J; Cunningham, B A

    1982-04-25

    We have determined the complete amino acid sequence (182 residues) of the beta chain of favin, the glucose-binding lectin from fava beans (Vicia faba), and have established that the carbohydrate moiety is attached to Asn 168. Together with the sequence of the alpha chain previously reported (Hemperly, J. J., Hopp, T. P., Becker, J. W., and Cunningham, B. A. (1979) J. Biol. Chem. 254, 6803-6810), these data complete the analysis of the primary structure of the lectin. We have also examined minor polypeptides that appear in all preparations of favin. Two lower molecular weight species (Mr = 9,500-11,600) appear to be fragments of the beta chain resulting from cleavage following Asn 76, whereas six high molecular weight forms (Mr = 25,000 or greater) appear to include aggregates of the beta chain and possibly some alternative products of chain processing. PMID:7068646

  5. Pyrosequencing on templates generated by asymmetric nucleic acid sequence-based amplification (asymmetric-NASBA).

    PubMed

    Jia, Huning; Chen, Zhiyao; Wu, Haiping; Ye, Hui; Yan, Zhengyu; Zhou, Guohua

    2011-12-21

    Pyrosequencing is an ideal tool for verifying the sequence of amplicons. To enable pyrosequencing on amplicons from nucleic acid sequence-based amplification (NASBA), asymmetric NASBA with unequal concentrations of T7 promoter primer and reverse transcription primer was proposed. By optimizing the ratio of two primers and the concentration of dNTPs and NTPs, the amount of single-stranded cDNA in the amplicons from asymmetric NASBA was found increased 12 times more than the conventional NASBA through the real-time detection of a molecular beacon specific to cDNA of interest. More than 20 bases have been successfully detected by pyrosequencing on amplicons from asymmetric NASBA using Human parainfluenza virus (HPIV) as an amplification template. The primary results indicate that the combination of NASBA with a pyrosequencing system is practical, and should open a new field in clinical diagnosis.

  6. FimH family of type 1 fimbrial adhesins: functional heterogeneity due to minor sequence variations among fimH genes.

    PubMed Central

    Sokurenko, E V; Courtney, H S; Ohman, D E; Klemm, P; Hasty, D L

    1994-01-01

    We recently reported that the type 1-fimbriated Escherichia coli strains CSH-50 and HB101(pPKL4), both K-12 derivatives, have different patterns of adhesion to yeast mannan, human plasma fibronectin, and fibronectin derivatives, suggesting functional heterogeneity of type 1 fimbriae. In this report, we provide evidence that this functional heterogeneity is due to variations in the fimH genes. We also investigated functional heterogeneity among clinical isolates and whether variation in fimH genes accounts for differences in receptor specificity. Twelve isolates obtained from human urine were tested for their ability to adhere to mannan, fibronectin, periodate-treated fibronectin, and a synthetic peptide copying the 30 amino-terminal residues of fibronectin. CSH-50 and HB101(pPKL4) were tested for comparison. Selected isolates were also tested for adhesion to purified fragments spanning the entire fibronectin molecule. Three distinct functional classes, designated M, MF, and MFP, were observed. The fimH genes were amplified by PCR from chromosomal DNA obtained from representative strains and expressed in a delta fim strain (AAEC191A) transformed with a recombinant plasmid containing the entire fim gene cluster but with a translational stop-linker inserted into the fimH gene (pPKL114). Cloned fimH genes conferred on AAEC191A(pPKL114) receptor specificities mimicking those of the parent strains from which the fimH genes were obtained, demonstrating that the FimH subunits are responsible for the functional heterogeneity. Representative fimH genes were sequenced, and the deduced amino acid sequences were compared with the previously published FimH sequence. Allelic variants exhibiting >98% homology and encoding proteins differing by as little as a single amino acid substitution confer distinct adhesive phenotypes. This unexpected adhesive diversity within the FimH family broadens the scope of potential receptors for enterobacterial adhesion and may lead to a fundamental

  7. Morphological tranformation of calcite crystal growth by prismatic "acidic" polypeptide sequences.

    SciTech Connect

    Kim, I; Giocondi, J L; Orme, C A; Collino, J; Evans, J S

    2007-02-13

    Many of the interesting mechanical and materials properties of the mollusk shell are thought to stem from the prismatic calcite crystal assemblies within this composite structure. It is now evident that proteins play a major role in the formation of these assemblies. Recently, a superfamily of 7 conserved prismatic layer-specific mollusk shell proteins, Asprich, were sequenced, and the 42 AA C-terminal sequence region of this protein superfamily was found to introduce surface voids or porosities on calcite crystals in vitro. Using AFM imaging techniques, we further investigate the effect that this 42 AA domain (Fragment-2) and its constituent subdomains, DEAD-17 and Acidic-2, have on the morphology and growth kinetics of calcite dislocation hillocks. We find that Fragment-2 adsorbs on terrace surfaces and pins acute steps, accelerates then decelerates the growth of obtuse steps, forms clusters and voids on terrace surfaces, and transforms calcite hillock morphology from a rhombohedral form to a rounded one. These results mirror yet are distinct from some of the earlier findings obtained for nacreous polypeptides. The subdomains Acidic-2 and DEAD-17 were found to accelerate then decelerate obtuse steps and induce oval rather than rounded hillock morphologies. Unlike DEAD-17, Acidic-2 does form clusters on terrace surfaces and exhibits stronger obtuse velocity inhibition effects than either DEAD-17 or Fragment-2. Interestingly, a 1:1 mixture of both subdomains induces an irregular polygonal morphology to hillocks, and exhibits the highest degree of acute step pinning and obtuse step velocity inhibition. This suggests that there is some interplay between subdomains within an intra (Fragment-2) or intermolecular (1:1 mixture) context, and sequence interplay phenomena may be employed by biomineralization proteins to exert net effects on crystal growth and morphology.

  8. Variations of mitochondrial DNA sequence in three breeds of rabbit (Oryctolagus cuniculus).

    PubMed

    Terrance, Diamond G C; Thangamani, A; Srivastava, Varsha

    2007-07-01

    Sequencing of the 300 bp region of the mitochondrial cytochrome b gene was done. Genetic analysis was carried out for the first time in three exotic breeds (Giant White, Soviet Chinchilla, and German Angora) of European rabbit (Oryctolagus cuniculus) to determine intra- and interspecific variability and to measure the genetic distance. The frequencies of types of mutations (transition, transversion, deletion, and insertion) were also determined. This study throws light on matrilineage of breeds that arise due to interbreed crosses and the genetic management of a stocked rabbit breeding population. PMID:17630855

  9. Whole genome sequencing of emerging multidrug resistant Candida auris isolates in India demonstrates low genetic variation.

    PubMed

    Sharma, C; Kumar, N; Pandey, R; Meis, J F; Chowdhary, A

    2016-09-01

    Candida auris is an emerging multidrug resistant yeast that causes nosocomial fungaemia and deep-seated infections. Notably, the emergence of this yeast is alarming as it exhibits resistance to azoles, amphotericin B and caspofungin, which may lead to clinical failure in patients. The multigene phylogeny and amplified fragment length polymorphism typing methods report the C. auris population as clonal. Here, using whole genome sequencing analysis, we decipher for the first time that C. auris strains from four Indian hospitals were highly related, suggesting clonal transmission. Further, all C. auris isolates originated from cases of fungaemia and were resistant to fluconazole (MIC >64 mg/L).

  10. Polarimetric Variations of Binary Stars. IV. Pre-Main-Sequence Spectroscopic Binaries Located in Taurus, Auriga, and Orion

    NASA Astrophysics Data System (ADS)

    Manset, N.; Bastien, P.

    2002-08-01

    We present polarimetric observations of 14 pre-main-sequence (PMS) binaries located in the Taurus, Auriga, and Orion star-forming regions. The majority of the average observed polarizations are below 0.5%, and none are above 0.9%. After removal of estimates of the interstellar polarization, about half the binaries have an intrinsic polarization above 0.5%, even though most of them do not present other evidences for the presence of circumstellar dust. Various tests reveal that 77% of the PMS binaries have or possibly have a variable polarization. LkCa 3, Par 1540, and Par 2494 present detectable periodic and phase-locked variations. The periodic polarimetric variations are noisier and of a lesser amplitude (~0.1%) than for other types of binaries, such as hot stars. This could be due to stochastic events that produce deviations in the average polarization, a nonfavorable geometry (circumbinary envelope), or the nature of the scatterers (dust grains are less efficient polarizers than electrons). Par 1540 is a weak-line T Tauri star but nonetheless has enough dust in its environment to produce detectable levels of polarization and variations. A fourth interesting case is W134, which displays rapid changes in polarization that could be due to eclipses. We compare the observations with some of our numerical simulations and also show that an analysis of the periodic polarimetric variations with the Brown, McLean, & Emslie (BME) formalism to find the orbital inclination is for the moment premature: nonperiodic events introduce stochastic noise that partially masks the periodic low-amplitude variations and prevents the BME formalism from finding a reasonable estimate of the orbital inclination.

  11. The amino-acid sequences of sculpin islet somatostatin-28 and peptide YY.

    PubMed

    Cutfield, S M; Carne, A; Cutfield, J F

    1987-04-01

    Two pancreatic peptides, somatostatin-28 and peptide YY, have been isolated from the Brockmann bodies of the teleost fish Cottus scorpius (daddy sculpin). Following purification by reverse-phase HPLC, each peptide was sequenced completely through to the carboxyl-terminus by gas-phase Edman degradation. Somatostatin-28 was the major form of somatostatin detected and is similar to the gene II product from anglerfish. Peptide YY (36 amino acids) more closely resembles porcine neuropeptide YY and intestinal peptide YY than it does the pancreatic polypeptides. PMID:2883025

  12. Sequence selective recognition of double-stranded RNA using triple helix-forming peptide nucleic acids.

    PubMed

    Zengeya, Thomas; Gupta, Pankaj; Rozners, Eriks

    2014-01-01

    Noncoding RNAs are attractive targets for molecular recognition because of the central role they play in gene expression. Since most noncoding RNAs are in a double-helical conformation, recognition of such structures is a formidable problem. Herein, we describe a method for sequence-selective recognition of biologically relevant double-helical RNA (illustrated on ribosomal A-site RNA) using peptide nucleic acids (PNA) that form a triple helix in the major grove of RNA under physiologically relevant conditions. Protocols for PNA preparation and binding studies using isothermal titration calorimetry are described in detail.

  13. Sequence selective double strand DNA cleavage by peptide nucleic acid (PNA) targeting using nuclease S1.

    PubMed Central

    Demidov, V; Frank-Kamenetskii, M D; Egholm, M; Buchardt, O; Nielsen, P E

    1993-01-01

    A novel method for sequence specific double strand DNA cleavage using PNA (peptide nucleic acid) targeting is described. Nuclease S1 digestion of double stranded DNA gives rise to double strand cleavage at an occupied PNA strand displacement binding site, and under optimized conditions complete cleavage can be obtained. The efficiency of this cleavage is more than 10 fold enhanced when a tandem PNA site is targeted, and additionally enhanced if this site is in trans rather than in cis orientation. Thus in effect, the PNA targeting makes the single strand specific nuclease S1 behave like a pseudo restriction endonuclease. Images PMID:8502550

  14. Fast computational methods for predicting protein structure from primary amino acid sequence

    DOEpatents

    Agarwal, Pratul Kumar

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  15. WinGene/WinPep: user-friendly software for the analysis of amino acid sequences.

    PubMed

    Hennig, L

    1999-06-01

    WinGene1.0/WinPep1.2 is a pair of Microsoft Windows programs designed to read nucleotide or amino acid sequence data. These versatile programs have the following capabilities: (i) searches for open reading frames and their translation, (ii) assisting the design of primers for PCR and (iii) calculation of molecular weight, isoelectric point and molar absorbtion coefficients of polypeptides. Furthermore, hydropathic plots and helical wheel displays are easily produced. The programs run with an intuitive Windows interface, contain a comprehensive help file and enable data exchange with other applications by means of the Copy&Paste command. The software is free for academic and noncommercial users.

  16. Complete genome sequence of Lactococcus lactis IO-1, a lactic acid bacterium that utilizes xylose and produces high levels of L-lactic acid.

    PubMed

    Kato, Hiroaki; Shiwa, Yuh; Oshima, Kenshiro; Machii, Miki; Araya-Kojima, Tomoko; Zendo, Takeshi; Shimizu-Kadota, Mariko; Hattori, Masahira; Sonomoto, Kenji; Yoshikawa, Hirofumi

    2012-04-01

    We report the complete genome sequence of Lactococcus lactis IO-1 (= JCM7638). It is a nondairy lactic acid bacterium, produces nisin Z, ferments xylose, and produces predominantly L-lactic acid at high xylose concentrations. From ortholog analysis with other five L. lactis strains, IO-1 was identified as L. lactis subsp. lactis.

  17. Purification and amino acid sequence of aminopeptidase P from pig kidney.

    PubMed

    Vergas Romero, C; Neudorfer, I; Mann, K; Schäfer, W

    1995-04-01

    Aminopeptidase P from kidney cortex was purified in high yield (recovery greater than or equal to 20%) by a series of column chromatographic steps after solubilization of the membrane-bound glycoprotein with n-butanol. A coupled enzymic assay, using Gly-Pro-Pro-NH-Nap as substrate and dipeptidyl-peptidase IV as auxilliary enzyme, was used to monitor the purification. The purification procedure yielded two forms of aminopeptidase P differing in their carbohydrate composition (glycoforms). Both enzyme preparations were homogeneous as assessed by SDS/PAGE silver staining, and isoelectric focusing. Both forms possessed the same substrate specificity, catalysed the same reaction, and consisted of identical protein chains. The amino acid sequence determined by Edman degradation and mass spectrometry consisted of 623 amino acids. Six N-glycosylation sites, all contained in the N-terminal half of the protein, were characterized. PMID:7744038

  18. Mass spectrometric detection of the amino acid sequence polymorphism of the hepatitis C virus antigen.

    PubMed

    Kaysheva, A L; Ivanov, Yu D; Frantsuzov, P A; Krohin, N V; Pavlova, T I; Uchaikin, V F; Konev, V А; Kovalev, O B; Ziborov, V S; Archakov, A I

    2016-03-01

    A method for detection and identification of the hepatitis C virus antigen (HCVcoreAg) in human serum with consideration for possible amino acid substitutions is proposed. The method is based on a combination of biospecific capturing and concentrating of the target protein on the surface of the chip for atomic force microscope (AFM chip) with subsequent protein identification by tandem mass spectrometric (MS/MS) analysis. Biospecific AFM-capturing of viral particles containing HCVcoreAg from serum samples was performed by use of AFM chips with monoclonal antibodies (anti-HCVcore) covalently immobilized on the surface. Biospecific complexes were registered and counted by AFM. Further MS/MS analysis allowed to reliably identify the HCVcoreAg in the complexes formed on the AFM chip surface. Analysis of MS/MS spectra, with the account taken of the possible polymorphisms in the amino acid sequence of the HCVcoreAg, enabled us to increase the number of identified peptides.

  19. Modeling the plant-soil interaction in presence of heavy metal pollution and acidity variations.

    PubMed

    Guala, Sebastián; Vega, Flora A; Covelo, Emma F

    2013-01-01

    On a mathematical interaction model, developed to model metal uptake by plants and the effects on their growth, we introduce a modification which considers also effects on variations of acidity in soil. The model relates the dynamics of the uptake of metals from soil to plants and also variations of uptake according to the acidity level. Two types of relationships are considered: total and available metal content. We suppose simple mathematical assumptions in order to get as simple as possible expressions with the aim of being easily tested in experimental problems. This work introduces modifications to two versions of the model: on the one hand, the expression of the relationship between the metal in soil and the concentration of the metal in plants and, on the other hand, the relationship between the metal in the soil and total amount of the metal in plants. The fine difference of both versions is fundamental at the moment to consider the tolerance and capacity of accumulation of pollutants in the biomass from the soil.

  20. Spatio-Temporal Variations of High and Low Nucleic Acid Content Bacteria in an Exorheic River.

    PubMed

    Liu, Jie; Hao, Zhenyu; Ma, Lili; Ji, Yurui; Bartlam, Mark; Wang, Yingying

    2016-01-01

    Bacteria with high nucleic acid (HNA) and low nucleic acid (LNA) content are commonly observed in aquatic environments. To date, limited knowledge is available on their temporal and spatial variations in freshwater environments. Here an investigation of HNA and LNA bacterial abundance and their flow cytometric characteristics was conducted in an exorheic river (Haihe River, Northern China) over a one year period covering September (autumn) 2011, December (winter) 2011, April (spring) 2012, and July (summer) 2012. The results showed that LNA and HNA bacteria contributed similarly to the total bacterial abundance on both the spatial and temporal scale. The variability of HNA on abundance, fluorescence intensity (FL1) and side scatter (SSC) were more sensitive to environmental factors than that of LNA bacteria. Meanwhile, the relative distance of SSC between HNA and LNA was more variable than that of FL1. Multivariate analysis further demonstrated that the influence of geographical distance (reflected by the salinity gradient along river to ocean) and temporal changes (as temperature variation due to seasonal succession) on the patterns of LNA and HNA were stronger than the effects of nutrient conditions. Furthermore, the results demonstrated that the distribution of LNA and HNA bacteria, including the abundance, FL1 and SSC, was controlled by different variables. The results suggested that LNA and HNA bacteria might play different ecological roles in the exorheic river. PMID:27082986

  1. Spatio-Temporal Variations of High and Low Nucleic Acid Content Bacteria in an Exorheic River

    PubMed Central

    Ma, Lili; Ji, Yurui; Bartlam, Mark; Wang, Yingying

    2016-01-01

    Bacteria with high nucleic acid (HNA) and low nucleic acid (LNA) content are commonly observed in aquatic environments. To date, limited knowledge is available on their temporal and spatial variations in freshwater environments. Here an investigation of HNA and LNA bacterial abundance and their flow cytometric characteristics was conducted in an exorheic river (Haihe River, Northern China) over a one year period covering September (autumn) 2011, December (winter) 2011, April (spring) 2012, and July (summer) 2012. The results showed that LNA and HNA bacteria contributed similarly to the total bacterial abundance on both the spatial and temporal scale. The variability of HNA on abundance, fluorescence intensity (FL1) and side scatter (SSC) were more sensitive to environmental factors than that of LNA bacteria. Meanwhile, the relative distance of SSC between HNA and LNA was more variable than that of FL1. Multivariate analysis further demonstrated that the influence of geographical distance (reflected by the salinity gradient along river to ocean) and temporal changes (as temperature variation due to seasonal succession) on the patterns of LNA and HNA were stronger than the effects of nutrient conditions. Furthermore, the results demonstrated that the distribution of LNA and HNA bacteria, including the abundance, FL1 and SSC, was controlled by different variables. The results suggested that LNA and HNA bacteria might play different ecological roles in the exorheic river. PMID:27082986

  2. Spatio-Temporal Variations of High and Low Nucleic Acid Content Bacteria in an Exorheic River.

    PubMed

    Liu, Jie; Hao, Zhenyu; Ma, Lili; Ji, Yurui; Bartlam, Mark; Wang, Yingying

    2016-01-01

    Bacteria with high nucleic acid (HNA) and low nucleic acid (LNA) content are commonly observed in aquatic environments. To date, limited knowledge is available on their temporal and spatial variations in freshwater environments. Here an investigation of HNA and LNA bacterial abundance and their flow cytometric characteristics was conducted in an exorheic river (Haihe River, Northern China) over a one year period covering September (autumn) 2011, December (winter) 2011, April (spring) 2012, and July (summer) 2012. The results showed that LNA and HNA bacteria contributed similarly to the total bacterial abundance on both the spatial and temporal scale. The variability of HNA on abundance, fluorescence intensity (FL1) and side scatter (SSC) were more sensitive to environmental factors than that of LNA bacteria. Meanwhile, the relative distance of SSC between HNA and LNA was more variable than that of FL1. Multivariate analysis further demonstrated that the influence of geographical distance (reflected by the salinity gradient along river to ocean) and temporal changes (as temperature variation due to seasonal succession) on the patterns of LNA and HNA were stronger than the effects of nutrient conditions. Furthermore, the results demonstrated that the distribution of LNA and HNA bacteria, including the abundance, FL1 and SSC, was controlled by different variables. The results suggested that LNA and HNA bacteria might play different ecological roles in the exorheic river.

  3. Haplogroup Classification of Korean Cattle Breeds Based on Sequence Variations of mtDNA Control Region.

    PubMed

    Kim, Jae-Hwan; Lee, Seong-Su; Kim, Seung Chang; Choi, Seong-Bok; Kim, Su-Hyun; Lee, Chang Woo; Jung, Kyoung-Sub; Kim, Eun Sung; Choi, Young-Sun; Kim, Sung-Bok; Kim, Woo Hyun; Cho, Chang-Yeon

    2016-05-01

    Many studies have reported the frequency and distribution of haplogroups among various cattle breeds for verification of their origins and genetic diversity. In this study, 318 complete sequences of the mtDNA control region from four Korean cattle breeds were used for haplogroup classification. 71 polymorphic sites and 66 haplotypes were found in these sequences. Consistent with the genetic patterns in previous reports, four haplogroups (T1, T2, T3, and T4) were identified in Korean cattle breeds. In addition, T1a, T3a, and T3b sub-haplogroups were classified. In the phylogenetic tree, each haplogroup formed an independent cluster. The frequencies of T3, T4, T1 (containing T1a), and T2 were 66%, 16%, 10%, and 8%, respectively. Especially, the T1 haplogroup contained only one haplotype and a sample. All four haplogroups were found in Chikso, Jeju black and Hanwoo. However, only the T3 and T4 haplogroups appeared in Heugu, and most Chikso populations showed a partial of four haplogroups. These results will be useful for stable conservation and efficient management of Korean cattle breeds. PMID:26954229

  4. Opsin gene sequence variation across phylogenetic and population histories in Mysis (Crustacea: Mysida) does not match current light environments or visual-pigment absorbance spectra.

    PubMed

    Audzijonyte, Asta; Pahlberg, Johan; Viljanen, Martta; Donner, Kristian; Väinölä, Risto

    2012-05-01

    The hypothesis that selection on the opsin gene is efficient in tuning vision to the ambient light environment of an organism was assessed in 49 populations of 12 Mysis crustacean species, inhabiting arctic marine waters, coastal littoral habitats, freshwater lakes ('glacial relicts') and the deep Caspian Sea. Extensive sequence variation was found within and among taxa, but its patterns did not match expectations based on light environments, spectral sensitivity of the visual pigment measured by microspectrophotometry or the history of species and populations. The main split in the opsin gene tree was between lineages I and II, differing in six amino acids. Lineage I was present in marine and Caspian Sea species and in the North American freshwater Mysis diluviana, whereas lineage II was found in the European and circumarctic fresh- and brackish-water Mysis relicta, Mysis salemaai and Mysis segerstralei. Both lineages were present in some populations of M. salemaai and M. segerstralei. Absorbance spectra of the visual pigment in nine populations of the latter three species showed a dichotomy between lake (λ(max) =554-562 nm) and brackish-water (Baltic Sea) populations (λ(max) = 521-535 nm). Judged by the shape of spectra, this difference was not because of different chromophores (A2 vs. A1), but neither did it coincide with the split in the opsin tree (lineages I/II), species identity or current light environments. In all, adaptive evolution of the opsin gene in Mysis could not be demonstrated, but its sequence variation did not conform to a neutral expectation either, suggesting evolutionary constraints and/or unidentified mechanisms of spectral tuning. PMID:22429275

  5. Sequence Polymorphisms at the REDUCED DORMANCY5 Pseudophosphatase Underlie Natural Variation in Arabidopsis Dormancy1[OPEN

    PubMed Central

    Xiang, Yong; Song, Baoxing; Née, Guillaume; Kramer, Katharina; Soppe, Wim J.J.

    2016-01-01

    Seed dormancy controls the timing of germination, which regulates the adaptation of plants to their environment and influences agricultural production. The time of germination is under strong natural selection and shows variation within species due to local adaptation. The identification of genes underlying dormancy quantitative trait loci is a major scientific challenge, which is relevant for agricultural and ecological goals. In this study, we describe the identification of the DELAY OF GERMINATION18 (DOG18) quantitative trait locus, which was identified as a factor in natural variation for seed dormancy in Arabidopsis (Arabidopsis thaliana). DOG18 encodes a member of the clade A of the type 2C protein phosphatases family, which we previously identified as the REDUCED DORMANCY5 (RDO5) gene. DOG18/RDO5 shows a relatively high frequency of loss-of-function alleles in natural accessions restricted to northwestern Europe. The loss of dormancy in these loss-of-function alleles can be compensated for by genetic factors like DOG1 and DOG6, and by environmental factors such as low temperature. RDO5 does not have detectable phosphatase activity. Analysis of the phosphoproteome in dry and imbibed seeds revealed a general decrease in protein phosphorylation during seed imbibition that is enhanced in the rdo5 mutant. We conclude that RDO5 acts as a pseudophosphatase that inhibits dephosphorylation during seed imbibition. PMID:27288362

  6. Draft Genome Sequence of Bacillus subtilis subsp. natto Strain CGMCC 2108, a High Producer of Poly-γ-Glutamic Acid

    PubMed Central

    Tan, Siyuan; Su, Anping; Zhang, Chen; Ren, Yuanyuan

    2016-01-01

    Here, we report the 4.1-Mb draft genome sequence of Bacillus subtilis subsp. natto strain CGMCC 2108, a high producer of poly-γ-glutamic acid (γ-PGA). This sequence will provide further help for the biosynthesis of γ-PGA and will greatly facilitate research efforts in metabolic engineering of B. subtilis subsp. natto strain CGMCC 2108. PMID:27231363

  7. WAViS server for handling, visualization and presentation of multiple alignments of nucleotide or amino acids sequences.

    PubMed

    Zika, Radek; Paces, Jan; Pavlícek, Adam; Paces, Václav

    2004-07-01

    Web Alignment Visualization Server contains a set of web-tools designed for quick generation of publication-quality color figures of multiple alignments of nucleotide or amino acids sequences. It can be used for identification of conserved regions and gaps within many sequences using only common web browsers. The server is accessible at http://wavis.img.cas.cz.

  8. Serum uric acid concentrations and SLC2A9 genetic variation in Hispanic children: The Viva La Familia Study

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Elevated concentrations of serum uric acid are associated with increased risk of gout and renal and cardiovascular diseases. Genetic studies in adults have consistently identified associations of solute carrier family 2, member 9 (SLC2A9), polymorphisms with variation in serum uric acid. However, it...

  9. ANTICALIgN: visualizing, editing and analyzing combined nucleotide and amino acid sequence alignments for combinatorial protein engineering.

    PubMed

    Jarasch, Alexander; Kopp, Melanie; Eggenstein, Evelyn; Richter, Antonia; Gebauer, Michaela; Skerra, Arne

    2016-07-01

    ANTIC ALIGN: is an interactive software developed to simultaneously visualize, analyze and modify alignments of DNA and/or protein sequences that arise during combinatorial protein engineering, design and selection. ANTIC ALIGN: combines powerful functions known from currently available sequence analysis tools with unique features for protein engineering, in particular the possibility to display and manipulate nucleotide sequences and their translated amino acid sequences at the same time. ANTIC ALIGN: offers both template-based multiple sequence alignment (MSA), using the unmutated protein as reference, and conventional global alignment, to compare sequences that share an evolutionary relationship. The application of similarity-based clustering algorithms facilitates the identification of duplicates or of conserved sequence features among a set of selected clones. Imported nucleotide sequences from DNA sequence analysis are automatically translated into the corresponding amino acid sequences and displayed, offering numerous options for selecting reading frames, highlighting of sequence features and graphical layout of the MSA. The MSA complexity can be reduced by hiding the conserved nucleotide and/or amino acid residues, thus putting emphasis on the relevant mutated positions. ANTIC ALIGN: is also able to handle suppressed stop codons or even to incorporate non-natural amino acids into a coding sequence. We demonstrate crucial functions of ANTIC ALIGN: in an example of Anticalins selected from a lipocalin random library against the fibronectin extradomain B (ED-B), an established marker of tumor vasculature. Apart from engineered protein scaffolds, ANTIC ALIGN: provides a powerful tool in the area of antibody engineering and for directed enzyme evolution.

  10. 3-d structure-based amino acid sequence alignment of esterases, lipases and related proteins

    SciTech Connect

    Gentry, M.K.; Doctor, B.P.; Cygler, M.; Schrag, J.D.; Sussman, J.L.

    1993-05-13

    Acetylcholinesterase and butyrylcholinesterase, enzymes with potential as pretreatment drugs for organophosphate toxicity, are members of a larger family of homologous proteins that includes carboxylesterases, cholesterol esterases, lipases, and several nonhydrolytic proteins. A computer-generated alignment of 18 of the proteins, the acetylcholinesases, butyrylcholinesterases, carboxylesterases, some esterases, and the nonenzymatic proteins has been previously presented. More recently, the three-dimensional structures of two enzymes enzymes in this group, acetylcholinesterase from Torpedo californica and lipase from Geotrichum candidum, have been determined. Based on the x-ray structures and the superposition of these two enzymes, it was possible to obtain an improved amino acid sequence alignment of 32 members of this family of proteins. Examination of this alignment reveals that 24 amino acids are invariant in all of the hydrolytic proteins, and an additional 49 are well conserved. Conserved amino acids include those of the active site, the disulfide bridges, the salt bridges, in the core of the proteins, and at the edges of secondary structural elements. Comparison of the three-dimensional structures makes it possible to find a well-defined structural basis for the conservation of many of these amino acids.

  11. Chemical profile and seasonal variation of phenolic acid content in bastard balm (Melittis melissophyllum L., Lamiaceae).

    PubMed

    Skrzypczak-Pietraszek, Ewa; Pietraszek, Jacek

    2012-07-01

    Melittis melissophyllum L. is an old medicinal plant. Nowadays it is only used in the folk medicine but formerly it has been applied in the official medicine as a natural product described in French Pharmacopoeia. M. melissophyllum herbs used in our studies were collected from two localities in Poland in May and September. Methanolic plant extracts were purified by means of solid-phase extraction and then analysed by HPLC-DAD for their phenolic acid profile. Eleven compounds were identified in all plant samples and quantitatively analysed as: protocatechuic, chlorogenic, p-hydroxybenzoic, vanillic, caffeic, syringic, p-coumaric, ferulic, sinapic, o-coumaric and cinnamic acid. Plant materials contained free and bound phenolic acids. The main compounds were: p-hydroxybenzoic acid (30.21-54.16 mg/100 g dw and 37.04-56.75 mg/100 g dw, free and bound, respectively) and p-coumaric acid (40.48-80.55 mg/100 g dw and 28.09-40.85 mg/100 g dw, free and bound, respectively). The highest amounts of the investigated compounds were found in all samples collected in September, e.g. p-hydroxybenzoic acid (September 51.72-54.16 mg/100 g dw vs. May 30.21-34.07 mg/100 g dw), p-coumaric acid (September 77.14-80.55 mg/100 g dw vs. May 40.48-43.2 5mg/100 g dw). Multivariate statistical and data mining techniques, such as cluster analysis (CA) and principal component analysis (PCA), were used to characterize the sample populations according to the geographical localities, vegetation period and compound form (free or bound). To the best of our knowledge we report for the first time the results of quantitative analysis of M. melissophyllum phenolic acids and seasonal variation of their content. Plant herbs are usually collected at flowering for plant derived medical preparations. Our results show that it is not always the optimal time for the highest contents of active compounds. PMID:22513117

  12. Sequencing and Transcriptional Analysis of the Biosynthesis Gene Cluster of Abscisic Acid-Producing Botrytis cinerea

    PubMed Central

    Gong, Tao; Shu, Dan; Yang, Jie; Ding, Zhong-Tao; Tan, Hong

    2014-01-01

    Botrytis cinerea is a model species with great importance as a pathogen of plants and has become used for biotechnological production of ABA. The ABA cluster of B. cinerea is composed of an open reading frame without significant similarities (bcaba3), followed by the genes (bcaba1 and bcaba2) encoding P450 monooxygenases and a gene probably coding for a short-chain dehydrogenase/reductase (bcaba4). In B. cinerea ATCC58025, targeted inactivation of the genes in the cluster suggested at least three genes responsible for the hydroxylation at carbon atom C-1' and C-4' or oxidation at C-4' of ABA. Our group has identified an ABA-overproducing strain, B. cinerea TB-3-H8. To differentiate TB-3-H8 from other B. cinerea strains with the functional ABA cluster, the DNA sequence of the 12.11-kb region containing the cluster of B. cinerea TB-3-H8 was determined. Full-length cDNAs were also isolated for bcaba1, bcaba2, bcaba3 and bcaba4 from B. cinerea TB-3-H8. Sequence comparison of the four genes and their flanking regions respectively derived from B. cinerea TB-3-H8, B05.10 and T4 revealed that major variations were located in intergenic sequences. In B. cinerea TB-3-H8, the expression profiles of the four function genes under ABA high-yield conditions were also analyzed by real-time PCR. PMID:25268614

  13. Multiple Amino Acid Sequence Alignment Nitrogenase Component 1: Insights into Phylogenetics and Structure-Function Relationships

    PubMed Central

    Howard, James B.; Kechris, Katerina J.; Rees, Douglas C.; Glazer, Alexander N.

    2013-01-01

    Amino acid residues critical for a protein's structure-function are retained by natural selection and these residues are identified by the level of variance in co-aligned homologous protein sequences. The relevant residues in the nitrogen fixation Component 1 α- and β-subunits were identified by the alignment of 95 protein sequences. Proteins were included from species encompassing multiple microbial phyla and diverse ecological niches as well as the nitrogen fixation genotypes, anf, nif, and vnf, which encode proteins associated with cofactors differing at one metal site. After adjusting for differences in sequence length, insertions, and deletions, the remaining >85% of the sequence co-aligned the subunits from the three genotypes. Six Groups, designated Anf, Vnf , and Nif I-IV, were assigned based upon genetic origin, sequence adjustments, and conserved residues. Both subunits subdivided into the same groups. Invariant and single variant residues were identified and were defined as “core” for nitrogenase function. Three species in Group Nif-III, Candidatus Desulforudis audaxviator, Desulfotomaculum kuznetsovii, and Thermodesulfatator indicus, were found to have a seleno-cysteine that replaces one cysteinyl ligand of the 8Fe:7S, P-cluster. Subsets of invariant residues, limited to individual groups, were identified; these unique residues help identify the gene of origin (anf, nif, or vnf) yet should not be considered diagnostic of the metal content of associated cofactors. Fourteen of the 19 residues that compose the cofactor pocket are invariant or single variant; the other five residues are highly variable but do not correlate with the putative metal content of the cofactor. The variable residues are clustered on one side of the cofactor, away from other functional centers in the three dimensional structure. Many of the invariant and single variant residues were not previously recognized as potentially critical and their identification provides the bases

  14. Next-Gen phylogeography of rainforest trees: exploring landscape-level cpDNA variation from whole-genome sequencing.

    PubMed

    van der Merwe, M; McPherson, H; Siow, J; Rossetto, M

    2014-01-01

    Standardized phylogeographic studies across codistributed taxa can identify important refugia and biogeographic barriers, and potentially uncover how changes in adaptive constraints through space and time impact on the distribution of genetic diversity. The combination of next-generation sequencing and methodologies that enable uncomplicated analysis of the full chloroplast genome may provide an invaluable resource for such studies. Here, we assess the potential of a shotgun-based method across twelve nonmodel rainforest trees sampled from two evolutionary distinct regions. Whole genomic shotgun sequencing libraries consisting of pooled individuals were used to assemble species-specific chloroplast references (in silicio). For each species, the pooled libraries allowed for the detection of variation within and between data sets (each representing a geographic region). The potential use of nuclear rDNA as an additional marker from the NGS libraries was investigated by mapping reads against available references. We successfully obtained phylogeographically informative sequence data from a range of previously unstudied rainforest trees. Greater levels of diversity were found in northern refugial rainforests than in southern expansion areas. The genetic signatures of varying evolutionary histories were detected, and interesting associative patterns between functional characteristics and genetic diversity were identified. This approach can suit a wide range of landscape-level studies. As the key laboratory-based steps do not require prior species-specific knowledge and can be easily outsourced, the techniques described here are even suitable for researchers without access to wet-laboratory facilities, making evolutionary ecology questions increasingly accessible to the research community.

  15. DNA sequence and haplotype variation in two candidate genes for dilated cardiomyopathy in the turkey Meleagris gallopavo.

    PubMed

    Lin, Kuan-chin; Xu, Jun; Kamara, Davida; Geng, Tuoyu; Gyenai, Kwaku; Reed, Kent M; Smith, Edward J

    2007-05-01

    Determining variation in genes is fundamental to understanding their function in the disease state. Cardiac troponin T (cTnT) and phospholamban (PLN) genes have been implicated in dilated cardiomyopathy (DCM) in human and model species. To investigate the role of these 2 candidate genes in DCM in the turkey Meleagris gallopavo, understanding sequence variants and map position distribution is necessary. To this end, a total of 1854 and 1771 bp of cTnT and PLN gene sequences, respectively, were scanned for single nucleotide polymorphisms (SNPs) in a randomly bred population. A total of 15 SNPs was identified in the cTnT and PLN genomic sequences. Nine haplotypes, 5 in cTnT and 4 in PLN, were identified. Observed heterozygosities (0.02-0.39) in the turkey population were low for both genes. Within each gene, 1 SNP corresponding to a restriction enzyme site was identified and used to develop a PCR-restriction fragment length polymorphism (RFLP) genotyping assay. The PLN gene was genetically mapped to turkey chromosome 2, equivalent to Gallus gallus chromosome 3, and cTnT mapped to a turkey microchromosome. Although limited because of the relatively small sample size of 55 birds, the data from this SNP analysis of PLN and cTnT provide a foundation from which to evaluate the function of cTnT and PLN in the turkey. Information about the distribution of the SNPs and haplotypes will facilitate future association and linkage studies.

  16. The thermostable direct hemolysin-related hemolysin (trh) gene of Vibrio parahaemolyticus: Sequence variation and implications for detection and function.

    PubMed

    Nilsson, William B; Turner, Jeffrey W

    2016-07-01

    Vibrio parahaemolyticus is a leading cause of bacterial food-related illness associated with the consumption of undercooked seafood. Only a small subset of strains is pathogenic. Most clinical strains encode for the thermostable direct hemolysin (TDH) and/or the TDH-related hemolysin (TRH). In this work, we amplify and sequence the trh gene from over 80 trh+strains of this bacterium and identify thirteen genetically distinct alleles, most of which have not been deposited in GenBank previously. Sequence data was used to design new primers for more reliable detection of trh by endpoint PCR. We also designed a new quantitative PCR assay to target a more conserved gene that is genetically-linked to trh. This gene, ureR, encodes the transcriptional regulator for the urease gene cluster immediately upstream of trh. We propose that this ureR assay can be a useful screening tool as a surrogate for direct detection of trh that circumvents challenges associated with trh sequence variation.

  17. Comparative Mitogenomics of the Genus Odontobutis (Perciformes: Gobioidei: Odontobutidae) Revealed Conserved Gene Rearrangement and High Sequence Variations

    PubMed Central

    Ma, Zhihong; Yang, Xuefen; Bercsenyi, Miklos; Wu, Junjie; Yu, Yongyao; Wei, Kaijian; Fan, Qixue; Yang, Ruibin

    2015-01-01

    To understand the molecular evolution of mitochondrial genomes (mitogenomes) in the genus Odontobutis, the mitogenome of Odontobutis yaluensis was sequenced and compared with those of another four Odontobutis species. Our results displayed similar mitogenome features among species in genome organization, base composition, codon usage, and gene rearrangement. The identical gene rearrangement of trnS-trnL-trnH tRNA cluster observed in mitogenomes of these five closely related freshwater sleepers suggests that this unique gene order is conserved within Odontobutis. Additionally, the present gene order and the positions of associated intergenic spacers of these Odontobutis mitogenomes indicate that this unusual gene rearrangement results from tandem duplication and random loss of large-scale gene regions. Moreover, these mitogenomes exhibit a high level of sequence variation, mainly due to the differences of corresponding intergenic sequences in gene rearrangement regions and the heterogeneity of tandem repeats in the control regions. Phylogenetic analyses support Odontobutis species with shared gene rearrangement forming a monophyletic group, and the interspecific phylogenetic relationships are associated with structural differences among their mitogenomes. The present study contributes to understanding the evolutionary patterns of Odontobutidae species. PMID:26492246

  18. A worldwide survey of genome sequence variation provides insight into the evolutionary history of the honeybee Apis mellifera.

    PubMed

    Wallberg, Andreas; Han, Fan; Wellhagen, Gustaf; Dahle, Bjørn; Kawata, Masakado; Haddad, Nizar; Simões, Zilá Luz Paulino; Allsopp, Mike H; Kandemir, Irfan; De la Rúa, Pilar; Pirk, Christian W; Webster, Matthew T

    2014-10-01

    The honeybee Apis mellifera has major ecological and economic importance. We analyze patterns of genetic variation at 8.3 million SNPs, identified by sequencing 140 honeybee genomes from a worldwide sample of 14 populations at a combined total depth of 634×. These data provide insight into the evolutionary history and genetic basis of local adaptation in this species. We find evidence that population sizes have fluctuated greatly, mirroring historical fluctuations in climate, although contemporary populations have high genetic diversity, indicating the absence of domestication bottlenecks. Levels of genetic variation are strongly shaped by natural selection and are highly correlated with patterns of gene expression and DNA methylation. We identify genomic signatures of local adaptation, which are enriched in genes expressed in workers and in immune system- and sperm motility-related genes that might underlie geographic variation in reproduction, dispersal and disease resistance. This study provides a framework for future investigations into responses to pathogens and climate change in honeybees. PMID:25151355

  19. Population structure of African buffalo inferred from mtDNA sequences and microsatellite loci: high variation but low differentiation.

    PubMed

    Simonsen, B T; Siegismund, H R; Arctander, P

    1998-02-01

    The African buffalo (Syncerus caffer) is widespread throughout sub-Saharan Africa and is found in most major vegetation types, wherever permanent sources of water are available, making it physically able to disperse through a wide range of habitats. Despite this, the buffalo has been assumed to be strongly philopatric and to form large aggregations that remain within separate home ranges with little interchange between units, but the level of differentiation within the species is unknown. Genetic differences between populations were assessed using mitochondrial DNA (control region) sequence data and analysis of variation at six microsatellite loci among 11 localities in eastern and southern Africa. High levels of genetic variability were found, suggesting that reported severe population bottlenecks due to outbreak of rinderpest during the last century did not strongly reduce the genetic variability within the species. The high level of genetic variation within the species was found to be evenly distributed among populations and only at the continental level were we able to consistently detect significant differentiation, contrasting with the assumed philopatric behaviour of the buffalo. Results of mtDNA and microsatellite data were found to be congruent, disagreeing with the alleged male-biased dispersal. We propose that the observed pattern of the distribution of genetic variation between buffalo populations at the regional level can be caused by fragmentation of a previous panmictic population due to human activity, and at the continental level, reflects an effect of geographical distance between populations.

  20. Polarimetric Variations of Binary Stars. V. Pre-Main-Sequence Spectroscopic Binaries Located in Ophiuchus and Scorpius

    NASA Astrophysics Data System (ADS)

    Manset, N.; Bastien, P.

    2003-06-01

    We present polarimetric observations of seven pre-main-sequence (PMS) spectroscopic binaries located in the ρ Ophiuchus and Upper Scorpius star-forming regions (SFRs). The average observed polarizations at 7660 Å are between 0.5% and 3.5%. After estimates of the interstellar polarization are removed, all binaries have an intrinsic polarization above 0.4%, even though most of them do not present other evidences for circumstellar dust. Two binaries, NTTS 162814-2427 and NTTS 162819-2423S, present high levels of intrinsic polarization between 1.5% and 2.1%, in agreement with the fact that other observations (photometry, spectroscopy) indicate the presence of circumstellar dust. Tests reveal that all seven PMS binaries have a statistically variable or possibly variable polarization. Combining these results with our previous sample of binaries located in the Taurus, Auriga, and Orion SFRs, 68% of the binaries have an intrinsic polarization above 0.5%, and 90% of the binaries are polarimetrically variable or possibly variable. NTTS 160814-1857, 162814-2427, and 162819-2423S are clearly polarimetrically variable. The first two also exhibit phase-locked variations over ~10 and ~40 orbits, respectively. Statistically, NTTS 160905-1859 is possibly variable, but it shows periodic variations not detected by the statistical tests; those variations are not phased locked and only present for short intervals of time. The amplitudes of the variations reach a few tenths of a percent, greater than for the previously studied PMS binaries located in the Taurus, Orion, and Auriga SFRs. The high-eccentricity system NTTS 162814-2427 shows single-periodic variations, in agreement with our previous numerical simulations. We compare the observations with some of our numerical simulations and also show that an analysis of the periodic polarimetric variations with the Brown, McLean, & Emslie (BME) formalism to find the orbital inclination is for the moment premature: nonperiodic events

  1. Whole genome sequencing of emerging multidrug resistant Candida auris isolates in India demonstrates low genetic variation.

    PubMed

    Sharma, C; Kumar, N; Pandey, R; Meis, J F; Chowdhary, A

    2016-09-01

    Candida auris is an emerging multidrug resistant yeast that causes nosocomial fungaemia and deep-seated infections. Notably, the emergence of this yeast is alarming as it exhibits resistance to azoles, amphotericin B and caspofungin, which may lead to clinical failure in patients. The multigene phylogeny and amplified fragment length polymorphism typing methods report the C. auris population as clonal. Here, using whole genome sequencing analysis, we decipher for the first time that C. auris strains from four Indian hospitals were highly related, suggesting clonal transmission. Further, all C. auris isolates originated from cases of fungaemia and were resistant to fluconazole (MIC >64 mg/L). PMID:27617098

  2. Evolution and Functional Impact of Rare Coding Variation from Deep Sequencing of Human Exomes

    PubMed Central

    Tennessen, Jacob A.; Bigham, Abigail W.; O'Connor, Timothy D.; Fu, Wenqing; Kenny, Eimear E.; Gravel, Simon; McGee, Sean; Do, Ron; Liu, Xiaoming; Jun, Goo; Kang, Hyun Min; Jordan, Daniel; Leal, Suzanne M.; Gabriel, Stacey; Rieder, Mark J.; Abecasis, Goncalo; Altshuler, David; Nickerson, Deborah A.; Boerwinkle, Eric; Sunyaev, Shamil; Bustamante, Carlos D.; Bamshad, Michael J.; Akey, Joshua M.

    2013-01-01

    As a first step toward understanding how rare variants contribute to risk for complex diseases, we sequenced 15,585 human protein-coding genes to an average median depth of 111× in 2440 individuals of European (n = 1351) and African (n = 1088) ancestry. We identified over 500,000 single-nucleotide variants (SNVs), the majority of which were rare (86% with a minor allele frequency less than 0.5%), previously unknown (82%), and population-specific (82%). On average, 2.3% of the 13,595 SNVs each person carried were predicted to affect protein function of ∼313 genes per genome, and ∼95.7% of SNVs predicted to be functionally important were rare. This excess of rare functional variants is due to the combined effects of explosive, recent accelerated population growth and weak purifying selection. Furthermore, we show that large sample sizes will be required to associate rare variants with complex traits. PMID:22604720

  3. Genetic variation analysis of Mugil cephalus in China sea based on mitochondrial COI gene sequences.

    PubMed

    Sun, Peng; Shi, Zhao-hong; Yin, Fei; Peng, Shi-ming

    2012-04-01

    In this study, genetic diversity and population genetic structure of flathead grey mullet, Mugil cephalus, among four China Sea populations were investigated by COI sequences. All the populations studied had high values of haplotype and nucleotide diversity, except for the Yellow Sea population. In the phylogenetic tree, these haplotypes clustered in two groups, one for the populations from the Bohai and East China seas, and the other from the Yellow and South China seas. Analysis of molecular variance indicated that the northern populations (Bohai and East China) had lower genetic divergence (0.0725, P > 0.05) than that of the southern population (South China) (0.4530-0.6827, P < 0.001), suggesting that two distinct genetic groups exist in Chinese waters. Tests of neutral evolution and mismatch distribution indicated that no historical demographic expansion occurred in these populations. The results provide new information for genetic assessment, fishery management, and conservation of this species.

  4. Identification of Genes Responsible for Natural Variation in Volatile Content Using Next-Generation Sequencing Technology.

    PubMed

    Amaya, Iraida; Pillet, Jeremy; Folta, Kevin M

    2016-01-01

    Identification of the genes controlling the variation of key traits remains a challenge for plant researchers and represents a goal for the development of functional markers and their implementation in marker-assisted crop breeding. As an example we describe the identification of volatile organic compounds (VOCs) that segregate as single locus or mayor quantitative trait loci (QTL) in strawberry F1 segregating populations. Next, we describe a fast and efficient method for RNA extraction in strawberry that yields high-quality RNA for downstream RNA-seq analysis. Finally, two alternative methods for analysis of global transcript expression in contrasting lines will be described in order to identify the candidate gene and genes with differential expression using RNA-seq.

  5. Correlations Between Amino Acids at Different Sites in Local Sequences of Protein Fragments with Given Structural Patterns

    NASA Astrophysics Data System (ADS)

    Lu, Wen; Liu, Hai-yan

    2007-02-01

    Ample evidence suggests that the local structures of peptide fragments in native proteins are to some extent encoded by their local sequences. Detecting such local correlations is important but it is still an open question what would be the most appropriate method. This is partly because conventional sequence analyses treat amino acid preferences at each site of a protein sequence independently, while it is often the inter-site interactions that bring about local sequence-structure correlations. Here a new scheme is introduced to capture the correlation between amino acid preferences at different sites for different local structure types. A library of nine-residue fragments is constructed, and the fragments are divided into clusters based on their local structures. For each local structure cluster or type, chi-square tests are used to identify correlated preferences of amino acid combinations at pairs of sites. A score function is constructed including both the single site amino acid preferences and the dual-site amino acid combination preferences, which can be used to identify whether a sequence fragment would have a strong tendency to form a particular local structure in native proteins. The results show that, given a local structure pattern, dual-site amino acid combinations contain different information from single site amino acid preferences. Representative examples show that many of the statistically identified correlations agree with previously-proposed heuristic rules about local sequence-structure correlations, or are consistent with physical-chemical interactions required to stabilize particular local structures. Results also show that such dual-site correlations in the score function significantly improves the Z-score matching a sequence fragment to its native local structure relative to non-native local structures, and certain local structure types are highly predictable from the local sequence alone if inter-site correlations are considered.

  6. Variations of the ISM Compactness Across the Main Sequence of Star Forming Galaxies: Observations and Simulations

    NASA Astrophysics Data System (ADS)

    Martínez-Galarza, J. R.; Smith, H. A.; Lanz, L.; Hayward, Christopher C.; Zezas, A.; Rosenthal, L.; Weiner, A.; Hung, C.; Ashby, M. L. N.; Groves, B.

    2016-01-01

    The majority of star-forming galaxies follow a simple empirical correlation in the star formation rate (SFR) versus stellar mass (M*) plane, of the form {{SFR}}\\propto {M}*α , usually referred to as the star formation main sequence (MS). The physics that sets the properties of the MS is currently a subject of debate, and no consensus has been reached regarding the fundamental difference between members of the sequence and its outliers. Here we combine a set of hydro-dynamical simulations of interacting galactic disks with state-of-the-art radiative transfer codes to analyze how the evolution of mergers is reflected upon the properties of the MS. We present Chiburst, a Markov Chain Monte Carlo spectral energy distribution (SED) code that fits the multi-wavelength, broad-band photometry of galaxies and derives stellar masses, SFRs, and geometrical properties of the dust distribution. We apply this tool to the SEDs of simulated mergers and compare the derived results with the reference output from the simulations. Our results indicate that changes in the SEDs of mergers as they approach coalescence and depart from the MS are related to an evolution of dust geometry in scales larger than a few hundred parsecs. This is reflected in a correlation between the specific star formation rate, and the compactness parameter { C }, that parametrizes this geometry and hence the evolution of dust temperature ({T}{{dust}}) with time. As mergers approach coalescence, they depart from the MS and increase their compactness, which implies that moderate outliers of the MS are consistent with late-type mergers. By further applying our method to real observations of luminous infrared galaxies (LIRGs), we show that the merger scenario is unable to explain these extreme outliers of the MS. Only by significantly increasing the gas fraction in the simulations are we able to reproduce the SEDs of LIRGs.

  7. Conservation and variation of nucleotide sequences within related bacterial genomes: enterobacteria.

    PubMed Central

    Riley, M; Anilionis, A

    1980-01-01

    We have assessed the degree of relatedness of several portions of the Escherichia coli genome to the corresponding portions of the genomes of representative enteric bacteria, using the Southern transfer and hybridization technique (E. Southern, J. Mol. Biol. 98:503-517, 1975). The degree of relatedness varied among the regions examined. Judging both by the relative amounts of deoxyribonucleic acid in the various enteric genomes that are highly homologous and by the conservation of positions of restriction enzyme cleavage sites in these regions, the enteric genomes have diverged to greater extents in some parts of the genomes than in others. Portions of the genomes (including the tnaA and thyA genes, the trp operon, and one other unassigned segment) appear to have evolved in concert with the genome as a whole. By contrast, the lacZ gene and portions of the genome that are homologous to phage lambda vary more widely, perhaps reflecting a separate evolutionary origin for these segments of deoxyribonucleic acid. Images PMID:6447143

  8. Draft Genome Sequences of Gluconobacter cerinus CECT 9110 and Gluconobacter japonicus CECT 8443, Acetic Acid Bacteria Isolated from Grape Must

    PubMed Central

    Sainz, Florencia

    2016-01-01

    We report here the draft genome sequences of Gluconobacter cerinus strain CECT9110 and Gluconobacter japonicus CECT8443, acetic acid bacteria isolated from grape must. Gluconobacter species are well known for their ability to oxidize sugar alcohols into the corresponding acids. Our objective was to select strains to oxidize effectively d-glucose. PMID:27365351

  9. Molecular cloning, encoding sequence, and expression of vaccinia virus nucleic acid-dependent nucleoside triphosphatase gene.

    PubMed Central

    Rodriguez, J F; Kahn, J S; Esteban, M

    1986-01-01

    A rabbit poxvirus genomic library contained within the expression vector lambda gt11 was screened with polyclonal antiserum prepared against vaccinia virus nucleic acid-dependent nucleoside triphosphatase (NTPase)-I enzyme. Five positive phage clones containing from 0.72- to 2.5-kilobase-pair (kbp) inserts expressed a beta-galactosidase fusion protein that was reactive by immunoblotting with the NTPase-I antibody. Hybridization analysis allowed the location of this gene within the vaccinia HindIIID restriction fragment. From the known nucleotide sequence of the 16-kbp vaccinia HindIIID fragment, we identified a region that contains a 1896-base open reading frame coding for a 631-amino acid protein. Analysis of the complete sequence revealed a highly basic protein, with hydrophilic COOH and NH2 termini, various hydrophobic domains, and no significant homology to other known proteins. Translational studies demonstrate that NTPase-I belongs to a late class of viral genes. This protein is highly conserved among Orthopoxviruses. Images PMID:3025846

  10. Partial amino acid sequences around sulfhydryl groups of soybean beta-amylase.

    PubMed

    Nomura, K; Mikami, B; Morita, Y

    1987-08-01

    Sulfhydryl (SH) groups of soybean beta-amylase were modified with 5-(iodoaceto-amidoethyl)aminonaphthalene-1-sulfonate (IAEDANS) and the SH-containing peptides exhibiting fluorescence were purified after chymotryptic digestion of the modified enzyme. The sequence analysis of the peptides derived from the modification of all SH groups in the denatured enzyme revealed the existence of six SH groups, in contrast to five reported previously. One of them was found to have extremely low reactivity toward SH-reagents without reduction. In the native state, IAEDANS reacted with 2 mol of SH groups per mol of the enzyme (SH1 and SH2) accompanied with inactivation of the enzyme owing to the modification of SH2 located near the active site of this enzyme. The selective modification of SH2 with IAEDANS was attained after the blocking of SH1 with 5,5'-dithiobis-(2-nitrobenzoic acid). The amino acid sequences of the peptides containing SH1 and SH2 were determined to be Cys-Ala-Asn-Pro-Gln and His-Gln-Cys-Gly-Gly-Asn-Val-Gly-Asp-Ile-Val-Asn-Ile-Pro-Ile-Pro-Gln-Trp, respectively.

  11. Detection of piscine nodaviruses by real-time nucleic acid sequence based amplification (NASBA).

    PubMed

    Starkey, William G; Millar, Rose Mary; Jenkins, Mary E; Ireland, Jacqueline H; Muir, K Fiona; Richards, Randolph H

    2004-05-01

    Nucleic acid sequence based amplification (NASBA) is an isothermal nucleic acid amplification procedure based on target-specific primers and probes, and the co-ordinated activity of 3 enzymes: AMV reverse transcriptase, RNase H, and T7 RNA polymerase. We have developed a real-time NASBA procedure for detection of piscine nodaviruses, which have emerged as major pathogens of marine fish. Viral RNA was isolated by guanidine thiocyanate lysis followed by purification on silica particles. Primers were designed to target sequences in the nodavirus capsid protein gene, yielding an amplification product of 120 nucleotides. Amplification products were detected in real-time with a molecular beacon (FAM labelled/methyl-red quenched) that recognised an internal region of the target amplicon. Amplification and detection were performed at 41 degrees C for 90 min in a Corbett Research Rotorgene. Based on the detection of cell culture-derived nodavirus, and a synthetic RNA target, the real-time NASBA procedure was approximately 100-fold more sensitive than single-tube RT-PCR. When used to test a panel of 37 clinical samples (negative, n = 18; positive, n = 19), the real-time NASBA assay correctly identified all 18 negative and 19 positive samples. In comparison, the RT-PCR procedure identified all 18 negative samples, but only 16 of the positive samples. These results suggest that real-time NASBA may represent a sensitive and specific diagnostic procedure for piscine nodaviruses.

  12. From amino acid sequence to bioactivity: The biomedical potential of antitumor peptides.

    PubMed

    Blanco-Míguez, Aitor; Gutiérrez-Jácome, Alberto; Pérez-Pérez, Martín; Pérez-Rodríguez, Gael; Catalán-García, Sandra; Fdez-Riverola, Florentino; Lourenço, Anália; Sánchez, Borja

    2016-06-01

    Chemoprevention is the use of natural and/or synthetic substances to block, reverse, or retard the process of carcinogenesis. In this field, the use of antitumor peptides is of interest as, (i) these molecules are small in size, (ii) they show good cell diffusion and permeability, (iii) they affect one or more specific molecular pathways involved in carcinogenesis, and (iv) they are not usually genotoxic. We have checked the Web of Science Database (23/11/2015) in order to collect papers reporting on bioactive peptide (1691 registers), which was further filtered searching terms such as "antiproliferative," "antitumoral," or "apoptosis" among others. Works reporting the amino acid sequence of an antiproliferative peptide were kept (60 registers), and this was complemented with the peptides included in CancerPPD, an extensive resource for antiproliferative peptides and proteins. Peptides were grouped according to one of the following mechanism of action: inhibition of cell migration, inhibition of tumor angiogenesis, antioxidative mechanisms, inhibition of gene transcription/cell proliferation, induction of apoptosis, disorganization of tubulin structure, cytotoxicity, or unknown mechanisms. The main mechanisms of action of those antiproliferative peptides with known amino acid sequences are presented and finally, their potential clinical usefulness and future challenges on their application is discussed.

  13. Complete amino acid sequence of a Lolium perenne (perennial rye grass) pollen allergen, Lol p II.

    PubMed

    Ansari, A A; Shenbagamurthi, P; Marsh, D G

    1989-07-01

    The complete amino acid sequence of a Lolium perenne (rye grass) pollen allergen, Lol p II was determined by automated Edman degradation of the protein and selected fragments. Cleavage of the protein by enzymatic and chemical techniques established an unambiguous sequence for the protein. Lol p II contains 97 amino acid residues, with a calculated molecular weight of 10,882. The protein lacks cysteine and glutamine and shows no evidence of glycosylation. Theoretical predictions by Fraga's (Fraga, S. (1982) Can. J. Chem. 60, 2606-2610) and Hopp and Woods' (Hopp, T. P., and Woods, K. R. (1981) Proc. Natl. Acad. Sci. U.S.A. 78, 3824-3828) methods indicate the presence of four hydrophilic regions, which may contribute to sequential or parts of conformational B-cell epitopes. Analysis of amphipathic regions by Berzofsky's method indicates the presence of a highly amphipathic region, which may contain, or contribute to, an Ia/T-cell epitope. This latter segment of Lol p II was found to be highly homologous with an antibody-binding segment of the major rye allergen Lol p I and may explain why immune responsiveness to both the allergens is associated with HLA-DR3.

  14. Molecular cloning, encoding sequence, and expression of vaccinia virus nucleic acid-dependent nucleoside triphosphatase gene.

    PubMed

    Rodriguez, J F; Kahn, J S; Esteban, M

    1986-12-01

    A rabbit poxvirus genomic library contained within the expression vector lambda gt11 was screened with polyclonal antiserum prepared against vaccinia virus nucleic acid-dependent nucleoside triphosphatase (NTPase)-I enzyme. Five positive phage clones containing from 0.72- to 2.5-kilobase-pair (kbp) inserts expressed a beta-galactosidase fusion protein that was reactive by immunoblotting with the NTPase-I antibody. Hybridization analysis allowed the location of this gene within the vaccinia HindIIID restriction fragment. From the known nucleotide sequence of the 16-kbp vaccinia HindIIID fragment, we identified a region that contains a 1896-base open reading frame coding for a 631-amino acid protein. Analysis of the complete sequence revealed a highly basic protein, with hydrophilic COOH and NH2 termini, various hydrophobic domains, and no significant homology to other known proteins. Translational studies demonstrate that NTPase-I belongs to a late class of viral genes. This protein is highly conserved among Orthopoxviruses.

  15. Seasonal variations of fatty acid profile in different tissues of farmed bighead carp (Aristichthys nobilis).

    PubMed

    Hong, Hui; Fan, Hongbing; Wang, Hang; Lu, Han; Luo, Yongkang; Shen, Huixing

    2015-02-01

    Bighead carp (Aristichthys nobilis) is one of the major farmed species of freshwater fish in China. Byproduct volume of bighead carp is significant at up to 60 % of whole fish weight. A better understanding of the nutritional composition is needed to optimize the use of these raw materials. The objective of this research was to characterize seasonal variations of fatty acid profile in different tissues (heads, bones, skin, scales, viscera, muscle and fins) of farmed bighead carp. The fatty acid composition of farmed bighead carp varied significantly with seasons and tissues. The highest lipid content was determined in viscera while the highest EPA and DHA composition were observed in muscle compared to the other tissues. Significantly higher ΣEPA+DHA (%) was recorded in all tissues in summer (June) when compared with those of the other three seasons (p < 0.05). The n-3/n-6 fatty acid ratios in summer ranged from 3.38 to 3.69, nearly three times the ratios of the other three seasons. The results indicated that farmed bighead carp caught in summer could better balance the n-3 PUFA needs of consumers. The byproducts of bighead carp can be utilized for the production of fish oil. PMID:25694699

  16. Isolation and amino acid sequences of squirrel monkey (Saimiri sciurea) insulin and glucagon.

    PubMed Central

    Yu, J H; Eng, J; Yalow, R S

    1990-01-01

    It was reported two decades ago that insulin was not detectable in the glucose-stimulated state in Saimiri sciurea, the New World squirrel monkey, by a radioimmunoassay system developed with guinea pig anti-pork insulin antibody and labeled pork insulin. With the same system, reasonable levels were observed in rhesus monkeys and chimpanzees. This suggested that New World monkeys, like the New World hystricomorph rodents such as the guinea pig and the coypu, might have insulins whose sequences differ markedly from those of Old World mammals. In this report we describe the purification and amino acid sequences of squirrel monkey insulin and glucagon. We demonstrate that the substitutions at B29, B27, A2, A4, and A17 of squirrel monkey insulin are identical with those previously found in another New World primate, the owl monkey (Aotus trivirgatus). The immunologic cross-reactivity of this insulin in our immunoassay system is only a few percent of that of human insulin. Squirrel monkey glucagon is identical with the usual glucagon found in Old World mammals, which predicts that the glucagons of other New World monkeys would not differ from the usual Old World mammalian glucagon. It appears that the peptides of the New World monkeys have diverged less from those of the Old World mammals than have those of the New World hystricomorph rodents. The striking improvements in peptide purification and sequencing have the potential for adding new information concerning the evolutionary divergence of species. PMID:2263627

  17. Complete amino acid sequence of the myoglobin from the Pacific spotted dolphin, Stenella attenuata graffmani.

    PubMed

    Jones, B N; Wang, C C; Dwulet, F E; Lehman, L D; Meuth, J L; Bogardt, R A; Gurd, F R

    1979-04-25

    The complete amino acid sequence of the major component myoglobin from the Pacific spotted dolphin, Stenella attenuata graffmani, was determined by the automated Edman degradation of several large peptides obtained by specific cleavage of the protein. The acetimidated apomyoglobin was selectively cleaved at its two methionyl residues with cyanogen bromide and at its three arginyl residues by trypsin. By subjecting four of these peptides and the apomyoglobin to automated Edman degradation, over 80% of the primary structure of the protein was obtained. The remainder of the covalent structure was determined by the sequence analysis of peptides that resulted from further digestion of the central cyanogen bromide fragment. This fragment was cleaved at its glutamyl residues with staphylococcal protease and its lysyl residues with trypsin. The action of trypsin was restricted to the lysyl residues by chemical modification of the single arginyl residue of the fragment with 1,2-cyclohexanedione. The primary structure of this myoglobin proved to be identical with that from the Atlantic bottlenosed dolphin and Pacific common dolphin but differs from the myoglobins of the killer whale and pilot whale at two positions. The above sequence identities and differences reflect the close taxonomic relationship of these five species of Cetacea. PMID:454657

  18. Isolation and amino acid sequences of squirrel monkey (Saimiri sciurea) insulin and glucagon

    SciTech Connect

    Yu, Jinghua ); Eng, J.; Yalow, R.S. City Univ. of New York, NY )

    1990-12-01

    It was reported two decades ago that insulin was not detectable in the glucose-stimulated state in Saimiri sciurea, the New World squirrel monkey, by a radioimmunoassay system developed with guinea pig anti-pork insulin antibody and labeled park insulin. With the same system, reasonable levels were observed in rhesus monkeys and chimpanzees. This suggested that New World monkeys, like the New World hystricomorph rodents such as the guinea pig and the coypu, might have insulins whose sequences differ markedly from those of Old World mammals. In this report the authors describe the purification and amino acid sequences of squirrel monkey insulin and glucagon. They demonstrate that the substitutions at B29, B27, A2, A4, and A17 of squirrel monkey insulin are identical with those previously found in another New World primate, the owl monkey (Aotus trivirgatus). The immunologic cross-reactivity of this insulin in their immunoassay system is only a few percent of that of human insulin. It appears that the peptides of the New World monkeys have diverged less from those of the Old World mammals than have those of the New World hystricomorph rodents. The striking improvements in peptide purification and sequencing have the potential for adding new information concerning the evolutionary divergence of species.

  19. Purification, amino acid sequence and characterisation of kangaroo IGF-I.

    PubMed

    Yandell, C A; Francis, G L; Wheldrake, J F; Upton, Z

    1998-01-01

    Insulin-like growth factor-I (IGF-I) and IGF-II have been purified to homogeneity from kangaroo (Macropus fuliginosus) serum, thus this represents the first report of the purification, sequencing and characterisation of marsupial IGFs. N-Terminal protein sequencing reveals that there are six amino acid differences between kangaroo and human IGF-I. Kangaroo IGF-II has been partially sequenced and no differences were found between human and kangaroo IGF-II in the 53 residues identified. Thus the IGFs appear to be remarkably structurally conserved during mammalian radiation. In addition, in vitro characterisation of kangaroo IGF-I demonstrated that the functional properties of human, kangaroo and chicken IGF-I are very similar. In an assay measuring the ability of the proteins to stimulate protein synthesis in rat L6 myoblasts, all IGF-I proteins were found to be equally potent. The ability of all three proteins to compete for binding with radiolabelled human IGF-I to type-1 IGF receptors in L6 myoblasts and in Sminthopsis crassicaudata transformed lung fibroblasts, a marsupial cell line, was comparable. Furthermore, kangaroo and human IGF-I react equally in a human IGF-I RIA using a human reference standard, radiolabelled human IGF-I and a polyclonal antibody raised against recombinant human IGF-I. This study indicates that not only is the primary structure of eutherian and metatherian IGF-I conserved, but also the proteins appear to be functionally similar.

  20. Next-generation sequencing analysis of off-ladder alleles due to migration shift caused by sequence variation at D12S391 locus.

    PubMed

    Fujii, Koji; Watahiki, Haruhiko; Mita, Yusuke; Iwashima, Yasuki; Miyaguchi, Hajime; Kitayama, Tetsushi; Nakahara, Hiroaki; Mizuno, Natsuko; Sekiguchi, Kazumasa

    2016-09-01

    In short tandem repeat (STR) analysis, length polymorphisms are detected by capillary electrophoresis (CE). At most STR loci, mobility shift due to sequence variation in the repeat region was thought not to affect the typing results. In our recent population studies of 1501 Japanese individuals, off-ladder calls were observed at the D12S391 locus using PowerPlex Fusion in nine samples for allele 22, one sample for allele 25, and one sample for allele 26. However, these samples were typed as ordinary alleles within the bins using GlobalFiler. In this study, next-generation sequencing analysis using MiSeq was performed for the D12S391 locus from the 11 off-ladder samples and 33 other samples, as well as the allelic ladders of PowerPlex Fusion and GlobalFiler. All off-ladder allele 22 in the nine samples had [AGAT]11[AGAC]11 as a repeat structure, while the corresponding allele was [AGAT]15[AGAC]6[AGAT] for the PowerPlex Fusion ladder, and [AGAT]13[AGAC]9 for the GlobalFiler ladder. Overall, as the number of [AGAT] in the repeat structure decreased at the D12S391 locus, the peak migrated more slowly using PowerPlex Fusion, the reverse strand of which was labeled, and it migrated more rapidly using GlobalFiler, the forward strand of which was labeled. The allelic ladders of both STR kits were reamplified with our small amplicon D12S391 primers and their mobility was also examined. In conclusion, off-ladder observations of allele 22 at the D12S391 locus using PowerPlex Fusion were mainly attributed to a relatively large difference of the repeat structure between its allelic ladder and off-ladder allele 22. PMID:27591542

  1. Phylogeny, Floral Evolution, and Inter-Island Dispersal in Hawaiian Clermontia (Campanulaceae) Based on ISSR Variation and Plastid Spacer Sequences

    PubMed Central

    Givnish, Thomas J.; Bean, Gregory J.; Ames, Mercedes; Lyon, Stephanie P.; Sytsma, Kenneth J.

    2013-01-01

    Previous studies based on DNA restriction-site and sequence variation have shown that the Hawaiian lobeliads are monophyletic and that the two largest genera, Cyanea and Clermontia, diverged from each other ca. 9.7 Mya. Sequence divergence among species of Clermontia is quite limited, however, and extensive hybridization is suspected, which has interfered with production of a well-resolved molecular phylogeny for the genus. Clermontia is of considerable interest because several species posses petal-like sepals, raising the question of whether such a homeotic mutation has arisen once or several times. In addition, morphological and molecular studies have implied different patterns of inter-island dispersal within the genus. Here we use nuclear ISSRs (inter-simple sequence repeat polymorphisms) and five plastid non-coding sequences to derive biparental and maternal phylogenies for Clermontia. Our findings imply that (1) Clermontia is not monophyletic, with Cl. pyrularia nested within Cyanea and apparently an intergeneric hybrid; (2) the earliest divergent clades within Clermontia are native to Kauài, then Òahu, then Maui, supporting the progression rule of dispersal down the chain toward progressively younger islands, although that rule is violated in later-evolving taxa in the ISSR tree; (3) almost no sequence divergence among several Clermontia species in 4.5 kb of rapidly evolving plastid DNA; (4) several apparent cases of hybridization/introgression or incomplete lineage sorting (i.e., Cl. oblongifolia, peleana, persicifolia, pyrularia, samuelii, tuberculata), based on extensive conflict between the ISSR and plastid phylogenies; and (5) two origins and two losses of petaloid sepals, or—perhaps more plausibly—a single origin and two losses of this homeotic mutation, with its introgression into Cl. persicifolia. Our phylogenies are better resolved and geographically more informative than others based on ITS and 5S-NTS sequences and nuclear SNPs, but agree

  2. Single Cell RNA-Sequencing of Pluripotent States Unlocks Modular Transcriptional Variation

    PubMed Central

    Kolodziejczyk, Aleksandra A.; Kim, Jong Kyoung; Tsang, Jason C.H.; Ilicic, Tomislav; Henriksson, Johan; Natarajan, Kedar N.; Tuck, Alex C.; Gao, Xuefei; Bühler, Marc; Liu, Pentao; Marioni, John C.; Teichmann, Sarah A.

    2015-01-01

    Summary Embryonic stem cell (ESC) culture conditions are important for maintaining long-term self-renewal, and they influence cellular pluripotency state. Here, we report single cell RNA-sequencing of mESCs cultured in three different conditions: serum, 2i, and the alternative ground state a2i. We find that the cellular transcriptomes of cells grown in these conditions are distinct, with 2i being the most similar to blastocyst cells and including a subpopulation resembling the two-cell embryo state. Overall levels of intercellular gene expression heterogeneity are comparable across the three conditions. However, this masks variable expression of pluripotency genes in serum cells and homogeneous expression in 2i and a2i cells. Additionally, genes related to the cell cycle are more variably expressed in the 2i and a2i conditions. Mining of our dataset for correlations in gene expression allowed us to identify additional components of the pluripotency network, including Ptma and Zfp640, illustrating its value as a resource for future discovery. PMID:26431182

  3. The evolution of proteins from random amino acid sequences: II. Evidence from the statistical distributions of the lengths of modern protein sequences.

    PubMed

    White, S H

    1994-04-01

    This paper continues an examination of the hypothesis that modern proteins evolved from random heteropeptide sequences. In support of the hypothesis, White and Jacobs (1993, J Mol Evol 36:79-95) have shown that any sequence chosen randomly from a large collection of nonhomologous proteins has a 90% or better chance of having a lengthwise distribution of amino acids that is indistinguishable from the random expectation regardless of amino acid type. The goal of the present study was to investigate the possibility that the random-origin hypothesis could explain the lengths of modern protein sequences without invoking specific mechanisms such as gene duplication or exon splicing. The sets of sequences examined were taken from the 1989 PIR database and consisted of 1,792 "super-family" proteins selected to have little sequence identity, 623 E. coli sequences, and 398 human sequences. The length distributions of the proteins could be described with high significance by either of two closely related probability density functions: The gamma distribution with parameter 2 or the distribution for the sum of two exponential random independent variables. A simple theory for the distributions was developed which assumes that (1) protoprotein sequences had exponentially distributed random independent lengths, (2) the length dependence of protein stability determined which of these protoproteins could fold into compact primitive proteins and thereby attain the potential for biochemical activity, (3) the useful protein sequences were preserved by the primitive genome, and (4) the resulting distribution of sequence lengths is reflected by modern proteins. The theory successfully predicts the two observed distributions which can be distinguished by the functional form of the dependence of protein stability on length. The theory leads to three interesting conclusions. First, it predicts that a tetra-nucleotide was the signal for primitive translation termination. This prediction is

  4. Sequence Design for a Test Tube of Interacting Nucleic Acid Strands.

    PubMed

    Wolfe, Brian R; Pierce, Niles A

    2015-10-16

    We describe an algorithm for designing the equilibrium base-pairing properties of a test tube of interacting nucleic acid strands. A target test tube is specified as a set of desired "on-target" complexes, each with a target secondary structure and target concentration, and a set of undesired "off-target" complexes, each with vanishing target concentration. Sequence design is performed by optimizing the test tube ensemble defect, corresponding to the concentration of incorrectly paired nucleotides at equilibrium evaluated over the ensemble of the test tube. To reduce the computational cost of accepting or rejecting mutations to a random initial sequence, the structural ensemble of each on-target complex is hierarchically decomposed into a tree of conditional subensembles, yielding a forest of decomposition trees. Candidate sequences are evaluated efficiently at the leaf level of the decomposition forest by estimating the test tube ensemble defect from conditional physical properties calculated over the leaf subensembles. As optimized subsequences are merged toward the root level of the forest, any emergent defects are eliminated via ensemble redecomposition and sequence reoptimization. After successfully merging subsequences to the root level, the exact test tube ensemble defect is calculated for the first time, explicitly checking for the effect of the previously neglected off-target complexes. Any off-target complexes that form at appreciable concentration are hierarchically decomposed, added to the decomposition forest, and actively destabilized during subsequent forest reoptimization. For target test tubes representative of design challenges in the molecular programming and synthetic biology communities, our test tube design algorithm typically succeeds in achieving a normalized test tube ensemble defect ≤1% at a design cost within an order of magnitude of the cost of test tube analysis.

  5. Sequence-Specific Electrical Purification of Nucleic Acids with Nanoporous Gold Electrodes.

    PubMed

    Daggumati, Pallavi; Appelt, Sandra; Matharu, Zimple; Marco, Maria L; Seker, Erkin

    2016-06-22

    Nucleic-acid-based biosensors have enabled rapid and sensitive detection of pathogenic targets; however, these devices often require purified nucleic acids for analysis since the constituents of complex biological fluids adversely affect sensor performance. This purification step is typically performed outside the device, thereby increasing sample-to-answer time and introducing contaminants. We report a novel approach using a multifunctional matrix, nanoporous gold (np-Au), which enables both detection of specific target sequences in a complex biological sample and their subsequent purification. The np-Au electrodes modified with 26-mer DNA probes (via thiol-gold chemistry) enabled sensitive detection and capture of complementary DNA targets in the presence of complex media (fetal bovine serum) and other interfering DNA fragments in the range of 50-1500 base pairs. Upon capture, the noncomplementary DNA fragments and serum constituents of varying sizes were washed away. Finally, the surface-bound DNA-DNA hybrids were released by electrochemically cleaving the thiol-gold linkage, and the hybrids were iontophoretically eluted from the nanoporous matrix. The optical and electrophoretic characterization of the analytes before and after the detection-purification process revealed that low target DNA concentrations (80 pg/μL) can be successfully detected in complex biological fluids and subsequently released to yield pure hybrids free of polydisperse digested DNA fragments and serum biomolecules. Taken together, this multifunctional platform is expected to enable seamless integration of detection and purification of nucleic acid biomarkers of pathogens and diseases in miniaturized diagnostic devices.

  6. Amino acid sequence analysis and characterization of a ribonuclease from starfish Asterias amurensis.

    PubMed

    Motoyoshi, Naomi; Kobayashi, Hiroko; Itagaki, Tadashi; Inokuchi, Norio

    2016-09-01

    The aim of this study was to phylogenetically characterize the location of the RNase T2 enzyme in the starfish (Asterias amurensis). We isolated an RNase T2 ribonuclease (RNase Aa) from the ovaries of starfish and determined its amino acid sequence by protein chemistry and cloning cDNA encoding RNase Aa. The isolated protein had 231 amino acid residues, a predicted molecular mass of 25,906 Da, and an optimal pH of 5.0. RNase Aa preferentially released guanylic acid from the RNA. The catalytic sites of the RNase T2 family are conserved in RNase Aa; furthermore, the distribution of the cysteine residues in RNase Aa is similar to that in other animal and plant T2 RNases. RNase Aa is cleaved at two points: 21 residues from the N-terminus and 29 residues from the C-terminus; however, both fragments may remain attached to the protein via disulfide bridges, leading to the maintenance of its conformation, as suggested by circular dichroism spectrum analysis. The phylogenetic analysis revealed that starfish RNase Aa is evolutionarily an intermediate between protozoan and oyster RNases. PMID:26920046

  7. [Mitochondrial DNA sequence variation, demographic history, and population structure of Amur sturgeon Acipenser schrenckii Brandt, 1869].

    PubMed

    Shedko, S V; Miroshnichenko, I L; Nemkova, G A; Koshelev, V N; Shedko, M B

    2015-02-01

    The variability of the mtDNA control region (D-loop) was examined in Amur sturgeon endemic to the Amur River. This species is also classified as critically endangered by the IUCN Red List of Threatened species. Sequencing of 796- to 812-bp fragments of the D-loop in 112 sturgeon collected in the Lower Amur revealed 73 different genotypes. The sample was characterized by a high level of haplotypic (0.976) and nucleotide (0.0194) diversity. The identified haplotypes split into two well-defined monophyletic groups, BG (n = 39) and SM (n = 34), differing (HKY distance) on average by 3.41% of nucleotide positions upon an average level of intragroup differences of 0.54 and 1.23%, respectively. Moreover, the haplotypes of the SM groups differed by the presence of a 13-14 bp deletion. Most ofthe samples (66 out of 112) carried BG haplotypes. Overall, the pattern of pairwise nucleotide differences and