Science.gov

Sample records for acid sequence variations

  1. Analysis of amino acid sequence variations and immunoglobulin E-binding epitopes of German cockroach tropomyosin.

    PubMed

    Jeong, Kyoung Yong; Lee, Jongweon; Lee, In-Yong; Ree, Han-Il; Hong, Chein-Soo; Yong, Tai-Soon

    2004-09-01

    The allergenicities of tropomyosins from different organisms have been reported to vary. The cDNA encoding German cockroach tropomyosin (Bla g 7) was isolated, expressed, and characterized previously. In the present study, the amino acid sequence variations in German cockroach tropomyosin were analyzed in order to investigate its influence on allergenicity. We also undertook the identification of immunodominant peptides containing immunoglobulin E (IgE) epitopes which may facilitate the development of diagnostic and immunotherapeutic strategies based on the recombinant proteins. Two-dimensional gel electrophoresis and immunoblot analysis with mouse anti-recombinant German cockroach tropomyosin serum was performed to investigate the isoforms at the protein level. Reverse transcriptase PCR (RT-PCR) was applied to examine the sequence diversity. Eleven different variants of the deduced amino acid sequences were identified by RT-PCR. German cockroach tropomyosin has only minor sequence variations that did not seem to affect its allergenicity significantly. These results support the molecular basis underlying the cross-reactivities of arthropod tropomyosins. Recombinant fragments were also generated by PCR, and IgE-binding epitopes were assessed by enzyme-linked immunosorbent assay. Sera from seven patients revealed heterogeneous IgE-binding responses. This study demonstrates multiple IgE-binding epitope regions in a single molecule, suggesting that full-length tropomyosin should be used for the development of diagnostic and therapeutic reagents.

  2. Formation Sequences of Iron Minerals in the Acidic Alteration Products and Variation of Hydrothermal Fluid Conditions

    NASA Astrophysics Data System (ADS)

    Isobe, H.; Yoshizawa, M.

    2008-12-01

    Iron minerals have important role in environmental issues not only on the Earth but also other terrestrial planets. Iron mineral species related to alteration products of primary minerals with surface or subsurface fluids are characterized by temperature, acidity and redox conditions of the fluids. We can see various iron- bearing alteration products in alteration products around fumaroles in geothermal/volcanic areas. In this study, zonal structures of iron minerals in alteration products of the geothermal area are observed to elucidate temporal and spatial variation of hydrothermal fluids. Alteration of the pyroxene-amphibole andesite of Garan-dake volcano, Oita, Japan occurs by the acidic hydrothermal fluid to form cristobalite leaching out elements other than Si. Hand specimens with unaltered or weakly altered core and cristobalite crust show various sequences of layers. XRD analysis revealed that the alteration degree is represented by abundance of cristobalite. Intermediately altered layers are characterized by occurrence including alunite, pyrite, kaolinite, goethite and hematite. A specimen with reddish brown core surrounded by cristobalite-rich white crust has brown colored layers at the boundary of core and the crust. Reddish core is characterized by occurrence of crystalline hematite by XRD. Another hand specimen has light gray core, which represents reduced conditions, and white cristobalite crust with light brown and reddish brown layers of ferric iron minerals between the core and the crust. On the other hand, hornblende crystals, typical ferrous iron-bearing mineral of the host rock, are well preserved in some samples with strongly decolorized cristobalite-rich groundmass. Hydrothermal alteration experiments of iron-rich basaltic material shows iron mineral species depend on acidity and temperature of the fluid. Oxidation states of the iron-bearing mineral species are strongly influenced by the acidity and redox conditions. Variations of alteration

  3. Alignment of 700 globin sequences: extent of amino acid substitution and its correlation with variation in volume.

    PubMed Central

    Kapp, O. H.; Moens, L.; Vanfleteren, J.; Trotman, C. N.; Suzuki, T.; Vinogradov, S. N.

    1995-01-01

    Seven-hundred globin sequences, including 146 nonvertebrate sequences, were aligned on the basis of conservation of secondary structure and the avoidance of gap penalties. Of the 182 positions needed to accommodate all the globin sequences, only 84 are common to all, including the absolutely conserved PheCD1 and HisF8. The mean number of amino acid substitutions per position ranges from 8 to 13 for all globins and 5 to 9 for internal positions. Although the total sequence volumes have a variation approximately 2-3%, the variation in volume per position ranges from approximately 13% for the internal to approximately 21% for the surface positions. Plausible correlations exist between amino acid substitution and the variation in volume per position for the 84 common and the internal but not the surface positions. The amino acid substitution matrix derived from the 84 common positions was used to evaluate sequence similarity within the globins and between the globins and phycocyanins C and colicins A, via calculation of pairwise similarity scores. The scores for globin-globin comparisons over the 84 common positions overlap the globin-phycocyanin and globin-colicin scores, with the former being intermediate. For the subset of internal positions, overlap is minimal between the three groups of scores. These results imply a continuum of amino acid sequences able to assume the common three-on-three alpha-helical structure and suggest that the determinants of the latter include sites other than those inaccessible to solvent. PMID:8535255

  4. DNA Sequence and Expression Variation of Hop (Humulus lupulus) Valerophenone Synthase (VPS), a Key Gene in Bitter Acid Biosynthesis

    PubMed Central

    Castro, Consuelo B.; Whittock, Lucy D.; Whittock, Simon P.; Leggett, Grey; Koutoulis, Anthony

    2008-01-01

    Background The hop plant (Humulus lupulus) is a source of many secondary metabolites, with bitter acids essential in the beer brewing industry and others having potential applications for human health. This study investigated variation in DNA sequence and gene expression of valerophenone synthase (VPS), a key gene in the bitter acid biosynthesis pathway of hop. Methods Sequence variation was studied in 12 varieties, and expression was analysed in four of the 12 varieties in a series across the development of the hop cone. Results Nine single nucleotide polymorphisms (SNPs) were detected in VPS, seven of which were synonymous. The two non-synonymous polymorphisms did not appear to be related to typical bitter acid profiles of the varieties studied. However, real-time quantitative reverse-transcription polymerase chain reaction (qRT-PCR) analysis of VPS expression during hop cone development showed a clear link with the bitter acid content. The highest levels of VPS expression were observed in two triploid varieties, ‘Symphony’ and ‘Ember’, which typically have high bitter acid levels. Conclusions In all hop varieties studied, VPS expression was lowest in the leaves and an increase in expression was consistently observed during the early stages of cone development. PMID:18519445

  5. Lactobacillus kefiri shows inter-strain variations in the amino acid sequence of the S-layer proteins.

    PubMed

    Malamud, Mariano; Carasi, Paula; Bronsoms, Sílvia; Trejo, Sebastián A; Serradell, María de Los Angeles

    2017-04-01

    The S-layer is a proteinaceous envelope constituted by subunits that self-assemble to form a two-dimensional lattice that covers the surface of different species of Bacteria and Archaea, and it could be involved in cell recognition of microbes among other several distinct functions. In this work, both proteomic and genomic approaches were used to gain knowledge about the sequences of the S-layer protein (SLPs) encoding genes expressed by six aggregative and sixteen non-aggregative strains of potentially probiotic Lactobacillus kefiri. Peptide mass fingerprint (PMF) analysis confirmed the identity of SLPs extracted from L. kefiri, and based on the homology with phylogenetically related species, primers located outside and inside the SLP-genes were employed to amplify genomic DNA. The O-glycosylation site SASSAS was found in all L. kefiri SLPs. Ten strains were selected for sequencing of the complete genes. The total length of the mature proteins varies from 492 to 576 amino acids, and all SLPs have a calculated pI between 9.37 and 9.60. The N-terminal region is relatively conserved and shows a high percentage of positively charged amino acids. Major differences among strains are found in the C-terminal region. Different groups could be distinguished regarding the mature SLPs and the similarities observed in the PMF spectra. Interestingly, SLPs of the aggregative strains are 100% homologous, although these strains were isolated from different kefir grains. This knowledge provides relevant data for better understanding of the mechanisms involved in SLPs functionality and could contribute to the development of products of biotechnological interest from potentially probiotic bacteria.

  6. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  7. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  8. Developmental variation and amino acid sequences of cytochromes c of the fruit fly Drosophila melanogaster and the flesh fly Boettcherisca peregrina.

    PubMed

    Inoue, S; Inoue, H; Hiroyoshi, T; Matsubara, H; Yamanaka, T

    1986-10-01

    The amino acid sequences of cytochromes c purified from the fruit fly Drosophila melanogaster and the flesh fly Boettcherisca peregrina were determined. In contrast with the case of the housefly, isocytochromes c were not detected in these flies at any developmental stage. The sequence of fruit fly cytochrome c differed from that reported previously but was identical with that predicted from the nucleotide sequence of the fruit fly cytochrome c gene (DC4) (Limbach, K.J. & Wu, R. (1985) Nucl. Acids Res. 13, 631-644). Isocytochrome c of the fruit fly, reported to be encoded by the DC3 gene, was not detected as a functional cytochrome c molecule.

  9. High speed nucleic acid sequencing

    SciTech Connect

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  10. Variation in seed fatty acid composition and sequence divergence in the FAD2 gene coding region between wild and cultivated sesame.

    PubMed

    Chen, Zhenbang; Tonnis, Brandon; Morris, Brad; Wang, Richard B; Zhang, Amy L; Pinnow, David; Wang, Ming Li

    2014-12-03

    Sesame germplasm harbors genetic diversity which can be useful for sesame improvement in breeding programs. Seven accessions with different levels of oleic acid were selected from the entire USDA sesame germplasm collection (1232 accessions) and planted for morphological observation and re-examination of fatty acid composition. The coding region of the FAD2 gene for fatty acid desaturase (FAD) in these accessions was also sequenced. Cultivated sesame accessions flowered and matured earlier than the wild species. The cultivated sesame seeds contained a significantly higher percentage of oleic acid (40.4%) than the seeds of the wild species (26.1%). Nucleotide polymorphisms were identified in the FAD2 gene coding region between wild and cultivated species. Some nucleotide polymorphisms led to amino acid changes, one of which was located in the enzyme active site and may contribute to the altered fatty acid composition. Based on the morphology observation, chemical analysis, and sequence analysis, it was determined that two accessions were misnamed and need to be reclassified. The results obtained from this study are useful for sesame improvement in molecular breeding programs.

  11. The regions of sequence variation in caulimovirus gene VI.

    PubMed

    Sanger, M; Daubert, S; Goodman, R M

    1991-06-01

    The sequence of gene VI from figwort mosaic virus (FMV) clone x4 was determined and compared with that previously published for FMV clone DxS. Both clones originated from the same virus isolation, but the virus used to clone DxS was propagated extensively in a host of a different family prior to cloning whereas that used to clone x4 was not. Differences in the amino acid sequence inferred from the DNA sequences occurred in two clusters. An N-terminal conserved region preceded two regions of variation separated by a central conserved region. Variation in cauliflower mosaic virus (CaMV) gene VI sequences, all of which were derived from virus isolates from hosts from one host family, was similar to that seen in the FMV comparison, though the extent of variation was less. Alignment of gene VI domains from FMV and CaMV revealed regions of amino acid sequence identical in both viruses within the conserved regions. The similarity in the pattern of conserved and variable domains of these two viruses suggests common host-interactive functions in caulimovirus gene VI homologues, and possibly an analogy between caulimoviruses and certain animal viruses in the influence of the host on sequence variability of viral genes.

  12. A case study on the genetic origin of the high oleic acid trait through FAD2-1 DNA sequence variation in safflower (Carthamus tinctorius L.).

    PubMed

    Rapson, Sara; Wu, Man; Okada, Shoko; Das, Alpana; Shrestha, Pushkar; Zhou, Xue-Rong; Wood, Craig; Green, Allan; Singh, Surinder; Liu, Qing

    2015-01-01

    The safflower (Carthamus tinctorius L.) is considered a strongly domesticated species with a long history of cultivation. The hybridization of safflower with its wild relatives has played an important role in the evolution of cultivars and is of particular interest with regards to their production of high quality edible oils. Original safflower varieties were all rich in linoleic acid, while varieties rich in oleic acid have risen to prominence in recent decades. The high oleic acid trait is controlled by a partially recessive allele ol at a single locus OL. The ol allele was found to be a defective microsomal oleate desaturase FAD2-1. Here we present DNA sequence data and Southern blot analysis suggesting that there has been an ancient hybridization and introgression of the FAD2-1 gene into C. tinctorius from its wild relative C. palaestinus. It is from this gene that FAD2-1Δ was derived more recently. Identification and characterization of the genetic origin and diversity of FAD2-1 could aid safflower breeders in reducing population size and generations required for the development of new high oleic acid varieties by using perfect molecular marker-assisted selection.

  13. Protein structure prediction from sequence variation

    PubMed Central

    Marks, Debora S; Hopf, Thomas A; Sander, Chris

    2015-01-01

    Genomic sequences contain rich evolutionary information about functional constraints on macromolecules such as proteins. This information can be efficiently mined to detect evolutionary couplings between residues in proteins and address the long-standing challenge to compute protein three-dimensional structures from amino acid sequences. Substantial progress has recently been made on this problem owing to the explosive growth in available sequences and the application of global statistical methods. In addition to three-dimensional structure, the improved understanding of covariation may help identify functional residues involved in ligand binding, protein-complex formation and conformational changes. We expect computation of covariation patterns to complement experimental structural biology in elucidating the full spectrum of protein structures, their functional interactions and evolutionary dynamics. PMID:23138306

  14. Chromospheric variations in main-sequence stars

    NASA Technical Reports Server (NTRS)

    Baliunas, S. L.; Donahue, R. A.; Soon, J. H.; Horne, J. H.; Frazer, J.; Woodard-Eklund, L.; Bradford, M.; Rao, L. M.; Wilson, O. C.; Zhang, Q.

    1995-01-01

    The fluxes in passbands 0.1 nm wide and centered on the Ca II H and K emission cores have been monitored in 111 stars of spectral type F2-M2 on or near the main sequence in a continuation of an observing program started by O. C. Wilson. Most of the measurements began in 1966, with observations scheduled monthly until 1980, when observations were schedueld sevral times per week. The records, with a long-term precision of about 1.5%, display fluctuations that can be idntified with variations on timescales similar to the 11 yr cycle of solar activity as well as axial rotation, and the growth and decay of emitting regions. We present the records of chromospheric emission and general conclusions about variations in surface magnetic activity on timescales greater than 1 yr but less than a few decades. The results for stars of spectral type G0-K5 V indicate a pattern of change in rotation and chromospheric activity on an evolutionary timescale, in which (1) young stars exhibit high average levels of activity, rapid rotation rates, no Maunder minimum phase and rarely display a smooth, cyclic variation; (2) stars of intermediate age (approximately 1-2 Gyr for 1 solar mass) have moderate levels of activity and rotation rates, and occasional smooth cycles; and (3) stars as old as the Sun and older have slower rotation rates, lower activity levels and smooth cycles with occasional Maunder minimum-phases.

  15. Indole acetic acid production by fluorescent Pseudomonas spp. from the rhizosphere of Plectranthus amboinicus (Lour.) Spreng. and their variation in extragenic repetitive DNA sequences.

    PubMed

    Sethia, Bedhya; Mustafa, Mariam; Manohar, Sneha; Patil, Savita V; Jayamohan, Nellickal Subramanian; Kumudini, Belur Satyan

    2015-06-01

    Fluorescent Pseudomonas (FP) is a heterogenous group of growth promoting rhizobacteria that regulate plant growth by releasing secondary metabolic compounds viz., indole acetic acid (IAA), siderophores, ammonia and hydrogen cyanide. In the present study, IAA producing FPs from the rhizosphere of Plectranthus amboinicus were characterized morphologically, biochemically and at the molecular level. Molecular identification of the isolates were carried out using Pseudomonas specific primers. The effect of varying time (24, 48, 72 and 96 h), Trp concentrations (100, 200, 300, 400 and 500 μg x ml(-1)), temperature (10, 26, 37 and 50 ± 2 degrees C) and pH (6, 7 and 8) on IAA production by 10 best isolates were studied. Results showed higher IAA production at 72 h incubation, at 300 μg x ml(-1) Trp concentration, temperature 26 ± 2 degrees C and pH 7. TLC with acidified ethyl acetate extract showed that the IAA produced has a similar Rf value to that of the standard IAA. Results of TLC were confirmed by HPLC analysis. Genetic diversity of the isolates was also studied using 40 RAPD and 4 Rep primers. Genetic diversity parameters such as dominance, Shannon index and Simpson index were calculated. Out of 40 RAPD primers tested, 9 (2 OP-D series and 7 OP-E series) were shortlisted for further analysis. Studies using RAPD, ERIC, BOX, REP and GTG5 primers revealed that isolates exhibit significant diversity in repetitive DNA sequences irrespective of the rhizosphere.

  16. Variation in the nucleotide sequence of a prolamin gene family in wild rice.

    PubMed

    Barbier, P; Ishihama, A

    1990-07-01

    Variation in the DNA sequence of the 10 kDa prolamin gene family within the wild rice species Oryza rufipogon was probed using the direct sequencing of PCR-amplified genes. A comparison of the nucleotide and deduced amino-acid sequences of eight Asian strains of O. rufipogon and one strain of the related African species O. longistaminata is presented.

  17. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  18. Inter-specific sequence conservation and intra-individual sequence variation in a spider silk gene.

    PubMed

    Tai, Pei-Ling; Hwang, Guang-Yuh; Tso, I-Min

    2004-10-01

    Currently, studies on major ampullate spidroin 1 (MaSp1) genes of non-orb weaving spiders are few, and it is not clear whether genes of these organisms exhibit the same characteristics as those of orb-weavers. In addition, many studies have proposed that MaSp1 might be a single gene with allelic variants, but supporting evidence is still lacking. In this study, we compared partial DNA and amino acid sequences of MaSp1 cloned from different spider guilds. We also cloned partial MaSp1 sequences from genomic DNA and cDNA of the same individuals of spiders using the same primer combination to see if different molecular forms existed. In the repetitive region of partial MaSp1 sequences obtained, GGX, GA and poly-A motifs were present in all Araneomorphae and Mygalomorpae species examined. An extreme similarity in MaSp1 non-repetitive portions was found in sequences of ecribellate, cribellate and Mygalomorphae web-builders and such a result suggested that this sequence might exhibit an important function. A comparison of sequences amplified from the same individual showed that substitutions in amino acids occurred in both repetitive and non-repetitive regions, with a much higher variation in the former. These results suggest that the MaSp1 of Araneomorphae spiders exhibits several forms in an individual spider and it might be either a multiple gene or a single gene with a multiple exon/intron organization.

  19. Distinguishing Proteins From Arbitrary Amino Acid Sequences

    PubMed Central

    Yau, Stephen S.-T.; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  20. The complete amino acid sequence of prochymosin.

    PubMed Central

    Foltmann, B; Pedersen, V B; Jacobsen, H; Kauffman, D; Wybrandt, G

    1977-01-01

    The total sequence of 365 amino acid residues in bovine prochymosin is presented. Alignment with the amino acid sequence of porcine pepsinogen shows that 204 amino acid residues are common to the two zymogens. Further comparison and alignment with the amino acid sequence of penicillopepsin shows that 66 residues are located at identical positions in all three proteases. The three enzymes belong to a large group of proteases with two aspartate residues in the active center. This group forms a family derived from one common ancestor. PMID:329280

  1. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  2. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  3. Variation in Seed Fatty Acid Composition, and Sequence Divergence in the FAD2 Gene Coding Region between Wild and Cultivated Sesame

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Sesame germplasm harbors genetic diversity which can be useful for sesame improvement in breeding programs. Seven accessions with different levels of oleic acid were selected from the entire USDA sesame germplasm collection (1232 accessions) and planted for morphological observation and re-examinati...

  4. Effect of amino acid sequence variations at position 149 on the fusogenic activity of the subtype B avian metapneumovirus fusion protein.

    PubMed

    Yun, Bingling; Gao, Yanni; Liu, Yongzhen; Guan, Xiaolu; Wang, Yongqiang; Qi, Xiaole; Gao, Honglei; Liu, Changjun; Cui, Hongyu; Zhang, Yanping; Gao, Yulong; Wang, Xiaomei

    2015-10-01

    The entry of enveloped viruses into host cells requires the fusion of viral and cell membranes. These membrane fusion reactions are mediated by virus-encoded glycoproteins. In the case of avian metapneumovirus (aMPV), the fusion (F) protein alone can mediate virus entry and induce syncytium formation in vitro. To investigate the fusogenic activity of the aMPV F protein, we compared the fusogenic activities of three subtypes of aMPV F proteins using a TCSD50 assay developed in this study. Interestingly, we found that the F protein of aMPV subtype B (aMPV/B) strain VCO3/60616 (aMPV/vB) was hyperfusogenic when compared with F proteins of aMPV/B strain aMPV/f (aMPV/fB), aMPV subtype A (aMPV/A), and aMPV subtype C (aMPV/C). We then further demonstrated that the amino acid (aa) residue 149F contributed to the hyperfusogenic activity of the aMPV/vB F protein. Moreover, we revealed that residue 149F had no effect on the fusogenic activities of aMPV/A, aMPV/C, and human metapneumovirus (hMPV) F proteins. Collectively, we provide the first evidence that the amino acid at position 149 affects the fusogenic activity of the aMPV/B F protein, and our findings will provide new insights into the fusogenic mechanism of this protein.

  5. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  6. Using chaos to generate variations on movement sequences

    NASA Astrophysics Data System (ADS)

    Bradley, Elizabeth; Stuart, Joshua

    1998-12-01

    We describe a method for introducing variations into predefined motion sequences using a chaotic symbol-sequence reordering technique. A progression of symbols representing the body positions in a dance piece, martial arts form, or other motion sequence is mapped onto a chaotic trajectory, establishing a symbolic dynamics that links the movement sequence and the attractor structure. A variation on the original piece is created by generating a trajectory with slightly different initial conditions, inverting the mapping, and using special corpus-based graph-theoretic interpolation schemes to smooth any abrupt transitions. Sensitive dependence guarantees that the variation is different from the original; the attractor structure and the symbolic dynamics guarantee that the two resemble one another in both aesthetic and mathematical senses.

  7. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  8. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  9. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  10. Amino acid sequence of bovine heart coupling factor 6.

    PubMed Central

    Fang, J K; Jacobs, J W; Kanner, B I; Racker, E; Bradshaw, R A

    1984-01-01

    The amino acid sequence of bovine heart mitochondrial coupling factor 6 (F6) has been determined by automated Edman degradation of the whole protein and derived peptides. Preparations based on heat precipitation and ethanol extraction showed allotypic variation at three positions while material further purified by HPLC yielded only one sequence that also differed by a Phe-Thr replacement at residue 62. The mature protein contains 76 amino acids with a calculated molecular weight of 9006 and a pI of approximately equal to 5, in good agreement with experimentally measured values. The charged amino acids are mainly clustered at the termini and in one section in the middle; these three polar segments are separated by two segments relatively rich in nonpolar residues. Chou-Fasman analysis suggests three stretches of alpha-helix coinciding (or within) the high-charge-density sequences with a single beta-turn at the first polar-nonpolar junction. Comparison of the F6 sequence with those of other proteins did not reveal any homologous structures. PMID:6149548

  11. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid...

  12. Mitochondrial sequence variation suggests an African influence in Portuguese cattle.

    PubMed Central

    Cymbron, T; Loftus, R T; Malheiro, M I; Bradley, D G

    1999-01-01

    A total of 49 samples from indigenous Portuguese cattle breeds were analysed for sequence variation in the hypervariable region of the mitochondrial DNA D-loop. Sequence comparison and phylogenetic analyses revealed that haplotypes fell into two distinct groups. These corresponded with two separate haplotype clusters into which, respectively, all African, or alternatively all sequences of European origin, have previously been shown to fall. Here, the majority of sequences of African type were encountered in three southern, as compared to three northern breeds. This pattern of African influence may reflect an intercontinental admixture in the initial origins of Iberian breeds, or it is perhaps an introgression dating from the long and influential Moorish occupation of the south of the Iberian peninsula. PMID:10212450

  13. Analyzing Neisseria gonorrhoeae Pilin Antigenic Variation Using 454 Sequencing Technology

    PubMed Central

    Rotman, Ella; Webber, David M.

    2016-01-01

    ABSTRACT Many pathogens use homologous recombination to vary surface antigens in order to avoid immune surveillance. Neisseria gonorrhoeae, the bacterium responsible for the sexually transmitted infection gonorrhea, achieves this in part by changing the sequence of the major subunit of the type IV pilus in a process termed pilin antigenic variation (Av). The N. gonorrhoeae chromosome contains one expression locus (pilE) and many promoterless, partial-coding silent copies (pilS) that act as reservoirs for variant pilin information. Pilin Av occurs by high-frequency gene conversion reactions, which transfer pilS sequences into the pilE locus. We have developed a 454 sequencing-based assay to analyze the frequency and characteristics of pilin Av that allows a more robust analysis of pilin Av than previous assays. We used this assay to analyze mutations and conditions previously shown to affect pilin Av, confirming many but not all of the previously reported phenotypes. We show that mutations or conditions that cause growth defects can result in Av phenotypes when analyzed by phase variation-based assays. Adapting the 454 sequencing to analyze pilin Av demonstrates the utility of this technology to analyze any diversity generation system that uses recombination to develop biological diversity. IMPORTANCE Measuring and analyzing complex recombination-based systems constitute a major barrier to understanding the mechanisms used to generate diversity. We have analyzed the contributions of many gonococcal mutations or conditions to the process of pilin antigenic variation. PMID:27381912

  14. DNA Sequence Analysis of SLC26A5, Encoding Prestin, in a Patient-Control Cohort: Identification of Fourteen Novel DNA Sequence Variations

    PubMed Central

    Minor, Jacob S.; Tang, Hsiao-Yuan; Pereira, Fred A.; Alford, Raye Lynn

    2009-01-01

    Background Prestin, encoded by the gene SLC26A5, is a transmembrane protein of the cochlear outer hair cell (OHC). Prestin is required for the somatic electromotile activity of OHCs, which is absent in OHCs and causes severe hearing impairment in mice lacking prestin. In humans, the role of sequence variations in SLC26A5 in hearing loss is less clear. Although prestin is expected to be required for functional human OHCs, the clinical significance of reported putative mutant alleles in humans is uncertain. Methodology/Principal Findings To explore the hypothesis that SLC26A5 may act as a modifier gene, affecting the severity of hearing loss caused by an independent etiology, a patient-control cohort was screened for DNA sequence variations in SLC26A5 using sequencing and allele specific methods. Patients in this study carried known pathogenic or controversial sequence variations in GJB2, encoding Connexin 26, or confirmed or suspected sequence variations in SLC26A5; controls included four ethnic populations. Twenty-three different DNA sequence variations in SLC26A5, 14 of which are novel, were observed: 4 novel sequence variations were found exclusively among patients; 7 novel sequence variations were found exclusively among controls; and, 12 sequence variations, 3 of which are novel, were found in both patients and controls. Twenty-one of the 23 DNA sequence variations were located in non-coding regions of SLC26A5. Two coding sequence variations, both novel, were observed only in patients and predict a silent change, p.S434S, and an amino acid substitution, p.I663V. In silico analysis of the p.I663V amino acid variation suggested this variant might be benign. Using Fisher's exact test, no statistically significant difference was observed between patients and controls in the frequency of the identified DNA sequence variations. Haplotype analysis using HaploView 4.0 software revealed the same predominant haplotype in patients and controls and derived haplotype blocks

  15. Notes on individual sequence variation in humans: Immunoglobulin kappa light chain

    SciTech Connect

    Kurth, J.H. ); Cavalli-Sforza, L.L. )

    1994-06-01

    Little is known concerning the magnitude of variability in the nucleic acid sequence of DNA at the individual level. The authors have collected a large set of sequence data from the human immunoglobulin kappa light-chain-locus constant region (10,444 bp) and subgroup IV variable region (18,580 bp). For the constant region, absolute conservation of sequence was observed, even in intron and coding-region silent sites, with the exception of one previously defined polymorphic site. For the variable region, 12 heterozygous positions were identified, giving a heterozygosity of 6 x 10[sup [minus]4] per nucleotide site. The amount of nucleic acid sequence variation differs significantly ([chi][sup 2] = 4.88) between these two regions, and the observed variation is two orders of magnitude lower than that reported for two Drosophila melanogaster loci. These data suggest that, for at least some regions of the human genome, nucleic acid sequence may be less variable than previously estimated. 13 refs., 2 figs.

  16. Lysoplex: An efficient toolkit to detect DNA sequence variations in the autophagy-lysosomal pathway

    PubMed Central

    Di Fruscio, Giuseppina; Schulz, Angela; De Cegli, Rossella; Savarese, Marco; Mutarelli, Margherita; Parenti, Giancarlo; Banfi, Sandro; Braulke, Thomas; Nigro, Vincenzo; Ballabio, Andrea

    2015-01-01

    The autophagy-lysosomal pathway (ALP) regulates cell homeostasis and plays a crucial role in human diseases, such as lysosomal storage disorders (LSDs) and common neurodegenerative diseases. Therefore, the identification of DNA sequence variations in genes involved in this pathway and their association with human diseases would have a significant impact on health. To this aim, we developed Lysoplex, a targeted next-generation sequencing (NGS) approach, which allowed us to obtain a uniform and accurate coding sequence coverage of a comprehensive set of 891 genes involved in lysosomal, endocytic, and autophagic pathways. Lysoplex was successfully validated on 14 different types of LSDs and then used to analyze 48 mutation-unknown patients with a clinical phenotype of neuronal ceroid lipofuscinosis (NCL), a genetically heterogeneous subtype of LSD. Lysoplex allowed us to identify pathogenic mutations in 67% of patients, most of whom had been unsuccessfully analyzed by several sequencing approaches. In addition, in 3 patients, we found potential disease-causing variants in novel NCL candidate genes. We then compared the variant detection power of Lysoplex with data derived from public whole exome sequencing (WES) efforts. On average, a 50% higher number of validated amino acid changes and truncating variations per gene were identified. Overall, we identified 61 truncating sequence variations and 488 missense variations with a high probability to cause loss of function in a total of 316 genes. Interestingly, some loss-of-function variations of genes involved in the ALP pathway were found in homozygosity in the normal population, suggesting that their role is not essential. Thus, Lysoplex provided a comprehensive catalog of sequence variants in ALP genes and allows the assessment of their relevance in cell biology as well as their contribution to human disease. PMID:26075876

  17. Los Alamos sequence analysis package for nucleic acids and proteins.

    PubMed Central

    Kanehisa, M I

    1982-01-01

    An interactive system for computer analysis of nucleic acid and protein sequences has been developed for the Los Alamos DNA Sequence Database. It provides a convenient way to search or verify various sequence features, e.g., restriction enzyme sites, protein coding frames, and properties of coded proteins. Further, the comprehensive analysis package on a large-scale database can be used for comparative studies on sequence and structural homologies in order to find unnoted information stored in nucleic acid sequences. PMID:6174934

  18. Forward Genetics by Sequencing EMS Variation-Induced Inbred Lines

    PubMed Central

    Addo-Quaye, Charles; Buescher, Elizabeth; Best, Norman; Chaikam, Vijay; Baxter, Ivan; Dilkes, Brian P.

    2016-01-01

    In order to leverage novel sequencing techniques for cloning genes in eukaryotic organisms with complex genomes, the false positive rate of variant discovery must be controlled for by experimental design and informatics. We sequenced five lines from three pedigrees of ethyl methanesulfonate (EMS)-mutagenized Sorghum bicolor, including a pedigree segregating a recessive dwarf mutant. Comparing the sequences of the lines, we were able to identify and eliminate error-prone positions. One genomic region contained EMS mutant alleles in dwarfs that were homozygous reference sequences in wild-type siblings and heterozygous in segregating families. This region contained a single nonsynonymous change that cosegregated with dwarfism in a validation population and caused a premature stop codon in the Sorghum ortholog encoding the gibberellic acid (GA) biosynthetic enzyme ent-kaurene oxidase. Application of exogenous GA rescued the mutant phenotype. Our method for mapping did not require outcrossing and introduced no segregation variance. This enables work when line crossing is complicated by life history, permitting gene discovery outside of genetic models. This inverts the historical approach of first using recombination to define a locus and then sequencing genes. Our formally identical approach first sequences all the genes and then seeks cosegregation with the trait. Mutagenized lines lacking obvious phenotypic alterations are available for an extension of this approach: mapping with a known marker set in a line that is phenotypically identical to starting material for EMS mutant generation. PMID:28040779

  19. Sequence variation of 22 autosomal STR loci detected by next generation sequencing.

    PubMed

    Gettings, Katherine Butler; Kiesler, Kevin M; Faith, Seth A; Montano, Elizabeth; Baker, Christine H; Young, Brian A; Guerrieri, Richard A; Vallone, Peter M

    2016-03-01

    Sequencing short tandem repeat (STR) loci allows for determination of repeat motif variations within the STR (or entire PCR amplicon) which cannot be ascertained by size-based PCR fragment analysis. Sanger sequencing has been used in research laboratories to further characterize STR loci, but is impractical for routine forensic use due to the laborious nature of the procedure in general and additional steps required to separate heterozygous alleles. Recent advances in library preparation methods enable high-throughput next generation sequencing (NGS) and technological improvements in sequencing chemistries now offer sufficient read lengths to encompass STR alleles. Herein, we present sequencing results from 183 DNA samples, including African American, Caucasian, and Hispanic individuals, at 22 autosomal forensic STR loci using an assay designed for NGS. The resulting dataset has been used to perform population genetic analyses of allelic diversity by length compared to sequence, and exemplifies which loci are likely to achieve the greatest gains in discrimination via sequencing. Within this data set, six loci demonstrate greater than double the number of alleles obtained by sequence compared to the number of alleles obtained by length: D12S391, D2S1338, D21S11, D8S1179, vWA, and D3S1358. As expected, repeat region sequences which had not previously been reported in forensic literature were identified.

  20. Sequence variation of 22 autosomal STR loci detected by next generation sequencing

    PubMed Central

    Gettings, Katherine Butler; Kiesler, Kevin M.; Faith, Seth A.; Montano, Elizabeth; Baker, Christine H.; Young, Brian A.; Guerrieri, Richard A.; Vallone, Peter M.

    2016-01-01

    Sequencing short tandem repeat (STR) loci allows for determination of repeat motif variations within the STR (or entire PCR amplicon) which cannot be ascertained by size-based PCR fragment analysis. Sanger sequencing has been used in research laboratories to further characterize STR loci, but is impractical for routine forensic use due to the laborious nature of the procedure in general and additional steps required to separate heterozygous alleles. Recent advances in library preparation methods enable high-throughput next generation sequencing (NGS) and technological improvements in sequencing chemistries now offer sufficient read lengths to encompass STR alleles. Herein, we present sequencing results from 183 DNA samples, including African American, Caucasian, and Hispanic individuals, at 22 autosomal forensic STR loci using an assay designed for NGS. The resulting dataset has been used to perform population genetic analyses of allelic diversity by length compared to sequence, and exemplifies which loci are likely to achieve the greatest gains in discrimination via sequencing. Within this data set, six loci demonstrate greater than double the number of alleles obtained by sequence compared to the number of alleles obtained by length: D12S391, D2S1338, D21S11, D8S1179, vWA, and D3S1358. As expected, repeat region sequences which had not previously been reported in forensic literature were identified. PMID:26701720

  1. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  2. STR allele sequence variation: Current knowledge and future issues.

    PubMed

    Gettings, Katherine Butler; Aponte, Rachel A; Vallone, Peter M; Butler, John M

    2015-09-01

    This article reviews what is currently known about short tandem repeat (STR) allelic sequence variation in and around the twenty-four loci most commonly used throughout the world to perform forensic DNA investigations. These STR loci include D1S1656, TPOX, D2S441, D2S1338, D3S1358, FGA, CSF1PO, D5S818, SE33, D6S1043, D7S820, D8S1179, D10S1248, TH01, vWA, D12S391, D13S317, Penta E, D16S539, D18S51, D19S433, D21S11, Penta D, and D22S1045. All known reported variant alleles are compiled along with genomic information available from GenBank, dbSNP, and the 1000 Genomes Project. Supplementary files are included which provide annotated reference sequences for each STR locus, characterize genomic variation around the STR repeat region, and compare alleles present in currently available STR kit allelic ladders. Looking to the future, STR allele nomenclature options are discussed as they relate to next generation sequencing efforts underway.

  3. Phase variable DNA repeats in Neisseria gonorrhoeae influence transcription, translation, and protein sequence variation

    PubMed Central

    Zelewska, Marta A.; Pulijala, Madhuri; Spencer-Smith, Russell; Mahmood, Hiba-Tun-Noor A.; Norman, Billie; Churchward, Colin P.; Calder, Alan

    2016-01-01

    There are many types of repeated DNA sequences in the genomes of the species of the genus Neisseria, from homopolymeric tracts to tandem repeats of hundreds of bases. Some of these have roles in the phase-variable expression of genes. When a repeat mediates phase variation, reversible switching between tract lengths occurs, which in the species of the genus Neisseria most often causes the gene to switch between on and off states through frame shifting of the open reading frame. Changes in repeat tract lengths may also influence the strength of transcription from a promoter. For phenotypes that can be readily observed, such as expression of the surface-expressed Opa proteins or pili, verification that repeats are mediating phase variation is relatively straightforward. For other genes, particularly those where the function has not been identified, gathering evidence of repeat tract changes can be more difficult. Here we present analysis of the repetitive sequences that could mediate phase variation in the Neisseria gonorrhoeae strain NCCP11945 genome sequence and compare these results with other gonococcal genome sequences. Evidence is presented for an updated phase-variable gene repertoire in this species, including a class of phase variation that causes amino acid changes at the C-terminus of the protein, not previously described in N. gonorrhoeae. PMID:28348872

  4. Sequence variation in the androgen receptor gene is not a common determinant of male sexual orientation

    SciTech Connect

    Macke, J.P.; Nathans, J.; King, V.L. ); Hu, N.; Hu, S.; Hamer, D.; Bailey, M. ); Brown, T. )

    1993-10-01

    To test the hypothesis that DNA sequence variation in the androgen receptor gene plays a causal role in the development of male sexual orientation, the authors have (1) measured the degree of concordance of androgen receptor alleles in 36 pairs of homosexual brothers, (2) compared the lengths of polyglutamine and polyglycine tracts in the amino-terminal domain of the androgen receptor in a sample of 197 homosexual males and 213 unselected subjects, and (3) screened the entire androgen receptor coding region for sequence variation by PCR and denaturing gradient-gel electrophoresis (DGGE) and/or single-strand conformation polymorphism analysis in 20 homosexual males with homosexual or bisexual brothers and one homosexual male with no homosexual brothers, and screened the amino-terminal domain of the receptor for sequence variation in an additional 44 homosexual males, 37 of whom had one or more first- or second-degree male relatives who were either homosexual or bisexual. These analyses show that (1) homosexual brothers are as likely to be discordant as concordant for androgen receptor alleles; (2) there are no large-scale differences between the distributions of polyglycine or polyglutamine tract lengths in the homosexual and control groups; and (3) coding region sequence variation is not commonly found within the androgen receptor gene of homosexual men. The DGGE screen identified two rare amino acid substitutions, ser[sup 205] -to-arg and glu[sup 793]-to-asp, the biological significance of which is unknown. 32 refs., 2 figs., 2 tabs.

  5. Comparative RNA sequencing reveals substantial genetic variation in endangered primates.

    PubMed

    Perry, George H; Melsted, Páll; Marioni, John C; Wang, Ying; Bainer, Russell; Pickrell, Joseph K; Michelini, Katelyn; Zehr, Sarah; Yoder, Anne D; Stephens, Matthew; Pritchard, Jonathan K; Gilad, Yoav

    2012-04-01

    Comparative genomic studies in primates have yielded important insights into the evolutionary forces that shape genetic diversity and revealed the likely genetic basis for certain species-specific adaptations. To date, however, these studies have focused on only a small number of species. For the majority of nonhuman primates, including some of the most critically endangered, genome-level data are not yet available. In this study, we have taken the first steps toward addressing this gap by sequencing RNA from the livers of multiple individuals from each of 16 mammalian species, including humans and 11 nonhuman primates. Of the nonhuman primate species, five are lemurs and two are lorisoids, for which little or no genomic data were previously available. To analyze these data, we developed a method for de novo assembly and alignment of orthologous gene sequences across species. We assembled an average of 5721 gene sequences per species and characterized diversity and divergence of both gene sequences and gene expression levels. We identified patterns of variation that are consistent with the action of positive or directional selection, including an 18-fold enrichment of peroxisomal genes among genes whose regulation likely evolved under directional selection in the ancestral primate lineage. Importantly, we found no relationship between genetic diversity and endangered status, with the two most endangered species in our study, the black and white ruffed lemur and the Coquerel's sifaka, having the highest genetic diversity among all primates. Our observations imply that many endangered lemur populations still harbor considerable genetic variation. Timely efforts to conserve these species alongside their habitats have, therefore, strong potential to achieve long-term success.

  6. Sequence variation in the Tbx4 gene in marine mammals.

    PubMed

    Onbe, Kaori; Nishida, Shin; Sone, Emi; Kanda, Naohisa; Goto, Mutsuo; Pastene, Luis A; Tanabe, Shinsuke; Koike, Hiroko

    2007-05-01

    The amino-acid sequences of the T-domain region of the Tbx4 gene, which is required for hindlimb development, are 100% identical in humans and mice. Cetaceans have lost most of their hindlimb structure, although hindlimb buds are present in very early cetacean embryos. To examine whether the Tbx4 gene has the same function in cetaceans as in other mammals, we analyzed Tbx4 sequences from cetaceans, dugong, artiodactyls and marine carnivores. A total of 39 primers were designed using human and dog Tbx4 nucleotide sequences. Exons 3, 4, 5, 6, 7, and 8 of the Tbx4 genes from cetaceans, artiodactyls, and marine carnivores were sequenced. Non-synonymous substitution sites were detected in the T-domain regions from some cetacean species, but were not detected in those from artiodactyls, the dugong, or the carnivores. The C-terminal regions contained a number of non-synonymous substitutions. Although some indels were present, they were in groups of three nucleotides and therefore did not cause frame shifts. The dN/dS values for the T-domain and C-terminal regions of the cetacean and artiodactylous Tbx4 genes were much lower than 1, indicating that the Tbx4 gene maintains it function in cetaceans, although full expression leading to hindlimb development is suppressed.

  7. Unraveling genomic variation from next generation sequencing data

    PubMed Central

    2013-01-01

    Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field. PMID:23885890

  8. Sequence variation in the Mc1r gene for a group of polymorphic snakes.

    PubMed

    Cox, Christian L; Rabosky, Alison R Davis; Chippindale, Paul T

    2013-01-25

    Studying the genetic factors underlying phenotypic traits can provide insight into dynamics of selection and molecular basis of adaptation, but this goal can be difficult for non-model organisms without extensive genomic resources. However, sequencing candidate genes for the trait of interest can facilitate the study of evolutionary genetics in natural populations. We sequenced the melanocortin-1 receptor (Mc1r) to study the genetic basis of color polymorphism in a group of snake species with variable black banding, the genera Sonora, Chilomeniscus, and Chionactis. Mc1r is an important gene in the melanin synthesis pathway and is associated with ecologically important variation in color pattern in birds, mammals, and other squamate reptiles. We found that Mc1r nucleotide sequence was variable and that within our focal Sonora species, there are both fixed and heterozygous nucleotide substitutions that result in an amino acid change and selection analyses indicated that Mc1r sequence was likely under purifying selection. However, we did not detect any statistical association with the presence or absence of black bands. Our results agree with other studies that have found no role for sequence variation in Mc1r and highlight the importance of comparative data for studying the phenotypic associations of candidate genes.

  9. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  10. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  11. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  12. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  13. Mapping copy number variation by population-scale genome sequencing.

    PubMed

    Mills, Ryan E; Walter, Klaudia; Stewart, Chip; Handsaker, Robert E; Chen, Ken; Alkan, Can; Abyzov, Alexej; Yoon, Seungtai Chris; Ye, Kai; Cheetham, R Keira; Chinwalla, Asif; Conrad, Donald F; Fu, Yutao; Grubert, Fabian; Hajirasouliha, Iman; Hormozdiari, Fereydoun; Iakoucheva, Lilia M; Iqbal, Zamin; Kang, Shuli; Kidd, Jeffrey M; Konkel, Miriam K; Korn, Joshua; Khurana, Ekta; Kural, Deniz; Lam, Hugo Y K; Leng, Jing; Li, Ruiqiang; Li, Yingrui; Lin, Chang-Yun; Luo, Ruibang; Mu, Xinmeng Jasmine; Nemesh, James; Peckham, Heather E; Rausch, Tobias; Scally, Aylwyn; Shi, Xinghua; Stromberg, Michael P; Stütz, Adrian M; Urban, Alexander Eckehart; Walker, Jerilyn A; Wu, Jiantao; Zhang, Yujun; Zhang, Zhengdong D; Batzer, Mark A; Ding, Li; Marth, Gabor T; McVean, Gil; Sebat, Jonathan; Snyder, Michael; Wang, Jun; Ye, Kenny; Eichler, Evan E; Gerstein, Mark B; Hurles, Matthew E; Lee, Charles; McCarroll, Steven A; Korbel, Jan O

    2011-02-03

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies.

  14. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  15. Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC

    NASA Astrophysics Data System (ADS)

    Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.

    2000-02-01

    Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.

  16. Simple sequence repeat variations expedite phage divergence: Mechanisms of indels and gene mutations.

    PubMed

    Lin, Tiao-Yin

    2016-07-01

    Phages are the most abundant biological entities and influence prokaryotic communities on Earth. Comparing closely related genomes sheds light on molecular events shaping phage evolution. Simple sequence repeat (SSR) variations impart over half of the genomic changes between T7M and T3, indicating an important role of SSRs in accelerating phage genetic divergence. Differences in coding and noncoding regions of phages infecting different hosts, coliphages T7M and T3, Yersinia phage ϕYeO3-12, and Salmonella phage ϕSG-JL2, frequently arise from SSR variations. Such variations modify noncoding and coding regions; the latter efficiently changes multiple amino acids, thereby hastening protein evolution. Four classes of events are found to drive SSR variations: insertion/deletion of SSR units, expansion/contraction of SSRs without alteration of genome length, changes of repeat motifs, and generation/loss of repeats. The categorization demonstrates the ways SSRs mutate in genomes during phage evolution. Indels are common constituents of genome variations and human diseases, yet, how they occur without preexisting repeat sequence is less understood. Non-repeat-unit-based misalignment-elongation (NRUBME) is proposed to be one mechanism for indels without adjacent repeats. NRUBME or consecutive NRUBME may also change repeat motifs or generate new repeats. NRUBME invoking a non-Watson-Crick base pair explains insertions that initiate mononucleotide repeats. Furthermore, NRUBME successfully interprets many inexplicable human di- to tetranucleotide repeat generations. This study provides the first evidence of SSR variations expediting phage divergence, and enables insights into the events and mechanisms of genome evolution. NRUBME allows us to emulate natural evolution to design indels for various applications.

  17. Amino acid sequence of mouse submaxillary gland renin.

    PubMed Central

    Misono, K S; Chang, J J; Inagami, T

    1982-01-01

    The complete amino acid sequences of the heavy chain and light chain of mouse submaxillary gland renin have been determined. The heavy chain consists of 288 amino acid residues having a Mr of 31,036 calculated from the sequence. The light chain contains 48 amino acid residues with a Mr of 5,458. The sequence of the heavy chain was determined by automated Edman degradations of the cyanogen bromide peptides and tryptic peptides generated after citraconylation, as well as other peptides generated therefrom. The sequence of the light chain was derived from sequence analyses of the peptides generated by cyanogen bromide cleavage or by digestion with Staphylococcus aureus protease. The sequences in the active site regions in renin containing two catalytically essential aspartyl residues 32 and 215 were found identical with those in pepsin, chymosin, and penicillopepsin. Comparison of the amino acid sequence of renin with that of porcine pepsin indicated a 42% sequence identity of the heavy chain with the amino-terminal and middle regions and a 46% identity of the light chain with the carboxyl-terminal region of the porcine pepsin sequence. Residues identical in renin and pepsin are distributed throughout the length of the molecules, suggesting a similarity in their overall structures. PMID:6812055

  18. Geographical distribution and temporal variation of rain acidity over China

    SciTech Connect

    Wen-Xing Wang; Yan-Bo Pang; Guo-An Ding

    1996-12-31

    In recent decade, large areas of acid rain have appeared in China. With the increasing emission of SO{sub 2} and NO{sub x} year by year, the acidity of precipitation has increased, and the acid rain area is expanding. Presently, the acid rain in China has become the third largest area of acid rain in the world, next to Europe and North America. The Chinese government took action against acid rain and planned a five-year National Acid Deposition Research Project. The space-time distribution and variation of rain acidity described in this paper is a part of this project. China is a large country. The area is almost equal to that of Europe. Its climate varies greatly and spans the tropics, subtropics, temperate and frigid zone. There is a varied topography including mountain, hilly country, desert and plain, on the other hand the distribution of anthropogenic sources are not even. All of the human and natural factors caused different chemical composition in different parts of China, the acidity of precipitation varies also. The acidity of the precipitation is the most important parameter in the acid rain research. In order to obtain the regional representative distribution of rain acidity, National Acidic Deposition Research Monitoring Network with 261 monitoring sites was established in 1992. This paper summarizes the rain acidity of 21355 precipitation samples, and gave the annual, seasonal, and the monthly pH contours. Results show that the acid rain area has expanded from the south during winter. Regional differences of monthly acid precipitation exists, generally, the rain acidity level is higher during summer and fall and lower during winter and spring in the northern provinces. The 9 opposite is the case in the southern provinces. The central areas are in a transitional situation. The geographical distribution and temporal variation of rain acidity are quite different from North America and Europe.

  19. Variation in amino acid and lipid composition of latent fingerprints.

    PubMed

    Croxton, Ruth S; Baron, Mark G; Butler, David; Kent, Terry; Sears, Vaughn G

    2010-06-15

    The enhancement of latent fingerprints, both at the crime scene and in the laboratory using an array of chemical, physical and optical techniques, permits their use for identification. Despite the plethora of techniques available, there are occasions when latent fingerprints are not successfully enhanced. An understanding of latent fingerprint chemistry and behaviour will aid the improvement of current techniques and the development of novel ones. In this study the amino acid and fatty acid content of 'real' latent fingerprints collected on a non-porous surface was analysed by gas chromatography-mass spectrometry. Squalene was also quantified in addition. Hexadecanoic acid, octadecanoic acid and cis-9-octadecenoic acid were the most abundant fatty acids in all samples. There was, however, wide variation in the relative amounts of each fatty acid in each sample. It was clearly demonstrated that touching sebum-rich areas of the face immediately prior to fingerprint deposition resulted in a significant increase in the amount of fatty acids and squalene deposited in the resulting 'groomed' fingerprints. Serine was the most abundant amino acid identified followed by glycine, alanine and aspartic acid. The significant quantitative differences between the 'natural' and 'groomed' fingerprint samples seen for fatty acids were not observed in the case of the amino acids. This study demonstrates the variation in latent fingerprint composition between individuals and the impact of the sampling protocol on the quantitative analysis of fingerprints.

  20. Analysis of sequence variation in Gnathostoma spinigerum mitochondrial DNA by single-strand conformation polymorphism analysis and DNA sequence.

    PubMed

    Ngarmamonpirat, Charinthon; Waikagul, Jitra; Petmitr, Songsak; Dekumyoy, Paron; Rojekittikhun, Wichit; Anantapruti, Malinee T

    2005-03-01

    Morphological variations were observed in the advance third stage larvae of Gnathostoma spinigerum collected from swamp eel (Fluta alba), the second intermediate host. Larvae with typical and three atypical types were chosen for partial cytochrome c oxidase subunit I (COI) gene sequence analysis. A 450 bp polymerase chain reaction product of the COI gene was amplified from mitochondrial DNA. The variations were analyzed by single-strand conformation polymorphism and DNA sequencing. The nucleotide variations of the COI gene in the four types of larvae indicated the presence of an intra-specific variation of mitochondrial DNA in the G. spinigerum population.

  1. Amino Acid Sequence of Human Cholinesterase

    DTIC Science & Technology

    1985-10-01

    liquid chromatography (HPLC). Activity testing of the aged, DFP-labeled cholinesterase showed that 99.8% of the active sites had been labeled, since...acids were quantitated by ninhydrin at the AAA Labs, or by derivatization with phenylisothiocyanate at the University of Michigan. The latter method

  2. Nucleotide sequence variation of the envelope protein gene identifies two distinct genotypes of yellow fever virus.

    PubMed Central

    Chang, G J; Cropp, B C; Kinney, R M; Trent, D W; Gubler, D J

    1995-01-01

    The evolution of yellow fever virus over 67 years was investigated by comparing the nucleotide sequences of the envelope (E) protein genes of 20 viruses isolated in Africa, the Caribbean, and South America. Uniformly weighted parsimony algorithm analysis defined two major evolutionary yellow fever virus lineages designated E genotypes I and II. E genotype I contained viruses isolated from East and Central Africa. E genotype II viruses were divided into two sublineages: IIA viruses from West Africa and IIB viruses from America, except for a 1979 virus isolated from Trinidad (TRINID79A). Unique signature patterns were identified at 111 nucleotide and 12 amino acid positions within the yellow fever virus E gene by signature pattern analysis. Yellow fever viruses from East and Central Africa contained unique signatures at 60 nucleotide and five amino acid positions, those from West Africa contained unique signatures at 25 nucleotide and two amino acid positions, and viruses from America contained such signatures at 30 nucleotide and five amino acid positions in the E gene. The dissemination of yellow fever viruses from Africa to the Americas is supported by the close genetic relatedness of genotype IIA and IIB viruses and genetic evidence of a possible second introduction of yellow fever virus from West Africa, as illustrated by the TRINID79A virus isolate. The E protein genes of American IIB yellow fever viruses had higher frequencies of amino acid substitutions than did genes of yellow fever viruses of genotypes I and IIA on the basis of comparisons with a consensus amino acid sequence for the yellow fever E gene. The great variation in the E proteins of American yellow fever virus probably results from positive selection imposed by virus interaction with different species of mosquitoes or nonhuman primates in the Americas. PMID:7637022

  3. Amino acid sequences of two nonspecific lipid-transfer proteins from germinated castor bean.

    PubMed

    Takishima, K; Watanabe, S; Yamada, M; Suga, T; Mamiya, G

    1988-11-01

    The amino acid sequence of two nonspecific lipid-transfer proteins (nsLTP) B and C from germinated castor bean seeds have been determined. Both the proteins consist of 92 residues, as for nsLTP previously reported, and their calculated Mr values are 9847 and 9593 for nsLTP-B and nsLTP-C, respectively. The sequences of nsLTP-B and nsLTP-C, compared to the known sequence of nsLTP-A from the same source, are 68% and 35% similar, respectively. No variation was found at the positions of the cysteine residues, indicating that they might be involved in disulfide bridges.

  4. Cystatin. Amino acid sequence and possible secondary structure.

    PubMed Central

    Schwabe, C; Anastasi, A; Crow, H; McDonald, J K; Barrett, A J

    1984-01-01

    The amino acid sequence of cystatin, the protein from chicken egg-white that is a tight-binding inhibitor of many cysteine proteinases, is reported. Cystatin is composed of 116 amino acid residues, and the Mr is calculated to be 13 143. No striking similarity to any other known sequence has been detected. The results of computer analysis of the sequence and c.d. spectrometry indicate that the secondary structure includes relatively little alpha-helix (about 20%) and that the remainder is mainly beta-structure. PMID:6712597

  5. Geochemical variations during the 2012 Emilia seismic sequence

    NASA Astrophysics Data System (ADS)

    Sciarra, Alessandra; Cantucci, Barbara; Galli, Gianfranco; Cinti, Daniele; Pizzino, Luca

    2015-04-01

    Several geochemical surveys (soil gas and shallow water) were performed in the Modena province (Massa Finalese, Finale Emilia, Medolla and S. Felice sul Panaro), during 2006-2014 period. In May-June 2012, a seismic sequence (main shocks of ML 5.9 and 5.8) was occurred closely to the investigated area. In this area 300 CO2 and CH4 fluxes measurements, 150 soil gas concentrations (He, H2, CO2, CH4 and C2H6), 30 shallow waters and their isotopic analyses (δ13C- CH4, δD- CH4 and δ13C- CO2) were performed in April-May 2006, October and December 2008, repeated in May and September 2012, June 2013 and July 2014 afterwards the 2012 Emilia seismic sequences. Chemical composition of soil gas are dominated by CH4 in the southern part by CO2 in the northern part. Very anomalous fluxes and concentrations are recorded in spot areas; elsewhere CO2 and CH4 values are very low, within the typical range of vegetative and of organic exhalation of the cultivated soil. After the seismic sequence the CH4 and CO2 fluxes are increased of one order of magnitude in the spotty areas, whereas in the surrounding area the values are within the background. On the contrary, CH4 concentration decrease (40%v/v in the 2012 surveys) and CO2 concentration increase until to 12.7%v/v (2013 survey). Isotopic gas analysis were carried out only on samples with anomalous values. Pre-seismic data hint a thermogenic origin of CH4 probably linked to leakage from a deep source in the Medolla area. Conversely, 2012/2013 isotopic data indicate a typical biogenic origin (i.e. microbial hydrocarbon production) of the CH4, as recognized elsewhere in the Po Plain and surroundings. The δ13C-CO2 value suggests a prevalent shallow origin of CO2 (i.e. organic and/or soil-derived) probably related to anaerobic oxidation of heavy hydrocarbons. Water samples, collected from domestic, industrial and hydrocarbons exploration wells, allowed us to recognize different families of waters. Waters are meteoric in origin and

  6. Mouse Vk gene classification by nucleic acid sequence similarity.

    PubMed

    Strohal, R; Helmberg, A; Kroemer, G; Kofler, R

    1989-01-01

    Analyses of immunoglobulin (Ig) variable (V) region gene usage in the immune response, estimates of V gene germline complexity, and other nucleic acid hybridization-based studies depend on the extent to which such genes are related (i.e., sequence similarity) and their organization in gene families. While mouse Igh heavy chain V region (VH) gene families are relatively well-established, a corresponding systematic classification of Igk light chain V region (Vk) genes has not been reported. The present analysis, in the course of which we reviewed the known extent of the Vk germline gene repertoire and Vk gene usage in a variety of responses to foreign and self antigens, provides a classification of mouse Vk genes in gene families composed of members with greater than 80% overall nucleic acid sequence similarity. This classification differed in several aspects from that of VH genes: only some Vk gene families were as clearly separated (by greater than 25% sequence dissimilarity) as typical VH gene families; most Vk gene families were closely related and, in several instances, members from different families were very similar (greater than 80%) over large sequence portions; frequently, classification by nucleic acid sequence similarity diverged from existing classifications based on amino-terminal protein sequence similarity. Our data have implications for Vk gene analyses by nucleic acid hybridization and describe potentially important differences in sequence organization between VH and Vk genes.

  7. Thin-film technology for direct visual detection of nucleic acid sequences: applications in clinical research.

    PubMed

    Jenison, Robert D; Bucala, Richard; Maul, Diana; Ward, David C

    2006-01-01

    Certain optical conditions permit the unaided eye to detect thickness changes on surfaces on the order of 20 A, which are of similar dimensions to monomolecular interactions between proteins or hybridization of complementary nucleic acid sequences. Such detection exploits specific interference of reflected white light, wherein thickness changes are perceived as surface color changes. This technology, termed thin-film detection, allows for the visualization of subattomole amounts of nucleic acid targets, even in complex clinical samples. Thin-film technology has been applied to a broad range of clinically relevant indications, including the detection of pathogenic bacterial and viral nucleic acid sequences and the discrimination of sequence variations in human genes causally related to susceptibility or severity of disease.

  8. Effect of variations in peptide sequence on anti-human milk fat globule membrane antibody reactions.

    PubMed

    Xing, P X; Reynolds, K; Pietersz, G A; McKenzie, I F

    1991-02-01

    Monoclonal anti-mucine antibodies BC1, BC2 and BC3 produced using human milk fat globule membrane react with a synthetic peptide p1-24 (PDTRPAPGSTAPPAHGVTSAPDTR) representing the repeating amino acid sequence of the mucin core protein. The minimum epitope recognized by these three monoclonal antibodies (mAb) in p1-24 was contained in the five amino acids APDTR. To analyse the variation of position of the epitope, various modifications of the APDTR sequence were made by synthesizing peptides and testing by direct binding and inhibition enzyme-linked immunosorbent assays. Firstly, peptides p13-32 and C-p13-32, in which the epitope APDTR was placed in the middle instead of the C-terminal as in p1-24, were examined. These peptides had a greater reaction with mAb BC1, BC2 and BC3 compared with the reaction with p1-24. Secondly, A-p1-24 and TSA-p1-24 were made wherein two APDTR epitopes were present--these peptides were shown to bind two IgG antibody molecules. Finally, the contribution of each amino acid in the APDTR epitope was studied using the pepscan polyethylene rods, making all 20 of the amino acid substitutions in each position for SAPDTR (the minimum epitope APDTR with an adjacent amino acid S). In the 120 peptides examined there were some 'permissible' substitutions in A, D and T but not in P or R for BC1 and BC2; there were more 'permissible' substitutions for BC3; different substitution patterns were found with each antibody and some substitutions gave an increased reaction compared with the native peptide SAPDTR. The studies are of value in analysing the reaction of antibodies with epitopes expressed in breast cancer and in determining the antigenicity of synthetic peptides.

  9. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences

    PubMed Central

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D.; Adir, Noam

    2016-01-01

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel. PMID:27307442

  10. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

    PubMed

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D; Adir, Noam

    2016-06-28

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel.

  11. Amino acid sequences of proteins from Leptospira serovar pomona.

    PubMed

    Alves, S F; Lefebvre, R B; Probert, W

    2000-01-01

    This report describes a partial amino acid sequences from three putative outer envelope proteins from Leptospira serovar pomona. In order to obtain internal fragments for protein sequencing, enzymatic and chemical digestion was performed. The enzyme clostripain was used to digest the proteins 32 and 45 kDa. In situ digestion of 40 kDa molecular weight protein was accomplished using cyanogen bromide. The 32 kDa protein generated two fragments, one of 21 kDa and another of 10 kDa that yielded five residues. A fragment of 24 kDa that yielded nineteen residues of amino acids was obtained from 45 kDa protein. A fragment with a molecular weight of 20 kDa, yielding a twenty amino acids sequence from the 40 kDa protein.

  12. A molecular footprint of limb loss: sequence variation of the autopodial identity gene Hoxa-13.

    PubMed

    Kohlsdorf, Tiana; Cummings, Michael P; Lynch, Vincent J; Stopper, Geffrey F; Takahashi, Kazuhiko; Wagner, Günter P

    2008-12-01

    The homeobox gene Hoxa-13 codes for a transcription factor involved in multiple functions, including body axis and hand/foot development in tetrapods. In this study we investigate whether the loss of one function (e.g., limb loss in snakes) left a molecular footprint in exon 1 of Hoxa-13 that could be associated with the release of functional constraints caused by limb loss. Fragments of the Hoxa-13 exon 1 were sequenced from 13 species and analyzed, with additional published sequences of the same region, using relative rates and likelihood-ratio tests. Five amino acid sites in exon 1 of Hoxa-13 were detected as evolving under positive selection in the stem lineage of snakes. To further investigate whether there is an association between limb loss and sequence variation in Hoxa-13, we used the random forest method on an alignment that included shark, basal fish lineages, and "eu-tetrapods" such as mammals, turtle, alligator, and birds. The random forest method approaches the problem as one of classification, where we seek to predict the presence or absence of autopodium based on amino acid variation in Hoxa-13 sequences. Different alignments tested were associated with similar error rates (18.42%). The random forest method suggested that phenotypic states (autopodium present and absent) can often be correctly predicted based on Hoxa-13 sequences. Basal, nontetrapod gnat-hostomes that never had an autopodium were consistently classified as limbless together with the snakes, while eu-tetrapods without any history of limb loss in their phylogeny were also consistently classified as having a limb. Misclassifications affected mostly lizards, which, as a group, have a history of limb loss and limb re-evolution, and the urodele and caecilian in our sample. We conclude that a molecular footprint can be detected in Hoxa-13 that is associated with the lack of an autopodium; groups with classification ambiguity (lizards) are characterized by a history of repeated limb loss

  13. Extensive amino acid sequence homologies between animal lectins

    SciTech Connect

    Paroutaud, P.; Levi, G.; Teichberg, V.I.; Strosberg, A.D.

    1987-09-01

    The authors have established the amino acid sequence of the ..beta..-D-galactoside binding lectin from the electric eel and the sequences of several peptides from a similar lectin isolated from human placenta. These sequences were compared with the published sequences of peptides derived from the ..beta..-D-galactoside binding lectin from human lung and with sequences deduced from cDNAs assigned to the ..beta..-D-galactoside binding lectins from chicken embryo skin and human hepatomas. Significant homologies were observed. One of the highly conserved regions that contains a tryptophan residue and two glutamic acid resides is probably part of the ..beta..-D-galactoside binding site, which, on the basis of spectroscopic studies of the electric eel lectin, is expected to contain such residues. The similarity of the hydropathy profiles and the predicted secondary structure of the lectins from chicken skin and electric eel, in spite of differences in their amino acid sequences, strongly suggests that these proteins have maintained structural homologies during evolution and together with the other ..beta..-D-galactoside binding lectins were derived form a common ancestor gene.

  14. Amino acid sequence of porcine spleen cathepsin D.

    PubMed Central

    Shewale, J G; Tang, J

    1984-01-01

    The amino acid sequence of porcine spleen cathepsin D heavy chain has been determined and, hence, the complete structure of this enzyme is now known. The sequence of heavy chain was constructed by aligning the structures of peptides generated by cyanogen bromide, trypsin, and endo-proteinase Lys C cleavages. The structure of the light chain has been published previously. The cathepsin D molecule contains 339 amino acid residues in two polypeptide chains: a 97-residue light chain and a 242-residue heavy chain, with a combined Mr of 36,779 (without carbohydrate). There are two carbohydrate units linked to asparagine residues 70 and 192. The disulfide bond arrangement in cathepsin D is probably similar to that of pepsin, because the positions of six half-cystine residues are conserved. The active site aspartyl residues, corresponding to aspartic acid-32 and -215 of pepsin, are located at residues 33 and 224 in the cathepsin D molecule. The amino acid sequence around these aspartyl residues is strongly conserved. Cathepsin D shows a strong homology with other acid proteases. When the sequence of cathepsin D, renin, and pepsin are aligned, 32.7% of the residues are identical. The homology is observed throughout the length of the molecules, indicating that three-dimensional structures of all three molecules are similar. PMID:6587385

  15. Protein 3D structure computed from evolutionary sequence variation.

    PubMed

    Marks, Debora S; Colwell, Lucy J; Sheridan, Robert; Hopf, Thomas A; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris

    2011-01-01

    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing.In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy.We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues, including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7-4.8 Å C(α)-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein structures

  16. From phenotypes to causal sequences: using genome wide association studies to dissect the sequence basis for variation of plant development.

    PubMed

    Ogura, Takehiko; Busch, Wolfgang

    2015-02-01

    Tremendous natural variation of growth and development exists within species. Uncovering the molecular mechanisms that tune growth and development promises to shed light on a broad set of biological issues including genotype to phenotype relations, regulatory mechanisms of biological processes and evolutionary questions. Recent progress in sequencing and data processing capabilities has enabled Genome Wide Association Studies (GWASs) to identify DNA sequence polymorphisms that underlie the variation of biological traits. In the last years, GWASs have proven powerful in revealing the complex genetic bases of many phenotypes in various plant species. Here we highlight successful recent GWASs that uncovered mechanistic and sequence bases of trait variation related to plant growth and development and discuss important considerations for conducting successful GWASs.

  17. Clinal variation for amino acid polymorphisms at the Pgm locus in Drosophila melanogaster.

    PubMed Central

    Verrelli, B C; Eanes, W F

    2001-01-01

    Clinal variation is common for enzymes in the glycolytic pathway for Drosophila melanogaster and is generally accepted as an adaptive response to different climates. Although the enzyme phosphoglucomutase (PGM) possesses several allozyme polymorphisms, it is unique in that it had been reported to show no clinal variation. Our recent DNA sequence investigation of Pgm found extensive cryptic amino acid polymorphism segregating with the allozyme alleles. In this study, we characterize the geographic variation of Pgm amino acid polymorphisms at the nucleotide level along a latitudinal cline in the eastern United States. A survey of 15 SNPs across the Pgm gene finds significant clinal differentiation for the allozyme polymorphisms as well as for many of the cryptic amino acid polymorphisms. A test of independence shows that pervasive linkage disequilibrium across this gene region can explain many of the amino acid clines. A single Pgm haplotype defined by two amino acid polymorphisms shows the strongest correlation with latitude and the steepest change in allele frequency across the cline. We propose that clinal selection at Pgm may in part explain the extensive amino acid polymorphism at this locus and is consistent with a multilocus response to selection in the glycolytic pathway. PMID:11290720

  18. Metadata-driven comparative analysis tool for sequences (meta-CATS): an automated process for identifying significant sequence variations that correlate with virus attributes.

    PubMed

    Pickett, B E; Liu, M; Sadat, E L; Squires, R B; Noronha, J M; He, S; Jen, W; Zaremba, S; Gu, Z; Zhou, L; Larsen, C N; Bosch, I; Gehrke, L; McGee, M; Klem, E B; Scheuermann, R H

    2013-12-01

    The Virus Pathogen Resource (ViPR; www.viprbrc.org) and Influenza Research Database (IRD; www.fludb.org) have developed a metadata-driven Comparative Analysis Tool for Sequences (meta-CATS), which performs statistical comparative analyses of nucleotide and amino acid sequence data to identify correlations between sequence variations and virus attributes (metadata). Meta-CATS guides users through: selecting a set of nucleotide or protein sequences; dividing them into multiple groups based on any associated metadata attribute (e.g. isolation location, host species); performing a statistical test at each aligned position; and identifying all residues that significantly differ between the groups. As proofs of concept, we have used meta-CATS to identify sequence biomarkers associated with dengue viruses isolated from different hemispheres, and to identify variations in the NS1 protein that are unique to each of the 4 dengue serotypes. Meta-CATS is made freely available to virology researchers to identify genotype-phenotype correlations for development of improved vaccines, diagnostics, and therapeutics.

  19. Variation in Symbiodinium ITS2 Sequence Assemblages among Coral Colonies

    PubMed Central

    Stat, Michael; Bird, Christopher E.; Pochon, Xavier; Chasqui, Luis; Chauka, Leonard J.; Concepcion, Gregory T.; Logan, Dan; Takabayashi, Misaki; Toonen, Robert J.; Gates, Ruth D.

    2011-01-01

    Endosymbiotic dinoflagellates in the genus Symbiodinium are fundamentally important to the biology of scleractinian corals, as well as to a variety of other marine organisms. The genus Symbiodinium is genetically and functionally diverse and the taxonomic nature of the union between Symbiodinium and corals is implicated as a key trait determining the environmental tolerance of the symbiosis. Surprisingly, the question of how Symbiodinium diversity partitions within a species across spatial scales of meters to kilometers has received little attention, but is important to understanding the intrinsic biological scope of a given coral population and adaptations to the local environment. Here we address this gap by describing the Symbiodinium ITS2 sequence assemblages recovered from colonies of the reef building coral Montipora capitata sampled across Kāne'ohe Bay, Hawai'i. A total of 52 corals were sampled in a nested design of Coral Colony(Site(Region)) reflecting spatial scales of meters to kilometers. A diversity of Symbiodinium ITS2 sequences was recovered with the majority of variance partitioning at the level of the Coral Colony. To confirm this result, the Symbiodinium ITS2 sequence diversity in six M. capitata colonies were analyzed in much greater depth with 35 to 55 clones per colony. The ITS2 sequences and quantitative composition recovered from these colonies varied significantly, indicating that each coral hosted a different assemblage of Symbiodinium. The diversity of Symbiodinium ITS2 sequence assemblages retrieved from individual colonies of M. capitata here highlights the problems inherent in interpreting multi-copy and intra-genomically variable molecular markers, and serves as a context for discussing the utility and biological relevance of assigning species names based on Symbiodinium ITS2 genotyping. PMID:21246044

  20. Variation in Symbiodinium ITS2 sequence assemblages among coral colonies.

    PubMed

    Stat, Michael; Bird, Christopher E; Pochon, Xavier; Chasqui, Luis; Chauka, Leonard J; Concepcion, Gregory T; Logan, Dan; Takabayashi, Misaki; Toonen, Robert J; Gates, Ruth D

    2011-01-05

    Endosymbiotic dinoflagellates in the genus Symbiodinium are fundamentally important to the biology of scleractinian corals, as well as to a variety of other marine organisms. The genus Symbiodinium is genetically and functionally diverse and the taxonomic nature of the union between Symbiodinium and corals is implicated as a key trait determining the environmental tolerance of the symbiosis. Surprisingly, the question of how Symbiodinium diversity partitions within a species across spatial scales of meters to kilometers has received little attention, but is important to understanding the intrinsic biological scope of a given coral population and adaptations to the local environment. Here we address this gap by describing the Symbiodinium ITS2 sequence assemblages recovered from colonies of the reef building coral Montipora capitata sampled across Kāne'ohe Bay, Hawai'i. A total of 52 corals were sampled in a nested design of Coral Colony(Site(Region)) reflecting spatial scales of meters to kilometers. A diversity of Symbiodinium ITS2 sequences was recovered with the majority of variance partitioning at the level of the Coral Colony. To confirm this result, the Symbiodinium ITS2 sequence diversity in six M. capitata colonies were analyzed in much greater depth with 35 to 55 clones per colony. The ITS2 sequences and quantitative composition recovered from these colonies varied significantly, indicating that each coral hosted a different assemblage of Symbiodinium. The diversity of Symbiodinium ITS2 sequence assemblages retrieved from individual colonies of M. capitata here highlights the problems inherent in interpreting multi-copy and intra-genomically variable molecular markers, and serves as a context for discussing the utility and biological relevance of assigning species names based on Symbiodinium ITS2 genotyping.

  1. Active site amino acid sequence of human factor D.

    PubMed

    Davis, A E

    1980-08-01

    Factor D was isolated from human plasma by chromatography on CM-Sephadex C50, Sephadex G-75, and hydroxylapatite. Digestion of reduced, S-carboxymethylated factor D with cyanogen bromide resulted in three peptides which were isolated by chromatography on Sephadex G-75 (superfine) equilibrated in 20% formic acid. NH2-Terminal sequences were determined by automated Edman degradation with a Beckman 890C sequencer using a 0.1 M Quadrol program. The smallest peptide (CNBr III) consisted of the NH2-terminal 14 amino acids. The other two peptides had molecular weights of 17,000 (CNBr I) and 7000 (CNBr II). Overlap of the NH2-terminal sequence of factor D with the NH2-terminal sequence of CNBr I established the order of the peptides. The NH2-terminal 53 residues of factor D are somewhat more homologous with the group-specific protease of rat intestine than with other serine proteases. The NH2-terminal sequence of CNBr II revealed the active site serine of factor D. The typical serine protease active site sequence (Gly-Asp-Ser-Gly-Gly-Pro was found at residues 12-17. The region surrounding the active site serine does not appear to be more highly homologous with any one of the other serine proteases. The structural data obtained point out the similarities between factor D and the other proteases. However, complete definition of the degree of relationship between factor D and other proteases will require determination of the remainder of the primary structure.

  2. The amino acid sequence of iguana (Iguana iguana) pancreatic ribonuclease.

    PubMed

    Zhao, W; Beintema, J J; Hofsteenge, J

    1994-01-15

    The pyrimidine-specific ribonuclease superfamily constitutes a group of homologous proteins so far found only in higher vertebrates. Four separate families are found in mammals, which have resulted from gene duplications in mammalian ancestors. To learn more about the evolutionary history of this superfamily, the primary structure and other characteristics of the pancreatic enzyme from iguana (Iguana iguana), a herbivorous lizard species belonging to the reptiles, have been determined. The polypeptide chain consists of 119 amino acid residues. The positions of insertions and deletions in the sequence are identical to those in the enzyme from snapping turtle. However, the two enzymes differ at 54% of the amino acid positions. Iguana ribonuclease contains no carbohydrate, although the enzyme possesses three recognition sites for carbohydrate attachment, and has a high number of acidic residues in a localized part of the sequence.

  3. Sequence variation of alcohol dehydrogenase (Adh) paralogs in cactophilic Drosophila.

    PubMed Central

    Matzkin, Luciano M; Eanes, Walter F

    2003-01-01

    This study focuses on the population genetics of alcohol dehydrogenase (Adh) in cactophilic Drosophila. Drosophila mojavensis and D. arizonae utilize cactus hosts, and each host contains a characteristic mixture of alcohol compounds. In these Drosophila species there are two functional Adh loci, an adult form (Adh-2) and a larval and ovarian form (Adh-1). Overall, the greater level of variation segregating in D. arizonae than in D. mojavensis suggests a larger population size for D. arizonae. There are markedly different patterns of variation between the paralogs across both species. A 16-bp intron haplotype segregates in both species at Adh-2, apparently the product of an ancient gene conversion event between the paralogs, which suggests that there is selection for the maintenance of the intron structure possibly for the maintenance of pre-mRNA structure. We observe a pattern of variation consistent with adaptive protein evolution in the D. mojavensis lineage at Adh-1, suggesting that the cactus host shift that occurred in the divergence of D. mojavensis from D. arizonae had an effect on the evolution of the larval expressed paralog. Contrary to previous work we estimate a recent time for both the divergence of D. mojavensis and D. arizonae (2.4 +/- 0.7 MY) and the age of the gene duplication (3.95 +/- 0.45 MY). PMID:12586706

  4. Amino acid sequence and comparative antigenicity of chicken metallothionein.

    PubMed Central

    McCormick, C C; Fullmer, C S; Garvey, J S

    1988-01-01

    The complete amino acid sequence of metallothionein (MT) from chicken liver is reported. The primary structure was determined by automated sequence analysis of peptides produced by limited acid hydrolysis and by trypsin digestion. The comparative antigenicity of chicken MT was determined by radioimmunoassay using rabbit anti-rat MT polyclonal antibody. Chicken MT consists of 63 amino acids as compared to 61 found in MTs from mammals. One insertion (and two substitutions) occurs in the amino-terminal region, a region considered invariant among mammalian MTs. Eighteen of the 20 cysteines in chicken MT were aligned with cysteines from other mammalian sequences. Two cysteines near the carboxyl terminus are shifted by one residue due to the insertion of proline in that region. Overall, the chicken protein showed approximately equal to 68% sequence identity in a comparison with various mammalian MTs. The affinity of the polyclonal antibody for chicken MT was decreased by 2 orders of magnitude in comparison to that of a mammalian MT (rat MT isoforms). This reduced affinity is attributed to major substitutions in chicken MT in the regions of the principal determinants of mammalian MTs. Theoretical analysis of the primary structure predicted the secondary structure to consist of reverse turns and random coils with no stable beta or helix conformations. There is no evidence that chicken MT differs functionally from mammalian MTs. PMID:2448773

  5. Analysis of microbial community variation during the mixed culture fermentation of agricultural peel wastes to produce lactic acid.

    PubMed

    Liang, Shaobo; Gliniewicz, Karol; Gerritsen, Alida T; McDonald, Armando G

    2016-05-01

    Mixed cultures fermentation can be used to convert organic wastes into various chemicals and fuels. This study examined the fermentation performance of four batch reactors fed with different agricultural (orange, banana, and potato (mechanical and steam)) peel wastes using mixed cultures, and monitored the interval variation of reactor microbial communities with 16S rRNA genes using Illumina sequencing. All four reactors produced similar chemical profile with lactic acid (LA) as dominant compound. Acetic acid and ethanol were also observed with small fractions. The Illumina sequencing results revealed the diversity of microbial community decreased during fermentation and a community of largely lactic acid producing bacteria dominated by species of Lactobacillus developed.

  6. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  7. Allelic sequence variation of the HLA-DQ loci: relationship to serology and to insulin-dependent diabetes susceptibility.

    PubMed Central

    Horn, G T; Bugawan, T L; Long, C M; Erlich, H A

    1988-01-01

    Analysis of sequence variation in the polymorphic second exon of the major histocompatibility complex genes HLA-DQ alpha and -DQ beta has revealed 8 allelic variants at the alpha locus and 13 variants at the beta locus. Correlation of sequence variation with serologic typing suggests that the DQw2, DQw3, and DQ(blank) types are determined by the DQ beta subunit, while the DQw1 specificity is determined by DQ alpha. The nature of the amino acid at position 57 in the DQ beta subunit is correlated with susceptibility to insulin-dependent diabetes mellitus. This region of the DQ beta chain contains shared peptides with Epstein-Barr virus and rubella virus. PMID:2842756

  8. HIV-1 sequence variation between isolates from mother-infant transmission pairs

    SciTech Connect

    Wike, C.M.; Daniels, M.R.; Furtado, M.; Wolinsky, M.; Korber, B.; Hutto, C.; Munoz, J.; Parks, W.; Saah, A.

    1991-01-01

    To examine the sequence diversity of human immunodeficiency virus type 1 (HIV-1) between known transmission sets, sequences from the V3 and V4-V5 region of the env gene from 4 mother-infant pairs were analyzed. The mean interpatient sequence variation between isolates from linked mother-infant pairs was comparable to the sequence diversity found between isolates from other close contacts. The mean intrapatient variation was significantly less in the infants' isolates then the isolates from both their mothers and other characterized intrapatient sequence sets. In addition, a distinct and characteristic difference in the glycosylation pattern preceding the V3 loop was found between each linked transmission pair. These findings indicate that selection of specific genotypic variants, which may play a role in some direct transmission sets, and the duration of infection are important factors in the degree of diversity seen between the sequence sets.

  9. HIV-1 sequence variation between isolates from mother-infant transmission pairs

    SciTech Connect

    Wike, C.M.; Daniels, M.R.; Furtado, M.; Wolinsky, M.; Korber, B.; Hutto, C.; Munoz, J.; Parks, W.; Saah, A.

    1991-12-31

    To examine the sequence diversity of human immunodeficiency virus type 1 (HIV-1) between known transmission sets, sequences from the V3 and V4-V5 region of the env gene from 4 mother-infant pairs were analyzed. The mean interpatient sequence variation between isolates from linked mother-infant pairs was comparable to the sequence diversity found between isolates from other close contacts. The mean intrapatient variation was significantly less in the infants` isolates then the isolates from both their mothers and other characterized intrapatient sequence sets. In addition, a distinct and characteristic difference in the glycosylation pattern preceding the V3 loop was found between each linked transmission pair. These findings indicate that selection of specific genotypic variants, which may play a role in some direct transmission sets, and the duration of infection are important factors in the degree of diversity seen between the sequence sets.

  10. Atomic force microscopy of crystalline insulins: the influence of sequence variation on crystallization and interfacial structure.

    PubMed Central

    Yip, C M; Brader, M L; DeFelippis, M R; Ward, M D

    1998-01-01

    The self-association of proteins is influenced by amino acid sequence, molecular conformation, and the presence of molecular additives. In the presence of phenolic additives, LysB28ProB29 insulin, in which the C-terminal prolyl and lysyl residues of wild-type human insulin have been inverted, can be crystallized into forms resembling those of wild-type insulins in which the protein exists as zinc-complexed hexamers organized into well-defined layers. We describe herein tapping-mode atomic force microscopy (TMAFM) studies of single crystals of rhombohedral (R3) LysB28ProB29 that reveal the influence of sequence variation on hexamer-hexamer association at the surface of actively growing crystals. Molecular scale lattice images of these crystals were acquired in situ under growth conditions, enabling simultaneous identification of the rhombohedral LysB28ProB29 crystal form, its orientation, and its dynamic growth characteristics. The ability to obtain crystallographic parameters on multiple crystal faces with TMAFM confirmed that bovine and porcine insulins grown under these conditions crystallized into the same space group as LysB28ProB29 (R3), enabling direct comparison of crystal growth behavior and the influence of sequence variation. Real-time TMAFM revealed hexamer vacancies on the (001) terraces of LysB28ProB29, and more rounded dislocation noses and larger terrace widths for actively growing screw dislocations compared to wild-type bovine and porcine insulin crystals under identical conditions. This behavior is consistent with weaker interhexamer attachment energies for LysB28ProB29 at active growth sites. Comparison of the single crystal x-ray structures of wild-type insulins and LysB28ProB29 suggests that differences in protein conformation at the hexamer-hexamer interface and accompanying changes in interhexamer bonding are responsible for this behavior. These studies demonstrate that subtle changes in molecular conformation due to a single sequence

  11. A genome-to-genome analysis of associations between human genetic variation, HIV-1 sequence diversity, and viral control.

    PubMed

    Bartha, István; Carlson, Jonathan M; Brumme, Chanson J; McLaren, Paul J; Brumme, Zabrina L; John, Mina; Haas, David W; Martinez-Picado, Javier; Dalmau, Judith; López-Galíndez, Cecilio; Casado, Concepción; Rauch, Andri; Günthard, Huldrych F; Bernasconi, Enos; Vernazza, Pietro; Klimkait, Thomas; Yerly, Sabine; O'Brien, Stephen J; Listgarten, Jennifer; Pfeifer, Nico; Lippert, Christoph; Fusi, Nicolo; Kutalik, Zoltán; Allen, Todd M; Müller, Viktor; Harrigan, P Richard; Heckerman, David; Telenti, Amalio; Fellay, Jacques

    2013-10-29

    HIV-1 sequence diversity is affected by selection pressures arising from host genomic factors. Using paired human and viral data from 1071 individuals, we ran >3000 genome-wide scans, testing for associations between host DNA polymorphisms, HIV-1 sequence variation and plasma viral load (VL), while considering human and viral population structure. We observed significant human SNP associations to a total of 48 HIV-1 amino acid variants (p<2.4 × 10(-12)). All associated SNPs mapped to the HLA class I region. Clinical relevance of host and pathogen variation was assessed using VL results. We identified two critical advantages to the use of viral variation for identifying host factors: (1) association signals are much stronger for HIV-1 sequence variants than VL, reflecting the 'intermediate phenotype' nature of viral variation; (2) association testing can be run without any clinical data. The proposed genome-to-genome approach highlights sites of genomic conflict and is a strategy generally applicable to studies of host-pathogen interaction. DOI:http://dx.doi.org/10.7554/eLife.01123.001.

  12. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  13. Engineering the Dynamic Properties of Protein Networks through Sequence Variation

    PubMed Central

    2016-01-01

    The dynamic behavior of macromolecular networks dominates the mechanical properties of soft materials and influences biological processes at multiple length scales. In hydrogels prepared from self-assembling artificial proteins, stress relaxation and energy dissipation arise from the transient character of physical network junctions. Here we show that subtle changes in sequence can be used to program the relaxation behavior of end-linked networks of engineered coiled-coil proteins. Single-site substitutions in the coiled-coil domains caused shifts in relaxation time over 5 orders of magnitude as demonstrated by dynamic oscillatory shear rheometry and stress relaxation measurements. Networks with multiple relaxation time scales were also engineered. This work demonstrates how time-dependent mechanical responses of macromolecular materials can be encoded in genetic information. PMID:27924309

  14. Analysis of the sequence variations in the Mhc DRB1-like gene of the endangered Humboldt penguin (Spheniscus humboldti).

    PubMed

    Kikkawa, Eri F; Tsuda, Tomi T; Naruse, Taeko K; Sumiyama, Daisuke; Fukuda, Michio; Kurita, Masanori; Murata, Koichi; Wilson, Rory P; LeMaho, Yvon; Tsuda, Michio; Kulski, Jerzy K; Inoko, Hidetoshi

    2005-04-01

    The Major Histocompatibility Complex (Mhc) genomic region of many vertebrates is known to contain at least one highly polymorphic class II gene that is homologous in sequence to one or other of the human Mhc DRB1 class II genes. The diversity of the avian Mhc class II gene sequences have been extensively studied in chickens, quails, and some songbirds, but have been largely ignored in the oceanic birds, including the flightless penguins. We have previously reported that several penguin species have a high degree of polymorphism on exon 2 of the Mhc class II DRB1-like gene. In this study, we present for the first time the complete nucleotide sequences of exon 2, intron 2, and exon 3 of the DRB1-like gene of 20 Humboldt penguins, a species that is presently vulnerable to the dangers of extinction. The Humboldt DRB1-like nucleotide and amino acid sequences reveal at least eight unique alleles. Phylogenetic analysis of all the available avian DRB-like sequences showed that, of five penguin species and nine other bird species, the sequences of the Humboldt penguins grouped most closely to the Little penguin and the mallard, respectively. The present analysis confirms that the sequence variations of the Mhc class II gene, DRB1, are useful for discriminating among individuals within the same penguin population as well those within different penguin population groups and species.

  15. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

    NASA Astrophysics Data System (ADS)

    Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

    2016-06-01

    Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.

  16. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation.

    PubMed

    Sheynkman, Gloria M; Shortreed, Michael R; Cesnik, Anthony J; Smith, Lloyd M

    2016-06-12

    Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.

  17. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

    PubMed Central

    Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

    2016-01-01

    Mass spectrometry–based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications. PMID:27049631

  18. Mathematical Characterization of Protein Sequences Using Patterns as Chemical Group Combinations of Amino Acids.

    PubMed

    Das, Jayanta Kumar; Das, Provas; Ray, Korak Kumar; Choudhury, Pabitra Pal; Jana, Siddhartha Sankar

    2016-01-01

    Comparison of amino acid sequence similarity is the fundamental concept behind the protein phylogenetic tree formation. By virtue of this method, we can explain the evolutionary relationships, but further explanations are not possible unless sequences are studied through the chemical nature of individual amino acids. Here we develop a new methodology to characterize the protein sequences on the basis of the chemical nature of the amino acids. We design various algorithms for studying the variation of chemical group transitions and various chemical group combinations as patterns in the protein sequences. The amino acid sequence of conventional myosin II head domain of 14 family members are taken to illustrate this new approach. We find two blocks of maximum length 6 aa as 'FPKATD' and 'Y/FTNEKL' without repeating the same chemical nature and one block of maximum length 20 aa with the repetition of chemical nature which are common among all 14 members. We also check commonality with another motor protein sub-family kinesin, KIF1A. Based on our analysis we find a common block of length 8 aa both in myosin II and KIF1A. This motif is located in the neck linker region which could be responsible for the generation of mechanical force, enabling us to find the unique blocks which remain chemically conserved across the family. We also validate our methodology with different protein families such as MYOI, Myosin light chain kinase (MLCK) and Rho-associated protein kinase (ROCK), Na+/K+-ATPase and Ca2+-ATPase. Altogether, our studies provide a new methodology for investigating the conserved amino acids' pattern in different proteins.

  19. Wide variation in microsatellite sequences within each Pfcrt mutant haplotype.

    PubMed

    Vinayak, Sumiti; Mittra, Pooja; Sharma, Yagya D

    2006-05-01

    Flanking microsatellites for each of the Pfcrt mutant haplotype of Plasmodium falciparum remain conserved among geographical isolates. We describe here heterogeneity in the intragenic microsatellites among each of the Pfcrt haplotype. There were fourteen different alleles of AT repeats of intron 2 and eight alleles of TA repeats of intron 4 of the pfcrt gene among Indian isolates. This resulted in 33 different two-locus (intron 2 plus intron 4) microsatellite genotypes among 224 isolates. There were 15 different two-locus microsatellite genotypes within the South American Pfcrt haplotype (S72V73M74N75T76S220) and 11 genotypes in the southeast Asian haplotype (C72V73I74E75T76S220) in these isolates. Indian isolates with Pfcrt haplotype C72V73I74E75T76S220 shared one of its two-locus microsatellite genotype with southeast Asian P. falciparum parasite lines from Thailand (K1) and Indochina (Dd2 and W2). Conversely, Indian isolates containing S72V73M74N75T76S220 Pfcrt haplotype did not share any of their two-locus microsatellite genotype with South American parasite line 7G8 from Brazil. Significantly, large number of newer two-locus microsatellite genotypes were detected in a 2-year time period (P<0.05). Microsatellite variation was more prominent in the areas of high malaria transmission. It is concluded that the genetic recombination in the intragenic microsatellites continues in the parasite population even after microsatellites flanking the pfcrt gene had already been fixed. Presence of various Pfcrt haplotypes and a variety of intragenic microsatellites indicates that there is a wide spectrum of chloroquine resistant parasite population in India. This information should be useful for malaria control programs of the country.

  20. The complementary deoxyribonucleic acid sequence of guinea pig endometrial prorelaxin.

    PubMed

    Lee, Y A; Bryant-Greenwood, G D; Mandel, M; Greenwood, F C

    1992-03-01

    The nucleotide sequence of the relaxin gene transcript in the endometrium of the late pregnant guinea pig has been determined. The strategy used was a combination of polymerase chain reaction (PCR) with primers designed from the mRNA sequence of porcine preprorelaxin, rapid amplification of cDNA ends-PCR, and blunt end cloning in M13 mp18. With heterologous primers, a 226-basepair (bp) segment of the guinea pig relaxin gene sequence was obtained and was used to design a guinea pig-specific primer for use with the rapid amplification of cDNA ends-PCR method. The latter allowed completion of the sequence of 336 bp, with a 96-bp overlap. The sequence obtained shows greater homology at both the nucleotide and amino acid levels with porcine and human relaxins H1 and H2 than with rat relaxin, supporting the thesis that the guinea pig is not a rodent. The transcription of the guinea pig endometrial relaxin gene during pregnancy was confirmed by Northern analysis of guinea pig endometrial tissues with a species-specific cDNA probe. The endometrial relaxin gene is transcribed during pregnancy, but not in lactation, consistent with the observed immunostaining for relaxin.

  1. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  2. An improved protocol for sequencing of repetitive genomic regions and structural variations using mutagenesis and next generation sequencing.

    PubMed

    Sipos, Botond; Massingham, Tim; Stütz, Adrian M; Goldman, Nick

    2012-01-01

    The rise of Next Generation Sequencing (NGS) technologies has transformed de novo genome sequencing into an accessible research tool, but obtaining high quality eukaryotic genome assemblies remains a challenge, mostly due to the abundance of repetitive elements. These also make it difficult to study nucleotide polymorphism in repetitive regions, including certain types of structural variations. One solution proposed for resolving such regions is Sequence Assembly aided by Mutagenesis (SAM), which relies on the fact that introducing enough random mutations breaks the repetitive structure, making assembly possible. Sequencing many different mutated copies permits the sequence of the repetitive region to be inferred by consensus methods. However, this approach relies on molecular cloning in order to isolate and amplify individual mutant copies, making it hard to scale-up the approach for use in conjunction with high-throughput sequencing technologies. To address this problem, we propose NG-SAM, a modified version of the SAM protocol that relies on PCR and dilution steps only, coupled to a NGS workflow. NG-SAM therefore has the potential to be scaled-up, e.g. using emerging microfluidics technologies. We built a realistic simulation pipeline to study the feasibility of NG-SAM, and our results suggest that under appropriate experimental conditions the approach might be successfully put into practice. Moreover, our simulations suggest that NG-SAM is capable of reconstructing robustly a wide range of potential target sequences of varying lengths and repetitive structures.

  3. Variation of unsaturated fatty acids in soybean sprout of high oleic acid accessions.

    PubMed

    Dhakal, Krishna Hari; Jung, Ki-Hwal; Chae, Jong-Hyun; Shannon, J Grover; Lee, Jeong-Dong

    2014-12-01

    Oleic acid and oleic acid rich foods may have beneficial health effects in humans. Soybeans with high oleic acid (around 80% in seed oil) have been developed. Soybean sprouts are an important vegetable in Korea, Japan and China. The objective of this study was to investigate the variation of unsaturated fatty acids, oleic, linoleic and α-linolenic acids, in sprouts from soybeans with normal and high oleic acid concentration. Twelve soybean accessions with six high oleic acid lines, three parents of high oleic acid lines, and three checks with normal and high oleic acid concentration were used in this study. The unsaturated fatty acid concentration in sprouts from each genotype was similar to the concentration in the ungerminated seed. The oleic acid concentration in the sprouts of high oleic acid lines (up to 80%) was still high (>70%) compared to the ungerminated seed. Thus, high oleic soybean varieties developed for sprout production could add valuable health benefits to sprouts and the individuals who consume this vegetable.

  4. SUBGROUPS OF AMINO ACID SEQUENCES IN THE VARIABLE REGIONS OF IMMUNOGLOBULIN HEAVY CHAINS*

    PubMed Central

    Cunningham, Bruce A.; Pflumm, Mollie N.; User, Urs Rutisha; Edelman, Gerald M.

    1969-01-01

    The amino acid sequence of the first 133 residues of the heavy (γ) chain from a human γG immunoglobulin (He) has been determined. This γ-chain is identical in Gm type to that of protein Eu, the complete sequence of which has been reported. Comparison of the two sequences substantiates the previous suggestion that there are subgroups of variable regions of heavy chains. The variable region of Eu has been assigned to subgroup I and that of He to subgroup II; on the other hand, the constant regions of the two proteins appear to be identical. Comparison of the sequence of the heavy chain of He with the heavy chain sequences determined in other laboratories suggests that the variable region of subgroup II is at least 118 residues long. The nature and distribution of amino acid variations in this heavy chain subgroup resemble those observed in light chain subgroups. These studies provide evidence that the translocation hypothesis applies to heavy as well as to light chains, viz., genes for variable regions (V) are somatically translocated to genes for constant regions (C) to form complete VC structural genes. Images PMID:5264153

  5. The Quantification of Representative Sequences pipeline for amplicon sequencing: case study on within-population ITS1 sequence variation in a microparasite infecting Daphnia.

    PubMed

    González-Tortuero, E; Rusek, J; Petrusek, A; Gießler, S; Lyras, D; Grath, S; Castro-Monzón, F; Wolinska, J

    2015-11-01

    Next generation sequencing (NGS) platforms are replacing traditional molecular biology protocols like cloning and Sanger sequencing. However, accuracy of NGS platforms has rarely been measured when quantifying relative frequencies of genotypes or taxa within populations. Here we developed a new bioinformatic pipeline (QRS) that pools similar sequence variants and estimates their frequencies in NGS data sets from populations or communities. We tested whether the estimated frequency of representative sequences, generated by 454 amplicon sequencing, differs significantly from that obtained by Sanger sequencing of cloned PCR products. This was performed by analysing sequence variation of the highly variable first internal transcribed spacer (ITS1) of the ichthyosporean Caullerya mesnili, a microparasite of cladocerans of the genus Daphnia. This analysis also serves as a case example of the usage of this pipeline to study within-population variation. Additionally, a public Illumina data set was used to validate the pipeline on community-level data. Overall, there was a good correspondence in absolute frequencies of C. mesnili ITS1 sequences obtained from Sanger and 454 platforms. Furthermore, analyses of molecular variance (amova) revealed that population structure of C. mesnili differs across lakes and years independently of the sequencing platform. Our results support not only the usefulness of amplicon sequencing data for studies of within-population structure but also the successful application of the QRS pipeline on Illumina-generated data. The QRS pipeline is freely available together with its documentation under GNU Public Licence version 3 at http://code.google.com/p/quantification-representative-sequences.

  6. Targeted Exome Sequencing Outcome Variations of Colorectal Tumors within and across Two Sequencing Platforms

    PubMed Central

    Ashktorab, Hassan; Azimi, Hamed; Nickerson, Michael L.; Bass, Sara; Varma, Sudhir; Brim, Hassan

    2016-01-01

    Background and Aim Next generation sequencing (NGS) has quickly the tool of choice for genome and exome data generation. The multitude of sequencing platforms as well as the variabilities within each platform need to be assessed. In this paper we used two platforms (ION TORRENT AND ILLUMINA) to assess single nucleotides variants in colorectal cancer (CRC) specimens. Methods CRC specimens (n = 13) collected from 6 CRC (cancer and matched normal) patients were used to establish the mutational profile using ION TORRENT AND ILLUMINA sequencing platforms. We analyzed a set of samples from Formalin Fixed Paraffin Embedded and FF (FF) samples on both platforms to assess the effect of sample nature (FFPE vs. FF) on sequencing outcome and to evaluate the similarity/differences of SNVs across the two platforms. In addition, duplicates of FF samples were sequenced on each platform to assess variability within platform. Results The comparison of FF replicates to each other gave a concordance of 77% (± 15.3%) in Ion Torrent and 70% (± 3.7%) in Illumina. FFPE vs. FF replicates gave a concordance of 40% (± 32%) in Ion Torrent and 49% (± 19%) in Illumina. For the cross platform concordance were FFPE compared to FF (Average of 75% (± 9.8%) for FFPE samples and 67% (± 32%) for FF and 70% (± 26.8%) overall average). Conclusion Our data show a significant variability within and across platforms. Also the number of detected variants depend on the nature of the specimen; FF vs. FFPE. Validation of NGS discovered mutations is a must to rule-out false positive mutants. This validation might either be performed through a second NGS platform or through Sanger sequencing. PMID:27547838

  7. Sequence variation and methylation of the flax 5S RNA genes.

    PubMed Central

    Goldsbrough, P B; Ellis, T H; Lomonossoff, G P

    1982-01-01

    The complete sequence of the flax 5S DNA repeat is presented. Length heterogeneity is the consequence of the presence or absence of a single direct repeat and the majority of single base changes are transition mutations. No sequence variation has been found in the coding sequence. The extent of methylation of cytosines has been measured at one location in the gene and one in the spacer. The relationship between the observed sequence heterogeneity and the level of methylation is discussed in the context of the operation of a correction mechanism. Images PMID:6290983

  8. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  9. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    DOEpatents

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  10. The amino acid sequence of chymopapain from Carica papaya.

    PubMed Central

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-01-01

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  11. The amino acid sequence of rabbit cardiac troponin I.

    PubMed Central

    Grand, R J; Wilkinson, J M

    1976-01-01

    The complete amino acid sequence of troponin I from rabbit cardiac muscle was determined by the isolation of four unique CNBr fragments, together with overlapping tryptic peptides containing radioactive methionine residues. Overlap data for residues 35-36, 93-94 and 140-145 are incomplete, the sequence at these positions being based on homology with the sequence of the fast-skeletal-muscle protein. Cardiac troponin I is a single polypeptide chain of 206 residues with mol.wt. 23550 and an extinction coefficient, E 1%,1cm/280, of 4.37. The protein has a net positive charge of 14 and is thus somewhat more basic than troponin I from fast-skeletal muscle. Comparison of the sequences of troponin I from cardiac and fast skeletal muscle show that the cardiac protein has 26 extra residues at the N-terminus which account for the larger size of the protein. In the remainder of sequence there is a considerable degree of homology, this being greater in the C-terminal two-thirds of the molecule. The region in the cardiac protein corresponding to the peptide with inhibitory activity from the fast-skeletal-muscle protein is very similar and it seems unlikely that this is the cause of the difference in inhibitory activity between the two proteins. The region responsible for binding troponin C, however, possesses a lower degree of homology. Detailed evidence on which the sequence is based has been deposited as Supplementary Publication SUP 50072 (20 pages), at the British Library Lending Division, Boston Spa, Wetherby, West Yorkshire LS23 7QB, U.K., from whom copies may be obtained on the terms given in Biochem. J. (1976) 153, 5. PMID:1008822

  12. Mathematical Characterization of Protein Sequences Using Patterns as Chemical Group Combinations of Amino Acids

    PubMed Central

    Choudhury, Pabitra Pal; Jana, Siddhartha Sankar

    2016-01-01

    Comparison of amino acid sequence similarity is the fundamental concept behind the protein phylogenetic tree formation. By virtue of this method, we can explain the evolutionary relationships, but further explanations are not possible unless sequences are studied through the chemical nature of individual amino acids. Here we develop a new methodology to characterize the protein sequences on the basis of the chemical nature of the amino acids. We design various algorithms for studying the variation of chemical group transitions and various chemical group combinations as patterns in the protein sequences. The amino acid sequence of conventional myosin II head domain of 14 family members are taken to illustrate this new approach. We find two blocks of maximum length 6 aa as ‘FPKATD’ and ‘Y/FTNEKL’ without repeating the same chemical nature and one block of maximum length 20 aa with the repetition of chemical nature which are common among all 14 members. We also check commonality with another motor protein sub-family kinesin, KIF1A. Based on our analysis we find a common block of length 8 aa both in myosin II and KIF1A. This motif is located in the neck linker region which could be responsible for the generation of mechanical force, enabling us to find the unique blocks which remain chemically conserved across the family. We also validate our methodology with different protein families such as MYOI, Myosin light chain kinase (MLCK) and Rho-associated protein kinase (ROCK), Na+/K+-ATPase and Ca2+-ATPase. Altogether, our studies provide a new methodology for investigating the conserved amino acids’ pattern in different proteins. PMID:27930687

  13. Amino acid sequence of a mouse immunoglobulin mu chain.

    PubMed Central

    Kehry, M; Sibley, C; Fuhrman, J; Schilling, J; Hood, L E

    1979-01-01

    The complete amino acid sequence of the mouse mu chain from the BALB/c myeloma tumor MOPC 104E is reported. The C mu region contains four consecutive homology regions of approximately 110 residues and a COOH-terminal region of 19 residues. A comparison of this mu chain from mouse with a complete mu sequence from human (Ou) and a partial mu chain sequence from dog (Moo) reveals a striking gradient of increasing homology from the NH2-terminal to the COOH-terminal portion of these mu chains, with the former being the least and the latter the most highly conserved. Four of the five sites of carbohydrate attachment appear to be at identical residue positions when the constant regions of the mouse and human mu chains are compared. The mu chain of MOPC 104E has a carbohydrate moiety attached in the second hypervariable region. This is particularly interesting in view of the fact that MOPC 104E binds alpha-(1 leads to 3)-dextran, a simple carbohydrate. The structural and functional constraints imposed by these comparative sequence analyses are discussed. PMID:111247

  14. Ultrasensitive nucleic acid sequence detection by single-molecule electrophoresis

    SciTech Connect

    Castro, A; Shera, E.B.

    1996-09-01

    This is the final report of a one-year laboratory-directed research and development project at Los Alamos National Laboratory. There has been considerable interest in the development of very sensitive clinical diagnostic techniques over the last few years. Many pathogenic agents are often present in extremely small concentrations in clinical samples, especially at the initial stages of infection, making their detection very difficult. This project sought to develop a new technique for the detection and accurate quantification of specific bacterial and viral nucleic acid sequences in clinical samples. The scheme involved the use of novel hybridization probes for the detection of nucleic acids combined with our recently developed technique of single-molecule electrophoresis. This project is directly relevant to the DOE`s Defense Programs strategic directions in the area of biological warfare counter-proliferation.

  15. Homologous recombination drives both sequence diversity and gene content variation in Neisseria meningitidis.

    PubMed

    Kong, Ying; Ma, Jennifer H; Warren, Keisha; Tsang, Raymond S W; Low, Donald E; Jamieson, Frances B; Alexander, David C; Hao, Weilong

    2013-01-01

    The study of genetic and phenotypic variation is fundamental for understanding the dynamics of bacterial genome evolution and untangling the evolution and epidemiology of bacterial pathogens. Neisseria meningitidis (Nm) is among the most intriguing bacterial pathogens in genomic studies due to its dynamic population structure and complex forms of pathogenicity. Extensive genomic variation within identical clonal complexes (CCs) in Nm has been recently reported and suggested to be the result of homologous recombination, but the extent to which recombination contributes to genomic variation within identical CCs has remained unclear. In this study, we sequenced two Nm strains of identical serogroup (C) and multi-locus sequence type (ST60), and conducted a systematic analysis with an additional 34 Nm genomes. Our results revealed that all gene content variation between the two ST60 genomes was introduced by homologous recombination at the conserved flanking genes, and 94.25% or more of sequence divergence was caused by homologous recombination. Recombination was found in genes associated with virulence factors, antigenic outer membrane proteins, and vaccine targets, suggesting an important role of homologous recombination in rapidly altering the pathogenicity and antigenicity of Nm. Recombination was also evident in genes of the restriction and modification systems, which may undermine barriers to DNA exchange. In conclusion, homologous recombination can drive both gene content variation and sequence divergence in Nm. These findings shed new light on the understanding of the rapid pathoadaptive evolution of Nm and other recombinogenic bacterial pathogens.

  16. Sequence variation of ribosomal internal transcribed spacers (ITS) in commercially important Phytoseiidae mites.

    PubMed

    Navajas, M; Lagnel, J; Fauvel, G; de Moraes, G

    1999-11-01

    Preliminary work is needed to assess the usefulness of different markers at different taxonomic scales when a new group is analyzed, such as the commercially important Phytoseiidae mites. We investigate here the level of sequence variation of the nuclear ribosomal spacers ITS 1 and 2 and the 5.8S gene in six species of Phytoseiidae: Neoseiulus culifornicus, N. fallacis, Euseius concordis, Metaseiulus occidentalis, Typhlodromus pyri and Phytoseiulus persimilis. As expected, the 5.8S gene (148 base pairs) is markedly conserved and displays little variation in between genera comparisons. ITS1 and ITS2 show contrasting patterns: while the ITS2 is short (80-89 bp) and shows little variation, the ITS1 is longer (303-404 bp) and is very variable in sequence. This fact compromises reliable nucleotide homologies when comparing the genera. The comparison of ITS1 sequence similarity at the species level might be useful for species identification, however, the value of ITS in taxonomic studies does not extend to the level of the family. The intraspecific variations of ITS were investigated in three species: N. californicus, N. fallacis and E. concordis. The first species has identical ITS1 sequences and the last two display low polymorphism (2 nucleotide substitutions). The ITS2 and 5.8S sequences were identical in all three subspecies comparisons.

  17. Characterization of genetic sequence variation of 58 STR loci in four major population groups.

    PubMed

    Novroski, Nicole M M; King, Jonathan L; Churchill, Jennifer D; Seah, Lay Hong; Budowle, Bruce

    2016-11-01

    Massively parallel sequencing (MPS) can identify sequence variation within short tandem repeat (STR) alleles as well as their nominal allele lengths that traditionally have been obtained by capillary electrophoresis. Using the MiSeq FGx Forensic Genomics System (Illumina), STRait Razor, and in-house excel workbooks, genetic variation was characterized within STR repeat and flanking regions of 27 autosomal, 7 X-chromosome and 24 Y-chromosome STR markers in 777 unrelated individuals from four population groups. Seven hundred and forty six autosomal, 227 X-chromosome, and 324 Y-chromosome STR alleles were identified by sequence compared with 357 autosomal, 107 X-chromosome, and 189 Y-chromosome STR alleles that were identified by length. Within the observed sequence variation, 227 autosomal, 156 X-chromosome, and 112 Y-chromosome novel alleles were identified and described. One hundred and seventy six autosomal, 123 X-chromosome, and 93 Y-chromosome sequence variants resided within STR repeat regions, and 86 autosomal, 39 X-chromosome, and 20 Y-chromosome variants were located in STR flanking regions. Three markers, D18S51, DXS10135, and DYS385a-b had 1, 4, and 1 alleles, respectively, which contained both a novel repeat region variant and a flanking sequence variant in the same nucleotide sequence. There were 50 markers that demonstrated a relative increase in diversity with the variant sequence alleles compared with those of traditional nominal length alleles. These population data illustrate the genetic variation that exists in the commonly used STR markers in the selected population samples and provide allele frequencies for statistical calculations related to STR profiling with MPS data.

  18. Major breeding plumage color differences of male ruffs (Philomachus pugnax) are not associated with coding sequence variation in the MC1R gene.

    PubMed

    Farrell, Lindsay L; Küpper, Clemens; Burke, Terry; Lank, David B

    2015-01-01

    Sequence variation in the melanocortin-1 receptor (MC1R) gene explains color morph variation in several species of birds and mammals. Ruffs (Philomachus pugnax) exhibit major dark/light color differences in melanin-based male breeding plumage which is closely associated with alternative reproductive behavior. A previous study identified a microsatellite marker (Ppu020) near the MC1R locus associated with the presence/absence of ornamental plumage. We investigated whether coding sequence variation in the MC1R gene explains major dark/light plumage color variation and/or the presence/absence of ornamental plumage in ruffs. Among 821bp of the MC1R coding region from 44 male ruffs we found 3 single nucleotide polymorphisms, representing 1 nonsynonymous and 2 synonymous amino acid substitutions. None were associated with major dark/light color differences or the presence/absence of ornamental plumage. At all amino acid sites known to be functionally important in other avian species with dark/light plumage color variation, ruffs were either monomorphic or the shared polymorphism did not coincide with color morph. Neither ornamental plumage color differences nor the presence/absence of ornamental plumage in ruffs are likely to be caused entirely by amino acid variation within the coding regions of the MC1R locus. Regulatory elements and structural variation at other loci may be involved in melanin expression and contribute to the extreme plumage polymorphism observed in this species.

  19. Using Next-Generation Sequencing for DNA Barcoding: Capturing Allelic Variation in ITS2

    PubMed Central

    Batovska, Jana; Cogan, Noel O. I.; Lynch, Stacey E.; Blacket, Mark J.

    2016-01-01

    Internal Transcribed Spacer 2 (ITS2) is a popular DNA barcoding marker; however, in some animal species it is hypervariable and therefore difficult to sequence with traditional methods. With next-generation sequencing (NGS) it is possible to sequence all gene variants despite the presence of single nucleotide polymorphisms (SNPs), insertions/deletions (indels), homopolymeric regions, and microsatellites. Our aim was to compare the performance of Sanger sequencing and NGS amplicon sequencing in characterizing ITS2 in 26 mosquito species represented by 88 samples. The suitability of ITS2 as a DNA barcoding marker for mosquitoes, and its allelic diversity in individuals and species, was also assessed. Compared to Sanger sequencing, NGS was able to characterize the ITS2 region to a greater extent, with resolution within and between individuals and species that was previously not possible. A total of 382 unique sequences (alleles) were generated from the 88 mosquito specimens, demonstrating the diversity present that has been overlooked by traditional sequencing methods. Multiple indels and microsatellites were present in the ITS2 alleles, which were often specific to species or genera, causing variation in sequence length. As a barcoding marker, ITS2 was able to separate all of the species, apart from members of the Culex pipiens complex, providing the same resolution as the commonly used Cytochrome Oxidase I (COI). The ability to cost-effectively sequence hypervariable markers makes NGS an invaluable tool with many applications in the DNA barcoding field, and provides insights into the limitations of previous studies and techniques. PMID:27799340

  20. A sparse model based detection of copy number variations from exome sequencing data

    PubMed Central

    Duan, Junbo; Wan, Mingxi; Deng, Hong-Wen; Wang, Yu-Ping

    2016-01-01

    Goal Whole-exome sequencing provides a more cost-effective way than whole-genome sequencing for detecting genetic variants such as copy number variations (CNVs). Although a number of approaches have been proposed to detect CNVs from whole-genome sequencing, a direct adoption of these approaches to whole-exome sequencing will often fail because exons are separately located along a genome. Therefore, an appropriate method is needed to target the specific features of exome sequencing data. Methods In this paper a novel sparse model based method is proposed to discover CNVs from multiple exome sequencing data. First, exome sequencing data are represented with a penalized matrix approximation, and technical variability and random sequencing errors are assumed to follow a generalized Gaussian distribution. Second, an iteratively re-weighted least squares algorithm is used to estimate the solution. Results The method is tested and validated on both synthetic and real data, and compared with other approaches including CoNIFER, XHMM and cn.MOPS. The test demonstrates that the proposed method outperform other approaches. Conclusion The proposed sparse model can detect CNVs from exome sequencing data with high power and precision. Significance Sparse model can target the specific features of exome sequencing data. The software codes are freely available at http://www.tulane.edu/wyp/software/ExonCNV.m PMID:26258935

  1. Nucleic acid (cDNA) and amino acid sequences of alpha-type gliadins from wheat (Triticum aestivum).

    PubMed Central

    Kasarda, D D; Okita, T W; Bernardin, J E; Baecker, P A; Nimmo, C C; Lew, E J; Dietler, M D; Greene, F C

    1984-01-01

    The complete amino acid sequence for an alpha-type gliadin protein of wheat (Triticum aestivum Linnaeus) endosperm has been derived from a cloned cDNA sequence. An additional cDNA clone that corresponds to about 75% of a similar alpha-type gliadin has been sequenced and shows some important differences. About 97% of the composite sequence of A-gliadin (an alpha-type gliadin fraction) has also been obtained by direct amino acid sequencing. This sequence shows a high degree of similarity with amino acid sequences derived from both cDNA clones and is virtually identical to one of them. On the basis of sequence information, after loss of the signal sequence, the mature alpha-type gliadins may be divided into five different domains, two of which may have evolved from an ancestral gliadin gene, whereas the remaining three contain repeating sequences that may have developed independently. Images PMID:6589619

  2. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway.

  3. DNA-protein recognition and sequence-dependent variations of DNA conformational properties

    NASA Astrophysics Data System (ADS)

    Vologodskii, Alexander

    2015-03-01

    Parameters of B-DNA, the major form of the double helix, depend on its sequence. This dependence can contribute to the recognition of specific DNA sequences by proteins. Here we try to analyze this contribution quantitatively. In the first approach to this goal we used experimental data on the sequence dependence of DNA bending rigidity and its helical repeat. The solution data on these parameters of B-DNA were derived from the experiments on cyclization of short DNA fragments with specially designed sequences. The data allowed calculating the sequence variations of DNA bending energy, as well as the variations of the energy of torsional deformation of the double helix associated with a protein binding. The results show that DNA conformational parameters can have very limited influence on the sequence specificity of protein binding. In the second approach we analyzed the experimental data on the binding affinity of the nucleosome core with DNA fragments of different sequences. The conclusions derived in these two approaches are in a good agreement with one another.

  4. Structural gene and complete amino acid sequence of Vibrio alginolyticus collagenase.

    PubMed Central

    Takeuchi, H; Shibano, Y; Morihara, K; Fukushima, J; Inami, S; Keil, B; Gilles, A M; Kawamoto, S; Okuda, K

    1992-01-01

    The DNA encoding the collagenase of Vibrio alginolyticus was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, bacteria carrying the gene exhibited both collagenase antigen and collagenase activity. The open reading frame from the ATG initiation codon was 2442 bp in length for the collagenase structural gene. The amino acid sequence, deduced from the nucleotide sequence, revealed that the mature collagenase consists of 739 amino acids with an Mr of 81875. The amino acid sequences of 20 polypeptide fragments were completely identical with the deduced amino acid sequences of the collagenase gene. The amino acid composition predicted from the DNA sequence was similar to the chemically determined composition of purified collagenase reported previously. The analyses of both the DNA and amino acid sequences of the collagenase gene were rigorously performed, but we could not detect any significant sequence similarity to other collagenases. Images Fig. 2. PMID:1311172

  5. Whole-genome sequencing reveals the diversity of cattle copy number variations and multicopy genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Structural and functional impacts of copy number variations (CNVs) on livestock genomes are not yet well understood. We identified 1853 CNV regions using population-scale sequencing data generated from 75 cattle representing 8 breeds (Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, Romagnol...

  6. Sequence analysis of pooled bacterial samples enables identification of strain variation in group A streptococcus

    PubMed Central

    Weldatsadik, Rigbe G.; Wang, Jingwen; Puhakainen, Kai; Jiao, Hong; Jalava, Jari; Räisänen, Kati; Datta, Neeta; Skoog, Tiina; Vuopio, Jaana; Jokiranta, T. Sakari; Kere, Juha

    2017-01-01

    Knowledge of the genomic variation among different strains of a pathogenic microbial species can help in selecting optimal candidates for diagnostic assays and vaccine development. Pooled sequencing (Pool-seq) is a cost effective approach for population level genetic studies that require large numbers of samples such as various strains of a microbe. To test the use of Pool-seq in identifying variation, we pooled DNA of 100 Streptococcus pyogenes strains of different emm types in two pools, each containing 50 strains. We used four variant calling tools (Freebayes, UnifiedGenotyper, SNVer, and SAMtools) and one emm1 strain, SF370, as a reference genome. In total 63719 SNPs and 164 INDELs were identified in the two pools concordantly by at least two of the tools. Majority of the variants (93.4%) from six individually sequenced strains used in the pools could be identified from the two pools and 72.3% and 97.4% of the variants in the pools could be mined from the analysis of the 44 complete Str. pyogenes genomes and 3407 sequence runs deposited in the European Nucleotide Archive respectively. We conclude that DNA sequencing of pooled samples of large numbers of bacterial strains is a robust, rapid and cost-efficient way to discover sequence variation. PMID:28361960

  7. Individual variation and intraclass correlation in arachidonic acid and eicosapentaenoic acid in chicken muscle

    PubMed Central

    2010-01-01

    Chicken meat with reduced concentration of arachidonic acid (AA) and reduced ratio between omega-6 and omega-3 fatty acids has potential health benefits because a reduction in AA intake dampens prostanoid signaling, and the proportion between omega-6 and omega-3 fatty acids is too high in our diet. Analyses for fatty acid determination are expensive, and finding the optimal number of analyses to give reliable results is a challenge. The objective of the present study was i) to analyse the intraclass correlation of different fatty acids in five meat samples, of one gram each, within the same chicken thigh, and ii) to study individual variations in the concentrations of a range of fatty acids and the ratio between omega-6 and omega-3 fatty acid concentrations among fifteen chickens. Fifteen newly hatched broilers were fed a wheat-based diet containing 4% rapeseed oil and 1% linseed oil for three weeks. Five muscle samples from the mid location of the thigh of each chicken were analysed for fatty acid composition. The intraclass correlation (sample correlation within the same animal) was 0.85-0.98 for the ratios of total omega-6 to total omega-3 fatty acids and of AA to eicosapentaenoic acid (EPA). This indicates that when studying these fatty acid ratios, one sample of one gram per animal is sufficient. However, due to the high individual variation between chicken for these ratios, a relatively high number of animals (minimum 15) are required to obtain a sufficiently high power to reveal significant effects of experimental factors (e.g. feeding regimes). The present experiment resulted in meat with a favorable concentration ratio between omega-6 and omega-3 fatty acids. The AA concentration varied from 1.5 to 2.8 g/100 g total fatty acids in thigh muscle in the fifteen broilers, and the ratio between AA and EPA concentrations ranged from 2.3 to 3.9. These differences among the birds may be due to genetic variance that can be exploited by breeding for lower AA

  8. Multi-Sample Pooling and Illumina Genome Analyzer Sequencing Methods to Determine Gene Sequence Variation for Database Development

    PubMed Central

    Margraf, Rebecca L.; Durtschi, Jacob D.; Dames, Shale; Pattison, David C.; Stephens, Jack E.; Mao, Rong; Voelkerding, Karl V.

    2010-01-01

    Determination of sequence variation within a genetic locus to develop clinically relevant databases is critical for molecular assay design and clinical test interpretation, so multisample pooling for Illumina genome analyzer (GA) sequencing was investigated using the RET proto-oncogene as a model. Samples were Sanger-sequenced for RET exons 10, 11, and 13–16. Ten samples with 13 known unique variants (“singleton variants” within the pool) and seven common changes were amplified and then equimolar-pooled before sequencing on a single flow cell lane, generating 36 base reads. For comparison, a single “control” sample was run in a different lane. After alignment, a 24-base quality score-screening threshold and 3` read end trimming of three bases yielded low background error rates with a 27% decrease in aligned read coverage. Sequencing data were evaluated using an established variant detection method (percent variant reads), by the presented subtractive correction method, and with SNPSeeker software. In total, 41 variants (of which 23 were singleton variants) were detected in the 10 pool data, which included all Sanger-identified variants. The 23 singleton variants were detected near the expected 5% allele frequency (average 5.17%±0.90% variant reads), well above the highest background error (1.25%). Based on background error rates, read coverage, simulated 30, 40, and 50 sample pool data, expected singleton allele frequencies within pools, and variant detection methods; ≥30 samples (which demonstrated a minimum 1% variant reads for singletons) could be pooled to reliably detect singleton variants by GA sequencing. PMID:20808642

  9. Genetic variation assessment of acid lime accessions collected from south of Iran using SSR and ISSR molecular markers.

    PubMed

    Sharafi, Ata Allah; Abkenar, Asad Asadi; Sharafi, Ali; Masaeli, Mohammad

    2016-01-01

    Iran has a long history of acid lime cultivation and propagation. In this study, genetic variation in 28 acid lime accessions from five regions of south of Iran, and their relatedness with other 19 citrus cultivars were analyzed using Simple Sequence Repeat (SSR) and Inter-Simple Sequence Repeat (ISSR) molecular markers. Nine primers for SSR and nine ISSR primers were used for allele scoring. In total, 49 SSR and 131 ISSR polymorphic alleles were detected. Cluster analysis of SSR and ISSR data showed that most of the acid lime accessions (19 genotypes) have hybrid origin and genetically distance with nucellar of Mexican lime (9 genotypes). As nucellar of Mexican lime are susceptible to phytoplasma, these acid lime genotypes can be used to evaluate their tolerance against biotic constricts like lime "witches' broom disease".

  10. LOESS correction for length variation in gene set-based genomic sequence analysis

    PubMed Central

    Aboukhalil, Anton; Bulyk, Martha L.

    2012-01-01

    Motivation: Sequence analysis algorithms are often applied to sets of DNA, RNA or protein sequences to identify common or distinguishing features. Controlling for sequence length variation is critical to properly score sequence features and identify true biological signals rather than length-dependent artifacts. Results: Several cis-regulatory module discovery algorithms exhibit a substantial dependence between DNA sequence score and sequence length. Our newly developed LOESS method is flexible in capturing diverse score-length relationships and is more effective in correcting DNA sequence scores for length-dependent artifacts, compared with four other approaches. Application of this method to genes co-expressed during Drosophila melanogaster embryonic mesoderm development or neural development scored by the Lever motif analysis algorithm resulted in successful recovery of their biologically validated cis-regulatory codes. The LOESS length-correction method is broadly applicable, and may be useful not only for more accurate inference of cis-regulatory codes, but also for detection of other types of patterns in biological sequences. Availability: Source code and compiled code are available from http://thebrain.bwh.harvard.edu/LM_LOESS/ Contact: mlbulyk@receptor.med.harvard.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:22492312

  11. Rice pseudomolecule-anchored cross-species DNA sequence alignments indicate regional genomic variation in expressed sequence conservation

    PubMed Central

    Armstead, Ian; Huang, Lin; King, Julie; Ougham, Helen; Thomas, Howard; King, Ian

    2007-01-01

    Background Various methods have been developed to explore inter-genomic relationships among plant species. Here, we present a sequence similarity analysis based upon comparison of transcript-assembly and methylation-filtered databases from five plant species and physically anchored rice coding sequences. Results A comparison of the frequency of sequence alignments, determined by MegaBLAST, between rice coding sequences in TIGR pseudomolecules and annotations vs 4.0 and comprehensive transcript-assembly and methylation-filtered databases from Lolium perenne (ryegrass), Zea mays (maize), Hordeum vulgare (barley), Glycine max (soybean) and Arabidopsis thaliana (thale cress) was undertaken. Each rice pseudomolecule was divided into 10 segments, each containing 10% of the functionally annotated, expressed genes. This indicated a correlation between relative segment position in the rice genome and numbers of alignments with all the queried monocot and dicot plant databases. Colour-coded moving windows of 100 functionally annotated, expressed genes along each pseudomolecule were used to generate 'heat-maps'. These revealed consistent intra- and inter-pseudomolecule variation in the relative concentrations of significant alignments with the tested plant databases. Analysis of the annotations and derived putative expression patterns of rice genes from 'hot-spots' and 'cold-spots' within the heat maps indicated possible functional differences. A similar comparison relating to ancestral duplications of the rice genome indicated that duplications were often associated with 'hot-spots'. Conclusion Physical positions of expressed genes in the rice genome are correlated with the degree of conservation of similar sequences in the transcriptomes of other plant species. This relative conservation is associated with the distribution of different sized gene families and segmentally duplicated loci and may have functional and evolutionary implications. PMID:17708759

  12. Identification of staphylococcal species based on variations in protein sequences (mass spectrometry) and DNA sequence (sodA microarray).

    PubMed

    Kooken, Jennifer; Fox, Karen; Fox, Alvin; Altomare, Diego; Creek, Kim; Wunschel, David; Pajares-Merino, Sara; Martínez-Ballesteros, Ilargi; Garaizar, Javier; Oyarzabal, Omar; Samadpour, Mansour

    2014-02-01

    This report is among the first using sequence variation in newly discovered protein markers for staphylococcal (or indeed any other bacterial) speciation. Variation, at the DNA sequence level, in the sodA gene (commonly used for staphylococcal speciation) provided excellent correlation. Relatedness among strains was also assessed using protein profiling using microcapillary electrophoresis and pulsed field electrophoresis. A total of 64 strains were analyzed including reference strains representing the 11 staphylococcal species most commonly isolated from man (Staphylococcus aureus and 10 coagulase negative species [CoNS]). Matrix assisted time of flight ionization/ionization mass spectrometry (MALDI TOF MS) and liquid chromatography-electrospray ionization tandem mass spectrometry (LC ESI MS/MS) were used for peptide analysis of proteins isolated from gel bands. Comparison of experimental spectra of unknowns versus spectra of peptides derived from reference strains allowed bacterial identification after MALDI TOF MS analysis. After LC-MS/MS analysis of gel bands bacterial speciation was performed by comparing experimental spectra versus virtual spectra using the software X!Tandem. Finally LC-MS/MS was performed on whole proteomes and data analysis also employing X!tandem. Aconitate hydratase and oxoglutarate dehydrogenase served as marker proteins on focused analysis after gel separation. Alternatively on full proteomics analysis elongation factor Tu generally provided the highest confidence in staphylococcal speciation.

  13. Nucleic acid (cDNA) and amino acid sequences of the maize endosperm protein glutelin-2.

    PubMed Central

    Prat, S; Cortadas, J; Puigdomènech, P; Palau, J

    1985-01-01

    The cDNA coding for a glutelin-2 protein from maize endosperm has been cloned and the complete amino acid sequence of the protein derived for the first time. An immature maize endosperm cDNA bank was screened for the expression of a beta-lactamase:glutelin-2 (G2) fusion polypeptide by using antibodies against the purified 28 kd G2 protein. A clone corresponding to the 28 kd G2 protein was sequenced and the primary structure of this protein was derived. Five regions can be defined in the protein sequence: an 11 residue N-terminal part, a repeated region formed by eight units of the sequence Pro-Pro-Pro-Val-His-Leu, an alternating Pro-X stretch 21 residues long, a Cys rich domain and a C-terminal part rich in Gln. The protein sequence is preceded by 19 residues which have the characteristics of the signal peptide found in secreted proteins. Unlike zeins, the main maize storage proteins, 28 kd glutelin-2 has several homologous sequences in common with other cereal storage proteins. Images PMID:3839076

  14. CODEX: a normalization and copy number variation detection method for whole exome sequencing.

    PubMed

    Jiang, Yuchao; Oldridge, Derek A; Diskin, Sharon J; Zhang, Nancy R

    2015-03-31

    High-throughput sequencing of DNA coding regions has become a common way of assaying genomic variation in the study of human diseases. Copy number variation (CNV) is an important type of genomic variation, but detecting and characterizing CNV from exome sequencing is challenging due to the high level of biases and artifacts. We propose CODEX, a normalization and CNV calling procedure for whole exome sequencing data. The Poisson latent factor model in CODEX includes terms that specifically remove biases due to GC content, exon capture and amplification efficiency, and latent systemic artifacts. CODEX also includes a Poisson likelihood-based recursive segmentation procedure that explicitly models the count-based exome sequencing data. CODEX is compared to existing methods on a population analysis of HapMap samples from the 1000 Genomes Project, and shown to be more accurate on three microarray-based validation data sets. We further evaluate performance on 222 neuroblastoma samples with matched normals and focus on a well-studied rare somatic CNV within the ATRX gene. We show that the cross-sample normalization procedure of CODEX removes more noise than normalizing the tumor against the matched normal and that the segmentation procedure performs well in detecting CNVs with nested structures.

  15. Limited HLA sequence variation outside of antigen recognition domain exons of 360 10 of 10 matched unrelated hematopoietic stem cell transplant donor-recipient pairs.

    PubMed

    Hou, L; Vierra-Green, C; Lazaro, A; Brady, C; Haagenson, M; Spellman, S; Hurley, C K

    2017-01-01

    Traditional DNA-based typing focuses primarily on interrogating the exons of human leukocyte antigen (HLA) genes that form the antigen recognition domain (ARD). The relevance of mismatching donor and recipient for HLA variation outside the ARD on hematopoietic stem cell transplantation (HSCT) outcomes is unknown. This study was designed to evaluate the frequency of variation outside the ARD in 10 of 10 (HLA-A, -B, -C, -DRB1, -DQB1) matched unrelated donor transplant pairs (n = 360). Next-generation DNA sequencing was used to characterize both HLA exons and introns for HLA-A, -B, -C alleles; exons 2, 3 and the intervening intron for HLA-DRB1 and exons only for HLA-DQA1 and -DQB1. Over 97% of alleles at each locus were matched for their nucleotide sequence outside of the ARD exons. Of the 4320 allele comparisons overall, only 17 allele pairs were mismatched for non-ARD exons, 41 for noncoding regions and 9 for ARD exons. The observed variation between donor and recipient usually involved a single nucleotide difference (88% of mismatches); 88% of the non-ARD exon variants impacted the amino acid sequence. The impact of amino acid sequence variation caused by substitutions in exons outside ARD regions in D-R pairs will be difficult to assess in HSCT outcome studies because these mismatches do not occur very frequently.

  16. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  17. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2012-07-01 2012-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  18. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2014-07-01 2014-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  19. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  20. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 37 Patents, Trademarks, and Copyrights 1 2013-07-01 2013-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide...

  1. Sequence-Based Pronunciation Variation Modeling for Spontaneous ASR Using a Noisy Channel Approach

    NASA Astrophysics Data System (ADS)

    Hofmann, Hansjörg; Sakti, Sakriani; Hori, Chiori; Kashioka, Hideki; Nakamura, Satoshi; Minker, Wolfgang

    The performance of English automatic speech recognition systems decreases when recognizing spontaneous speech mainly due to multiple pronunciation variants in the utterances. Previous approaches address this problem by modeling the alteration of the pronunciation on a phoneme to phoneme level. However, the phonetic transformation effects induced by the pronunciation of the whole sentence have not yet been considered. In this article, the sequence-based pronunciation variation is modeled using a noisy channel approach where the spontaneous phoneme sequence is considered as a “noisy” string and the goal is to recover the “clean” string of the word sequence. Hereby, the whole word sequence and its effect on the alternation of the phonemes will be taken into consideration. Moreover, the system not only learns the phoneme transformation but also the mapping from the phoneme to the word directly. In this study, first the phonemes will be recognized with the present recognition system and afterwards the pronunciation variation model based on the noisy channel approach will map from the phoneme to the word level. Two well-known natural language processing approaches are adopted and derived from the noisy channel model theory: Joint-sequence models and statistical machine translation. Both of them are applied and various experiments are conducted using microphone and telephone of spontaneous speech.

  2. FEATnotator: A tool for integrated annotation of sequence features and variation, facilitating interpretation in genomics experiments.

    PubMed

    Podicheti, Ram; Mockaitis, Keithanne

    2015-06-01

    As approaches are sought for more efficient and democratized uses of non-model and expanded model genomics references, ease of integration of genomic feature datasets is especially desirable in multidisciplinary research communities. Valuable conclusions are often missed or slowed when researchers refer experimental results to a single reference sequence that lacks integrated pan-genomic and multi-experiment data in accessible formats. Association of genomic positional information, such as results from an expansive variety of next-generation sequencing experiments, with annotated reference features such as genes or predicted protein binding sites, provides the context essential for conclusions and ongoing research. When the experimental system includes polymorphic genomic inputs, rapid calculation of gene structural and protein translational effects of sequence variation from the reference can be invaluable. Here we present FEATnotator, a lightweight, fast and easy to use open source software program that integrates and reports overlap and proximity in genomic information from any user-defined datasets including those from next generation sequencing applications. We illustrate use of the tool by summarizing whole genome sequence variation of a widely used natural isolate of Arabidopsis thaliana in the context of gene models of the reference accession. Previous discovery of a protein coding deletion influencing root development is replicated rapidly. Appropriate even in investigations of a single gene or genic regions such as QTL, comprehensive reports provided by FEATnotator better prepare researchers for interpretation of their experimental results. The tool is available for download at http://featnotator.sourceforge.net.

  3. Simulated seasonal variations in wet acid depositions over East Asia.

    PubMed

    Ge, Cui; Zhang, Meigen; Zhu, Lingyun; Han, Xiao; Wang, Jun

    2011-11-01

    The air quality modeling system Regional Atmospheric Modeling System-Community Multi-scale Air Quality (RAMS-CMAQ) was applied to analyze temporospatial variations in wet acid deposition over East Asia in 2005, and model results obtained on a monthly basis were evaluated against extensive observations, including precipitation amounts at 704 stations and SO4(2-), NO3-, and NH4+ concentrations in the atmosphere and rainwater at 18 EANET (the Acid Deposition Monitoring Network in East Asia) stations. The comparison shows that the modeling system can reasonably reproduce seasonal precipitation patterns, especially the extensive area of dry conditions in northeast China and north China and the major precipitation zones. For ambient concentrations and wet depositions, the simulated results are in reasonable agreement (within a factor of 2) with observations in most cases, and the major observed features are mostly well reproduced. The analysis of modeled wet deposition distributions indicates that East Asia experiences noticeable variations in its wet deposition patterns throughout the year. In winter, southern China and the coastal areas of the Japan Sea report higher S04(2-) and NO3- wet depositions. In spring, elevated SO4(2-) and NO3-wet depositions are found in northeastern China, southern China, and around the Yangtze River. In summer, a remarkable rise in precipitation in northeastern China, the valleys of the Huaihe and Yangtze rivers, Korea, and Japan leads to a noticeable increase in SO4(2-) and NO3- wet depositions, whereas in autumn, higher SO4(2-) and NO3-wet depositions are found around Sichuan Province. Meanwhile, due to the high emission of SO2, high wet depositions of SO4(2-) are found throughout the entire year in the area surrounding Sichuan Province. There is a tendency toward decreasing NO3- concentrations in rainwater from China through Korea to Japan in both observed and simulated results, which is a consequence of the influence of the continental

  4. Genome-wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomic structural variations are an important source of genetic diversity. Copy number variations (CNVs), gains and losses of large regions of genomic sequence between individuals of a species, are known to be associated with both diseases and phenotypic traits. Deeply sequenced genomes are often u...

  5. Human liver apolipoprotein B-100 cDNA: complete nucleic acid and derived amino acid sequence.

    PubMed Central

    Law, S W; Grant, S M; Higuchi, K; Hospattankar, A; Lackner, K; Lee, N; Brewer, H B

    1986-01-01

    Human apolipoprotein B-100 (apoB-100), the ligand on low density lipoproteins that interacts with the low density lipoprotein receptor and initiates receptor-mediated endocytosis and low density lipoprotein catabolism, has been cloned, and the complete nucleic acid and derived amino acid sequences have been determined. ApoB-100 cDNAs were isolated from normal human liver cDNA libraries utilizing immunoscreening as well as filter hybridization with radiolabeled apoB-100 oligodeoxynucleotides. The apoB-100 mRNA is 14.1 kilobases long encoding a mature apoB-100 protein of 4536 amino acids with a calculated amino acid molecular weight of 512,723. ApoB-100 contains 20 potential glycosylation sites, and 12 of a total of 25 cysteine residues are located in the amino-terminal region of the apolipoprotein providing a potential globular structure of the amino terminus of the protein. ApoB-100 contains relatively few regions of amphipathic helices, but compared to other human apolipoproteins it is enriched in beta-structure. The delineation of the entire human apoB-100 sequence will now permit a detailed analysis of the conformation of the protein, the low density lipoprotein receptor binding domain(s), and the structural relationship between apoB-100 and apoB-48 and will provide the basis for the study of genetic defects in apoB-100 in patients with dyslipoproteinemias. PMID:3464946

  6. Oleic Acid: Natural variation and potential enhancement in oilseed crops.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Oleic acid is a monounsaturated omega 9 fatty acid (MUFA, C18:1) which can be found in various plant lipids and animal fats. Unlike omega 3 (a-linolenic acid, C18:3) and omega 6 (linoleic acid, C18:2) fatty acids which are essential because they cannot be synthesized by humans and must be obtained f...

  7. Computer selection of oligonucleotide probes from amino acid sequences for use in gene library screening.

    PubMed

    Yang, J H; Ye, J H; Wallace, D C

    1984-01-11

    We present a computer program, FINPROBE, which utilizes known amino acid sequence data to deduce minimum redundancy oligonucleotide probes for use in screening cDNA or genomic libraries or in primer extension. The user enters the amino acid sequence of interest, the desired probe length, the number of probes sought, and the constraints on oligonucleotide synthesis. The computer generates a table of possible probes listed in increasing order of redundancy and provides the location of each probe in the protein and mRNA coding sequence. Activation of a next function provides the amino acid and mRNA sequences of each probe of interest as well as the complementary sequence and the minimum dissociation temperature of the probe. A final routine prints out the amino acid sequence of the protein in parallel with the mRNA sequence listing all possible codons for each amino acid.

  8. Characterization of ADME gene variation in 21 populations by exome sequencing

    PubMed Central

    Hovelson, Daniel H.; Xue, Zhengyu; Zawistowski, Matthew; Ehm, Margaret G.; Harris, Elizabeth C.; Stocker, Sophie L.; Gross, Annette S.; Jang, In-Jin; Ieiri, Ichiro; Lee, Jong-Eun; Cardon, Lon R.; Chissoe, Stephanie L.; Abecasis, Gonçalo

    2017-01-01

    Objective Proteins involving absorption, distribution, metabolism, and excretion (ADME) play a critical role in drug pharmacokinetics. The type and frequency of genetic variation in the ADME genes differ among populations. The aim of this study was to systematically investigate common and rare ADME coding variation in diverse ethnic populations by exome sequencing. Materials and methods Data derived from commercial exome capture arrays and next-generation sequencing were used to characterize coding variation in 298 ADME genes in 251 Northeast Asians and 1181 individuals from the 1000 Genomes Project. Results Approximately 75% of the ADME coding sequence was captured at high quality across the joint samples harboring more than 8000 variants, with 49% of individuals carrying at least one ‘knockout’ allele. ADME genes carried 50% more nonsynonymous variation than non-ADME genes (P=8.2×10–13) and showed significantly greater levels of population differentiation (P=7.6×10–11). Out of the 2135 variants identified that were predicted to be deleterious, 633 were not on commercially available ADME or general-purpose genotyping arrays. Forty deleterious variants within important ADME genes, with frequencies of at least 2% in at least one population, were identified as candidates for future pharmacogenetic studies. Conclusion Exome sequencing was effective in accurately genotyping most ADME variants important for pharmacogenetic research, in addition to identifying rare or potentially de novo coding variants that may be clinically meaningful. Furthermore, as a class, ADME genes are more variable and less sensitive to purifying selection than non-ADME genes. PMID:27984508

  9. Blind Prediction of Deleterious Amino Acid Variations with SNPs&GO.

    PubMed

    Capriotti, Emidio; Martelli, Pier Luigi; Fariselli, Piero; Casadio, Rita

    2017-01-19

    SNPs&GO is a machine learning method for predicting the association of single amino acid variations (SAVs) to disease, considering protein functional annotation. The method is a binary classifier that implements a Support Vector Machine algorithm to discriminate between disease-related and neutral SAVs. SNPs&GO combines information from protein sequence with functional annotation encoded by Gene Ontology terms. Tested in sequence mode on more than 38,000 SAVs from the SwissVar dataset, our method reached 81% overall accuracy and an area under the receiving operating characteristic curve (AUC) of 0.88 with low false positive rate. In almost all the editions of the Critical Assessment of Genome Interpretation (CAGI) experiments, SNPs&GO ranked among the most accurate algorithms for predicting the effect of SAVs. In this paper we summarize the best results obtained by SNPs&GO on disease related variations of four CAGI challenges relative to the following genes: CHEK2 (CAGI 2010), RAD50 (CAGI 2011), p16-INK (CAGI 2013) and NAGLU (CAGI 2016). Result evaluation provides insights about the accuracy of our algorithm and the relevance of GO terms in annotating the effect of the variants. It also helps to define good practices for the detection of deleterious SAVs.

  10. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  11. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  12. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  13. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  14. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data...

  15. A map of human genome variation from population-scale sequencing.

    PubMed

    Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A

    2010-10-28

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research.

  16. Cloning and sequencing of the Bet v 1-homologous allergen Fra a 1 in strawberry (Fragaria ananassa) shows the presence of an intron and little variability in amino acid sequence.

    PubMed

    Musidlowska-Persson, Anna; Alm, Rikard; Emanuelsson, Cecilia

    2007-02-01

    The Fra a 1 allergen in strawberry (Fragaria ananassa) is homologous to the major birch pollen allergen Bet v 1, which has numerous isoforms differing in terms of amino acid sequence and immunological impact. To map the extent of sequence differences in the Fra a 1 allergen, PCR cloning and sequencing was applied. Several genomic sequences of Fra a 1, with a length of either 584, 591 or 594 nucleotides, were obtained from three different strawberry varieties. All contained one intron, with the length of either 101 or 110 nucleotides. By sequencing 30 different clones, eight different DNA sequences were obtained, giving in total five potential Fra a 1 protein isoforms, with high sequence similarity (>97% sequence identity) and only seven positions of amino acid variability, which were largely confirmed by mass spectrometry of expressed proteins. We conclude that the sequence variability in the strawberry allergen Fra a 1 is small, within and between strawberry varieties, and that multiple spots, previously detected in 2DE, are presumably due to differences in post-translational modification rather than differences in amino acid sequence. The most abundant Fra a 1 isoform sequence, recombinantly expressed in Escherichia coli after removal of the intron, was recognized by IgE from strawberry allergic patients. It cross-reacted with antibodies to Bet v 1 and the homologous apple allergen Mal d 1 (61 and 78% sequence identity, respectively), and will be used in further analyses of variation in Fra a 1-expression.

  17. Temporal Variations of Organic Acids in Sumac Fruit

    SciTech Connect

    Robbins, C.; Mulcahy, F.; Somayajula, K.; Edenborn, H.M.

    2006-10-01

    Extracts from staghorn sumac (Rhus typhina) fruits were obtained from fresh fruits obtained from June to October in two successive years. Total acidity, pH, and concentrations of malic and succinic acids determined using liquid chromatography were measured for each extract. Acidity and acid concentrations reached their maxima in late July, and declined slowly thereafter. Malic and succinic acid concentrations in the extracts reached maxima of about 4 and 0.2% (expressed per unit weight of fruit), respectively. Malic and succinic acids were the only organic acids observed in the extracts, and mass balance determinations indicate that these acids are most likely the only ones present in appreciable amounts.

  18. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    PubMed Central

    Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A.C.T; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-01-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts. PMID:27617755

  19. Sequence variation of koala retrovirus transmembrane protein p15E among koalas from different geographic regions.

    PubMed

    Ishida, Yasuko; McCallister, Chelsea; Nikolaidis, Nikolas; Tsangaras, Kyriakos; Helgen, Kristofer M; Greenwood, Alex D; Roca, Alfred L

    2015-01-15

    The koala retrovirus (KoRV), which is transitioning from an exogenous to an endogenous form, has been associated with high mortality in koalas. For other retroviruses, the envelope protein p15E has been considered a candidate for vaccine development. We therefore examined proviral sequence variation of KoRV p15E in a captive Queensland and three wild southern Australian koalas. We generated 163 sequences with intact open reading frames, which grouped into 39 distinct haplotypes. Sixteen distinct haplotypes comprising 139 of the sequences (85%) coded for the same polypeptide. Among the remaining 23 haplotypes, 22 were detected only once among the sequences, and each had 1 or 2 non-synonymous differences from the majority sequence. Several analyses suggested that p15E was under purifying selection. Important epitopes and domains were highly conserved across the p15E sequences and in previously reported exogenous KoRVs. Overall, these results support the potential use of p15E for KoRV vaccine development.

  20. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing

    NASA Astrophysics Data System (ADS)

    Ferreira, Pedro G.; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R.; Rivas, Manuel A.; Esteve-Codina, Anna; Estivill, Xavier; Guigó, Roderic; Dermitzakis, Emmanouil; Antonarakis, Stylianos; Meitinger, Thomas; Strom, Tim M.; Palotie, Aarno; François Deleuze, Jean; Sudbrak, Ralf; Lerach, Hans; Gut, Ivo; Syvänen, Ann-Christine; Gyllensten, Ulf; Schreiber, Stefan; Rosenstiel, Philip; Brunner, Han; Veltman, Joris; Hoen, Peter A. C. T.; Jan van Ommen, Gert; Carracedo, Angel; Brazma, Alvis; Flicek, Paul; Cambon-Thomsen, Anne; Mangion, Jonathan; Bentley, David; Hamosh, Ada; Rosenstiel, Philip; Strom, Tim M.; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-09-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing—alternative splice sites, introns, and cleavage sites—which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts.

  1. Sequence variation of koala retrovirus transmembrane protein p15E among koalas from different geographic regions

    PubMed Central

    Ishida, Yasuko; McCallister, Chelsea; Nikolaidis, Nikolas; Tsangaras, Kyriakos; Helgen, Kristofer M.; Greenwood, Alex D.; Roca, Alfred L.

    2014-01-01

    The koala retrovirus (KoRV), which is transitioning from an exogenous to an endogenous form, has been associated with high mortality in koalas. For other retroviruses, the envelope protein p15E has been considered a candidate for vaccine development. We therefore examined proviral sequence variation of KoRV p15E in a captive Queensland and three wild southern Australian koalas. We generated 163 sequences with intact open reading frames, which grouped into 39 distinct haplotypes. Sixteen distinct haplotypes comprising 139 of the sequences (85%) coded for the same polypeptide. Among the remaining 23 haplotypes, 22 were detected only once among the sequences, and each had 1 or 2 non-synonymous differences from the majority sequence. Several analyses suggested that p15E was under purifying selection. Important epitopes and domains were highly conserved across the p15E sequences and in previously reported exogenous KoRVs. Overall, these results support the potential use of p15E for KoRV vaccine development. PMID:25462343

  2. Magnetic susceptibility variations in Loess sequences and their relationship to astronomical forcing

    NASA Technical Reports Server (NTRS)

    Verosub, Kenneth L.; Singer, Michael J.

    1992-01-01

    The long, well-exposed and often continuous sequences of loess found throughout the world are generally thought to provide an excellent opportunity for studying long-term, large-scale environmental change during the last few million years. In recent years, the most fruitful loess studies have been those involving the deposits of the loess in China. One of the most intriguing results of that work has been the discovery of an apparent correlation between variations in the magnetic susceptibility of the loess sequence and the oxygen isotope record of the deep sea. This correlation implies that magnetic susceptibility variations are being driven by astronomical parameters. However, the basic data have been interpreted in various ways by different authors, most of whom assumed that the magnetic minerals in the loess have not been affected by post-depositional processes. Using a chemical extraction procedure that allows us to separate the contribution of secondary pedogenic magnetic minerals from primary inherited magnetic minerals, we have found that the magnetic susceptibility of the Chinese paleosols is largely due to a pedogenic component which is present to a lesser degree in the loess. We have also found that the smaller inherited component of the magnetic susceptibility is about the same in the paleosols and the loess. These results demonstrate the need for additional study of the processes that create magnetic susceptibility variations in order to interpret properly the role of astronomical forcing in producing these variations.

  3. Ultra Deep Sequencing of a Baculovirus Population Reveals Widespread Genomic Variations

    PubMed Central

    Chateigner, Aurélien; Bézier, Annie; Labrousse, Carole; Jiolle, Davy; Barbe, Valérie; Herniou, Elisabeth A.

    2015-01-01

    Viruses rely on widespread genetic variation and large population size for adaptation. Large DNA virus populations are thought to harbor little variation though natural populations may be polymorphic. To measure the genetic variation present in a dsDNA virus population, we deep sequenced a natural strain of the baculovirus Autographa californica multiple nucleopolyhedrovirus. With 124,221X average genome coverage of our 133,926 bp long consensus, we could detect low frequency mutations (0.025%). K-means clustering was used to classify the mutations in four categories according to their frequency in the population. We found 60 high frequency non-synonymous mutations under balancing selection distributed in all functional classes. These mutants could alter viral adaptation dynamics, either through competitive or synergistic processes. Lastly, we developed a technique for the delimitation of large deletions in next generation sequencing data. We found that large deletions occur along the entire viral genome, with hotspots located in homologous repeat regions (hrs). Present in 25.4% of the genomes, these deletion mutants presumably require functional complementation to complete their infection cycle. They might thus have a large impact on the fitness of the baculovirus population. Altogether, we found a wide breadth of genomic variation in the baculovirus population, suggesting it has high adaptive potential. PMID:26198241

  4. Haplotype structure and population genetic inferences from nucleotide-sequence variation in human lipoprotein lipase.

    PubMed Central

    Clark, A G; Weiss, K M; Nickerson, D A; Taylor, S L; Buchanan, A; Stengård, J; Salomaa, V; Vartiainen, E; Perola, M; Boerwinkle, E; Sing, C F

    1998-01-01

    Allelic variation in 9.7 kb of genomic DNA sequence from the human lipoprotein lipase gene (LPL) was scored in 71 healthy individuals (142 chromosomes) from three populations: African Americans (24) from Jackson, MS; Finns (24) from North Karelia, Finland; and non-Hispanic Whites (23) from Rochester, MN. The sequences had a total of 88 variable sites, with a nucleotide diversity (site-specific heterozygosity) of .002+/-.001 across this 9.7-kb region. The frequency spectrum of nucleotide variation exhibited a slight excess of heterozygosity, but, in general, the data fit expectations of the infinite-sites model of mutation and genetic drift. Allele-specific PCR helped resolve linkage phases, and a total of 88 distinct haplotypes were identified. For 1,410 (64%) of the 2,211 site pairs, all four possible gametes were present in these haplotypes, reflecting a rich history of past recombination. Despite the strong evidence for recombination, extensive linkage disequilibrium was observed. The number of haplotypes generally is much greater than the number expected under the infinite-sites model, but there was sufficient multisite linkage disequilibrium to reveal two major clades, which appear to be very old. Variation in this region of LPL may depart from the variation expected under a simple, neutral model, owing to complex historical patterns of population founding, drift, selection, and recombination. These data suggest that the design and interpretation of disease-association studies may not be as straightforward as often is assumed. PMID:9683608

  5. Impact of next generation sequencing: the 2009 Human Genome Variation Society Scientific Meeting.

    PubMed

    Oetting, William S

    2010-04-01

    The annual scientific meeting of the Human Genome Variation Society (HGVS) was held on the 20th of October, 2009, in Honolulu, Hawaii. The theme of this meeting was the "Impact of Next Generation Sequencing." Presenters spoke on issues ranging from advances in the technology of large-scale genome sequencing to how this information can be analyzed to uncover genetic variants associated with disease. Many of the challenges resulting from the implementation of these new technologies were presented, but possible solutions, or at least paths to the solutions, were also given. With the combined efforts of investigators using next-generation sequencing to help understand the impact of genetic variants on disease, the use of the personal genome in medicine will soon become a reality.

  6. Variation in the sequence and modification state of the human insulin gene flanking regions.

    PubMed

    Ullrich, A; Dull, T J; Gray, A; Philips, J A; Peter, S

    1982-04-10

    The nucleotide sequence of a highly repetitive sequence region upstream from the human insulin gene is reported. The length of this region varies between alleles in the population, and appears to be stably transmitted to the next generation in a Mendelian fashion. There is no significant correlation between the length of this sequence and two types of diabetes mellitus. We observe variation in the cleavability of a BglI recognition site downstream from the human insulin gene, which is probably due to variable nucleotide modification. This presumed modification state appears not to be inherited, and varies between tissues within an individual and between individuals for a given tissue. Both alleles in a given tissue DNA sample are modified to the same extent.

  7. Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

    NASA Astrophysics Data System (ADS)

    Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

    2015-12-01

    Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.

  8. Population subdivision in Europe's great bustard inferred from mitochondrial and nuclear DNA sequence variation.

    PubMed

    Pitra, C; Lieckfeldt, D; Alonso, J C

    2000-08-01

    A continent-wide survey of sequence variation in mitochondrial (mt) and nuclear (n) DNA of the endangered great bustard (Otis tarda) was conducted to assess the extent of phylogeographic structure in a morphologically monotypic bird. DNA sequence variation in a combined 809 bp segment of the mtDNA genome from 66 individuals from the last six breeding regions showed relatively low levels of intraspecific sequence diversity (n = 0.32%) but significant differences in the regional distribution of 11 haplotypes (phiST = 0.49). Despite their exceptional potential for dispersal, a complete and long-term historical separation between the populations from the Iberian Peninsula (Spain) and mainland Europe (Hungary, Slovakia, Germany, and Russia) was demonstrated. Divergence between populations based on a 3-bp insertion-deletion polymorphism within the intron region of the nuclear CHD-Z gene was geographically concordant with the primary subdivision identified within the mtDNA sequences. Inferred aspects of phylogeography were used to formulate conservation recommendations for this endangered species.

  9. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  10. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  11. Human retroviruses and aids, 1992. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Korber, B.; Berzofsky, J.A.; Pavlakis, G.N.; Smith, R.F.

    1992-10-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) HIV and SIV Nucleotide Sequences; (H) Amino Acid Sequences; (III) Analyses; (IV) Related Sequences; and (V) Database Communications. information within all the parts is updated at least twice in each year, which accounts for the modes of binding and pagination in the compendium. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions below of the parts of the compendium, the user should read the individual introductions for each part.

  12. Intragenomic and interspecific 5S rDNA sequence variation in five Asian pines.

    PubMed

    Liu, Zhan-Lin; Zhang, Daming; Wang, Xiao-Quan; Ma, Xiao-Fei; Wang, Xiao-Ru

    2003-01-01

    Patterns of intragenomic and interspecific variation of 5S rDNA in Pinus (Pinaceae) were studied by cloning and sequencing multiple 5S rDNA repeats from individual trees. Five pines, from both subgenera, Pinus and Strobus, were selected. The 5S rDNA repeat in pines has a conserved 120-base pair (bp) transcribed region and an intergenic spacer region of variable length (382-608 bp). The evolutionary rate in the spacer region is three- to sevenfold higher than in the genic region. We found substantial sequence divergence between the two subgenera. Intragenomic sequence heterogeneity was high for all species, and more than 86% of the clones within each individual were unique. The 5S gene tree revealed that different 5S repeats within individuals are polyphyletic, indicating that their ancestral divergence preceded the speciation events. The degrees of interspecific and intragenomic divergence among diploxylon pines are similar. The observed sequence patterns suggest that concerted evolution has been acting after the diversification of the two subgenera but very weak after the speciation of the four diploxylon pines. Sequence patterns in P. densata are consistent with hybrid origin. It had higher intragenomic diversity and maintained polymorphic copies of the parental types in addition to new and recombinant types unique to the hybrid.

  13. BreaKmer: detection of structural variation in targeted massively parallel sequencing data using kmers

    PubMed Central

    Abo, Ryan P.; Ducar, Matthew; Garcia, Elizabeth P.; Thorner, Aaron R.; Rojas-Rudilla, Vanesa; Lin, Ling; Sholl, Lynette M.; Hahn, William C.; Meyerson, Matthew; Lindeman, Neal I.; Van Hummelen, Paul; MacConaill, Laura E.

    2015-01-01

    Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for ‘targeted’ resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a ‘kmer’ strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings. PMID:25428359

  14. Sequence variation in the coding region of the melanocortin-1 receptor gene (MC1R) is not associated with plumage variation in the blue-crowned manakin (Lepidothrix coronata).

    PubMed

    Cheviron, Z A; Hackett, Shannon J; Brumfield, Robb T

    2006-07-07

    Avian plumage traits are the targets of both natural and sexual selection. Consequently, genetic changes resulting in plumage variation among closely related taxa might represent important evolutionary events. The molecular basis of such differences, however, is unknown in most cases. Sequence variation in the melanocortin-1 receptor gene (MC1R) is associated with melanistic phenotypes in many vertebrate taxa, including several avian species. The blue-crowned manakin (Lepidothrix coronata), a widespread, sexually dichromatic passerine, exhibits striking geographic variation in male plumage colour across its range in southern Central America and western Amazonia. Northern males are black with brilliant blue crowns whereas southern males are green with lighter blue crowns. We sequenced 810 bp of the MC1R coding region in 23 individuals spanning the range of male plumage variation. The only variable sites we detected among L. coronata sequences were four synonymous substitutions, none of which were strictly associated with either plumage type. Similarly, comparative analyses showed that L. coronata sequences were monomorphic at the three amino acid sites hypothesized to be functionally important in other birds. These results demonstrate that genes other than MC1R underlie melanic plumage polymorphism in blue-crowned manakins.

  15. Completion of the amino acid sequence of the alpha 1 chain from type I calf skin collagen. Amino acid sequence of alpha 1(I)B8.

    PubMed Central

    Glanville, R W; Breitkreutz, D; Meitinger, M; Fietzek, P P

    1983-01-01

    The complete amino acid sequence of the 279-residue CNBr peptide CB8 from the alpha 1 chain of type I calf skin collagen is presented. It was determined by sequencing overlapping fragments of CB8 produced by Staphylococcus aureus V8 proteinase, trypsin, Endoproteinase Arg-C and hydroxylamine. Tryptic cleavages were also made specific for lysine by blocking arginine residues with cyclohexane-1,2-dione. This completes the amino acid sequence analysis of the 1054-residues-long alpha (I) chain of calf skin collagen. PMID:6354180

  16. CNV-TV: A robust method to discover copy number variation from short sequencing reads

    PubMed Central

    2013-01-01

    Background Copy number variation (CNV) is an important structural variation (SV) in human genome. Various studies have shown that CNVs are associated with complex diseases. Traditional CNV detection methods such as fluorescence in situ hybridization (FISH) and array comparative genomic hybridization (aCGH) suffer from low resolution. The next generation sequencing (NGS) technique promises a higher resolution detection of CNVs and several methods were recently proposed for realizing such a promise. However, the performances of these methods are not robust under some conditions, e.g., some of them may fail to detect CNVs of short sizes. There has been a strong demand for reliable detection of CNVs from high resolution NGS data. Results A novel and robust method to detect CNV from short sequencing reads is proposed in this study. The detection of CNV is modeled as a change-point detection from the read depth (RD) signal derived from the NGS, which is fitted with a total variation (TV) penalized least squares model. The performance (e.g., sensitivity and specificity) of the proposed approach are evaluated by comparison with several recently published methods on both simulated and real data from the 1000 Genomes Project. Conclusion The experimental results showed that both the true positive rate and false positive rate of the proposed detection method do not change significantly for CNVs with different copy numbers and lengthes, when compared with several existing methods. Therefore, our proposed approach results in a more reliable detection of CNVs than the existing methods. PMID:23634703

  17. A survey of chromosomal and nucleotide sequence variation in Drosophila miranda.

    PubMed Central

    Yi, Soojin; Bachtrog, Doris; Charlesworth, Brian

    2003-01-01

    There have recently been several studies of the evolution of Y chromosome degeneration and dosage compensation using the neo-sex chromosomes of Drosophila miranda as a model system. To understand these evolutionary processes more fully, it is necessary to document the general pattern of genetic variation in this species. Here we report a survey of chromosomal variation, as well as polymorphism and divergence data, for 12 nuclear genes of D. miranda. These genes exhibit varying levels of DNA sequence polymorphism. Compared to its well-studied sibling species D. pseudoobscura, D. miranda has much less nucleotide sequence variation, and the effective population size of this species is inferred to be several-fold lower. Nevertheless, it harbors a few inversion polymorphisms, one of which involves the neo-X chromosome. There is no convincing evidence for a recent population expansion in D. miranda, in contrast to D. pseudoobscura. The pattern of population subdivision previously observed for the X-linked gene period is not seen for the other loci, suggesting that there is no general population subdivision in D. miranda. However, data on an additional region of period confirm population subdivision for this gene, suggesting that local selection is operating at or near period to promote differentiation between populations. PMID:12930746

  18. Large scale DNA sequencing: new challenges emerge--the 2007 Human Genome Variation Society scientific meeting.

    PubMed

    Oetting, William S

    2008-05-01

    The annual scientific meeting of the Human Genome Variation Society (HGVS) was held on 23 October 2007, in San Diego, CA. The major theme of this meeting was "New DNA Sequencing Technologies & Human Genome Variation." A series of speakers provided information on several new technologies that produce DNA sequence data on a scale far beyond what was possible even a few years ago. These new technologies produce up to gigabases of nucleotides on a single run. Already, two individuals have had their entire genome sequenced, resulting in the identification of many novel DNA variants. Several new questions now need to be answered. What impact do these novel variants have on the phenotypes? How are we to associate private variants in a single individual with disease, especially when current association studies require genotyping thousands of individuals? Further work will be required to create methodologies to analyze these variants to determine if they are potentially disease-producing or are phenotypically silent. For the technology to be useful in a medical setting it will be crucial to answer to these questions.

  19. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: Combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance

    SciTech Connect

    Wu, Gang; Nie, Lei; Zhang, Weiwen

    2006-05-26

    ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...

  20. An Integrated Sequence-Structure Database incorporating matching mRNA sequence, amino acid sequence and protein three-dimensional structure data.

    PubMed Central

    Adzhubei, I A; Adzhubei, A A; Neidle, S

    1998-01-01

    We have constructed a non-homologous database, termed the Integrated Sequence-Structure Database (ISSD) which comprises the coding sequences of genes, amino acid sequences of the corresponding proteins, their secondary structure and straight phi,psi angles assignments, and polypeptide backbone coordinates. Each protein entry in the database holds the alignment of nucleotide sequence, amino acid sequence and the PDB three-dimensional structure data. The nucleotide and amino acid sequences for each entry are selected on the basis of exact matches of the source organism and cell environment. The current version 1.0 of ISSD is available on the WWW at http://www.protein.bio.msu.su/issd/ and includes 107 non-homologous mammalian proteins, of which 80 are human proteins. The database has been used by us for the analysis of synonymous codon usage patterns in mRNA sequences showing their correlation with the three-dimensional structure features in the encoded proteins. Possible ISSD applications include optimisation of protein expression, improvement of the protein structure prediction accuracy, and analysis of evolutionary aspects of the nucleotide sequence-protein structure relationship. PMID:9399866

  1. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor.

  2. Sequence variation within botulinum neurotoxin serotypes impacts antibody binding and neutralization.

    PubMed

    Smith, T J; Lou, J; Geren, I N; Forsyth, C M; Tsai, R; Laporte, S L; Tepp, W H; Bradshaw, M; Johnson, E A; Smith, L A; Marks, J D

    2005-09-01

    The botulinum neurotoxins (BoNTs) are category A biothreat agents which have been the focus of intensive efforts to develop vaccines and antibody-based prophylaxis and treatment. Such approaches must take into account the extensive BoNT sequence variability; the seven BoNT serotypes differ by up to 70% at the amino acid level. Here, we have analyzed 49 complete published sequences of BoNTs and show that all toxins also exhibit variability within serotypes ranging between 2.6 and 31.6%. To determine the impact of such sequence differences on immune recognition, we studied the binding and neutralization capacity of six BoNT serotype A (BoNT/A) monoclonal antibodies (MAbs) to BoNT/A1 and BoNT/A2, which differ by 10% at the amino acid level. While all six MAbs bound BoNT/A1 with high affinity, three of the six MAbs showed a marked reduction in binding affinity of 500- to more than 1,000-fold to BoNT/A2 toxin. Binding results predicted in vivo toxin neutralization; MAbs or MAb combinations that potently neutralized A1 toxin but did not bind A2 toxin had minimal neutralizing capacity for A2 toxin. This was most striking for a combination of three binding domain MAbs which together neutralized >40,000 mouse 50% lethal doses (LD(50)s) of A1 toxin but less than 500 LD(50)s of A2 toxin. Combining three MAbs which bound both A1 and A2 toxins potently neutralized both toxins. We conclude that sequence variability exists within all toxin serotypes, and this impacts monoclonal antibody binding and neutralization. Such subtype sequence variability must be accounted for when generating and evaluating diagnostic and therapeutic antibodies.

  3. Sequence Variation within Botulinum Neurotoxin Serotypes Impacts Antibody Binding and Neutralization

    PubMed Central

    Smith, T. J.; Lou, J.; Geren, I. N.; Forsyth, C. M.; Tsai, R.; LaPorte, S. L.; Tepp, W. H.; Bradshaw, M.; Johnson, E. A.; Smith, L. A.; Marks, J. D.

    2005-01-01

    The botulinum neurotoxins (BoNTs) are category A biothreat agents which have been the focus of intensive efforts to develop vaccines and antibody-based prophylaxis and treatment. Such approaches must take into account the extensive BoNT sequence variability; the seven BoNT serotypes differ by up to 70% at the amino acid level. Here, we have analyzed 49 complete published sequences of BoNTs and show that all toxins also exhibit variability within serotypes ranging between 2.6 and 31.6%. To determine the impact of such sequence differences on immune recognition, we studied the binding and neutralization capacity of six BoNT serotype A (BoNT/A) monoclonal antibodies (MAbs) to BoNT/A1 and BoNT/A2, which differ by 10% at the amino acid level. While all six MAbs bound BoNT/A1 with high affinity, three of the six MAbs showed a marked reduction in binding affinity of 500- to more than 1,000-fold to BoNT/A2 toxin. Binding results predicted in vivo toxin neutralization; MAbs or MAb combinations that potently neutralized A1 toxin but did not bind A2 toxin had minimal neutralizing capacity for A2 toxin. This was most striking for a combination of three binding domain MAbs which together neutralized >40,000 mouse 50% lethal doses (LD50s) of A1 toxin but less than 500 LD50s of A2 toxin. Combining three MAbs which bound both A1 and A2 toxins potently neutralized both toxins. We conclude that sequence variability exists within all toxin serotypes, and this impacts monoclonal antibody binding and neutralization. Such subtype sequence variability must be accounted for when generating and evaluating diagnostic and therapeutic antibodies. PMID:16113261

  4. Individual and population variation in invertebrates revealed by Inter-simple Sequence Repeats (ISSRs)

    PubMed Central

    Abbot, Patrick

    2001-01-01

    PCR-based molecular markers are well suited for questions requiring large scale surveys of plant and animal populations. Inter-simple Sequence Repeats or ISSRs are analyzed by a recently developed technique based on the amplification of the regions between inverse-oriented microsatellite loci with oligonucleotides anchored in microsatellites themselves. ISSRs have shown much promise for the study of the population biology of plants, but have not yet been explored for similar studies of animals. The value of ISSRs is demonstrated for the study of animal species with low levels of within-population variation. Sets of primers are identified which reveal variation in two aphid species, Acyrthosiphon pisum and Pemphigus obesinymphae, in the yellow fever mosquito Aedes aegypti, and in a rotifer in the genus Philodina. PMID:15455068

  5. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  6. MRI assessment of internal acoustic canal variations using 3D-FIESTA sequences.

    PubMed

    Erdogan, Nezahat; Altay, Canan; Akay, Emrah; Karakas, Levent; Uluc, Engin; Mete, Berna; Oygen, Aysegul; Oyar, Orhan; Gelal, Fazıl; Songu, Murat; Katilmis, Huseyin; Calli, Cağlar

    2013-02-01

    Magnetic resonance imaging (MRI) of the internal acoustic canal is the standard diagnostic tool for a wide range of indications in patients. This study aims to investigate the vascular variations and compression of the cranial nerves (CNs) VII and VIII at the cerebellopontine angle in patients with neuro-otologic symptoms using 3D-fast imaging employing steady-state acquisition (FIESTA) MR imaging. One hundred and eighty-seven patients (374 temporal bones) were examined on a 1.5-T MRI. In addition to conventional MR sequences, a 3D-FIESTA MR imaging was acquired. Magnetic resonance images thus obtained were evaluated with special regard to the presence of vascular contact to the CNs VII and VIII, as well as the presence of the vascular variations of the anterior inferior cerebellar artery (AICA) causing the compression of CNs. The Chi-squared test was used for statistical analysis. No statistically significant differences were found between the presence and absence of the AICA loop and/or vascular contact for the clinical symptoms of patients (P > 0.05). The cisternal and canalicular segments of CNs VII and VIII and adjacent vascular variations are well identified using 3D-FIESTA, especially by determining the relationship of the AICA variations between CNs.

  7. Recognition of 5'-YpG-3' sequences by coupled stacking/hydrogen bonding interactions with amino acid residues.

    PubMed

    Lamoureux, Jason S; Maynes, Jason T; Glover, J N Mark

    2004-01-09

    The combined biochemical and structural study of hundreds of protein-DNA complexes has indicated that sequence-specific interactions are mediated by two mechanisms termed direct and indirect readout. Direct readout involves direct interactions between the protein and base-specific atoms exposed in the major and minor grooves of DNA. For indirect readout, the protein recognizes DNA by sensing conformational variations in the structure dependent on nucleotide sequence, typically through interactions with the phosphodiester backbone. Based on our recent structure of Ndt80 bound to DNA in conjunction with a search of the existing PDB database, we propose a new method of sequence-specific recognition that utilizes both direct and indirect readout. In this mode, a single amino acid side-chain recognizes two consecutive base-pairs. The 3'-base is recognized by canonical direct readout, while the 5'-base is recognized through a variation of indirect readout, whereby the conformational flexibility of the particular dinucleotide step, namely a 5'-pyrimidine-purine-3' step, facilitates its recognition by the amino acid via cation-pi interactions. In most cases, this mode of DNA recognition helps explain the sequence specificity of the protein for its target DNA.

  8. Frequent sequence variation in the human myostatin (GDF8) gene as a marker for analysis of muscle-related phenotypes.

    PubMed

    Ferrell, R E; Conte, V; Lawrence, E C; Roth, S M; Hagberg, J M; Hurley, B F

    1999-12-01

    Myostatin is a recently identified member of the transforming growth factor-beta family of regulatory factors, also known as growth and differentiation factor 8 (GDF8). The nucleotide sequence of human myostatin was determined in 40 individuals. The invariant promoter contains a consensus MyoD binding site, and the coding sequence contains five missense substitutions in conserved amino acid residues (A55T, K153R, E164K, P198A, and I225T). Two of these, A55T in exon 1 and K153R in exon 2, are polymorphic in the general population with significantly different allele frequencies in Caucasians and African Americans (P < 0.001). Neither of the common polymorphisms had a significant impact on muscle mass response to strength training in either Caucasians or African Americans, although skewed allele frequencies preclude detection of small effects. These allelic variants provide markers for examining association between the myostatin gene and interindividual variation in muscle mass and differences in loss of muscle mass with aging.

  9. Targeted deep sequencing of flowering regulators in Brassica napus reveals extensive copy number variation

    PubMed Central

    Schiessl, Sarah; Huettel, Bruno; Kuehn, Diana; Reinhardt, Richard; Snowdon, Rod J.

    2017-01-01

    Gene copy number variation (CNV) is increasingly implicated in control of complex trait networks, particularly in polyploid plants like rapeseed (Brassica napus L.) with an evolutionary history of genome restructuring. Here we performed sequence capture to assay nucleotide variation and CNV in a panel of central flowering time regulatory genes across a species-wide diversity set of 280 B. napus accessions. The genes were chosen based on prior knowledge from Arabidopsis thaliana and related Brassica species. Target enrichment was performed using the Agilent SureSelect technology, followed by Illumina sequencing. A bait (probe) pool was developed based on results of a preliminary experiment with representatives from different B. napus morphotypes. A very high mean target coverage of ~670x allowed reliable calling of CNV, single nucleotide polymorphisms (SNPs) and insertion-deletion (InDel) polymorphisms. No accession exhibited no CNV, and at least one homolog of every gene we investigated showed CNV in some accessions. Some CNV appear more often in specific morphotypes, indicating a role in diversification. PMID:28291231

  10. Investigating peptide sequence variations for 'double-click' stapled p53 peptides.

    PubMed

    Lau, Yu Heng; de Andrade, Peterson; Sköld, Niklas; McKenzie, Grahame J; Venkitaraman, Ashok R; Verma, Chandra; Lane, David P; Spring, David R

    2014-06-28

    Stapling peptides for inhibiting the p53/MDM2 interaction is a promising strategy for developing anti-cancer therapeutic leads. We evaluate double-click stapled peptides formed from p53-based diazidopeptides with different staple positions and azido amino acid side-chain lengths, determining the impact of these variations on MDM2 binding and cellular activity. We also demonstrate a K24R mutation, necessary for cellular activity in hydrocarbon-stapled p53 peptides, is not required for analogous 'double-click' peptides.

  11. Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RN...

  12. Whole Genome Sequencing demonstrates that Geographic Variation of Escherichia coli O157 Genotypes Dominates Host Association

    PubMed Central

    Strachan, Norval J. C.; Rotariu, Ovidiu; Lopes, Bruno; MacRae, Marion; Fairley, Susan; Laing, Chad; Gannon, Victor; Allison, Lesley J.; Hanson, Mary F.; Dallman, Tim; Ashton, Philip; Franz, Eelco; van Hoek, Angela H. A. M.; French, Nigel P.; George, Tessy; Biggs, Patrick J.; Forbes, Ken J.

    2015-01-01

    Genetic variation in an infectious disease pathogen can be driven by ecological niche dissimilarities arising from different host species and different geographical locations. Whole genome sequencing was used to compare E. coli O157 isolates from host reservoirs (cattle and sheep) from Scotland and to compare genetic variation of isolates (human, animal, environmental/food) obtained from Scotland, New Zealand, Netherlands, Canada and the USA. Nei’s genetic distance calculated from core genome single nucleotide polymorphisms (SNPs) demonstrated that the animal isolates were from the same population. Investigation of the Shiga toxin bacteriophage and their insertion sites (SBI typing) revealed that cattle and sheep isolates had statistically indistinguishable rarefaction profiles, diversity and genotypes. In contrast, isolates from different countries exhibited significant differences in Nei’s genetic distance and SBI typing. Hence, after successful international transmission, which has occurred on multiple occasions, local genetic variation occurs, resulting in a global patchwork of continental and trans-continental phylogeographic clades. These findings are important for three reasons: first, understanding transmission and evolution of infectious diseases associated with multiple host reservoirs and multi-geographic locations; second, highlighting the relevance of the sheep reservoir when considering farm based interventions; and third, improving our understanding of why human disease incidence varies across the world. PMID:26442781

  13. Whole Genome Sequencing demonstrates that Geographic Variation of Escherichia coli O157 Genotypes Dominates Host Association.

    PubMed

    Strachan, Norval J C; Rotariu, Ovidiu; Lopes, Bruno; MacRae, Marion; Fairley, Susan; Laing, Chad; Gannon, Victor; Allison, Lesley J; Hanson, Mary F; Dallman, Tim; Ashton, Philip; Franz, Eelco; van Hoek, Angela H A M; French, Nigel P; George, Tessy; Biggs, Patrick J; Forbes, Ken J

    2015-10-07

    Genetic variation in an infectious disease pathogen can be driven by ecological niche dissimilarities arising from different host species and different geographical locations. Whole genome sequencing was used to compare E. coli O157 isolates from host reservoirs (cattle and sheep) from Scotland and to compare genetic variation of isolates (human, animal, environmental/food) obtained from Scotland, New Zealand, Netherlands, Canada and the USA. Nei's genetic distance calculated from core genome single nucleotide polymorphisms (SNPs) demonstrated that the animal isolates were from the same population. Investigation of the Shiga toxin bacteriophage and their insertion sites (SBI typing) revealed that cattle and sheep isolates had statistically indistinguishable rarefaction profiles, diversity and genotypes. In contrast, isolates from different countries exhibited significant differences in Nei's genetic distance and SBI typing. Hence, after successful international transmission, which has occurred on multiple occasions, local genetic variation occurs, resulting in a global patchwork of continental and trans-continental phylogeographic clades. These findings are important for three reasons: first, understanding transmission and evolution of infectious diseases associated with multiple host reservoirs and multi-geographic locations; second, highlighting the relevance of the sheep reservoir when considering farm based interventions; and third, improving our understanding of why human disease incidence varies across the world.

  14. The complete nucleotide sequence of the Crossostoma lacustre mitochondrial genome: conservation and variations among vertebrates.

    PubMed Central

    Tzeng, C S; Hui, C F; Shen, S C; Huang, P C

    1992-01-01

    The complete mitochondrial (mt) genome of Crossostoma lacustre, a freshwater loach from mountain stream of Taiwan, has been cloned and sequenced. This fish mt genome, consisting of 16558 base-pairs, encodes genes for 13 proteins, two rRNAs, and 22 tRNAs, in addition to a regulatory sequence for replication and transcription (D-loop), is similar to those of the other vertebrates in both the order and orientation of these genes. The protein-coding and ribosomal RNA genes are highly homologous both in size and composition, to their counterparts in mammals, birds, amphibians, and invertebrates, and using essentially the same set of codons, including both the initiation and termination signals, and the tRNAs. Differences do exist, however, in the lengths and sequences of the D-loop regions, and in space between genes, which account for the variations in total lengths of the genomes. Our observations provide evidence for the first time for the conservation of genetic information in the fish mitochondrial genome, especially among the vertebrates. PMID:1408800

  15. DNA sequence variation in the mitochondrial control region of red-backed voles (Clethrionomys).

    PubMed

    Matson, C W; Baker, R J

    2001-08-01

    The complete mitochondrial DNA (mtDNA) control region was sequenced for 71 individuals from five species of the rodent genus Clethrionomys both to understand patterns of variation and to explore the existence of previously described domains and other elements. Among species, the control region ranged from 942 to 971 bp in length. Our data were compatible with the proposal of three domains (extended terminal associated sequences [ETAS], central, conserved sequence blocks [CSB]) within the control region. The most conserved region in the control region was the central domain (12% of nucleotide positions variable), whereas in the ETAS and CSB domains, 22% and 40% of nucleotide positions were variable, respectively. Tandem repeats were encountered only in the ETAS domain of Clethrionomys rufocanus. This tandem repeat found in C. rufocanus was 24 bp in length and was located at the 5' end of the control region. Only two of the proposed CSB and ETAS elements appeared to be supported by our data; however, a "CSB1-like" element was also documented in the ETAS domain.

  16. Analysis of Sequence Variation and Risk Association of Human Papillomavirus 52 Variants Circulating in Korea

    PubMed Central

    Choi, Youn Jin; Ki, Eun Young; Zhang, Chuqing; Ho, Wendy C. S.; Lee, Sung-Jong; Jeong, Min Jin

    2016-01-01

    Introduction Human papillomavirus (HPV) 52 is a carcinogenic, high-risk genotype frequently detected in cervical cancer cases from East Asia, including Korea. Materials and Methods Sequences of HPV52 detected in 91 cervical samples collected from women attending Seoul St. Mary’s Hospital were analyzed. HPV52 genomic sequences were obtained by polymerase chain reaction (PCR)-based sequencing and analyzed using Seq-Scape software, and phylogenetic trees were constructed using MEGA6 software. Results Of the 91 cervical samples, 40 were normal, 22 were low-grade lesions, 21 were high-grade lesions and 7 were squamous cell carcinomas. Four HPV52 variant lineages (A, B, C and D) were identified. Lineage B was the most frequently detected lineage, followed by lineage C. By analyzing the two most frequently detected lineages (B and C), we found that distinct variations existed in each lineage. We also found that a lineage B-specific mutation K93R (A379G) was associated with an increased risk of cervical neoplasia. Conclusions To our knowledge, we are the first to reveal the predominance of the HPV52 lineages, B and C, in Korea. We also found these lineages harbored distinct genetic alterations that may affect oncogenicity. Our findings increase our understanding on the heterogeneity of HPV52 variants, and may be useful for the development of new diagnostic assays and therapeutic vaccines. PMID:27977741

  17. Application of high-throughput sequencing for studying genomic variations in congenital heart disease.

    PubMed

    Dorn, Cornelia; Grunert, Marcel; Sperling, Silke R

    2014-01-01

    Congenital heart diseases (CHD) represent the most common birth defect in human. The majority of cases are caused by a combination of complex genetic alterations and environmental influences. In the past, many disease-causing mutations have been identified; however, there is still a large proportion of cardiac malformations with unknown precise origin. High-throughput sequencing technologies established during the last years offer novel opportunities to further study the genetic background underlying the disease. In this review, we provide a roadmap for designing and analyzing high-throughput sequencing studies focused on CHD, but also with general applicability to other complex diseases. The three main next-generation sequencing (NGS) platforms including their particular advantages and disadvantages are presented. To identify potentially disease-related genomic variations and genes, different filtering steps and gene prioritization strategies are discussed. In addition, available control datasets based on NGS are summarized. Finally, we provide an overview of current studies already using NGS technologies and showing that these techniques will help to further unravel the complex genetics underlying CHD.

  18. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat

    PubMed Central

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-01-01

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches. PMID:27172215

  19. DNA sequence variation in BpMADS2 gene in two populations of Betula pendula.

    PubMed

    Järvinen, Pia; Lemmetyinen, Juha; Savolainen, Outi; Sopanen, Tuomas

    2003-02-01

    The PISTILLATA (PI) homologue, BpMADS2, was isolated from silver birch (Betula pendula Roth) and used to study nucleotide polymorphism. Two regions (together about 2450 bp) comprising mainly untranslated sequences were sequenced from 10 individuals from each of two populations in Finland. The nucleotide polymorphism was low in the BpMADS2 locus, especially in the coding region. The synonymous site overall nucleotide diversity (pis) was 0.0043 and the nonsynonymous nucleotide diversity (pia) was only 0.000052. For the whole region, the pi values for the two populations were 0.0039 and 0.0045, and for the coding regions, the pi values were only 0 and 0.00066 (for the corresponding coding regions of Arabidopsis thaliana PI world-wide pi was 0.0021). Estimates of pi or theta did not differ significantly between the two populations, and the two populations were not diverged from each other. Two classes of BpMADS2 alleles were present in both populations, suggesting that this gene exhibits allelic dimorphism. In addition to the nucleotide site variation, two microsatellites were also associated within the haplotypes. This allelic dimorphism might be the result of postglacial re-colonization partly from northwestern, partly from southeastern/eastern refugia. The sequence comparison detected five recombination events in the regions studied. The large number of microsatellites in all of the three introns studied suggests that BpMADS2 is a hotspot for microsatellite formation.

  20. Trichomonas vaginalis acidic phospholipase A2: isolation and partial amino acid sequence.

    PubMed

    Escobedo-Guajardo, Brenda L; González-Salazar, Francisco; Palacios-Corona, Rebeca; Torres de la Cruz, Víctor M; Morales-Vallarta, Mario; Mata-Cárdenas, Benito D; Garza-González, Jesús N; Rivera-Silva, Gerardo; Vargas-Villarreal, Javier

    2013-12-01

    Sexually transmitted diseases are a major cause of acute disease worldwide, and trichomoniasis is the most common and curable disease, generating more than 170 million cases annually worldwide. Trichomonas vaginalis is the causal agent of trichomoniasis and has the ability to destroy in vitro cell monolayers of the vaginal mucosa, where the phospholipases A2 (PLA2) have been reported as potential virulence factors. These enzymes have been partially characterized from the subcellular fraction S30 of pathogenic T. vaginalis strains. The main objective of this study was to purify a phospholipase A2 from T. vaginalis, make a partial characterization, obtain a partial amino acid sequence, and determine its enzymatic participation as hemolytic factor causing lysis of erythrocytes. Trichomonas S30, RF30 and UFF30 sub-fractions from GT-15 strain have the capacity to hydrolyze [2-(14)C-PA]-PC at pH 6.0. Proteins from the UFF30 sub-fraction were separated by affinity chromatography into two eluted fractions with detectable PLA A2 activity. The EDTA-eluted fraction was analyzed by HPLC using on-line HPLC-tandem mass spectrometry and two protein peaks were observed at 8.2 and 13 kDa. Peptide sequences were identified from the proteins present in the eluted EDTA UFF30 fraction; bioinformatic analysis using Protein Link Global Server charged with T. vaginalis protein database suggests that eluted peptides correspond a putative ubiquitin protein in the 8.2 kDa fraction and a phospholipase preserved in the 13 kDa fraction. The EDTA-eluted fraction hydrolyzed [2-(14)C-PA]-PC lyses erythrocytes from Sprague-Dawley in a time and dose-dependent manner. The acidic hemolytic activity decreased by 84% with the addition of 100 μM of Rosenthal's inhibitor.

  1. K-Pax2: Bayesian identification of cluster-defining amino acid positions in large sequence datasets

    PubMed Central

    Grad, Yonatan; Cobey, Sarah; Puranen, Juha Santeri; Corander, Jukka

    2015-01-01

    The recent growth in publicly available sequence data has introduced new opportunities for studying microbial evolution and spread. Because the pace of sequence accumulation tends to exceed the pace of experimental studies of protein function and the roles of individual amino acids, statistical tools to identify meaningful patterns in protein diversity are essential. Large sequence alignments from fast-evolving micro-organisms are particularly challenging to dissect using standard tools from phylogenetics and multivariate statistics because biologically relevant functional signals are easily masked by neutral variation and noise. To meet this need, a novel computational method is introduced that is easily executed in parallel using a cluster environment and can handle thousands of sequences with minimal subjective input from the user. The usefulness of this kind of machine learning is demonstrated by applying it to nearly 5000 haemagglutinin sequences of influenza A/H3N2.Antigenic and 3D structural mapping of the results show that the method can recover the major jumps in antigenic phenotype that occurred between 1968 and 2013 and identify specific amino acids associated with these changes. The method is expected to provide a useful tool to uncover patterns of protein evolution. PMID:28348810

  2. HPV-16 E2 gene disruption and sequence variation in CIN 3 lesions and invasive squamous cell carcinomas of the cervix: relation to numerical chromosome abnormalities

    PubMed Central

    Graham, D A; Herrington, C S

    2000-01-01

    Aim—To test the hypothesis that, because the human papillomavirus (HPV) E2 protein represses viral early gene transcription, E2 gene sequence variation or disruption could play a part in the induction of the numerical chromosome abnormalities that have been described in squamous cervical lesions. Methods—The integrity and sequence of the E2 gene from 11 cervical intraepithelial neoplasia (CIN) grade 3 lesions and 14 invasive squamous cell carcinomas, all of which contained HPV-16, were analysed by the polymerase chain reaction (PCR). The E2 gene was amplified in three overlapping fragments and PCR products sequenced directly. Chromosome abnormalities were identified by interphase cytogenetics using chromosome specific probes for chromosomes 1, 3, 11, 17, 18, and X. Results—E2 gene disruption was present in significantly more invasive carcinomas (eight of 14) than CIN 3 lesions (one of 11) (p = 0.03). No association was found between E2 disruption and the presence of a numerical chromosome abnormality. The E2 gene from the non-disrupted isolates was sequenced and wild-type (n = 5) and variant (n = 11) sequences identified. Variant sequences belonged to European and African classes and contained from one to 15 amino acid substitutions. Although numerical chromosome abnormalities were significantly more frequent in invasive squamous cell carcinoma than CIN 3 (p = 0.04), there was no significant relation between the presence of sequence variation and either histological diagnosis or chromosome abnormality. Conclusions—These data do not support the hypothesis that E2 gene disruption or variation is important in the induction of chromosome imbalance in these lesions. However, there is a relation between E2 gene disruption and the presence of invasive disease. PMID:11040943

  3. Hydrophobicity and Aromaticity Are Primary Factors Shaping Variation in Amino Acid Usage of Chicken Proteome

    PubMed Central

    Chai, Xuewen; Nie, Qinghua; Zhang, Xiquan

    2014-01-01

    Amino acids are utilized with different frequencies both among species and among genes within the same genome. Up to date, no study on the amino acid usage pattern of chicken has been performed. In the present study, we carried out a systematic examination of the amino acid usage in the chicken proteome. Our data indicated that the relative amino acid usage is positively correlated with the tRNA gene copy number. GC contents, including GC1, GC2, GC3, GC content of CDS and GC content of the introns, were correlated with the most of the amino acid usage, especially for GC rich and GC poor amino acids, however, multiple linear regression analyses indicated that only approximately 10–40% variation of amino acid usage can be explained by GC content for GC rich and GC poor amino acids. For other intermediate GC content amino acids, only approximately 10% variation can be explained. Correspondence analyses demonstrated that the main factors responsible for the variation of amino acid usage in chicken are hydrophobicity, aromaticity and genomic GC content. Gene expression level also influenced the amino acid usage significantly. We argued that the amino acid usage of chicken proteome likely reflects a balance or near balance between the action of selection, mutation, and genetic drift. PMID:25329059

  4. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  5. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  6. The amino acid sequence of protein CM-3 from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J

    1985-01-01

    Protein CM-3 from Dendroaspis polylepis polylepis venom was purified by gel filtration and ion exchange chromatography. It comprises 65 amino acids including eight half-cystines. The complete amino acid sequence of protein CM-3 has been elucidated. The sequence (residues 1-50) resembles that of the N-terminal sequence of the subunits of a synergistic type protein and residues 51-65 that of the C-terminal sequence of an angusticeps type protein. Mixtures of protein CM-3 and angusticeps type proteins showed no apparent synergistic effect, in that their toxicity in combination was no greater than the sum of their individual toxicities.

  7. The amino acid sequences of the Fd fragments of two human γ heavy chains

    PubMed Central

    Press, E. M.; Hogg, N. M.

    1970-01-01

    The amino acid sequences of the Fd fragments of two human pathological immunoglobulins of the immunoglobulin G1 class are reported. Comparison of the two sequences shows that the heavy-chain variable regions are similar in length to those of the light chains. The existence of heavy chain variable region subgroups is also deduced, from a comparison of these two sequences with those of another γ 1 chain, Eu, a μ chain, Ou, and the partial sequence of a fourth γ 1 chain, Ste. Carbohydrate has been found to be linked to an aspartic acid residue in the variable region of one of the γ 1 chains, Cor. PMID:5449120

  8. Variation of partial transferrin sequences and phylogenetic relationships among hares (Lepus capensis, Lagomorpha) from Tunisia.

    PubMed

    Awadi, Asma; Suchentrunk, Franz; Makni, Mohamed; Ben Slimen, Hichem

    2016-10-01

    North African hares are currently included in cape hares, Lepus capensis sensu lato, a taxon that may be considered a superspecies or a complex of closely related species. The existing molecular data, however, are not unequivocal, with mtDNA control region sequences suggesting a separate species status and nuclear loci (allozymes, microsatellites) revealing conspecificity of L. capensis and L. europaeus. Here, we study sequence variation in the intron 6 (468 bp) of the transferrin nuclear gene, of 105 hares with different coat colour from different regions in Tunisia with respect to genetic diversity and differentiation, as well as their phylogenetic status. Forty-six haplotypes (alleles) were revealed and compared phylogenetically to all available TF haplotypes of various Lepus species retrieved from GenBank. Maximum Likelihood, neighbor joining and median joining network analyses concordantly grouped all currently obtained haplotypes together with haplotypes belonging to six different Chinese hare species and the African scrub hare L. saxatilis. Moreover, two Tunisian haploypes were shared with L. capensis, L timidus, L. sinensis, L. yarkandensis, and L. hainanus from China. These results indicated the evolutionary complexity of the genus Lepus with the mixing of nuclear gene haplotypes resulting from introgressive hybridization or/and shared ancestral polymorphism. We report the presence of shared ancestral polymorphism between North African and Chinese hares. This has not been detected earlier in the mtDNA sequences of the same individuals. Genetic diversity of the TF sequences from the Tunisian populations was relatively high compared to other hare populations. However, genetic differentiation and gene flow analyses (AMOVA, FST, Nm) indicated little divergence with the absence of geographically meaningful phylogroups and lack of clustering with coat colour types. These results confirm the presence of a single hare species in Tunisia, but a sound inference on

  9. Combining Natural Sequence Variation with High Throughput Mutational Data to Reveal Protein Interaction Sites

    PubMed Central

    Melamed, Daniel; Young, David L.; Miller, Christina R.; Fields, Stanley

    2015-01-01

    Many protein interactions are conserved among organisms despite changes in the amino acid sequences that comprise their contact sites, a property that has been used to infer the location of these sites from protein homology. In an inter-species complementation experiment, a sequence present in a homologue is substituted into a protein and tested for its ability to support function. Therefore, substitutions that inhibit function can identify interaction sites that changed over evolution. However, most of the sequence differences within a protein family remain unexplored because of the small-scale nature of these complementation approaches. Here we use existing high throughput mutational data on the in vivo function of the RRM2 domain of the Saccharomyces cerevisiae poly(A)-binding protein, Pab1, to analyze its sites of interaction. Of 197 single amino acid differences in 52 Pab1 homologues, 17 reduce the function of Pab1 when substituted into the yeast protein. The majority of these deleterious mutations interfere with the binding of the RRM2 domain to eIF4G1 and eIF4G2, isoforms of a translation initiation factor. A large-scale mutational analysis of the RRM2 domain in a two-hybrid assay for eIF4G1 binding supports these findings and identifies peripheral residues that make a smaller contribution to eIF4G1 binding. Three single amino acid substitutions in yeast Pab1 corresponding to residues from the human orthologue are deleterious and eliminate binding to the yeast eIF4G isoforms. We create a triple mutant that carries these substitutions and other humanizing substitutions that collectively support a switch in binding specificity of RRM2 from the yeast eIF4G1 to its human orthologue. Finally, we map other deleterious substitutions in Pab1 to inter-domain (RRM2–RRM1) or protein-RNA (RRM2–poly(A)) interaction sites. Thus, the combined approach of large-scale mutational data and evolutionary conservation can be used to characterize interaction sites at single

  10. PCR/SSCP detects reliably and efficiently DNA sequence variations in large scale screening projects.

    PubMed

    Miterski, B; Krüger, R; Wintermeyer, P; Epplen, J T

    2000-06-01

    A simple and fast method with high reliability is necessary for the identification of mutations, polymorphisms and sequence variants (MPSV) within many genes and many samples, e.g. for clarifying the genetic background of individuals with multifactorial diseases. Here we review our experience with the polymerase chain reaction/single-strand conformation polymorphism (PCR/SSCP) analysis to identify MPSV in a number of genes thought to be involved in the pathogenesis of multifactorial neurological disorders, including autoimmune diseases like multiple sclerosis (MS) and neurodegenerative disorders like Parkinson s disease (PD). The method is based on the property of the DNA that the electrophoretic mobility of single stranded nucleic acids depends not only on their size but also on their sequence. The target sequences were amplified, digested into fragments ranging from 50-240 base pairs (bp), heat-denatured and analysed on native polyacrylamide (PAA) gels of different composition. The analysis of a great number of different PCR products demonstrates that the detection rate of MPSV depends on the fragment lengths, the temperature during electrophoresis and the composition of the gel. In general, the detection of MPSV is neither influenced by their location within the DNA fragment nor by the type of substitution, i.e., transitions or transversions. The standard PCR/SSCP system described here provides high reliability and detection rates. It allows the efficient analysis of a large number of DNA samples and many different genes.

  11. The Chinese hamster Alu-equivalent sequence: a conserved highly repetitious, interspersed deoxyribonucleic acid sequence in mammals has a structure suggestive of a transposable element.

    PubMed Central

    Haynes, S R; Toomey, T P; Leinwand, L; Jelinek, W R

    1981-01-01

    A consensus sequence has been determined for a major interspersed deoxyribonucleic acid repeat in the genome of Chinese hamster ovary cells (CHO cells). This sequence is extensively homologous to (i) the human Alu sequence (P. L. Deininger et al., J. Mol. Biol., in press), (ii) the mouse B1 interspersed repetitious sequence (Krayev et al., Nucleic Acids Res. 8:1201-1215, 1980) (iii) an interspersed repetitious sequence from African green monkey deoxyribonucleic acid (Dhruva et al., Proc. Natl. Acad. Sci. U.S.A. 77:4514-4518, 1980) and (iv) the CHO and mouse 4.5S ribonucleic acid (this report; F. Harada and N. Kato, Nucleic Acids Res. 8:1273-1285, 1980). Because the CHO consensus sequence shows significant homology to the human Alu sequence it is termed the CHO Alu-equivalent sequence. A conserved structure surrounding CHO Alu-equivalent family members can be recognized. It is similar to that surrounding the human Alu and the mouse B1 sequences, and is represented as follows: direct repeat-CHO-Alu-A-rich sequence-direct repeat. A composite interspersed repetitious sequence has been identified. Its structure is represented as follows: direct repeat-residue 47 to 107 of CHO-Alu-non-Alu repetitious sequence-A-rich sequence-direct repeat. Because the Alu flanking sequences resemble those that flank known transposable elements, we think it likely that the Alu sequence dispersed throughout the mammalian genome by transposition. Images PMID:9279371

  12. Phylogenetic Sequence Variations in Bacterial rRNA Affect Species-Specific Susceptibility to Drugs Targeting Protein Synthesis▿‡

    PubMed Central

    Akshay, Subramanian; Bertea, Mihai; Hobbie, Sven N.; Oettinghaus, Björn; Shcherbakov, Dimitri; Böttger, Erik C.; Akbergenov, Rashid

    2011-01-01

    Antibiotics targeting the bacterial ribosome typically bind to highly conserved rRNA regions with only minor phylogenetic sequence variations. It is unclear whether these sequence variations affect antibiotic susceptibility or resistance development. To address this question, we have investigated the drug binding pockets of aminoglycosides and macrolides/ketolides. The binding site of aminoglycosides is located within helix 44 of the 16S rRNA (A site); macrolides/ketolides bind to domain V of the 23S rRNA (peptidyltransferase center). We have used mutagenesis of rRNA sequences in Mycobacterium smegmatis ribosomes to reconstruct the different bacterial drug binding sites and to study the effects of rRNA sequence variations on drug activity. Our results provide a rationale for differences in species-specific drug susceptibility patterns and species-specific resistance phenotypes associated with mutational alterations in the drug binding pocket. PMID:21730122

  13. Recurrent Coding Sequence Variation Explains Only A Small Fraction of the Genetic Architecture of Colorectal Cancer

    PubMed Central

    Timofeeva, Maria N.; Kinnersley, Ben; Farrington, Susan M.; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J.; Harris, Sarah E.; Northwood, Emma L.; Barrett, Jennifer H.; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F A; Hes, Frederik J.; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Försti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernández-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellví-Bel, Sergi; Campbell, Harry; Bishop, D. Timothy; Tomlinson, Ian P M; Dunlop, Malcolm G.; Houlston, Richard S.

    2015-01-01

    Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,045 controls from six European populations. Single-variant analysis identified a coding variant (rs3184504) in SH2B3 (12q24) associated with CRC risk (OR = 1.08, P = 3.9 × 10−7), and novel damaging coding variants in 3 genes previously tagged by GWAS efforts; rs16888728 (8q24) in UTP23 (OR = 1.15, P = 1.4 × 10−7); rs6580742 and rs12303082 (12q13) in FAM186A (OR = 1.11, P = 1.2 × 10−7 and OR = 1.09, P = 7.4 × 10−8); rs1129406 (12q13) in ATF1 (OR = 1.11, P = 8.3 × 10−9), all reaching exome-wide significance levels. Gene based tests identified associations between CRC and PCDHGA genes (P < 2.90 × 10−6). We found an excess of rare, damaging variants in base-excision (P = 2.4 × 10−4) and DNA mismatch repair genes (P = 6.1 × 10−4) consistent with a recessive mode of inheritance. This study comprehensively explores the contribution of coding sequence variation to CRC risk, identifying associations with coding variation in 4 genes and PCDHG gene cluster and several candidate recessive alleles. However, these findings suggest that recurrent, low-frequency coding variants account for a minority of the unexplained heritability of CRC. PMID:26553438

  14. Recurrent Coding Sequence Variation Explains Only A Small Fraction of the Genetic Architecture of Colorectal Cancer.

    PubMed

    Timofeeva, Maria N; Kinnersley, Ben; Farrington, Susan M; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J; Harris, Sarah E; Northwood, Emma L; Barrett, Jennifer H; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F A; Hes, Frederik J; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Försti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernández-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellví-Bel, Sergi; Campbell, Harry; Bishop, D Timothy; Tomlinson, Ian P M; Dunlop, Malcolm G; Houlston, Richard S

    2015-11-10

    Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,045 controls from six European populations. Single-variant analysis identified a coding variant (rs3184504) in SH2B3 (12q24) associated with CRC risk (OR = 1.08, P = 3.9 × 10(-7)), and novel damaging coding variants in 3 genes previously tagged by GWAS efforts; rs16888728 (8q24) in UTP23 (OR = 1.15, P = 1.4 × 10(-7)); rs6580742 and rs12303082 (12q13) in FAM186A (OR = 1.11, P = 1.2 × 10(-7) and OR = 1.09, P = 7.4 × 10(-8)); rs1129406 (12q13) in ATF1 (OR = 1.11, P = 8.3 × 10(-9)), all reaching exome-wide significance levels. Gene based tests identified associations between CRC and PCDHGA genes (P < 2.90 × 10(-6)). We found an excess of rare, damaging variants in base-excision (P = 2.4 × 10(-4)) and DNA mismatch repair genes (P = 6.1 × 10(-4)) consistent with a recessive mode of inheritance. This study comprehensively explores the contribution of coding sequence variation to CRC risk, identifying associations with coding variation in 4 genes and PCDHG gene cluster and several candidate recessive alleles. However, these findings suggest that recurrent, low-frequency coding variants account for a minority of the unexplained heritability of CRC.

  15. The amino acid sequence of goat beta-lactoglobulin.

    PubMed

    Préaux, G; Braunitzer, G; Schrank, B; Stangl, A

    1979-11-01

    The isolation of beta-lactoglobulin from milk of the goat is described. The purified protein was checked for purity and has been characterized by its gross composition and end groups. The native or the modified protein was then degraded by tryptic and cyanogen bromide cleavage. The cleavage products were isolated and sequenced in the sequenator using a Quadrol and propyne program. These data provide the complete sequence of beta-lactoglobulin of the goat. The results are discussed and compared particularly with bovine beta-lactoglobulin components AB. Some biological aspects are described.

  16. Layered materials with coexisting acidic and basic sites for catalytic one-pot reaction sequences.

    PubMed

    Motokura, Ken; Tada, Mizuki; Iwasawa, Yasuhiro

    2009-06-17

    Acidic montmorillonite-immobilized primary amines (H-mont-NH(2)) were found to be excellent acid-base bifunctional catalysts for one-pot reaction sequences, which are the first materials with coexisting acid and base sites active for acid-base tamdem reactions. For example, tandem deacetalization-Knoevenagel condensation proceeded successfully with the H-mont-NH(2), affording the corresponding condensation product in a quantitative yield. The acidity of the H-mont-NH(2) was strongly influenced by the preparation solvent, and the base-catalyzed reactions were enhanced by interlayer acid sites.

  17. Synthesis of gamma,delta-unsaturated glycolic acids via sequenced brook and Ireland--claisen rearrangements.

    PubMed

    Schmitt, Daniel C; Johnson, Jeffrey S

    2010-03-05

    Organozinc, -magnesium, and -lithium nucleophiles initiate a Brook/Ireland-Claisen rearrangement sequence of allylic silyl glyoxylates resulting in the formation of gamma,delta-unsaturated alpha-silyloxy acids.

  18. Computer Simulation of the Determination of Amino Acid Sequences in Polypeptides

    ERIC Educational Resources Information Center

    Daubert, Stephen D.; Sontum, Stephen F.

    1977-01-01

    Describes a computer program that generates a random string of amino acids and guides the student in determining the correct sequence of a given protein by using experimental analytic data for that protein. (MLH)

  19. Color differences among feral pigeons (Columba livia) are not attributable to sequence variation in the coding region of the melanocortin-1 receptor gene (MC1R)

    PubMed Central

    2013-01-01

    Background Genetic variation at the melanocortin-1 receptor (MC1R) gene is correlated with melanin color variation in many birds. Feral pigeons (Columba livia) show two major melanin-based colorations: a red coloration due to pheomelanic pigment and a black coloration due to eumelanic pigment. Furthermore, within each color type, feral pigeons display continuous variation in the amount of melanin pigment present in the feathers, with individuals varying from pure white to a full dark melanic color. Coloration is highly heritable and it has been suggested that it is under natural or sexual selection, or both. Our objective was to investigate whether MC1R allelic variants are associated with plumage color in feral pigeons. Findings We sequenced 888 bp of the coding sequence of MC1R among pigeons varying both in the type, eumelanin or pheomelanin, and the amount of melanin in their feathers. We detected 10 non-synonymous substitutions and 2 synonymous substitution but none of them were associated with a plumage type. It remains possible that non-synonymous substitutions that influence coloration are present in the short MC1R fragment that we did not sequence but this seems unlikely because we analyzed the entire functionally important region of the gene. Conclusions Our results show that color differences among feral pigeons are probably not attributable to amino acid variation at the MC1R locus. Therefore, variation in regulatory regions of MC1R or variation in other genes may be responsible for the color polymorphism of feral pigeons. PMID:23915680

  20. Genome sequence of the acid-tolerant strain Rhizobium sp. LPU83.

    PubMed

    Wibberg, Daniel; Tejerizo, Gonzalo Torres; Del Papa, María Florencia; Martini, Carla; Pühler, Alfred; Lagares, Antonio; Schlüter, Andreas; Pistorio, Mariano

    2014-04-20

    Rhizobia are important members of the soil microbiome since they enter into nitrogen-fixing symbiosis with different legume host plants. Rhizobium sp. LPU83 is an acid-tolerant Rhizobium strain featuring a broad-host-range. However, it is ineffective in nitrogen fixation. Here, the improved draft genome sequence of this strain is reported. Genome sequence information provides the basis for analysis of its acid tolerance, symbiotic properties and taxonomic classification.

  1. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly).

  2. Copy number variations in Hanwoo and Yanbian cattle genomes using the massively parallel sequencing data.

    PubMed

    Choi, Jung-Woo; Chung, Won-Hyong; Lim, Kyu-Sang; Lim, Won-Jun; Choi, Bong-Hwan; Lee, Seung-Hwan; Kim, Hyeong-Cheol; Lee, Seung-Soo; Cho, Eun-Seok; Lee, Kyung-Tai; Kim, Namshin; Kim, Jeong-Dae; Kim, Jong-Bok; Chai, Han-Ha; Cho, Yong-Min; Kim, Tae-Hun; Lim, Dajeong

    2016-09-01

    Hanwoo is an indigenous Korean beef cattle breed, and it shared an ancestor with Yanbian cattle that are found in the Northeast provinces in China until the last century. During recent decades, those cattle breeds experienced different selection pressures. Here, we present genome-wide copy number variations (CNVs) by comparing Hanwoo and Yanbian cattle sequencing data. We used ~3.12 and ~3.07 billion sequence reads from Hanwoo and Yanbian cattle, respectively. A total of 901 putative CNV regions (CNVRs) were identified throughout the genome, representing 5,513,340bp. This is a smaller number than has been reported in previous studies, indicating that Hanwoo are genetically close to Yanbian cattle. Of the CNVRs, 53.2% and 46.8% were found to be gains and losses in Hanwoo. Potential functional roles of each CNVR were assessed by annotating all CNVRs and gene ontology (GO) enrichment analysis. We found that 278 CNVRs overlapped with cattle gene-sets (genic-CNVRs) that could be promising candidates to account for economically important traits in cattle. The enrichment analysis indicated that genes were significantly over-represented in GO terms, including developmental process, multicellular organismal process, reproduction, and response to stimulus. These results provide a valuable genomic resource for determining how CNVs are associated with cattle traits.

  3. Patchwork sequencing of tomato San Marzano and Vesuviano varieties highlights genome-wide variations

    PubMed Central

    2014-01-01

    Background Investigation of tomato genetic resources is a crucial issue for better straight evolution and genetic studies as well as tomato breeding strategies. Traditional Vesuviano and San Marzano varieties grown in Campania region (Southern Italy) are famous for their remarkable fruit quality. Owing to their economic and social importance is crucial to understand the genetic basis of their unique traits. Results Here, we present the draft genome sequences of tomato Vesuviano and San Marzano genome. A 40x genome coverage was obtained from a hybrid Illumina paired-end reads assembling that combines de novo assembly with iterative mapping to the reference S. lycopersicum genome (SL2.40). Insertions, deletions and SNP variants were carefully measured. When assessed on the basis of the reference annotation, 30% of protein-coding genes are predicted to have variants in both varieties. Copy genes number and gene location were assessed by mRNA transcripts mapping, showing a closer relationship of San Marzano with reference genome. Distinctive variations in key genes and transcription/regulation factors related to fruit quality have been revealed for both cultivars. Conclusions The effort performed highlighted varieties relationships and important variants in fruit key processes useful to dissect the path from sequence variant to phenotype. PMID:24548308

  4. Single-chain structure of human ceruloplasmin: the complete amino acid sequence of the whole molecule.

    PubMed Central

    Takahashi, N; Ortel, T L; Putnam, F W

    1984-01-01

    We have determined the amino acid sequence of the amino-terminal 67,000-dalton (67-kDa) fragment of human ceruloplasmin and have established overlapping sequences between the 67-kDa and 50-kDa fragments and between the 50-kDa and 19-kDa fragments. The 67-kDa fragment contains 480 amino acid residues and three glucosamine oligosaccharides. These results together with our previous sequence data for the 50-kDa and 19-kDa fragments complete the amino acid sequence of human ceruloplasmin. The polypeptide chain has a total of 1,046 amino acid residues (Mr 120,085) and has attachment sites for four glucosamine oligosaccharides; together these account for the total molecular mass of human ceruloplasmin (132 kDa). The sequence analysis of the peptides overlapping the fragments showed that one additional amino acid, arginine, is present between the 67-kDa and 50-kDa fragments, and another, lysine, is between the 50-kDa and 19-kDa fragments. Only two apparent sites of amino acid interchange have been identified in the polypeptide chain. Both involve a single-point interchange of glycine and lysine that would result in a difference in charge. The results of the complete sequence analysis verified that human ceruloplasmin is composed of a single polypeptide chain and that the subunit-like fragments are produced by proteolytic cleavage during purification (and possibly also in vivo). PMID:6582496

  5. Mitochondrial DNA sequence variation among populations and host races of Lambdina fiscellaria (Gn.) (Lepidoptera: Geometridae).

    PubMed

    Sperling, F A; Raske, A G; Otvos, I S

    1999-02-01

    The hemlock looper, Lambdina fiscellaria (Gn.), is a recurring major forest pest that is widely distributed in North America. Three subspecies (L. f. fiscellaria, L. f. lugubrosa (Hulst) and L. f. somniaria (Hulst)) have been recognized based on larval host or adult pheromone differences, but no consistent morphological differences have been reported. To clarify their taxonomic status, we surveyed mitochondrial DNA (mtDNA) sequence and restriction site variation in two protein coding genes, cytochrome oxidase I and II (COI and COII), in populations across the range of L. fiscellaria. In addition to variation in COI and COII, we found an intergenic spacer region of 20-23 bp located between the tRNA tyrosine gene and the start of COI. Of the 141 specimens of L. fiscellaria assayed, 137 were grouped into two distinct mtDNA lineages, one of which was disproportionately associated with eastern populations and one with western populations. However, single specimens and two populations in eastern Canada had mtDNA resembling that of western populations. Three divergent and rare haplotypes had basal affinities to the two common lineages. The two major lineages of L. fiscellaria were diverged by approximately 2% from each other, as well as from the mtDNA of two outgroup species, L. athasaria (Walker) and L. pellucidaria(G. & R.). The two outgroup species had essentially the same mtDNA and may be conspecific. We interpret the pattern of mtDNA variation within L. fiscellaria as indicating genetic polymorphism within a single species without clear subspecific divisions, rather than evidence of multiple cryptic species.

  6. Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

    SciTech Connect

    Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng; Kurz,Thorsten; Dubchak, Inna; Frazer, Kelly A.; Ober, Carole

    2005-09-10

    Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs each inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.

  7. Multiple Genome Sequences of Important Beer-Spoiling Lactic Acid Bacteria

    PubMed Central

    Geissler, Andreas J.; Vogel, Rudi F.

    2016-01-01

    Seven strains of important beer-spoiling lactic acid bacteria were sequenced using single-molecule real-time sequencing. Complete genomes were obtained for strains of Lactobacillus paracollinoides, Lactobacillus lindneri, and Pediococcus claussenii. The analysis of these genomes emphasizes the role of plasmids as the genomic foundation of beer-spoiling ability. PMID:27795248

  8. Copy number variation of individual cattle genomes using next-generation sequencing

    PubMed Central

    Bickhart, Derek M.; Hou, Yali; Schroeder, Steven G.; Alkan, Can; Cardone, Maria Francesca; Matukumalli, Lakshmi K.; Song, Jiuzhou; Schnabel, Robert D.; Ventura, Mario; Taylor, Jeremy F.; Garcia, Jose Fernando; Van Tassell, Curtis P.; Sonstegard, Tad S.; Eichler, Evan E.; Liu, George E.

    2012-01-01

    Copy number variations (CNVs) affect a wide range of phenotypic traits; however, CNVs in or near segmental duplication regions are often intractable. Using a read depth approach based on next-generation sequencing, we examined genome-wide copy number differences among five taurine (three Angus, one Holstein, and one Hereford) and one indicine (Nelore) cattle. Within mapped chromosomal sequence, we identified 1265 CNV regions comprising ∼55.6-Mbp sequence—476 of which (∼38%) have not previously been reported. We validated this sequence-based CNV call set with array comparative genomic hybridization (aCGH), quantitative PCR (qPCR), and fluorescent in situ hybridization (FISH), achieving a validation rate of 82% and a false positive rate of 8%. We further estimated absolute copy numbers for genomic segments and annotated genes in each individual. Surveys of the top 25 most variable genes revealed that the Nelore individual had the lowest copy numbers in 13 cases (∼52%, χ2 test; P-value <0.05). In contrast, genes related to pathogen- and parasite-resistance, such as CATHL4 and ULBP17, were highly duplicated in the Nelore individual relative to the taurine cattle, while genes involved in lipid transport and metabolism, including APOL3 and FABP2, were highly duplicated in the beef breeds. These CNV regions also harbor genes like BPIFA2A (BSP30A) and WC1, suggesting that some CNVs may be associated with breed-specific differences in adaptation, health, and production traits. By providing the first individualized cattle CNV and segmental duplication maps and genome-wide gene copy number estimates, we enable future CNV studies into highly duplicated regions in the cattle genome. PMID:22300768

  9. Detection and implication of significant temporal b-value variation during earthquake sequences

    NASA Astrophysics Data System (ADS)

    Gulia, Laura; Tormann, Thessa; Schorlemmer, Danijel; Wiemer, Stefan

    2016-04-01

    Earthquakes tend to cluster in space and time and periods of increased seismic activity are also periods of increased seismic hazard. Forecasting models currently used in statistical seismology and in Operational Earthquake Forecasting (e.g. ETAS) consider the spatial and temporal changes in the activity rates whilst the spatio-temporal changes in the earthquake size distribution, the b-value, are not included. Laboratory experiments on rock samples show an increasing relative proportion of larger events as the system approaches failure, and a sudden reversal of this trend after the main event. The increasing fraction of larger events during the stress increase period can be mathematically represented by a systematic b-value decrease, while the b-value increases immediately following the stress release. We investigate whether these lab-scale observations also apply to natural earthquake sequences and can help to improve our understanding of the physical processes generating damaging earthquakes. A number of large events nucleated in low b-value regions and spatial b-value variations have been extensively documented in the past. Detecting temporal b-value evolution with confidence is more difficult, one reason being the very different scales that have been suggested for a precursory drop in b-value, from a few days to decadal scale gradients. We demonstrate with the results of detailed case studies of the 2009 M6.3 L'Aquila and 2011 M9 Tohoku earthquakes that significant and meaningful temporal b-value variability can be detected throughout the sequences, which e.g. suggests that foreshock probabilities are not generic but subject to significant spatio-temporal variability. Such potential conclusions require and motivate the systematic study of many sequences to investigate whether general patterns exist that might eventually be useful for time-dependent or even real-time seismic hazard assessment.

  10. Effect of laying sequence on egg mercury in captive zebra finches: an interpretation considering individual variation.

    PubMed

    Ou, Langbo; Varian-Ramos, Claire W; Cristol, Daniel A

    2015-08-01

    Bird eggs are used widely as noninvasive bioindicators for environmental mercury availability. Previous studies, however, have found varying relationships between laying sequence and egg mercury concentrations. Some studies have reported that the mercury concentration was higher in first-laid eggs or declined across the laying sequence, whereas in other studies mercury concentration was not related to egg order. Approximately 300 eggs (61 clutches) were collected from captive zebra finches dosed throughout their reproductive lives with methylmercury (0.3 μg/g, 0.6 μg/g, 1.2 μg/g, or 2.4 μg/g wet wt in diet); the total mercury concentration (mean ± standard deviation [SD] dry wt basis) of their eggs was 7.03 ± 1.38 μg/g, 14.15 ± 2.52 μg/g, 26.85 ± 5.85 μg/g, and 49.76 ± 10.37 μg/g, respectively (equivalent to fresh wt egg mercury concentrations of 1.24 μg/g, 2.50 μg/g, 4.74 μg/g, and 8.79 μg/g). The authors observed a significant decrease in the mercury concentration of successive eggs when compared with the first egg and notable variation between clutches within treatments. The mercury level of individual females within and among treatments did not alter this relationship. Based on the results, sampling of a single egg in each clutch from any position in the laying sequence is sufficient for purposes of population risk assessment, but it is not recommended as a proxy for individual female exposure or as an estimate of average mercury level within the clutch.

  11. Exploring genetic variation in the tomato (Solanum section Lycopersicon) clade by whole-genome sequencing.

    PubMed

    Aflitos, Saulo; Schijlen, Elio; de Jong, Hans; de Ridder, Dick; Smit, Sandra; Finkers, Richard; Wang, Jun; Zhang, Gengyun; Li, Ning; Mao, Likai; Bakker, Freek; Dirks, Rob; Breit, Timo; Gravendeel, Barbara; Huits, Henk; Struss, Darush; Swanson-Wagner, Ruth; van Leeuwen, Hans; van Ham, Roeland C H J; Fito, Laia; Guignier, Laëtitia; Sevilla, Myrna; Ellul, Philippe; Ganko, Eric; Kapur, Arvind; Reclus, Emannuel; de Geus, Bernard; van de Geest, Henri; Te Lintel Hekkert, Bas; van Haarst, Jan; Smits, Lars; Koops, Andries; Sanchez-Perez, Gabino; van Heusden, Adriaan W; Visser, Richard; Quan, Zhiwu; Min, Jiumeng; Liao, Li; Wang, Xiaoli; Wang, Guangbiao; Yue, Zhen; Yang, Xinhua; Xu, Na; Schranz, Eric; Smets, Erik; Vos, Rutger; Rauwerda, Johan; Ursem, Remco; Schuit, Cees; Kerns, Mike; van den Berg, Jan; Vriezen, Wim; Janssen, Antoine; Datema, Erwin; Jahrman, Torben; Moquet, Frederic; Bonnet, Julien; Peters, Sander

    2014-10-01

    We explored genetic variation by sequencing a selection of 84 tomato accessions and related wild species representative of the Lycopersicon, Arcanum, Eriopersicon and Neolycopersicon groups, which has yielded a huge amount of precious data on sequence diversity in the tomato clade. Three new reference genomes were reconstructed to support our comparative genome analyses. Comparative sequence alignment revealed group-, species- and accession-specific polymorphisms, explaining characteristic fruit traits and growth habits in the various cultivars. Using gene models from the annotated Heinz 1706 reference genome, we observed differences in the ratio between non-synonymous and synonymous SNPs (dN/dS) in fruit diversification and plant growth genes compared to a random set of genes, indicating positive selection and differences in selection pressure between crop accessions and wild species. In wild species, the number of single-nucleotide polymorphisms (SNPs) exceeds 10 million, i.e. 20-fold higher than found in most of the crop accessions, indicating dramatic genetic erosion of crop and heirloom tomatoes. In addition, the highest levels of heterozygosity were found for allogamous self-incompatible wild species, while facultative and autogamous self-compatible species display a lower heterozygosity level. Using whole-genome SNP information for maximum-likelihood analysis, we achieved complete tree resolution, whereas maximum-likelihood trees based on SNPs from ten fruit and growth genes show incomplete resolution for the crop accessions, partly due to the effect of heterozygous SNPs. Finally, results suggest that phylogenetic relationships are correlated with habitat, indicating the occurrence of geographical races within these groups, which is of practical importance for Solanum genome evolution studies.

  12. High sensitivity of the single-strand conformation polymorphism method for detecting sequence variations in the low-density lipoprotein receptor gene validated by DNA sequencing.

    PubMed

    Jensen, H K; Jensen, L G; Hansen, P S; Faergeman, O; Gregersen, N

    1996-08-01

    We designed oligonucleotide primer pairs to amplify the promoter region, the translated exon sequences, and the flanking intron sequences of all 18 exons of the LDL receptor gene to compare the ability of the PCR single-strand conformation polymorphism (PCR-SSCP) method with semiautomated solid-phase genomic DNA sequencing to detect sequence variations. In 20 apparently unrelated Danish patients with a clinical diagnosis of heterozygous familial hypercholesterolemia (FH), we identified 13 different mutations in the LDL receptor gene: two silent (C331C, N494 N); five missense (W66G, E119K, T383P, W556S, T7051); one nonsense (W23X); three splice-site (313 + 1G-->A, 1061-8T-->C, 1846-1G-->A); and two frameshift (335del10, 1650delG) mutations. Four of these mutations, N494 N, T383P, 1061-8T-->C, and W556S, have not been reported earlier. The pathogenicity of the T383P, 1061-8T-->C, and W556S mutations remains to be established by in vitro mutagenesis and transfection studies. One patient had three mutations (335del10, 1061-8T-->C, and T705I) on the same allele. Further, nine well-known polymorphisms were detectable with this methodological setup. Direct DNA sequencing of the PCR products used for the SSCP analysis did not reveal any sequence variations not detected by the PCR-SSCP method. In two patients we did not detect any mutation by either method. We conclude that the PCR-SSCP analysis, performed as described here, is as sensitive and efficient as DNA sequencing in the ability to identify the sequence variations in the LDL receptor gene of the patients with heterozygous FH of this study.

  13. Variation-tolerant capture and multiplex detection of nucleic acids: application to detection of microbes.

    PubMed

    Ohrmalm, Christina; Eriksson, Ronnie; Jobs, Magnus; Simonson, Magnus; Strømme, Maria; Bondeson, Kåre; Herrmann, Björn; Melhus, Asa; Blomberg, Jonas

    2012-10-01

    In contrast to ordinary PCRs, which have a limited multiplex capacity and often return false-negative results due to target variation or inhibition, our new detection strategy, VOCMA (variation-tolerant capture multiplex assay), allows variation-tolerant, target-specific capture and detection of many nucleic acids in one test. Here we demonstrate the use of a single-tube, dual-step amplification strategy that overcomes the usual limitations of PCR multiplexing, allowing at least a 22-plex format with retained sensitivity. Variation tolerance was achieved using long primers and probes designed to withstand variation at known sites and a judicious mix of degeneration and universal bases. We tested VOCMA in situations where enrichment from a large sample volume with high sensitivity and multiplexity is important (sepsis; streptococci, enterococci, and staphylococci, several enterobacteria, candida, and the most important antibiotic resistance genes) and where variation tolerance and high multiplexity is important (gastroenteritis; astrovirus, adenovirus, rotavirus, norovirus genogroups I and II, and sapovirus, as well as enteroviruses, which are not associated with gastroenteritis). Detection sensitivities of 10 to 1,000 copies per reaction were achieved for many targets. VOCMA is a highly multiplex, variation-tolerant, general purpose nucleic acid detection concept. It is a specific and sensitive method for simultaneous detection of nucleic acids from viruses, bacteria, fungi, and protozoa, as well as host nucleic acid, in the same test. It can be run on an ordinary PCR and a Luminex machine and is suitable for both clinical diagnoses and microbial surveillance.

  14. New BZLF1 sequence variations in EBV-associated undifferentiated nasopharyngeal carcinoma in southern China.

    PubMed

    Ji, Kun-Mei; Li, Chun-Lin; Meng, Guang; Han, Ai-Dong; Wu, Xu-Li

    2008-01-01

    The viral lytic gene BZLF1 triggers replication of the Epstein-Barr virus (EBV), which is commonly found in nasopharyngeal carcinoma (NPC). Here, RT-PCR revealed five new BZLF1 variants in 8 of 12 NPC and 4 of 12 non-NPC nasopharyngeal biopsies from an NPC-endemic area in southern China. The deduced peptide sequence of the dominant BZLF1 variant differed by 11 amino acids from that of the prototypical strain B95.8 (V01555). Anti-ZEBRA antibody levels were higher in NPC than that in non-NPC patients (P < 0.001). These findings demonstrated a dominant BZLF1 variant in southern Chinese EBV-associated NPC and non-NPC patients.

  15. PASTA: Ultra-Large Multiple Sequence Alignment for Nucleotide and Amino-Acid Sequences.

    PubMed

    Mirarab, Siavash; Nguyen, Nam; Guo, Sheng; Wang, Li-San; Kim, Junhyong; Warnow, Tandy

    2015-05-01

    We introduce PASTA, a new multiple sequence alignment algorithm. PASTA uses a new technique to produce an alignment given a guide tree that enables it to be both highly scalable and very accurate. We present a study on biological and simulated data with up to 200,000 sequences, showing that PASTA produces highly accurate alignments, improving on the accuracy and scalability of the leading alignment methods (including SATé). We also show that trees estimated on PASTA alignments are highly accurate--slightly better than SATé trees, but with substantial improvements relative to other methods. Finally, PASTA is faster than SATé, highly parallelizable, and requires relatively little memory.

  16. SETG: Nucleic Acid Extraction and Sequencing for In Situ Life Detection on Mars

    NASA Astrophysics Data System (ADS)

    Mojarro, A.; Hachey, J.; Tani, J.; Smith, A.; Bhattaru, S. A.; Pontefract, A.; Doebler, R.; Brown, M.; Ruvkun, G.; Zuber, M. T.; Carr, C. E.

    2016-10-01

    We are developing an integrated nucleic acid extraction and sequencing instrument: the Search for Extra-Terrestrial Genomes (SETG) for in situ life detection on Mars. Our goals are to identify related or unrelated nucleic acid-based life on Mars.

  17. Draft Genome Sequence of Cyanobacterium sp. Strain IPPAS B-1200 with a Unique Fatty Acid Composition

    PubMed Central

    Starikov, Alexander Y.; Usserbaeva, Aizhan A.; Sinetova, Maria A.; Sarsekeyeva, Fariza K.; Zayadan, Bolatkhan K.; Ustinova, Vera V.; Kupriyanova, Elena V.; Los, Dmitry A.

    2016-01-01

    Here, we report the draft genome of Cyanobacterium sp. IPPAS strain B-1200, isolated from Lake Balkhash, Kazakhstan, and characterized by the unique fatty acid composition of its membrane lipids, which are enriched with myristic and myristoleic acids. The approximate genome size is 3.4 Mb, and the predicted number of coding sequences is 3,119. PMID:27856596

  18. De Novo Structure Prediction of Globular Proteins Aided by Sequence Variation-Derived Contacts

    PubMed Central

    Kosciolek, Tomasz; Jones, David T.

    2014-01-01

    The advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality of de novo structure predictions. Here, we investigate the potential benefits of combining a well-established fragment-based folding algorithm – FRAGFOLD, with PSICOV, a contact prediction method which uses sparse inverse covariance estimation to identify co-varying sites in multiple sequence alignments. Using a comprehensive set of 150 diverse globular target proteins, up to 266 amino acids in length, we are able to address the effectiveness and some limitations of such approaches to globular proteins in practice. Overall we find that using fragment assembly with both statistical potentials and predicted contacts is significantly better than either statistical potentials or contacts alone. Results show up to nearly 80% of correct predictions (TM-score ≥0.5) within analysed dataset and a mean TM-score of 0.54. Unsuccessful modelling cases emerged either from conformational sampling problems, or insufficient contact prediction accuracy. Nevertheless, a strong dependency of the quality of final models on the fraction of satisfied predicted long-range contacts was observed. This not only highlights the importance of these contacts on determining the protein fold, but also (combined with other ensemble-derived qualities) provides a powerful guide as to the choice of correct models and the global quality of the selected model. A proposed quality assessment scoring function achieves 0.93 precision and 0.77 recall for the discrimination of correct folds on our dataset of decoys. These findings suggest the approach is well-suited for blind predictions on a variety of globular proteins of unknown 3D structure, provided that enough homologous sequences are available to construct a large and accurate multiple sequence alignment for the initial contact prediction step. PMID:24637808

  19. De novo structure prediction of globular proteins aided by sequence variation-derived contacts.

    PubMed

    Kosciolek, Tomasz; Jones, David T

    2014-01-01

    The advent of high accuracy residue-residue intra-protein contact prediction methods enabled a significant boost in the quality of de novo structure predictions. Here, we investigate the potential benefits of combining a well-established fragment-based folding algorithm--FRAGFOLD, with PSICOV, a contact prediction method which uses sparse inverse covariance estimation to identify co-varying sites in multiple sequence alignments. Using a comprehensive set of 150 diverse globular target proteins, up to 266 amino acids in length, we are able to address the effectiveness and some limitations of such approaches to globular proteins in practice. Overall we find that using fragment assembly with both statistical potentials and predicted contacts is significantly better than either statistical potentials or contacts alone. Results show up to nearly 80% of correct predictions (TM-score ≥0.5) within analysed dataset and a mean TM-score of 0.54. Unsuccessful modelling cases emerged either from conformational sampling problems, or insufficient contact prediction accuracy. Nevertheless, a strong dependency of the quality of final models on the fraction of satisfied predicted long-range contacts was observed. This not only highlights the importance of these contacts on determining the protein fold, but also (combined with other ensemble-derived qualities) provides a powerful guide as to the choice of correct models and the global quality of the selected model. A proposed quality assessment scoring function achieves 0.93 precision and 0.77 recall for the discrimination of correct folds on our dataset of decoys. These findings suggest the approach is well-suited for blind predictions on a variety of globular proteins of unknown 3D structure, provided that enough homologous sequences are available to construct a large and accurate multiple sequence alignment for the initial contact prediction step.

  20. Sequencing and computational analysis of complete genome sequences of Citrus yellow mosaic badna virus from acid lime and pummelo.

    PubMed

    Borah, Basanta K; Johnson, A M Anthony; Sai Gopal, D V R; Dasgupta, Indranil

    2009-08-01

    Citrus yellow mosaic badna virus (CMBV), a member of the Family Caulimoviridae, Genus Badnavirus, is the causative agent of Citrus mosaic disease in India. Although the virus has been detected in several citrus species, only two full-length genomes, one each from Sweet orange and Rangpur lime, are available in publicly accessible databases. In order to obtain a better understanding of the genetic variability of the virus in other citrus mosaic-affected citrus species, we performed the cloning and sequence analysis of complete genomes of CMBV from two additional citrus species, Acid lime and Pummelo. We show that CMBV genomes from the two hosts share high homology with previously reported CMBV sequences and hence conclude that the new isolates represent variants of the virus present in these species. Based on in silico sequence analysis, we predict the possible function of the protein encoded by one of the five ORFs.

  1. Parvalbumins from coelacanth muscle. III. Amino acid sequence of the major component.

    PubMed

    Jauregui-Adell, J; Pechere, J F

    1978-09-26

    The primary structure of the major parvalbumin (pI = 4.52) from coelacanth muscle (Latimeria chalumnae) has been determined. Sequence analysis of the tryptic peptides, in some cases obtained with beta-trypsin, accounts for the total amino acid content of the protein. Chymotryptic peptides provide appropriate sequence overlaps, to complete the localization of the tryptic peptides. Examination of the amino acid sequence of this protein shows the typical structure of a beta-parvalbumin. Its position in the dendrogram of related calcium-binding proteins corresponds to that usually accepted for crossopterygians.

  2. Analysis of cloned cDNA and genomic sequences for phytochrome: complete amino acid sequences for two gene products expressed in etiolated Avena.

    PubMed Central

    Hershey, H P; Barker, R F; Idler, K B; Lissemore, J L; Quail, P H

    1985-01-01

    Cloned cDNA and genomic sequences have been analyzed to deduce the amino acid sequence of phytochrome from etiolated Avena. Restriction endonuclease site polymorphism between clones indicates that at least four phytochrome genes are expressed in this tissue. Sequence analysis of two complete and one partial coding region shows approximately 98% homology at both the nucleotide and amino acid levels, with the majority of amino acid changes being conservative. High sequence homology is also found in the 5'-untranslated region but significant divergence occurs in the 3'-untranslated region. The phytochrome polypeptides are 1128 amino acid residues long corresponding to a molecular mass of 125 kdaltons. The known protein sequence at the chromophore attachment site occurs only once in the polypeptide, establishing that phytochrome has a single chromophore per monomer covalently linked to Cys-321. Computer analyses of the amino acid sequences have provided predictions regarding a number of structural features of the phytochrome molecule. PMID:3001642

  3. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  4. Sequence variation in couch potato and its effects on life-history traits in a northern malt fly, Drosophila montana.

    PubMed

    Kankare, Maaria; Salminen, Tiina S; Lampinen, Hanna; Hoikkala, Anneli

    2012-02-01

    Couch potato (cpo) has previously been connected to reproductive diapause in several insect species including Drosophila melanogaster, where it has been suggested to provide a link between the insulin signalling pathway and the hormonal control of diapause. In the first part of the study we sequenced nearly 3.6 kb of this gene in a northern Drosophila species (Drosophila montana) with a robust photoperiodically determined diapause and found several types of polymorphisms along the sequenced area. We also found variation among five Drosophila virilis group species in the length of the 5th exon of cpo and in the site of the stop codon at the end of this exon. The second part of the study was targeted on a deletion of six amino acids located in the last section of exon 5, which in D. melanogaster, is translated only in one short transcript lacking the following exons. The studied deletion appeared to be extremely rare in the wild D. montana population where it was found, but its frequency rapidly increased during laboratory culture. qPCR analyses showed the expression level of the deletion allele to be significantly downregulated in both the diapausing and non-diapausing females compared to the wild type allele. At the phenotypic level, the deletion and the decreased expression of cpo transcript involving it did not have direct effect on the incidence of female reproductive diapause, but it was associated with a reduction in development time under diapause-inducing conditions. This suggests that while the cpo transcript containing the prolonged version of the 5th exon with a stop codon is clearly associated with fly development time, the exons with RNA domains included in other transcripts of the gene may be more directly related to diapause regulation.

  5. Simultaneous Detection of Both Single Nucleotide Variations and Copy Number Alterations by Next-Generation Sequencing in Gorlin Syndrome

    PubMed Central

    Morita, Kei-ichi; Naruto, Takuya; Tanimoto, Kousuke; Yasukawa, Chisato; Oikawa, Yu; Masuda, Kiyoshi; Imoto, Issei; Inazawa, Johji; Omura, Ken; Harada, Hiroyuki

    2015-01-01

    Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions. PMID:26544948

  6. Simultaneous Detection of Both Single Nucleotide Variations and Copy Number Alterations by Next-Generation Sequencing in Gorlin Syndrome.

    PubMed

    Morita, Kei-ichi; Naruto, Takuya; Tanimoto, Kousuke; Yasukawa, Chisato; Oikawa, Yu; Masuda, Kiyoshi; Imoto, Issei; Inazawa, Johji; Omura, Ken; Harada, Hiroyuki

    2015-01-01

    Gorlin syndrome (GS) is an autosomal dominant disorder that predisposes affected individuals to developmental defects and tumorigenesis, and caused mainly by heterozygous germline PTCH1 mutations. Despite exhaustive analysis, PTCH1 mutations are often unidentifiable in some patients; the failure to detect mutations is presumably because of mutations occurred in other causative genes or outside of analyzed regions of PTCH1, or copy number alterations (CNAs). In this study, we subjected a cohort of GS-affected individuals from six unrelated families to next-generation sequencing (NGS) analysis for the combined screening of causative alterations in Hedgehog signaling pathway-related genes. Specific single nucleotide variations (SNVs) of PTCH1 causing inferred amino acid changes were identified in four families (seven affected individuals), whereas CNAs within or around PTCH1 were found in two families in whom possible causative SNVs were not detected. Through a targeted resequencing of all coding exons, as well as simultaneous evaluation of copy number status using the alignment map files obtained via NGS, we found that GS phenotypes could be explained by PTCH1 mutations or deletions in all affected patients. Because it is advisable to evaluate CNAs of candidate causative genes in point mutation-negative cases, NGS methodology appears to be useful for improving molecular diagnosis through the simultaneous detection of both SNVs and CNAs in the targeted genes/regions.

  7. Purification, characterization and partial amino acid sequence of glycogen synthase from Saccharomyces cerevisiae.

    PubMed Central

    Carabaza, A; Arino, J; Fox, J W; Villar-Palasi, C; Guinovart, J J

    1990-01-01

    Glycogen synthase from Saccharomyces cerevisiae was purified to homogeneity. The enzyme showed a subunit molecular mass of 80 kDa. The holoenzyme appears to be a tetramer. Antibodies developed against purified yeast glycogen synthase inactivated the enzyme in yeast extracts and allowed the detection of the protein in Western blots. Amino acid analysis showed that the enzyme is very rich in glutamate and/or glutamine residues. The N-terminal sequence (11 amino acid residues) was determined. In addition, selected tryptic-digest peptides were purified by reverse-phase h.p.l.c. and submitted to gas-phase sequencing. Up to eight sequences (79 amino acid residues) could be aligned with the human muscle enzyme sequence. Levels of identity range between 37 and 100%, indicating that, although human and yeast glycogen synthases probably share some conserved regions, significant differences in their primary structure should be expected. Images Fig. 1. Fig. 2. Fig. 3. PMID:2114092

  8. Amino acid sequence of anionic peroxidase from the windmill palm tree Trachycarpus fortunei.

    PubMed

    Baker, Margaret R; Zhao, Hongwei; Sakharov, Ivan Yu; Li, Qing X

    2014-12-10

    Palm peroxidases are extremely stable and have uncommon substrate specificity. This study was designed to fill in the knowledge gap about the structures of a peroxidase from the windmill palm tree Trachycarpus fortunei. The complete amino acid sequence and partial glycosylation were determined by MALDI-top-down sequencing of native windmill palm tree peroxidase (WPTP), MALDI-TOF/TOF MS/MS of WPTP tryptic peptides, and cDNA sequencing. The propeptide of WPTP contained N- and C-terminal signal sequences which contained 21 and 17 amino acid residues, respectively. Mature WPTP was 306 amino acids in length, and its carbohydrate content ranged from 21% to 29%. Comparison to closely related royal palm tree peroxidase revealed structural features that may explain differences in their substrate specificity. The results can be used to guide engineering of WPTP and its novel applications.

  9. Physicochemical consequences of amino acid variations that contribute to fibril formation by immunoglobulin light chains.

    PubMed Central

    Raffen, R.; Dieckman, L. J.; Szpunar, M.; Wunschl, C.; Pokkuluri, P. R.; Dave, P.; Wilkins Stevens, P.; Cai, X.; Schiffer, M.; Stevens, F. J.

    1999-01-01

    The most common form of systemic amyloidosis originates from antibody light chains. The large number of amino acid variations that distinguish amyloidogenic from nonamyloidogenic light chain proteins has impeded our understanding of the structural basis of light-chain fibril formation. Moreover, even among the subset of human light chains that are amyloidogenic, many primary structure differences are found. We compared the thermodynamic stabilities of two recombinant kappa4 light-chain variable domains (V(L)s) derived from amyloidogenic light chains with a V(L) from a benign light chain. The amyloidogenic V(L)s were significantly less stable than the benign V(L). Furthermore, only the amyloidogenic V(L)s formed fibrils under native conditions in an in vitro fibril formation assay. We used site-directed mutagenesis to examine the consequences of individual amino acid substitutions found in the amyloidogenic V(L)s on stability and fibril formation capability. Both stabilizing and destabilizing mutations were found; however, only destabilizing mutations induced fibril formation in vitro. We found that fibril formation by the benign V(L) could be induced by low concentrations of a denaturant. This indicates that there are no structural or sequence-specific features of the benign V(L) that are incompatible with fibril formation, other than its greater stability. These studies demonstrate that the V(L) beta-domain structure is vulnerable to destabilizing mutations at a number of sites, including complementarity determining regions (CDRs), and that loss of variable domain stability is a major driving force in fibril formation. PMID:10091653

  10. TranslatorX: multiple alignment of nucleotide sequences guided by amino acid translations.

    PubMed

    Abascal, Federico; Zardoya, Rafael; Telford, Maximilian J

    2010-07-01

    We present TranslatorX, a web server designed to align protein-coding nucleotide sequences based on their corresponding amino acid translations. Many comparisons between biological sequences (nucleic acids and proteins) involve the construction of multiple alignments. Alignments represent a statement regarding the homology between individual nucleotides or amino acids within homologous genes. As protein-coding DNA sequences evolve as triplets of nucleotides (codons) and it is known that sequence similarity degrades more rapidly at the DNA than at the amino acid level, alignments are generally more accurate when based on amino acids than on their corresponding nucleotides. TranslatorX novelties include: (i) use of all documented genetic codes and the possibility of assigning different genetic codes for each sequence; (ii) a battery of different multiple alignment programs; (iii) translation of ambiguous codons when possible; (iv) an innovative criterion to clean nucleotide alignments with GBlocks based on protein information; and (v) a rich output, including Jalview-powered graphical visualization of the alignments, codon-based alignments coloured according to the corresponding amino acids, measures of compositional bias and first, second and third codon position specific alignments. The TranslatorX server is freely available at http://translatorx.co.uk.

  11. Variations in prebiotic oligosaccharide fermentation by intestinal lactic acid bacteria.

    PubMed

    Endo, Akihito; Nakamura, Saki; Konishi, Kenta; Nakagawa, Junichi; Tochio, Takumi

    2016-01-01

    Prebiotic oligosaccharides confer health benefits on the host by modulating the gut microbiota. Intestinal lactic acid bacteria (LAB) are potential targets of prebiotics; however, the metabolism of oligosaccharides by LAB has not been fully characterized. Here, we studied the metabolism of eight oligosaccharides by 19 strains of intestinal LAB. Among the eight oligosaccharides used, 1-kestose, lactosucrose and galactooligosaccharides (GOSs) led to the greatest increases in the numbers of the strains tested. However, mono- and disaccharides accounted for more than half of the GOSs used, and several strains only metabolized the mono- and di-saccharides in GOSs. End product profiles indicated that the amounts of lactate produced were generally consistent with the bacterial growth recorded. Oligosaccharide profiling revealed the interesting metabolic manner in Lactobacillus paracasei strains, which metabolized all oligosaccharides, but left sucrose when cultured with fructooligosaccharides. The present study clearly indicated that the prebiotic potential of each oligosaccharide differs.

  12. Amino acid sequence of homologous rat atrial peptides: natriuretic activity of native and synthetic forms.

    PubMed Central

    Seidah, N G; Lazure, C; Chrétien, M; Thibault, G; Garcia, R; Cantin, M; Genest, J; Nutt, R F; Brady, S F; Lyle, T A

    1984-01-01

    A substance called atrial natriuretic factor (ANF), localized in secretory granules of atrial cardiocytes, was isolated as four homologous natriuretic peptides from homogenates of rat atria. The complete sequence of the longest form showed that it is composed of 33 amino acids. The three other shorter forms (2-33, 3-33, and 8-33) represent amino-terminally truncated versions of the 33 amino acid parent molecule as shown by analysis of sequence, amino acid composition, or both. The proposed primary structure agrees entirely with the amino acid composition and reveals no significant sequence homology with any known protein or segment of protein. The short form ANF-(8-33) was synthesized by a multi-fragment condensation approach and the synthetic product was shown to exhibit specific activity comparable to that of the natural ANF-(3-33). PMID:6232612

  13. Nucleotide and deduced amino acid sequences of a new subtilisin from an alkaliphilic Bacillus isolate.

    PubMed

    Saeki, Katsuhisa; Magallones, Marietta V; Takimura, Yasushi; Hatada, Yuji; Kobayashi, Tohru; Kawai, Shuji; Ito, Susumu

    2003-10-01

    The gene for a new subtilisin from the alkaliphilic Bacillus sp. KSM-LD1 was cloned and sequenced. The open reading frame of the gene encoded a 97 amino-acid prepro-peptide plus a 307 amino-acid mature enzyme that contained a possible catalytic triad of residues, Asp32, His66, and Ser224. The deduced amino acid sequence of the mature enzyme (LD1) showed approximately 65% identity to those of subtilisins SprC and SprD from alkaliphilic Bacillus sp. LG12. The amino acid sequence identities of LD1 to those of previously reported true subtilisins and high-alkaline proteases were below 60%. LD1 was characteristically stable during incubation with surfactants and chemical oxidants. Interestingly, an oxidizable Met residue is located next to the catalytic Ser224 of the enzyme as in the cases of the oxidation-susceptible subtilisins reported to date.

  14. Shark myelin basic protein: amino acid sequence, secondary structure, and self-association.

    PubMed

    Milne, T J; Atkins, A R; Warren, J A; Auton, W P; Smith, R

    1990-09-01

    Myelin basic protein (MBP) from the Whaler shark (Carcharhinus obscurus) has been purified from acid extracts of a chloroform/methanol pellet from whole brains. The amino acid sequence of the majority of the protein has been determined and compared with the sequences of other MBPs. The shark protein has only 44% homology with the bovine protein, but, in common with other MBPs, it has basic residues distributed throughout the sequence and no extensive segments that are predicted to have an ordered secondary structure in solution. Shark MBP lacks the triproline sequence previously postulated to form a hairpin bend in the molecule. The region containing the putative consensus sequence for encephalitogenicity in the guinea pig contains several substitutions, thus accounting for the lack of activity of the shark protein. Studies of the secondary structure and self-association have shown that shark MBP possesses solution properties similar to those of the bovine protein, despite the extensive differences in primary structure.

  15. The rules of variation: Amino acid exchange according to the rotating circular genetic code

    PubMed Central

    Castro-Chavez, Fernando

    2011-01-01

    General guidelines for the molecular basis of functional variation are presented while focused on the rotating circular genetic code and allowable exchanges that make it resistant to genetic diseases under normal conditions. The rules of variation, bioinformatics aids for preventive medicine, are: (1) same position in the four quadrants for hydrophobic codons, (2) same or contiguous position in two quadrants for synonymous or related codons, and (3) same quadrant for equivalent codons. To preserve protein function, amino acid exchange according to the first rule takes into account the positional homology of essential hydrophobic amino acids with every codon with a central uracil in the four quadrants, the second rule includes codons for identical, acidic, or their amidic amino acids present in two quadrants, and the third rule, the smaller, aromatic, stop codons, and basic amino acids, each in proximity within a 90 degree angle. I also define codifying genes and palindromati, CTCGTGCCGAATTCGGCACGAG. PMID:20371250

  16. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  17. Enterovirus D68 in Hospitalized Children: Sequence Variation, Viral Loads and Clinical Outcomes

    PubMed Central

    Salamon, Douglas; Leber, Amy; Mejias, Asuncion

    2016-01-01

    Background An outbreak of enterovirus D68 (EV-D68) caused severe respiratory illness in 2014. The disease spectrum of EV-D68 infections in children with underlying medical conditions other than asthma, the role of EV-D68 loads on clinical illness, and the variation of EV-D68 strains within the same institution over time have not been described. We sought to define the association between EV-D68 loads and sequence variation, and the clinical characteristic in hospitalized children at our institution from 2011 to 2014. Methods May through November 2014, and August to September 2011 to 2013, a convenience sample of nasopharyngeal specimens from children with rhinovirus (RV)/EV respiratory infections were tested for EV-D68 by RT-PCR. Clinical data were compared between children with RV/EV-non-EV-D68 and EV-D68 infections, and among children with EV-D68 infections categorized as healthy, asthmatics, and chronic medical conditions. EV-D68 loads were analyzed in relation to disease severity parameters and sequence variability characterized over time. Results In 2014, 44% (192/438) of samples tested positive for EV-D68 vs. 10% (13/130) in 2011–13 (p<0.0001). PICU admissions (p<0.0001) and non-invasive ventilation (p<0.0001) were more common in children with EV-D68 vs. RV/EV-non-EV-D68 infections. Asthmatic EV-D68+ children, required supplemental oxygen administration (p = 0.03) and PICU admissions (p <0.001) more frequently than healthy children or those with chronic medical conditions; however oxygen duration (p<0.0001), and both PICU and total hospital stay (p<0.01) were greater in children with underlying medical conditions, irrespective of viral burden. By phylogenetic analysis, the 2014 EV-D68 strains clustered into a new sublineage within clade B. Conclusions This is one of the largest pediatric cohorts described from the EV-D68 outbreak. Irrespective of viral loads, EV-D68 was associated with high morbidity in children with asthma and co-morbidities. While EV-D68

  18. Mitochondrial DNA sequence variation is associated with free-living activity energy expenditure in the elderly.

    PubMed

    Tranah, Gregory J; Lam, Ernest T; Katzman, Shana M; Nalls, Michael A; Zhao, Yiqiang; Evans, Daniel S; Yokoyama, Jennifer S; Pawlikowska, Ludmila; Kwok, Pui-Yan; Mooney, Sean; Kritchevsky, Stephen; Goodpaster, Bret H; Newman, Anne B; Harris, Tamara B; Manini, Todd M; Cummings, Steven R

    2012-09-01

    The decline in activity energy expenditure underlies a range of age-associated pathological conditions, neuromuscular and neurological impairments, disability, and mortality. The majority (90%) of the energy needs of the human body are met by mitochondrial oxidative phosphorylation (OXPHOS). OXPHOS is dependent on the coordinated expression and interaction of genes encoded in the nuclear and mitochondrial genomes. We examined the role of mitochondrial genomic variation in free-living activity energy expenditure (AEE) and physical activity levels (PAL) by sequencing the entire (~16.5 kilobases) mtDNA from 138 Health, Aging, and Body Composition Study participants. Among the common mtDNA variants, the hypervariable region 2 m.185G>A variant was significantly associated with AEE (p=0.001) and PAL (p=0.0005) after adjustment for multiple comparisons. Several unique nonsynonymous variants were identified in the extremes of AEE with some occurring at highly conserved sites predicted to affect protein structure and function. Of interest is the p.T194M, CytB substitution in the lower extreme of AEE occurring at a residue in the Qi site of complex III. Among participants with low activity levels, the burden of singleton variants was 30% higher across the entire mtDNA and OXPHOS complex I when compared to those having moderate to high activity levels. A significant pooled variant association across the hypervariable 2 region was observed for AEE and PAL. These results suggest that mtDNA variation is associated with free-living AEE in older persons and may generate new hypotheses by which specific mtDNA complexes, genes, and variants may contribute to the maintenance of activity levels in late life.

  19. An analysis of amino acid sequences surrounding archaeal glycoprotein sequons.

    PubMed

    Abu-Qarn, Mehtap; Eichler, Jerry

    2007-05-01

    Despite having provided the first example of a prokaryal glycoprotein, little is known of the rules governing the N-glycosylation process in Archaea. As in Eukarya and Bacteria, archaeal N-glycosylation takes place at the Asn residues of Asn-X-Ser/Thr sequons. Since not all sequons are utilized, it is clear that other factors, including the context in which a sequon exists, affect glycosylation efficiency. As yet, the contribution to N-glycosylation made by sequon-bordering residues and other related factors in Archaea remains unaddressed. In the following, the surroundings of Asn residues confirmed by experiment as modified were analyzed in an attempt to define sequence rules and requirements for archaeal N-glycosylation.

  20. Draft genome sequence of an elite Dura palm and whole-genome patterns of DNA variation in oil palm

    PubMed Central

    Jin, Jingjing; Lee, May; Bai, Bin; Sun, Yanwei; Qu, Jing; Rahmadsyah; Alfiko, Yuzer; Lim, Chin Huat; Suwanto, Antonius; Sugiharti, Maria; Wong, Limsoon; Ye, Jian; Chua, Nam-Hai; Yue, Gen Hua

    2016-01-01

    Oil palm is the world’s leading source of vegetable oil and fat. Dura, Pisifera and Tenera are three forms of oil palm. The genome sequence of Pisifera is available whereas the Dura form has not been sequenced yet. We sequenced the genome of one elite Dura palm, and re-sequenced 17 palm genomes. The assemble genome sequence of the elite Dura tree contained 10,971 scaffolds and was 1.701 Gb in length, covering 94.49% of the oil palm genome. 36,105 genes were predicted. Re-sequencing of 17 additional palm trees identified 18.1 million SNPs. We found high genetic variation among palms from different geographical regions, but lower variation among Southeast Asian Dura and Pisifera palms. We mapped 10,000 SNPs on the linkage map of oil palm. In addition, high linkage disequilibrium (LD) was detected in the oil palms used in breeding populations of Southeast Asia, suggesting that LD mapping is likely to be practical in this important oil crop. Our data provide a valuable resource for accelerating genetic improvement and studying the mechanism underlying phenotypic variations of important oil palm traits. PMID:27426468

  1. Sequence variation of the 16S to 23S rRNA spacer region in Salmonella enterica.

    PubMed

    Christensen, H; Møller, P L; Vogensen, F K; Olsen, J E

    2000-01-01

    The possibility for identification of Salmonella enterica serotypes by sequence analysis of the 16S to 23S rRNA internal transcribed spacer was investigated by direct sequencing of polymerase chain reaction-amplified DNA from all operons simultaneously in a collection of 25 strains of 18 different serotypes of S. enterica, and by sequencing individual cloned operons from a single strain. It was only possible to determine the first 117 bases upstream from the 23S rRNA gene by direct sequencing because of variation between the rrn operons. Comparison of sequences from this region allowed separation of only 15 out of the 18 serotypes investigated and was not specific even at the subspecies level of S. enterica. To determine the differences between internal transcribed spacers in more detail, the individual rrn operons of strain JEO 197, serotype IV 43:z4,z23:-, were cloned and sequenced. The strain contained four short internal transcribed spacer fragments of 382-384 bases in length, which were 98.4-99.7% similar to each other and three long fragments of 505 bases with 98.0-99.8% similarity. The study demonstrated a higher degree of interbacterial variation than intrabacterial variation between operons for serotypes of S. enterica.

  2. Sequence variation in mitochondrial cox1 and nad1 genes of ascaridoid nematodes in cats and dogs from Iran.

    PubMed

    Mikaeili, F; Mirhendi, H; Mohebali, M; Hosseini, M; Sharbatkhori, M; Zarei, Z; Kia, E B

    2015-07-01

    The study was conducted to determine the sequence variation in two mitochondrial genes, namely cytochrome c oxidase 1 (pcox1) and NADH dehydrogenase 1 (pnad1) within and among isolates of Toxocara cati, Toxocara canis and Toxascaris leonina. Genomic DNA was extracted from 32 isolates of T. cati, 9 isolates of T. canis and 19 isolates of T. leonina collected from cats and dogs in different geographical areas of Iran. Mitochondrial genes were amplified by polymerase chain reaction (PCR) and sequenced. Sequence data were aligned using the BioEdit software and compared with published sequences in GenBank. Phylogenetic analysis was performed using Bayesian inference and maximum likelihood methods. Based on pairwise comparison, intra-species genetic diversity within Iranian isolates of T. cati, T. canis and T. leonina amounted to 0-2.3%, 0-1.3% and 0-1.0% for pcox1 and 0-2.0%, 0-1.7% and 0-2.6% for pnad1, respectively. Inter-species sequence variation among the three ascaridoid nematodes was significantly higher, being 9.5-16.6% for pcox1 and 11.9-26.7% for pnad1. Sequence and phylogenetic analysis of the pcox1 and pnad1 genes indicated that there is significant genetic diversity within and among isolates of T. cati, T. canis and T. leonina from different areas of Iran, and these genes can be used for studying genetic variation of ascaridoid nematodes.

  3. Draft genome sequence of an elite Dura palm and whole-genome patterns of DNA variation in oil palm.

    PubMed

    Jin, Jingjing; Lee, May; Bai, Bin; Sun, Yanwei; Qu, Jing; Rahmadsyah; Alfiko, Yuzer; Lim, Chin Huat; Suwanto, Antonius; Sugiharti, Maria; Wong, Limsoon; Ye, Jian; Chua, Nam-Hai; Yue, Gen Hua

    2016-12-01

    Oil palm is the world's leading source of vegetable oil and fat. Dura, Pisifera and Tenera are three forms of oil palm. The genome sequence of Pisifera is available whereas the Dura form has not been sequenced yet. We sequenced the genome of one elite Dura palm, and re-sequenced 17 palm genomes. The assemble genome sequence of the elite Dura tree contained 10,971 scaffolds and was 1.701 Gb in length, covering 94.49% of the oil palm genome. 36,105 genes were predicted. Re-sequencing of 17 additional palm trees identified 18.1 million SNPs. We found high genetic variation among palms from different geographical regions, but lower variation among Southeast Asian Dura and Pisifera palms. We mapped 10,000 SNPs on the linkage map of oil palm. In addition, high linkage disequilibrium (LD) was detected in the oil palms used in breeding populations of Southeast Asia, suggesting that LD mapping is likely to be practical in this important oil crop. Our data provide a valuable resource for accelerating genetic improvement and studying the mechanism underlying phenotypic variations of important oil palm traits.

  4. Analysis of seasonal variation of stratospheric nitric acid

    NASA Astrophysics Data System (ADS)

    Gruzdev, A. N.

    1998-11-01

    Data from the draft COSPAR reference model for stratospheric nitric acid (HNO3) are analysed. Eight months of LIMS HNO3 measurements allow the analysis of dynamics of regimes associated with the annual HNO3 maximum followed by the HNO3 decrease in the Northern Hemisphere and the annual HNO3 minimum followed by the HNO3 increase in the Southern Hemisphere. The HNO3 minimum is noted earlier (in November) in the Southern Hemisphere subtropical upper stratosphere, from where the regime of minimum HNO3 values propagates to the southern high-latitude middle stratosphere, and then (in Austral summer) the equatorward propagation of the regime is observed, with a persistent downward component. The regime of the HNO3 annual maximum in the Northern Hemisphere propagates from the Arctic lower stratosphere (in autumn) and from the tropical middle stratosphere (in late summer), so that in the mid-latitude middle stratosphere the downward propagation of the regime is observed. Evolution of areas with HNO3 increase and decrease by 1 ppbv against the January HNO3 distribution quantifies intensity of the HNO3 decrease in winter-spring in the Northern Hemisphere and the HNO3 increase in Austral summer-autumn in the Southern Hemisphere.

  5. Sequence polymorphism of GroEL gene in natural population of Bacillus and Brevibacillus spp. that showed variation in thermal tolerance capacity and mRNA expression.

    PubMed

    Sen, R; Tripathy, S; Padhi, S K; Mohanty, S; Maiti, N K

    2014-10-01

    GroEL, a class I chaperonin, plays an important role in the thermal adaptation of the cell and helps to maintain the viability of the cell under heat shock condition. Function of groEL in vivo depends on the maintenance of proper structure of the protein which in turn depends on the nucleotide and amino acid sequence of the gene. In this study, we investigated the changes in nucleotide and amino acid sequences of the partial groEL gene that may affect the thermotolerance capacity as well as mRNA expression of bacterial isolates. Sequences among the same species having differences in the amino acid level were identified as different alleles. The effect of allelic variation on the groEL gene expression was analyzed by comparison and relative quantification in each allele under thermal shock condition by RT-PCR. Evaluation of K a/K s ratio among the strains of same species showed that the groEL gene of all the species had undergone similar functional constrain during evolution. The strains showing similar thermotolerance capacity was found to carry same allele of groEL gene. The isolates carrying allele having amino acid substitution inside the highly ATP/ADP or Mg(2+)-binding region could not tolerate thermal stress and showed lower expression of the groEL gene. Our results indicate that during evolution of these bacterial species the groEL gene has undergone the process of natural selection, and the isolates have evolved with the groEL allelic sequences that help them to withstand the thermal stress during their interaction with the environment.

  6. Nucleic and Amino Acid Sequences Support Structure-Based Viral Classification

    PubMed Central

    Sinclair, Robert M.; Ravantti, Janne J.

    2017-01-01

    ABSTRACT Viral capsids ensure viral genome integrity by protecting the enclosed nucleic acids. Interactions between the genome and capsid and between individual capsid proteins (i.e., capsid architecture) are intimate and are expected to be characterized by strong evolutionary conservation. For this reason, a capsid structure-based viral classification has been proposed as a way to bring order to the viral universe. The seeming lack of sufficient sequence similarity to reproduce this classification has made it difficult to reject structural convergence as the basis for the classification. We reinvestigate whether the structure-based classification for viral coat proteins making icosahedral virus capsids is in fact supported by previously undetected sequence similarity. Since codon choices can influence nascent protein folding cotranslationally, we searched for both amino acid and nucleotide sequence similarity. To demonstrate the sensitivity of the approach, we identify a candidate gene for the pandoravirus capsid protein. We show that the structure-based classification is strongly supported by amino acid and also nucleotide sequence similarities, suggesting that the similarities are due to common descent. The correspondence between structure-based and sequence-based analyses of the same proteins shown here allow them to be used in future analyses of the relationship between linear sequence information and macromolecular function, as well as between linear sequence and protein folds. IMPORTANCE Viral capsids protect nucleic acid genomes, which in turn encode capsid proteins. This tight coupling of protein shell and nucleic acids, together with strong functional constraints on capsid protein folding and architecture, leads to the hypothesis that capsid protein-coding nucleotide sequences may retain signatures of ancient viral evolution. We have been able to show that this is indeed the case, using the major capsid proteins of viruses forming icosahedral capsids

  7. HIV-1 Tat and Viral Latency: What We Can Learn from Naturally Occurring Sequence Variations.

    PubMed

    Kamori, Doreen; Ueno, Takamasa

    2017-01-01

    Despite the effective use of antiretroviral therapy, the remainder of a latently HIV-1-infected reservoir mainly in the resting memory CD4(+) T lymphocyte subset has provided a great setback toward viral eradication. While host transcriptional silencing machinery is thought to play a dominant role in HIV-1 latency, HIV-1 protein such as Tat, may affect both the establishment and the reversal of latency. Indeed, mutational studies have demonstrated that insufficient Tat transactivation activity can result in impaired transcription of viral genes and the establishment of latency in cell culture experiments. Because Tat protein is one of highly variable proteins within HIV-1 proteome, it is conceivable that naturally occurring Tat mutations may differentially modulate Tat functions, thereby influencing the establishment and/or the reversal of viral latency in vivo. In this mini review, we summarize the recent findings of Tat naturally occurring polymorphisms associating with host immune responses and we highlight the implication of Tat sequence variations in relation to HIV latency.

  8. HIV-1 Tat and Viral Latency: What We Can Learn from Naturally Occurring Sequence Variations

    PubMed Central

    Kamori, Doreen; Ueno, Takamasa

    2017-01-01

    Despite the effective use of antiretroviral therapy, the remainder of a latently HIV-1-infected reservoir mainly in the resting memory CD4+ T lymphocyte subset has provided a great setback toward viral eradication. While host transcriptional silencing machinery is thought to play a dominant role in HIV-1 latency, HIV-1 protein such as Tat, may affect both the establishment and the reversal of latency. Indeed, mutational studies have demonstrated that insufficient Tat transactivation activity can result in impaired transcription of viral genes and the establishment of latency in cell culture experiments. Because Tat protein is one of highly variable proteins within HIV-1 proteome, it is conceivable that naturally occurring Tat mutations may differentially modulate Tat functions, thereby influencing the establishment and/or the reversal of viral latency in vivo. In this mini review, we summarize the recent findings of Tat naturally occurring polymorphisms associating with host immune responses and we highlight the implication of Tat sequence variations in relation to HIV latency. PMID:28194140

  9. Diversity and Variation of Bacterial Community Revealed by MiSeq Sequencing in Chinese Dark Teas

    PubMed Central

    Fu, Jianyu; Lv, Haipeng; Chen, Feng

    2016-01-01

    Chinese dark teas (CDTs) are now among the popular tea beverages worldwide due to their unique health benefits. Because the production of CDTs involves fermentation that is characterized by the effect of microbes, microorganisms are believed to play critical roles in the determination of the chemical characteristics of CDTs. Some dominant fungi have been identified from CDTs. In contrast, little, if anything, is known about the composition of bacterial community in CDTs. This study was set to investigate the diversity and variation of bacterial community in four major types of CDTs from China. First, the composition of the bacterial community of CDTs was determined using MiSeq sequencing. From the four typical CDTs, a total of 238 genera that belong to 128 families of bacteria were detected, including most of the families of beneficial bacteria known to be associated with fermented food. While different types of CDTs had generally distinct bacterial structures, the two types of brick teas produced from adjacent regions displayed strong similarity in bacterial composition, suggesting that the producing environment and processing condition perhaps together influence bacterial succession in CDTs. The global characterization of bacterial communities in CDTs is an essential first step for us to understand their function in fermentation and their potential impact on human health. Such knowledge will be important guidance for improving the production of CDTs with higher quality and elevated health benefits. PMID:27690376

  10. Unique Features of Germline Variation in Five Egyptian Familial Breast Cancer Families Revealed by Exome Sequencing

    PubMed Central

    Kim, Yeong C.; Soliman, Amr S.; Cui, Jian; Ramadan, Mohamed; Hablas, Ahmed; Abouelhoda, Mohamed; Hussien, Nehal; Ahmed, Ola; Zekri, Abdel-Rahman Nabawy; Seifeldin, Ibrahim A.

    2017-01-01

    Genetic predisposition increases the risk of familial breast cancer. Recent studies indicate that genetic predisposition for familial breast cancer can be ethnic-specific. However, current knowledge of genetic predisposition for the disease is predominantly derived from Western populations. Using this existing information as the sole reference to judge the predisposition in non-Western populations is not adequate and can potentially lead to misdiagnosis. Efforts are required to collect genetic predisposition from non-Western populations. The Egyptian population has high genetic variations in reflecting its divergent ethnic origins, and incident rate of familial breast cancer in Egypt is also higher than the rate in many other populations. Using whole exome sequencing, we investigated genetic predisposition in five Egyptian familial breast cancer families. No pathogenic variants in BRCA1, BRCA2 and other classical breast cancer-predisposition genes were present in these five families. Comparison of the genetic variants with those in Caucasian familial breast cancer showed that variants in the Egyptian families were more variable and heterogeneous than the variants in Caucasian families. Multiple damaging variants in genes of different functional categories were identified either in a single family or shared between families. Our study demonstrates that genetic predisposition in Egyptian breast cancer families may differ from those in other disease populations, and supports a comprehensive screening of local disease families to determine the genetic predisposition in Egyptian familial breast cancer. PMID:28076423

  11. Structural variation discovery in the cancer genome using next generation sequencing: Computational solutions and perspectives

    PubMed Central

    Liu, Biao; Conroy, Jeffrey M.; Morrison, Carl D.; Odunsi, Adekunle O.; Qin, Maochun; Wei, Lei; Trump, Donald L.; Johnson, Candace S.; Liu, Song; Wang, Jianmin

    2015-01-01

    Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937

  12. Classification of mouse VK groups based on the partial amino acid sequence to the first invariant tryptophan: impact of 14 new sequences from IgG myeloma proteins.

    PubMed

    Potter, M; Newell, J B; Rudikoff, S; Haber, E

    1982-12-01

    Fourteen new VK sequences derived from BALB/c IgG myeloma proteins were determined to the first invariant tryptophan (Trp 35). These partial sequences were compared with 65 other published VK sequences using a computer program. The 79 sequences were organized according to the length of the sequence from the amino terminus to the first invariant tryptophan (Trp 35), into seven groups (33, 34, 35, 36, 39, 40 and 41aa). A distance matrix of all 79 sequences was then computed, i.e. the number of amino acid substitutions necessary to convert one sequence to another was determined. From these data a dendrogram was constructed. Most of the VK sequences fell into clusters or closely related groups. The definition of a sequence group is arbitrary but facilitates the classification of VK proteins. We used 12 substitutions as the basis for defining a sequence group based on the known number of substitutions that are found in the VK21 proteins. By this criterion there were 18 groups in the Trp 35 dendrogram. Twelve of the 14 new sequences fell into one of these sequence groups; two formed new sequence groups. Collective amino acid sequencing is still encountering new VK structures indicating more sequences will be required to attain an accurate estimate of the total number of VK groups. Updated dendrograms can be quickly generated to include newly generated sequences.

  13. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  14. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  15. Sequence Diversity and Antigenic Variation at the rag Locus of Porphyromonas gingivalis

    PubMed Central

    Hall, Lucinda M. C.; Fawell, Stuart C.; Shi, Xiaoju; Faray-Kele, Marie-Claire; Aduse-Opoku, Joseph; Whiley, Robert A.; Curtis, Michael A.

    2005-01-01

    The rag locus of Porphyromonas gingivalis W50 encodes RagA, a predicted tonB-dependent receptor protein, and RagB, a lipoprotein that constitutes an immunodominant outer membrane antigen. The low G+C content of the locus, an association with mobility elements, and an apparent restricted distribution in the species suggested that the locus had arisen by horizontal gene transfer. In the present study, we have demonstrated that there are four divergent alleles of the rag locus. The original rag allele found in W50 was renamed rag-1, while three novel alleles, rag-2 to rag-4, were found in isolates lacking rag-1. The three novel alleles encoded variants of RagA with 63 to 71% amino acid identity to RagA1 and each other and variants of RagB with 43 to 56% amino acid identity. The RagA/B proteins have homology to numerous Bacteroides proteins, including SusC/D, implicated in polysaccharide uptake. Monoclonal and polyclonal antibodies raised against RagB1 of P. gingivalis W50 did not cross-react with proteins from isolates carrying different alleles. In a laboratory collection of 168 isolates, 26% carried rag-1, 36% carried rag-2, 25% carried rag-3, and 14% carried rag-4 (including the type strain, ATCC 33277). Restriction profiles of the locus in different isolates demonstrated polymorphism within each allele, some of which is accounted for by the presence or absence of insertion sequence elements. By reference to a previously published study on virulence in a mouse model (M. L. Laine and A. J. van Winkelhoff, Oral Microbiol. Immunol. 13:322-325, 1998), isolates that caused serious disease in mice were significantly more likely to carry rag-1 than other rag alleles. PMID:15972517

  16. Amino acid sequence around the active-site serine residue in the acyltransferase domain of goat mammary fatty acid synthetase.

    PubMed Central

    Mikkelsen, J; Højrup, P; Rasmussen, M M; Roepstorff, P; Knudsen, J

    1985-01-01

    Goat mammary fatty acid synthetase was labelled in the acyltransferase domain by formation of O-ester intermediates by incubation with [1-14C]acetyl-CoA and [2-14C]malonyl-CoA. Tryptic-digest and CNBr-cleavage peptides were isolated and purified by high-performance reverse-phase and ion-exchange liquid chromatography. The sequences of the malonyl- and acetyl-labelled peptides were shown to be identical. The results confirm the hypothesis that both acetyl and malonyl groups are transferred to the mammalian fatty acid synthetase complex by the same transferase. The sequence is compared with those of other fatty acid synthetase transferases. PMID:3922356

  17. Ligation with nucleic acid sequence-based amplification.

    PubMed

    Ong, Carmichael; Tai, Warren; Sarma, Aartik; Opal, Steven M; Artenstein, Andrew W; Tripathi, Anubhav

    2012-01-01

    This work presents a novel method for detecting nucleic acid targets using a ligation step along with an isothermal, exponential amplification step. We use an engineered ssDNA with two variable regions on the ends, allowing us to design the probe for optimal reaction kinetics and primer binding. This two-part probe is ligated by T4 DNA Ligase only when both parts bind adjacently to the target. The assay demonstrates that the expected 72-nt RNA product appears only when the synthetic target, T4 ligase, and both probe fragments are present during the ligation step. An extraneous 38-nt RNA product also appears due to linear amplification of unligated probe (P3), but its presence does not cause a false-positive result. In addition, 40 mmol/L KCl in the final amplification mix was found to be optimal. It was also found that increasing P5 in excess of P3 helped with ligation and reduced the extraneous 38-nt RNA product. The assay was also tested with a single nucleotide polymorphism target, changing one base at the ligation site. The assay was able to yield a negative signal despite only a single-base change. Finally, using P3 and P5 with longer binding sites results in increased overall sensitivity of the reaction, showing that increasing ligation efficiency can improve the assay overall. We believe that this method can be used effectively for a number of diagnostic assays.

  18. Genetic Variation of Fatty Acid Oxidation and Obesity, A Literature Review

    PubMed Central

    Freitag Luglio, Harry

    2016-01-01

    Modulation of fat metabolism is an important component of the etiology of obesity as well as individual response to weight loss program. The influence of lipolysis process had receives many attentions in recent decades. Compared to that, fatty acid oxidation which occurred after lipolysis seems to be less exposed. There are limited publications on how fatty acid oxidation influences predisposition to obesity, especially the importance of genetic variations of fatty acid oxidation proteins on development of obesity. The aim of this review is to provide recent knowledge on how polymorphism of genes related fatty acid oxidation is obtained. Studies in human as well as animal model showed that disturbance of genes related fatty acid oxidation process gave impact on body weight and risks to obesity. Several polymorphisms on CD36, CPT, ACS and FABP had been shown to be related to obesity either by regulating enzymatic activity or directly influence fatty acid oxidation process. PMID:27127449

  19. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  20. RNA internal standard synthesis by nucleic acid sequence-based amplification for competitive quantitative amplification reactions.

    PubMed

    Lo, Wan-Yu; Baeumner, Antje J

    2007-02-15

    Nucleic acid sequence-based amplification (NASBA) reactions have been demonstrated to successfully synthesize new sequences based on deletion and insertion reactions. Two RNA internal standards were synthesized for use in competitive amplification reactions in which quantitative analysis can be achieved by coamplifying the internal standard with the wild type sample. The sequences were created in two consecutive NASBA reactions using the E. coli clpB mRNA sequence as model analyte. The primer sequences of the wild type sequence were maintained, and a 20-nt-long segment inside the amplicon region was exchanged for a new segment of similar GC content and melting temperature. The new RNA sequence was thus amplifiable using the wild type primers and detectable via a new inserted sequence. In the first reaction, the forwarding primer and an additional 20-nt-long sequence was deleted and replaced by a new 20-nt-long sequence. In the second reaction, a forwarding primer containing as 5' overhang sequence the wild type primer sequence was used. The presence of pure internal standard was verified using electrochemiluminescence and RNA lateral-flow biosensor analysis. Additional sequence deletion in order to shorten the internal standard amplicons and thus generate higher detection signals was found not to be required. Finally, a competitive NASBA reaction between one internal standard and the wild type sequence was carried out proving its functionality. This new rapid construction method via NASBA provides advantages over the traditional techniques since it requires no traditional cloning procedures, no thermocyclers, and can be completed in less than 4 h.

  1. Sequence variation of Epstein-Barr virus (EBV) BZLF1 gene in EBV-associated gastric carcinomas and nasopharyngeal carcinomas in Northern China.

    PubMed

    Luo, Bing; Tang, Xiuming; Jia, Yuping; Wang, Yun; Chao, Yan; Zhao, Chengquan

    2011-08-01

    Epstein-Barr virus (EBV) BZLF1 gene can trigger EBV from latent infection to lytic replicative phase. The functions of BZLF1 are well known, while little is known about its gene polymorphism. In order to elucidate the sequence variations of BZLF1 and its association with malignancies, we analyzed BZLF1 gene in 24 EBV-associated gastric carcinomas, 41 nasopharyngeal carcinomas and 24 throat washing samples from healthy donors in Northern China using PCR-direct sequencing method. Three types and 8 subtypes of BZLF1 were identified. A dominant type BZLF1-A was found in 67 of 89 (75.3%) isolates. Type BZLF1-B was characterized by a common Ala deletion at residue 127, which was detected in 21 of 89 isolates (23.6%). Type BZLF1-C contained only one isolate (GC103), which had the same sequence with the prototype B95-8. Among 3 functional domains of BZLF1 protein, the transactivation domain had most mutations, followed by the bZIP domains (the DNA binding domain and dimerization domain). No prevalence of any subtypes or mutations in the functional domains among three specimen groups was found (P > 0.05). Our study indicates that BZLF1 subtypes and amino acid changes in functional domains are not preferentially associated with EBV-associated gastric carcinomas or nasopharyngeal carcinomas in Northern China. BZLF1 gene variations are geographically restricted rather than tumor-specific polymorphisms.

  2. Epilepsy-causing sequence variations in SIK1 disrupt synaptic activity response gene expression and affect neuronal morphology.

    PubMed

    Pröschel, Christoph; Hansen, Jeanne N; Ali, Adil; Tuttle, Emily; Lacagnina, Michelle; Buscaglia, Georgia; Halterman, Marc W; Paciorkowski, Alex R

    2017-02-01

    SIK1 syndrome is a newly described developmental epilepsy disorder caused by heterozygous mutations in the salt-inducible kinase SIK1. To better understand the pathophysiology of SIK1 syndrome, we studied the effects of SIK1 pathogenic sequence variations in human neurons. Primary human fetal cortical neurons were transfected with a lentiviral vector to overexpress wild-type and mutant SIK1 protein. We evaluated the transcriptional activity of known downstream gene targets in neurons expressing mutant SIK1 compared with wild type. We then assayed neuronal morphology by measuring neurite length, number and branching. Truncating SIK1 sequence variations were associated with abnormal MEF2C transcriptional activity and decreased MEF2C protein levels. Epilepsy-causing SIK1 sequence variations were associated with significantly decreased expression of ARC (activity-regulated cytoskeletal-associated) and other synaptic activity response element genes. Assay of mRNA levels for other MEF2C target genes NR4A1 (Nur77) and NRG1, found significantly, decreased the expression of these genes as well. The missense p.(Pro287Thr) SIK1 sequence variation was associated with abnormal neuronal morphology, with significant decreases in mean neurite length, mean number of neurites and a significant increase in proximal branches compared with wild type. Epilepsy-causing SIK1 sequence variations resulted in abnormalities in the MEF2C-ARC pathway of neuronal development and synapse activity response. This work provides the first insights into the mechanisms of pathogenesis in SIK1 syndrome, and extends the ARX-MEF2C pathway in the pathogenesis of developmental epilepsy.

  3. Sequence variation of Bemisia tabaci Chemosensory Protein 2 in cryptic species B and Q: New DNA markers for whitefly recognition.

    PubMed

    Liu, Guo-Xia; Ma, Hong-Mei; Xie, Hong-Yan; Xuan, Ning; Picimbon, Jean-François

    2016-01-15

    Bemisia tabaci Gennadius biotypes B and Q are two of the most important worldwide agricultural insect pests. Genomic sequences of Type-2 B. tabaci chemosensory protein (BtabCSP2) were cloned and sequenced in B and Q biotypes, revealing key biotype-specific variations in the intron sequence. A Q260 sequence was found specifically in Q-BtabCSP2 and Cucumis melo LN692399, suggesting ancestral horizontal transfer of gene between the insect and the plant through bacteria. A cleaved amplified polymorphic sequences (CAPS) method was then developed to differentiate B and Q based on the sequence variation in exon of BtabCSP2 gene. The performances of CSP2-based CAPS for whitefly recognition were assessed using B. tabaci field collections from Shandong Province (P.R. China). Our SacII based CAPS method led to the same result compared to mitochondrial cytochrome oxidase-based CAPS method in the field collections. We therefore propose an explanation for CSP origin and a new rapid simple molecular method based on genomic DNA and chemosensory gene to differentiate accurately the B and Q whiteflies of the Bemisia complex around the world.

  4. Sequence variation in the melanocortin-1 receptor (MC1R) pigmentation gene and its role in the cryptic coloration of two South American sand lizards

    PubMed Central

    Corso, Josmael; Gonçalves, Gislene L.; de Freitas, Thales R.O.

    2012-01-01

    In reptiles, dorsal body darkness often varies with substrate color or temperature environment, and is generally presumed to be an adaptation for crypsis or thermoregulation. However, the genetic basis of pigmentation is poorly known in this group. In this study we analyzed the coding region of the melanocortin-1-receptor (MC1R) gene, and therefore its role underlying the dorsal color variation in two sympatric species of sand lizards (Liolaemus) that inhabit the southeastern coast of South America: L. occipitalis and L. arambarensis. The first is light-colored and occupies aeolic pale sand dunes, while the second is brownish and lives in a darker sandy habitat. We sequenced 630 base pairs of MC1R in both species. In total, 12 nucleotide polymorphisms were observed, and four amino acid replacement sites, but none of them could be associated with a color pattern. Comparative analysis indicated that these taxa are monomorphic for amino acid sites that were previously identified as functionally important in other reptiles. Thus, our results indicate that MC1R is not involved in the pigmentation pattern observed in Liolaemus lizards. Therefore, structural differences in other genes, such as ASIP, or variation in regulatory regions of MC1R may be responsible for this variation. Alternatively, the phenotypic differences observed might be a consequence of non-genetic factors, such as thermoregulatory mechanisms. PMID:22481878

  5. Sequence variation in the melanocortin-1 receptor (MC1R) pigmentation gene and its role in the cryptic coloration of two South American sand lizards.

    PubMed

    Corso, Josmael; Gonçalves, Gislene L; de Freitas, Thales R O

    2012-01-01

    In reptiles, dorsal body darkness often varies with substrate color or temperature environment, and is generally presumed to be an adaptation for crypsis or thermoregulation. However, the genetic basis of pigmentation is poorly known in this group. In this study we analyzed the coding region of the melanocortin-1-receptor (MC1R) gene, and therefore its role underlying the dorsal color variation in two sympatric species of sand lizards (Liolaemus) that inhabit the southeastern coast of South America: L. occipitalis and L. arambarensis. The first is light-colored and occupies aeolic pale sand dunes, while the second is brownish and lives in a darker sandy habitat. We sequenced 630 base pairs of MC1R in both species. In total, 12 nucleotide polymorphisms were observed, and four amino acid replacement sites, but none of them could be associated with a color pattern. Comparative analysis indicated that these taxa are monomorphic for amino acid sites that were previously identified as functionally important in other reptiles. Thus, our results indicate that MC1R is not involved in the pigmentation pattern observed in Liolaemus lizards. Therefore, structural differences in other genes, such as ASIP, or variation in regulatory regions of MC1R may be responsible for this variation. Alternatively, the phenotypic differences observed might be a consequence of non-genetic factors, such as thermoregulatory mechanisms.

  6. Identification of conserved genomic regions and variation therein amongst Cetartiodactyla species using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Next Generation Sequencing has created an opportunity to genetically characterize an individual both inexpensively and comprehensively. In earlier work produced in our collaboration [1], it was demonstrated that, for animals without a reference genome, their Next Generation Sequence data ...

  7. Population clustering based on copy number variations detected from next generation sequencing data

    PubMed Central

    Duan, Junbo; Zhang, Ji-Gang; Wan, Mingxi; Deng, Hong-Wen; Wang, Yu-Ping

    2015-01-01

    Copy number variations (CNVs) can be used as significant bio-markers and next generation sequencing (NGS) provides a high resolution detection of these CNVs. But how to extract features from CNVs and further apply them to genomic studies such as population clustering have become a big challenge. In this paper, we propose a novel method for population clustering based on CNVs from NGS. First, CNVs are extracted from each sample to form a feature matrix. Then, this feature matrix is decomposed into the source matrix and weight matrix with non-negative matrix factorization (NMF). The source matrix consists of common CNVs that are shared by all the samples from the same group, and the weight matrix indicates the corresponding level of CNVs from each sample. Therefore, using NMF of CNVs one can differentiate samples from different ethnic groups, i.e. population clustering. To validate the approach, we applied it to the analysis of both simulation data and two real data set from the 1000 Genomes Project. The results on simulation data demonstrate that the proposed method can recover the true common CNVs with high quality. The results on the first real data analysis show that the proposed method can cluster two family trio with different ancestries into two ethnic groups and the results on the second real data analysis show that the proposed method can be applied to the whole-genome with large sample size consisting of multiple groups. Both results demonstrate the potential of the proposed method for population clustering. PMID:25152046

  8. Sequence variations in the collagen IX and XI genes are associated with degenerative lumbar spinal stenosis

    PubMed Central

    Noponen-Hietala, N; Kyllonen, E; Mannikko, M; Ilkko, E; Karppinen, J; Ott, J; Ala-Kokko, L

    2003-01-01

    Background: Degenerative lumbar spinal stenosis (LSS) is usually caused by disc herniation or degeneration. Several genetic factors have been implicated in disc disease. Tryptophan alleles in COL9A2 and COL9A3 have been shown to be associated with lumbar disc disease in the Finnish population, and polymorphisms in the vitamin D receptor gene (VDR) (FokI and TaqI), the matrix metalloproteinase-3 gene (MMP-3) and an aggrecan gene (AGC1) VNTR have been reported to be associated with disc degeneration. In addition, an IVS6-4 a>t polymorphism in COL11A2 has been found in connection with stenosis caused by ossification of the posterior longitudinal ligament in the Japanese population. Objective: To study the role of genetic factors in LSS. Methods: 29 Finnish probands were analysed for mutations in the genes coding for intervertebral disc matrix proteins, COL1A1, COL1A2, COL2A1, COL9A1, COL9A2, COL9A3, COL11A1, COL11A2, and AGC1. VDR and MMP-3 polymorphisms were also analysed. Sequence variations were tested in 56 Finnish controls. Results: Several disease associated alleles were identified. A splice site mutation in COL9A2 leading to a premature translation termination codon and the generation of a truncated protein was identified in one proband, another had the Trp2 allele, and four others the Trp3 allele. The frequency of the COL11A2 IVS6-4 t allele was 93.1% in the probands and 72.3% in controls (p = 0.0016). The differences in genotype frequencies for this site were less significant (p = 0.0043). Conclusions: Genetic factors have an important role in the pathogenesis of LSS. PMID:14644861

  9. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.

    PubMed

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-02-04

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs.

  10. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma

    PubMed Central

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-01-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  11. Phylogeography of the endangered Cathaya argyrophylla (Pinaceae) inferred from sequence variation of mitochondrial and nuclear DNA.

    PubMed

    Wang, Hong-Wei; Ge, Song

    2006-11-01

    Cathaya argyrophylla is an endangered conifer restricted to subtropical mountains of China. To study phylogeographical pattern and demographic history of C. argyrophylla, species-wide genetic variation was investigated using sequences of maternally inherited mtDNA and biparentally inherited nuclear DNA. Of 15 populations sampled from all four distinct regions, only three mitotypes were detected at two loci, without single region having a mixed composition (G(ST) = 1). Average nucleotide diversity (theta(ws) = 0.0024; pi(s) = 0.0029) across eight nuclear loci is significantly lower than those found for other conifers (theta(ws) = 0.003 approximately 0.015; pi(s) = 0.002 approximately 0.012) based on estimates of multiple loci. Because of its highest diversity among the eight nuclear loci and evolving neutrally, one locus (2009) was further used for phylogeographical studies and eight haplotypes resulting from 12 polymorphic sites were obtained from 98 individuals. All the four distinct regions had at least four haplotypes, with the Dalou region (DL) having the highest diversity and the Bamian region (BM) the lowest, paralleling the result of the eight nuclear loci. An AMOVA revealed significant proportion of diversity attributable to differences among regions (13.4%) and among populations within regions (8.9%). F(ST) analysis also indicated significantly high differentiation among populations (F(ST) = 0.22) and between regions (F(ST) = 0.12-0.38). Non-overlapping distribution of mitotypes and high genetic differentiation among the distinct geographical groups suggest the existence of at least four separate glacial refugia. Based on network and mismatch distribution analyses, we do not find evidence of long distance dispersal and population expansion in C. argyrophylla. Ex situ conservation and artificial crossing are recommended for the management of this endangered species.

  12. Fin whale MDH-1 and MPI allozyme variation is not reflected in the corresponding DNA sequences

    PubMed Central

    Olsen, Morten Tange; Pampoulie, Christophe; Daníelsdóttir, Anna K; Lidh, Emmelie; Bérubé, Martine; Víkingsson, Gísli A; Palsbøll, Per J

    2014-01-01

    The appeal of genetic inference methods to assess population genetic structure and guide management efforts is grounded in the correlation between the genetic similarity and gene flow among populations. Effects of such gene flow are typically genomewide; however, some loci may appear as outliers, displaying above or below average genetic divergence relative to the genomewide level. Above average population, genetic divergence may be due to divergent selection as a result of local adaptation. Consequently, substantial efforts have been directed toward such outlying loci in order to identify traits subject to local adaptation. Here, we report the results of an investigation into the molecular basis of the substantial degree of genetic divergence previously reported at allozyme loci among North Atlantic fin whale (Balaenoptera physalus) populations. We sequenced the exons encoding for the two most divergent allozyme loci (MDH-1 and MPI) and failed to detect any nonsynonymous substitutions. Following extensive error checking and analysis of additional bioinformatic and morphological data, we hypothesize that the observed allozyme polymorphisms may reflect phenotypic plasticity at the cellular level, perhaps as a response to nutritional stress. While such plasticity is intriguing in itself, and of fundamental evolutionary interest, our key finding is that the observed allozyme variation does not appear to be a result of genetic drift, migration, or selection on the MDH-1 and MPI exons themselves, stressing the importance of interpreting allozyme data with caution. As for North Atlantic fin whale population structure, our findings support the low levels of differentiation found in previous analyses of DNA nucleotide loci. PMID:24963377

  13. Sequence Variation in Superoxide Dismutase Gene of Toxoplasma gondii among Various Isolates from Different Hosts and Geographical Regions.

    PubMed

    Wang, Shuai; Cao, Aiping; Li, Xun; Zhao, Qunli; Liu, Yuan; Cong, Hua; He, Shenyi; Zhou, Huaiyu

    2015-06-01

    Toxoplasma gondii, an obligate intracellular protozoan parasite of the phylum Apicomplexa, can infect all warm-blooded vertebrates, including humans, livestock, and marine mammals. The aim of this study was to investigate whether superoxide dismutase (SOD) of T. gondii can be used as a new marker for genetic study or a potential vaccine candidate. The partial genome region of the SOD gene was amplified and sequenced from 10 different T. gondii isolates from different parts of the world, and all the sequences were examined by PCR-RFLP, sequence analysis, and phylogenetic reconstruction. The results showed that partial SOD gene sequences ranged from 1,702 bp to 1,712 bp and A + T contents varied from 50.1% to 51.1% among all examined isolates. Sequence alignment analysis identified total 43 variable nucleotide positions, and these results showed that 97.5% sequence similarity of SOD gene among all examined isolates. Phylogenetic analysis revealed that these SOD sequences were not an effective molecular marker for differential identification of T. gondii strains. The research demonstrated existence of low sequence variation in the SOD gene among T. gondii strains of different genotypes from different hosts and geographical regions.

  14. Genetic variation in safflower (Carthamus tinctorious L.) for seed quality-related traits and inter-simple sequence repeat (ISSR) markers.

    PubMed

    Golkar, Pooran; Arzani, Ahmad; Rezaei, Abdolmajid M

    2011-01-01

    Safflower (Carthamus tinctorious L.) is an oilseed crop that is valued as a source of high quality vegetable oil. The genetic diversity of 16 safflower genotypes originated from different geographical regions of Iran and some with exotic origin were evaluated. Eight different seed quality-related traits including fatty acid composition of seed oil (stearic acid, palmitic acid, oleic acid and linoleic acid), the contents of, oil, protein, fiber and ash in its seeds, as well as 20 inter-simple sequence repeat (ISSR) polymorphic primers were used in this study. Analysis of variance showed significant variation in genotypes for the seed quality-related traits. Based on ISSR markers, a total of 204 bands were amplified and 149 bands (about 70%) of these were polymorphic. Cluster analysis based on either biochemical or molecular markers classified the genotypes into four groups, showing some similarities between molecular and biochemical markers for evaluated genotypes. A logical similarity between the genotype clusters based on molecular data with their geographical origins was observed.

  15. A classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B

    1991-01-01

    The amino acid sequences of 301 glycosyl hydrolases and related enzymes have been compared. A total of 291 sequences corresponding to 39 EC entries could be classified into 35 families. Only ten sequences (less than 5% of the sample) could not be assigned to any family. With the sequences available for this analysis, 18 families were found to be monospecific (containing only one EC number) and 17 were found to be polyspecific (containing at least two EC numbers). Implications on the folding characteristics and mechanism of action of these enzymes and on the evolution of carbohydrate metabolism are discussed. With the steady increase in sequence and structural data, it is suggested that the enzyme classification system should perhaps be revised. PMID:1747104

  16. Destruxin analogs: variations of the alpha-hydroxy acid side chain.

    PubMed

    Cavelier, F; Jacquier, R; Mercadier, J L; Verducci, J; Traris, M; Vey, A

    1997-08-01

    This work describes the synthesis of three destruxin E cyclodepsipeptidic analogs. These compounds have an identical amino acid sequence but differ by the nature of the hydroxy acid residue with is 2-hydroxy-3-phenylpropionic (Hpp), 2-hydroxy-5-trimethylsilyl-4-pentynoic (Hpy-TMS) and 2-hydroxy-4-pentynoic (Hpy) acid, respectively. The insecticidal properties on the Galleria mellonella larvae (paralysis and lethal effect) of these analogs are presented in comparison with the natural destruxin E. All these compounds have toxic effects, the most potent being Hpy that induces the same effect as destruxin E.

  17. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences.

  18. Complete amino acid sequence of the N-terminal extension of calf skin type III procollagen.

    PubMed Central

    Brandt, A; Glanville, R W; Hörlein, D; Bruckner, P; Timpl, R; Fietzek, P P; Kühn, K

    1984-01-01

    The N-terminal extension peptide of type III procollagen, isolated from foetal-calf skin, contains 130 amino acid residues. To determine its amino acid sequence, the peptide was reduced and carboxymethylated or aminoethylated and fragmented with trypsin, Staphylococcus aureus V8 proteinase and bacterial collagenase. Pyroglutamate aminopeptidase was used to deblock the N-terminal collagenase fragment to enable amino acid sequencing. The type III collagen extension peptide is homologous to that of the alpha 1 chain of type I procollagen with respect to a three-domain structure. The N-terminal 79 amino acids, which contain ten of the 12 cysteine residues, form a compact globular domain. The next 39 amino acids are in a collagenase triplet sequence (Gly- Xaa - Yaa )n with a high hydroxyproline content. Finally, another short non-collagenous domain of 12 amino acids ends at the cleavage site for procollagen aminopeptidase, which cleaves a proline-glutamine bond. In contrast with type I procollagen, the type III procollagen extension peptides contain interchain disulphide bridges located at the C-terminus of the triple-helical domain. PMID:6331392

  19. Tandem repeat sequence variation as causative cis-eQTLs for protein-coding gene expression variation: the case of CSTB.

    PubMed

    Borel, Christelle; Migliavacca, Eugenia; Letourneau, Audrey; Gagnebin, Maryline; Béna, Frédérique; Sailani, M Reza; Dermitzakis, Emmanouil T; Sharp, Andrew J; Antonarakis, Stylianos E

    2012-08-01

    Association studies have revealed expression quantitative trait loci (eQTLs) for a large number of genes. However, the causative variants that regulate gene expression levels are generally unknown. We hypothesized that copy-number variation of sequence repeats contribute to the expression variation of some genes. Our laboratory has previously identified that the rare expansion of a repeat c.-174CGGGGCGGGGCG in the promoter region of the CSTB gene causes a silencing of the gene, resulting in progressive myoclonus epilepsy. Here, we genotyped the repeat length and quantified CSTB expression by quantitative real-time polymerase chain reaction in 173 lymphoblastoid cell lines (LCLs) and fibroblast samples from the GenCord collection. The majority of alleles contain either two or three copies of this repeat. Independent analysis revealed that the c.-174CGGGGCGGGGCG repeat length is strongly associated with CSTB expression (P = 3.14 × 10(-11)) in LCLs only. Examination of both genotyped and imputed single-nucleotide polymorphisms (SNPs) within 2 Mb of CSTB revealed that the dodecamer repeat represents the strongest cis-eQTL for CSTB in LCLs. We conclude that the common two or three copy variation is likely the causative cis-eQTL for CSTB expression variation. More broadly, we propose that polymorphic tandem repeats may represent the causative variation of a fraction of cis-eQTLs in the genome.

  20. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  1. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  2. 37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... nucleotide and/or amino acid sequence submissions in computer readable form. 1.824 Section 1.824 Patents... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... readable form may be created by any means, such as word processors, nucleotide/amino acid sequence...

  3. Complete amino acid sequence of branched-chain amino acid aminotransferase (transaminase B) of Salmonella typhimurium, identification of the coenzyme-binding site and sequence comparison analysis

    SciTech Connect

    Feild, M.J.

    1988-01-01

    The complete amino acid sequence of the subunit of branched-chain amino acid aminotransferase of Salmonella typhimurium was determined by automated Edman degradation of peptide fragments generated by chemical and enzymatic digestion of S-carboxymethylated and S-pyridylethylated transaminase B. Peptide fragments of transaminase B were generated by treatment of the enzyme with trypsin, Staphylococcus aureus V8 protease, endoproteinase Lys-C, and cyanogen bromide. Protocols were developed for separation of the peptide fragments by reverse-phase high performance liquid chromatography (HPLC), ion-exchange HPLC, and SDS-urea gel electrophoresis. The enzyme subunit contains 308 amino acid residues and has a molecular weight of 33,920 daltons. The coenzyme-binding site was determined by treatment of the enzyme, containing bound pyridoxal 5-phosphate, with tritiated sodium borohydride prior to trypsin digestion. Monitoring radioactivity incorporation and peptide map comparisons with an apoenzyme tryptic digest, allowed identification of the pyridoxylated-peptide which was isolated by reverse-phase HPLC and sequenced. The coenzyme-binding site is a lysyl residue at position 159. Some peptides were further characterized by fast atom bombardment mass spectrometry.

  4. Maternal effects and maternal selection arising from variation in allocation of free amino acid to eggs

    PubMed Central

    Newcombe, Devi; Hunt, John; Mitchell, Christopher; Moore, Allen J

    2015-01-01

    Maternal provisioning can have profound effects on offspring phenotypes, or maternal effects, especially early in life. One ubiquitous form of provisioning is in the makeup of egg. However, only a few studies examine the role of specific egg constituents in maternal effects, especially as they relate to maternal selection (a standardized selection gradient reflecting the covariance between maternal traits and offspring fitness). Here, we report on the evolutionary consequences of differences in maternal acquisition and allocation of amino acids to eggs. We manipulated acquisition by varying maternal diet (milkweed or sunflower) in the large milkweed bug, Oncopeltus fasciatus. Variation in allocation was detected by examining two source populations with different evolutionary histories and life-history response to sunflower as food. We measured amino acids composition in eggs in this 2 × 2 design and found significant effects of source population and maternal diet on egg and nymph mass and of source population, maternal diet, and their interaction on amino acid composition of eggs. We measured significant linear and quadratic maternal selection on offspring mass associated with variation in amino acid allocation. Visualizing the performance surface along the major axes of nonlinear selection and plotting the mean amino acid profile of eggs from each treatment onto the surface revealed a saddle-shaped fitness surface. While maternal selection appears to have influenced how females allocate amino acids, this maternal effect did not evolve equally in the two populations. Furthermore, none of the population means coincided with peak performance. Thus, we found that the composition of free amino acids in eggs was due to variation in both acquisition and allocation, which had significant fitness effects and created selection. However, although there can be an evolutionary response to novel food resources, females may be constrained from reaching phenotypic optima with

  5. Repetitive sequence variation and dynamics in the ribosomal DNA array of Saccharomyces cerevisiae as revealed by whole-genome resequencing

    PubMed Central

    James, Stephen A.; O'Kelly, Michael J.T.; Carter, David M.; Davey, Robert P.; van Oudenaarden, Alexander; Roberts, Ian N.

    2009-01-01

    Ribosomal DNA (rDNA) plays a key role in ribosome biogenesis, encoding genes for the structural RNA components of this important cellular organelle. These genes are vital for efficient functioning of the cellular protein synthesis machinery and as such are highly conserved and normally present in high copy numbers. In the baker's yeast Saccharomyces cerevisiae, there are more than 100 rDNA repeats located at a single locus on chromosome XII. Stability and sequence homogeneity of the rDNA array is essential for function, and this is achieved primarily by the mechanism of gene conversion. Detecting variation within these arrays is extremely problematic due to their large size and repetitive structure. In an attempt to address this, we have analyzed over 35 Mbp of rDNA sequence obtained from whole-genome shotgun sequencing (WGSS) of 34 strains of S. cerevisiae. Contrary to expectation, we find significant rDNA sequence variation exists within individual genomes. Many of the detected polymorphisms are not fully resolved. For this type of sequence variation, we introduce the term partial single nucleotide polymorphism, or pSNP. Comparative analysis of the complete data set reveals that different S. cerevisiae genomes possess different patterns of rDNA polymorphism, with much of the variation located within the rapidly evolving nontranscribed intergenic spacer (IGS) region. Furthermore, we find that strains known to have either structured or mosaic/hybrid genomes can be distinguished from one another based on rDNA pSNP number, indicating that pSNP dynamics may provide a reliable new measure of genome origin and stability. PMID:19141593

  6. Combined examination of sequence and copy number variations in human deafness genes improves diagnosis for cases of genetic deafness

    PubMed Central

    2014-01-01

    Background Copy number variations (CNVs) are the major type of structural variation in the human genome, and are more common than DNA sequence variations in populations. CNVs are important factors for human genetic and phenotypic diversity. Many CNVs have been associated with either resistance to diseases or identified as the cause of diseases. Currently little is known about the role of CNVs in causing deafness. CNVs are currently not analyzed by conventional genetic analysis methods to study deafness. Here we detected both DNA sequence variations and CNVs affecting 80 genes known to be required for normal hearing. Methods Coding regions of the deafness genes were captured by a hybridization-based method and processed through the standard next-generation sequencing (NGS) protocol using the Illumina platform. Samples hybridized together in the same reaction were analyzed to obtain CNVs. A read depth based method was used to measure CNVs at the resolution of a single exon. Results were validated by the quantitative PCR (qPCR) based method. Results Among 79 sporadic cases clinically diagnosed with sensorineural hearing loss, we identified previously-reported disease-causing sequence mutations in 16 cases. In addition, we identified a total of 97 CNVs (72 CNV gains and 25 CNV losses) in 27 deafness genes. The CNVs included homozygous deletions which may directly give rise to deleterious effects on protein functions known to be essential for hearing, as well as heterozygous deletions and CNV gains compounded with sequence mutations in deafness genes that could potentially harm gene functions. Conclusions We studied how CNVs in known deafness genes may result in deafness. Data provided here served as a basis to explain how CNVs disrupt normal functions of deafness genes. These results may significantly expand our understanding about how various types of genetic mutations cause deafness in humans. PMID:25342930

  7. Phylogenetic Relationships and Genetic Variation in Longidorus and Xiphinema Species (Nematoda: Longidoridae) Using ITS1 Sequences of Nuclear Ribosomal DNA

    PubMed Central

    Ye, Weimin; Szalanski, Allen L.; Robbins, R. T.

    2004-01-01

    Genetic analyses using DNA sequences of nuclear ribosomal DNA ITS1 were conducted to determine the extent of genetic variation within and among Longidorus and Xiphinema species. DNA sequences were obtained from samples collected from Arkansas, California and Australia as well as 4 Xiphinema DNA sequences from GenBank. The sequences of the ITS1 region including the 3' end of the 18S rDNA gene and the 5' end of the 5.8S rDNA gene ranged from 1020 bp to 1244 bp for the 9 Longidorus species, and from 870 bp to 1354 bp for the 7 Xiphinema species. Nucleotide frequencies were: A = 25.5%, C = 21.0%, G = 26.4%, and T = 27.1%. Genetic variation between the two genera had a maximum divergence of 38.6% between X. chambersi and L. crassus. Genetic variation among Xiphinema species ranged from 3.8% between X. diversicaudatum and X. bakeri to 29.9% between X. chambersi and X. italiae. Within Longidorus, genetic variation ranged from 8.9% between L. crassus and L. grandis to 32.4% between L. fragilis and L. diadecturus. Intraspecific genetic variation in X. americanum sensu lato ranged from 0.3% to 1.9%, while genetic variation in L. diadecturus had 0.8% and L. biformis ranged from 0.6% to 10.9%. Identical sequences were obtained between the two populations of L. grandis, and between the two populations of X. bakeri. Phylogenetic analyses based on the ITS1 DNA sequence data were conducted on each genus separately using both maximum parsimony and maximum likelihood analysis. Among the Longidorus taxa, 4 subgroups are supported: L. grandis, L. crassus, and L. elongatus are in one cluster; L. biformis and L. paralongicaudatus are in a second cluster; L. fragilis and L. breviannulatus are in a third cluster; and L. diadecturus is in a fourth cluster. Among the Xiphinema taxa, 3 subgroups are supported: X. americanum with X. chambersi, X. bakeri with X. diversicaudatum, and X. italiae and X. vuittenezi forming a sister group with X. index. The relationships observed in this study

  8. Cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow; Mary Ann D.; Dahlberg, James E.

    2010-11-09

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  9. Nucleic acid detection compositions

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James L.

    2008-08-05

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  10. Nucleic acid detection assays

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James E.

    2005-04-05

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  11. Cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor L.; Brow, Mary Ann D.; Dahlberg, James E.

    2007-12-11

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  12. The amino acid sequence of cytochromes c-551 from three species of Pseudomonas

    PubMed Central

    Ambler, R. P.; Wynn, Margaret

    1973-01-01

    The amino acid sequences of the cytochromes c-551 from three species of Pseudomonas have been determined. Each resembles the protein from Pseudomonas strain P6009 (now known to be Pseudomonas aeruginosa, not Pseudomonas fluorescens) in containing 82 amino acids in a single peptide chain, with a haem group covalently attached to cysteine residues 12 and 15. In all four sequences 43 residues are identical. Although by bacteriological criteria the organisms are closely related, the differences between pairs of sequences range from 22% to 39%. These values should be compared with the differences in the sequence of mitochondrial cytochrome c between mammals and amphibians (about 18%) or between mammals and insects (about 33%). Detailed evidence for the amino acid sequences of the proteins has been deposited as Supplementary Publication SUP 50015 at the National Lending Library for Science and Technology, Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1973), 131, 5. PMID:4352718

  13. Draft Genome Sequence of Sorghum Grain Mold Fungus Epicoccum sorghinum, a Producer of Tenuazonic Acid

    PubMed Central

    Oliveira, Rodrigo C.; Davenport, Karen W.; Hovde, Blake; Silva, Danielle; Chain, Patrick S. G.; Correa, Benedito

    2017-01-01

    ABSTRACT The facultative plant pathogen Epicoccum sorghinum is associated with grain mold of sorghum and produces the mycotoxin tenuazonic acid. This fungus can have serious economic impact on sorghum production. Here, we report the draft genome sequence of E. sorghinum (USPMTOX48). PMID:28126937

  14. Snake venom. The amino acid sequence of protein A from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J; Strydom, D J

    1980-12-01

    Protein A from Dendroaspis polylepis polylepis venom comprises 81 amino acids, including ten half-cystine residues. The complete primary structures of protein A and its variant A' were elucidated. The sequences of proteins A and A', which differ in a single position, show no homology with various neurotoxins and non-neurotoxic proteins and represent a new type of elapid venom protein.

  15. Draft Genome Sequence of Bacillus coagulans NL01, a Wonderful l-Lactic Acid Producer

    PubMed Central

    Zheng, Zhaojuan; Jiang, Ting; Lin, Xi; Zhou, Jie

    2015-01-01

    Here, we report the draft genome sequence of Bacillus coagulans NL01, which could produce high optically pure l-lactic acid using xylose as a sole carbon source. The draft genome is 3,505,081 bp, with 144 contigs. About 3,903 protein-coding genes and 92 rRNAs are predicted from this assembly. PMID:26089419

  16. Characterization and Sequence Variation in the rDNA Region of Six Nematode Species of the Genus Longidorus (Nematoda)

    PubMed Central

    De Luca, F.; Reyes, A.; Grunder, J.; Kunz, P.; Agostinelli, A.; De Giorgi, C.; Lamberti, F.

    2004-01-01

    Total DNA was isolated from individual nematodes of the species Longidorus helveticus, L. macrosoma, L. arthensis, L. profundorum, L. elongatus, and L. raskii collected in Switzerland. The ITS region and D1-D2 expansion segments of the 26S rDNA were amplified and cloned. The sequences obtained were aligned in order to investigate sequence diversity and to infer the phylogenetic relationships among the six Longidorus species. D1-D2 sequences were more conserved than the ITS sequences that varied widely in primary structure and length, and no consensus was observed. Phylogenetic analyses using the neighbor-joining, maximum parsimony and maximum likelihood methods were performed with three different sequence data sets: ITS1-ITS2, 5.8S-D1-D2, and combining ITS1-ITS2+5.8S-D1-D2 sequences. All multiple alignments yielded similar basic trees supporting the existence of the six species established using morphological characters. These sequence data also provided evidence that the different regions of the rDNA are characterized by different evolution rates and by different factors associated with the generation of extreme size variation. PMID:19262800

  17. Summer and winter variations of dicarboxylic acids, fatty acids and benzoic acid in PM2.5 in Pearl Delta River Region, China

    NASA Astrophysics Data System (ADS)

    Ho, K. F.; Ho, S. S. H.; Lee, S. C.; Kawamura, K.; Zou, S. C.; Cao, J. J.; Xu, H. M.

    2010-11-01

    Ground-based PM2.5 samples collected in Pearl River Delta (PRD) region during winter and summer (from 14 December 2006 to 28 January 2007 in winter and from 4 July 2007 to 9 August 2007 in summer) were analyzed for 30 water-soluble organic species, including dicarboxylic acids, ketocarboxylic acids and dicarbonyls, nine fatty acids, and benzoic acid. Molecular distributions of dicarboxylic acids demonstrated that oxalic acid (C2) was the most abundant species followed by phthalic acid (Ph) in PRD region. The concentrations of total dicarboxylic acids ranged from 99 to 1340 ng m-3, with an average of 438 ± 267 ng m-3 in PRD. The concentrations of total ketocarboxylic acids ranged from 0.6 to 207 ng m-3 (43 ± 48 ng m-3 on average) while the concentrations of total α-dicarbonyls, including glyoxal and methylglyoxal, ranged from 0.2 to 89 ng m-3, with an average of 11 ± 18 ng m-3 in PRD. The total quantified water-soluble organic carbon (TQWOC) accounted for 3.4 ± 2.2% of OC and 14.3 ± 10.3% of water-soluble OC (WSOC). Hexadecanoic acid (C16:0), octadecanoic acid (C18:0) and oleic acid (C18:1) are the three most abundant fatty acids in PRD. The distributions of fatty acids are characterized by a strong even carbon number predominance with a maximum (Cmax) at hexadecanoic acid (C16:0). Ratio of C18:1 to C18:0 acts as an indicator for aerosol aging. In PRD, an average of C18:1/C18:0 ratio was 0.53 ± 0.39, suggesting an enhanced photochemical degradation of unsaturated fatty acid. Seasonal variations of the pollutant concentrations were found in the four sampling cities. Higher concentrations of TQWOC were observed in winter (544 ng m-3) than in summer (318 ng m-3). However, the abundances of TQWOC in OC mass were higher in summer (1.8-12.4%, 5.4% on average) than in winter (1.1-5.7, 2.6% on average), being consistent with enhanced secondary production of dicarboxylic acids in warmer weather. Spatial variations of water-soluble dicarboxylic acids were characterized

  18. Variation.

    ERIC Educational Resources Information Center

    Hamilton City Board of Education (Ontario).

    Suggestions for studying the topic of variation of individuals and objects (balls) to help develop elementary school students' measurement, comparison, classification, evaluation, and data collection and recording skills are made. General suggestions of variables that can be investigated are made for the study of human variation. Twelve specific…

  19. Amino acid sequences of heterotrophic and photosynthetic ferredoxins from the tomato plant (Lycopersicon esculentum Mill.).

    PubMed

    Kamide, K; Sakai, H; Aoki, K; Sanada, Y; Wada, K; Green, L S; Yee, B C; Buchanan, B B

    1995-11-01

    Several forms (isoproteins) of ferredoxin in roots, leaves, and green and red pericarps in tomato plants (Lycopersicon esculentum Mill.) were earlier identified on the basis of N-terminal amino acid sequence and chromatographic behavior (Green et al. 1991). In the present study, a large scale preparation made possible determination of the full length amino acid sequence of the two ferredoxins from leaves. The ferredoxins characteristic of fruit and root were sequenced from the amino terminus to the 30th residue or beyond. The leaf ferredoxins were confirmed to be expressed in pericarp of both green and red fruit. The ferredoxins characteristic of fruit and root appeared to be restricted to those tissue. The results extend earlier findings in demonstrating that ferredoxin occurs in the major organs of the tomato plant where it appears to function irrespective of photosynthetic competence.

  20. Amino acid sequence of myoglobin from white-tailed deer (Odocoileus virginianus).

    PubMed

    Joseph, Poulson; Suman, Surendranath P; Li, Shuting; Fontaine, Michele; Steinke, Laurey

    2012-10-01

    Our objective was to determine the primary structure of white-tailed deer myoglobin (Mb). White-tailed deer Mb was isolated from cardiac muscles employing ammonium sulfate precipitation and gel-filtration chromatography. The amino acid sequence was determined by Edman degradation. Sequence analyses of intact Mb as well as tryptic- and cyanogen bromide-peptides yielded the complete primary structure of white-tailed deer Mb, which shared 100% similarity with red deer Mb. White-tailed deer Mb consists of 153 amino acid residues and shares more than 96% sequence similarity with myoglobins from meat-producing ruminants, such as cattle, buffalo, sheep, and goat. Similar to sheep and goat myoglobins, white-tailed deer Mb contains 12 histidine residues. Proximal (position 93) and distal (position 64) histidine residues responsible for maintaining the stability of heme are conserved in white-tailed deer Mb.

  1. Nucleotide sequence and the encoded amino acids of human apolipoprotein A-I mRNA.

    PubMed Central

    Law, S W; Brewer, H B

    1984-01-01

    The cDNA clones encoding the precursor form of human liver apolipoprotein A-I (apoA-I), preproapoA-I, have been isolated from a cDNA library. A 17-base synthetic oligonucleotide based on residues 108-113 of apoA-I and a 26-base primer-extended, dideoxynucleotide-terminated cDNA were used as hybridization probes to select for recombinant plasmids bearing the apoA-I sequence. The complete nucleic acid sequence of human liver preproapoA-I has been determined by analysis of the cloned cDNA. The sequence is composed of 801 nucleotides encoding 267 amino acid residues. PreproapoA-I contains an 18-amino-acid prepeptide and a 6-amino-acid propeptide connected to the amino terminus of the 243-amino acid mature apoA-I. Southern blotting analysis of chromosomal DNA obtained from peripheral blood indicated the apoA-I gene is contained in a 2.1-kilobase-pair Pst I fragment and there is no gross difference in structural organization between the normal apoA-I gene and the Tangier disease apoA-I gene. Images PMID:6198645

  2. Single Amino Acid Variation Underlies Species-Specific Sensitivity to Amphibian Skin-Derived Opioid-like Peptides.

    PubMed

    Vardy, Eyal; Sassano, Maria F; Rennekamp, Andrew J; Kroeze, Wesley K; Mosier, Philip D; Westkaemper, Richard B; Stevens, Craig W; Katritch, Vsevolod; Stevens, Raymond C; Peterson, Randall T; Roth, Bryan L

    2015-06-18

    It has been suggested that the evolution of vertebrate opioid receptors (ORs) follow a vector of increased functionality. Here, we test this idea by comparing human and frog ORs. Interestingly, some of the most potent opioid peptides known have been isolated from amphibian skin secretions. Here we show that such peptides (dermorphin and deltorphin) are highly potent in the human receptors and inactive in frog ORs. The molecular basis for the insensitivity of the frog ORs to these peptides was studied using chimeras and molecular modeling. The insensitivity of the delta OR (DOR) to deltorphin was due to variation of a single amino acid, Trp7.35, which is a leucine in mammalian DORs. Notably, Trp7.35 is completely conserved in all known DOR sequences from lamprey, fish, and amphibians. The deltorphin-insensitive phenotype was verified in fish. Our results provide a molecular explanation for the species selectivity of skin-derived opioid peptides.

  3. Hybridization properties of long nucleic acid probes for detection of variable target sequences, and development of a hybridization prediction algorithm.

    PubMed

    Ohrmalm, Christina; Jobs, Magnus; Eriksson, Ronnie; Golbob, Sultan; Elfaitouri, Amal; Benachenhou, Farid; Strømme, Maria; Blomberg, Jonas

    2010-11-01

    One of the main problems in nucleic acid-based techniques for detection of infectious agents, such as influenza viruses, is that of nucleic acid sequence variation. DNA probes, 70-nt long, some including the nucleotide analog deoxyribose-Inosine (dInosine), were analyzed for hybridization tolerance to different amounts and distributions of mismatching bases, e.g. synonymous mutations, in target DNA. Microsphere-linked 70-mer probes were hybridized in 3M TMAC buffer to biotinylated single-stranded (ss) DNA for subsequent analysis in a Luminex® system. When mismatches interrupted contiguous matching stretches of 6 nt or longer, it had a strong impact on hybridization. Contiguous matching stretches are more important than the same number of matching nucleotides separated by mismatches into several regions. dInosine, but not 5-nitroindole, substitutions at mismatching positions stabilized hybridization remarkably well, comparable to N (4-fold) wobbles in the same positions. In contrast to shorter probes, 70-nt probes with judiciously placed dInosine substitutions and/or wobble positions were remarkably mismatch tolerant, with preserved specificity. An algorithm, NucZip, was constructed to model the nucleation and zipping phases of hybridization, integrating both local and distant binding contributions. It predicted hybridization more exactly than previous algorithms, and has the potential to guide the design of variation-tolerant yet specific probes.

  4. Software scripts for quality checking of high-throughput nucleic acid sequencers.

    PubMed

    Lazo, G R; Tong, J; Miller, R; Hsia, C; Rausch, C; Kang, Y; Anderson, O D

    2001-06-01

    We have developed a graphical interface to allow the researcher to view and assess the quality of sequencing results using a series of program scripts developed to process data generated by automated sequencers. The scripts are written in Perl programming language and are executable under the cgibin directory of a Web server environment. The scripts direct nucleic acid sequencing trace file data output from automated sequencers to be analyzed by the phred molecular biology program and are displayed as graphical hypertext mark-up language (HTML) pages. The scripts are mainly designed to handle 96-well microtiter dish samples, but the scripts are also able to read data from 384-well microtiter dishes 96 samples at a time. The scripts may be customized for different laboratory environments and computer configurations. Web links to the sources and discussion page are provided.

  5. Phylogenetic and functional analysis of sequence variation of human papillomavirus type 31 E6 and E7 oncoproteins.

    PubMed

    Ferenczi, Annamária; Gyöngyösi, Eszter; Szalmás, Anita; László, Brigitta; Kónya, József; Veress, György

    2016-09-01

    High-risk human papillomaviruses (HPV) are the causative agents of cervical and other anogenital cancers as well as a subset of head and neck cancers. The E6 and E7 oncoproteins of HPV contribute to oncogenesis by associating with the tumour suppressor protein p53 and pRb, respectively. For HPV types 16 and 18, intratypic sequence variation was shown to have biological and clinical significance. The functional significance of sequence variation among HPV 31 variants was studied less intensively. HPV 31 variants belonging to different variant lineages were found to have differences in persistence and in the ability to cause high grade cervical intraepithelial neoplasia. In the present study, we started to explore the functional effects of natural sequence variation of HPV 31 E6 and E7 oncoproteins. The E6 variants were tested for their effects on p53 protein stability and transcriptional activity, while the E7 variants were tested for their effects on pRb protein level and also on the transcriptional activity of E2F transcription factors. HPV 31 E7 variants displayed uniform effects on pRb stability and also on the activity of E2F transcription factors. HPV 31 E6 variants had remarkable differences in the ability to inhibit the trans-activation function of p53 but not in the ability to induce the in vivo degradation of p53. Our results indicate that natural sequence variation of the HPV 31 E6 protein may be involved in the observed differences in the oncogenic potential between HPV 31 variants.

  6. ITS2-rDNA Sequence Variation of Phlebotomus sergenti s.l. (Dip: Psychodidae) Populations in Iran

    PubMed Central

    Moin-Vaziri, Vahideh; Oshaghi, Mohammad Ali; Yaghoobi-Ershadi, Mohammad Reza; Derakhshandeh-Peykar, Pupak; Abaei, Mohammad Reza; Mohtarami, Fatemeh; Zahraei-Ramezani, Ali Reza; Nadim, Aboulhassan

    2016-01-01

    Background: Phlebotomus sergenti s.l. is considered the most likely vector of Leishmania tropica in Iran. Although two morphotypes- P. sergenti sergenti (A) and P. sergenti similis (B)-have been formally described, further morphological and a molecular analysis of mitochondrial cytochrome oxidase I (mtDNA-COI) gene revealed inconsistencies and suggests that the variation between the morphotypes is intraspecific and the morphotypes might be identical species. Methods: We examined the sequence of the ITS2-rDNA of Iranian specimens of P. sergenti s.l., comprising P. cf sergenti, P. cf similis, and intermediate morphotypes, together with available data in Genbank. Results: Sequence analysis showed 5.2% variation among P. sergenti s.l. morphotypes. Almost half of the variation was due to the number of an AT microsatellite repeats in the center of the spacer. Nine haplotypes were found in the species constructing three main lineages corresponding to the origin of the colonies located in southwest (SW), northeast (NE), and northwest-center-southeast (NCS). Lineages NCS and NE included both typical P. cf sergenti and P. cf similis and intermediate morphotypes. Conclusion: Phylogenetic sequence analysis revealed that, except for one Iranian sample, which was close to the European samples, other Iranian haplotypes were associated with the northeastern Mediterranean populations including Turkey, Cyprus, Syria, and Pakistan. Similar to the sequences of mtDNA COI gene, ITS2 sequences could not resolve P. sergenti from P. similis and did not support the possible existence of sibling species or subspecies within P. sergenti s.l.. PMID:28032098

  7. Seasonal variation in soil nitrogen availability across a fertilization chronosequence in moist acidic tundra

    NASA Astrophysics Data System (ADS)

    McLaren, J. R.; Gough, L.; Weintraub, M. N.

    2012-12-01

    Changes in global climate may result in altered timing of seasonal events including the timing of the spring-thaw and fall freeze-up. In addition to this changing seasonality, arctic environments are experiencing overall increases in nutrient availability caused by climate warming resulting in alterations of plant species composition, such as the observed increases in the abundance of deciduous shrubs. Changing species composition may have large effects on nutrient dynamics in the surrounding ecosystem because of documented differences in how particular plant species influence soil nutrient availability. Although we have some idea of how plant identity influences soil nutrients, soil biogeochemical processes are strongly seasonal, and we have a poor understanding of how plant identity, or nutrient levels, may influence these seasonal patterns. We examined the responses of moist acidic tundra to experimentally increased soil nutrient availability and the accompanying increase in shrub abundance at the Arctic Long Term Ecological Research (LTER) site at Toolik Lake, Alaska. We examined a chrono-sequence of long-term fertilization experiments, composed of experiments fertilized for 5, 15 and 22 years, which has resulted in increasing shrub density with time since fertilization. The fertilized plots receive both nitrogen (N, 10 g/m2/yr) and phosphorus (5 g/m2/yr) annually following snowmelt. In the 2011 growing season we measured variation in soil available N weekly, including measures of ammonium (NH4), nitrate (NO3) and total free amino acids (TFAA). We found that differences between fertilized and control plots depended strongly on both the seasonal timing of measurements, as well as the duration of the fertilization treatment. Early in the growing season fertilization resulted in large increases in available soil N (both NH4 and NO3) across the entire chronosequence. As the season progressed, however, older fertilized plots show evidence of N saturation, where

  8. Variation of the internal transcribed spacer 1 sequence within individual strains and among different strains of Neospora caninum.

    PubMed

    Gondim, Luis F P; Laski, Paul; Gao, Liying; McAllister, Milton M

    2004-02-01

    Small differences have been reported in the internal transcribed spacer 1 (ITS1) region among strains of Neospora caninum. We compared ITS1 sequences among 6 N. caninum strains analyzed in our laboratory, including 2 strains that have not been examined previously (NC-Illinois and NC-Bahia). Five sequences showed 100% similarity and also were identical to 7 of 11 sequences that were previously reported by others. In contrast, initial attempts to sequence the ITS1 of NC-Bahia generated 12 nucleotide differences compared with the other 5 strains, and several ambiguous bases. However, the single band containing the ITS1 region, as observed after electrophoresis on a 2% agarose gel, became divided into 2 distinct bands when reanalyzed using 5 or 10% polyacrylamide gel electrophoresis (PAGE), and the ITS1 within these separate bands were sequenced without ambiguity. The other 5 N. caninum strains were also reexamined using PAGE, and in each strain 2 distinct bands were discovered. In comparison, 2 strains of Toxoplasma gondii continued to show only 1 band when examined using PAGE. The ITS1 sequence of NC-Bahia, from Brazil, differs in several base pairs from those of North American and European strains of N. caninum. Intrastrain variation of the ITS1 region appears to be common in N. caninum, in contrast to T. gondii.

  9. Amino acid sequence of band-3 protein from rainbow trout erythrocytes derived from cDNA.

    PubMed Central

    Hübner, S; Michel, F; Rudloff, V; Appelhans, H

    1992-01-01

    In this report we present the first complete band-3 cDNA sequence of a poikilothermic lower vertebrate. The primary structure of the anion-exchange protein band 3 (AE1) from rainbow trout erythrocytes was determined by nucleotide sequencing of cDNA clones. The overlapping clones have a total length of 3827 bp with a 5'-terminal untranslated region of 150 bp, a 2754 bp open reading frame and a 3'-untranslated region of 924 bp. Band-3 protein from trout erythrocytes consists of 918 amino acid residues with a calculated molecular mass of 101 827 Da. Comparison of its amino acid sequence revealed a 60-65% identity within the transmembrane spanning sequence of band-3 proteins published so far. An additional insertion of 24 amino acid residues within the membrane-associated domain of trout band-3 protein was identified, which until now was thought to be a general feature only of mammalian band-3-related proteins. PMID:1637296

  10. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D [Ken; SNL,

    2016-07-12

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  11. AFLP and DNA sequence variation in an Andean domesticate, pepino (Solanum muricatum, Solanaceae): implications for evolution and domestication.

    PubMed

    Blanca, José M; Prohens, Jaime; Anderson, Gregory J; Zuriaga, Elena; Cañizares, Joaquín; Nuez, Fernando

    2007-07-01

    The pepino (Solanum muricatum) is a vegetatively propagated, domesticated native of the Andes, where it grows with wild relatives. We used AFLPs and a 1-kb sequence of the 3-methylcrotonyl-CoA carboxylase gene to study variation of 27 accessions of S. muricatum and 35 collections of 10 species of wild relatives (Solanum section Basarthrum). A total of 298 AFLP fragments and 29 DNA sequence haplotypes were detected. Cluster and principal coordinate analyses and other genetic parameters estimated from both types of markers, show that S. muricatum is closely related to the species from one of the series (Caripensia) of section Basarthrum and that >90% of the variation of the cultigen is also represented in that series. Pepino is highly diverse, either because it is not monophyletic or it has been subjected to regular introgression with wild species, or both. Although a continuous distribution of the genetic variation occurred within the cultivated species, three genetic clusters were recognized. Cluster 1 is mostly centered in Ecuador, cluster 2 in Ecuador and Peru, and cluster 3 in Colombia and Ecuador. Cluster 3 also includes all modern cultivars studied. These results and other evidence suggest that northern Ecuador/southern Colombia is the main center of pepino diversity and the center of origin. The high genetic variation of this cultigen indicates that domestication does not always produce a genetic bottleneck.

  12. Phylogenetic lineage of Tobacco leaf curl virus in Korea and estimation of recombination events implicated in their sequence variation.

    PubMed

    Park, Jungan; Lee, Hyejung; Kim, Mi-Kyung; Kwak, Hae-Ryun; Auh, Chung-Kyoon; Lee, Kyeong-Yeoll; Kim, Sunghan; Choi, Hong-Soo; Lee, Sukchan

    2011-08-01

    New strains of Tobacco leaf curl virus (TbLCV) were isolated from tomato plants in four different local communities of Korea, and hence were designated TbLCV-Kr. Phylogenetic analysis of the sequences of the whole genome and of individual ORFs of these viruses indicated that they are closely related to the Tobacco leaf curl Japan virus (TbLCJV) cluster, which includes Honeysuckle yellow vein virus (HYVV), Honeysuckle yellow vein mosaic virus (HYVMV), and TbLCJV isolates. Four putative recombination events were recognized within these virus sequences, suggesting that the sequence variations observed in these viruses may be attributable to intraspecific and interspecific recombination events involving some TbLCV-Kr isolates, Papaya leaf curl virus (PaLCV), and a local isolate of Tomato yellow leaf curl virus (TYLCV).

  13. The Effects of Sequence Variation on Genome-wide NRF2 Binding—New Target Genes and Regulatory SNPs

    PubMed Central

    Kuosmanen, Suvi M.; Viitala, Sari; Laitinen, Tuomo; Peräkylä, Mikael; Pölönen, Petri; Kansanen, Emilia; Leinonen, Hanna; Raju, Suresh; Wienecke-Baldacchino, Anke; Närvänen, Ale; Poso, Antti; Heinäniemi, Merja; Heikkinen, Sami; Levonen, Anna-Liisa

    2016-01-01

    Transcription factor binding specificity is crucial for proper target gene regulation. Motif discovery algorithms identify the main features of the binding patterns, but the accuracy on the lower affinity sites is often poor. Nuclear factor E2-related factor 2 (NRF2) is a ubiquitous redox-activated transcription factor having a key protective role against endogenous and exogenous oxidant and electrophile stress. Herein, we decipher the effects of sequence variation on the DNA binding sequence of NRF2, in order to identify both genome-wide binding sites for NRF2 and disease-associated regulatory SNPs (rSNPs) with drastic effects on NRF2 binding. Interactions between NRF2 and DNA were studied using molecular modelling, and NRF2 chromatin immunoprecipitation-sequence datasets together with protein binding microarray measurements were utilized to study binding sequence variation in detail. The binding model thus generated was used to identify genome-wide binding sites for NRF2, and genomic binding sites with rSNPs that have strong effects on NRF2 binding and reside on active regulatory elements in human cells. As a proof of concept, miR-126–3p and -5p were identified as NRF2 target microRNAs, and a rSNP (rs113067944) residing on NRF2 target gene (Ferritin, light polypeptide, FTL) promoter was experimentally verified to decrease NRF2 binding and result in decreased transcriptional activity. PMID:26826707

  14. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    SciTech Connect

    Le Coq, Johanne; Ghosh, Partho

    2012-06-19

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd ({approx}16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10{sup 20} potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.

  15. Role of the two-component leader sequence and mature amino acid sequences in extracellular export of endoglucanase EGL from Pseudomonas solanacearum.

    PubMed Central

    Huang, J Z; Schell, M A

    1992-01-01

    The egl gene of Pseudomonas solanacearum encodes a 43-kDa extracellular endoglucanase (mEGL) involved in wilt disease caused by this phytopathogen. Egl is initially translated with a 45-residue, two-part leader sequence. The first 19 residues are apparently removed by signal peptidase II during export of Egl across the inner membrane (IM); the remaining residues of the leader sequence (modified with palmitate) are removed during export across the outer membrane (OM). Localization of Egl-PhoA fusion proteins showed that the first 26 residues of the Egl leader sequence are required and sufficient to direct lipid modification, processing, and export of Egl or PhoA across the IM but not the OM. Fusions of the complete 45-residue leader sequence or of the leader and increasing portions of mEgl sequences to PhoA did not cause its export across the OM. In-frame deletion of portions of mEGL-coding sequences blocked export of the truncated polypeptides across the OM without affecting export across the IM. These results indicate that the first part of the leader sequence functions independently to direct export of Egl across the IM while the second part and sequences and structures in mEGL are involved in export across the OM. Computer analysis of the mEgl amino acid sequence obtained from its nucleotide sequence identified a region of mEGL similar in amino acid sequence to regions in other prokaryotic endoglucanases. Images PMID:1735723

  16. Studies on adenosine triphosphate transphosphorylases. Amino acid sequence of rabbit muscle ATP-AMP transphosphorylase.

    PubMed

    Kuby, S A; Palmieri, R H; Frischat, A; Fischer, A H; Wu, L H; Maland, L; Manship, M

    1984-05-22

    The total amino acid sequence of rabbit muscle adenylate kinase has been determined, and the single polypeptide chain of 194 amino acid residues starts with N-acetylmethionine and ends with leucyllysine at its carboxyl terminus, in agreement with the earlier data on its amino acid composition [Mahowald, T. A., Noltmann, E. A., & Kuby, S. A. (1962) J. Biol. Chem. 237, 1138-1145] and its carboxyl-terminus sequence [Olson, O. E., & Kuby, S. A. (1964) J. Biol. Chem. 239, 460-467]. Elucidation of the primary structure was based on tryptic and chymotryptic cleavages of the performic acid oxidized protein, cyanogen bromide cleavages of the 14C-labeled S-carboxymethylated protein at its five methionine sites (followed by maleylation of peptide fragments), and tryptic cleavages at its 12 arginine sites of the maleylated 14C-labeled S-carboxymethylated protein. Calf muscle myokinase, whose sequence has also been established, differs primarily from the rabbit muscle myokinase's sequence in the following: His-30 is replaced by Gln-30; Lys-56 is replaced by Met-56; Ala-84 and Asp 85 are replaced by Val-84 and Asn-85. A comparison of the four muscle-type adenylate kinases, whose covalent structures have now been determined, viz., rabbit, calf, porcine, and human [for the latter two sequences see Heil, A., Müller, G., Noda, L., Pinder, T., Schirmer, H., Schirmer, I., & Von Zabern, I. (1974) Eur. J. Biochem. 43, 131-144, and Von Zabern, I., Wittmann-Liebold, B., Untucht-Grau, R., Schirmer, R. H., & Pai, E. F. (1976) Eur. J. Biochem. 68, 281-290], demonstrates an extraordinary degree of homology.(ABSTRACT TRUNCATED AT 250 WORDS)

  17. Sequence variation determining stereochemistry of a Δ11 desaturase active in moth sex pheromone biosynthesis.

    PubMed

    Ding, Bao-Jian; Carraher, Colm; Löfstedt, Christer

    2016-07-01

    A Δ11 desaturase from the oblique banded leaf roller moth Choristoneura rosaceana takes the saturated myristic acid and produces a mixture of (E)-11-tetradecenoate and (Z)-11-tetradecenoate with an excess of the Z isomer (35:65). A desaturase from the spotted fireworm moth Choristoneura parallela also operates on myristic acid substrate but produces almost pure (E)-11-tetradecenoate. The two desaturases share 92% amino acid identity and 97% amino acid similarity. There are 24 amino acids differing between these two desaturases. We constructed mutations at all of these positions to pinpoint the sites that determine the product stereochemistry. We demonstrated with a yeast functional assay that one amino acid at the cytosolic carboxyl terminus of the protein (258E) is critical for the Z activity of the C. rosaceana desaturase. Mutating the glutamic acid (E) into aspartic acid (D) transforms the C. rosaceana enzyme into a desaturase with C. parallela-like activity, whereas the reciprocal mutation of the C. parallela desaturase transformed it into an enzyme producing an intermediate 64:36 E/Z product ratio. We discuss the causal link between this amino acid change and the stereochemical properties of the desaturase and the role of desaturase mutations in pheromone evolution.

  18. The complete amino acid sequence of a trypsin inhibitor from Bauhinia variegata var. candida seeds.

    PubMed

    Di Ciero, L; Oliva, M L; Torquato, R; Köhler, P; Weder, J K; Camillo Novello, J; Sampaio, C A; Oliveira, B; Marangoni, S

    1998-11-01

    Trypsin inhibitors of two varieties of Bauhinia variegata seeds have been isolated and characterized. Bauhinia variegata candida trypsin inhibitor (BvcTI) and B. variegata lilac trypsin inhibitor (BvlTI) are proteins with Mr of about 20,000 without free sulfhydryl groups. Amino acid analysis shows a high content of aspartic acid, glutamic acid, serine, and glycine, and a low content of histidine, tyrosine, methionine, and lysine in both inhibitors. Isoelectric focusing for both varieties detected three isoforms (pI 4.85, 5.00, and 5.15), which were resolved by HPLC procedure. The trypsin inhibitors show Ki values of 6.9 and 1.2 nM for BvcTI and BvlTI, respectively. The N-terminal sequences of the three trypsin inhibitor isoforms from both varieties of Bauhinia variegata and the complete amino acid sequence of B. variegata var. candida L. trypsin inhibitor isoform 3 (BvcTI-3) are presented. The sequences have been determined by automated Edman degradation of the reduced and carboxymethylated proteins of the peptides resulting from Staphylococcus aureus protease and trypsin digestion. BvcTI-3 is composed of 167 residues and has a calculated molecular mass of 18,529. Homology studies with other trypsin inhibitors show that BvcTI-3 belongs to the Kunitz family. The putative active site encompasses Arg (63)-Ile (64).

  19. Multiple site-selective insertions of non-canonical amino acids into sequence-repetitive polypeptides

    PubMed Central

    Wu, I-Lin; Patterson, Melissa A.; Carpenter Desai, Holly E.; Mehl, Ryan A.; Giorgi, Gianluca

    2013-01-01

    A simple and efficient method is described for introduction of non-canonical amino acids at multiple, structurally defined sites within recombinant polypeptide sequences. E. coli MRA30, a bacterial host strain with attenuated activity for release factor 1 (RF1), is assessed for its ability to support the incorporation of a diverse range of non-canonical amino acids in response to multiple encoded amber (TAG) codons within genetic templates derived from superfolder GFP and an elastin-mimetic protein polymer. Suppression efficiency and isolated protein yield were observed to depend on the identity of the orthogonal aminoacyl-tRNA synthetase/tRNACUA pair and the non-canonical amino acid substrate. This approach afforded elastin-mimetic protein polymers containing non-canonical amino acid derivatives at up to twenty-two positions within the repeat sequence with high levels of substitution. The identity and position of the variant residues was confirmed by mass spectrometric analysis of the full-length polypeptides and proteolytic cleavage fragments resulting from thermolysin digestion. The accumulated data suggest that this multi-site suppression approach permits the preparation of protein-based materials in which novel chemical functionality can be introduced at precisely defined positions within the polypeptide sequence. PMID:23625817

  20. Deduced amino acid sequence of human pulmonary surfactant proteolipid: SPL(pVal)

    SciTech Connect

    Whitsett, J.A.; Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.; Clark, J.; Pilot-Matias, T.; Meuth, J.; Fox, J.L.

    1987-05-01

    Hydrophobic, proteolipid-like protein of Mr 6500 was isolated from ether/ethanol extracts of human, canine and bovine pulmonary surfactant. Amino acid composition of the protein demonstrated a remarkable abundance of hydrophobic residues, particularly valine and leucine. The N-terminal amino acid sequence of the human protein was determined: N-Leu-Ile-Pro-Cys-Cys-Pro-Val-Asn-Leu-Lys-Arg-Leu-Leu-Ile-Val4... An oligonucleotide probe was used to screen an adult human lung cDNA library and resulted in detection of cDNA clones with predicted amino acid sequence with close identity to the N-terminal amino acid sequence of the human peptide. SPL(pVal) was found within the reading frame of a larger peptide. SPL(pVal) results from proteolytic processing of a larger preprotein. Northern blot analysis detected in a single 1.0 kilobase SPL(pVal) RNA which was less abundant in fetal than in adult lung. Mixtures of purified canine and bovine SPL(pVal) and synthetic phospholipids display properties of rapid adsorption and surface tension lowering activity characteristic of surfactant. Human SPL(pVal) is a pulmonary surfactant proteolipid which may therefore be useful in combination with phospholipids and/or other surfactant proteins for the treatment of surfactant deficiency such as hyaline membrane disease in newborn infants.

  1. Complete nucleic acid sequence of Penaeus stylirostris densovirus (PstDNV) from India.

    PubMed

    Rai, Praveen; Safeena, Muhammed P; Karunasagar, Iddya; Karunasagar, Indrani

    2011-06-01

    Infectious hypodermal and hematopoietic necrosis virus (IHHNV) of shrimp, recently been classified as Penaeus stylirostris densovirus (PstDNV). The complete nucleic acid sequence of PstDNV from India was obtained by cloning and sequencing of different DNA fragment of the virus. The genome organisation of PstDNV revealed that there were three major coding domains: a left ORF (NS1) of 2001 bp, a mid ORF (NS2) of 1092 bp and a right ORF (VP) of 990 bp. The complete genome and amino acid sequences of three proteins viz., NS1, NS2 and VP were compared with the genomes of the virus reported from Hawaii, China and Mexico and with partial sequence available from isolates from different regions. The phylogenetic analysis of shrimp, insect and vertebrate parvovirus sequences showed that the Indian PstDNV isolate is phylogenetically more closely related to one of the three isolates from Taiwan (AY355307), and two isolates (AY362547 and AY102034) from Thailand.

  2. Comparative venom gland transcriptomics of Naja kaouthia (monocled cobra) from Malaysia and Thailand: elucidating geographical venom variation and insights into sequence novelty

    PubMed Central

    Chanhome, Lawan; Tan, Nget Hong

    2017-01-01

    exclusively and abundantly expressed cytotoxin CTX-3 in NK-T. The findings suggested correlation with the geographical variation in proteome and toxicity of the venom, and support the call for optimising antivenom production and use in the region. Besides, the current study uncovered full and partial sequences of numerous toxin genes from N. kaouthia which have not been reported hitherto; these include N. kaouthia-specific l-amino acid oxidase (LAAO), snake venom serine protease (SVSP), cystatin, acetylcholinesterase (AChE), hyaluronidase (HYA), waprin, phospholipase B (PLB), aminopeptidase (AP), neprilysin, etc. Taken together, the findings further enrich the snake toxin database and provide deeper insights into the genetic diversity of cobra venom toxins. PMID:28392982

  3. Variation in Lake Michigan alewife (Alosa pseudoharengus) thiaminase and fatty acids composition

    USGS Publications Warehouse

    Honeyfield, D.C.; Tillitt, D.E.; Fitzsimons, J.D.; Brown, S.B.

    2010-01-01

    Thiaminase activity of alewife (Alosa pseudoharengus) is variable across Lake Michigan, yet factors that contribute to the variability in alewife thiaminase activity are unknown. The fatty acid content of Lake Michigan alewife has not been previously reported. Analysis of 53 Lake Michigan alewives found a positive correlation between thiaminase activity and the following fatty acid: C22:ln9, sum of omega-6 fatty acids (Sw6), and sum of the polyunsaturated fatty acids. Thiaminase activity was negatively correlated with C15:0, C16:0, C17:0, C18:0, C20:0, C22:0, C24:0, C18:ln9t, C20:3n3, C22:2, and the sum of all saturated fatty acids (SAFA). Multi-variant regression analysis resulted in three variables (C18:ln9t, Sw6, SAFA) that explained 71% (R2=0.71, P<0.0001) of the variation in thiaminase activity. Because the fatty acid content of an organism is related is food source, diet may be an important factor modulating alewife thiaminase activity. These data suggest there is an association between fatty acids and thiaminase activity in Lake Michigan alewife.

  4. DNA Cloning of Plasmodium falciparum Circumsporozoite Gene: Amino Acid Sequence of Repetitive Epitope

    NASA Astrophysics Data System (ADS)

    Enea, Vincenzo; Ellis, Joan; Zavala, Fidel; Arnot, David E.; Asavanich, Achara; Masuda, Aoi; Quakyi, Isabella; Nussenzweig, Ruth S.

    1984-08-01

    A clone of complementary DNA encoding the circumsporozoite (CS) protein of the human malaria parasite Plasmodium falciparum has been isolated by screening an Escherichia coli complementary DNA library with a monoclonal antibody to the CS protein. The DNA sequence of the complementary DNA insert encodes a four-amino acid sequence: proline-asparagine-alanine-asparagine, tandemly repeated 23 times. The CS β -lactamase fusion protein specifically binds monoclonal antibodies to the CS protein and inhibits the binding of these antibodies to native Plasmodium falciparum CS protein. These findings provide a basis for the development of a vaccine against Plasmodium falciparum malaria.

  5. Amino-Acid Sequence of NADP-Specific Glutamate Dehydrogenase of Neurospora crassa

    PubMed Central

    Wootton, John C.; Chambers, Geoffrey K.; Holder, Anthony A.; Baron, Andrew J.; Taylor, John G.; Fincham, John R. S.; Blumenthal, Kenneth M.; Moon, Kenneth; Smith, Emil L.

    1974-01-01

    A tentative primary structure of the NADP-specific glutamate dehydrogenase [L-glutamate: NADP oxidoreductase (deaminating), EC 1.4.1.4] from Neurospora crassa has been determined. The proposed sequence contains 452 amino-acid residues in each of the identical subunits of the hexameric enzyme. Comparison of the sequence with that of the bovine liver enzyme reveals considerable homology in the amino-terminal portion of the chain, including the vicinity of the reactive lysine, with only shorter stretches of homology within the carboxyl-terminal regions. The significance of this distribution of homologous regions is discussed. PMID:4155068

  6. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion

    PubMed Central

    Thomsen, Martin Christen Frølund; Nielsen, Morten

    2012-01-01

    Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed). PMID:22638583

  7. Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion.

    PubMed

    Thomsen, Martin Christen Frølund; Nielsen, Morten

    2012-07-01

    Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed).

  8. Complete plastid genome sequence of Primula sinensis (Primulaceae): structure comparison, sequence variation and evidence for accD transfer to nucleus

    PubMed Central

    Liu, Tong-Jian; Zhang, Cai-Yun; Yan, Hai-Fei; Zhang, Lu

    2016-01-01

    Species-rich genus Primula L. is a typical plant group with which to understand genetic variance between species in different levels of relationships. Chloroplast genome sequences are used to be the information resource for quantifying this difference and reconstructing evolutionary history. In this study, we reported the complete chloroplast genome sequence of Primula sinensis and compared it with other related species. This genome of chloroplast showed a typical circular quadripartite structure with 150,859 bp in sequence length consisting of 37.2% GC base. Two inverted repeated regions (25,535 bp) were separated by a large single-copy region (82,064 bp) and a small single-copy region (17,725 bp). The genome consists of 112 genes, including 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Among them, seven coding genes, seven tRNA genes and four rRNA genes have two copies due to their locations in the IR regions. The accD and infA genes lacking intact open reading frames (ORF) were identified as pseudogenes. SSR and sequence variation analyses were also performed on the plastome of Primula sinensis, comparing with another available plastome of P. poissonii. The four most variable regions, rpl36–rps8, rps16–trnQ, trnH–psbA and ndhC–trnV, were identified. Phylogenetic relationship estimates using three sub-datasets extracted from a matrix of 57 protein-coding gene sequences showed the identical result that was consistent with previous studies. A transcript found from P. sinensis transcriptome showed a high similarity to plastid accD functional region and was identified as a putative plastid transit peptide at the N-terminal region. The result strongly suggested that plastid accD has been functionally transferred to the nucleus in P. sinensis. PMID:27375965

  9. Reprint of "Identification of staphylococcal species based on variations in protein sequences (mass spectrometry) and DNA sequence (sodA microarray)".

    PubMed

    Kooken, Jennifer; Fox, Karen; Fox, Alvin; Altomare, Diego; Creek, Kim; Wunschel, David; Pajares-Merino, Sara; Martínez-Ballesteros, Ilargi; Garaizar, Javier; Oyarzabal, Omar; Samadpour, Mansour

    2014-01-01

    This report is among the first using sequence variation in newly discovered protein markers for staphylococcal (or indeed any other bacterial) speciation. Variation, at the DNA sequence level, in the sodA gene (commonly used for staphylococcal speciation) provided excellent correlation. Relatedness among strains was also assessed using protein profiling using microcapillary electrophoresis and pulsed field electrophoresis. A total of 64 strains were analyzed including reference strains representing the 11 staphylococcal species most commonly isolated from man (Staphylococcus aureus and 10 coagulase negative species [CoNS]). Matrix assisted time of flight ionization/ionization mass spectrometry (MALDI TOF MS) and liquid chromatography-electrospray ionization tandem mass spectrometry (LC ESI MS/MS) were used for peptide analysis of proteins isolated from gel bands. Comparison of experimental spectra of unknowns versus spectra of peptides derived from reference strains allowed bacterial identification after MALDI TOF MS analysis. After LC-MS/MS analysis of gel bands bacterial speciation was performed by comparing experimental spectra versus virtual spectra using the software X!Tandem. Finally LC-MS/MS was performed on whole proteomes and data analysis also employing X!tandem. Aconitate hydratase and oxoglutarate dehydrogenase served as marker proteins on focused analysis after gel separation. Alternatively on full proteomics analysis elongation factor Tu generally provided the highest confidence in staphylococcal speciation.

  10. Characterizing novel endogenous retroviruses from genetic variation inferred from short sequence reads

    PubMed Central

    Mourier, Tobias; Mollerup, Sarah; Vinner, Lasse; Hansen, Thomas Arn; Kjartansdóttir, Kristín Rós; Guldberg Frøslev, Tobias; Snogdal Boutrup, Torsten; Nielsen, Lars Peter; Willerslev, Eske; Hansen, Anders J.

    2015-01-01

    From Illumina sequencing of DNA from brain and liver tissue from the lion, Panthera leo, and tumor samples from the pike-perch, Sander lucioperca, we obtained two assembled sequence contigs with similarity to known retroviruses. Phylogenetic analyses suggest that the pike-perch retrovirus belongs to the epsilonretroviruses, and the lion retrovirus to the gammaretroviruses. To determine if these novel retroviral sequences originate from an endogenous retrovirus or from a recently integrated exogenous retrovirus, we assessed the genetic diversity of the parental sequences from which the short Illumina reads are derived. First, we showed by simulations that we can robustly infer the level of genetic diversity from short sequence reads. Second, we find that the measures of nucleotide diversity inferred from our retroviral sequences significantly exceed the level observed from Human Immunodeficiency Virus infections, prompting us to conclude that the novel retroviruses are both of endogenous origin. Through further simulations, we rule out the possibility that the observed elevated levels of nucleotide diversity are the result of co-infection with two closely related exogenous retroviruses. PMID:26493184

  11. Optimization of shRNA inhibitors by variation of the terminal loop sequence.

    PubMed

    Schopman, Nick C T; Liu, Ying Poi; Konstantinova, Pavlina; ter Brake, Olivier; Berkhout, Ben

    2010-05-01

    Gene silencing by RNA interference (RNAi) can be achieved by intracellular expression of a short hairpin RNA (shRNA) that is processed into the effective small interfering RNA (siRNA) inhibitor by the RNAi machinery. Previous studies indicate that shRNA molecules do not always reflect the activity of corresponding synthetic siRNAs that attack the same target sequence. One obvious difference between these two effector molecules is the hairpin loop of the shRNA. Most studies use the original shRNA design of the pSuper system, but no extensive study regarding optimization of the shRNA loop sequence has been performed. We tested the impact of different hairpin loop sequences, varying in size and structure, on the activity of a set of shRNAs targeting HIV-1. We were able to transform weak inhibitors into intermediate or even strong shRNA inhibitors by replacing the loop sequence. We demonstrate that the efficacy of these optimized shRNA inhibitors is improved significantly in different cell types due to increased siRNA production. These results indicate that the loop sequence is an essential part of the shRNA design. The optimized shRNA loop sequence is generally applicable for RNAi knockdown studies, and will allow us to develop a more potent gene therapy against HIV-1.

  12. Community Genomic Analysis of Strain Variation of a Novel Archaeon in an Acid Mine Drainage Environment

    NASA Astrophysics Data System (ADS)

    Yelton, P.; Banfield, J.; Wilmes, P.

    2006-12-01

    Microorganisms play a significant role in acid mine drainage (AMD) generation within the Richmond Mine, Iron Mountain, California. To better understand the contributions of individual microbial species to this process, the assemblies of community genomic data from AMD biofilms were manually curated. Not reported previously is detailed analysis of genomic sequence from G-plasma, an archaeal population from a sample collected from the 5-way location in 2002. The G-plasma population exhibits a small number of differing nucleotide sequences at most genomic locations and comprises multiple genome types. Linkage between these sequence types indicates frequent homologous recombination. As the near complete genome is still in many fragments, the current investigation focused on the 25% of the genome in large, confidently linked pieces. Many predicted proteins from this organism were detected via proteomic analysis. In combination, information about genome heterogeneity and protein expression is providing clues to the role of this population in the biofilm community.

  13. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  14. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  15. Comparative analysis of Mycobacterium tuberculosis pe and ppe genes reveals high sequence variation and an apparent absence of selective constraints.

    PubMed

    McEvoy, Christopher R E; Cloete, Ruben; Müller, Borna; Schürch, Anita C; van Helden, Paul D; Gagneux, Sebastien; Warren, Robin M; Gey van Pittius, Nicolaas C

    2012-01-01

    Mycobacterium tuberculosis complex (MTBC) genomes contain 2 large gene families termed pe and ppe. The function of pe/ppe proteins remains enigmatic but studies suggest that they are secreted or cell surface associated and are involved in bacterial virulence. Previous studies have also shown that some pe/ppe genes are polymorphic, a finding that suggests involvement in antigenic variation. Using comparative sequence analysis of 18 publicly available MTBC whole genome sequences, we have performed alignments of 33 pe (excluding pe_pgrs) and 66 ppe genes in order to detect the frequency and nature of genetic variation. This work has been supplemented by whole gene sequencing of 14 pe/ppe (including 5 pe_pgrs) genes in a cohort of 40 diverse and well defined clinical isolates covering all the main lineages of the M. tuberculosis phylogenetic tree. We show that nsSNP's in pe (excluding pgrs) and ppe genes are 3.0 and 3.3 times higher than in non-pe/ppe genes respectively and that numerous other mutation types are also present at a high frequency. It has previously been shown that non-pe/ppe M. tuberculosis genes display a remarkably low level of purifying selection. Here, we also show that compared to these genes those of the pe/ppe families show a further reduction of selection pressure that suggests neutral evolution. This is inconsistent with the positive selection pressure of "classical" antigenic variation. Finally, by analyzing such a large number of genes we were able to detect large differences in mutation type and frequency between both individual genes and gene sub-families. The high variation rates and absence of selective constraints provides valuable insights into potential pe/ppe function. Since pe/ppe proteins are highly antigenic and have been studied as potential vaccine components these results should also prove informative for aspects of M. tuberculosis vaccine design.

  16. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon

    PubMed Central

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, ‘SCNU1154’, ‘Edisto47’, ‘MR-1’, and ‘PMR5’. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon. PMID:27311063

  17. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    PubMed

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon.

  18. Cis-regulatory sequence variation and association with Mycoplasma load in natural populations of the house finch (Carpodacus mexicanus)

    PubMed Central

    Backström, Niclas; Shipilina, Daria; Blom, Mozes P K; Edwards, Scott V

    2013-01-01

    Characterization of the genetic basis of fitness traits in natural populations is important for understanding how organisms adapt to the changing environment and to novel events, such as epizootics. However, candidate fitness-influencing loci, such as regulatory regions, are usually unavailable in nonmodel species. Here, we analyze sequence data from targeted resequencing of the cis-regulatory regions of three candidate genes for disease resistance (CD74, HSP90α, and LCP1) in populations of the house finch (Carpodacus mexicanus) historically exposed (Alabama) and naïve (Arizona) to Mycoplasma gallisepticum. Our study, the first to quantify variation in regulatory regions in wild birds, reveals that the upstream regions of CD74 and HSP90α are GC-rich, with the former exhibiting unusually low sequence variation for this species. We identified two SNPs, located in a GC-rich region immediately upstream of an inferred promoter site in the gene HSP90α, that were significantly associated with Mycoplasma pathogen load in the two populations. The SNPs are closely linked and situated in potential regulatory sequences: one in a binding site for the transcription factor nuclear NFYα and the other in a dinucleotide microsatellite ((GC)6). The genotype associated with pathogen load in the putative NFYα binding site was significantly overrepresented in the Alabama birds. However, we did not see strong effects of selection at this SNP, perhaps because selection has acted on standing genetic variation over an extremely short time in a highly recombining region. Our study is a useful starting point to explore functional relationships between sequence polymorphisms, gene expression, and phenotypic traits, such as pathogen resistance that affect fitness in the wild. PMID:23532859

  19. Sequence-specific thermodynamic properties of nucleic acids influence both transcriptional pausing and backtracking in yeast

    PubMed Central

    2017-01-01

    RNA Polymerase II pauses and backtracks during transcription, with many consequences for gene expression and cellular physiology. Here, we show that the energy required to melt double-stranded nucleic acids in the transcription bubble predicts pausing in Saccharomyces cerevisiae far more accurately than nucleosome roadblocks do. In addition, the same energy difference also determines when the RNA polymerase backtracks instead of continuing to move forward. This data-driven model corroborates—in a genome wide and quantitative manner—previous evidence that sequence-dependent thermodynamic features of nucleic acids influence both transcriptional pausing and backtracking. PMID:28301878

  20. Comparative analysis of methods used to define eustatic variations in outcrop: Late Cambrian interbasinal sequence development

    SciTech Connect

    Osleger, D. ); Read, J.F. )

    1993-03-01

    Interbasinal correlation of Late Cambrian cyclic carbonates from the Appalachian and Cordilleran passive margins, the Texas craton, and the southern Oklahoma aulacogen defines six major third-order depositional sequences. Graphic correlation of biostratigraphically-constrained strata was used to establish equivalency of stratigraphic sequences between the individual sections. Relatively isochronous biomere boundaries were used as time datums for lithostratigraphic correlation. Although the individual sections are composed of different types of meter-scale cycles and component lithofacies that reflect the various environmental settings of the localities, the overall upward-shallowing character of individual sequences is evident. The sequences are: late Cedaria, mid-Crepicephalus, late Crepicephalus, Aphelaspis to earliest Elvinia, Elvinia to early Saukia, and Saukia to the Cambrian-Ordovician boundary. Interbasinal correlation of stratigraphic sequences permits an evaluation of quantitative techniques for determining accommodation history. Correlation of Fischer plots of cyclic successions from separate basins supports a eustatic control of Late Cambrian sequence development. R2/R3 curves derived from subsidence analysis of the Late Cambrian sections provide good resolution of the second- and third-order scales of accommodation change, and interbasinal correlations of R2/R3 curves also support eustatic control on sequence development. Comparing the accomodation curves and subsidence analysis with paleobathymetric trends of Late Cambrian cyclic strata suggests that the curves may approximate the form of the eustatic sealevel signal. A composite eustatic sealevel curve for Late Cambrian time in North America was created by qualitatively combining the accommodation curves defined by the different techniques for each of the four localities. 129 refs., 16 figs., 3 tabs.

  1. Respiratory syncytial virus fusion glycoprotein: nucleotide sequence of mRNA, identification of cleavage activation site and amino acid sequence of N-terminus of F1 subunit.

    PubMed Central

    Elango, N; Satake, M; Coligan, J E; Norrby, E; Camargo, E; Venkatesan, S

    1985-01-01

    The amino acid sequence of respiratory syncytial virus fusion protein (Fo) was deduced from the sequence of a partial cDNA clone of mRNA and from the 5' mRNA sequence obtained by primer extension and dideoxysequencing. The encoded protein of 574 amino acids is extremely hydrophobic and has a molecular weight of 63371 daltons. The site of proteolytic cleavage within this protein was accurately mapped by determining a partial amino acid sequence of the N-terminus of the larger subunit (F1) purified by radioimmunoprecipitation using monoclonal antibodies. Alignment of the N-terminus of the F1 subunit within the deduced amino acid sequence of Fo permitted us to identify a sequence of lys-lys-arg-lys-arg-arg at the C-terminus of the smaller N-terminal F2 subunit that appears to represent the cleavage/activation domain. Five potential sites of glycosylation, four within the F2 subunit, were also identified. Three extremely hydrophobic domains are present in the protein; a) the N-terminal signal sequence, b) the N-terminus of the F1 subunit that is analogous to the N-terminus of the paramyxovirus F1 subunit and the HA2 subunit of influenza virus hemagglutinin, and c) the putative membrane anchorage domain near the C-terminus of F1. Images PMID:2987829

  2. Analysis of protein function and its prediction from amino acid sequence.

    PubMed

    Clark, Wyatt T; Radivojac, Predrag

    2011-07-01

    Understanding protein function is one of the keys to understanding life at the molecular level. It is also important in the context of human disease because many conditions arise as a consequence of alterations of protein function. The recent availability of relatively inexpensive sequencing technology has resulted in thousands of complete or partially sequenced genomes with millions of functionally uncharacterized proteins. Such a large volume of data, combined with the lack of high-throughput experimental assays to functionally annotate proteins, attributes to the growing importance of automated function prediction. Here, we study proteins annotated by Gene Ontology (GO) terms and estimate the accuracy of functional transfer from protein sequence only. We find that the transfer of GO terms by pairwise sequence alignments is only moderately accurate, showing a surprisingly small influence of sequence identity (SID) in a broad range (30-100%). We developed and evaluated a new predictor of protein function, functional annotator (FANN), from amino acid sequence. The predictor exploits a multioutput neural network framework which is well suited to simultaneously modeling dependencies between functional terms. Experiments provide evidence that FANN-GO (predictor of GO terms; available from http://www.informatics.indiana.edu/predrag) outperforms standard methods such as transfer by global or local SID as well as GOtcha, a method that incorporates the structure of GO.

  3. The Complete Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis ssp. lactis IL1403

    PubMed Central

    Bolotin, Alexander; Wincker, Patrick; Mauger, Stéphane; Jaillon, Olivier; Malarme, Karine; Weissenbach, Jean; Ehrlich, S. Dusko; Sorokin, Alexei

    2001-01-01

    Lactococcus lactis is a nonpathogenic AT-rich gram-positive bacterium closely related to the genus Streptococcus and is the most commonly used cheese starter. It is also the best-characterized lactic acid bacterium. We sequenced the genome of the laboratory strain IL1403, using a novel two-step strategy that comprises diagnostic sequencing of the entire genome and a shotgun polishing step. The genome contains 2,365,589 base pairs and encodes 2310 proteins, including 293 protein-coding genes belonging to six prophages and 43 insertion sequence (IS) elements. Nonrandom distribution of IS elements indicates that the chromosome of the sequenced strain may be a product of recent recombination between two closely related genomes. A complete set of late competence genes is present, indicating the ability of L. lactis to undergo DNA transformation. Genomic sequence revealed new possibilities for fermentation pathways and for aerobic respiration. It also indicated a horizontal transfer of genetic information from Lactococcus to gram-negative enteric bacteria of Salmonella-Escherichia group. [The sequence data described in this paper has been submitted to the GenBank data library under accession no. AE005176.] PMID:11337471

  4. Source quality variations tied to sequence development: Integration of physical and chemical aspects, Lower to Middle Triassic, western Barents Sea

    SciTech Connect

    Bohacs, K.M.; Isaksen, G.H. )

    1991-03-01

    Triassic mudrocks from the Barents Sea area demonstrate to covariance of physical and chemical properties of mudrocks deposited in shelfal environments and the aspect of depositional sequences in distal settings. The tie of physical parameters to chemical character within a detailed sequence-stratigraphic framework enables the construction of depositional-facies models to predict organic-matter content and quality. This allows the explorer to more closely constrain and predict the nature of potential source rocks using seismic and well-log data. Changes in lithology, bedding geometry, sedimentary structures, body and trace-fossil assemblages, and inorganic, bulk-organic, and molecular geochemistry revealed the detailed depositional environments. The depositional environments stack predictably, according to their position in the depositional sequence: from aerobic lower-shoreface--offshore transition environments in lowstand systems tracts to dysaerobic-anaerobic distal open-marine-shelf environment in transgressive and early highstand systems tracts. Quantitative molecular geochemistry also revealed variations within this distal setting and strong covariance with sequence position. Input of organic matter from terrigenous higher plants dominates the lowstands whereas marine-algal organic matter is most prevalent within transgressive and highstand systems tracts. Specifically, the abundance of C{sub 30} steranes, total steranes, and moretane reflected development of the sequences.

  5. Distinct intraspecific variations of garlic (Allium sativum L.) revealed by the exon-intron sequences of the alliinase gene.

    PubMed

    Endo, Aki; Imai, Yukiko; Nakamura, Mizuho; Yanagisawa, Eri; Taguchi, Takaaki; Torii, Kosuke; Okumura, Hidenobu; Ichinose, Koji

    2014-04-01

    Garlic (Allium sativum L.) has been used worldwide as a food and for medicinal purposes since early times. Garlic cultivars exhibit considerable morphological diversity despite the fact that they are mostly sterile and are grown only by vegetative propagation of cloves. Considerable recombination occurs in garlic genomes, including the genes involved in secondary metabolites. We examined the genomic DNAs (gDNAs) from garlic, encoding alliinase, a key enzyme involved in organosulfur metabolism in Allium plants. The 1.7-kb gDNA fragments, covering three exons (2, 3, and 4) and all four introns, were amplified from total DNAs prepared from garlic samples produced in Asia and Europe, leading to 73 sequences in total: Japan (JPN), China (CHN), India (IND), Spain (ESP), and France (FRA). The exon sequences were highly conserved among all the sequences, probably reflecting the fully functional alliinase associated with the flavor quality. Distinct intraspecific variations were detected for all four intron sequences, leading to the haplotype classifications. A close relationship between JPN and CHN was observed for all four introns, whereas IND showed a more divergent distribution. ESP and FRA afforded clearly different variants compared with those from Asian sequences. The present study provides information that could be useful in the development of an additional molecular marker for garlic authentication and quality control.

  6. Amino acid sequence of myoglobin from emu (Dromaius novaehollandiae) skeletal muscle.

    PubMed

    Suman, S P; Joseph, P; Li, S; Beach, C M; Fontaine, M; Steinke, L

    2010-11-01

    The objective of the present study was to characterize the primary structure of emu myoglobin (Mb). Emu Mb was isolated from Iliofibularis muscle employing gel-filtration chromatography. Matrix Assisted Laser Desorption Ionization-Time of Flight Mass Spectrometry was employed to determine the exact molecular mass of emu Mb in comparison with horse Mb, and Edman degradation was utilized to characterize the amino acid sequence. The molecular mass of emu Mb was 17,380 Da and was close to those reported for ratite and poultry myoglobins. Similar to myoglobins from meat-producing livestock and birds, emu Mb has 153 amino acids. Emu Mb contains 9 histidines. Proximal and distal histidines, responsible for coordinating oxygen-binding property of Mb, are conserved in emu. Emu Mb shared more than 90% homology with ratite and chicken myoglobins, whereas it demonstrated only less than 70% sequence similarity with ruminant myoglobins.

  7. Stereochemical Sequence Ion Selectivity: Proline versus Pipecolic-acid-containing Protonated Peptides

    NASA Astrophysics Data System (ADS)

    Abutokaikah, Maha T.; Guan, Shanshan; Bythell, Benjamin J.

    2017-01-01

    Substitution of proline by pipecolic acid, the six-membered ring congener of proline, results in vastly different tandem mass spectra. The well-known proline effect is eliminated and amide bond cleavage C-terminal to pipecolic acid dominates instead. Why do these two ostensibly similar residues produce dramatically differing spectra? Recent evidence indicates that the proton affinities of these residues are similar, so are unlikely to explain the result [Raulfs et al., J. Am. Soc. Mass Spectrom. 25, 1705-1715 (2014)]. An additional hypothesis based on increased flexibility was also advocated. Here, we provide a computational investigation of the "pipecolic acid effect," to test this and other hypotheses to determine if theory can shed additional light on this fascinating result. Our calculations provide evidence for both the increased flexibility of pipecolic-acid-containing peptides, and structural changes in the transition structures necessary to produce the sequence ions. The most striking computational finding is inversion of the stereochemistry of the transition structures leading to "proline effect"-type amide bond fragmentation between the proline/pipecolic acid-congeners: R (proline) to S (pipecolic acid). Additionally, our calculations predict substantial stabilization of the amide bond cleavage barriers for the pipecolic acid congeners by reduction in deleterious steric interactions and provide evidence for the importance of experimental energy regime in rationalizing the spectra.

  8. Variation in sequence and organization of splicing regulatory elements in vertebrate genes

    PubMed Central

    Yeo, Gene; Hoon, Shawn; Venkatesh, Byrappa; Burge, Christopher B.

    2004-01-01

    Although core mechanisms and machinery of premRNA splicing are conserved from yeast to human, the details of intron recognition often differ, even between closely related organisms. For example, genes from the pufferfish Fugu rubripes generally contain one or more introns that are not properly spliced in mouse cells. Exploiting available genome sequence data, a battery of sequence analysis techniques was used to reach several conclusions about the organization and evolution of splicing regulatory elements in vertebrate genes. The classical splice site and putative branch site signals are completely conserved across the vertebrates studied (human, mouse, pufferfish, and zebrafish), and exonic splicing enhancers also appear broadly conserved in vertebrates. However, another class of splicing regulatory elements, the intronic splicing enhancers, appears to differ substantially between mammals and fish, with G triples (GGG) very abundant in mammalian introns but comparatively rare in fish. Conversely, short repeats of AC and GT are predicted to function as intronic splicing enhancers in fish but are not enriched in mammalian introns. Consistent with this pattern, exonic splicing enhancer-binding SR proteins are highly conserved across all vertebrates, whereas heterogeneous nuclear ribonucleoproteins, which bind many intronic sequences, vary in domain structure and even presence/absence between mammals and fish. Exploiting differences in intronic sequence composition, a statistical model was developed to predict the splicing phenotype of Fugu introns in mammalian systems and was used to engineer the spliceability of a Fugu intron in human cells by insertion of specific sequences, thereby rescuing splicing in human cells. PMID:15505203

  9. PSCC: Sensitive and Reliable Population-Scale Copy Number Variation Detection Method Based on Low Coverage Sequencing

    PubMed Central

    Vogel, Ida; Choy, Kwong Wai; Chen, Fang; Christensen, Rikke; Zhang, Chunlei; Ge, Huijuan; Jiang, Haojun; Yu, Chang; Huang, Fang; Wang, Wei; Jiang, Hui; Zhang, Xiuqing

    2014-01-01

    Background Copy number variations (CNVs) represent an important type of genetic variation that deeply impact phenotypic polymorphisms and human diseases. The advent of high-throughput sequencing technologies provides an opportunity to revolutionize the discovery of CNVs and to explore their relationship with diseases. However, most of the existing methods depend on sequencing depth and show instability with low sequence coverage. In this study, using low coverage whole-genome sequencing (LCS) we have developed an effective population-scale CNV calling (PSCC) method. Methodology/Principal Findings In our novel method, two-step correction was used to remove biases caused by local GC content and complex genomic characteristics. We chose a binary segmentation method to locate CNV segments and designed combined statistics tests to ensure the stable performance of the false positive control. The simulation data showed that our PSCC method could achieve 99.7%/100% and 98.6%/100% sensitivity and specificity for over 300 kb CNV calling in the condition of LCS (∼2×) and ultra LCS (∼0.2×), respectively. Finally, we applied this novel method to analyze 34 clinical samples with an average of 2× LCS. In the final results, all the 31 pathogenic CNVs identified by aCGH were successfully detected. In addition, the performance comparison revealed that our method had significant advantages over existing methods using ultra LCS. Conclusions/Significance Our study showed that PSCC can sensitively and reliably detect CNVs using low coverage or even ultra-low coverage data through population-scale sequencing. PMID:24465483

  10. Self-sequencing of amino acids and origins of polyfunctional protocells

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1984-01-01

    The role of proteins in the origin of living things is discussed. It has been experimentally established that amino acids can sequence themselves under simulated geological conditions with highly nonrandom products which accordingly contain diverse information. Multiple copies of each type of macromolecule are formed, resulting in greater power for any protoenzymic molecule than would accrue from a single copy of each type. Thermal proteins are readily incorporated into laboratory protocells. The experimental evidence for original polyfunctional protocells is discussed.

  11. Amino acid sequence of atrial natriuretic peptides in human coronary sinus plasma.

    PubMed

    Yandle, T; Crozier, I; Nicholls, G; Espiner, E; Carne, A; Brennan, S

    1987-07-31

    Two atrial natriuretic peptides were purified from pooled human coronary sinus plasma by Sep-Pak extraction, immunoaffinity chromatography and reverse phase HPLC. The amino acid sequences of the two peptides were homologous with 99-126 human atrial natriuretic peptide (hANP) and 106-126 hANP, the latter being most probably linked to 99-105 ANP by the disulphide bond. The molar ratio of the peptides in plasma, as assessed by radioimmunoassay was 10:3.

  12. Amino Acid Sequences Mediating Vascular Cell Adhesion Molecule 1 Binding to Integrin Alpha 4: Homologous DSP Sequence Found for JC Polyoma VP1 Coat Protein

    PubMed Central

    Meyer, Michael Andrew

    2013-01-01

    The JC polyoma viral coat protein VP1 was analyzed for amino acid sequences homologies to the IDSP sequence which mediates binding of VLA-4 (integrin alpha 4) to vascular cell adhesion molecule 1. Although the full sequence was not found, a DSP sequence was located near the critical arginine residue linked to infectivity of the virus and binding to sialic acid containing molecules such as integrins (3). For the JC polyoma virus, a DSP sequence was found at residues 70, 71 and 72 with homology also noted for the mouse polyoma virus and SV40 virus. Three dimensional modeling of the VP1 molecule suggests that the DSP loop has an accessible site for interaction from the external side of the assembled viral capsid pentamer. PMID:24147211

  13. Amino Acid Sequences Mediating Vascular Cell Adhesion Molecule 1 Binding to Integrin Alpha 4: Homologous DSP Sequence Found for JC Polyoma VP1 Coat Protein.

    PubMed

    Meyer, Michael Andrew

    2013-01-01

    The JC polyoma viral coat protein VP1 was analyzed for amino acid sequences homologies to the IDSP sequence which mediates binding of VLA-4 (integrin alpha 4) to vascular cell adhesion molecule 1. Although the full sequence was not found, a DSP sequence was located near the critical arginine residue linked to infectivity of the virus and binding to sialic acid containing molecules such as integrins (3). For the JC polyoma virus, a DSP sequence was found at residues 70, 71 and 72 with homology also noted for the mouse polyoma virus and SV40 virus. Three dimensional modeling of the VP1 molecule suggests that the DSP loop has an accessible site for interaction from the external side of the assembled viral capsid pentamer.

  14. Associations between sequence variations in the mitochondrial DNA D-loop region and outcome of hepatocellular carcinoma

    PubMed Central

    LI, SHILAI; WAN, PEIQI; PENG, TAO; XIAO, KAIYIN; SU, MING; SHANG, LIMING; XU, BANGHAO; SU, ZHIXIONG; YE, XINPING; PENG, NING; QIN, QUANLIN; LI, LEQUN

    2016-01-01

    The association between mitochondrial DNA (mtDNA) polymorphisms or mutations and the prognoses of cancer have been investigated previously, but the results have been ambiguous. In the present study, the associations between sequence variations in the mtDNA D-loop region and the outcomes of patients with hepatocellular carcinoma (HCC) were analysed. A total of 140 patients with HCC (123 males and 17 females), who were hospitalised to undergo radical resection, were studied. Polymerase chain reaction and direct sequencing were performed to detect the sequence variations in the mtDNA D-loop region. Multivariate and univariate analyses were conducted to determine important factors in the prognosis of HCC. A total of 150 point sequence variations were observed in the 140 cases (13 point mutations, 8 insertions, 20 deletions and 116 polymorphisms). The variation rate was 13.4% (150/1, 122). mtDNA nucleotide 150 (C/T) was an independent factor in the logistic regression for early/late recurrence of HCC. Patients with 150T appeared to have later recurrences. In a Cox proportional hazards regression model, hepatitis B virus DNA, Child-Pugh class, differentiation degree, tumour-node-metastasis (TNM) stage, nucleotide 16263 (T/C) and nucleotide 315 (N/insertion C) were independent factors for tumour-free survival time. Patients with the 16263T allele had a greater tumour-free survival time than patients with the 16263C allele. Similarly, patients with 315 insertion C had a superior tumour-free survival time when compared with patients with 315 N (normal). In the Cox proportional hazards regression model, recurrence type (early/late), Child-Pugh class, TNM stage and adjuvant treatment after tumour recurrence (none or one/more than one treatment) were independent factors for overall survival. None of the mtDNA variations served as independent factors. Patients with late recurrence, Child-Pugh class A, and low TNM stages and/or those who received more than one adjuvant treatment

  15. Application of high-throughput genome sequencing to intrapathovar variation in Pseudomonas syringae.

    PubMed

    Studholme, David J

    2011-10-01

    One reason for the success of Pseudomonas syringae as a model pathogen has been the availability of three complete genome sequences since 2005. Now, at the beginning of 2011, more than 25 strains of P. syringae have been sequenced and many more will soon be released. To date, published analyses of P. syringae have been largely descriptive, focusing on catalogues of genetic differences among strains and between species. Numerous powerful statistical tools are now available that have yet to be applied to P. syringae genomic data for robust and quantitative reconstruction of evolutionary events. The aim of this review is to provide a snapshot of the current status of P. syringae genome sequence data resources, including very recent and unpublished studies, and thereby demonstrate the richness of resources available for this species. Furthermore, certain specific opportunities and challenges in making the best use of these data resources are highlighted.

  16. Amino acid sequence similarity between rabies virus glycoprotein and snake venom curaremimetic neurotoxins.

    PubMed

    Lentz, T L; Wilson, P T; Hawrot, E; Speicher, D W

    1984-11-16

    Evidence was presented earlier that a host-cell receptor for the highly neurotropic rabies virus might be the acetylcholine receptor. The amino acid sequence of the glycoprotein of rabies virus was compared by computer analysis with that of snake venom curaremimetic neurotoxins, potent ligands of the acetylcholine receptor. A statistically significant sequence relation was found between a segment of the rabies glycoprotein and the entire sequence of long neurotoxins. The greatest identity occurs with residues considered most important in neurotoxicity, including those interacting with the acetylcholine binding site of the acetylcholine receptor. Because of the similarity between the glycoprotein and the receptor-binding region of the neurotoxins, this region of the viral glycoprotein may function as a recognition site for the acetylcholine receptor. Direct binding of the rabies virus glycoprotein to the acetylcholine receptor could contribute to the neurotropism of this virus.

  17. Partial amino acid sequence of human pancreatic stone protein, a novel pancreatic secretory protein.

    PubMed Central

    Montalto, G; Bonicel, J; Multigner, L; Rovery, M; Sarles, H; De Caro, A

    1986-01-01

    Pancreatic stone protein (PSP) is the major organic component of human pancreatic stones. With the use of monoclonal antibody immunoadsorbents, five immunoreactive forms (PSP-S) with close Mr values (14,000-19,000) were isolated from normal pancreatic juice. By CM-Trisacryl M chromatography the lowest-Mr form (PSP-S1) was separated from the others and some of its molecular characteristics were investigated. The Mr of the PSP-S1 polypeptide chain calculated from the amino acid composition was about 16,100. The N-terminal sequences (40 residues) of PSP and PSP-S1 are identical, which suggests that the peptide backbone is the same for both of these polypeptides. The PSP-S1 sequence was determined up to residue 65 and was found to be different from all other known protein sequences. Images Fig. 1. PMID:3541906

  18. Estimation of Response Functions Based on Variational Bayes Algorithm in Dynamic Images Sequences

    PubMed Central

    2016-01-01

    We proposed a nonparametric Bayesian model based on variational Bayes algorithm to estimate the response functions in dynamic medical imaging. In dynamic renal scintigraphy, the impulse response or retention functions are rather complicated and finding a suitable parametric form is problematic. In this paper, we estimated the response functions using nonparametric Bayesian priors. These priors were designed to favor desirable properties of the functions, such as sparsity or smoothness. These assumptions were used within hierarchical priors of the variational Bayes algorithm. We performed our algorithm on the real online dataset of dynamic renal scintigraphy. The results demonstrated that this algorithm improved the estimation of response functions with nonparametric priors. PMID:27631007

  19. Characterization of the microbial acid mine drainage microbial community using culturing and direct sequencing techniques.

    PubMed

    Auld, Ryan R; Myre, Maxine; Mykytczuk, Nadia C S; Leduc, Leo G; Merritt, Thomas J S

    2013-05-01

    We characterized the bacterial community from an AMD tailings pond using both classical culturing and modern direct sequencing techniques and compared the two methods. Acid mine drainage (AMD) is produced by the environmental and microbial oxidation of minerals dissolved from mining waste. Surprisingly, we know little about the microbial communities associated with AMD, despite the fundamental ecological roles of these organisms and large-scale economic impact of these waste sites. AMD microbial communities have classically been characterized by laboratory culturing-based techniques and more recently by direct sequencing of marker gene sequences, primarily the 16S rRNA gene. In our comparison of the techniques, we find that their results are complementary, overall indicating very similar community structure with similar dominant species, but with each method identifying some species that were missed by the other. We were able to culture the majority of species that our direct sequencing results indicated were present, primarily species within the Acidithiobacillus and Acidiphilium genera, although estimates of relative species abundance were only obtained from direct sequencing. Interestingly, our culture-based methods recovered four species that had been overlooked from our sequencing results because of the rarity of the marker gene sequences, likely members of the rare biosphere. Further, direct sequencing indicated that a single genus, completely missed in our culture-based study, Legionella, was a dominant member of the microbial community. Our results suggest that while either method does a reasonable job of identifying the dominant members of the AMD microbial community, together the methods combine to give a more complete picture of the true diversity of this environment.

  20. Analysis of genetic variation within clonal lineages of grape phylloxera (Daktulosphaira vitifoliae Fitch) using AFLP fingerprinting and DNA sequencing.

    PubMed

    Vorwerk, S; Forneck, A

    2007-07-01

    Two AFLP fingerprinting methods were employed to estimate the potential of AFLP fingerprints for the detection of genetic diversity within single founder lineages of grape phylloxera (Daktulosphaira vitifoliae Fitch). Eight clonal lineages, reared under controlled conditions in a greenhouse and reproducing asexually throughout a minimum of 15 generations, were monitored and mutations were scored as polymorphisms between the founder individual and individuals of succeeding generations. Genetic variation was detected within all lineages, from early generations on. Six to 15 polymorphic loci (from a total of 141 loci) were detected within the lineages, making up 4.3% of the total amount of genetic variation. The presence of contaminating extra-genomic sequences (e.g., viral material, bacteria, or ingested chloroplast DNA) was excluded as a source of intraclonal variation. Sequencing of 37 selected polymorphic bands confirmed their origin in mostly noncoding regions of the grape phylloxera genome. AFLP techniques were revealed to be powerful for the identification of reproducible banding patterns within clonal lineages.

  1. Signatures of DNA flexibility, interactions and sequence-related structural variations in classical X-ray diffraction patterns

    PubMed Central

    Kornyshev, A. A.; Lee, D. J.; Wynveen, A.; Leikin, S.

    2011-01-01

    The theory of X-ray diffraction from ideal, rigid helices allowed Watson and Crick to unravel the DNA structure, thereby elucidating functions encoded in it. Yet, as we know now, the DNA double helix is neither ideal nor rigid. Its structure varies with the base pair sequence. Its flexibility leads to thermal fluctuations and allows molecules to adapt their structure to optimize their intermolecular interactions. In addition to the double helix symmetry revealed by Watson and Crick, classical X-ray diffraction patterns of DNA contain information about the flexibility, interactions and sequence-related variations encoded within the helical structure. To extract this information, we have developed a new diffraction theory that accounts for these effects. We show how double helix non-ideality and fluctuations broaden the diffraction peaks. Meridional intensity profiles of the peaks at the first three helical layer lines reveal information about structural adaptation and intermolecular interactions. The meridional width of the fifth layer line peaks is inversely proportional to the helical coherence length that characterizes sequence-related and thermal variations in the double helix structure. Analysis of measured fiber diffraction patterns based on this theory yields important parameters that control DNA structure, packing and function. PMID:21593127

  2. Signatures of DNA flexibility, interactions and sequence-related structural variations in classical X-ray diffraction patterns.

    PubMed

    Kornyshev, A A; Lee, D J; Wynveen, A; Leikin, S

    2011-09-01

    The theory of X-ray diffraction from ideal, rigid helices allowed Watson and Crick to unravel the DNA structure, thereby elucidating functions encoded in it. Yet, as we know now, the DNA double helix is neither ideal nor rigid. Its structure varies with the base pair sequence. Its flexibility leads to thermal fluctuations and allows molecules to adapt their structure to optimize their intermolecular interactions. In addition to the double helix symmetry revealed by Watson and Crick, classical X-ray diffraction patterns of DNA contain information about the flexibility, interactions and sequence-related variations encoded within the helical structure. To extract this information, we have developed a new diffraction theory that accounts for these effects. We show how double helix non-ideality and fluctuations broaden the diffraction peaks. Meridional intensity profiles of the peaks at the first three helical layer lines reveal information about structural adaptation and intermolecular interactions. The meridional width of the fifth layer line peaks is inversely proportional to the helical coherence length that characterizes sequence-related and thermal variations in the double helix structure. Analysis of measured fiber diffraction patterns based on this theory yields important parameters that control DNA structure, packing and function.

  3. mit-o-matic: a comprehensive computational pipeline for clinical evaluation of mitochondrial variations from next-generation sequencing datasets.

    PubMed

    Vellarikkal, Shamsudheen Karuthedath; Dhiman, Heena; Joshi, Kandarp; Hasija, Yasha; Sivasubbu, Sridhar; Scaria, Vinod

    2015-04-01

    The human mitochondrial genome has been reported to have a very high mutation rate as compared with the nuclear genome. A large number of mitochondrial mutations show significant phenotypic association and are involved in a broad spectrum of diseases. In recent years, there has been a remarkable progress in the understanding of mitochondrial genetics. The availability of next-generation sequencing (NGS) technologies have not only reduced sequencing cost by orders of magnitude but has also provided us good quality mitochondrial genome sequences with high coverage, thereby enabling decoding of a number of human mitochondrial diseases. In this study, we report a computational and experimental pipeline to decipher the human mitochondrial DNA variations and examine them for their clinical correlation. As a proof of principle, we also present a clinical study of a patient with Leigh disease and confirmed maternal inheritance of the causative allele. The pipeline is made available as a user-friendly online tool to annotate variants and find haplogroup, disease association, and heteroplasmic sites. The "mit-o-matic" computational pipeline represents a comprehensive cloud-based tool for clinical evaluation of mitochondrial genomic variations from NGS datasets. The tool is freely available at http://genome.igib.res.in/mitomatic/.

  4. [MOLECULAR EVOLUTION OF ION CHANNELS: AMINO ACID SEQUENCES AND 3D STRUCTURES].

    PubMed

    Korkosh, V S; Zhorov, B S; Tikhonov, D B

    2016-01-01

    An integral part of modern evolutionary biology is comparative analysis of structure and function of macromolecules such as proteins. The first and critical step to understand evolution of homologous proteins is their amino acid sequence alignment. However, standard algorithms fop not provide unambiguous sequence alignments for proteins of poor homology. More reliable results can be obtained by comparing experimental 3D structures obtained at atomic resolution, for instance, with the aid of X-ray structural analysis. If such structures are lacking, homology modeling is used, which may take into account indirect experimental data on functional roles of individual amino-acid residues. An important problem is that the sequence alignment, which reflects genetic modifications, does not necessarily correspond to the functional homology. The latter depends on three-dimensional structures which are critical for natural selection. Since alignment techniques relying only on the analysis of primary structures carry no information on the functional properties of proteins, including 3D structures into consideration is very important. Here we consider several examples involving ion channels and demonstrate that alignment of their three-dimensional structures can significantly improve sequence alignments obtained by traditional methods.

  5. The amino acid sequence of the aspartate aminotransferase from baker's yeast (Saccharomyces cerevisiae).

    PubMed Central

    Cronin, V B; Maras, B; Barra, D; Doonan, S

    1991-01-01

    1. The single (cytosolic) aspartate aminotransferase was purified in high yield from baker's yeast (Saccharomyces cerevisiae). 2. Amino-acid-sequence analysis was carried out by digestion of the protein with trypsin and with CNBr; some of the peptides produced were further subdigested with Staphylococcus aureus V8 proteinase or with pepsin. Peptides were sequenced by the dansyl-Edman method and/or by automated gas-phase methods. The amino acid sequence obtained was complete except for a probable gap of two residues as indicated by comparison with the structures of counterpart proteins in other species. 3. The N-terminus of the enzyme is blocked. Fast-atom-bombardment m.s. was used to identify the blocking group as an acetyl one. 4. Alignment of the sequence of the enzyme with those of vertebrate cytosolic and mitochondrial aspartate aminotransferases and with the enzyme from Escherichia coli showed that about 25% of residues are conserved between these distantly related forms. 5. Experimental details and confirmatory data for the results presented here are given in a Supplementary Publication (SUP 50164, 25 pages) that has been deposited at the British Library Document Supply Centre, Boston Spa. Wetherby, West Yorkshire LS23 7 BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1991) 273, 5. PMID:1859361

  6. Significant seasonal variations of microbial community in an acid mine drainage lake in Anhui Province, China.

    PubMed

    Hao, Chunbo; Wei, Pengfei; Pei, Lixin; Du, Zerui; Zhang, Yi; Lu, Yanchun; Dong, Hailiang

    2017-04-01

    Acid mine drainage (AMD),characterized by strong acidity and high metal concentrations, generates from the oxidative dissolution of metal sulfides, and acidophiles can accelerate the process significantly. Despite extensive research in microbial diversity and community composition, little is known about seasonal variations of microbial community structure (especially micro eukaryotes) in response to environmental conditions in AMD ecosystem. To this end, AMD samples were collected from Nanshan AMD lake, Anhui Province, China, over a full seasonal cycle from 2013 to 2014, and water chemistry and microbial composition were studied. pH of lake water was stable (∼3.0) across the sampling period, while the concentrations of ions varied dramatically. The highest metal concentrations in the lake were found for Mg and Al, not commonly found Fe. Unexpectedly, ultrahigh concentration of chlorophyll a was measured in the extremely acidic lake, reaching 226.43-280.95 μg/L in winter, even higher than those in most eutrophic freshwater lakes. Both prokaryotic and eukaryotic communities showed a strong seasonal variation. Among the prokaryotes, "Ferrovum", a chemolithotrophic iron-oxidizing bacterium was predominant in most sampling seasons, although it was a minor member prior to September, 2012. Fe(2+) was the initial geochemical factor that drove the variation of the prokaryotic community. The eukaryotic community was simple but varied more drastically than the prokaryotic community. Photoautotrophic algae (primary producers) formed a food web with protozoa or flagellate (top consumers) across all four seasons, and temperature appeared to be responsible for the observed seasonal variation. Ochromonas and Chlamydomonas (responsible for high algal bloom in winter) occurred in autumn/summer and winter/spring seasons, respectively, because of their distinct growth temperatures. The closest phylogenetic relationship between Chlamydomonas species in the lake and those in Arctic

  7. The complete mitochondrial genome of a purebred Tibetan Mastiff (Canis lupus familiaris breed Tibetan Mastiff) from Lijiang, China, and comparison of genome-wide sequence variations.

    PubMed

    Deng, Li Xin; He, Cong

    2016-01-01

    In this study, the complete mitochondrial genome sequence of the Tibetan Mastiff was reported. The total length of the mitogenome is 16,729 bp. It contains the typical structure, including 13 protein-coding genes, 22 transfer RNA genes, 2 ribosomal RNA genes and 1 control region is in line with other canine animals. We further identified genome-wide variations among different canine mitochondrial genomes and indicated that the D-loop region harbors the most sequence variation, which will provide sequence variation information for the protection and utilization of the Tibetan Mastiff germplasm resource.

  8. Organosulfates and organic acids in Arctic aerosols: speciation, annual variation and concentration levels

    NASA Astrophysics Data System (ADS)

    Hansen, A. M. K.; Kristensen, K.; Nguyen, Q. T.; Zare, A.; Cozzi, F.; Nøjgaard, J. K.; Skov, H.; Brandt, J.; Christensen, J. H.; Ström, J.; Tunved, P.; Krejci, R.; Glasius, M.

    2014-02-01

    Sources, composition and occurrence of secondary organic aerosols (SOA) in the Arctic were investigated at Zeppelin Mountain, Svalbard, and Station Nord, northeast Greenland, during the full annual cycle of 2008 and 2010 respectively. We focused on the speciation of three types of SOA tracers: organic acids, organosulfates and nitrooxy organosulfates from both anthropogenic and biogenic precursors, here presenting organosulfate concentrations and compositions during a full annual cycle and chemical speciation of organosulfates in Arctic aerosols for the first time. Aerosol samples were analysed using High Performance Liquid Chromatography coupled to a quadrupole Time-of-Flight mass spectrometer (HPLC-q-TOF-MS). A total of 11 organic acids (terpenylic acid, benzoic acid, phthalic acid, pinic acid, suberic acid, azelaic acid, adipic acid, pimelic acid, pinonic acid, diaterpenylic acid acetate (DTAA) and 3-methyl-1,2,3-butanetricarboxylic acid (MBTCA)), 12 organosulfates and one nitrooxy organosulfate were identified at the two sites. Six out of the 12 organosulfates are reported for the first time. Concentrations of organosulfates follow a distinct annual pattern at Station Nord, where high concentration were observed in late winter and early spring, with a mean total concentration of 47 (±14) ng m-3, accounting for 7 (±2)% of total organic matter, contrary to a considerably lower organosulfate mean concentration of 2 (±3) ng m-3 (accounting for 1 (±1)% of total organic matter) observed during the rest of the year. The organic acids followed the same temporal trend as the organosulfates at Station Nord; however the variations in organic acid concentrations were less pronounced, with a total mean organic acid concentration of 11.5 (±4) ng m-3 (accounting for 1.7 (±0.6)% of total organic matter) in late winter and early spring, and 2.2 (±1) ng m-3 (accounting for 0.9 (±0.4)% of total organic matter) during the rest of the year. At Zeppelin Mountain

  9. Barcoding lichen-forming fungi using 454 pyrosequencing is challenged by artifactual and biological sequence variation.

    PubMed

    Mark, Kristiina; Cornejo, Carolina; Keller, Christine; Flück, Daniela; Scheidegger, Christoph

    2016-09-01

    Although lichens (lichen-forming fungi) play an important role in the ecological integrity of many vulnerable landscapes, only a minority of lichen-forming fungi have been barcoded out of the currently accepted ∼18 000 species. Regular Sanger sequencing can be problematic when analyzing lichens since saprophytic, endophytic, and parasitic fungi live intimately admixed, resulting in low-quality sequencing reads. Here, high-throughput, long-read 454 pyrosequencing in a GS FLX+ System was tested to barcode the fungal partner of 100 epiphytic lichen species from Switzerland using fungal-specific primers when amplifying the full internal transcribed spacer region (ITS). The present study shows the potential of DNA barcoding using pyrosequencing, in that the expected lichen fungus was successfully sequenced for all samples except one. Alignment solutions such as BLAST were found to be largely adequate for the generated long reads. In addition, the NCBI nucleotide database-currently the most complete database for lichen-forming fungi-can be used as a reference database when identifying common species, since the majority of analyzed lichens were identified correctly to the species or at least to the genus level. However, several issues were encountered, including a high sequencing error rate, multiple ITS versions in a genome (incomplete concerted evolution), and in some samples the presence of mixed lichen-forming fungi (possible lichen chimeras).

  10. A pedigree-based study of mitochondrial D-loop DNA sequence variation among Arabian horses.

    PubMed

    Bowling, A T; Del Valle, A; Bowling, M

    2000-02-01

    Through DNA sequence comparisons of a mitochondrial D-loop hypervariable region, we investigated matrilineal diversity for Arabian horses in the United States. Sixty-two horses were tested. From published pedigrees they traced in the maternal line to 34 mares acquired primarily in the mid to late 19th century from nomadic Bedouin tribes. Compared with the reference sequence (GenBank X79547), these samples showed 27 haplotypes with altogether 31 base substitution sites within 397 bp of sequence. Based on examination of pedigrees from a random sampling of 200 horses in current studbooks of the Arabian Horse Registry of America, we estimated that this study defined the expected mtDNA haplotypes for at least 89% of Arabian horses registered in the US. The reliability of the studbook recorded maternal lineages of Arabian pedigrees was demonstrated by haplotype concordance among multiple samplings in 14 lines. Single base differences observed within two maternal lines were interpreted as representing alternative fixations of past heteroplasmy. The study also demonstrated the utility of mtDNA sequence studies to resolve historical maternity questions without access to biological material from the horses whose relationship was in question, provided that representatives of the relevant female lines were available for comparison. The data call into question the traditional assumption that Arabian horses of the same strain necessarily share a common maternal ancestry.

  11. Complete amino acid sequence of a histidine-rich proteolytic fragment of human ceruloplasmin.

    PubMed

    Kingston, I B; Kingston, B L; Putnam, F W

    1979-04-01

    The complete amino acid sequence has been determined for a fragment of human ceruloplasmin [ferroxidase; iron(II):oxygen oxidoreductase, EC 1.16.3.1]. The fragment (designated Cp F5) contains 159 amino acid residues and has a molecular weight of 18,650; it lacks carbohydrate, is rich in histidine, and contains one free cysteine that may be part of a copper-binding site. This fragment is present in most commercial preparations of ceruloplasmin, probably owing to proteolytic degradation, but can also be obtained by limited cleavage of single-chain ceruloplasmin with plasmin. Cp F5 probably is an intact domain attached to the COOH-terminal end of single-chain ceruloplasmin via a labile interdomain peptide bond. A model of the secondary structure predicted by empirical methods suggests that almost one-third of the amino acid residues are distributed in alpha helices, about a third in beta-sheet structure, and the remainder in beta turns and unidentified structures. Computer analysis of the amino acid sequence has not demonstrated a statistically significant relationship between this ceruloplasmin fragment and any other protein, but there is some evidence for an internal duplication.

  12. Natural variation among Arabidopsis accessions reveals malic acid as a key mediator of Nickel (Ni) tolerance.

    PubMed

    Agrawal, Bhavana; Lakshmanan, Venkatachalam; Kaushik, Shail; Bais, Harsh P

    2012-08-01

    Plants have evolved various mechanisms for detoxification that are specific to the plant species as well as the metal ion chemical properties. Malic acid, which is commonly found in plants, participates in a number of physiological processes including metal chelation. Using natural variation among Arabidopsis accessions, we investigated the function of malic acid in Nickel (Ni) tolerance and detoxification. The Ni-induced production of reactive oxygen species was found to be modulated by intracellular malic acid, indicating its crucial role in Ni detoxification. Ni tolerance in Arabidopsis may actively involve malic acid and/or complexes of Ni and malic acid. Investigation of malic acid content in roots among tolerant ecotypes suggested that a complex of Ni and malic acid may be involved in translocation of Ni from roots to leaves. The exudation of malic acid from roots in response to Ni treatment in either susceptible or tolerant plant species was found to be partially dependent on AtALMT1 expression. A lower concentration of Ni (10 µM) treatment induced AtALMT1 expression in the Ni-tolerant Arabidopsis ecotypes. We found that the ecotype Santa Clara (S.C.) not only tolerated Ni but also accumulated more Ni in leaves compared to other ecotypes. Thus, the ecotype S.C. can be used as a model system to delineate the biochemical and genetic basis of Ni tolerance, accumulation, and detoxification in plants. The evolution of Ni hyperaccumulators, which are found in serpentine soils, is an interesting corollary to the fact that S.C. is also native to serpentine soils.

  13. Summer and winter variations of dicarboxylic acids, fatty acids and benzoic acid in PM2.5 in Pearl Delta River Region, China

    NASA Astrophysics Data System (ADS)

    Ho, K. F.; Ho, S. S. H.; Lee, S. C.; Kawamura, K.; Zou, S. C.; Cao, J. J.; Xu, H. M.

    2011-03-01

    Ground-based PM2.5 samples collected at four different sites in Pearl River Delta region (PRD) during winter and summer (from 14 December 2006 to 28 January 2007 in winter and from 4 July to 9 August 2007 in summer) were analyzed for 30 water-soluble organic species, including dicarboxylic acids, ketocarboxylic acids and dicarbonyls, nine fatty acids, and benzoic acid. Molecular distributions of dicarboxylic acids demonstrated that oxalic acid (C2) was the most abundant species followed by phthalic acid (Ph) in PRD region. The concentrations of total dicarboxylic acids ranged from 99 to 1340 ng m-3, with an average of 438 ± 267 ng m-3 in PRD. The concentrations of total ketocarboxylic acids ranged from 0.6 to 207 ng m-3 (43 ± 48 ng m-3 on average) while the concentrations of total α-dicarbonyls, including glyoxal and methylglyoxal, ranged from 0.2 to 89 ng m-3, with an average of 11 ± 18 ng m-3 in PRD. The total quantified water-soluble compounds (TQWOC) (organic carbon) accounted for 3.4 ± 2.2% of OC and 14.3 ± 10.3% of water-soluble OC (WSOC). Hexadecanoic acid (C16:0), octadecanoic acid (C18:0) and oleic acid (C18:1) were the three most abundant fatty acids in PRD. The distributions of fatty acids were characterized by a strong even carbon number predominance with a maximum (Cmax) at hexadecanoic acid (C16:0). Ratio of C18:1 to C18:0 acts as an indicator for aerosol aging. In PRD, an average of C18:1/C18:0 ratio was 0.53 ± 0.39, suggesting an enhanced photochemical degradation of unsaturated fatty acid. Moreover, the concentrations of benzoic acid ranged from 84 to 306 ng m-3, (165 ± 48 ng m-3 on average), which can be emitted as primary pollutant from motor vehicles exhaust, or formed from photochemical degradation of aromatic hydrocarbons. Seasonal variations of the organic specie concentrations were found in the four sampling cities. Higher concentrations of TQWOC were observed in winter (598 ± 321 ng m-3) than in summer (372 ± 215 ng m-3). However

  14. Processing and amino acid sequence analysis of the mouse mammary tumor virus env gene product.

    PubMed Central

    Arthur, L O; Copeland, T D; Oroszlan, S; Schochetman, G

    1982-01-01

    The envelope proteins of mouse mammary tumor virus (MMTV) are synthesized from a subgenomic 24S mRNA as a 75,000-dalton glycosylated precursor polyprotein which is eventually processed to the mature glycoproteins gp52 and gp36. In vivo synthesis of this env precursor in the presence of the core glycosylation inhibitor tunicamycin yielded a precursor of approximately 61,000 daltons (P61env). However, a 67,000-dalton protein (P67env) was obtained from cell-free translation with the MMTV 24S mRNA as the template. To determine whether the portion of the protein cleaved from P67env to give P61env was removed from the NH2-terminal end of P67env and as such would represent a leader sequence, the NH2-terminal amino acid sequence of the terminal peptide gp52 was determined. Glutamic acid, and not methionine, was found to be the amino-terminal residue of gp52, indicating that the cleaved portion was derived from the NH2-terminal end of P67env. The NH2-terminal amino acid sequences of gp52's from endogenous and exogenous C3H MMTVs were determined though 46 residues and found to be identical. However, amino acid composition and type-specific gp52 radioimmunoassays from MMTVs grown in heterologous cells indicated primary structure differences between gp52's of the two viruses. The nucleic acid sequence of cloned MMTV DNA fragments (J. Majors and H. E. Varmus, personal communication) in conjunction with the NH2-terminal sequence of gp52 allowed localization of the env gene in the MMTV genome. Nucleotides coding for the NH2 terminus of gp52 begin approximately 0.8 kilobase to the 3' side of the single EcoRI cleavage site. Localization of the env gene at that point agrees with the proposed gene order -gag-pol-env- and also allows sufficient coding potential for the glycoprotein precursor without extending into the long terminal repeat. Images PMID:6281457

  15. Complete Genome Sequence of a thermotolerant sporogenic lactic acid bacterium, Bacillus coagulans strain 36D1

    PubMed Central

    Rhee, Mun Su; Moritz, Brélan E.; Xie, Gary; Glavina del Rio, T.; Dalin, E.; Tice, H.; Bruce, D.; Goodwin, L.; Chertkov, O.; Brettin, T.; Han, C.; Detter, C.; Pitluck, S.; Land, Miriam L.; Patel, Milind; Ou, Mark; Harbrucker, Roberta; Ingram, Lonnie O.; Shanmugam, K. T.

    2011-01-01

    Bacillus coagulans is a ubiquitous soil bacterium that grows at 50-55 °C and pH 5.0 and ferments various sugars that constitute plant biomass to L (+)-lactic acid. The ability of this sporogenic lactic acid bacterium to grow at 50-55 °C and pH 5.0 makes this organism an attractive microbial biocatalyst for production of optically pure lactic acid at industrial scale not only from glucose derived from cellulose but also from xylose, a major constituent of hemicellulose. This bacterium is also considered as a potential probiotic. Complete genome sequence of a representative strain, B. coagulans strain 36D1, is presented and discussed. PMID:22675583

  16. BeadCons: detection of nucleic acid sequences by flow cytometry.

    PubMed

    Horejsh, Douglas; Martini, Federico; Capobianchi, Maria Rosaria

    2005-11-01

    Molecular beacons are single-stranded nucleic acid structures with a terminal fluorophore and a distal, terminal quencher. These molecules are typically used in real-time PCR assays, but have also been conjugated with solid matrices. This unit describes protocols related to molecular beacon-conjugated beads (BeadCons), whose specific hybridization with complementary target sequences can be resolved by cytometry. Assay sensitivity is achieved through the concentration of fluorescence signal on discrete particles. By using molecular beacons with different fluorophores and microspheres of different sizes, it is possible to construct a fluid array system with each bead corresponding to a specific target nucleic acid. Methods are presented for the design, construction, and use of BeadCons for the specific, multiplexed detection of unlabeled nucleic acids in solution. The use of bead-based detection methods will likely lead to the design of new multiplex molecular diagnostic tools.

  17. Measuring nanometer distances in nucleic acids using a sequence-independent nitroxide probe

    PubMed Central

    Qin, Peter Z; Haworth, Ian S; Cai, Qi; Kusnetzow, Ana K; Grant, Gian Paola G; Price, Eric A; Sowa, Glenna Z; Popova, Anna; Herreros, Bruno; He, Honghang

    2008-01-01

    This protocol describes the procedures for measuring nanometer distances in nucleic acids using a nitroxide probe that can be attached to any nucleotide within a given sequence. Two nitroxides are attached to phosphorothioates that are chemically substituted at specific sites of DNA or RNA. Inter-nitroxide distances are measured using a four-pulse double electron–electron resonance technique, and the measured distances are correlated to the parent structures using a Web-accessible computer program. Four to five days are needed for sample labeling, purification and distance measurement. The procedures described herein provide a method for probing global structures and studying conformational changes of nucleic acids and protein/nucleic acid complexes. PMID:17947978

  18. Complete Genome Sequence of a thermotolerant sporogenic lactic acid bacterium, Bacillus coagulans strain 36D1.

    PubMed

    Rhee, Mun Su; Moritz, Brélan E; Xie, Gary; Glavina Del Rio, T; Dalin, E; Tice, H; Bruce, D; Goodwin, L; Chertkov, O; Brettin, T; Han, C; Detter, C; Pitluck, S; Land, Miriam L; Patel, Milind; Ou, Mark; Harbrucker, Roberta; Ingram, Lonnie O; Shanmugam, K T

    2011-12-31

    Bacillus coagulans is a ubiquitous soil bacterium that grows at 50-55 °C and pH 5.0 and ferments various sugars that constitute plant biomass to L (+)-lactic acid. The ability of this sporogenic lactic acid bacterium to grow at 50-55 °C and pH 5.0 makes this organism an attractive microbial biocatalyst for production of optically pure lactic acid at industrial scale not only from glucose derived from cellulose but also from xylose, a major constituent of hemicellulose. This bacterium is also considered as a potential probiotic. Complete genome sequence of a representative strain, B. coagulans strain 36D1, is presented and discussed.

  19. Comprehensive characterization of human genome variation by high coverage whole-genome sequencing of forty four Caucasians.

    PubMed

    Shen, Hui; Li, Jian; Zhang, Jigang; Xu, Chao; Jiang, Yan; Wu, Zikai; Zhao, Fuping; Liao, Li; Chen, Jun; Lin, Yong; Tian, Qing; Papasian, Christopher J; Deng, Hong-Wen

    2013-01-01

    Whole genome sequencing studies are essential to obtain a comprehensive understanding of the vast pattern of human genomic variations. Here we report the results of a high-coverage whole genome sequencing study for 44 unrelated healthy Caucasian adults, each sequenced to over 50-fold coverage (averaging 65.8×). We identified approximately 11 million single nucleotide polymorphisms (SNPs), 2.8 million short insertions and deletions, and over 500,000 block substitutions. We showed that, although previous studies, including the 1000 Genomes Project Phase 1 study, have catalogued the vast majority of common SNPs, many of the low-frequency and rare variants remain undiscovered. For instance, approximately 1.4 million SNPs and 1.3 million short indels that we found were novel to both the dbSNP and the 1000 Genomes Project Phase 1 data sets, and the majority of which (∼96%) have a minor allele frequency less than 5%. On average, each individual genome carried ∼3.3 million SNPs and ∼492,000 indels/block substitutions, including approximately 179 variants that were predicted to cause loss of function of the gene products. Moreover, each individual genome carried an average of 44 such loss-of-function variants in a homozygous state, which would completely "knock out" the corresponding genes. Across all the 44 genomes, a total of 182 genes were "knocked-out" in at least one individual genome, among which 46 genes were "knocked out" in over 30% of our samples, suggesting that a number of genes are commonly "knocked-out" in general populations. Gene ontology analysis suggested that these commonly "knocked-out" genes are enriched in biological process related to antigen processing and immune response. Our results contribute towards a comprehensive characterization of human genomic variation, especially for less-common and rare variants, and provide an invaluable resource for future genetic studies of human variation and diseases.

  20. Dissecting a bacterial collagen domain from Streptococcus pyogenes: sequence and length-dependent variations in triple helix stability and folding.

    PubMed

    Yu, Zhuoxin; Brodsky, Barbara; Inouye, Masayori

    2011-05-27

    To better investigate the relationship between sequence, stability, and folding, the Streptococcus pyogenes collagenous domain CL (Gly-Xaa-Yaa)(79) was divided to create three recombinant triple helix subdomains A, B, and C of almost equal size with distinctive amino acid features: an A domain high in polar residues, a B domain containing the highest concentration of Pro residues, and a very highly charged C domain. Each segment was expressed as a monomer, a linear dimer, and a linear trimer fused with the trimerization domain (V domain) in Escherichia coli. All recombinant proteins studied formed stable triple helical structures, but the stability varied depending on the amino acid sequence in the A, B, and C segments and increased as the triple helix got longer. V-AAA was found to melt at a much lower temperature (31.0 °C) than V-ABC (V-CL), whereas V-BBB melted at almost the same temperature (∼36-37 °C). When heat-denatured, the V domain enhanced refolding for all of the constructs; however, the folding rate was affected by their amino acid sequences and became reduced for longer constructs. The folding rates of all the other constructs were lower than that of the natural V-ABC protein. Amino acid substitution mutations at all Pro residues in the C fragment dramatically decreased stability but increased the folding rate. These results indicate that the thermostability of the bacterial collagen is dominated by the most stable domain in the same manner as found with eukaryotic collagens.

  1. The amino acid sequence of Lady Amherst's pheasant (Chrysolophus amherstiae) and golden pheasant (Chrysolophus pictus) egg-white lysozymes.

    PubMed

    Araki, T; Kuramoto, M; Torikata, T

    1990-09-01

    The amino acids of Lady Amherst's pheasant and golden pheasant egg-white lysozymes have been sequenced. The carboxymethylated lysozymes were digested with trypsin followed by sequencing of the tryptic peptides. Lady Amherst's pheasant lysozyme proved to consist of 129 amino acid residues, and a relative molecular mass of 14,423 Da was calculated. This lysozyme had 6 amino acids substitutions when compared with hen egg-white lysozyme: Phe3 to Tyr, His15 to Leu, Gln41 to His, Asn77 to His, Gln 121 to Asn, and a newly found substitution of Ile124 to Thr. The amino acid sequence of golden pheasant lysozyme was identical to that of Lady Amherst's phesant lysozyme. The phylogenetic tree constructured by the comparison of amino acid sequences of phasianoid birds lysozymes revealed a minimum genetic distance between these pheasants and the turkey-peafowl group.

  2. Sequence variations at the HLA-linked olfactory receptor cluster do not influence female preferences for male odors

    PubMed Central

    Thompson, Emma E; Haller, Gabe; Pinto, Jayant M; Sun, Ying; Zelano, Bethanne; Jacob, Suma; McClintock, Martha K.; Nicolae, Dan L.; Ober, Carole

    2013-01-01

    We previously reported that paternally-inherited human leukocyte antigen (HLA) alleles are a template for women's preference for male odors (P = 0.0007). However, it has been suggested that sequence variation in a nearby olfactory receptor (OR) cluster on chromosome 6p influences smell preference. To determine if the HLA-linked OR genes contribute to previously observed HLA-mediated behaviors, we use the odor preference data from our earlier study in combination with a new resequencing study of four functional HLA-linked OR genes in the same subjects. Our results indicate that OR alleles in the genes surveyed are not in linkage disequilibrium (LD) with HLA variation and do not explain the previous findings of HLA-associated odor preference. PMID:19833159

  3. Genetic variation in and spatial structure of natural populations of Dipterocarpus alatus (Dipterocarpaceae) determined using single sequence repeat markers.

    PubMed

    Tam, N M; Duy, V D; Duc, N M; Giap, V D; Xuan, B T T

    2014-07-24

    Dipterocarpus alatus (Dipterocarpaceae) is widely distributed in lowland forests in central and southern Vietnam, Cambodia, Laos, Myanmar, Philippines, Thailand, and India. Due to over-exploitation and habitat destruction, the species is now threatened. The genetic variation within and among populations of D. alatus was investigated on the basis of 9 microsatellite (single sequence repeat, SSR) loci. In all, 268 sampled trees from 10 populations in central and southern Vietnam were analyzed in this study. The SSR data showed a high genetic variability within populations with an average of HO = 0.209 and HE = 0.239. Genetic differentiation among populations was high (FST = 0.266), indicating limited gene flow (Nm = 0.69). Analysis of molecular variance showed that most genetic variation was within populations (74.96%). This study highlights the importance of conserving the genetic resources of D. alatus species.

  4. Identification of eight mutations and three sequence variations in the cystic fibrosis transmembrane conductance regulator (CFTR) gene

    SciTech Connect

    Ghanem, N.; Costes, B.; Girodon, E.; Martin, J.; Fanen, P.; Goossens, M. )

    1994-05-15

    To determine cystic fibrosis (CF) defects in a sample of 224 non-[Delta]F508 CF chromosomes, the authors used denaturing gradient gel multiplex analysis of CF transmembrane conductance regulator gene segments, a strategy based on blind exhaustive analysis rather than a search for known mutations. This process allowed detection of 11 novel variations comprising two nonsense mutations (Q890X and W1204X), a splice defect (405 + 4 A [yields] G), a frameshift (3293delA), four presumed missense mutations (S912L, H949Y, L1065P, Q1071P), and three sequence polymorphisms (R31C or 223 C/T, 3471 T/C, and T1220I or 3791 C/T). The authors describe these variations, together with the associated phenotype when defects on both CF chromosomes were identified. 8 refs., 1 fig., 1 tab.

  5. Variation of Select Flavonols and Chlorogenic Acid Content of Elderberry Collected Throughout the Eastern United States

    PubMed Central

    Mudge, Elizabeth; Applequist, Wendy L.; Finley, Jamie; Lister, Patience; Townesmith, Andrew K.; Walker, Karen M.; Brown, Paula N.

    2016-01-01

    American elderberries are commonly collected from wild plants for use as food and medicinal products. The degree of phytochemical variation amongst wild populations has not been established and might affect the overall quality of elderberry dietary supplements. The three major flavonols identified in elderberries are rutin, quercetin and isoquercetin. Variation in the flavonols and chlorogenic acid was determined for 107 collections of elderberries from throughout the eastern United States using an optimized high performance liquid chromatography with ultraviolet detection method. The mean content was 71.9 mg per 100g fresh weight with variation ranging from 7.0 to 209.7 mg per 100 g fresh weight within the collected population. Elderberries collected from southeastern regions had significantly higher contents in comparison with those in more northern regions. The variability of the individual flavonol and chlorogenic acid profiles of the berries was complex and likely influenced by multiple factors. Several outliers were identified based on unique phytochemical profiles in comparison with average populations. This is the first study to determine the inherent variability of American elderberries from wild collections and can be used to identify potential new cultivars that may produce fruits of unique or high-quality phytochemical content for the food and dietary supplement industries. PMID:26877585

  6. Patterns of structural and sequence variation within isotype lineages of the Neisseria meningitidis transferrin receptor system

    PubMed Central

    Adamiak, Paul; Calmettes, Charles; Moraes, Trevor F; Schryvers, Anthony B

    2015-01-01

    Neisseria meningitidis inhabits the human upper respiratory tract and is an important cause of sepsis and meningitis. A surface receptor comprised of transferrin-binding proteins A and B (TbpA and TbpB), is responsible for acquiring iron from host transferrin. Sequence and immunological diversity divides TbpBs into two distinct lineages; isotype I and isotype II. Two representative isotype I and II strains, B16B6 and M982, differ in their dependence on TbpB for in vitro growth on exogenous transferrin. The crystal structure of TbpB and a structural model for TbpA from the representative isotype I N. meningitidis strain B16B6 were obtained. The structures were integrated with a comprehensive analysis of the sequence diversity of these proteins to probe for potential functional differences. A distinct isotype I TbpA was identified that co-varied with TbpB and lacked sequence in the region for the loop 3 α-helix that is proposed to be involved in iron removal from transferrin. The tightly associated isotype I TbpBs had a distinct anchor peptide region, a distinct, smaller linker region between the lobes and lacked the large loops in the isotype II C-lobe. Sequences of the intact TbpB, the TbpB N-lobe, the TbpB C-lobe, and TbpA were subjected to phylogenetic analyses. The phylogenetic clustering of TbpA and the TbpB C-lobe were similar with two main branches comprising the isotype 1 and isotype 2 TbpBs, possibly suggesting an association between TbpA and the TbpB C-lobe. The intact TbpB and TbpB N-lobe had 4 main branches, one consisting of the isotype 1 TbpBs. One isotype 2 TbpB cluster appeared to consist of isotype 1 N-lobe sequences and isotype 2 C-lobe sequences, indicating the swapping of N-lobes and C-lobes. Our findings should inform future studies on the interaction between TbpB and TbpA and the process of iron acquisition. PMID:25800619

  7. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  8. Intrashell variations in amino acid concentrations and isoleucine epimerization ratios in fossil Hiatella arctica

    NASA Astrophysics Data System (ADS)

    Brigham, Julie K.

    1983-09-01

    Twenty-four valves of fossil Hiatella arctica were analyzed to determine if amino acid ratios varied from one region of a shell to another. The ratio of D-alloisoleucine/L-isoleucine, routinely used as a stratigraphic correlation tool and an indicator of relative age, did not vary significantly between five anatomically different shell parts in Hiatella arctica. Sampling only the hinge or central part of all valves, however, resulted in less variation about the average value. Analyses of only this part of the shell should improve the resolution of stratigraphic units by amino acid geochronology. The absolute concentrations of aspartic acid, threonine, serine, glutamic acid, glycine, alanine, valine, alloisoleucine, isoleucine, and leucine (in picomoles/milligram of shell) are significantly higher in the hinge and central part of the shell, whereas the outer growth edge appears to have lower levels of amino acids. This is true in both the FREE and TOTAL hydrolysate fractions. The reasons are not clear; however, the high value may be caused by a thin, protein-rich inner layer lining the valve out to the pallial line and/or differences in the proportion of inorganic carbonate to protein produced in different areas during shell growth. Alternatively, it may suggest leaching of the thinner, more vulnerable part of the shell growth edge.

  9. The impact of monomer sequence and stereochemistry on the swelling and erosion of biodegradable poly(lactic-co-glycolic acid) matrices.

    PubMed

    Washington, Michael A; Swiner, Devin J; Bell, Kerri R; Fedorchak, Morgan V; Little, Steven R; Meyer, Tara Y

    2017-02-01

    Monomer sequence is demonstrated to be a primary factor in determining the hydrolytic degradation profile of poly(lactic-co-glycolic acid)s (PLGAs). Although many approaches have been used to tune the degradation of PLGAs, little effort has been expended in exploring the sequence-control strategy exploited by nature in biopolymers. Cylindrical matrices and films prepared from a series of sequenced and random PLGAs were subjected to hydrolysis in a pH 7.4 buffer at 37 °C. Swelling ranged from 107% for the random racemic PLGA with a 50:50 ratio of lactic (L) to glycolic (G) units to 6% for the sequenced alternating copolymer poly LG. Erosion followed an inverse trend with the random 50:50 PLGA showing an erosion half-life of 3-4 weeks while poly LG required ca. >10 weeks. Stereosequence was found to play a large role in determining swelling and erosion; stereopure analogs swelled less and were slower to lose mass. Molecular weight loss followed similar trends and increases in dispersity correlated with the onset of significant swelling. The relative proportion of rapidly cleavable G-G linkages relative to G-L/L-G (moderate) and L-L (slow) correlates strongly with the degree of swelling observed and the rate of erosion. The dramatic sequence-dependent variation in swelling, in the absence of a parallel hydrophilicity trend, suggest that osmotic pressure, driven by the differential accumulation of degradation products, plays an important role.

  10. Correlates of substitution rate variation in mammalian protein-coding sequences

    PubMed Central

    2008-01-01

    Background Rates of molecular evolution in different lineages can vary widely, and some of this variation might be predictable from aspects of species' biology. Investigating such predictable rate variation can help us to understand the causes of molecular evolution, and could also help to improve molecular dating methods. Here we present a comprehensive study of the life history correlates of substitution rate variation across the mammals, comparing results for mitochondrial and nuclear loci, and for synonymous and non-synonymous sites. We use phylogenetic comparative methods, refined to take into account the special nature of substitution rate data. Particular attention is paid to the widespread correlations between the components of mammalian life history, which can complicate the interpretation of results. Results We find that mitochondrial synonymous substitution rates, estimated from the 9 longest mitochondrial genes, show strong negative correlations with body mass and with maximum recorded lifespan. But lifespan is the sole variable to remain after multiple regression and model simplification. Nuclear synonymous substitution rates, estimated from 6 genes, show strong negative correlations with body mass and generation time, and a strong positive correlation with fecundity. In contrast to the mitochondrial results, the same trends are evident in rates of nonsynonymous substitution. Conclusion A substantial proportion of variation in mammalian substitution rates can be explained by aspects of their life history, implying that molecular and life history evolution are closely interlinked in this group. The strength and consistency of the nuclear body mass effect suggests that molecular dating studies may have been systematically misled, but also that methods could be improved by incorporating the finding as a priori information. Mitochondrial synonymous rates also show the body mass effect, but for apparently quite different reasons, and the strength of the

  11. Mitochondrial DNA Sequence Variation in North Atlantic Long-Finned Pilot Whales, Globicephala melas

    DTIC Science & Technology

    1994-06-01

    Strongylocentrotus purpuratus and S . droebachiensis. Evolution 44: 403-415. Rosel, P.E. (1992). Genetic population structure and systematic relationships of...reproduce and distribute copies of this thesis document in whole or in part Signale of Amho,i^ S ^*^ Joint Program in Oceanography, Massachusetts Institute...variation used in the studies described in this chapter include: 1) Genetic distance (d, p, S , or D) is a measure of the number of nucleotide

  12. Variations of dopamine, serotonin, and amino acid concentrations in Noda epileptic rat (NER) retina.

    PubMed

    Chanut, Evelyne; Labarthe, Benoît; Lacroix, Brigitte; Noda, Atsuhi; Gasdeblay, Sylvie; Bondier, Jean-Robert; Versaux-Botteri, Claudine

    2006-01-27

    Noda epileptic rats (NER) exhibit frequent spontaneous tonic-clonic convulsions which represent a valuable model of human epilepsy. If implication of brain neurotransmitters was largely reported, little is known about retina. However, it has been reported that human epilepsy syndrome varies not only with the location of seizure foci but also according to rhythmic patterns, for which retina has a major role in the transmission of external light-dark cycle information. The purpose of this work was to evaluate dopamine (DA), DA metabolites, serotonin (5-HT), and amino acid [glutamate, aspartate, glycine, gamma aminobutyric acid (GABA), and taurine] level variations in retina from NER, at two different nycthemeral periods (11 a.m. and 11 p.m.) and at different ages (2, 6, and 12 months). In NER, retinal dopaminergic function was decreased as soon as 2 months, whereas GABA levels were increased, even if no differences among the different ages could be distinguished. These variations were associated to a slight increase in 5-HT. Other amino acids tested were not affected by epilepsy, whereas taurine decreased with aging in NER as well as in control rats. Retinal 5-HT occurs principally as a precursor of melatonin (MEL). A triangular interaction may be hypothesized: MEL could decrease DA synthesis or release by enhancing GABA activity. Taken together, these results suggest that the retinal physiology is affected by the epileptic status and that information transmitted from retina to the brain should be affected by epilepsy in NER.

  13. Amplification and thrifty single-molecule sequencing of recurrent somatic structural variations

    PubMed Central

    Patel, Anand; Schwab, Richard; Liu, Yu-Tsueng; Bafna, Vineet

    2014-01-01

    Deletion of tumor-suppressor genes as well as other genomic rearrangements pervade cancer genomes across numerous types of solid tumor and hematologic malignancies. However, even for a specific rearrangement, the breakpoints may vary between individuals, such as the recurrent CDKN2A deletion. Characterizing the exact breakpoints for structural variants (SVs) is useful for designating patient-specific tumor biomarkers. We propose AmBre (Amplification of Breakpoints), a method to target SV breakpoints occurring in samples composed of heterogeneous tumor and germline DNA. Additionally, AmBre validates SVs called by whole-exome/genome sequencing and hybridization arrays. AmBre involves a PCR-based approach to amplify the DNA segment containing an SV's breakpoint and then confirms breakpoints using sequencing by Pacific Biosciences RS. To amplify breakpoints with PCR, primers tiling specified target regions are carefully selected with a simulated annealing algorithm to minimize off-target amplification and maximize efficiency at capturing all possible breakpoints within the target regions. To confirm correct amplification and obtain breakpoints, PCR amplicons are combined without barcoding and simultaneously long-read sequenced using a single SMRT cell. Our algorithm efficiently separates reads based on breakpoints. Each read group supporting the same breakpoint corresponds with an amplicon and a consensus amplicon sequence is called. AmBre was used to discover CDKN2A deletion breakpoints in cancer cell lines: A549, CEM, Detroit562, MOLT4, MCF7, and T98G. Also, we successfully assayed RUNX1–RUNX1T1 reciprocal translocations by finding both breakpoints in the Kasumi-1 cell line. AmBre successfully targets SVs where DNA harboring the breakpoints are present in 1:1000 mixtures. PMID:24307551

  14. A 25-Amino Acid Sequence of the Arabidopsis TGD2 Protein Is Sufficient for Specific Binding of Phosphatidic Acid*

    PubMed Central

    Lu, Binbin; Benning, Christoph

    2009-01-01

    Genetic analysis suggests that the TGD2 protein of Arabidopsis is required for the biosynthesis of endoplasmic reticulum derived thylakoid lipids. TGD2 is proposed to be the substrate-binding protein of a presumed lipid transporter consisting of the TGD1 (permease) and TGD3 (ATPase) proteins. The TGD1, -2, and -3 proteins are localized in the inner chloroplast envelope membrane. TGD2 appears to be anchored with an N-terminal membrane-spanning domain into the inner envelope membrane, whereas the C-terminal domain faces the intermembrane space. It was previously shown that the C-terminal domain of TGD2 binds phosphatidic acid (PtdOH). To investigate the PtdOH binding site of TGD2 in detail, the C-terminal domain of the TGD2 sequence lacking the transit peptide and transmembrane sequences was fused to the C terminus of the Discosoma sp. red fluorescent protein (DR). This greatly improved the solubility of the resulting DR-TGD2C fusion protein following production in Escherichia coli. The DR-TGD2C protein bound PtdOH with high specificity, as demonstrated by membrane lipid-protein overlay and liposome association assays. Internal deletion and truncation mutagenesis identified a previously undescribed minimal 25-amino acid fragment in the C-terminal domain of TGD2 that is sufficient for PtdOH binding. Binding characteristics of this 25-mer were distinctly different from those of TGD2C, suggesting that additional sequences of TGD2 providing the proper context for this 25-mer are needed for wild type-like PtdOH binding. PMID:19416982

  15. Posttranslational modification and sequence variation of redox-active proteins correlate with biofilm life cycle in natural microbial communities

    SciTech Connect

    Singer, Steven; Erickson, Brian K; Verberkmoes, Nathan C; Hwang, Mona; Shah, Manesh B; Hettich, Robert {Bob} L; Banfield, Jillian F.; Thelen, Michael P.

    2010-01-01

    Characterizing proteins recovered from natural microbial communities affords the opportunity to correlate protein expression and modification with environmental factors, including species composition and successional stage. Proteogenomic and biochemical studies of pellicle biofilms from subsurface acid mine drainage streams have shown abundant cytochromes from the dominant organism, Leptospirillum Group II. These cytochromes are proposed to be key proteins in aerobic Fe(II) oxidation, the dominant mode of cellular energy generation by the biofilms. In this study, we determined that posttranslational modification and expression of amino-acid sequence variants change as a function of biofilm maturation. For Cytochrome579 (Cyt579), the most abundant cytochrome in the biofilms, late developmental-stage biofilms differed from early-stage biofilms in N-terminal truncations and decreased redox potentials. Expression of sequence variants of two monoheme c-type cytochromes also depended on biofilm development. For Cyt572, an abundant membrane-bound cytochrome, the expression of multiple sequence variants was observed in both early and late developmental-stage biofilms; however, redox potentials of Cyt572 from these different sources did not vary significantly. These cytochrome analyses show a complex response of the Leptospirillum Group II electron transport chain to growth within a microbial community and illustrate the power of multiple proteomics techniques to define biochemistry in natural systems.

  16. Dissection of genomic features and variations of three pathotypes of Puccinia striiformis through whole genome sequencing

    PubMed Central

    Kiran, Kanti; Rawal, Hukam C.; Dubey, Himanshu; Jaswal, R.; Bhardwaj, Subhash C.; Prasad, P.; Pal, Dharam; Devanna, B. N.; Sharma, Tilak R.

    2017-01-01

    Stripe rust of wheat, caused by Puccinia striiformis f. sp. tritici, is one of the important diseases of wheat. We used NGS technologies to generate a draft genome sequence of two highly virulent (46S 119 and 31) and a least virulent (K) pathotypes of P. striiformis from the Indian subcontinent. We generated ~24,000–32,000 sequence contigs (N50;7.4–9.2 kb), which accounted for ~86X–105X sequence depth coverage with an estimated genome size of these pathotypes ranging from 66.2–70.2 Mb. A genome-wide analysis revealed that pathotype 46S 119 might be highly evolved among the three pathotypes in terms of year of detection and prevalence. SNP analysis revealed that ~47% of the gene sets are affected by nonsynonymous mutations. The extracellular secreted (ES) proteins presumably are well conserved among the three pathotypes, and perhaps purifying selection has an important role in differentiating pathotype 46S 119 from pathotypes K and 31. In the present study, we decoded the genomes of three pathotypes, with 81% of the total annotated genes being successfully assigned functional roles. Besides the identification of secretory genes, genes essential for pathogen-host interactions shall prove this study as a huge genomic resource for the management of this disease using host resistance. PMID:28211474

  17. Dissection of genomic features and variations of three pathotypes of Puccinia striiformis through whole genome sequencing.

    PubMed

    Kiran, Kanti; Rawal, Hukam C; Dubey, Himanshu; Jaswal, R; Bhardwaj, Subhash C; Prasad, P; Pal, Dharam; Devanna, B N; Sharma, Tilak R

    2017-02-17

    Stripe rust of wheat, caused by Puccinia striiformis f. sp. tritici, is one of the important diseases of wheat. We used NGS technologies to generate a draft genome sequence of two highly virulent (46S 119 and 31) and a least virulent (K) pathotypes of P. striiformis from the Indian subcontinent. We generated ~24,000-32,000 sequence contigs (N50;7.4-9.2 kb), which accounted for ~86X-105X sequence depth coverage with an estimated genome size of these pathotypes ranging from 66.2-70.2 Mb. A genome-wide analysis revealed that pathotype 46S 119 might be highly evolved among the three pathotypes in terms of year of detection and prevalence. SNP analysis revealed that ~47% of the gene sets are affected by nonsynonymous mutations. The extracellular secreted (ES) proteins presumably are well conserved among the three pathotypes, and perhaps purifying selection has an important role in differentiating pathotype 46S 119 from pathotypes K and 31. In the present study, we decoded the genomes of three pathotypes, with 81% of the total annotated genes being successfully assigned functional roles. Besides the identification of secretory genes, genes essential for pathogen-host interactions shall prove this study as a huge genomic resource for the management of this disease using host resistance.

  18. Nucleotide sequence of the luxC gene encoding fatty acid reductase of the lux operon from Photobacterium leiognathi.

    PubMed

    Lin, J W; Chao, Y F; Weng, S F

    1993-02-26

    The nucleotide sequence of the luxC gene (EMBL Accession No. 65156) encoding fatty acid reductase (FAR) of the lux operon from Photobacterium leiognathi PL741 was determined and the encoded amino acid sequence deduced. The fatty acid reductase is a component of the fatty acid reductase complex. The complex is responsible for converting fatty acid to aldehyde which serves as the substrate in the luciferase-catalyzed bioluminescent reaction. The protein comprises 478 amino acid residues and has a calculated M(r) of 53,858. Alignment and comparison of the fatty acid reductase of P. leiognathi with that of Vibrio harveyi B392 and Vibrio fischeri ATCC 7744 shows that there is 70% and 59% amino acid residues identity, respectively.

  19. Natural variation in Brachypodium disctachyon: Deep Sequencing of Highly Diverse Natural Accessions (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    SciTech Connect

    Gordon, Sean

    2013-03-01

    Sean Gordon of the USDA on "Natural variation in Brachypodium disctachyon: Deep Sequencing of Highly Diverse Natural Accessions" at the 8th Annual Genomics of Energy & Environment Meeting on March 27, 2013 in Walnut Creek, Calif.

  20. Nucleotide sequence of the Klebsiella pneumoniae nifD gene and predicted amino acid sequence of the alpha-subunit of nitrogenase MoFe protein.

    PubMed Central

    Ioannidis, I; Buck, M

    1987-01-01

    The nucleotide sequence of the Klebsiella pneumoniae nifD gene is presented and together with the accompanying paper [Holland, Zilberstein, Zamir & Sussman (1987) Biochem. J. 247, 277-285] completes the sequence of the nifHDK genes encoding the nitrogenase polypeptides. The K. pneumoniae nifD gene encodes the 483-amino acid-residue nitrogenase alpha-subunit polypeptide of Mr 54156. The alpha-subunit has five strongly conserved cysteine residues at positions 63, 89, 155, 184 and 275, some occurring in a region showing both primary sequence and potential structural homology to the K. pneumoniae nitrogenase beta-subunit. A comparison with six other alpha-subunit amino acid sequences has been made, which indicates a number of potentially important domains within alpha-subunits. PMID:3322262

  1. Complete amino acid sequence of the A chain of human complement-classical-pathway enzyme C1r.

    PubMed Central

    Arlaud, G J; Willis, A C; Gagnon, J

    1987-01-01

    The amino acid sequence of human C1r A chain was determined, from sequence analysis performed on fragments obtained from C1r autolytic cleavage, cleavage of methionyl bonds, tryptic cleavages at arginine and lysine residues, and cleavages by staphylococcal proteinase. The polypeptide chain has an N-terminal serine residue and contains 446 amino acid residues (Mr 51,200). The sequence data allow chemical characterization of fragments alpha (positions 1-211), beta (positions 212-279) and gamma (positions 280-446) yielded from C1r autolytic cleavage, and identification of the two major cleavage sites generating these fragments. Position 150 of C1r A chain is occupied by a modified amino acid residue that, upon acid hydrolysis, yields erythro-beta-hydroxyaspartic acid, and that is located in a sequence homologous to the beta-hydroxyaspartic acid-containing regions of Factor IX, Factor X, protein C and protein Z. Sequence comparison reveals internal homology between two segments (positions 10-78 and 186-257). Two carbohydrate moieties are attached to the polypeptide chain, both via asparagine residues at positions 108 and 204. Combined with the previously determined sequence of C1r B chain [Arlaud & Gagnon (1983) Biochemistry 22, 1758-1764], these data give the complete sequence of human C1r. PMID:3036070

  2. Characterization of mitochondrial ribosomal RNA genes in gadiformes: sequence variations, secondary structural features, and phylogenetic implications.

    PubMed

    Bakke, Ingrid; Johansen, Steinar

    2002-10-01

    Secondary structure features of mitochondrial ribosomal RNAs (mt-rRNAs) of bony fishes were investigated by a DNA sequence alignment approach. The small subunit (SSU) and large subunit (LSU) mt-rRNA genes were found to contain several additional variable regions compared to their mammalian counterparts. Fish mt-LSU rRNA genes were found to be longer than the mammalians due to increased length of some of the variable regions. The 5' and 3' ends of Atlantic cod mt-rRNAs were precisely mapped. The 3' ends of mt-SSU rRNAs were found to be homogenous and mono-adenylated, whereas that of the mt-LSU rRNAs were heterogenous and oligo-adenylated. The 5' ends of mt-SSU rRNAs appeared to be heterogenous, corresponding to the presumed first and second positions of the gene. Sequences of the central domain and the D-domain of the mt-SSU and mt-LSU rRNA genes, respectively, were determined and characterized for 11 gadiform species (representing the families Gadidae, Lotidae, Ranicipitidae, Merlucciidae, Phycidae, and Macrouridae) and one Lophiidae species. Detailed secondary structure models of the RNA regions are presented for the Atlantic cod (Gadus morhua) and Roundnose grenadier (Coryphaeonides rupestris). Saturation plots revealed that DNA nucleotide positions corresponding to unpaired RNA regions become saturated with transitions at sequence divergence levels about 0.15. Phylogenetic analyses revealed some aspects of gadiform relationships. Gadidae was identified as the most derived of the gadiform families. Lotidae was found to be the family closest related to Gadidae, and Ranicipitidae was also recognized as a derived gadiform taxon.

  3. Nucleotide sequences of the Pseudomonas savastanoi indoleacetic acid genes show homology with Agrobacterium tumefaciens T-DNA

    PubMed Central

    Yamada, Tetsuji; Palm, Curtis J.; Brooks, Bob; Kosuge, Tsune

    1985-01-01

    We report the nucleotide sequences of iaaM and iaaH, the genetic determinants for, respectively, tryptophan 2-monooxygenase and indoleacetamide hydrolase, the enzymes that catalyze the conversion of L-tryptophan to indoleacetic acid in the tumor-forming bacterium Pseudomonas syringae pv. savastanoi. The sequence analysis indicates that the iaaM locus contains an open reading frame encoding 557 amino acids that would comprise a protein with a molecular weight of 61,783; the iaaH locus contains an open reading frame of 455 amino acids that would comprise a protein with a molecular weight of 48,515. Significant amino acid sequence homology was found between the predicted sequence of the tryptophan monooxygenase of P. savastanoi and the deduced product of the T-DNA tms-1 gene of the octopine-type plasmid pTiA6NC from Agrobacterium tumefaciens. Strong homology was found in the 25 amino acid sequence in the putative FAD-binding region of tryptophan monooxygenase. Homology was also found in the amino acid sequences representing the central regions of the putative products of iaaH and tms-2 T-DNA. The results suggest a strong similarity in the pathways for indoleacetic acid synthesis encoded by genes in P. savastanoi and in A. tumefaciens T-DNA. Images PMID:16593610

  4. DNA sequence-dependent variation in nucleosome structure, stability, and dynamics detected by a FRET-based analysis.

    PubMed

    Kelbauskas, L; Woodbury, N; Lohr, D

    2009-02-01

    Förster resonance energy transfer (FRET) techniques provide powerful and sensitive methods for the study of conformational features in biomolecules. Here, we review FRET-based studies of nucleosomes, focusing particularly on our work comparing the widely used nucleosome standard, 5S rDNA, and 2 promoter-derived regulatory element-containing nucleosomes, mouse mammary tumor virus (MMTV)-B and GAL10. Using several FRET approaches, we detected significant DNA sequence-dependent structure, stability, and dynamics differences among the three. In particular, 5S nucleosomes and 5S H2A/H2B-depleted nucleosomal particles have enhanced stability and diminished DNA dynamics, compared with MMTV-B and GAL10 nucleosomes and particles. H2A/H2B-depleted nucleosomes are of interest because they are produced by the activities of many transcription-associated complexes. Significant location-dependent (intranucleosomal) stability and dynamics variations were also observed. These also vary among nucleosome types. Nucleosomes restrict regulatory factor access to DNA, thereby impeding genetic processes. Eukaryotic cells possess mechanisms to alter nucleosome structure, to generate DNA access, but alterations often must be targeted to specific nucleosomes on critical regulatory DNA elements. By endowing specific nucleosomes with intrinsically higher DNA accessibility and (or) enhanced facility for conformational transitions, DNA sequence-dependent nucleosome dynamics and stability variations have the potential to facilitate nucleosome recognition and, thus, aid in the crucial targeting process. This and other nucleosome structure and function conclusions from FRET analyses are discussed.

  5. Autozygome Sequencing Expands the Horizon of Human Knockout Research and Provides Novel Insights into Human Phenotypic Variation

    PubMed Central

    Anazi, Shamsa; Alshamekh, Shomoukh; Alkuraya, Fowzan S.

    2013-01-01

    The use of autozygosity as a mapping tool in the search for autosomal recessive disease genes is well established. We hypothesized that autozygosity not only unmasks the recessiveness of disease causing variants, but can also reveal natural knockouts of genes with less obvious phenotypic consequences. To test this hypothesis, we exome sequenced 77 well phenotyped individuals born to first cousin parents in search of genes that are biallelically inactivated. Using a very conservative estimate, we show that each of these individuals carries biallelic inactivation of 22.8 genes on average. For many of the 169 genes that appear to be biallelically inactivated, available data support involvement in modulating metabolism, immunity, perception, external appearance and other phenotypic aspects, and appear therefore to contribute to human phenotypic variation. Other genes with biallelic inactivation may contribute in yet unknown mechanisms or may be on their way to conversion into pseudogenes due to true recent dispensability. We conclude that sequencing the autozygome is an efficient way to map the contribution of genes to human phenotypic variation that goes beyond the classical definition of disease. PMID:24367280

  6. Prediction of flexible/rigid regions from protein sequences using k-spaced amino acid pairs

    PubMed Central

    Chen, Ke; Kurgan, Lukasz A; Ruan, Jishou

    2007-01-01

    Background Traditionally, it is believed that the native structure of a protein corresponds to a global minimum of its free energy. However, with the growing number of known tertiary (3D) protein structures, researchers have discovered that some proteins can alter their structures in response to a change in their surroundings or with the help of other proteins or ligands. Such structural shifts play a crucial role with respect to the protein function. To this end, we propose a machine learning method for the prediction of the flexible/rigid regions of proteins (referred to as FlexRP); the method is based on a novel sequence representation and feature selection. Knowledge of the flexible/rigid regions may provide insights into the protein folding process and the 3D structure prediction. Results The flexible/rigid regions were defined based on a dataset, which includes protein sequences that have multiple experimental structures, and which was previously used to study the structural conservation of proteins. Sequences drawn from this dataset were represented based on feature sets that were proposed in prior research, such as PSI-BLAST profiles, composition vector and binary sequence encoding, and a newly proposed representation based on frequencies of k-spaced amino acid pairs. These representations were processed by feature selection to reduce the dimensionality. Several machine learning methods for the prediction of flexible/rigid regions and two recently proposed methods for the prediction of conformational changes and unstructured regions were compared with the proposed method. The FlexRP method, which applies Logistic Regression and collocation-based representation with 95 features, obtained 79.5% accuracy. The two runner-up methods, which apply the same sequence representation and Support Vector Machines (SVM) and Naïve Bayes classifiers, obtained 79.2% and 78.4% accuracy, respectively. The remaining considered methods are characterized by accuracies below 70

  7. Nucleic and amino acid sequences relating to a novel transketolase, and methods for the expression thereof

    DOEpatents

    Croteau, Rodney Bruce; Wildung, Mark Raymond; Lange, Bernd Markus; McCaskill, David G.

    2001-01-01

    cDNAs encoding 1-deoxyxylulose-5-phosphate synthase from peppermint (Mentha piperita) have been isolated and sequenced, and the corresponding amino acid sequences have been determined. Accordingly, isolated DNA sequences (SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7) are provided which code for the expression of 1-deoxyxylulose-5-phosphate synthase from plants. In another aspect the present invention provides for isolated, recombinant DXPS proteins, such as the proteins having the sequences set forth in SEQ ID NO:4, SEQ ID NO:6 and SEQ ID NO:8. In other aspects, replicable recombinant cloning vehicles are provided which code for plant 1-deoxyxylulose-5-phosphate synthases, or for a base sequence sufficiently complementary to at least a portion of 1-deoxyxylulose-5-phosphate synthase DNA or RNA to enable hybridization therewith. In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding a plant 1-deoxyxylulose-5-phosphate synthase. Thus, systems and methods are provided for the recombinant expression of the aforementioned recombinant 1-deoxyxylulose-5-phosphate synthase that may be used to facilitate its production, isolation and purification in significant amounts. Recombinant 1-deoxyxylulose-5-phosphate synthase may be used to obtain expression or enhanced expression of 1-deoxyxylulose-5-phosphate synthase in plants in order to enhance the production of 1-deoxyxylulose-5-phosphate, or its derivatives such as isopentenyl diphosphate (BP), or may be otherwise employed for the regulation or expression of 1-deoxyxylulose-5-phosphate synthase, or the production of its products.

  8. Gene sequence and predicted amino acid sequence of the motA protein, a membrane-associated protein required for flagellar rotation in Escherichia coli.

    PubMed Central

    Dean, G E; Macnab, R M; Stader, J; Matsumura, P; Burks, C

    1984-01-01

    The motA and motB gene products of Escherichia coli are integral membrane proteins necessary for flagellar rotation. We determined the DNA sequence of the region containing the motA gene and its promoter. Within this sequence, there is an open reading frame of 885 nucleotides, which with high probability (98% confidence level) meets criteria for a coding sequence. The 295-residue amino acid translation product had a molecular weight of 31,974, in good agreement with the value determined experimentally by gel electrophoresis. The amino acid sequence, which was quite hydrophobic, was subjected to a theoretical analysis designed to predict membrane-spanning alpha-helical segments of integral membrane proteins; four such hydrophobic helices were predicted by this treatment. Additional amphipathic helices may also be present. A remarkable feature of the sequence is the existence of two segments of high uncompensated charge density, one positive and the other negative. Possible organization of the protein in the membrane is discussed. Asymmetry in the amino acid composition of translated DNA sequences was used to distinguish between two possible initiation codons. The use of this method as a criterion for authentication of coding regions is described briefly in an Appendix. PMID:6090403

  9. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3

    PubMed Central

    Xiao, Jingfa; Hao, Lirui; Crowley, David E.; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  10. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3.

    PubMed

    Wang, Xiaoyu; Chen, Meili; Xiao, Jingfa; Hao, Lirui; Crowley, David E; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals.

  11. The variation of nitric acid vapor and nitrate aerosol concentrations near the island of Hawaii

    SciTech Connect

    Lee, G.

    1992-01-01

    Anthropogenic emissions of nitrogen oxides (NO + NO[sub 2]) are estimated to be half of the global emissions to the atmosphere. To understand the effect of increasing anthropogenic reactive nitrogen inputs to the global atmosphere, one needs to monitor their long-term variations. This dissertation examines the variations of total nitrate (nitric acid vapor and nitrate aerosol) at the Mauna Loa Observatory (MLO), Hawaii. During the Mauna Loa Observatory Photochemistry Experiment (MLOPEX) in May, 1988, six different air types were identified at MLO with statistical analysis. They were: (1) volcano influenced air, (2) stratosphere-like air, (3) boundary-layer air with recent anthropogenic influence, (4) photochemical haze, (5) marine boundary-layer air, (6) well-aged and modified marine air. Samples that might be influenced by marine air or human activity from local islands were eliminated with three meterological criteria (wind direction, condensation nuclei, and dew point). To examine the negative sampling artifacts of nitric acid vapor due to ground loss, mixing ratio gradients with height were measured during August of 1991. The observed gradients of nitric acid vapor indicated that the long-term samplers at 8 m at MLO may underestimate the free tropospheric nitric acid vapor mixing ratio by about 20%. The three year mean and median of free tropospheric total nitrate during long-term measurements were 113 pptv and 93 pptv, respectively. Each year, the total nitrate mixing ratios at MLO during the spring and summer were increased by more than a factor of two higher than fall and winter. NO[sub y] from remote continents (Asia and North America) are likely sources of these increased total nitrate at MLO during these seasons. However, other processes govern the total nitrate mixing ratios, e.g., degree of mixing between free tropospheric air and boundary air at source regions, stratospheric injection, and wet removal of total nitrate.

  12. Variations in a hotspot region of chloroplast DNAs among common wheat and Aegilops revealed by nucleotide sequence analysis.

    PubMed

    Guo, Chang-Hong; Terachi, Toru

    2005-08-01

    The second largest BamHI fragment (B2) of the chloroplast DNA in Triticum (wheat) and Aegilops contains a highly variable region (a hotspot), resulting in four types of B2 of different size, i.e. B2l (10.5kb), B2m (10.2kb), B2 (9.6kb) and B2s (9.4kb). In order to gain a better understanding of the molecular nature of the variations in length and explain unexpected identity among B2 of Ae. ovata, Ae. speltoides and common wheat (T. aestivum), the nucleotide sequence between a stop codon of rbcL and a HindIII site in cemA in the hotspot was determined for Ae. ovata, Ae. speltoides, Ae. caudata and Ae. mutica. The total number of nucleotides in the region was 2808, 2810, 3302, and 3594 bp, for Ae. speltoides, Ae. ovata, Ae. caudata and Ae. mutica, respectively, and the sequences were compared with the corresponding ones of Ae. crassa 4x, T. aestivum and Ae. squarrosa. Compared with the largest B2l fragment of Ae. mutica, a 791bp and a 793 bp deletion were found in Ae. speltoides and Ae. ovata, respectively, and the possible site of deletion in the two species is the same as that of T. aestivum. However, a deleted segment in Ae. ovata is 2 bp longer than that of Ae. speltoides (and T. aestivum), demonstrating that recurrent deletions had occurred in the chloroplast genomes of both species. Comparison of the sequences from Ae. caudata and Ae. crassa 4x with that of Ae. mutica revealed a 289 bp and a 61 bp deletion at the same site in Ae. caudata and Ae. crassa 4x, respectively. Sequence comparison using wild Aegilops plants showed that the large length variations in a hotspot are fixed to each species. A considerable number of polymorphisms are observed in a loop in the 3' of rbcL. The study reveals the relative importance of the large and small indels and minute inversions to account for variations in the chloroplast genomes among closely related species.

  13. Functional and genetic analysis of haplotypic sequence variation at the nicastrin genomic locus.

    PubMed

    Hamilton, Gillian; Killick, Richard; Lambert, Jean-Charles; Amouyel, Philippe; Carrasquillo, Minerva M; Pankratz, V Shane; Graff-Radford, Neill R; Dickson, Dennis W; Petersen, Ronald C; Younkin, Steven G; Powell, John F; Wade-Martins, Richard

    2012-08-01

    Nicastrin (NCSTN) is a component of the γ-secretase complex and therefore potentially a candidate risk gene for Alzheimer's disease. Here, we have developed a novel functional genomics methodology to express common locus haplotypes to assess functional differences. DNA recombination was used to engineer 5 bacterial artificial chromosomes (BACs) to each express a different haplotype of the NCSTN locus. Each NCSTN-BAC was delivered to knockout nicastrin (Ncstn(-/-)) cells and clonal NCSTN-BAC(+)/Ncstn(-/-) cell lines were created for functional analyses. We showed that all NCSTN-BAC haplotypes expressed nicastrin protein and rescued γ-secretase activity and amyloid beta (Aβ) production in NCSTN-BAC(+)/Ncstn(-/-) lines. We then showed that genetic variation at the NCSTN locus affected alternative splicing in human postmortem brain tissue. However, there was no robust functional difference between clonal cell lines rescued by each of the 5 different haplotypes. Finally, there was no statistically significant association of NCSTN with disease risk in the 4 cohorts. We therefore conclude that it is unlikely that common variation at the NCSTN locus is a risk factor for Alzheimer's disease.

  14. Cross-amplification and sequence variation of microsatellite loci in Eurasian hard pines.

    PubMed

    González-Martínez, S C; Robledo-Arnuncio, J J; Collada, C; Díaz, A; Williams, C G; Alía, R; Cervera, M T

    2004-06-01

    Microsatellite transfer across coniferous species is a valued methodology because de novo development for each species is costly and there are many species with only a limited commodity value. Cross-species amplification of orthologous microsatellite regions provides valuable information on mutational and evolutionary processes affecting these loci. We tested 19 nuclear microsatellite markers from Pinus taeda L. (subsection Australes) and three from P. sylvestris L. (subsection Pinus) on seven Eurasian hard pine species ( P. uncinata Ram., P. sylvestris L., P. nigra Arn., P. pinaster Ait., P. halepensis Mill., P. pinea L. and P. canariensis Sm.). Transfer rates to species in subsection Pinus (36-59%) were slightly higher than those to subsections Pineae and Pinaster (32-45%). Half of the trans-specific microsatellites were found to be polymorphic over evolutionary times of approximately 100 million years (ten million generations). Sequencing of three trans-specific microsatellites showed conserved repeat and flanking regions. Both a decrease in the number of perfect repeats in the non-focal species and a polarity for mutation, the latter defined as a higher substitution rate in the flanking sequence regions close to the repeat motifs, were observed in the trans-specific microsatellites. The transfer of microsatellites among hard pine species proved to be useful for obtaining highly polymorphic markers in a wide range of species, thereby providing new tools for population and quantitative genetic studies.

  15. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, Heinz-Ulrich G.; Gray, Joe W.

    1995-01-01

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.

  16. Unconventional amino acid sequence of the sun anemone (Stoichactis helianthus) polypeptide neurotoxin

    SciTech Connect

    Kem, W.; Dunn, B.; Parten, B.; Pennington, M.; Price, D.

    1986-05-01

    A 5000 dalton polypeptide neurotoxin (Sh-NI) purified by G50 Sephadex, P-cellulose, and SP-Sephadex chromatography was homogeneous by isoelectric focusing. Sh-NI was highly toxic to crayfish (LD/sub 50/ 0.6 ..mu..g/kg) but without effect upon mice at 15,000 ..mu..g/kg (i.p. injection). The reduced, /sup 3/H-carboxymethylated toxin and its fragments were subjected to automatic Edman degradation and the resulting PTH-amino acids were identified by HPLC, back hydrolysis, and scintillation counting. Peptides resulting from proteolytic (clostripain, staphylococcal protease) and chemical (tryptophan) cleavage were sequenced. The sequence is: AACKCDDEGPDIRTAPLTGTVDLGSCNAGWEKCASYYTIIADCCRKKK. This sequence differs considerably from the homologous Anemonia and Anthopleura toxins; many of the identical residues (6 half-cystines, G9, P10, R13, G19, G29, W30) are probably critical for folding rather than receptor recognition. However, the Sh-NI sequence closely resembles Radioanthus macrodactylus neurotoxin III and r. paumotensis II. The authors propose that Sh-NI and related Radioanthus toxins act upon a different site on the sodium channel.

  17. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, H.U.G.; Gray, J.W.

    1995-06-27

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.

  18. Sequence-defined bioactive macrocycles via an acid-catalysed cascade reaction

    NASA Astrophysics Data System (ADS)

    Porel, Mintu; Thornlow, Dana N.; Phan, Ngoc N.; Alabi, Christopher A.

    2016-06-01

    Synthetic macrocycles derived from sequence-defined oligomers are a unique structural class whose ring size, sequence and structure can be tuned via precise organization of the primary sequence. Similar to peptides and other peptidomimetics, these well-defined synthetic macromolecules become pharmacologically relevant when bioactive side chains are incorporated into their primary sequence. In this article, we report the synthesis of oligothioetheramide (oligoTEA) macrocycles via a one-pot acid-catalysed cascade reaction. The versatility of the cyclization chemistry and modularity of the assembly process was demonstrated via the synthesis of >20 diverse oligoTEA macrocycles. Structural characterization via NMR spectroscopy revealed the presence of conformational isomers, which enabled the determination of local chain dynamics within the macromolecular structure. Finally, we demonstrate the biological activity of oligoTEA macrocycles designed to mimic facially amphiphilic antimicrobial peptides. The preliminary results indicate that macrocyclic oligoTEAs with just two-to-three cationic charge centres can elicit potent antibacterial activity against Gram-positive and Gram-negative bacteria.

  19. Vertical and Seasonal Variations of Bacterioplankton Subgroups with Different Nucleic Acid Contents: Possible Regulation by Phosphorus†

    PubMed Central

    Nishimura, Yoko; Kim, Chulgoo; Nagata, Toshi

    2005-01-01

    We used flow cytometry to examine seasonal variations in basin-scale distributions of bacterioplankton in Lake Biwa, Japan, a large mesotrophic freshwater lake with an oxygenated hypolimnion. The bacterial communities were divided into three subgroups: bacteria with very high nucleic acid contents (VHNA bacteria), bacteria with high nucleic acid contents (HNA bacteria), and bacteria with low nucleic acid contents (LNA bacteria). During the thermal stratification period, the relative abundance of VHNA bacteria (%VHNA) increased with depth, while the reverse trend was evident for LNA bacteria. Seasonally, the %VHNA was strongly positively correlated (r = 0.87; P < 0.001) with the concentration of dissolved inorganic phosphorus, but not with the concentration of chlorophyll a. The growth of VHNA bacteria was significantly enhanced by addition of phosphate or phosphate plus glucose but not by addition of glucose alone. Although the growth of VHNA and HNA bacteria generally exceeded that of LNA bacteria, our data also revealed that LNA bacteria grew faster than and were grazed as fast as VHNA bacteria in late August, when nutrient limitation was presumably severe. Based on these results, we hypothesize that in severely P-limited environments such as Lake Biwa, P limitation exerts more severe constraints on the growth of bacterial groups with higher nucleic acid contents, which allows LNA bacteria to be competitive and become an important component of the microbial loop. PMID:16204494

  20. Discovery of a novel amino acid racemase through exploration of natural variation in Arabidopsis thaliana

    DOE PAGES

    Strauch, Renee C.; Svedin, Elisabeth; Dilkes, Brian; ...

    2015-08-31

    Plants produce diverse low-molecular-weight compounds via specialized metabolism. Discovery of the pathways underlying production of these metabolites is an important challenge for harnessing the huge chemical diversity and catalytic potential in the plant kingdom for human uses, but this effort is often encumbered by the necessity to initially identify compounds of interest or purify a catalyst involved in their synthesis. Here, as an alternative approach, we have performed untargeted metabolite profiling and genome-wide association analysis on 440 natural accessions of Arabidopsis thaliana. This approach allowed us to establish genetic linkages between metabolites and genes. Investigation of one of the metabolite-genemore » associations led to the identification of N-malonyl-D-allo-isoleucine, and the discovery of a novel amino acid racemase involved in its biosynthesis. This finding provides, to our knowledge, the first functional characterization of a eukaryotic member of a large and widely conserved phenazine biosynthesis protein PhzF-like protein family. Unlike most of known eukaryotic amino acid racemases, the newly discovered enzyme does not require pyridoxal 5'-phosphate for its activity. In conclusion, this study thus identifies a new d-amino acid racemase gene family and advances our knowledge of plant d-amino acid metabolism that is currently largely unexplored. As a result, it also demonstrates that exploitation of natural metabolic variation by integrating metabolomics with genome-wide association is a powerful approach for functional genomics study of specialized metabolism.« less

  1. Complete amino acid sequence of ananain and a comparison with stem bromelain and other plant cysteine proteases.

    PubMed Central

    Lee, K L; Albee, K L; Bernasconi, R J; Edmunds, T

    1997-01-01

    The amino acid sequences of ananain (EC3.4.22.31) and stem bromelain (3.4.22.32), two cysteine proteases from pineapple stem, are similar yet ananain and stem bromelain possess distinct specificities towards synthetic peptide substrates and different reactivities towards the cysteine protease inhibitors E-64 and chicken egg white cystatin. We present here the complete amino acid sequence of ananain and compare it with the reported sequences of pineapple stem bromelain, papain and chymopapain from papaya and actinidin from kiwifruit. Ananain is comprised of 216 residues with a theoretical mass of 23464 Da. This primary structure includes a sequence insert between residues 170 and 174 not present in stem bromelain or papain and a hydrophobic series of amino acids adjacent to His-157. It is possible that these sequence differences contribute to the different substrate and inhibitor specificities exhibited by ananain and stem bromelain. PMID:9355753

  2. Microbial community dynamics in bioaugmented sequencing batch reactors for bromoamine acid removal.

    PubMed

    Qu, Yuanyuan; Zhou, Jiti; Wang, Jing; Fu, Xiang; Xing, Linlin

    2005-05-01

    Sphingomonas xenophaga QYY with the ability to degrade bromoamine acid (BAA) was previously isolated from sludge samples. The enhancement of BAA removal by strain QYY in sequencing batch reactors (SBRs) was investigated in this study. The results showed that augmented SBRs exhibited stronger abilities to degrade BAA than the non-augmented control one. In order to estimate the relationship between community dynamics and function of augmented SBRs, a combined method based on fingerprints (ribosomal intergenic spacer analysis, RISA) and 16S rRNA gene sequencing was used. The results indicated that the microbial community dynamics were substantially changed, and the introduced strain QYY was persistent in the augmented systems. This study suggests that it is feasible and potentially useful to enhance BAA removal using BAA-degrading bacteria, such as S. xenophaga QYY.

  3. [Measurement of the amino acid sequence for the fusion protein FP3 with LC-MS/MS].

    PubMed

    Li, Xiang; Gao, Xiang-Dong; Tao, Lei; Pei, De-Ning; Guo, Ying; Rao, Chun-Ming; Wang, Jun-Zhi

    2012-02-01

    The amino acid sequence of the fusion protein FP3 was measured by two types of LC-MS/MS and its primary structure was confirmed. After reduction and alkylation, the protein was digested with trypsin and glycosyl groups in glycopeptide were removed by PNGase F. The mixed peptides were separated by LC, then Q-TOF and Ion trap tandem mass spectrometry were used to measure b, y fragment ions of each peptide to analyze the amino acid sequence of fusion protein FP3. Seventy-six percent of full amino acid sequence of the fusion protein FP3 was measured by LC-ESI-Q-TOF with the remaining 24% completed by LC-ESI-Trap. As LC-MS and tandem mass spectrometry are rapid, sensitive, accurate to measure the protein amino acid sequence, they are important approach to structure analysis and identification of recombinant protein.

  4. NullSeq: A Tool for Generating Random Coding Sequences with Desired Amino Acid and GC Contents

    PubMed Central

    Liu, Sophia S.; Hockenberry, Adam J.; Lancichinetti, Andrea; Jewett, Michael C.

    2016-01-01

    The existence of over- and under-represented sequence motifs in genomes provides evidence of selective evolutionary pressures on biological mechanisms such as transcription, translation, ligand-substrate binding, and host immunity. In order to accurately identify motifs and other genome-scale patterns of interest, it is essential to be able to generate accurate null models that are appropriate for the sequences under study. While many tools have been developed to create random nucleotide sequences, protein coding sequences are subject to a unique set of constraints that complicates the process of generating appropriate null models. There are currently no tools available that allow users to create random coding sequences with specified amino acid composition and GC content for the purpose of hypothesis testing. Using the principle of maximum entropy, we developed a method that generates unbiased random sequences with pre-specified amino acid and GC content, which we have developed into a python package. Our method is the simplest way to obtain maximally unbiased random sequences that are subject to GC usage and primary amino acid sequence constraints. Furthermore, this approach can easily be expanded to create unbiased random sequences that incorporate more complicated constraints such as individual nucleotide usage or even di-nucleotide frequencies. The ability to generate correctly specified null models will allow researchers to accurately identify sequence motifs which will lead to a better understanding of biological processes as well as more effective engineering of biological systems. PMID:27835644

  5. Geographical distribution and oncogenic risk association of human papillomavirus type 58 E6 and E7 sequence variations

    PubMed Central

    Chan, Paul K.S.; Zhang, Chuqing; Park, Jong-Sup; Smith-McCune, Karen K.; Palefsky, Joel M.; Giovannelli, Lucia; Coutlée, Francois; Hibbitts, Samantha; Konno, Ryo; Settheetham-Ishida, Wannapa; Chu, Tang-Yuan; Ferrera, Annabelle; Picconi, María Alejandra; De Marco, Federico; Woo, Yin-Ling; Raiol, Tainá; Piña-Sánchez, Patricia; Bae, Jeong-Hoon; Wong, Martin C.S.; Chirenje, Mike Z.; Magure, Tsitsi; Moscicki, Anna-Barbara; Fiander, Alison N.; Capra, Giuseppina; Ki, Eun Young; Tan, Yi; Chen, Zigui; Burk, Robert D.; Chan, Martin C.W.; Cheung, Tak-Hong; Pim, David; Banks, Lawrence

    2014-01-01

    Human papillomavirus (HPV) 58 accounts for a notable proportion of cervical cancers in East Asia and parts of Latin America, but it is uncommon elsewhere. The reason for such ethnogeographical predilection is unknown. In our study, nucleotide sequences of E6 and E7 genes of 401 HPV58 isolates collected from 15 countries/cities across four continents were examined. Phylogenetic relationship, geographical distribution and risk association of nucleotide sequence variations were analyzed. We found that the E6 genes of HPV58 variants were more conserved than E7. Thus, E6 is a more appropriate target for type-specific detection, whereas E7 is more appropriate for strain differentiation. The frequency of sequence variation varied geographically. Africa had significantly more isolates with E6-367A (D86E) but significantly less isolates with E6-203G, -245G, -367C (prototype-like) than other regions (p ≤ 0.003). E7-632T, -760A (T20I, G63S) was more frequently found in Asia, and E7-793G (T74A) was more frequent in Africa (p < 0.001). Variants with T20I and G63S substitutions at E7 conferred a significantly higher risk for cervical intraepithelial neoplasia grade III and invasive cervical cancer compared to other HPV58 variants (odds ratio = 4.44, p = 0.007). In conclusion, T20I and/or G63S substitution(s) at E7 of HPV58 is/are associated with a higher risk for cervical neoplasia. These substitutions are more commonly found in Asia and the Americas, which may account for the higher disease attribution of HPV58 in these areas. PMID:23136059

  6. CoVaMa: Co-Variation Mapper for disequilibrium analysis of mutant loci in viral populations using next-generation sequence data

    PubMed Central

    Routh, Andrew; Chang, Max W.; Okulicz, Jason F.; Johnson, John E.; Torbett, Bruce E.

    2015-01-01

    Next-generation sequencing (NGS) has transformed our understanding of the dynamics and diversity of virus populations for human pathogens and model systems alike. Due to the sensitivity and depth of coverage in NGS, it is possible to measure the frequency of mutations that may be present even at vanishingly low frequencies within the viral population. Here, we describe a simple bioinformatic pipeline called CoVaMa (Co-Variation Mapper) scripted in Python that detects correlated patterns of mutations in a viral sample. Our algorithm takes NGS alignment data and populates large matrices of contingency tables that correspond to every possible pairwise interaction of nucleotides in the viral genome or amino acids in the chosen open reading frame. These tables are then analysed using classical linkage disequilibrium to detect and report evidence of epistasis. We test our analysis with simulated data and then apply the approach to find epistatically linked loci in Flock House Virus genomic RNA grown under controlled cell culture conditions. We also reanalyze NGS data from a large cohort of HIV infected patients and find correlated amino acid substitution events in the protease gene that have arisen in response to anti-viral therapy. This both confirms previous findings and suggests new pairs of interactions within HIV protease. The script is publically available at http://sourceforge.net/projects/covama PMID:26408523

  7. Morphological tranformation of calcite crystal growth by prismatic "acidic" polypeptide sequences.

    SciTech Connect

    Kim, I; Giocondi, J L; Orme, C A; Collino, J; Evans, J S

    2007-02-13

    Many of the interesting mechanical and materials properties of the mollusk shell are thought to stem from the prismatic calcite crystal assemblies within this composite structure. It is now evident that proteins play a major role in the formation of these assemblies. Recently, a superfamily of 7 conserved prismatic layer-specific mollusk shell proteins, Asprich, were sequenced, and the 42 AA C-terminal sequence region of this protein superfamily was found to introduce surface voids or porosities on calcite crystals in vitro. Using AFM imaging techniques, we further investigate the effect that this 42 AA domain (Fragment-2) and its constituent subdomains, DEAD-17 and Acidic-2, have on the morphology and growth kinetics of calcite dislocation hillocks. We find that Fragment-2 adsorbs on terrace surfaces and pins acute steps, accelerates then decelerates the growth of obtuse steps, forms clusters and voids on terrace surfaces, and transforms calcite hillock morphology from a rhombohedral form to a rounded one. These results mirror yet are distinct from some of the earlier findings obtained for nacreous polypeptides. The subdomains Acidic-2 and DEAD-17 were found to accelerate then decelerate obtuse steps and induce oval rather than rounded hillock morphologies. Unlike DEAD-17, Acidic-2 does form clusters on terrace surfaces and exhibits stronger obtuse velocity inhibition effects than either DEAD-17 or Fragment-2. Interestingly, a 1:1 mixture of both subdomains induces an irregular polygonal morphology to hillocks, and exhibits the highest degree of acute step pinning and obtuse step velocity inhibition. This suggests that there is some interplay between subdomains within an intra (Fragment-2) or intermolecular (1:1 mixture) context, and sequence interplay phenomena may be employed by biomineralization proteins to exert net effects on crystal growth and morphology.

  8. Polarimetric Variations of Binary Stars. V. Pre-Main-Sequence Spectroscopic Binaries Located in Ophiuchus and Scorpius

    NASA Astrophysics Data System (ADS)

    Manset, N.; Bastien, P.

    2003-06-01

    We present polarimetric observations of seven pre-main-sequence (PMS) spectroscopic binaries located in the ρ Ophiuchus and Upper Scorpius star-forming regions (SFRs). The average observed polarizations at 7660 Å are between 0.5% and 3.5%. After estimates of the interstellar polarization are removed, all binaries have an intrinsic polarization above 0.4%, even though most of them do not present other evidences for circumstellar dust. Two binaries, NTTS 162814-2427 and NTTS 162819-2423S, present high levels of intrinsic polarization between 1.5% and 2.1%, in agreement with the fact that other observations (photometry, spectroscopy) indicate the presence of circumstellar dust. Tests reveal that all seven PMS binaries have a statistically variable or possibly variable polarization. Combining these results with our previous sample of binaries located in the Taurus, Auriga, and Orion SFRs, 68% of the binaries have an intrinsic polarization above 0.5%, and 90% of the binaries are polarimetrically variable or possibly variable. NTTS 160814-1857, 162814-2427, and 162819-2423S are clearly polarimetrically variable. The first two also exhibit phase-locked variations over ~10 and ~40 orbits, respectively. Statistically, NTTS 160905-1859 is possibly variable, but it shows periodic variations not detected by the statistical tests; those variations are not phased locked and only present for short intervals of time. The amplitudes of the variations reach a few tenths of a percent, greater than for the previously studied PMS binaries located in the Taurus, Orion, and Auriga SFRs. The high-eccentricity system NTTS 162814-2427 shows single-periodic variations, in agreement with our previous numerical simulations. We compare the observations with some of our numerical simulations and also show that an analysis of the periodic polarimetric variations with the Brown, McLean, & Emslie (BME) formalism to find the orbital inclination is for the moment premature: nonperiodic events

  9. Transcriptome-wide comparison of sequence variation in divergent ecotypes of kokanee salmon

    PubMed Central

    2013-01-01

    Background High throughput next-generation sequencing technology has enabled the collection of genome-wide sequence data and revolutionized single nucleotide polymorphism (SNP) discovery in a broad range of species. When analyzed within a population genomics framework, SNP-based genotypic data may be used to investigate questions of evolutionary, ecological, and conservation significance in natural populations of non-model organisms. Kokanee salmon are recently diverged freshwater populations of sockeye salmon (Oncorhynchus nerka) that exhibit reproductive ecotypes (stream-spawning and shore-spawning) in lakes throughout western North America and northeast Asia. Current conservation and management strategies may treat these ecotypes as discrete stocks, however their recent divergence and low levels of gene flow make in-season genetic stock identification a challenge. The development of genome-wide SNP markers is an essential step towards fine-scale stock identification, and may enable a direct investigation of the genetic basis of ecotype divergence. Results We used pooled cDNA samples from both ecotypes of kokanee to generate 750 million base pairs of transcriptome sequence data. These raw data were assembled into 11,074 high coverage contigs from which we identified 32,699 novel single nucleotide polymorphisms. A subset of these putative SNPs was validated using high-resolution melt analysis and Sanger resequencing to genotype independent samples of kokanee and anadromous sockeye salmon. We also identified a number of contigs that were composed entirely of reads from a single ecotype, which may indicate regions of differential gene expression between the two reproductive ecotypes. In addition, we found some evidence for greater pathogen load among the kokanee sampled in stream-spawning habitats, suggesting a possible evolutionary advantage to shore-spawning that warrants further study. Conclusions This study provides novel genomic resources to support population

  10. Serial Gene Losses and Foreign DNA Underlie Size and Sequence Variation in the Plastid Genomes of Diatoms

    PubMed Central

    Ruck, Elizabeth C.; Nakov, Teofil; Jansen, Robert K.; Theriot, Edward C.; Alverson, Andrew J.

    2014-01-01

    Photosynthesis by diatoms accounts for roughly one-fifth of global primary production, but despite this, relatively little is known about their plastid genomes. We report the completely sequenced plastid genomes for eight phylogenetically diverse diatoms and show them to be variable in size, gene and foreign sequence content, and gene order. The genomes contain a core set of 122 protein-coding genes, with 15 additional genes exhibiting complex patterns of 1) gene losses at varying phylogenetic scales, 2) functional transfers to the nucleus, 3) gene duplication, divergence, and differential retention of paralogs, and 4) acquisitions of putatively functional recombinase genes from resident plasmids. The newly sequenced genomes also contain several previously unreported genes, highlighting how poorly characterized diatom plastid genomes are overall. Genome size variation reflects major expansions of the inverted repeat region in some cases but, more commonly, large-scale expansions of intergenic regions, many of which contain unique open reading frames of likely foreign origin. Although many gene clusters are conserved across species, rearrangements appear to be frequent in most lineages. PMID:24567305

  11. Comparative Mitogenomics of the Genus Odontobutis (Perciformes: Gobioidei: Odontobutidae) Revealed Conserved Gene Rearrangement and High Sequence Variations

    PubMed Central

    Ma, Zhihong; Yang, Xuefen; Bercsenyi, Miklos; Wu, Junjie; Yu, Yongyao; Wei, Kaijian; Fan, Qixue; Yang, Ruibin

    2015-01-01

    To understand the molecular evolution of mitochondrial genomes (mitogenomes) in the genus Odontobutis, the mitogenome of Odontobutis yaluensis was sequenced and compared with those of another four Odontobutis species. Our results displayed similar mitogenome features among species in genome organization, base composition, codon usage, and gene rearrangement. The identical gene rearrangement of trnS-trnL-trnH tRNA cluster observed in mitogenomes of these five closely related freshwater sleepers suggests that this unique gene order is conserved within Odontobutis. Additionally, the present gene order and the positions of associated intergenic spacers of these Odontobutis mitogenomes indicate that this unusual gene rearrangement results from tandem duplication and random loss of large-scale gene regions. Moreover, these mitogenomes exhibit a high level of sequence variation, mainly due to the differences of corresponding intergenic sequences in gene rearrangement regions and the heterogeneity of tandem repeats in the control regions. Phylogenetic analyses support Odontobutis species with shared gene rearrangement forming a monophyletic group, and the interspecific phylogenetic relationships are associated with structural differences among their mitogenomes. The present study contributes to understanding the evolutionary patterns of Odontobutidae species. PMID:26492246

  12. Comparative Mitogenomics of the Genus Odontobutis (Perciformes: Gobioidei: Odontobutidae) Revealed Conserved Gene Rearrangement and High Sequence Variations.

    PubMed

    Ma, Zhihong; Yang, Xuefen; Bercsenyi, Miklos; Wu, Junjie; Yu, Yongyao; Wei, Kaijian; Fan, Qixue; Yang, Ruibin

    2015-10-20

    To understand the molecular evolution of mitochondrial genomes (mitogenomes) in the genus Odontobutis, the mitogenome of Odontobutis yaluensis was sequenced and compared with those of another four Odontobutis species. Our results displayed similar mitogenome features among species in genome organization, base composition, codon usage, and gene rearrangement. The identical gene rearrangement of trnS-trnL-trnH tRNA cluster observed in mitogenomes of these five closely related freshwater sleepers suggests that this unique gene order is conserved within Odontobutis. Additionally, the present gene order and the positions of associated intergenic spacers of these Odontobutis mitogenomes indicate that this unusual gene rearrangement results from tandem duplication and random loss of large-scale gene regions. Moreover, these mitogenomes exhibit a high level of sequence variation, mainly due to the differences of corresponding intergenic sequences in gene rearrangement regions and the heterogeneity of tandem repeats in the control regions. Phylogenetic analyses support Odontobutis species with shared gene rearrangement forming a monophyletic group, and the interspecific phylogenetic relationships are associated with structural differences among their mitogenomes. The present study contributes to understanding the evolutionary patterns of Odontobutidae species.

  13. Sequence selective recognition of double-stranded RNA using triple helix-forming peptide nucleic acids.

    PubMed

    Zengeya, Thomas; Gupta, Pankaj; Rozners, Eriks

    2014-01-01

    Noncoding RNAs are attractive targets for molecular recognition because of the central role they play in gene expression. Since most noncoding RNAs are in a double-helical conformation, recognition of such structures is a formidable problem. Herein, we describe a method for sequence-selective recognition of biologically relevant double-helical RNA (illustrated on ribosomal A-site RNA) using peptide nucleic acids (PNA) that form a triple helix in the major grove of RNA under physiologically relevant conditions. Protocols for PNA preparation and binding studies using isothermal titration calorimetry are described in detail.

  14. Fast computational methods for predicting protein structure from primary amino acid sequence

    DOEpatents

    Agarwal, Pratul Kumar

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  15. Analysis of DNA sequence variation within marine species using Beta-coalescents

    PubMed Central

    Steinrücken, Matthias; Birkner, Matthias; Blath, Jochen

    2013-01-01

    We apply recently developed inference methods based on general coalescent processes to DNA sequence data obtained from various marine species. Several of these species are believed to exhibit so-called shallow gene genealogies, potentially due to extreme reproductive behaviour, e.g. via Hedgecock’s “reproduction sweepstakes”. Besides the data analysis, in particular the inference of mutation rates and the estimation of the (real) time to the most recent common ancestor, we briefly address the question whether the genealogies might be adequately described by so-called Beta coalescents (as opposed to Kingman’s coalescent), allowing multiple mergers of genealogies. The choice of the underlying coalescent model for the genealogy has drastic implications for the estimation of the above quantities, in particular the real-time embedding of the genealogy. PMID:23376155

  16. Assessment of megabase-scale somatic copy number variation using single-cell sequencing

    PubMed Central

    Knouse, Kristin A.; Wu, Jie; Amon, Angelika

    2016-01-01

    Megabase-scale copy number variants (CNVs) can have profound phenotypic consequences. Germline CNVs of this magnitude are associated with disease and experience negative selection. However, it is unknown whether organismal function requires that every cell maintain a balanced genome. It is possible that large somatic CNVs are tolerated or even positively selected. Single-cell sequencing is a useful tool for assessing somatic genomic heterogeneity, but its performance in CNV detection has not been rigorously tested. Here, we develop an approach that allows for reliable detection of megabase-scale CNVs in single somatic cells. We discover large CNVs in 8%–9% of cells across tissues and identify two recurrent CNVs. We conclude that large CNVs can be tolerated in subpopulations of cells, and particular CNVs are relatively prevalent within and across individuals. PMID:26772196

  17. Assessment of megabase-scale somatic copy number variation using single-cell sequencing.

    PubMed

    Knouse, Kristin A; Wu, Jie; Amon, Angelika

    2016-03-01

    Megabase-scale copy number variants (CNVs) can have profound phenotypic consequences. Germline CNVs of this magnitude are associated with disease and experience negative selection. However, it is unknown whether organismal function requires that every cell maintain a balanced genome. It is possible that large somatic CNVs are tolerated or even positively selected. Single-cell sequencing is a useful tool for assessing somatic genomic heterogeneity, but its performance in CNV detection has not been rigorously tested. Here, we develop an approach that allows for reliable detection of megabase-scale CNVs in single somatic cells. We discover large CNVs in 8%-9% of cells across tissues and identify two recurrent CNVs. We conclude that large CNVs can be tolerated in subpopulations of cells, and particular CNVs are relatively prevalent within and across individuals.

  18. Fluorescence energy transfer as a probe for nucleic acid structures and sequences.

    PubMed Central

    Mergny, J L; Boutorine, A S; Garestier, T; Belloc, F; Rougée, M; Bulychev, N V; Koshkin, A A; Bourson, J; Lebedev, A V; Valeur, B

    1994-01-01

    The primary or secondary structure of single-stranded nucleic acids has been investigated with fluorescent oligonucleotides, i.e., oligonucleotides covalently linked to a fluorescent dye. Five different chromophores were used: 2-methoxy-6-chloro-9-amino-acridine, coumarin 500, fluorescein, rhodamine and ethidium. The chemical synthesis of derivatized oligonucleotides is described. Hybridization of two fluorescent oligonucleotides to adjacent nucleic acid sequences led to fluorescence excitation energy transfer between the donor and the acceptor dyes. This phenomenon was used to probe primary and secondary structures of DNA fragments and the orientation of oligodeoxynucleotides synthesized with the alpha-anomers of nucleoside units. Fluorescence energy transfer can be used to reveal the formation of hairpin structures and the translocation of genes between two chromosomes. PMID:8152922

  19. Amino acid sequence of two neurotoxins from the venom of the Egyptian black snake (Walterinnesia aegyptia).

    PubMed

    Samejima, Y; Aoki-Tomomatsu, Y; Yanagisawa, M; Mebs, D

    1997-02-01

    The venom of the Egyptian black snake Walterinnesia aegyptia contains at least three toxins, which act postsynaptically to block the neuromuscular transmission of isolated rat phrenic nerve-diaphragm and chicken biventer cervicis muscle. The complete amino acid sequence of the two toxins, W-III and W-IV, consisting of 62 amino acid residues, was elucidated by Edman degradation of fragments obtained after Staphylococcus aureus protease and prolylpeptidase digestion. Although the toxins exhibit close structural homology to other short-chain postsynaptic neurotoxins from Elapidae venoms, toxin IV is unique by having a free SH-group (cysteine) at position 16. In position 35 of W-III, which is located at the tip of the central loop, threonine is replaced by lysine, which may alter the interaction of the toxin with the acetylcholine receptor, since the toxin is seven times less lethal than toxin W-IV.

  20. Spatio-Temporal Variations of High and Low Nucleic Acid Content Bacteria in an Exorheic River

    PubMed Central

    Ma, Lili; Ji, Yurui; Bartlam, Mark; Wang, Yingying

    2016-01-01

    Bacteria with high nucleic acid (HNA) and low nucleic acid (LNA) content are commonly observed in aquatic environments. To date, limited knowledge is available on their temporal and spatial variations in freshwater environments. Here an investigation of HNA and LNA bacterial abundance and their flow cytometric characteristics was conducted in an exorheic river (Haihe River, Northern China) over a one year period covering September (autumn) 2011, December (winter) 2011, April (spring) 2012, and July (summer) 2012. The results showed that LNA and HNA bacteria contributed similarly to the total bacterial abundance on both the spatial and temporal scale. The variability of HNA on abundance, fluorescence intensity (FL1) and side scatter (SSC) were more sensitive to environmental factors than that of LNA bacteria. Meanwhile, the relative distance of SSC between HNA and LNA was more variable than that of FL1. Multivariate analysis further demonstrated that the influence of geographical distance (reflected by the salinity gradient along river to ocean) and temporal changes (as temperature variation due to seasonal succession) on the patterns of LNA and HNA were stronger than the effects of nutrient conditions. Furthermore, the results demonstrated that the distribution of LNA and HNA bacteria, including the abundance, FL1 and SSC, was controlled by different variables. The results suggested that LNA and HNA bacteria might play different ecological roles in the exorheic river. PMID:27082986

  1. Spatio-Temporal Variations of High and Low Nucleic Acid Content Bacteria in an Exorheic River.

    PubMed

    Liu, Jie; Hao, Zhenyu; Ma, Lili; Ji, Yurui; Bartlam, Mark; Wang, Yingying

    2016-01-01

    Bacteria with high nucleic acid (HNA) and low nucleic acid (LNA) content are commonly observed in aquatic environments. To date, limited knowledge is available on their temporal and spatial variations in freshwater environments. Here an investigation of HNA and LNA bacterial abundance and their flow cytometric characteristics was conducted in an exorheic river (Haihe River, Northern China) over a one year period covering September (autumn) 2011, December (winter) 2011, April (spring) 2012, and July (summer) 2012. The results showed that LNA and HNA bacteria contributed similarly to the total bacterial abundance on both the spatial and temporal scale. The variability of HNA on abundance, fluorescence intensity (FL1) and side scatter (SSC) were more sensitive to environmental factors than that of LNA bacteria. Meanwhile, the relative distance of SSC between HNA and LNA was more variable than that of FL1. Multivariate analysis further demonstrated that the influence of geographical distance (reflected by the salinity gradient along river to ocean) and temporal changes (as temperature variation due to seasonal succession) on the patterns of LNA and HNA were stronger than the effects of nutrient conditions. Furthermore, the results demonstrated that the distribution of LNA and HNA bacteria, including the abundance, FL1 and SSC, was controlled by different variables. The results suggested that LNA and HNA bacteria might play different ecological roles in the exorheic river.

  2. Modeling the plant-soil interaction in presence of heavy metal pollution and acidity variations.

    PubMed

    Guala, Sebastián; Vega, Flora A; Covelo, Emma F

    2013-01-01

    On a mathematical interaction model, developed to model metal uptake by plants and the effects on their growth, we introduce a modification which considers also effects on variations of acidity in soil. The model relates the dynamics of the uptake of metals from soil to plants and also variations of uptake according to the acidity level. Two types of relationships are considered: total and available metal content. We suppose simple mathematical assumptions in order to get as simple as possible expressions with the aim of being easily tested in experimental problems. This work introduces modifications to two versions of the model: on the one hand, the expression of the relationship between the metal in soil and the concentration of the metal in plants and, on the other hand, the relationship between the metal in the soil and total amount of the metal in plants. The fine difference of both versions is fundamental at the moment to consider the tolerance and capacity of accumulation of pollutants in the biomass from the soil.

  3. Complete genome sequence of Lactococcus lactis IO-1, a lactic acid bacterium that utilizes xylose and produces high levels of L-lactic acid.

    PubMed

    Kato, Hiroaki; Shiwa, Yuh; Oshima, Kenshiro; Machii, Miki; Araya-Kojima, Tomoko; Zendo, Takeshi; Shimizu-Kadota, Mariko; Hattori, Masahira; Sonomoto, Kenji; Yoshikawa, Hirofumi

    2012-04-01

    We report the complete genome sequence of Lactococcus lactis IO-1 (= JCM7638). It is a nondairy lactic acid bacterium, produces nisin Z, ferments xylose, and produces predominantly L-lactic acid at high xylose concentrations. From ortholog analysis with other five L. lactis strains, IO-1 was identified as L. lactis subsp. lactis.

  4. Complete genome sequence of Bacillus amyloliquefaciens LL3, which exhibits glutamic acid-independent production of poly-γ-glutamic acid.

    PubMed

    Geng, Weitao; Cao, Mingfeng; Song, Cunjiang; Xie, Hui; Liu, Li; Yang, Chao; Feng, Jun; Zhang, Wei; Jin, Yinghong; Du, Yang; Wang, Shufang

    2011-07-01

    Bacillus amyloliquefaciens is one of most prevalent Gram-positive aerobic spore-forming bacteria with the ability to synthesize polysaccharides and polypeptides. Here, we report the complete genome sequence of B. amyloliquefaciens LL3, which was isolated from fermented food and presents the glutamic acid-independent production of poly-γ-glutamic acid.

  5. Sequence variation of ookinete surface proteins Pvs25 and Pvs28 of Plasmodium vivax isolates from Southern Mexico and their association to local anophelines infectivity.

    PubMed

    González-Cerón, Lilia; Alvarado-Delgado, Alejandro; Martínez-Barnetche, Jesus; Rodríguez, Mario H; Ovilla-Muñoz, Marbella; Pérez, Fabián; Hernandez-Avila, Juan E; Sandoval, Marco A; Rodríguez, Maria Del Carmen; Villarreal-Treviño, Cuauhtémoc

    2010-07-01

    The polymorphism of Pvs25 and Pvs28 ookinete surface proteins, their association to circumsporozoite protein repeat (CSPr) genotypes (Vk210 and Vk247) and their infectivity to local Anopheles albimanus and Anopheles pseudopunctipennis were investigated in Plasmodium vivax-infected blood samples obtained from patients in Southern Mexico. The pvs25 and pvs28 complete genes were amplified, cloned and sequenced; and the CSPr genotype was determined by PCR amplification and hybridization. The amino acid Pvs25 and Pvs28 polymorphisms were mapped to their corresponding protein structure. Infected blood samples were simultaneously provided through artificial feeders to both mosquito species; the ratio of infected mosquitoes and oocyst numbers were recorded. The polymorphism of pvs25 and pvs28 was limited to few nucleotide positions, and produced three haplotypes: type A/A parasites presented Pvs25 and Pvs28 amino acid sequences identical to that of Sal I reference strain; parasites type B1 presented a mutation 130 Ile-->Thr in Pvs25, while type B2 presented 87 Gln-->Lys/130 Ile-->Thr in the same molecule. Both types B1 and B2 parasites presented changes in Pvs28 at 87 Asn-->Asp, 110 Tyr-->Asn and five GSGGE/D repeat sequences between the fourth EGF-like domain and the GPI. Most P. vivaxparasites from the coastal plains and the overlapping region were Pvs25/28 A/A, CSPrVk210 and were infective only to An. albimanus (p< or =0.0001). Parasites originating in foothills were Pvs25/28 type B1/B or B2/B and CSPrVk210 or Vk247, and were more infective to An. pseudopunctipennis than to An. albimanus (p< or =0.001). These results and the analysis of Pvs25/28 from other parts of the world indicated that non-synonymous variations in these proteins occur in amino acid residues exposed on the surface of the proteins, and are likely to interact with midgut mosquito ligands. We hypothesize that these molecules have been shaped by co-evolutionary adaptations of parasites to their

  6. Design, synthesis, and characterization of a protein sequencing reagent yielding amino acid derivatives with enhanced detectability by mass spectrometry.

    PubMed Central

    Aebersold, R.; Bures, E. J.; Namchuk, M.; Goghari, M. H.; Shushan, B.; Covey, T. C.

    1992-01-01

    We report the design, chemical synthesis, and structural and functional characterization of a novel reagent for protein sequence analysis by the Edman degradation, yielding amino acid derivatives rapidly detectable at high sensitivity by ion-evaporation mass spectrometry. We demonstrate that the reagent 3-[4'(ethylene-N,N,N-trimethylamino)phenyl]-2-isothiocyanate is chemically stable and shows coupling and cyclization/cleavage yields comparable to phenylisothiocyanate, the standard reagent in chemical sequence analysis, under conditions typically encountered in manual or automated sequence analysis. Amino acid derivatives generated with this reagent were detectable by ion-evaporation mass spectrometry at the subfemtomole sensitivity level at a pace of one sample per minute. Furthermore, derivatives were identified by their mass, thus permitting the rapid and highly sensitive determination of the molecular nature of modified amino acids. Derivatives of amino acids with acidic, basic, polar, or hydrophobic side chains were reproducibly detectable at comparable sensitivities. The polar nature of the reagent required covalent immobilization of polypeptides prior to automated sequence analysis. This reagent, used in automated sequence analysis, has the potential for overcoming the limitations in sensitivity, speed, and the ability to characterize modified amino acid residues inherent in the chemical sequencing methods that are currently used. PMID:1304351

  7. Serum uric acid concentrations and SLC2A9 genetic variation in Hispanic children: The Viva La Familia Study

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Elevated concentrations of serum uric acid are associated with increased risk of gout and renal and cardiovascular diseases. Genetic studies in adults have consistently identified associations of solute carrier family 2, member 9 (SLC2A9), polymorphisms with variation in serum uric acid. However, it...

  8. Complete Genome Sequence of Enterobacter cloacae UW5, a Rhizobacterium Capable of High Levels of Indole-3-Acetic Acid Production.

    PubMed

    Coulson, Thomas J D; Patten, Cheryl L

    2015-08-06

    We report the complete genome sequence of Enterobacter cloacae UW5, an indole-3-acetic acid-producing rhizobacterium originally isolated from the rhizosphere of grass. The 4.9-Mbp genome has a G+C content of 54% and contains 4,496 protein-coding sequences.

  9. Complete Genome Sequence of Enterobacter cloacae UW5, a Rhizobacteri