Science.gov

Sample records for acid sequence variations

  1. Formation Sequences of Iron Minerals in the Acidic Alteration Products and Variation of Hydrothermal Fluid Conditions

    NASA Astrophysics Data System (ADS)

    Isobe, H.; Yoshizawa, M.

    2008-12-01

    Iron minerals have important role in environmental issues not only on the Earth but also other terrestrial planets. Iron mineral species related to alteration products of primary minerals with surface or subsurface fluids are characterized by temperature, acidity and redox conditions of the fluids. We can see various iron- bearing alteration products in alteration products around fumaroles in geothermal/volcanic areas. In this study, zonal structures of iron minerals in alteration products of the geothermal area are observed to elucidate temporal and spatial variation of hydrothermal fluids. Alteration of the pyroxene-amphibole andesite of Garan-dake volcano, Oita, Japan occurs by the acidic hydrothermal fluid to form cristobalite leaching out elements other than Si. Hand specimens with unaltered or weakly altered core and cristobalite crust show various sequences of layers. XRD analysis revealed that the alteration degree is represented by abundance of cristobalite. Intermediately altered layers are characterized by occurrence including alunite, pyrite, kaolinite, goethite and hematite. A specimen with reddish brown core surrounded by cristobalite-rich white crust has brown colored layers at the boundary of core and the crust. Reddish core is characterized by occurrence of crystalline hematite by XRD. Another hand specimen has light gray core, which represents reduced conditions, and white cristobalite crust with light brown and reddish brown layers of ferric iron minerals between the core and the crust. On the other hand, hornblende crystals, typical ferrous iron-bearing mineral of the host rock, are well preserved in some samples with strongly decolorized cristobalite-rich groundmass. Hydrothermal alteration experiments of iron-rich basaltic material shows iron mineral species depend on acidity and temperature of the fluid. Oxidation states of the iron-bearing mineral species are strongly influenced by the acidity and redox conditions. Variations of alteration

  2. DNA Sequence and Expression Variation of Hop (Humulus lupulus) Valerophenone Synthase (VPS), a Key Gene in Bitter Acid Biosynthesis

    PubMed Central

    Castro, Consuelo B.; Whittock, Lucy D.; Whittock, Simon P.; Leggett, Grey; Koutoulis, Anthony

    2008-01-01

    Background The hop plant (Humulus lupulus) is a source of many secondary metabolites, with bitter acids essential in the beer brewing industry and others having potential applications for human health. This study investigated variation in DNA sequence and gene expression of valerophenone synthase (VPS), a key gene in the bitter acid biosynthesis pathway of hop. Methods Sequence variation was studied in 12 varieties, and expression was analysed in four of the 12 varieties in a series across the development of the hop cone. Results Nine single nucleotide polymorphisms (SNPs) were detected in VPS, seven of which were synonymous. The two non-synonymous polymorphisms did not appear to be related to typical bitter acid profiles of the varieties studied. However, real-time quantitative reverse-transcription polymerase chain reaction (qRT-PCR) analysis of VPS expression during hop cone development showed a clear link with the bitter acid content. The highest levels of VPS expression were observed in two triploid varieties, ‘Symphony’ and ‘Ember’, which typically have high bitter acid levels. Conclusions In all hop varieties studied, VPS expression was lowest in the leaves and an increase in expression was consistently observed during the early stages of cone development. PMID:18519445

  3. Fad7 gene identification and fatty acids phenotypic variation in an olive collection by EcoTILLING and sequencing approaches.

    PubMed

    Sabetta, Wilma; Blanco, Antonio; Zelasco, Samanta; Lombardo, Luca; Perri, Enzo; Mangini, Giacomo; Montemurro, Cinzia

    2013-08-01

    The ω-3 fatty acid desaturases (FADs) are enzymes responsible for catalyzing the conversion of linoleic acid to α-linolenic acid localized in the plastid or in the endoplasmic reticulum. In this research we report the genotypic and phenotypic variation of Italian Olea europaea L. germoplasm for the fatty acid composition. The phenotypic oil characterization was followed by the molecular analysis of the plastidial-type ω-3 FAD gene (fad7) (EC 1.14.19), whose full-length sequence has been here identified in cultivar Leccino. The gene consisted of 2635 bp with 8 exons and 5'- and 3'-UTRs of 336 and 282 bp respectively, and showed a high level of heterozygousity (1/110 bp). The natural allelic variation was investigated both by a LiCOR EcoTILLING assay and the PCR product direct sequencing. Only three haplotypes were identified among the 96 analysed cultivars, highlighting the strong degree of conservation of this gene. PMID:23685785

  4. Detection of nucleic acid sequences by invader-directed cleavage

    DOEpatents

    Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.

  5. Composition for nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2008-08-26

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  6. Sequence variations in the FAD2 gene in seeded pumpkins.

    PubMed

    Ge, Y; Chang, Y; Xu, W L; Cui, C S; Qu, S P

    2015-01-01

    Seeded pumpkins are important economic crops; the seeds contain various unsaturated fatty acids, such as oleic acid and linoleic acid, which are crucial for human and animal nutrition. The fatty acid desaturase-2 (FAD2) gene encodes delta-12 desaturase, which converts oleic acid to linoleic acid. However, little is known about sequence variations in FAD2 in seeded pumpkins. Twenty-seven FAD2 clones from 27 accessions of Cucurbita moschata, Cucurbita maxima, Cucurbita pepo, and Cucurbita ficifolia were obtained (totally 1152 bp; a single gene without introns). More than 90% nucleotide identities were detected among the 27 FAD2 clones. Nucleotide substitution, rather than nucleotide insertion and deletion, led to sequence polymorphism in the 27 FAD2 clones. Furthermore, the 27 FAD2 selected clones all encoded the FAD2 enzyme (delta-12 desaturase) with amino acid sequence identities from 91.7 to 100% for 384 amino acids. The same main-function domain between 47 and 329 amino acids was identified. The four species clustered separately based on differences in the sequences that were identified using the unweighted pair group method with arithmetic mean. Geographic origin and species were found to be closely related to sequence variation in FAD2. PMID:26782391

  7. High speed nucleic acid sequencing

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid. Each type of labeled nucleotide comprises an acceptor fluorophore attached to a phosphate portion of the nucleotide such that the fluorophore is removed upon incorporation into a growing strand. Fluorescent signal is emitted via fluorescent resonance energy transfer between the donor fluorophore and the acceptor fluorophore as each nucleotide is incorporated into the growing strand. The sequence is deduced by identifying which base is being incorporated into the growing strand.

  8. The regions of sequence variation in caulimovirus gene VI.

    PubMed

    Sanger, M; Daubert, S; Goodman, R M

    1991-06-01

    The sequence of gene VI from figwort mosaic virus (FMV) clone x4 was determined and compared with that previously published for FMV clone DxS. Both clones originated from the same virus isolation, but the virus used to clone DxS was propagated extensively in a host of a different family prior to cloning whereas that used to clone x4 was not. Differences in the amino acid sequence inferred from the DNA sequences occurred in two clusters. An N-terminal conserved region preceded two regions of variation separated by a central conserved region. Variation in cauliflower mosaic virus (CaMV) gene VI sequences, all of which were derived from virus isolates from hosts from one host family, was similar to that seen in the FMV comparison, though the extent of variation was less. Alignment of gene VI domains from FMV and CaMV revealed regions of amino acid sequence identical in both viruses within the conserved regions. The similarity in the pattern of conserved and variable domains of these two viruses suggests common host-interactive functions in caulimovirus gene VI homologues, and possibly an analogy between caulimoviruses and certain animal viruses in the influence of the host on sequence variability of viral genes. PMID:2024500

  9. A case study on the genetic origin of the high oleic acid trait through FAD2-1 DNA sequence variation in safflower (Carthamus tinctorius L.)

    PubMed Central

    Rapson, Sara; Wu, Man; Okada, Shoko; Das, Alpana; Shrestha, Pushkar; Zhou, Xue-Rong; Wood, Craig; Green, Allan; Singh, Surinder; Liu, Qing

    2015-01-01

    The safflower (Carthamus tinctorius L.) is considered a strongly domesticated species with a long history of cultivation. The hybridization of safflower with its wild relatives has played an important role in the evolution of cultivars and is of particular interest with regards to their production of high quality edible oils. Original safflower varieties were all rich in linoleic acid, while varieties rich in oleic acid have risen to prominence in recent decades. The high oleic acid trait is controlled by a partially recessive allele ol at a single locus OL. The ol allele was found to be a defective microsomal oleate desaturase FAD2-1. Here we present DNA sequence data and Southern blot analysis suggesting that there has been an ancient hybridization and introgression of the FAD2-1 gene into C. tinctorius from its wild relative C. palaestinus. It is from this gene that FAD2-1Δ was derived more recently. Identification and characterization of the genetic origin and diversity of FAD2-1 could aid safflower breeders in reducing population size and generations required for the development of new high oleic acid varieties by using perfect molecular marker-assisted selection. PMID:26442008

  10. Protein structure prediction from sequence variation

    PubMed Central

    Marks, Debora S; Hopf, Thomas A; Sander, Chris

    2015-01-01

    Genomic sequences contain rich evolutionary information about functional constraints on macromolecules such as proteins. This information can be efficiently mined to detect evolutionary couplings between residues in proteins and address the long-standing challenge to compute protein three-dimensional structures from amino acid sequences. Substantial progress has recently been made on this problem owing to the explosive growth in available sequences and the application of global statistical methods. In addition to three-dimensional structure, the improved understanding of covariation may help identify functional residues involved in ligand binding, protein-complex formation and conformational changes. We expect computation of covariation patterns to complement experimental structural biology in elucidating the full spectrum of protein structures, their functional interactions and evolutionary dynamics. PMID:23138306

  11. Indole acetic acid production by fluorescent Pseudomonas spp. from the rhizosphere of Plectranthus amboinicus (Lour.) Spreng. and their variation in extragenic repetitive DNA sequences.

    PubMed

    Sethia, Bedhya; Mustafa, Mariam; Manohar, Sneha; Patil, Savita V; Jayamohan, Nellickal Subramanian; Kumudini, Belur Satyan

    2015-06-01

    Fluorescent Pseudomonas (FP) is a heterogenous group of growth promoting rhizobacteria that regulate plant growth by releasing secondary metabolic compounds viz., indole acetic acid (IAA), siderophores, ammonia and hydrogen cyanide. In the present study, IAA producing FPs from the rhizosphere of Plectranthus amboinicus were characterized morphologically, biochemically and at the molecular level. Molecular identification of the isolates were carried out using Pseudomonas specific primers. The effect of varying time (24, 48, 72 and 96 h), Trp concentrations (100, 200, 300, 400 and 500 μg x ml(-1)), temperature (10, 26, 37 and 50 ± 2 degrees C) and pH (6, 7 and 8) on IAA production by 10 best isolates were studied. Results showed higher IAA production at 72 h incubation, at 300 μg x ml(-1) Trp concentration, temperature 26 ± 2 degrees C and pH 7. TLC with acidified ethyl acetate extract showed that the IAA produced has a similar Rf value to that of the standard IAA. Results of TLC were confirmed by HPLC analysis. Genetic diversity of the isolates was also studied using 40 RAPD and 4 Rep primers. Genetic diversity parameters such as dominance, Shannon index and Simpson index were calculated. Out of 40 RAPD primers tested, 9 (2 OP-D series and 7 OP-E series) were shortlisted for further analysis. Studies using RAPD, ERIC, BOX, REP and GTG5 primers revealed that isolates exhibit significant diversity in repetitive DNA sequences irrespective of the rhizosphere. PMID:26155673

  12. Chip-based sequencing nucleic acids

    DOEpatents

    Beer, Neil Reginald

    2014-08-26

    A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.

  13. Distinguishing Proteins From Arbitrary Amino Acid Sequences

    PubMed Central

    Yau, Stephen S.-T.; Mao, Wei-Guang; Benson, Max; He, Rong Lucy

    2015-01-01

    What kinds of amino acid sequences could possibly be protein sequences? From all existing databases that we can find, known proteins are only a small fraction of all possible combinations of amino acids. Beginning with Sanger's first detailed determination of a protein sequence in 1952, previous studies have focused on describing the structure of existing protein sequences in order to construct the protein universe. No one, however, has developed a criteria for determining whether an arbitrary amino acid sequence can be a protein. Here we show that when the collection of arbitrary amino acid sequences is viewed in an appropriate geometric context, the protein sequences cluster together. This leads to a new computational test, described here, that has proved to be remarkably accurate at determining whether an arbitrary amino acid sequence can be a protein. Even more, if the results of this test indicate that the sequence can be a protein, and it is indeed a protein sequence, then its identity as a protein sequence is uniquely defined. We anticipate our computational test will be useful for those who are attempting to complete the job of discovering all proteins, or constructing the protein universe. PMID:25609314

  14. Control for stochastic sampling variation and qualitative sequencing error in next generation sequencing

    PubMed Central

    Blomquist, Thomas; Crawford, Erin L.; Yeo, Jiyoun; Zhang, Xiaolu; Willey, James C.

    2015-01-01

    Background Clinical implementation of Next-Generation Sequencing (NGS) is challenged by poor control for stochastic sampling, library preparation biases and qualitative sequencing error. To address these challenges we developed and tested two hypotheses. Methods Hypothesis 1: Analytical variation in quantification is predicted by stochastic sampling effects at input of (a) amplifiable nucleic acid target molecules into the library preparation, (b) amplicons from library into sequencer, or (c) both. We derived equations using Monte Carlo simulation to predict assay coefficient of variation (CV) based on these three working models and tested them against NGS data from specimens with well characterized molecule inputs and sequence counts prepared using competitive multiplex-PCR amplicon-based NGS library preparation method comprising synthetic internal standards (IS). Hypothesis 2: Frequencies of technically-derived qualitative sequencing errors (i.e., base substitution, insertion and deletion) observed at each base position in each target native template (NT) are concordant with those observed in respective competitive synthetic IS present in the same reaction. We measured error frequencies at each base position within amplicons from each of 30 target NT, then tested whether they correspond to those within the 30 respective IS. Results For hypothesis 1, the Monte Carlo model derived from both sampling events best predicted CV and explained 74% of observed assay variance. For hypothesis 2, observed frequency and type of sequence variation at each base position within each IS was concordant with that observed in respective NTs (R2 = 0.93). Conclusion In targeted NGS, synthetic competitive IS control for stochastic sampling at input of both target into library preparation and of target library product into sequencer, and control for qualitative errors generated during library preparation and sequencing. These controls enable accurate clinical diagnostic reporting of

  15. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-05-30

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  16. Method for sequencing nucleic acid molecules

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2006-06-06

    The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.

  17. Variation in Seed Fatty Acid Composition, and Sequence Divergence in the FAD2 Gene Coding Region between Wild and Cultivated Sesame

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Sesame germplasm harbors genetic diversity which can be useful for sesame improvement in breeding programs. Seven accessions with different levels of oleic acid were selected from the entire USDA sesame germplasm collection (1232 accessions) and planted for morphological observation and re-examinati...

  18. Effect of amino acid sequence variations at position 149 on the fusogenic activity of the subtype B avian metapneumovirus fusion protein.

    PubMed

    Yun, Bingling; Gao, Yanni; Liu, Yongzhen; Guan, Xiaolu; Wang, Yongqiang; Qi, Xiaole; Gao, Honglei; Liu, Changjun; Cui, Hongyu; Zhang, Yanping; Gao, Yulong; Wang, Xiaomei

    2015-10-01

    The entry of enveloped viruses into host cells requires the fusion of viral and cell membranes. These membrane fusion reactions are mediated by virus-encoded glycoproteins. In the case of avian metapneumovirus (aMPV), the fusion (F) protein alone can mediate virus entry and induce syncytium formation in vitro. To investigate the fusogenic activity of the aMPV F protein, we compared the fusogenic activities of three subtypes of aMPV F proteins using a TCSD50 assay developed in this study. Interestingly, we found that the F protein of aMPV subtype B (aMPV/B) strain VCO3/60616 (aMPV/vB) was hyperfusogenic when compared with F proteins of aMPV/B strain aMPV/f (aMPV/fB), aMPV subtype A (aMPV/A), and aMPV subtype C (aMPV/C). We then further demonstrated that the amino acid (aa) residue 149F contributed to the hyperfusogenic activity of the aMPV/vB F protein. Moreover, we revealed that residue 149F had no effect on the fusogenic activities of aMPV/A, aMPV/C, and human metapneumovirus (hMPV) F proteins. Collectively, we provide the first evidence that the amino acid at position 149 affects the fusogenic activity of the aMPV/B F protein, and our findings will provide new insights into the fusogenic mechanism of this protein. PMID:26175070

  19. Variations on strongly lacunary quasi Cauchy sequences

    NASA Astrophysics Data System (ADS)

    Kaplan, Huseyin; Cakalli, Huseyin

    2016-08-01

    We introduce a new function space, namely the space of Nθ (p)-ward continuous functions, which turns out to be a closed subspace of the space of continuous functions for each positive integer p. Nθα(p ) -ward continuity is also introduced and investigated for any fixed 0 < α ≤ 1, and for any fixed positive integer p. A real valued function f defined on a subset A of R, the set of real numbers is Nθα(p ) -ward continuous if it preserves Nθα(p ) -quasi-Cauchy sequences, i.e. (f (xn)) is an Nθα(p ) -quasi-Cauchy sequence whenever (xn) is Nθα(p ) -quasi-Cauchy sequence of points in A, where a sequence (xk) of points in R is called Nθα(p ) -quasi-Cauchy if lim r →∞ 1/hrα ∑k ∈Ir |Δ xk | p =0 , where Δxk = xk+1-xk for each positive integer k, p is a fixed positive integer, α is fixed in ]0, 1], Ir = (kr-1, kr], and θ = (kr) is a lacunary sequence, i.e. an increasing sequence of positive integers such that k0 ≠ 0, and hr: kr-kr-1 →∞.

  20. A variation on lacunary quasi Cauchy sequences

    NASA Astrophysics Data System (ADS)

    Cakalli, Huseyin; Et, Mikail; Sengul, Hacer

    2016-08-01

    In the present paper, we introduce a concept of ideal lacunary statistical quasi-Cauchy sequence of order α of real numbers in the sense that a sequence (xk) of points in R is called I-lacunary statistically quasi-Cauchy of order α, if { r ∈N :1/hrα | { k ∈Ir:| Δ xk | ≥ɛ } | ≥δ } ∈I for each ɛ > 0 and for each δ > 0, where an ideal I is a family of subsets of positive integers N which is closed under taking finite unions and subsets of its elements. The main purpose of this paper is to investigate ideal lacunary statistical ward continuity of order α, where a function f is called I- lacunary statistically ward continuous of order α if it preserves I-lacunary statistically quasi-Cauchy sequences of order α, i.e. (f (xn)) is a Sθα(I ) -quasi-Cauchy sequence whenever (xn) is.

  1. Phenolic acid esterases, coding sequences and methods

    DOEpatents

    Blum, David L.; Kataeva, Irina; Li, Xin-Liang; Ljungdahl, Lars G.

    2002-01-01

    Described herein are four phenolic acid esterases, three of which correspond to domains of previously unknown function within bacterial xylanases, from XynY and XynZ of Clostridium thermocellum and from a xylanase of Ruminococcus. The fourth specifically exemplified xylanase is a protein encoded within the genome of Orpinomyces PC-2. The amino acids of these polypeptides and nucleotide sequences encoding them are provided. Recombinant host cells, expression vectors and methods for the recombinant production of phenolic acid esterases are also provided.

  2. Amino-Acid Sequence of Porcine Pepsin

    PubMed Central

    Tang, J.; Sepulveda, P.; Marciniszyn, J.; Chen, K. C. S.; Huang, W-Y.; Tao, N.; Liu, D.; Lanier, J. P.

    1973-01-01

    As the culmination of several years of experiments, we propose a complete amino-acid sequence for porcine pepsin, an enzyme containing 327 amino-acid residues in a single polypeptide chain. In the sequence determination, the enzyme was treated with cyanogen bromide. Five resulting fragments were purified. The amino-acid sequence of four of the fragments accounted for 290 residues. Because the structure of a 37-residue carboxyl-terminal fragment was already known, it was not studied. The alignment of these fragments was determined from the sequence of methionyl-peptides we had previously reported. We also discovered the locations of activesite aspartyl residues, as well as the pairing of the three disulfide bridges. A minor component of commercial crystalline pepsin was found to contain two extra amino-acid residues, Ala-Leu-, at the amino-terminus of the molecule. This minor component was apparently derived from a different site of cleavage during the activation of porcine pepsinogen. PMID:4587252

  3. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.

  4. Method for identifying and quantifying nucleic acid sequence aberrations

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-07-21

    A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

  5. Methods for analyzing nucleic acid sequences

    DOEpatents

    Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

    2011-05-17

    The present invention is directed to a method of sequencing a target nucleic acid. The method provides a complex comprising a polymerase enzyme, a target nucleic acid molecule, and a primer, wherein the complex is immobilized on a support Fluorescent label is attached to a terminal phosphate group of the nucleotide or nucleotide analog. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The time duration of the signal from labeled nucleotides or nucleotide analogs that become incorporated is distinguished from freely diffusing labels by a longer retention in the observation volume for the nucleotides or nucleotide analogs that become incorporated than for the freely diffusing labels.

  6. Nucleotide sequence variation of chitin synthase genes among ectomycorrhizal fungi and its potential use in taxonomy.

    PubMed Central

    Mehmann, B; Brunner, I; Braus, G H

    1994-01-01

    DNA sequences of single-copy genes coding for chitin synthases (UDP-N-acetyl-D-glucosamine:chitin 4-beta-N-acetylglucosaminyltransferase; EC 2.4.1.16) were used to characterize ectomycorrhizal fungi. Degenerate primers deduced from short, completely conserved amino acid stretches flanking a region of about 200 amino acids of zymogenic chitin synthases allowed the amplification of DNA fragments of several members of this gene family. Different DNA band patterns were obtained from basidiomycetes because of variation in the number and length of amplified fragments. Cloning and sequencing of the most prominent DNA fragments revealed that these differences were due to various introns at conserved positions. The presence of introns in basidiomycetous fungi therefore has a potential use in identification of genera by analyzing PCR-generated DNA fragment patterns. Analyses of the nucleotide sequences of cloned fragments revealed variations in nucleotide sequences from 4 to 45%. By comparison of the deduced amino acid sequences, the majority of the DNA fragments were identified as members of genes for chitin synthase class II. The deduced amino acid sequences from species of the same genus differed only in one amino acid residue, whereas identity between the amino acid sequences of ascomycetous and basidiomycetous fungi within the same taxonomic class was found to be approximately 43 to 66%. Phylogenetic analysis of the amino acid sequence of class II chitin synthase-encoding gene fragments by using parsimony confirmed the current taxonomic groupings. In addition, our data revealed a fourth class of putative zymogenic chitin synthesis. Images PMID:7944356

  7. Terminal region sequence variations in variola virus DNA.

    PubMed

    Massung, R F; Loparev, V N; Knight, J C; Totmenin, A V; Chizhikov, V E; Parsons, J M; Safronov, P F; Gutorov, V V; Shchelkunov, S N; Esposito, J J

    1996-07-15

    Genome DNA terminal region sequences were determined for a Brazilian alastrim variola minor virus strain Garcia-1966 that was associated with an 0.8% case-fatality rate and African smallpox strains Congo-1970 and Somalia-1977 associated with variola major (9.6%) and minor (0.4%) mortality rates, respectively. A base sequence identity of > or = 98.8% was determined after aligning 30 kb of the left- or right-end region sequences with cognate sequences previously determined for Asian variola major strains India-1967 (31% death rate) and Bangladesh-1975 (18.5% death rate). The deduced amino acid sequences of putative proteins of > or = 65 amino acids also showed relatively high identity, although the Asian and African viruses were clearly more related to each other than to alastrim virus. Alastrim virus contained only 10 of 70 proteins that were 100% identical to homologs in Asian strains, and 7 alastrim-specific proteins were noted. PMID:8661439

  8. Sequence variation in ligand binding sites in proteins

    PubMed Central

    Magliery, Thomas J; Regan, Lynne

    2005-01-01

    Background The recent explosion in the availability of complete genome sequences has led to the cataloging of tens of thousands of new proteins and putative proteins. Many of these proteins can be structurally or functionally categorized from sequence conservation alone. In contrast, little attention has been given to the meaning of poorly-conserved sites in families of proteins, which are typically assumed to be of little structural or functional importance. Results Recently, using statistical free energy analysis of tetratricopeptide repeat (TPR) domains, we observed that positions in contact with peptide ligands are more variable than surface positions in general. Here we show that statistical analysis of TPRs, ankyrin repeats, Cys2His2 zinc fingers and PDZ domains accurately identifies specificity-determining positions by their sequence variation. Sequence variation is measured as deviation from a neutral reference state, and we present probabilistic and information theory formalisms that improve upon recently suggested methods such as statistical free energies and sequence entropies. Conclusion Sequence variation has been used to identify functionally-important residues in four selected protein families. With TPRs and ankyrin repeats, protein families that bind highly diverse ligands, the effect is so pronounced that sequence "hypervariation" alone can be used to predict ligand binding sites. PMID:16194281

  9. Amino acid sequence of bovine heart coupling factor 6.

    PubMed Central

    Fang, J K; Jacobs, J W; Kanner, B I; Racker, E; Bradshaw, R A

    1984-01-01

    The amino acid sequence of bovine heart mitochondrial coupling factor 6 (F6) has been determined by automated Edman degradation of the whole protein and derived peptides. Preparations based on heat precipitation and ethanol extraction showed allotypic variation at three positions while material further purified by HPLC yielded only one sequence that also differed by a Phe-Thr replacement at residue 62. The mature protein contains 76 amino acids with a calculated molecular weight of 9006 and a pI of approximately equal to 5, in good agreement with experimentally measured values. The charged amino acids are mainly clustered at the termini and in one section in the middle; these three polar segments are separated by two segments relatively rich in nonpolar residues. Chou-Fasman analysis suggests three stretches of alpha-helix coinciding (or within) the high-charge-density sequences with a single beta-turn at the first polar-nonpolar junction. Comparison of the F6 sequence with those of other proteins did not reveal any homologous structures. PMID:6149548

  10. Bm86 midgut protein sequence variation in South Texas cattle fever ticks

    PubMed Central

    2010-01-01

    Background Cattle fever ticks, Rhipicephalus (Boophilus) microplus and R. (B.) annulatus, vector bovine and equine babesiosis, and have significantly expanded beyond the permanent quarantine zone established in South Texas. Currently, there are no vaccines approved for use within the United States for controlling these vectors. Vaccines developed in Australia and Cuba based on the midgut antigen Bm86 have variable efficacy against cattle fever ticks. A possible explanation for this variation in vaccine efficacy is amino acid sequence divergence between the recombinant Bm86 vaccine component and native Bm86 expressed in ticks from different geographical regions of the world. Results There was 91.8% amino acid sequence identity in Bm86 among R. microplus and R. annulatus sequenced from South Texas infestations. When South Texas isolates were compared to the Australian Yeerongpilly and Cuban Camcord vaccine strains, there was 89.8% and 90.0% identity, respectively. Most of the sequence divergence was focused in one region of the protein, amino acids 206-298. Hydrophilicity profiles revealed that two short regions of Bm86 (amino acids 206-210 and 560-570) appear to be more hydrophilic in South Texas isolates compared to vaccine strains. Only one amino acid difference was found between South Texas and vaccine strains within two previously described B-cell epitopes. A total of 4 amino acid differences were observed within three peptides previously shown to induce protective immune responses in cattle. Conclusions Sequence differences between South Texas isolates and Yeerongpilly and Camcord strains are spread throughout the entire Bm86 sequence, suggesting that geographic variation does exist. Differences within previously described B-cell epitopes between South Texas isolates and vaccine strains are minimal; however, short regions of hydrophilic amino acids found unique to South Texas isolates suggest that additional unique surface exposed peptides could be targeted

  11. Determining Word Sequence Variation Patterns in Clinical Documents using Multiple Sequence Alignment

    PubMed Central

    Meng, Frank; Morioka, Craig A.; El-Saden, Suzie

    2011-01-01

    Sentences and phrases that represent a certain meaning often exhibit patterns of variation where they differ from a basic structural form by one or two words. We present an algorithm that utilizes multiple sequence alignments (MSAs) to generate a representation of groups of phrases that possess the same semantic meaning but also share in common the same basic word sequence structure. The MSA enables the determination not only of the words that compose the basic word sequence, but also of the locations within the structure that exhibit variation. The algorithm can be utilized to generate patterns of text sequences that can be used as the basis for a pattern-based classifier, as a starting point to bootstrap the pattern building process for a regular expression-based classifiers, or serve to reveal the variation characteristics of sentences and phrases within a particular domain. PMID:22195152

  12. Hybridization and sequencing of nucleic acids using base pair mismatches

    DOEpatents

    Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

    2001-01-01

    Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.

  13. Dissecting the relationship between protein structure and sequence variation

    NASA Astrophysics Data System (ADS)

    Shahmoradi, Amir; Wilke, Claus; Wilke Lab Team

    2015-03-01

    Over the past decade several independent works have shown that some structural properties of proteins are capable of predicting protein evolution. The strength and significance of these structure-sequence relations, however, appear to vary widely among different proteins, with absolute correlation strengths ranging from 0 . 1 to 0 . 8 . Here we present the results from a comprehensive search for the potential biophysical and structural determinants of protein evolution by studying more than 200 structural and evolutionary properties in a dataset of 209 monomeric enzymes. We discuss the main protein characteristics responsible for the general patterns of protein evolution, and identify sequence divergence as the main determinant of the strengths of virtually all structure-evolution relationships, explaining ~ 10 - 30 % of observed variation in sequence-structure relations. In addition to sequence divergence, we identify several protein structural properties that are moderately but significantly coupled with the strength of sequence-structure relations. In particular, proteins with more homogeneous back-bone hydrogen bond energies, large fractions of helical secondary structures and low fraction of beta sheets tend to have the strongest sequence-structure relation. BEACON-NSF center for the study of evolution in action.

  14. Correlation between fibroin amino acid sequence and physical silk properties.

    PubMed

    Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek

    2003-09-12

    The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet. PMID:12816957

  15. Lysoplex: An efficient toolkit to detect DNA sequence variations in the autophagy-lysosomal pathway

    PubMed Central

    Di Fruscio, Giuseppina; Schulz, Angela; De Cegli, Rossella; Savarese, Marco; Mutarelli, Margherita; Parenti, Giancarlo; Banfi, Sandro; Braulke, Thomas; Nigro, Vincenzo; Ballabio, Andrea

    2015-01-01

    The autophagy-lysosomal pathway (ALP) regulates cell homeostasis and plays a crucial role in human diseases, such as lysosomal storage disorders (LSDs) and common neurodegenerative diseases. Therefore, the identification of DNA sequence variations in genes involved in this pathway and their association with human diseases would have a significant impact on health. To this aim, we developed Lysoplex, a targeted next-generation sequencing (NGS) approach, which allowed us to obtain a uniform and accurate coding sequence coverage of a comprehensive set of 891 genes involved in lysosomal, endocytic, and autophagic pathways. Lysoplex was successfully validated on 14 different types of LSDs and then used to analyze 48 mutation-unknown patients with a clinical phenotype of neuronal ceroid lipofuscinosis (NCL), a genetically heterogeneous subtype of LSD. Lysoplex allowed us to identify pathogenic mutations in 67% of patients, most of whom had been unsuccessfully analyzed by several sequencing approaches. In addition, in 3 patients, we found potential disease-causing variants in novel NCL candidate genes. We then compared the variant detection power of Lysoplex with data derived from public whole exome sequencing (WES) efforts. On average, a 50% higher number of validated amino acid changes and truncating variations per gene were identified. Overall, we identified 61 truncating sequence variations and 488 missense variations with a high probability to cause loss of function in a total of 316 genes. Interestingly, some loss-of-function variations of genes involved in the ALP pathway were found in homozygosity in the normal population, suggesting that their role is not essential. Thus, Lysoplex provided a comprehensive catalog of sequence variants in ALP genes and allows the assessment of their relevance in cell biology as well as their contribution to human disease. PMID:26075876

  16. GeneSV - an Approach to Help Characterize Possible Variations in Genomic and Protein Sequences.

    PubMed

    Zemla, Adam; Kostova, Tanya; Gorchakov, Rodion; Volkova, Evgeniya; Beasley, David W C; Cardosa, Jane; Weaver, Scott C; Vasilakis, Nikos; Naraghi-Arani, Pejman

    2014-01-01

    A computational approach for identification and assessment of genomic sequence variability (GeneSV) is described. For a given nucleotide sequence, GeneSV collects information about the permissible nucleotide variability (changes that potentially preserve function) observed in corresponding regions in genomic sequences, and combines it with conservation/variability results from protein sequence and structure-based analyses of evaluated protein coding regions. GeneSV was used to predict effects (functional vs. non-functional) of 37 amino acid substitutions on the NS5 polymerase (RdRp) of dengue virus type 2 (DENV-2), 36 of which are not observed in any publicly available DENV-2 sequence. 32 novel mutants with single amino acid substitutions in the RdRp were generated using a DENV-2 reverse genetics system. In 81% (26 of 32) of predictions tested, GeneSV correctly predicted viability of introduced mutations. In 4 of 5 (80%) mutants with double amino acid substitutions proximal in structure to one another GeneSV was also correct in its predictions. Predictive capabilities of the developed system were illustrated on dengue RNA virus, but described in the manuscript a general approach to characterize real or theoretically possible variations in genomic and protein sequences can be applied to any organism. PMID:24453480

  17. GeneSV – an Approach to Help Characterize Possible Variations in Genomic and Protein Sequences

    PubMed Central

    Zemla, Adam; Kostova, Tanya; Gorchakov, Rodion; Volkova, Evgeniya; Beasley, David W. C.; Cardosa, Jane; Weaver, Scott C.; Vasilakis, Nikos; Naraghi-Arani, Pejman

    2014-01-01

    A computational approach for identification and assessment of genomic sequence variability (GeneSV) is described. For a given nucleotide sequence, GeneSV collects information about the permissible nucleotide variability (changes that potentially preserve function) observed in corresponding regions in genomic sequences, and combines it with conservation/variability results from protein sequence and structure-based analyses of evaluated protein coding regions. GeneSV was used to predict effects (functional vs. non-functional) of 37 amino acid substitutions on the NS5 polymerase (RdRp) of dengue virus type 2 (DENV-2), 36 of which are not observed in any publicly available DENV-2 sequence. 32 novel mutants with single amino acid substitutions in the RdRp were generated using a DENV-2 reverse genetics system. In 81% (26 of 32) of predictions tested, GeneSV correctly predicted viability of introduced mutations. In 4 of 5 (80%) mutants with double amino acid substitutions proximal in structure to one another GeneSV was also correct in its predictions. Predictive capabilities of the developed system were illustrated on dengue RNA virus, but described in the manuscript a general approach to characterize real or theoretically possible variations in genomic and protein sequences can be applied to any organism. PMID:24453480

  18. 77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-10-29

    ... Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request. SUMMARY: The United States....'' SUPPLEMENTARY INFORMATION: I. Abstract Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of the sequence listing in accordance with the requirements in 37 CFR...

  19. DNA Shape versus Sequence Variations in the Protein Binding Process.

    PubMed

    Chen, Chuanying; Pettitt, B Montgomery

    2016-02-01

    The binding process of a protein with a DNA involves three stages: approach, encounter, and association. It has been known that the complexation of protein and DNA involves mutual conformational changes, especially for a specific sequence association. However, it is still unclear how the conformation and the information in the DNA sequences affects the binding process. What is the extent to which the DNA structure adopted in the complex is induced by protein binding, or is instead intrinsic to the DNA sequence? In this study, we used the multiscale simulation method to explore the binding process of a protein with DNA in terms of DNA sequence, conformation, and interactions. We found that in the approach stage the protein can bind both the major and minor groove of the DNA, but uses different features to locate the binding site. The intrinsic conformational properties of the DNA play a significant role in this binding stage. By comparing the specific DNA with the nonspecific in unbound, intermediate, and associated states, we found that for a specific DNA sequence, ∼40% of the bending in the association forms is intrinsic and that ∼60% is induced by the protein. The protein does not induce appreciable bending of nonspecific DNA. In addition, we proposed that the DNA shape variations induced by protein binding are required in the early stage of the binding process, so that the protein is able to approach, encounter, and form an intermediate at the correct site on DNA. PMID:26840719

  20. Sequence variation in the androgen receptor gene is not a common determinant of male sexual orientation.

    PubMed Central

    Macke, J P; Hu, N; Hu, S; Bailey, M; King, V L; Brown, T; Hamer, D; Nathans, J

    1993-01-01

    To test the hypothesis that DNA sequence variation in the androgen receptor gene plays a causal role in the development of male sexual orientation, we have (1) measured the degree of concordance of androgen receptor alleles in 36 pairs of homosexual brothers, (2) compared the lengths of polyglutamine and polyglycine tracts in the amino-terminal domain of the androgen receptor in a sample of 197 homosexual males and 213 unselected subjects, and (3) screened the the entire androgen receptor coding region for sequence variation by PCR and denaturing gradient-gel electrophoresis (DGGE) and/or single-strand conformation polymorphism analysis in 20 homosexual males with homosexual or bisexual brothers and one homosexual male with no homosexual brothers, and screened the amino-terminal domain of the receptor for sequence variation in an additional 44 homosexual males, 37 of whom had one or more first- or second-degree male relatives who were either homosexual or bisexual. These analyses show that (1) homosexual brothers are as likely to be discordant as concordant for androgen receptor alleles; (2) there are no large-scale differences between the distributions of polyglycine or polyglutamine tract lengths in the homosexual and control groups; and (3) coding region sequence variation is not commonly found within the androgen receptor gene of homosexual men. The DGGE screen identified two rare amino acid substitutions, ser205-to-arg and glu793-to-asp, the biological significance of which is unknown. Images Figure 2 PMID:8213813

  1. Sequence variation in the androgen receptor gene is not a common determinant of male sexual orientation

    SciTech Connect

    Macke, J.P.; Nathans, J.; King, V.L. ); Hu, N.; Hu, S.; Hamer, D.; Bailey, M. ); Brown, T. )

    1993-10-01

    To test the hypothesis that DNA sequence variation in the androgen receptor gene plays a causal role in the development of male sexual orientation, the authors have (1) measured the degree of concordance of androgen receptor alleles in 36 pairs of homosexual brothers, (2) compared the lengths of polyglutamine and polyglycine tracts in the amino-terminal domain of the androgen receptor in a sample of 197 homosexual males and 213 unselected subjects, and (3) screened the entire androgen receptor coding region for sequence variation by PCR and denaturing gradient-gel electrophoresis (DGGE) and/or single-strand conformation polymorphism analysis in 20 homosexual males with homosexual or bisexual brothers and one homosexual male with no homosexual brothers, and screened the amino-terminal domain of the receptor for sequence variation in an additional 44 homosexual males, 37 of whom had one or more first- or second-degree male relatives who were either homosexual or bisexual. These analyses show that (1) homosexual brothers are as likely to be discordant as concordant for androgen receptor alleles; (2) there are no large-scale differences between the distributions of polyglycine or polyglutamine tract lengths in the homosexual and control groups; and (3) coding region sequence variation is not commonly found within the androgen receptor gene of homosexual men. The DGGE screen identified two rare amino acid substitutions, ser[sup 205] -to-arg and glu[sup 793]-to-asp, the biological significance of which is unknown. 32 refs., 2 figs., 2 tabs.

  2. Attacin gene sequence variations in different ecoraces of tasar silkworm Antheraea mylitta

    PubMed Central

    Sudha, Rati; Murthy, Geetha N; Awasthi, Arvind K; Ponnuvel, Kangayam M

    2015-01-01

    Attacin gene exists as paralogous conversion and is being used for identification of strain variations in insects based on the sequence variation. Hence, a study was undertaken to analyze the sequence variation of the attacin gene isoforms in the tasar silkworm Anthereae mylitta that exists in the form of different ecoraces depending upon the environment, food plant and location. Comparison of the previously reported attacin sequences with the DNA sequences of attacin A and B genes revealed six amino acid substitutions among the sequences of the ecoraces which however did not affect the functional domain of Attacin. The generated dendrogram clearly indicated unique branches for each ecorace with two separate gene clusters for attacin A and B. The Sarihan ecorace formed a separate sub-group under both the gene clusters. The present study also revealed the presence of Attacin_N Superfamily domain exclusively in Exon I separated from the Attacin_C Superfamily domain that was present in Exon II and part of Exon III, a prominent character of attacin gene. The phylogenetic reconstruction analysis of attacin gene in A.mylitta supported the common evolutionary origin of attacin genes belonging to the Lepidoteran and Dipteran families that formed two separate clusters. PMID:26664033

  3. STR allele sequence variation: Current knowledge and future issues.

    PubMed

    Gettings, Katherine Butler; Aponte, Rachel A; Vallone, Peter M; Butler, John M

    2015-09-01

    This article reviews what is currently known about short tandem repeat (STR) allelic sequence variation in and around the twenty-four loci most commonly used throughout the world to perform forensic DNA investigations. These STR loci include D1S1656, TPOX, D2S441, D2S1338, D3S1358, FGA, CSF1PO, D5S818, SE33, D6S1043, D7S820, D8S1179, D10S1248, TH01, vWA, D12S391, D13S317, Penta E, D16S539, D18S51, D19S433, D21S11, Penta D, and D22S1045. All known reported variant alleles are compiled along with genomic information available from GenBank, dbSNP, and the 1000 Genomes Project. Supplementary files are included which provide annotated reference sequences for each STR locus, characterize genomic variation around the STR repeat region, and compare alleles present in currently available STR kit allelic ladders. Looking to the future, STR allele nomenclature options are discussed as they relate to next generation sequencing efforts underway. PMID:26197946

  4. Predicting intrinsic disorder from amino acid sequence.

    PubMed

    Obradovic, Zoran; Peng, Kang; Vucetic, Slobodan; Radivojac, Predrag; Brown, Celeste J; Dunker, A Keith

    2003-01-01

    Blind predictions of intrinsic order and disorder were made on 42 proteins subsequently revealed to contain 9,044 ordered residues, 284 disordered residues in 26 segments of length 30 residues or less, and 281 disordered residues in 2 disordered segments of length greater than 30 residues. The accuracies of the six predictors used in this experiment ranged from 77% to 91% for the ordered regions and from 56% to 78% for the disordered segments. The average of the order and disorder predictions ranged from 73% to 77%. The prediction of disorder in the shorter segments was poor, from 25% to 66% correct, while the prediction of disorder in the longer segments was better, from 75% to 95% correct. Four of the predictors were composed of ensembles of neural networks. This enabled them to deal more efficiently with the large asymmetry in the training data through diversified sampling from the significantly larger ordered set and achieve better accuracy on ordered and long disordered regions. The exclusive use of long disordered regions for predictor training likely contributed to the disparity of the predictions on long versus short disordered regions, while averaging the output values over 61-residue windows to eliminate short predictions of order or disorder probably contributed to the even greater disparity for three of the predictors. This experiment supports the predictability of intrinsic disorder from amino acid sequence. PMID:14579347

  5. DYZ1 arrays show sequence variation between the monozygotic males

    PubMed Central

    2014-01-01

    Background Monozygotic twins (MZT) are an important resource for genetical studies in the context of normal and diseased genomes. In the present study we used DYZ1, a satellite fraction present in the form of tandem arrays on the long arm of the human Y chromosome, as a tool to uncover sequence variations between the monozygotic males. Results We detected copy number variation, frequent insertions and deletions within the sequences of DYZ1 arrays amongst all the three sets of twins used in the present study. MZT1b showed loss of 35 bp compared to that in 1a, whereas 2a showed loss of 31 bp compared to that in 2b. Similarly, 3b showed 10 bp insertion compared to that in 3a. MZT1a germline DNA showed loss of 5 bp and 1b blood DNA showed loss of 26 bp compared to that of 1a blood and 1b germline DNA, respectively. Of the 69 restriction sites detected in DYZ1 arrays, MboII, BsrI, TspEI and TaqI enzymes showed frequent loss and or gain amongst all the 3 pairs studied. MZT1 pair showed loss/gain of VspI, BsrDI, AgsI, PleI, TspDTI, TspEI, TfiI and TaqI restriction sites in both blood and germline DNA. All the three sets of MZT showed differences in the number of DYZ1 copies. FISH signals reflected somatic mosaicism of the DYZ1 copies across the cells. Conclusions DYZ1 showed both sequence and copy number variation between the MZT males. Sequence variation was also noticed between germline and blood DNA samples of the same individual as we observed at least in one set of sample. The result suggests that DYZ1 faithfully records all the genetical changes occurring after the twining which may be ascribed to the environmental factors. PMID:24495361

  6. Comparative RNA sequencing reveals substantial genetic variation in endangered primates

    PubMed Central

    Perry, George H.; Melsted, Páll; Marioni, John C.; Wang, Ying; Bainer, Russell; Pickrell, Joseph K.; Michelini, Katelyn; Zehr, Sarah; Yoder, Anne D.; Stephens, Matthew; Pritchard, Jonathan K.; Gilad, Yoav

    2012-01-01

    Comparative genomic studies in primates have yielded important insights into the evolutionary forces that shape genetic diversity and revealed the likely genetic basis for certain species-specific adaptations. To date, however, these studies have focused on only a small number of species. For the majority of nonhuman primates, including some of the most critically endangered, genome-level data are not yet available. In this study, we have taken the first steps toward addressing this gap by sequencing RNA from the livers of multiple individuals from each of 16 mammalian species, including humans and 11 nonhuman primates. Of the nonhuman primate species, five are lemurs and two are lorisoids, for which little or no genomic data were previously available. To analyze these data, we developed a method for de novo assembly and alignment of orthologous gene sequences across species. We assembled an average of 5721 gene sequences per species and characterized diversity and divergence of both gene sequences and gene expression levels. We identified patterns of variation that are consistent with the action of positive or directional selection, including an 18-fold enrichment of peroxisomal genes among genes whose regulation likely evolved under directional selection in the ancestral primate lineage. Importantly, we found no relationship between genetic diversity and endangered status, with the two most endangered species in our study, the black and white ruffed lemur and the Coquerel's sifaka, having the highest genetic diversity among all primates. Our observations imply that many endangered lemur populations still harbor considerable genetic variation. Timely efforts to conserve these species alongside their habitats have, therefore, strong potential to achieve long-term success. PMID:22207615

  7. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2002-01-01

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  8. Methods and compositions for efficient nucleic acid sequencing

    DOEpatents

    Drmanac, Radoje

    2006-07-04

    Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.

  9. Kit for detecting nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2001-01-01

    A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the

  10. tax and rex Sequences of bovine leukaemia virus from globally diverse isolates: rex amino acid sequence more variable than tax.

    PubMed

    McGirr, K M; Buehring, G C

    2005-02-01

    Bovine leukaemia virus (BLV) is an important agricultural problem with high costs to the dairy industry. Here, we examine the variation of the tax and rex genes of BLV. The tax and rex genes share 420 bases and have overlapping reading frames. The tax gene encodes a protein that functions as a transactivator of the BLV promoter, is required for viral replication, acts on cellular promoters, and is responsible for oncogenesis. The rex facilitates the export of viral mRNAs from the nucleus and regulates transcription. We have sequenced five new isolates of the tax/rex gene. We examined the five new and three previously published tax/rex DNA and predicted amino acid sequences of BLV isolates from cattle in representative regions worldwide. The highest variation among nucleic acid sequences for tax and rex was 7% and 5%, respectively; among predicted amino acid sequences for Tax and Rex, 9% and 11%, respectively. Significantly more nucleotide changes resulted in predicted amino acid changes in the rex gene than in the tax gene (P < or = 0.0006). This variability is higher than previously reported for any region of the viral genome. This research may also have implications for the development of Tax-based vaccines. PMID:15702995

  11. Unraveling genomic variation from next generation sequencing data

    PubMed Central

    2013-01-01

    Elucidating the content of a DNA sequence is critical to deeper understand and decode the genetic information for any biological system. As next generation sequencing (NGS) techniques have become cheaper and more advanced in throughput over time, great innovations and breakthrough conclusions have been generated in various biological areas. Few of these areas, which get shaped by the new technological advances, involve evolution of species, microbial mapping, population genetics, genome-wide association studies (GWAs), comparative genomics, variant analysis, gene expression, gene regulation, epigenetics and personalized medicine. While NGS techniques stand as key players in modern biological research, the analysis and the interpretation of the vast amount of data that gets produced is a not an easy or a trivial task and still remains a great challenge in the field of bioinformatics. Therefore, efficient tools to cope with information overload, tackle the high complexity and provide meaningful visualizations to make the knowledge extraction easier are essential. In this article, we briefly refer to the sequencing methodologies and the available equipment to serve these analyses and we describe the data formats of the files which get produced by them. We conclude with a thorough review of tools developed to efficiently store, analyze and visualize such data with emphasis in structural variation analysis and comparative genomics. We finally comment on their functionality, strengths and weaknesses and we discuss how future applications could further develop in this field. PMID:23885890

  12. Solid phase sequencing of double-stranded nucleic acids

    DOEpatents

    Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

    2002-01-01

    This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.

  13. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    States, David J.

    2004-07-28

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  14. Analysis and Annotation of Nucleic Acid Sequence

    SciTech Connect

    David J. States

    1998-08-01

    The aims of this project were to develop improved methods for computational genome annotation and to apply these methods to improve the annotation of genomic sequence data with a specific focus on human genome sequencing. The project resulted in a substantial body of published work. Notable contributions of this project were the identification of basecalling and lane tracking as error processes in genome sequencing and contributions to improved methods for these steps in genome sequencing. This technology improved the accuracy and throughput of genome sequence analysis. Probabilistic methods for physical map construction were developed. Improved methods for sequence alignment, alternative splicing analysis, promoter identification and NF kappa B response gene prediction were also developed.

  15. Using evolutionary sequence variation to make inferences about protein structure and function

    NASA Astrophysics Data System (ADS)

    Colwell, Lucy

    2015-03-01

    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. The explosive growth in the number of available protein sequences raises the possibility of using the natural variation present in homologous protein sequences to infer these constraints and thus identify residues that control different protein phenotypes. Because in many cases phenotypic changes are controlled by more than one amino acid, the mutations that separate one phenotype from another may not be independent, requiring us to understand the correlation structure of the data. To address this we build a maximum entropy probability model for the protein sequence. The parameters of the inferred model are constrained by the statistics of a large sequence alignment. Pairs of sequence positions with the strongest interactions accurately predict contacts in protein tertiary structure, enabling all atom structural models to be constructed. We describe development of a theoretical inference framework that enables the relationship between the amount of available input data and the reliability of structural predictions to be better understood.

  16. Sequence variation in ROP8 gene among Toxoplasma gondii isolates from different hosts and geographical localities.

    PubMed

    Li, Z Y; Chen, J; Lu, J; Wang, C R; Zhu, X Q

    2015-01-01

    The protozoan parasite Toxoplasma gondii has a worldwide distribution; it can cause serious diseases in humans and almost all other warm-blooded animals. Different genotypes of T. gondii result in different lesions in the same host. T. gondii rhoptry protein 8 (TgROP8) is a major factor of T. gondii acute virulence. We examined sequence variation in the TgROP8 gene among T. gondii isolates from different hosts and geographical localities. The TgROP8 gene was amplified from individual isolates and sequenced. A phylogenetic tree was constructed using Bayesian inference, maximum parsimony, and maximum likelihood based on the sequences obtained plus TgME49 from the ToxoDB database. The TgROP8 gene was 1728 bp in length for all the examined T. gondii strains, and their A+T contents were 45.37-45.95%. Sequence analysis detected 140 (0.06-5.56%) variable nucleotide positions resulting in 96 (0-10.78%) amino acid substitutions. Sequence variations in the TgROP8 gene resulted in polymorphic restriction sites for endonucleases BstBI, BsaI, and XhoI, which allowed the differentiation of the three classical genotype strains (types I, II, and III) by polymerase chain reaction-restriction fragment length polymorphism (PCR-RFLP). However, phylogenetic analyses indicated that the TgROP8 gene is not a suitable genetic marker for population studies of T. gondii. PMID:26436382

  17. Deep sequencing of the hepatitis B virus in hepatocellular carcinoma patients reveals enriched integration events, structural alterations and sequence variations.

    PubMed

    Toh, Soo Ting; Jin, Yu; Liu, Lizhen; Wang, Jingbo; Babrzadeh, Farbod; Gharizadeh, Baback; Ronaghi, Mostafa; Toh, Han Chong; Chow, Pierce Kah-Hoe; Chung, Alexander Y-F; Ooi, London L-P-J; Lee, Caroline G-L

    2013-04-01

    Chronic hepatitis B virus (HBV) infection is epidemiologically associated with hepatocellular carcinoma (HCC), but its role in HCC remains poorly understood due to technological limitations. In this study, we systematically characterize HBV in HCC patients. HBV sequences were enriched from 48 HCC patients using an oligo-bead-based strategy, pooled together and sequenced using the FLX-Genome-Sequencer. In the tumors, preferential integration of HBV into promoters of genes (P < 0.001) and significant enrichment of integration into chromosome 10 (P < 0.01) were observed. Integration into chromosome 10 was significantly associated with poorly differentiated tumors (P < 0.05). Notably, in the tumors, recurrent integration into the promoter of the human telomerase reverse transcriptase (TERT) gene was found to correlate with increased TERT expression. The preferred region within the HBV genome involved in integration and viral structural alteration is at the 3'-end of hepatitis B virus X protein (HBx), where viral replication/transcription initiates. Upon integration, the 3'-end of the HBx is often deleted. HBx-human chimeric transcripts, the most common type of chimeric transcripts, can be expressed as chimeric proteins. Sequence variation resulting in non-conservative amino acid substitutions are commonly observed in HBV genome. This study highlights HBV as highly mutable in HCC patients with preferential regions within the host and virus genome for HBV integration/structural alterations. PMID:23276797

  18. Conversion of amino-acid sequence in proteins to classical music: search for auditory patterns

    PubMed Central

    2007-01-01

    We have converted genome-encoded protein sequences into musical notes to reveal auditory patterns without compromising musicality. We derived a reduced range of 13 base notes by pairing similar amino acids and distinguishing them using variations of three-note chords and codon distribution to dictate rhythm. The conversion will help make genomic coding sequences more approachable for the general public, young children, and vision-impaired scientists. PMID:17477882

  19. Alpha-CENTAURI: assessing novel centromeric repeat sequence variation with long read sequencing

    PubMed Central

    Sevim, Volkan; Bashir, Ali; Chin, Chen-Shan; Miga, Karen H.

    2016-01-01

    Motivation: Long arrays of near-identical tandem repeats are a common feature of centromeric and subtelomeric regions in complex genomes. These sequences present a source of repeat structure diversity that is commonly ignored by standard genomic tools. Unlike reads shorter than the underlying repeat structure that rely on indirect inference methods, e.g. assembly, long reads allow direct inference of satellite higher order repeat structure. To automate characterization of local centromeric tandem repeat sequence variation we have designed Alpha-CENTAURI (ALPHA satellite CENTromeric AUtomated Repeat Identification), that takes advantage of Pacific Bioscience long-reads from whole-genome sequencing datasets. By operating on reads prior to assembly, our approach provides a more comprehensive set of repeat-structure variants and is not impacted by rearrangements or sequence underrepresentation due to misassembly. Results: We demonstrate the utility of Alpha-CENTAURI in characterizing repeat structure for alpha satellite containing reads in the hydatidiform mole (CHM1, haploid-like) genome. The pipeline is designed to report local repeat organization summaries for each read, thereby monitoring rearrangements in repeat units, shifts in repeat orientation and sites of array transition into non-satellite DNA, typically defined by transposable element insertion. We validate the method by showing consistency with existing centromere high order repeat references. Alpha-CENTAURI can, in principle, run on any sequence data, offering a method to generate a sequence repeat resolution that could be readily performed using consensus sequences available for other satellite families in genomes without high-quality reference assemblies. Availability and implementation: Documentation and source code for Alpha-CENTAURI are freely available at http://github.com/volkansevim/alpha-CENTAURI. Contact: ali.bashir@mssm.edu Supplementary information: Supplementary data are available at

  20. Mitochondrial sequence variation in the Guahibo Amerindian population from Venezuela.

    PubMed

    Vona, Giuseppe; Falchi, Alessandra; Moral, Pedro; Calò, Carla M; Varesi, Laurent

    2005-07-01

    New data were obtained on mitochondrial DNA (mtDNA) from Guahibo from Venezuela, a group so far not studied using molecular data. A population sample (n = 59) was analyzed for mtDNA variation in two control-region hypervariable segments (HV1 and HV2) by sequencing. The presence or absence of a 9-bp polymorphism in the COII/tRNA(Lys) region was studied by direct amplification and electrophoretic identification. Thirty-eight variable sites were detected in regions HV1 and HV2, defining 26 mtDNA lineages; 23.7% of these were present in a single individual. The 9-bp deletion was found in 3.39% of individuals. Nucleotide and haplotype diversities were relatively high compared with other New World populations. The identified sequence haplotypes were classified into four major haplogroups (A-D) according to previous studies, with high frequencies for A (47.46%) and C (49.15%), low frequency for B (3.39%), and an absence of D. PMID:15558610

  1. Mapping copy number variation by population-scale genome sequencing.

    PubMed

    Mills, Ryan E; Walter, Klaudia; Stewart, Chip; Handsaker, Robert E; Chen, Ken; Alkan, Can; Abyzov, Alexej; Yoon, Seungtai Chris; Ye, Kai; Cheetham, R Keira; Chinwalla, Asif; Conrad, Donald F; Fu, Yutao; Grubert, Fabian; Hajirasouliha, Iman; Hormozdiari, Fereydoun; Iakoucheva, Lilia M; Iqbal, Zamin; Kang, Shuli; Kidd, Jeffrey M; Konkel, Miriam K; Korn, Joshua; Khurana, Ekta; Kural, Deniz; Lam, Hugo Y K; Leng, Jing; Li, Ruiqiang; Li, Yingrui; Lin, Chang-Yun; Luo, Ruibang; Mu, Xinmeng Jasmine; Nemesh, James; Peckham, Heather E; Rausch, Tobias; Scally, Aylwyn; Shi, Xinghua; Stromberg, Michael P; Stütz, Adrian M; Urban, Alexander Eckehart; Walker, Jerilyn A; Wu, Jiantao; Zhang, Yujun; Zhang, Zhengdong D; Batzer, Mark A; Ding, Li; Marth, Gabor T; McVean, Gil; Sebat, Jonathan; Snyder, Michael; Wang, Jun; Ye, Kenny; Eichler, Evan E; Gerstein, Mark B; Hurles, Matthew E; Lee, Charles; McCarroll, Steven A; Korbel, Jan O

    2011-02-01

    Genomic structural variants (SVs) are abundant in humans, differing from other forms of variation in extent, origin and functional impact. Despite progress in SV characterization, the nucleotide resolution architecture of most SVs remains unknown. We constructed a map of unbalanced SVs (that is, copy number variants) based on whole genome DNA sequencing data from 185 human genomes, integrating evidence from complementary SV discovery approaches with extensive experimental validations. Our map encompassed 22,025 deletions and 6,000 additional SVs, including insertions and tandem duplications. Most SVs (53%) were mapped to nucleotide resolution, which facilitated analysing their origin and functional impact. We examined numerous whole and partial gene deletions with a genotyping approach and observed a depletion of gene disruptions amongst high frequency deletions. Furthermore, we observed differences in the size spectra of SVs originating from distinct formation mechanisms, and constructed a map of SV hotspots formed by common mechanisms. Our analytical framework and SV map serves as a resource for sequencing-based association studies. PMID:21293372

  2. From Artificial Amino Acids to Sequence-Defined Targeted Oligoaminoamides.

    PubMed

    Morys, Stephan; Wagner, Ernst; Lächelt, Ulrich

    2016-01-01

    Artificial oligoamino acids with appropriate protecting groups can be used for the sequential assembly of oligoaminoamides on solid-phase. With the help of these oligoamino acids multifunctional nucleic acid (NA) carriers can be designed and produced in highly defined topologies. Here we describe the synthesis of the artificial oligoamino acid Fmoc-Stp(Boc3)-OH, the subsequent assembly into sequence-defined oligomers and the formulation of tumor-targeted plasmid DNA (pDNA) polyplexes. PMID:27436323

  3. Sequence variation in the human T-cell receptor loci.

    PubMed

    Mackelprang, Rachel; Carlson, Christopher S; Subrahmanyan, Lakshman; Livingston, Robert J; Eberle, Michael A; Nickerson, Deborah A

    2002-12-01

    Identifying common sequence variations known as single nucleotide polymorphisms (SNPs) in human populations is one of the current objectives of the human genome project. Nearly 3 million SNPs have been identified. Analysis of the relative allele frequency of these markers in human populations and the genetic associations between these markers, known as linkage disequilibrium, is now underway to generate a high-density genetic map. Because of the central role T cells play in immune reactivity, the T-cell receptor (TCR) loci have long been considered important candidates for common disease susceptibility within the immune system (e.g., asthma, atopy and autoimmunity). Over the past two decades, hundreds of SNPs in the TCR loci have been identified. Most studies have focused on defining SNPs in the variable gene segments which are involved in antigenic recognition. On average, the coding sequence of each TCR variable gene segment contains two SNPs, with many more found in the 5', 3' and intronic sequences of these segments. Therefore, a potentially large repertoire of functional variants exists in these loci. Association between SNPs (linkage disequilibrium) extends approximately 30 kb in the TCR loci, although a few larger regions of disequilibrium have been identified. Therefore, the SNPs found in one variable gene segment may or may not be associated with SNPs in other surrounding variable gene segments. This suggests that meaningful association studies in the TCR loci will require the analysis and typing of large marker sets to fully evaluate the role of TCR loci in common disease susceptibility in human populations. PMID:12493004

  4. Studies on monotreme proteins. VII. Amino acid sequence of myoglobin from the platypus, Ornithoryhynchus anatinus.

    PubMed

    Fisher, W K; Thompson, E O

    1976-03-01

    Myoglobin isolated from skeletal muscle of the platypus contains 153 amino acid residues. The complete amino acid sequence has been determined following cleavage with cyanogen bromide and further digestion of the four fragments with trypsin, chymotrypsin, pepsin and thermolysin. Sequences of the purified peptides were determined by the dansyl-Edman procedure. The amino acid sequence showed 25 differences from human myoglobin and 24 from kangaroo myoglobin. Amino acid sequences in myoglobins are more conserved than sequences in the alpha- and beta-globin chains, and platypus myoglobin shows a similar number of variations in sequence to kangaroo myoglobin when compared with myoglobin of other species. The date of divergence of the platypus from other mammals was estimated at 102 +/- 31 million years, based on the number of amino acid differences between species and allowing for mutations during the evolutionary period. This estimate differs widely from the estimate given by similar treatment of the alpha- and beta-chain sequences and a constant rate of mutation of globin chains is not supported. PMID:962722

  5. Geographical distribution and temporal variation of rain acidity over China

    SciTech Connect

    Wen-Xing Wang; Yan-Bo Pang; Guo-An Ding

    1996-12-31

    In recent decade, large areas of acid rain have appeared in China. With the increasing emission of SO{sub 2} and NO{sub x} year by year, the acidity of precipitation has increased, and the acid rain area is expanding. Presently, the acid rain in China has become the third largest area of acid rain in the world, next to Europe and North America. The Chinese government took action against acid rain and planned a five-year National Acid Deposition Research Project. The space-time distribution and variation of rain acidity described in this paper is a part of this project. China is a large country. The area is almost equal to that of Europe. Its climate varies greatly and spans the tropics, subtropics, temperate and frigid zone. There is a varied topography including mountain, hilly country, desert and plain, on the other hand the distribution of anthropogenic sources are not even. All of the human and natural factors caused different chemical composition in different parts of China, the acidity of precipitation varies also. The acidity of the precipitation is the most important parameter in the acid rain research. In order to obtain the regional representative distribution of rain acidity, National Acidic Deposition Research Monitoring Network with 261 monitoring sites was established in 1992. This paper summarizes the rain acidity of 21355 precipitation samples, and gave the annual, seasonal, and the monthly pH contours. Results show that the acid rain area has expanded from the south during winter. Regional differences of monthly acid precipitation exists, generally, the rain acidity level is higher during summer and fall and lower during winter and spring in the northern provinces. The 9 opposite is the case in the southern provinces. The central areas are in a transitional situation. The geographical distribution and temporal variation of rain acidity are quite different from North America and Europe.

  6. Detecting frame shifts by amino acid sequence comparison.

    PubMed

    Claverie, J M

    1993-12-20

    Various amino acid substitution scoring matrices are used in conjunction with local alignments programs to detect regions of similarity and infer potential common ancestry between proteins. The usual scoring schemes derive from the implicit hypothesis that related proteins evolve from a common ancestor by the accumulation of point mutations and that amino acids tend to be progressively substituted by others with similar properties. However, other frequent single mutation events, like nucleotide insertion or deletion and gene inversion, change the translation reading frame and cause previously encoded amino acid sequences to become unrecognizable at once. Here, I derive five new types of scoring matrix, each capable of detecting a specific frame shift (deletion, insertion and inversion in 3 frames) and use them with a regular local alignments program to detect amino acid sequences that may have derived from alternative reading frames of the same nucleotide sequence. Frame shifts are inferred from the sole comparison of the protein sequences. The five scoring matrices were used with the BLASTP program to compare all the protein sequences in the Swissprot database. Surprisingly, the searches revealed hundreds of highly significant frame shift matches, of which many are likely to represent sequencing errors. Others provide some evidence that frame shift mutations might be used in protein evolution as a way to create new amino acid sequences from pre-existing coding regions. PMID:7903399

  7. Segments of amino acid sequence similarity in beta-amylases.

    PubMed

    Friedberg, F; Rhodes, C

    1988-01-01

    In alpha-amylases from animals, plants and bacteria and in beta-amylases from plants and bacteria a number of segments exhibit amino acid sequence similarity specific to the alpha or to the beta type, respectively. In the case of the beta-amylases the similar sequence regions are extensive and they are disrupted only by short interspersed dissimilar regions. Close to the C terminus, however, no such sequence similarity exist. PMID:2464171

  8. Genotypic variation in fatty acid content of blackcurrant seeds.

    PubMed

    Ruiz del Castillo, M L; Dobson, G; Brennan, R; Gordon, S

    2002-01-16

    The fatty acid composition and total fatty acid content of seeds from 36 blackcurrant genotypes developed at the Scottish Crop Research Institute were examined. A rapid small-scale procedure, involving homogenization of seeds in toluene followed by sodium methoxide transesterification and gas chromatography, was used. There was considerable variation between genotypes. The gamma-linolenic acid content generally varied from 11 to 19% of the total fatty acids, but three genotypes had higher values of 22-24%, levels previously not reported for blackcurrant seed and similar to those for borage seed. Other nutritionally important fatty acids, stearidonic acid and alpha-linolenic acid, varied from 2 to 4% and 10-19%, respectively. The mean total fatty acid contents ranged from 14 to 23% of the seed, but repeatability was poor. The results are discussed. Blackcurrant seeds are mainly byproducts from juice production, and the study shows the potential for developing blackcurrant genotypes with optimal added value. PMID:11782203

  9. Simple sequence repeat variations expedite phage divergence: Mechanisms of indels and gene mutations.

    PubMed

    Lin, Tiao-Yin

    2016-07-01

    Phages are the most abundant biological entities and influence prokaryotic communities on Earth. Comparing closely related genomes sheds light on molecular events shaping phage evolution. Simple sequence repeat (SSR) variations impart over half of the genomic changes between T7M and T3, indicating an important role of SSRs in accelerating phage genetic divergence. Differences in coding and noncoding regions of phages infecting different hosts, coliphages T7M and T3, Yersinia phage ϕYeO3-12, and Salmonella phage ϕSG-JL2, frequently arise from SSR variations. Such variations modify noncoding and coding regions; the latter efficiently changes multiple amino acids, thereby hastening protein evolution. Four classes of events are found to drive SSR variations: insertion/deletion of SSR units, expansion/contraction of SSRs without alteration of genome length, changes of repeat motifs, and generation/loss of repeats. The categorization demonstrates the ways SSRs mutate in genomes during phage evolution. Indels are common constituents of genome variations and human diseases, yet, how they occur without preexisting repeat sequence is less understood. Non-repeat-unit-based misalignment-elongation (NRUBME) is proposed to be one mechanism for indels without adjacent repeats. NRUBME or consecutive NRUBME may also change repeat motifs or generate new repeats. NRUBME invoking a non-Watson-Crick base pair explains insertions that initiate mononucleotide repeats. Furthermore, NRUBME successfully interprets many inexplicable human di- to tetranucleotide repeat generations. This study provides the first evidence of SSR variations expediting phage divergence, and enables insights into the events and mechanisms of genome evolution. NRUBME allows us to emulate natural evolution to design indels for various applications. PMID:27133219

  10. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  11. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  12. 37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... acids are not intended to be embraced by this definition. Any amino acid sequence that contains post-translationally modified amino acids may be described as the amino acid sequence that is initially translated... sequence of four or more amino acids or an unbranched sequence of ten or more nucleotides....

  13. Intraspecific Variation of Unusual Phospholipids from Corynebacterium spp. Containing a Novel Fatty Acid

    PubMed Central

    Niepel, Tanja; Meyer, Holger; Wray, Victor; Abraham, Wolf-Rainer

    1998-01-01

    The novel fatty acid trans-9-methyl-10-octadecenoic acid was isolated from the coryneform bacterial strain LMG 3820 (previously misidentified as Arthrobacter globiformis) and identified by spectroscopic methods and chemical derivatization. This fatty acid is attached to the unusual lipid acyl phosphatidylglycerol. Five different species of this lipid type were identified; their structures were elucidated by tandem mass spectrometry and are reported here for the first time. Additionally, we identified three different cardiolipins, two bearing the novel fatty acid. The characteristic 10-methyl-octadecanoic acid was present only in phosphatidylinositol. Because of the unusual fatty acid pattern of strain LMG 3820, the 16S rDNA sequence was determined and showed regions of identity to sequences of Corynebacterium variabilis DSM 20132T and DSM 20536. All three strains possessed the novel fatty acid, identifying trans-9-methyl-10-octadecenoic acid as a potential biomarker characteristic for this taxon. Surprisingly, the fatty acid and relative abundances of phospholipids of Corynebacterium sp. strain LMG 3820 were similar to those of the type strain but different from those of Corynebacterium variabilis DSM 20536, although all three strains possessed identical 16S rDNA sequences and strains DSM 20132T and DSM 20536 have 90.5% DNA-DNA homology. This is one of the rare cases wherein different organisms with identical 16S rDNA sequences have been observed to present recognizably different fatty acid and lipid compositions. Since methylation of a fatty acid considerably lowers the transition temperature of the corresponding lipid resulting in a more flexible cell membrane, the intraspecific variation in the lipid composition, coinciding with the morphological and Gram stain reaction variability of this species, probably offers an advantage for this species to inhabit different environmental niches. PMID:9721308

  14. Nucleotide sequence variation of the envelope protein gene identifies two distinct genotypes of yellow fever virus.

    PubMed Central

    Chang, G J; Cropp, B C; Kinney, R M; Trent, D W; Gubler, D J

    1995-01-01

    The evolution of yellow fever virus over 67 years was investigated by comparing the nucleotide sequences of the envelope (E) protein genes of 20 viruses isolated in Africa, the Caribbean, and South America. Uniformly weighted parsimony algorithm analysis defined two major evolutionary yellow fever virus lineages designated E genotypes I and II. E genotype I contained viruses isolated from East and Central Africa. E genotype II viruses were divided into two sublineages: IIA viruses from West Africa and IIB viruses from America, except for a 1979 virus isolated from Trinidad (TRINID79A). Unique signature patterns were identified at 111 nucleotide and 12 amino acid positions within the yellow fever virus E gene by signature pattern analysis. Yellow fever viruses from East and Central Africa contained unique signatures at 60 nucleotide and five amino acid positions, those from West Africa contained unique signatures at 25 nucleotide and two amino acid positions, and viruses from America contained such signatures at 30 nucleotide and five amino acid positions in the E gene. The dissemination of yellow fever viruses from Africa to the Americas is supported by the close genetic relatedness of genotype IIA and IIB viruses and genetic evidence of a possible second introduction of yellow fever virus from West Africa, as illustrated by the TRINID79A virus isolate. The E protein genes of American IIB yellow fever viruses had higher frequencies of amino acid substitutions than did genes of yellow fever viruses of genotypes I and IIA on the basis of comparisons with a consensus amino acid sequence for the yellow fever E gene. The great variation in the E proteins of American yellow fever virus probably results from positive selection imposed by virus interaction with different species of mosquitoes or nonhuman primates in the Americas. PMID:7637022

  15. Haplotypes and Sequence Variation in the Ovine Adiponectin Gene (ADIPOQ)

    PubMed Central

    An, Qing-Ming; Zhou, Hui-Tong; Hu, Jiang; Luo, Yu-Zhu; Hickford, Jon G. H.

    2015-01-01

    The adiponectin gene (ADIPOQ) plays an important role in energy homeostasis. In this study five separate regions (regions 1 to 5) of ovine ADIPOQ were analysed using PCR-SSCP. Four different PCR-SSCP patterns (A1-D1, A2-D2) were detected in region-1 and region-2, respectively, with seven and six SNPs being revealed. In region-3, three different patterns (A3-C3) and three SNPs were observed. Two patterns (A4-B4, A5-B5) and two and one SNPs were observed in region-4 and region-5, respectively. In total, nineteen SNPs were detected, with five of them in the coding region and two (c.46T/C and c.515G/A) putatively resulting in amino acid changes (p.Tyr16His and p.Lys172Arg). In region-1, -2 and -3 of 316 sheep from eight New Zealand breeds, variants A1, A2 and A3 were the most common, although variant frequencies differed in the eight breeds. Across region-1 and region-3, nine haplotypes were identified and haplotypes A1-A3, A1-C3, B1-A3 and B1-C3 were most common. These results indicate that the ADIPOQ gene is polymorphic and suggest that further analysis is required to see if the variation in the gene is associated with animal production traits. PMID:26610572

  16. A method to find palindromes in nucleic acid sequences.

    PubMed

    Anjana, Ramnath; Shankar, Mani; Vaishnavi, Marthandan Kirti; Sekar, Kanagaraj

    2013-01-01

    Various types of sequences in the human genome are known to play important roles in different aspects of genomic functioning. Among these sequences, palindromic nucleic acid sequences are one such type that have been studied in detail and found to influence a wide variety of genomic characteristics. For a nucleotide sequence to be considered as a palindrome, its complementary strand must read the same in the opposite direction. For example, both the strands i.e the strand going from 5' to 3' and its complementary strand from 3' to 5' must be complementary. A typical nucleotide palindromic sequence would be TATA (5' to 3') and its complimentary sequence from 3' to 5' would be ATAT. Thus, a new method has been developed using dynamic programming to fetch the palindromic nucleic acid sequences. The new method uses less memory and thereby it increases the overall speed and efficiency. The proposed method has been tested using the bacterial (3891 KB bases) and human chromosomal sequences (Chr-18: 74366 kb and Chr-Y: 25554 kb) and the computation time for finding the palindromic sequences is in milli seconds. PMID:23515654

  17. Sequence variations in the introns of the triosephosphate isomerase genes of Oesophagostomum dentatum and O. quadrispinulatum.

    PubMed

    Joachim, A; von Samson-Himmelstjerna, G

    2001-09-01

    Degenerated primers were used to amplify DNA fragments of the triosephosphate isomerase (TPI) gene from complementary DNA (cDNA) and from genomic DNA of two species of porcine gastrointestinal nematodes, Oesophagostomum dentatum and O.quadrispinulatum. Polymerase chain reaction (PCR) fragments amplified from cDNA were 520 bp in size for both species, while genomic fragments were 1,035 bp for O. dentatum (GC-content: 45%) and 1,331 bp for O. quadrispinulatum (44%). Sequence analyses revealed blocks of high homology in the exons interrupted by more variable parts in the intron regions. Five exons were predicted from the genomic sequences in the conserved regions which corresponded to the respective cDNA sequences with 6% interspecific differences. The predicted protein sequences (161 amino acids) were 98% similar between the species and showed 71% similarity to the putative protein of Caenorhabditis elegans. As a housekeeping gene, TPI could be amplified from cDNA of both infectious third-stage larvae and adults. Interspecific variations in the non-coding regions allow the PCR-based differentiation of the two Oesophagostomum spp. PMID:11570563

  18. Geochemical variations during the 2012 Emilia seismic sequence

    NASA Astrophysics Data System (ADS)

    Sciarra, Alessandra; Cantucci, Barbara; Galli, Gianfranco; Cinti, Daniele; Pizzino, Luca

    2015-04-01

    Several geochemical surveys (soil gas and shallow water) were performed in the Modena province (Massa Finalese, Finale Emilia, Medolla and S. Felice sul Panaro), during 2006-2014 period. In May-June 2012, a seismic sequence (main shocks of ML 5.9 and 5.8) was occurred closely to the investigated area. In this area 300 CO2 and CH4 fluxes measurements, 150 soil gas concentrations (He, H2, CO2, CH4 and C2H6), 30 shallow waters and their isotopic analyses (δ13C- CH4, δD- CH4 and δ13C- CO2) were performed in April-May 2006, October and December 2008, repeated in May and September 2012, June 2013 and July 2014 afterwards the 2012 Emilia seismic sequences. Chemical composition of soil gas are dominated by CH4 in the southern part by CO2 in the northern part. Very anomalous fluxes and concentrations are recorded in spot areas; elsewhere CO2 and CH4 values are very low, within the typical range of vegetative and of organic exhalation of the cultivated soil. After the seismic sequence the CH4 and CO2 fluxes are increased of one order of magnitude in the spotty areas, whereas in the surrounding area the values are within the background. On the contrary, CH4 concentration decrease (40%v/v in the 2012 surveys) and CO2 concentration increase until to 12.7%v/v (2013 survey). Isotopic gas analysis were carried out only on samples with anomalous values. Pre-seismic data hint a thermogenic origin of CH4 probably linked to leakage from a deep source in the Medolla area. Conversely, 2012/2013 isotopic data indicate a typical biogenic origin (i.e. microbial hydrocarbon production) of the CH4, as recognized elsewhere in the Po Plain and surroundings. The δ13C-CO2 value suggests a prevalent shallow origin of CO2 (i.e. organic and/or soil-derived) probably related to anaerobic oxidation of heavy hydrocarbons. Water samples, collected from domestic, industrial and hydrocarbons exploration wells, allowed us to recognize different families of waters. Waters are meteoric in origin and

  19. Amino acid sequence repertoire of the bacterial proteome and the occurrence of untranslatable sequences.

    PubMed

    Navon, Sharon Penias; Kornberg, Guy; Chen, Jin; Schwartzman, Tali; Tsai, Albert; Puglisi, Elisabetta Viani; Puglisi, Joseph D; Adir, Noam

    2016-06-28

    Bioinformatic analysis of Escherichia coli proteomes revealed that all possible amino acid triplet sequences occur at their expected frequencies, with four exceptions. Two of the four underrepresented sequences (URSs) were shown to interfere with translation in vivo and in vitro. Enlarging the URS by a single amino acid resulted in increased translational inhibition. Single-molecule methods revealed stalling of translation at the entrance of the peptide exit tunnel of the ribosome, adjacent to ribosomal nucleotides A2062 and U2585. Interaction with these same ribosomal residues is involved in regulation of translation by longer, naturally occurring protein sequences. The E. coli exit tunnel has evidently evolved to minimize interaction with the exit tunnel and maximize the sequence diversity of the proteome, although allowing some interactions for regulatory purposes. Bioinformatic analysis of the human proteome revealed no underrepresented triplet sequences, possibly reflecting an absence of regulation by interaction with the exit tunnel. PMID:27307442

  20. On Quantum Algorithm for Multiple Alignment of Amino Acid Sequences

    NASA Astrophysics Data System (ADS)

    Iriyama, Satoshi; Ohya, Masanori

    2009-02-01

    The alignment of genome sequences or amino acid sequences is one of fundamental operations for the study of life. Usual computational complexity for the multiple alignment of N sequences with common length L by dynamic programming is O(LN). This alignment is considered as one of the NP problems, so that it is desirable to find a nice algorithm of the multiple alignment. Thus in this paper we propose the quantum algorithm for the multiple alignment based on the works12,1,2 in which the NP complete problem was shown to be the P problem by means of quantum algorithm and chaos information dynamics.

  1. Sequence of morphological transitions in two-dimensional pattern growth from aqueous ascorbic Acid solutions.

    PubMed

    Paranjpe, A S

    2002-08-12

    A sequence of morphological transitions in two-dimensional dehydration patterns of aqueous solutions of ascorbic acid is observed with humidity as a control parameter. Change in morphology occurs due to humidity induced variation in the concentration of the metastable supersaturated solution phase formed after initial solvent evaporation. As percent humidity is varied from 40 to 80, patterns change from compact circular --> radial --> density modulated radial (a new morphology) --> density modulated circular --> density modulated dendritic (a new morphology) --> dense branching. PMID:12190528

  2. Prebiotically plausible mechanisms increase compositional diversity of nucleic acid sequences

    PubMed Central

    Derr, Julien; Manapat, Michael L.; Rajamani, Sudha; Leu, Kevin; Xulvi-Brunet, Ramon; Joseph, Isaac; Nowak, Martin A.; Chen, Irene A.

    2012-01-01

    During the origin of life, the biological information of nucleic acid polymers must have increased to encode functional molecules (the RNA world). Ribozymes tend to be compositionally unbiased, as is the vast majority of possible sequence space. However, ribonucleotides vary greatly in synthetic yield, reactivity and degradation rate, and their non-enzymatic polymerization results in compositionally biased sequences. While natural selection could lead to complex sequences, molecules with some activity are required to begin this process. Was the emergence of compositionally diverse sequences a matter of chance, or could prebiotically plausible reactions counter chemical biases to increase the probability of finding a ribozyme? Our in silico simulations using a two-letter alphabet show that template-directed ligation and high concatenation rates counter compositional bias and shift the pool toward longer sequences, permitting greater exploration of sequence space and stable folding. We verified experimentally that unbiased DNA sequences are more efficient templates for ligation, thus increasing the compositional diversity of the pool. Our work suggests that prebiotically plausible chemical mechanisms of nucleic acid polymerization and ligation could predispose toward a diverse pool of longer, potentially structured molecules. Such mechanisms could have set the stage for the appearance of functional activity very early in the emergence of life. PMID:22319215

  3. The amino-acid sequence of kangaroo pancreatic ribonuclease.

    PubMed

    Gaastra, W; Welling, G W; Beintema, J J

    1978-05-01

    Red kangaroo (Macropus rufus) ribonuclease was isolated from pancreatic tissue by affinity chromatography. The amino acid sequence was determined by automatic sequencing of overlapping large fragments and by analysis of shorter peptides obtained by digestion with a number of proteolytic enzymes. The polypeptide chain consists of 122 amino acid residues. Compared to other ribonucleases, the N-terminal residue and residue 114 are deleted. In other pancreatic ribonucleases position 114 is occupied by a cis proline residue in an external loop at the surface of the molecule. Other remarkable substitutions are the presence of a tyrosine residue at position 123 instead of a serine which forms a hydrogen bond with the pyrimidine ring of a nucleotide substrate, and a number of hydrophobichydrophilic interchanges in the sequence 51-55, which forms part of an alpha-helix in bovine ribonuclease and exhibits few substitutions in the placental mammals. Kangaroo ribonuclease contains no carbohydrate, although the enzyme possesses a recognition site for carbohydrate attachment in the sequence Asn-Val-Thr (62-64). The enzyme differs at about 35-40% of the positions from all other mammalian pancreatic ribonucleases sequenced to date, which is in agreement with the early divergence between the marsupials and the placental mammals. From fragmentary data a tentative sequence of red-necked wallaby (Macropus rufogriseus) pancreatic ribonuclease has been derived. Eight differences with the kangaroo sequence were found. PMID:658039

  4. Sources of variation in ancestral sequence reconstruction for HIV-1 envelope genes

    PubMed Central

    Ross, Howard A.; Nickle, David C.; Liu, Yi; Heath, Laura; Jensen, Mark A.; Rodrigo, Allen G.; Mullins, James I.

    2007-01-01

    We characterized the variation in the reconstructed ancestor of 118 HIV-1 envelope gene sequences arising from the methods used for (a) estimating and (b) rooting the phylogenetic tree, and (c) reconstructing the ancestor on that tree, from (d) the sequence format, and from (e) the number of input sequences. The method of rooting the tree was responsible for most of the sequence variation both among the reconstructed ancestral sequences and between the ancestral and observed sequences. Variation in predicted 3-D structural properties of the ancestors mirrored their sequence variation. The observed sequence consensus and ancestral sequences from center-rooted trees were most similar in all predicted attributes. Only for the predicted number of N-glycosylation sites was there a difference between MP and ML methods of reconstruction. Taxon sampling effects were observed only for outgroup-rooted trees, not center-rooted, reflecting the occurrence of several divergent basal sequences. Thus, for sequences exhibiting a radial phylogenetic tree, as does HIV-1, most of the variation in the estimated ancestor arises from the method of rooting the phylogenetic tree. Those investigating the ancestors of genes exhibiting such a radial tree should pay particular attention to alternate rooting methods in order to obtain a representative sample of ancestors. PMID:19455202

  5. Clinal variation for amino acid polymorphisms at the Pgm locus in Drosophila melanogaster.

    PubMed Central

    Verrelli, B C; Eanes, W F

    2001-01-01

    Clinal variation is common for enzymes in the glycolytic pathway for Drosophila melanogaster and is generally accepted as an adaptive response to different climates. Although the enzyme phosphoglucomutase (PGM) possesses several allozyme polymorphisms, it is unique in that it had been reported to show no clinal variation. Our recent DNA sequence investigation of Pgm found extensive cryptic amino acid polymorphism segregating with the allozyme alleles. In this study, we characterize the geographic variation of Pgm amino acid polymorphisms at the nucleotide level along a latitudinal cline in the eastern United States. A survey of 15 SNPs across the Pgm gene finds significant clinal differentiation for the allozyme polymorphisms as well as for many of the cryptic amino acid polymorphisms. A test of independence shows that pervasive linkage disequilibrium across this gene region can explain many of the amino acid clines. A single Pgm haplotype defined by two amino acid polymorphisms shows the strongest correlation with latitude and the steepest change in allele frequency across the cline. We propose that clinal selection at Pgm may in part explain the extensive amino acid polymorphism at this locus and is consistent with a multilocus response to selection in the glycolytic pathway. PMID:11290720

  6. Flagellin gene sequence variation in the genus Pseudomonas.

    PubMed

    Bellingham, N F; Morgan, J A; Saunders, J R; Winstanley, C

    2001-07-01

    Flagellin gene (fliC) sequences from 18 strains of Pseudomonas sensu stricto representing 8 different species, and 9 representative fliC sequences from other members of the gamma sub-division of proteobacteria, were compared. Analysis was performed on N-terminal, C-terminal and whole fliC sequences. The fliC analyses confirmed the inferred relationship between P. mendocina, P. oleovorans and P. aeruginosa based on 16S rRNA sequence comparisons. In addition, the analyses indicated that P. putida PRS2000 was closely related to P. fluorescens SBW25 and P. fluorescens NCIMB 9046T, but suggested that P. putida PaW8 and P. putida PRS2000 were more closely related to other Pseudomonas spp. than they were to each other. There were a number of inconsistencies in inferred evolutionary relationships between strains, depending on the analysis performed. In particular, whole flagellin gene comparisons often differed from those obtained using N- and C-terminal sequences. However, there were also inconsistencies between the terminal region analyses, suggesting that phylogenetic relationships inferred on the basis of fliC sequence should be treated with caution. Although the central domain of fliC is highly variable between Pseudomonas strains, there was evidence of sequence similarities between the central domains of different Pseudomonas fliC sequences. This indicates the possibility of recombination in the central domain of fliC genes within Pseudomonas species, and between these genes and those from other bacteria. PMID:11518318

  7. Sequence Variation among Group III F-Specific RNA Coliphages from Water Samples and Swine Lagoons

    PubMed Central

    Stewart, Jill R.; Vinjé, Jan; Oudejans, Sjon J. G.; Scott, Geoff I.; Sobsey, Mark D.

    2006-01-01

    Typing of F-specific RNA (FRNA) coliphages has been proposed as a useful method for distinguishing human from animal fecal contamination in environmental samples. Group II and III FRNA coliphages are generally associated with human wastes, but several exceptions have been noted. In the present study, we have genotyped and partially sequenced group III FRNA coliphage field isolates from swine lagoons in North Carolina (NC) and South Carolina (SC), along with isolates from surface waters and municipal wastewaters. Phylogenetic analysis of a region of the 5′ end of the maturation protein gene revealed two genetically different group III FRNA subclusters with 36.6% sequence variation. The SC swine lagoon isolates were more closely related to group III prototype virus M11, whereas the isolates from a swine lagoon in NC, surface waters, and wastewaters grouped with prototype virus Q-beta. These results suggest that refining phage genotyping systems to discriminate M11-like phages from Q-beta-like phages would not necessarily provide greater discriminatory power in distinguishing human from animal sources of pollution. Within the group III subclusters, nucleotide sequence diversity ranged from 0% to 6.9% for M11-like strains and from 0% to 8.7% for Q-beta-like strains. It is demonstrated here that nucleotide sequencing of closely related FRNA strains can be used to help track sources of contamination in surface waters. A similar use of phage genomic sequence information to track fecal pollution promises more reliable results than phage typing by nucleic acid hybridization and may hold more potential for field applications. PMID:16461670

  8. GENETIC VARIATION IN CLONAL VERTEBRATES DETECTED BY SIMPLE SEQUENCE FINGERPRINTING

    EPA Science Inventory

    Measurement of clonal heterogeneity is central to understanding evolutionary and population genetics of roughly 50 species of vertebrates lack effective genetic recombination. imple-sequence DNA fingerprinting with oligonucleotide probes (CAG)5 and (GACA)4 was used to detect hete...

  9. Fatty acid metabolism: Implications for diet, genetic variation, and disease

    PubMed Central

    Suburu, Janel; Gu, Zhennan; Chen, Haiqin; Chen, Wei; Zhang, Hao; Chen, Yong Q.

    2014-01-01

    Cultures across the globe, especially Western societies, are burdened by chronic diseases such as obesity, metabolic syndrome, cardiovascular disease, and cancer. Several factors, including diet, genetics, and sedentary lifestyle, are suspected culprits to the development and progression of these health maladies. Fatty acids are primary constituents of cellular physiology. Humans can acquire fatty acids by de novo synthesis from carbohydrate or protein sources or by dietary consumption. Importantly, regulation of their metabolism is critical to sustain balanced homeostasis, and perturbations of such can lead to the development of disease. Here, we review de novo and dietary fatty acid metabolism and highlight recent advances in our understanding of the relationship between dietary influences and genetic variation in fatty acid metabolism and their role in chronic diseases. PMID:24511462

  10. Pyrosequencing for discovery and analysis of DNA sequence variations.

    PubMed

    Ronaghi, Mostafa; Shokralla, Shadi; Gharizadeh, Baback

    2007-10-01

    Since the invention of pyrosequencing, more than 500 articles have been published describing different applications of this technology, most notably for DNA structure variation and microbial detection. Technological advances have been made to enhance the robustness and accuracy of this technique as well as to reduce the cost and increase the throughput. This review intends to cover recent advances in this technology and discuss its application for low and high-throughput DNA variation studies. PMID:17979516

  11. Amino acid sequence of Salmonella typhimurium branched-chain amino acid aminotransferase.

    PubMed

    Feild, M J; Nguyen, D C; Armstrong, F B

    1989-06-13

    The complete amino acid sequence of the subunit of branched-chain amino acid aminotransferase (transaminase B, EC 2.6.1.42) of Salmonella typhimurium was determined. An Escherichia coli recombinant containing the ilvGEDAY gene cluster of Salmonella was used as the source of the hexameric enzyme. The peptide fragments used for sequencing were generated by treatment with trypsin, Staphylococcus aureus V8 protease, endoproteinase Lys-C, and cyanogen bromide. The enzyme subunit contains 308 residues and has a molecular weight of 33,920. To determine the coenzyme-binding site, the pyridoxal 5-phosphate containing enzyme was treated with tritiated sodium borohydride prior to trypsin digestion. Peptide map comparisons with an apoenzyme tryptic digest and monitoring radioactivity incorporation allowed identification of the pyridoxylated peptide, which was then isolated and sequenced. The coenzyme-binding site is the lysyl residue at position 159. The amino acid sequence of Salmonella transaminase B is 97.4% identical with that of Escherichia coli, differing in only eight amino acid positions. Sequence comparisons of transaminase B to other known aminotransferase sequences revealed limited sequence similarity (24-33%) when conserved amino acid substitutions are allowed and alignments were forced to occur on the coenzyme-binding site. PMID:2669973

  12. Protein 3D Structure Computed from Evolutionary Sequence Variation

    PubMed Central

    Sheridan, Robert; Hopf, Thomas A.; Pagnani, Andrea; Zecchina, Riccardo; Sander, Chris

    2011-01-01

    The evolutionary trajectory of a protein through sequence space is constrained by its function. Collections of sequence homologs record the outcomes of millions of evolutionary experiments in which the protein evolves according to these constraints. Deciphering the evolutionary record held in these sequences and exploiting it for predictive and engineering purposes presents a formidable challenge. The potential benefit of solving this challenge is amplified by the advent of inexpensive high-throughput genomic sequencing. In this paper we ask whether we can infer evolutionary constraints from a set of sequence homologs of a protein. The challenge is to distinguish true co-evolution couplings from the noisy set of observed correlations. We address this challenge using a maximum entropy model of the protein sequence, constrained by the statistics of the multiple sequence alignment, to infer residue pair couplings. Surprisingly, we find that the strength of these inferred couplings is an excellent predictor of residue-residue proximity in folded structures. Indeed, the top-scoring residue couplings are sufficiently accurate and well-distributed to define the 3D protein fold with remarkable accuracy. We quantify this observation by computing, from sequence alone, all-atom 3D structures of fifteen test proteins from different fold classes, ranging in size from 50 to 260 residues., including a G-protein coupled receptor. These blinded inferences are de novo, i.e., they do not use homology modeling or sequence-similar fragments from known structures. The co-evolution signals provide sufficient information to determine accurate 3D protein structure to 2.7–4.8 Å Cα-RMSD error relative to the observed structure, over at least two-thirds of the protein (method called EVfold, details at http://EVfold.org). This discovery provides insight into essential interactions constraining protein evolution and will facilitate a comprehensive survey of the universe of protein

  13. Genotyping common and rare variation using overlapping pool sequencing

    PubMed Central

    2011-01-01

    Background Recent advances in sequencing technologies set the stage for large, population based studies, in which the ANA or RNA of thousands of individuals will be sequenced. Currently, however, such studies are still infeasible using a straightforward sequencing approach; as a result, recently a few multiplexing schemes have been suggested, in which a small number of ANA pools are sequenced, and the results are then deconvoluted using compressed sensing or similar approaches. These methods, however, are limited to the detection of rare variants. Results In this paper we provide a new algorithm for the deconvolution of DNA pools multiplexing schemes. The presented algorithm utilizes a likelihood model and linear programming. The approach allows for the addition of external data, particularly imputation data, resulting in a flexible environment that is suitable for different applications. Conclusions Particularly, we demonstrate that both low and high allele frequency SNPs can be accurately genotyped when the DNA pooling scheme is performed in conjunction with microarray genotyping and imputation. Additionally, we demonstrate the use of our framework for the detection of cancer fusion genes from RNA sequences. PMID:21989232

  14. Variation among Bm86 sequences in Rhipicephalus (Boophilus) microplus ticks collected from cattle across Thailand.

    PubMed

    Kaewmongkol, S; Kaewmongkol, G; Inthong, N; Lakkitjaroen, N; Sirinarumitr, T; Berry, C M; Jonsson, N N; Stich, R W; Jittapalapong, S

    2015-06-01

    Anti-tick vaccines based on recombinant homologues Bm86 and Bm95 have become a more cost-effective and sustainable alternative to chemical pesticides commonly used to control the cattle tick, Rhipicephalus (Boophilus) microplus. However, Bm86 polymorphism among geographically separate ticks is reportedly associated with reduced effectiveness of these vaccines. The purpose of this study was to investigate the variation of Bm86 among cattle ticks collected from Northern, Northeastern, Central and Southern areas across Thailand. Bm86 cDNA and deduced amino acid sequences representing 29 female tick midgut samples were 95.6-97.0 and 91.5-93.5 % identical to the nucleotide and amino acid reference sequences, respectively, of the Australian Yeerongpilly vaccine strain. Multiple sequence analyses of these Bm86 variants indicated geographical relationships and polymorphism among Thai cattle ticks. Two larger groups of cattle tick strains were discernable based on this phylogenetic analysis of Bm86, a Thai group and a Latin American group. Thai female and male cattle ticks (50 pairs) were also subjected to detailed morphological characterization to confirm their identity. The majority of female ticks had morphological features consistent with those described for R. (B.) microplus, whereas, curiously, the majority of male ticks were more consistent with the recently re-instated R. (B.) australis. A number of these ticks had features consistent with both species. Further investigations are warranted to test the efficacies of rBm86-based vaccines to homologous and heterologous challenge infestations with Thai tick strains and for in-depth study of the phylogeny of Thai cattle ticks. PMID:25777941

  15. Cytochrome b nucleotide sequence variation among the Atlantic Alcidae.

    PubMed

    Friesen, V L; Montevecchi, W A; Davidson, W S

    1993-01-01

    Analysis of cytochrome b nucleotide sequences of the six extant species of Atlantic alcids and a gull revealed an excess of adenines and cytosines and a deficit of guanines at silent sites on the coding strand. Phylogenetic analyses grouped the sequences of the common (Uria aalge) and Brünnich's (U. lomvia) guillemots, followed by the razorbill (Alca torda) and little auk (Alle alle). The black guillemot (Cepphus grylle) sequence formed a sister taxon, and the puffin (Fratercula arctica) fell outside the other alcids. Phylogenetic comparisons of substitutions indicated that mutabilities of bases did not differ, but that C was much more likely to be incorporated than was G. Imbalances in base composition appear to result from a strand bias in replication errors, which may result from selection on secondary RNA structure and/or the energetics of codon-anticodon interactions. PMID:7916741

  16. Sequences Of Amino Acids For Human Serum Albumin

    NASA Technical Reports Server (NTRS)

    Carter, Daniel C.

    1992-01-01

    Sequences of amino acids defined for use in making polypeptides one-third to one-sixth as large as parent human serum albumin molecule. Smaller, chemically stable peptides have diverse applications including service as artificial human serum and as active components of biosensors and chromatographic matrices. In applications involving production of artificial sera from new sequences, little or no concern about viral contaminants. Smaller genetically engineered polypeptides more easily expressed and produced in large quantities, making commercial isolation and production more feasible and profitable.

  17. Extensive sequence variation in rice blast resistance gene Pi54 makes it broad spectrum in nature

    PubMed Central

    Thakur, Shallu; Singh, Pankaj K.; Das, Alok; Rathour, R.; Variar, M.; Prashanthi, S. K.; Singh, A. K.; Singh, U. D.; Chand, Duni; Singh, N. K.; Sharma, Tilak R.

    2015-01-01

    Rice blast resistant gene, Pi54 cloned from rice line, Tetep, is effective against diverse isolates of Magnaporthe oryzae. In this study, we prospected the allelic variants of the dominant blast resistance gene from a set of 92 rice lines to determine the nucleotide diversity, pattern of its molecular evolution, phylogenetic relationships and evolutionary dynamics, and to develop allele specific markers. High quality sequences were generated for homologs of Pi54 gene. Using comparative sequence analysis, InDels of variable sizes in all the alleles were observed. Profiling of the selected sites of SNP (Single Nucleotide Polymorphism) and amino acids (N sites ≥ 10) exhibited constant frequency distribution of mutational and substitutional sites between the resistance and susceptible rice lines, respectively. A total of 50 new haplotypes based on the nucleotide polymorphism was also identified. A unique haplotype (H_3) was found to be linked to all the resistant alleles isolated from indica rice lines. Unique leucine zipper and tyrosine sulfation sites were identified in the predicted Pi54 proteins. Selection signals were observed in entire coding sequence of resistance alleles, as compared to LRR domains for susceptible alleles. This is a maiden report of extensive variability of Pi54 alleles in different landraces and cultivated varieties, possibly, attributing broad-spectrum resistance to Magnaporthe oryzae. The sequence variation in two consensus region: 163 and 144 bp were used for the development of allele specific DNA markers. Validated markers can be used for the selection and identification of better allele(s) and their introgression in commercial rice cultivars employing marker assisted selection. PMID:26052332

  18. Variation in Symbiodinium ITS2 sequence assemblages among coral colonies.

    PubMed

    Stat, Michael; Bird, Christopher E; Pochon, Xavier; Chasqui, Luis; Chauka, Leonard J; Concepcion, Gregory T; Logan, Dan; Takabayashi, Misaki; Toonen, Robert J; Gates, Ruth D

    2011-01-01

    Endosymbiotic dinoflagellates in the genus Symbiodinium are fundamentally important to the biology of scleractinian corals, as well as to a variety of other marine organisms. The genus Symbiodinium is genetically and functionally diverse and the taxonomic nature of the union between Symbiodinium and corals is implicated as a key trait determining the environmental tolerance of the symbiosis. Surprisingly, the question of how Symbiodinium diversity partitions within a species across spatial scales of meters to kilometers has received little attention, but is important to understanding the intrinsic biological scope of a given coral population and adaptations to the local environment. Here we address this gap by describing the Symbiodinium ITS2 sequence assemblages recovered from colonies of the reef building coral Montipora capitata sampled across Kāne'ohe Bay, Hawai'i. A total of 52 corals were sampled in a nested design of Coral Colony(Site(Region)) reflecting spatial scales of meters to kilometers. A diversity of Symbiodinium ITS2 sequences was recovered with the majority of variance partitioning at the level of the Coral Colony. To confirm this result, the Symbiodinium ITS2 sequence diversity in six M. capitata colonies were analyzed in much greater depth with 35 to 55 clones per colony. The ITS2 sequences and quantitative composition recovered from these colonies varied significantly, indicating that each coral hosted a different assemblage of Symbiodinium. The diversity of Symbiodinium ITS2 sequence assemblages retrieved from individual colonies of M. capitata here highlights the problems inherent in interpreting multi-copy and intra-genomically variable molecular markers, and serves as a context for discussing the utility and biological relevance of assigning species names based on Symbiodinium ITS2 genotyping. PMID:21246044

  19. Variation in Symbiodinium ITS2 Sequence Assemblages among Coral Colonies

    PubMed Central

    Stat, Michael; Bird, Christopher E.; Pochon, Xavier; Chasqui, Luis; Chauka, Leonard J.; Concepcion, Gregory T.; Logan, Dan; Takabayashi, Misaki; Toonen, Robert J.; Gates, Ruth D.

    2011-01-01

    Endosymbiotic dinoflagellates in the genus Symbiodinium are fundamentally important to the biology of scleractinian corals, as well as to a variety of other marine organisms. The genus Symbiodinium is genetically and functionally diverse and the taxonomic nature of the union between Symbiodinium and corals is implicated as a key trait determining the environmental tolerance of the symbiosis. Surprisingly, the question of how Symbiodinium diversity partitions within a species across spatial scales of meters to kilometers has received little attention, but is important to understanding the intrinsic biological scope of a given coral population and adaptations to the local environment. Here we address this gap by describing the Symbiodinium ITS2 sequence assemblages recovered from colonies of the reef building coral Montipora capitata sampled across Kāne'ohe Bay, Hawai'i. A total of 52 corals were sampled in a nested design of Coral Colony(Site(Region)) reflecting spatial scales of meters to kilometers. A diversity of Symbiodinium ITS2 sequences was recovered with the majority of variance partitioning at the level of the Coral Colony. To confirm this result, the Symbiodinium ITS2 sequence diversity in six M. capitata colonies were analyzed in much greater depth with 35 to 55 clones per colony. The ITS2 sequences and quantitative composition recovered from these colonies varied significantly, indicating that each coral hosted a different assemblage of Symbiodinium. The diversity of Symbiodinium ITS2 sequence assemblages retrieved from individual colonies of M. capitata here highlights the problems inherent in interpreting multi-copy and intra-genomically variable molecular markers, and serves as a context for discussing the utility and biological relevance of assigning species names based on Symbiodinium ITS2 genotyping. PMID:21246044

  20. Tandem gene arrays in Trypanosoma brucei: Comparative phylogenomic analysis of duplicate sequence variation

    PubMed Central

    Jackson, Andrew P

    2007-01-01

    Background The genome sequence of the protistan parasite Trypanosoma brucei contains many tandem gene arrays. Gene duplicates are created through tandem duplication and are expressed through polycistronic transcription, suggesting that the primary purpose of long, tandem arrays is to increase gene dosage in an environment where individual gene promoters are absent. This report presents the first account of the tandem gene arrays in the T. brucei genome, employing several related genome sequences to establish how variation is created and removed. Results A systematic survey of tandem gene arrays showed that substantial sequence variation existed across the genome; variation from different regions of an array often produced inconsistent phylogenetic affinities. Phylogenetic relationships of gene duplicates were consistent with concerted evolution being a widespread homogenising force. However, tandem duplicates were not usually identical; therefore, any homogenising effect was coincident with divergence among duplicates. Allelic gene conversion was detected using various criteria and was apparently able to both remove and introduce sequence variation. Tandem arrays containing structural heterogeneity demonstrated how sequence homogenisation and differentiation can occur within a single locus. Conclusion The use of multiple genome sequences in a comparative analysis of tandem gene arrays identified substantial sequence variation among gene duplicates. The distribution of sequence variation is determined by a dynamic balance of conservative and innovative evolutionary forces. Gene trees from various species showed that intraspecific duplicates evolve in concert, perhaps through frequent gene conversion, although this does not prevent sequence divergence, especially where structural heterogeneity physically separates a duplicate from its neighbours. In describing dynamics of sequence variation that have consequences beyond gene dosage, this survey provides a basis for

  1. Nanopores and nucleic acids: prospects for ultrarapid sequencing

    NASA Technical Reports Server (NTRS)

    Deamer, D. W.; Akeson, M.

    2000-01-01

    DNA and RNA molecules can be detected as they are driven through a nanopore by an applied electric field at rates ranging from several hundred microseconds to a few milliseconds per molecule. The nanopore can rapidly discriminate between pyrimidine and purine segments along a single-stranded nucleic acid molecule. Nanopore detection and characterization of single molecules represents a new method for directly reading information encoded in linear polymers. If single-nucleotide resolution can be achieved, it is possible that nucleic acid sequences can be determined at rates exceeding a thousand bases per second.

  2. Amino acid sequence of the Amur tiger prion protein.

    PubMed

    Wu, Changde; Pang, Wanyong; Zhao, Deming

    2006-10-01

    Prion diseases are fatal neurodegenerative disorders in human and animal associated with conformational conversion of a cellular prion protein (PrP(C)) into the pathologic isoform (PrP(Sc)). Various data indicate that the polymorphisms within the open reading frame (ORF) of PrP are associated with the susceptibility and control the species barrier in prion diseases. In the present study, partial Prnp from 25 Amur tigers (tPrnp) were cloned and screened for polymorphisms. Four single nucleotide polymorphisms (T423C, A501G, C511A, A610G) were found; the C511A and A610G nucleotide substitutions resulted in the amino acid changes Lysine171Glutamine and Alanine204Threoine, respectively. The tPrnp amino acid sequence is similar to house cat (Felis catus ) and sheep, but differs significantly from other two cat Prnp sequences that were previously deposited in GenBank. PMID:16780982

  3. Allelic sequence variation of the HLA-DQ loci: relationship to serology and to insulin-dependent diabetes susceptibility.

    PubMed Central

    Horn, G T; Bugawan, T L; Long, C M; Erlich, H A

    1988-01-01

    Analysis of sequence variation in the polymorphic second exon of the major histocompatibility complex genes HLA-DQ alpha and -DQ beta has revealed 8 allelic variants at the alpha locus and 13 variants at the beta locus. Correlation of sequence variation with serologic typing suggests that the DQw2, DQw3, and DQ(blank) types are determined by the DQ beta subunit, while the DQw1 specificity is determined by DQ alpha. The nature of the amino acid at position 57 in the DQ beta subunit is correlated with susceptibility to insulin-dependent diabetes mellitus. This region of the DQ beta chain contains shared peptides with Epstein-Barr virus and rubella virus. PMID:2842756

  4. [Sequence variation of mitochondrial cytochrome b gene and phylogenetic relationships among twelve species of Charadriiformes].

    PubMed

    Chen, Xiao-Fang; Wang, Xiang; Yuan, Xiao-Dong; Tang, Min-Qian; Li, Yu-Xiang; Guo, Yu-Mei; Li, Qing-Wei

    2003-05-01

    Studies of the phylogenetic relationships of the Charadriiformes have been largely based on conservative morphological characters. During the past 10 years, many studies on the evolutionary biology of birds adopted phylogenetic information obtained from mitochondrial DNA, but few work on the Charadriiformes has been reported to date. Therefore, phylogenetic relationships and classification of the Charadriiformes remains controversial. In this study, we try to shed light on these relationships via DNA sequence analysis of the mitochondrial Cyt b gene in 12 species of Charadriiformes. It was a preliminary study of the origin and evolution of the species by using nucleotide sequence data. Using the well-known PCR techniques, the complete mitochondrial Cyt b gene sequences were amplified and sequenced respectively from Charadrius mongolus, Charadrius alexandrinus, Numenius madagascariensis, Numenius arquat, Numenius phaeopus, Tringa totanus, Tringa glareola, Xenus cineres, Arenaria interpres, Calidris tenuirostris, Recurvirostra avosetts and Haematopus ostralensis. The 1143 bp long DNA sequences of the gene from these species were obtained, in which 381 variable sites were identified without insertions or deletions. The nucleic acid sequence variation of the mitochondrial Cyt b gene was 5.16%-16.01% among these species. Phylogenetic trees constructed using the NJ method, MP method and ML method with Ciconia ciconia as the outgroup indicate that the 12 species of Charadriiformes examined in this study are clustered in two major clades. The first clade includes T. totanus, T. glareola, A. interpres, C. tenuirostris, X. cineres, N. madagascariensis, N. arquata and N. phaeopus. The second one includes C. mongolus, C. alexandrinus, R. avosetts and H. ostralensis. Our molecular data show that the phylogenetic relationships among species of Scolopacidae are consistent with the classification based on morphological studies; R. avosetts and H. ostralensis are relatively closer

  5. Using Disease-Associated Coding Sequence Variation to Investigate Functional Compensation by Human Paralogous Proteins

    PubMed Central

    Miura, Sayaka; Tate, Stephanie; Kumar, Sudhir

    2015-01-01

    Gene duplication enables the functional diversification in species. It is thought that duplicated genes may be able to compensate if the function of one of the gene copies is disrupted. This possibility is extensively debated with some studies reporting proteome-wide compensation, whereas others suggest functional compensation among only recent gene duplicates or no compensation at all. We report results from a systematic molecular evolutionary analysis to test the predictions of the functional compensation hypothesis. We contrasted the density of Mendelian disease-associated single nucleotide variants (dSNVs) in proteins with no discernable paralogs (singletons) with the dSNV density in proteins found in multigene families. Under the functional compensation hypothesis, we expected to find greater numbers of dSNVs in singletons due to the lack of any compensating partners. Our analyses produced an opposite pattern; paralogs have over 35% higher dSNV density than singletons. We found that these patterns are concordant with similar differences in the rates of amino acid evolution (ie, functional constraints), as the proteins with paralogs have evolved 33% slower than singletons. Our evolutionary constraint explanation is robust to differences in family sizes, ages (young vs. old duplicates), and degrees of amino acid sequence similarities among paralogs. Therefore, disease-associated human variation does not exhibit significant signals of functional compensation among paralogous proteins, but rather an evolutionary constraint hypothesis provides a better explanation for the observed patterns of disease-associated and neutral polymorphisms in the human genome. PMID:26604664

  6. HIV-1 sequence variation between isolates from mother-infant transmission pairs

    SciTech Connect

    Wike, C.M.; Daniels, M.R.; Furtado, M.; Wolinsky, M.; Korber, B.; Hutto, C.; Munoz, J.; Parks, W.; Saah, A.

    1991-12-31

    To examine the sequence diversity of human immunodeficiency virus type 1 (HIV-1) between known transmission sets, sequences from the V3 and V4-V5 region of the env gene from 4 mother-infant pairs were analyzed. The mean interpatient sequence variation between isolates from linked mother-infant pairs was comparable to the sequence diversity found between isolates from other close contacts. The mean intrapatient variation was significantly less in the infants` isolates then the isolates from both their mothers and other characterized intrapatient sequence sets. In addition, a distinct and characteristic difference in the glycosylation pattern preceding the V3 loop was found between each linked transmission pair. These findings indicate that selection of specific genotypic variants, which may play a role in some direct transmission sets, and the duration of infection are important factors in the degree of diversity seen between the sequence sets.

  7. HIV-1 sequence variation between isolates from mother-infant transmission pairs

    SciTech Connect

    Wike, C.M.; Daniels, M.R.; Furtado, M.; Wolinsky, M.; Korber, B.; Hutto, C.; Munoz, J.; Parks, W.; Saah, A.

    1991-01-01

    To examine the sequence diversity of human immunodeficiency virus type 1 (HIV-1) between known transmission sets, sequences from the V3 and V4-V5 region of the env gene from 4 mother-infant pairs were analyzed. The mean interpatient sequence variation between isolates from linked mother-infant pairs was comparable to the sequence diversity found between isolates from other close contacts. The mean intrapatient variation was significantly less in the infants' isolates then the isolates from both their mothers and other characterized intrapatient sequence sets. In addition, a distinct and characteristic difference in the glycosylation pattern preceding the V3 loop was found between each linked transmission pair. These findings indicate that selection of specific genotypic variants, which may play a role in some direct transmission sets, and the duration of infection are important factors in the degree of diversity seen between the sequence sets.

  8. Rate variation of DNA sequence evolution in the Drosophila lineages.

    PubMed Central

    Takano, T S

    1998-01-01

    Rate constancy of DNA sequence evolution was examined for three species of Drosophila, using two samples: the published sequences of eight genes from regions of the normal recombination rates and new data of the four AS-C (ac, sc, l'sc and ase) and ci genes. The AS-C and ci genes were chosen because these genes are located in the regions of very reduced recombination in Drosophila melanogaster and their locations remain unchanged throughout the entire lineages involved, yielding less effect of ancestral polymorphism in the study of rate constancy. The synonymous substitution pattern of the three lineages was found to be erratic in both samples. The dispersion index for replacement substitution was relatively high for the per, G6pd and ac genes. A significant heterogeneity was found in the number of synonymous substitutions in the three lineages between the two samples of genes with different recombination rates. This is partly due to a lack of the lineage effect in the D. melanogaster and Drosophila simulans lineages in the AS-C and ci genes in contrast to Akashi's observation of genes in regions of normal recombination. The higher codon bias in Drosophila yakuba as compared with D. melanogaster and D. simulans was observed in the four AS-C genes, which suggests change(s) in action of natural selection involved in codon usage on these genes. Fluctuating selection intensity may also be responsible for the observed locus-lineage interaction effects in synonymous substitution. PMID:9611206

  9. Quantum-Sequencing: Biophysics of quantum tunneling through nucleic acids

    NASA Astrophysics Data System (ADS)

    Casamada Ribot, Josep; Chatterjee, Anushree; Nagpal, Prashant

    2014-03-01

    Tunneling microscopy and spectroscopy has extensively been used in physical surface sciences to study quantum tunneling to measure electronic local density of states of nanomaterials and to characterize adsorbed species. Quantum-Sequencing (Q-Seq) is a new method based on tunneling microscopy for electronic sequencing of single molecule of nucleic acids. A major goal of third-generation sequencing technologies is to develop a fast, reliable, enzyme-free single-molecule sequencing method. Here, we present the unique ``electronic fingerprints'' for all nucleotides on DNA and RNA using Q-Seq along their intrinsic biophysical parameters. We have analyzed tunneling spectra for the nucleotides at different pH conditions and analyzed the HOMO, LUMO and energy gap for all of them. In addition we show a number of biophysical parameters to further characterize all nucleobases (electron and hole transition voltage and energy barriers). These results highlight the robustness of Q-Seq as a technique for next-generation sequencing.

  10. Amino acid sequence of the nonsecretory ribonuclease of human urine.

    PubMed

    Beintema, J J; Hofsteenge, J; Iwama, M; Morita, T; Ohgi, K; Irie, M; Sugiyama, R H; Schieven, G L; Dekker, C A; Glitz, D G

    1988-06-14

    The amino acid sequence of a nonsecretory ribonuclease isolated from human urine was determined except for the identity of the residue at position 7. Sequence information indicates that the ribonucleases of human liver and spleen and an eosinophil-derived neurotoxin are identical or very closely related gene products. The sequence is identical at about 30% of the amino acid positions with those of all of the secreted mammalian ribonucleases for which information is available. Identical residues include active-site residues histidine-12, histidine-119, and lysine-41, other residues known to be important for substrate binding and catalytic activity, and all eight half-cystine residues common to these enzymes. Major differences include a deletion of six residues in the (so-called) S-peptide loop, insertions of two, and nine residues, respectively, in three other external loops of the molecule, and an addition of three residues at the amino terminus. The sequence shows the human nonsecretory ribonuclease to belong to the same ribonuclease superfamily as the mammalian secretory ribonucleases, turtle pancreatic ribonuclease, and human angiogenin. Sequence data suggest that a gene duplication occurred in an ancient vertebrate ancestor; one branch led to the nonsecretory ribonuclease, while the other branch led to a second duplication, with one line leading to the secretory ribonucleases (in mammals) and the second line leading to pancreatic ribonuclease in turtle and an angiogenic factor in mammals (human angiogenin). The nonsecretory ribonuclease has five short carbohydrate chains attached via asparagine residues at the surface of the molecule; these chains may have been shortened by exoglycosidase action.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:3166997

  11. Sequence variation at the major histocompatibility complex locus DQ beta in beluga whales (Delphinapterus leucas)

    PubMed

    Murray, B W; Malik, S; White, B N

    1995-07-01

    Genetic variation at the Major Histocompatibility Complex locus DQ beta was analyzed in 233 beluga whales (Delphinapterus leucas) from seven populations: St. Lawrence Estuary, eastern Beaufort Sea, eastern Chukchi Sea, western Hudson Bay, eastern Hudson Bay, southeastern Baffin Island, and High Arctic and in 12 narwhals (Monodon monoceros) sympatric with the High Arctic beluga population. Variation was assessed by amplification of the exon coding for the peptide binding region via the polymerase chain reaction, followed by either cloning and DNA sequencing or single-stranded conformation polymorphism analysis. Five alleles were found across the beluga populations and one in the narwhal. Pairwise comparisons of these alleles showed a 5:1 ratio of nonsynonymous to synonymous substitutions per site leading to eight amino acid differences, five of which were nonconservative substitutions, centered around positions previously shown to be important for peptide binding. Although the amount of allelic variation is low when compared with terrestrial mammals, the nature of the substitutions in the peptide binding sites indicates an important role for the DQ beta locus in the cellular immune response of beluga whales. Comparisons of allele frequencies among populations show the High Arctic population to be different (P < or = .005) from the other beluga populations surveyed. In these other populations an allele, Dele-DQ beta*0101-2, was found in 98% of the animals, while in the High Arctic it was found in only 52% of the animals. Two other alleles were found at high frequencies in the High Arctic population, one being very similar to the single allele found in narwhal. PMID:7659014

  12. Characterization and amino acid sequence of a fatty acid-binding protein from human heart.

    PubMed

    Offner, G D; Brecher, P; Sawlivich, W B; Costello, C E; Troxler, R F

    1988-05-15

    The complete amino acid sequence of a fatty acid-binding protein from human heart was determined by automated Edman degradation of CNBr, BNPS-skatole [3'-bromo-3-methyl-2-(2-nitrobenzenesulphenyl)indolenine], hydroxylamine, Staphylococcus aureus V8 proteinase, tryptic and chymotryptic peptides, and by digestion of the protein with carboxypeptidase A. The sequence of the blocked N-terminal tryptic peptide from citraconylated protein was determined by collisionally induced decomposition mass spectrometry. The protein contains 132 amino acid residues, is enriched with respect to threonine and lysine, lacks cysteine, has an acetylated valine residue at the N-terminus, and has an Mr of 14768 and an isoelectric point of 5.25. This protein contains two short internal repeated sequences from residues 48-54 and from residues 114-119 located within regions of predicted beta-structure and decreasing hydrophobicity. These short repeats are contained within two longer repeated regions from residues 48-60 and residues 114-125, which display 62% sequence similarity. These regions could accommodate the charged and uncharged moieties of long-chain fatty acids and may represent fatty acid-binding domains consistent with the finding that human heart fatty acid-binding protein binds 2 mol of oleate or palmitate/mol of protein. Detailed evidence for the amino acid sequences of the peptides has been deposited as Supplementary Publication SUP 50143 (23 pages) at the British Library Lending Division, Boston Spa, Yorkshire LS23 7BQ, U.K., from whom copies may be obtained as indicated in Biochem. J. (1988) 249, 5. PMID:3421901

  13. Molecular cloning and amino acid sequence of human 5-lipoxygenase

    SciTech Connect

    Matsumoto, T.; Funk, C.D.; Radmark, O.; Hoeoeg, J.O.; Joernvall, H.; Samuelsson, B.

    1988-01-01

    5-Lipoxygenase (EC 1.13.11.34), a Ca/sup 2 +/- and ATP-requiring enzyme, catalyzes the first two steps in the biosynthesis of the peptidoleukotrienes and the chemotactic factor leukotriene B/sub 4/. A cDNA clone corresponding to 5-lipoxygenase was isolated from a human lung lambda gt11 expression library by immunoscreening with a polyclonal antibody. Additional clones from a human placenta lambda gt11 cDNA library were obtained by plaque hybridization with the /sup 32/P-labeled lung cDNA clone. Sequence data obtained from several overlapping clones indicate that the composite DNAs contain the complete coding region for the enzyme. From the deduced primary structure, 5-lipoxygenase encodes a 673 amino acid protein with a calculated molecular weight of 77,839. Direct analysis of the native protein and its proteolytic fragments confirmed the deduced composition, the amino-terminal amino acid sequence, and the structure of many internal segments. 5-Lipoxygenase has no apparent sequence homology with leukotriene A/sub 4/ hydrolase or Ca/sup 2 +/-binding proteins. RNA blot analysis indicated substantial amounts of an mRNA species of approx. = 2700 nucleotides in leukocytes, lung, and placenta.

  14. Nucleic acid sequence detection using multiplexed oligonucleotide PCR

    DOEpatents

    Nolan, John P.; White, P. Scott

    2006-12-26

    Methods for rapidly detecting single or multiple sequence alleles in a sample nucleic acid are described. Provided are all of the oligonucleotide pairs capable of annealing specifically to a target allele and discriminating among possible sequences thereof, and ligating to each other to form an oligonucleotide complex when a particular sequence feature is present (or, alternatively, absent) in the sample nucleic acid. The design of each oligonucleotide pair permits the subsequent high-level PCR amplification of a specific amplicon when the oligonucleotide complex is formed, but not when the oligonucleotide complex is not formed. The presence or absence of the specific amplicon is used to detect the allele. Detection of the specific amplicon may be achieved using a variety of methods well known in the art, including without limitation, oligonucleotide capture onto DNA chips or microarrays, oligonucleotide capture onto beads or microspheres, electrophoresis, and mass spectrometry. Various labels and address-capture tags may be employed in the amplicon detection step of multiplexed assays, as further described herein.

  15. The amino acid sequence of rabbit muscle triose phosphate isomerase.

    PubMed Central

    Corran, P H; Waley, S G

    1975-01-01

    The amino acid sequence of rabbit muscle triose phosphate isomerase was deduced by characterizing peptides that overlap the tryptic peptides. Thiol groups were modified by oxidation, carboxymethylation or aminoen. About 50 peptides that provided information about overlaps were isolated; the peptides were mostly characterized by their compositions and N-terminal residues. The peptide chains contain 248 amino acid residues, and no evidence for dissimilarity of the two subunits that comprise the native enzyme was found. The sequence of the rabbit muscle enzyme may be compared with that of the coelacanth enzyme (Kolb et al., 1974): 84% of the residues are in identical positions. Similarly, comparison of the sequence with that inferred for the chicken enzyme (Furth et al., 1974) shows that 87% of the residues are in identical positions. Limited though these comparisons are, they suggest that triose phosphate isomerase has one of the lowest rates of evolutionary change. An extended version of the present paper has been deposited as Supplementary Publication SUP 50040 (42 pages) at the British Library (Lending Division) (formerly the National Lending Library for Science and Technology), Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms given in Biochem. J. (1975) 145, 5. PMID:1171682

  16. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

    NASA Astrophysics Data System (ADS)

    Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

    2016-06-01

    Mass spectrometry–based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.

  17. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation.

    PubMed

    Sheynkman, Gloria M; Shortreed, Michael R; Cesnik, Anthony J; Smith, Lloyd M

    2016-06-12

    Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications. PMID:27049631

  18. Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

    PubMed Central

    Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

    2016-01-01

    Mass spectrometry–based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications. PMID:27049631

  19. The amino acid sequence of chymopapain from Carica papaya.

    PubMed Central

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-01-01

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  20. The amino acid sequence of chymopapain from Carica papaya.

    PubMed

    Watson, D C; Yaguchi, M; Lynn, K R

    1990-02-15

    Chymopapain is a polypeptide of 218 amino acid residues. It has considerable structural similarity with papain and papaya proteinase omega, including conservation of the catalytic site and of the disulphide bonding. Chymopapain is like papaya proteinase omega in carrying four extra residues between papain positions 168 and 169, but differs from both papaya proteinases in the composition of its S2 subsite, as well as in having a second thiol group, Cys-117. Some evidence for the amino acid sequence of chymopapain has been deposited as Supplementary Publication SUP 50153 (12 pages) at the British Library Document Supply Centre, Boston Spa., Wetherby, West Yorkshire LS23 7BQ, U.K., from whom copies may be obtained on the terms indicated in Biochem. J. (1990) 265, 5. The information comprises Supplement Tables 1-4, which contain, in order, amino acid compositions of peptides from tryptic, peptic, CNBr and mild acid cleavages, Supplement Fig. 1, showing re-fractionation of selected peaks from Fig. 2 of the main paper. Supplement Fig. 2, showing cation-exchange chromatography of the earliest-eluted peak of Fig. 3 of the main paper, Supplement Fig. 3, showing reverse-phase h.p.l.c. of the later-eluted peak from Fig. 3 of the main paper, and Supplement Fig. 4, showing the separation of peptides after mild acid hydrolysis of CNBr-cleavage fragment CB3. PMID:2106878

  1. Mitochondrial control-region sequence variation in aboriginal Australians.

    PubMed

    van Holst Pellekaan, S; Frommer, M; Sved, J; Boettcher, B

    1998-02-01

    The mitochondrial D-loop hypervariable segment 1 (mt HVS1) between nucleotides 15997 and 16377 has been examined in aboriginal Australian people from the Darling River region of New South Wales (riverine) and from Yuendumu in central Australia (desert). Forty-seven unique HVS1 types were identified, varying at 49 nucleotide positions. Pairwise analysis by calculation of BEPPI (between population proportion index) reveals statistically significant structure in the populations, although some identical HVS1 types are seen in the two contrasting regions. mt HVS1 types may reflect more-ancient distributions than do linguistic diversity and other culturally distinguishing attributes. Comparison with sequences from five published global studies reveals that these Australians demonstrate greatest divergence from some Africans, least from Papua New Guinea highlanders, and only slightly more from some Pacific groups (Indonesian, Asian, Samoan, and coastal Papua New Guinea), although the HVS1 types vary at different nucleotide sites. Construction of a median network, displaying three main groups, suggests that several hypervariable nucleotide sites within the HVS1 are likely to have undergone mutation independently, making phylogenetic comparison with global samples by conventional methods difficult. Specific nucleotide-site variants are major separators in median networks constructed from Australian HVS1 types alone and for one global selection. The distribution of these, requiring extended study, suggests that they may be signatures of different groups of prehistoric colonizers into Australia, for which the time of colonization remains elusive. PMID:9463317

  2. Anchored pseudo-de novo assembly of human genomes identifies extensive sequence variation from unmapped sequence reads.

    PubMed

    Faber-Hammond, Joshua J; Brown, Kim H

    2016-07-01

    The human genome reference (HGR) completion marked the genomics era beginning, yet despite its utility universal application is limited by the small number of individuals used in its development. This is highlighted by the presence of high-quality sequence reads failing to map within the HGR. Sequences failing to map generally represent 2-5 % of total reads, which may harbor regions that would enhance our understanding of population variation, evolution, and disease. Alternatively, complete de novo assemblies can be created, but these effectively ignore the groundwork of the HGR. In an effort to find a middle ground, we developed a bioinformatic pipeline that maps paired-end reads to the HGR as separate single reads, exports unmappable reads, de novo assembles these reads per individual and then combines assemblies into a secondary reference assembly used for comparative analysis. Using 45 diverse 1000 Genomes Project individuals, we identified 351,361 contigs covering 195.5 Mb of sequence unincorporated in GRCh38. 30,879 contigs are represented in multiple individuals with ~40 % showing high sequence complexity. Genomic coordinates were generated for 99.9 %, with 52.5 % exhibiting high-quality mapping scores. Comparative genomic analyses with archaic humans and primates revealed significant sequence alignments and comparisons with model organism RefSeq gene datasets identified novel human genes. If incorporated, these sequences will expand the HGR, but more importantly our data highlight that with this method low coverage (~10-20×) next-generation sequencing can still be used to identify novel unmapped sequences to explore biological functions contributing to human phenotypic variation, disease and functionality for personal genomic medicine. PMID:27061184

  3. Population variation of human mtDNA control region sequences detected by enzymatic amplification and sequence-specific oligonucleotide probes.

    PubMed Central

    Stoneking, M; Hedgecock, D; Higuchi, R G; Vigilant, L; Erlich, H A

    1991-01-01

    A method for detecting sequence variation of hypervariable segments of the mtDNA control region was developed. The technique uses hybridization of sequence-specific oligonucleotide (SSO) probes to DNA sequences that have been amplified by PCR. The nucleotide sequences of the two hypervariable segments of the mtDNA control region from 52 individuals were determined; these sequences were then used to define nine regions suitable for SSO typing. A total of 23 SSO probes were used to detect sequence variants at these nine regions in 525 individuals from five ethnic groups (African, Asian, Caucasian, Japanese, and Mexican). The SSO typing revealed an enormous amount of variability, with 274 mtDNA types observed among these 525 individuals and with diversity values, for each population, exceeding .95. For each of the nine mtDNA regions significant differences in the frequencies of sequence variants were observed between these five populations. The mtDNA SSO-typing system was successfully applied to a case involving individual identification of skeletal remains; the probability of a random match was approximately 0.7%. The potential useful applications of this mtDNA SSO-typing system thus include the analysis of individual identity as well as population genetic studies. Images Figure 3 PMID:1990843

  4. On human disease-causing amino acid variants: statistical study of sequence and structural patterns

    PubMed Central

    Alexov, Emil

    2015-01-01

    Statistical analysis was carried out on large set of naturally occurring human amino acid variations and it was demonstrated that there is a preference for some amino acid substitutions to be associated with diseases. At an amino acid sequence level, it was shown that the disease-causing variants frequently involve drastic changes of amino acid physico-chemical properties of proteins such as charge, hydrophobicity and geometry. Structural analysis of variants involved in diseases and being frequently observed in human population showed similar trends: disease-causing variants tend to cause more changes of hydrogen bond network and salt bridges as compared with harmless amino acid mutations. Analysis of thermodynamics data reported in literature, both experimental and computational, indicated that disease-causing variants tend to destabilize proteins and their interactions, which prompted us to investigate the effects of amino acid mutations on large databases of experimentally measured energy changes in unrelated proteins. Although the experimental datasets were linked neither to diseases nor exclusory to human proteins, the observed trends were the same: amino acid mutations tend to destabilize proteins and their interactions. Having in mind that structural and thermodynamics properties are interrelated, it is pointed out that any large change of any of them is anticipated to cause a disease. PMID:25689729

  5. The Quantification of Representative Sequences pipeline for amplicon sequencing: case study on within-population ITS1 sequence variation in a microparasite infecting Daphnia.

    PubMed

    González-Tortuero, E; Rusek, J; Petrusek, A; Gießler, S; Lyras, D; Grath, S; Castro-Monzón, F; Wolinska, J

    2015-11-01

    Next generation sequencing (NGS) platforms are replacing traditional molecular biology protocols like cloning and Sanger sequencing. However, accuracy of NGS platforms has rarely been measured when quantifying relative frequencies of genotypes or taxa within populations. Here we developed a new bioinformatic pipeline (QRS) that pools similar sequence variants and estimates their frequencies in NGS data sets from populations or communities. We tested whether the estimated frequency of representative sequences, generated by 454 amplicon sequencing, differs significantly from that obtained by Sanger sequencing of cloned PCR products. This was performed by analysing sequence variation of the highly variable first internal transcribed spacer (ITS1) of the ichthyosporean Caullerya mesnili, a microparasite of cladocerans of the genus Daphnia. This analysis also serves as a case example of the usage of this pipeline to study within-population variation. Additionally, a public Illumina data set was used to validate the pipeline on community-level data. Overall, there was a good correspondence in absolute frequencies of C. mesnili ITS1 sequences obtained from Sanger and 454 platforms. Furthermore, analyses of molecular variance (amova) revealed that population structure of C. mesnili differs across lakes and years independently of the sequencing platform. Our results support not only the usefulness of amplicon sequencing data for studies of within-population structure but also the successful application of the QRS pipeline on Illumina-generated data. The QRS pipeline is freely available together with its documentation under GNU Public Licence version 3 at http://code.google.com/p/quantification-representative-sequences. PMID:25728529

  6. Amino acid sequence prerequisites for the formation of cn ions.

    PubMed

    Downard, K M; Biemann, K

    1993-11-01

    Ammo acid sequence prerequisites are described for the formation of c, ions observed in high-energy collision-induced decomposition spectra of peptides. It is shown that the formation of cn ions is promoted by the nature of the amino acid C-terminal to the cleavage site. A propensity for cn cleavage preceding threonine, and to a lesser extent tryptophan, lysine, and serine, is demonstrated where fragmentation is directed N-terminally at these residues. In addition, the nature of the residue N-terminal to the cleavage site is shown to have little effect on cn ion formation. A mechanism for cn ion formation is proposed and its applicability to the results observed is discussed. PMID:24227531

  7. Ultrasensitive nucleic acid sequence detection by single-molecule electrophoresis

    SciTech Connect

    Castro, A; Shera, E.B.

    1996-09-01

    This is the final report of a one-year laboratory-directed research and development project at Los Alamos National Laboratory. There has been considerable interest in the development of very sensitive clinical diagnostic techniques over the last few years. Many pathogenic agents are often present in extremely small concentrations in clinical samples, especially at the initial stages of infection, making their detection very difficult. This project sought to develop a new technique for the detection and accurate quantification of specific bacterial and viral nucleic acid sequences in clinical samples. The scheme involved the use of novel hybridization probes for the detection of nucleic acids combined with our recently developed technique of single-molecule electrophoresis. This project is directly relevant to the DOE`s Defense Programs strategic directions in the area of biological warfare counter-proliferation.

  8. Temporal Variation of Aristolochia chilensis Aristolochic Acids during Spring.

    PubMed

    Santander, Rocío; Urzúa, Alejandro; Olguín, Ángel; Sánchez, María

    2015-01-01

    In this communication, we report the springtime variation of the composition of aristolochic acids (AAs) in Aristolochia chilensis leaves and stems. The dominant AA in the leaves of all samples, which were collected between October and December, was AA-I (1), and its concentration varied between 212.6±3.8 and 145.6±1.2 mg/kg and decreased linearly. This decrease occurred in parallel with the increase in AA-Ia (5) concentration from 15.9±0.8 mg/kg at the beginning of October to 96.8±7.8 mg/kg in mid-December. Both acids are enzymatically related by methylation-demethylation reactions. Other AAs also showed important variations: AA-II (2) significantly increased in concentration, reaching a maximum in the first two weeks of November and subsequently decreasing in mid-December to approximately the October levels. The principal component in the AA mixture of the stems was also AA-I (1); similar to AA-II (2), its concentration increased beginning in October, peaked in the second week of November and subsequently decreased. The concentrations of AA-IIIa (6) and AA-IVa (7) in the leaves and stems varied throughout the study period, but no clear pattern was identified. Based on the variation of AAs in A. chilensis leaves and stems during the study period, the reduced contents of non-phenolic AAs and increased concentrations of phenolic AAs are likely associated with a decrease in this plant's toxicity during the spring. PMID:26580587

  9. Targeted Exome Sequencing Outcome Variations of Colorectal Tumors within and across Two Sequencing Platforms

    PubMed Central

    Ashktorab, Hassan; Azimi, Hamed; Nickerson, Michael L.; Bass, Sara; Varma, Sudhir; Brim, Hassan

    2016-01-01

    Background and Aim Next generation sequencing (NGS) has quickly the tool of choice for genome and exome data generation. The multitude of sequencing platforms as well as the variabilities within each platform need to be assessed. In this paper we used two platforms (ION TORRENT AND ILLUMINA) to assess single nucleotides variants in colorectal cancer (CRC) specimens. Methods CRC specimens (n = 13) collected from 6 CRC (cancer and matched normal) patients were used to establish the mutational profile using ION TORRENT AND ILLUMINA sequencing platforms. We analyzed a set of samples from Formalin Fixed Paraffin Embedded and FF (FF) samples on both platforms to assess the effect of sample nature (FFPE vs. FF) on sequencing outcome and to evaluate the similarity/differences of SNVs across the two platforms. In addition, duplicates of FF samples were sequenced on each platform to assess variability within platform. Results The comparison of FF replicates to each other gave a concordance of 77% (± 15.3%) in Ion Torrent and 70% (± 3.7%) in Illumina. FFPE vs. FF replicates gave a concordance of 40% (± 32%) in Ion Torrent and 49% (± 19%) in Illumina. For the cross platform concordance were FFPE compared to FF (Average of 75% (± 9.8%) for FFPE samples and 67% (± 32%) for FF and 70% (± 26.8%) overall average). Conclusion Our data show a significant variability within and across platforms. Also the number of detected variants depend on the nature of the specimen; FF vs. FFPE. Validation of NGS discovered mutations is a must to rule-out false positive mutants. This validation might either be performed through a second NGS platform or through Sanger sequencing.

  10. Targeted capture enrichment and sequencing identifies extensive nucleotide variation in the turkey MHC-B.

    PubMed

    Reed, Kent M; Mendoza, Kristelle M; Settlage, Robert E

    2016-03-01

    Variation in the major histocompatibility complex (MHC) is increasingly associated with disease susceptibility and resistance in avian species of agricultural importance. This variation includes sequence polymorphisms but also structural differences (gene rearrangement) and copy number variation (CNV). The MHC has now been described for multiple galliform species including the best defined assemblies of the chicken (Gallus gallus) and domestic turkey (Meleagris gallopavo). Using this sequence resource, this study applied high-throughput sequencing to investigate MHC variation in turkeys of North America (NA turkeys). An MHC-specific SureSelect (Agilent) capture array was developed, and libraries were created for 14 turkeys representing domestic (commercial bred), heritage breed, and wild turkeys. In addition, a representative of the Ocellated turkey (M. ocellata) and chicken (G. gallus) was included to test cross-species applicability of the capture array allowing for identification of new species-specific polymorphisms. Libraries were hybridized to ∼12 K cRNA baits and the resulting pools were sequenced. On average, 98% of processed reads mapped to the turkey whole genome sequence and 53% to the MHC target. In addition to the MHC, capture hybridization recovered sequences corresponding to other MHC regions. Sequence alignment and de novo assembly indicated the presence of several additional BG genes in the turkey with evidence for CNV. Variant detection identified an average of 2245 polymorphisms per individual for the NA turkeys, 3012 for the Ocellated turkey, and 462 variants in the chicken (RJF-256). This study provides an extensive sequence resource for examining MHC variation and its relation to health of this agriculturally important group of birds. PMID:26729471

  11. Individual variation and intraclass correlation in arachidonic acid and eicosapentaenoic acid in chicken muscle

    PubMed Central

    2010-01-01

    Chicken meat with reduced concentration of arachidonic acid (AA) and reduced ratio between omega-6 and omega-3 fatty acids has potential health benefits because a reduction in AA intake dampens prostanoid signaling, and the proportion between omega-6 and omega-3 fatty acids is too high in our diet. Analyses for fatty acid determination are expensive, and finding the optimal number of analyses to give reliable results is a challenge. The objective of the present study was i) to analyse the intraclass correlation of different fatty acids in five meat samples, of one gram each, within the same chicken thigh, and ii) to study individual variations in the concentrations of a range of fatty acids and the ratio between omega-6 and omega-3 fatty acid concentrations among fifteen chickens. Fifteen newly hatched broilers were fed a wheat-based diet containing 4% rapeseed oil and 1% linseed oil for three weeks. Five muscle samples from the mid location of the thigh of each chicken were analysed for fatty acid composition. The intraclass correlation (sample correlation within the same animal) was 0.85-0.98 for the ratios of total omega-6 to total omega-3 fatty acids and of AA to eicosapentaenoic acid (EPA). This indicates that when studying these fatty acid ratios, one sample of one gram per animal is sufficient. However, due to the high individual variation between chicken for these ratios, a relatively high number of animals (minimum 15) are required to obtain a sufficiently high power to reveal significant effects of experimental factors (e.g. feeding regimes). The present experiment resulted in meat with a favorable concentration ratio between omega-6 and omega-3 fatty acids. The AA concentration varied from 1.5 to 2.8 g/100 g total fatty acids in thigh muscle in the fifteen broilers, and the ratio between AA and EPA concentrations ranged from 2.3 to 3.9. These differences among the birds may be due to genetic variance that can be exploited by breeding for lower AA

  12. Otopalatodigital syndrome type 2 in a male infant: A case report with a novel sequence variation

    PubMed Central

    Sankararaman, Senthilkumar; Kurepa, Dalibor; Shen, Yiping; Kakkilaya, Venkatakrishna; Ursin, Sussone; Chen, Harold

    2013-01-01

    We report a male infant with typical clinical, pathological and radiological features of otopalatodigital syndrome type 2 (OPD 2) with a novel sequence variation in the FLNA gene. His clinical manifestations include typical craniofacial features, cleft palate, hearing impairment, omphalocele, bowing of the long bones, absent fibulae and digital abnormalities consistent with OPD 2. Two hemizygous sequence variations in the FLNA gene were identified. The variation c.5290G>A/p.Ala1764Thr has been previously reported in a patient with periventricular nodular heterotopia, but subsequently it has been reported as a polymorphism. The other variation c.613T>C/p.Cys205Arg detected in the proband has not been previously reported and our analysis indicates that this is a novel disease-causing mutation for OPD2.

  13. Major Breeding Plumage Color Differences of Male Ruffs (Philomachus pugnax) Are Not Associated With Coding Sequence Variation in the MC1R Gene

    PubMed Central

    Küpper, Clemens; Burke, Terry; Lank, David B.

    2015-01-01

    Sequence variation in the melanocortin-1 receptor (MC1R) gene explains color morph variation in several species of birds and mammals. Ruffs (Philomachus pugnax) exhibit major dark/light color differences in melanin-based male breeding plumage which is closely associated with alternative reproductive behavior. A previous study identified a microsatellite marker (Ppu020) near the MC1R locus associated with the presence/absence of ornamental plumage. We investigated whether coding sequence variation in the MC1R gene explains major dark/light plumage color variation and/or the presence/absence of ornamental plumage in ruffs. Among 821bp of the MC1R coding region from 44 male ruffs we found 3 single nucleotide polymorphisms, representing 1 nonsynonymous and 2 synonymous amino acid substitutions. None were associated with major dark/light color differences or the presence/absence of ornamental plumage. At all amino acid sites known to be functionally important in other avian species with dark/light plumage color variation, ruffs were either monomorphic or the shared polymorphism did not coincide with color morph. Neither ornamental plumage color differences nor the presence/absence of ornamental plumage in ruffs are likely to be caused entirely by amino acid variation within the coding regions of the MC1R locus. Regulatory elements and structural variation at other loci may be involved in melanin expression and contribute to the extreme plumage polymorphism observed in this species. PMID:25534935

  14. Genetic variation assessment of acid lime accessions collected from south of Iran using SSR and ISSR molecular markers.

    PubMed

    Sharafi, Ata Allah; Abkenar, Asad Asadi; Sharafi, Ali; Masaeli, Mohammad

    2016-01-01

    Iran has a long history of acid lime cultivation and propagation. In this study, genetic variation in 28 acid lime accessions from five regions of south of Iran, and their relatedness with other 19 citrus cultivars were analyzed using Simple Sequence Repeat (SSR) and Inter-Simple Sequence Repeat (ISSR) molecular markers. Nine primers for SSR and nine ISSR primers were used for allele scoring. In total, 49 SSR and 131 ISSR polymorphic alleles were detected. Cluster analysis of SSR and ISSR data showed that most of the acid lime accessions (19 genotypes) have hybrid origin and genetically distance with nucellar of Mexican lime (9 genotypes). As nucellar of Mexican lime are susceptible to phytoplasma, these acid lime genotypes can be used to evaluate their tolerance against biotic constricts like lime "witches' broom disease". PMID:27186022

  15. Sequence variation of ribosomal internal transcribed spacers (ITS) in commercially important Phytoseiidae mites.

    PubMed

    Navajas, M; Lagnel, J; Fauvel, G; de Moraes, G

    1999-11-01

    Preliminary work is needed to assess the usefulness of different markers at different taxonomic scales when a new group is analyzed, such as the commercially important Phytoseiidae mites. We investigate here the level of sequence variation of the nuclear ribosomal spacers ITS 1 and 2 and the 5.8S gene in six species of Phytoseiidae: Neoseiulus culifornicus, N. fallacis, Euseius concordis, Metaseiulus occidentalis, Typhlodromus pyri and Phytoseiulus persimilis. As expected, the 5.8S gene (148 base pairs) is markedly conserved and displays little variation in between genera comparisons. ITS1 and ITS2 show contrasting patterns: while the ITS2 is short (80-89 bp) and shows little variation, the ITS1 is longer (303-404 bp) and is very variable in sequence. This fact compromises reliable nucleotide homologies when comparing the genera. The comparison of ITS1 sequence similarity at the species level might be useful for species identification, however, the value of ITS in taxonomic studies does not extend to the level of the family. The intraspecific variations of ITS were investigated in three species: N. californicus, N. fallacis and E. concordis. The first species has identical ITS1 sequences and the last two display low polymorphism (2 nucleotide substitutions). The ITS2 and 5.8S sequences were identical in all three subspecies comparisons. PMID:10668860

  16. Low-level sequence variation in Toxoplasma gondii calcium-dependent protein kinases among different genotypes.

    PubMed

    Wang, J L; Zhang, N Z; Huang, S Y; Xu, Y; Wang, R A; Zhu, X Q

    2015-01-01

    The causative agent of toxoplasmosis, Toxoplasma gondii, can infect virtually all nucleated cell types of warm-blooded animals. In this study, we examined the sequence variation in calcium-dependent protein kinase 2 (CDPK2) genes among 13 T. gondii strains from different hosts and geographical locations. The results showed that the lengths of the complete CDPK2 DNA and cDNA sequences were 3671-3673 and 2136 bp, respectively, and the sequence variation was 0-0.9% among different T. gondii strains. Phylogenetic analysis based on the CDPK2 gene sequences revealed that T. gondii strains of the same genotypes were clustered in different clades. Further analysis of all the other T. gondii CDPK genes in genotype I (GT1), II (ME49), or III (VEG) strains indicated the T. gondii CDPK gene family is quite conserved, with sequence variation ranging from 0 to 1.40%. We concluded that CDPK2 as well as all the other CDPK genes in T. gondii cannot be used as proper markers for studying the variants of different T. gondii genotypes from different hosts and geographical locations, but their sequence conservation may be a useful feature promoting them as anti-T. gondii vaccine candidates in further studies. PMID:25966270

  17. Oxytocin receptor gene sequences in owl monkeys and other primates show remarkable interspecific regulatory and protein coding variation.

    PubMed

    Babb, Paul L; Fernandez-Duque, Eduardo; Schurr, Theodore G

    2015-10-01

    The oxytocin (OT) hormone pathway is involved in numerous physiological processes, and one of its receptor genes (OXTR) has been implicated in pair bonding behavior in mammalian lineages. This observation is important for understanding social monogamy in primates, which occurs in only a small subset of taxa, including Azara's owl monkey (Aotus azarae). To examine the potential relationship between social monogamy and OXTR variation, we sequenced its 5' regulatory (4936bp) and coding (1167bp) regions in 25 owl monkeys from the Argentinean Gran Chaco, and examined OXTR sequences from 1092 humans from the 1000 Genomes Project. We also assessed interspecific variation of OXTR in 25 primate and rodent species that represent a set of phylogenetically and behaviorally disparate taxa. Our analysis revealed substantial variation in the putative 5' regulatory region of OXTR, with marked structural differences across primate taxa, particularly for humans and chimpanzees, which exhibited unique patterns of large motifs of dinucleotide A+T repeats upstream of the OXTR 5' UTR. In addition, we observed a large number of amino acid substitutions in the OXTR CDS region among New World primate taxa that distinguish them from Old World primates. Furthermore, primate taxa traditionally defined as socially monogamous (e.g., gibbons, owl monkeys, titi monkeys, and saki monkeys) all exhibited different amino acid motifs for their respective OXTR protein coding sequences. These findings support the notion that monogamy has evolved independently in Old World and New World primates, and that it has done so through different molecular mechanisms, not exclusively through the oxytocin pathway. PMID:26025428

  18. Polarimetric Variations of Binary Stars. VI. Orbit-Induced Variations in the Pre-Main-Sequence Binary AK Scorpii

    NASA Astrophysics Data System (ADS)

    Manset, N.; Bastien, P.; Bertout, C.

    2005-01-01

    We present simultaneous UBV polarimetric and photometric observations of the pre-main-sequence binary AK Sco, obtained over 12 nights, slightly less than the orbital period of 13.6 days. The polarization is a sum of interstellar and intrinsic polarization, with a significant intrinsic polarization of 1% at 5250 Å, indicating the presence of circumstellar matter distributed in an asymmetric geometry. The polarization and its position angle are clearly variable on timescales of hours and nights in all three wavelengths, with a behavior related to the orbital motion. The variations have the highest amplitudes seen so far for pre-main-sequence binaries (~1% and ~30°) and are sinusoidal with periods similar to the orbital period and half of it. The polarization variations are generally correlated with the photometric ones: when the star gets fainter, it also gets redder, and its polarization increases. The (B-V, V) color-magnitude diagram exhibits a ratio of total to selective absorption R=4.3, higher than in normal interstellar clouds (R=3.1). The interpretation of the simultaneous photometric and polarimetric observations is that a cloud of circumstellar matter passes in front of the star, decreasing the amount of direct, unpolarized light and hence increasing the contribution of scattered (blue) light. We show that the large amplitude of the polarization variations cannot be reproduced with a single-scattering model and axially symmetric circumbinary or circumstellar disks. Based on observations made with the ESO telescopes at the La Silla Observatory.

  19. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  20. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  1. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... in the sequence. (4) The enumeration of amino acids may start at the first amino acid of the first..., counting backwards starting with the amino acid next to number 1. Otherwise, the enumeration of amino acids... sequence every 5 amino acids. The enumeration method for amino acid sequences that is set forth......

  2. Predicting protein disorder by analyzing amino acid sequence

    PubMed Central

    Yang, Jack Y; Yang, Mary Qu

    2008-01-01

    Background Many protein regions and some entire proteins have no definite tertiary structure, presenting instead as dynamic, disorder ensembles under different physiochemical circumstances. These proteins and regions are known as Intrinsically Unstructured Proteins (IUP). IUP have been associated with a wide range of protein functions, along with roles in diseases characterized by protein misfolding and aggregation. Results Identifying IUP is important task in structural and functional genomics. We exact useful features from sequences and develop machine learning algorithms for the above task. We compare our IUP predictor with PONDRs (mainly neural-network-based predictors), disEMBL (also based on neural networks) and Globplot (based on disorder propensity). Conclusion We find that augmenting features derived from physiochemical properties of amino acids (such as hydrophobicity, complexity etc.) and using ensemble method proved beneficial. The IUP predictor is a viable alternative software tool for identifying IUP protein regions and proteins. PMID:18831799

  3. Copy number variation of individual cattle genomes using next-generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Copy number variations (CNVs) affect a wide range of phenotypic traits; however, CNVs in or near segmental duplication regions are often intractable. Using a read depth approach based on next-generation sequencing, we examined genome-wide copy number differences among five taurine (three Angus, one ...

  4. Copy number variation of individual cattle genomes using next-generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Copy Number Variations (CNVs) affect a wide range of phenotypic traits; however, CNVs in or near segmental duplication regions are often difficult to track. Using a read depth approach based on next generation sequencing, we examined genome-wide copy number differences among five taurine (three Angu...

  5. Whole-genome sequencing reveals the diversity of cattle copy number variations and multicopy genes

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Structural and functional impacts of copy number variations (CNVs) on livestock genomes are not yet well understood. We identified 1853 CNV regions using population-scale sequencing data generated from 75 cattle representing 8 breeds (Angus, Brahman, Gir, Holstein, Jersey, Limousin, Nelore, Romagnol...

  6. A sequencing strategy for identifying variation throughout the prion gene of BSE-affected cattle

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Cattle prion gene (PRNP) polymorphisms have been associated with bovine spongiform encephalopathy (BSE) susceptibility. We developed a method for sequencing bovine PRNP through all exons, introns and part of the promoter (25.2 kb) that accounts for known variation. The method can be used to detect...

  7. Mapping cattle copy number variation by population-scale genome sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Copy number variation (CNV) is abundant in livestock, differing from SNPs in extent, origin and functional impact. Despite progress in CNV discovery, the nucleotide resolution architecture of most CNVs remains elusive. As a pilot population study of cattle CNV, we sequenced 100 representative cattle...

  8. Mitochondrial COI sequences in mites: evidence for variations in base composition.

    PubMed

    Navajas, M; Fournier, D; Lagnel, J; Gutierrez, J; Boursot, P

    1996-11-01

    Studies of mitochondrial DNA sequences in a variety of animals have shown important differences between phyla, including differences in the genetic codes used, and varying constraints on base composition. In that respect, little is known of mites, an important and diversified group. We sequenced a portion (340 nt) of the cytochrome oxidase subunit I (COI) encoding gene in twenty species of phytophagous mites belonging to nine genera of the two families Tetranychidae and Tenuipalpidae. The mitochondrial genetic code used in mites appeared to be the same as in insects. As is generally also the case in insects, the mite sequences were very rich in A + T (75% on average), especially at the third codon position (94%). However, important variations of base composition were observed among mite species, one of them showing as little as 69% A + T. Variations of base composition occur mostly through synonymous transitions, and do not have detectable effects on polypeptide evolution in this group. PMID:8933179

  9. High-throughput sequencing provides insights into genome variation and evolution in Salmonella Typhi

    PubMed Central

    Holt, Kathryn E; Parkhill, Julian; Mazzoni, Camila J; Roumagnac, Philippe; Weill, François-Xavier; Goodhead, Ian; Rance, Richard; Baker, Stephen; Maskell, Duncan J; Wain, John; Dolecek, Christiane; Achtman, Mark; Dougan, Gordon

    2009-01-01

    Isolates of Salmonella enterica serovar Typhi (Typhi), a human-restricted bacterial pathogen that causes typhoid, show limited genetic variation. We generated whole-genome sequences for 19 Typhi isolates using 454 (Roche) and Solexa (Illumina) technologies. Isolates, including the previously sequenced CT18 and Ty2 isolates, were selected to represent major nodes in the phylogenetic tree. Comparative analysis showed little evidence of purifying selection, antigenic variation or recombination between isolates. Rather, evolution in the Typhi population seems to be characterized by ongoing loss of gene function, consistent with a small effective population size. The lack of evidence for antigenic variation driven by immune selection is in contrast to strong adaptive selection for mutations conferring antibiotic resistance in Typhi. The observed patterns of genetic isolation and drift are consistent with the proposed key role of asymptomatic carriers of Typhi as the main reservoir of this pathogen, highlighting the need for identification and treatment of carriers. PMID:18660809

  10. Mitochondrial Sequence Variation in African-American Primary Open-Angle Glaucoma Patients

    PubMed Central

    Collins, David W.; Gudiseva, Harini V.; Trachtman, Benjamin T.; Jerrehian, Matthew; Gorry, Thomasine; Merritt III, William T.; Rhodes, Allison L.; Sankar, Prithvi S.; Regina, Meredith; Miller-Ellis, Eydie; O’Brien, Joan M.

    2013-01-01

    Primary open-angle glaucoma (POAG) is a major cause of blindness and results from irreversible retinal ganglion cell damage and optic nerve degeneration. In the United States, POAG is most prevalent in African-Americans. Mitochondrial genetics and dysfunction have been implicated in POAG, and potentially pathogenic sequence variations, in particular novel transversional base substitutions, are reportedly common in mitochondrial genomes (mtDNA) from POAG patient blood. The purpose of this study was to ascertain the spectrum of sequence variation in mtDNA from African-American POAG patients and determine whether novel nonsynonymous, transversional or other potentially pathogenic sequence variations are observed more commonly in POAG cases than controls. mtDNA from African-American POAG cases (n = 22) and age-matched controls (n = 22) was analyzed by deep sequencing of a single 16,487 base pair PCR amplicon by Ion Torrent, and candidate novel variants were validated by Sanger sequencing. Sequence variants were classified and interpreted using the MITOMAP compendium of polymorphisms. 99.8% of the observed variations had been previously reported. The ratio of novel variants to POAG cases was 7-fold lower than a prior estimate. Novel mtDNA variants were present in 3 of 22 cases, novel nonsynonymous changes in 1 of 22 cases and novel transversions in 0 of 22 cases; these proportions are significantly lower (p<.0005, p<.0004, p<.0001) than estimated previously for POAG, and did not differ significantly from controls. Although it is possible that mitochondrial genetics play a role in African-Americans’ high susceptibility to POAG, it is unlikely that any mitochondrial respiratory dysfunction is due to an abnormally high incidence of novel mutations that can be detected in mtDNA from peripheral blood. PMID:24146900

  11. Mitochondrial sequence variation in African-American primary open-angle glaucoma patients.

    PubMed

    Collins, David W; Gudiseva, Harini V; Trachtman, Benjamin T; Jerrehian, Matthew; Gorry, Thomasine; Merritt, William T; Rhodes, Allison L; Sankar, Prithvi S; Regina, Meredith; Miller-Ellis, Eydie; O'Brien, Joan M

    2013-01-01

    Primary open-angle glaucoma (POAG) is a major cause of blindness and results from irreversible retinal ganglion cell damage and optic nerve degeneration. In the United States, POAG is most prevalent in African-Americans. Mitochondrial genetics and dysfunction have been implicated in POAG, and potentially pathogenic sequence variations, in particular novel transversional base substitutions, are reportedly common in mitochondrial genomes (mtDNA) from POAG patient blood. The purpose of this study was to ascertain the spectrum of sequence variation in mtDNA from African-American POAG patients and determine whether novel nonsynonymous, transversional or other potentially pathogenic sequence variations are observed more commonly in POAG cases than controls. mtDNA from African-American POAG cases (n = 22) and age-matched controls (n = 22) was analyzed by deep sequencing of a single 16,487 base pair PCR amplicon by Ion Torrent, and candidate novel variants were validated by Sanger sequencing. Sequence variants were classified and interpreted using the MITOMAP compendium of polymorphisms. 99.8% of the observed variations had been previously reported. The ratio of novel variants to POAG cases was 7-fold lower than a prior estimate. Novel mtDNA variants were present in 3 of 22 cases, novel nonsynonymous changes in 1 of 22 cases and novel transversions in 0 of 22 cases; these proportions are significantly lower (p<.0005, p<.0004, p<.0001) than estimated previously for POAG, and did not differ significantly from controls. Although it is possible that mitochondrial genetics play a role in African-Americans' high susceptibility to POAG, it is unlikely that any mitochondrial respiratory dysfunction is due to an abnormally high incidence of novel mutations that can be detected in mtDNA from peripheral blood. PMID:24146900

  12. Simulated seasonal variations in wet acid depositions over East Asia.

    PubMed

    Ge, Cui; Zhang, Meigen; Zhu, Lingyun; Han, Xiao; Wang, Jun

    2011-11-01

    The air quality modeling system Regional Atmospheric Modeling System-Community Multi-scale Air Quality (RAMS-CMAQ) was applied to analyze temporospatial variations in wet acid deposition over East Asia in 2005, and model results obtained on a monthly basis were evaluated against extensive observations, including precipitation amounts at 704 stations and SO4(2-), NO3-, and NH4+ concentrations in the atmosphere and rainwater at 18 EANET (the Acid Deposition Monitoring Network in East Asia) stations. The comparison shows that the modeling system can reasonably reproduce seasonal precipitation patterns, especially the extensive area of dry conditions in northeast China and north China and the major precipitation zones. For ambient concentrations and wet depositions, the simulated results are in reasonable agreement (within a factor of 2) with observations in most cases, and the major observed features are mostly well reproduced. The analysis of modeled wet deposition distributions indicates that East Asia experiences noticeable variations in its wet deposition patterns throughout the year. In winter, southern China and the coastal areas of the Japan Sea report higher S04(2-) and NO3- wet depositions. In spring, elevated SO4(2-) and NO3-wet depositions are found in northeastern China, southern China, and around the Yangtze River. In summer, a remarkable rise in precipitation in northeastern China, the valleys of the Huaihe and Yangtze rivers, Korea, and Japan leads to a noticeable increase in SO4(2-) and NO3- wet depositions, whereas in autumn, higher SO4(2-) and NO3-wet depositions are found around Sichuan Province. Meanwhile, due to the high emission of SO2, high wet depositions of SO4(2-) are found throughout the entire year in the area surrounding Sichuan Province. There is a tendency toward decreasing NO3- concentrations in rainwater from China through Korea to Japan in both observed and simulated results, which is a consequence of the influence of the continental

  13. Oleic Acid: Natural variation and potential enhancement in oilseed crops.

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Oleic acid is a monounsaturated omega 9 fatty acid (MUFA, C18:1) which can be found in various plant lipids and animal fats. Unlike omega 3 (a-linolenic acid, C18:3) and omega 6 (linoleic acid, C18:2) fatty acids which are essential because they cannot be synthesized by humans and must be obtained f...

  14. Comparison of Predicted Scaffold-Compatible Sequence Variation in the Triple-Hairpin Structure of Human Immunodeficiency Virus Type 1 gp41 with Patient Data

    PubMed Central

    Boutonnet, Nathalie; Janssens, Wouter; Boutton, Carlo; Verschelde, Jean-Luc; Heyndrickx, Leo; Beirnaert, Els; van der Groen, Guido; Lasters, Ignace

    2002-01-01

    It has been proposed that the ectodomain of human immunodeficiency virus type 1 (HIV-1) gp41 (e-gp41), involved in HIV entry into the target cell, exists in at least two conformations, a pre-hairpin intermediate and a fusion-active hairpin structure. To obtain more information on the structure-sequence relationship in e-gp41, we performed in silico a full single-amino-acid substitution analysis, resulting in a Fold Compatible Database (FCD) for each conformation. The FCD contains for each residue position in a given protein a list of values assessing the energetic compatibility (ECO) of each of the 20 natural amino acids at that position. Our results suggest that FCD predictions are in good agreement with the sequence variation observed for well-validated e-gp41 sequences. The data show that at a minECO threshold value of 5 kcal/mol, about 90% of the observed patient sequence variation is encompassed by the FCD predictions. Some inconsistent FCD predictions at N-helix positions packing against residues of the C helix suggest that packing of both peptides may involve some flexibility and may be attributed to an altered orientation of the C-helical domain versus the N-helical region. The permissiveness of sequence variation in the C helices is in agreement with FCD predictions. Comparison of N-core and triple-hairpin FCDs suggests that the N helices may impose more constraints on sequence variation than the C helices. Although the observed sequences of e-gp41 contain many multiple mutations, our method, which is based on single-point mutations, can predict the natural sequence variability of e-gp41 very well. PMID:12097573

  15. Analysis of microbial community variation during the mixed culture fermentation of agricultural peel wastes to produce lactic acid.

    PubMed

    Liang, Shaobo; Gliniewicz, Karol; Gerritsen, Alida T; McDonald, Armando G

    2016-05-01

    Mixed cultures fermentation can be used to convert organic wastes into various chemicals and fuels. This study examined the fermentation performance of four batch reactors fed with different agricultural (orange, banana, and potato (mechanical and steam)) peel wastes using mixed cultures, and monitored the interval variation of reactor microbial communities with 16S rRNA genes using Illumina sequencing. All four reactors produced similar chemical profile with lactic acid (LA) as dominant compound. Acetic acid and ethanol were also observed with small fractions. The Illumina sequencing results revealed the diversity of microbial community decreased during fermentation and a community of largely lactic acid producing bacteria dominated by species of Lactobacillus developed. PMID:26913642

  16. A quantitative trait locus for variation in dopamine metabolism mapped in a primate model using reference sequences from related species

    PubMed Central

    Freimer, Nelson B.; Service, Susan K.; Ophoff, Roel A.; Jasinska, Anna J.; McKee, Kevin; Villeneuve, Amelie; Belisle, Alexandre; Bailey, Julia N.; Breidenthal, Sherry E.; Jorgensen, Matthew J.; Mann, J. John; Cantor, Rita M.; Dewar, Ken; Fairbanks, Lynn A.

    2007-01-01

    Non-human primates (NHP) provide crucial research models. Their strong similarities to humans make them particularly valuable for understanding complex behavioral traits and brain structure and function. We report here the genetic mapping of an NHP nervous system biologic trait, the cerebrospinal fluid (CSF) concentration of the dopamine metabolite homovanillic acid (HVA), in an extended inbred vervet monkey (Chlorocebus aethiops sabaeus) pedigree. CSF HVA is an index of CNS dopamine activity, which is hypothesized to contribute substantially to behavioral variations in NHP and humans. For quantitative trait locus (QTL) mapping, we carried out a two-stage procedure. We first scanned the genome using a first-generation genetic map of short tandem repeat markers. Subsequently, using >100 SNPs within the most promising region identified by the genome scan, we mapped a QTL for CSF HVA at a genome-wide level of significance (peak logarithm of odds score >4) to a narrow well delineated interval (<10 Mb). The SNP discovery exploited conserved segments between human and rhesus macaque reference genome sequences. Our findings demonstrate the potential of using existing primate reference genome sequences for designing high-resolution genetic analyses applicable across a wide range of NHP species, including the many for which full genome sequences are not yet available. Leveraging genomic information from sequenced to nonsequenced species should enable the utilization of the full range of NHP diversity in behavior and disease susceptibility to determine the genetic basis of specific biological and behavioral traits. PMID:17884980

  17. Effect of Sequence Variation on the Mechanical Response of Amyloid Fibrils Probed by Steered Molecular Dynamics Simulation

    PubMed Central

    Ndlovu, Hlengisizwe; Ashcroft, Alison E.; Radford, Sheena E.; Harris, Sarah A.

    2012-01-01

    The mechanical failure of mature amyloid fibers produces fragments that act as seeds for the growth of new fibrils. Fragmentation may also be correlated with cytotoxicity. We have used steered atomistic molecular dynamics simulations to study the mechanical failure of fibrils formed by the amyloidogenic fragment of human amylin hIAPP20-29 subjected to force applied in a variety of directions. By introducing systematic variations to this peptide sequence in silico, we have also investigated the role of the amino-acid sequence in determining the mechanical stability of amyloid fibrils. Our calculations show that the force required to induce mechanical failure depends on the direction of the applied stress and upon the degree of structural order present in the β-sheet assemblies, which in turn depends on the peptide sequence. The results have implications for the importance of sequence-dependent mechanical properties on seeding the growth of new fibrils and the role of breakage events in cytotoxicity. PMID:22325282

  18. Temporal Variations of Organic Acids in Sumac Fruit

    SciTech Connect

    Robbins, C.; Mulcahy, F.; Somayajula, K.; Edenborn, H.M.

    2006-10-01

    Extracts from staghorn sumac (Rhus typhina) fruits were obtained from fresh fruits obtained from June to October in two successive years. Total acidity, pH, and concentrations of malic and succinic acids determined using liquid chromatography were measured for each extract. Acidity and acid concentrations reached their maxima in late July, and declined slowly thereafter. Malic and succinic acid concentrations in the extracts reached maxima of about 4 and 0.2% (expressed per unit weight of fruit), respectively. Malic and succinic acids were the only organic acids observed in the extracts, and mass balance determinations indicate that these acids are most likely the only ones present in appreciable amounts.

  19. High intraindividual variation in internal transcibed spacer sequences in Aeschynanthus (Gesneriaceae): implications for phylogenetics.

    PubMed Central

    Denduangboripant, J; Cronk, Q C

    2000-01-01

    Aeschynanthus (Gesneriaceae) is a large genus of tropical epiphytes that is widely distributed from the Himalayas and China throughout South-East Asia to New Guinea and the Solomon Islands. Polymerase chain reaction (PCR) consensus sequences of the internal transcribed spacers (ITS) of Aeschynanthus nuclear ribosomal DNA showed sequence polymorphism that was difficult to interpret. Cloning individual sequences from the PCR product generated a phylogenetic tree of 23 Aeschynanthus species (two clones per species). The intraindividual clone pairs varied from 0 to 5.01%. We suggest that the high intraindividual sequence variation results from low molecular drive in the ITS of Aeschynanthus. However, this study shows that, despite the variation found within some individuals, it is still possible to use these data to reconstruct phylogenetic relationships of the species, suggesting that clone variation, although persistent, does not pre-date the divergence of Aeschynanthus species. The Aeschynanthus analysis revealed two major clades with different but overlapping geographic distributions and reflected classification based on morphology (particularly seed hair type). PMID:10983824

  20. Structural gene and complete amino acid sequence of Pseudomonas aeruginosa IFO 3455 elastase.

    PubMed Central

    Fukushima, J; Yamamoto, S; Morihara, K; Atsumi, Y; Takeuchi, H; Kawamoto, S; Okuda, K

    1989-01-01

    The DNA encoding the elastase of Pseudomonas aeruginosa IFO 3455 was cloned, and its complete nucleotide sequence was determined. When the cloned gene was ligated to pUC18, the Escherichia coli expression vector, bacteria carrying the gene exhibited high levels of both elastase activity and elastase antigens. The amino acid sequence, deduced from the nucleotide sequence, revealed that the mature elastase consisted of 301 amino acids with a relative molecular mass of 32,926 daltons. The amino acid composition predicted from the DNA sequence was quite similar to the chemically determined composition of purified elastase reported previously. We also observed nucleotide sequence encoding a signal peptide and "pro" sequence consisting of 197 amino acids upstream from the mature elastase protein gene. The amino acid sequence analysis revealed that both the N-terminal sequence of the purified elastase and the N-terminal side sequences of the C-terminal tryptic peptide as well as the internal lysyl peptide fragment were completely identical to the deduced amino acid sequences. The pattern of identity of amino acid sequences was quite evident in the regions that include structurally and functionally important residues of Bacillus subtilis thermolysin. PMID:2493453

  1. ENTPRISE: An Algorithm for Predicting Human Disease-Associated Amino Acid Substitutions from Sequence Entropy and Predicted Protein Structures

    PubMed Central

    Zhou, Hongyi; Gao, Mu; Skolnick, Jeffrey

    2016-01-01

    The advance of next-generation sequencing technologies has made exome sequencing rapid and relatively inexpensive. A major application of exome sequencing is the identification of genetic variations likely to cause Mendelian diseases. This requires processing large amounts of sequence information and therefore computational approaches that can accurately and efficiently identify the subset of disease-associated variations are needed. The accuracy and high false positive rates of existing computational tools leave much room for improvement. Here, we develop a boosted tree regression machine-learning approach to predict human disease-associated amino acid variations by utilizing a comprehensive combination of protein sequence and structure features. On comparing our method, ENTPRISE, to the state-of-the-art methods SIFT, PolyPhen-2, MUTATIONASSESSOR, MUTATIONTASTER, FATHMM, ENTPRISE exhibits significant improvement. In particular, on a testing dataset consisting of only proteins with balanced disease-associated and neutral variations defined as having the ratio of neutral/disease-associated variations between 0.3 and 3, the Mathews Correlation Coefficient by ENTPRISE is 0.493 as compared to 0.432 by PPH2-HumVar, 0.406 by SIFT, 0.403 by MUTATIONASSESSOR, 0.402 by PPH2-HumDiv, 0.305 by MUTATIONTASTER, and 0.181 by FATHMM. ENTPRISE is then applied to nucleic acid binding proteins in the human proteome. Disease-associated predictions are shown to be highly correlated with the number of protein-protein interactions. Both these predictions and the ENTPRISE server are freely available for academic users as a web service at http://cssb.biology.gatech.edu/entprise/. PMID:26982818

  2. Contemporary environmental variation determines microbial diversity patterns in acid mine drainage

    PubMed Central

    Kuang, Jia-Liang; Huang, Li-Nan; Chen, Lin-Xing; Hua, Zheng-Shuang; Li, Sheng-Jin; Hu, Min; Li, Jin-Tian; Shu, Wen-Sheng

    2013-01-01

    A wide array of microorganisms survive and thrive in extreme environments. However, we know little about the patterns of, and controls over, their large-scale ecological distribution. To this end, we have applied a bar-coded 16S rRNA pyrosequencing technology to explore the phylogenetic differentiation among 59 microbial communities from physically and geochemically diverse acid mine drainage (AMD) sites across Southeast China, revealing for the first time environmental variation as the major factor explaining community differences in these harsh environments. Our data showed that overall microbial diversity estimates, including phylogenetic diversity, phylotype richness and pairwise UniFrac distance, were largely correlated with pH conditions. Furthermore, multivariate regression tree analysis also identified solution pH as a strong predictor of relative lineage abundance. Betaproteobacteria, mostly affiliated with the ‘Ferrovum' genus, were explicitly predominant in assemblages under moderate pH conditions, whereas Alphaproteobacteria, Euryarchaeota, Gammaproteobacteria and Nitrospira exhibited a strong adaptation to more acidic environments. Strikingly, such pH-dependent patterns could also be observed in a subsequent comprehensive analysis of the environmental distribution of acidophilic microorganisms based on 16S rRNA gene sequences previously retrieved from globally distributed AMD and associated environments, regardless of the long-distance isolation and the distinct substrate types. Collectively, our results suggest that microbial diversity patterns are better predicted by contemporary environmental variation rather than geographical distance in extreme AMD systems. PMID:23178673

  3. BayesPI-BAR: a new biophysical model for characterization of regulatory sequence variations.

    PubMed

    Wang, Junbai; Batmanov, Kirill

    2015-12-01

    Sequence variations in regulatory DNA regions are known to cause functionally important consequences for gene expression. DNA sequence variations may have an essential role in determining phenotypes and may be linked to disease; however, their identification through analysis of massive genome-wide sequencing data is a great challenge. In this work, a new computational pipeline, a Bayesian method for protein-DNA interaction with binding affinity ranking (BayesPI-BAR), is proposed for quantifying the effect of sequence variations on protein binding. BayesPI-BAR uses biophysical modeling of protein-DNA interactions to predict single nucleotide polymorphisms (SNPs) that cause significant changes in the binding affinity of a regulatory region for transcription factors (TFs). The method includes two new parameters (TF chemical potentials or protein concentrations and direct TF binding targets) that are neglected by previous methods. The new method is verified on 67 known human regulatory SNPs, of which 47 (70%) have predicted true TFs ranked in the top 10. Importantly, the performance of BayesPI-BAR, which uses principal component analysis to integrate multiple predictions from various TF chemical potentials, is found to be better than that of existing programs, such as sTRAP and is-rSNP, when evaluated on the same SNPs. BayesPI-BAR is a publicly available tool and is able to carry out parallelized computation, which helps to investigate a large number of TFs or SNPs and to detect disease-associated regulatory sequence variations in the sea of genome-wide noncoding regions. PMID:26202972

  4. Genome-wide Mycobacterium tuberculosis variation (GMTV) database: a new tool for integrating sequence variations and epidemiology

    PubMed Central

    2014-01-01

    Background Tuberculosis (TB) poses a worldwide threat due to advancing multidrug-resistant strains and deadly co-infections with Human immunodeficiency virus. Today large amounts of Mycobacterium tuberculosis whole genome sequencing data are being assessed broadly and yet there exists no comprehensive online resource that connects M. tuberculosis genome variants with geographic origin, with drug resistance or with clinical outcome. Description Here we describe a broadly inclusive unifying Genome-wide Mycobacterium tuberculosis Variation (GMTV) database, (http://mtb.dobzhanskycenter.org) that catalogues genome variations of M. tuberculosis strains collected across Russia. GMTV contains a broad spectrum of data derived from different sources and related to M. tuberculosis molecular biology, epidemiology, TB clinical outcome, year and place of isolation, drug resistance profiles and displays the variants across the genome using a dedicated genome browser. GMTV database, which includes 1084 genomes and over 69,000 SNP or Indel variants, can be queried about M. tuberculosis genome variation and putative associations with drug resistance, geographical origin, and clinical stages and outcomes. Conclusions Implementation of GMTV tracks the pattern of changes of M. tuberculosis strains in different geographical areas, facilitates disease gene discoveries associated with drug resistance or different clinical sequelae, and automates comparative genomic analyses among M. tuberculosis strains. PMID:24767249

  5. Human retroviruses and AIDS 1996. A compilation and analysis of nucleic acid and amino acid sequences

    SciTech Connect

    Myers, G.; Foley, B.; Korber, B.; Mellors, J.W.; Jeang, K.T.; Wain-Hobson, S.

    1997-04-01

    This compendium and the accompanying floppy diskettes are the result of an effort to compile and rapidly publish all relevant molecular data concerning the human immunodeficiency viruses (HIV) and related retroviruses. The scope of the compendium and database is best summarized by the five parts that it comprises: (1) Nuclear Acid Alignments and Sequences; (2) Amino Acid Alignments; (3) Analysis; (4) Related Sequences; and (5) Database Communications. Information within all the parts is updated throughout the year on the Web site, http://hiv-web.lanl.gov. While this publication could take the form of a review or sequence monograph, it is not so conceived. Instead, the literature from which the database is derived has simply been summarized and some elementary computational analyses have been performed upon the data. Interpretation and commentary have been avoided insofar as possible so that the reader can form his or her own judgments concerning the complex information. In addition to the general descriptions of the parts of the compendium, the user should read the individual introductions for each part.

  6. Sequence-Based Pronunciation Variation Modeling for Spontaneous ASR Using a Noisy Channel Approach

    NASA Astrophysics Data System (ADS)

    Hofmann, Hansjörg; Sakti, Sakriani; Hori, Chiori; Kashioka, Hideki; Nakamura, Satoshi; Minker, Wolfgang

    The performance of English automatic speech recognition systems decreases when recognizing spontaneous speech mainly due to multiple pronunciation variants in the utterances. Previous approaches address this problem by modeling the alteration of the pronunciation on a phoneme to phoneme level. However, the phonetic transformation effects induced by the pronunciation of the whole sentence have not yet been considered. In this article, the sequence-based pronunciation variation is modeled using a noisy channel approach where the spontaneous phoneme sequence is considered as a “noisy” string and the goal is to recover the “clean” string of the word sequence. Hereby, the whole word sequence and its effect on the alternation of the phonemes will be taken into consideration. Moreover, the system not only learns the phoneme transformation but also the mapping from the phoneme to the word directly. In this study, first the phonemes will be recognized with the present recognition system and afterwards the pronunciation variation model based on the noisy channel approach will map from the phoneme to the word level. Two well-known natural language processing approaches are adopted and derived from the noisy channel model theory: Joint-sequence models and statistical machine translation. Both of them are applied and various experiments are conducted using microphone and telephone of spontaneous speech.

  7. SoftSearch: Integration of Multiple Sequence Features to Identify Breakpoints of Structural Variations

    PubMed Central

    Hart, Steven N.; Sarangi, Vivekananda; Moore, Raymond; Baheti, Saurabh; Bhavsar, Jaysheel D.; Couch, Fergus J.; Kocher, Jean-Pierre A.

    2013-01-01

    Background Structural variation (SV) represents a significant, yet poorly understood contribution to an individual’s genetic makeup. Advanced next-generation sequencing technologies are widely used to discover such variations, but there is no single detection tool that is considered a community standard. In an attempt to fulfil this need, we developed an algorithm, SoftSearch, for discovering structural variant breakpoints in Illumina paired-end next-generation sequencing data. SoftSearch combines multiple strategies for detecting SV including split-read, discordant read-pair, and unmated pairs. Co-localized split-reads and discordant read pairs are used to refine the breakpoints. Results We developed and validated SoftSearch using real and synthetic datasets. SoftSearch’s key features are 1) not requiring secondary (or exhaustive primary) alignment, 2) portability into established sequencing workflows, and 3) is applicable to any DNA-sequencing experiment (e.g. whole genome, exome, custom capture, etc.). SoftSearch identifies breakpoints from a small number of soft-clipped bases from split reads and a few discordant read-pairs which on their own would not be sufficient to make an SV call. Conclusions We show that SoftSearch can identify more true SVs by combining multiple sequence features. SoftSearch was able to call clinically relevant SVs in the BRCA2 gene not reported by other tools while offering significantly improved overall performance. PMID:24358278

  8. Sequence analysis and identification of new variations in the coding sequence of melatonin receptor gene (MTNR1A) of Indian Chokla sheep breed

    PubMed Central

    Saxena, Vijay Kumar; Jha, Bipul Kumar; Meena, Amar Singh; Naqvi, S.M.K.

    2014-01-01

    Melatonin receptor 1A gene is the prime receptor mediating the effect of melatonin at the neuroendocrine level for control of seasonal reproduction in sheep. The aims of this study were to examine the polymorphism pattern of coding sequence of MTNR1A gene in Chokla sheep, a breed of Indian arid tract and to identify new variations in relation to its aseasonal status. Genomic DNAs of 101 Chokla sheep were collected and an 824 bp coding sequence of Exon II was amplified. RFLP was performed with enzyme RsaI and MnlI to assess the presence of polymorphism at position C606T and G612A, respectively. Genotyping revealed significantly higher frequency of M and R alleles than m and r alleles. RR and MM were found to be dominantly present in the group of studied population. Cloning and sequencing of Exon II followed by mutation/polymorphism analysis revealed ten mutations of which three were non-synonymous mutations (G706A, C893A, G931C). G706A leads to substitution of valine by isoleucine Val125I (U14109) in the fifth transmembrane domain. C893A leads to substitution of alanine by aspartic acid in the third extracellular loop. G931C mutation brings about substitution of amino acid alanine by proline in the seventh transmembrane helix, can affect the conformational stability of the molecule. Polyphen-2 analysis revealed that the polymorphism at position 931 is potentially damaging while the mutations at positions 706 and 893 were benign. It is concluded that G931C mutation of MTNR 1A gene, may explain, in part, the importance of melatonin structure integrity in influencing seasonality in sheep. PMID:25606429

  9. Genome-wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Genomic structural variations are an important source of genetic diversity. Copy number variations (CNVs), gains and losses of large regions of genomic sequence between individuals of a species, are known to be associated with both diseases and phenotypic traits. Deeply sequenced genomes are often u...

  10. Natural vs. random protein sequences: Discovering combinatorics properties on amino acid words.

    PubMed

    Santoni, Daniele; Felici, Giovanni; Vergni, Davide

    2016-02-21

    Casual mutations and natural selection have driven the evolution of protein amino acid sequences that we observe at present in nature. The question about which is the dominant force of proteins evolution is still lacking of an unambiguous answer. Casual mutations tend to randomize protein sequences while, in order to have the correct functionality, one expects that selection mechanisms impose rigid constraints on amino acid sequences. Moreover, one also has to consider that the space of all possible amino acid sequences is so astonishingly large that it could be reasonable to have a well tuned amino acid sequence indistinguishable from a random one. In order to study the possibility to discriminate between random and natural amino acid sequences, we introduce different measures of association between pairs of amino acids in a sequence, and apply them to a dataset of 1047 natural protein sequences and 10,470 random sequences, carefully generated in order to preserve the relative length and amino acid distribution of the natural proteins. We analyze the multidimensional measures with machine learning techniques and show that, to a reasonable extent, natural protein sequences can be differentiated from random ones. PMID:26656109

  11. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza

    PubMed Central

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  12. Transcriptome Sequencing in Response to Salicylic Acid in Salvia miltiorrhiza.

    PubMed

    Zhang, Xiaoru; Dong, Juane; Liu, Hailong; Wang, Jiao; Qi, Yuexin; Liang, Zongsuo

    2016-01-01

    Salvia miltiorrhiza is a traditional Chinese herbal medicine, whose quality and yield are often affected by diseases and environmental stresses during its growing season. Salicylic acid (SA) plays a significant role in plants responding to biotic and abiotic stresses, but the involved regulatory factors and their signaling mechanisms are largely unknown. In order to identify the genes involved in SA signaling, the RNA sequencing (RNA-seq) strategy was employed to evaluate the transcriptional profiles in S. miltiorrhiza cell cultures. A total of 50,778 unigenes were assembled, in which 5,316 unigenes were differentially expressed among 0-, 2-, and 8-h SA induction. The up-regulated genes were mainly involved in stimulus response and multi-organism process. A core set of candidate novel genes coding SA signaling component proteins was identified. Many transcription factors (e.g., WRKY, bHLH and GRAS) and genes involved in hormone signal transduction were differentially expressed in response to SA induction. Detailed analysis revealed that genes associated with defense signaling, such as antioxidant system genes, cytochrome P450s and ATP-binding cassette transporters, were significantly overexpressed, which can be used as genetic tools to investigate disease resistance. Our transcriptome analysis will help understand SA signaling and its mechanism of defense systems in S. miltiorrhiza. PMID:26808150

  13. Identification of the ovine KAP11-1 gene (KRTAP11-1) and genetic variation in its coding sequence.

    PubMed

    Gong, Hua; Zhou, Huitong; Dyer, Jolon M; Hickford, Jon G H

    2011-11-01

    Keratin-associated proteins (KAPs) are a structural component of the wool fibre and form the matrix between the keratin intermediate filaments (KIFs). The gene encoding high sulphur-protein KAP11-1 has been identified in human, cattle and mouse, but not yet in sheep, despite the economic importance of wool. In this study, PCR using primers based on the cattle KAP11-1 gene sequence produced an amplicon of the expected size with sheep DNA. Upon using PCR-Single Stranded Conformational Polymorphism (PCR-SSCP) analysis in 260 sheep, six different PCR-SSCP patterns were detected. Either one or a combination of two banding patterns was observed for each sheep, suggesting they were either homozygous or heterozygous for this gene. Sequencing of the amplicons confirmed the occurrence of six DNA sequences. All of these were unique, and the greatest homology was with KRTAP11-1 sequences from cattle, human and mouse, suggesting that they were derived from the ovine KAP11-1 gene and were allelic variants. The ovine KAP11-1 gene had an open reading frame of 477 nucleotides encoding 159 amino acids. The putative protein was rich in serine, cysteine, and threonine which account for 18.2-18.9, 12.6 and 12.0 mol%, respectively. Of these, approximately 20 of the serine and threonine residues might be phosphorylated. Five nucleotide substitutions were identified, and one was non-synonymous and would result in an amino acid change at a potential phosphorylation site. The genetic variation found in KRTAP11-1 may influence its expression, protein structure, and/or post-translational modifications, and consequently affect wool fibre structure and wool traits. PMID:21400094

  14. Sequence variations in the Boophilus microplus Bm86 locus and implications for immunoprotection in cattle vaccinated with this antigen.

    PubMed

    García-García, J C; Gonzalez, I L; González, D M; Valdés, M; Méndez, L; Lamberti, J; D'Agostino, B; Citroni, D; Fragoso, H; Ortiz, M; Rodríguez, M; de la Fuente, J

    1999-11-01

    Cattle tick infestations constitute a major problem for the cattle industry in tropical and subtropical regions of the world. Traditional control methods have been only partially successful, hampered by the selection of chemical-resistant tick populations. The Boophilus microplus Bm86 protein was isolated from tick gut epithelial cells and shown to induce a protective response against tick infestations in vaccinated cattle. Vaccine preparations including the recombinant Bm86 are used to control cattle tick infestations in the field as an alternative measure to reduce the losses produced by this ectoparasite. The principle for the immunological control of tick infestations relies on a polyclonal antibody response against the target antigen and, therefore, should be difficult to select for tick-resistant populations. However, sequence variations in the Bm86 locus, among other factors, could affect the effectiveness of Bm86-containing vaccines. In the present study we have addressed this issue, employing data obtained with B. microplus strains from Australia, Mexico, Cuba, Argentina and Venezuela. The results showed a tendency in the inverse correlation between the efficacy of the vaccination with Bm86 and the sequence variations in the Bm86 locus (R2 = 0.7). The mutation fixation index in the Bm86 locus was calculated and shown to be between 0.02 and 0.1 amino acids per year. Possible implications of these findings for the immunoprotection of cattle against tick infestations employing the Bm86 antigen are discussed. PMID:10668863

  15. A map of human genome variation from population-scale sequencing.

    PubMed

    Abecasis, Gonçalo R; Altshuler, David; Auton, Adam; Brooks, Lisa D; Durbin, Richard M; Gibbs, Richard A; Hurles, Matt E; McVean, Gil A

    2010-10-28

    The 1000 Genomes Project aims to provide a deep characterization of human genome sequence variation as a foundation for investigating the relationship between genotype and phenotype. Here we present results of the pilot phase of the project, designed to develop and compare different strategies for genome-wide sequencing with high-throughput platforms. We undertook three projects: low-coverage whole-genome sequencing of 179 individuals from four populations; high-coverage sequencing of two mother-father-child trios; and exon-targeted sequencing of 697 individuals from seven populations. We describe the location, allele frequency and local haplotype structure of approximately 15 million single nucleotide polymorphisms, 1 million short insertions and deletions, and 20,000 structural variants, most of which were previously undescribed. We show that, because we have catalogued the vast majority of common variation, over 95% of the currently accessible variants found in any individual are present in this data set. On average, each person is found to carry approximately 250 to 300 loss-of-function variants in annotated genes and 50 to 100 variants previously implicated in inherited disorders. We demonstrate how these results can be used to inform association and functional studies. From the two trios, we directly estimate the rate of de novo germline base substitution mutations to be approximately 10(-8) per base pair per generation. We explore the data with regard to signatures of natural selection, and identify a marked reduction of genetic variation in the neighbourhood of genes, due to selection at linked sites. These methods and public data will support the next phase of human genetic research. PMID:20981092

  16. Sequence variation of koala retrovirus transmembrane protein p15E among koalas from different geographic regions.

    PubMed

    Ishida, Yasuko; McCallister, Chelsea; Nikolaidis, Nikolas; Tsangaras, Kyriakos; Helgen, Kristofer M; Greenwood, Alex D; Roca, Alfred L

    2015-01-15

    The koala retrovirus (KoRV), which is transitioning from an exogenous to an endogenous form, has been associated with high mortality in koalas. For other retroviruses, the envelope protein p15E has been considered a candidate for vaccine development. We therefore examined proviral sequence variation of KoRV p15E in a captive Queensland and three wild southern Australian koalas. We generated 163 sequences with intact open reading frames, which grouped into 39 distinct haplotypes. Sixteen distinct haplotypes comprising 139 of the sequences (85%) coded for the same polypeptide. Among the remaining 23 haplotypes, 22 were detected only once among the sequences, and each had 1 or 2 non-synonymous differences from the majority sequence. Several analyses suggested that p15E was under purifying selection. Important epitopes and domains were highly conserved across the p15E sequences and in previously reported exogenous KoRVs. Overall, these results support the potential use of p15E for KoRV vaccine development. PMID:25462343

  17. Sequence variation between 462 human individuals fine-tunes functional sites of RNA processing.

    PubMed

    Ferreira, Pedro G; Oti, Martin; Barann, Matthias; Wieland, Thomas; Ezquina, Suzana; Friedländer, Marc R; Rivas, Manuel A; Esteve-Codina, Anna; Rosenstiel, Philip; Strom, Tim M; Lappalainen, Tuuli; Guigó, Roderic; Sammeth, Michael

    2016-01-01

    Recent advances in the cost-efficiency of sequencing technologies enabled the combined DNA- and RNA-sequencing of human individuals at the population-scale, making genome-wide investigations of the inter-individual genetic impact on gene expression viable. Employing mRNA-sequencing data from the Geuvadis Project and genome sequencing data from the 1000 Genomes Project we show that the computational analysis of DNA sequences around splice sites and poly-A signals is able to explain several observations in the phenotype data. In contrast to widespread assessments of statistically significant associations between DNA polymorphisms and quantitative traits, we developed a computational tool to pinpoint the molecular mechanisms by which genetic markers drive variation in RNA-processing, cataloguing and classifying alleles that change the affinity of core RNA elements to their recognizing factors. The in silico models we employ further suggest RNA editing can moonlight as a splicing-modulator, albeit less frequently than genomic sequence diversity. Beyond existing annotations, we demonstrate that the ultra-high resolution of RNA-Seq combined from 462 individuals also provides evidence for thousands of bona fide novel elements of RNA processing-alternative splice sites, introns, and cleavage sites-which are often rare and lowly expressed but in other characteristics similar to their annotated counterparts. PMID:27617755

  18. Sequence variation of koala retrovirus transmembrane protein p15E among koalas from different geographic regions

    PubMed Central

    Ishida, Yasuko; McCallister, Chelsea; Nikolaidis, Nikolas; Tsangaras, Kyriakos; Helgen, Kristofer M.; Greenwood, Alex D.; Roca, Alfred L.

    2014-01-01

    The koala retrovirus (KoRV), which is transitioning from an exogenous to an endogenous form, has been associated with high mortality in koalas. For other retroviruses, the envelope protein p15E has been considered a candidate for vaccine development. We therefore examined proviral sequence variation of KoRV p15E in a captive Queensland and three wild southern Australian koalas. We generated 163 sequences with intact open reading frames, which grouped into 39 distinct haplotypes. Sixteen distinct haplotypes comprising 139 of the sequences (85%) coded for the same polypeptide. Among the remaining 23 haplotypes, 22 were detected only once among the sequences, and each had 1 or 2 non-synonymous differences from the majority sequence. Several analyses suggested that p15E was under purifying selection. Important epitopes and domains were highly conserved across the p15E sequences and in previously reported exogenous KoRVs. Overall, these results support the potential use of p15E for KoRV vaccine development. PMID:25462343

  19. A framework for variation discovery and genotyping using next-generation DNA sequencing data.

    PubMed

    DePristo, Mark A; Banks, Eric; Poplin, Ryan; Garimella, Kiran V; Maguire, Jared R; Hartl, Christopher; Philippakis, Anthony A; del Angel, Guillermo; Rivas, Manuel A; Hanna, Matt; McKenna, Aaron; Fennell, Tim J; Kernytsky, Andrew M; Sivachenko, Andrey Y; Cibulskis, Kristian; Gabriel, Stacey B; Altshuler, David; Daly, Mark J

    2011-05-01

    Recent advances in sequencing technology make it possible to comprehensively catalog genetic variation in population samples, creating a foundation for understanding human disease, ancestry and evolution. The amounts of raw data produced are prodigious, and many computational steps are required to translate this output into high-quality variant calls. We present a unified analytic framework to discover and genotype variation among multiple samples simultaneously that achieves sensitive and specific results across five sequencing technologies and three distinct, canonical experimental designs. Our process includes (i) initial read mapping; (ii) local realignment around indels; (iii) base quality score recalibration; (iv) SNP discovery and genotyping to find all potential variants; and (v) machine learning to separate true segregating variation from machine artifacts common to next-generation sequencing technologies. We here discuss the application of these tools, instantiated in the Genome Analysis Toolkit, to deep whole-genome, whole-exome capture and multi-sample low-pass (∼4×) 1000 Genomes Project datasets. PMID:21478889

  20. Virus Load and Sequence Variation in Simian Retrovirus Type 2 Infection

    PubMed Central

    Rosenblum, Lisa L.; Weiss, Robin A.; McClure, Myra O.

    2000-01-01

    The natural history of type D simian retrovirus (SRV) infection is poorly characterized in terms of viral load, antibody status, and sequence variation. To investigate this, blood samples were taken from a small cohort of mostly asymptomatic cynomolgus macaques (Macaca fascicularis), naturally infected with SRV type 2 (SRV-2), some of which were followed over an 8-month period with blood taken every 2 months. Provirus and RNA virus loads were obtained, the samples were screened for presence of antibodies to SRV-2 and neutralizing antibody titers to SRV-2 were assayed. env sequences were aligned to determine intra- and intermonkey variation over time. Virus loads varied greatly among cohort individuals but, conversely, remained steady for each macaque over the 8-month period, regardless of their initial levels. No significant sequence variation was found within an individual over time. No clear picture emerged from these results, which indicate that the variables of SRV-2 infection are complex, differ from those for lentivirus infection, and are not distinctly related to disease outcome. PMID:10729117

  1. Population genetic structure of Indian shad, Tenualosa ilisha inferred from variation in mitochondrial DNA sequences.

    PubMed

    Behera, B K; Singh, N S; Paria, P; Sahoo, A K; Panda, D; Meena, D K; Das, P; Pakrashi, S; Biswas, D K; Sharma, A P

    2015-09-01

    Indian shad, Tenualosa ilisha, is a commercially important anadromous fish representing major catch in Indo-pacific region. The present study evaluated partial Cytochrome b (Cyt b) gene sequence of mtDNA in T. ilisha for determining genetic variation from Bay of Bengal and Arabian Sea origins. The genomic DNA extracted from T. ilisha samples representing two distant rivers in the Indian subcontinent, the Bhagirathi (lower stretch of Ganges) and the Tapi was analyzed. Sequencing of 307 bp mtDNA Cytochrome b gene fragment revealed the presence of 5 haplotypes, with high haplotype diversity (Hd) of 0.9048 with variance 0.103 and low nucleotide diversity (π) of 0.14301. Three population specific haplotypes were observed in river Ganga and two haplotypes in river Tapi. Neighbour-joining tree based on Cytochrome b gene sequences of T. ilisha showed that population from Bay of Bengal and Arabian Sea origins belonged to two distinct clusters. PMID:26521565

  2. Spatio-temporal Variations of Characteristic Repeating Earthquake Sequences along the Middle America Trench in Mexico

    NASA Astrophysics Data System (ADS)

    Dominguez, L. A.; Taira, T.; Hjorleifsdottir, V.; Santoyo, M. A.

    2015-12-01

    Repeating earthquake sequences are sets of events that are thought to rupture the same area on the plate interface and thus provide nearly identical waveforms. We systematically analyzed seismic records from 2001 through 2014 to identify repeating earthquakes with highly correlated waveforms occurring along the subduction zone of the Cocos plate. Using the correlation coefficient (cc) and spectral coherency (coh) of the vertical components as selection criteria, we found a set of 214 sequences whose waveforms exceed cc≥95% and coh≥95%. Spatial clustering along the trench shows large variations in repeating earthquakes activity. Particularly, the rupture zone of the M8.1, 1985 earthquake shows an almost absence of characteristic repeating earthquakes, whereas the Guerrero Gap zone and the segment of the trench close to the Guerrero-Oaxaca border shows a significantly larger number of repeating earthquakes sequences. Furthermore, temporal variations associated to stress changes due to major shows episodes of unlocking and healing of the interface. Understanding the different components that control the location and recurrence time of characteristic repeating sequences is a key factor to pinpoint areas where large megathrust earthquakes may nucleate and consequently to improve the seismic hazard assessment.

  3. Magnetic susceptibility variations in Loess sequences and their relationship to astronomical forcing

    NASA Technical Reports Server (NTRS)

    Verosub, Kenneth L.; Singer, Michael J.

    1992-01-01

    The long, well-exposed and often continuous sequences of loess found throughout the world are generally thought to provide an excellent opportunity for studying long-term, large-scale environmental change during the last few million years. In recent years, the most fruitful loess studies have been those involving the deposits of the loess in China. One of the most intriguing results of that work has been the discovery of an apparent correlation between variations in the magnetic susceptibility of the loess sequence and the oxygen isotope record of the deep sea. This correlation implies that magnetic susceptibility variations are being driven by astronomical parameters. However, the basic data have been interpreted in various ways by different authors, most of whom assumed that the magnetic minerals in the loess have not been affected by post-depositional processes. Using a chemical extraction procedure that allows us to separate the contribution of secondary pedogenic magnetic minerals from primary inherited magnetic minerals, we have found that the magnetic susceptibility of the Chinese paleosols is largely due to a pedogenic component which is present to a lesser degree in the loess. We have also found that the smaller inherited component of the magnetic susceptibility is about the same in the paleosols and the loess. These results demonstrate the need for additional study of the processes that create magnetic susceptibility variations in order to interpret properly the role of astronomical forcing in producing these variations.

  4. Matrix genes of measles virus and canine distemper virus: cloning, nucleotide sequences, and deduced amino acid sequences.

    PubMed Central

    Bellini, W J; Englund, G; Richardson, C D; Rozenblatt, S; Lazzarini, R A

    1986-01-01

    The nucleotide sequences encoding the matrix (M) proteins of measles virus (MV) and canine distemper virus (CDV) were determined from cDNA clones containing these genes in their entirety. In both cases, single open reading frames specifying basic proteins of 335 amino acid residues were predicted from the nucleotide sequences. Both viral messages were composed of approximately 1,450 nucleotides and contained 400 nucleotides of presumptive noncoding sequences at their respective 3' ends. MV and CDV M-protein-coding regions were 67% homologous at the nucleotide level and 76% homologous at the amino acid level. Only chance homology was observed in the 400-nucleotide trailer sequences. Comparisons of the M protein sequences of MV and CDV with the sequence reported for Sendai virus (B. M. Blumberg, K. Rose, M. G. Simona, L. Roux, C. Giorgi, and D. Kolakofsky, J. Virol. 52:656-663; Y. Hidaka, T. Kanda, K. Iwasaki, A. Nomoto, T. Shioda, and H. Shibuta, Nucleic Acids Res. 12:7965-7973) indicated the greatest homology among these M proteins in the carboxyterminal third of the molecule. Secondary-structure analyses of this shared region indicated a structurally conserved, hydrophobic sequence which possibly interacted with the lipid bilayer. Images PMID:3754588

  5. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity.

    PubMed

    Petrovski, Slavé; Gussow, Ayal B; Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H; Allen, Andrew S; Goldstein, David B

    2015-09-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene's proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene's regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen's Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance, nc

  6. The Intolerance of Regulatory Sequence to Genetic Variation Predicts Gene Dosage Sensitivity

    PubMed Central

    Wang, Quanli; Halvorsen, Matt; Han, Yujun; Weir, William H.; Allen, Andrew S.; Goldstein, David B.

    2015-01-01

    Noncoding sequence contains pathogenic mutations. Yet, compared with mutations in protein-coding sequence, pathogenic regulatory mutations are notoriously difficult to recognize. Most fundamentally, we are not yet adept at recognizing the sequence stretches in the human genome that are most important in regulating the expression of genes. For this reason, it is difficult to apply to the regulatory regions the same kinds of analytical paradigms that are being successfully applied to identify mutations among protein-coding regions that influence risk. To determine whether dosage sensitive genes have distinct patterns among their noncoding sequence, we present two primary approaches that focus solely on a gene’s proximal noncoding regulatory sequence. The first approach is a regulatory sequence analogue of the recently introduced residual variation intolerance score (RVIS), termed noncoding RVIS, or ncRVIS. The ncRVIS compares observed and predicted levels of standing variation in the regulatory sequence of human genes. The second approach, termed ncGERP, reflects the phylogenetic conservation of a gene’s regulatory sequence using GERP++. We assess how well these two approaches correlate with four gene lists that use different ways to identify genes known or likely to cause disease through changes in expression: 1) genes that are known to cause disease through haploinsufficiency, 2) genes curated as dosage sensitive in ClinGen’s Genome Dosage Map, 3) genes judged likely to be under purifying selection for mutations that change expression levels because they are statistically depleted of loss-of-function variants in the general population, and 4) genes judged unlikely to cause disease based on the presence of copy number variants in the general population. We find that both noncoding scores are highly predictive of dosage sensitivity using any of these criteria. In a similar way to ncGERP, we assess two ensemble-based predictors of regional noncoding importance

  7. Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    2000-01-01

    A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.

  8. BreaKmer: detection of structural variation in targeted massively parallel sequencing data using kmers

    PubMed Central

    Abo, Ryan P.; Ducar, Matthew; Garcia, Elizabeth P.; Thorner, Aaron R.; Rojas-Rudilla, Vanesa; Lin, Ling; Sholl, Lynette M.; Hahn, William C.; Meyerson, Matthew; Lindeman, Neal I.; Van Hummelen, Paul; MacConaill, Laura E.

    2015-01-01

    Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for ‘targeted’ resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a ‘kmer’ strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings. PMID:25428359

  9. Relation between mRNA expression and sequence information in Desulfovibrio vulgaris: Combinatorial contributions of upstream regulatory motifs and coding sequence features to variations in mRNA abundance

    SciTech Connect

    Wu, Gang; Nie, Lei; Zhang, Weiwen

    2006-05-26

    ABSTRACT-The context-dependent expression of genes is the core for biological activities, and significant attention has been given to identification of various factors contributing to gene expression at genomic scale. However, so far this type of analysis has been focused whether on relation between mRNA expression and non-coding sequence features such as upstream regulatory motifs or on correlation between mRN abundance and non-random features in coding sequences (e.g. codon usage and amino acid usage). In this study multiple regression analyses of the mRNA abundance and all sequence information in Desulfovibrio vulgaris were performed, with the goal to investigate how much coding and non-coding sequence features contribute to the variations in mRNA expression, and in what manner they act together...

  10. Population subdivision and molecular sequence variation: theory and analysis of Drosophila ananassae data.

    PubMed Central

    Vogl, Claus; Das, Aparup; Beaumont, Mark; Mohanty, Sujata; Stephan, Wolfgang

    2003-01-01

    Population subdivision complicates analysis of molecular variation. Even if neutrality is assumed, three evolutionary forces need to be considered: migration, mutation, and drift. Simplification can be achieved by assuming that the process of migration among and drift within subpopulations is occurring fast compared to mutation and drift in the entire population. This allows a two-step approach in the analysis: (i) analysis of population subdivision and (ii) analysis of molecular variation in the migrant pool. We model population subdivision using an infinite island model, where we allow the migration/drift parameter Theta to vary among populations. Thus, central and peripheral populations can be differentiated. For inference of Theta, we use a coalescence approach, implemented via a Markov chain Monte Carlo (MCMC) integration method that allows estimation of allele frequencies in the migrant pool. The second step of this approach (analysis of molecular variation in the migrant pool) uses the estimated allele frequencies in the migrant pool for the study of molecular variation. We apply this method to a Drosophila ananassae sequence data set. We find little indication of isolation by distance, but large differences in the migration parameter among populations. The population as a whole seems to be expanding. A population from Bogor (Java, Indonesia) shows the highest variation and seems closest to the species center. PMID:14668389

  11. Analysis of simian immunodeficiency virus sequence variation in tissues of rhesus macaques with simian AIDS.

    PubMed Central

    Kodama, T; Mori, K; Kawahara, T; Ringler, D J; Desrosiers, R C

    1993-01-01

    One rhesus macaque displayed severe encephalomyelitis and another displayed severe enterocolitis following infection with molecularly cloned simian immunodeficiency virus (SIV) strain SIVmac239. Little or no free anti-SIV antibody developed in these two macaques, and they died relatively quickly (4 to 6 months) after infection. Manifestation of the tissue-specific disease in these macaques was associated with the emergence of variants with high replicative capacity for macrophages and primary infection of tissue macrophages. The nature of sequence variation in the central region (vif, vpr, and vpx), the env gene, and the nef long terminal repeat (LTR) region in brain, colon, and other tissues was examined to see whether specific genetic changes were associated with SIV replication in brain or gut. Sequence analysis revealed strong conservation of the intergenic central region, nef, and the LTR. However, analysis of env sequences in these two macaques and one other revealed significant, interesting patterns of sequence variation. (i) Changes in env that were found previously to contribute to the replicative ability of SIVmac for macrophages in culture were present in the tissues of these animals. (ii) The greatest variability was located in the regions between V1 and V2 and from "V3" through C3 in gp120, which are different in location from the variable regions observed previously in animals with strong antibody responses and long-term persistent infection. (iii) The predominant sequence change of D-->N at position 385 in C3 is most surprising, since this change in both SIV and human immunodeficiency virus type 1 has been associated with dramatically diminished affinity for CD4 and replication in vitro. (iv) The nature of sequence changes at some positions (146, 178, 345, 385, and "V3") suggests that viral replication in brain and gut may be facilitated by specific sequence changes in env in addition to those that impart a general ability to replicate well in

  12. From sequence to function: Insights from natural variation in budding yeasts☆

    PubMed Central

    Nieduszynski, Conrad A.; Liti, Gianni

    2011-01-01

    Background Natural variation offers a powerful approach for assigning function to DNA sequence—a pressing challenge in the age of high throughput sequencing technologies. Scope of Review Here we review comparative genomic approaches that are bridging the sequence–function and genotype–phenotype gaps. Reverse genomic approaches aim to analyse sequence to assign function, whereas forward genomic approaches start from a phenotype and aim to identify the underlying genotype responsible. Major Conclusions Comparative genomic approaches, pioneered in budding yeasts, have resulted in dramatic improvements in our understanding of the function of both genes and regulatory sequences. Analogous studies in other systems, including humans, demonstrate the ubiquity of comparative genomic approaches. Recently, forward genomic approaches, exploiting natural variation within yeast populations, have started to offer powerful insights into how genotype influences phenotype and even the ability to predict phenotypes. General Significance Comparative genomic experiments are defining the fundamental rules that govern complex traits in natural populations from yeast to humans. This article is part of a Special Issue entitled Systems Biology of Microorganisms. PMID:21320572

  13. CNV-TV: A robust method to discover copy number variation from short sequencing reads

    PubMed Central

    2013-01-01

    Background Copy number variation (CNV) is an important structural variation (SV) in human genome. Various studies have shown that CNVs are associated with complex diseases. Traditional CNV detection methods such as fluorescence in situ hybridization (FISH) and array comparative genomic hybridization (aCGH) suffer from low resolution. The next generation sequencing (NGS) technique promises a higher resolution detection of CNVs and several methods were recently proposed for realizing such a promise. However, the performances of these methods are not robust under some conditions, e.g., some of them may fail to detect CNVs of short sizes. There has been a strong demand for reliable detection of CNVs from high resolution NGS data. Results A novel and robust method to detect CNV from short sequencing reads is proposed in this study. The detection of CNV is modeled as a change-point detection from the read depth (RD) signal derived from the NGS, which is fitted with a total variation (TV) penalized least squares model. The performance (e.g., sensitivity and specificity) of the proposed approach are evaluated by comparison with several recently published methods on both simulated and real data from the 1000 Genomes Project. Conclusion The experimental results showed that both the true positive rate and false positive rate of the proposed detection method do not change significantly for CNVs with different copy numbers and lengthes, when compared with several existing methods. Therefore, our proposed approach results in a more reliable detection of CNVs than the existing methods. PMID:23634703

  14. In vivo activity of epoxide hydrolase according to sequence variation affects the progression of human IgA nephropathy.

    PubMed

    Lee, Jung Pyo; Yang, Seung Hee; Kim, Dong Ki; Lee, Hajeong; Kim, Bora; Cho, Joo-Youn; Yu, Kyung-Sang; Paik, Jin Ho; Kim, Myounghee; Lim, Chun Soo; Kim, Yon Su

    2011-06-01

    Epoxyeicosatrienoic acid (EET) regulates the functional integrity of the endothelium. It is hypothesized that the activity of epoxide hydrolase (EPHX2), which determines EET concentration through hydrolysis, may affect the progression of glomerulonephritis. Here, we evaluated the relationship between genetic variations, the in vivo activity of EPHX2, and progression of IgA nephropathy (IgAN). Three single-nucleotide polymorphisms (SNPs) [rs41507953 (K55R), rs751141 (R287Q), and rs1042032] were traced in 401 IgAN patients and 402 normal healthy controls. The in vivo activity of EPHX2 was assessed by measuring substrates/metabolites of the enzyme. None of the polymorphism frequencies differed significantly between patients and controls. However, patients carrying the variant allele (A) of rs751141 possessed better kidney survival than those with the wild-type allele (G; P < 0.001). This association remained significant after adjustment for several risk factors (hazard ratio 1.83, 95% confidence interval 1.13-2.96, P = 0.014). Vascular damage was more prominent in kidney biopsies from patients carrying the G allele of rs751141. The in vivo activity of EPHX2, assessed by the epoxyoctadecenoic acid/dihydroxyoctadecenoic acid ratio using liquid chromatography/mass spectrometry analysis, was elevated in patients with the G allele. The expression of EPHX2 in the human kidney was independent of the sequence variation of the rs751141 allele. Variant rs41507953 was not present in this cohort, and rs1042032 was not associated with progression. Thus the specific measures which regulate EPHX2 activity should be designed for potential therapeutics. PMID:21429967

  15. Partial amino acid sequence of human factor D:homology with serine proteases.

    PubMed Central

    Volanakis, J E; Bhown, A; Bennett, J C; Mole, J E

    1980-01-01

    Human factor D purified to homogeneity by a modified procedure was subjected to NH2-terminal amino acid sequence analysis by using a modified automated Beckman sequencer. We identified 48 of the first 57 NH2-terminal amino acids in a single sequencer run, using microgram quantities of factor D. The deduced amino acid sequence represents approximately 25% of the primary structure of factor D. This extended NH2-terminal amino acid sequence of factor D was compared to that of other trypsin-related serine proteases. By visual inspection, strong homologies (33--50% identity) were observed with all the serine proteases included in the comparison. Interestingly, factor D showed a higher degree of homology to serine proteases of pancreatic origin than to those of serum origin. Images PMID:6987665

  16. Amino acid sequence of Japanese quail (Coturnix japonica) and northern bobwhite (Colinus virginianus) myoglobin.

    PubMed

    Goodson, John; Beckstead, Robert B; Payne, Jason; Singh, Rakesh K; Mohan, Anand

    2015-08-15

    Myoglobin has an important physiological role in vertebrates, and as the primary sarcoplasmic pigment in meat, influences quality perception and consumer acceptability. In this study, the amino acid sequences of Japanese quail and northern bobwhite myoglobin were deduced by cDNA cloning of the coding sequence from mRNA. Japanese quail myoglobin was isolated from quail cardiac muscles, purified using ammonium sulphate precipitation and gel-filtration, and subjected to multiple enzymatic digestions. Mass spectrometry corroborated the deduced protein amino acid sequence at the protein level. Sequence analysis revealed both species' myoglobin structures consist of 153 amino acids, differing at only three positions. When compared with chicken myoglobin, Japanese quail showed 98% sequence identity, and northern bobwhite 97% sequence identity. The myoglobin in both quail species contained eight histidine residues instead of the nine present in chicken and turkey. PMID:25794748

  17. Exome sequencing-based copy-number variation and loss of heterozygosity detection: ExomeCNV

    PubMed Central

    Sathirapongsasuti, Jarupon Fah; Lee, Hane; Horst, Basil A. J.; Brunner, Georg; Cochran, Alistair J.; Binder, Scott; Quackenbush, John; Nelson, Stanley F.

    2011-01-01

    Motivation: The ability to detect copy-number variation (CNV) and loss of heterozygosity (LOH) from exome sequencing data extends the utility of this powerful approach that has mainly been used for point or small insertion/deletion detection. Results: We present ExomeCNV, a statistical method to detect CNV and LOH using depth-of-coverage and B-allele frequencies, from mapped short sequence reads, and we assess both the method's power and the effects of confounding variables. We apply our method to a cancer exome resequencing dataset. As expected, accuracy and resolution are dependent on depth-of-coverage and capture probe design. Availability: CRAN package ‘ExomeCNV’. Contact: fsathira@fas.harvard.edu; snelson@ucla.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:21828086

  18. Capturing genomic signatures of DNA sequence variation using a standard anonymous microarray platform

    PubMed Central

    Cannon, C. H.; Kua, C. S.; Lobenhofer, E. K.; Hurban, P.

    2006-01-01

    Comparative genomics, using the model organism approach, has provided powerful insights into the structure and evolution of whole genomes. Unfortunately, only a small fraction of Earth's biodiversity will have its genome sequenced in the foreseeable future. Most wild organisms have radically different life histories and evolutionary genomics than current model systems. A novel technique is needed to expand comparative genomics to a wider range of organisms. Here, we describe a novel approach using an anonymous DNA microarray platform that gathers genomic samples of sequence variation from any organism. Oligonucleotide probe sequences placed on a custom 44 K array were 25 bp long and designed using a simple set of criteria to maximize their complexity and dispersion in sequence probability space. Using whole genomic samples from three known genomes (mouse, rat and human) and one unknown (Gonystylus bancanus), we demonstrate and validate its power, reliability, transitivity and sensitivity. Using two separate statistical analyses, a large numbers of genomic ‘indicator’ probes were discovered. The construction of a genomic signature database based upon this technique would allow virtual comparisons and simple queries could generate optimal subsets of markers to be used in large-scale assays, using simple downstream techniques. Biologists from a wide range of fields, studying almost any organism, could efficiently perform genomic comparisons, at potentially any phylogenetic level after performing a small number of standardized DNA microarray hybridizations. Possibilities for refining and expanding the approach are discussed. PMID:17000641

  19. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1998-01-01

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.

  20. Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1998-03-24

    A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.

  1. The amino acid sequence of protein CM-3 from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J

    1985-01-01

    Protein CM-3 from Dendroaspis polylepis polylepis venom was purified by gel filtration and ion exchange chromatography. It comprises 65 amino acids including eight half-cystines. The complete amino acid sequence of protein CM-3 has been elucidated. The sequence (residues 1-50) resembles that of the N-terminal sequence of the subunits of a synergistic type protein and residues 51-65 that of the C-terminal sequence of an angusticeps type protein. Mixtures of protein CM-3 and angusticeps type proteins showed no apparent synergistic effect, in that their toxicity in combination was no greater than the sum of their individual toxicities. PMID:4029488

  2. BRCA1 and BRCA2 sequence variations detected with next-generation sequencing in patients with premature ovarian insufficiency

    PubMed Central

    Yılmaz, Nafiye Karakaş; Karagin, Peren Hatice; Terzi, Yunus Kasım; Kahyaoğlu, İnci; Yılmaz, Saynur; Erkaya, Salim; Şahin, Feride İffet

    2016-01-01

    Objective Although the association between BRCA1 and BRCA2 gene mutations and breast and ovarian cancer is known, there is insufficient data about premature ovarian insufficiency (POI). However, several studies have reported that there might be a relationship between POI and BRCA1 and BRCA2 gene mutation. Therefore, in the present study, we aimed to investigate the role of BRCA1 and BRCA2 gene mutations in the etiology of POI in a Turkish population. Material and Methods The cohort was classified into two groups: a study group, consisting of 56 individuals diagnosed with premature ovarian insufficiency (and who were younger than 40 years of age, had an antral follicle count <3–5, and FSH levels >12 IU/I), and a control group, consisting of 45 fertile individuals. A total of 101 individuals were analyzed by next-generation sequencing to detect BRCA1 and BRCA2 gene mutations. Results We detected four new variations (p.T1246N and p.R1835Q in BRCA1 and p.I3312V and IVS-7T>A in BRCA2) that had not been reported before. Conclusion We did not find an association between the BRCA1 and BRCA2 gene mutations and premature ovarian insufficiency. However, larger, functional studies are needed to clarify the association. PMID:27403073

  3. Sequence Variation and Immunologic Cross-Reactivity among Babesia bovis Merozoite Surface Antigen 1 Proteins from Vaccine Strains and Vaccine Breakthrough Isolates

    PubMed Central

    LeRoith, Tanya; Brayton, Kelly A.; Molloy, John B.; Bock, Russell E.; Hines, Stephen A.; Lew, Ala E.; McElwain, Terry F.

    2005-01-01

    The Babesia bovis merozoite surface antigen 1 (MSA-1) is an immunodominant membrane glycoprotein that is the target of invasion-blocking antibodies. While antigenic variation has been demonstrated in MSA-1 among strains from distinct geographical areas, the extent of sequence variation within a region where it is endemic and the effect of variation on immunologic cross-reactivity have not been assessed. In this study, sequencing of MSA-1 from two Australian B. bovis vaccine strains and 14 breakthrough isolates from vaccinated animals demonstrated low sequence identity in the extracellular region of the molecule, ranging from 19.8 to 46.7% between the T vaccine strain and eight T vaccine breakthrough isolates, and from 18.7 to 99% between the K vaccine strain and six K vaccine breakthrough isolates. Although MSA-1 amino acid sequence varied substantially among strains, overall predicted regions of hydrophilicity and hydrophobicity in the extracellular domain were conserved in all strains examined, suggesting a conserved functional role for MSA-1 despite sequence polymorphism. Importantly, the antigenic variation created by sequence differences resulted in a lack of immunologic cross-reactivity among outbreak strains using sera from animals infected with the B. bovis vaccine strains. Additionally, sera from cattle hyperinfected with the Mexico strain of B. bovis and shown to be clinically immune did not cross-react with MSA-1 from any other isolate tested. The results indicate that isolates of B. bovis capable of evading vaccine-induced immunity contain an msa-1 gene that is significantly different from the msa-1 of the vaccine strain, and that the difference can result in a complete lack of cross-reactivity between MSA-1 from vaccine and breakthrough strains in immunized animals. PMID:16113254

  4. Mitochondrial DNA hypervariable region-1 sequence variation and phylogeny of the concolor gibbons, Nomascus.

    PubMed

    Monda, Keri; Simmons, Rachel E; Kressirer, Philipp; Su, Bing; Woodruff, David S

    2007-11-01

    The still little known concolor gibbons are represented by 14 taxa (five species, nine subspecies) distributed parapatrically in China, Myanmar, Vietnam, Laos and Cambodia. To set the stage for a phylogeographic study of the genus we examined DNA sequences from the highly variable mitochondrial hypervariable region-1 (HVR-1 or control region) in 51 animals, mostly of unknown geographic provenance. We developed gibbon-specific primers to amplify mtDNA noninvasively and obtained >477 bp sequences from 38 gibbons in North American and European zoos and >159 bp sequences from ten Chinese museum skins. In hindsight, we believe these animals represent eight of the nine nominal subspecies and four of the five nominal species. Bayesian, maximum likelihood and maximum parsimony haplotype network analyses gave concordant results and show Nomascus to be monophyletic. Significant intraspecific variation within N. leucogenys (17 haplotypes) is comparable with that reported earlier in Hylobates lar and less than half the known interspecific pairwise distances in gibbons. Sequence data support the recognition of five species (concolor, leucogenys, nasutus, gabriellae and probably hainanus) and suggest that nasutus is the oldest and leucogenys, the youngest taxon. In contrast, the subspecies N. c. furvogaster, N. c. jingdongensis, and N. leucogenys siki, are not recognizable at this otherwise informative genetic locus. These results show that HVR-1 sequence is variable enough to define evolutionarily significant units in Nomascus and, if coupled with multilocus microsatellite or SNP genotyping, more than adequate to characterize their phylogeographic history. There is an urgent need to obtain DNA from gibbons of known geographic provenance before they are extirpated to facilitate the conservation genetic management of the surviving animals. PMID:17455231

  5. Sequence variation within the rRNA gene loci of 12 Drosophila species

    PubMed Central

    Stage, Deborah E.; Eickbush, Thomas H.

    2007-01-01

    Concerted evolution maintains at near identity the hundreds of tandemly arrayed ribosomal RNA (rRNA) genes and their spacers present in any eukaryote. Few comprehensive attempts have been made to directly measure the identity between the rDNA units. We used the original sequencing reads (trace archives) available through the whole-genome shotgun sequencing projects of 12 Drosophila species to locate the sequence variants within the 7.8–8.2 kb transcribed portions of the rDNA units. Three to 18 variants were identified in >3% of the total rDNA units from 11 species. Species where the rDNA units are present on multiple chromosomes exhibited only minor increases in sequence variation. Variants were 10–20 times more abundant in the noncoding compared with the coding regions of the rDNA unit. Within the coding regions, variants were three to eight times more abundant in the expansion compared with the conserved core regions. The distribution of variants was largely consistent with models of concerted evolution in which there is uniform recombination across the transcribed portion of the unit with the frequency of standing variants dependent upon the selection pressure to preserve that sequence. However, the 28S gene was found to contain fewer variants than the 18S gene despite evolving 2.5-fold faster. We postulate that the fewer variants in the 28S gene is due to localized gene conversion or DNA repair triggered by the activity of retrotransposable elements that are specialized for insertion into the 28S genes of these species. PMID:17989256

  6. The Chinese hamster Alu-equivalent sequence: a conserved highly repetitious, interspersed deoxyribonucleic acid sequence in mammals has a structure suggestive of a transposable element.

    PubMed Central

    Haynes, S R; Toomey, T P; Leinwand, L; Jelinek, W R

    1981-01-01

    A consensus sequence has been determined for a major interspersed deoxyribonucleic acid repeat in the genome of Chinese hamster ovary cells (CHO cells). This sequence is extensively homologous to (i) the human Alu sequence (P. L. Deininger et al., J. Mol. Biol., in press), (ii) the mouse B1 interspersed repetitious sequence (Krayev et al., Nucleic Acids Res. 8:1201-1215, 1980) (iii) an interspersed repetitious sequence from African green monkey deoxyribonucleic acid (Dhruva et al., Proc. Natl. Acad. Sci. U.S.A. 77:4514-4518, 1980) and (iv) the CHO and mouse 4.5S ribonucleic acid (this report; F. Harada and N. Kato, Nucleic Acids Res. 8:1273-1285, 1980). Because the CHO consensus sequence shows significant homology to the human Alu sequence it is termed the CHO Alu-equivalent sequence. A conserved structure surrounding CHO Alu-equivalent family members can be recognized. It is similar to that surrounding the human Alu and the mouse B1 sequences, and is represented as follows: direct repeat-CHO-Alu-A-rich sequence-direct repeat. A composite interspersed repetitious sequence has been identified. Its structure is represented as follows: direct repeat-residue 47 to 107 of CHO-Alu-non-Alu repetitious sequence-A-rich sequence-direct repeat. Because the Alu flanking sequences resemble those that flank known transposable elements, we think it likely that the Alu sequence dispersed throughout the mammalian genome by transposition. Images PMID:9279371

  7. No increase in bleeding identified in type 1 VWD subjects with D1472H sequence variation.

    PubMed

    Flood, Veronica H; Friedman, Kenneth D; Gill, Joan Cox; Haberichter, Sandra L; Christopherson, Pamela A; Branchford, Brian R; Hoffmann, Raymond G; Abshire, Thomas C; Dunn, Amy L; Di Paola, Jorge A; Hoots, W Keith; Brown, Deborah L; Leissinger, Cindy; Lusher, Jeanne M; Ragni, Margaret V; Shapiro, Amy D; Montgomery, Robert R

    2013-05-01

    The diagnosis of von Willebrand disease (VWD) is complicated by issues with current laboratory testing, particularly the ristocetin cofactor activity assay (VWF:RCo). We have recently reported a sequence variation in the von Willebrand factor (VWF) A1 domain, p.D1472H (D1472H), associated with a decrease in the VWF:RCo/VWF antigen (VWF:Ag) ratio but not associated with bleeding in healthy control subjects. This report expands the previous study to include subjects with symptoms leading to the diagnosis of type 1 VWD. Type 1 VWD subjects with D1472H had a significant decrease in the VWF:RCo/VWF:Ag ratio compared with those without D1472H, similar to the findings in the healthy control population. No increase in bleeding score was observed, however, for VWD subjects with D1472H compared with those without D1472H. These results suggest that the presence of the D1472H sequence variation is not associated with a significant increase in bleeding symptoms, even in type 1 VWD subjects. PMID:23520336

  8. Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RN...

  9. Computer Simulation of the Determination of Amino Acid Sequences in Polypeptides

    ERIC Educational Resources Information Center

    Daubert, Stephen D.; Sontum, Stephen F.

    1977-01-01

    Describes a computer program that generates a random string of amino acids and guides the student in determining the correct sequence of a given protein by using experimental analytic data for that protein. (MLH)

  10. Genome-Wide Characterization of Insertion and Deletion Variation in Chicken Using Next Generation Sequencing

    PubMed Central

    Yan, Yiyuan; Yi, Guoqiang; Sun, Congjiao; Qu, Lujiang; Yang, Ning

    2014-01-01

    Insertion and deletion (INDEL) is one of the main events contributing to genetic and phenotypic diversity, which receives less attention than SNP and large structural variation. To gain a better knowledge of INDEL variation in chicken genome, we applied next generation sequencing on 12 diverse chicken breeds at an average effective depth of 8.6. Over 1.3 million non-redundant short INDELs (1–49 bp) were obtained, the vast majority (92.48%) of which were novel. Follow-up validation assays confirmed that most (88.00%) of the randomly selected INDELs represent true variations. The majority (95.76%) of INDELs were less than 10 bp. Both the detected number and affected bases were larger for deletions than insertions. In total, INDELs covered 3.8 Mbp, corresponding to 0.36% of the chicken genome. The average genomic INDEL density was estimated as 0.49 per kb. INDELs were ubiquitous and distributed in a non-uniform fashion across chromosomes, with lower INDEL density in micro-chromosomes than in others, and some functional regions like exons and UTRs were prone to less INDELs than introns and intergenic regions. Nearly 620,253 INDELs fell in genic regions, 1,765 (0.28%) of which located in exons, spanning 1,358 (7.56%) unique Ensembl genes. Many of them are associated with economically important traits and some are the homologues of human disease-related genes. We demonstrate that sequencing multiple individuals at a medium depth offers a promising way for reliable identification of INDELs. The coding INDELs are valuable candidates for further elucidation of the association between genotypes and phenotypes. The chicken INDELs revealed by our study can be useful for future studies, including development of INDEL markers, construction of high density linkage map, INDEL arrays design, and hopefully, molecular breeding programs in chicken. PMID:25133774

  11. Effects of sequence variation on differential allelic transcription factor occupancy and gene expression.

    PubMed

    Reddy, Timothy E; Gertz, Jason; Pauli, Florencia; Kucera, Katerina S; Varley, Katherine E; Newberry, Kimberly M; Marinov, Georgi K; Mortazavi, Ali; Williams, Brian A; Song, Lingyun; Crawford, Gregory E; Wold, Barbara; Willard, Huntington F; Myers, Richard M

    2012-05-01

    A complex interplay between transcription factors (TFs) and the genome regulates transcription. However, connecting variation in genome sequence with variation in TF binding and gene expression is challenging due to environmental differences between individuals and cell types. To address this problem, we measured genome-wide differential allelic occupancy of 24 TFs and EP300 in a human lymphoblastoid cell line GM12878. Overall, 5% of human TF binding sites have an allelic imbalance in occupancy. At many sites, TFs clustered in TF-binding hubs on the same homolog in especially open chromatin. While genetic variation in core TF binding motifs generally resulted in large allelic differences in TF occupancy, most allelic differences in occupancy were subtle and associated with disruption of weak or noncanonical motifs. We also measured genome-wide differential allelic expression of genes with and without heterozygous exonic variants in the same cells. We found that genes with differential allelic expression were overall less expressed both in GM12878 cells and in unrelated human cell lines. Comparing TF occupancy with expression, we found strong association between allelic occupancy and expression within 100 bp of transcription start sites (TSSs), and weak association up to 100 kb from TSSs. Sites of differential allelic occupancy were significantly enriched for variants associated with disease, particularly autoimmune disease, suggesting that allelic differences in TF occupancy give functional insights into intergenic variants associated with disease. Our results have the potential to increase the power and interpretability of association studies by targeting functional intergenic variants in addition to protein coding sequences. PMID:22300769

  12. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat.

    PubMed

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-01-01

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches. PMID:27172215

  13. [Genuineness of Morinda officinalis How germplasm inferred from ITS sequences variation of nuclear ribosomal DNA].

    PubMed

    Ding, Ping; Liu, Jin; Qiu, Jin-Ying; Lai, Xiao-Ping

    2012-04-01

    PCR sequencing ITS genes methods were used to assess the genetic diversity of Morinda officinalis How different populations. The sequence of Morinda officinalis ITS gene was 567 bp in length, and the content of G/C was 64.5%. In this study, 17 haplotypes were obtained, which were at a high level of branching, and the haplotypes of Guangdong population showed to be the expansion origin. The result of the analysis of molecular variance (AMOVA) also showed that the percentage of variation among populations (56.65%) was greater than that within a population (43.35%). The F(ST) value was 0.566 5, and the genetic divergence among populations was significant. Mantel test results also indicated that the level of geneflow was positively correlated with geographic distances (R2 = 0.721 1). The result showed a good correlation between genotype and geographic distribution of Morinda officinalis, and ITS gene sequencing could be useful molecular method for the genuineness and phylogeography of Morinda officinalis. PMID:22799040

  14. Deleted copy number variation of Hanwoo and Holstein using next generation sequencing at the population level

    PubMed Central

    2014-01-01

    Background Copy number variation (CNV), a source of genetic diversity in mammals, has been shown to underlie biological functions related to production traits. Notwithstanding, there have been few studies conducted on CNVs using next generation sequencing at the population level. Results Illumina NGS data was obtained for ten Holsteins, a dairy cattle, and 22 Hanwoo, a beef cattle. The sequence data for each of the 32 animals varied from 13.58-fold to almost 20-fold coverage. We detected a total of 6,811 deleted CNVs across the analyzed individuals (average length = 2732.2 bp) corresponding to 0.74% of the cattle genome (18.6 Mbp of variable sequence). By examining the overlap between CNV deletion regions and genes, we selected 30 genes with the highest deletion scores. These genes were found to be related to the nervous system, more specifically with nervous transmission, neuron motion, and neurogenesis. We regarded these genes as having been effected by the domestication process. Further analysis of the CNV genotyping information revealed 94 putative selected CNVs and 954 breed-specific CNVs. Conclusions This study provides useful information for assessing the impact of CNVs on cattle traits using NGS at the population level. PMID:24673797

  15. Whole-Genome Sequencing Reveals Genetic Variation in the Asian House Rat

    PubMed Central

    Teng, Huajing; Zhang, Yaohua; Shi, Chengmin; Mao, Fengbiao; Hou, Lingling; Guo, Hongling; Sun, Zhongsheng; Zhang, Jianxu

    2016-01-01

    Whole-genome sequencing of wild-derived rat species can provide novel genomic resources, which may help decipher the genetics underlying complex phenotypes. As a notorious pest, reservoir of human pathogens, and colonizer, the Asian house rat, Rattus tanezumi, is successfully adapted to its habitat. However, little is known regarding genetic variation in this species. In this study, we identified over 41,000,000 single-nucleotide polymorphisms, plus insertions and deletions, through whole-genome sequencing and bioinformatics analyses. Moreover, we identified over 12,000 structural variants, including 143 chromosomal inversions. Further functional analyses revealed several fixed nonsense mutations associated with infection and immunity-related adaptations, and a number of fixed missense mutations that may be related to anticoagulant resistance. A genome-wide scan for loci under selection identified various genes related to neural activity. Our whole-genome sequencing data provide a genomic resource for future genetic studies of the Asian house rat species and have the potential to facilitate understanding of the molecular adaptations of rats to their ecological niches. PMID:27172215

  16. Genome organization and variation in the 3'-partial sequence of garlic latent virus in China.

    PubMed

    Chen, Jiong; Zheng, Hongying; Chen, Jianping; Yang, Chongliang

    2002-08-01

    Ten different isolates of a carlavirus were detected by degenerate PCR from 12 garlic samples collected from 6 provinces in China, and the complete genome sequence of the Zhejiang isolate ZJ1 and 3'-terminal sequences of 9 other isolates were determined. The RNA genome of isolate ZJ1 consisted of 8363nts excluding the 3'-poly (A) tail, and the genome organization was similar to other carlaviruses with 6 open reading frames encoding a replicase, TGB1, TGB2, TGB3, CP and NABP respectively. Sequence comparisons showed that all 10 isolates were Garlic latent virus (GarLV). The variations in the TGB2, TGB3 and NABP were more significant than those in the CP. High homology was also detected between those isolates and Shallot latent virus (ShLV). Phylogenetic analysis suggested that GarLV isolates from garlic can be divided into 4 main groups and Chinese isolates belonged to each group. This is the first reported molecular analysis of members of the genus Carlavirus in China. PMID:18759032

  17. Cytochrome Oxidase I (COI) sequence conservation and variation patterns in the yellowfin and longtail tunas.

    PubMed

    Kunal, Swaraj Priyaranjan; Kumar, Girish

    2013-01-01

    Tunas are commercially important fishery worldwide. There are at least 13 species of tuna belonging to three genera, out of which genus Thunnus has maximum eight species. On the basis of their availability, they can be characterised as oceanic such as Thunnus albacares (yellowfin tuna) or coastal such as Thunnus tonggol (longtail tuna). Although these two are different species, morphological differentiation can only be seen in mature individuals, hence misidentification may result in erroneous data set, which ultimately affect conservation strategies. The mitochondrial DNA cytochrome oxidase c subunit 1 (COI) gene is one of the most popular markers for population genetic and phylogeographic studies across the animal kingdom. The present study aims to study the sequence conservation and variation in mitochondrial Cytochrome Oxidase I (COI) between these two species of tuna. COI sequence analysis of yellowfin and longtail revealed the close relationship between them in Thunnus genera. The present study is the first direct comparison of mitochondrial COI sequences of these two tuna species. PMID:23649742

  18. Extra-binomial variation approach for analysis of pooled DNA sequencing data

    PubMed Central

    Wallace, Chris

    2012-01-01

    Motivation: The invention of next-generation sequencing technology has made it possible to study the rare variants that are more likely to pinpoint causal disease genes. To make such experiments financially viable, DNA samples from several subjects are often pooled before sequencing. This induces large between-pool variation which, together with other sources of experimental error, creates over-dispersed data. Statistical analysis of pooled sequencing data needs to appropriately model this additional variance to avoid inflating the false-positive rate. Results: We propose a new statistical method based on an extra-binomial model to address the over-dispersion and apply it to pooled case-control data. We demonstrate that our model provides a better fit to the data than either a standard binomial model or a traditional extra-binomial model proposed by Williams and can analyse both rare and common variants with lower or more variable pool depths compared to the other methods. Availability: Package ‘extraBinomial’ is on http://cran.r-project.org/ Contact: chris.wallace@cimr.cam.ac.uk Supplementary information: Supplementary data are available at Bioinformatics Online. PMID:22976083

  19. A Genome-Wide Survey of Genetic Variation in Gorillas Using Reduced Representation Sequencing

    PubMed Central

    Xue, Yali; Ayub, Qasim; Durbin, Richard; Tyler-Smith, Chris

    2013-01-01

    All non-human great apes are endangered in the wild, and it is therefore important to gain an understanding of their demography and genetic diversity. Whole genome assembly projects have provided an invaluable foundation for understanding genetics in all four genera, but to date genetic studies of multiple individuals within great ape species have largely been confined to mitochondrial DNA and a small number of other loci. Here, we present a genome-wide survey of genetic variation in gorillas using a reduced representation sequencing approach, focusing on the two lowland subspecies. We identify 3,006,670 polymorphic sites in 14 individuals: 12 western lowland gorillas (Gorilla gorilla gorilla) and 2 eastern lowland gorillas (Gorilla beringei graueri). We find that the two species are genetically distinct, based on levels of heterozygosity and patterns of allele sharing. Focusing on the western lowland population, we observe evidence for population substructure, and a deficit of rare genetic variants suggesting a recent episode of population contraction. In western lowland gorillas, there is an elevation of variation towards telomeres and centromeres on the chromosomal scale. On a finer scale, we find substantial variation in genetic diversity, including a marked reduction close to the major histocompatibility locus, perhaps indicative of recent strong selection there. These findings suggest that despite their maintaining an overall level of genetic diversity equal to or greater than that of humans, population decline, perhaps associated with disease, has been a significant factor in recent and long-term pressures on wild gorilla populations. PMID:23750230

  20. A genome-wide survey of genetic variation in gorillas using reduced representation sequencing.

    PubMed

    Scally, Aylwyn; Yngvadottir, Bryndis; Xue, Yali; Ayub, Qasim; Durbin, Richard; Tyler-Smith, Chris

    2013-01-01

    All non-human great apes are endangered in the wild, and it is therefore important to gain an understanding of their demography and genetic diversity. Whole genome assembly projects have provided an invaluable foundation for understanding genetics in all four genera, but to date genetic studies of multiple individuals within great ape species have largely been confined to mitochondrial DNA and a small number of other loci. Here, we present a genome-wide survey of genetic variation in gorillas using a reduced representation sequencing approach, focusing on the two lowland subspecies. We identify 3,006,670 polymorphic sites in 14 individuals: 12 western lowland gorillas (Gorilla gorilla gorilla) and 2 eastern lowland gorillas (Gorilla beringei graueri). We find that the two species are genetically distinct, based on levels of heterozygosity and patterns of allele sharing. Focusing on the western lowland population, we observe evidence for population substructure, and a deficit of rare genetic variants suggesting a recent episode of population contraction. In western lowland gorillas, there is an elevation of variation towards telomeres and centromeres on the chromosomal scale. On a finer scale, we find substantial variation in genetic diversity, including a marked reduction close to the major histocompatibility locus, perhaps indicative of recent strong selection there. These findings suggest that despite their maintaining an overall level of genetic diversity equal to or greater than that of humans, population decline, perhaps associated with disease, has been a significant factor in recent and long-term pressures on wild gorilla populations. PMID:23750230

  1. Whole Genome Sequencing demonstrates that Geographic Variation of Escherichia coli O157 Genotypes Dominates Host Association

    PubMed Central

    Strachan, Norval J. C.; Rotariu, Ovidiu; Lopes, Bruno; MacRae, Marion; Fairley, Susan; Laing, Chad; Gannon, Victor; Allison, Lesley J.; Hanson, Mary F.; Dallman, Tim; Ashton, Philip; Franz, Eelco; van Hoek, Angela H. A. M.; French, Nigel P.; George, Tessy; Biggs, Patrick J.; Forbes, Ken J.

    2015-01-01

    Genetic variation in an infectious disease pathogen can be driven by ecological niche dissimilarities arising from different host species and different geographical locations. Whole genome sequencing was used to compare E. coli O157 isolates from host reservoirs (cattle and sheep) from Scotland and to compare genetic variation of isolates (human, animal, environmental/food) obtained from Scotland, New Zealand, Netherlands, Canada and the USA. Nei’s genetic distance calculated from core genome single nucleotide polymorphisms (SNPs) demonstrated that the animal isolates were from the same population. Investigation of the Shiga toxin bacteriophage and their insertion sites (SBI typing) revealed that cattle and sheep isolates had statistically indistinguishable rarefaction profiles, diversity and genotypes. In contrast, isolates from different countries exhibited significant differences in Nei’s genetic distance and SBI typing. Hence, after successful international transmission, which has occurred on multiple occasions, local genetic variation occurs, resulting in a global patchwork of continental and trans-continental phylogeographic clades. These findings are important for three reasons: first, understanding transmission and evolution of infectious diseases associated with multiple host reservoirs and multi-geographic locations; second, highlighting the relevance of the sheep reservoir when considering farm based interventions; and third, improving our understanding of why human disease incidence varies across the world. PMID:26442781

  2. Accuracy of sequence alignment and fold assessment using reduced amino acid alphabets.

    PubMed

    Melo, Francisco; Marti-Renom, Marc A

    2006-06-01

    Reduced or simplified amino acid alphabets group the 20 naturally occurring amino acids into a smaller number of representative protein residues. To date, several reduced amino acid alphabets have been proposed, which have been derived and optimized by a variety of methods. The resulting reduced amino acid alphabets have been applied to pattern recognition, generation of consensus sequences from multiple alignments, protein folding, and protein structure prediction. In this work, amino acid substitution matrices and statistical potentials were derived based on several reduced amino acid alphabets and their performance assessed in a large benchmark for the tasks of sequence alignment and fold assessment of protein structure models, using as a reference frame the standard alphabet of 20 amino acids. The results showed that a large reduction in the total number of residue types does not necessarily translate into a significant loss of discriminative power for sequence alignment and fold assessment. Therefore, some definitions of a few residue types are able to encode most of the relevant sequence/structure information that is present in the 20 standard amino acids. Based on these results, we suggest that the use of reduced amino acid alphabets may allow to increasing the accuracy of current substitution matrices and statistical potentials for the prediction of protein structure of remote homologs. PMID:16506243

  3. Characterization of mouse cellular deoxyribonucleic acid homologous to Abelson murine leukemia virus-specific sequences.

    PubMed Central

    Dale, B; Ozanne, B

    1981-01-01

    The genome of Abelson murine leukemia virus (A-MuLV) consists of sequences derived from both BALB/c mouse deoxyribonucleic acid and the genome of Moloney murine leukemia virus. Using deoxyribonucleic acid linear intermediates as a source of retroviral deoxyribonucleic acid, we isolated a recombinant plasmid which contained 1.9 kilobases of the 3.5-kilobase mouse-derived sequences found in A-MuLV (A-MuLV-specific sequences). We used this clone, designated pSA-17, as a probe restriction enzyme and Southern blot analyses to examine the arrangement of homologous sequences in BALB/c deoxyribonucleic acid (endogenous Abelson sequences). The endogenous Abelson sequences within the mouse genome were interrupted by noncoding regions, suggesting that a rearrangement of the cell sequences was required to produce the sequence found in the virus. Endogenous Abelson sequences were arranged similarly in mice that were susceptible to A-MuLV tumors and in mice that were resistant to A-MuLV tumors. An examination of three BALB/c plasmacytomas and a BALB/c early B-cell tumor likewise revealed no alteration in the arrangement of the endogenous Abelson sequences. Homology to pSA-17 was also observed in deoxyribonucleic acids prepared from rat, hamster, chicken, and human cells. An isolate of A-MuLV which encoded a 160,000-dalton transforming protein (P160) contained 700 more base pairs of mouse sequences than the standard A-MuLV isolate, which encoded a 120,000-dalton transforming protein (P120). Images PMID:9279386

  4. The amino acid sequence of monal pheasant lysozyme and its activity.

    PubMed

    Araki, T; Matsumoto, T; Torikata, T

    1998-10-01

    The amino acid sequence of monal pheasant lysozyme and its activity were analyzed. Carboxymethylated lysozyme was digested with trypsin and the resulting peptides were sequenced. The established amino acid sequence had one amino acid substitution at position 102 (Arg to Gly) comparing with Indian peafowl lysozyme and four amino acid substitutions at positions 3 (Phe to Tyr), 15 (His to Leu), 41 (Gln to His), and 121 (Gln to His) with chicken lysozyme. Analysis of the time-courses of reaction using N-acetylglucosamine pentamer as a substrate showed a difference of binding free energy change (-0.4 kcal/mol) at subsites A between monal pheasant and Indian peafowl lysozyme. This was assumed to be caused by the amino acid substitution at subsite A with loss of a positive charge at position 102 (Arg102 to Gly). PMID:9836434

  5. cDNA-derived amino acid sequences of myoglobins from nine species of whales and dolphins.

    PubMed

    Iwanami, Kentaro; Mita, Hajime; Yamamoto, Yasuhiko; Fujise, Yoshihiro; Yamada, Tadasu; Suzuki, Tomohiko

    2006-10-01

    We determined the myoglobin (Mb) cDNA sequences of nine cetaceans, of which six are the first reports of Mb sequences: sei whale (Balaenoptera borealis), Bryde's whale (Balaenoptera edeni), pygmy sperm whale (Kogia breviceps), Stejneger's beaked whale (Mesoplodon stejnegeri), Longman's beaked whale (Indopacetus pacificus), and melon-headed whale (Peponocephala electra), and three confirm the previously determined chemical amino acid sequences: sperm whale (Physeter macrocephalus), common minke whale (Balaenoptera acutorostrata) and pantropical spotted dolphin (Stenella attenuata). We found two types of Mb in the skeletal muscle of pantropical spotted dolphin: Mb I with the same amino acid sequence as that deposited in the protein database, and Mb II, which differs at two amino acid residues compared with Mb I. Using an alignment of the amino acid or cDNA sequences of cetacean Mb, we constructed a phylogenetic tree by the NJ method. Clustering of cetacean Mb amino acid and cDNA sequences essentially follows the classical taxonomy of cetaceans, suggesting that Mb sequence data is valid for classification of cetaceans at least to the family level. PMID:16962803

  6. Organismal, genetic, and transcriptional variation in the deeply sequenced gut microbiomes of identical twins

    PubMed Central

    Turnbaugh, Peter J.; Quince, Christopher; Faith, Jeremiah J.; McHardy, Alice C.; Yatsunenko, Tanya; Niazi, Faheem; Affourtit, Jason; Egholm, Michael; Henrissat, Bernard; Knight, Rob; Gordon, Jeffrey I.

    2010-01-01

    We deeply sampled the organismal, genetic, and transcriptional diversity in fecal samples collected from a monozygotic (MZ) twin pair and compared the results to 1,095 communities from the gut and other body habitats of related and unrelated individuals. Using a new scheme for noise reduction in pyrosequencing data, we estimated the total diversity of species-level bacterial phylotypes in the 1.2-1.5 million bacterial 16S rRNA reads obtained from each deeply sampled cotwin to be ~800 (35.9%, 49.1% detected in both). A combined 1.1 million read 16S rRNA dataset representing 281 shallowly sequenced fecal samples from 54 twin pairs and their mothers contained an estimated 4,018 species-level phylotypes, with each sample having a unique species assemblage (53.4 ± 0.6% and 50.3 ± 0.5% overlap with the deeply sampled cotwins). Of the 134 phylotypes with a relative abundance of >0.1% in the combined dataset, only 37 appeared in >50% of the samples, with one phylotype in the Lachnospiraceae family present in 99%. Nongut communities had significantly reduced overlap with the deeply sequenced twins’ fecal microbiota (18.3 ± 0.3%, 15.3 ± 0.3%). The MZ cotwins’ fecal DNA was deeply sequenced (3.8-6.3 Gbp/sample) and assembled reads were assigned to 25 genus-level phylogenetic bins. Only 17% of the genes in these bins were shared between the cotwins. Bins exhibited differences in their degree of sequence variation, gene content including the repertoire of carbohydrate active enzymes present within and between twins (e.g., predicted cellulases, dockerins), and transcriptional activities. These results provide an expanded perspective about features that make each of us unique life forms and directions for future characterization of our gut ecosystems. PMID:20363958

  7. Mitochondrial DNA control region sequence variation in migraine headache and cyclic vomiting syndrome.

    PubMed

    Wang, Qingxue; Ito, Masamichi; Adams, Kathleen; Li, B U K; Klopstock, Thomas; Maslim, Audrey; Higashimoto, Tomoyasu; Herzog, Juergen; Boles, Richard G

    2004-11-15

    Migraine headache is a very common condition affecting about 10% of the population that results in substantial morbidity and economic loss. The two most common variants are migraine with (MA) and without (MO) aura. Often considered to be a migraine-like variant, cyclic vomiting syndrome (CVS) is a predominately childhood condition characterized by severe, discrete episodes of nausea, vomiting, and lethargy. Disease-associated mitochondrial DNA (mtDNA) sequence variants are suggested in common migraine and CVS based upon a strong bias towards the maternal inheritance of disease, and several other factors. Temporal temperature gradient gel electrophoresis (TTGE) followed by cyclosequencing and RFLP was used to screen almost 90% of the mtDNA, including the control region (CR), for heteroplasmy in 62 children with CVS and neuromuscular disease (CVS+) and in 95 control subjects. One or two rare mtDNA-CR heteroplasmic sequence variants were found in six CVS+ and in zero control subjects (P = 0.003). These variants comprised 6 point and 2 length variants in hypervariable regions 1 and 2 (HV1 and HV2, both part of the mtDNA-CR), one half of which were clustered in the nt 16040-16188 segment of HV1 that includes the termination associated sequence (TAS), a functional location important in the regulation of mtDNA replication. Based upon our findings, sequencing and statistical analysis looking for homoplasmic nucleotide changes was performed in HV1 among 30 CVS+, 30 randomly-ascertained CVS (rCVS), 18 MA, 32 MO, and 35 control haplogroup H cases. Within the nt 16040-16188 segment, homoplasmic sequence variants were three-fold more common relative to control subjects in both CVS groups (P = 0.01 combined data) and in MO (P = 0.02), but not in MA (P = 0.5 vs. control subjects and 0.02 vs. MO). No group differences were noted in the remainder of HV1. We conclude that sequence variation in this small "peri-TAS" segment is associated with CVS and MO, but not MA. These variants

  8. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome.

    PubMed

    Pinto, Ameet J; Sharp, Jonathan O; Yoder, Michael J; Almstrand, Robert

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  9. Draft Genome Sequences of Two Novel Acidimicrobiaceae Members from an Acid Mine Drainage Biofilm Metagenome

    PubMed Central

    Pinto, Ameet J.; Sharp, Jonathan O.; Yoder, Michael J.

    2016-01-01

    Bacteria belonging to the family Acidimicrobiaceae are frequently encountered in heavy metal-contaminated acidic environments. However, their phylogenetic and metabolic diversity is poorly resolved. We present draft genome sequences of two novel and phylogenetically distinct Acidimicrobiaceae members assembled from an acid mine drainage biofilm metagenome. PMID:26769942

  10. Recurrent Coding Sequence Variation Explains Only A Small Fraction of the Genetic Architecture of Colorectal Cancer

    PubMed Central

    Timofeeva, Maria N.; Kinnersley, Ben; Farrington, Susan M.; Whiffin, Nicola; Palles, Claire; Svinti, Victoria; Lloyd, Amy; Gorman, Maggie; Ooi, Li-Yin; Hosking, Fay; Barclay, Ella; Zgaga, Lina; Dobbins, Sara; Martin, Lynn; Theodoratou, Evropi; Broderick, Peter; Tenesa, Albert; Smillie, Claire; Grimes, Graeme; Hayward, Caroline; Campbell, Archie; Porteous, David; Deary, Ian J.; Harris, Sarah E.; Northwood, Emma L.; Barrett, Jennifer H.; Smith, Gillian; Wolf, Roland; Forman, David; Morreau, Hans; Ruano, Dina; Tops, Carli; Wijnen, Juul; Schrumpf, Melanie; Boot, Arnoud; Vasen, Hans F A; Hes, Frederik J.; van Wezel, Tom; Franke, Andre; Lieb, Wolgang; Schafmayer, Clemens; Hampe, Jochen; Buch, Stephan; Propping, Peter; Hemminki, Kari; Försti, Asta; Westers, Helga; Hofstra, Robert; Pinheiro, Manuela; Pinto, Carla; Teixeira, Manuel; Ruiz-Ponte, Clara; Fernández-Rozadilla, Ceres; Carracedo, Angel; Castells, Antoni; Castellví-Bel, Sergi; Campbell, Harry; Bishop, D. Timothy; Tomlinson, Ian P M; Dunlop, Malcolm G.; Houlston, Richard S.

    2015-01-01

    Whilst common genetic variation in many non-coding genomic regulatory regions are known to impart risk of colorectal cancer (CRC), much of the heritability of CRC remains unexplained. To examine the role of recurrent coding sequence variation in CRC aetiology, we genotyped 12,638 CRCs cases and 29,045 controls from six European populations. Single-variant analysis identified a coding variant (rs3184504) in SH2B3 (12q24) associated with CRC risk (OR = 1.08, P = 3.9 × 10−7), and novel damaging coding variants in 3 genes previously tagged by GWAS efforts; rs16888728 (8q24) in UTP23 (OR = 1.15, P = 1.4 × 10−7); rs6580742 and rs12303082 (12q13) in FAM186A (OR = 1.11, P = 1.2 × 10−7 and OR = 1.09, P = 7.4 × 10−8); rs1129406 (12q13) in ATF1 (OR = 1.11, P = 8.3 × 10−9), all reaching exome-wide significance levels. Gene based tests identified associations between CRC and PCDHGA genes (P < 2.90 × 10−6). We found an excess of rare, damaging variants in base-excision (P = 2.4 × 10−4) and DNA mismatch repair genes (P = 6.1 × 10−4) consistent with a recessive mode of inheritance. This study comprehensively explores the contribution of coding sequence variation to CRC risk, identifying associations with coding variation in 4 genes and PCDHG gene cluster and several candidate recessive alleles. However, these findings suggest that recurrent, low-frequency coding variants account for a minority of the unexplained heritability of CRC. PMID:26553438

  11. Color differences among feral pigeons (Columba livia) are not attributable to sequence variation in the coding region of the melanocortin-1 receptor gene (MC1R)

    PubMed Central

    2013-01-01

    Background Genetic variation at the melanocortin-1 receptor (MC1R) gene is correlated with melanin color variation in many birds. Feral pigeons (Columba livia) show two major melanin-based colorations: a red coloration due to pheomelanic pigment and a black coloration due to eumelanic pigment. Furthermore, within each color type, feral pigeons display continuous variation in the amount of melanin pigment present in the feathers, with individuals varying from pure white to a full dark melanic color. Coloration is highly heritable and it has been suggested that it is under natural or sexual selection, or both. Our objective was to investigate whether MC1R allelic variants are associated with plumage color in feral pigeons. Findings We sequenced 888 bp of the coding sequence of MC1R among pigeons varying both in the type, eumelanin or pheomelanin, and the amount of melanin in their feathers. We detected 10 non-synonymous substitutions and 2 synonymous substitution but none of them were associated with a plumage type. It remains possible that non-synonymous substitutions that influence coloration are present in the short MC1R fragment that we did not sequence but this seems unlikely because we analyzed the entire functionally important region of the gene. Conclusions Our results show that color differences among feral pigeons are probably not attributable to amino acid variation at the MC1R locus. Therefore, variation in regulatory regions of MC1R or variation in other genes may be responsible for the color polymorphism of feral pigeons. PMID:23915680

  12. Association Between Absolute Neutrophil Count and Variation at TCIRG1: The NHLBI Exome Sequencing Project.

    PubMed

    Rosenthal, Elisabeth A; Makaryan, Vahagn; Burt, Amber A; Crosslin, David R; Kim, Daniel Seung; Smith, Joshua D; Nickerson, Deborah A; Reiner, Alex P; Rich, Stephen S; Jackson, Rebecca D; Ganesh, Santhi K; Polfus, Linda M; Qi, Lihong; Dale, David C; Jarvik, Gail P

    2016-09-01

    Neutrophils are a key component of innate immunity. Individuals with low neutrophil count are susceptible to frequent infections. Linkage and association between congenital neutropenia and a single rare missense variant in TCIRG1 have been reported in a single family. Here, we report on nine rare missense variants at evolutionarily conserved sites in TCIRG1 that are associated with lower absolute neutrophil count (ANC; p = 0.005) in 1,058 participants from three cohorts: Atherosclerosis Risk in Communities (ARIC), Coronary Artery Risk Development in Young Adults (CARDIA), and Jackson Heart Study (JHS) of the NHLBI Grand Opportunity Exome Sequencing Project (GO ESP). These results validate the effects of TCIRG1 coding variation on ANC and suggest that this gene may be associated with a spectrum of mild to severe effects on ANC. PMID:27229898

  13. Ethnic variation in the mitochondrial targeting sequence polymorphism of MnSOD.

    PubMed

    Van Landeghem, G F; Tabatabaie, P; Kucinskas, V; Saha, N; Beckman, G

    1999-07-01

    In contrast to CuZn superoxide dismutase (SOD), only a very limited number of mutations have been described in MnSOD. One interesting example is a polymorphism (Ala-9Val) in the mitochondrial targeting sequence of this radical-scavenging enzyme. We have studied the Ala-9Val polymorphism in various ethnic groups by means of the oligonucleotide ligation assay. There were significant variations in this unique polymorphism between three different language groups: Baltic (Lithuanians), Finnic (Finns and Saamis) and Germanic (Swedes). The Ala frequency in an Asiatic population (Chinese) was significantly lower than in most European populations. This polymorphism may affect the mitochondrial targeting rate of MnSOD which may result in mitochondrial damage with implication in various late-onset neurological diseases. PMID:10436379

  14. Virology. Mutation rate and genotype variation of Ebola virus from Mali case sequences.

    PubMed

    Hoenen, T; Safronetz, D; Groseth, A; Wollenberg, K R; Koita, O A; Diarra, B; Fall, I S; Haidara, F C; Diallo, F; Sanogo, M; Sarro, Y S; Kone, A; Togo, A C G; Traore, A; Kodio, M; Dosseh, A; Rosenke, K; de Wit, E; Feldmann, F; Ebihara, H; Munster, V J; Zoon, K C; Feldmann, H; Sow, S

    2015-04-01

    The occurrence of Ebola virus (EBOV) in West Africa during 2013-2015 is unprecedented. Early reports suggested that in this outbreak EBOV is mutating twice as fast as previously observed, which indicates the potential for changes in transmissibility and virulence and could render current molecular diagnostics and countermeasures ineffective. We have determined additional full-length sequences from two clusters of imported EBOV infections into Mali, and we show that the nucleotide substitution rate (9.6 × 10(-4) substitutions per site per year) is consistent with rates observed in Central African outbreaks. In addition, overall variation among all genotypes observed remains low. Thus, our data indicate that EBOV is not undergoing rapid evolution in humans during the current outbreak. This finding has important implications for outbreak response and public health decisions and should alleviate several previously raised concerns. PMID:25814067

  15. Two distinct ferredoxins from Rhodobacter capsulatus: complete amino acid sequences and molecular evolution.

    PubMed

    Saeki, K; Suetsugu, Y; Yao, Y; Horio, T; Marrs, B L; Matsubara, H

    1990-09-01

    Two distinct ferredoxins were purified from Rhodobacter capsulatus SB1003. Their complete amino acid sequences were determined by a combination of protease digestion, BrCN cleavage and Edman degradation. Ferredoxins I and II were composed of 64 and 111 amino acids, respectively, with molecular weights of 6,728 and 12,549 excluding iron and sulfur atoms. Both contained two Cys clusters in their amino acid sequences. The first cluster of ferredoxin I and the second cluster of ferredoxin II had a sequence, CxxCxxCxxxCP, in common with the ferredoxins found in Clostridia. The second cluster of ferredoxin I had a sequence, CxxCxxxxxxxxCxxxCM, with extra amino acids between the second and third Cys, which has been reported for other photosynthetic bacterial ferredoxins and putative ferredoxins (nif-gene products) from nitrogen-fixing bacteria, and with a unique occurrence of Met. The first cluster of ferredoxin II had a CxxCxxxxCxxxCP sequence, with two additional amino acids between the second and third Cys, a characteristics feature of Azotobacter-[3Fe-4S] [4Fe-4S]-ferredoxin. Ferredoxin II was also similar to Azotobacter-type ferredoxins with an extended carboxyl (C-) terminal sequence compared to the common Clostridium-type. The evolutionary relationship of the two together with a putative one recently found to be encoded in nifENXQ region in this bacterium [Moreno-Vivian et al. (1989) J. Bacteriol. 171, 2591-2598] is discussed. PMID:2277040

  16. Physicochemical consequences of amino acid variations that contribute to fibril formation by immunoglobulin light chains.

    PubMed Central

    Raffen, R.; Dieckman, L. J.; Szpunar, M.; Wunschl, C.; Pokkuluri, P. R.; Dave, P.; Wilkins Stevens, P.; Cai, X.; Schiffer, M.; Stevens, F. J.

    1999-01-01

    The most common form of systemic amyloidosis originates from antibody light chains. The large number of amino acid variations that distinguish amyloidogenic from nonamyloidogenic light chain proteins has impeded our understanding of the structural basis of light-chain fibril formation. Moreover, even among the subset of human light chains that are amyloidogenic, many primary structure differences are found. We compared the thermodynamic stabilities of two recombinant kappa4 light-chain variable domains (V(L)s) derived from amyloidogenic light chains with a V(L) from a benign light chain. The amyloidogenic V(L)s were significantly less stable than the benign V(L). Furthermore, only the amyloidogenic V(L)s formed fibrils under native conditions in an in vitro fibril formation assay. We used site-directed mutagenesis to examine the consequences of individual amino acid substitutions found in the amyloidogenic V(L)s on stability and fibril formation capability. Both stabilizing and destabilizing mutations were found; however, only destabilizing mutations induced fibril formation in vitro. We found that fibril formation by the benign V(L) could be induced by low concentrations of a denaturant. This indicates that there are no structural or sequence-specific features of the benign V(L) that are incompatible with fibril formation, other than its greater stability. These studies demonstrate that the V(L) beta-domain structure is vulnerable to destabilizing mutations at a number of sites, including complementarity determining regions (CDRs), and that loss of variable domain stability is a major driving force in fibril formation. PMID:10091653

  17. Amino Acid Sequence of Anionic Peroxidase from the Windmill Palm Tree Trachycarpus fortunei

    PubMed Central

    2015-01-01

    Palm peroxidases are extremely stable and have uncommon substrate specificity. This study was designed to fill in the knowledge gap about the structures of a peroxidase from the windmill palm tree Trachycarpus fortunei. The complete amino acid sequence and partial glycosylation were determined by MALDI-top-down sequencing of native windmill palm tree peroxidase (WPTP), MALDI-TOF/TOF MS/MS of WPTP tryptic peptides, and cDNA sequencing. The propeptide of WPTP contained N- and C-terminal signal sequences which contained 21 and 17 amino acid residues, respectively. Mature WPTP was 306 amino acids in length, and its carbohydrate content ranged from 21% to 29%. Comparison to closely related royal palm tree peroxidase revealed structural features that may explain differences in their substrate specificity. The results can be used to guide engineering of WPTP and its novel applications. PMID:25383699

  18. Extensive Variation and Rapid Shift of the MG192 Sequence in Mycoplasma genitalium Strains from Patients with Chronic Infection

    PubMed Central

    Mancuso, Miriam; Williams, James A.; Van Der Pol, Barbara; Fortenberry, J. Dennis; Jia, Qiuyao; Myers, Leann; Martin, David H.

    2014-01-01

    Mycoplasma genitalium causes persistent urogenital tract infection in humans. Antigenic variation of the protein encoded by the MG192 gene has been proposed as one of the mechanisms for persistence. The aims of this study were to determine MG192 sequence variation in patients with chronic M. genitalium infection and to analyze the sequence structural features of the MG192 gene and its encoded protein. Urogenital specimens were obtained from 13 patients who were followed for 10 days to 14 months. The variable region of the MG192 gene was PCR amplified, subcloned into plasmids, and sequenced. Sequence analysis of 220 plasmid clones yielded 97 unique MG192 variant sequences. MG192 sequence shift was identified between sequential specimens from all but one patient. Despite great variation of the MG192 gene among and within clinical specimens from different patients, MG192 sequences were more related within M. genitalium specimens from an individual patient than between patients. The MG192 variable region consisted of 11 discrete subvariable regions with different degrees of variability. Analysis of the two most variable regions (V4 and V6) in five sequential specimens from one patient showed that sequence changes increased over time and that most sequences were present at only one time point, suggesting immune selection. Topology analysis of the deduced MG192 protein predicted a surface-exposed membrane protein. Extensive variation of the MG192 sequence may not only change the antigenicity of the protein to allow immune evasion but also alter the mobility and adhesion ability of the organism to adapt to diverse host microenvironments, thus facilitating persistent infection. PMID:24396043

  19. Variations in prebiotic oligosaccharide fermentation by intestinal lactic acid bacteria.

    PubMed

    Endo, Akihito; Nakamura, Saki; Konishi, Kenta; Nakagawa, Junichi; Tochio, Takumi

    2016-01-01

    Prebiotic oligosaccharides confer health benefits on the host by modulating the gut microbiota. Intestinal lactic acid bacteria (LAB) are potential targets of prebiotics; however, the metabolism of oligosaccharides by LAB has not been fully characterized. Here, we studied the metabolism of eight oligosaccharides by 19 strains of intestinal LAB. Among the eight oligosaccharides used, 1-kestose, lactosucrose and galactooligosaccharides (GOSs) led to the greatest increases in the numbers of the strains tested. However, mono- and disaccharides accounted for more than half of the GOSs used, and several strains only metabolized the mono- and di-saccharides in GOSs. End product profiles indicated that the amounts of lactate produced were generally consistent with the bacterial growth recorded. Oligosaccharide profiling revealed the interesting metabolic manner in Lactobacillus paracasei strains, which metabolized all oligosaccharides, but left sucrose when cultured with fructooligosaccharides. The present study clearly indicated that the prebiotic potential of each oligosaccharide differs. PMID:26888650

  20. The rules of variation: Amino acid exchange according to the rotating circular genetic code

    PubMed Central

    Castro-Chavez, Fernando

    2011-01-01

    General guidelines for the molecular basis of functional variation are presented while focused on the rotating circular genetic code and allowable exchanges that make it resistant to genetic diseases under normal conditions. The rules of variation, bioinformatics aids for preventive medicine, are: (1) same position in the four quadrants for hydrophobic codons, (2) same or contiguous position in two quadrants for synonymous or related codons, and (3) same quadrant for equivalent codons. To preserve protein function, amino acid exchange according to the first rule takes into account the positional homology of essential hydrophobic amino acids with every codon with a central uracil in the four quadrants, the second rule includes codons for identical, acidic, or their amidic amino acids present in two quadrants, and the third rule, the smaller, aromatic, stop codons, and basic amino acids, each in proximity within a 90 degree angle. I also define codifying genes and palindromati, CTCGTGCCGAATTCGGCACGAG. PMID:20371250

  1. Protein chemotaxonomy. XIII. Amino acid sequence of ferredoxin from Panax ginseng.

    PubMed

    Mino, Yoshiki

    2006-08-01

    The complete amino acid sequence of [2Fe-2S] ferredoxin from Panax ginseng (Araliaceae) has been determined by automated Edman degradation of the entire S-carboxymethylcysteinyl protein and of the peptides obtained by enzymatic digestion. This ferredoxin has a unique amino acid sequence, which includes an insertion of Tyr at the 3rd position from the amino-terminus and a deletion of two amino acid residues at the carboxyl terminus. This ferredoxin had 18 differences in its amino acid sequence compared to that of Petroselinum sativum (Umbelliferae). In contrast, 23-33 differences were observed compared to other dicotyledonous plants. This suggests that Panax ginseng is related taxonomically to umbelliferous plants. PMID:16880642

  2. Complete amino acid sequence and structure characterization of the taste-modifying protein, miraculin.

    PubMed

    Theerasilp, S; Hitotsuya, H; Nakajo, S; Nakaya, K; Nakamura, Y; Kurihara, Y

    1989-04-25

    The taste-modifying protein, miraculin, has the unusual property of modifying sour taste into sweet taste. The complete amino acid sequence of miraculin purified from miracle fruits by a newly developed method (Theerasilp, S., and Kurihara, Y. (1988) J. Biol. Chem. 263, 11536-11539) was determined by an automatic Edman degradation method. Miraculin was a single polypeptide with 191 amino acid residues. The calculated molecular weight based on the amino acid sequence and the carbohydrate content (13.9%) was 24,600. Asn-42 and Asn-186 were linked N-glycosidically to carbohydrate chains. High homology was found between the amino acid sequences of miraculin and soybean trypsin inhibitor. PMID:2708331

  3. Validation of copy number variation sequencing for detecting chromosome imbalances in human preimplantation embryos.

    PubMed

    Wang, Li; Cram, David S; Shen, Jiandong; Wang, Xiaohong; Zhang, Jianguang; Song, Zhuo; Xu, Genming; Li, Na; Fan, Junmei; Wang, Shufang; Luo, Yaning; Wang, Jun; Yu, Li; Liu, Jiayin; Yao, Yuanqing

    2014-08-01

    Chromosome aneuploidies commonly arise in embryos produced by assisted reproductive technologies and represent a major cause of implantation failure and miscarriage. Currently, preimplantation genetic diagnosis (PGD) is performed by array-based methods to identify euploid embryos for transfer to the patient. We speculated that a combination of next-generation sequencing technologies and sophisticated bioinformatics would deliver a more comprehensive and accurate methodology to improve the overall efficacy of embryo testing. To meet this challenge, we developed a high-resolution copy number variation (CNV) sequencing pipeline suitable for single-cell analysis. In validation studies, we showed that CNV-Seq was highly sensitive and specific for detection of euploidy, aneuploidy, and segmental imbalances in 24 whole genome amplification samples from PGD embryos that were originally diagnosed by gold standard array comparative genomic hybridization. In addition, CNV-Seq was capable of detecting, mapping, and accurately quantifying terminal chromosome imbalances down to 1 Mb in size originating from abnormal segregation of translocation chromosomes. These validation studies indicate that CNV-Seq displays the hallmarks of an accurate and reliable embryo test with the potential to further improve the overall efficacy of PGD. PMID:24966395

  4. Copy number variations in Hanwoo and Yanbian cattle genomes using the massively parallel sequencing data.

    PubMed

    Choi, Jung-Woo; Chung, Won-Hyong; Lim, Kyu-Sang; Lim, Won-Jun; Choi, Bong-Hwan; Lee, Seung-Hwan; Kim, Hyeong-Cheol; Lee, Seung-Soo; Cho, Eun-Seok; Lee, Kyung-Tai; Kim, Namshin; Kim, Jeong-Dae; Kim, Jong-Bok; Chai, Han-Ha; Cho, Yong-Min; Kim, Tae-Hun; Lim, Dajeong

    2016-09-01

    Hanwoo is an indigenous Korean beef cattle breed, and it shared an ancestor with Yanbian cattle that are found in the Northeast provinces in China until the last century. During recent decades, those cattle breeds experienced different selection pressures. Here, we present genome-wide copy number variations (CNVs) by comparing Hanwoo and Yanbian cattle sequencing data. We used ~3.12 and ~3.07 billion sequence reads from Hanwoo and Yanbian cattle, respectively. A total of 901 putative CNV regions (CNVRs) were identified throughout the genome, representing 5,513,340bp. This is a smaller number than has been reported in previous studies, indicating that Hanwoo are genetically close to Yanbian cattle. Of the CNVRs, 53.2% and 46.8% were found to be gains and losses in Hanwoo. Potential functional roles of each CNVR were assessed by annotating all CNVRs and gene ontology (GO) enrichment analysis. We found that 278 CNVRs overlapped with cattle gene-sets (genic-CNVRs) that could be promising candidates to account for economically important traits in cattle. The enrichment analysis indicated that genes were significantly over-represented in GO terms, including developmental process, multicellular organismal process, reproduction, and response to stimulus. These results provide a valuable genomic resource for determining how CNVs are associated with cattle traits. PMID:27188257

  5. Serine Hydroxymethyltransferase 1 and 2: Gene Sequence Variation and Functional Genomic Characterization

    PubMed Central

    Hebbring, Scott J.; Chai, Yubo; Ji, Yuan; Abo, Ryan P.; Jenkins, Gregory D.; Fridley, Brooke; Zhang, Jianping; Eckloff, Bruce W.; Wieben, Eric D.; Weinshilboum, Richard M.

    2012-01-01

    Serine hydroxymethyltransferase (SHMT) catalyzes the transfer of a beta carbon from serine to tetrahydrofolate (THF) to form glycine and 5,10-methylene-THF. This reaction plays an important role in neurotransmitter synthesis and metabolism. We set out to resequence SHMT1 and SHMT2, followed by functional genomic studies. We identified 87 and 60 polymorphisms in SHMT1 and SHMT2, respectively. We observed no significant functional effect of the 13 nonsynonymous SNPs in these genes, either on catalytic activity or protein quantity. We imputed additional variants across the two genes using “1000 Genomes” data, and identified 14 variants that were significantly associated (p-value < 1.0E-10) with SHMT1 mRNA expression in lymphoblastoid cell lines. Many of these SNPs were also significantly correlated with basal SHMT1 protein expression in 268 human liver biopsy samples. Reporter gene assays suggested that the SHMT1 promoter SNP, rs669340, contributed to this variation. Finally, SHMT1 and SHMT2 expression were significantly correlated with those of other Folate and Methionine Cycle genes at both the mRNA and protein levels. These experiments represent a comprehensive study of SHMT1 and SHMT2 gene sequence variation and its functional implications. In addition, we obtained preliminary indications that these genes may be co-regulated with other Folate and Methionine Cycle genes. PMID:22220685

  6. Complete cDNA and derived amino acid sequence of human factor V

    SciTech Connect

    Jenny, R.J.; Pittman, D.D.; Toole, J.J.; Kriz, R.W.; Aldape, R.A.; Hewick, R.M.; Kaufman, R.J.; Mann, K.G.

    1987-07-01

    cDNA clones encoding human factor V have been isolated from an oligo(dT)-primed human fetal liver cDNA library prepared with vector Charon 21A. The cDNA sequence of factor V from three overlapping clones includes a 6672-base-pair (bp) coding region, a 90-bp 5' untranslated region, and a 163-bp 3' untranslated region within which is a poly(A)tail. The deduced amino acid sequence consists of 2224 amino acids inclusive of a 28-amino acid leader peptide. Direct comparison with human factor VIII reveals considerable homology between proteins in amino acid sequence and domain structure: a triplicated A domain and duplicated C domain show approx. 40% identity with the corresponding domains in factor VIII. As in factor VIII, the A domains of factor V share approx. 40% amino acid-sequence homology with the three highly conserved domains in ceruloplasmin. The B domain of factor V contains 35 tandem and approx. 9 additional semiconserved repeats of nine amino acids of the form Asp-Leu-Ser-Gln-Thr-Thr/Asn-Leu-Ser-Pro and 2 additional semiconserved repeats of 17 amino acids. Factor V contains 37 potential N-linked glycosylation sites, 25 of which are in the B domain, and a total of 19 cysteine residues.

  7. Detection and implication of significant temporal b-value variation during earthquake sequences

    NASA Astrophysics Data System (ADS)

    Gulia, Laura; Tormann, Thessa; Schorlemmer, Danijel; Wiemer, Stefan

    2016-04-01

    Earthquakes tend to cluster in space and time and periods of increased seismic activity are also periods of increased seismic hazard. Forecasting models currently used in statistical seismology and in Operational Earthquake Forecasting (e.g. ETAS) consider the spatial and temporal changes in the activity rates whilst the spatio-temporal changes in the earthquake size distribution, the b-value, are not included. Laboratory experiments on rock samples show an increasing relative proportion of larger events as the system approaches failure, and a sudden reversal of this trend after the main event. The increasing fraction of larger events during the stress increase period can be mathematically represented by a systematic b-value decrease, while the b-value increases immediately following the stress release. We investigate whether these lab-scale observations also apply to natural earthquake sequences and can help to improve our understanding of the physical processes generating damaging earthquakes. A number of large events nucleated in low b-value regions and spatial b-value variations have been extensively documented in the past. Detecting temporal b-value evolution with confidence is more difficult, one reason being the very different scales that have been suggested for a precursory drop in b-value, from a few days to decadal scale gradients. We demonstrate with the results of detailed case studies of the 2009 M6.3 L'Aquila and 2011 M9 Tohoku earthquakes that significant and meaningful temporal b-value variability can be detected throughout the sequences, which e.g. suggests that foreshock probabilities are not generic but subject to significant spatio-temporal variability. Such potential conclusions require and motivate the systematic study of many sequences to investigate whether general patterns exist that might eventually be useful for time-dependent or even real-time seismic hazard assessment.

  8. Temporal Stability of Epigenetic Markers: Sequence Characteristics and Predictors of Short-Term DNA Methylation Variations

    PubMed Central

    Coull, Brent A.; Tarantini, Letizia; Hou, Lifang; Bonzini, Matteo; Apostoli, Pietro; Bertazzi, Pier Alberto; Baccarelli, Andrea

    2012-01-01

    Background DNA methylation is an epigenetic mechanism that has been increasingly investigated in observational human studies, particularly on blood leukocyte DNA. Characterizing the degree and determinants of DNA methylation stability can provide critical information for the design and conduction of human epigenetic studies. Methods We measured DNA methylation in 12 gene-promoter regions (APC, p16, p53, RASSF1A, CDH13, eNOS, ET-1, IFNγ, IL-6, TNFα, iNOS, and hTERT) and 2 of non-long terminal repeat elements, i.e., L1 and Alu in blood samples obtained from 63 healthy individuals at baseline (Day 1) and after three days (Day 4). DNA methylation was measured by bisulfite-PCR-Pyrosequencing. We calculated intraclass correlation coefficients (ICCs) to measure the within-individual stability of DNA methylation between Day 1 and 4, subtracted of pyrosequencing error and adjusted for multiple covariates. Results Methylation markers showed different temporal behaviors ranging from high (IL-6, ICC = 0.89) to low stability (APC, ICC = 0.08) between Day 1 and 4. Multiple sequence and marker characteristics were associated with the degree of variation. Density of CpG dinucleotides nearby the sequence analyzed (measured as CpG(o/e) or G+C content within ±200bp) was positively associated with DNA methylation stability. The 3′ proximity to repeat elements and range of DNA methylation on Day 1 were also positively associated with methylation stability. An inverted U-shaped correlation was observed between mean DNA methylation on Day 1 and stability. Conclusions The degree of short-term DNA methylation stability is marker-dependent and associated with sequence characteristics and methylation levels. PMID:22745719

  9. Effect of laying sequence on egg mercury in captive zebra finches: an interpretation considering individual variation.

    PubMed

    Ou, Langbo; Varian-Ramos, Claire W; Cristol, Daniel A

    2015-08-01

    Bird eggs are used widely as noninvasive bioindicators for environmental mercury availability. Previous studies, however, have found varying relationships between laying sequence and egg mercury concentrations. Some studies have reported that the mercury concentration was higher in first-laid eggs or declined across the laying sequence, whereas in other studies mercury concentration was not related to egg order. Approximately 300 eggs (61 clutches) were collected from captive zebra finches dosed throughout their reproductive lives with methylmercury (0.3 μg/g, 0.6 μg/g, 1.2 μg/g, or 2.4 μg/g wet wt in diet); the total mercury concentration (mean ± standard deviation [SD] dry wt basis) of their eggs was 7.03 ± 1.38 μg/g, 14.15 ± 2.52 μg/g, 26.85 ± 5.85 μg/g, and 49.76 ± 10.37 μg/g, respectively (equivalent to fresh wt egg mercury concentrations of 1.24 μg/g, 2.50 μg/g, 4.74 μg/g, and 8.79 μg/g). The authors observed a significant decrease in the mercury concentration of successive eggs when compared with the first egg and notable variation between clutches within treatments. The mercury level of individual females within and among treatments did not alter this relationship. Based on the results, sampling of a single egg in each clutch from any position in the laying sequence is sufficient for purposes of population risk assessment, but it is not recommended as a proxy for individual female exposure or as an estimate of average mercury level within the clutch. PMID:25760460

  10. N-terminal sequence of amino acids and some properties of an acid-stable alpha-amylase from citric acid-koji (Aspergillus usamii var.).

    PubMed

    Suganuma, T; Tahara, N; Kitahara, K; Nagahama, T; Inuzuka, K

    1996-01-01

    An acid-stable alpha-amylase (AA) was purified from an acidic extract of citric acid-koji (A. usamii var.). The N-terminal sequence of the first 20 amino acids of the enzyme was identical with that of AA from A. niger, but the two enzymes differed in molecular weight. HPLC analysis for identifying the anomers of products indicated that the AA hydrolyzed maltopentaose (G5) at the third glycoside bond predominantly, which differed from Taka-amylase A and the neutral alpha-amylase (NA) from the citric acid-koji. PMID:8824843

  11. LOVD: easy creation of a locus-specific sequence variation database using an "LSDB-in-a-box" approach.

    PubMed

    Fokkema, Ivo F A C; den Dunnen, Johan T; Taschner, Peter E M

    2005-08-01

    The completion of the human genome project has initiated, as well as provided the basis for, the collection and study of all sequence variation between individuals. Direct access to up-to-date information on sequence variation is currently provided most efficiently through web-based, gene-centered, locus-specific databases (LSDBs). We have developed the Leiden Open (source) Variation Database (LOVD) software approaching the "LSDB-in-a-Box" idea for the easy creation and maintenance of a fully web-based gene sequence variation database. LOVD is platform-independent and uses PHP and MySQL open source software only. The basic gene-centered and modular design of the database follows the recommendations of the Human Genome Variation Society (HGVS) and focuses on the collection and display of DNA sequence variations. With minimal effort, the LOVD platform is extendable with clinical data. The open set-up should both facilitate and promote functional extension with scripts written by the community. The LOVD software is freely available from the Leiden Muscular Dystrophy pages (www.DMD.nl/LOVD/). To promote the use of LOVD, we currently offer curators the possibility to set up an LSDB on our Leiden server. PMID:15977173

  12. Variation in conserved non-coding sequences on chromosome 5q andsusceptibility to asthma and atopy

    SciTech Connect

    Donfack, Joseph; Schneider, Daniel H.; Tan, Zheng; Kurz,Thorsten; Dubchak, Inna; Frazer, Kelly A.; Ober, Carole

    2005-09-10

    Background: Evolutionarily conserved sequences likely havebiological function. Methods: To determine whether variation in conservedsequences in non-coding DNA contributes to risk for human disease, westudied six conserved non-coding elements in the Th2 cytokine cluster onhuman chromosome 5q31 in a large Hutterite pedigree and in samples ofoutbred European American and African American asthma cases and controls.Results: Among six conserved non-coding elements (>100 bp,>70percent identity; human-mouse comparison), we identified one singlenucleotide polymorphism (SNP) in each of two conserved elements and sixSNPs in the flanking regions of three conserved elements. We genotypedour samples for four of these SNPs and an additional three SNPs each inthe IL13 and IL4 genes. While there was only modest evidence forassociation with single SNPs in the Hutterite and European Americansamples (P<0.05), there were highly significant associations inEuropean Americans between asthma and haplotypes comprised of SNPs in theIL4 gene (P<0.001), including a SNP in a conserved non-codingelement. Furthermore, variation in the IL13 gene was strongly associatedwith total IgE (P = 0.00022) and allergic sensitization to mold allergens(P = 0.00076) in the Hutterites, and more modestly associated withsensitization to molds in the European Americans and African Americans (P<0.01). Conclusion: These results indicate that there is overalllittle variation in the conserved non-coding elements on 5q31, butvariation in IL4 and IL13, including possibly one SNP in a conservedelement, influence asthma and atopic phenotypes in diversepopulations.

  13. Spatial and Temporal Stress Drop Variations of the 2011 Tohoku Earthquake Sequence

    NASA Astrophysics Data System (ADS)

    Miyake, H.

    2013-12-01

    The 2011 Tohoku earthquake sequence consists of foreshocks, mainshock, aftershocks, and repeating earthquakes. To quantify spatial and temporal stress drop variations is important for understanding M9-class megathrust earthquakes. Variability and spatial and temporal pattern of stress drop is a basic information for rupture dynamics as well as useful to source modeling. As pointed in the ground motion prediction equations by Campbell and Bozorgnia [2008, Earthquake Spectra], mainshock-aftershock pairs often provide significant decrease of stress drop. We here focus strong motion records before and after the Tohoku earthquake, and analyze source spectral ratios considering azimuth- and distance dependency [Miyake et al., 2001, GRL]. Due to the limitation of station locations on land, spatial and temporal stress drop variations are estimated by adjusting shifts from the omega-squared source spectral model. The adjustment is based on the stochastic Green's function simulations of source spectra considering azimuth- and distance dependency. We assumed the same Green's functions for event pairs for each station, both the propagation path and site amplification effects are cancelled out. Precise studies of spatial and temporal stress drop variations have been performed [e.g., Allmann and Shearer, 2007, JGR], this study targets the relations between stress drop vs. progression of slow slip prior to the Tohoku earthquake by Kato et al. [2012, Science] and plate structures. Acknowledgement: This study is partly supported by ERI Joint Research (2013-B-05). We used the JMA unified earthquake catalogue and K-NET, KiK-net, and F-net data provided by NIED.

  14. The developmental transcriptome landscape of bovine skeletal muscle defined by Ribo-Zero ribonucleic acid sequencing.

    PubMed

    Sun, X; Li, M; Sun, Y; Cai, H; Li, R; Wei, X; Lan, X; Huang, Y; Lei, C; Chen, H

    2015-12-01

    Ribonucleic acid sequencing (RNA-Seq) libraries are normally prepared with oligo(dT) selection of poly(A)+ mRNA, but it depends on intact total RNA samples. Recent studies have described Ribo-Zero technology, a novel method that can capture both poly(A)+ and poly(A)- transcripts from intact or fragmented RNA samples. We report here the first application of Ribo-Zero RNA-Seq for the analysis of the bovine embryonic, neonatal, and adult skeletal muscle whole transcriptome at an unprecedented depth. Overall, 19,893 genes were found to be expressed, with a high correlation of expression levels between the calf and the adult. Hundreds of genes were found to be highly expressed in the embryo and decreased at least 10-fold after birth, indicating their potential roles in embryonic muscle development. In addition, we present for the first time the analysis of global transcript isoform discovery in bovine skeletal muscle and identified 36,694 transcript isoforms. Transcriptomic data were also analyzed to unravel sequence variations; 185,036 putative SNP and 12,428 putative short insertions-deletions (InDel) were detected. Specifically, many stop-gain, stop-loss, and frameshift mutations were identified that probably change the relative protein production and sequentially affect the gene function. Notably, the numbers of stage-specific transcripts, alternative splicing events, SNP, and InDel were greater in the embryo than in the calf and the adult, suggesting that gene expression is most active in the embryo. The resulting view of the transcriptome at a single-base resolution greatly enhances the comprehensive transcript catalog and uncovers the global trends in gene expression during bovine skeletal muscle development. PMID:26641174

  15. [Genetic structure of the Siberian Sucker (Catostomus catostomus rostratus) according to data on sequence variation of the mtDNA cytochrome B gene].

    PubMed

    Bachevskaia, L T; Pereverzeva, V V; Ivanova, G D; Agapova, G A; Primak, A A

    2014-01-01

    Data regarding the structure and variation of the nucleotide sequence of the cytochrome b gene of mitochondrial DNA of the Siberian Sucker from the Kolyma River were obtained. Analysis of the median network revealed that evolutionary lines diverged from a common ancestor. Penetration of the sucker into Asia from Northern America took place between the Early and Middle Pleistocene. Prolonged reproductive isolation of the Siberian and Northern American suckers led to interspecies divergence with the appearance of amino acid substitutions, which, apparently, fixed due to positive selection. The Siberian Sucker appeared to have three modifications of the Cytb protein. PMID:25735175

  16. Analysis of seasonal variation of stratospheric nitric acid

    NASA Astrophysics Data System (ADS)

    Gruzdev, A. N.

    1998-11-01

    Data from the draft COSPAR reference model for stratospheric nitric acid (HNO3) are analysed. Eight months of LIMS HNO3 measurements allow the analysis of dynamics of regimes associated with the annual HNO3 maximum followed by the HNO3 decrease in the Northern Hemisphere and the annual HNO3 minimum followed by the HNO3 increase in the Southern Hemisphere. The HNO3 minimum is noted earlier (in November) in the Southern Hemisphere subtropical upper stratosphere, from where the regime of minimum HNO3 values propagates to the southern high-latitude middle stratosphere, and then (in Austral summer) the equatorward propagation of the regime is observed, with a persistent downward component. The regime of the HNO3 annual maximum in the Northern Hemisphere propagates from the Arctic lower stratosphere (in autumn) and from the tropical middle stratosphere (in late summer), so that in the mid-latitude middle stratosphere the downward propagation of the regime is observed. Evolution of areas with HNO3 increase and decrease by 1 ppbv against the January HNO3 distribution quantifies intensity of the HNO3 decrease in winter-spring in the Northern Hemisphere and the HNO3 increase in Austral summer-autumn in the Southern Hemisphere.

  17. Simultaneous alignment and folding of 28S rRNA sequences uncovers phylogenetic signal in structure variation.

    PubMed

    Letsch, Harald O; Greve, Carola; Kück, Patrick; Fleck, Günther; Stocsits, Roman R; Misof, Bernhard

    2009-12-01

    Secondary structure models of mitochondrial and nuclear (r)RNA sequences are frequently applied to aid the alignment of these molecules in phylogenetic analyses. Additionally, it is often speculated that structure variation of (r)RNA sequences might profitably be used as phylogenetic markers. The benefit of these approaches depends on the reliability of structure models. We used a recently developed approach to show that reliable inference of large (r)RNA secondary structures as a prerequisite of simultaneous sequence and structure alignment is feasible. The approach iteratively establishes local structure constraints of each sequence and infers fully folded individual structures by constrained MFE optimization. A comparison of structure edit distances of individual constraints and fully folded structures showed pronounced phylogenetic signal in fully folded structures. As model sequences we characterized secondary structures of 28S rRNA sequences of selected insects and examined their phylogenetic signal according to established phylogenetic hypotheses. PMID:19654047

  18. Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.

  19. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

    1997-01-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.

  20. Detection and isolation of nucleic acid sequences using competitive hybridization probes

    DOEpatents

    Lucas, J.N.; Straume, T.; Bogen, K.T.

    1997-04-01

    A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.

  1. Sequence Variation in the Small-Subunit rRNA Gene of Plasmodium malariae and Prevalence of Isolates with the Variant Sequence in Sichuan, China

    PubMed Central

    Liu, Qing; Zhu, Shenghua; Mizuno, Sahoko; Kimura, Masatsugu; Liu, Peina; Isomura, Shin; Wang, Xingzhen; Kawamoto, Fumihiko

    1998-01-01

    By two PCR-based diagnostic methods, Plasmodium malariae infections have been rediscovered at two foci in the Sichuan province of China, a region where no cases of P. malariae have been officially reported for the last 2 decades. In addition, a variant form of P. malariae which has a deletion of 19 bp and seven substitutions of base pairs in the target sequence of the small-subunit (SSU) rRNA gene was detected with high frequency. Alignment analysis of Plasmodium sp. SSU rRNA gene sequences revealed that the 5′ region of the variant sequence is identical to that of P. vivax or P. knowlesi and its 3′ region is identical to that of P. malariae. The same sequence variations were also found in P. malariae isolates collected along the Thai-Myanmar border, suggesting a wide distribution of this variant form from southern China to Southeast Asia. PMID:9774600

  2. Genetic Variation of Fatty Acid Oxidation and Obesity, A Literature Review

    PubMed Central

    Freitag Luglio, Harry

    2016-01-01

    Modulation of fat metabolism is an important component of the etiology of obesity as well as individual response to weight loss program. The influence of lipolysis process had receives many attentions in recent decades. Compared to that, fatty acid oxidation which occurred after lipolysis seems to be less exposed. There are limited publications on how fatty acid oxidation influences predisposition to obesity, especially the importance of genetic variations of fatty acid oxidation proteins on development of obesity. The aim of this review is to provide recent knowledge on how polymorphism of genes related fatty acid oxidation is obtained. Studies in human as well as animal model showed that disturbance of genes related fatty acid oxidation process gave impact on body weight and risks to obesity. Several polymorphisms on CD36, CPT, ACS and FABP had been shown to be related to obesity either by regulating enzymatic activity or directly influence fatty acid oxidation process. PMID:27127449

  3. Sequence polymorphism of GroEL gene in natural population of Bacillus and Brevibacillus spp. that showed variation in thermal tolerance capacity and mRNA expression.

    PubMed

    Sen, R; Tripathy, S; Padhi, S K; Mohanty, S; Maiti, N K

    2014-10-01

    GroEL, a class I chaperonin, plays an important role in the thermal adaptation of the cell and helps to maintain the viability of the cell under heat shock condition. Function of groEL in vivo depends on the maintenance of proper structure of the protein which in turn depends on the nucleotide and amino acid sequence of the gene. In this study, we investigated the changes in nucleotide and amino acid sequences of the partial groEL gene that may affect the thermotolerance capacity as well as mRNA expression of bacterial isolates. Sequences among the same species having differences in the amino acid level were identified as different alleles. The effect of allelic variation on the groEL gene expression was analyzed by comparison and relative quantification in each allele under thermal shock condition by RT-PCR. Evaluation of K a/K s ratio among the strains of same species showed that the groEL gene of all the species had undergone similar functional constrain during evolution. The strains showing similar thermotolerance capacity was found to carry same allele of groEL gene. The isolates carrying allele having amino acid substitution inside the highly ATP/ADP or Mg(2+)-binding region could not tolerate thermal stress and showed lower expression of the groEL gene. Our results indicate that during evolution of these bacterial species the groEL gene has undergone the process of natural selection, and the isolates have evolved with the groEL allelic sequences that help them to withstand the thermal stress during their interaction with the environment. PMID:24894903

  4. Conservation of Shannon's redundancy for proteins. [information theory applied to amino acid sequences

    NASA Technical Reports Server (NTRS)

    Gatlin, L. L.

    1974-01-01

    Concepts of information theory are applied to examine various proteins in terms of their redundancy in natural originators such as animals and plants. The Monte Carlo method is used to derive information parameters for random protein sequences. Real protein sequence parameters are compared with the standard parameters of protein sequences having a specific length. The tendency of a chain to contain some amino acids more frequently than others and the tendency of a chain to contain certain amino acid pairs more frequently than other pairs are used as randomness measures of individual protein sequences. Non-periodic proteins are generally found to have random Shannon redundancies except in cases of constraints due to short chain length and genetic codes. Redundant characteristics of highly periodic proteins are discussed. A degree of periodicity parameter is derived.

  5. Paleosecular Variation Study on a Pliocene Lava Flow Sequence in the Lesser Caucasus

    NASA Astrophysics Data System (ADS)

    Caccavari, A.; Calvo-Rathert, M.; Gogichaishvili, A.; Huaiyu, H.; Vashakidze, G.; Vegas, N.; Aguilar, B.

    2013-05-01

    A paleomagnetic and rock magnetic study was carried out on 39 successive Pliocene lava flows from the Saro sequence, which is located in the Djavakheti Highland in the Lesser Caucasus in Georgia. Previous K-Ar ages carried out by Lebedev et al. (Stratigraphy and Geological Correlation, 2008, Vol. 16, No.2, 204-224) yielded an age of 2.2 Ma for the sequence. For the present study a new Ar-Ar dating has been performed on samples from the lower and the upper part of the section. Rock magnetism experiments were carried out to characterize the carries of remanence and obtain information about their stability. Thermomagnetic experiments show that titanomagnetite with differing content of titan is the main carrier in the 39 lava flows. Analysis of hysteresis parameters suggests that the grain size of most studied samples corresponds to pseudo single-domain particles, which can also be interpreted in terms of a mixture of single-domain and multi-domain grains. Paleomagnetic experiments reveal in all flows only a single paleomagnetic component with reverse polarity, D= 205.6°, I= -60.7°, (α 95= 2.0, k= 129.6) and the calculated paleomagnetic pole yields a longitude λ= 123.1 and a latitude 71.1° (α 95=2.8°, k=72.1). The angular distance between the Pliocene paleomagnetic pole obtained in this work and the expected one is 17°. With the purpose of analysing the behaviour of paleosecular variation (PSV), the scatter of virtual geomagnetic poles was calculated and a value SB = 12.9, with an upper confidence limit Sup=14.28 and a lower confidence limit Slow= 10.45 was obtained. This result is lower than predicted by specific models for VGP dispersion at 41°N.

  6. Structural variation detection using next-generation sequencing data: A comparative technical review.

    PubMed

    Guan, Peiyong; Sung, Wing-Kin

    2016-06-01

    Structural variations (SVs) are mutations in the genome of size at least fifty nucleotides. They contribute to the phenotypic differences among healthy individuals, cause severe diseases and even cancers by breaking or linking genes. Thus, it is crucial to systematically profile SVs in the genome. In the past decade, many next-generation sequencing (NGS)-based SV detection methods have been proposed due to the significant cost reduction of NGS experiments and their ability to unbiasedly detect SVs to the base-pair resolution. These SV detection methods vary in both sensitivity and specificity, since they use different SV-property-dependent and library-property-dependent features. As a result, predictions from different SV callers are often inconsistent. Besides, the noises in the data (both platform-specific sequencing error and artificial chimeric reads) impede the specificity of SV detection. Poorly characterized regions in the human genome (e.g., repeat regions) greatly impact the reads mapping and in turn affect the SV calling accuracy. Calling of complex SVs requires specialized SV callers. Apart from accuracy, processing speed of SV caller is another factor deciding its usability. Knowing the pros and cons of different SV calling techniques and the objectives of the biological study are essential for biologists and bioinformaticians to make informed decisions. This paper describes different components in the SV calling pipeline and reviews the techniques used by existing SV callers. Through simulation study, we also demonstrate that library properties, especially insert size, greatly impact the sensitivity of different SV callers. We hope the community can benefit from this work both in designing new SV calling methods and in selecting the appropriate SV caller for specific biological studies. PMID:26845461

  7. Sequence variation and differential splicing of the midgut cadherin gene in Trichoplusia ni.

    PubMed

    Zhang, Xin; Kain, Wendy; Wang, Ping

    2013-08-01

    The insect midgut cadherin serves as an important receptor for the Cry toxins from Bacillus thuringiensis (Bt). Variation of the cadherin in insect populations provides a genetic potential for development of cadherin-based Bt resistance in insect populations. Sequence analysis of the cadherin from the cabbage looper, Trichoplusia ni, together with cadherins from 18 other lepidopterans showed a similar phylogenetic relationship of the cadherins to the phylogeny of Lepidoptera. The midgut cadherin in three laboratory populations of T. ni exhibited high variability, although the resistance to Bt toxin Cry1Ac in the T. ni strain is not genetically associated with cadherin gene mutations. A total of 142 single nucleotide polymorphisms (SNPs) were identified in the cadherin cDNAs from the T. ni strains, including 20 missense mutations. In addition, insertion and deletion polymorphisms (indels) were also identified in the cadherin alleles in T. ni. More interestingly, the results from this study reveal that differential splicing of mRNA also occurs in the cadherin gene expression. Therefore, variation of the midgut cadherin in insects may not only be caused by cadherin gene mutations, but could also result from alternative splicing of its mRNA regulated by factors acting in trans. Analysis of cadherin gene alleles in F2, F3 and F4 progenies from the cross between the Cry1Ac resistant and the susceptible strain after consecutive selections with Cry1Ac for three generations showed that selection with Cry1Ac did not result in an increase of frequencies of the cadherin alleles originated from the resistant strain. PMID:23743444

  8. Protein location prediction using atomic composition and global features of the amino acid sequence

    SciTech Connect

    Cherian, Betsy Sheena; Nair, Achuthsankar S.

    2010-01-22

    Subcellular location of protein is constructive information in determining its function, screening for drug candidates, vaccine design, annotation of gene products and in selecting relevant proteins for further studies. Computational prediction of subcellular localization deals with predicting the location of a protein from its amino acid sequence. For a computational localization prediction method to be more accurate, it should exploit all possible relevant biological features that contribute to the subcellular localization. In this work, we extracted the biological features from the full length protein sequence to incorporate more biological information. A new biological feature, distribution of atomic composition is effectively used with, multiple physiochemical properties, amino acid composition, three part amino acid composition, and sequence similarity for predicting the subcellular location of the protein. Support Vector Machines are designed for four modules and prediction is made by a weighted voting system. Our system makes prediction with an accuracy of 100, 82.47, 88.81 for self-consistency test, jackknife test and independent data test respectively. Our results provide evidence that the prediction based on the biological features derived from the full length amino acid sequence gives better accuracy than those derived from N-terminal alone. Considering the features as a distribution within the entire sequence will bring out underlying property distribution to a greater detail to enhance the prediction accuracy.

  9. Effective normalization for copy number variation detection from whole genome sequencing

    PubMed Central

    2012-01-01

    Background Whole genome sequencing enables a high resolution view of the human genome and provides unique insights into genome structure at an unprecedented scale. There have been a number of tools to infer copy number variation in the genome. These tools, while validated, also include a number of parameters that are configurable to genome data being analyzed. These algorithms allow for normalization to account for individual and population-specific effects on individual genome CNV estimates but the impact of these changes on the estimated CNVs is not well characterized. We evaluate in detail the effect of normalization methodologies in two CNV algorithms FREEC and CNV-seq using whole genome sequencing data from 8 individuals spanning four populations. Methods We apply FREEC and CNV-seq to a sequencing data set consisting of 8 genomes. We use multiple configurations corresponding to different read-count normalization methodologies in FREEC, and statistically characterize the concordance of the CNV calls between FREEC configurations and the analogous output from CNV-seq. The normalization methodologies evaluated in FREEC are: GC content, mappability and control genome. We further stratify the concordance analysis within genic, non-genic, and a collection of validated variant regions. Results The GC content normalization methodology generates the highest number of altered copy number regions. Both mappability and control genome normalization reduce the total number and length of copy number regions. Mappability normalization yields Jaccard indices in the 0.07 - 0.3 range, whereas using a control genome normalization yields Jaccard index values around 0.4 with normalization based on GC content. The most critical impact of using mappability as a normalization factor is substantial reduction of deletion CNV calls. The output of another method based on control genome normalization, CNV-seq, resulted in comparable CNV call profiles, and substantial agreement in variable

  10. Ab initio detection of fuzzy amino acid tandem repeats in protein sequences

    PubMed Central

    2012-01-01

    Background Tandem repetitions within protein amino acid sequences often correspond to regular secondary structures and form multi-repeat 3D assemblies of varied size and function. Developing internal repetitions is one of the evolutionary mechanisms that proteins employ to adapt their structure and function under evolutionary pressure. While there is keen interest in understanding such phenomena, detection of repeating structures based only on sequence analysis is considered an arduous task, since structure and function is often preserved even under considerable sequence divergence (fuzzy tandem repeats). Results In this paper we present PTRStalker, a new algorithm for ab-initio detection of fuzzy tandem repeats in protein amino acid sequences. In the reported results we show that by feeding PTRStalker with amino acid sequences from the UniProtKB/Swiss-Prot database we detect novel tandemly repeated structures not captured by other state-of-the-art tools. Experiments with membrane proteins indicate that PTRStalker can detect global symmetries in the primary structure which are then reflected in the tertiary structure. Conclusions PTRStalker is able to detect fuzzy tandem repeating structures in protein sequences, with performance beyond the current state-of-the art. Such a tool may be a valuable support to investigating protein structural properties when tertiary X-ray data is not available. PMID:22536906

  11. Multimodal phylogeny for taxonomy: integrating information from nucleotide and amino acid sequences.

    PubMed

    Bicego, Manuele; Dellaglio, Franco; Felis, Giovanna E

    2007-10-01

    The crucial role played by the analysis of microbial diversity in biotechnology-based innovations has increased the interest in the microbial taxonomy research area. Phylogenetic sequence analyses have contributed significantly to the advances in this field, also in the view of the large amount of sequence data collected in recent years. Phylogenetic analyses could be realized on the basis of protein-encoding nucleotide sequences or encoded amino acid molecules: these two mechanisms present different peculiarities, still starting from two alternative representations of the same information. This complementarity could be exploited to achieve a multimodal phylogenetic scheme that is able to integrate gene and protein information in order to realize a single final tree. This aspect has been poorly addressed in the literature. In this paper, we propose to integrate the two phylogenetic analyses using basic schemes derived from the multimodality fusion theory (or multiclassifier systems theory), a well-founded and rigorous branch for which its powerfulness has already been demonstrated in other pattern recognition contexts. The proposed approach could be applied to distance matrix-based phylogenetic techniques (like neighbor joining), resulting in a smart and fast method. The proposed methodology has been tested in a real case involving sequences of some species of lactic acid bacteria. With this dataset, both nucleotide sequence- and amino acid sequence-based phylogenetic analyses present some drawbacks, which are overcome with the multimodal analysis. PMID:17933011

  12. Cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor L.; Brow, Mary Ann D.; Dahlberg, James E.

    2007-12-11

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  13. Cleavage of nucleic acids

    SciTech Connect

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow; Mary Ann D.; Dahlberg, James E.

    2010-11-09

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  14. Cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

    2000-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  15. Nucleic acid detection assays

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James E.

    2005-04-05

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  16. Nucleic acid detection compositions

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James L.

    2008-08-05

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  17. The amino-acid sequence of leghemoglobin component a from Phaseolus vulgaris (kidney bean).

    PubMed

    Lehtovaara, P; Ellfolk, N

    1975-06-01

    1. Leghemoglobin component a from Phaseolus vulgaris (kidney bean) was digested with trypsin; 15 tryptic peptides and free lysine were purified and the amino acid sequences of the peptides determined. 2. The internal order of the tryptic peptides was determined by the bridge peptides obtained from the thermolytic digest and the dilute acid hydrolyzate of kidney bean leghemoglobin a; 12 thermolytic peptides and two acid hydrolysis peptides were purified and the sequences were partially or completely determined. 3. The complete amino acid sequence of kidney bean leghemoglobin a is compared to that of leghemoglobin a from soybean (Glycine max) and to some animal globins. As regards sequence, the kidney bean globin has 79% identity with the soybean globin and 21% identity with human hemoglobin gamma-chain. Seven of the 14 amino acid residues common to most globins are found in the kidney bean globin. Trp-15 and Tyr-145 are evolutionarily conserved in this globin, which confirms the concept of a common origin of animal and plant globins. PMID:809270

  18. Studies on fatty acid-binding proteins. The diurnal variation shown by rat liver fatty acid-binding protein.

    PubMed Central

    Wilkinson, T C; Wilton, D C

    1987-01-01

    The concentration of fatty acid-binding protein in rat liver was examined by SDS/polyacrylamide-gel electrophoresis, by Western blotting and by quantifying the fluorescence enhancement achieved on the binding of the fluorescent probe 11-(dansylamino)undecanoic acid. A 2-3-fold increase in the concentration of this protein produced by treatment of rats with the peroxisome proliferator tiadenol was readily detected; however, only a small variation in the concentration of the protein due to a diurnal rhythm was observed. This result contradicts the 7-10-fold variation previously reported for this protein [Hargis, Olson, Clarke & Dempsey (1986) J. Biol. Chem. 261, 1988-1991]. Images Fig. 1. Fig. 3. PMID:3593284

  19. Genetic variation of Sargassum horneri populations detected by inter-simple sequence repeats.

    PubMed

    Ren, J R; Yang, R; He, Y Y; Sun, Q H

    2015-01-01

    The seaweed Sargassum horneri is an important brown alga in the marine environment, and it is an important raw material in the alginate industry. Unfortunately, the fixed resource that was originally reported is now reduced or disappeared, and increased floating populations have been reported in recent years. We sampled a floating population and 4 fixed cultivated populations of S. horneri along the coast of Zhejiang, China. Inter-simple sequence repeat (ISSR) markers were applied in this research to analyze the genetic variation between floating populations and fixed cultivated populations of S. horneri. In total, 220 loci were amplified with 23 ISSR primers. The percentage of polymorphic loci within each population ranged from 53.64 to 95.45%. The highest diversity was observed in population 3, which was the local species that was suspension cultured in the lab and then fixed cultivated in the Nanji Islands before sampling. The lowest diversity was obtained in the floating population 4. The genetic distances among the 5 S. horneri populations ranged from 0.0819 to 0.2889, and the distance tendency confirmed the genetic diversity. The results suggest that the floating population had the lowest genetic diversity and could not be joined into the cluster branch of the fixed cultivated populations. PMID:25729997

  20. Structural variation discovery in the cancer genome using next generation sequencing: computational solutions and perspectives.

    PubMed

    Liu, Biao; Conroy, Jeffrey M; Morrison, Carl D; Odunsi, Adekunle O; Qin, Maochun; Wei, Lei; Trump, Donald L; Johnson, Candace S; Liu, Song; Wang, Jianmin

    2015-03-20

    Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937

  1. Structural variation discovery in the cancer genome using next generation sequencing: Computational solutions and perspectives

    PubMed Central

    Liu, Biao; Conroy, Jeffrey M.; Morrison, Carl D.; Odunsi, Adekunle O.; Qin, Maochun; Wei, Lei; Trump, Donald L.; Johnson, Candace S.; Liu, Song; Wang, Jianmin

    2015-01-01

    Somatic Structural Variations (SVs) are a complex collection of chromosomal mutations that could directly contribute to carcinogenesis. Next Generation Sequencing (NGS) technology has emerged as the primary means of interrogating the SVs of the cancer genome in recent investigations. Sophisticated computational methods are required to accurately identify the SV events and delineate their breakpoints from the massive amounts of reads generated by a NGS experiment. In this review, we provide an overview of current analytic tools used for SV detection in NGS-based cancer studies. We summarize the features of common SV groups and the primary types of NGS signatures that can be used in SV detection methods. We discuss the principles and key similarities and differences of existing computational programs and comment on unresolved issues related to this research field. The aim of this article is to provide a practical guide of relevant concepts, computational methods, software tools and important factors for analyzing and interpreting NGS data for the detection of SVs in the cancer genome. PMID:25849937

  2. Draft genome sequence of the docosahexaenoic acid producing thraustochytrid Aurantiochytrium sp. T66.

    PubMed

    Liu, Bin; Ertesvåg, Helga; Aasen, Inga Marie; Vadstein, Olav; Brautaset, Trygve; Heggeset, Tonje Marita Bjerkan

    2016-06-01

    Thraustochytrids are unicellular, marine protists, and there is a growing industrial interest in these organisms, particularly because some species, including strains belonging to the genus Aurantiochytrium, accumulate high levels of docosahexaenoic acid (DHA). Here, we report the draft genome sequence of Aurantiochytrium sp. T66 (ATCC PRA-276), with a size of 43 Mbp, and 11,683 predicted protein-coding sequences. The data has been deposited at DDBJ/EMBL/Genbank under the accession LNGJ00000000. The genome sequence will contribute new insight into DHA biosynthesis and regulation, providing a basis for metabolic engineering of thraustochytrids. PMID:27222814

  3. A classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B

    1991-01-01

    The amino acid sequences of 301 glycosyl hydrolases and related enzymes have been compared. A total of 291 sequences corresponding to 39 EC entries could be classified into 35 families. Only ten sequences (less than 5% of the sample) could not be assigned to any family. With the sequences available for this analysis, 18 families were found to be monospecific (containing only one EC number) and 17 were found to be polyspecific (containing at least two EC numbers). Implications on the folding characteristics and mechanism of action of these enzymes and on the evolution of carbohydrate metabolism are discussed. With the steady increase in sequence and structural data, it is suggested that the enzyme classification system should perhaps be revised. PMID:1747104

  4. New families in the classification of glycosyl hydrolases based on amino acid sequence similarities.

    PubMed Central

    Henrissat, B; Bairoch, A

    1993-01-01

    301 glycosyl hydrolases and related enzymes corresponding to 39 EC entries of the I.U.B. classification system have been classified into 35 families on the basis of amino-acid-sequence similarities [Henrissat (1991) Biochem. J. 280, 309-316]. Approximately half of the families were found to be monospecific (containing only one EC number), whereas the other half were found to be polyspecific (containing at least two EC numbers). A > 60% increase in sequence data for glycosyl hydrolases (181 additional enzymes or enzyme domains sequences have since become available) allowed us to update the classification not only by the addition of more members to already identified families, but also by the finding of ten new families. On the basis of a comparison of 482 sequences corresponding to 52 EC entries, 45 families, out of which 22 are polyspecific, can now be defined. This classification has been implemented in the SWISS-PROT protein sequence data bank. PMID:8352747

  5. Sequence-specific purification of nucleic acids by PNA-controlled hybrid selection.

    PubMed

    Orum, H; Nielsen, P E; Jørgensen, M; Larsson, C; Stanley, C; Koch, T

    1995-09-01

    Using an oligohistidine peptide nucleic acids (oligohistidine-PNA) chimera, we have developed a rapid hybrid selection method that allows efficient, sequence-specific purification of a target nucleic acid. The method exploits two fundamental features of PNA. First, that PNA binds with high affinity and specificity to its complementary nucleic acid. Second, that amino acids are easily attached to the PNA oligomer during synthesis. We show that a (His)6-PNA chimera exhibits strong binding to chelated Ni2+ ions without compromising its native PNA hybridization properties. We further show that these characteristics allow the (His)6-PNA/DNA complex to be purified by the well-established method of metal ion affinity chromatography using a Ni(2+)-NTA (nitrilotriactic acid) resin. Specificity and efficiency are the touchstones of any nucleic acid purification scheme. We show that the specificity of the (His)6-PNA selection approach is such that oligonucleotides differing by only a single nucleotide can be selectively purified. We also show that large RNAs (2224 nucleotides) can be captured with high efficiency by using multiple (His)6-PNA probes. PNA can hybridize to nucleic acids in low-salt concentrations that destabilize native nucleic acid structures. We demonstrate that this property of PNA can be utilized to purify an oligonucleotide in which the target sequence forms part of an intramolecular stem/loop structure. PMID:7495562

  6. In silico comparative analysis of DNA and amino acid sequences for prion protein gene.

    PubMed

    Kim, Y; Lee, J; Lee, C

    2008-01-01

    Genetic variability might contribute to species specificity of prion diseases in various organisms. In this study, structures of the prion protein gene (PRNP) and its amino acids were compared among species of which sequence data were available. Comparisons of PRNP DNA sequences among 12 species including human, chimpanzee, monkey, bovine, ovine, dog, mouse, rat, wallaby, opossum, chicken and zebrafish allowed us to identify candidate regulatory regions in intron 1 and 3'-untranslated region (UTR) in addition to the coding region. Highly conserved putative binding sites for transcription factors, such as heat shock factor 2 (HSF2) and myocite enhancer factor 2 (MEF2), were discovered in the intron 1. In 3'-UTR, the functional sequence (ATTAAA) for nucleus-specific polyadenylation was found in all the analysed species. The functional sequence (TTTTTAT) for maturation-specific polyadenylation was identically observed only in ovine, and one or two nucleotide mismatches in the other species. A comparison of the amino acid sequences in 53 species revealed a large sequence identity. Especially the octapeptide repeat region was observed in all the species but frog and zebrafish. Functional changes and susceptibility to prion diseases with various isoforms of prion protein could be caused by numeric variability and conformational changes discovered in the repeat sequences. PMID:18397498

  7. Genotypic variation in phenolic acids, vitamin E and fatty acids in whole grain rice.

    PubMed

    Yu, Lilei; Li, Guanglei; Li, Mei; Xu, Feifei; Beta, Trust; Bao, Jinsong

    2016-04-15

    The genetic diversity of phenolic content (PC), individual phenolic acids, vitamin E isomers (VE) and fatty acids (FA) in the whole grain rice were investigated. The most abundant phenolic acid was ferulic acid ranging from 155.6 to 271.1 μg/g and comprising approximately 40-57% of total phenolic acid (TPA). The predominant tocopherols (T) and tocotrienols (T3) were α-T (6.43-12.67 μg/g) and γ-T3 (12.88-32.75 μg/g). The unsaturated fractions of FAs accounted for 74-81% of the total FAs in rice. Most of the phytochemicals among phenolics and VEs showed significant differences between white and red rice, with red rice demonstrating significantly higher levels. However, white rice had higher content of oleic, linolenic, eicosenoic and total fatty acids than red rice. The wide genetic diversity in whole grain rice allows food processors to have a good selection for producing rice products, and breeders to have new rice lines that can be bred for high nutrient levels. PMID:26617016

  8. Sequence variation in three mitochondrial DNA genes among isolates of Ascaridia galli originating from Guangdong, Hunan and Yunnan provinces, China.

    PubMed

    Li, J Y; Liu, G H; Wang, Y; Song, H Q; Lin, R Q; Zou, F C; Liu, W; Xu, M J; Zhu, X Q

    2013-09-01

    The present study examined sequence variation in three mitochondrial DNA (mtDNA) genes, namely cytochrome c oxidase subunit 3 (cox3) and NADH dehydrogenase subunits 1 and 4 (nad1 and nad4), among Ascaridia galli isolates from different geographical localities in China. A portion of cox3 (pcox3), nad1 (pnad1) and nad4 (pnad4) genes were amplified by polymerase chain reaction (PCR) separately from adult A. galli individuals and the amplicons were subjected to sequencing from both directions. The length of the sequences of pcox3, pnad1 and pnad4 were 408 bp, 471 bp and 333 bp, respectively. The intraspecific sequence variations within A. galli were 0-1.7% for pcox3, 0-2.8% for pnad1 and 0-3.4% for pnad4. The A+T contents of the sequences were 67.16-67.65% (pcox3), 67.09-67.94% (pnad1) and 69.91-71.77% (pnad4). The interspecific sequence differences among members of the Ascaridida were significantly higher, being 13.2-30.9%, 12.8-29.0% and 15.1-34.1% for pcox3, pnad1 and pnad4, respectively. Phylogenetic analyses using combined sequences of pcox3, pnad1 and pnad4, with three different computational algorithms (Bayesian analysis, maximum likelihood and maximum parsimony), all revealed distinct groups with high statistical support. These findings demonstrated the existence of intraspecific variation in mitochondrial DNA (mtDNA) sequences among A. galli isolates from different geographical regions in China, and have implications for studying molecular epidemiology and population genetics of A. galli. PMID:23046568

  9. Modeling kinetic rate variation in third generation DNA sequencing data to detect putative modifications to DNA bases.

    PubMed

    Schadt, Eric E; Banerjee, Onureena; Fang, Gang; Feng, Zhixing; Wong, Wing H; Zhang, Xuegong; Kislyuk, Andrey; Clark, Tyson A; Luong, Khai; Keren-Paz, Alona; Chess, Andrew; Kumar, Vipin; Chen-Plotkin, Alice; Sondheimer, Neal; Korlach, Jonas; Kasarskis, Andrew

    2013-01-01

    Current generation DNA sequencing instruments are moving closer to seamlessly sequencing genomes of entire populations as a routine part of scientific investigation. However, while significant inroads have been made identifying small nucleotide variation and structural variations in DNA that impact phenotypes of interest, progress has not been as dramatic regarding epigenetic changes and base-level damage to DNA, largely due to technological limitations in assaying all known and unknown types of modifications at genome scale. Recently, single-molecule real time (SMRT) sequencing has been reported to identify kinetic variation (KV) events that have been demonstrated to reflect epigenetic changes of every known type, providing a path forward for detecting base modifications as a routine part of sequencing. However, to date no statistical framework has been proposed to enhance the power to detect these events while also controlling for false-positive events. By modeling enzyme kinetics in the neighborhood of an arbitrary location in a genomic region of interest as a conditional random field, we provide a statistical framework for incorporating kinetic information at a test position of interest as well as at neighboring sites that help enhance the power to detect KV events. The performance of this and related models is explored, with the best-performing model applied to plasmid DNA isolated from Escherichia coli and mitochondrial DNA isolated from human brain tissue. We highlight widespread kinetic variation events, some of which strongly associate with known modification events, while others represent putative chemically modified sites of unknown types. PMID:23093720

  10. Antibody-specific model of amino acid substitution for immunological inferences from alignments of antibody sequences.

    PubMed

    Mirsky, Alexander; Kazandjian, Linda; Anisimova, Maria

    2015-03-01

    Antibodies are glycoproteins produced by the immune system as a dynamically adaptive line of defense against invading pathogens. Very elegant and specific mutational mechanisms allow B lymphocytes to produce a large and diversified repertoire of antibodies, which is modified and enhanced throughout all adulthood. One of these mechanisms is somatic hypermutation, which stochastically mutates nucleotides in the antibody genes, forming new sequences with different properties and, eventually, higher affinity and selectivity to the pathogenic target. As somatic hypermutation involves fast mutation of antibody sequences, this process can be described using a Markov substitution model of molecular evolution. Here, using large sets of antibody sequences from mice and humans, we infer an empirical amino acid substitution model AB, which is specific to antibody sequences. Compared with existing general amino acid models, we show that the AB model provides significantly better description for the somatic evolution of mice and human antibody sequences, as demonstrated on large next generation sequencing (NGS) antibody data. General amino acid models are reflective of conservation at the protein level due to functional constraints, with most frequent amino acids exchanges taking place between residues with the same or similar physicochemical properties. In contrast, within the variable part of antibody sequences we observed an elevated frequency of exchanges between amino acids with distinct physicochemical properties. This is indicative of a sui generis mutational mechanism, specific to antibody somatic hypermutation. We illustrate this property of antibody sequences by a comparative analysis of the network modularity implied by the AB model and general amino acid substitution models. We recommend using the new model for computational studies of antibody sequence maturation, including inference of alignments and phylogenetic trees describing antibody somatic hypermutation in

  11. Maternal effects and maternal selection arising from variation in allocation of free amino acid to eggs

    PubMed Central

    Newcombe, Devi; Hunt, John; Mitchell, Christopher; Moore, Allen J

    2015-01-01

    Maternal provisioning can have profound effects on offspring phenotypes, or maternal effects, especially early in life. One ubiquitous form of provisioning is in the makeup of egg. However, only a few studies examine the role of specific egg constituents in maternal effects, especially as they relate to maternal selection (a standardized selection gradient reflecting the covariance between maternal traits and offspring fitness). Here, we report on the evolutionary consequences of differences in maternal acquisition and allocation of amino acids to eggs. We manipulated acquisition by varying maternal diet (milkweed or sunflower) in the large milkweed bug, Oncopeltus fasciatus. Variation in allocation was detected by examining two source populations with different evolutionary histories and life-history response to sunflower as food. We measured amino acids composition in eggs in this 2 × 2 design and found significant effects of source population and maternal diet on egg and nymph mass and of source population, maternal diet, and their interaction on amino acid composition of eggs. We measured significant linear and quadratic maternal selection on offspring mass associated with variation in amino acid allocation. Visualizing the performance surface along the major axes of nonlinear selection and plotting the mean amino acid profile of eggs from each treatment onto the surface revealed a saddle-shaped fitness surface. While maternal selection appears to have influenced how females allocate amino acids, this maternal effect did not evolve equally in the two populations. Furthermore, none of the population means coincided with peak performance. Thus, we found that the composition of free amino acids in eggs was due to variation in both acquisition and allocation, which had significant fitness effects and created selection. However, although there can be an evolutionary response to novel food resources, females may be constrained from reaching phenotypic optima with

  12. Maternal effects and maternal selection arising from variation in allocation of free amino acid to eggs.

    PubMed

    Newcombe, Devi; Hunt, John; Mitchell, Christopher; Moore, Allen J

    2015-06-01

    Maternal provisioning can have profound effects on offspring phenotypes, or maternal effects, especially early in life. One ubiquitous form of provisioning is in the makeup of egg. However, only a few studies examine the role of specific egg constituents in maternal effects, especially as they relate to maternal selection (a standardized selection gradient reflecting the covariance between maternal traits and offspring fitness). Here, we report on the evolutionary consequences of differences in maternal acquisition and allocation of amino acids to eggs. We manipulated acquisition by varying maternal diet (milkweed or sunflower) in the large milkweed bug, Oncopeltus fasciatus. Variation in allocation was detected by examining two source populations with different evolutionary histories and life-history response to sunflower as food. We measured amino acids composition in eggs in this 2 × 2 design and found significant effects of source population and maternal diet on egg and nymph mass and of source population, maternal diet, and their interaction on amino acid composition of eggs. We measured significant linear and quadratic maternal selection on offspring mass associated with variation in amino acid allocation. Visualizing the performance surface along the major axes of nonlinear selection and plotting the mean amino acid profile of eggs from each treatment onto the surface revealed a saddle-shaped fitness surface. While maternal selection appears to have influenced how females allocate amino acids, this maternal effect did not evolve equally in the two populations. Furthermore, none of the population means coincided with peak performance. Thus, we found that the composition of free amino acids in eggs was due to variation in both acquisition and allocation, which had significant fitness effects and created selection. However, although there can be an evolutionary response to novel food resources, females may be constrained from reaching phenotypic optima with

  13. Computational tools for copy number variation (CNV) detection using next-generation sequencing data: features and perspectives

    PubMed Central

    2013-01-01

    Copy number variation (CNV) is a prevalent form of critical genetic variation that leads to an abnormal number of copies of large genomic regions in a cell. Microarray-based comparative genome hybridization (arrayCGH) or genotyping arrays have been standard technologies to detect large regions subject to copy number changes in genomes until most recently high-resolution sequence data can be analyzed by next-generation sequencing (NGS). During the last several years, NGS-based analysis has been widely applied to identify CNVs in both healthy and diseased individuals. Correspondingly, the strong demand for NGS-based CNV analyses has fuelled development of numerous computational methods and tools for CNV detection. In this article, we review the recent advances in computational methods pertaining to CNV detection using whole genome and whole exome sequencing data. Additionally, we discuss their strengths and weaknesses and suggest directions for future development. PMID:24564169

  14. Ovine mitochondrial DNA sequence variation and its association with production and reproduction traits within an Afec-Assaf flock.

    PubMed

    Reicher, S; Seroussi, E; Weller, J I; Rosov, A; Gootwine, E

    2012-07-01

    Polymorphisms in mitochondrial DNA (mtDNA) protein- and tRNA-coding genes were shown to be associated with various diseases in humans as well as with production and reproduction traits in livestock. Alignment of full length mitochondria sequences from the 5 known ovine haplogroups: HA (n = 3), HB (n = 5), HC (n = 3), HD (n = 2), and HE (n = 2; GenBank accession nos. HE577847-50 and 11 published complete ovine mitochondria sequences) revealed sequence variation in 10 out of the 13 protein coding mtDNA sequences. Twenty-six of the 245 variable sites found in the protein coding sequences represent non-synonymous mutations. Sequence variation was observed also in 8 out of the 22 tRNA mtDNA sequences. On the basis of the mtDNA control region and cytochrome b partial sequences along with information on maternal lineages within an Afec-Assaf flock, 1,126 Afec-Assaf ewes were assigned to mitochondrial haplogroups HA, HB, and HC, with frequencies of 0.43, 0.43, and 0.14, respectively. Analysis of birth weight and growth rate records of lamb (n = 1286) and productivity from 4,993 lambing records revealed no association between mitochondrial haplogroup affiliation and female longevity, lambs perinatal survival rate, birth weight, and daily growth rate of lambs up to 150 d that averaged 1,664 d, 88.3%, 4.5 kg, and 320 g/d, respectively. However, significant (P < 0.0001) differences among the haplogroups were found for prolificacy of ewes, with prolificacies (mean ± SE) of 2.14 ± 0.04, 2.25 ± 0.04, and 2.30 ± 0.06 lamb born/ewe lambing for the HA, HB, and the HC haplogroups, respectively. Our results highlight the ovine mitogenome genetic variation in protein- and tRNA coding genes and suggest that sequence variation in ovine mtDNA is associated with variation in ewe prolificacy. PMID:22266988

  15. Amino acid sequence of a vitamin K-dependent Ca2+-binding peptide from bovine prothrombin.

    PubMed

    Howard, J B; Fausch, M D

    1975-08-10

    The amino acid sequence of a 31-residue peptide from bovine prothrombin has been determined. This peptide has been shown to contain the vitamin K-dependent modification required for Ca2+ binding (Nelsestuen, G. L., and Suttie, J. W. (1973) Proc. Natl. Acad. Sci. U. S. A. 70, 3366-3370) and the modified amino acid, gamma-carboxyglutamic acid (Nelsestuen, G. L., Zytkovicz, T., and Howard, J. B. (1974) J. Biol. Chem. 249, 6347-6350). The peptide was shown to correspond to residues 12 to 42 of prothrombin. PMID:807581

  16. Amino acid sequences around the cysteine residues of rabbit muscle triose phosphate isomerase

    PubMed Central

    Miller, Janet C.; Waley, S. G.

    1971-01-01

    1. The nature of the subunits in rabbit muscle triose phosphate isomerase has been investigated. 2. Amino acid analyses show that there are five cysteine residues and two methionine residues/subunit. 3. The amino acid sequences around the cysteine residues have been determined; these account for about 75 residues. 4. Cleavage at the methionine residues with cyanogen bromide gave three fragments. 5. These results show that the subunits correspond to polypeptide chains, containing about 230 amino acid residues. The chains in triose phosphate isomerase seem to be shorter than those of other glycolytic enzymes. PMID:5165707

  17. Apolipoprotein E Variation at the Sequence Haplotype Level: Implications for the Origin and Maintenance of a Major Human Polymorphism

    PubMed Central

    Fullerton, Stephanie M.; Clark, Andrew G.; Weiss, Kenneth M.; Nickerson, Deborah A.; Taylor, Scott L.; Stengård, Jari H.; Salomaa, Veikko; Vartiainen, Erkki; Perola, Markus; Boerwinkle, Eric; Sing, Charles F.

    2000-01-01

    Three common protein isoforms of apolipoprotein E (apoE), encoded by the ε2, ε3, and ε4 alleles of the APOE gene, differ in their association with cardiovascular and Alzheimer's disease risk. To gain a better understanding of the genetic variation underlying this important polymorphism, we identified sequence haplotype variation in 5.5 kb of genomic DNA encompassing the whole of the APOE locus and adjoining flanking regions in 96 individuals from four populations: blacks from Jackson, MS (n=48 chromosomes), Mayans from Campeche, Mexico (n=48), Finns from North Karelia, Finland (n=48), and non-Hispanic whites from Rochester, MN (n=48). In the region sequenced, 23 sites varied (21 single nucleotide polymorphisms, or SNPs, 1 diallelic indel, and 1 multiallelic indel). The 22 diallelic sites defined 31 distinct haplotypes in the sample. The estimate of nucleotide diversity (site-specific heterozygosity) for the locus was 0.0005±0.0003. Sequence analysis of the chimpanzee APOE gene showed that it was most closely related to human ε4-type haplotypes, differing from the human consensus sequence at 67 synonymous (54 substitutions and 13 indels) and 9 nonsynonymous fixed positions. The evolutionary history of allelic divergence within humans was inferred from the pattern of haplotype relationships. This analysis suggests that haplotypes defining the ε3 and ε2 alleles are derived from the ancestral ε4s and that the ε3 group of haplotypes have increased in frequency, relative to ε4s, in the past 200,000 years. Substantial heterogeneity exists within all three classes of sequence haplotypes, and there are important interpopulation differences in the sequence variation underlying the protein isoforms that may be relevant to interpreting conflicting reports of phenotypic associations with variation in the common protein isoforms. PMID:10986041

  18. Complete amino acid sequence of the Mu heavy chain of a human IgM immunoglobulin.

    PubMed

    Putnam, F W; Florent, G; Paul, C; Shinoda, T; Shimizu, A

    1973-10-19

    The amino acid sequence of the micro, chain of a human IgM immunoglobulin, including the location of all disulfide bridges and oligosaccharides, has been determined. The homology of the constant regions of immunoglobulin micro, gamma, alpha, and epsilon heavy chains reveals evolutionary relationships and suggests that two genes code for each heavy chain. PMID:4742735

  19. Draft Genome Sequence of the Butyric Acid Producer Clostridium tyrobutyricum Strain CIP I-776 (IFP923)

    PubMed Central

    Clément, Benjamin; Lopes Ferreira, Nicolas

    2016-01-01

    Here, we report the draft genome sequence of Clostridium tyrobutyricum CIP I-776 (IFP923), an efficient producer of butyric acid. The genome consists of a single chromosome of 3.19 Mb and provides useful data concerning the metabolic capacities of the strain. PMID:26941139

  20. Draft Genome Sequence of Perfluorooctane Acid-Degrading Bacterium Pseudomonas parafulva YAB-1

    PubMed Central

    Tang, Chongjian; Peng, Qingjing; Peng, Qingzhong

    2015-01-01

    Pseudomonas parafulva YAB-1, isolated from perfluorinated compound-contaminated soil, has the ability to degrade perfluorooctane acid (PFOA) compound. Here, we report the draft genome sequence and annotation of the PFOA-degrading bacterium P. parafulva YAB-1. The data provide the basis to investigate the molecular mechanism of PFOA metabolism. PMID:26337877

  1. Laying-sequence-specific variation in yolk oestrogen levels, and relationship to plasma oestrogen in female zebra finches (Taeniopygia guttata)

    PubMed Central

    Williams, Tony D.; Ames, Caroline E.; Kiparissis, Yiannis; Wynne-Edwards, Katherine E.

    2005-01-01

    We investigated the relationship between plasma and yolk oestrogens in laying female zebra finches (Taeniopygia guttata) by manipulating plasma oestradiol (E2) levels, via injection of oestradiol-17β, in a sequence-specific manner to maintain chronically high plasma levels for later-developing eggs (contrasting with the endogenous pattern of decreasing plasma E2 concentrations during laying). We report systematic variation in yolk oestrogen concentrations, in relation to laying sequence, similar to that widely reported for androgenic steroids. In sham-manipulated females, yolk E2 concentrations decreased with laying sequence. However, in E2-treated females plasma E2 levels were higher during the period of rapid yolk development of later-laid eggs, compared with control females. As a consequence, we reversed the laying-sequence-specific pattern of yolk E2: in E2-treated females, yolk E2 concentrations increased with laying-sequence. In general therefore, yolk E2 levels were a direct reflection of plasma E2 levels. However, in control females there was some inter-individual variability in the endogenous pattern of plasma E2 levels through the laying cycle which could generate variation in sequence-specific patterns of yolk hormone levels even if these primarily reflect circulating steroid levels. PMID:15695208

  2. Sequence variation of Bemisia tabaci Chemosensory Protein 2 in cryptic species B and Q: New DNA markers for whitefly recognition.

    PubMed

    Liu, Guo-Xia; Ma, Hong-Mei; Xie, Hong-Yan; Xuan, Ning; Picimbon, Jean-François

    2016-01-15

    Bemisia tabaci Gennadius biotypes B and Q are two of the most important worldwide agricultural insect pests. Genomic sequences of Type-2 B. tabaci chemosensory protein (BtabCSP2) were cloned and sequenced in B and Q biotypes, revealing key biotype-specific variations in the intron sequence. A Q260 sequence was found specifically in Q-BtabCSP2 and Cucumis melo LN692399, suggesting ancestral horizontal transfer of gene between the insect and the plant through bacteria. A cleaved amplified polymorphic sequences (CAPS) method was then developed to differentiate B and Q based on the sequence variation in exon of BtabCSP2 gene. The performances of CSP2-based CAPS for whitefly recognition were assessed using B. tabaci field collections from Shandong Province (P.R. China). Our SacII based CAPS method led to the same result compared to mitochondrial cytochrome oxidase-based CAPS method in the field collections. We therefore propose an explanation for CSP origin and a new rapid simple molecular method based on genomic DNA and chemosensory gene to differentiate accurately the B and Q whiteflies of the Bemisia complex around the world. PMID:26481237

  3. The amino acid sequence of cytochrome c-555 from the methane-oxidizing bacterium Methylococcus capsulatus.

    PubMed Central

    Ambler, R P; Dalton, H; Meyer, T E; Bartsch, R G; Kamen, M D

    1986-01-01

    The amino acid sequence of the cytochrome c-555 from the obligate methanotroph Methylococcus capsulatus strain Bath (N.C.I.B. 11132) was determined. It is a single polypeptide chain of 96 residues, binding a haem group through the cysteine residues at positions 19 and 22, and the only methionine residue is a position 59. The sequence does not closely resemble that of any other cytochrome c that has yet been characterized. Detailed evidence for the amino acid sequence of the protein has been deposited as Supplementary Publication SUP 50131 (12 pages) at the British Library Lending Division, Boston Spa, West Yorkshire LS23 7BQ, U.K., from whom copies are available on prepayment. PMID:3006666

  4. Identification of conserved genomic regions and variation therein amongst Cetartiodactyla species using next generation sequencing

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Background Next Generation Sequencing has created an opportunity to genetically characterize an individual both inexpensively and comprehensively. In earlier work produced in our collaboration [1], it was demonstrated that, for animals without a reference genome, their Next Generation Sequence data ...

  5. The Use of High-Throughput DNA Sequencing in the Investigation of Antigenic Variation: Application to Neisseria Species

    PubMed Central

    Davies, John K.; Harrison, Paul F.; Lin, Ya-Hsun; Bartley, Stephanie; Khoo, Chen Ai; Seemann, Torsten; Ryan, Catherine S.; Kahler, Charlene M.; Hill, Stuart A.

    2014-01-01

    Antigenic variation occurs in a broad range of species. This process resembles gene conversion in that variant DNA is unidirectionally transferred from partial gene copies (or silent loci) into an expression locus. Previous studies of antigenic variation have involved the amplification and sequencing of individual genes from hundreds of colonies. Using the pilE gene from Neisseria gonorrhoeae we have demonstrated that it is possible to use PCR amplification, followed by high-throughput DNA sequencing and a novel assembly process, to detect individual antigenic variation events. The ability to detect these events was much greater than has previously been possible. In N. gonorrhoeae most silent loci contain multiple partial gene copies. Here we show that there is a bias towards using the copy at the 3′ end of the silent loci (copy 1) as the donor sequence. The pilE gene of N. gonorrhoeae and some strains of Neisseria meningitidis encode class I pilin, but strains of N. meningitidis from clonal complexes 8 and 11 encode a class II pilin. We have confirmed that the class II pili of meningococcal strain FAM18 (clonal complex 11) are non-variable, and this is also true for the class II pili of strain NMB from clonal complex 8. In addition when a gene encoding class I pilin was moved into the meningococcal strain NMB background there was no evidence of antigenic variation. Finally we investigated several members of the opa gene family of N. gonorrhoeae, where it has been suggested that limited variation occurs. Variation was detected in the opaK gene that is located close to pilE, but not at the opaJ gene located elsewhere on the genome. The approach described here promises to dramatically improve studies of the extent and nature of antigenic variation systems in a variety of species. PMID:24466206

  6. Low levels of haptoglobin and putative amino acid sequence in Taiwanese Lanyu miniature pigs.

    PubMed

    Yueh, Sunny C H; Wang, Yao Horng; Lin, Kuan Yu; Tseng, Chi Feng; Chu, Hsien Pin; Chen, Kuen Jaw; Wang, Shih Sheng; Lai, I Hsiang; Mao, Simon J T

    2008-04-01

    Porcine haptoglobin (Hp) is an acute phase protein. Its plasma level increases significantly during inflammation and infection. One of the main functions of Hp is to bind free hemoglobin (Hb) and inhibit its oxidative activity. In the present report, we studied the Hp phenotype of Taiwanese Lanyu miniature pigs (TLY minipigs; n=43) and found their Hp structure to be a homodimer (beta-alpha-alpha-beta) similar to human Hp 1-1. Interestingly, Western blot and high performance liquid chromatographic (HPLC) analysis showed that 25% of the TLY minipigs possessed low or no plasma Hp level (<0.05 mg/ml). The Hp cDNA of these TLY minipigs was then cloned, and the translated amino acid sequence was analyzed. No sequences were found to be deficient; they showed a 99.7% identity with domestic pigs (NP_999165). The mean overall Hp level of the TLY minipigs (0.21 +/- 0.25 mg/ml; n=43) determined by enzyme-linked immunosorbent assay (ELISA) was markedly lower than that of domestic pigs (0.78 +/- 0.45 mg/ml; p<0.001), while 25% of the TLY minipigs had an Hp level that was extremely low (<0.05 mg/ml). In addition, the initial recovery rate (first 40 min) in the circulation of infused fluorescein isothiocyanate (FITC)-Hb was significantly higher in the TLY minipigs with extremely low Hp levels than those with high levels. This data suggests that the low concentration of Hp-Hb complex is responsible for the higher recovery rate of Hb in the circulation. TLY minipigs have been used as an experimental model for cardiovascular diseases; whether they can be used as a model for inflammatory diseases, with Hp as a marker, remains a topic of interest. However, since the Hp level varies significantly among individual TLY minipigs, it is necessary to prescreen the Hp levels of the animals to minimize variation in the experimental baseline. The present study may provide a reference value for future use of the TLY minipig as an animal model for inflammation-associated diseases. PMID:18460833

  7. Sequence variations in the collagen IX and XI genes are associated with degenerative lumbar spinal stenosis

    PubMed Central

    Noponen-Hietala, N; Kyllonen, E; Mannikko, M; Ilkko, E; Karppinen, J; Ott, J; Ala-Kokko, L

    2003-01-01

    Background: Degenerative lumbar spinal stenosis (LSS) is usually caused by disc herniation or degeneration. Several genetic factors have been implicated in disc disease. Tryptophan alleles in COL9A2 and COL9A3 have been shown to be associated with lumbar disc disease in the Finnish population, and polymorphisms in the vitamin D receptor gene (VDR) (FokI and TaqI), the matrix metalloproteinase-3 gene (MMP-3) and an aggrecan gene (AGC1) VNTR have been reported to be associated with disc degeneration. In addition, an IVS6-4 a>t polymorphism in COL11A2 has been found in connection with stenosis caused by ossification of the posterior longitudinal ligament in the Japanese population. Objective: To study the role of genetic factors in LSS. Methods: 29 Finnish probands were analysed for mutations in the genes coding for intervertebral disc matrix proteins, COL1A1, COL1A2, COL2A1, COL9A1, COL9A2, COL9A3, COL11A1, COL11A2, and AGC1. VDR and MMP-3 polymorphisms were also analysed. Sequence variations were tested in 56 Finnish controls. Results: Several disease associated alleles were identified. A splice site mutation in COL9A2 leading to a premature translation termination codon and the generation of a truncated protein was identified in one proband, another had the Trp2 allele, and four others the Trp3 allele. The frequency of the COL11A2 IVS6-4 t allele was 93.1% in the probands and 72.3% in controls (p = 0.0016). The differences in genotype frequencies for this site were less significant (p = 0.0043). Conclusions: Genetic factors have an important role in the pathogenesis of LSS. PMID:14644861

  8. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma.

    PubMed

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-02-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  9. Whole-Genome Sequencing Reveals Diverse Models of Structural Variations in Esophageal Squamous Cell Carcinoma

    PubMed Central

    Cheng, Caixia; Zhou, Yong; Li, Hongyi; Xiong, Teng; Li, Shuaicheng; Bi, Yanghui; Kong, Pengzhou; Wang, Fang; Cui, Heyang; Li, Yaoping; Fang, Xiaodong; Yan, Ting; Li, Yike; Wang, Juan; Yang, Bin; Zhang, Ling; Jia, Zhiwu; Song, Bin; Hu, Xiaoling; Yang, Jie; Qiu, Haile; Zhang, Gehong; Liu, Jing; Xu, Enwei; Shi, Ruyi; Zhang, Yanyan; Liu, Haiyan; He, Chanting; Zhao, Zhenxiang; Qian, Yu; Rong, Ruizhou; Han, Zhiwei; Zhang, Yanlin; Luo, Wen; Wang, Jiaqian; Peng, Shaoliang; Yang, Xukui; Li, Xiangchun; Li, Lin; Fang, Hu; Liu, Xingmin; Ma, Li; Chen, Yunqing; Guo, Shiping; Chen, Xing; Xi, Yanfeng; Li, Guodong; Liang, Jianfang; Yang, Xiaofeng; Guo, Jiansheng; Jia, JunMei; Li, Qingshan; Cheng, Xiaolong; Zhan, Qimin; Cui, Yongping

    2016-01-01

    Comprehensive identification of somatic structural variations (SVs) and understanding their mutational mechanisms in cancer might contribute to understanding biological differences and help to identify new therapeutic targets. Unfortunately, characterization of complex SVs across the whole genome and the mutational mechanisms underlying esophageal squamous cell carcinoma (ESCC) is largely unclear. To define a comprehensive catalog of somatic SVs, affected target genes, and their underlying mechanisms in ESCC, we re-analyzed whole-genome sequencing (WGS) data from 31 ESCCs using Meerkat algorithm to predict somatic SVs and Patchwork to determine copy-number changes. We found deletions and translocations with NHEJ and alt-EJ signature as the dominant SV types, and 16% of deletions were complex deletions. SVs frequently led to disruption of cancer-associated genes (e.g., CDKN2A and NOTCH1) with different mutational mechanisms. Moreover, chromothripsis, kataegis, and breakage-fusion-bridge (BFB) were identified as contributing to locally mis-arranged chromosomes that occurred in 55% of ESCCs. These genomic catastrophes led to amplification of oncogene through chromothripsis-derived double-minute chromosome formation (e.g., FGFR1 and LETM2) or BFB-affected chromosomes (e.g., CCND1, EGFR, ERBB2, MMPs, and MYC), with approximately 30% of ESCCs harboring BFB-derived CCND1 amplification. Furthermore, analyses of copy-number alterations reveal high frequency of whole-genome duplication (WGD) and recurrent focal amplification of CDCA7 that might act as a potential oncogene in ESCC. Our findings reveal molecular defects such as chromothripsis and BFB in malignant transformation of ESCCs and demonstrate diverse models of SVs-derived target genes in ESCCs. These genome-wide SV profiles and their underlying mechanisms provide preventive, diagnostic, and therapeutic implications for ESCCs. PMID:26833333

  10. Fin whale MDH-1 and MPI allozyme variation is not reflected in the corresponding DNA sequences

    PubMed Central

    Olsen, Morten Tange; Pampoulie, Christophe; Daníelsdóttir, Anna K; Lidh, Emmelie; Bérubé, Martine; Víkingsson, Gísli A; Palsbøll, Per J

    2014-01-01

    The appeal of genetic inference methods to assess population genetic structure and guide management efforts is grounded in the correlation between the genetic similarity and gene flow among populations. Effects of such gene flow are typically genomewide; however, some loci may appear as outliers, displaying above or below average genetic divergence relative to the genomewide level. Above average population, genetic divergence may be due to divergent selection as a result of local adaptation. Consequently, substantial efforts have been directed toward such outlying loci in order to identify traits subject to local adaptation. Here, we report the results of an investigation into the molecular basis of the substantial degree of genetic divergence previously reported at allozyme loci among North Atlantic fin whale (Balaenoptera physalus) populations. We sequenced the exons encoding for the two most divergent allozyme loci (MDH-1 and MPI) and failed to detect any nonsynonymous substitutions. Following extensive error checking and analysis of additional bioinformatic and morphological data, we hypothesize that the observed allozyme polymorphisms may reflect phenotypic plasticity at the cellular level, perhaps as a response to nutritional stress. While such plasticity is intriguing in itself, and of fundamental evolutionary interest, our key finding is that the observed allozyme variation does not appear to be a result of genetic drift, migration, or selection on the MDH-1 and MPI exons themselves, stressing the importance of interpreting allozyme data with caution. As for North Atlantic fin whale population structure, our findings support the low levels of differentiation found in previous analyses of DNA nucleotide loci. PMID:24963377

  11. Allelic polymorphism in arabian camel ribonuclease and the amino acid sequence of bactrian camel ribonuclease.

    PubMed

    Welling, G W; Mulder, H; Beintema, J J

    1976-04-01

    Pancreatic ribonucleases from several species (whitetail deer, roe deer, guinea pig, and arabian camel) exhibit more than one amino acid at particular positions in their amino acid sequences. Since these enzymes were isolated from pooled pancreas, the origin of this heterogeneity is not clear. The pancreatic ribonucleases from 11 individual arabian camels (Camelus dromedarius) have been investigated with respect to the lysine-glutamine heterogeneity at position 103 (Welling et al., 1975). Six ribonucleases showed only one basic band and five showed two bands after polyacrylamide gel electrophoresis, suggesting a gene frequency of about 0.75 for the Lys gene and about 0.25 for the Gln gene. The amino acid sequence of bactrian camel (Camelus bactrianus) ribonuclease isolated from individual pancreatic tissue was determined and compared with that of arabian camel ribonuclease. The only difference was observed at position 103. In the ribonucleases from two unrelated bactrian camels, only glutamine was observed at that position. PMID:962846

  12. Use of a structural alphabet to find compatible folds for amino acid sequences

    PubMed Central

    Mahajan, Swapnil; de Brevern, Alexandre G; Sanejouand, Yves-Henri; Srinivasan, Narayanaswamy; Offmann, Bernard

    2015-01-01

    The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as “Protein Blocks” (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa. PMID:25297700

  13. Use of a structural alphabet to find compatible folds for amino acid sequences.

    PubMed

    Mahajan, Swapnil; de Brevern, Alexandre G; Sanejouand, Yves-Henri; Srinivasan, Narayanaswamy; Offmann, Bernard

    2015-01-01

    The structural annotation of proteins with no detectable homologs of known 3D structure identified using sequence-search methods is a major challenge today. We propose an original method that computes the conditional probabilities for the amino-acid sequence of a protein to fit to known protein 3D structures using a structural alphabet, known as "Protein Blocks" (PBs). PBs constitute a library of 16 local structural prototypes that approximate every part of protein backbone structures. It is used to encode 3D protein structures into 1D PB sequences and to capture sequence to structure relationships. Our method relies on amino acid occurrence matrices, one for each PB, to score global and local threading of query amino acid sequences to protein folds encoded into PB sequences. It does not use any information from residue contacts or sequence-search methods or explicit incorporation of hydrophobic effect. The performance of the method was assessed with independent test datasets derived from SCOP 1.75A. With a Z-score cutoff that achieved 95% specificity (i.e., less than 5% false positives), global and local threading showed sensitivity of 64.1% and 34.2%, respectively. We further tested its performance on 57 difficult CASP10 targets that had no known homologs in PDB: 38 compatible templates were identified by our approach and 66% of these hits yielded correctly predicted structures. This method scales-up well and offers promising perspectives for structural annotations at genomic level. It has been implemented in the form of a web-server that is freely available at http://www.bo-protscience.fr/forsa. PMID:25297700

  14. Hybridization properties of long nucleic acid probes for detection of variable target sequences, and development of a hybridization prediction algorithm

    PubMed Central

    Öhrmalm, Christina; Jobs, Magnus; Eriksson, Ronnie; Golbob, Sultan; Elfaitouri, Amal; Benachenhou, Farid; Strømme, Maria; Blomberg, Jonas

    2010-01-01

    One of the main problems in nucleic acid-based techniques for detection of infectious agents, such as influenza viruses, is that of nucleic acid sequence variation. DNA probes, 70-nt long, some including the nucleotide analog deoxyribose-Inosine (dInosine), were analyzed for hybridization tolerance to different amounts and distributions of mismatching bases, e.g. synonymous mutations, in target DNA. Microsphere-linked 70-mer probes were hybridized in 3M TMAC buffer to biotinylated single-stranded (ss) DNA for subsequent analysis in a Luminex® system. When mismatches interrupted contiguous matching stretches of 6 nt or longer, it had a strong impact on hybridization. Contiguous matching stretches are more important than the same number of matching nucleotides separated by mismatches into several regions. dInosine, but not 5-nitroindole, substitutions at mismatching positions stabilized hybridization remarkably well, comparable to N (4-fold) wobbles in the same positions. In contrast to shorter probes, 70-nt probes with judiciously placed dInosine substitutions and/or wobble positions were remarkably mismatch tolerant, with preserved specificity. An algorithm, NucZip, was constructed to model the nucleation and zipping phases of hybridization, integrating both local and distant binding contributions. It predicted hybridization more exactly than previous algorithms, and has the potential to guide the design of variation-tolerant yet specific probes. PMID:20864443

  15. Mitochondrial DNA variation and phylogenetic relationships among five tuna species based on sequencing of D-loop region.

    PubMed

    Kumar, Girish; Kocour, Martin; Kunal, Swaraj Priyaranjan

    2016-05-01

    In order to assess the DNA sequence variation and phylogenetic relationship among five tuna species (Auxis thazard, Euthynnus affinis, Katsuwonus pelamis, Thunnus tonggol, and T. albacares) out of all four tuna genera, partial sequences of the mitochondrial DNA (mtDNA) D-loop region were analyzed. The estimate of intra-specific sequence variation in studied species was low, ranging from 0.027 to 0.080 [Kimura's two parameter distance (K2P)], whereas values of inter-specific variation ranged from 0.049 to 0.491. The longtail tuna (T. tonggol) and yellowfin tuna (T. albacares) were found to share a close relationship (K2P = 0.049) while skipjack tuna (K. pelamis) was most divergent studied species. Phylogenetic analysis using Maximum-Likelihood (ML) and Neighbor-Joining (NJ) methods supported the monophyletic origin of Thunnus species. Similarly, phylogeny of Auxis and Euthynnus species substantiate the monophyly. However, results showed a distinct origin of K. pelamis from genus Thunnus as well as Auxis and Euthynnus. Thus, the mtDNA D-loop region sequence data supports the polyphyletic origin of tuna species. PMID:25329285

  16. Software scripts for quality checking of high-throughput nucleic acid sequencers.

    PubMed

    Lazo, G R; Tong, J; Miller, R; Hsia, C; Rausch, C; Kang, Y; Anderson, O D

    2001-06-01

    We have developed a graphical interface to allow the researcher to view and assess the quality of sequencing results using a series of program scripts developed to process data generated by automated sequencers. The scripts are written in Perl programming language and are executable under the cgibin directory of a Web server environment. The scripts direct nucleic acid sequencing trace file data output from automated sequencers to be analyzed by the phred molecular biology program and are displayed as graphical hypertext mark-up language (HTML) pages. The scripts are mainly designed to handle 96-well microtiter dish samples, but the scripts are also able to read data from 384-well microtiter dishes 96 samples at a time. The scripts may be customized for different laboratory environments and computer configurations. Web links to the sources and discussion page are provided. PMID:11414222

  17. Seasonal variation in soil nitrogen availability across a fertilization chronosequence in moist acidic tundra

    NASA Astrophysics Data System (ADS)

    McLaren, J. R.; Gough, L.; Weintraub, M. N.

    2012-12-01

    Changes in global climate may result in altered timing of seasonal events including the timing of the spring-thaw and fall freeze-up. In addition to this changing seasonality, arctic environments are experiencing overall increases in nutrient availability caused by climate warming resulting in alterations of plant species composition, such as the observed increases in the abundance of deciduous shrubs. Changing species composition may have large effects on nutrient dynamics in the surrounding ecosystem because of documented differences in how particular plant species influence soil nutrient availability. Although we have some idea of how plant identity influences soil nutrients, soil biogeochemical processes are strongly seasonal, and we have a poor understanding of how plant identity, or nutrient levels, may influence these seasonal patterns. We examined the responses of moist acidic tundra to experimentally increased soil nutrient availability and the accompanying increase in shrub abundance at the Arctic Long Term Ecological Research (LTER) site at Toolik Lake, Alaska. We examined a chrono-sequence of long-term fertilization experiments, composed of experiments fertilized for 5, 15 and 22 years, which has resulted in increasing shrub density with time since fertilization. The fertilized plots receive both nitrogen (N, 10 g/m2/yr) and phosphorus (5 g/m2/yr) annually following snowmelt. In the 2011 growing season we measured variation in soil available N weekly, including measures of ammonium (NH4), nitrate (NO3) and total free amino acids (TFAA). We found that differences between fertilized and control plots depended strongly on both the seasonal timing of measurements, as well as the duration of the fertilization treatment. Early in the growing season fertilization resulted in large increases in available soil N (both NH4 and NO3) across the entire chronosequence. As the season progressed, however, older fertilized plots show evidence of N saturation, where

  18. Combined examination of sequence and copy number variations in human deafness genes improves diagnosis for cases of genetic deafness

    PubMed Central

    2014-01-01

    Background Copy number variations (CNVs) are the major type of structural variation in the human genome, and are more common than DNA sequence variations in populations. CNVs are important factors for human genetic and phenotypic diversity. Many CNVs have been associated with either resistance to diseases or identified as the cause of diseases. Currently little is known about the role of CNVs in causing deafness. CNVs are currently not analyzed by conventional genetic analysis methods to study deafness. Here we detected both DNA sequence variations and CNVs affecting 80 genes known to be required for normal hearing. Methods Coding regions of the deafness genes were captured by a hybridization-based method and processed through the standard next-generation sequencing (NGS) protocol using the Illumina platform. Samples hybridized together in the same reaction were analyzed to obtain CNVs. A read depth based method was used to measure CNVs at the resolution of a single exon. Results were validated by the quantitative PCR (qPCR) based method. Results Among 79 sporadic cases clinically diagnosed with sensorineural hearing loss, we identified previously-reported disease-causing sequence mutations in 16 cases. In addition, we identified a total of 97 CNVs (72 CNV gains and 25 CNV losses) in 27 deafness genes. The CNVs included homozygous deletions which may directly give rise to deleterious effects on protein functions known to be essential for hearing, as well as heterozygous deletions and CNV gains compounded with sequence mutations in deafness genes that could potentially harm gene functions. Conclusions We studied how CNVs in known deafness genes may result in deafness. Data provided here served as a basis to explain how CNVs disrupt normal functions of deafness genes. These results may significantly expand our understanding about how various types of genetic mutations cause deafness in humans. PMID:25342930

  19. DNA sequence variation in a non-coding region of low recombination on the human X chromosome.

    PubMed

    Kaessmann, H; Heissig, F; von Haeseler, A; Pääbo, S

    1999-05-01

    DNA sequence variation has become a major source of insight regarding the origin and history of our species as well as an important tool for the identification of allelic variants associated with disease. Comparative sequencing of DNA has to date focused mainly on mitochondrial (mt) DNA, which due to its apparent lack of recombination and high evolutionary rate lends itself well to the study of human evolution. These advantages also entail limitations. For example, the high mutation rate of mtDNA results in multiple substitutions that make phylogenetic analysis difficult and, because mtDNA is maternally inherited, it reflects only the history of females. For the history of males, the non-recombining part of the paternally inherited Y chromosome can be studied. The extent of variation on the Y chromosome is so low that variation at particular sites known to be polymorphic rather than entire sequences are typically determined. It is currently unclear how some forms of analysis (such as the coalescent) should be applied to such data. Furthermore, the lack of recombination means that selection at any locus affects all 59 Mb of DNA. To gauge the extent and pattern of point substitutional variation in non-coding parts of the human genome, we have sequenced 10 kb of non-coding DNA in a region of low recombination at Xq13.3. Analysis of this sequence in 69 individuals representing all major linguistic groups reveals the highest overall diversity in Africa, whereas deep divergences also exist in Asia. The time elapsed since the most recent common ancestor (MRCA) is 535,000+/-119,000 years. We expect this type of nuclear locus to provide more answers about the genetic origin and history of humans. PMID:10319866

  20. Phylogenetic Relationships and Genetic Variation in Longidorus and Xiphinema Species (Nematoda: Longidoridae) Using ITS1 Sequences of Nuclear Ribosomal DNA

    PubMed Central

    Ye, Weimin; Szalanski, Allen L.; Robbins, R. T.

    2004-01-01

    Genetic analyses using DNA sequences of nuclear ribosomal DNA ITS1 were conducted to determine the extent of genetic variation within and among Longidorus and Xiphinema species. DNA sequences were obtained from samples collected from Arkansas, California and Australia as well as 4 Xiphinema DNA sequences from GenBank. The sequences of the ITS1 region including the 3' end of the 18S rDNA gene and the 5' end of the 5.8S rDNA gene ranged from 1020 bp to 1244 bp for the 9 Longidorus species, and from 870 bp to 1354 bp for the 7 Xiphinema species. Nucleotide frequencies were: A = 25.5%, C = 21.0%, G = 26.4%, and T = 27.1%. Genetic variation between the two genera had a maximum divergence of 38.6% between X. chambersi and L. crassus. Genetic variation among Xiphinema species ranged from 3.8% between X. diversicaudatum and X. bakeri to 29.9% between X. chambersi and X. italiae. Within Longidorus, genetic variation ranged from 8.9% between L. crassus and L. grandis to 32.4% between L. fragilis and L. diadecturus. Intraspecific genetic variation in X. americanum sensu lato ranged from 0.3% to 1.9%, while genetic variation in L. diadecturus had 0.8% and L. biformis ranged from 0.6% to 10.9%. Identical sequences were obtained between the two populations of L. grandis, and between the two populations of X. bakeri. Phylogenetic analyses based on the ITS1 DNA sequence data were conducted on each genus separately using both maximum parsimony and maximum likelihood analysis. Among the Longidorus taxa, 4 subgroups are supported: L. grandis, L. crassus, and L. elongatus are in one cluster; L. biformis and L. paralongicaudatus are in a second cluster; L. fragilis and L. breviannulatus are in a third cluster; and L. diadecturus is in a fourth cluster. Among the Xiphinema taxa, 3 subgroups are supported: X. americanum with X. chambersi, X. bakeri with X. diversicaudatum, and X. italiae and X. vuittenezi forming a sister group with X. index. The relationships observed in this study

  1. Nucleotide and predicted amino acid sequences of cloned human and mouse preprocathepsin B cDNAs.

    PubMed Central

    Chan, S J; San Segundo, B; McCormick, M B; Steiner, D F

    1986-01-01

    Cathepsin B is a lysosomal thiol proteinase that may have additional extralysosomal functions. To further our investigations on the structure, mode of biosynthesis, and intracellular sorting of this enzyme, we have determined the complete coding sequences for human and mouse preprocathepsin B by using cDNA clones isolated from human hepatoma and kidney phage libraries. The nucleotide sequences predict that the primary structure of preprocathepsin B contains 339 amino acids organized as follows: a 17-residue NH2-terminal prepeptide sequence followed by a 62-residue propeptide region, 254 residues in mature (single chain) cathepsin B, and a 6-residue extension at the COOH terminus. A comparison of procathepsin B sequences from three species (human, mouse, and rat) reveals that the homology between the propeptides is relatively conserved with a minimum of 68% sequence identity. In particular, two conserved sequences in the propeptide that may be functionally significant include a potential glycosylation site and the presence of a single cysteine at position 59. Comparative analysis of the three sequences also suggests that processing of procathepsin B is a multistep process, during which enzymatically active intermediate forms may be generated. The availability of the cDNA clones will facilitate the identification of possible active or inactive intermediate processive forms as well as studies on the transcriptional regulation of the cathepsin B gene. PMID:3463996

  2. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization.

    PubMed

    Anahtar, Melis N; Bowman, Brittany A; Kwon, Douglas S

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  3. Efficient Nucleic Acid Extraction and 16S rRNA Gene Sequencing for Bacterial Community Characterization

    PubMed Central

    Anahtar, Melis N.; Bowman, Brittany A.; Kwon, Douglas S.

    2016-01-01

    There is a growing appreciation for the role of microbial communities as critical modulators of human health and disease. High throughput sequencing technologies have allowed for the rapid and efficient characterization of bacterial communities using 16S rRNA gene sequencing from a variety of sources. Although readily available tools for 16S rRNA sequence analysis have standardized computational workflows, sample processing for DNA extraction remains a continued source of variability across studies. Here we describe an efficient, robust, and cost effective method for extracting nucleic acid from swabs. We also delineate downstream methods for 16S rRNA gene sequencing, including generation of sequencing libraries, data quality control, and sequence analysis. The workflow can accommodate multiple samples types, including stool and swabs collected from a variety of anatomical locations and host species. Additionally, recovered DNA and RNA can be separated and used for other applications, including whole genome sequencing or RNA-seq. The method described allows for a common processing approach for multiple sample types and accommodates downstream analysis of genomic, metagenomic and transcriptional information. PMID:27168460

  4. Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology (Seventh Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting 2012)

    ScienceCinema

    Patel, Kamlesh D [Ken]; SNL,

    2013-01-25

    Kamlesh (Ken) Patel from Sandia National Laboratories (Livermore, California) presents "Preparation of Nucleic Acid Libraries for Personalized Sequencing Systems Using an Integrated Microfluidic Hub Technology " at the 7th Annual Sequencing, Finishing, Analysis in the Future (SFAF) Meeting held in June, 2012 in Santa Fe, NM.

  5. The amino acid sequence of ribonuclease U2 from Ustilago sphaerogena.

    PubMed Central

    Sato, S; Uchida, T

    1975-01-01

    1. RNAase (ribonuclease) U2, a purine-specific RNAase, was reduced, aminoethylated and hydrolysed with trypsin, chymotrypsin and thermolysin. On the basis of the analyses of the resulting peptides, the complete amino acid sequence of RNAase U2 was determined, 2. When the sequence was compared with the amino acid sequence of RNAase T1 (EC 3.1.4.8), the following regions were found to be similar in the two enzymes; Tyr-Pro-His-Gln-Tyr (38-42) in RNAase U2 and Tyr-Pro-His-Lys-Tyr (38-42) in RNAase T1, Glu-Phe-Pro-Leu-Val (61-65) in RNAase U2 and Glu-Trp-Pro-Ile-Leu (58-62) in RNAase T1, Asp-Arg-Val-Ile-Tyr-Gln (83-88) in RNAase U2 and Asp-Arg-Val-Phe-Asn (76-81) in RNAase T1 and Val-Thr-His-Thr-Gly-Ala (98-103) in RNAase U2 and Ile-Thr-His-Thr-Gly-Ala (90-95) in RNAase T1. All of the amino acid residues, histidine-40, glutamate-58, arginine-77 and histidine-92, which were found to play a crucial role in the biological activity of RNAase T1, were included in the regions cited here. 3. Detailed evidence for the amino acid sequence of the sequence of the proteins has been deposited as Supplementary Publication SUP 50041 (33 PAGES) AT THE British Library (Lending Division)(formerly the National Lending Library for Science and Technology), Boston Spa, Yorks. LS23 7BQ, U.K., from whom copies can be obtained on the terms indicated in Biochem. J. (1975), 145, 5. PMID:1156364

  6. Deduced amino acid sequence of human pulmonary surfactant proteolipid: SPL(pVal)

    SciTech Connect

    Whitsett, J.A.; Glasser, S.W.; Korfhagen, T.R.; Weaver, T.E.; Clark, J.; Pilot-Matias, T.; Meuth, J.; Fox, J.L.

    1987-05-01

    Hydrophobic, proteolipid-like protein of Mr 6500 was isolated from ether/ethanol extracts of human, canine and bovine pulmonary surfactant. Amino acid composition of the protein demonstrated a remarkable abundance of hydrophobic residues, particularly valine and leucine. The N-terminal amino acid sequence of the human protein was determined: N-Leu-Ile-Pro-Cys-Cys-Pro-Val-Asn-Leu-Lys-Arg-Leu-Leu-Ile-Val4... An oligonucleotide probe was used to screen an adult human lung cDNA library and resulted in detection of cDNA clones with predicted amino acid sequence with close identity to the N-terminal amino acid sequence of the human peptide. SPL(pVal) was found within the reading frame of a larger peptide. SPL(pVal) results from proteolytic processing of a larger preprotein. Northern blot analysis detected in a single 1.0 kilobase SPL(pVal) RNA which was less abundant in fetal than in adult lung. Mixtures of purified canine and bovine SPL(pVal) and synthetic phospholipids display properties of rapid adsorption and surface tension lowering activity characteristic of surfactant. Human SPL(pVal) is a pulmonary surfactant proteolipid which may therefore be useful in combination with phospholipids and/or other surfactant proteins for the treatment of surfactant deficiency such as hyaline membrane disease in newborn infants.

  7. Variation in Lake Michigan alewife (Alosa pseudoharengus) thiaminase and fatty acids composition

    USGS Publications Warehouse

    Honeyfield, D.C.; Tillitt, D.E.; Fitzsimons, J.D.; Brown, S.B.

    2010-01-01

    Thiaminase activity of alewife (Alosa pseudoharengus) is variable across Lake Michigan, yet factors that contribute to the variability in alewife thiaminase activity are unknown. The fatty acid content of Lake Michigan alewife has not been previously reported. Analysis of 53 Lake Michigan alewives found a positive correlation between thiaminase activity and the following fatty acid: C22:ln9, sum of omega-6 fatty acids (Sw6), and sum of the polyunsaturated fatty acids. Thiaminase activity was negatively correlated with C15:0, C16:0, C17:0, C18:0, C20:0, C22:0, C24:0, C18:ln9t, C20:3n3, C22:2, and the sum of all saturated fatty acids (SAFA). Multi-variant regression analysis resulted in three variables (C18:ln9t, Sw6, SAFA) that explained 71% (R2=0.71, P<0.0001) of the variation in thiaminase activity. Because the fatty acid content of an organism is related is food source, diet may be an important factor modulating alewife thiaminase activity. These data suggest there is an association between fatty acids and thiaminase activity in Lake Michigan alewife.

  8. Complete nucleic acid sequence of Penaeus stylirostris densovirus (PstDNV) from India.

    PubMed

    Rai, Praveen; Safeena, Muhammed P; Karunasagar, Iddya; Karunasagar, Indrani

    2011-06-01

    Infectious hypodermal and hematopoietic necrosis virus (IHHNV) of shrimp, recently been classified as Penaeus stylirostris densovirus (PstDNV). The complete nucleic acid sequence of PstDNV from India was obtained by cloning and sequencing of different DNA fragment of the virus. The genome organisation of PstDNV revealed that there were three major coding domains: a left ORF (NS1) of 2001 bp, a mid ORF (NS2) of 1092 bp and a right ORF (VP) of 990 bp. The complete genome and amino acid sequences of three proteins viz., NS1, NS2 and VP were compared with the genomes of the virus reported from Hawaii, China and Mexico and with partial sequence available from isolates from different regions. The phylogenetic analysis of shrimp, insect and vertebrate parvovirus sequences showed that the Indian PstDNV isolate is phylogenetically more closely related to one of the three isolates from Taiwan (AY355307), and two isolates (AY362547 and AY102034) from Thailand. PMID:21402111

  9. Human liver type pyruvate kinase: complete amino acid sequence and the expression in mammalian cells.

    PubMed Central

    Tani, K; Fujii, H; Nagata, S; Miwa, S

    1988-01-01

    Pyruvate kinase (PK) has four isozymes (L, R, M1, M2) that are encoded by two different genes. Among these isozymes, abnormalities of liver (L)-type PK is considered to be associated with hereditary nonspherocytic hemolytic anemia in humans. We isolated and determined the full-length sequence of human L-type PK cDNA. The cDNA contains 1629 base pairs encoding 543 amino acids, 68 base pairs of 5'-noncoding sequence, and 734 base pairs of 3'-noncoding sequence. The similarity between human and rat L-type PK was 86.9% at the nucleotide sequence level and 92.4% at the amino acid sequence level. The full-length L-type PK cDNA was placed under the promoter of simian virus 40 and introduced into monkey COS cells. Human L-type PK activity was detected in the extract of COS cells by the classical PK electrophoresis method. Images PMID:3126495

  10. Human liver type pyruvate kinase: Complete amino acid sequence and the expression in mammalian cells

    SciTech Connect

    Tani, Kenzaburo; Nagata, Shigekazu ); Fujii, Hisaichi ); Miwa, Shiro )

    1988-03-01

    Pyruvate kinase (PK) has four isozymes (L, R, M{sub 1}, M{sub 2}) that are encoded by two different genes. Among these isozymes, abnormalities of liver (L)-type PK is considered to be associated with hereditary nonspherocytic hemolytic anemia in humans. The authors isolated and determined the full-length sequence of human L-type PK cDNA. The cDNA contains 1,629 base pairs encoding 543 amino acids, 68 base pairs of 5{prime}-noncoding sequence, and 734 base pairs of 3{prime}-noncoding sequence. The similarity between human and rat L-type PK was 86.9% at the nucleotide sequence level and 92.4% at the amino acid sequence level. The full-length L-type PK cDNA was placed under the promoter of simian virus 40 and introduced into monkey COS cells. Human L-type PK activity was detected in the extract of COS cells by the classical PK electrophoresis method.

  11. Gene-related strain variation of Staphylococcus aureus for homologous resistance response to acid stress.

    PubMed

    Lee, Soomin; Ahn, Sooyeon; Lee, Heeyoung; Kim, Won-Il; Kim, Hwang-Yong; Ryu, Jae-Gee; Kim, Se-Ri; Choi, Kyoung-Hee; Yoon, Yohan

    2014-10-01

    This study investigated the effect of adaptation of Staphylococcus aureus strains to the acidic condition of tomato in response to environmental stresses, such as heat and acid. S. aureus ATCC 13565, ATCC 14458, ATCC 23235, ATCC 27664, and NCCP10826 habituated in tomato extract at 35°C for 24 h were inoculated in tryptic soy broth. The culture suspensions were then subjected to heat challenge or acid challenge at 60°C and pH 3.0, respectively, for 60 min. In addition, transcriptional analysis using quantitative real-time PCR was performed to evaluate the expression level of acid-shock genes, such as clpB, zwf, nuoF, and gnd, from five S. aureus strains after the acid habituation of strains in tomato at 35°C for 15 min and 60 min in comparison with that of the nonhabituated strains. In comparison with the nonhabituated strains, the five tomato-habituated S. aureus strains did not show cross protection to heat, but tomato-habituated S. aureus ATCC 23235 showed acid resistance. In quantitative real-time-PCR analysis, the relative expression levels of acid-shock genes (clpB, zwf, nuoF, and gnd) were increased the most in S. aureus ATCC 23235 after 60 min of tomato habituation, but there was little difference in the expression levels among the five S. aureus strains after 15 min of tomato habituation. These results indicate that the variation of acid resistance of S. aureus is related to the expression of acid-shock genes during acid habituation. PMID:25285500

  12. Molecular cytogenetics by polymerase catalyzed amplification or in situ labelling of specific nucleic acid sequences

    SciTech Connect

    Bolund, L.; Brandt, C.; Hindkjaer, J.; Koch, J.; Koelvraa, S.; Pedersen, S. )

    1993-01-01

    The Polymerase Chain Reaction (PCR) can be performed on isolated cells or chromosomes and the product can be analyzed by DNA technology or by FISH to test metaphases. The authors have good experiences analyzing aberrant chromosomes by FACS sorting, PCR with degenerated primers and painting of test metaphases with the PCR product. They also utilize polymerases for PRimed IN Situ labelling (PRINS) of specific nucleic acid sequences. In PRINS oligonucleotides are hybridized to their target sequences and labeled nucleotides are incorporated at the site of hybridization with the oligonucleotide as primer. PRINS may eventually allow the study of individual genes, gene expression and even somatic mutations (in mRNA) in single cells.

  13. DNA Cloning of Plasmodium falciparum Circumsporozoite Gene: Amino Acid Sequence of Repetitive Epitope

    NASA Astrophysics Data System (ADS)

    Enea, Vincenzo; Ellis, Joan; Zavala, Fidel; Arnot, David E.; Asavanich, Achara; Masuda, Aoi; Quakyi, Isabella; Nussenzweig, Ruth S.

    1984-08-01

    A clone of complementary DNA encoding the circumsporozoite (CS) protein of the human malaria parasite Plasmodium falciparum has been isolated by screening an Escherichia coli complementary DNA library with a monoclonal antibody to the CS protein. The DNA sequence of the complementary DNA insert encodes a four-amino acid sequence: proline-asparagine-alanine-asparagine, tandemly repeated 23 times. The CS β -lactamase fusion protein specifically binds monoclonal antibodies to the CS protein and inhibits the binding of these antibodies to native Plasmodium falciparum CS protein. These findings provide a basis for the development of a vaccine against Plasmodium falciparum malaria.

  14. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F.W.

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.

  15. Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

    DOEpatents

    Studier, F. William

    1995-04-18

    Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.

  16. Individual Variation in Lipidomic Profiles of Healthy Subjects in Response to Omega-3 Fatty Acids

    PubMed Central

    Nording, Malin L.; Yang, Jun; Georgi, Katrin; Hegedus Karbowski, Christine; German, J. Bruce; Weiss, Robert H.; Hogg, Ronald J.; Trygg, Johan; Hammock, Bruce D.; Zivkovic, Angela M.

    2013-01-01

    Introduction Conflicting findings in both interventional and observational studies have resulted in a lack of consensus on the benefits of ω3 fatty acids in reducing disease risk. This may be due to individual variability in response. We used a multi-platform lipidomic approach to investigate both the consistent and inconsistent responses of individuals comprehensively to a defined ω3 intervention. Methods The lipidomic profile including fatty acids, lipid classes, lipoprotein distribution, and oxylipins was examined multi- and uni-variately in 12 healthy subjects pre vs. post six weeks of ω3 fatty acids (1.9 g/d eicosapentaenoic acid [EPA] and 1.5 g/d docosahexaenoic acid [DHA]). Results Total lipidomic and oxylipin profiles were significantly different pre vs. post treatment across all subjects (p=0.00007 and p=0.00002 respectively). There was a strong correlation between oxylipin profiles and EPA and DHA incorporated into different lipid classes (r2=0.93). However, strikingly divergent responses among individuals were also observed. Both ω3 and ω6 fatty acid metabolites displayed a large degree of variation among the subjects. For example, in half of the subjects, two arachidonic acid cyclooxygenase products, prostaglandin E2 (PGE2) and thromboxane B2 (TXB2), and a lipoxygenase product, 12-hydroxyeicosatetraenoic acid (12-HETE) significantly decreased post intervention, whereas in the other half they either did not change or increased. The EPA lipoxygenase metabolite 12-hydroxyeicosapentaenoic acid (12-HEPE) varied among subjects from an 82% decrease to a 5,000% increase. Conclusions Our results show that certain defined responses to ω3 fatty acid intervention were consistent across all subjects. However, there was also a high degree of inter-individual variability in certain aspects of lipid metabolism. This lipidomic based phenotyping approach demonstrated that individual responsiveness to ω3 fatty acids is highly variable and measurable, and could be

  17. Partial amino acid sequence of apolipoprotein(a) shows that it is homologous to plasminogen

    SciTech Connect

    Eaton, D.L.; Fless, G.M.; Kohr, W.J.; McLean, J.W.; Xu, Q.T.; Miller, C.G.; Lawn, R.M.; Scanu, A.M.

    1987-05-01

    Apolipoprotein(a) (apo(a)) is a glycoprotein with M/sub r/ approx. 280,000 that is disulfide linked to apolipoprotein B in lipoprotein(a) particles. Elevated plasma levels of lipoprotein(a) are correlated with atherosclerosis. Partial amino acid sequence of apo(a) shows that it has striking homology to plasminogen. Plasminogen is a plasma serine protease zymogen that consists of five homologous and tandemly repeated domains called kringles and a trypsin-like protease domain. The amino-terminal sequence obtained for apo(a) is homologous to the beginning of kringle 4 but not the amino terminus of plasminogen. Apo(a) was subjected to limited proteolysis by trypsin or V8 protease, and fragments generated were isolated and sequenced. Sequences obtained from several of these fragments are highly (77-100%) homologous to plasminogen residues 391-421, which reside within kringle 4. Analysis of these internal apo(a) sequences revealed that apo(a) may contain at least two kringle 4-like domains. A sequence obtained from another tryptic fragment also shows homology to the end of kringle 4 and the beginning of kringle 5. Sequence data obtained from the two tryptic fragments shows homology with the protease domain of plasminogen. One of these sequences is homologous to the sequences surrounding the activation site of plasminogen. Plasminogen is activated by the cleavage of a specific arginine residue by urokinase and tissue plasminogen activator; however, the corresponding site in apo(a) is a serine that would not be cleaved by tissue plasminogen activator or urokinase. Using a plasmin-specific assay, no proteolytic activity could be demonstrated for lipoprotein(a) particles. These results suggest that apo(a) contains kringle-like domains and an inactive protease domain.

  18. Phylogenetic and functional analysis of sequence variation of human papillomavirus type 31 E6 and E7 oncoproteins.

    PubMed

    Ferenczi, Annamária; Gyöngyösi, Eszter; Szalmás, Anita; László, Brigitta; Kónya, József; Veress, György

    2016-09-01

    High-risk human papillomaviruses (HPV) are the causative agents of cervical and other anogenital cancers as well as a subset of head and neck cancers. The E6 and E7 oncoproteins of HPV contribute to oncogenesis by associating with the tumour suppressor protein p53 and pRb, respectively. For HPV types 16 and 18, intratypic sequence variation was shown to have biological and clinical significance. The functional significance of sequence variation among HPV 31 variants was studied less intensively. HPV 31 variants belonging to different variant lineages were found to have differences in persistence and in the ability to cause high grade cervical intraepithelial neoplasia. In the present study, we started to explore the functional effects of natural sequence variation of HPV 31 E6 and E7 oncoproteins. The E6 variants were tested for their effects on p53 protein stability and transcriptional activity, while the E7 variants were tested for their effects on pRb protein level and also on the transcriptional activity of E2F transcription factors. HPV 31 E7 variants displayed uniform effects on pRb stability and also on the activity of E2F transcription factors. HPV 31 E6 variants had remarkable differences in the ability to inhibit the trans-activation function of p53 but not in the ability to induce the in vivo degradation of p53. Our results indicate that natural sequence variation of the HPV 31 E6 protein may be involved in the observed differences in the oncogenic potential between HPV 31 variants. PMID:27197052

  19. The Complete Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis ssp. lactis IL1403

    PubMed Central

    Bolotin, Alexander; Wincker, Patrick; Mauger, Stéphane; Jaillon, Olivier; Malarme, Karine; Weissenbach, Jean; Ehrlich, S. Dusko; Sorokin, Alexei

    2001-01-01

    Lactococcus lactis is a nonpathogenic AT-rich gram-positive bacterium closely related to the genus Streptococcus and is the most commonly used cheese starter. It is also the best-characterized lactic acid bacterium. We sequenced the genome of the laboratory strain IL1403, using a novel two-step strategy that comprises diagnostic sequencing of the entire genome and a shotgun polishing step. The genome contains 2,365,589 base pairs and encodes 2310 proteins, including 293 protein-coding genes belonging to six prophages and 43 insertion sequence (IS) elements. Nonrandom distribution of IS elements indicates that the chromosome of the sequenced strain may be a product of recent recombination between two closely related genomes. A complete set of late competence genes is present, indicating the ability of L. lactis to undergo DNA transformation. Genomic sequence revealed new possibilities for fermentation pathways and for aerobic respiration. It also indicated a horizontal transfer of genetic information from Lactococcus to gram-negative enteric bacteria of Salmonella-Escherichia group. [The sequence data described in this paper has been submitted to the GenBank data library under accession no. AE005176.] PMID:11337471

  20. Sequence variation in the IL4 gene and resistance to Trypanosoma cruzi infection in Bolivians

    PubMed Central

    Alvarado Arnez, Lucia Elena; Venegas, Evaristo N.; Ober, Carole; Thompson, Emma E.

    2013-01-01

    Summary Variation in the IL4 gene has been associated with parastic infections, but has not been studied in Bolivians infected with Trypanosoma cruzi. Our results suggest that variation at IL4 influences susceptibility to T. cruzi infection in Bolivians. PMID:21211660

  1. Self-sequencing of amino acids and origins of polyfunctional protocells.

    PubMed

    Fox, S W

    1984-01-01

    The primal role of the origins of proteins in molecular evolution is discussed. On the basis of this premise, the significance of the experimentally established self-sequencing of amino acids under simulated geological conditions is explained as due to the fact that the products are highly nonrandom and accordingly contain many kinds of information. When such thermal proteins are aggregated into laboratory protocells, an action that occurs readily, the resultant protocells also contain many kinds of information. Residue-by-residue order, enzymic activities, and lipid quality accordingly occur within each preparation of proteinoid (thermal protein). In this paper are reviewed briefly the phenomenon of self-sequencing of amino acids, its relationship to evolutionary processes, other significance of such self-ordering, and the experimental evidence for original polyfunctional protocells. PMID:6462684

  2. Self-Sequencing of Amino Acids and Origins of Polyfunctional Protocells

    NASA Astrophysics Data System (ADS)

    Fox, Sidney W.

    1984-12-01

    The primal role of the origins of proteins in molecular evolution is discussed. On the basis of this premise, the significance of the experimentally established self-sequencing of amino acids under simulated geological conditions is explained as due to the fact that the products are highly nonrandom and accordingly contain many kinds of information. When such thermal proteins are aggregated into laboratory protocells, an action that occurs readily, the resultant protocells also contain many kinds of information. Residue-by-residue order, enzymic activities, and lipid quality accordingly occur within each preparation of proteinoid (thermal protein). In this paper are reviewed briefly the phenomenon of self-sequencing of amino acids, its relationship to evolutionary processes, other significance of such self-ordering, and the experimental evidence for original polyfunctional protocells.

  3. Detailed Analysis of Sequence Changes Occurring during vlsE Antigenic Variation in the Mouse Model of Borrelia burgdorferi Infection

    PubMed Central

    Coutte, Loïc; Botkin, Douglas J.; Gao, Lihui; Norris, Steven J.

    2009-01-01

    Lyme disease Borrelia can infect humans and animals for months to years, despite the presence of an active host immune response. The vls antigenic variation system, which expresses the surface-exposed lipoprotein VlsE, plays a major role in B. burgdorferi immune evasion. Gene conversion between vls silent cassettes and the vlsE expression site occurs at high frequency during mammalian infection, resulting in sequence variation in the VlsE product. In this study, we examined vlsE sequence variation in B. burgdorferi B31 during mouse infection by analyzing 1,399 clones isolated from bladder, heart, joint, ear, and skin tissues of mice infected for 4 to 365 days. The median number of codon changes increased progressively in C3H/HeN mice from 4 to 28 days post infection, and no clones retained the parental vlsE sequence at 28 days. In contrast, the decrease in the number of clones with the parental vlsE sequence and the increase in the number of sequence changes occurred more gradually in severe combined immunodeficiency (SCID) mice. Clones containing a stop codon were isolated, indicating that continuous expression of full-length VlsE is not required for survival in vivo; also, these clones continued to undergo vlsE recombination. Analysis of clones with apparent single recombination events indicated that recombinations into vlsE are nonselective with regard to the silent cassette utilized, as well as the length and location of the recombination event. Sequence changes as small as one base pair were common. Fifteen percent of recovered vlsE variants contained “template-independent” sequence changes, which clustered in the variable regions of vlsE. We hypothesize that the increased frequency and complexity of vlsE sequence changes observed in clones recovered from immunocompetent mice (as compared with SCID mice) is due to rapid clearance of relatively invariant clones by variable region-specific anti-VlsE antibody responses. PMID:19214205

  4. Conservation of the C-type lectin fold for massive sequence variation in a Treponema diversity-generating retroelement

    SciTech Connect

    Le Coq, Johanne; Ghosh, Partho

    2012-06-19

    Anticipatory ligand binding through massive protein sequence variation is rare in biological systems, having been observed only in the vertebrate adaptive immune response and in a phage diversity-generating retroelement (DGR). Earlier work has demonstrated that the prototypical DGR variable protein, major tropism determinant (Mtd), meets the demands of anticipatory ligand binding by novel means through the C-type lectin (CLec) fold. However, because of the low sequence identity among DGR variable proteins, it has remained unclear whether the CLec fold is a general solution for DGRs. We have addressed this problem by determining the structure of a second DGR variable protein, TvpA, from the pathogenic oral spirochete Treponema denticola. Despite its weak sequence identity to Mtd ({approx}16%), TvpA was found to also have a CLec fold, with predicted variable residues exposed in a ligand-binding site. However, this site in TvpA was markedly more variable than the one in Mtd, reflecting the unprecedented approximate 10{sup 20} potential variability of TvpA. In addition, similarity between TvpA and Mtd with formylglycine-generating enzymes was detected. These results provide strong evidence for the conservation of the formylglycine-generating enzyme-type CLec fold among DGRs as a means of accommodating massive sequence variation.

  5. Genome sequencing of Metrosideros polymorpha (Myrtaceae), a dominant species in various habitats in the Hawaiian Islands with remarkable phenotypic variations.

    PubMed

    Izuno, Ayako; Hatakeyama, Masaomi; Nishiyama, Tomoaki; Tamaki, Ichiro; Shimizu-Inatsugi, Rie; Sasaki, Ryuta; Shimizu, Kentaro K; Isagi, Yuji

    2016-07-01

    Whole genome sequences, which can be provided even for non-model organisms owing to high-throughput sequencers, are valuable in enhancing the understanding of adaptive evolution. Metrosideros polymorpha, a tree species endemic to the Hawaiian Islands, occupies a wide range of ecological habitats and shows remarkable polymorphism in phenotypes among/within populations. The biological functions of genetic variations observed within this species could provide significant insights into the adaptive radiation found in a single species. Here de novo assembled genome sequences of M. polymorpha are presented to reveal basic genomic parameters about this species and to develop our knowledge of ecological divergences. The assembly yielded 304-Mbp genome sequences, half of which were covered by 19 scaffolds with >5 Mbp, and contained 30 K protein-coding genes. Demographic history inferred from the genome-wide heterozygosity indicated that this species experienced a dramatic rise and fall in the effective population size, possibly owing to past geographic or climatic changes in the Hawaiian Islands. This M. polymorpha genome assembly represents a high-quality genome resource useful for future functional analyses of both intra- and interspecies genetic variations or comparative genomics. PMID:27052216

  6. The Effects of Sequence Variation on Genome-wide NRF2 Binding—New Target Genes and Regulatory SNPs

    PubMed Central

    Kuosmanen, Suvi M.; Viitala, Sari; Laitinen, Tuomo; Peräkylä, Mikael; Pölönen, Petri; Kansanen, Emilia; Leinonen, Hanna; Raju, Suresh; Wienecke-Baldacchino, Anke; Närvänen, Ale; Poso, Antti; Heinäniemi, Merja; Heikkinen, Sami; Levonen, Anna-Liisa

    2016-01-01

    Transcription factor binding specificity is crucial for proper target gene regulation. Motif discovery algorithms identify the main features of the binding patterns, but the accuracy on the lower affinity sites is often poor. Nuclear factor E2-related factor 2 (NRF2) is a ubiquitous redox-activated transcription factor having a key protective role against endogenous and exogenous oxidant and electrophile stress. Herein, we decipher the effects of sequence variation on the DNA binding sequence of NRF2, in order to identify both genome-wide binding sites for NRF2 and disease-associated regulatory SNPs (rSNPs) with drastic effects on NRF2 binding. Interactions between NRF2 and DNA were studied using molecular modelling, and NRF2 chromatin immunoprecipitation-sequence datasets together with protein binding microarray measurements were utilized to study binding sequence variation in detail. The binding model thus generated was used to identify genome-wide binding sites for NRF2, and genomic binding sites with rSNPs that have strong effects on NRF2 binding and reside on active regulatory elements in human cells. As a proof of concept, miR-126–3p and -5p were identified as NRF2 target microRNAs, and a rSNP (rs113067944) residing on NRF2 target gene (Ferritin, light polypeptide, FTL) promoter was experimentally verified to decrease NRF2 binding and result in decreased transcriptional activity. PMID:26826707

  7. Self-sequencing of amino acids and origins of polyfunctional protocells

    NASA Technical Reports Server (NTRS)

    Fox, S. W.

    1984-01-01

    The role of proteins in the origin of living things is discussed. It has been experimentally established that amino acids can sequence themselves under simulated geological conditions with highly nonrandom products which accordingly contain diverse information. Multiple copies of each type of macromolecule are formed, resulting in greater power for any protoenzymic molecule than would accrue from a single copy of each type. Thermal proteins are readily incorporated into laboratory protocells. The experimental evidence for original polyfunctional protocells is discussed.

  8. Snake venom. The amino acid sequence of protein A from Dendroaspis polylepis polylepis (black mamba) venom.

    PubMed

    Joubert, F J; Strydom, D J

    1980-12-01

    Protein A from Dendroaspis polylepis polylepis venom comprises 81 amino acids, including ten half-cystine residues. The complete primary structures of protein A and its variant A' were elucidated. The sequences of proteins A and A', which differ in a single position, show no homology with various neurotoxins and non-neurotoxic proteins and represent a new type of elapid venom protein. PMID:7461607

  9. Chromosomal localization and sequence variation of 5S rRNA gene in five Capsicum species.

    PubMed

    Park, Y K; Park, K C; Park, C H; Kim, N S

    2000-02-29

    Chromosomal localization and sequence analysis of the 5S rRNA gene were carried out in five Capsicum species. Fluorescence in situ hybridization revealed that chromosomal location of the 5S rRNA gene was conserved in a single locus at a chromosome which was assigned to chromosome 1 by the synteny relationship with tomato. In sequence analysis, the repeating units of the 5S rRNA genes in the Capsicum species were variable in size from 278 bp to 300 bp. In sequence comparison of our results to the results with other Solanaceae plants as published by others, the coding region was highly conserved, but the spacer regions varied in size and sequence. T stretch regions, just after the end of the coding sequences, were more prominant in the Capsicum species than in two other plants. High G x C rich regions, which might have similar functions as that of the GC islands in the genes transcribed by RNA PolII, were observed after the T stretch region. Although we could not observe the TATA like sequences, an AT rich segment at -27 to -18 was detected in the 5S rRNA genes of the Capsicum species. Species relationship among the Capsicum species was also studied by the sequence comparison of the 5S rRNA genes. While C. chinense, C. frutescens, and C. annuum formed one lineage, C. baccatum was revealed to be an intermediate species between the former three species and C. pubescens. PMID:10774742

  10. Sequence variation determining stereochemistry of a Δ11 desaturase active in moth sex pheromone biosynthesis.

    PubMed

    Ding, Bao-Jian; Carraher, Colm; Löfstedt, Christer

    2016-07-01

    A Δ11 desaturase from the oblique banded leaf roller moth Choristoneura rosaceana takes the saturated myristic acid and produces a mixture of (E)-11-tetradecenoate and (Z)-11-tetradecenoate with an excess of the Z isomer (35:65). A desaturase from the spotted fireworm moth Choristoneura parallela also operates on myristic acid substrate but produces almost pure (E)-11-tetradecenoate. The two desaturases share 92% amino acid identity and 97% amino acid similarity. There are 24 amino acids differing between these two desaturases. We constructed mutations at all of these positions to pinpoint the sites that determine the product stereochemistry. We demonstrated with a yeast functional assay that one amino acid at the cytosolic carboxyl terminus of the protein (258E) is critical for the Z activity of the C. rosaceana desaturase. Mutating the glutamic acid (E) into aspartic acid (D) transforms the C. rosaceana enzyme into a desaturase with C. parallela-like activity, whereas the reciprocal mutation of the C. parallela desaturase transformed it into an enzyme producing an intermediate 64:36 E/Z product ratio. We discuss the causal link between this amino acid change and the stereochemical properties of the desaturase and the role of desaturase mutations in pheromone evolution. PMID:27163509

  11. Organosulfates and organic acids in Arctic aerosols: speciation, annual variation and concentration levels

    NASA Astrophysics Data System (ADS)

    Hansen, A. M. K.; Kristensen, K.; Nguyen, Q. T.; Zare, A.; Cozzi, F.; Nøjgaard, J. K.; Skov, H.; Brandt, J.; Christensen, J. H.; Ström, J.; Tunved, P.; Krejci, R.; Glasius, M.

    2014-02-01

    Sources, composition and occurrence of secondary organic aerosols (SOA) in the Arctic were investigated at Zeppelin Mountain, Svalbard, and Station Nord, northeast Greenland, during the full annual cycle of 2008 and 2010 respectively. We focused on the speciation of three types of SOA tracers: organic acids, organosulfates and nitrooxy organosulfates from both anthropogenic and biogenic precursors, here presenting organosulfate concentrations and compositions during a full annual cycle and chemical speciation of organosulfates in Arctic aerosols for the first time. Aerosol samples were analysed using High Performance Liquid Chromatography coupled to a quadrupole Time-of-Flight mass spectrometer (HPLC-q-TOF-MS). A total of 11 organic acids (terpenylic acid, benzoic acid, phthalic acid, pinic acid, suberic acid, azelaic acid, adipic acid, pimelic acid, pinonic acid, diaterpenylic acid acetate (DTAA) and 3-methyl-1,2,3-butanetricarboxylic acid (MBTCA)), 12 organosulfates and one nitrooxy organosulfate were identified at the two sites. Six out of the 12 organosulfates are reported for the first time. Concentrations of organosulfates follow a distinct annual pattern at Station Nord, where high concentration were observed in late winter and early spring, with a mean total concentration of 47 (±14) ng m-3, accounting for 7 (±2)% of total organic matter, contrary to a considerably lower organosulfate mean concentration of 2 (±3) ng m-3 (accounting for 1 (±1)% of total organic matter) observed during the rest of the year. The organic acids followed the same temporal trend as the organosulfates at Station Nord; however the variations in organic acid concentrations were less pronounced, with a total mean organic acid concentration of 11.5 (±4) ng m-3 (accounting for 1.7 (±0.6)% of total organic matter) in late winter and early spring, and 2.2 (±1) ng m-3 (accounting for 0.9 (±0.4)% of total organic matter) during the rest of the year. At Zeppelin Mountain

  12. Characterization of the microbial acid mine drainage microbial community using culturing and direct sequencing techniques.

    PubMed

    Auld, Ryan R; Myre, Maxine; Mykytczuk, Nadia C S; Leduc, Leo G; Merritt, Thomas J S

    2013-05-01

    We characterized the bacterial community from an AMD tailings pond using both classical culturing and modern direct sequencing techniques and compared the two methods. Acid mine drainage (AMD) is produced by the environmental and microbial oxidation of minerals dissolved from mining waste. Surprisingly, we know little about the microbial communities associated with AMD, despite the fundamental ecological roles of these organisms and large-scale economic impact of these waste sites. AMD microbial communities have classically been characterized by laboratory culturing-based techniques and more recently by direct sequencing of marker gene sequences, primarily the 16S rRNA gene. In our comparison of the techniques, we find that their results are complementary, overall indicating very similar community structure with similar dominant species, but with each method identifying some species that were missed by the other. We were able to culture the majority of species that our direct sequencing results indicated were present, primarily species within the Acidithiobacillus and Acidiphilium genera, although estimates of relative species abundance were only obtained from direct sequencing. Interestingly, our culture-based methods recovered four species that had been overlooked from our sequencing results because of the rarity of the marker gene sequences, likely members of the rare biosphere. Further, direct sequencing indicated that a single genus, completely missed in our culture-based study, Legionella, was a dominant member of the microbial community. Our results suggest that while either method does a reasonable job of identifying the dominant members of the AMD microbial community, together the methods combine to give a more complete picture of the true diversity of this environment. PMID:23485423

  13. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2011 CFR

    2011-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... base or modified or unusual amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or modified or unusual amino acid is one of those...

  14. 37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

    Code of Federal Regulations, 2010 CFR

    2010-07-01

    ... approved by the Director of the Federal Register in accordance with 5 U.S.C. 552(a) and 1 CFR part 51... base or modified or unusual amino acid may be presented in a given sequence as the corresponding unmodified base or amino acid if the modified base or modified or unusual amino acid is one of those...

  15. Nanopore Analysis of Nucleic Acids: Single-Molecule Studies of Molecular Dynamics, Structure, and Base Sequence

    NASA Astrophysics Data System (ADS)

    Olasagasti, Felix; Deamer, David W.

    Nucleic acids are linear polynucleotides in which each base is covalently linked to a pentose sugar and a phosphate group carrying a negative charge. If a pore having roughly the crosssectional diameter of a single-stranded nucleic acid is embedded in a thin membrane and a voltage of 100 mV or more is applied, individual nucleic acids in solution can be captured by the electrical field in the pore and translocated through by single-molecule electrophoresis. The dimensions of the pore cannot accommodate anything larger than a single strand, so each base in the molecule passes through the pore in strict linear sequence. The nucleic acid strand occupies a large fraction of the pore's volume during translocation and therefore produces a transient blockade of the ionic current created by the applied voltage. If it could be demonstrated that each nucleotide in the polymer produced a characteristic modulation of the ionic current during its passage through the nanopore, the sequence of current modulations would reflect the sequence of bases in the polymer. According to this basic concept, nanopores are analogous to a Coulter counter that detects nanoscopic molecules rather than microscopic [1,2]. However, the advantage of nanopores is that individual macromolecules can be characterized because different chemical and physical properties affect their passage through the pore. Because macromolecules can be captured in the pore as well as translocated, the nanopore can be used to detect individual functional complexes that form between a nucleic acid and an enzyme. No other technique has this capability.

  16. Complete plastid genome sequence of Primula sinensis (Primulaceae): structure comparison, sequence variation and evidence for accD transfer to nucleus

    PubMed Central

    Liu, Tong-Jian; Zhang, Cai-Yun; Yan, Hai-Fei; Zhang, Lu

    2016-01-01

    Species-rich genus Primula L. is a typical plant group with which to understand genetic variance between species in different levels of relationships. Chloroplast genome sequences are used to be the information resource for quantifying this difference and reconstructing evolutionary history. In this study, we reported the complete chloroplast genome sequence of Primula sinensis and compared it with other related species. This genome of chloroplast showed a typical circular quadripartite structure with 150,859 bp in sequence length consisting of 37.2% GC base. Two inverted repeated regions (25,535 bp) were separated by a large single-copy region (82,064 bp) and a small single-copy region (17,725 bp). The genome consists of 112 genes, including 78 protein-coding genes, 30 tRNA genes and four rRNA genes. Among them, seven coding genes, seven tRNA genes and four rRNA genes have two copies due to their locations in the IR regions. The accD and infA genes lacking intact open reading frames (ORF) were identified as pseudogenes. SSR and sequence variation analyses were also performed on the plastome of Primula sinensis, comparing with another available plastome of P. poissonii. The four most variable regions, rpl36–rps8, rps16–trnQ, trnH–psbA and ndhC–trnV, were identified. Phylogenetic relationship estimates using three sub-datasets extracted from a matrix of 57 protein-coding gene sequences showed the identical result that was consistent with previous studies. A transcript found from P. sinensis transcriptome showed a high similarity to plastid accD functional region and was identified as a putative plastid transit peptide at the N-terminal region. The result strongly suggested that plastid accD has been functionally transferred to the nucleus in P. sinensis. PMID:27375965

  17. Invasive cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

    1999-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  18. Invasive cleavage of nucleic acids

    DOEpatents

    Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

    2002-01-01

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.

  19. Characterizing novel endogenous retroviruses from genetic variation inferred from short sequence reads

    PubMed Central

    Mourier, Tobias; Mollerup, Sarah; Vinner, Lasse; Hansen, Thomas Arn; Kjartansdóttir, Kristín Rós; Guldberg Frøslev, Tobias; Snogdal Boutrup, Torsten; Nielsen, Lars Peter; Willerslev, Eske; Hansen, Anders J.

    2015-01-01

    From Illumina sequencing of DNA from brain and liver tissue from the lion, Panthera leo, and tumor samples from the pike-perch, Sander lucioperca, we obtained two assembled sequence contigs with similarity to known retroviruses. Phylogenetic analyses suggest that the pike-perch retrovirus belongs to the epsilonretroviruses, and the lion retrovirus to the gammaretroviruses. To determine if these novel retroviral sequences originate from an endogenous retrovirus or from a recently integrated exogenous retrovirus, we assessed the genetic diversity of the parental sequences from which the short Illumina reads are derived. First, we showed by simulations that we can robustly infer the level of genetic diversity from short sequence reads. Second, we find that the measures of nucleotide diversity inferred from our retroviral sequences significantly exceed the level observed from Human Immunodeficiency Virus infections, prompting us to conclude that the novel retroviruses are both of endogenous origin. Through further simulations, we rule out the possibility that the observed elevated levels of nucleotide diversity are the result of co-infection with two closely related exogenous retroviruses. PMID:26493184

  20. Characterizing novel endogenous retroviruses from genetic variation inferred from short sequence reads.

    PubMed

    Mourier, Tobias; Mollerup, Sarah; Vinner, Lasse; Hansen, Thomas Arn; Kjartansdóttir, Kristín Rós; Guldberg Frøslev, Tobias; Snogdal Boutrup, Torsten; Nielsen, Lars Peter; Willerslev, Eske; Hansen, Anders J

    2015-01-01

    From Illumina sequencing of DNA from brain and liver tissue from the lion, Panthera leo, and tumor samples from the pike-perch, Sander lucioperca, we obtained two assembled sequence contigs with similarity to known retroviruses. Phylogenetic analyses suggest that the pike-perch retrovirus belongs to the epsilonretroviruses, and the lion retrovirus to the gammaretroviruses. To determine if these novel retroviral sequences originate from an endogenous retrovirus or from a recently integrated exogenous retrovirus, we assessed the genetic diversity of the parental sequences from which the short Illumina reads are derived. First, we showed by simulations that we can robustly infer the level of genetic diversity from short sequence reads. Second, we find that the measures of nucleotide diversity inferred from our retroviral sequences significantly exceed the level observed from Human Immunodeficiency Virus infections, prompting us to conclude that the novel retroviruses are both of endogenous origin. Through further simulations, we rule out the possibility that the observed elevated levels of nucleotide diversity are the result of co-infection with two closely related exogenous retroviruses. PMID:26493184

  1. Complete amino acid sequence of a histidine-rich proteolytic fragment of human ceruloplasmin.

    PubMed

    Kingston, I B; Kingston, B L; Putnam, F W

    1979-04-01

    The complete amino acid sequence has been determined for a fragment of human ceruloplasmin [ferroxidase; iron(II):oxygen oxidoreductase, EC 1.16.3.1]. The fragment (designated Cp F5) contains 159 amino acid residues and has a molecular weight of 18,650; it lacks carbohydrate, is rich in histidine, and contains one free cysteine that may be part of a copper-binding site. This fragment is present in most commercial preparations of ceruloplasmin, probably owing to proteolytic degradation, but can also be obtained by limited cleavage of single-chain ceruloplasmin with plasmin. Cp F5 probably is an intact domain attached to the COOH-terminal end of single-chain ceruloplasmin via a labile interdomain peptide bond. A model of the secondary structure predicted by empirical methods suggests that almost one-third of the amino acid residues are distributed in alpha helices, about a third in beta-sheet structure, and the remainder in beta turns and unidentified structures. Computer analysis of the amino acid sequence has not demonstrated a statistically significant relationship between this ceruloplasmin fragment and any other protein, but there is some evidence for an internal duplication. PMID:287005

  2. Variation in the Nucleotide Sequence of Cottontail Rabbit Papillomavirus a and b Subtypes Affects Wart Regression and Malignant Transformation and Level of Viral Replication in Domestic Rabbits

    PubMed Central

    Salmon, Jérôme; Nonnenmacher, Mathieu; Cazé, Sandrine; Flamant, Patricia; Croissant, Odile; Orth, Gérard; Breitburd, Françoise

    2000-01-01

    We previously reported the partial characterization of two cottontail rabbit papillomavirus (CRPV) subtypes with strikingly divergent E6 and E7 oncoproteins. We report now the complete nucleotide sequences of these subtypes, referred to as CRPVa4 (7,868 nucleotides) and CRPVb (7,867 nucleotides). The CRPVa4 and CRPVb genomes differed at 238 (3%) nucleotide positions, whereas CRPVa4 and the prototype CRPV differed by only 5 nucleotides. The most variable region (7% nucleotide divergence) included the long regulatory region (LRR) and the E6 and E7 genes. A mutation in the stop codon resulted in an 8-amino-acid-longer CRPVb E4 protein, and a nucleotide deletion reduced the coding capacity of the E5 gene from 101 to 25 amino acids. In domestic rabbits homozygous for a specific haplotype of the DRA and DQA genes of the major histocompatibility complex, warts induced by CRPVb DNA or a chimeric genome containing the CRPVb LRR/E6/E7 region showed an early regression, whereas warts induced by CRPVa4 or a chimeric genome containing the CRPVa4 LRR/E6/E7 region persisted and evolved into carcinomas. In contrast, most CRPVa, CRPVb, and chimeric CRPV DNA-induced warts showed no early regression in rabbits homozygous for another DRA-DQA haplotype. Little, if any, viral replication is usually observed in domestic rabbit warts. When warts induced by CRPVa and CRPVb virions and DNA were compared, the number of cells positive for viral DNA or capsid antigens was found to be greater by 1 order of magnitude for specimens induced by CRPVb. Thus, both sequence variation in the LRR/E6/E7 region and the genetic constitution of the host influence the expression of the oncogenic potential of CRPV. Furthermore, intratype variation may overcome to some extent the host restriction of CRPV replication in domestic rabbits. PMID:11044121

  3. Population-genomic variation within RNA viruses of the Western honey bee, Apis mellifera, inferred from deep sequencing

    PubMed Central

    2013-01-01

    Background Deep sequencing of viruses isolated from infected hosts is an efficient way to measure population-genetic variation and can reveal patterns of dispersal and natural selection. In this study, we mined existing Illumina sequence reads to investigate single-nucleotide polymorphisms (SNPs) within two RNA viruses of the Western honey bee (Apis mellifera), deformed wing virus (DWV) and Israel acute paralysis virus (IAPV). All viral RNA was extracted from North American samples of honey bees or, in one case, the ectoparasitic mite Varroa destructor. Results Coverage depth was generally lower for IAPV than DWV, and marked gaps in coverage occurred in several narrow regions (< 50 bp) of IAPV. These coverage gaps occurred across sequencing runs and were virtually unchanged when reads were re-mapped with greater permissiveness (up to 8% divergence), suggesting a recurrent sequencing artifact rather than strain divergence. Consensus sequences of DWV for each sample showed little phylogenetic divergence, low nucleotide diversity, and strongly negative values of Fu and Li’s D statistic, suggesting a recent population bottleneck and/or purifying selection. The Kakugo strain of DWV fell outside of all other DWV sequences at 100% bootstrap support. IAPV consensus sequences supported the existence of multiple clades as had been previously reported, and Fu and Li’s D was closer to neutral expectation overall, although a sliding-window analysis identified a significantly positive D within the protease region, suggesting selection maintains diversity in that region. Within-sample mean diversity was comparable between the two viruses on average, although for both viruses there was substantial variation among samples in mean diversity at third codon positions and in the number of high-diversity sites. FST values were bimodal for DWV, likely reflecting neutral divergence in two low-diversity populations, whereas IAPV had several sites that were strong outliers with very low

  4. The amino acid sequence of Lady Amherst's pheasant (Chrysolophus amherstiae) and golden pheasant (Chrysolophus pictus) egg-white lysozymes.

    PubMed

    Araki, T; Kuramoto, M; Torikata, T

    1990-09-01

    The amino acids of Lady Amherst's pheasant and golden pheasant egg-white lysozymes have been sequenced. The carboxymethylated lysozymes were digested with trypsin followed by sequencing of the tryptic peptides. Lady Amherst's pheasant lysozyme proved to consist of 129 amino acid residues, and a relative molecular mass of 14,423 Da was calculated. This lysozyme had 6 amino acids substitutions when compared with hen egg-white lysozyme: Phe3 to Tyr, His15 to Leu, Gln41 to His, Asn77 to His, Gln 121 to Asn, and a newly found substitution of Ile124 to Thr. The amino acid sequence of golden pheasant lysozyme was identical to that of Lady Amherst's phesant lysozyme. The phylogenetic tree constructured by the comparison of amino acid sequences of phasianoid birds lysozymes revealed a minimum genetic distance between these pheasants and the turkey-peafowl group. PMID:1368578

  5. ClinVar: public archive of relationships among sequence variation and human phenotype

    PubMed Central

    Landrum, Melissa J.; Lee, Jennifer M.; Riley, George R.; Jang, Wonhee; Rubinstein, Wendy S.; Church, Deanna M.; Maglott, Donna R.

    2014-01-01

    ClinVar (http://www.ncbi.nlm.nih.gov/clinvar/) provides a freely available archive of reports of relationships among medically important variants and phenotypes. ClinVar accessions submissions reporting human variation, interpretations of the relationship of that variation to human health and the evidence supporting each interpretation. The database is tightly coupled with dbSNP and dbVar, which maintain information about the location of variation on human assemblies. ClinVar is also based on the phenotypic descriptions maintained in MedGen (http://www.ncbi.nlm.nih.gov/medgen). Each ClinVar record represents the submitter, the variation and the phenotype, i.e. the unit that is assigned an accession of the format SCV000000000.0. The submitter can update the submission at any time, in which case a new version is assigned. To facilitate evaluation of the medical importance of each variant, ClinVar aggregates submissions with the same variation/phenotype combination, adds value from other NCBI databases, assigns a distinct accession of the format RCV000000000.0 and reports if there are conflicting clinical interpretations. Data in ClinVar are available in multiple formats, including html, download as XML, VCF or tab-delimited subsets. Data from ClinVar are provided as annotation tracks on genomic RefSeqs and are used in tools such as Variation Reporter (http://www.ncbi.nlm.nih.gov/variation/tools/reporter), which reports what is known about variation based on user-supplied locations. PMID:24234437

  6. Complete Genome Sequence of a thermotolerant sporogenic lactic acid bacterium, Bacillus coagulans strain 36D1

    PubMed Central

    Rhee, Mun Su; Moritz, Brélan E.; Xie, Gary; Glavina del Rio, T.; Dalin, E.; Tice, H.; Bruce, D.; Goodwin, L.; Chertkov, O.; Brettin, T.; Han, C.; Detter, C.; Pitluck, S.; Land, Miriam L.; Patel, Milind; Ou, Mark; Harbrucker, Roberta; Ingram, Lonnie O.; Shanmugam, K. T.

    2011-01-01

    Bacillus coagulans is a ubiquitous soil bacterium that grows at 50-55 °C and pH 5.0 and ferments various sugars that constitute plant biomass to L (+)-lactic acid. The ability of this sporogenic lactic acid bacterium to grow at 50-55 °C and pH 5.0 makes this organism an attractive microbial biocatalyst for production of optically pure lactic acid at industrial scale not only from glucose derived from cellulose but also from xylose, a major constituent of hemicellulose. This bacterium is also considered as a potential probiotic. Complete genome sequence of a representative strain, B. coagulans strain 36D1, is presented and discussed. PMID:22675583

  7. Complete amino acid sequence of globin chains and biological activity of fragmented crocodile hemoglobin (Crocodylus siamensis).

    PubMed

    Srihongthong, Saowaluck; Pakdeesuwan, Anawat; Daduang, Sakda; Araki, Tomohiro; Dhiravisit, Apisak; Thammasirirak, Sompong

    2012-08-01

    Hemoglobin, α-chain, β-chain and fragmented hemoglobin of Crocodylus siamensis demonstrated both antibacterial and antioxidant activities. Antibacterial and antioxidant properties of the hemoglobin did not depend on the heme structure but could result from the compositions of amino acid residues and structures present in their primary structure. Furthermore, thirteen purified active peptides were obtained by RP-HPLC analyses, corresponding to fragments in the α-globin chain and the β-globin chain which are mostly located at the N-terminal and C-terminal parts. These active peptides operate on the bacterial cell membrane. The globin chains of Crocodylus siamensis showed similar amino acids to the sequences of Crocodylus niloticus. The novel amino acid substitutions of α-chain and β-chain are not associated with the heme binding site or the bicarbonate ion binding site, but could be important through their interactions with membranes of bacteria. PMID:22648692

  8. [Partial sequence homology of FtsZ in phylogenetics analysis of lactic acid bacteria].

    PubMed

    Zhang, Bin; Dong, Xiu-zhu

    2005-10-01

    FtsZ is a structurally conserved protein, which is universal among the prokaryotes. It plays a key role in prokaryote cell division. A partial fragment of the ftsZ gene about 800bp in length was amplified and sequenced and a partial FtsZ protein phylogenetic tree for the lactic acid bacteria was constructed. By comparing the FtsZ phylogenetic tree with the 16S rDNA tree, it was shown that the two trees were similar in topology. Both trees revealed that Pediococcus spp. were closely related with L. casei group of Lactobacillus spp. , but less related with other lactic acid cocci such as Enterococcus and Streptococcus. The results also showed that the discriminative power of FtsZ was higher than that of 16S rDNA for either inter-species or inter-genus and could be a very useful tool in species identification of lactic acid bacteria. PMID:16342751

  9. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon

    PubMed Central

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, ‘SCNU1154’, ‘Edisto47’, ‘MR-1’, and ‘PMR5’. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon. PMID:27311063

  10. Whole Genome Re-Sequencing and Characterization of Powdery Mildew Disease-Associated Allelic Variation in Melon.

    PubMed

    Natarajan, Sathishkumar; Kim, Hoy-Taek; Thamilarasan, Senthil Kumar; Veerappan, Karpagam; Park, Jong-In; Nou, Ill-Sup

    2016-01-01

    Powdery mildew is one of the most common fungal diseases in the world. This disease frequently affects melon (Cucumis melo L.) and other Cucurbitaceous family crops in both open field and greenhouse cultivation. One of the goals of genomics is to identify the polymorphic loci responsible for variation in phenotypic traits. In this study, powdery mildew disease assessment scores were calculated for four melon accessions, 'SCNU1154', 'Edisto47', 'MR-1', and 'PMR5'. To investigate the genetic variation of these accessions, whole genome re-sequencing using the Illumina HiSeq 2000 platform was performed. A total of 754,759,704 quality-filtered reads were generated, with an average of 82.64% coverage relative to the reference genome. Comparisons of the sequences for the melon accessions revealed around 7.4 million single nucleotide polymorphisms (SNPs), 1.9 million InDels, and 182,398 putative structural variations (SVs). Functional enrichment analysis of detected variations classified them into biological process, cellular component and molecular function categories. Further, a disease-associated QTL map was constructed for 390 SNPs and 45 InDels identified as related to defense-response genes. Among them 112 SNPs and 12 InDels were observed in powdery mildew responsive chromosomes. Accordingly, this whole genome re-sequencing study identified SNPs and InDels associated with defense genes that will serve as candidate polymorphisms in the search for sources of resistance against powdery mildew disease and could accelerate marker-assisted breeding in melon. PMID:27311063

  11. Comparative Analysis of Mycobacterium tuberculosis pe and ppe Genes Reveals High Sequence Variation and an Apparent Absence of Selective Constraints

    PubMed Central

    McEvoy, Christopher R. E.; Cloete, Ruben; Müller, Borna; Schürch, Anita C.; van Helden, Paul D.; Gagneux, Sebastien; Warren, Robin M.; Gey van Pittius, Nicolaas C.

    2012-01-01

    Mycobacterium tuberculosis complex (MTBC) genomes contain 2 large gene families termed pe and ppe. The function of pe/ppe proteins remains enigmatic but studies suggest that they are secreted or cell surface associated and are involved in bacterial virulence. Previous studies have also shown that some pe/ppe genes are polymorphic, a finding that suggests involvement in antigenic variation. Using comparative sequence analysis of 18 publicly available MTBC whole genome sequences, we have performed alignments of 33 pe (excluding pe_pgrs) and 66 ppe genes in order to detect the frequency and nature of genetic variation. This work has been supplemented by whole gene sequencing of 14 pe/ppe (including 5 pe_pgrs) genes in a cohort of 40 diverse and well defined clinical isolates covering all the main lineages of the M. tuberculosis phylogenetic tree. We show that nsSNP's in pe (excluding pgrs) and ppe genes are 3.0 and 3.3 times higher than in non-pe/ppe genes respectively and that numerous other mutation types are also present at a high frequency. It has previously been shown that non-pe/ppe M. tuberculosis genes display a remarkably low level of purifying selection. Here, we also show that compared to these genes those of the pe/ppe families show a further reduction of selection pressure that suggests neutral evolution. This is inconsistent with the positive selection pressure of “classical” antigenic variation. Finally, by analyzing such a large number of genes we were able to detect large differences in mutation type and frequency between both individual genes and gene sub-families. The high variation rates and absence of selective constraints provides valuable insights into potential pe/ppe function. Since pe/ppe proteins are highly antigenic and have been studied as potential vaccine components these results should also prove informative for aspects of M. tuberculosis vaccine design. PMID:22496726

  12. Source quality variations tied to sequence development: Integration of physical and chemical aspects, Lower to Middle Triassic, western Barents Sea

    SciTech Connect

    Bohacs, K.M.; Isaksen, G.H. )

    1991-03-01

    Triassic mudrocks from the Barents Sea area demonstrate to covariance of physical and chemical properties of mudrocks deposited in shelfal environments and the aspect of depositional sequences in distal settings. The tie of physical parameters to chemical character within a detailed sequence-stratigraphic framework enables the construction of depositional-facies models to predict organic-matter content and quality. This allows the explorer to more closely constrain and predict the nature of potential source rocks using seismic and well-log data. Changes in lithology, bedding geometry, sedimentary structures, body and trace-fossil assemblages, and inorganic, bulk-organic, and molecular geochemistry revealed the detailed depositional environments. The depositional environments stack predictably, according to their position in the depositional sequence: from aerobic lower-shoreface--offshore transition environments in lowstand systems tracts to dysaerobic-anaerobic distal open-marine-shelf environment in transgressive and early highstand systems tracts. Quantitative molecular geochemistry also revealed variations within this distal setting and strong covariance with sequence position. Input of organic matter from terrigenous higher plants dominates the lowstands whereas marine-algal organic matter is most prevalent within transgressive and highstand systems tracts. Specifically, the abundance of C{sub 30} steranes, total steranes, and moretane reflected development of the sequences.

  13. The influence of aging, environmental exposures and local sequence features on the variation of DNA methylation in blood

    PubMed Central

    Langevin, Scott M; Houseman, E Andres; Christensen, Brock C; Wiencke, John K; Nelson, Heather H; Karagas, Margaret R; Marsit, Carmen J

    2011-01-01

    In order to properly comprehend the epigenetic dysregulation that occurs during the course of disease, there is a need to characterize the epigenetic variability in healthy individuals that arises in response to aging and exposures, and to understand such variation within the biological context of the DNA sequence. We analyzed the methylation of 26,486 autosomal CpG loci in blood from 205 healthy subjects, using three complementary approaches to assess the association between methylation, age or exposures and local sequence features, such as CpG island status, repeat sequences, location within a polycomb target gene or proximity to a transcription factor binding site. We clustered CpGs (1) using unsupervised recursively partitioned mixture modeling (RPMM) and (2) bioinformatically-informed methods and (3) also employed a marginal model-based (non-clustering) approach. We observed associations between age and methylation and hair dye use and methylation, where the direction and magnitude was contingent on the local sequence features of the CpGs. Our results demonstrate that CpGs are differentially methylated dependent upon the genomic features of the sequence in which they are embedded, and that CpG methylation is associated with age and hair dye use in a CpG context-dependent manner in healthy individuals. PMID:21617368

  14. Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids.

    PubMed

    Tanaka, Junko; Doi, Nobuhide; Takashima, Hideaki; Yanagawa, Hiroshi

    2010-04-01

    Screening of functional proteins from a random-sequence library has been used to evolve novel proteins in the field of evolutionary protein engineering. However, random-sequence proteins consisting of the 20 natural amino acids tend to aggregate, and the occurrence rate of functional proteins in a random-sequence library is low. From the viewpoint of the origin of life, it has been proposed that primordial proteins consisted of a limited set of amino acids that could have been abundantly formed early during chemical evolution. We have previously found that members of a random-sequence protein library constructed with five primitive amino acids show high solubility (Doi et al., Protein Eng Des Sel 2005;18:279-284). Although such a library is expected to be appropriate for finding functional proteins, the functionality may be limited, because they have no positively charged amino acid. Here, we constructed three libraries of 120-amino acid, random-sequence proteins using alphabets of 5, 12, and 20 amino acids by preselection using mRNA display (to eliminate sequences containing stop codons and frameshifts) and characterized and compared the structural properties of random-sequence proteins arbitrarily chosen from these libraries. We found that random-sequence proteins constructed with the 12-member alphabet (including five primitive amino acids and positively charged amino acids) have higher solubility than those constructed with the 20-member alphabet, though other biophysical properties are very similar in the two libraries. Thus, a library of moderate complexity constructed from 12 amino acids may be a more appropriate resource for functional screening than one constructed from 20 amino acids. PMID:20162614

  15. Distribution of sequence variation in the mtDNA control region of Native North Americans.

    PubMed

    Lorenz, J G; Smith, D G

    1997-12-01

    The distributions of mtDNA diversity within and/or among North American haplogroups, language groups, and tribes were used to characterize the process of tribalization that followed the colonization of the New World. Approximately 400 bp from the mtDNA control region of 1 Na-Dene and 33 Amerind individuals representing a wide variety of languages and geographic origins were sequenced. With the inclusion of data from previous studies, 225 native North American (284 bp) sequences representing 85 distinct mtDNA lineages were analyzed. Mean pairwise sequence differences between (and within) tribes and language groups were primarily due to differences in the distribution of three of the four major haplogroups that evolved before settlement of the New World. Pairwise sequence differences within each of these three haplogroups were more similar than previous studies based on restriction enzyme analysis have indicated. The mean of pairwise sequence differences between Amerind members of haplogroup A, the most common of the four haplogroups in North America, was only slightly higher than that for the Eskimo, providing no evidence of separate ancestry, but was about two-thirds higher than that for the Na-Dene. However, analysis of pairwise sequence divergence between only tribal-specific lineages, unweighted for sample size, suggests that random evolutionary processes have reduced sequence diversity within the Na-Dene and that members of all three language groups possess approximately equally diverse mtDNA lineages. Comparisons of diversity within and between specific ethnic groups with the largest sample size were also consistent with this outcome. These data are not consistent with the hypothesis that the New World was settled by more than a single migration. Because lineages tended not to cluster by tribe and because lineage sharing among linguistically unrelated groups was restricted to geographically proximate groups, the tribalization process probably did not occur

  16. Variations of the sequence stratigraphic model: Past concepts, present understandings, and future directions

    SciTech Connect

    Posamentier, H.W. ); James, D.H. )

    1991-03-01

    The working hypothesis upon which the sequence concepts are based is that the relative sea level change results in changes in the capacity of a basin to accommodate sediment, which, in turn, results in a succession of sequences. The interplay between eustasy, tectonics, sediment flux, and physiography yields a predictable geologic response in carbonate, clastic, as well as mixed carbonate/clastic settings. The criteria for recognition of sequence boundaries can be varied within a given basin as well as between basins. They include but are not restricted to (1) a basinward shift of facies across a sharp bedding contact, (2) onlapping stratal geometry, and (3) truncation of strata. The key to the correct utilization of these concepts is to recognize sequence stratigraphy as an approach or a tool rather than a rigid template. Observations from the upper Albian, Cretaceous, Viking Formation of the western Canadian sedimentary basin are presented to illustrate the stratigraphic expression of clastic depositional sequences on a ramp margin. In this setting, forced regressions and lowstand shorelines commonly occur, incised valleys sometimes occur, and submarine fans rarely occur, in response to fluctuations of relative sea level. The base of the Viking Formation sometimes is characterized by relatively coarse-grained sediments sharply overlying fine-grained offshore muds and is interpreted as a third-order sequence boundary. Pebbles occasionally are observed at this contact. Subsequently, a number of higher-order sequences within the lower to middle Viking are observed and are characterized by the occurrence of forced regressions and lowstand shorelines without associated incised valleys.

  17. N-Terminal Amino Acid Sequence Determination of Proteins by N-Terminal Dimethyl Labeling: Pitfalls and Advantages When Compared with Edman Degradation Sequence Analysis.

    PubMed

    Chang, Elizabeth; Pourmal, Sergei; Zhou, Chun; Kumar, Rupesh; Teplova, Marianna; Pavletich, Nikola P; Marians, Kenneth J; Erdjument-Bromage, Hediye

    2016-07-01

    In recent history, alternative approaches to Edman sequencing have been investigated, and to this end, the Association of Biomolecular Resource Facilities (ABRF) Protein Sequencing Research Group (PSRG) initiated studies in 2014 and 2015, looking into bottom-up and top-down N-terminal (Nt) dimethyl derivatization of standard quantities of intact proteins with the aim to determine Nt sequence information. We have expanded this initiative and used low picomole amounts of myoglobin to determine the efficiency of Nt-dimethylation. Application of this approach on protein domains, generated by limited proteolysis of overexpressed proteins, confirms that it is a universal labeling technique and is very sensitive when compared with Edman sequencing. Finally, we compared Edman sequencing and Nt-dimethylation of the same polypeptide fragments; results confirm that there is agreement in the identity of the Nt amino acid sequence between these 2 methods. PMID:27006647

  18. N-Terminal Amino Acid Sequence Determination of Proteins by N-Terminal Dimethyl Labeling: Pitfalls and Advantages When Compared with Edman Degradation Sequence Analysis

    PubMed Central

    Chang, Elizabeth; Pourmal, Sergei; Zhou, Chun; Kumar, Rupesh; Teplova, Marianna; Pavletich, Nikola P.; Marians, Kenneth J.

    2016-01-01

    In recent history, alternative approaches to Edman sequencing have been investigated, and to this end, the Association of Biomolecular Resource Facilities (ABRF) Protein Sequencing Research Group (PSRG) initiated studies in 2014 and 2015, looking into bottom-up and top-down N-terminal (Nt) dimethyl derivatization of standard quantities of intact proteins with the aim to determine Nt sequence information. We have expanded this initiative and used low picomole amounts of myoglobin to determine the efficiency of Nt-dimethylation. Application of this approach on protein domains, generated by limited proteolysis of overexpressed proteins, confirms that it is a universal labeling technique and is very sensitive when compared with Edman sequencing. Finally, we compared Edman sequencing and Nt-dimethylation of the same polypeptide fragments; results confirm that there is agreement in the identity of the Nt amino acid sequence between these 2 methods. PMID:27006647

  19. Partial amino acid sequence of fructose-1,6-bisphosphatase from the blue-green algae Synechococcus leopoliensis.

    PubMed

    Marcus, F; Latshaw, S P; Steup, M; Gerbling, K P

    1989-08-01

    Purified fructose-1,6-bisphosphatase from the cyanobacterium Synechococcus leopoliensis was S-carboxymethylated and cleaved with trypsin. The resulting peptides were purified by reversed-phase high performance liquid chromatography and the amino acid sequence of six of the purified peptides was determined by gas-phase microsequencing. The results revealed sequence homology with other fructose-1,6-bisphosphatases. The obtained sequence data provides information required for the design of oligonucleotide hybridization probes to screen existing libraries of cyanobacterial DNA. The determination of the amino acid sequence of cyanobacterial proteins may yield important information with respect to the endosymbiotic theory of evolution. PMID:2550924

  20. Protein sequence analysis by incorporating modified chaos game and physicochemical properties into Chou's general pseudo amino acid composition.

    PubMed

    Xu, Chunrui; Sun, Dandan; Liu, Shenghui; Zhang, Yusen

    2016-10-01

    In this contribution we introduced a novel graphical method to compare protein sequences. By mapping a protein sequence into 3D space based on codons and physicochemical properties of 20 amino acids, we are able to get a unique P-vector from the 3D curve. This approach is consistent with wobble theory of amino acids. We compute the distance between sequences by their P-vectors to measure similarities/dissimilarities among protein sequences. Finally, we use our method to analyze four datasets and get better results compared with previous approaches. PMID:27375218

  1. Single-Molecule LATE-PCR Analysis of Human Mitochondrial Genomic Sequence Variations

    PubMed Central

    Osborne, Adam; Reis, Arthur H.; Bach, Loren; Wangh, Lawrence J.

    2009-01-01

    It is thought that changes in mitochondrial DNA are associated with many degenerative diseases, including Alzheimer's and diabetes. Much of the evidence, however, depends on correlating disease states with changing levels of heteroplasmy within populations of mitochondrial genomes, rather than individual mitochondrial genomes. Thus these measurements are likely to either overestimate the extent of heteroplasmy due to technical artifacts, or underestimate the actual level of heteroplasmy because only the most abundant changes are observable. In contrast, Single Molecule (SM) LATE-PCR analysis achieves efficient amplification of single-stranded amplicons from single target molecules. The product molecules, in turn, can be accurately sequenced using a convenient Dilute-‘N’-Go protocol, as shown here. Using these novel technologies we have rigorously analyzed levels of mitochondrial genome heteroplasmy found in single hair shafts of healthy adult individuals. Two of the single molecule sequences (7% of the samples) were found to contain mutations. Most of the mtDNA sequence changes, however, were due to the presence of laboratory contaminants. Amplification and sequencing errors did not result in mis-identification of mutations. We conclude that SM-LATE-PCR in combination with Dilute-‘N’-Go Sequencing are convenient technologies for detecting infrequent mutations in mitochondrial genomes, provided great care is taken to control and document contamination. We plan to use these technologies in the future to look for age, drug, and disease related mitochondrial genome changes in model systems and clinical samples. PMID:19461959

  2. Associations between sequence variations in the mitochondrial DNA D-loop region and outcome of hepatocellular carcinoma

    PubMed Central

    LI, SHILAI; WAN, PEIQI; PENG, TAO; XIAO, KAIYIN; SU, MING; SHANG, LIMING; XU, BANGHAO; SU, ZHIXIONG; YE, XINPING; PENG, NING; QIN, QUANLIN; LI, LEQUN

    2016-01-01

    The association between mitochondrial DNA (mtDNA) polymorphisms or mutations and the prognoses of cancer have been investigated previously, but the results have been ambiguous. In the present study, the associations between sequence variations in the mtDNA D-loop region and the outcomes of patients with hepatocellular carcinoma (HCC) were analysed. A total of 140 patients with HCC (123 males and 17 females), who were hospitalised to undergo radical resection, were studied. Polymerase chain reaction and direct sequencing were performed to detect the sequence variations in the mtDNA D-loop region. Multivariate and univariate analyses were conducted to determine important factors in the prognosis of HCC. A total of 150 point sequence variations were observed in the 140 cases (13 point mutations, 8 insertions, 20 deletions and 116 polymorphisms). The variation rate was 13.4% (150/1, 122). mtDNA nucleotide 150 (C/T) was an independent factor in the logistic regression for early/late recurrence of HCC. Patients with 150T appeared to have later recurrences. In a Cox proportional hazards regression model, hepatitis B virus DNA, Child-Pugh class, differentiation degree, tumour-node-metastasis (TNM) stage, nucleotide 16263 (T/C) and nucleotide 315 (N/insertion C) were independent factors for tumour-free survival time. Patients with the 16263T allele had a greater tumour-free survival time than patients with the 16263C allele. Similarly, patients with 315 insertion C had a superior tumour-free survival time when compared with patients with 315 N (normal). In the Cox proportional hazards regression model, recurrence type (early/late), Child-Pugh class, TNM stage and adjuvant treatment after tumour recurrence (none or one/more than one treatment) were independent factors for overall survival. None of the mtDNA variations served as independent factors. Patients with late recurrence, Child-Pugh class A, and low TNM stages and/or those who received more than one adjuvant treatment

  3. Differential transcriptional activity of SAD, FAD2 and FAD3 desaturase genes in developing seeds of linseed contributes to varietal variation in α-linolenic acid content.

    PubMed

    Rajwade, Ashwini V; Kadoo, Narendra Y; Borikar, Sanjay P; Harsulkar, Abhay M; Ghorpade, Prakash B; Gupta, Vidya S

    2014-02-01

    Linseed or flax (Linum usitatissimum L.) varieties differ markedly in their seed α-linolenic acid (ALA) levels. Fatty acid desaturases play a key role in accumulating ALA in seed. We performed fatty acid (FA) profiling of various seed developmental stages of ten Indian linseed varieties including one mutant variety. Depending on their ALA contents, these varieties were grouped under high ALA and low ALA groups. Transcript profiling of six microsomal desaturase genes (SAD1, SAD2, FAD2, FAD2-2, FAD3A and FAD3B), which act sequentially in the fatty acid desaturation pathway, was performed using real-time PCR. We observed gene specific as well as temporal expression pattern for all the desaturases and their differential expression profiles corresponded well with the variation in FA accumulation in the two groups. Our study points to efficient conversion of intermediate FAs [stearic (SA), oleic (OA) and linoleic acids (LA)] to the final product, ALA, due to efficient action of all the desaturases in high ALA group. While in the low ALA group, even though the initial conversion up to OA was efficient, later conversions up to ALA seemed to be inefficient, leading to higher accumulation of OA and LA instead of ALA. We sequenced the six desaturase genes from the ten varieties and observed that variation in the amino acid (AA) sequences of desaturases was not responsible for differential ALA accumulation, except in the mutant variety TL23 with very low (<2%) ALA content. In TL23, a point mutation in the FAD3A gene resulted into a premature stop codon generating a truncated protein with 291 AA. PMID:24380374

  4. Association Between Sequence Variations in RCAN1 Promoter and the Risk of Sporadic Congenital Heart Disease in a Chinese Population.

    PubMed

    Li, Xiaoyong; Wang, Gang; An, Yong; Li, Hongbo; Li, Yonggang; Wu, Chun

    2015-10-01

    The pathogenesis of congenital heart disease (CHD) is unclear. There is a high incidence of CHD in Down syndrome, in which RCAN1 (regulator of calcineurin 1) overexpression is observed. However, whether RCAN1 plays an important role in non-syndromic CHD is unknown. This study investigates the relationship between sequence variations in the RCAN1 promoter and sporadic CHD. This was a case-control study in which the RCAN1 promoter was cloned and sequenced in 128 CHD patients (median age 1.1 year) and 150 normal controls (median age 3.0 year). No mutation sites had been identified in this research. Three single-nucleotide (C to T) polymorphisms were detected: rs193289374, rs149048873 and rs143081213. The polymorphisms were not associated with CHD risk according to a logistic regression analysis. Functional assays in vitro showed that compared with the wild-type genotype, the rs149048873 polymorphism decreased, and the rs143081213 increased, the RCAN1 promoter activity, though the rs193289374 polymorphism had no effect. In conclusion, the sequence variations in RCAN1 promoter are not major genetic factors involved in sporadic CHD, at least in the current research population. PMID:25863471

  5. mit-o-matic: a comprehensive computational pipeline for clinical evaluation of mitochondrial variations from next-generation sequencing datasets.

    PubMed

    Vellarikkal, Shamsudheen Karuthedath; Dhiman, Heena; Joshi, Kandarp; Hasija, Yasha; Sivasubbu, Sridhar; Scaria, Vinod

    2015-04-01

    The human mitochondrial genome has been reported to have a very high mutation rate as compared with the nuclear genome. A large number of mitochondrial mutations show significant phenotypic association and are involved in a broad spectrum of diseases. In recent years, there has been a remarkable progress in the understanding of mitochondrial genetics. The availability of next-generation sequencing (NGS) technologies have not only reduced sequencing cost by orders of magnitude but has also provided us good quality mitochondrial genome sequences with high coverage, thereby enabling decoding of a number of human mitochondrial diseases. In this study, we report a computational and experimental pipeline to decipher the human mitochondrial DNA variations and examine them for their clinical correlation. As a proof of principle, we also present a clinical study of a patient with Leigh disease and confirmed maternal inheritance of the causative allele. The pipeline is made available as a user-friendly online tool to annotate variants and find haplogroup, disease association, and heteroplasmic sites. The "mit-o-matic" computational pipeline represents a comprehensive cloud-based tool for clinical evaluation of mitochondrial genomic variations from NGS datasets. The tool is freely available at http://genome.igib.res.in/mitomatic/. PMID:25677119

  6. Variation and genetic structure of Tunisian Festuca arundinacea populations based on inter-simple sequence repeat pattern.

    PubMed

    Chtourou-Ghorbel, N; Elazreg, H; Ghariani, S; Ben Mheni, N; Sekmani, M; Chakroun, M; Trifi-Farah, N

    2015-01-01

    Tunisian tall fescue (Festuca arundinacea Schreb.) is an important grass for forages or soil conservation, particularly in marginal sites. Inter-simple sequence repeats were used to estimate genetic diversity within and among 8 natural populations and 1 cultivar from Northern Tunisia. A total of 181 polymorphic inter-simple sequence repeat markers were generated using 7 primers. Shannon's index and analysis of molecular variance evidenced a high molecular polymorphism at intra-specific levels for wild and cultivated accessions, showing that Tunisian tall fescue germplasm constitutes an important pool of diversity. Within-population variation accounted for 39.42% of the total variation, but no regional differentiation was discernible to designate close relationships between regions. Most of the variation (GST = 67%) occurred between populations, rather than within populations. The ɸST (0.60) revealed high population structuring. Additionally, the population structure was independent of the geographic origin and was not affected by environmental factors. The unweighted pair group method with arithmetic mean tree based on genetic similarity and principal coordinate analysis based on coefficient similarity illustrated that continental populations from the proximate localities of Beja and Jendouba were genetically closely related, while the wild Skalba population from the littoral Tunisian locality was the most diverse from the others. Moreover, great molecular similarity of the spontaneous population Sedjnane originated from the mountain areas was revealed with the local cultivar Mornag. The observed genetic diversity can be used to implement conservation strategies and breeding programs for improving forage crops in Tunisia. PMID:25966071

  7. Purifying selection, sequence composition, and context-specific indel mutations shape intraspecific variation in a bacterial endosymbiont.

    PubMed

    Williams, Laura E; Wernegreen, Jennifer J

    2012-01-01

    Comparative genomics of closely related bacterial strains can clarify mutational processes and selective forces that impact genetic variation. Among primary bacterial endosymbionts of insects, such analyses have revealed ongoing genome reduction, raising questions about the ultimate evolutionary fate of these partnerships. Here, we explored genomic variation within Blochmannia vafer, an obligate mutualist of the ant Camponotus vafer. Polymorphism analysis of the Illumina data set used previously for de novo assembly revealed a second Bl. vafer genotype. To determine why a single ant colony contained two symbiont genotypes, we examined polymorphisms in 12 C. vafer mitochondrial sequences assembled from the Illumina data; the spectrum of variants suggests that the colony contained two maternal lineages, each harboring a distinct Bl. vafer genotype. Comparing the two Bl. vafer genotypes revealed that purifying selection purged most indels and nonsynonymous differences from protein-coding genes. We also discovered that indels occur frequently in multimeric simple sequence repeats, which are relatively abundant in Bl. vafer and may play a more substantial role in generating variation in this ant mutualist than in the aphid endosymbiont Buchnera. Finally, we explored how an apparent relocation of the origin of replication in Bl. vafer and the resulting shift in strand-associated mutational pressures may have caused accelerated gene loss and an elevated rate of indel polymorphisms in the region spanning the origin relocation. Combined, these results point to significant impacts of purifying selection on genomic polymorphisms as well as distinct patterns of indels associated with unusual genomic features of Blochmannia. PMID:22117087

  8. Analysis of genetic variation within clonal lineages of grape phylloxera (Daktulosphaira vitifoliae Fitch) using AFLP fingerprinting and DNA sequencing.

    PubMed

    Vorwerk, S; Forneck, A

    2007-07-01

    Two AFLP fingerprinting methods were employed to estimate the potential of AFLP fingerprints for the detection of genetic diversity within single founder lineages of grape phylloxera (Daktulosphaira vitifoliae Fitch). Eight clonal lineages, reared under controlled conditions in a greenhouse and reproducing asexually throughout a minimum of 15 generations, were monitored and mutations were scored as polymorphisms between the founder individual and individuals of succeeding generations. Genetic variation was detected within all lineages, from early generations on. Six to 15 polymorphic loci (from a total of 141 loci) were detected within the lineages, making up 4.3% of the total amount of genetic variation. The presence of contaminating extra-genomic sequences (e.g., viral material, bacteria, or ingested chloroplast DNA) was excluded as a source of intraclonal variation. Sequencing of 37 selected polymorphic bands confirmed their origin in mostly noncoding regions of the grape phylloxera genome. AFLP techniques were revealed to be powerful for the identification of reproducible banding patterns within clonal lineages. PMID:17893744

  9. Estimation of Response Functions Based on Variational Bayes Algorithm in Dynamic Images Sequences

    PubMed Central

    2016-01-01

    We proposed a nonparametric Bayesian model based on variational Bayes algorithm to estimate the response functions in dynamic medical imaging. In dynamic renal scintigraphy, the impulse response or retention functions are rather complicated and finding a suitable parametric form is problematic. In this paper, we estimated the response functions using nonparametric Bayesian priors. These priors were designed to favor desirable properties of the functions, such as sparsity or smoothness. These assumptions were used within hierarchical priors of the variational Bayes algorithm. We performed our algorithm on the real online dataset of dynamic renal scintigraphy. The results demonstrated that this algorithm improved the estimation of response functions with nonparametric priors.

  10. Nucleotide sequence of the phosphoglycerate kinase gene from the extreme thermophile Thermus thermophilus. Comparison of the deduced amino acid sequence with that of the mesophilic yeast phosphoglycerate kinase.

    PubMed Central

    Bowen, D; Littlechild, J A; Fothergill, J E; Watson, H C; Hall, L

    1988-01-01

    Using oligonucleotide probes derived from amino acid sequencing information, the structural gene for phosphoglycerate kinase from the extreme thermophile, Thermus thermophilus, was cloned in Escherichia coli and its complete nucleotide sequence determined. The gene consists of an open reading frame corresponding to a protein of 390 amino acid residues (calculated Mr 41,791) with an extreme bias for G or C (93.1%) in the codon third base position. Comparison of the deduced amino acid sequence with that of the corresponding mesophilic yeast enzyme indicated a number of significant differences. These are discussed in terms of the unusual codon bias and their possible role in enhanced protein thermal stability. Images Fig. 1. PMID:3052437

  11. Bacteria obtained from a sequencing batch reactor that are capable of growth on dehydroabietic acid.

    PubMed Central

    Mohn, W W

    1995-01-01

    Eleven isolates capable of growth on the resin acid dehydroabietic acid (DhA) were obtained from a sequencing batch reactor designed to treat a high-strength process stream from a paper mill. The isolates belonged to two groups, represented by strains DhA-33 and DhA-35, which were characterized. In the bioreactor, bacteria like DhA-35 were more abundant than those like DhA-33. The population in the bioreactor of organisms capable of growth on DhA was estimated to be 1.1 x 10(6) propagules per ml, based on a most-probable-number determination. Analysis of small-subunit rRNA partial sequences indicated that DhA-33 was most closely related to Sphingomonas yanoikuyae (Sab = 0.875) and that DhA-35 was most closely related to Zoogloea ramigera (Sab = 0.849). Both isolates additionally grew on other abietanes, i.e., abietic and palustric acids, but not on the pimaranes, pimaric and isopimaric acids. For DhA-33 and DhA-35 with DhA as the sole organic substrate, doubling times were 2.7 and 2.2 h, respectively, and growth yields were 0.30 and 0.25 g of protein per g of DhA, respectively. Glucose as a cosubstrate stimulated growth of DhA-33 on DhA and stimulated DhA degradation by the culture. Pyruvate as a cosubstrate did not stimulate growth of DhA-35 on DhA and reduced the specific rate of DhA degradation of the culture. DhA induced DhA and abietic acid degradation activities in both strains, and these activities were heat labile. Cell suspensions of both strains consumed DhA at a rate of 6 mumol mg of protein-1 h-1.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:7793937

  12. Bovine herpesvirus-1: comparison and differentiation of vaccine and field strains based on genomic sequence variation.

    PubMed

    Fulton, R W; d'Offay, J M; Eberle, R

    2013-03-01

    Bovine herpesvirus-1 (BoHV-1) causes significant disease in cattle including respiratory, fetal diseases, and reproductive tract infections. Control programs usually include vaccination with a modified live viral (MLV) vaccine. On occasion BoHV-1 strains are isolated from diseased animals or fetuses postvaccination. Currently there are no markers for differentiating MLV strains from field strains of BoHV-1. In this study several BoHV-1 strains were sequenced using whole-genome sequencing technologies and the data analyzed to identify single nucleotide polymorphisms (SNPs). Strains sequenced included the reference BoHV-1 Cooper strain (GenBank Accession JX898220), eight commercial MLV vaccine strains, and 14 field strains from cases presented for diagnosis. Based on SNP analyses, the viruses could be classified into groups having similar SNP patterns. The eight MLV strains could be differentiated from one another although some were closely related to each other. A number of field strains isolated from animals with a history of prior vaccination had SNP patterns similar to specific MLV viruses, while other field isolates were very distinct from all vaccine strains. The results indicate that some BoHV-1 isolates from clinically ill cattle/fetuses can be associated with a prior MLV vaccination history, but more information is needed on the rate of BoHV-1 genome sequence change before irrefutable associations can be drawn. PMID:23333211

  13. Synthetic promoter elements obtained by nucleotide sequence variation and selection for activity

    PubMed Central

    Edelman, Gerald M.; Meech, Robyn; Owens, Geoffrey C.; Jones, Frederick S.

    2000-01-01

    Eukaryotic transcriptional regulation in different cells involves large numbers and arrangements of cis and trans elements. To survey the number of cis regulatory elements that are active in different contexts, we have devised a high-throughput selection procedure permitting synthesis of active cis motifs that enhance the activity of a minimal promoter. This synthetic promoter construction method (SPCM) was used to identify >100 DNA sequences that showed increased promoter activity in the neuroblastoma cell line Neuro2A. After determining DNA sequences of selected synthetic promoters, database searches for known elements revealed a predominance of eight motifs: AP2, CEBP, GRE, Ebox, ETS, CREB, AP1, and SP1/MAZ. The most active of the selected synthetic promoters contain composites of a number of these motifs. Assays of DNA binding and promoter activity of three exemplary motifs (ETS, CREB, and SP1/MAZ) were used to prove the effectiveness of SPCM in uncovering active sequences. Up to 10% of 133 selected active sequences had no match in currently available databases, raising the possibility that new motifs and transcriptional regulatory proteins to which they bind may be revealed by SPCM. The method may find uses in constructing databases of active cis motifs, in diagnostics, and in gene therapy. PMID:10725347

  14. Mitochondrial intronic open reading frames in Podospora: Mobility and consecutive exonic sequence variations

    SciTech Connect

    Sellem, C.H.; Rossignol, M.; Belcour, L.

    1996-06-01

    The mitochondrial genome of 23 wild-type strains belonging to three different species of the filamentous fungus Podospora was examined. Among the 15 optical sequences identified are two intronic reading frames, nad1-i4-orf1 and cox1-i7-orf2. We show that the presence of these sequences was strictly correlated with tightly clustered nucleotide substitutions in the adjacent exon. This correlation applies to the presence or absence of closely related open reading frames (ORFs), found at the same genetic locations, in all the Pyrenomycete genera examined. The recent gain of these optional ORFs in the evolution of the genus Podospora probably account for such sequence differences. In the homoplasmic progeny from heteroplasmons constructed between Podospora strains differing by the presence of these optional ORFs, nad1-i4-orf1 and cox1-i7-orf2 appeared highly invasive. Sequence comparisons in the nad1-i4 intron of various strains of the Pyrenomycete family led us to propose a scenario of its evolution that includes several events of loss and gain of intronic ORFs. These results strongly reinforce the idea that group I intronic ORFs are mobile elements and that their transfer, and comcomitant modification of the adjacent exon, could participate in the modular evolution of mitochondrial genomes. 46 refs., 5 figs., 2 tabs.

  15. A pedigree-based study of mitochondrial D-loop DNA sequence variation among Arabian horses.

    PubMed

    Bowling, A T; Del Valle, A; Bowling, M

    2000-02-01

    Through DNA sequence comparisons of a mitochondrial D-loop hypervariable region, we investigated matrilineal diversity for Arabian horses in the United States. Sixty-two horses were tested. From published pedigrees they traced in the maternal line to 34 mares acquired primarily in the mid to late 19th century from nomadic Bedouin tribes. Compared with the reference sequence (GenBank X79547), these samples showed 27 haplotypes with altogether 31 base substitution sites within 397 bp of sequence. Based on examination of pedigrees from a random sampling of 200 horses in current studbooks of the Arabian Horse Registry of America, we estimated that this study defined the expected mtDNA haplotypes for at least 89% of Arabian horses registered in the US. The reliability of the studbook recorded maternal lineages of Arabian pedigrees was demonstrated by haplotype concordance among multiple samplings in 14 lines. Single base differences observed within two maternal lines were interpreted as representing alternative fixations of past heteroplasmy. The study also demonstrated the utility of mtDNA sequence studies to resolve historical maternity questions without access to biological material from the horses whose relationship was in question, provided that representatives of the relevant female lines were available for comparison. The data call into question the traditional assumption that Arabian horses of the same strain necessarily share a common maternal ancestry. PMID:10690354

  16. Distribution and sequence variations of selected virulence genes among group A streptococcal isolates from western Norway.

    PubMed

    Mylvaganam, H; Bjorvatn, B; Osland, A

    2000-11-01

    In order to compare the distribution of selected virulence genes among group A streptococci recovered from invasive disease and superficial infections, 42 isolates were screened for mga, speB, speA, ssa and ska, by PCR. The isolates were predominantly of the sequence types emm1, emm3 and emm6, but also included a few of the types emm22, emm28, emm75 and emm78. The phage-mediated speA seemed to be prevalent in emm types 1 and 3, and its distribution was not related to disease severity. The other genes were present in all isolates. The mga, speB and speA were further studied by sequence analysis. Although allotypic associations with invasiveness were not found, allelic specificity to the emm sequence type was observed. In addition, the mga sequences indicated two lineages, related to opacity factor production. A possible recombination between these two main divergent mga genes was observed in isolates of the types emm22 and emm75. A logical nomenclature of the alleles of mga and speB is suggested. PMID:11211972

  17. The variation of nitric acid vapor and nitrate aerosol concentrations near the island of Hawaii

    SciTech Connect

    Lee, G.

    1992-01-01

    Anthropogenic emissions of nitrogen oxides (NO + NO[sub 2]) are estimated to be half of the global emissions to the atmosphere. To understand the effect of increasing anthropogenic reactive nitrogen inputs to the global atmosphere, one needs to monitor their long-term variations. This dissertation examines the variations of total nitrate (nitric acid vapor and nitrate aerosol) at the Mauna Loa Observatory (MLO), Hawaii. During the Mauna Loa Observatory Photochemistry Experiment (MLOPEX) in May, 1988, six different air types were identified at MLO with statistical analysis. They were: (1) volcano influenced air, (2) stratosphere-like air, (3) boundary-layer air with recent anthropogenic influence, (4) photochemical haze, (5) marine boundary-layer air, (6) well-aged and modified marine air. Samples that might be influenced by marine air or human activity from local islands were eliminated with three meterological criteria (wind direction, condensation nuclei, and dew point). To examine the negative sampling artifacts of nitric acid vapor due to ground loss, mixing ratio gradients with height were measured during August of 1991. The observed gradients of nitric acid vapor indicated that the long-term samplers at 8 m at MLO may underestimate the free tropospheric nitric acid vapor mixing ratio by about 20%. The three year mean and median of free tropospheric total nitrate during long-term measurements were 113 pptv and 93 pptv, respectively. Each year, the total nitrate mixing ratios at MLO during the spring and summer were increased by more than a factor of two higher than fall and winter. NO[sub y] from remote continents (Asia and North America) are likely sources of these increased total nitrate at MLO during these seasons. However, other processes govern the total nitrate mixing ratios, e.g., degree of mixing between free tropospheric air and boundary air at source regions, stratospheric injection, and wet removal of total nitrate.

  18. Day-to-night variations of cytoplasmic pH in a crassulacean acid metabolism plant.

    PubMed

    Hafke, J B; Neff, R; Hütt, M T; Lüttge, U; Thiel, G

    2001-01-01

    In crassulacean acid metabolism (CAM) large amounts of malic acid are redistributed between vacuole and cytoplasm in the course of night-to-day transitions. The corresponding changes of the cytoplasmic pH (pHcyt) were monitored in mesophyll protoplasts from the CAM plant Kalanchoe daigremontiana Hamet et Perrier by ratiometric fluorimetry with the fluorescent dye 2',7'-bis-(2-carboxyethyl)-5-(and-6-)carboxyfluorescein as a pHcyt indicator. At the beginning of the light phase, pHcyt was slightly alkaline (about 7.5). It dropped during midday by about 0.3 pH units before recovering again in the late-day-to-early-dark phase. In the physiological context the variation in pHcyt may be a component of CAM regulation. Due to its pH sensitivity, phosphoenolpyruvate carboxylase appears as a likely target enzyme. From monitoring delta pHcyt in response to loading the cytoplasm with the weak acid salt K-acetate a cytoplasmic H(+)-buffer capacity in the order of 65 mM H+ per pH unit was estimated at a pHcyt of about 7.5. With this value, an acid load of the cytoplasm by about 10 mM malic acid can be estimated as the cause of the observed drop in pHcyt. A diurnal oscillation in pHcyt and a quantitatively similar cytoplasmic malic acid is predicted from an established mathematical model which allows simulation of the CAM dynamics. The similarity of model predictions and experimental data supports the view put forward in this model that a phase transition of the tonoplast is an essential functional element in CAM dynamics. PMID:11732184

  19. Discovery of a novel amino acid racemase through exploration of natural variation in Arabidopsis thaliana

    PubMed Central

    Strauch, Renee C.; Svedin, Elisabeth; Dilkes, Brian; Chapple, Clint; Li, Xu

    2015-01-01

    Plants produce diverse low-molecular-weight compounds via specialized metabolism. Discovery of the pathways underlying production of these metabolites is an important challenge for harnessing the huge chemical diversity and catalytic potential in the plant kingdom for human uses, but this effort is often encumbered by the necessity to initially identify compounds of interest or purify a catalyst involved in their synthesis. As an alternative approach, we have performed untargeted metabolite profiling and genome-wide association analysis on 440 natural accessions of Arabidopsis thaliana. This approach allowed us to establish genetic linkages between metabolites and genes. Investigation of one of the metabolite–gene associations led to the identification of N-malonyl-d-allo-isoleucine, and the discovery of a novel amino acid racemase involved in its biosynthesis. This finding provides, to our knowledge, the first functional characterization of a eukaryotic member of a large and widely conserved phenazine biosynthesis protein PhzF-like protein family. Unlike most of known eukaryotic amino acid racemases, the newly discovered enzyme does not require pyridoxal 5′-phosphate for its activity. This study thus identifies a new d-amino acid racemase gene family and advances our knowledge of plant d-amino acid metabolism that is currently largely unexplored. It also demonstrates that exploitation of natural metabolic variation by integrating metabolomics with genome-wide association is a powerful approach for functional genomics study of specialized metabolism. PMID:26324904

  20. Nucleic and amino acid sequences relating to a novel transketolase, and methods for the expression thereof

    DOEpatents

    Croteau, Rodney Bruce; Wildung, Mark Raymond; Lange, Bernd Markus; McCaskill, David G.

    2001-01-01

    cDNAs encoding 1-deoxyxylulose-5-phosphate synthase from peppermint (Mentha piperita) have been isolated and sequenced, and the corresponding amino acid sequences have been determined. Accordingly, isolated DNA sequences (SEQ ID NO:3, SEQ ID NO:5, SEQ ID NO:7) are provided which code for the expression of 1-deoxyxylulose-5-phosphate synthase from plants. In another aspect the present invention provides for isolated, recombinant DXPS proteins, such as the proteins having the sequences set forth in SEQ ID NO:4, SEQ ID NO:6 and SEQ ID NO:8. In other aspects, replicable recombinant cloning vehicles are provided which code for plant 1-deoxyxylulose-5-phosphate synthases, or for a base sequence sufficiently complementary to at least a portion of 1-deoxyxylulose-5-phosphate synthase DNA or RNA to enable hybridization therewith. In yet other aspects, modified host cells are provided that have been transformed, transfected, infected and/or injected with a recombinant cloning vehicle and/or DNA sequence encoding a plant 1-deoxyxylulose-5-phosphate synthase. Thus, systems and methods are provided for the recombinant expression of the aforementioned recombinant 1-deoxyxylulose-5-phosphate synthase that may be used to facilitate its production, isolation and purification in significant amounts. Recombinant 1-deoxyxylulose-5-phosphate synthase may be used to obtain expression or enhanced expression of 1-deoxyxylulose-5-phosphate synthase in plants in order to enhance the production of 1-deoxyxylulose-5-phosphate, or its derivatives such as isopentenyl diphosphate (BP), or may be otherwise employed for the regulation or expression of 1-deoxyxylulose-5-phosphate synthase, or the production of its products.

  1. Novel method for PIK3CA mutation analysis: locked nucleic acid--PCR sequencing.

    PubMed

    Ang, Daphne; O'Gara, Rebecca; Schilling, Amy; Beadling, Carol; Warrick, Andrea; Troxell, Megan L; Corless, Christopher L

    2013-05-01

    Somatic mutations in PIK3CA are commonly seen in invasive breast cancer and several other carcinomas, occurring in three hotspots: codons 542 and 545 of exon 9 and in codon 1047 of exon 20. We designed a locked nucleic acid (LNA)-PCR sequencing assay to detect low levels of mutant PIK3CA DNA with attention to avoiding amplification of a pseudogene on chromosome 22 that has >95% homology to exon 9 of PIK3CA. We tested 60 FFPE breast DNA samples with known PIK3CA mutation status (48 cases had one or more PIK3CA mutations, and 12 were wild type) as identified by PCR-mass spectrometry. PIK3CA exons 9 and 20 were amplified in the presence or absence of LNA-oligonucleotides designed to bind to the wild-type sequences for codons 542, 545, and 1047, and partially suppress their amplification. LNA-PCR sequencing confirmed all 51 PIK3CA mutations; however, the mutation detection rate by standard Sanger sequencing was only 69% (35 of 51). Of the 12 PIK3CA wild-type cases, LNA-PCR sequencing detected three additional H1047R mutations in "normal" breast tissue and one E545K in usual ductal hyperplasia. Histopathological review of these three normal breast specimens showed columnar cell change in two (both with known H1047R mutations) and apocrine metaplasia in one. The novel LNA-PCR shows higher sensitivity than standard Sanger sequencing and did not amplify the known pseudogene. PMID:23541593

  2. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3.

    PubMed

    Wang, Xiaoyu; Chen, Meili; Xiao, Jingfa; Hao, Lirui; Crowley, David E; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  3. Genome Sequence Analysis of the Naphthenic Acid Degrading and Metal Resistant Bacterium Cupriavidus gilardii CR3

    PubMed Central

    Xiao, Jingfa; Hao, Lirui; Crowley, David E.; Zhang, Zhewen; Yu, Jun; Huang, Ning; Huo, Mingxin; Wu, Jiayan

    2015-01-01

    Cupriavidus sp. are generally heavy metal tolerant bacteria with the ability to degrade a variety of aromatic hydrocarbon compounds, although the degradation pathways and substrate versatilities remain largely unknown. Here we studied the bacterium Cupriavidus gilardii strain CR3, which was isolated from a natural asphalt deposit, and which was shown to utilize naphthenic acids as a sole carbon source. Genome sequencing of C. gilardii CR3 was carried out to elucidate possible mechanisms for the naphthenic acid biodegradation. The genome of C. gilardii CR3 was composed of two circular chromosomes chr1 and chr2 of respectively 3,539,530 bp and 2,039,213 bp in size. The genome for strain CR3 encoded 4,502 putative protein-coding genes, 59 tRNA genes, and many other non-coding genes. Many genes were associated with xenobiotic biodegradation and metal resistance functions. Pathway prediction for degradation of cyclohexanecarboxylic acid, a representative naphthenic acid, suggested that naphthenic acid undergoes initial ring-cleavage, after which the ring fission products can be degraded via several plausible degradation pathways including a mechanism similar to that used for fatty acid oxidation. The final metabolic products of these pathways are unstable or volatile compounds that were not toxic to CR3. Strain CR3 was also shown to have tolerance to at least 10 heavy metals, which was mainly achieved by self-detoxification through ion efflux, metal-complexation and metal-reduction, and a powerful DNA self-repair mechanism. Our genomic analysis suggests that CR3 is well adapted to survive the harsh environment in natural asphalts containing naphthenic acids and high concentrations of heavy metals. PMID:26301592

  4. Bile acid sulfotransferase I from rat liver sulfates bile acids and 3-hydroxy steroids: purification, N-terminal amino acid sequence, and kinetic properties.

    PubMed

    Barnes, S; Buchina, E S; King, R J; McBurnett, T; Taylor, K B

    1989-04-01

    A bile acid:3'phosphoadenosine-5'phosphosulfate:sulfotransferase (BAST I) from adult female rat liver cytosol has been purified 157-fold by a two-step isolation procedure. The N-terminal amino acid sequence of the 30,000 subunit has been determined for the first 35 residues. The Vmax of purified BAST I is 18.7 nmol/min per mg protein with N-(3-hydroxy-5 beta-cholanoyl)glycine (glycolithocholic acid) as substrate, comparable to that of the corresponding purified human BAST (Chen, L-J., and I. H. Segel, 1985. Arch. Biochem. Biophys. 241: 371-379). BAST I activity has a broad pH optimum from 5.5-7.5. Although maximum activity occurs with 5 mM MgCl2, Mg2+ is not essential for BAST I activity. The greatest sulfotransferase activity and the highest substrate affinity is observed with bile acids or steroids that have a steroid nucleus containing a 3 beta-hydroxy group and a 5-6 double bond or a trans A-B ring junction. These substrates have normal hyperbolic initial velocity curves with substrate inhibition occurring above 5 microM. Of the saturated 5 beta-bile acids, those with a single 3-hydroxy group are the most active. The addition of a second hydroxy group at the 6- or 7-position eliminates more than 99% of the activity. In contrast, 3 alpha,12 alpha-dihydroxy-5 beta-cholan-24-oic acid (deoxycholic acid) is an excellent substrate. The initial velocity curves for glycolithocholic and deoxycholic acid conjugates are sigmoidal rather than hyperbolic, suggestive of an allosteric effect. Maximum activity is observed at 80 microM for glycolithocholic acid. All substrates, bile acids and steroids, are inhibited by the 5 beta-bile acid, 3-keto-5 beta-cholanoic acid. The data suggest that BAST I is the same protein as hydrosteroid sulfotransferase 2 (Marcus, C. J., et al. 1980. Anal. Biochem. 107: 296-304). PMID:2754334

  5. Detection of sequence variation in parasite ribosomal DNA by electrophoresis in agarose gels supplemented with a DNA-intercalating agent.

    PubMed

    Zhu, X Q; Chilton, N B; Gasser, R B

    1998-05-01

    This study evaluated the use of a commercially available DNA intercalating agent (Resolver Gold) in agarose gels for the direct detection of sequence variation in ribosomal DNA (rDNA). This agent binds preferentially to AT sequence motifs in DNA. Regions of nuclear rDNA, known to provide genetic markers for the identification of species of parasitic ascarid nematodes (order Ascaridida), were amplified by polymerase chain reaction (PCR) and subjected to electrophoresis in standard agarose gels versus gels supplemented with Resolver Gold. Individual taxa examined could not be distinguished reliably based on the size of their amplicons in standard agarose gels, whereas they could be readily delineated based on mobility using Resolver Gold-supplemented gels. The latter was achieved because of differences (approximately 0.1-8.2%) in the AT content of the fragments among different taxa, which were associated with significant interspecific differences (approximately 11-39%) in the rDNA sequences employed. There was a tendency for fragments with higher AT content to migrate slower in supplemented agarose gels compared with those of lower AT content. The results indicate the usefulness of this electrophoretic approach to rapidly screen for sequence variability within or among PCR-amplified rDNA fragments of similar sizes but differing AT contents. Although evaluated on rDNA of parasites, the approach has potential to be applied to a range of genes of different groups of infectious organisms. PMID:9629896

  6. Genetic variation in and spatial structure of natural populations of Dipterocarpus alatus (Dipterocarpaceae) determined using single sequence repeat markers.

    PubMed

    Tam, N M; Duy, V D; Duc, N M; Giap, V D; Xuan, B T T

    2014-01-01

    Dipterocarpus alatus (Dipterocarpaceae) is widely distributed in lowland forests in central and southern Vietnam, Cambodia, Laos, Myanmar, Philippines, Thailand, and India. Due to over-exploitation and habitat destruction, the species is now threatened. The genetic variation within and among populations of D. alatus was investigated on the basis of 9 microsatellite (single sequence repeat, SSR) loci. In all, 268 sampled trees from 10 populations in central and southern Vietnam were analyzed in this study. The SSR data showed a high genetic variability within populations with an average of HO = 0.209 and HE = 0.239. Genetic differentiation among populations was high (FST = 0.266), indicating limited gene flow (Nm = 0.69). Analysis of molecular variance showed that most genetic variation was within populations (74.96%). This study highlights the importance of conserving the genetic resources of D. alatus species. PMID:25078594

  7. Identification of eight mutations and three sequence variations in the cystic fibrosis transmembrane conductance regulator (CFTR) gene

    SciTech Connect

    Ghanem, N.; Costes, B.; Girodon, E.; Martin, J.; Fanen, P.; Goossens, M. )

    1994-05-15

    To determine cystic fibrosis (CF) defects in a sample of 224 non-[Delta]F508 CF chromosomes, the authors used denaturing gradient gel multiplex analysis of CF transmembrane conductance regulator gene segments, a strategy based on blind exhaustive analysis rather than a search for known mutations. This process allowed detection of 11 novel variations comprising two nonsense mutations (Q890X and W1204X), a splice defect (405 + 4 A [yields] G), a frameshift (3293delA), four presumed missense mutations (S912L, H949Y, L1065P, Q1071P), and three sequence polymorphisms (R31C or 223 C/T, 3471 T/C, and T1220I or 3791 C/T). The authors describe these variations, together with the associated phenotype when defects on both CF chromosomes were identified. 8 refs., 1 fig., 1 tab.

  8. Posttranslational modification and sequence variation of redox-active proteins correlate with biofilm life cycle in natural microbial communities

    SciTech Connect

    Singer, Steven; Erickson, Brian K; Verberkmoes, Nathan C; Hwang, Mona; Shah, Manesh B; Hettich, Robert {Bob} L; Banfield, Jillian F.; Thelen, Michael P.

    2010-01-01

    Characterizing proteins recovered from natural microbial communities affords the opportunity to correlate protein expression and modification with environmental factors, including species composition and successional stage. Proteogenomic and biochemical studies of pellicle biofilms from subsurface acid mine drainage streams have shown abundant cytochromes from the dominant organism, Leptospirillum Group II. These cytochromes are proposed to be key proteins in aerobic Fe(II) oxidation, the dominant mode of cellular energy generation by the biofilms. In this study, we determined that posttranslational modification and expression of amino-acid sequence variants change as a function of biofilm maturation. For Cytochrome579 (Cyt579), the most abundant cytochrome in the biofilms, late developmental-stage biofilms differed from early-stage biofilms in N-terminal truncations and decreased redox potentials. Expression of sequence variants of two monoheme c-type cytochromes also depended on biofilm development. For Cyt572, an abundant membrane-bound cytochrome, the expression of multiple sequence variants was observed in both early and late developmental-stage biofilms; however, redox potentials of Cyt572 from these different sources did not vary significantly. These cytochrome analyses show a complex response of the Leptospirillum Group II electron transport chain to growth within a microbial community and illustrate the power of multiple proteomics techniques to define biochemistry in natural systems.

  9. Sequence-defined bioactive macrocycles via an acid-catalysed cascade reaction

    NASA Astrophysics Data System (ADS)

    Porel, Mintu; Thornlow, Dana N.; Phan, Ngoc N.; Alabi, Christopher A.

    2016-06-01

    Synthetic macrocycles derived from sequence-defined oligomers are a unique structural class whose ring size, sequence and structure can be tuned via precise organization of the primary sequence. Similar to peptides and other peptidomimetics, these well-defined synthetic macromolecules become pharmacologically relevant when bioactive side chains are incorporated into their primary sequence. In this article, we report the synthesis of oligothioetheramide (oligoTEA) macrocycles via a one-pot acid-catalysed cascade reaction. The versatility of the cyclization chemistry and modularity of the assembly process was demonstrated via the synthesis of >20 diverse oligoTEA macrocycles. Structural characterization via NMR spectroscopy revealed the presence of conformational isomers, which enabled the determination of local chain dynamics within the macromolecular structure. Finally, we demonstrate the biological activity of oligoTEA macrocycles designed to mimic facially amphiphilic antimicrobial peptides. The preliminary results indicate that macrocyclic oligoTEAs with just two-to-three cationic charge centres can elicit potent antibacterial activity against Gram-positive and Gram-negative bacteria.

  10. Unconventional amino acid sequence of the sun anemone (Stoichactis helianthus) polypeptide neurotoxin

    SciTech Connect

    Kem, W.; Dunn, B.; Parten, B.; Pennington, M.; Price, D.

    1986-05-01

    A 5000 dalton polypeptide neurotoxin (Sh-NI) purified by G50 Sephadex, P-cellulose, and SP-Sephadex chromatography was homogeneous by isoelectric focusing. Sh-NI was highly toxic to crayfish (LD/sub 50/ 0.6 ..mu..g/kg) but without effect upon mice at 15,000 ..mu..g/kg (i.p. injection). The reduced, /sup 3/H-carboxymethylated toxin and its fragments were subjected to automatic Edman degradation and the resulting PTH-amino acids were identified by HPLC, back hydrolysis, and scintillation counting. Peptides resulting from proteolytic (clostripain, staphylococcal protease) and chemical (tryptophan) cleavage were sequenced. The sequence is: AACKCDDEGPDIRTAPLTGTVDLGSCNAGWEKCASYYTIIADCCRKKK. This sequence differs considerably from the homologous Anemonia and Anthopleura toxins; many of the identical residues (6 half-cystines, G9, P10, R13, G19, G29, W30) are probably critical for folding rather than receptor recognition. However, the Sh-NI sequence closely resembles Radioanthus macrodactylus neurotoxin III and r. paumotensis II. The authors propose that Sh-NI and related Radioanthus toxins act upon a different site on the sodium channel.

  11. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, H.U.G.; Gray, J.W.

    1995-06-27

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers and probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity. 18 figs.

  12. Repeat sequence chromosome specific nucleic acid probes and methods of preparing and using

    DOEpatents

    Weier, Heinz-Ulrich G.; Gray, Joe W.

    1995-01-01

    A primer directed DNA amplification method to isolate efficiently chromosome-specific repeated DNA wherein degenerate oligonucleotide primers are used is disclosed. The probes produced are a heterogeneous mixture that can be used with blocking DNA as a chromosome-specific staining reagent, and/or the elements of the mixture can be screened for high specificity, size and/or high degree of repetition among other parameters. The degenerate primers are sets of primers that vary in sequence but are substantially complementary to highly repeated nucleic acid sequences, preferably clustered within the template DNA, for example, pericentromeric alpha satellite repeat sequences. The template DNA is preferably chromosome-specific. Exemplary primers ard probes are disclosed. The probes of this invention can be used to determine the number of chromosomes of a specific type in metaphase spreads, in germ line and/or somatic cell interphase nuclei, micronuclei and/or in tissue sections. Also provided is a method to select arbitrarily repeat sequence probes that can be screened for chromosome-specificity.

  13. Lack of sequence variation in sporadic bovine leucosis in regions of tumour suppressor genes p53 and p16.

    PubMed

    Mayr, B; Grüneis, C; Brem, G; Reifinger, M; Schaffner, G; Hochsteiner, W

    2001-08-01

    Regions of the promoter and exons 5-8 of the tumour suppressor gene p53 were analysed in 25 cases of sporadic bovine leucosis. The study included 17 cases of juvenile leucosis, five cases of adult leucosis and three cases of skin leucosis. Exon 2 of tumour suppressor gene p16 was also investigated in the same samples. No sequence variations were present in the analysed areas of the genes. In p53, this fact represents a clear difference in comparison with enzootic bovine leucosis. In p16, no comparative data are available. PMID:11554494

  14. The non-obese diabetic mouse sequence, annotation and variation resource: an aid for investigating type 1 diabetes

    PubMed Central

    Steward, Charles A.; Gonzalez, Jose M.; Trevanion, Steve; Sheppard, Dan; Kerry, Giselle; Gilbert, James G. R.; Wicker, Linda S.; Rogers, Jane; Harrow, Jennifer L.

    2013-01-01

    Model organisms are becoming increasingly important for the study of complex diseases such as type 1 diabetes (T1D). The non-obese diabetic (NOD) mouse is an experimental model for T1D having been bred to develop the disease spontaneously in a process that is similar to humans. Genetic analysis of the NOD mouse has identified around 50 disease loci, which have the nomenclature Idd for insulin-dependent diabetes, distributed across at least 11 different chromosomes. In total, 21 Idd regions across 6 chromosomes, that are major contributors to T1D susceptibility or resistance, were selected for finished sequencing and annotation at the Wellcome Trust Sanger Institute. Here we describe the generation of 40.4 mega base-pairs of finished sequence from 289 bacterial artificial chromosomes for the NOD mouse. Manual annotation has identified 738 genes in the diabetes sensitive NOD mouse and 765 genes in homologous regions of the diabetes resistant C57BL/6J reference mouse across 19 candidate Idd regions. This has allowed us to call variation consequences between homologous exonic sequences for all annotated regions in the two mouse strains. We demonstrate the importance of this resource further by illustrating the technical difficulties that regions of inter-strain structural variation between the NOD mouse and the C57BL/6J reference mouse can cause for current next generation sequencing and assembly techniques. Furthermore, we have established that the variation rate in the Idd regions is 2.3 times higher than the mean found for the whole genome assembly for the NOD/ShiLtJ genome, which we suggest reflects the fact that positive selection for functional variation in immune genes is beneficial in regard to host defence. In summary, we provide an important resource, which aids the analysis of potential causative genes involved in T1D susceptibility. Database URLs: http://www.sanger.ac.uk/resources/mouse/nod/; http://vega

  15. Chimeric proteins for detection and quantitation of DNA mutations, DNA sequence variations, DNA damage and DNA mismatches

    DOEpatents

    McCutchen-Maloney, Sandra L.

    2002-01-01

    Chimeric proteins having both DNA mutation binding activity and nuclease activity are synthesized by recombinant technology. The proteins are of the general formula A-L-B and B-L-A where A is a peptide having DNA mutation binding activity, L is a linker and B is a peptide having nuclease activity. The chimeric proteins are useful for detection and identification of DNA sequence variations including DNA mutations (including DNA damage and mismatches) by binding to the DNA mutation and cutting the DNA once the DNA mutation is detected.

  16. Detection of Nucleic Acids with Graphene Nanopores: Ab Initio Characterization of a Novel Sequencing Device

    NASA Astrophysics Data System (ADS)

    Nelson, Tammie; Zhang, Bo; Prezhdo, Oleg

    2010-03-01

    We report an ab initio study of the interaction of two nucleobases, cytosine and adenine, with a novel graphene nanopore device for detecting the base sequence of a single-stranded nucleic acid (ssDNA or RNA). The nucleobases were inserted into a pore in a graphene nanoribbon, and the electrical current and conductance spectra were calculated as functions of voltage applied across the nanoribbon. The conductance spectra and charge densities were analyzed in the presence of each nucleobase in the graphene nanopore. The results indicate that, due to significant differences in the conductance spectra, the proposed device has adequate sensitivity to discriminate between different nucleotides. Moreover, we show that the nucleotide conductance spectra is not affected by its orientation inside the graphene nanopore. The proposed technique may be extremely useful for real applications in developing ultrafast, low cost DNA sequencing methods.

  17. Patterns of structural and sequence variation within isotype lineages of the Neisseria meningitidis transferrin receptor system

    PubMed Central

    Adamiak, Paul; Calmettes, Charles; Moraes, Trevor F; Schryvers, Anthony B

    2015-01-01

    Neisseria meningitidis inhabits the human upper respiratory tract and is an important cause of sepsis and meningitis. A surface receptor comprised of transferrin-binding proteins A and B (TbpA and TbpB), is responsible for acquiring iron from host transferrin. Sequence and immunological diversity divides TbpBs into two distinct lineages; isotype I and isotype II. Two representative isotype I and II strains, B16B6 and M982, differ in their dependence on TbpB for in vitro growth on exogenous transferrin. The crystal structure of TbpB and a structural model for TbpA from the representative isotype I N. meningitidis strain B16B6 were obtained. The structures were integrated with a comprehensive analysis of the sequence diversity of these proteins to probe for potential functional differences. A distinct isotype I TbpA was identified that co-varied with TbpB and lacked sequence in the region for the loop 3 α-helix that is proposed to be involved in iron removal from transferrin. The tightly associated isotype I TbpBs had a distinct anchor peptide region, a distinct, smaller linker region between the lobes and lacked the large loops in the isotype II C-lobe. Sequences of the intact TbpB, the TbpB N-lobe, the TbpB C-lobe, and TbpA were subjected to phylogenetic analyses. The phylogenetic clustering of TbpA and the TbpB C-lobe were similar with two main branches comprising the isotype 1 and isotype 2 TbpBs, possibly suggesting an association between TbpA and the TbpB C-lobe. The intact TbpB and TbpB N-lobe had 4 main branches, one consisting of the isotype 1 TbpBs. One isotype 2 TbpB cluster appeared to consist of isotype 1 N-lobe sequences and isotype 2 C-lobe sequences, indicating the swapping of N-lobes and C-lobes. Our findings should inform future studies on the interaction between TbpB and TbpA and the process of iron acquisition. PMID:25800619

  18. Morphological tranformation of calcite crystal growth by prismatic "acidic" polypeptide sequences.

    SciTech Connect

    Kim, I; Giocondi, J L; Orme, C A; Collino, J; Evans, J S

    2007-02-13

    Many of the interesting mechanical and materials properties of the mollusk shell are thought to stem from the prismatic calcite crystal assemblies within this composite structure. It is now evident that proteins play a major role in the formation of these assemblies. Recently, a superfamily of 7 conserved prismatic layer-specific mollusk shell proteins, Asprich, were sequenced, and the 42 AA C-terminal sequence region of this protein superfamily was found to introduce surface voids or porosities on calcite crystals in vitro. Using AFM imaging techniques, we further investigate the effect that this 42 AA domain (Fragment-2) and its constituent subdomains, DEAD-17 and Acidic-2, have on the morphology and growth kinetics of calcite dislocation hillocks. We find that Fragment-2 adsorbs on terrace surfaces and pins acute steps, accelerates then decelerates the growth of obtuse steps, forms clusters and voids on terrace surfaces, and transforms calcite hillock morphology from a rhombohedral form to a rounded one. These results mirror yet are distinct from some of the earlier findings obtained for nacreous polypeptides. The subdomains Acidic-2 and DEAD-17 were found to accelerate then decelerate obtuse steps and induce oval rather than rounded hillock morphologies. Unlike DEAD-17, Acidic-2 does form clusters on terrace surfaces and exhibits stronger obtuse velocity inhibition effects than either DEAD-17 or Fragment-2. Interestingly, a 1:1 mixture of both subdomains induces an irregular polygonal morphology to hillocks, and exhibits the highest degree of acute step pinning and obtuse step velocity inhibition. This suggests that there is some interplay between subdomains within an intra (Fragment-2) or intermolecular (1:1 mixture) context, and sequence interplay phenomena may be employed by biomineralization proteins to exert net effects on crystal growth and morphology.

  19. Fast computational methods for predicting protein structure from primary amino acid sequence

    DOEpatents

    Agarwal, Pratul Kumar

    2011-07-19

    The present invention provides a method utilizing primary amino acid sequence of a protein, energy minimization, molecular dynamics and protein vibrational modes to predict three-dimensional structure of a protein. The present invention also determines possible intermediates in the protein folding pathway. The present invention has important applications to the design of novel drugs as well as protein engineering. The present invention predicts the three-dimensional structure of a protein independent of size of the protein, overcoming a significant limitation in the prior art.

  20. Natural variation in Brachypodium disctachyon: Deep Sequencing of Highly Diverse Natural Accessions (2013 DOE JGI Genomics of Energy and Environment 8th Annual User Meeting)

    SciTech Connect

    Gordon, Sean

    2013-03-01

    Sean Gordon of the USDA on "Natural variation in Brachypodium disctachyon: Deep Sequencing of Highly Diverse Natural Accessions" at the 8th Annual Genomics of Energy & Environment Meeting on March 27, 2013 in Walnut Creek, Calif.

  1. Mitochondrial DNA D-loop sequence variation in maternal lineages of Iranian native horses.

    PubMed

    Moridi, M; Masoudi, A A; Vaez Torshizi, R; Hill, E W

    2013-04-01

    To understand the origin and genetic diversity of Iranian native horses, mitochondrial DNA (mtDNA) D-loop sequences were generated for 95 horses from five breeds sampled in eight geographical locations in Iran. Sequence analysis of a 247-bp segment revealed a total of 27 haplotypes with 38 polymorphic sites. Twelve of 19 mtDNA haplogroups were identified in the samples. The most common haplotypes were found within haplogroup X2. Within-population haplotype and nucleotide diversities of the five breeds ranged from 0.838 ± 0.056 to 0.974 ± 0.022 and 0.011 ± 0.002 to 0.021 ± 0.001 respectively, indicating a relatively high genetic diversity in Iranian horses. The identification of several ancient sequences common between the breeds suggests that the lineage of the majority of Iranian horse breeds is old and obviously originated from a vast number of mares. We found in all native Iranian horse breeds lineages of the haplogroups D and K, which is concordant with the previous findings of Asian origins of these haplogroups. The presence of haplotypes E and K in our study also is consistent with a geographical west-east direction of increasing frequency of these haplotypes and a genetic fusion in Iranian horse breeds. PMID:22732008

  2. Spatio-Temporal Variations of High and Low Nucleic Acid Content Bacteria in an Exorheic River

    PubMed Central

    Ma, Lili; Ji, Yurui; Bartlam, Mark; Wang, Yingying

    2016-01-01

    Bacteria with high nucleic acid (HNA) and low nucleic acid (LNA) content are commonly observed in aquatic environments. To date, limited knowledge is available on their temporal and spatial variations in freshwater environments. Here an investigation of HNA and LNA bacterial abundance and their flow cytometric characteristics was conducted in an exorheic river (Haihe River, Northern China) over a one year period covering September (autumn) 2011, December (winter) 2011, April (spring) 2012, and July (summer) 2012. The results showed that LNA and HNA bacteria contributed similarly to the total bacterial abundance on both the spatial and temporal scale. The variability of HNA on abundance, fluorescence intensity (FL1) and side scatter (SSC) were more sensitive to environmental factors than that of LNA bacteria. Meanwhile, the relative distance of SSC between HNA and LNA was more variable than that of FL1. Multivariate analysis further demonstrated that the influence of geographical distance (reflected by the salinity gradient along river to ocean) and temporal changes (as temperature variation due to seasonal succession) on the patterns of LNA and HNA were stronger than the effects of nutrient conditions. Furthermore, the results demonstrated that the distribution of LNA and HNA bacteria, including the abundance, FL1 and SSC, was controlled by different variables. The results suggested that LNA and HNA bacteria might play different ecological roles in the exorheic river. PMID:27082986

  3. Amino-terminal amino acid sequence of the major structural polypeptides of avian retroviruses: sequence homology between reticuloendotheliosis virus p30 and p30s of mammalian retroviruses.

    PubMed Central

    Hunter, E; Bhown, A S; Bennett, J C

    1978-01-01

    The major structural polypeptides, p30 of reticuloendotheliosis virus (REV) (strain T) and p27 of avian sarcoma virus B77, have been compared with regard to amino acid composition. NH2-terminal amino acid sequence, and immunological crossreactions. The amino acid composition of the two polypeptides is distinct, and a comparison of the first 30 NH2-terminal amino acids of REV p30 with that for the first 25 of B77 p27 yields only three homologous residues. In competition radioimmunoassays the polypeptides show no crossreactivity. A comparison of the amino acid composition and NH2-terminal amino acid sequence of REV p30 with those reported for several mammalian retrovirus p30s shows remarkable similarities. Both REV and mammalian p30s contain a large number of polar residues in their amino acid composition and show approximately 40% homology in the first 30 NH2-terminal amino acids. No crossreactivity could be observed, however, in competition radioimmunoassays between Rauscher murine leukemia virus p30 and that of REV. The observations reported here suggest a close evolutionary relationship between REV and the mammalian retroviruses. Images PMID:208072

  4. Serum uric acid concentrations and SLC2A9 genetic variation in Hispanic children: The Viva La Familia Study

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Elevated concentrations of serum uric acid are associated with increased risk of gout and renal and cardiovascular diseases. Genetic studies in adults have consistently identified associations of solute carrier family 2, member 9 (SLC2A9), polymorphisms with variation in serum uric acid. However, it...

  5. Purification and amino acid sequence of aminopeptidase P from pig kidney.

    PubMed

    Vergas Romero, C; Neudorfer, I; Mann, K; Schäfer, W

    1995-04-01

    Aminopeptidase P from kidney cortex was purified in high yield (recovery greater than or equal to 20%) by a series of column chromatographic steps after solubilization of the membrane-bound glycoprotein with n-butanol. A coupled enzymic assay, using Gly-Pro-Pro-NH-Nap as substrate and dipeptidyl-peptidase IV as auxilliary enzyme, was used to monitor the purification. The purification procedure yielded two forms of aminopeptidase P differing in their carbohydrate composition (glycoforms). Both enzyme preparations were homogeneous as assessed by SDS/PAGE silver staining, and isoelectric focusing. Both forms possessed the same substrate specificity, catalysed the same reaction, and consisted of identical protein chains. The amino acid sequence determined by Edman degradation and mass spectrometry consisted of 623 amino acids. Six N-glycosylation sites, all contained in the N-terminal half of the protein, were characterized. PMID:7744038

  6. Draft Genome Sequence of Cupriavidus sp. Strain SK-3, a 4-Chlorobiphenyl- and 4-Clorobenzoic Acid-Degrading Bacterium

    PubMed Central

    Vilo, Claudia; Benedik, Michael J.; Ilori, Matthew

    2014-01-01

    We report the draft genome sequence of Cupriavidus sp. strain SK-3, which can use 4-chlorobiphenyl and 4-clorobenzoic acid as the sole carbon source for growth. The draft genome sequence allowed the study of the polychlorinated biphenyl degradation mechanism and the recharacterization of the strain SK-3 as a Cupriavidus species. PMID:24994805

  7. Draft Genome Sequence of Bacillus subtilis subsp. natto Strain CGMCC 2108, a High Producer of Poly-γ-Glutamic Acid

    PubMed Central

    Tan, Siyuan; Su, Anping; Zhang, Chen; Ren, Yuanyuan

    2016-01-01

    Here, we report the 4.1-Mb draft genome sequence of Bacillus subtilis subsp. natto strain CGMCC 2108, a high producer of poly-γ-glutamic acid (γ-PGA). This sequence will provide further help for the biosynthesis of γ-PGA and will greatly facilitate research efforts in metabolic engineering of B. subtilis subsp. natto strain CGMCC 2108. PMID:27231363

  8. New monoclonal antibodies to the Ebola virus glycoprotein: Identification and analysis of the amino acid sequence of the variable domains.

    PubMed

    Panina, A A; Aliev, T K; Shemchukova, O B; Dement'yeva, I G; Varlamov, N E; Pozdnyakova, L P; Bokov, M N; Dolgikh, D A; Sveshnikov, P G; Kirpichnikov, M P

    2016-03-01

    We determined the nucleotide and amino acid sequences of variable domains of three new monoclonal antibodies to the glycoprotein of Ebola virus capsid. The framework and hypervariable regions of immunoglobulin heavy and light chains were identified. The primary structures were confirmed using massspectrometry analysis. Immunoglobulin database search showed the uniqueness of the sequences obtained. PMID:27193713

  9. Genome Sequence of the Lactic Acid Bacterium Lactococcus lactis subsp. lactis TOMSC161, Isolated from a Nonscalded Curd Pressed Cheese

    PubMed Central

    Velly, H.; Abraham, A.-L.; Loux, V.; Delacroix-Buchet, A.; Fonseca, F.; Bouix, M.

    2014-01-01

    Lactococcus lactis is a lactic acid bacterium used in the production of many fermented foods, such as dairy products. Here, we report the genome sequence of L. lactis subsp. lactis TOMSC161, isolated from nonscalded curd pressed cheese. This genome sequence provides information in relation to dairy environment adaptation. PMID:25377704

  10. Draft Genome Sequence of Bacillus subtilis subsp. natto Strain CGMCC 2108, a High Producer of Poly-γ-Glutamic Acid.

    PubMed

    Tan, Siyuan; Meng, Yonghong; Su, Anping; Zhang, Chen; Ren, Yuanyuan

    2016-01-01

    Here, we report the 4.1-Mb draft genome sequence of Bacillus subtilis subsp. natto strain CGMCC 2108, a high producer of poly-γ-glutamic acid (γ-PGA). This sequence will provide further help for the biosynthesis of γ-PGA and will greatly facilitate research efforts in metabolic engineering of B. subtilis subsp. natto strain CGMCC 2108. PMID:27231363

  11. Chemical profile and seasonal variation of phenolic acid content in bastard balm (Melittis melissophyllum L., Lamiaceae).

    PubMed

    Skrzypczak-Pietraszek, Ewa; Pietraszek, Jacek

    2012-07-01

    Melittis melissophyllum L. is an old medicinal plant. Nowadays it is only used in the folk medicine but formerly it has been applied in the official medicine as a natural product described in French Pharmacopoeia. M. melissophyllum herbs used in our studies were collected from two localities in Poland in May and September. Methanolic plant extracts were purified by means of solid-phase extraction and then analysed by HPLC-DAD for their phenolic acid profile. Eleven compounds were identified in all plant samples and quantitatively analysed as: protocatechuic, chlorogenic, p-hydroxybenzoic, vanillic, caffeic, syringic, p-coumaric, ferulic, sinapic, o-coumaric and cinnamic acid. Plant materials contained free and bound phenolic acids. The main compounds were: p-hydroxybenzoic acid (30.21-54.16 mg/100 g dw and 37.04-56.75 mg/100 g dw, free and bound, respectively) and p-coumaric acid (40.48-80.55 mg/100 g dw and 28.09-40.85 mg/100 g dw, free and bound, respectively). The highest amounts of the investigated compounds were found in all samples collected in September, e.g. p-hydroxybenzoic acid (September 51.72-54.16 mg/100 g dw vs. May 30.21-34.07 mg/100 g dw), p-coumaric acid (September 77.14-80.55 mg/100 g dw vs. May 40.48-43.2 5mg/100 g dw). Multivariate statistical and data mining techniques, such as cluster analysis (CA) and principal component analysis (PCA), were used to characterize the sample populations according to the geographical localities, vegetation period and compound form (free or bound). To the best of our knowledge we report for the first time the results of quantitative analysis of M. melissophyllum phenolic acids and seasonal variation of their content. Plant herbs are usually collected at flowering for plant derived medical preparations. Our results show that it is not always the optimal time for the highest contents of active compounds. PMID:22513117

  12. ANTICALIgN: visualizing, editing and analyzing combined nucleotide and amino acid sequence alignments for combinatorial protein engineering.

    PubMed

    Jarasch, Alexander; Kopp, Melanie; Eggenstein, Evelyn; Richter, Antonia; Gebauer, Michaela; Skerra, Arne

    2016-07-01

    ANTIC ALIGN: is an interactive software developed to simultaneously visualize, analyze and modify alignments of DNA and/or protein sequences that arise during combinatorial protein engineering, design and selection. ANTIC ALIGN: combines powerful functions known from currently available sequence analysis tools with unique features for protein engineering, in particular the possibility to display and manipulate nucleotide sequences and their translated amino acid sequences at the same time. ANTIC ALIGN: offers both template-based multiple sequence alignment (MSA), using the unmutated protein as reference, and conventional global alignment, to compare sequences that share an evolutionary relationship. The application of similarity-based clustering algorithms facilitates the identification of duplicates or of conserved sequence features among a set of selected clones. Imported nucleotide sequences from DNA sequence analysis are automatically translated into the corresponding amino acid sequences and displayed, offering numerous options for selecting reading frames, highlighting of sequence features and graphical layout of the MSA. The MSA complexity can be reduced by hiding the conserved nucleotide and/or amino acid residues, thus putting emphasis on the relevant mutated positions. ANTIC ALIGN: is also able to handle suppressed stop codons or even to incorporate non-natural amino acids into a coding sequence. We demonstrate crucial functions of ANTIC ALIGN: in an example of Anticalins selected from a lipocalin random library against the fibronectin extradomain B (ED-B), an established marker of tumor vasculature. Apart from engineered protein scaffolds, ANTIC ALIGN: provides a powerful tool in the area of antibody engineering and for directed enzyme evolution. PMID:27261456

  13. FimH family of type 1 fimbrial adhesins: functional heterogeneity due to minor sequence variations among fimH genes.

    PubMed Central

    Sokurenko, E V; Courtney, H S; Ohman, D E; Klemm, P; Hasty, D L

    1994-01-01

    We recently reported that the type 1-fimbriated Escherichia coli strains CSH-50 and HB101(pPKL4), both K-12 derivatives, have different patterns of adhesion to yeast mannan, human plasma fibronectin, and fibronectin derivatives, suggesting functional heterogeneity of type 1 fimbriae. In this report, we provide evidence that this functional heterogeneity is due to variations in the fimH genes. We also investigated functional heterogeneity among clinical isolates and whether variation in fimH genes accounts for differences in receptor specificity. Twelve isolates obtained from human urine were tested for their ability to adhere to mannan, fibronectin, periodate-treated fibronectin, and a synthetic peptide copying the 30 amino-terminal residues of fibronectin. CSH-50 and HB101(pPKL4) were tested for comparison. Selected isolates were also tested for adhesion to purified fragments spanning the entire fibronectin molecule. Three distinct functional classes, designated M, MF, and MFP, were observed. The fimH genes were amplified by PCR from chromosomal DNA obtained from representative strains and expressed in a delta fim strain (AAEC191A) transformed with a recombinant plasmid containing the entire fim gene cluster but with a translational stop-linker inserted into the fimH gene (pPKL114). Cloned fimH genes conferred on AAEC191A(pPKL114) receptor specificities mimicking those of the parent strains from which the fimH genes were obtained, demonstrating that the FimH subunits are responsible for the functional heterogeneity. Representative fimH genes were sequenced, and the deduced amino acid sequences were compared with the previously published FimH sequence. Allelic variants exhibiting >98% homology and encoding proteins differing by as little as a single amino acid substitution confer distinct adhesive phenotypes. This unexpected adhesive diversity within the FimH family broadens the scope of potential receptors for enterobacterial adhesion and may lead to a fundamental

  14. Multiple Amino Acid Sequence Alignment Nitrogenase Component 1: Insights into Phylogenetics and Structure-Function Relationships

    PubMed Central

    Howard, James B.; Kechris, Katerina J.; Rees, Douglas C.; Glazer, Alexander N.

    2013-01-01

    Amino acid residues critical for a protein's structure-function are retained by natural selection and these residues are identified by the level of variance in co-aligned homologous protein sequences. The relevant residues in the nitrogen fixation Component 1 α- and β-subunits were identified by the alignment of 95 protein sequences. Proteins were included from species encompassing multiple microbial phyla and diverse ecological niches as well as the nitrogen fixation genotypes, anf, nif, and vnf, which encode proteins associated with cofactors differing at one metal site. After adjusting for differences in sequence length, insertions, and deletions, the remaining >85% of the sequence co-aligned the subunits from the three genotypes. Six Groups, designated Anf, Vnf , and Nif I-IV, were assigned based upon genetic origin, sequence adjustments, and conserved residues. Both subunits subdivided into the same groups. Invariant and single variant residues were identified and were defined as “core” for nitrogenase function. Three species in Group Nif-III, Candidatus Desulforudis audaxviator, Desulfotomaculum kuznetsovii, and Thermodesulfatator indicus, were found to have a seleno-cysteine that replaces one cysteinyl ligand of the 8Fe:7S, P-cluster. Subsets of invariant residues, limited to individual groups, were identified; these unique residues help identify the gene of origin (anf, nif, or vnf) yet should not be considered diagnostic of the metal content of associated cofactors. Fourteen of the 19 residues that compose the cofactor pocket are invariant or single variant; the other five residues are highly variable but do not correlate with the putative metal content of the cofactor. The variable residues are clustered on one side of the cofactor, away from other functional centers in the three dimensional structure. Many of the invariant and single variant residues were not previously recognized as potentially critical and their identification provides the bases

  15. Phase Variation in the Helicobacter pylori Phospholipase A Gene and Its Role in Acid Adaptation

    PubMed Central

    Tannæs, Tone; Dekker, Niek; Bukholm, Geir; Bijlsma, Jetta J. E.; Appelmelk, Ben J.

    2001-01-01

    Previously, we have shown that Helicobacter pylori can spontaneously and reversibly change its membrane lipid composition, producing variants with low or high content of lysophospholipids. The “lyso” variant contains a high percentage of lysophospholipids, adheres better to epithelial cells, and releases more proteins such as urease and VacA, compared to the “normal” variant, which has a low content of lysophospholipids. Prolonged growth of the normal variant at pH 3.5, but not under neutral conditions, leads to enrichment of lyso variant colonies, suggesting that the colony switch is relevant to acid adaptation. In this study we show that the change in membrane lipid composition is due to phase variation in the pldA gene. A change in the (C) tract length of this gene results in reversible frameshifts, translation of a full-length or truncated pldA, and the production of active or inactive outer membrane phospholipase A (OMPLA). The role of OMPLA in determining the colony morphology was confirmed by the construction of an OMPLA-negative mutant. Furthermore, variants with an active OMPLA were able to survive acidic conditions better than variants with the inactive form. This explains why the lyso variant is selected at low pH. Our studies demonstrate that phase variation in the pldA gene, resulting in an active form of OMPLA, is important for survival under acidic conditions. We also demonstrated the active OMPLA genotype in fresh isolates of H. pylori from patients referred to gastroscopy for dyspepsia. PMID:11705905

  16. Liver fatty acid binding protein: species variation and the accommodation of different ligands.

    PubMed

    Thompson, J; Reese-Wagoner, A; Banaszak, L

    1999-11-23

    The crystal structure of rat liver fatty acid binding protein (LFABP) and an alignment of amino acid sequences of all known species have been used to demonstrate two groups or sub-classes. Based on estimates at neutral pH and the electrostatic field calculated using the crystal coordinates, some evidence of changes that occur in going from holo- to apo-forms has been obtained. LFABP belongs to a large family frequently referred to as the intracellular lipid binding proteins or iLBPs. LFABP, unlike other family members, has two fatty acid binding sites. The two cavity sites have been reviewed and arguments for interactions between the sites are presented. Based on the crystal structure of rat LFABP, differences between the A and B groups have been postulated. Last of all, hypothetical models have been built of complexes of LFABP and heme, and LFABP and oleoyl CoA. In both cases, the stoichiometry is one to one and the models show why this is likely. PMID:10570240

  17. Genetic variation in Labeo fimbriatus (Cypriniformes: Cyprinidae) populations as revealed by partial cytochrome b sequences of mitochondrial DNA.

    PubMed

    Swain, Subrat Kumar; Bej, Dillip; Das, Sofia Priyadarsani; Sahoo, Lakshman; Jayasankar, Pallipuram; Das, Pratap Chandra; Das, Paramananda

    2016-05-01

    Labeo fimbriatus, a medium sized carp is assessed as a commercially important aquaculture species in Indian subcontinent. In the present study, the genetic diversity and population structure of four Indian riverine populations of L. fimbriatus have been evaluated using partial cytochrome b sequences of mitochondrial DNA. Sequencing and analysis of this gene from 108 individuals defined 7 distinct haplotypes. Haplotype diversity (Hd) and nucleotide diversity (π) ranged from 0.067 to 0.405 and 0.00023 to 0.03231, respectively. The Mahanadi population had the highest π level. Analysis of molecular variance (AMOVA) indicated that 47.36% of genetic variation contained within population and 53.76% of genetic variation among groups. Pairwise FST analysis indicated that there was little or no genetic differentiation among populations (-0.0018 to 04572) from different geographical regions except Mahanadi population. The Mahanadi population can be considered as a separate stock from rest three riverine populations. Accordingly, the genetic information generated from this study can be implemented while taking decision in formulating base population for the sustainable selective breeding programs of this species. PMID:25329277

  18. Evolution of simple sequence repeat-mediated phase variation in bacterial genomes.

    PubMed

    Bayliss, Christopher D; Palmer, Michael E

    2012-09-01

    Mutability as mechanism for rapid adaptation to environmental challenge is an alluringly simple concept whose apotheosis is realized in simple sequence repeats (SSR). Bacterial genomes of several species contain SSRs with a proven role in adaptation to environmental fluctuations. SSRs are hypermutable and generate reversible mutations in localized regions of bacterial genomes, leading to phase variable ON/OFF switches in gene expression. The application of genetic, bioinformatic, and mathematical/computational modeling approaches are revolutionizing our current understanding of how genomic molecular forces and environmental factors influence SSR-mediated adaptation and led to evolution of this mechanism of localized hypermutation in bacterial genomes. PMID:22954215

  19. A haplotype method detects diverse scenarios of local adaptation from genomic sequence variation.

    PubMed

    Lange, Jeremy D; Pool, John E

    2016-07-01

    Identifying genomic targets of population-specific positive selection is a major goal in several areas of basic and applied biology. However, it is unclear how often such selection should act on new mutations versus standing genetic variation or recurrent mutation, and furthermore, favoured alleles may either become fixed or remain variable in the population. Very few population genetic statistics are sensitive to all of these modes of selection. Here, we introduce and evaluate the Comparative Haplotype Identity statistic (χMD ), which assesses whether pairwise haplotype sharing at a locus in one population is unusually large compared with another population, relative to genomewide trends. Using simulations that emulate human and Drosophila genetic variation, we find that χMD is sensitive to a wide range of selection scenarios, and for some very challenging cases (e.g. partial soft sweeps), it outperforms other two-population statistics. We also find that, as with FST , our haplotype approach has the ability to detect surprisingly ancient selective sweeps. Particularly for the scenarios resembling human variation, we find that χMD outperforms other frequency- and haplotype-based statistics for soft and/or partial selective sweeps. Applying χMD and other between-population statistics to published population genomic data from D. melanogaster, we find both shared and unique genes and functional categories identified by each statistic. The broad utility and computational simplicity of χMD will make it an especially valuable tool in the search for genes targeted by local adaptation. PMID:27135633

  20. Draft Genome Sequences of Gluconobacter cerinus CECT 9110 and Gluconobacter japonicus CECT 8443, Acetic Acid Bacteria Isolated from Grape Must

    PubMed Central

    Sainz, Florencia

    2016-01-01

    We report here the draft genome sequences of Gluconobacter cerinus strain CECT9110 and Gluconobacter japonicus CECT8443, acetic acid bacteria isolated from grape must. Gluconobacter species are well known for their ability to oxidize sugar alcohols into the corresponding acids. Our objective was to select strains to oxidize effectively d-glucose. PMID:27365351

  1. Variation in photoreactivity of iron hydroxides taken from an acidic mountain stream

    SciTech Connect

    Hrncir, D.C.; McKnight, D.

    1998-07-15

    The photoreduction of iron hydroxides is known to exert significant influence over many biogeochemical processes in streams impacted by acid main drainage. Using laboratory and in-stream measurements, the variation in reactivity of iron hydroxides taken from a stream receiving acid mine drainage (AMD) was studied. The reactivity decreased for material collected at sites progressively downstream from the AMD inflow. In the presence of two simple organic ligands, photoreduction increased for the fresher iron hydroxides but remained unchanged for the older hydroxides. The importance of ligand coordination to the enhancement of photoreduction in natural waters was further demonstrated in experiments using two types of fulvic acids. In-stream measurements of hydrogen peroxide concentration are consistent with the conclusions drawn from the batch experiments. Iron hydroxides were observed to age over time, becoming less photoreactive. This aging was accompanied by an increase in crystallinity. The loss of photoreactivity for the older material can be explained by a decrease in the number of active surface sites, a change in the nature of the surface sites, or a combination of both.

  2. Swfoldrate: predicting protein folding rates from amino acid sequence with sliding window method.

    PubMed

    Cheng, Xiang; Xiao, Xuan; Wu, Zhi-cheng; Wang, Pu; Lin, Wei-zhong

    2013-01-01

    Protein folding is the process by which a protein processes from its denatured state to its specific biologically active conformation. Understanding the relationship between sequences and the folding rates of proteins remains an important challenge. Most previous methods of predicting protein folding rate require the tertiary structure of a protein as an input. In this study, the long-range and short-range contact in protein were used to derive extended version of the pseudo amino acid composition based on sliding window method. This method is capable of predicting the protein folding rates just from the amino acid sequence without the aid of any structural class information. We systematically studied the contributions of individual features to folding rate prediction. The optimal feature selection procedures are adopted by means of combining the forward feature selection and sequential backward selection method. Using the jackknife cross validation test, the method was demonstrated on the large dataset. The predictor was achieved on the basis of multitudinous physicochemical features and statistical features from protein using nonlinear support vector machine (SVM) regression model, the method obtained an excellent agreement between predicted and experimentally observed folding rates of proteins. The correlation coefficient is 0.9313 and the standard error is 2.2692. The prediction server is freely available at http://www.jci-bioinfo.cn/swfrate/input.jsp. PMID:22933332

  3. From amino acid sequence to bioactivity: The biomedical potential of antitumor peptides.

    PubMed

    Blanco-Míguez, Aitor; Gutiérrez-Jácome, Alberto; Pérez-Pérez, Martín; Pérez-Rodríguez, Gael; Catalán-García, Sandra; Fdez-Riverola, Florentino; Lourenço, Anália; Sánchez, Borja

    2016-06-01

    Chemoprevention is the use of natural and/or synthetic substances to block, reverse, or retard the process of carcinogenesis. In this field, the use of antitumor peptides is of interest as, (i) these molecules are small in size, (ii) they show good cell diffusion and permeability, (iii) they affect one or more specific molecular pathways involved in carcinogenesis, and (iv) they are not usually genotoxic. We have checked the Web of Science Database (23/11/2015) in order to collect papers reporting on bioactive peptide (1691 registers), which was further filtered searching terms such as "antiproliferative," "antitumoral," or "apoptosis" among others. Works reporting the amino acid sequence of an antiproliferative peptide were kept (60 registers), and this was complemented with the peptides included in CancerPPD, an extensive resource for antiproliferative peptides and proteins. Peptides were grouped according to one of the following mechanism of action: inhibition of cell migration, inhibition of tumor angiogenesis, antioxidative mechanisms, inhibition of gene transcription/cell proliferation, induction of apoptosis, disorganization of tubulin structure, cytotoxicity, or unknown mechanisms. The main mechanisms of action of those antiproliferative peptides with known amino acid sequences are presented and finally, their potential clinical usefulness and future challenges on their application is discussed. PMID:27010507

  4. The amino acid sequences and activities of synergistic hemolysins from Staphylococcus cohnii.

    PubMed

    Mak, Pawel; Maszewska, Agnieszka; Rozalska, Malgorzata

    2008-10-01

    Staphylococcus cohnii ssp. cohnii and S. cohnii ssp. urealyticus are a coagulase-negative staphylococci considered for a long time as unable to cause infections. This situation changed recently and pathogenic strains of these bacteria were isolated from hospital environments, patients and medical staff. Most of the isolated strains were resistant to many antibiotics. The present work describes isolation and characterization of several synergistic peptide hemolysins produced by these bacteria and acting as virulence factors responsible for hemolytic and cytotoxic activities. Amino acid sequences of respective hemolysins from S. cohnii ssp. cohnii (named as H1C, H2C and H3C) and S. cohnii ssp. urealyticus (H1U, H2U and H3U) were identical. Peptides H1 and H3 possessed significant amino acid homology to three synergistic hemolysins secreted by Staphylococcus lugdunensis and to putative antibacterial peptide produced by Staphylococcus saprophyticus ssp. saprophyticus. On the other hand, hemolysin H2 had a unique sequence. All isolated peptides lysed red cells from different mammalian species and exerted a cytotoxic effect on human fibroblasts. PMID:18752624

  5. Haplogroup Classification of Korean Cattle Breeds Based on Sequence Variations of mtDNA Control Region

    PubMed Central

    Kim, Jae-Hwan; Lee, Seong-Su; Kim, Seung Chang; Choi, Seong-Bok; Kim, Su-Hyun; Lee, Chang Woo; Jung, Kyoung-Sub; Kim, Eun Sung; Choi, Young-Sun; Kim, Sung-Bok; Kim, Woo Hyun; Cho, Chang-Yeon

    2016-01-01

    Many studies have reported the frequency and distribution of haplogroups among various cattle breeds for verification of their origins and genetic diversity. In this study, 318 complete sequences of the mtDNA control region from four Korean cattle breeds were used for haplogroup classification. 71 polymorphic sites and 66 haplotypes were found in these sequences. Consistent with the genetic patterns in previous reports, four haplogroups (T1, T2, T3, and T4) were identified in Korean cattle breeds. In addition, T1a, T3a, and T3b sub-haplogroups were classified. In the phylogenetic tree, each haplogroup formed an independent cluster. The frequencies of T3, T4, T1 (containing T1a), and T2 were 66%, 16%, 10%, and 8%, respectively. Especially, the T1 haplogroup contained only one haplotype and a sample. All four haplogroups were found in Chikso, Jeju black and Hanwoo. However, only the T3 and T4 haplogroups appeared in Heugu, and most Chikso populations showed a partial of four haplogroups. These results will be useful for stable conservation and efficient management of Korean cattle breeds. PMID:26954229

  6. Mitochondrial sequence variation in ancient horses from the Carpathian Basin and possible modern relatives.

    PubMed

    Priskin, K; Szabó, K; Tömöry, G; Bogácsi-Szabó, E; Csányi, B; Eördögh, R; Downes, C S; Raskó, I

    2010-02-01

    Movements of human populations leave their traces in the genetic makeup of the areas affected; the same applies to the horses that move with their owners This study is concerned with the mitochondrial control region genotypes of 31 archaeological horse remains, excavated from pre-conquest Avar and post-conquest Hungarian burial sites in the Carpathian Basin dating from the sixth to the tenth century. To investigate relationships to other ancient and recent breeds, modern Hucul and Akhal Teke samples were also collected, and mtDNA control region (CR) sequences from 76 breeds representing 921 individual specimens were combined with our sequence data. Phylogenetic relationships among horse mtDNA CR haplotypes were estimated using both genetic distance and the non-dichotomous network method. Both methods indicated a separation between horses of the Avars and the Hungarians. Our results show that the ethnic changes induced by the Hungarian Conquest were accompanied by a corresponding change in the stables of the Carpathian Basin. PMID:19789983

  7. Passing faces: sequence-dependent variations in the perceptual processing of emotional faces.

    PubMed

    Karl, Christian; Hewig, Johannes; Osinsky, Roman

    2016-10-01

    There is broad evidence that contextual factors influence the processing of emotional facial expressions. Yet temporal-dynamic aspects, inter alia how face processing is influenced by the specific order of neutral and emotional facial expressions, have been largely neglected. To shed light on this topic, we recorded electroencephalogram from 168 healthy participants while they performed a gender-discrimination task with angry and neutral faces. Our event-related potential (ERP) analyses revealed a strong emotional modulation of the N170 component, indicating that the basic visual encoding and emotional analysis of a facial stimulus happen, at least partially, in parallel. While the N170 and the late positive potential (LPP; 400-600 ms) were only modestly affected by the sequence of preceding faces, we observed a strong influence of face sequences on the early posterior negativity (EPN; 200-300 ms). Finally, the differing response patterns of the EPN and LPP indicate that these two ERPs represent distinct processes during face analysis: while the former seems to represent the integration of contextual information in the perception of a current face, the latter appears to represent the net emotional interpretation of a current face. PMID:26599470

  8. Haplogroup Classification of Korean Cattle Breeds Based on Sequence Variations of mtDNA Control Region.

    PubMed

    Kim, Jae-Hwan; Lee, Seong-Su; Kim, Seung Chang; Choi, Seong-Bok; Kim, Su-Hyun; Lee, Chang Woo; Jung, Kyoung-Sub; Kim, Eun Sung; Choi, Young-Sun; Kim, Sung-Bok; Kim, Woo Hyun; Cho, Chang-Yeon

    2016-05-01

    Many studies have reported the frequency and distribution of haplogroups among various cattle breeds for verification of their origins and genetic diversity. In this study, 318 complete sequences of the mtDNA control region from four Korean cattle breeds were used for haplogroup classification. 71 polymorphic sites and 66 haplotypes were found in these sequences. Consistent with the genetic patterns in previous reports, four haplogroups (T1, T2, T3, and T4) were identified in Korean cattle breeds. In addition, T1a, T3a, and T3b sub-haplogroups were classified. In the phylogenetic tree, each haplogroup formed an independent cluster. The frequencies of T3, T4, T1 (containing T1a), and T2 were 66%, 16%, 10%, and 8%, respectively. Especially, the T1 haplogroup contained only one haplotype and a sample. All four haplogroups were found in Chikso, Jeju black and Hanwoo. However, only the T3 and T4 haplogroups appeared in Heugu, and most Chikso populations showed a partial of four haplogroups. These results will be useful for stable conservation and efficient management of Korean cattle breeds. PMID:26954229

  9. Sequence Polymorphisms at the REDUCED DORMANCY5 Pseudophosphatase Underlie Natural Variation in Arabidopsis Dormancy.

    PubMed

    Xiang, Yong; Song, Baoxing; Née, Guillaume; Kramer, Katharina; Finkemeier, Iris; Soppe, Wim J J

    2016-08-01

    Seed dormancy controls the timing of germination, which regulates the adaptation of plants to their environment and influences agricultural production. The time of germination is under strong natural selection and shows variation within species due to local adaptation. The identification of genes underlying dormancy quantitative trait loci is a major scientific challenge, which is relevant for agricultural and ecological goals. In this study, we describe the identification of the DELAY OF GERMINATION18 (DOG18) quantitative trait locus, which was identified as a factor in natural variation for seed dormancy in Arabidopsis (Arabidopsis thaliana). DOG18 encodes a member of the clade A of the type 2C protein phosphatases family, which we previously identified as the REDUCED DORMANCY5 (RDO5) gene. DOG18/RDO5 shows a relatively high frequency of loss-of-function alleles in natural accessions restricted to northwestern Europe. The loss of dormancy in these loss-of-function alleles can be compensated for by genetic factors like DOG1 and DOG6, and by environmental factors such as low temperature. RDO5 does not have detectable phosphatase activity. Analysis of the phosphoproteome in dry and imbibed seeds revealed a general decrease in protein phosphorylation during seed imbibition that is enhanced in the rdo5 mutant. We conclude that RDO5 acts as a pseudophosphatase that inhibits dephosphorylation during seed imbibition. PMID:27288362

  10. Functional and genetic analysis of haplotypic sequence variation at the nicastrin genomic locus

    PubMed Central

    Hamilton, Gillian; Killick, Richard; Lambert, Jean-Charles; Amouyel, Philippe; Carrasquillo, Minerva M.; Pankratz, V. Shane; Graff-Radford, Neill R.; Dickson, Dennis W.; Petersen, Ronald C.; Younkin, Steven G.; Powell, John F.; Wade-Martins, Richard

    2013-01-01

    Nicastrin (NCSTN) is a component of the γ-secretase complex and therefore potentially a candidate risk gene for Alzheimer's disease. Here, we have developed a novel functional genomics methodology to express common locus haplotypes to assess functional differences. DNA recombination was used to engineer 5 bacterial artificial chromosomes (BACs) to each express a different haplotype of the NCSTN locus. Each NCSTN-BAC was delivered to knockout nicastrin (Ncstn−/−) cells and clonal NCSTN-BAC+/Ncstn−/− cell lines were created for functional analyses. We showed that all NCSTN-BAC haplotypes expressed nicastrin protein and rescued γ-secretase activity and amyloid beta (Aβ) production in NCSTN-BAC+/Ncstn−/− lines. We then showed that genetic variation at the NCSTN locus affected alternative splicing in human postmortem brain tissue. However, there was no robust functional difference between clonal cell lines rescued by each of the 5 different haplotypes. Finally, there was no statistically significant association of NCSTN with disease risk in the 4 cohorts. We therefore conclude that it is unlikely that common variation at the NCSTN locus is a risk factor for Alzheimer's disease. PMID:22405046

  11. Variations of a Y chromosome repeated sequence across subspecies of Mus musculus.

    PubMed

    Boursot, P; Bonhomme, F; Catalan, J; Moriwaki, K

    1989-12-01

    The complex species Mus musculus is widespread in Eurasia and consists of four parapatric genetical entities (subspecies) that have recently radiated. Two of them (M. m. domesticus and M. m. musculus) are known to interact through a narrow zone of hybridisation across which autosomal and mitochondrial exchanges are very limited and Y chromosome exchange is absent. We extend here the study of this group by the genetical analysis of 22 Asian strains of various origins (China, Korea, Japan, Taiwan, Philippines and Indonesia). A survey of protein variation at ten polymorphic loci confirmed that these animals belong to either the subspecies M. m. musculus (northern type in Asia, ranging westwards to Eastern Europe) or to M. m. castaneus (southern Asian type) and revealed a certain degree of intergradation between the two taxa. Y chromosome variations were assessed in these strains using a Y specific DNA probe representing part of a small multigene family and also in four M. m. domesticus (the Western European house mouse) strains of various origins and one M. m. bactrianus (from Pakistan). Musculus and castaneus were identically monomorphic for one type of organisation of this Y repeated family, while domesticus and bactrianus were very similar to each other, showing slightly different types of organisation. Introgression of a bactrianus Y chromosome into the territory of castaneus was found in Indonesia. The present distribution of the Y types among the four subspecies is not phylogenetically concordant with the known distributions of autosomal and mitochondrial variants.(ABSTRACT TRUNCATED AT 250 WORDS) PMID:2575606

  12. Sequence Polymorphisms at the REDUCED DORMANCY5 Pseudophosphatase Underlie Natural Variation in Arabidopsis Dormancy1[OPEN

    PubMed Central

    Xiang, Yong; Song, Baoxing; Née, Guillaume; Kramer, Katharina; Soppe, Wim J.J.

    2016-01-01

    Seed dormancy controls the timing of germination, which regulates the adaptation of plants to their environment and influences agricultural production. The time of germination is under strong natural selection and shows variation within species due to local adaptation. The identification of genes underlying dormancy quantitative trait loci is a major scientific challenge, which is relevant for agricultural and ecological goals. In this study, we describe the identification of the DELAY OF GERMINATION18 (DOG18) quantitative trait locus, which was identified as a factor in natural variation for seed dormancy in Arabidopsis (Arabidopsis thaliana). DOG18 encodes a member of the clade A of the type 2C protein phosphatases family, which we previously identified as the REDUCED DORMANCY5 (RDO5) gene. DOG18/RDO5 shows a relatively high frequency of loss-of-function alleles in natural accessions restricted to northwestern Europe. The loss of dormancy in these loss-of-function alleles can be compensated for by genetic factors like DOG1 and DOG6, and by environmental factors such as low temperature. RDO5 does not have detectable phosphatase activity. Analysis of the phosphoproteome in dry and imbibed seeds revealed a general decrease in protein phosphorylation during seed imbibition that is enhanced in the rdo5 mutant. We conclude that RDO5 acts as a pseudophosphatase that inhibits dephosphorylation during seed imbibition. PMID:27288362

  13. Nucleic acid detection kits

    DOEpatents

    Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann; Kwiatkowski, Robert W.; Vavra, Stephanie H.

    2005-03-29

    The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of nucleic acid from various viruses in a sample.

  14. Comparative Mitogenomics of the Genus Odontobutis (Perciformes: Gobioidei: Odontobutidae) Revealed Conserved Gene Rearrangement and High Sequence Variations

    PubMed Central

    Ma, Zhihong; Yang, Xuefen; Bercsenyi, Miklos; Wu, Junjie; Yu, Yongyao; Wei, Kaijian; Fan, Qixue; Yang, Ruibin

    2015-01-01

    To understand the molecular evolution of mitochondrial genomes (mitogenomes) in the genus Odontobutis, the mitogenome of Odontobutis yaluensis was sequenced and compared with those of another four Odontobutis species. Our results displayed similar mitogenome features among species in genome organization, base composition, codon usage, and gene rearrangement. The identical gene rearrangement of trnS-trnL-trnH tRNA cluster observed in mitogenomes of these five closely related freshwater sleepers suggests that this unique gene order is conserved within Odontobutis. Additionally, the present gene order and the positions of associated intergenic spacers of these Odontobutis mitogenomes indicate that this unusual gene rearrangement results from tandem duplication and random loss of large-scale gene regions. Moreover, these mitogenomes exhibit a high level of sequence variation, mainly due to the differences of corresponding intergenic sequences in gene rearrangement regions and the heterogeneity of tandem repeats in the control regions. Phylogenetic analyses support Odontobutis species with shared gene rearrangement forming a monophyletic group, and the interspecific phylogenetic relationships are associated with structural differences among their mitogenomes. The present study contributes to understanding the evolutionary patterns of Odontobutidae species. PMID:26492246

  15. The thermostable direct hemolysin-related hemolysin (trh) gene of Vibrio parahaemolyticus: Sequence variation and implications for detection and function.

    PubMed

    Nilsson, William B; Turner, Jeffrey W

    2016-07-01

    Vibrio parahaemolyticus is a leading cause of bacterial food-related illness associated with the consumption of undercooked seafood. Only a small subset of strains is pathogenic. Most clinical strains encode for the thermostable direct hemolysin (TDH) and/or the TDH-related hemolysin (TRH). In this work, we amplify and sequence the trh gene from over 80 trh+strains of this bacterium and identify thirteen genetically distinct alleles, most of which have not been deposited in GenBank previously. Sequence data was used to design new primers for more reliable detection of trh by endpoint PCR. We also designed a new quantitative PCR assay to target a more conserved gene that is genetically-linked to trh. This gene, ureR, encodes the transcriptional regulator for the urease gene cluster immediately upstream of trh. We propose that this ureR assay can be a useful screening tool as a surrogate for direct detection of trh that circumvents challenges associated with trh sequence variation. PMID:27094247

  16. Clostridium sticklandii, a specialist in amino acid degradation:revisiting its metabolism through its genome sequence

    PubMed Central

    2010-01-01

    Background Clostridium sticklandii belongs to a cluster of non-pathogenic proteolytic clostridia which utilize amino acids as carbon and energy sources. Isolated by T.C. Stadtman in 1954, it has been generally regarded as a "gold mine" for novel biochemical reactions and is used as a model organism for studying metabolic aspects such as the Stickland reaction, coenzyme-B12- and selenium-dependent reactions of amino acids. With the goal of revisiting its carbon, nitrogen, and energy metabolism, and comparing studies with other clostridia, its genome has been sequenced and analyzed. Results C. sticklandii is one of the best biochemically studied proteolytic clostridial species. Useful additional information has been obtained from the sequencing and annotation of its genome, which is presented in this paper. Besides, experimental procedures reveal that C. sticklandii degrades amino acids in a preferential and sequential way. The organism prefers threonine, arginine, serine, cysteine, proline, and glycine, whereas glutamate, aspartate and alanine are excreted. Energy conservation is primarily obtained by substrate-level phosphorylation in fermentative pathways. The reactions catalyzed by different ferredoxin oxidoreductases and the exergonic NADH-dependent reduction of crotonyl-CoA point to a possible chemiosmotic energy conservation via the Rnf complex. C. sticklandii possesses both the F-type and V-type ATPases. The discovery of an as yet unrecognized selenoprotein in the D-proline reductase operon suggests a more detailed mechanism for NADH-dependent D-proline reduction. A rather unusual metabolic feature is the presence of genes for all the enzymes involved in two different CO2-fixation pathways: C. sticklandii harbours both the glycine synthase/glycine reductase and the Wood-Ljungdahl pathways. This unusual pathway combination has retrospectively been observed in only four other sequenced microorganisms. Conclusions Analysis of the C. sticklandii genome and

  17. Complete amino acid sequence of the myoglobin from the Pacific spotted dolphin, Stenella attenuata graffmani.

    PubMed

    Jones, B N; Wang, C C; Dwulet, F E; Lehman, L D; Meuth, J L; Bogardt, R A; Gurd, F R

    1979-04-25

    The complete amino acid sequence of the major component myoglobin from the Pacific spotted dolphin, Stenella attenuata graffmani, was determined by the automated Edman degradation of several large peptides obtained by specific cleavage of the protein. The acetimidated apomyoglobin was selectively cleaved at its two methionyl residues with cyanogen bromide and at its three arginyl residues by trypsin. By subjecting four of these peptides and the apomyoglobin to automated Edman degradation, over 80% of the primary structure of the protein was obtained. The remainder of the covalent structure was determined by the sequence analysis of peptides that resulted from further digestion of the central cyanogen bromide fragment. This fragment was cleaved at its glutamyl residues with staphylococcal protease and its lysyl residues with trypsin. The action of trypsin was restricted to the lysyl residues by chemical modification of the single arginyl residue of the fragment with 1,2-cyclohexanedione. The primary structure of this myoglobin proved to be identical with that from the Atlantic bottlenosed dolphin and Pacific common dolphin but differs from the myoglobins of the killer whale and pilot whale at two positions. The above sequence identities and differences reflect the close taxonomic relationship of these five species of Cetacea. PMID:454657

  18. Isolation and amino acid sequences of squirrel monkey (Saimiri sciurea) insulin and glucagon.

    PubMed Central

    Yu, J H; Eng, J; Yalow, R S

    1990-01-01

    It was reported two decades ago that insulin was not detectable in the glucose-stimulated state in Saimiri sciurea, the New World squirrel monkey, by a radioimmunoassay system developed with guinea pig anti-pork insulin antibody and labeled pork insulin. With the same system, reasonable levels were observed in rhesus monkeys and chimpanzees. This suggested that New World monkeys, like the New World hystricomorph rodents such as the guinea pig and the coypu, might have insulins whose sequences differ markedly from those of Old World mammals. In this report we describe the purification and amino acid sequences of squirrel monkey insulin and glucagon. We demonstrate that the substitutions at B29, B27, A2, A4, and A17 of squirrel monkey insulin are identical with those previously found in another New World primate, the owl monkey (Aotus trivirgatus). The immunologic cross-reactivity of this insulin in our immunoassay system is only a few percent of that of human insulin. Squirrel monkey glucagon is identical with the usual glucagon found in Old World mammals, which predicts that the glucagons of other New World monkeys would not differ from the usual Old World mammalian glucagon. It appears that the peptides of the New World monkeys have diverged less from those of the Old World mammals than have those of the New World hystricomorph rodents. The striking improvements in peptide purification and sequencing have the potential for adding new information concerning the evolutionary divergence of species. PMID:2263627

  19. Isolation and amino acid sequences of squirrel monkey (Saimiri sciurea) insulin and glucagon

    SciTech Connect

    Yu, Jinghua ); Eng, J.; Yalow, R.S. City Univ. of New York, NY )

    1990-12-01

    It was reported two decades ago that insulin was not detectable in the glucose-stimulated state in Saimiri sciurea, the New World squirrel monkey, by a radioimmunoassay system developed with guinea pig anti-pork insulin antibody and labeled park insulin. With the same system, reasonable levels were observed in rhesus monkeys and chimpanzees. This suggested that New World monkeys, like the New World hystricomorph rodents such as the guinea pig and the coypu, might have insulins whose sequences differ markedly from those of Old World mammals. In this report the authors describe the purification and amino acid sequences of squirrel monkey insulin and glucagon. They demonstrate that the substitutions at B29, B27, A2, A4, and A17 of squirrel monkey insulin are identical with those previously found in another New World primate, the owl monkey (Aotus trivirgatus). The immunologic cross-reactivity of this insulin in their immunoassay system is only a few percent of that of human insulin. It appears that the peptides of the New World monkeys have diverged less from those of the Old World mammals than have those of the New World hystricomorph rodents. The striking improvements in peptide purification and sequencing have the potential for adding new information concerning the evolutionary divergence of species.

  20. Binding site discovery from nucleic acid sequences by discriminative learning of hidden Markov models

    PubMed Central

    Maaskola, Jonas; Rajewsky, Nikolaus

    2014-01-01

    We present a discriminative learning method for pattern discovery of binding sites in nucleic acid sequences based on hidden Markov models. Sets of positive and negative example sequences are mined for sequence motifs whose occurrence frequency varies between the sets. The method offers several objective functions, but we concentrate on mutual information of condition and motif occurrence. We perform a systematic comparison of our method and numerous published motif-finding tools. Our method achieves the highest motif discovery performance, while being faster than most published methods. We present case studies of data from various technologies, including ChIP-Seq, RIP-Chip and PAR-CLIP, of embryonic stem cell transcription factors and of RNA-binding proteins, demonstrating practicality and utility of the method. For the alternative splicing factor RBM10, our analysis finds motifs known to be splicing-relevant. The motif discovery method is implemented in the free software package Discrover. It is applicable to genome- and transcriptome-scale data, makes use of available repeat experiments and aside from binary contrasts also more complex data configurations can be utilized. PMID:25389269

  1. Nucleotide and derived amino acid sequences of the major porin of Comamonas acidovorans and comparison of porin primary structures.

    PubMed Central

    Gerbl-Rieger, S; Peters, J; Kellermann, J; Lottspeich, F; Baumeister, W

    1991-01-01

    The DNA sequence of the gene which codes for the major outer membrane porin (Omp32) of Comamonas acidovorans has been determined. The structural gene encodes a precursor consisting of 351 amino acid residues with a signal peptide of 19 amino acid residues. Comparisons with amino acid sequences of outer membrane proteins and porins from several other members of the class Proteobacteria and of the Chlamydia trachomatis porin and the Neurospora crassa mitochondrial porin revealed a motif of eight regions of local homology. The results of this analysis are discussed with regard to common structural features of porins. PMID:1848840

  2. A worldwide survey of genome sequence variation provides insight into the evolutionary history of the honeybee Apis mellifera.

    PubMed

    Wallberg, Andreas; Han, Fan; Wellhagen, Gustaf; Dahle, Bjørn; Kawata, Masakado; Haddad, Nizar; Simões, Zilá Luz Paulino; Allsopp, Mike H; Kandemir, Irfan; De la Rúa, Pilar; Pirk, Christian W; Webster, Matthew T

    2014-10-01

    The honeybee Apis mellifera has major ecological and economic importance. We analyze patterns of genetic variation at 8.3 million SNPs, identified by sequencing 140 honeybee genomes from a worldwide sample of 14 populations at a combined total depth of 634×. These data provide insight into the evolutionary history and genetic basis of local adaptation in this species. We find evidence that population sizes have fluctuated greatly, mirroring historical fluctuations in climate, although contemporary populations have high genetic diversity, indicating the absence of domestication bottlenecks. Levels of genetic variation are strongly shaped by natural selection and are highly correlated with patterns of gene expression and DNA methylation. We identify genomic signatures of local adaptation, which are enriched in genes expressed in workers and in immune system- and sperm motility-related genes that might underlie geographic variation in reproduction, dispersal and disease resistance. This study provides a framework for future investigations into responses to pathogens and climate change in honeybees. PMID:25151355

  3. Whole genome sequencing of emerging multidrug resistant Candida auris isolates in India demonstrates low genetic variation.

    PubMed

    Sharma, C; Kumar, N; Pandey, R; Meis, J F; Chowdhary, A

    2016-09-01

    Candida auris is an emerging multidrug resistant yeast that causes nosocomial fungaemia and deep-seated infections. Notably, the emergence of this yeast is alarming as it exhibits resistance to azoles, amphotericin B and caspofungin, which may lead to clinical failure in patients. The multigene phylogeny and amplified fragment length polymorphism typing methods report the C. auris population as clonal. Here, using whole genome sequencing analysis, we decipher for the first time that C. auris strains from four Indian hospitals were highly related, suggesting clonal transmission. Further, all C. auris isolates originated from cases of fungaemia and were resistant to fluconazole (MIC >64 mg/L). PMID:27617098

  4. Assessment of megabase-scale somatic copy number variation using single-cell sequencing

    PubMed Central

    Knouse, Kristin A.; Wu, Jie; Amon, Angelika

    2016-01-01

    Megabase-scale copy number variants (CNVs) can have profound phenotypic consequences. Germline CNVs of this magnitude are associated with disease and experience negative selection. However, it is unknown whether organismal function requires that every cell maintain a balanced genome. It is possible that large somatic CNVs are tolerated or even positively selected. Single-cell sequencing is a useful tool for assessing somatic genomic heterogeneity, but its performance in CNV detection has not been rigorously tested. Here, we develop an approach that allows for reliable detection of megabase-scale CNVs in single somatic cells. We discover large CNVs in 8%–9% of cells across tissues and identify two recurrent CNVs. We conclude that large CNVs can be tolerated in subpopulations of cells, and particular CNVs are relatively prevalent within and across individuals. PMID:26772196

  5. Variations of the ISM Compactness Across the Main Sequence of Star Forming Galaxies: Observations and Simulations

    NASA Astrophysics Data System (ADS)

    Martínez-Galarza, J. R.; Smith, H. A.; Lanz, L.; Hayward, Christopher C.; Zezas, A.; Rosenthal, L.; Weiner, A.; Hung, C.; Ashby, M. L. N.; Groves, B.

    2016-01-01

    The majority of star-forming galaxies follow a simple empirical correlation in the star formation rate (SFR) versus stellar mass (M*) plane, of the form {{SFR}}\\propto {M}*α , usually referred to as the star formation main sequence (MS). The physics that sets the properties of the MS is currently a subject of debate, and no consensus has been reached regarding the fundamental difference between members of the sequence and its outliers. Here we combine a set of hydro-dynamical simulations of interacting galactic disks with state-of-the-art radiative transfer codes to analyze how the evolution of mergers is reflected upon the properties of the MS. We present Chiburst, a Markov Chain Monte Carlo spectral energy distribution (SED) code that fits the multi-wavelength, broad-band photometry of galaxies and derives stellar masses, SFRs, and geometrical properties of the dust distribution. We apply this tool to the SEDs of simulated mergers and compare the derived results with the reference output from the simulations. Our results indicate that changes in the SEDs of mergers as they approach coalescence and depart from the MS are related to an evolution of dust geometry in scales larger than a few hundred parsecs. This is reflected in a correlation between the specific star formation rate, and the compactness parameter { C }, that parametrizes this geometry and hence the evolution of dust temperature ({T}{{dust}}) with time. As mergers approach coalescence, they depart from the MS and increase their compactness, which implies that moderate outliers of the MS are consistent with late-type mergers. By further applying our method to real observations of luminous infrared galaxies (LIRGs), we show that the merger scenario is unable to explain these extreme outliers of the MS. Only by significantly increasing the gas fraction in the simulations are we able to reproduce the SEDs of LIRGs.

  6. DNA sequence variations of metalloproteinases: their role in asthma and COPD

    PubMed Central

    Sampsonas, Fotis; Kaparianos, Alexander; Lykouras, Dimosthenis; Karkoulias, Kiriakos; Spiropoulos, Kostas

    2007-01-01

    Asthma and chronic obstructive pulmonary disease (COPD) are complex genetic diseases that cause considerable morbidity and mortality worldwide. Genetic variability interacting with environmental and ethnic factors is presumed to cause tobacco smoke susceptibility and to influence asthma severity. A disintegrin and metalloproteinase 33 (ADAM33) and matrix metalloproteinase‐9 (MMP9) appear to have important roles in asthma and COPD pathogenesis. ADAM33 and MMP9 genetic alterations could possibly contribute to the establishment and progression of these multifactorial diseases, although their association with the clinical phenotypes has not yet been elucidated. However, the occurrence of these alterations does not always result in clear disease, implying that either they are an epiphenomenon or they are in proximity to the true causative alteration. This review summarises the most recent literature dealing with the genetic variations of metalloproteinases and outlines their potential pathogenetic outcome. PMID:17403951

  7. Identification of Genes Responsible for Natural Variation in Volatile Content Using Next-Generation Sequencing Technology.

    PubMed

    Amaya, Iraida; Pillet, Jeremy; Folta, Kevin M

    2016-01-01

    Identification of the genes controlling the variation of key traits remains a challenge for plant researchers and represents a goal for the development of functional markers and their implementation in marker-assisted crop breeding. As an example we describe the identification of volatile organic compounds (VOCs) that segregate as single locus or mayor quantitative trait loci (QTL) in strawberry F1 segregating populations. Next, we describe a fast and efficient method for RNA extraction in strawberry that yields high-quality RNA for downstream RNA-seq analysis. Finally, two alternative methods for analysis of global transcript expression in contrasting lines will be described in order to identify the candidate gene and genes with differential expression using RNA-seq. PMID:26577779

  8. Amino acid sequence analysis and characterization of a ribonuclease from starfish Asterias amurensis.

    PubMed

    Motoyoshi, Naomi; Kobayashi, Hiroko; Itagaki, Tadashi; Inokuchi, Norio

    2016-09-01

    The aim of this study was to phylogenetically characterize the location of the RNase T2 enzyme in the starfish (Asterias amurensis). We isolated an RNase T2 ribonuclease (RNase Aa) from the ovaries of starfish and determined its amino acid sequence by protein chemistry and cloning cDNA encoding RNase Aa. The isolated protein had 231 amino acid residues, a predicted molecular mass of 25,906 Da, and an optimal pH of 5.0. RNase Aa preferentially released guanylic acid from the RNA. The catalytic sites of the RNase T2 family are conserved in RNase Aa; furthermore, the distribution of the cysteine residues in RNase Aa is similar to that in other animal and plant T2 RNases. RNase Aa is cleaved at two points: 21 residues from the N-terminus and 29 residues from the C-terminus; however, both fragments may remain attached to the protein via disulfide bridges, leading to the maintenance of its conformation, as suggested by circular dichroism spectrum analysis. The phylogenetic analysis revealed that starfish RNase Aa is evolutionarily an intermediate between protozoan and oyster RNases. PMID:26920046

  9. Phylogeny, Floral Evolution, and Inter-Island Dispersal in Hawaiian Clermontia (Campanulaceae) Based on ISSR Variation and Plastid Spacer Sequences

    PubMed Central

    Givnish, Thomas J.; Bean, Gregory J.; Ames, Mercedes; Lyon, Stephanie P.; Sytsma, Kenneth J.

    2013-01-01

    Previous studies based on DNA restriction-site and sequence variation have shown that the Hawaiian lobeliads are monophyletic and that the two largest genera, Cyanea and Clermontia, diverged from each other ca. 9.7 Mya. Sequence divergence among species of Clermontia is quite limited, however, and extensive hybridization is suspected, which has interfered with production of a well-resolved molecular phylogeny for the genus. Clermontia is of considerable interest because several species posses petal-like sepals, raising the question of whether such a homeotic mutation has arisen once or several times. In addition, morphological and molecular studies have implied different patterns of inter-island dispersal within the genus. Here we use nuclear ISSRs (inter-simple sequence repeat polymorphisms) and five plastid non-coding sequences to derive biparental and maternal phylogenies for Clermontia. Our findings imply that (1) Clermontia is not monophyletic, with Cl. pyrularia nested within Cyanea and apparently an intergeneric hybrid; (2) the earliest divergent clades within Clermontia are native to Kauài, then Òahu, then Maui, supporting the progression rule of dispersal down the chain toward progressively younger islands, although that rule is violated in later-evolving taxa in the ISSR tree; (3) almost no sequence divergence among several Clermontia species in 4.5 kb of rapidly evolving plastid DNA; (4) several apparent cases of hybridization/introgression or incomplete lineage sorting (i.e., Cl. oblongifolia, peleana, persicifolia, pyrularia, samuelii, tuberculata), based on extensive conflict between the ISSR and plastid phylogenies; and (5) two origins and two losses of petaloid sepals, or—perhaps more plausibly—a single origin and two losses of this homeotic mutation, with its introgression into Cl. persicifolia. Our phylogenies are better resolved and geographically more informative than others based on ITS and 5S-NTS sequences and nuclear SNPs, but agree

  10. Phylogeny, floral evolution, and inter-island dispersal in Hawaiian Clermontia (Campanulaceae) based on ISSR variation and plastid spacer sequences.

    PubMed

    Givnish, Thomas J; Bean, Gregory J; Ames, Mercedes; Lyon, Stephanie P; Sytsma, Kenneth J

    2013-01-01

    Previous studies based on DNA restriction-site and sequence variation have shown that the Hawaiian lobeliads are monophyletic and that the two largest genera, Cyanea and Clermontia, diverged from each other ca. 9.7 Mya. Sequence divergence among species of Clermontia is quite limited, however, and extensive hybridization is suspected, which has interfered with production of a well-resolved molecular phylogeny for the genus. Clermontia is of considerable interest because several species posses petal-like sepals, raising the question of whether such a homeotic mutation has arisen once or several times. In addition, morphological and molecular studies have implied different patterns of inter-island dispersal within the genus. Here we use nuclear ISSRs (inter-simple sequence repeat polymorphisms) and five plastid non-coding sequences to derive biparental and maternal phylogenies for Clermontia. Our findings imply that (1) Clermontia is not monophyletic, with Cl. pyrularia nested within Cyanea and apparently an intergeneric hybrid; (2) the earliest divergent clades within Clermontia are native to Kauài, then Òahu, then Maui, supporting the progression rule of dispersal down the chain toward progressively younger islands, although that rule is violated in later-evolving taxa in the ISSR tree; (3) almost no sequence divergence among several Clermontia species in 4.5 kb of rapidly evolving plastid DNA; (4) several apparent cases of hybridization/introgression or incomplete lineage sorting (i.e., Cl. oblongifolia, peleana, persicifolia, pyrularia, samuelii, tuberculata), based on extensive conflict between the ISSR and plastid phylogenies; and (5) two origins and two losses of petaloid sepals, or--perhaps more plausibly--a single origin and two losses of this homeotic mutation, with its introgression into Cl. persicifolia. Our phylogenies are better resolved and geographically more informative than others based on ITS and 5S-NTS sequences and nuclear SNPs, but agree with

  11. Next-generation sequencing analysis of off-ladder alleles due to migration shift caused by sequence variation at D12S391 locus.

    PubMed

    Fujii, Koji; Watahiki, Haruhiko; Mita, Yusuke; Iwashima, Yasuki; Miyaguchi, Hajime; Kitayama, Tetsushi; Nakahara, Hiroaki; Mizuno, Natsuko; Sekiguchi, Kazumasa

    2016-09-01

    In short tandem repeat (STR) analysis, length polymorphisms are detected by capillary electrophoresis (CE). At most STR loci, mobility shift due to sequence variation in the repeat region was thought not to affect the typing results. In our recent population studies of 1501 Japanese individuals, off-ladder calls were observed at the D12S391 locus using PowerPlex Fusion in nine samples for allele 22, one sample for allele 25, and one sample for allele 26. However, these samples were typed as ordinary alleles within the bins using GlobalFiler. In this study, next-generation sequencing analysis using MiSeq was performed for the D12S391 locus from the 11 off-ladder samples and 33 other samples, as well as the allelic ladders of PowerPlex Fusion and GlobalFiler. All off-ladder allele 22 in the nine samples had [AGAT]11[AGAC]11 as a repeat structure, while the corresponding allele was [AGAT]15[AGAC]6[AGAT] for the PowerPlex Fusion ladder, and [AGAT]13[AGAC]9 for the GlobalFiler ladder. Overall, as the number of [AGAT] in the repeat structure decreased at the D12S391 locus, the peak migrated more slowly using PowerPlex Fusion, the reverse strand of which was labeled, and it migrated more rapidly using GlobalFiler, the forward strand of which was labeled. The allelic ladders of both STR kits were reamplified with our small amplicon D12S391 primers and their mobility was also examined. In conclusion, off-ladder observations of allele 22 at the D12S391 locus using PowerPlex Fusion were mainly attributed to a relatively large difference of the repeat structure between its allelic ladder and off-ladder allele 22. PMID:27591542

  12. Analysis of genetic variation and diversity of Rice stripe virus populations through high-throughput sequencing.

    PubMed

    Huang, Lingzhe; Li, Zefeng; Wu, Jianxiang; Xu, Yi; Yang, Xiuling; Fan, Longjiang; Fang, Rongxiang; Zhou, Xueping

    2015-01-01

    Plant RNA viruses often generate diverse populations in their host plants through error-prone replication and recombination. Recent studies on the genetic diversity of plant RNA viruses in various host plants have provided valuable information about RNA virus evolution and emergence of new diseases caused by RNA viruses. We analyzed and compared the genetic diversity of Rice stripe virus (RSV) populations in Oryza sativa (a natural host of RSV) and compared it with that of the RSV populations generated in an infection of Nicotiana benthamiana, an experimental host of RSV, using the high-throughput sequencing technology. From infected O. sativa and N. benthamiana plants, a total of 341 and 1675 site substitutions were identified in the RSV genome, respectively, and the average substitution ratio in these sites was 1.47 and 7.05 %, respectively, indicating that the RSV populations from infected N. benthamiana plant are more diverse than those from infected O. sativa plant. Our result gives a direct evidence that virus might allow higher genetic diversity for host adaptation. PMID:25852724

  13. An integrative approach to predicting the functional effects of non-coding and coding sequence variation

    PubMed Central

    Shihab, Hashem A.; Rogers, Mark F.; Gough, Julian; Mort, Matthew; Cooper, David N.; Day, Ian N. M.; Gaunt, Tom R.; Campbell, Colin

    2015-01-01

    Motivation: Technological advances have enabled the identification of an increasingly large spectrum of single nucleotide variants within the human genome, many of which may be associated with monogenic disease or complex traits. Here, we propose an integrative approach, named FATHMM-MKL, to predict the functional consequences of both coding and non-coding sequence variants. Our method utilizes various genomic annotations, which have recently become available, and learns to weight the significance of each component annotation source. Results: We show that our method outperforms current state-of-the-art algorithms, CADD and GWAVA, when predicting the functional consequences of non-coding variants. In addition, FATHMM-MKL is comparable to the best of these algorithms when predicting the impact of coding variants. The method includes a confidence measure to rank order predictions. Availability and implementation: The FATHMM-MKL webserver is available at: http://fathmm.biocompute.org.uk Contact: H.Shihab@bristol.ac.uk or Mark.Rogers@bristol.ac.uk or C.Campbell@bristol.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25583119

  14. Genetic relationship of Chinese and Japanese gamecocks revealed by mtDNA sequence variation.

    PubMed

    Liu, Yi-Ping; Zhu, Qing; Yao, Yong-Gang

    2006-02-01

    Cockfighting has a very long history dating back to as early as 2500 years ago in China. Cockfighting was intertwined with human cultural traditions, helped disperse chickens across the world, and influenced the subsequent breed selection. Therefore, tracing the origin of gamecocks could mirror the distribution of the cockfighting culture. In this study, we compared the available mtDNA control region sequences in Chinese and Japanese gamecocks to test the recently proposed hypothesis behind the dual origin of the Japanese cockfighting culture (from China and Southeast Asia independently). We assigned gamecock mtDNAs to different matrilineal components (or phylogenetic clades) that emerged from the phylogenetic tree and network profile, and compared the frequency differences between Chinese and Japanese gamecocks. Among the six clades (A-F) identified, Japanese gamecocks were most frequently found in clades C and D (74%, 32/43), whereas more than half of the Chinese gamecock samples (69%, 35/51) were grouped in clades A and B. Haplotypes in Japanese gamecocks assigned to clades A, B, and E were either shared with those of the Chinese samples or differed from the close Chinese types by no more than a three-mutation distance. This genetic pattern is in accordance with the proposed dual origin of Japanese gamecocks but has left room for single origin of Japanese gamecocks from China. The genetic structure of gamecocks in China and Japan might also be influenced by subsequent breed selection and conservation after the initial gamecock introduction. PMID:16648993

  15. Sequence variation and linkage disequilibrium in the human T-cell receptor beta (TCRB) locus.

    PubMed

    Subrahmanyan, L; Eberle, M A; Clark, A G; Kruglyak, L; Nickerson, D A

    2001-08-01

    The T-cell receptor (TCR) plays a central role in the immune system, and > 90% of human T cells present a receptor that consists of the alpha TCR subunit (TCRA) and the beta subunit (TCRB). Here we report an analysis of 63 variable genes (BV), spanning 553 kb of TCRB that yielded 279 single-nucleotide polymorphisms (SNPs). Samples were drawn from 10 individuals and represent four populations-African American, Chinese, Mexican, and Northern European. We found nine variants that produce nonfunctional BV segments, removing those genes from the TCRB genomic repertoire. There was significant heterogeneity among population samples in SNP frequency (including the BV-inactivating sites), indicating the need for multiple-population samples for adequate variant discovery. In addition, we observed considerable linkage disequilibrium (LD) (r(2) > 0.1) over distances of approximately 30 kb in TCRB, and, in general, the distribution of r(2) as a function of physical distance was in close agreement with neutral coalescent simulations. LD in TCRB showed considerable spatial variation across the locus, being concentrated in "blocks" of LD; however, coalescent simulations of the locus illustrated that the heterogeneity of LD we observed in TCRB did not differ markedly from that expected from neutral processes. Finally, examination of the extended genotypes for each subject demonstrated homozygous stretches of >100 kb in the locus of several individuals. These results provide the basis for optimization of locuswide SNP typing in TCRB for studies of genotype-phenotype association. PMID:11438886

  16. Sequence Variation and Linkage Disequilibrium in the Human T-Cell Receptor β (TCRB) Locus

    PubMed Central

    Subrahmanyan, Lakshman; Eberle, Michael A.; Clark, Andrew G.; Kruglyak, Leonid; Nickerson, Deborah A.

    2001-01-01

    The T-cell receptor (TCR) plays a central role in the immune system, and >90% of human T cells present a receptor that consists of the α TCR subunit (TCRA) and the β subunit (TCRB). Here we report an analysis of 63 variable genes (BV), spanning 553 kb of TCRB that yielded 279 single-nucleotide polymorphisms (SNPs). Samples were drawn from 10 individuals and represent four populations—African American, Chinese, Mexican, and Northern European. We found nine variants that produce nonfunctional BV segments, removing those genes from the TCRB genomic repertoire. There was significant heterogeneity among population samples in SNP frequency (including the BV-inactivating sites), indicating the need for multiple-population samples for adequate variant discovery. In addition, we observed considerable linkage disequilibrium (LD) (r2>0.1) over distances of ∼30 kb in TCRB, and, in general, the distribution of r2 as a function of physical distance was in close agreement with neutral coalescent simulations. LD in TCRB showed considerable spatial variation across the locus, being concentrated in “blocks” of LD; however, coalescent simulations of the locus illustrated that the heterogeneity of LD we observed in TCRB did not differ markedly from that expected from neutral processes. Finally, examination of the extended genotypes for each subject demonstrated homozygous stretches of >100 kb in the locus of several individuals. These results provide the basis for optimization of locuswide SNP typing in TCRB for studies of genotype-phenotype association. PMID:11438886

  17. Rapid differentiation of Australian, European and American ranaviruses based on variation in major capsid protein gene sequence.

    PubMed

    Marsh, I B; Whittington, R J; O'Rourke, B; Hyatt, A D; Chisholm, O

    2002-04-01

    Epizootic haematopoietic necrosis virus (EHNV), Bohle iridovirus (BIV) and Wamena virus (WV) cause serious diseases in fish, amphibians and snakes, respectively but are restricted to Australasia. European catfish virus (ECV) and sheatfish virus (ESV) have caused epizootics in fish on farms in continental Europe. Currently there are no simple or readily available methods to distinguish these viruses, which are in the Iridoviridae. They are culturally, morphologically and antigenically very similar to Frog Virus 3 (FV3), the type species in Ranavirus in this family and Gutapo virus (GV), another amphibian ranavirus from America. The diseases caused by EHNV, ESV and ECV are so serious that they are internationally notifiable. Tests to distinguish these viruses are desirable to ensure that disease occurrences do not unnecessarily restrict trade in aquaculture products. The gene encoding the major capsid protein from two EHNV isolates from different fish species (Perca fluviatilis and Oncorhynchus mykiss) and one BIV isolate were sequenced and the data and deduced amino acid sequences were compared with those from FV3 and other iridoviruses. The sequences for the two EHNV isolates were identical, confirming suggestions from existing partial MCP sequence that the same type of EHNV infects wild redfin perch and farmed rainbow trout. Differences in restriction endonuclease patterns of specific PCR products were predicted and confirmed between EHNV, BIV, and WV and provided a basis for rapid differentiation of these viruses from each other and from ESV/ECV and FV3/GV. These simple and rapid tests to distinguish important ranaviruses from the regions of Europe, Australia and America will help regulatory authorities assess the need for disease control responses in the event of occurrence of ranavirus infection in aquaculture species. PMID:12030764

  18. Full Genome Virus Detection in Fecal Samples Using Sensitive Nucleic Acid Preparation, Deep Sequencing, and a Novel Iterative Sequence Classification Algorithm

    PubMed Central

    Cotten, Matthew; Oude Munnink, Bas; Canuti, Marta; Deijs, Martin; Watson, Simon J.; Kellam, Paul; van der Hoek, Lia

    2014-01-01

    We have developed a full genome virus detection process that combines sensitive nucleic acid preparation optimised for virus identification in fecal material with Illumina MiSeq sequencing and a novel post-sequencing virus identification algorithm. Enriched viral nucleic acid was converted to double-stranded DNA and subjected to Illumina MiSeq sequencing. The resulting short reads were processed with a novel iterative Python algorithm SLIM for the identification of sequences with homology to known viruses. De novo assembly was then used to generate full viral genomes. The sensitivity of this process was demonstrated with a set of fecal samples from HIV-1 infected patients. A quantitative assessment of the mammalian, plant, and bacterial virus content of this compartment was generated and the deep sequencing data were sufficient to assembly 12 complete viral genomes from 6 virus families. The method detected high levels of enteropathic viruses that are normally controlled in healthy adults, but may be involved in the pathogenesis of HIV-1 infection and will provide a powerful tool for virus detection and for analyzing changes in the fecal virome associated with HIV-1 progression and pathogenesis. PMID:24695106

  19. [Mitochondrial DNA sequence variation, demographic history, and population structure of Amur sturgeon Acipenser schrenckii Brandt, 1869].

    PubMed

    Shedko, S V; Miroshnichenko, I L; Nemkova, G A; Koshelev, V N; Shedko, M B

    2015-02-01

    The variability of the mtDNA control region (D-loop) was examined in Amur sturgeon endemic to the Amur River. This species is also classified as critically endangered by the IUCN Red List of Threatened species. Sequencing of 796- to 812-bp fragments of the D-loop in 112 sturgeon collected in the Lower Amur revealed 73 different genotypes. The sample was characterized by a high level of haplotypic (0.976) and nucleotide (0.0194) diversity. The identified haplotypes split into two well-defined monophyletic groups, BG (n = 39) and SM (n = 34), differing (HKY distance) on average by 3.41% of nucleotide positions upon an average level of intragroup differences of 0.54 and 1.23%, respectively. Moreover, the haplotypes of the SM groups differed by the presence of a 13-14 bp deletion. Most ofthe samples (66 out of 112) carried BG haplotypes. Overall, the pattern of pairwise nucleotide differences and the results of neutrality tests, as well as the results of tests for compliance with the model of sudden demographic expansion or with the model of exponential growth pointed to a past significant increase in the number of Amur sturgeon, which was most clearly manifested in the analysis of data on the BG haplogroup. The constructed Bayesian skyline plots showed that this growth began about 18 to 16 thousand years ago. At present, the effective size of the strongly reduced (due to overharvesting) population of Amur sturgeon may be equal to or even lower than it was before the beginning of this growth during the Last Glacial Maximum. The presence in the mitochondrial gene pool ofAmur sturgeon of two haplogroups, their unequal evolutionary dynamics, and, judging by scanty data, their unequal representation in the Russian and Chinese parts of the Amur River basin point to the possible existence of at least two distinct populations of Amur sturgeon in the past. PMID:25966586

  20. Structure of the Bacterial Cytoskeleton Protein Bactofilin by NMR Chemical Shifts and Sequence Variation.

    PubMed

    Kassem, Maher M; Wang, Yong; Boomsma, Wouter; Lindorff-Larsen, Kresten

    2016-06-01

    Bactofilins constitute a recently discovered class of bacterial proteins that form cytoskeletal filaments. They share a highly conserved domain (DUF583) of which the structure remains unknown, in part due to the large size and noncrystalline nature of the filaments. Here, we describe the atomic structure of a bactofilin domain from Caulobacter crescentus. To determine the structure, we developed an approach that combines a biophysical model for proteins with recently obtained solid-state NMR spectroscopy data and amino acid contacts predicted from a detailed analysis of the evolutionary history of bactofilins. Our structure reveals a triangular β-helical (solenoid) conformation with conserved residues forming the tightly packed core and polar residues lining the surface. The repetitive structure explains the presence of internal repeats as well as strongly conserved positions, and is reminiscent of other fibrillar proteins. Our work provides a structural basis for future studies of bactofilin biology and for designing molecules that target them, as well as a starting point for determining the organization of the entire bactofilin filament. Finally, our approach presents new avenues for determining structures that are difficult to obtain by traditional means. PMID:27276252

  1. Genetic variation and population structure of hair crab (Erimacrus isenbeckii ) in Japan inferred from mitochondrial DNA sequence analysis.

    PubMed

    Azuma, Noriko; Kunihiro, Yasushi; Sasaki, Jun; Mihara, Eiji; Mihara, Yukio; Yasunaga, Tomoaki; Jin, Deuk-Hee; Abe, Syuiti

    2008-01-01

    Genetic variation and population structure of hair crab (Erimacrus isenbeckii) were examined using nucleotide sequence analysis of 580 base pairs (bp) in the 3' portion of the mitochondrial cytochrome c oxidase subunit I gene (COI) of 20 samples collected from 16 locales in Japan (the Hokkaido and Honshu Islands) and one in Korea. A total of 27 haplotypes was defined by 23 variable nucleotide sites in the examined COI region. Pairwise population F (ST) estimates and neighbor-joining tree inferred distinct genetic differentiation between the representative samples from the Pacific Ocean off the Eastern Hokkaido Island and the Sea of Japan, while others were intermediate between these two groups. AMOVA also showed a weak but significant differentiation among these three groups. The present results suggest a moderate population structure of hair crab, probably influenced by high gene flow between regional populations due to sea current dependent larval dispersal of this species. PMID:17955293

  2. Computational methods for detecting copy number variations in cancer genome using next generation sequencing: principles and challenges

    PubMed Central

    Liu, Biao; Morrison, Carl D.; Johnson, Candace S.; Trump, Donald L.; Qin, Maochun; Conroy, Jeffrey C.; Wang, Jianmin; Liu, Song

    2013-01-01