acid sequence variations: Topics by Science.gov

Sample records for acid sequence variations

Cleavage of nucleic acids

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor L.; Brow, Mary Ann D.; Dahlberg, James E.

2007-12-11

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Invasive cleavage of nucleic acids

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

1999-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Invasive cleavage of nucleic acids

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

2002-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Cleavage of nucleic acids

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow; Mary Ann D.; Dahlberg, James E.

2010-11-09

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Cleavage of nucleic acids

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann D.; Dahlberg, James E.

2000-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Nucleic acid detection assays

DOEpatents

Prudent, James R.; Hall, Jeff G.; Lyamichev, Victor I.; Brow, Mary Ann; Dahlberg, James E.

2005-04-05

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof.
Detection of nucleic acid sequences by invader-directed cleavage

DOEpatents

Brow, Mary Ann D.; Hall, Jeff Steven Grotelueschen; Lyamichev, Victor; Olive, David Michael; Prudent, James Robert

1999-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The 5' nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based by charge.
Artificial mismatch hybridization

DOEpatents

Guo, Zhen; Smith, Lloyd M.

1998-01-01

An improved nucleic acid hybridization process is provided which employs a modified oligonucleotide and improves the ability to discriminate a control nucleic acid target from a variant nucleic acid target containing a sequence variation. The modified probe contains at least one artificial mismatch relative to the control nucleic acid target in addition to any mismatch(es) arising from the sequence variation. The invention has direct and advantageous application to numerous existing hybridization methods, including, applications that employ, for example, the Polymerase Chain Reaction, allele-specific nucleic acid sequencing methods, and diagnostic hybridization methods.
Thermal and acid tolerant beta-xylosidases, genes encoding, related organisms, and methods

DOEpatents

Thompson, David N [Idaho Falls, ID; Thompson, Vicki S [Idaho Falls, ID; Schaller, Kastli D [Ammon, ID; Apel, William A [Jackson, WY; Lacey, Jeffrey A [Idaho Falls, ID; Reed, David W [Idaho Falls, ID

2011-04-12

Isolated and/or purified polypeptides and nucleic acid sequences encoding polypeptides from Alicyclobacillus acidocaldarius and variations thereof are provided. Further provided are methods of at least partially degrading xylotriose and/or xylobiose using isolated and/or purified polypeptides and nucleic acid sequences encoding polypeptides from Alicyclobacillus acidocaldarius and variations thereof.
Thermal and acid tolerant beta xylosidases, arabinofuranosidases, genes encoding, related organisms, and methods

DOEpatents

Thompson, David N; Thompson, Vicki S; Schaller, Kastli D; Apel, William A; Reed, David W; Lacey, Jeffrey A

2013-04-30

Isolated and/or purified polypeptides and nucleic acid sequences encoding polypeptides from Alicyclobacillus acidocaldarius and variations thereof are provided. Further provided are methods of at least partially degrading xylotriose, xylobiose, and/or arabinofuranose-substituted xylan using isolated and/or purified polypeptides and nucleic acid sequences encoding polypeptides from Alicyclobacillus acidocaldarius and variations thereof.
Detection of nucleic acids by multiple sequential invasive cleavages

DOEpatents

Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann D.

1999-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.
Nucleic acid detection kits

DOEpatents

Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann; Kwiatkowski, Robert W.; Vavra, Stephanie H.

2005-03-29

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of nucleic acid from various viruses in a sample.
Detection of nucleic acids by multiple sequential invasive cleavages 02

DOEpatents

Hall, Jeff G.; Lyamichev, Victor I.; Mast, Andrea L.; Brow, Mary Ann D.

2002-01-01

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.
Detection of nucleic acids by multiple sequential invasive cleavages

DOEpatents

Hall, Jeff G; Lyamichev, Victor I; Mast, Andrea L; Brow, Mary Ann D

2012-10-16

The present invention relates to means for the detection and characterization of nucleic acid sequences, as well as variations in nucleic acid sequences. The present invention also relates to methods for forming a nucleic acid cleavage structure on a target sequence and cleaving the nucleic acid cleavage structure in a site-specific manner. The structure-specific nuclease activity of a variety of enzymes is used to cleave the target-dependent cleavage structure, thereby indicating the presence of specific nucleic acid sequences or specific variations thereof. The present invention further relates to methods and devices for the separation of nucleic acid molecules based on charge. The present invention also provides methods for the detection of non-target cleavage products via the formation of a complete and activated protein binding region. The invention further provides sensitive and specific methods for the detection of human cytomegalovirus nucleic acid in a sample.
Sequence diversity of hepatitis C virus 6a within the extended interferon sensitivity-determining region correlates with interferon-alpha/ribavirin treatment outcomes.

PubMed

Zhou, Daniel X M; Chan, Paul K S; Zhang, Tiejun; Tully, Damien C; Tam, John S

2010-10-01

Studies on the association between sequence variability of the interferon sensitivity-determining region (ISDR) of hepatitis C virus and the outcome of treatment have reached conflicting results. In this study, 25 patients infected with HCV 6a who had received interferon-alpha/ribavirin combination treatment were analyzed for the sequence variations. 14 of them had the full genome sequences obtained from a previous study, whereas the other 11 samples were sequenced for the extended ISDR (eISDR). This eISDR fragment covers 192 bp (64 amino acids) upstream and 201 bp (67 amino acids) downstream from the ISDR previously defined for HCV 1b. The comparison between interferon-alpha resistance and response groups for the amino acid mutations located in the full genome (6 and 8 patients respectively) as well as the mutations located in the eISDR (10 and 15 patients respectively) showed that the mutations I2160V, I2256V, V2292I (P<0.05) within eISDR were significantly associated with resistance to treatment. However, the extent of amino acid variations within previously defined ISDR was not associated with resistance to treatment as previously reported. Four amino acid variations I248V (P=0.03-0.06) within E1, R445K (P=0.02-0.05) and S747T (P=0.03) within E2, I861V (P=0.01) within NS2 which located outside the eISDR may also associate with treatment outcome as identified by a prescreening of variations within 14 HCV 6a full genomes. (c) 2010 Elsevier B.V. All rights reserved.
Overdispersion of the Molecular Clock: Temporal Variation of Gene-Specific Substitution Rates in Drosophila

PubMed Central

Hartl, Daniel L.

2008-01-01

Simple models of molecular evolution assume that sequences evolve by a Poisson process in which nucleotide or amino acid substitutions occur as rare independent events. In these models, the expected ratio of the variance to the mean of substitution counts equals 1, and substitution processes with a ratio greater than 1 are called overdispersed. Comparing the genomes of 10 closely related species of Drosophila, we extend earlier evidence for overdispersion in amino acid replacements as well as in four-fold synonymous substitutions. The observed deviation from the Poisson expectation can be described as a linear function of the rate at which substitutions occur on a phylogeny, which implies that deviations from the Poisson expectation arise from gene-specific temporal variation in substitution rates. Amino acid sequences show greater temporal variation in substitution rates than do four-fold synonymous sequences. Our findings provide a general phenomenological framework for understanding overdispersion in the molecular clock. Also, the presence of substantial variation in gene-specific substitution rates has broad implications for work in phylogeny reconstruction and evolutionary rate estimation. PMID:18480070
Diagnostics based on nucleic acid sequence variant profiling: PCR, hybridization, and NGS approaches.

PubMed

Khodakov, Dmitriy; Wang, Chunyan; Zhang, David Yu

2016-10-01

Nucleic acid sequence variations have been implicated in many diseases, and reliable detection and quantitation of DNA/RNA biomarkers can inform effective therapeutic action, enabling precision medicine. Nucleic acid analysis technologies being translated into the clinic can broadly be classified into hybridization, PCR, and sequencing, as well as their combinations. Here we review the molecular mechanisms of popular commercial assays, and their progress in translation into in vitro diagnostics. Copyright © 2016 The Authors. Published by Elsevier B.V. All rights reserved.
Major Breeding Plumage Color Differences of Male Ruffs (Philomachus pugnax) Are Not Associated With Coding Sequence Variation in the MC1R Gene

PubMed Central

Küpper, Clemens; Burke, Terry; Lank, David B.

2015-01-01

Sequence variation in the melanocortin-1 receptor (MC1R) gene explains color morph variation in several species of birds and mammals. Ruffs (Philomachus pugnax) exhibit major dark/light color differences in melanin-based male breeding plumage which is closely associated with alternative reproductive behavior. A previous study identified a microsatellite marker (Ppu020) near the MC1R locus associated with the presence/absence of ornamental plumage. We investigated whether coding sequence variation in the MC1R gene explains major dark/light plumage color variation and/or the presence/absence of ornamental plumage in ruffs. Among 821bp of the MC1R coding region from 44 male ruffs we found 3 single nucleotide polymorphisms, representing 1 nonsynonymous and 2 synonymous amino acid substitutions. None were associated with major dark/light color differences or the presence/absence of ornamental plumage. At all amino acid sites known to be functionally important in other avian species with dark/light plumage color variation, ruffs were either monomorphic or the shared polymorphism did not coincide with color morph. Neither ornamental plumage color differences nor the presence/absence of ornamental plumage in ruffs are likely to be caused entirely by amino acid variation within the coding regions of the MC1R locus. Regulatory elements and structural variation at other loci may be involved in melanin expression and contribute to the extreme plumage polymorphism observed in this species. PMID:25534935
Variability of the protein sequences of lcrV between epidemic and atypical rhamnose-positive strains of Yersinia pestis.

PubMed

Anisimov, Andrey P; Panfertsev, Evgeniy A; Svetoch, Tat'yana E; Dentovskaya, Svetlana V

2007-01-01

Sequencing of lcrV genes and comparison of the deduced amino acid sequences from ten Y. pestis strains belonging mostly to the group of atypical rhamnose-positive isolates (non-pestis subspecies or pestoides group) showed that the LcrV proteins analyzed could be classified into five sequence types. This classification was based on major amino acid polymorphisms among LcrV proteins in the four "hot points" of the protein sequences. Some additional minor polymorphisms were found throughout these sequence types. The "hot points" corresponded to amino acids 18 (Lys --> Asn), 72 (Lys --> Arg), 273 (Cys --> Ser), and 324-326 (Ser-Gly-Lys --> Arg) in the LcrV sequence of the reference Y. pestis strain CO92. One possible explanation for polymorphism in amino acid sequences of LcrV among different strains is that strain-specific variation resulted from adaptation of the plague pathogen to different rodent and lagomorph hosts.
Evolutionary Pattern of the FAE1 Gene in Brassicaceae and Its Correlation with the Erucic Acid Trait

PubMed Central

Li, Mimi; Peng, Bin; Guo, Haisong; Yan, Qinqin; Hang, Yueyu

2013-01-01

The fatty acid elongase 1 (FAE1) gene catalyzes the initial condensation step in the elongation pathway of VLCFA (very long chain fatty acid) biosynthesis and is thus a key gene in erucic acid biosynthesis. Based on a worldwide collection of 62 accessions representing 14 tribes, 31 genera, 51 species, 4 subspecies and 7 varieties, we conducted a phylogenetic reconstruction and correlation analysis between genetic variations in the FAE1 gene and the erucic acid trait, attempting to gain insight into the evolutionary patterns and the correlations between genetic variations in FAE1 and trait variations. The five clear, deeply diverged clades detected in the phylogenetic reconstruction are largely congruent with a previous multiple gene-derived phylogeny. The Ka/Ks ratio (<1) and overall low level of nucleotide diversity in the FAE1 gene suggest that purifying selection is the major evolutionary force acting on this gene. Sequence variations in FAE1 show a strong correlation with the content of erucic acid in seeds, suggesting a causal link between the two. Furthermore, we detected 16 mutations that were fixed between the low and high phenotypes of the FAE1 gene, which constitute candidate active sites in this gene for altering the content of erucic acid in seeds. Our findings begin to shed light on the evolutionary pattern of this important gene and represent the first step in elucidating how the sequence variations impact the production of erucic acid in plants. PMID:24358289

Full genome sequence of Rocio virus reveal substantial variations from the prototype Rocio virus SPH 34675 sequence.

PubMed

Setoh, Yin Xiang; Amarilla, Alberto A; Peng, Nias Y; Slonchak, Andrii; Periasamy, Parthiban; Figueiredo, Luiz T M; Aquino, Victor H; Khromykh, Alexander A

2018-01-01

Rocio virus (ROCV) is an arbovirus belonging to the genus Flavivirus, family Flaviviridae. We present an updated sequence of ROCV strain SPH 34675 (GenBank: AY632542.4), the only available full genome sequence prior to this study. Using next-generation sequencing of the entire genome, we reveal substantial sequence variation from the prototype sequence, with 30 nucleotide differences amounting to 14 amino acid changes, as well as significant changes to predicted 3'UTR RNA structures. Our results present an updated and corrected sequence of a potential emerging human-virulent flavivirus uniquely indigenous to Brazil (GenBank: MF461639).
Genetic variation assessment of acid lime accessions collected from south of Iran using SSR and ISSR molecular markers.

PubMed

Sharafi, Ata Allah; Abkenar, Asad Asadi; Sharafi, Ali; Masaeli, Mohammad

2016-01-01

Iran has a long history of acid lime cultivation and propagation. In this study, genetic variation in 28 acid lime accessions from five regions of south of Iran, and their relatedness with other 19 citrus cultivars were analyzed using Simple Sequence Repeat (SSR) and Inter-Simple Sequence Repeat (ISSR) molecular markers. Nine primers for SSR and nine ISSR primers were used for allele scoring. In total, 49 SSR and 131 ISSR polymorphic alleles were detected. Cluster analysis of SSR and ISSR data showed that most of the acid lime accessions (19 genotypes) have hybrid origin and genetically distance with nucellar of Mexican lime (9 genotypes). As nucellar of Mexican lime are susceptible to phytoplasma, these acid lime genotypes can be used to evaluate their tolerance against biotic constricts like lime "witches' broom disease".
Isolation and molecular characterization of partial FSH and LH receptor genes in Arabian camels (Camelus dromedarius)

PubMed Central

Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Bitaraf-Sani, Morteza

2015-01-01

Very little is known about LHR and FSHR genes of domestic dromedary camels. The main objective of this study was to determine and analyze partial genomic regions of FSHR and LHR genes in dromedary camels for the first time. To this end, a total of50 DNA samples belonging to dromedary camels raised in Iran were sent for sequencing (25 samples of each gene). We compared the nucleotide sequences of Camelus dromedarius with corresponding sequences of previously published FSHR and LHR genes in bactrian camels and other species. According to the data, the same nucleotide variation was identified in both regions of the two camel species. The alignment of deduced protein sequences of the two different species revealed an amino acid variation at the FSHR region. No evidence of amino acid variation was observed, however, in LHR sequences. Phylogenetic analysis indicated that both camel species had a close relationship and clustered together in a separate branch. This was further confirmed by genetic distance values illustrating significant sequence identity between Camelus dromedarius and Camelus bactrianus. Interestingly, sequence comparisons revealed heterozygote patterns in FSHR sequences isolated from dromedary camels of Iran. In comparison to other species, this camel contains three amino acid substitutions at 5, 67, and 105 positions in the FSHR coding region. These positions are found exclusively in camels and can be considered as species specific. The results of our study can be used for hormone functionality research (FSHR and LHR) as well as reproduction-linked polymorphisms and breeding programs. PMID:27844002
Isolation and molecular characterization of partial FSH and LH receptor genes in Arabian camels (Camelus dromedarius).

PubMed

Jelokhani-Niaraki, Saber; Tahmoorespur, Mojtaba; Bitaraf-Sani, Morteza

2015-06-01

Very little is known about LHR and FSHR genes of domestic dromedary camels. The main objective of this study was to determine and analyze partial genomic regions of FSHR and LHR genes in dromedary camels for the first time. To this end, a total of50 DNA samples belonging to dromedary camels raised in Iran were sent for sequencing (25 samples of each gene). We compared the nucleotide sequences of Camelus dromedarius with corresponding sequences of previously published FSHR and LHR genes in bactrian camels and other species. According to the data, the same nucleotide variation was identified in both regions of the two camel species. The alignment of deduced protein sequences of the two different species revealed an amino acid variation at the FSHR region. No evidence of amino acid variation was observed, however, in LHR sequences. Phylogenetic analysis indicated that both camel species had a close relationship and clustered together in a separate branch. This was further confirmed by genetic distance values illustrating significant sequence identity between Camelus dromedarius and Camelus bactrianus . Interestingly, sequence comparisons revealed heterozygote patterns in FSHR sequences isolated from dromedary camels of Iran. In comparison to other species, this camel contains three amino acid substitutions at 5, 67, and 105 positions in the FSHR coding region. These positions are found exclusively in camels and can be considered as species specific. The results of our study can be used for hormone functionality research ( FSHR and LHR ) as well as reproduction-linked polymorphisms and breeding programs.
Analysis of microbial community variation during the mixed culture fermentation of agricultural peel wastes to produce lactic acid.

PubMed

Liang, Shaobo; Gliniewicz, Karol; Gerritsen, Alida T; McDonald, Armando G

2016-05-01

Mixed cultures fermentation can be used to convert organic wastes into various chemicals and fuels. This study examined the fermentation performance of four batch reactors fed with different agricultural (orange, banana, and potato (mechanical and steam)) peel wastes using mixed cultures, and monitored the interval variation of reactor microbial communities with 16S rRNA genes using Illumina sequencing. All four reactors produced similar chemical profile with lactic acid (LA) as dominant compound. Acetic acid and ethanol were also observed with small fractions. The Illumina sequencing results revealed the diversity of microbial community decreased during fermentation and a community of largely lactic acid producing bacteria dominated by species of Lactobacillus developed. Copyright © 2016 Elsevier Ltd. All rights reserved.
Conversion of amino-acid sequence in proteins to classical music: search for auditory patterns

PubMed Central

2007-01-01

We have converted genome-encoded protein sequences into musical notes to reveal auditory patterns without compromising musicality. We derived a reduced range of 13 base notes by pairing similar amino acids and distinguishing them using variations of three-note chords and codon distribution to dictate rhythm. The conversion will help make genomic coding sequences more approachable for the general public, young children, and vision-impaired scientists. PMID:17477882
Ultra-deep sequencing reveals high prevalence and broad structural diversity of hepatitis B surface antigen mutations in a global population

PubMed Central

Gencay, Mikael; Hübner, Kirsten; Gohl, Peter; Seffner, Anja; Weizenegger, Michael; Neofytos, Dionysios; Batrla, Richard; Woeste, Andreas; Kim, Hyon-suk; Westergaard, Gaston; Reinsch, Christine; Brill, Eva; Thu Thuy, Pham Thi; Hoang, Bui Huu; Sonderup, Mark; Spearman, C. Wendy; Pabinger, Stephan; Gautier, Jérémie; Brancaccio, Giuseppina; Fasano, Massimo; Santantonio, Teresa; Gaeta, Giovanni B.; Nauck, Markus; Kaminski, Wolfgang E.

2017-01-01

The diversity of the hepatitis B surface antigen (HBsAg) has a significant impact on the performance of diagnostic screening tests and the clinical outcome of hepatitis B infection. Neutralizing or diagnostic antibodies against the HBsAg are directed towards its highly conserved major hydrophilic region (MHR), in particular towards its “a” determinant subdomain. Here, we explored, on a global scale, the genetic diversity of the HBsAg MHR in a large, multi-ethnic cohort of randomly selected subjects with HBV infection from four continents. A total of 1553 HBsAg positive blood samples of subjects originating from 20 different countries across Africa, America, Asia and central Europe were characterized for amino acid variation in the MHR. Using highly sensitive ultra-deep sequencing, we found 72.8% of the successfully sequenced subjects (n = 1391) demonstrated amino acid sequence variation in the HBsAg MHR. This indicates that the global variation frequency in the HBsAg MHR is threefold higher than previously reported. The majority of the amino acid mutations were found in the HBV genotypes B (28.9%) and C (25.4%). Collectively, we identified 345 distinct amino acid mutations in the MHR. Among these, we report 62 previously unknown mutations, which extends the worldwide pool of currently known HBsAg MHR mutations by 22%. Importantly, topological analysis identified the “a” determinant upstream flanking region as the structurally most diverse subdomain of the HBsAg MHR. The highest prevalence of “a” determinant region mutations was observed in subjects from Asia, followed by the African, American and European cohorts, respectively. Finally, we found that more than half (59.3%) of all HBV subjects investigated carried multiple MHR mutations. Together, this worldwide ultra-deep sequencing based genotyping study reveals that the global prevalence and structural complexity of variation in the hepatitis B surface antigen have, to date, been significantly underappreciated. PMID:28472040
Ultra-deep sequencing reveals high prevalence and broad structural diversity of hepatitis B surface antigen mutations in a global population.

PubMed

Gencay, Mikael; Hübner, Kirsten; Gohl, Peter; Seffner, Anja; Weizenegger, Michael; Neofytos, Dionysios; Batrla, Richard; Woeste, Andreas; Kim, Hyon-Suk; Westergaard, Gaston; Reinsch, Christine; Brill, Eva; Thu Thuy, Pham Thi; Hoang, Bui Huu; Sonderup, Mark; Spearman, C Wendy; Pabinger, Stephan; Gautier, Jérémie; Brancaccio, Giuseppina; Fasano, Massimo; Santantonio, Teresa; Gaeta, Giovanni B; Nauck, Markus; Kaminski, Wolfgang E

2017-01-01

The diversity of the hepatitis B surface antigen (HBsAg) has a significant impact on the performance of diagnostic screening tests and the clinical outcome of hepatitis B infection. Neutralizing or diagnostic antibodies against the HBsAg are directed towards its highly conserved major hydrophilic region (MHR), in particular towards its "a" determinant subdomain. Here, we explored, on a global scale, the genetic diversity of the HBsAg MHR in a large, multi-ethnic cohort of randomly selected subjects with HBV infection from four continents. A total of 1553 HBsAg positive blood samples of subjects originating from 20 different countries across Africa, America, Asia and central Europe were characterized for amino acid variation in the MHR. Using highly sensitive ultra-deep sequencing, we found 72.8% of the successfully sequenced subjects (n = 1391) demonstrated amino acid sequence variation in the HBsAg MHR. This indicates that the global variation frequency in the HBsAg MHR is threefold higher than previously reported. The majority of the amino acid mutations were found in the HBV genotypes B (28.9%) and C (25.4%). Collectively, we identified 345 distinct amino acid mutations in the MHR. Among these, we report 62 previously unknown mutations, which extends the worldwide pool of currently known HBsAg MHR mutations by 22%. Importantly, topological analysis identified the "a" determinant upstream flanking region as the structurally most diverse subdomain of the HBsAg MHR. The highest prevalence of "a" determinant region mutations was observed in subjects from Asia, followed by the African, American and European cohorts, respectively. Finally, we found that more than half (59.3%) of all HBV subjects investigated carried multiple MHR mutations. Together, this worldwide ultra-deep sequencing based genotyping study reveals that the global prevalence and structural complexity of variation in the hepatitis B surface antigen have, to date, been significantly underappreciated.
Identification and characterization of transcript polymorphisms in soybean lines varying in oil composition and content.

PubMed

Goettel, Wolfgang; Xia, Eric; Upchurch, Robert; Wang, Ming-Li; Chen, Pengyin; An, Yong-Qiang Charles

2014-04-23

Variation in seed oil composition and content among soybean varieties is largely attributed to differences in transcript sequences and/or transcript accumulation of oil production related genes in seeds. Discovery and analysis of sequence and expression variations in these genes will accelerate soybean oil quality improvement. In an effort to identify these variations, we sequenced the transcriptomes of soybean seeds from nine lines varying in oil composition and/or total oil content. Our results showed that 69,338 distinct transcripts from 32,885 annotated genes were expressed in seeds. A total of 8,037 transcript expression polymorphisms and 50,485 transcript sequence polymorphisms (48,792 SNPs and 1,693 small Indels) were identified among the lines. Effects of the transcript polymorphisms on their encoded protein sequences and functions were predicted. The studies also provided independent evidence that the lack of FAD2-1A gene activity and a non-synonymous SNP in the coding sequence of FAB2C caused elevated oleic acid and stearic acid levels in soybean lines M23 and FAM94-41, respectively. As a proof-of-concept, we developed an integrated RNA-seq and bioinformatics approach to identify and functionally annotate transcript polymorphisms, and demonstrated its high effectiveness for discovery of genetic and transcript variations that result in altered oil quality traits. The collection of transcript polymorphisms coupled with their predicted functional effects will be a valuable asset for further discovery of genes, gene variants, and functional markers to improve soybean oil quality.
SNP in Chalcone Synthase gene is associated with variation of 6-gingerol content in contrasting landraces of Zingiber officinale.Roscoe.

PubMed

Ghosh, Subhabrata; Mandi, Swati Sen

2015-07-25

Zingiber officinale, medicinally the most important species within Zingiber genus, contains 6-gingerol as the active principle. This compound obtained from rhizomes of Z.officinale, has immense medicinal importance and is used in various herbal drug formulations. Our record of variation in content of this active principle, viz. 6-gingerol, in land races of this drug plant collected from different locations correlated with our Gene expression studies exhibiting high Chalcone Synthase gene (Chalcone Synthase is the rate limiting enzyme of 6-gingerol biosynthesis pathway) expression in high 6-gingerol containing landraces than in the low 6-gingerol containing landraces. Sequencing of Chalcone Synthase cDNA and subsequent multiple sequence alignment revealed seven SNPs between these contrasting genotypes. Converting this nucleotide sequence to amino acid sequence, alteration of two amino acids becomes evident; one amino acid change (asparagine to serine at position 336) is associated with base change (A→G) and another change (serine to leucine at position 142) is associated with the base change (C→T). Since asparagine at position 336 is one of the critical amino acids of the catalytic triad of Chalcone Synthase enzyme, responsible for substrate binding, our study suggests that landraces with a specific amino acid change viz. Asparagine (found in high 6-gingerol containing landraces) to serine causes low 6-gingerol content. This is probably due to a weak enzyme substrate association caused by the absence of asparagine in the catalytic triad. Detailed study of this finding could also help to understand molecular mechanism associated with variation in 6-gingerol content in Z.officinale genotypes and thereby strategies for developing elite genotypes containing high 6-gingerol content. Copyright © 2015 Elsevier B.V. All rights reserved.
Variation of clinical expression in patients with Stargardt dystrophy and sequence variations in the ABCR gene.

PubMed

Fishman, G A; Stone, E M; Grover, S; Derlacki, D J; Haines, H L; Hockey, R R

1999-04-01

To report the spectrum of ophthalmic findings in patients with Stargardt dystrophy or fundus flavimaculatus who have a specific sequence variation in the ABCR gene. Twenty-nine patients with Stargardt dystrophy or fundus flavimaculatus from different pedigrees were identified with possible disease-causing sequence variations in the ABCR gene from a group of 66 patients who were screened for sequence variations in this gene. Patients underwent a routine ocular examination, including slitlamp biomicroscopy and a dilated fundus examination. Fluorescein angiography was performed on 22 patients, and electroretinographic measurements were obtained on 24 of 29 patients. Kinetic visual fields were measured with a Goldmann perimeter in 26 patients. Single-strand conformation polymorphism analysis and DNA sequencing were used to identify variations in coding sequences of the ABCR gene. Three clinical phenotypes were observed among these 29 patients. In phenotype I, 9 of 12 patients had a sequence change in exon 42 of the ABCR gene in which the amino acid glutamic acid was substituted for glycine (Gly1961Glu). In only 4 of these 9 patients was a second possible disease-causing mutation found on the other ABCR allele. In addition to an atrophic-appearing macular lesion, phenotype I was characterized by localized perifoveal yellowish white flecks, the absence of a dark choroid, and normal electroretinographic amplitudes. Phenotype II consisted of 10 patients who showed a dark choroid and more diffuse yellowish white flecks in the fundus. None exhibited the Gly1961Glu change. Phenotype III consisted of 7 patients who showed extensive atrophic-appearing changes of the retinal pigment epithelium. Electroretinographic cone and rod amplitudes were reduced. One patient showed the Gly1961Glu change. A wide variation in clinical phenotype can occur in patients with sequence changes in the ABCR gene. In individual patients, a certain phenotype seems to be associated with the presence of a Gly1961Glu change in exon 42 of the ABCR gene. The identification of correlations between specific mutations in the ABCR gene and clinical phenotypes will better facilitate the counseling of patients on their visual prognosis. This information will also likely be important for future therapeutic trials in patients with Stargardt dystrophy.
Hermes Transposon Distribution and Structure in Musca domestica

PubMed Central

Subramanian, Ramanand A.; Cathcart, Laura A.; Krafsur, Elliot S.; Atkinson, Peter W.

2009-01-01

Hermes are hAT transposons from Musca domestica that are very closely related to the hobo transposons from Drosophila melanogaster and are useful as gene vectors in a wide variety of organisms including insects, planaria, and yeast. hobo elements show distinct length variations in a rapidly evolving region of the transposase-coding region as a result of expansions and contractions of a simple repeat sequence encoding 3 amino acids threonine, proline, and glutamic acid (TPE). These variations in length may influence the function of the protein and the movement of hobo transposons in natural populations. Here, we determine the distribution of Hermes in populations of M. domestica as well as whether Hermes transposase has undergone similar sequence expansions and contractions during its evolution in this species. Hermes transposons were found in all M. domestica individuals sampled from 14 populations collected from 4 continents. All individuals with Hermes transposons had evidence for the presence of intact transposase open reading frames, and little sequence variation was observed among Hermes elements. A systematic analysis of the TPE-homologous region of the Hermes transposase-coding region revealed no evidence for length variation. The simple sequence repeat found in hobo elements is a feature of this transposon that evolved since the divergence of hobo and Hermes. PMID:19366812
Variation of amino acid sequences of serum amyloid a (SAA) and immunohistochemical analysis of amyloid a (AA) in Japanese domestic cats.

PubMed

Tei, Meina; Uchida, Kazuyuki; Chambers, James K; Watanabe, Ken-Ichi; Tamamoto, Takashi; Ohno, Koichi; Nakayama, Hiroyuki

2018-02-02

Amyloid A (AA) amyloidosis, a fatal systemic amyloid disease, occurs secondary to chronic inflammatory conditions in humans. Although persistently elevated serum amyloid A (SAA) levels are required for its pathogenesis, not all individuals with chronic inflammation necessarily develop AA amyloidosis. Furthermore, many diseases in cats are associated with the elevated production of SAA, whereas only a small number actually develop AA amyloidosis. We hypothesized that a genetic mutation in the SAA gene may strongly contribute to the pathogenesis of feline AA amyloidosis. In the present study, genomic DNA from four Japanese domestic cats (JDCs) with AA amyloidosis and from five without amyloidosis was analyzed using polymerase chain reaction (PCR) amplification and direct sequencing. We identified the novel variation combination of 45R-51A in the deduced amino acid sequences of four JDCs with amyloidosis and five without. However, there was no relationship between amino acid variations and the distribution of AA amyloid deposits, indicating that differences in SAA sequences do not contribute to the pathogenesis of AA amyloidosis. Immunohistochemical analysis using antisera against the three different parts of the feline SAA protein-i.e., the N-terminal, central, and C-terminal regions-revealed that feline AA contained the C-terminus, unlike human AA. These results indicate that the cleavage and degradation of the C-terminus are not essential for amyloid fibril formation in JDCs.
Variation in a surface-exposed region of the Mycoplasma pneumoniae P40 protein as a consequence of homologous DNA recombination between RepMP5 elements.

PubMed

Spuesens, Emiel B M; van de Kreeke, Nick; Estevão, Silvia; Hoogenboezem, Theo; Sluijter, Marcel; Hartwig, Nico G; van Rossum, Annemarie M C; Vink, Cornelis

2011-02-01

Mycoplasma pneumoniae is a human pathogen that causes a range of respiratory tract infections. The first step in infection is adherence of the bacteria to the respiratory epithelium. This step is mediated by a specialized organelle, which contains several proteins (cytadhesins) that have an important function in adherence. Two of these cytadhesins, P40 and P90, represent the proteolytic products from a single 130 kDa protein precursor, which is encoded by the MPN142 gene. Interestingly, MPN142 contains a repetitive DNA element, termed RepMP5, of which homologues are found at seven other loci within the M. pneumoniae genome. It has been hypothesized that these RepMP5 elements, which are similar but not identical in sequence, recombine with their counterpart within MPN142 and thereby provide a source of sequence variation for this gene. As this variation may give rise to amino acid changes within P40 and P90, the recombination between RepMP5 elements may constitute the basis of antigenic variation and, possibly, immune evasion by M. pneumoniae. To investigate the sequence variation of MPN142 in relation to inter-RepMP5 recombination, we determined the sequences of all RepMP5 elements in a collection of 25 strains. The results indicate that: (i) inter-RepMP5 recombination events have occurred in seven of the strains, and (ii) putative RepMP5 recombination events involving MPN142 have induced amino acid changes in a surface-exposed part of the P40 protein in two of the strains. We conclude that recombination between RepMP5 elements is a common phenomenon that may lead to sequence variation of MPN142-encoded proteins.
Major histocompatibility complex variation in the endangered Przewalski's horse.

PubMed Central

Hedrick, P W; Parker, K M; Miller, E L; Miller, P S

1999-01-01

The major histocompatibility complex (MHC) is a fundamental part of the vertebrate immune system, and the high variability in many MHC genes is thought to play an essential role in recognition of parasites. The Przewalski's horse is extinct in the wild and all the living individuals descend from 13 founders, most of whom were captured around the turn of the century. One of the primary genetic concerns in endangered species is whether they have ample adaptive variation to respond to novel selective factors. In examining 14 Przewalski's horses that are broadly representative of the living animals, we found six different class II DRB major histocompatibility sequences. The sequences showed extensive nonsynonymous variation, concentrated in the putative antigen-binding sites, and little synonymous variation. Individuals had from two to four sequences as determined by single-stranded conformation polymorphism (SSCP) analysis. On the basis of the SSCP data, phylogenetic analysis of the nucleotide sequences, and segregation in a family group, we conclude that four of these sequences are from one gene (although one sequence codes for a nonfunctional allele because it contains a stop codon) and two other sequences are from another gene. The position of the stop codon is at the same amino-acid position as in a closely related sequence from the domestic horse. Because other organisms have extensive variation at homologous loci, the Przewalski's horse may have quite low variation in this important adaptive region. PMID:10430594
Evidence of Divergent Amino Acid Usage in Comparative Analyses of R5- and X4-Associated HIV-1 Vpr Sequences

PubMed Central

Antell, Gregory C.; Zhong, Wen; Kercher, Katherine; Passic, Shendra; Williams, Jean; Liu, Yucheng; James, Tony; Jacobson, Jeffrey M.; Szep, Zsofia

2017-01-01

Vpr is an HIV-1 accessory protein that plays numerous roles during viral replication, and some of which are cell type dependent. To test the hypothesis that HIV-1 tropism extends beyond the envelope into the vpr gene, studies were performed to identify the associations between coreceptor usage and Vpr variation in HIV-1-infected patients. Colinear HIV-1 Env-V3 and Vpr amino acid sequences were obtained from the LANL HIV-1 sequence database and from well-suppressed patients in the Drexel/Temple Medicine CNS AIDS Research and Eradication Study (CARES) Cohort. Genotypic classification of Env-V3 sequences as X4 (CXCR4-utilizing) or R5 (CCR5-utilizing) was used to group colinear Vpr sequences. To reveal the sequences associated with a specific coreceptor usage genotype, Vpr amino acid sequences were assessed for amino acid diversity and Jensen-Shannon divergence between the two groups. Five amino acid alphabets were used to comprehensively examine the impact of amino acid substitutions involving side chains with similar physiochemical properties. Positions 36, 37, 41, 89, and 96 of Vpr were characterized by statistically significant divergence across multiple alphabets when X4 and R5 sequence groups were compared. In addition, consensus amino acid switches were found at positions 37 and 41 in comparisons of the R5 and X4 sequence populations. These results suggest an evolutionary link between Vpr and gp120 in HIV-1-infected patients. PMID:28620613
Identification and characterization of transcript polymorphisms in soybean lines varying in oil composition and content

PubMed Central

2014-01-01

Background Variation in seed oil composition and content among soybean varieties is largely attributed to differences in transcript sequences and/or transcript accumulation of oil production related genes in seeds. Discovery and analysis of sequence and expression variations in these genes will accelerate soybean oil quality improvement. Results In an effort to identify these variations, we sequenced the transcriptomes of soybean seeds from nine lines varying in oil composition and/or total oil content. Our results showed that 69,338 distinct transcripts from 32,885 annotated genes were expressed in seeds. A total of 8,037 transcript expression polymorphisms and 50,485 transcript sequence polymorphisms (48,792 SNPs and 1,693 small Indels) were identified among the lines. Effects of the transcript polymorphisms on their encoded protein sequences and functions were predicted. The studies also provided independent evidence that the lack of FAD2-1A gene activity and a non-synonymous SNP in the coding sequence of FAB2C caused elevated oleic acid and stearic acid levels in soybean lines M23 and FAM94-41, respectively. Conclusions As a proof-of-concept, we developed an integrated RNA-seq and bioinformatics approach to identify and functionally annotate transcript polymorphisms, and demonstrated its high effectiveness for discovery of genetic and transcript variations that result in altered oil quality traits. The collection of transcript polymorphisms coupled with their predicted functional effects will be a valuable asset for further discovery of genes, gene variants, and functional markers to improve soybean oil quality. PMID:24755115
[Genetic characteristics of hemagglutinin in measles viruses isolated in Henan Province, China].

PubMed

Feng, Da-Xing; Seng, Ming-Hua; Liu, Qian; Zhang, Zhen-Ying

2014-03-01

This study aims to investigate the genetic characteristics of hemagglutinin in wild-type measles viruses in Henan Province, China and to provide a basis for measles control and elimination. Specimens were collected from suspected measles cases in Henan during 2008-2012. Cell culture was performed for virus isolation, and RT-PCR was used to amplify hemagglutinin gene. The PCR products were sequenced and analyzed, including construction of phylogenetic tree and analysis of the distance between the isolated virus and the reference virus; then, the variations in predicted amino acids were analyzed. The results showed that 12 measles viruses were isolated in Henan Province and identified as H1a genotype; the nucleotide and amino acid homologies were 98.0%-100% and 97.2%-99.8%, respectively. One glycosylation site changed in all the 12 sequences because of the amino acid mutation from serine to asparagine at the 240th site, as compared with Edmonston-wt. USA/54/A. Overall, the wild-type measles virus genotype circulating in Henan Province from 2008 to 2012 was H1a, with high homology between strains; there were some variations in amino acid sequences, resulting in glycosylation site deletion.
Cloning and characterization of acid invertase genes in the roots of the metallophyte Kummerowia stipulacea (Maxim.) Makino from two populations: Differential expression under copper stress.

PubMed

Zhang, Luan; Xiong, Zhi-ting; Xu, Zhong-rui; Liu, Chen; Cai, Shen-wen

2014-06-01

The roots of metallophytes serve as the key interface between plants and heavy metal-contaminated underground environments. It is known that the roots of metallicolous plants show a higher activity of acid invertase enzymes than those of non-metallicolous plants when under copper stress. To test whether the higher activity of acid invertases is the result of increased expression of acid invertase genes or variations in the amino acid sequences between the two population types, we isolated full cDNAs for acid invertases from two populations of Kummerowia stipulacea (from metalliferous and non-metalliferous soils), determined their nucleotide sequences, expressed them in Pichia pastoris, and conducted real-time PCR to determine differences in transcript levels during Cu stress. Heterologous expression of acid invertase cDNAs in P. pastoris indicated that variations in the amino acid sequences of acid invertases between the two populations played no significant role in determining enzyme characteristics. Seedlings of K. stipulacea were exposed to 0.3µM Cu(2+) (control) and 10µM Cu(2+) for 7 days under hydroponics׳ conditions. The transcript levels of acid invertases in metallicolous plants were significantly higher than in non-metallicolous plants when under copper stress. The results suggest that the expression of acid invertase genes in metallicolous plants of K. stipulacea differed from those in non-metallicolous plants under such conditions. In addition, the sugars may play an important role in regulating the transcript level of acid invertase genes and acid invertase genes may also be involved in root/shoot biomass allocation. Copyright © 2014 Elsevier Inc. All rights reserved.
Diff-seq: A high throughput sequencing-based mismatch detection assay for DNA variant enrichment and discovery

PubMed Central

Karas, Vlad O; Sinnott-Armstrong, Nicholas A; Varghese, Vici; Shafer, Robert W; Greenleaf, William J; Sherlock, Gavin

2018-01-01

Abstract Much of the within species genetic variation is in the form of single nucleotide polymorphisms (SNPs), typically detected by whole genome sequencing (WGS) or microarray-based technologies. However, WGS produces mostly uninformative reads that perfectly match the reference, while microarrays require genome-specific reagents. We have developed Diff-seq, a sequencing-based mismatch detection assay for SNP discovery without the requirement for specialized nucleic-acid reagents. Diff-seq leverages the Surveyor endonuclease to cleave mismatched DNA molecules that are generated after cross-annealing of a complex pool of DNA fragments. Sequencing libraries enriched for Surveyor-cleaved molecules result in increased coverage at the variant sites. Diff-seq detected all mismatches present in an initial test substrate, with specific enrichment dependent on the identity and context of the variation. Application to viral sequences resulted in increased observation of variant alleles in a biologically relevant context. Diff-Seq has the potential to increase the sensitivity and efficiency of high-throughput sequencing in the detection of variation. PMID:29361139

The Diversity Present in 5140 Human Mitochondrial Genomes

PubMed Central

Pereira, Luísa; Freitas, Fernando; Fernandes, Verónica; Pereira, Joana B.; Costa, Marta D.; Costa, Stephanie; Máximo, Valdemar; Macaulay, Vincent; Rocha, Ricardo; Samuels, David C.

2009-01-01

We analyzed the current status (as of the end of August 2008) of human mitochondrial genomes deposited in GenBank, amounting to 5140 complete or coding-region sequences, in order to present an overall picture of the diversity present in the mitochondrial DNA of the global human population. To perform this task, we developed mtDNA-GeneSyn, a computer tool that identifies and exhaustedly classifies the diversity present in large genetic data sets. The diversity observed in the 5140 human mitochondrial genomes was compared with all possible transitions and transversions from the standard human mitochondrial reference genome. This comparison showed that tRNA and rRNA secondary structures have a large effect in limiting the diversity of the human mitochondrial sequences, whereas for the protein-coding genes there is a bias toward less variation at the second codon positions. The analysis of the observed amino acid variations showed a tolerance of variations that convert between the amino acids V, I, A, M, and T. This defines a group of amino acids with similar chemical properties that can interconvert by a single transition. PMID:19426953
LenVarDB: database of length-variant protein domains.

PubMed

Mutt, Eshita; Mathew, Oommen K; Sowdhamini, Ramanathan

2014-01-01

Protein domains are functionally and structurally independent modules, which add to the functional variety of proteins. This array of functional diversity has been enabled by evolutionary changes, such as amino acid substitutions or insertions or deletions, occurring in these protein domains. Length variations (indels) can introduce changes at structural, functional and interaction levels. LenVarDB (freely available at http://caps.ncbs.res.in/lenvardb/) traces these length variations, starting from structure-based sequence alignments in our Protein Alignments organized as Structural Superfamilies (PASS2) database, across 731 structural classification of proteins (SCOP)-based protein domain superfamilies connected to 2 730 625 sequence homologues. Alignment of sequence homologues corresponding to a structural domain is available, starting from a structure-based sequence alignment of the superfamily. Orientation of the length-variant (indel) regions in protein domains can be visualized by mapping them on the structure and on the alignment. Knowledge about location of length variations within protein domains and their visual representation will be useful in predicting changes within structurally or functionally relevant sites, which may ultimately regulate protein function. Non-technical summary: Evolutionary changes bring about natural changes to proteins that may be found in many organisms. Such changes could be reflected as amino acid substitutions or insertions-deletions (indels) in protein sequences. LenVarDB is a database that provides an early overview of observed length variations that were set among 731 protein families and after examining >2 million sequences. Indels are followed up to observe if they are close to the active site such that they can affect the activity of proteins. Inclusion of such information can aid the design of bioengineering experiments.
Structural requirements for recognition of the HLA-Dw14 class II epitope: A key HLA determinant associated with rheumatoid arthritis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hiraiwa, Akikazu; Yamanaka, Katsuo; Kwok, W.W.

Although HLA genes have been shown to be associated with certain diseases, the basis for this association is unknown. Recent studies, however, have documented patterns of nucleotide sequence variation among some HLA genes associated with a particular disease. For rheumatoid arthritis, HLA genes in most patients have a shared nucleotide sequence encoding a key structural element of an HLA class II polypeptide; this sequence element is critical for the interaction of the HLA molecule with antigenic peptides and with responding T cells, suggestive of a direct role for this sequence element in disease susceptibility. The authors describe the serological andmore » cellular immunologic characteristics encoded by this rheumatoid arthritis-associated sequence element. Site-directed mutagenesis of the DRB1 gene was used to define amino acids critical for antibody and T-cell recognition of this structural element, focusing on residues that distinguish the rheumatoid arthritis-associated alleles Dw4 and Dw14 from a closely related allele, Dw10, not associated with disease. Both the gain and loss of rheumatoid arthritis-associated epitopes were highly dependent on three residues within a discrete domain of the HLA-DR molecule. Recognition was most strongly influenced by the following amino acids (in order): 70 > 71 > 67. Some alloreactive T-cell clones were also influenced by amino acid variation in portions of the DR molecule lying outside the shared sequence element.« less
Genetic Variation and Its Reflection on Posttranslational Modifications in Frequency Clock and Mating Type a-1 Proteins in Sordaria fimicola

PubMed Central

Arif, Rabia; Akram, Faiza; Jamil, Tazeen; Lee, Siu Fai

2017-01-01

Posttranslational modifications (PTMs) occur in all essential proteins taking command of their functions. There are many domains inside proteins where modifications take place on side-chains of amino acids through various enzymes to generate different species of proteins. In this manuscript we have, for the first time, predicted posttranslational modifications of frequency clock and mating type a-1 proteins in Sordaria fimicola collected from different sites to see the effect of environment on proteins or various amino acids pickings and their ultimate impact on consensus sequences present in mating type proteins using bioinformatics tools. Furthermore, we have also measured and walked through genomic DNA of various Sordaria strains to determine genetic diversity by genotyping the short sequence repeats (SSRs) of wild strains of S. fimicola collected from contrasting environments of two opposing slopes (harsh and xeric south facing slope and mild north facing slope) of Evolution Canyon (EC), Israel. Based on the whole genome sequence of S. macrospora, we targeted 20 genomic regions in S. fimicola which contain short sequence repeats (SSRs). Our data revealed genetic variations in strains from south facing slope and these findings assist in the hypothesis that genetic variations caused by stressful environments lead to evolution. PMID:28717646
Genetic Variation and Its Reflection on Posttranslational Modifications in Frequency Clock and Mating Type a-1 Proteins in Sordaria fimicola.

PubMed

Arif, Rabia; Akram, Faiza; Jamil, Tazeen; Mukhtar, Hamid; Lee, Siu Fai; Saleem, Muhammad

2017-01-01

Posttranslational modifications (PTMs) occur in all essential proteins taking command of their functions. There are many domains inside proteins where modifications take place on side-chains of amino acids through various enzymes to generate different species of proteins. In this manuscript we have, for the first time, predicted posttranslational modifications of frequency clock and mating type a-1 proteins in Sordaria fimicola collected from different sites to see the effect of environment on proteins or various amino acids pickings and their ultimate impact on consensus sequences present in mating type proteins using bioinformatics tools. Furthermore, we have also measured and walked through genomic DNA of various Sordaria strains to determine genetic diversity by genotyping the short sequence repeats (SSRs) of wild strains of S. fimicola collected from contrasting environments of two opposing slopes (harsh and xeric south facing slope and mild north facing slope) of Evolution Canyon (EC), Israel. Based on the whole genome sequence of S. macrospora , we targeted 20 genomic regions in S. fimicola which contain short sequence repeats (SSRs). Our data revealed genetic variations in strains from south facing slope and these findings assist in the hypothesis that genetic variations caused by stressful environments lead to evolution.
Sequence, distribution and chromosomal context of class I and class II pilin genes of Neisseria meningitidis identified in whole genome sequences

PubMed Central

2014-01-01

Background Neisseria meningitidis expresses type four pili (Tfp) which are important for colonisation and virulence. Tfp have been considered as one of the most variable structures on the bacterial surface due to high frequency gene conversion, resulting in amino acid sequence variation of the major pilin subunit (PilE). Meningococci express either a class I or a class II pilE gene and recent work has indicated that class II pilins do not undergo antigenic variation, as class II pilE genes encode conserved pilin subunits. The purpose of this work was to use whole genome sequences to further investigate the frequency and variability of the class II pilE genes in meningococcal isolate collections. Results We analysed over 600 publically available whole genome sequences of N. meningitidis isolates to determine the sequence and genomic organization of pilE. We confirmed that meningococcal strains belonging to a limited number of clonal complexes (ccs, namely cc1, cc5, cc8, cc11 and cc174) harbour a class II pilE gene which is conserved in terms of sequence and chromosomal context. We also identified pilS cassettes in all isolates with class II pilE, however, our analysis indicates that these do not serve as donor sequences for pilE/pilS recombination. Furthermore, our work reveals that the class II pilE locus lacks the DNA sequence motifs that enable (G4) or enhance (Sma/Cla repeat) pilin antigenic variation. Finally, through analysis of pilin genes in commensal Neisseria species we found that meningococcal class II pilE genes are closely related to pilE from Neisseria lactamica and Neisseria polysaccharea, suggesting horizontal transfer among these species. Conclusions Class II pilins can be defined by their amino acid sequence and genomic context and are present in meningococcal isolates which have persisted and spread globally. The absence of G4 and Sma/Cla sequences adjacent to the class II pilE genes is consistent with the lack of pilin subunit variation in these isolates, although horizontal transfer may generate class II pilin diversity. This study supports the suggestion that high frequency antigenic variation of pilin is not universal in pathogenic Neisseria. PMID:24690385
RSAT 2015: Regulatory Sequence Analysis Tools.

PubMed

Medina-Rivera, Alejandra; Defrance, Matthieu; Sand, Olivier; Herrmann, Carl; Castro-Mondragon, Jaime A; Delerce, Jeremy; Jaeger, Sébastien; Blanchet, Christophe; Vincens, Pierre; Caron, Christophe; Staines, Daniel M; Contreras-Moreira, Bruno; Artufel, Marie; Charbonnier-Khamvongsa, Lucie; Hernandez, Céline; Thieffry, Denis; Thomas-Chollier, Morgane; van Helden, Jacques

2015-07-01

RSAT (Regulatory Sequence Analysis Tools) is a modular software suite for the analysis of cis-regulatory elements in genome sequences. Its main applications are (i) motif discovery, appropriate to genome-wide data sets like ChIP-seq, (ii) transcription factor binding motif analysis (quality assessment, comparisons and clustering), (iii) comparative genomics and (iv) analysis of regulatory variations. Nine new programs have been added to the 43 described in the 2011 NAR Web Software Issue, including a tool to extract sequences from a list of coordinates (fetch-sequences from UCSC), novel programs dedicated to the analysis of regulatory variants from GWAS or population genomics (retrieve-variation-seq and variation-scan), a program to cluster motifs and visualize the similarities as trees (matrix-clustering). To deal with the drastic increase of sequenced genomes, RSAT public sites have been reorganized into taxon-specific servers. The suite is well-documented with tutorials and published protocols. The software suite is available through Web sites, SOAP/WSDL Web services, virtual machines and stand-alone programs at http://www.rsat.eu/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Characterization and mapping of cDNA encoding aspartate aminotransferase in rice, Oryza sativa L.

PubMed

Song, J; Yamamoto, K; Shomura, A; Yano, M; Minobe, Y; Sasaki, T

1996-10-31

Fifteen cDNA clones, putatively identified as encoding aspartate aminotransferase (AST, EC 2.6.1.1.), were isolated and partially sequenced. Together with six previously isolated clones putatively identified to encode ASTs (Sasaki, et al. 1994, Plant Journal 6, 615-624), their sequences were characterized and classified into 4 cDNA species. Two of the isolated clones, C60213 and C2079, were full-length cDNAs, and their complete nucleotide sequences were determined. C60213 was 1612 bp long and its deduced amino acid sequence showed 88% homology with that of Panicum miliaceum L. mitochondrial AST. The C60213-encoded protein had an N-terminal amino acid sequence that was characteristic of a mitochondrial transit peptide. On the other hand, C2079 was 1546 bp long and had 91% amino acid sequence homology with P. miliaceum L. cytosolic AST but lacked in the transit peptide sequence. The homologies of nucleotide sequences and deduced amino acid sequences of C2079 and C60213 were 54% and 52%, respectively. C2079 and C60213 were mapped on chromosomes 1 and 6, respectively, by restriction fragment length polymorphism linkage analysis. Northern blot analysis using C2079 as a probe revealed much higher transcript levels in callus and root than in green and etiolated shoots, suggesting tissue-specific variations of AST gene expression.
Variation in Seed Fatty Acid Composition, and Sequence Divergence in the FAD2 Gene Coding Region between Wild and Cultivated Sesame

USDA-ARS?s Scientific Manuscript database

Sesame germplasm harbors genetic diversity which can be useful for sesame improvement in breeding programs. Seven accessions with different levels of oleic acid were selected from the entire USDA sesame germplasm collection (1232 accessions) and planted for morphological observation and re-examinati...
A gene variation of 14-3-3 zeta isoform in rat hippocampus.

PubMed

Murakami, K; Situ, S Y; Eshete, F

1996-11-14

A variant form of 14-3-3 zeta was isolated from the rat hippocampal cDNA library. The cloned cDNA is 1687 bp in length and it contains an entire ORF (nt = 63-797) with 245 amino acids that is characteristic to 14-3-3 zeta subtype. By comparing with reported sequences of 14-3-3 zeta, we found three nucleotide substitutions within the coding sequence in our clone; C<-->T transition at nt = 325 and G<-->C transversions at nt = 387 and 388. Both are missense mutations, leading ACG (Thr) to ATG (Met) and CGT (Arg) to GCT (Ala) conversions at residue 88 and 109, respectively. Our results show that at least three different genetic variants of 14-3-3 zeta are present in rat species which results in protein variations. Such mutation in the amino acid sequence is an important indication of the diverse functions of this protein and may also contribute to the recent contradictory observations regarding the role of the 14-3-3 zeta subtype.
Markov-modulated Markov chains and the covarion process of molecular evolution.

PubMed

Galtier, N; Jean-Marie, A

2004-01-01

The covarion (or site specific rate variation, SSRV) process of biological sequence evolution is a process by which the evolutionary rate of a nucleotide/amino acid/codon position can change in time. In this paper, we introduce time-continuous, space-discrete, Markov-modulated Markov chains as a model for representing SSRV processes, generalizing existing theory to any model of rate change. We propose a fast algorithm for diagonalizing the generator matrix of relevant Markov-modulated Markov processes. This algorithm makes phylogeny likelihood calculation tractable even for a large number of rate classes and a large number of states, so that SSRV models become applicable to amino acid or codon sequence datasets. Using this algorithm, we investigate the accuracy of the discrete approximation to the Gamma distribution of evolutionary rates, widely used in molecular phylogeny. We show that a relatively large number of classes is required to achieve accurate approximation of the exact likelihood when the number of analyzed sequences exceeds 20, both under the SSRV and among site rate variation (ASRV) models.
ACTG: novel peptide mapping onto gene models.

PubMed

Choi, Seunghyuk; Kim, Hyunwoo; Paek, Eunok

2017-04-15

In many proteogenomic applications, mapping peptide sequences onto genome sequences can be very useful, because it allows us to understand origins of the gene products. Existing software tools either take the genomic position of a peptide start site as an input or assume that the peptide sequence exactly matches the coding sequence of a given gene model. In case of novel peptides resulting from genomic variations, especially structural variations such as alternative splicing, these existing tools cannot be directly applied unless users supply information about the variant, either its genomic position or its transcription model. Mapping potentially novel peptides to genome sequences, while allowing certain genomic variations, requires introducing novel gene models when aligning peptide sequences to gene structures. We have developed a new tool called ACTG (Amino aCids To Genome), which maps peptides to genome, assuming all possible single exon skipping, junction variation allowing three edit distances from the original splice sites, exon extension and frame shift. In addition, it can also consider SNVs (single nucleotide variations) during mapping phase if a user provides the VCF (variant call format) file as an input. Available at http://prix.hanyang.ac.kr/ACTG/search.jsp . eunokpaek@hanyang.ac.kr. Supplementary data are available at Bioinformatics online. © The Author 2016. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com
alpha-Lactalbumin species variation, HAMLET formation, and tumor cell death.

PubMed

Pettersson, Jenny; Mossberg, Ann-Kristin; Svanborg, Catharina

2006-06-23

HAMLET (human alpha-lactalbumin made lethal to tumor cells) is a tumoricidal complex of apo alpha-lactalbumin and oleic acid, formed in casein after low pH treatment of human milk. This study examined if HAMLET-like complexes are present in casein from different species and if isolated alpha-lactalbumin from those species can form such complexes with oleic acid. Casein from human, bovine, equine, and porcine milk was separated by ion exchange chromatography and active complexes were only found in human casein. This was not explained by alpha-lactalbumin sequence variation, as purified bovine, equine, porcine, and caprine alpha-lactalbumins formed complexes with oleic acid with biological activity similar to HAMLET. We conclude that structural variation of alpha-lactalbumins does not preclude the formation of HAMLET-like complexes and that natural HAMLET formation in casein was unique to human milk, which also showed the highest oleic acid content.
Genome sequences of five Lactobacillus sp. isolates from traditional Turkish sourdough

USDA-ARS?s Scientific Manuscript database

A high level of variation in microflora can be observed in lactic acid bacteria (LAB) profiles of sourdoughs. Here, we present draft genome sequences of Lactobacillus reuteri E81, L. reuteri LR5A, L. rhamnosus LR2, L. plantarum PFC-311 and a novel Lactobacillus sp. PFC-70 isolated from traditional T...
Terminal region sequence variations in variola virus DNA.

PubMed

Massung, R F; Loparev, V N; Knight, J C; Totmenin, A V; Chizhikov, V E; Parsons, J M; Safronov, P F; Gutorov, V V; Shchelkunov, S N; Esposito, J J

1996-07-15

Genome DNA terminal region sequences were determined for a Brazilian alastrim variola minor virus strain Garcia-1966 that was associated with an 0.8% case-fatality rate and African smallpox strains Congo-1970 and Somalia-1977 associated with variola major (9.6%) and minor (0.4%) mortality rates, respectively. A base sequence identity of > or = 98.8% was determined after aligning 30 kb of the left- or right-end region sequences with cognate sequences previously determined for Asian variola major strains India-1967 (31% death rate) and Bangladesh-1975 (18.5% death rate). The deduced amino acid sequences of putative proteins of > or = 65 amino acids also showed relatively high identity, although the Asian and African viruses were clearly more related to each other than to alastrim virus. Alastrim virus contained only 10 of 70 proteins that were 100% identical to homologs in Asian strains, and 7 alastrim-specific proteins were noted.
Two missense mutations in melanocortin 1 receptor (MC1R) are strongly associated with dark ventral coat color in reindeer (Rangifer tarandus).

PubMed

Våge, D I; Nieminen, M; Anderson, D G; Røed, K H

2014-10-01

The protein-coding region of melanocortin 1 receptor (MC1R) was sequenced to identify potential variation affecting coat color in reindeer (Rangifer tarandus). A T→C sequence variation at nucleotide position 218 (c.218T>C) causing an amino acid (aa) change from methionine to threonine at aa position 73 (p.Met73Thr) was identified. In addition, a T→G sequence variation was found at nucleotide position 839 (c.839T>G), causing phenylalanine to be exchanged by cysteine at aa position 280 (p.Phe280Cys). The two sequence variants (c.218C and c.839G) were found to be closely associated with a darker belly coat compared with animals not having any of these two variants. The aa acid change p.Met73Thr affects the same position as p.Met73Lys previously reported to give constitutive activation of MC1R in black sheep (Ovis aries), whereas p.Phe280Cys is identical to one of two variants previously reported to be associated with dark coat color in Arctic fox (Alopex lagopus), supporting that the two variants found in reindeer are functional. The complete absence of Thr73 and Cys280 among the 51 wild reindeer analyzed provides some evidence that these variants are more common in the domestic herds. © 2014 Stichting International Foundation for Animal Genetics.
[Genetic variation analysis of canine parvovirus VP2 gene in China].

PubMed

Yi, Li; Cheng, Shi-Peng; Yan, Xi-Jun; Wang, Jian-Ke; Luo, Bin

2009-11-01

To recognize the molecular biology character, phylogenetic relationship and the state quo prevalent of Canine parvovirus (CPV), Faecal samnples from pet dogs with acute enteritis in the cities of Beijing, Wuhan, and Nanjing were collected and tested for CPV by PCR and other assay between 2006 and 2008. There was no CPV to FPV (MEV) variation by PCR-RFLP analysis in all samples. The complete ORFs of VP2 genes were obtained by PCR from 15 clinical CPVs and 2 CPV vaccine strains. All amplicons were cloned and sequenced. Analysis of the VP2 sequences showed that clinical CPVs both belong to CPV-2a subtype, and could be classified into a new cluster by amino acids contrasting which contains Tyr-->Ile (324) mutation. Besides the 2 CPV vaccine strains belong to CPV-2 subtype, and both of them have scattered variation in amino acids residues of VP2 protein. Construction of the phylogenetic tree based on CPV VP2 sequence showed these 15 CPV clinical strains were in close relationship with Korea strain K001 than CPV-2a isolates in other countries at early time, It is indicated that the canine parvovirus genetic variation was associated with location and time in some degree. The survey of CPV capsid protein VP2 gene provided the useful information for the identification of CPV types and understanding of their genetic relationship.
Dynamic Variation and Reversion in the Signature Amino Acids of H7N9 Virus During Human Infection.

PubMed

Zou, Xiaohui; Guo, Qiang; Zhang, Wei; Chen, Hui; Bai, Wei; Lu, Binghuai; Zhang, Wang; Fan, Yanyan; Liu, Chao; Wang, Yeming; Zhou, Fei; Cao, Bin

2018-04-24

Signature amino acids of H7N9 influenza virus play critical roles in human adaption and pathogenesis, but their dynamic variation is unknown during disease development. We sequentially collected respiratory samples from H7N9 patients at different timepoints and applied next-generation sequencing (NGS) to the whole genome of the H7N9 virus to investigate the variation at signature sites. A total of 11 patients were involved and from whom 29 samples were successfully sequenced, including samples from multiple timepoints in 9 patients. NA R292K, PB2 E627K, and D701N were the three most dynamic mutations. The oseltamivir resistance-related NA R292K mutation was present in 9 samples from 5 patients, including one sample obtained before antiviral therapy. In all patients with the NA 292K mutation, the oseltamivir-sensitive 292R genotype persisted and was not eliminated by antiviral treatment. The PB2 E627K substitution was present in 18 samples from 8 patients, among which 12 samples demonstrated a mixture of E/K and the 627K frequency exhibited dynamic variation. Dual D701N and E627K mutations emerged but failed to achieve predominance in any of the samples. Signature amino acids in PB2 and NA demonstrated high polymorphism and dynamic variation within individual patients during H7N9 virus infection.
Population genomic analysis of strain variation in Leptospirillum group II bacteria involved in acid mine drainage formation.

PubMed

Simmons, Sheri L; Dibartolo, Genevieve; Denef, Vincent J; Goltsman, Daniela S Aliaga; Thelen, Michael P; Banfield, Jillian F

2008-07-22

Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth approximately 20x). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types ( approximately 94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of polymorphism is divergence of ancestral strains due to geographic isolation, followed by mixing and subsequent recombination.
Population Genomic Analysis of Strain Variation in Leptospirillum Group II Bacteria Involved in Acid Mine Drainage Formation

PubMed Central

Denef, Vincent J; Goltsman, Daniela S. Aliaga; Thelen, Michael P; Banfield, Jillian F

2008-01-01

Deeply sampled community genomic (metagenomic) datasets enable comprehensive analysis of heterogeneity in natural microbial populations. In this study, we used sequence data obtained from the dominant member of a low-diversity natural chemoautotrophic microbial community to determine how coexisting closely related individuals differ from each other in terms of gene sequence and gene content, and to uncover evidence of evolutionary processes that occur over short timescales. DNA sequence obtained from an acid mine drainage biofilm was reconstructed, taking into account the effects of strain variation, to generate a nearly complete genome tiling path for a Leptospirillum group II species closely related to L. ferriphilum (sampling depth ∼20×). The population is dominated by one sequence type, yet we detected evidence for relatively abundant variants (>99.5% sequence identity to the dominant type) at multiple loci, and a few rare variants. Blocks of other Leptospirillum group II types (∼94% sequence identity) have recombined into one or more variants. Variant blocks of both types are more numerous near the origin of replication. Heterogeneity in genetic potential within the population arises from localized variation in gene content, typically focused in integrated plasmid/phage-like regions. Some laterally transferred gene blocks encode physiologically important genes, including quorum-sensing genes of the LuxIR system. Overall, results suggest inter- and intrapopulation genetic exchange involving distinct parental genome types and implicate gain and loss of phage and plasmid genes in recent evolution of this Leptospirillum group II population. Population genetic analyses of single nucleotide polymorphisms indicate variation between closely related strains is not maintained by positive selection, suggesting that these regions do not represent adaptive differences between strains. Thus, the most likely explanation for the observed patterns of polymorphism is divergence of ancestral strains due to geographic isolation, followed by mixing and subsequent recombination. PMID:18651792

Integrating mRNA and Protein Sequencing Enables the Detection and Quantitative Profiling of Natural Protein Sequence Variants of Populus trichocarpa.

PubMed

Abraham, Paul E; Wang, Xiaojing; Ranjan, Priya; Nookaew, Intawat; Zhang, Bing; Tuskan, Gerald A; Hettich, Robert L

2015-12-04

Next-generation sequencing has transformed the ability to link genotypes to phenotypes and facilitates the dissection of genetic contribution to complex traits. However, it is challenging to link genetic variants with the perturbed functional effects on proteins encoded by such genes. Here we show how RNA sequencing can be exploited to construct genotype-specific protein sequence databases to assess natural variation in proteins, providing information about the molecular toolbox driving cellular processes. For this study, we used two natural genotypes selected from a recent genome-wide association study of Populus trichocarpa, an obligate outcrosser with tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs), as well as insertions and deletions. We profiled the frequency of 128 types of naturally occurring amino acid substitutions, including both expected (neutral) and unexpected (non-neutral) SAAPs, with a subset occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. By zeroing in on the molecular signatures of these important regions that might have previously been uncharacterized, we now provide a high-resolution molecular inventory that should improve accessibility and subsequent identification of natural protein variants in future genotype-to-phenotype studies.
Microbial Diversity of Acidic Hot Spring (Kawah Hujan B) in Geothermal Field of Kamojang Area, West Java-Indonesia

PubMed Central

Aditiawati, Pingkan; Yohandini, Heni; Madayanti, Fida; Akhmaloka

2009-01-01

Microbial communities in an acidic hot spring, namely Kawah Hujan B, at Kamojang geothermal field, West Java-Indonesia was examined using culture dependent and culture independent strategies. Chemical analysis of the hot spring water showed a characteristic of acidic-sulfate geothermal activity that contained high sulfate concentrations and low pH values (pH 1.8 to 1.9). Microbial community present in the spring was characterized by 16S rRNA gene combined with denaturing gradient gel electrophoresis (DGGE) analysis. The majority of the sequences recovered from culture-independent method were closely related to Crenarchaeota and Proteobacteria phyla. However, detail comparison among the member of Crenarchaeota showing some sequences variation compared to that the published data especially on the hypervariable and variable regions. In addition, the sequences did not belong to certain genus. Meanwhile, the 16S Rdna sequences from culture-dependent samples revealed mostly close to Firmicute and gamma Proteobacteria. PMID:19440252
Nucleotide and amino acid variations of tannase gene from different Aspergillus strains.

PubMed

Borrego-Terrazas, J A; Lara-Victoriano, F; Flores-Gallegos, A C; Veana, F; Aguilar, C N; Rodríguez-Herrera, R

2014-08-01

Tannase is an enzyme that catalyses the hydrolysis of ester bonds present in tannins. Most of the scientific reports about this biocatalysis focus on aspects related to tannase production and its recovery; on the other hand, reports assessing the molecular aspects of the tannase gene or protein are scarce. In the present study, a tannase gene fragment from several Aspergillus strains isolated from the Mexican semidesert was sequenced and compared with tannase amino acid sequences reported in NCBI database using bioinformatics tools. The genetic relationship among the different tannase sequences was also determined. A conserved region of 7 amino acids was found with the conserved motif GXSXG common to esterases, in which the active-site serine residue is located. In addition, in Aspergillus niger strains GH1 and PSH, we found an extra codon in the tannase sequences encoding glycine. The tannase gene belonging to semidesert fungal strains followed a neutral evolution path with the formation of 10 haplotypes, of which A. niger GH1 and PSH haplotypes are the oldest.
Microbial diversity of acidic hot spring (kawah hujan B) in geothermal field of kamojang area, west java-indonesia.

PubMed

Aditiawati, Pingkan; Yohandini, Heni; Madayanti, Fida; Akhmaloka

2009-01-01

Microbial communities in an acidic hot spring, namely Kawah Hujan B, at Kamojang geothermal field, West Java-Indonesia was examined using culture dependent and culture independent strategies. Chemical analysis of the hot spring water showed a characteristic of acidic-sulfate geothermal activity that contained high sulfate concentrations and low pH values (pH 1.8 to 1.9). Microbial community present in the spring was characterized by 16S rRNA gene combined with denaturing gradient gel electrophoresis (DGGE) analysis. The majority of the sequences recovered from culture-independent method were closely related to Crenarchaeota and Proteobacteria phyla. However, detail comparison among the member of Crenarchaeota showing some sequences variation compared to that the published data especially on the hypervariable and variable regions. In addition, the sequences did not belong to certain genus. Meanwhile, the 16S Rdna sequences from culture-dependent samples revealed mostly close to Firmicute and gamma Proteobacteria.
Evolution of proteins.

NASA Technical Reports Server (NTRS)

Dayhoff, M. O.

1971-01-01

The amino acid sequences of proteins from living organisms are dealt with. The structure of proteins is first discussed; the variation in this structure from one biological group to another is illustrated by the first halves of the sequences of cytochrome c, and a phylogenetic tree is derived from the cytochrome c data. The relative geological times associated with the events of this tree are discussed. Errors which occur in the duplication of cells during the evolutionary process are examined. Particular attention is given to evolution of mutant proteins, globins, ferredoxin, and transfer ribonucleic acids (tRNA's). Finally, a general outline of biological evolution is presented.
Genotype-specific signal generation based on digestion of 3-way DNA junctions: application to KRAS variation detection.

PubMed

Amicarelli, Giulia; Adlerstein, Daniel; Shehi, Erlet; Wang, Fengfei; Makrigiorgos, G Mike

2006-10-01

Genotyping methods that reveal single-nucleotide differences are useful for a wide range of applications. We used digestion of 3-way DNA junctions in a novel technology, OneCutEventAmplificatioN (OCEAN) that allows sequence-specific signal generation and amplification. We combined OCEAN with peptide-nucleic-acid (PNA)-based variant enrichment to detect and simultaneously genotype v-Ki-ras2 Kirsten rat sarcoma viral oncogene homolog (KRAS) codon 12 sequence variants in human tissue specimens. We analyzed KRAS codon 12 sequence variants in 106 lung cancer surgical specimens. We conducted a PNA-PCR reaction that suppresses wild-type KRAS amplification and genotyped the product with a set of OCEAN reactions carried out in fluorescence microplate format. The isothermal OCEAN assay enabled a 3-way DNA junction to form between the specific target nucleic acid, a fluorescently labeled "amplifier", and an "anchor". The amplifier-anchor contact contains the recognition site for a restriction enzyme. Digestion produces a cleaved amplifier and generation of a fluorescent signal. The cleaved amplifier dissociates from the 3-way DNA junction, allowing a new amplifier to bind and propagate the reaction. The system detected and genotyped KRAS sequence variants down to approximately 0.3% variant-to-wild-type alleles. PNA-PCR/OCEAN had a concordance rate with PNA-PCR/sequencing of 93% to 98%, depending on the exact implementation. Concordance rate with restriction endonuclease-mediated selective-PCR/sequencing was 89%. OCEAN is a practical and low-cost novel technology for sequence-specific signal generation. Reliable analysis of KRAS sequence alterations in human specimens circumvents the requirement for sequencing. Application is expected in genotyping KRAS codon 12 sequence variants in surgical specimens or in bodily fluids, as well as single-base variations and sequence alterations in other genes.
Diversity and evolutionary patterns of immune genes in free-ranging Namibian leopards (Panthera pardus pardus).

PubMed

Castro-Prieto, Aines; Wachter, Bettina; Melzheimer, Joerg; Thalwitzer, Susanne; Sommer, Simone

2011-01-01

The genes of the major histocompatibility complex (MHC) are a key component of the mammalian immune system and have become important molecular markers for fitness-related genetic variation in wildlife populations. Currently, no information about the MHC sequence variation and constitution in African leopards exists. In this study, we isolated and characterized genetic variation at the adaptively most important region of MHC class I and MHC class II-DRB genes in 25 free-ranging African leopards from Namibia and investigated the mechanisms that generate and maintain MHC polymorphism in the species. Using single-stranded conformation polymorphism analysis and direct sequencing, we detected 6 MHC class I and 6 MHC class II-DRB sequences, which likely correspond to at least 3 MHC class I and 3 MHC class II-DRB loci. Amino acid sequence variation in both MHC classes was higher or similar in comparison to other reported felids. We found signatures of positive selection shaping the diversity of MHC class I and MHC class II-DRB loci during the evolutionary history of the species. A comparison of MHC class I and MHC class II-DRB sequences of the leopard to those of other felids revealed a trans-species mode of evolution. In addition, the evolutionary relationships of MHC class II-DRB sequences between African and Asian leopard subspecies are discussed.
Genotypic characterization of CRF01_AE env genes derived from human immunodeficiency virus type 1-infected patients residing in central Thailand.

PubMed

Utachee, Piraporn; Jinnopat, Piyamat; Isarangkura-Na-Ayuthaya, Panasda; de Silva, Udayanga Chandimal; Nakamura, Shota; Siripanyaphinyo, Uamporn; Wichukchinda, Nuanjun; Tokunaga, Kenzo; Yasunaga, Teruo; Sawanpanyalert, Pathom; Ikuta, Kazuyoshi; Auwanit, Wattana; Kameoka, Masanori

2009-02-01

CRF01_AE is a major subtype of human immunodeficiency virus type 1 (HIV-1) circulating in Southeast Asia, including Thailand. HIV-1 env genes were amplified by polymerase chain reaction from blood samples of HIV-1-infected patients residing in Thailand in 2006, and cloned into the pNL4-3-derived reporter viral construct. Generated envelope protein (Env)-recombinant virus was examined for its infectivity, and then 35 infectious CRF01_AE Env-recombinant viruses were selected. Sequencing analysis revealed that the interclone variation of the deduced amino acid sequences was higher in CRF01_AE env genes isolated in 2006 than in those isolated in the early 1990s, suggesting that env gene variation has been increasing gradually among CRF01_AE viruses prevalent in Thailand. We also examined the characteristics of the deduced amino acid sequences of 35 CRF01_AE env genes. Our results may provide useful information to help in better understanding the genotype of env genes of CRF01_AE viruses currently circulating in Thailand.
Mutations around interferon sensitivity-determining region: a pilot resistance report of hepatitis C virus 1b in a Hong Kong population.

PubMed

Zhou, Xiao-Ming; Chan, Paul Ks; Tam, John S

2011-12-28

To explore mutations around the interferon sensitivity-determining region (ISDR) which are associated with the resistance of hepatitis C virus 1b (HCV-1b) to interferon-α treatment. Thirty-seven HCV-1b samples were obtained from Hong Kong patients who had completed the combined interferon-α/ribavirin treatment for more than one year with available response data. Nineteen of them were sustained virological responders, while 18 were non-responders. The amino acid sequences of the extended ISDR (eISDR) covering 64 amino acids upstream and 67 amino acids downstream from the previously reported ISDR were analyzed. One amino acid variation (I2268V, P = 0.023) was significantly correlated with treatment outcome in this pilot study with a limited number of patients, while two amino acid variations (R2260H, P = 0.05 and S2278T, P = 0.05) were weakly associated with treatment outcome. The extent of amino acid variations within the ISDR or eISDR was not correlated with treatment outcome as previously reported. Three amino acid mutations near but outside of ISDR may associate with interferon treatment resistance of HCV-1b patients in Hong Kong.
Molecular phylogeny of Coxsackievirus A16 in Shenzhen, China, from 2005 to 2009.

PubMed

Zong, Wenping; He, Yaqing; Yu, Shouyi; Yang, Hong; Xian, Huixia; Liao, Yuxue; Hu, Guifang

2011-04-01

Phylogenetic analysis of a Coxsackievirus A16 (CA16) sequence from Shenzhen, China, and other Chinese and international CA16 sequences revealed a pattern of endemic cocirculation of strains of clusters B2a and B2b within subtype B2 viruses. Amino acid evolution and nucleotide variation in the VP1 region were slight for 5 years.
Does the Genetic Code Have A Eukaryotic Origin?

PubMed Central

Zhang, Zhang; Yu, Jun

2013-01-01

In the RNA world, RNA is assumed to be the dominant macromolecule performing most, if not all, core “house-keeping” functions. The ribo-cell hypothesis suggests that the genetic code and the translation machinery may both be born of the RNA world, and the introduction of DNA to ribo-cells may take over the informational role of RNA gradually, such as a mature set of genetic code and mechanism enabling stable inheritance of sequence and its variation. In this context, we modeled the genetic code in two content variables—GC and purine contents—of protein-coding sequences and measured the purine content sensitivities for each codon when the sensitivity (% usage) is plotted as a function of GC content variation. The analysis leads to a new pattern—the symmetric pattern—where the sensitivity of purine content variation shows diagonally symmetry in the codon table more significantly in the two GC content invariable quarters in addition to the two existing patterns where the table is divided into either four GC content sensitivity quarters or two amino acid diversity halves. The most insensitive codon sets are GUN (valine) and CAN (CAR for asparagine and CAY for aspartic acid) and the most biased amino acid is valine (always over-estimated) followed by alanine (always under-estimated). The unique position of valine and its codons suggests its key roles in the final recruitment of the complete codon set of the canonical table. The distinct choice may only be attributable to sequence signatures or signals of splice sites for spliceosomal introns shared by all extant eukaryotes. PMID:23402863
Role of DNA conformation & energetic insights in Msx-1-DNA recognition as revealed by molecular dynamics studies on specific and nonspecific complexes.

PubMed

Kachhap, Sangita; Singh, Balvinder

2015-01-01

In most of homeodomain-DNA complexes, glutamine or lysine is present at 50th position and interacts with 5th and 6th nucleotide of core recognition region. Molecular dynamics simulations of Msx-1-DNA complex (Q50-TG) and its variant complexes, that is specific (Q50K-CC), nonspecific (Q50-CC) having mutation in DNA and (Q50K-TG) in protein, have been carried out. Analysis of protein-DNA interactions and structure of DNA in specific and nonspecific complexes show that amino acid residues use sequence-dependent shape of DNA to interact. The binding free energies of all four complexes were analysed to define role of amino acid residue at 50th position in terms of binding strength considering the variation in DNA on stability of protein-DNA complexes. The order of stability of protein-DNA complexes shows that specific complexes are more stable than nonspecific ones. Decomposition analysis shows that N-terminal amino acid residues have been found to contribute maximally in binding free energy of protein-DNA complexes. Among specific protein-DNA complexes, K50 contributes more as compared to Q50 towards binding free energy in respective complexes. The sequence dependence of local conformation of DNA enables Q50/Q50K to make hydrogen bond with nucleotide(s) of DNA. The changes in amino acid sequence of protein are accommodated and stabilized around TAAT core region of DNA having variation in nucleotides.
Hydroxamic acids as weak base indicators: protonation in strong acid media.

PubMed

García, B; Ibeas, S; Hoyuelos, F J; Leal, J M; Secco, F; Venturini, M

2001-11-30

The protonation equilibria of N-phenylbenzohydroxamic, benzohydroxamic, salicylhydroxamic, and N-p-tolylcinnamohydroxamic acids have been studied at 25 degrees C in concentrated sulfuric, hydrochloric, and perchloric acid media; the UV-vis spectral measurements were analyzed using the Hammett equation and the Bunnett-Olsen and excess acidity methods. The medium effects observed in the UV spectral curves were corrected with the Cox-Yates and vector analysis methods. The H(A) acidity function based on benzamides provided the best results. The range of variation of the solvation coefficient m is similar to that of amides, this indicating similar solvation requirements for amides and hydroxamic acids. For the same substrate, the observed variations of pK(BH)(+) with the mineral acid used was justified by formation of solvent-separated ion pairs; for the same mineral acid, the observed changes in pK(BH)(+) can be explained by the solvation of BH(+). The change of the pK(BH)(+) values was in reasonably good agreement with the sequence of the catalytic efficiency of the mineral acids used, HCl > H(2)SO(4) > HClO(4).
[Sequencing and analysis of complete genome of rabies viruses isolated from Chinese Ferret-Badger and dog in Zhejiang province].

PubMed

Lei, Yong-Liang; Wang, Xiao-Guang; Tao, Xiao-Yan; Li, Hao; Meng, Sheng-Li; Chen, Xiu-Ying; Liu, Fu-Ming; Ye, Bi-Feng; Tang, Qing

2010-01-01

Based on sequencing the full-length genomes of four Chinese Ferret-Badger and dog, we analyze the properties of rabies viruses genetic variation in molecular level, get the information about rabies viruses prevalence and variation in Zhejiang, and enrich the genome database of rabies viruses street strains isolated from China. Rabies viruses in suckling mice were isolated, overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses from Chinese Ferret-Badger, dog, sika deer, vole, used vaccine strain were determined. The four full-length genomes were sequenced completely and had the same genetic structure with the length of 11, 923 nts or 11, 925 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions(IGRs), 423 nts-Pseudogene-like sequence (psi), 70 nts-Trailer. The four full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by BLAST and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the four full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so the nucleotide mutations happened in these four genomes were most synonymous mutations. Compared with the reference rabies viruses, the lengths of the five protein coding regions had no change, no recombination, only with a few point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the four genomes were similar to the reference vaccine or street strains. And the four strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessed the distinct district characteristics of China. Therefore, these four rabies viruses are likely to be street viruses already existing in the natural world.
Genetic variation and dynamics of infections of equid herpesvirus 5 in individual horses.

PubMed

Back, Helena; Ullman, Karin; Leijon, Mikael; Söderlund, Robert; Penell, Johanna; Ståhl, Karl; Pringle, John; Valarcher, Jean-François

2016-01-01

Equid herpesvirus 5 (EHV-5) is related to the human Epstein-Barr virus (human herpesvirus 4) and has frequently been observed in equine populations worldwide. EHV-5 was previously assumed to be low to non-pathogenic; however, studies have also related the virus to the severe lung disease equine multinodular pulmonary fibrosis (EMPF). Genetic information of EHV-5 is scanty: the whole genome was recently described and only limited nucleotide sequences are available. In this study, samples were taken twice 1 year apart from eight healthy horses at the same professional training yard and samples from a ninth horse that was diagnosed with EMPF with samples taken pre- and post-mortem to analyse partial glycoprotein B (gB) gene of EHV-5 by using next-generation sequencing. The analysis resulted in 27 partial gB gene sequences, 11 unique sequence types and five amino acid sequences. These sequences could be classified within four genotypes (I-IV) of the EHV-5 gB gene based on the degree of similarity of the nucleotide and amino acid sequences, and in this work horses were shown to be identified with up to three different genotypes simultaneously. The observations showed a range of interactions between EHV-5 and the host over time, where the same virus persists in some horses, whereas others have a more dynamic infection pattern including strains from different genotypes. This study provides insight into the genetic variation and dynamics of EHV-5, and highlights that further work is needed to understand the EHV-5 interaction with its host.
Sequence dependent aggregation of peptides and fibril formation

NASA Astrophysics Data System (ADS)

Hung, Nguyen Ba; Le, Duy-Manh; Hoang, Trinh X.

2017-09-01

Deciphering the links between amino acid sequence and amyloid fibril formation is key for understanding protein misfolding diseases. Here we use Monte Carlo simulations to study the aggregation of short peptides in a coarse-grained model with hydrophobic-polar (HP) amino acid sequences and correlated side chain orientations for hydrophobic contacts. A significant heterogeneity is observed in the aggregate structures and in the thermodynamics of aggregation for systems of different HP sequences and different numbers of peptides. Fibril-like ordered aggregates are found for several sequences that contain the common HPH pattern, while other sequences may form helix bundles or disordered aggregates. A wide variation of the aggregation transition temperatures among sequences, even among those of the same hydrophobic fraction, indicates that not all sequences undergo aggregation at a presumable physiological temperature. The transition is found to be the most cooperative for sequences forming fibril-like structures. For a fibril-prone sequence, it is shown that fibril formation follows the nucleation and growth mechanism. Interestingly, a binary mixture of peptides of an aggregation-prone and a non-aggregation-prone sequence shows the association and conversion of the latter to the fibrillar structure. Our study highlights the role of a sequence in selecting fibril-like aggregates and also the impact of a structural template on fibril formation by peptides of unrelated sequences.
Automated design evolution of stereochemically randomized protein foldamers

NASA Astrophysics Data System (ADS)

Ranbhor, Ranjit; Kumar, Anil; Patel, Kirti; Ramakrishnan, Vibin; Durani, Susheel

2018-05-01

Diversification of chain stereochemistry opens up the possibilities of an ‘in principle’ increase in the design space of proteins. This huge increase in the sequence and consequent structural variation is aimed at the generation of smart materials. To diversify protein structure stereochemically, we introduced L- and D-α-amino acids as the design alphabet. With a sequence design algorithm, we explored the usage of specific variables such as chirality and the sequence of this alphabet in independent steps. With molecular dynamics, we folded stereochemically diverse homopolypeptides and evaluated their ‘fitness’ for possible design as protein-like foldamers. We propose a fitness function to prune the most optimal fold among 1000 structures simulated with an automated repetitive simulated annealing molecular dynamics (AR-SAMD) approach. The highly scored poly-leucine fold with sequence lengths of 24 and 30 amino acids were later sequence-optimized using a Dead End Elimination cum Monte Carlo based optimization tool. This paper demonstrates a novel approach for the de novo design of protein-like foldamers.
Analyses of natural variation indicates that the absence of RPS4/RRS1 and amino acid change in RPS4 cause loss of their functions and resistance to pathogens.

PubMed

Narusaka, Mari; Iuchi, Satoshi; Narusaka, Yoshihiro

2017-03-04

A pair of Arabidopsis thaliana resistance proteins, RPS4 and RRS1, recognizes the cognate Avr effector from the bacterial pathogens Pseudomonas syringae pv. tomato expressing avrRps4 (Pst-avrRps4), Ralstonia solanacearum, and the fungal pathogen Colletotrichum higginsianum and leads to defense signaling activation against the pathogens. In the present study, we analyzed 14 A. thaliana accessions for natural variation in Pst-avrRps4 and C. higginsianum susceptibility, and found new compatible and incompatible Arabidopsis-pathogen interactions. We first found that A. thaliana accession Cvi-0 is susceptible to Pst-avrRps4. Interestingly, the genome sequence assembly indicated that Cvi-0 lost both RPS4 and RRS1, but not RPS4B and RRS1B, compared to the reference genome sequence from A. thaliana accession Col-0. On the other hand, the natural variation analysis of RPS4 alleles from various Arabidopsis accessions revealed that one amino-acid change, Y950H, is responsible for the loss of resistance to Pst-avrRps4 and C. higginsianum in RLD-0. Our data indicate that the amino acid change, Y950H, in RPS4 resulted in the loss of both RPS4 and RRS1 functions and resistance to pathogens.
Mutations around interferon sensitivity-determining region: A pilot resistance report of hepatitis C virus 1b in a Hong Kong population

PubMed Central

Zhou, Xiao-Ming; Chan, Paul KS; Tam, John S

2011-01-01

AIM: To explore mutations around the interferon sensitivity-determining region (ISDR) which are associated with the resistance of hepatitis C virus 1b (HCV-1b) to interferon-α treatment. METHODS: Thirty-seven HCV-1b samples were obtained from Hong Kong patients who had completed the combined interferon-α/ribavirin treatment for more than one year with available response data. Nineteen of them were sustained virological responders, while 18 were non-responders. The amino acid sequences of the extended ISDR (eISDR) covering 64 amino acids upstream and 67 amino acids downstream from the previously reported ISDR were analyzed. RESULTS: One amino acid variation (I2268V, P = 0.023) was significantly correlated with treatment outcome in this pilot study with a limited number of patients, while two amino acid variations (R2260H, P = 0.05 and S2278T, P = 0.05) were weakly associated with treatment outcome. The extent of amino acid variations within the ISDR or eISDR was not correlated with treatment outcome as previously reported. CONCLUSION: Three amino acid mutations near but outside of ISDR may associate with interferon treatment resistance of HCV-1b patients in Hong Kong. PMID:22219602
A natural mutation-led truncation in one of the two aluminum-activated malate transporter-like genes at the Ma locus is associated with low fruit acidity in apple.

PubMed

Bai, Yang; Dougherty, Laura; Li, Mingjun; Fazio, Gennaro; Cheng, Lailiang; Xu, Kenong

2012-08-01

Acidity levels greatly affect the taste and flavor of fruit, and consequently its market value. In mature apple fruit, malic acid is the predominant organic acid. Several studies have confirmed that the major quantitative trait locus Ma largely controls the variation of fruit acidity levels. The Ma locus has recently been defined in a region of 150 kb that contains 44 predicted genes on chromosome 16 in the Golden Delicious genome. In this study, we identified two aluminum-activated malate transporter-like genes, designated Ma1 and Ma2, as strong candidates of Ma by narrowing down the Ma locus to 65-82 kb containing 12-19 predicted genes depending on the haplotypes. The Ma haplotypes were determined by sequencing two bacterial artificial chromosome clones from G.41 (an apple rootstock of genotype Mama) that cover the two distinct haplotypes at the Ma locus. Gene expression profiling in 18 apple germplasm accessions suggested that Ma1 is the major determinant at the Ma locus controlling fruit acidity as Ma1 is expressed at a much higher level than Ma2 and the Ma1 expression is significantly correlated with fruit titratable acidity (R (2) = 0.4543, P = 0.0021). In the coding sequences of low acidity alleles of Ma1 and Ma2, sequence variations at the amino acid level between Golden Delicious and G.41 were not detected. But the alleles for high acidity vary considerably between the two genotypes. The low acidity allele of Ma1, Ma1-1455A, is mainly characterized by a mutation at base 1455 in the open reading frame. The mutation leads to a premature stop codon that truncates the carboxyl terminus of Ma1-1455A by 84 amino acids compared with Ma1-1455G. A survey of 29 apple germplasm accessions using marker CAPS(1455) that targets the SNP(1455) in Ma1 showed that the CAPS(1455A) allele was associated completely with high pH and highly with low titratable acidity, suggesting that the natural mutation-led truncation is most likely responsible for the abolished function of Ma for low pH or high acidity in apple.

Diversity of Pneumolysin and Pneumococcal Histidine Triad Protein D of Streptococcus pneumoniae Isolated from Invasive Diseases in Korean Children.

PubMed

Yun, Ki Wook; Lee, Hyunju; Choi, Eun Hwa; Lee, Hoan Jong

2015-01-01

Pneumolysin (Ply) and pneumococcal histidine triad protein D (PhtD) are candidate proteins for a next-generation pneumococcal vaccine. We aimed to analyze the genetic diversity and antigenic heterogeneity of Ply and PhtD for 173 pneumococci isolated from invasive diseases in Korean children. Allele was designated based on the variation of amino acid sequence. Antigenicity was predicted by the amino acid hydrophobicity of the region. There were seven and 39 allele types for the ply and phtD genes, respectively. The nucleotide sequence identity was 97.2%-99.9% for ply and 91.4%-98.0% for phtD gene. Only minor variations in hydrophobicity were noted among the antigenicity plots of Ply and PhtD. Overall, the allele types of the ply and phtD genes were remarkably homogeneous, and the antigenic diversity of the corresponding proteins was very limited. The Ply and PhtD could be useful antigens for universal pneumococcal vaccines.
Comparative genomics of citric-acid producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

DOE Office of Scientific and Technical Information (OSTI.GOV)

Andersen, Mikael R.; Salazar, Margarita; Schaap, Peter

2011-06-01

The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regionsmore » have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases and protein transporters.« less
Comparison of theoretical proteomes: identification of COGs with conserved and variable pI within the multimodal pI distribution.

PubMed

Nandi, Soumyadeep; Mehra, Nipun; Lynn, Andrew M; Bhattacharya, Alok

2005-09-09

Theoretical proteome analysis, generated by plotting theoretical isoelectric points (pI) against molecular masses of all proteins encoded by the genome show a multimodal distribution for pI. This multimodal distribution is an effect of allowed combinations of the charged amino acids, and not due to evolutionary causes. The variation in this distribution can be correlated to the organisms ecological niche. Contributions to this variation maybe mapped to individual proteins by studying the variation in pI of orthologs across microorganism genomes. The distribution of ortholog pI values showed trimodal distributions for all prokaryotic genomes analyzed, similar to whole proteome plots. Pairwise analysis of pI variation show that a few COGs are conserved within, but most vary between, the acidic and basic regions of the distribution, while molecular mass is more highly conserved. At the level of functional grouping of orthologs, five groups vary significantly from the population of orthologs, which is attributed to either conservation at the level of sequences or a bias for either positively or negatively charged residues contributing to the function. Individual COGs conserved in both the acidic and basic regions of the trimodal distribution are identified, and orthologs that best represent the variation in levels of the acidic and basic regions are listed. The analysis of pI distribution by using orthologs provides a basis for resolution of theoretical proteome comparison at the level of individual proteins. Orthologs identified that significantly vary between the major acidic and basic regions maybe used as representative of the variation of the entire proteome.
Integrating mRNA and protein sequencing enables the detection and quantitative profiling of natural protein sequence variants of Populus trichocarpa

DOE Office of Scientific and Technical Information (OSTI.GOV)

Abraham, Paul E.; Wang, Xiaojing; Ranjan, Priya

The availability of next-generation sequencing technologies has rapidly transformed our ability to link genotypes to phenotypes, and as such, promises to facilitate the dissection of genetic contribution to complex traits. Although discoveries of genetic associations will further our understanding of biology, once candidate variants have been identified, investigators are faced with the challenge of characterizing the functional effects on proteins encoded by such genes. Here we show how next-generation RNA sequencing data can be exploited to construct genotype-specific protein sequence databases, which provide a clearer picture of the molecular toolbox underlying cellular and organismal processes and their variation in amore » natural population. For this study, we used two individual genotypes (DENA-17-3 and VNDL-27-4) from a recent genome wide association (GWA) study of Populus trichocarpa, an obligate outcrosser that exhibits tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs) and insertions and deletions (INDELS). Based on large-scale identification of SAAPs, we profiled the frequency of 128 types of naturally occurring amino acid substitutions, with a subset of SAAPs occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. In addition, we were able to explore the diploid landscape of Populus at the proteome-level, allowing the characterization of heterozygous variants.« less
Integrating mRNA and protein sequencing enables the detection and quantitative profiling of natural protein sequence variants of Populus trichocarpa

DOE PAGES

Abraham, Paul E.; Wang, Xiaojing; Ranjan, Priya; ...

2015-10-20

The availability of next-generation sequencing technologies has rapidly transformed our ability to link genotypes to phenotypes, and as such, promises to facilitate the dissection of genetic contribution to complex traits. Although discoveries of genetic associations will further our understanding of biology, once candidate variants have been identified, investigators are faced with the challenge of characterizing the functional effects on proteins encoded by such genes. Here we show how next-generation RNA sequencing data can be exploited to construct genotype-specific protein sequence databases, which provide a clearer picture of the molecular toolbox underlying cellular and organismal processes and their variation in amore » natural population. For this study, we used two individual genotypes (DENA-17-3 and VNDL-27-4) from a recent genome wide association (GWA) study of Populus trichocarpa, an obligate outcrosser that exhibits tremendous phenotypic variation across the natural population. This strategy allowed us to comprehensively catalogue proteins containing single amino acid polymorphisms (SAAPs) and insertions and deletions (INDELS). Based on large-scale identification of SAAPs, we profiled the frequency of 128 types of naturally occurring amino acid substitutions, with a subset of SAAPs occurring in regions of the genome having strong polymorphism patterns consistent with recent positive and/or divergent selection. In addition, we were able to explore the diploid landscape of Populus at the proteome-level, allowing the characterization of heterozygous variants.« less
Computational analysis of sequence selection mechanisms.

PubMed

Meyerguz, Leonid; Grasso, Catherine; Kleinberg, Jon; Elber, Ron

2004-04-01

Mechanisms leading to gene variations are responsible for the diversity of species and are important components of the theory of evolution. One constraint on gene evolution is that of protein foldability; the three-dimensional shapes of proteins must be thermodynamically stable. We explore the impact of this constraint and calculate properties of foldable sequences using 3660 structures from the Protein Data Bank. We seek a selection function that receives sequences as input, and outputs survival probability based on sequence fitness to structure. We compute the number of sequences that match a particular protein structure with energy lower than the native sequence, the density of the number of sequences, the entropy, and the "selection" temperature. The mechanism of structure selection for sequences longer than 200 amino acids is approximately universal. For shorter sequences, it is not. We speculate on concrete evolutionary mechanisms that show this behavior.
Ovine Reference Materials and Assays for Prion Genetic Testing

USDA-ARS?s Scientific Manuscript database

Background: Genetic predisposition to scrapie in sheep is associated with variation in the peptide sequence of the ovine prion protein encoded by Prnp. Codon variants implicated in scrapie susceptibility or disease progression include those at amino acid positions 112, 136, 141, 154, and 171. Nin...
[Susceptibility HLA alleles and amino acids to Takayasu arteritis].

PubMed

Terao, Chikashi; Yoshifuji, Hajime; Mimori, Tsuneyo; Matsuda, Fumihiko

2014-01-01

Takayasu arteritis (TAK) is a systemic vasculitis affecting aorta and its large branches which were firstly reported from Japan. TAK develops mainly in young females and the number of patients with TAK in Japan is estimated about 6,000 to 10,000. This low prevalence has made genetic studies of TAK difficult to elucidate its genetic background. The HLA region, especially HLA-B locus, is the strongest susceptibility locus to TAK. The association between TAK and HLA-B*52:01 has been established beyond ethnicity. Recently, two different Japanese research groups identified HLA-B67:01, a relatively rare allele in East Asian population, as a novel susceptibility allele. At the same time, two amino acid variations, namely, histidine at position 171 and phenylalanine at position 67 were reported as susceptibility and protective variations, respectively. Since these positions of amino acid are in the peptide binding grooves of HLA-B protein, changes of peptide-binding in MHC class I seem to play a critical role on susceptibility to TAK. Furthermore, the importance of these two amino acid variations would explain the lack of susceptibility effect of HLA-B*51:01 to TAK, which shares most of amino acid sequences with HLA-B*52:01 except for two amino acids including the position 67.
Helicobacter pylori Heat Shock Protein A: Serologic Responses and Genetic Diversity

PubMed Central

Ng, Enders K. W.; Thompson, Stuart A.; Pérez-Pérez, Guillermo I.; Kansau, Imad; van der Ende, Arie; Labigne, Agnès; Sung, Joseph J. Y.; Chung, S. C. Sydney; Blaser, Martin J.

1999-01-01

Helicobacter pylori synthesizes an unusual GroES homolog, heat shock protein A (HspA). The present study was aimed at an assessment of the serological response to HspA in a group of Chinese patients with defined gastroduodenal pathologies and determination of whether diversity is present in the nucleotide sequences encoding HspA in isolates from these patients. Serum samples collected from 154 patients who had an upper gastrointestinal pathology and the presence of H. pylori defined by biopsy were tested for an immunoglobulin G (IgG) serologic response to H. pylori HspA by an enzyme linked immunosorbant assay. HspA-encoding nucleotide sequences in H. pylori isolates from 14 patients (7 seropositive and 7 seronegative for HspA) were analyzed by PCR and direct sequencing of the PCR products. The sequencing results were compared to those of 48 isolates from other parts of the world. Of the 154 known H. pylori-positive patients, 54 (35.1%) were seropositive for HspA. The A domain (GroES homology) of HspA was highly conserved in the 14 isolates tested. Although the B domain (metal-binding site unique to H. pylori) resembled that in the known major variant, particular amino acid substitutions allowed definition of an HspA variant associated with isolates from East Asia. There were no associations between patient characteristics and HspA seropositivity or amino acid sequences. We confirmed in this study that the clinical outcomes of H. pylori infection are not related to HspA antigenicity or to sequence variation. However, B-domain sequence variation may be a marker for the study of the genetic diversity of H. pylori strains of different geographic origins. PMID:10225839
Droplet digital PCR technology promises new applications and research areas.

PubMed

Manoj, P

2016-01-01

Digital Polymerase Chain Reaction (dPCR) is used to quantify nucleic acids and its applications are in the detection and precise quantification of low-level pathogens, rare genetic sequences, quantification of copy number variants, rare mutations and in relative gene expressions. Here the PCR is performed in large number of reaction chambers or partitions and the reaction is carried out in each partition individually. This separation allows a more reliable collection and sensitive measurement of nucleic acid. Results are calculated by counting amplified target sequence (positive droplets) and the number of partitions in which there is no amplification (negative droplets). The mean number of target sequences was calculated by Poisson Algorithm. Poisson correction compensates the presence of more than one copy of target gene in any droplets. The method provides information with accuracy and precision which is highly reproducible and less susceptible to inhibitors than qPCR. It has been demonstrated in studying variations in gene sequences, such as copy number variants and point mutations, distinguishing differences between expression of nearly identical alleles, assessment of clinically relevant genetic variations and it is routinely used for clonal amplification of samples for NGS methods. dPCR enables more reliable predictors of tumor status and patient prognosis by absolute quantitation using reference normalizations. Rare mitochondrial DNA deletions associated with a range of diseases and disorders as well as aging can be accurately detected with droplet digital PCR.
Detection and sequence/structure mapping of biophysical constraints to protein variation in saturated mutational libraries and protein sequence alignments with a dedicated server.

PubMed

Abriata, Luciano A; Bovigny, Christophe; Dal Peraro, Matteo

2016-06-17

Protein variability can now be studied by measuring high-resolution tolerance-to-substitution maps and fitness landscapes in saturated mutational libraries. But these rich and expensive datasets are typically interpreted coarsely, restricting detailed analyses to positions of extremely high or low variability or dubbed important beforehand based on existing knowledge about active sites, interaction surfaces, (de)stabilizing mutations, etc. Our new webserver PsychoProt (freely available without registration at http://psychoprot.epfl.ch or at http://lucianoabriata.altervista.org/psychoprot/index.html ) helps to detect, quantify, and sequence/structure map the biophysical and biochemical traits that shape amino acid preferences throughout a protein as determined by deep-sequencing of saturated mutational libraries or from large alignments of naturally occurring variants. We exemplify how PsychoProt helps to (i) unveil protein structure-function relationships from experiments and from alignments that are consistent with structures according to coevolution analysis, (ii) recall global information about structural and functional features and identify hitherto unknown constraints to variation in alignments, and (iii) point at different sources of variation among related experimental datasets or between experimental and alignment-based data. Remarkably, metabolic costs of the amino acids pose strong constraints to variability at protein surfaces in nature but not in the laboratory. This and other differences call for caution when extrapolating results from in vitro experiments to natural scenarios in, for example, studies of protein evolution. We show through examples how PsychoProt can be a useful tool for the broad communities of structural biology and molecular evolution, particularly for studies about protein modeling, evolution and design.
Color differences among feral pigeons (Columba livia) are not attributable to sequence variation in the coding region of the melanocortin-1 receptor gene (MC1R)

PubMed Central

2013-01-01

Background Genetic variation at the melanocortin-1 receptor (MC1R) gene is correlated with melanin color variation in many birds. Feral pigeons (Columba livia) show two major melanin-based colorations: a red coloration due to pheomelanic pigment and a black coloration due to eumelanic pigment. Furthermore, within each color type, feral pigeons display continuous variation in the amount of melanin pigment present in the feathers, with individuals varying from pure white to a full dark melanic color. Coloration is highly heritable and it has been suggested that it is under natural or sexual selection, or both. Our objective was to investigate whether MC1R allelic variants are associated with plumage color in feral pigeons. Findings We sequenced 888 bp of the coding sequence of MC1R among pigeons varying both in the type, eumelanin or pheomelanin, and the amount of melanin in their feathers. We detected 10 non-synonymous substitutions and 2 synonymous substitution but none of them were associated with a plumage type. It remains possible that non-synonymous substitutions that influence coloration are present in the short MC1R fragment that we did not sequence but this seems unlikely because we analyzed the entire functionally important region of the gene. Conclusions Our results show that color differences among feral pigeons are probably not attributable to amino acid variation at the MC1R locus. Therefore, variation in regulatory regions of MC1R or variation in other genes may be responsible for the color polymorphism of feral pigeons. PMID:23915680
The wheat cytochrome oxidase subunit II gene has an intron insert and three radical amino acid changes relative to maize

PubMed Central

Bonen, Linda; Boer, Poppo H.; Gray, Michael W.

1984-01-01

We have determined the sequence of the wheat mitochondrial gene for cytochrome oxidase subunit II (COII) and find that its derived protein sequence differs from that of maize at only three amino acid positions. Unexpectedly, all three replacements are non-conservative ones. The wheat COII gene has a highly-conserved intron at the same position as in maize, but the wheat intron is 1.5 times longer because of an insert relative to its maize counterpart. Hybridization analysis of mitochondrial DNA from rye, pea, broad bean and cucumber indicates strong sequence conservation of COII coding sequences among all these higher plants. However, only rye and maize mitochondrial DNA show homology with wheat COII intron sequences and rye alone with intron-insert sequences. We find that a sequence identical to the region of the 5' exon corresponding to the transmembrane domain of the COII protein is present at a second genomic location in wheat mitochondria. These variations in COII gene structure and size, as well as the presence of repeated COII sequences, illustrate at the DNA sequence level, factors which contribute to higher plant mitochondrial DNA diversity and complexity. ImagesFig. 3.Fig. 4.Fig. 5. PMID:16453565
Sequence diversity of the leukotoxin (lktA) gene in caprine and ovine strains of Mannheimia haemolytica.

PubMed

Vougidou, C; Sandalakis, V; Psaroulaki, A; Petridou, E; Ekateriniadou, L

2013-04-20

Mannheimia haemolytica is the aetiological agent of pneumonic pasteurellosis in small ruminants. The primary virulence factor of the bacterium is a leukotoxin (LktA), which induces apoptosis in susceptible cells via mitochondrial targeting. It has been previously shown that certain lktA alleles are associated either with cattle or sheep. The objective of the present study was to investigate lktA sequence variation among ovine and caprine M haemolytica strains isolated from pneumonic lungs, revealing any potential adaptation for the caprine host, for which there is no available data. Furthermore, we investigated amino acid variation in the N-terminal part of the sequences and its effect on targeting mitochondria. Data analysis showed that the prevalent caprine genotype differed at a single non-synonymous site from a previously described uncommon bovine allele, whereas the ovine sequences represented new, distinct alleles. N-terminal sequence differences did not affect the mitochondrial targeting ability of the isolates; interestingly enough in one case, mitochondrial matrix targeting was indicated rather than membrane association, suggesting an alternative LktA trafficking pattern.
[Complete genome sequencing and analyses of rabies viruses isolated from wild animals (Chinese Ferret-Badger) in Zhejiang province].

PubMed

Lei, Yong-Liang; Wang, Xiao-Guang; Liu, Fu-Ming; Chen, Xiu-Ying; Ye, Bi-Feng; Mei, Jian-Hua; Lan, Jin-Quan; Tang, Qing

2009-08-01

Based on sequencing the full-length genomes of two Chinese Ferret-Badger, we analyzed the properties of rabies viruses genetic variation in molecular level to get information on prevalence and variation of rabies viruses in Zhejiang, and to enrich the genome database of rabies viruses street strains isolated from Chinese wildlife. Overlapped fragments were amplified by RT-PCR and full-length genomes were assembled to analyze the nucleotide and deduced protein similarities and phylogenetic analyses of the N genes from Chinese Ferret-Badger, sika deer, vole, dog. Vaccine strains were then determined. The two full-length genomes were completely sequenced to find out that they had the same genetic structure with 11 923 nts including 58 nts-Leader, 1353 nts-NP, 894 nts-PP, 609 nts-MP, 1575 nts-GP, 6386 nts-LP, and 2, 5, 5 nts- intergenic regions (IGRs), 423 nts-Pseudogene-like sequence (Psi), 70 nts-Trailer. The two full-length genomes were in accordance with the properties of Rhabdoviridae Lyssa virus by blast and multi-sequence alignment. The nucleotide and amino acid sequences among Chinese strains had the highest similarity, especially among animals of the same species. Of the two full-length genomes, the similarity in amino acid level was dramatically higher than that in nucleotide level, so that the nucleotide mutations happened in these two genomes were most probably as synonymous mutations. Compared to the referenced rabies viruses, the lengths of the five protein coding regions did not show any changes or recombination, but only with a few-point mutations. It was evident that the five proteins appeared to be stable. The variation sites and types of the two ferret badgers genomes were similar to the referenced vaccine or street strains. The two strains were genotype 1 according to the multi-sequence and phylogenetic analyses, which possessing the distinct geographyphic characteristics of China. All the evidence suggested a cue that these two ferret badgers rabies viruses were likely to be street virus that already circulating in wildlife.
Structure and genetic variability of envelope glycoproteins of two antigenic variants of caprine arthritis-encephalitis lentivirus.

PubMed

Knowles, D P; Cheevers, W P; McGuire, T C; Brassfield, A L; Harwood, W G; Stem, T A

1991-11-01

To define the structure of the caprine arthritis-encephalitis virus (CAEV) env gene and characterize genetic changes which occur during antigenic variation, we sequenced the env genes of CAEV-63 and CAEV-Co, two antigenic variants of CAEV defined by serum neutralization. The deduced primary translation product of the CAEV env gene consists of a 60- to 80-amino-acid signal peptide followed by an amino-terminal surface protein (SU) and a carboxy-terminal transmembrane protein (TM) separated by an Arg-Lys-Lys-Arg cleavage site. The signal peptide cleavage site was verified by amino-terminal amino acid sequencing of native CAEV-63 SU. In addition, immunoprecipitation of [35S]methionine-labeled CAEV-63 proteins by sera from goats immunized with recombinant vaccinia virus expressing the CAEV-63 env gene confirmed that antibodies induced by env-encoded recombinant proteins react specifically with native virion SU and TM. The env genes of CAEV-63 and CAEV-Co encode 28 conserved cysteines and 25 conserved potential N-linked glycosylation sites. Nucleotide sequence variability results in 62 amino acid changes and one deletion within the SU and 34 amino acid changes within the TM.
Structure and genetic variability of envelope glycoproteins of two antigenic variants of caprine arthritis-encephalitis lentivirus.

PubMed Central

Knowles, D P; Cheevers, W P; McGuire, T C; Brassfield, A L; Harwood, W G; Stem, T A

1991-01-01

To define the structure of the caprine arthritis-encephalitis virus (CAEV) env gene and characterize genetic changes which occur during antigenic variation, we sequenced the env genes of CAEV-63 and CAEV-Co, two antigenic variants of CAEV defined by serum neutralization. The deduced primary translation product of the CAEV env gene consists of a 60- to 80-amino-acid signal peptide followed by an amino-terminal surface protein (SU) and a carboxy-terminal transmembrane protein (TM) separated by an Arg-Lys-Lys-Arg cleavage site. The signal peptide cleavage site was verified by amino-terminal amino acid sequencing of native CAEV-63 SU. In addition, immunoprecipitation of [35S]methionine-labeled CAEV-63 proteins by sera from goats immunized with recombinant vaccinia virus expressing the CAEV-63 env gene confirmed that antibodies induced by env-encoded recombinant proteins react specifically with native virion SU and TM. The env genes of CAEV-63 and CAEV-Co encode 28 conserved cysteines and 25 conserved potential N-linked glycosylation sites. Nucleotide sequence variability results in 62 amino acid changes and one deletion within the SU and 34 amino acid changes within the TM. Images PMID:1656067
Genetic variation of viral protein 1 genes of field strains of waterfowl parvoviruses and their attenuated derivatives.

PubMed

Tsai, Hsiang-Jung; Tseng, Chun-hsien; Chang, Poa-chun; Mei, Kai; Wang, Shih-Chi

2004-09-01

To understand the genetic variations between the field strains of waterfowl parvoviruses and their attenuated derivatives, we analyzed the complete nucleotide sequences of the viral protein 1 (VP1) genes of nine field strains and two vaccine strains of waterfowl parvoviruses. Sequence comparison of the VP1 proteins showed that these viruses could be divided into goose parvovirus (GPV) related and Muscovy duck parvovirus (MDPV) related groups. The amino acid difference between GPV- and MDPV-related groups ranged from 13.1% to 15.8%, and the most variable region resided in the N terminus of VP2. The vaccine strains of GPV and MDPV exhibited only 1.2% and 0.3% difference in amino acid when compared with their parental field strains, and most of these differences resided in residues 497-575 of VP1, suggesting that these residues might be important for the attenuation of GPV and MDPV. When the GPV strains isolated in 1982 (the strain 82-0308) and in 2001 (the strain 01-1001) were compared, only 0.3% difference in amino acid was found, while MDPV strains isolated in 1990 (the strain 90-0219) and 1997 (the strain 97-0104) showed only 0.4% difference in amino acid. The result indicates that the genome of waterfowl parvovirus had remained highly stable in the field.
Characterization of durum wheat high molecular weight glutenin subunits Bx20 and By20 sequences by a molecular and proteomic approach.

PubMed

Santagati, Vito Davide; Sestili, Francesco; Lafiandra, Domenico; D'Ovidio, Renato; Rogniaux, Helene; Masci, Stefania

2016-07-01

Wheat high molecular weight glutenin subunit variation is important because of its great influence on glutenin polymer structure, that is related to dough technological properties. Among the different subunits, the pair Bx20 and By20 is known to have a negative effect on quality, but the reasons are not clear: Bx20 has two cysteines, which theoretically make this subunit a chain extender of the glutenin polymer, just like the other Bx subunits, showing four cysteines, two of which should be involved in intra-molecular disulfide bonds. By20 has never been characterized so far at molecular level. Here we report the nucleotide sequences of Bx20 and By20 genes isolated from the durum wheat cultivar 'Lira 45' and the validation of the corresponding deduced amino acid sequences by using MALDI-TOF and LC-MS/MS. Four nucleotide differences were identified in the Bx20 gene with respect to the deduced sequence present in NCBI, causing two amino acid substitutions. For the By20 subunit, nucleotide and amino acid sequences revealed a great similarity to By15, both at gene and protein levels, showing five nucleotide changes generating two amino acid differences. No evidence of post-translational modifications has been found. Hypotheses are formulated in regard to relationships with technological quality. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.
Genetic diversity and phylogenetic analysis of Aleutian mink disease virus isolates in north-east China.

PubMed

Leng, Xue; Liu, Dongxu; Li, Jianming; Shi, Kun; Zeng, Fanli; Zong, Ying; Liu, Yi; Sun, Zhibo; Zhang, Shanshan; Liu, Yadong; Du, Rui

2018-05-01

Aleutian mink disease is the most important disease in the mink-farming industry worldwide. So far, few large-scale molecular epidemiological studies of AMDV, based on the NS1 and VP2 genes, have been conducted in China. Here, eight new Chinese isolates of AMDV from three provinces in north-east China were analyzed to clarify the molecular epidemiology of AMDV. The seroprevalence of AMDV in north-east China was 41.8% according to counterimmuno-electrophoresis. Genetic variation analysis of the eight isolates showed significant non-synonymous substitutions in the NS1 and VP2 genes, especially in the NS1 gene. All eight isolates included the caspase-recognition sequence NS1:285 (DQTD↓S), but not the caspase recognition sequence NS1:227 (INTD↓S). The LN1 and LN2 strains had a new 10-amino-acid deletion in-between amino acids 28-37, while the JL3 strain had a one-amino-acid deletion at position 28 in the VP2 protein, compared with the AMDV-G strain. Phylogenetic analysis based on most of NS1 (1755 bp) and complete VP2 showed that the AMDV genotypes did not cluster according to their pathogenicity or geographic origin. Local and imported ADMV species are all prevalent in mink-farming populations in the north-east of China. This is the first study to report the molecular epidemiology of AMDV in north-east China based on most of NS1 and the complete VP2, and further provides information about polyG deletions and new variations in the amino acid sequences of NS1 and VP2 proteins. This report is a good foundation for further study of AMDV in China.

Comparative genomics of citric-acid-producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

PubMed Central

Andersen, Mikael R.; Salazar, Margarita P.; Schaap, Peter J.; van de Vondervoort, Peter J.I.; Culley, David; Thykaer, Jette; Frisvad, Jens C.; Nielsen, Kristian F.; Albang, Richard; Albermann, Kaj; Berka, Randy M.; Braus, Gerhard H.; Braus-Stromeyer, Susanna A.; Corrochano, Luis M.; Dai, Ziyu; van Dijck, Piet W.M.; Hofmann, Gerald; Lasure, Linda L.; Magnuson, Jon K.; Menke, Hildegard; Meijer, Martin; Meijer, Susan L.; Nielsen, Jakob B.; Nielsen, Michael L.; van Ooyen, Albert J.J.; Pel, Herman J.; Poulsen, Lars; Samson, Rob A.; Stam, Hein; Tsang, Adrian; van den Brink, Johannes M.; Atkins, Alex; Aerts, Andrea; Shapiro, Harris; Pangilinan, Jasmyn; Salamov, Asaf; Lou, Yigong; Lindquist, Erika; Lucas, Susan; Grimwood, Jane; Grigoriev, Igor V.; Kubicek, Christian P.; Martinez, Diego; van Peij, Noël N.M.E.; Roubos, Johannes A.; Nielsen, Jens; Baker, Scott E.

2011-01-01

The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compel additional exploration. We therefore undertook whole-genome sequencing of the acidogenic A. niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence, and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was used to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 Mb of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis supported up-regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases, and protein transporters in the protein producing CBS 513.88 strain. Our results and data sets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi. PMID:21543515
Specificity determinants for the abscisic acid response element.

PubMed

Sarkar, Aditya Kumar; Lahiri, Ansuman

2013-01-01

Abscisic acid (ABA) response elements (ABREs) are a group of cis-acting DNA elements that have been identified from promoter analysis of many ABA-regulated genes in plants. We are interested in understanding the mechanism of binding specificity between ABREs and a class of bZIP transcription factors known as ABRE binding factors (ABFs). In this work, we have modeled the homodimeric structure of the bZIP domain of ABRE binding factor 1 from Arabidopsis thaliana (AtABF1) and studied its interaction with ACGT core motif-containing ABRE sequences. We have also examined the variation in the stability of the protein-DNA complex upon mutating ABRE sequences using the protein design algorithm FoldX. The high throughput free energy calculations successfully predicted the ability of ABF1 to bind to alternative core motifs like GCGT or AAGT and also rationalized the role of the flanking sequences in determining the specificity of the protein-DNA interaction.
Comparison of theoretical proteomes: Identification of COGs with conserved and variable pI within the multimodal pI distribution

PubMed Central

Nandi, Soumyadeep; Mehra, Nipun; Lynn, Andrew M; Bhattacharya, Alok

2005-01-01

Background Theoretical proteome analysis, generated by plotting theoretical isoelectric points (pI) against molecular masses of all proteins encoded by the genome show a multimodal distribution for pI. This multimodal distribution is an effect of allowed combinations of the charged amino acids, and not due to evolutionary causes. The variation in this distribution can be correlated to the organisms ecological niche. Contributions to this variation maybe mapped to individual proteins by studying the variation in pI of orthologs across microorganism genomes. Results The distribution of ortholog pI values showed trimodal distributions for all prokaryotic genomes analyzed, similar to whole proteome plots. Pairwise analysis of pI variation show that a few COGs are conserved within, but most vary between, the acidic and basic regions of the distribution, while molecular mass is more highly conserved. At the level of functional grouping of orthologs, five groups vary significantly from the population of orthologs, which is attributed to either conservation at the level of sequences or a bias for either positively or negatively charged residues contributing to the function. Individual COGs conserved in both the acidic and basic regions of the trimodal distribution are identified, and orthologs that best represent the variation in levels of the acidic and basic regions are listed. Conclusion The analysis of pI distribution by using orthologs provides a basis for resolution of theoretical proteome comparison at the level of individual proteins. Orthologs identified that significantly vary between the major acidic and basic regions maybe used as representative of the variation of the entire proteome. PMID:16150155
[Comparison of genotype characteristics between the circulating mumps virus strain in Beijing area and the vaccine strain].

PubMed

Chen, Meng; Zhang, Tie-gang; Chen, Li-juan; Wu, Jiang; Yang, Jie; Zhang, Wei

2009-11-01

To compare the genetic characteristics of mumps virus strain circulating in Beijing with vaccine strain and to preliminarily analysis the reasons of vaccine ineffectiveness. The following methods were used: Isolation and identification of the mumps virus which had been circulating in Beijing, immunization history analysis, SH gene sequence analysis and comparison genotype homology with reference strains and analysis of the key amino acid sites of HN variation. In 38 mumps cases that virus had been isolated from, another seven cases were IgM negative. In 2007 and 2008, the positive rates on virus isolation, RT-PCR and IgM-decreased significantly, while the cases with immunization history had an increase. Cases without histories of vaccination had both higher positive rates on virus isolation and IgM. Thirty-eight strains belonged to F genotype virus, but vaccine strain was A genotype. The circulating viruses showed 5.6% sequence divergence on SH gene nucleotide and 16.0% - 18.1% from vaccine strain. Conservative hydrophobic amino acids on SH protein of some Beijing strains had changed. For example, there were 6 strains, from No.8: L-->F. The circulating viruses showed 2.3% sequence divergence on HN protein amino acid sequences and 4.2% - 5.3% from vaccine strain. Amino acids sites, which deciding the ability of cross-neutralization of the Beijing strains and vaccine strains were different. At the 354 and 356 sites, all the Beijing strains were different from the vaccine strains. The N-glycosylation sites on HN of Beijing strains were also different from those on vaccine strains. Locations 464 - 466 appeared to be NCS on Beijing strain, but locations 464 - 466 were NCR on the vaccine strains. Another 18 unknown function amino acids sites of all Beijing strains were different from those on vaccine strains. In recent years, genotype F became the main genotype of circulating strains in Beijing without genotype variation, but larger difference was found between them. There was a big difference between SH and HN protein of Beijing strains and vaccine strain, which might explain the ineffectiveness of the vaccine.
Genetic variation of six desaturase genes in flax and their impact on fatty acid composition.

PubMed

Thambugala, Dinushika; Duguid, Scott; Loewen, Evelyn; Rowland, Gordon; Booker, Helen; You, Frank M; Cloutier, Sylvie

2013-10-01

Flax (Linum usitatissimum L.) is one of the richest plant sources of omega-3 fatty acids praised for their health benefits. In this study, the extent of the genetic variability of genes encoding stearoyl-ACP desaturase (SAD), and fatty acid desaturase 2 (FAD2) and 3 (FAD3) was determined by sequencing the six paralogous genes from 120 flax accessions representing a broad range of germplasm including some EMS mutant lines. A total of 6 alleles for sad1 and sad2, 21 for fad2a, 5 for fad2b, 15 for fad3a and 18 for fad3b were identified. Deduced amino acid sequences of the alleles predicted 4, 2, 3, 4, 6 and 7 isoforms, respectively. Allele frequencies varied greatly across genes. Fad3a, with 110 SNPs and 19 indels, and fad3b, with 50 SNPs and 5 indels, showed the highest levels of genetic variations. While most of the SNPs and all the indels were silent mutations, both genes carried nonsense SNP mutations resulting in premature stop codons, a feature not observed in sad and fad2 genes. Some alleles and isoforms discovered in induced mutant lines were absent in the natural germplasm. Correlation of these genotypic data with fatty acid composition data of 120 flax accessions phenotyped in six field experiments revealed statistically significant effects of some of the SAD and FAD isoforms on fatty acid composition, oil content and iodine value. The novel allelic variants and isoforms identified for the six desaturases will be a resource for the development of oilseed flax with unique and useful fatty acid profiles.
Dynamics of actin evolution in dinoflagellates.

PubMed

Kim, Sunju; Bachvaroff, Tsvetan R; Handy, Sara M; Delwiche, Charles F

2011-04-01

Dinoflagellates have unique nuclei and intriguing genome characteristics with very high DNA content making complete genome sequencing difficult. In dinoflagellates, many genes are found in multicopy gene families, but the processes involved in the establishment and maintenance of these gene families are poorly understood. Understanding the dynamics of gene family evolution in dinoflagellates requires comparisons at different evolutionary scales. Studies of closely related species provide fine-scale information relative to species divergence, whereas comparisons of more distantly related species provides broad context. We selected the actin gene family as a highly expressed conserved gene previously studied in dinoflagellates. Of the 142 sequences determined in this study, 103 were from the two closely related species, Dinophysis acuminata and D. caudata, including full length and partial cDNA sequences as well as partial genomic amplicons. For these two Dinophysis species, at least three types of sequences could be identified. Most copies (79%) were relatively similar and in nucleotide trees, the sequences formed two bushy clades corresponding to the two species. In comparisons within species, only eight to ten nucleotide differences were found between these copies. The two remaining types formed clades containing sequences from both species. One type included the most similar sequences in between-species comparisons with as few as 12 nucleotide differences between species. The second type included the most divergent sequences in comparisons between and within species with up to 93 nucleotide differences between sequences. In all the sequences, most variation occurred in synonymous sites or the 5' UnTranslated Region (UTR), although there was still limited amino acid variation between most sequences. Several potential pseudogenes were found (approximately 10% of all sequences depending on species) with incomplete open reading frames due to frameshifts or early stop codons. Overall, variation in the actin gene family fits best with the "birth and death" model of evolution based on recent duplications, pseudogenes, and incomplete lineage sorting. Divergence between species was similar to variation within species, so that actin may be too conserved to be useful for phylogenetic estimation of closely related species.
Robustness of Reconstructed Ancestral Protein Functions to Statistical Uncertainty.

PubMed

Eick, Geeta N; Bridgham, Jamie T; Anderson, Douglas P; Harms, Michael J; Thornton, Joseph W

2017-02-01

Hypotheses about the functions of ancient proteins and the effects of historical mutations on them are often tested using ancestral protein reconstruction (APR)-phylogenetic inference of ancestral sequences followed by synthesis and experimental characterization. Usually, some sequence sites are ambiguously reconstructed, with two or more statistically plausible states. The extent to which the inferred functions and mutational effects are robust to uncertainty about the ancestral sequence has not been studied systematically. To address this issue, we reconstructed ancestral proteins in three domain families that have different functions, architectures, and degrees of uncertainty; we then experimentally characterized the functional robustness of these proteins when uncertainty was incorporated using several approaches, including sampling amino acid states from the posterior distribution at each site and incorporating the alternative amino acid state at every ambiguous site in the sequence into a single "worst plausible case" protein. In every case, qualitative conclusions about the ancestral proteins' functions and the effects of key historical mutations were robust to sequence uncertainty, with similar functions observed even when scores of alternate amino acids were incorporated. There was some variation in quantitative descriptors of function among plausible sequences, suggesting that experimentally characterizing robustness is particularly important when quantitative estimates of ancient biochemical parameters are desired. The worst plausible case method appears to provide an efficient strategy for characterizing the functional robustness of ancestral proteins to large amounts of sequence uncertainty. Sampling from the posterior distribution sometimes produced artifactually nonfunctional proteins for sequences reconstructed with substantial ambiguity. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
[Application of the vanillin sulfuric acid colorimetry-ultraviolet spectrometry on quality evaluation of Panax notoginseng].

PubMed

Ding, Yong-Li; Wang, Yuan-Zhong; Zhang, Ji; Zhang, Qing-Zhi; Zhang, Jin-Yu; Jin, Hang

2013-02-01

In this study, Panax notoginseng samples were extracted by chloroform, ethanol and water, or by those extracted solution with 5% vanillin sulfuric acid to establish two kinds of UV fingerprint of P. notoginseng which were compared by applying the common and variation peak ratio dual index sequence analysis method and SIMCA software qualitative analysis. The results indicated that the optimization extraction time of P. notoginseng samples was 20 min with chloroform, ethanol and water extraction, but the fingerprint differed significantly after add vanillin sulfuric acid. The common peak ratios of UV fingerprint of P. notoginseng were scattered. The minimum was 25% (Y5-Y8), while the maximum was 84.38% (Y11-Y13, Y20-Y21). The maximum variation peak ratio was 177.78% (Y8-Y5), meanwhile, the variation peak ratios of several samples were more than 100%. However, the common peak ratios of UV fingerprint of P. notoginseng with vanillin sulfuric acid were concentrated (distributed in the range of 50%-70%): the minimum was 42.86%(Y1-Y19), whereas the maximum was 79.55% (Y22-Y23); the range of the variation peak ratios was also smaller with the ranges of 20%-50% in general. The result of the dual index sequence analysis was agreement with the fingerprint implied. The similarity of the UV fingerprint of the extracts of P. notoginseng after adding vanillin sulfuric acid was greater than before. Both the ages and origin was related with the difference of UV fingerprint. The similarity of the two samples with same age was more significant than those with different ages. The similarity and difference between samples was no correlation with the distance of geographic space, the near origin samples maybe have a significant similarity or difference. This method appears as good alternative for evaluate quality of the P. notoginseng and can distinguish at least two samples quantitatively, duo to it reaches the limitation of the multiple methods which only could be used to indistinctly distinguish herbs.
[Variations on hemagglutinin gene of Zhejiang measles virus strains and differences with measles strains circulated both at home and abroad].

PubMed

Feng, Yan; Zhong, Shu-ling; Xu, Chang-ping; Lu, Yi-yu

2013-07-01

To investigate the variations on hemagglutinin (H) gene of measles virus (MV) in Zhejiang province, and to analyze the differences with strains circulated both at home and abroad. In total, 33 MV strains isolated in Zhejiang province between 1999 and 2011 were collected.RNA of the isolated MV strains was extracted and the complete sequences on H gene were amplified using RT-PCR assay. The products were compared with the Chinese vaccine strain Shanghai-191, which was downloaded from GenBank, and other 95 different MV strains from all over the world. 33 MV strains, isolated from the throat swab specimens collected from MV patients in Zhejiang province during 1999 to 2001, were used to conduct phylogenetic analysis with MV strains circulated in other areas of China during 1993 to 2007. The phylogenetic tree based on H gene sequences showed that all the Zhejiang MV strains located in H1a cluster, and no apparent time series and geographic restrictions were observed. Compared to the referenced vaccine strain Shanghai-191, the average variation rate on nucleotides and amino acids, and the evolutionary rate of H1a viruses from China during 2003 to 2011 were separately 5.15%, 4.44% and 5.81%, which were higher than the rates of H1a viruses during 1965 to 1993 (4.75%, 3.86% and 5.30%), and the rates of viruses during 1994 to 2002 (4.80%, 4.08% and 5.37%).However, the dn/ds ratios of strains within the three time periods were 0.19,0.21 and 0.23 respectively, which indicated that no evidence of positive selection was found on H1a MV strains during 1993 to 2011. A 24 stable amino acid variation sites on H gene was found between H1a viruses during 2003 to 2011 and the vaccine strain Shanghai-191. The largest variation occurred between vaccine and H1a strains, with 0.053 of the p-distance and 26-28 of amino acid mutations.However, only 15 stable amino acid variations were found between vaccine strain and genotype B3 or D4 strains.In addition, significant differences were found between H1a viruses and genotype B or D viruses, with 0.074 and 0.071 of p-distance and 27-33 of amino acid differences. Significant differences were found on H gene between MV strains subtype H1a and vaccine strains and other genotype strains. The variations were enlarged with the time coursing; therefore, the surveillance on variation of Chinese MV strains should be taken into account.
Development of SSR Markers Linked to Low Hydrocyanic Acid Content in Sorghum-Sudan Grass Hybrid Based on BSA Method.

PubMed

Xiao-Xia, Yu; Zhi-Hua, Liu; Zhuo, Yu; Yue, Shi; Xiao-Yu, Li

2016-01-01

Sorghum-Sudan grass hybrid containing high hydrocyanic acid content can cause hydrocyanic acid poisoning to the livestock and limit the popularization of this forage crop. Molecular markers associated with low hydrocyanic acid content can speed up the process of identification of genotypes with low hydrocyanic acid content. In the present study, 11 polymorphic SSR primers were screened and used for bulked segregant analysis and single marker analysis. Three SSR markers Xtxp7230, Xtxp7375 and Bnlg667960 associated with low hydrocyanic acid content were rapidly identified by BSA. In single marker analysis, six markers Xtxp7230, Xtxp7375, Bnlg667960, Xtxp67-11, Xtxp295-7 and Xtxp12-9 were linked to low hydrocyanic acid content, which explained the proportion of phenotypic variation from 7.6 % to 41.2 %. The markers identified by BSA were also verified by single marker analysis. The three SSR marker bands were then cloned and sequenced for sequence homology analysis in NCBI. It is the first report on the development of molecular markers associated with low hydrocyanic acid content in sorghum- Sudan grass hybrid. These markers will be useful for genetic improvement of low hydrocyanic acid sorghum-Sudan grass hybrid by marker-assisted breeding.
sapFinder: an R/Bioconductor package for detection of variant peptides in shotgun proteomics experiments.

PubMed

Wen, Bo; Xu, Shaohang; Sheynkman, Gloria M; Feng, Qiang; Lin, Liang; Wang, Quanhui; Xu, Xun; Wang, Jun; Liu, Siqi

2014-11-01

Single nucleotide variations (SNVs) located within a reading frame can result in single amino acid polymorphisms (SAPs), leading to alteration of the corresponding amino acid sequence as well as function of a protein. Accurate detection of SAPs is an important issue in proteomic analysis at the experimental and bioinformatic level. Herein, we present sapFinder, an R software package, for detection of the variant peptides based on tandem mass spectrometry (MS/MS)-based proteomics data. This package automates the construction of variation-associated databases from public SNV repositories or sample-specific next-generation sequencing (NGS) data and the identification of SAPs through database searching, post-processing and generation of HTML-based report with visualized interface. sapFinder is implemented as a Bioconductor package in R. The package and the vignette can be downloaded at http://bioconductor.org/packages/devel/bioc/html/sapFinder.html and are provided under a GPL-2 license. © The Author 2014. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
GWA Mapping of Anthocyanin Accumulation Reveals Balancing Selection of MYB90 in Arabidopsis thaliana

PubMed Central

Bac-Molenaar, Johanna A.; Fradin, Emilie F.; Rienstra, Juriaan A.; Vreugdenhil, Dick; Keurentjes, Joost J. B.

2015-01-01

Induction of anthocyanin accumulation by osmotic stress was assessed in 360 accessions of Arabidopsis thaliana. A wide range of natural variation, with phenotypes ranging from green to completely red/purple rosettes, was observed. A genome wide association (GWA) mapping approach revealed that sequence diversity in a small 15 kb region on chromosome 1 explained 40% of the variation observed. Sequence and expression analyses of alleles of the candidate gene MYB90 identified a causal polymorphism at amino acid (AA) position 210 of this transcription factor of the anthocyanin biosynthesis pathway. This amino acid discriminates the two most frequent alleles of MYB90. Both alleles are present in a substantial part of the population, suggesting balancing selection between these two alleles. Analysis of the geographical origin of the studied accessions suggests that the macro climate is not the driving force behind positive or negative selection for anthocyanin accumulation. An important role for local climatic conditions is, therefore, suggested. This study emphasizes that GWA mapping is a powerful approach to identify alleles that are under balancing selection pressure in nature. PMID:26588092
The international Genome sample resource (IGSR): A worldwide collection of genome variation incorporating the 1000 Genomes Project data.

PubMed

Clarke, Laura; Fairley, Susan; Zheng-Bradley, Xiangqun; Streeter, Ian; Perry, Emily; Lowy, Ernesto; Tassé, Anne-Marie; Flicek, Paul

2017-01-04

The International Genome Sample Resource (IGSR; http://www.internationalgenome.org) expands in data type and population diversity the resources from the 1000 Genomes Project. IGSR represents the largest open collection of human variation data and provides easy access to these resources. IGSR was established in 2015 to maintain and extend the 1000 Genomes Project data, which has been widely used as a reference set of human variation and by researchers developing analysis methods. IGSR has mapped all of the 1000 Genomes sequence to the newest human reference (GRCh38), and will release updated variant calls to ensure maximal usefulness of the existing data. IGSR is collecting new structural variation data on the 1000 Genomes samples from long read sequencing and other technologies, and will collect relevant functional data into a single comprehensive resource. IGSR is extending coverage with new populations sequenced by collaborating groups. Here, we present the new data and analysis that IGSR has made available. We have also introduced a new data portal that increases discoverability of our data-previously only browseable through our FTP site-by focusing on particular samples, populations or data sets of interest. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
WS-SNPs&GO: a web server for predicting the deleterious effect of human protein variants using functional annotation

PubMed Central

2013-01-01

Background SNPs&GO is a method for the prediction of deleterious Single Amino acid Polymorphisms (SAPs) using protein functional annotation. In this work, we present the web server implementation of SNPs&GO (WS-SNPs&GO). The server is based on Support Vector Machines (SVM) and for a given protein, its input comprises: the sequence and/or its three-dimensional structure (when available), a set of target variations and its functional Gene Ontology (GO) terms. The output of the server provides, for each protein variation, the probabilities to be associated to human diseases. Results The server consists of two main components, including updated versions of the sequence-based SNPs&GO (recently scored as one of the best algorithms for predicting deleterious SAPs) and of the structure-based SNPs&GO3d programs. Sequence and structure based algorithms are extensively tested on a large set of annotated variations extracted from the SwissVar database. Selecting a balanced dataset with more than 38,000 SAPs, the sequence-based approach achieves 81% overall accuracy, 0.61 correlation coefficient and an Area Under the Curve (AUC) of the Receiver Operating Characteristic (ROC) curve of 0.88. For the subset of ~6,600 variations mapped on protein structures available at the Protein Data Bank (PDB), the structure-based method scores with 84% overall accuracy, 0.68 correlation coefficient, and 0.91 AUC. When tested on a new blind set of variations, the results of the server are 79% and 83% overall accuracy for the sequence-based and structure-based inputs, respectively. Conclusions WS-SNPs&GO is a valuable tool that includes in a unique framework information derived from protein sequence, structure, evolutionary profile, and protein function. WS-SNPs&GO is freely available at http://snps.biofold.org/snps-and-go. PMID:23819482
Detection of a single nucleotide polymorphism in the human alpha-lactalbumin gene: implications for human milk proteins.

PubMed

Chowanadisai, Winyoo; Kelleher, Shannon L; Nemeth, Jennifer F; Yachetti, Stephen; Kuhlman, Charles F; Jackson, Joan G; Davis, Anne M; Lien, Eric L; Lönnerdal, Bo

2005-05-01

Variability in the protein composition of breast milk has been observed in many women and is believed to be due to natural variation of the human population. Single nucleotide polymorphisms (SNPs) are present throughout the entire human genome, but the impact of this variation on human milk composition and biological activity and infant nutrition and health is unclear. The goals of this study were to characterize a variant of human alpha-lactalbumin observed in milk from a Filipino population by determining the location of the polymorphism in the amino acid and genomic sequences of alpha-lactalbumin. Milk and blood samples were collected from 20 Filipino women, and milk samples were collected from an additional 450 women from nine different countries. alpha-Lactalbumin concentration was measured by high-performance liquid chromatography (HPLC), and milk samples containing the variant form of the protein were identified with both HPLC and mass spectrometry (MS). The molecular weight of the variant form was measured by MS, and the location of the polymorphism was narrowed down by protein reduction, alkylation and trypsin digestion. Genomic DNA was isolated from whole blood, and the polymorphism location and subject genotype were determined by amplifying the entire coding sequence of human alpha-lactalbumin by PCR, followed by DNA sequencing. A variant form of alpha-lactalbumin was observed in HPLC chromatograms, and the difference in molecular weight was determined by MS (wild type=14,070 Da, variant=14,056 Da). Protein reduction and digestion narrowed the polymorphism between the 33rd and 77th amino acid of the protein. The genetic polymorphism was identified as adenine to guanine, which translates to a substitution from isoleucine to valine at amino acid 46. The frequency of variation was higher in milk from China, Japan and Philippines, which suggests that this polymorphism is most prevalent in Asia. There are SNPs in the genome for human milk proteins and their implications for protein bioactivity and infant nutrition need to be considered.
Geometric Patterns for Neighboring Bases Near the Stacked State in Nucleic Acid Strands.

PubMed

Sedova, Ada; Banavali, Nilesh K

2017-03-14

Structural variation in base stacking has been analyzed frequently in isolated double helical contexts for nucleic acids, but not as often in nonhelical geometries or in complex biomolecular environments. In this study, conformations of two neighboring bases near their stacked state in any environment are comprehensively characterized for single-strand dinucleotide (SSD) nucleic acid crystal structure conformations. An ensemble clustering method is used to identify a reduced set of representative stacking geometries based on pairwise distances between select atoms in consecutive bases, with multiple separable conformational clusters obtained for categories divided by nucleic acid type (DNA/RNA), SSD sequence, stacking face orientation, and the presence or absence of a protein environment. For both DNA and RNA, SSD conformations are observed that are either close to the A-form, or close to the B-form, or intermediate between the two forms, or further away from either form, illustrating the local structural heterogeneity near the stacked state. Among this large variety of distinct conformations, several common stacking patterns are observed between DNA and RNA, and between nucleic acids in isolation or in complex with proteins, suggesting that these might be stable stacking orientations. Noncanonical face/face orientations of the two bases are also observed for neighboring bases in the same strand, but their frequency is much lower, with multiple SSD sequences across categories showing no occurrences of such unusual stacked conformations. The resulting reduced set of stacking geometries is directly useful for stacking-energy comparisons between empirical force fields, prediction of plausible localized variations in single-strand structures near their canonical states, and identification of analogous stacking patterns in newly solved nucleic acid containing structures.
A novel cry2Ab gene from the indigenous isolate Bacillus thuringiensis subsp. kurstaki.

PubMed

Sevim, Ali; Eryüzlü, Emine; Demirbağ, Zihni; Demir, Ismail

2012-01-01

A novel cry2Ab gene was cloned and sequenced from the indigenous isolate of Bacillus thuringiensis subsp. kurstaki. This gene was designated as cry2Ab25 and its sequence revealed an open reading frame of 1,902 bp encoding a 633 aa protein with calculated molecular mass of 70 kDa and pI value of 8.98. The amino acid sequence of the Cry2Ab25 protein was compared with previously known Cry2Ab toxins, and the phylogenetic relationships among them were determined. The deduced amino acid sequence of the Cry2Ab25 protein showed 99% homology to the known Cry2Ab proteins, except for Cry2Ab10 and Cry2Ab12 with 97% homology, and a variation in one amino acid residue in comparison with all known Cry2Ab proteins. The cry2Ab25 gene was expressed in Escherichia coli BL21(DE3) cells. Sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) revealed that the Cry2Ab25 protein is about 70 kDa. The toxin expressed in BL21(DE3) exhibited high toxicity against Malacosoma neustria and Rhagoletis cerasi with 73% and 75% mortality after 5 days of treatment, respectively.
Distribution of human papilloma virus type 16 E6/E7 gene mutation in cervical precancer or cancer: A case control study in Guizhou Province, China.

PubMed

Yang, Yingjie; Ren, Jie; Zhang, Qizhu

2016-02-01

HPV-16 varies geographically and is correlated with cervical cancer genesis and progression. This study aimed to determine the distribution of HPV-16 E6/E7 genetic variation in patients with invasive cervical cancer or precancer in Guizhou Province, China. A case-control study was designed, and the distribution of HPV-16 E6/E7 genetic variation was compared among women with cervical cancer, precancer, and sexually active without cervical lesion. HPV infection was detected through flow-through hybridization and gene chip techniques to determine the prevalence of HPV 16 E6/E7 genetic variation. Among 90 specimens (30 cervical cancer, 30 precancer, 30 controls), 81 were subjected to HPV-16 E6/E7 gene sequencing. The rates of DNA sequence mutation and amino acid mutation were 76.5% (62/81) and 66.7% (54/81), respectively. Both E6 and E7 genes showed higher mutation rate than their prototypes. The prevalence of E6/E7 mutation significantly differed between the cervical cancer and the controls (P < 0.05) and between the cervical precancer and the controls (P < 0.05). Mutations were simultaneously detected at the E6-D32E (T96A) and E7-M28V (A82G)/L94P (T281C) sites of the amino acid sequence. The most common genetic variation was D32E/M28V/L94P, which accounted for 35.8% of the cases (29/81). D32E/M28V/L94P mutation was higher in the cervical cancer and precancer compared with the prototype. HPV-16 E6/E7 genetic variations, such as D32E/M28V/L94P, are more prevalent in cervical cancer or precancer than those in the controls. The possible correlation between genetic variation and cancerigenesis may be used to design an HPV vaccine for cervical carcinoma. © 2015 Wiley Periodicals, Inc.
In Vitro Modeling of Bile Acid Processing by the Human Fecal Microbiota.

PubMed

Martin, Glynn; Kolida, Sofia; Marchesi, Julian R; Want, Elizabeth; Sidaway, James E; Swann, Jonathan R

2018-01-01

Bile acids, the products of concerted host and gut bacterial metabolism, have important signaling functions within the mammalian metabolic system and a key role in digestion. Given the complexity of the mega-variate bacterial community residing in the gastrointestinal tract, studying associations between individual bacterial genera and bile acid processing remains a challenge. Here, we present a novel in vitro approach to determine the bacterial genera associated with the metabolism of different primary bile acids and their potential to contribute to inter-individual variation in this processing. Anaerobic, pH-controlled batch cultures were inoculated with human fecal microbiota and treated with individual conjugated primary bile acids (500 μg/ml) to serve as the sole substrate for 24 h. Samples were collected throughout the experiment (0, 5, 10, and 24 h) and the bacterial composition was determined by 16S rRNA gene sequencing and the bile acid signatures were characterized using a targeted ultra-performance liquid chromatography-mass spectrometry (UPLC-MS) approach. Data fusion techniques were used to identify statistical bacterial-metabolic linkages. An increase in gut bacteria associated bile acids was observed over 24 h with variation in the rate of bile acid metabolism across the volunteers ( n = 7). Correlation analysis identified a significant association between the Gemmiger genus and the deconjugation of glycine conjugated bile acids while the deconjugation of taurocholic acid was associated with bacteria from the Eubacterium and Ruminococcus genera. A positive correlation between Dorea and deoxycholic acid production suggest a potential role for this genus in cholic acid dehydroxylation. A slower deconjugation of taurocholic acid was observed in individuals with a greater abundance of Parasutterella and Akkermansia . This work demonstrates the utility of integrating compositional (metataxonomics) and functional (metabonomics) systems biology approaches, coupled to in vitro model systems, to study the biochemical capabilities of bacteria within complex ecosystems. Characterizing the dynamic interactions between the gut microbiota and the bile acid pool enables a greater understanding of how variation in the gut microbiota influences host bile acid signatures, their associated functions and their implications for health.
Identification of single amino acid substitutions (SAAS) in neuraminidase from influenza a virus (H1N1) via mass spectrometry analysis coupled with de novo peptide sequencing.

PubMed

Peng, Qisheng; Wang, Zijian; Wu, Donglin; Li, Xiaoou; Liu, Xiaofeng; Sun, Wanchun; Liu, Ning

2016-08-01

Amino acid substitutions in the neuraminidase of the influenza virus are the main cause of the emergence of resistance to zanamivir or oseltamivir during seasonal influenza treatment; they are the result of non-synonymous mutations in the viral genome that can be successfully detected by polymer chain reaction (PCR)-based approaches. There is always an urgent need to detect variation in amino acid sequences directly at the protein level. Mass spectrometry coupled with de novo sequencing has been explored as an alternative and straightforward strategy for detecting amino acid substitutions, as well - this approach is the primary focus of the present study. Influenza virus (A/Puerto Rico/8/1934 H1N1) propagated in embryonated chicken eggs was purified by ultracentrifugation, followed by PNGase F treatment. The deglycosylated virion was lysed and separated by sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE). The gel band corresponding to neuraminidase was picked up and subjected to liquid chromatography tandem mass spectrometry (LC-MS/MS) analysis. LC-MS/MS analyses, coupled with manual de novo sequencing, allowed the determination of three amino acid substitutions: R346K, S349 N, and S370I/L, in the neuraminidase from the influenza virus (A/Puerto Rico/8/1934 H1N1), which were located in three mutated peptides of the neuraminidase: YGNGVWIGK, TKNHSSR, and PNGWTETDI/LK, respectively. We found that the amino acid substitutions in the proteins of RNA viruses (including influenza A virus) resulting from non-synonymous gene mutations can indeed be directly analyzed via mass spectrometry, and that manual interpretation of the MS/MS data may be beneficial. Copyright © 2016 John Wiley & Sons, Ltd. Copyright © 2016 John Wiley & Sons, Ltd.

Differential display detects host nucleic acid motifs altered in scrapie-infected brain.

PubMed

Lathe, Richard; Harris, Alyson

2009-09-25

The transmissible spongiform encephalopathies (TSEs) including scrapie have been attributed to an infectious protein or prion. Infectivity is allied to conversion of the endogenous nucleic-acid-binding protein PrP to an infectious modified form known as PrP(sc). The protein-only theory does not easily explain the enigmatic properties of the agent including strain variation. It was previously suggested that a short nucleic acid, perhaps host-encoded, might contribute to the pathoetiology of the TSEs. No candidate host molecules that might explain transmission of strain differences have yet been put forward. Differential display is a robust technique for detecting nucleic acid differences between two populations. We applied this technique to total nucleic acid preparations from scrapie-infected and control brain. Independent RNA preparations from eight normal and eight scrapie-infected (strain 263K) hamster brains were randomly amplified and visualized in parallel. Though the nucleic acid patterns were generally identical in scrapie-infected versus control brain, some rare bands were differentially displayed. Molecular species consistently overrepresented (or underrepresented) in all eight infected brain samples versus all eight controls were excised from the display, sequenced, and assembled into contigs. Only seven ros contigs (RNAs over- or underrepresented in scrapie) emerged, representing <4 kb from the transcriptome. All contained highly stable regions of secondary structure. The most abundant scrapie-only ros sequence was homologous to a repetitive transposable element (LINE; long interspersed nuclear element). Other ros sequences identified cellular RNA 7SL, clathrin heavy chain, visinin-like protein-1, and three highly specific subregions of ribosomal RNA (ros1-3). The ribosomal ros sequences accurately corresponded to LINE; retrotransposon insertion sites in ribosomal DNA (p<0.01). These differential motifs implicate specific host RNAs in the pathoetiology of the TSEs.
Molecular epidemiology of porcine reproductive and respiratory syndrome virus in Central China since 2014: The prevalence of NADC30-like PRRSVs.

PubMed

Wang, Lin-Jian; Xie, Weitao; Chen, Xin-Xin; Qiao, Songlin; Zhao, Mengmeng; Gu, Yu; Zhao, Bao-Lei; Zhang, Gaiping

2017-08-01

Porcine reproductive and respiratory syndrome (PRRS), characterized by respiratory disorders in piglets and reproductive failure in sows, is still the great threat of swine industry. Recently, Emergence of the novel NADC30-like PRRS viruses (PRRSVs) has caused widespread outbreaks of PRRS. To investigate the epidemic characteristics of PRRSVs in Central China since 2014, 6372 clinical serum samples were tested by ELISA, 250 tissue samples were tested by RT-PCR, and among these, 30 ORF5 and 17 Nsp2 genes sequences were analyzed. Phylogenetic tree based on ORF5 revealed that, 17 isolates were clustered into subgroup 1, represented by the NADC30. And for the Nsp2, The strains which had a discontinuous 131-amino-acid deletion in Nsp2, called NADC30-like strains, were clustered into subgroup 2. Our data suggested that the NADC30-like PRRSV strains spread quickly and are now circulating and prevalent in Central China as well as the classical HP-PRRSV strains. In addition, amino acid variation analysis of GP5 revealed that the amino acid sequences of NADC30-like PRRSV strains underwent rapid evolution and contained extensive amino acid substitutions in important motifs, such as potential neutralization epitope and the N-glycosylation sites. In summary, our data would provide a large amount of detailed information on molecular variation and genetic diversity of PRRSV in central China. Copyright © 2017. Published by Elsevier Ltd.
DNA–DNA kissing complexes as a new tool for the assembly of DNA nanostructures

PubMed Central

Barth, Anna; Kobbe, Daniela; Focke, Manfred

2016-01-01

Kissing-loop annealing of nucleic acids occurs in nature in several viruses and in prokaryotic replication, among other circumstances. Nucleobases of two nucleic acid strands (loops) interact with each other, although the two strands cannot wrap around each other completely because of the adjacent double-stranded regions (stems). In this study, we exploited DNA kissing-loop interaction for nanotechnological application. We functionalized the vertices of DNA tetrahedrons with DNA stem-loop sequences. The complementary loop sequence design allowed the hybridization of different tetrahedrons via kissing-loop interaction, which might be further exploited for nanotechnology applications like cargo transport and logical elements. Importantly, we were able to manipulate the stability of those kissing-loop complexes based on the choice and concentration of cations, the temperature and the number of complementary loops per tetrahedron either at the same or at different vertices. Moreover, variations in loop sequences allowed the characterization of necessary sequences within the loop as well as additional stability control of the kissing complexes. Therefore, the properties of the presented nanostructures make them an important tool for DNA nanotechnology. PMID:26773051
Molecular cloning of a cDNA coding for GTP cyclohydrolase I from Dictyostelium discoideum.

PubMed Central

Witter, K; Cahill, D J; Werner, T; Ziegler, I; Rödl, W; Bacher, A; Gütlich, M

1996-01-01

The GTP cyclohydrolase I (GTP-CH) gene of the cellular slime mould Dictyostelium discoideum has been cloned and sequenced. The 855 bp cDNA of this gene contains the open reading frame (ORF) encoding 232 amino acids with a predicted molecular mass of approx. 26 kDa. Southern blot analysis indicated the presence of a single gene for GTP-CH in Dictyostelium. PCR amplification of the ORF from chromosomal DNA and sequencing showed the existence of a 101 bp intron in the GTP-CH gene of Dictyostelium discoideum. The amino acid sequence has 47% and 49% positional identity to those of the human and yeast enzymes respectively. Most of the sequence variation between species is located in the N-terminal part of the protein. The overall identity with the E. coli protein is markedly lower. The enzyme was expressed in E. coli and purified as a 68 kDa fusion protein with the maltose-binding protein of E. coli. GTP-CH of Dictyostelium is heat-stable and showed maximal activity at 60 degrees C. The Km value for GTP is 50 microM. PMID:8870645
Glutamate cysteine ligase (GCL) in the freshwater bivalve Unio tumidus: impact of storage conditions and seasons on activity and identification of partial coding sequence of the catalytic subunit.

PubMed

Coffinet, Stéphanie; Cossu-Leguille, Carole; Rodius, François; Vasseur, Paule

2008-09-01

Glutamate cysteine ligase (GCL; EC 6.3.2.2) is the first enzyme involved in the synthesis of glutathione. A HPLC method with fluorimetric detection was used to measure GCL activity in the gills and the digestive gland of the freshwater bivalve, Unio tumidus. Storage conditions were optimized in order to prevent decrease of GCL activity and consisted in freezing the cytosolic fraction in the presence of protease (1 mM phenylmethylsulfonic fluoric acid) and gamma-glutamyltranspeptidase (1 mM L-serine borate mixture and 0.5 mM acivicin) inhibitors. Seasonal variations of activity in the digestive gland and to a lesser extent in the gills were found with activity increasing in spring compared to winter. No sex differences were revealed. The GCL coding sequence was identified using degenerated primers designed in the highly conserved regions of the catalytic subunit of GCL. The partial sequence identified encoded for 121 amino acids. The comparison of the identified partial coding sequence of U. tumidus with those available from vertebrates and invertebrates indicated that GCL sequence was highly conserved.
Rapid Characterization of Insulin Modifications and Sequence Variations by Proteinase K Digestion and UHPLC-ESI-MS

NASA Astrophysics Data System (ADS)

Yang, Rong-Sheng; Tang, Weijuan; Sheng, Huaming; Meng, Fanyu

2018-01-01

Discovery of novel insulin analogs as therapeutics has remained an active area of research. Compared with native human insulin, insulin analog molecules normally incorporate either covalent modifications or amino acid sequence variations. From the drug discovery and development perspective, methods for efficient and detailed characterization of these primary structural changes are very important. In this report, we demonstrate that proteinase K digestion coupled with UPLC-ESI-MS analysis provides a simple and rapid approach to characterize the modifications and sequence variations of insulin molecules. A commercially available proteinase K digestion kit was used to process recombinant human insulin (RHI), insulin glargine, and fluorescein isothiocynate-labeled recombinant human insulin (FITC-RHI) samples. The LC-MS data clearly showed that RHI and insulin glargine samples can be differentiated, and the FITC modifications in all three amine sites of the RHI molecule are well characterized. The end-to-end experiment and data interpretation was achieved within 60 min. This approach is fast and simple, and can be easily implemented in early drug discovery laboratories to facilitate research on more advanced insulin therapeutics. [Figure not available: see fulltext.
Rapid Characterization of Insulin Modifications and Sequence Variations by Proteinase K Digestion and UHPLC-ESI-MS

NASA Astrophysics Data System (ADS)

Yang, Rong-Sheng; Tang, Weijuan; Sheng, Huaming; Meng, Fanyu

2018-05-01

Discovery of novel insulin analogs as therapeutics has remained an active area of research. Compared with native human insulin, insulin analog molecules normally incorporate either covalent modifications or amino acid sequence variations. From the drug discovery and development perspective, methods for efficient and detailed characterization of these primary structural changes are very important. In this report, we demonstrate that proteinase K digestion coupled with UPLC-ESI-MS analysis provides a simple and rapid approach to characterize the modifications and sequence variations of insulin molecules. A commercially available proteinase K digestion kit was used to process recombinant human insulin (RHI), insulin glargine, and fluorescein isothiocynate-labeled recombinant human insulin (FITC-RHI) samples. The LC-MS data clearly showed that RHI and insulin glargine samples can be differentiated, and the FITC modifications in all three amine sites of the RHI molecule are well characterized. The end-to-end experiment and data interpretation was achieved within 60 min. This approach is fast and simple, and can be easily implemented in early drug discovery laboratories to facilitate research on more advanced insulin therapeutics. [Figure not available: see fulltext.
BreaKmer: detection of structural variation in targeted massively parallel sequencing data using kmers.

PubMed

Abo, Ryan P; Ducar, Matthew; Garcia, Elizabeth P; Thorner, Aaron R; Rojas-Rudilla, Vanesa; Lin, Ling; Sholl, Lynette M; Hahn, William C; Meyerson, Matthew; Lindeman, Neal I; Van Hummelen, Paul; MacConaill, Laura E

2015-02-18

Genomic structural variation (SV), a common hallmark of cancer, has important predictive and therapeutic implications. However, accurately detecting SV using high-throughput sequencing data remains challenging, especially for 'targeted' resequencing efforts. This is critically important in the clinical setting where targeted resequencing is frequently being applied to rapidly assess clinically actionable mutations in tumor biopsies in a cost-effective manner. We present BreaKmer, a novel approach that uses a 'kmer' strategy to assemble misaligned sequence reads for predicting insertions, deletions, inversions, tandem duplications and translocations at base-pair resolution in targeted resequencing data. Variants are predicted by realigning an assembled consensus sequence created from sequence reads that were abnormally aligned to the reference genome. Using targeted resequencing data from tumor specimens with orthogonally validated SV, non-tumor samples and whole-genome sequencing data, BreaKmer had a 97.4% overall sensitivity for known events and predicted 17 positively validated, novel variants. Relative to four publically available algorithms, BreaKmer detected SV with increased sensitivity and limited calls in non-tumor samples, key features for variant analysis of tumor specimens in both the clinical and research settings. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Intraspecific Polymorphism, Interspecific Divergence, and the Origins of Function-Altering Mutations in Deer Mouse Hemoglobin

PubMed Central

Natarajan, Chandrasekhar; Hoffmann, Federico G.; Lanier, Hayley C.; Wolf, Cole J.; Cheviron, Zachary A.; Spangler, Matthew L.; Weber, Roy E.; Fago, Angela; Storz, Jay F.

2015-01-01

Major challenges for illuminating the genetic basis of phenotypic evolution are to identify causative mutations, to quantify their functional effects, to trace their origins as new or preexisting variants, and to assess the manner in which segregating variation is transduced into species differences. Here, we report an experimental analysis of genetic variation in hemoglobin (Hb) function within and among species of Peromyscus mice that are native to different elevations. A multilocus survey of sequence variation in the duplicated HBA and HBB genes in Peromyscus maniculatus revealed that function-altering amino acid variants are widely shared among geographically disparate populations from different elevations, and numerous amino acid polymorphisms are also shared with closely related species. Variation in Hb-O2 affinity within and among populations of P. maniculatus is attributable to numerous amino acid mutations that have individually small effects. One especially surprising feature of the Hb polymorphism in P. maniculatus is that an appreciable fraction of functional standing variation in the two transcriptionally active HBA paralogs is attributable to recurrent gene conversion from a tandemly linked HBA pseudogene. Moreover, transpecific polymorphism in the duplicated HBA genes is not solely attributable to incomplete lineage sorting or introgressive hybridization; instead, it is mainly attributable to recurrent interparalog gene conversion that has occurred independently in different species. Partly as a result of concerted evolution between tandemly duplicated globin genes, the same amino acid changes that contribute to variation in Hb function within P. maniculatus also contribute to divergence in Hb function among different species of Peromyscus. In the case of function-altering Hb mutations in Peromyscus, there is no qualitative or quantitative distinction between segregating variants within species and fixed differences between species. PMID:25556236
Genomic Characterization of Phenylalanine Ammonia Lyase Gene in Buckwheat

PubMed Central

Thiyagarajan, Karthikeyan; Vitali, Fabio; Tolaini, Valentina; Galeffi, Patrizia; Cantale, Cristina; Vikram, Prashant; Singh, Sukhwinder; De Rossi, Patrizia; Nobili, Chiara; Procacci, Silvia; Del Fiore, Antonella; Antonini, Alessandro; Presenti, Ombretta; Brunori, Andrea

2016-01-01

Phenylalanine Ammonia Lyase (PAL) gene which plays a key role in bio-synthesis of medicinally important compounds, Rutin/quercetin was sequence characterized for its efficient genomics application. These compounds possessing anti-diabetic and anti-cancer properties and are predominantly produced by Fagopyrum spp. In the present study, PAL gene was sequenced from three Fagopyrum spp. (F. tataricum, F. esculentum and F. dibotrys) and showed the presence of three SNPs and four insertion/deletions at intra and inter specific level. Among them, the potential SNP (position 949th bp G>C) with Parsimony Informative Site was selected and successfully utilised to individuate the zygosity/allelic variation of 16 F. tataricum varieties. Insertion mutations were identified in coding region, which resulted the change of a stretch of 39 amino acids on the putative protein. Our Study revealed that autogamous species (F. tataricum) has lower frequency of observed SNPs as compared to allogamous species (F. dibotrys and F. esculentum). The identified SNPs in F. tataricum didn’t result to amino acid change, while in other two species it caused both conservative and non-conservative variations. Consistent pattern of SNPs across the species revealed their phylogenetic importance. We found two groups of F. tataricum and one of them was closely related with F. dibotrys. Sequence characterization information of PAL gene reported in present investigation can be utilized in genetic improvement of buckwheat in reference to its medicinal value. PMID:26990297
Human papillomavirus type 18 variant lineages in United States populations characterized by sequence analysis of LCR-E6, E2, and L1 regions.

PubMed

Arias-Pulido, Hugo; Peyton, Cheri L; Torrez-Martínez, Norah; Anderson, D Nelson; Wheeler, Cosette M

2005-07-20

While HPV 16 variant lineages have been well characterized, the knowledge about HPV 18 variants is limited. In this study, HPV 18 nucleotide variations in the E2 hinge region were characterized by sequence analysis in 47 control and 51 tumor specimens. Fifty of these specimens were randomly selected for sequencing of an LCR-E6 segment and 20 samples representative of LCR-E6 and E2 sequence variants were examined across the L1 region. A total of 2770 nucleotides per HPV 18 variant genome were considered in this study. HPV 18 variant nucleotides were linked among all gene segments analyzed and grouped into three main branches: Asian-American (AA), European (E), and African (Af). These three branches were equally distributed among controls and cases and when stratified by Hispanic and non-Hispanic ethnicities. Among invasive cervical cancer cases, no significant differences in the three HPV variant branches were observed among ethnic groups or when stratified by histopathology (squamous vs. adenocarcinoma). The Af branch showed the greatest nucleotide variability when compared to the HPV 18 reference sequence and was more closely related to HPV 45 than either AA or E branches. Our data also characterize nucleotide and amino acid variations in the L1 capsid gene among HPV 18 variants, which may be relevant to vaccine strategies and subsequent studies of naturally occurring HPV 18 variants. Several novel HPV 18 nucleotide variations were identified in this study.
Protein conformation and disease : pathological consequences of analogous mutations in homologous proteins.

DOE Office of Scientific and Technical Information (OSTI.GOV)

Stevens, F. J.; Pokkuluri, P. R.; Schiffer, M.

2000-12-19

The antibody light chain variable domain (V{sub L}){sup 1} and myelin protein zero (MPZ) are representatives of the functionally diverse immunoglobulin superfamily. The V{sub L} is a subunit of the antigen-binding component of antibodies, while MPZ is the major membrane-linked constituent of the myelin sheaths that coat peripheral nerves. Despite limited amino acid sequence homology, the conformations of the core structures of the two proteins are largely superimposable. Amino acid variations in V{sub L} account for various conformational disease outcomes, including amyloidosis. However, the specific amino acid changes in V{sub L} that are responsible for disease have been obscured bymore » multiple concurrent primary structure alterations. Recently, certain demyelination disorders have been linked to point mutations and single amino acid polymorphisms in MPZ. We demonstrate here that some pathogenic variations in MPZ correspond to changes suspected of determining amyloidosis in V{sub L}. This unanticipated observation suggests that studies of the biophysical origin of conformational disease in one member of a superfamily of homologous proteins may have implications throughout the superfamily. In some cases, findings may account for overt disease; in other cases, due to the natural repertoire of inherited polymorphisms, variations in a representative protein may predict subclinical impairment of homologous proteins.« less
GibbsCluster: unsupervised clustering and alignment of peptide sequences.

PubMed

Andreatta, Massimo; Alvarez, Bruno; Nielsen, Morten

2017-07-03

Receptor interactions with short linear peptide fragments (ligands) are at the base of many biological signaling processes. Conserved and information-rich amino acid patterns, commonly called sequence motifs, shape and regulate these interactions. Because of the properties of a receptor-ligand system or of the assay used to interrogate it, experimental data often contain multiple sequence motifs. GibbsCluster is a powerful tool for unsupervised motif discovery because it can simultaneously cluster and align peptide data. The GibbsCluster 2.0 presented here is an improved version incorporating insertion and deletions accounting for variations in motif length in the peptide input. In basic terms, the program takes as input a set of peptide sequences and clusters them into meaningful groups. It returns the optimal number of clusters it identified, together with the sequence alignment and sequence motif characterizing each cluster. Several parameters are available to customize cluster analysis, including adjustable penalties for small clusters and overlapping groups and a trash cluster to remove outliers. As an example application, we used the server to deconvolute multiple specificities in large-scale peptidome data generated by mass spectrometry. The server is available at http://www.cbs.dtu.dk/services/GibbsCluster-2.0. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Genetic variation in Dip5, an amino acid permease, and Pdr5, a multiple drug transporter, regulates glyphosate resistance in S. cerevisiae.

PubMed

Rong-Mullins, Xiaoqing; Ravishankar, Apoorva; McNeal, Kirsten A; Lonergan, Zachery R; Biega, Audrey C; Creamer, J Philip; Gallagher, Jennifer E G

2017-01-01

S. cerevisiae from different environments are subject to a wide range of selective pressures, whether intentional or by happenstance. Chemicals classified by their application, such as herbicides, fungicides and antibiotics, can affect non-target organisms. First marketed as RoundUp™, glyphosate is the most widely used herbicide. In plants, glyphosate inhibits EPSPS, of the shikimate pathway, which is present in many organisms but lacking in mammals. The shikimate pathway produces chorismate which is the precursor to all the aromatic amino acids, para-aminobenzoic acid, and Coenzyme Q10. Crops engineered to be resistant to glyphosate contain a homolog of EPSPS that is not bound by glyphosate. Here, we show that S. cerevisiae has a wide-range of glyphosate resistance. Sequence comparison between the target proteins, i.e., the plant EPSPS and the yeast orthologous protein Aro1, predicted that yeast would be resistant to glyphosate. However, the growth variation seen in the subset of yeast tested was not due to polymorphisms within Aro1, instead, it was caused by genetic variation in an ABC multiple drug transporter, Pdr5, and an amino acid permease, Dip5. Using genetic variation as a probe into glyphosate response, we uncovered mechanisms that contribute to the transportation of glyphosate in and out of the cell. Taking advantage of the natural genetic variation within yeast and measuring growth under different conditions that would change the use of the shikimate pathway, we uncovered a general transport mechanism of glyphosate into eukaryotic cells.
Genetic variation in Dip5, an amino acid permease, and Pdr5, a multiple drug transporter, regulates glyphosate resistance in S. cerevisiae

PubMed Central

McNeal, Kirsten A.; Lonergan, Zachery R.; Biega, Audrey C.; Creamer, J. Philip

2017-01-01

S. cerevisiae from different environments are subject to a wide range of selective pressures, whether intentional or by happenstance. Chemicals classified by their application, such as herbicides, fungicides and antibiotics, can affect non-target organisms. First marketed as RoundUp™, glyphosate is the most widely used herbicide. In plants, glyphosate inhibits EPSPS, of the shikimate pathway, which is present in many organisms but lacking in mammals. The shikimate pathway produces chorismate which is the precursor to all the aromatic amino acids, para-aminobenzoic acid, and Coenzyme Q10. Crops engineered to be resistant to glyphosate contain a homolog of EPSPS that is not bound by glyphosate. Here, we show that S. cerevisiae has a wide-range of glyphosate resistance. Sequence comparison between the target proteins, i.e., the plant EPSPS and the yeast orthologous protein Aro1, predicted that yeast would be resistant to glyphosate. However, the growth variation seen in the subset of yeast tested was not due to polymorphisms within Aro1, instead, it was caused by genetic variation in an ABC multiple drug transporter, Pdr5, and an amino acid permease, Dip5. Using genetic variation as a probe into glyphosate response, we uncovered mechanisms that contribute to the transportation of glyphosate in and out of the cell. Taking advantage of the natural genetic variation within yeast and measuring growth under different conditions that would change the use of the shikimate pathway, we uncovered a general transport mechanism of glyphosate into eukaryotic cells. PMID:29155836
Modular probes for enriching and detecting complex nucleic acid sequences

NASA Astrophysics Data System (ADS)

Wang, Juexiao Sherry; Yan, Yan Helen; Zhang, David Yu

2017-12-01

Complex DNA sequences are difficult to detect and profile, but are important contributors to human health and disease. Existing hybridization probes lack the capability to selectively bind and enrich hypervariable, long or repetitive sequences. Here, we present a generalized strategy for constructing modular hybridization probes (M-Probes) that overcomes these challenges. We demonstrate that M-Probes can tolerate sequence variations of up to 7 nt at prescribed positions while maintaining single nucleotide sensitivity at other positions. M-Probes are also shown to be capable of sequence-selectively binding a continuous DNA sequence of more than 500 nt. Furthermore, we show that M-Probes can detect genes with triplet repeats exceeding a programmed threshold. As a demonstration of this technology, we have developed a hybrid capture method to determine the exact triplet repeat expansion number in the Huntington's gene of genomic DNA using quantitative PCR.
Evolution-Based Functional Decomposition of Proteins

PubMed Central

Rivoire, Olivier; Reynolds, Kimberly A.; Ranganathan, Rama

2016-01-01

The essential biological properties of proteins—folding, biochemical activities, and the capacity to adapt—arise from the global pattern of interactions between amino acid residues. The statistical coupling analysis (SCA) is an approach to defining this pattern that involves the study of amino acid coevolution in an ensemble of sequences comprising a protein family. This approach indicates a functional architecture within proteins in which the basic units are coupled networks of amino acids termed sectors. This evolution-based decomposition has potential for new understandings of the structural basis for protein function. To facilitate its usage, we present here the principles and practice of the SCA and introduce new methods for sector analysis in a python-based software package (pySCA). We show that the pattern of amino acid interactions within sectors is linked to the divergence of functional lineages in a multiple sequence alignment—a model for how sector properties might be differentially tuned in members of a protein family. This work provides new tools for studying proteins and for generally testing the concept of sectors as the principal units of function and adaptive variation. PMID:27254668
High levels of MHC class II allelic diversity in lake trout from Lake Superior

USGS Publications Warehouse

Dorschner, M.O.; Duris, T.; Bronte, C.R.; Burnham-Curtis, M. K.; Phillips, R.B.

2000-01-01

Sequence variation in a 216 bp portion of the major histocompatibility complex (MHC) II B1 domain was examined in 74 individual lake trout (Salvelinus namaycush) from different locations in Lake Superior. Forty-three alleles were obtained which encoded 71-72 amino acids of the mature protein. These sequences were compared with previous data obtained from five Pacific salmon species and Atlantic salmon using the same primers. Although all of the lake trout alleles clustered together in the neighbor-joining analysis of amino acid sequences, one amino acid allelic lineage was shared with Atlantic salmon (Salmo salar), a species in another genus which probably diverged from Salvelinus more than 10-20 million years ago. As shown previously in other salmonids, the level of nonsynonymous nucleotide substitution (d(N)) exceeded the level of synonymous substitution (d(S)). The level of nucleotide diversity at the MHC class II B1 locus was considerably higher in lake trout than in the Pacific salmon (genus Oncorhynchus). These results are consistent with the hypothesis that lake trout colonized Lake Superior from more than one refuge following the Wisconsin glaciation. Recent population bottlenecks may have reduced nucleotide diversity in Pacific salmon populations.
UET: a database of evolutionarily-predicted functional determinants of protein sequences that cluster as functional sites in protein structures.

PubMed

Lua, Rhonald C; Wilson, Stephen J; Konecki, Daniel M; Wilkins, Angela D; Venner, Eric; Morgan, Daniel H; Lichtarge, Olivier

2016-01-04

The structure and function of proteins underlie most aspects of biology and their mutational perturbations often cause disease. To identify the molecular determinants of function as well as targets for drugs, it is central to characterize the important residues and how they cluster to form functional sites. The Evolutionary Trace (ET) achieves this by ranking the functional and structural importance of the protein sequence positions. ET uses evolutionary distances to estimate functional distances and correlates genotype variations with those in the fitness phenotype. Thus, ET ranks are worse for sequence positions that vary among evolutionarily closer homologs but better for positions that vary mostly among distant homologs. This approach identifies functional determinants, predicts function, guides the mutational redesign of functional and allosteric specificity, and interprets the action of coding sequence variations in proteins, people and populations. Now, the UET database offers pre-computed ET analyses for the protein structure databank, and on-the-fly analysis of any protein sequence. A web interface retrieves ET rankings of sequence positions and maps results to a structure to identify functionally important regions. This UET database integrates several ways of viewing the results on the protein sequence or structure and can be found at http://mammoth.bcm.tmc.edu/uet/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Hybridization properties of long nucleic acid probes for detection of variable target sequences, and development of a hybridization prediction algorithm

PubMed Central

Öhrmalm, Christina; Jobs, Magnus; Eriksson, Ronnie; Golbob, Sultan; Elfaitouri, Amal; Benachenhou, Farid; Strømme, Maria; Blomberg, Jonas

2010-01-01

One of the main problems in nucleic acid-based techniques for detection of infectious agents, such as influenza viruses, is that of nucleic acid sequence variation. DNA probes, 70-nt long, some including the nucleotide analog deoxyribose-Inosine (dInosine), were analyzed for hybridization tolerance to different amounts and distributions of mismatching bases, e.g. synonymous mutations, in target DNA. Microsphere-linked 70-mer probes were hybridized in 3M TMAC buffer to biotinylated single-stranded (ss) DNA for subsequent analysis in a Luminex® system. When mismatches interrupted contiguous matching stretches of 6 nt or longer, it had a strong impact on hybridization. Contiguous matching stretches are more important than the same number of matching nucleotides separated by mismatches into several regions. dInosine, but not 5-nitroindole, substitutions at mismatching positions stabilized hybridization remarkably well, comparable to N (4-fold) wobbles in the same positions. In contrast to shorter probes, 70-nt probes with judiciously placed dInosine substitutions and/or wobble positions were remarkably mismatch tolerant, with preserved specificity. An algorithm, NucZip, was constructed to model the nucleation and zipping phases of hybridization, integrating both local and distant binding contributions. It predicted hybridization more exactly than previous algorithms, and has the potential to guide the design of variation-tolerant yet specific probes. PMID:20864443

Quasispecies characters of hepatitis B virus in immunoprophylaxis failure infants.

PubMed

Wang, Xin; Deng, Wanyan; Qian, Keli; Deng, Haijun; Huang, Yong; Tu, Zeng; Huang, Ailong; Long, Quanxin

2018-06-01

Hepatitis B vaccination prevents 80-95% of transmission and reduces the incidence of HBV in children. The variations in the a determinant of HBV surface antigen (HBsAg) have been reported to be the most prevalent cause for vaccine or antibody escape. There is a conflicting evidence on as to whether escape mutants arise de novo in infected infants or whether the mutants, that have preexisted maternally, subsequently undergo selective replication in the infant under immune pressure. Here, we report that nearly 65% (55 of 85) vaccination failure in child patients has no amino acid substitution in a determinant as seen by Sanger sequencing. We further employed an Illumina sequencing platform-based method to detect HBV quasispecies in four immunoprophylaxis failure infants and their mothers. In our data, the substitution rate of amino acid located at a determinant is relatively low (< 10%), I/T126A, C124S, F134Y, K141Q, Q129H, D144A, G145V, and N146K, which showed no statistical difference to their mothers, proving that these vaccine escape mutants preexist maternally as minor variants. Besides that, bioinformatical analysis showed that the binding affinity of high variation epitopes (amino acid divergence in mother and their infants > 20%) to related HLA molecules was generally decreased, these traces of immune escape suggesting that immune pressure was present and was effective in all samples.
Conservation of Three-Dimensional Helix-Loop-Helix Structure through the Vertebrate Lineage Reopens the Cold Case of Gonadotropin-Releasing Hormone-Associated Peptide.

PubMed

Pérez Sirkin, Daniela I; Lafont, Anne-Gaëlle; Kamech, Nédia; Somoza, Gustavo M; Vissio, Paula G; Dufour, Sylvie

2017-01-01

GnRH-associated peptide (GAP) is the C-terminal portion of the gonadotropin-releasing hormone (GnRH) preprohormone. Although it was reported in mammals that GAP may act as a prolactin-inhibiting factor and can be co-secreted with GnRH into the hypophyseal portal blood, GAP has been practically out of the research circuit for about 20 years. Comparative studies highlighted the low conservation of GAP primary amino acid sequences among vertebrates, contributing to consider that this peptide only participates in the folding or carrying process of GnRH. Considering that the three-dimensional (3D) structure of a protein may define its function, the aim of this study was to evaluate if GAP sequences and 3D structures are conserved in the vertebrate lineage. GAP sequences from various vertebrates were retrieved from databases. Analysis of primary amino acid sequence identity and similarity, molecular phylogeny, and prediction of 3D structures were performed. Amino acid sequence comparison and phylogeny analyses confirmed the large variation of GAP sequences throughout vertebrate radiation. In contrast, prediction of the 3D structure revealed a striking conservation of the 3D structure of GAP1 (GAP associated with the hypophysiotropic type 1 GnRH), despite low amino acid sequence conservation. This GAP1 peptide presented a typical helix-loop-helix (HLH) structure in all the vertebrate species analyzed. This HLH structure could also be predicted for GAP2 in some but not all vertebrate species and in none of the GAP3 analyzed. These results allowed us to infer that selective pressures have maintained GAP1 HLH structure throughout the vertebrate lineage. The conservation of the HLH motif, known to confer biological activity to various proteins, suggests that GAP1 peptides may exert some hypophysiotropic biological functions across vertebrate radiation.
Conservation of Three-Dimensional Helix-Loop-Helix Structure through the Vertebrate Lineage Reopens the Cold Case of Gonadotropin-Releasing Hormone-Associated Peptide

PubMed Central

Pérez Sirkin, Daniela I.; Lafont, Anne-Gaëlle; Kamech, Nédia; Somoza, Gustavo M.; Vissio, Paula G.; Dufour, Sylvie

2017-01-01

GnRH-associated peptide (GAP) is the C-terminal portion of the gonadotropin-releasing hormone (GnRH) preprohormone. Although it was reported in mammals that GAP may act as a prolactin-inhibiting factor and can be co-secreted with GnRH into the hypophyseal portal blood, GAP has been practically out of the research circuit for about 20 years. Comparative studies highlighted the low conservation of GAP primary amino acid sequences among vertebrates, contributing to consider that this peptide only participates in the folding or carrying process of GnRH. Considering that the three-dimensional (3D) structure of a protein may define its function, the aim of this study was to evaluate if GAP sequences and 3D structures are conserved in the vertebrate lineage. GAP sequences from various vertebrates were retrieved from databases. Analysis of primary amino acid sequence identity and similarity, molecular phylogeny, and prediction of 3D structures were performed. Amino acid sequence comparison and phylogeny analyses confirmed the large variation of GAP sequences throughout vertebrate radiation. In contrast, prediction of the 3D structure revealed a striking conservation of the 3D structure of GAP1 (GAP associated with the hypophysiotropic type 1 GnRH), despite low amino acid sequence conservation. This GAP1 peptide presented a typical helix-loop-helix (HLH) structure in all the vertebrate species analyzed. This HLH structure could also be predicted for GAP2 in some but not all vertebrate species and in none of the GAP3 analyzed. These results allowed us to infer that selective pressures have maintained GAP1 HLH structure throughout the vertebrate lineage. The conservation of the HLH motif, known to confer biological activity to various proteins, suggests that GAP1 peptides may exert some hypophysiotropic biological functions across vertebrate radiation. PMID:28878737
Theileria parva antigens recognized by CD8+ T cells show varying degrees of diversity in buffalo-derived infected cell lines.

PubMed

Sitt, Tatjana; Pelle, Roger; Chepkwony, Maurine; Morrison, W Ivan; Toye, Philip

2018-05-06

The extent of sequence diversity among the genes encoding 10 antigens (Tp1-10) known to be recognized by CD8+ T lymphocytes from cattle immune to Theileria parva was analysed. The sequences were derived from parasites in 23 buffalo-derived cell lines, three cattle-derived isolates and one cloned cell line obtained from a buffalo-derived stabilate. The results revealed substantial variation among the antigens through sequence diversity. The greatest nucleotide and amino acid diversity were observed in Tp1, Tp2 and Tp9. Tp5 and Tp7 showed the least amount of allelic diversity, and Tp5, Tp6 and Tp7 had the lowest levels of protein diversity. Tp6 was the most conserved protein; only a single non-synonymous substitution was found in all obtained sequences. The ratio of non-synonymous: synonymous substitutions varied from 0.84 (Tp1) to 0.04 (Tp6). Apart from Tp2 and Tp9, we observed no variation in the other defined CD8+ T cell epitopes (Tp4, 5, 7 and 8), indicating that epitope variation is not a universal feature of T. parva antigens. In addition to providing markers that can be used to examine the diversity in T. parva populations, the results highlight the potential for using conserved antigens to develop vaccines that provide broad protection against T. parva.
Genomic stability of adipogenic human adenovirus 36.

PubMed

Nam, J-H; Na, H-N; Atkinson, R L; Dhurandhar, N V

2014-02-01

Human adenovirus Ad36 increases adiposity in several animal models, including rodents and non-human primates. Importantly, Ad36 is associated with human obesity, which has prompted research to understand its epidemiology and to develop a vaccine to prevent a subgroup of obesity. For this purpose, understanding the genomic stability of Ad36 in vivo and in vitro infections is critical. Here, we examined whether in vitro cell passaging over a 14-year period introduced any genetic variation in Ad36. We sequenced the whole genome of Ad36-which was plaque purified in 1998 from the original strain obtained from American Type Culture Collection, and passaged approximately 12 times over the past 14 years (Ad36-2012). This DNA sequence was compared with a previously published sequence of Ad36 likely obtained from the same source (Ad36-1988). Compared with Ad36-1988, only two nucleotides were altered in Ad36-2012: a T insertion at nucleotide 1862, which may induce early termination of the E1B viral protein, and a T➝C transition at nucleotide 26 136. Virus with the T insertion (designated Ad36-2012-T6) was mixed with wild-type virus lacking the T insertion (designated Ad36-2012-T5) in the viral stock. The transition at nucleotide 26 136 does not change the encoded amino acid (aspartic acid) in the pVIII viral protein. The rate of genetic variation in Ad36 is ∼2.37 × 10(-6) mutations/nucleotide/passage. Of particular importance, there were no mutations in the E4orf1 gene, the critical gene for producing obesity. This very-low-variation rate should reduce concerns about genetic variability when developing Ad36 vaccines or developing assays for detecting Ad36 infection in populations.
Porcine MYF6 gene: sequence, homology analysis, and variation in the promoter region.

PubMed

Wyszyńska-Koko, J; Kurył, J

2004-01-01

MYF6 gene codes for the bHLH transcription factor belonging to MyoD family. Its expression accompanies the processes of differentiation and maturation of myotubes during embriogenesis and continues on a relatively high level after birth, affecting the muscle phenotype. The porcine MYF6 gene was amplified and sequenced and compared with MYF6 gene sequences of other species. The amino acid sequence was deduced and an interspecies homology analysis was performed. Myf-6 protein shows a high conservation among species of 99 and 97% identity when comparing pig with cow and human, respectively, and of 93% when comparing pig with mouse and rat. The single nucleotide polymorphism (SNP) was revealed within the promoter region, which appeared to be T --> C transition recognized by a MspI restriction enzyme.
The genetic diversity and complete genome analysis of two novel porcine deltacoronavirus isolates in Thailand in 2015.

PubMed

Lorsirigool, Athip; Saeng-Chuto, Kepalee; Madapong, Adthakorn; Temeeyasen, Gun; Tripipat, Thitima; Kaewprommal, Pavita; Tantituvanont, Angkana; Piriyapongsa, Jittima; Nilubol, Dachrit

2017-04-01

Porcine deltacoronavirus (PDCoV) was identified in intestinal samples collected from piglets with diarrhea in Thailand in 2015. Two Thai PDCoV isolates, P23_15_TT_1115 and P24_15_NT1_1215, were isolated and identified. The full-length genome sequences of the P23_15_TT_1115 and P24_15_NT1_1215 isolates were 25,404 and 25,407 nucleotides in length, respectively, which were relatively shorter than that of US and China PDCoV. The phylogenetic analysis based on the full-length genome demonstrated that Thai PDCoV isolates form a new cluster separated from US and China PDCoV but relatively were more closely related to China PDCoV than US isolates. The genetic analyses demonstrated that Thai PDCoVs have 97.0-97.8 and 92.2-94.0% similarities with China PDCoV at nucleotide and amino acid levels, respectively, but share 97.1-97.3 and 92.5-93.0 similarity with US PDCoV at the nucleotide and amino acid levels, respectively. Thai PDCoV possesses two discontinuous deletions of five amino acids in ORF1a/b region. One additional deletion of one amino acid was identified in P23_15_TT_1115. The variation analyses demonstrated that six regions (nt 1317-1436, 2997-3096, 19,737-19,836, 20,277-20,376, 21,177-21,276, and 22,371-22,416) in ORF1a/b and spike genes exhibit high sequence variation between Thai and other PDCoV. The analyses of amino acid changes suggested that they could potentially be from different lineages.
Using msa-2b as a molecular marker for genotyping Mexican isolates of Babesia bovis.

PubMed

Genis, Alma D; Perez, Jocelin; Mosqueda, Juan J; Alvarez, Antonio; Camacho, Minerva; Muñoz, Maria de Lourdes; Rojas, Carmen; Figueroa, Julio V

2009-12-01

Variable merozoite surface antigens of Babesia bovis are exposed glycoproteins having a role in erythrocyte invasion. Members of this gene family include msa-1 and msa-2 (msa-2c, msa-2a(1), msa-2a(2) and msa-2b). To determine the sequence variation among B. bovis Mexican isolates using msa-2b as a genetic marker, PCR amplicons corresponding to msa-2b were cloned and plasmids carrying the corresponding inserts were purified and sequenced. Comparative analysis of nucleotide and deduced amino acid sequences revealed distinct degrees of variability and identity among the coding gene sequences obtained from 16 geographically different Mexican B. bovis isolates and a reference strain. Clustal-W multiple alignments of the MSA-2b deduced amino acid sequences performed with the 17 B. bovis Mexican isolates, revealed the identification of three genotypes with a distinct set each of amino acid residues present at the variable region: Genotype I represented by the MO7 strain (in vitro culture-derived from the Mexico isolate) as well as RAD, Chiapas-1, Tabasco and Veracruz-3 isolates; Genotype II, represented by the Jalisco, Mexico and Veracruz-2 isolates; and Genotype III comprising the sequences from most of the isolates studied, Tamaulipas-1, Chiapas-2, Guerrero-1, Nayarit, Quintana Roo, Nuevo Leon, Tamaulipas-2, Yucatan and Guerrero-2. Moreover, these three genotypes could be discriminated against each other by using a PCR-RFLP approach. The results suggest that occurrence of indels within the variable region of msa-2b sequences can be useful markers for identifying a particular genotype present in field populations of B. bovis isolated from infected cattle in Mexico.
Cloning, characterization, expression and comparative analysis of pig Golgi membrane sphingomyelin synthase 1.

PubMed

Guillén, Natalia; Navarro, María A; Surra, Joaquín C; Arnal, Carmen; Fernández-Juan, Marta; Cebrián-Pérez, Jose Alvaro; Osada, Jesús

2007-02-15

Pig sphingomyelin synthase 1 (SMS1) cDNA was cloned, characterized and compared to the human ortholog. Porcine protein consists of 413 amino acids and displays a 97% sequence identity with human protein. A phylogenic tree of proteins reveals that porcine SMS1 is more closely related to bovine and rodent proteins than to human. Analysis of protein mass was higher than the theoretical prediction based on amino acid sequence suggesting a kind of posttranslational modification. Quantitative representation of tissue distribution obtained by real-time RT-PCR showed that it was widely expressed although important variations in levels were obtained among organs. Thus, the cardiovascular system, especially the heart, showed the highest value of all the tissues studied. Regional differences of expression were observed in the central nervous system and intestinal tract. Analysis of the hepatic mRNA and protein expressions of SMS1 following turpentine treatment revealed a progressive decrease in the former paralleled by a decrease in the protein concentration. These findings indicate the variation in expression in the different tissues might suggest a different requirement of Golgi sphingomyelin for the specific function in each organ and a regulation of the enzyme in response to turpentine-induced hepatic injury.
Sequence variation of functional HTLV-II tax alleles among isolates from an endemic population: lack of evidence for oncogenic determinant in tax.

PubMed

Hjelle, B; Chaney, R

1992-02-01

Human T-cell leukemia-lymphoma virus type II (HTLV-II) has been isolated from patients with hairy cell leukemia (HCL). We previously described a population with longstanding endemic HTLV-II infection, and showed that there is no increased risk for HCL in the affected groups. We thus have direct evidence that the endemic form(s) of HTLV-II cause HCL infrequently, if at all. By comparison, there is reason to suspect that the viruses isolated from patients with HCL had an etiologic role in the disease in those patients. One way to reconcile these conflicting observations is to consider that isolates of HTLV-II might differ in oncogenic potential. To determine whether the structure of the putative oncogenic determinant of HTLV-II, tax2, might differ in the new isolates compared to the tax of the prototype HCL isolate, MO, four new functional tax cDNAs were cloned from new isolates. Sequence analysis showed only minor (0.9-2.0%) amino acid variation compared to the published sequence of MO tax2. Some codons were consistently different from published sequences of the MO virus, but in most cases, such variations were also found in each of two tax2 clones we isolated from the MO T-cell line. These variations rendered the new clones more similar to the tax1 of the pathogenic virus HTLV-I. Thus we find no evidence that pathologic determinants of HTLV-II can be assigned to the tax gene.
Analyzing endocrine system conservation and evolution.

PubMed

Bonett, Ronald M

2016-08-01

Analyzing variation in rates of evolution can provide important insights into the factors that constrain trait evolution, as well as those that promote diversification. Metazoan endocrine systems exhibit apparent variation in evolutionary rates of their constituent components at multiple levels, yet relatively few studies have quantified these patterns and analyzed them in a phylogenetic context. This may be in part due to historical and current data limitations for many endocrine components and taxonomic groups. However, recent technological advancements such as high-throughput sequencing provide the opportunity to collect large-scale comparative data sets for even non-model species. Such ventures will produce a fertile data landscape for evolutionary analyses of nucleic acid and amino acid based endocrine components. Here I summarize evolutionary rate analyses that can be applied to categorical and continuous endocrine traits, and also those for nucleic acid and protein-based components. I emphasize analyses that could be used to test whether other variables (e.g., ecology, ontogenetic timing of expression, etc.) are related to patterns of rate variation and endocrine component diversification. The application of phylogenetic-based rate analyses to comparative endocrine data will greatly enhance our understanding of the factors that have shaped endocrine system evolution. Copyright © 2016 Elsevier Inc. All rights reserved.
Correlation between fibroin amino acid sequence and physical silk properties.

PubMed

Fedic, Robert; Zurovec, Michal; Sehnal, Frantisek

2003-09-12

The fiber properties of lepidopteran silk depend on the amino acid repeats that interact during H-fibroin polymerization. The aim of our research was to relate repeat composition to insect biology and fiber strength. Representative regions of the H-fibroin genes were sequenced and analyzed in three pyralid species: wax moth (Galleria mellonella), European flour moth (Ephestia kuehniella), and Indian meal moth (Plodia interpunctella). The amino acid repeats are species-specific, evidently a diversification of an ancestral region of 43 residues, and include three types of regularly dispersed motifs: modifications of GSSAASAA sequence, stretches of tripeptides GXZ where X and Z represent bulky residues, and sequences similar to PVIVIEE. No concatenations of GX dipeptide or alanine, which are typical for Bombyx silkworms and Antheraea silk moths, respectively, were found. Despite different repeat structure, the silks of G. mellonella and E. kuehniella exhibit similar tensile strength as the Bombyx and Antheraea silks. We suggest that in these latter two species, variations in the repeat length obstruct repeat alignment, but sufficiently long stretches of iterated residues get superposed to interact. In the pyralid H-fibroins, interactions of the widely separated and diverse motifs depend on the precision of repeat matching; silk is strong in G. mellonella and E. kuehniella, with 2-3 types of long homogeneous repeats, and nearly 10 times weaker in P. interpunctella, with seven types of shorter erratic repeats. The high proportion of large amino acids in the H-fibroin of pyralids has probably evolved in connection with the spinning habit of caterpillars that live in protective silk tubes and spin continuously, enlarging the tubes on one end and partly devouring the other one. The silk serves as a depot of energetically rich and essential amino acids that may be scarce in the diet.
A global analysis of CNVs in swine using whole genome sequence data and association analysis with fatty acid composition and growth traits.

PubMed

Revilla, Manuel; Puig-Oliveras, Anna; Castelló, Anna; Crespo-Piazuelo, Daniel; Paludo, Ediane; Fernández, Ana I; Ballester, Maria; Folch, Josep M

2017-01-01

Copy number variations (CNVs) are important genetic variants complementary to SNPs, and can be considered as biomarkers for some economically important traits in domestic animals. In the present study, a genomic analysis of porcine CNVs based on next-generation sequencing data was carried out to identify CNVs segregating in an Iberian x Landrace backcross population and study their association with fatty acid composition and growth-related traits. A total of 1,279 CNVs, including duplications and deletions, were detected, ranging from 106 to 235 CNVs across samples, with an average of 183 CNVs per sample. Moreover, we detected 540 CNV regions (CNVRs) containing 245 genes. Functional annotation suggested that these genes possess a great variety of molecular functions and may play a role in production traits in commercial breeds. Some of the identified CNVRs contained relevant functional genes (e.g., CLCA4, CYP4X1, GPAT2, MOGAT2, PLA2G2A and PRKG1, among others). The variation in copy number of four of them (CLCA4, GPAT2, MOGAT2 and PRKG1) was validated in 150 BC1_LD (25% Iberian and 75% Landrace) animals by qPCR. Additionally, their contribution regarding backfat and intramuscular fatty acid composition and growth-related traits was analyzed. Statistically significant associations were obtained for CNVR112 (GPAT2) for the C18:2(n-6)/C18:3(n-3) ratio in backfat and carcass length, among others. Notably, GPATs are enzymes that catalyze the first step in the biosynthesis of both triglycerides and glycerophospholipids, suggesting that this CNVR may contribute to genetic variation in fatty acid composition and growth traits. These findings provide useful genomic information to facilitate the further identification of trait-related CNVRs affecting economically important traits in pigs.
A wide extent of inter-strain diversity in virulent and vaccine strains of alphaherpesviruses.

PubMed

Szpara, Moriah L; Tafuri, Yolanda R; Parsons, Lance; Shamim, S Rafi; Verstrepen, Kevin J; Legendre, Matthieu; Enquist, L W

2011-10-01

Alphaherpesviruses are widespread in the human population, and include herpes simplex virus 1 (HSV-1) and 2, and varicella zoster virus (VZV). These viral pathogens cause epithelial lesions, and then infect the nervous system to cause lifelong latency, reactivation, and spread. A related veterinary herpesvirus, pseudorabies (PRV), causes similar disease in livestock that result in significant economic losses. Vaccines developed for VZV and PRV serve as useful models for the development of an HSV-1 vaccine. We present full genome sequence comparisons of the PRV vaccine strain Bartha, and two virulent PRV isolates, Kaplan and Becker. These genome sequences were determined by high-throughput sequencing and assembly, and present new insights into the attenuation of a mammalian alphaherpesvirus vaccine strain. We find many previously unknown coding differences between PRV Bartha and the virulent strains, including changes to the fusion proteins gH and gB, and over forty other viral proteins. Inter-strain variation in PRV protein sequences is much closer to levels previously observed for HSV-1 than for the highly stable VZV proteome. Almost 20% of the PRV genome contains tandem short sequence repeats (SSRs), a class of nucleic acids motifs whose length-variation has been associated with changes in DNA binding site efficiency, transcriptional regulation, and protein interactions. We find SSRs throughout the herpesvirus family, and provide the first global characterization of SSRs in viruses, both within and between strains. We find SSR length variation between different isolates of PRV and HSV-1, which may provide a new mechanism for phenotypic variation between strains. Finally, we detected a small number of polymorphic bases within each plaque-purified PRV strain, and we characterize the effect of passage and plaque-purification on these polymorphisms. These data add to growing evidence that even plaque-purified stocks of stable DNA viruses exhibit limited sequence heterogeneity, which likely seeds future strain evolution.
Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

DOEpatents

Lucas, J.N.; Straume, T.; Bogen, K.T.

1998-03-24

A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. 14 figs.
Identification of random nucleic acid sequence aberrations using dual capture probes which hybridize to different chromosome regions

DOEpatents

Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

1998-01-01

A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration.
Characterization of natural polymorphic sites of the HIV-1 integrase before the introduction of HIV-1 integrase inhibitors in Germany

PubMed Central

Meixenberger, Karolin; Pouran Yousef, Kaveh; Somogyi, Sybille; Fiedler, Stefan; Bartmeyer, Barbara; von Kleist, Max; Kücherer, Claudia

2014-01-01

Introduction The aim of our study was to analyze the occurrence and evolution of HIV-1 integrase polymorphisms during the HIV-1 epidemic in Germany prior to the introduction of the first integrase inhibitor raltegravir in 2007. Materials and Methods Plasma samples from drug-naïve HIV-1 infected individuals newly diagnosed between 1986 and 2006 were used to determine PCR-based population sequences of the HIV-1 integrase (amino acids 1–278). The HIV-1 subtype was determined using the REGA HIV-1 subtyping tool. We calculated the frequency of amino acids at each position of the HIV-1 integrase in 337 subtype B strains for the time periods 1986–1989, 1991–1994, 1995–1998, 1999–2002, and 2003–2006. Positions were defined as polymorphic if amino acid variation was >1% in any period. Logistic regression was used to identify trends in amino acid variation over time. Resistance-associated mutations were identified according to the IAS 2013 list and the HIVdb, ANRS and GRADE algorithms. Results Overall, 56.8% (158/278) amino acid positions were polymorphic and 15.8% (25/158) of these positions exhibited a significant trend in amino acid variation over time. Proportionately, most polymorphic positions (63.3%, 31/49) were detected in the N-terminal zinc finger domain of the HIV-1 integrase. Motifs and residues essential for HIV-1 integrase activity were little polymorphic, but within the minimal non-specific DNA binding region I220-D270 up to 18.1% amino acid variation was noticed, including four positions with significant amino acid variation over time (S230, D232, D256, A265). No major resistance mutations were identified, and minor resistance mutations were rarely observed without trend over time. E157Q considered by HIVdb, ANRS, and GRADE algorithms was the most frequent resistance-associated polymorphism with an overall prevalence of 2.4%. Conclusions Detailed knowledge of the evolutionary variation of HIV-1 integrase polymorphisms is important to understand the development of resistance in the presence of the drug. Our results will contribute to define the relevance of integrase polymorphisms in HIV-strains resistant to integrase inhibitors and to improve resistance interpretation algorithms. PMID:25397491
Use of extremely short Förster resonance energy transfer probes in real-time polymerase chain reaction

PubMed Central

Kutyavin, Igor V.

2013-01-01

Described in the article is a new approach for the sequence-specific detection of nucleic acids in real-time polymerase chain reaction (PCR) using fluorescently labeled oligonucleotide probes. The method is based on the production of PCR amplicons, which fold into dumbbell-like secondary structures carrying a specially designed ‘probe-luring’ sequence at their 5′ ends. Hybridization of this sequence to a complementary ‘anchoring’ tail introduced at the 3′ end of a fluorescent probe enables the probe to bind to its target during PCR, and the subsequent probe cleavage results in the florescence signal. As it has been shown in the study, this amplicon-endorsed and guided formation of the probe-target duplex allows the use of extremely short oligonucleotide probes, up to tetranucleotides in length. In particular, the short length of the fluorescent probes makes possible the development of a ‘universal’ probe inventory that is relatively small in size but represents all possible sequence variations. The unparalleled cost-effectiveness of the inventory approach is discussed. Despite the short length of the probes, this new method, named Angler real-time PCR, remains highly sequence specific, and the results of the study indicate that it can be effectively used for quantitative PCR and the detection of polymorphic variations. PMID:24013564
Method for identifying and quantifying nucleic acid sequence aberrations

DOEpatents

Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

1998-01-01

A method for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe.
Method for identifying and quantifying nucleic acid sequence aberrations

DOEpatents

Lucas, J.N.; Straume, T.; Bogen, K.T.

1998-07-21

A method is disclosed for detecting nucleic acid sequence aberrations by detecting nucleic acid sequences having both a first and a second nucleic acid sequence type, the presence of the first and second sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. The method uses a first hybridization probe which includes a nucleic acid sequence that is complementary to a first sequence type and a first complexing agent capable of attaching to a second complexing agent and a second hybridization probe which includes a nucleic acid sequence that selectively hybridizes to the second nucleic acid sequence type over the first sequence type and includes a detectable marker for detecting the second hybridization probe. 11 figs.

Sperm Bindin Divergence under Sexual Selection and Concerted Evolution in Sea Stars.

PubMed

Patiño, Susana; Keever, Carson C; Sunday, Jennifer M; Popovic, Iva; Byrne, Maria; Hart, Michael W

2016-08-01

Selection associated with competition among males or sexual conflict between mates can create positive selection for high rates of molecular evolution of gamete recognition genes and lead to reproductive isolation between species. We analyzed coding sequence and repetitive domain variation in the gene encoding the sperm acrosomal protein bindin in 13 diverse sea star species. We found that bindin has a conserved coding sequence domain structure in all 13 species, with several repeated motifs in a large central region that is similar among all sea stars in organization but highly divergent among genera in nucleotide and predicted amino acid sequence. More bindin codons and lineages showed positive selection for high relative rates of amino acid substitution in genera with gonochoric outcrossing adults (and greater expected strength of sexual selection) than in selfing hermaphrodites. That difference is consistent with the expectation that selfing (a highly derived mating system) may moderate the strength of sexual selection and limit the accumulation of bindin amino acid differences. The results implicate both positive selection on single codons and concerted evolution within the repetitive region in bindin divergence, and suggest that both single amino acid differences and repeat differences may affect sperm-egg binding and reproductive compatibility. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution. All rights reserved. For permissions, please e-mail: journals.permissions@oup.com.
A genome-wide analysis of the lysophosphatidate acyltransferase (LPAAT) gene family in cotton: organization, expression, sequence variation, and association with seed oil content and fiber quality.

PubMed

Wang, Nuohan; Ma, Jianjiang; Pei, Wenfeng; Wu, Man; Li, Haijing; Li, Xingli; Yu, Shuxun; Zhang, Jinfa; Yu, Jiwen

2017-03-01

Lysophosphatidic acid acyltransferase (LPAAT) encoded by a multigene family is a rate-limiting enzyme in the Kennedy pathway in higher plants. Cotton is the most important natural fiber crop and one of the most important oilseed crops. However, little is known on genes coding for LPAATs involved in oil biosynthesis with regard to its genome organization, diversity, expression, natural genetic variation, and association with fiber development and oil content in cotton. In this study, a comprehensive genome-wide analysis in four Gossypium species with genome sequences, i.e., tetraploid G. hirsutum- AD 1 and G. barbadense- AD 2 and its possible ancestral diploids G. raimondii- D 5 and G. arboreum- A 2 , identified 13, 10, 8, and 9 LPAAT genes, respectively, that were divided into four subfamilies. RNA-seq analyses of the LPAAT genes in the widely grown G. hirsutum suggest their differential expression at the transcriptional level in developing cottonseeds and fibers. Although 10 LPAAT genes were co-localised with quantitative trait loci (QTL) for cottonseed oil or protein content within a 25-cM region, only one single strand conformation polymorphic (SSCP) marker developed from a synonymous single nucleotide polymorphism (SNP) of the At-Gh13LPAAT5 gene was significantly correlated with cottonseed oil and protein contents in one of the three field tests. Moreover, transformed yeasts using the At-Gh13LPAAT5 gene with the two sequences for the SNP led to similar results, i.e., a 25-31% increase in palmitic acid and oleic acid, and a 16-29% increase in total triacylglycerol (TAG). The results in this study demonstrated that the natural variation in the LPAAT genes to improving cottonseed oil content and fiber quality is limited; therefore, traditional cross breeding should not expect much progress in improving cottonseed oil content or fiber quality through a marker-assisted selection for the LPAAT genes. However, enhancing the expression of one of the LPAAT genes such as At-Gh13LPAAT5 can significantly increase the production of total TAG and other fatty acids, providing an incentive for further studies into the use of LPAAT genes to increase cottonseed oil content through biotechnology.
A complete Neandertal mitochondrial genome sequence determined by high-throughput sequencing

PubMed Central

Green, Richard E.; Malaspinas, Anna-Sapfo; Krause, Johannes; Briggs, Adrian W.; Johnson, Philip L. F.; Uhler, Caroline; Meyer, Matthias; Good, Jeffrey M.; Maricic, Tomislav; Stenzel, Udo; Prüfer, Kay; Siebauer, Michael; Burbano, Hernán A.; Ronan, Michael; Rothberg, Jonathan M.; Egholm, Michael; Rudan, Pavao; Brajković, Dejana; Kućan, Željko; Gušić, Ivan; Wikström, Mårten; Laakkonen, Liisa; Kelso, Janet; Slatkin, Montgomery; Pääbo, Svante

2008-01-01

Summary A complete mitochondrial (mt) genome sequence was reconstructed from a 38,000-year-old Neandertal individual using 8,341 mtDNA sequences identified among 4.8 Gb of DNA generated from ~0.3 grams of bone. Analysis of the assembled sequence unequivocally establishes that the Neandertal mtDNA falls outside the variation of extant human mtDNAs and allows an estimate of the divergence date between the two mtDNA lineages of 660,000±140,000 years. Of the 13 proteins encoded in the mtDNA, subunit 2 of cytochrome c oxidase of the mitochondrial electron transport chain has experienced the largest number of amino acid substitutions in human ancestors since the separation from Neandertals. There is evidence that purifying selection in the Neandertal mtDNA was reduced compared to other primate lineages suggesting that the effective population size of Neandertals was small. PMID:18692465
Deep sequencing in library selection projects: what insight does it bring?

PubMed

Glanville, J; D'Angelo, S; Khan, T A; Reddy, S T; Naranjo, L; Ferrara, F; Bradbury, A R M

2015-08-01

High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. Copyright © 2015 Elsevier Ltd. All rights reserved.
Deep sequencing in library selection projects: what insight does it bring?

PubMed Central

Glanville, J; D’Angelo, S; Khan, T.A.; Reddy, S. T.; Naranjo, L.; Ferrara, F.; Bradbury, A.R.M.

2015-01-01

High throughput sequencing is poised to change all aspects of the way antibodies and other binders are discovered and engineered. Millions of available sequence reads provide an unprecedented sampling depth able to guide the design and construction of effective, high quality naïve libraries containing tens of billions of unique molecules. Furthermore, during selections, high throughput sequencing enables quantitative tracing of enriched clones and position-specific guidance to amino acid variation under positive selection during antibody engineering. Successful application of the technologies relies on specific PCR reagent design, correct sequencing platform selection, and effective use of computational tools and statistical measures to remove error, identify antibodies, estimate diversity, and extract signatures of selection from the clone down to individual structural positions. Here we review these considerations and discuss some of the remaining challenges to the widespread adoption of the technology. PMID:26451649
Conserved Sequence Preferences Contribute to Substrate Recognition by the Proteasome*

PubMed Central

Yu, Houqing; Singh Gautam, Amit K.; Wilmington, Shameika R.; Wylie, Dennis; Martinez-Fonts, Kirby; Kago, Grace; Warburton, Marie; Chavali, Sreenivas; Inobe, Tomonao; Finkelstein, Ilya J.; Babu, M. Madan

2016-01-01

The proteasome has pronounced preferences for the amino acid sequence of its substrates at the site where it initiates degradation. Here, we report that modulating these sequences can tune the steady-state abundance of proteins over 2 orders of magnitude in cells. This is the same dynamic range as seen for inducing ubiquitination through a classic N-end rule degron. The stability and abundance of His3 constructs dictated by the initiation site affect survival of yeast cells and show that variation in proteasomal initiation can affect fitness. The proteasome's sequence preferences are linked directly to the affinity of the initiation sites to their receptor on the proteasome and are conserved between Saccharomyces cerevisiae, Schizosaccharomyces pombe, and human cells. These findings establish that the sequence composition of unstructured initiation sites influences protein abundance in vivo in an evolutionarily conserved manner and can affect phenotype and fitness. PMID:27226608
Sequence analysis of Epstein-Barr virus (EBV) early genes BARF1 and BHRF1 in NK/T cell lymphoma from Northern China.

PubMed

Sun, Lingling; Che, Kui; Zhao, Zhenzhen; Liu, Song; Xing, Xiaoming; Luo, Bing

2015-09-04

NK/T cell lymphoma is an aggressive lymphoma almost always associated with EBV. BamHI-A rightward open reading frame 1 (BARF1) and BamHI-H rightward open reading frame 1 (BHRF1) are two EBV early genes, which may be involved in the oncogenicity of EBV. It has been found that V29A strains, a BARF1 mutant subtype, showed higher prevalence in NPC, which may suggest the association between this variation and nasopharyngeal carcinoma (NPC). To characterize the sequence variation patterns of the Epstein-Barr virus (EBV) early genes and to elucidate their association with NK/T cell lymphoma, we analyzed the sequences of BARF1 and BHRF1 in EBV-positive NK/T cell lymphoma samples from Northern China. In situ hybridization (ISH) performed for EBV-encoded small RNA1 (EBER1) with specific digoxigenin-labeled probes was used to select the EBV positive lymphoma samples. Nested-polymerase chain reaction (nested-PCR) and DNA sequence analysis technique were used to obtain the sequences of BARF1 and BHRF1. The polymorphisms of these two genes were classified according to the signature changes and compared with the known corresponding EBV gene variation data. Two major subtypes of BARF1 gene, designated as B95-8 and V29A subtype, were identified. B95-8 subtype was the dominant subtype. The V29A subtype had one consistent amino acid change at amino acid residue 29 (V → A). Compared with B95-8, AA change at 88 (L → V) of BHRF1 was found in the majority of the isolates, and AA79 (V → L) mutation in a few isolates. Functional domains of BARF1 and BHRF1 were highly conserved. The distributions of BARF1 and BHRF1 subtypes had no significant differences among different EBV-associated malignancies and healthy donors. The sequences of BARF1 and BHRF1 are highly conserved which may contribute to maintain the biological function of these two genes. There is no evidence that particular EBV substrains of BARF1 or BHRF1 is region-restricted or disease-specific.
THE SMALL ACID SOLUBLE PROTEINS (SASP α and SASP β) OF BACILLUS WEIHENSTEPHANENSIS AND B. MYCOIDES GROUP 2 ARE THE MOST DISTINCT AMONG THE B. CEREUS GROUP

PubMed Central

Callahan, Courtney; Fox, Karen; Fox, Alvin

2009-01-01

The Bacillus cereus group includes Bacillus anthracis, Bacillus cereus, Bacillus thuringiensis, Bacillus mycoides and Bacillus weihenstephanensis. The small acid-soluble spore protein (SASP) β has been previously demonstrated to be among the biomarkers differentiating B. anthracis and B. cereus; SASP β of B. cereus most commonly exhibits one or two amino acid substitutions when compared to B. anthracis. SASP α is conserved in sequence among these two species. Neither SASP α nor β for B. thuringiensis, B. mycoides and B. weihenstephanensis have been previously characterized as taxonomic discriminators. In the current work molecular weight (MW) variation of these SASPs were determined by matrix assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI TOF MS) for representative strains of the 5 species within the B. cereus group. The measured MWs also correlate with calculated MWs of translated amino acid sequences generated from whole genome sequencing projects. SASP α and β demonstrated consistent MW among B. cereus, B. thuringiensis, and B. mycoides strains (group 1). However B. mycoides (group 2) and B. weihenstephanensis SASP α and β were quite distinct making them unique among the B. cereus group. Limited sequence changes were observed in SASP α (at most 3 substitutions and 2 deletions) indicating it is a more conserved protein than SASP β (up to 6 substitutions and a deletion). Another even more conserved SASP, SASP α-β type, was described here for the first time. PMID:19616612
Complete Chloroplast Genome Sequence of Tartary Buckwheat (Fagopyrum tataricum) and Comparative Analysis with Common Buckwheat (F. esculentum)

PubMed Central

Cho, Kwang-Soo; Yun, Bong-Kyoung; Yoon, Young-Ho; Hong, Su-Young; Mekapogu, Manjulatha; Kim, Kyung-Hee; Yang, Tae-Jin

2015-01-01

We report the chloroplast (cp) genome sequence of tartary buckwheat (Fagopyrum tataricum) obtained by next-generation sequencing technology and compared this with the previously reported common buckwheat (F. esculentum ssp. ancestrale) cp genome. The cp genome of F. tataricum has a total sequence length of 159,272 bp, which is 327 bp shorter than the common buckwheat cp genome. The cp gene content, order, and orientation are similar to those of common buckwheat, but with some structural variation at tandem and palindromic repeat frequencies and junction areas. A total of seven InDels (around 100 bp) were found within the intergenic sequences and the ycf1 gene. Copy number variation of the 21-bp tandem repeat varied in F. tataricum (four repeats) and F. esculentum (one repeat), and the InDel of the ycf1 gene was 63 bp long. Nucleotide and amino acid have highly conserved coding sequence with about 98% homology and four genes—rpoC2, ycf3, accD, and clpP—have high synonymous (Ks) value. PCR based InDel markers were applied to diverse genetic resources of F. tataricum and F. esculentum, and the amplicon size was identical to that expected in silico. Therefore, these InDel markers are informative biomarkers to practically distinguish raw or processed buckwheat products derived from F. tataricum and F. esculentum. PMID:25966355
Structural details (kinks and non-α conformations) in transmembrane helices are intrahelically determined and can be predicted by sequence pattern descriptors

PubMed Central

Rigoutsos, Isidore; Riek, Peter; Graham, Robert M.; Novotny, Jiri

2003-01-01

One of the promising methods of protein structure prediction involves the use of amino acid sequence-derived patterns. Here we report on the creation of non-degenerate motif descriptors derived through data mining of training sets of residues taken from the transmembrane-spanning segments of polytopic proteins. These residues correspond to short regions in which there is a deviation from the regular α-helical character (i.e. π-helices, 310-helices and kinks). A ‘search engine’ derived from these motif descriptors correctly identifies, and discriminates amongst instances of the above ‘non-canonical’ helical motifs contained in the SwissProt/TrEMBL database of protein primary structures. Our results suggest that deviations from α-helicity are encoded locally in sequence patterns only about 7–9 residues long and can be determined in silico directly from the amino acid sequence. Delineation of such variations in helical habit is critical to understanding the complex structure–function relationships of polytopic proteins and for drug discovery. The success of our current methodology foretells development of similar prediction tools capable of identifying other structural motifs from sequence alone. The method described here has been implemented and is available on the World Wide Web at http://cbcsrv.watson.ibm.com/Ttkw.html. PMID:12888523
Structural details (kinks and non-alpha conformations) in transmembrane helices are intrahelically determined and can be predicted by sequence pattern descriptors.

PubMed

Rigoutsos, Isidore; Riek, Peter; Graham, Robert M; Novotny, Jiri

2003-08-01

One of the promising methods of protein structure prediction involves the use of amino acid sequence-derived patterns. Here we report on the creation of non-degenerate motif descriptors derived through data mining of training sets of residues taken from the transmembrane-spanning segments of polytopic proteins. These residues correspond to short regions in which there is a deviation from the regular alpha-helical character (i.e. pi-helices, 3(10)-helices and kinks). A 'search engine' derived from these motif descriptors correctly identifies, and discriminates amongst instances of the above 'non-canonical' helical motifs contained in the SwissProt/TrEMBL database of protein primary structures. Our results suggest that deviations from alpha-helicity are encoded locally in sequence patterns only about 7-9 residues long and can be determined in silico directly from the amino acid sequence. Delineation of such variations in helical habit is critical to understanding the complex structure-function relationships of polytopic proteins and for drug discovery. The success of our current methodology foretells development of similar prediction tools capable of identifying other structural motifs from sequence alone. The method described here has been implemented and is available on the World Wide Web at http://cbcsrv.watson.ibm.com/Ttkw.html.
Determining divergence times with a protein clock: update and reevaluation

NASA Technical Reports Server (NTRS)

Feng, D. F.; Cho, G.; Doolittle, R. F.; Bada, J. L. (Principal Investigator)

1997-01-01

A recent study of the divergence times of the major groups of organisms as gauged by amino acid sequence comparison has been expanded and the data have been reanalyzed with a distance measure that corrects for both constraints on amino acid interchange and variation in substitution rate at different sites. Beyond that, the availability of complete genome sequences for several eubacteria and an archaebacterium has had a great impact on the interpretation of certain aspects of the data. Thus, the majority of the archaebacterial sequences are not consistent with currently accepted views of the Tree of Life which cluster the archaebacteria with eukaryotes. Instead, they are either outliers or mixed in with eubacterial orthologs. The simplest resolution of the problem is to postulate that many of these sequences were carried into eukaryotes by early eubacterial endosymbionts about 2 billion years ago, only very shortly after or even coincident with the divergence of eukaryotes and archaebacteria. The strong resemblances of these same enzymes among the major eubacterial groups suggest that the cyanobacteria and Gram-positive and Gram-negative eubacteria also diverged at about this same time, whereas the much greater differences between archaebacterial and eubacterial sequences indicate these two groups may have diverged between 3 and 4 billion years ago.
Structurally detailed coarse-grained model for Sec-facilitated co-translational protein translocation and membrane integration

PubMed Central

Miller, Thomas F.

2017-01-01

We present a coarse-grained simulation model that is capable of simulating the minute-timescale dynamics of protein translocation and membrane integration via the Sec translocon, while retaining sufficient chemical and structural detail to capture many of the sequence-specific interactions that drive these processes. The model includes accurate geometric representations of the ribosome and Sec translocon, obtained directly from experimental structures, and interactions parameterized from nearly 200 μs of residue-based coarse-grained molecular dynamics simulations. A protocol for mapping amino-acid sequences to coarse-grained beads enables the direct simulation of trajectories for the co-translational insertion of arbitrary polypeptide sequences into the Sec translocon. The model reproduces experimentally observed features of membrane protein integration, including the efficiency with which polypeptide domains integrate into the membrane, the variation in integration efficiency upon single amino-acid mutations, and the orientation of transmembrane domains. The central advantage of the model is that it connects sequence-level protein features to biological observables and timescales, enabling direct simulation for the mechanistic analysis of co-translational integration and for the engineering of membrane proteins with enhanced membrane integration efficiency. PMID:28328943
Eco-geographical diversification of bitter taste receptor genes (TAS2Rs) among subspecies of chimpanzees (Pan troglodytes).

PubMed

Hayakawa, Takashi; Sugawara, Tohru; Go, Yasuhiro; Udono, Toshifumi; Hirai, Hirohisa; Imai, Hiroo

2012-01-01

Chimpanzees (Pan troglodytes) have region-specific difference in dietary repertoires from East to West across tropical Africa. Such differences may result from different genetic backgrounds in addition to cultural variations. We analyzed the sequences of all bitter taste receptor genes (cTAS2Rs) in a total of 59 chimpanzees, including 4 putative subspecies. We identified genetic variations including single-nucleotide variations (SNVs), insertions and deletions (indels), gene-conversion variations, and copy-number variations (CNVs) in cTAS2Rs. Approximately two-thirds of all cTAS2R haplotypes in the amino acid sequence were unique to each subspecies. We analyzed the evolutionary backgrounds of natural selection behind such diversification. Our previous study concluded that diversification of cTAS2Rs in western chimpanzees (P. t. verus) may have resulted from balancing selection. In contrast, the present study found that purifying selection dominates as the evolutionary form of diversification of the so-called human cluster of cTAS2Rs in eastern chimpanzees (P. t. schweinfurthii) and that the other cTAS2Rs were under no obvious selection as a whole. Such marked diversification of cTAS2Rs with different evolutionary backgrounds among subspecies of chimpanzees probably reflects their subspecies-specific dietary repertoires.
Eco-Geographical Diversification of Bitter Taste Receptor Genes (TAS2Rs) among Subspecies of Chimpanzees (Pan troglodytes)

PubMed Central

Hayakawa, Takashi; Sugawara, Tohru; Go, Yasuhiro; Udono, Toshifumi; Hirai, Hirohisa; Imai, Hiroo

2012-01-01

Chimpanzees (Pan troglodytes) have region-specific difference in dietary repertoires from East to West across tropical Africa. Such differences may result from different genetic backgrounds in addition to cultural variations. We analyzed the sequences of all bitter taste receptor genes (cTAS2Rs) in a total of 59 chimpanzees, including 4 putative subspecies. We identified genetic variations including single-nucleotide variations (SNVs), insertions and deletions (indels), gene-conversion variations, and copy-number variations (CNVs) in cTAS2Rs. Approximately two-thirds of all cTAS2R haplotypes in the amino acid sequence were unique to each subspecies. We analyzed the evolutionary backgrounds of natural selection behind such diversification. Our previous study concluded that diversification of cTAS2Rs in western chimpanzees (P. t. verus) may have resulted from balancing selection. In contrast, the present study found that purifying selection dominates as the evolutionary form of diversification of the so-called human cluster of cTAS2Rs in eastern chimpanzees (P. t. schweinfurthii) and that the other cTAS2Rs were under no obvious selection as a whole. Such marked diversification of cTAS2Rs with different evolutionary backgrounds among subspecies of chimpanzees probably reflects their subspecies-specific dietary repertoires. PMID:22916235
Nucleotide sequence variation at two genes of the phenylpropanoid pathway, the FAH1 and F3H genes, in Arabidopsis thaliana.

PubMed

Aguadé, M

2001-01-01

The FAH1 and F3H genes encode ferulate-5-hydroxylase and flavanone-3-hydroxylase, which are enzymes in the pathways leading to the synthesis of sinapic acid esters and flavonoids, respectively. Nucleotide variation at these genes was surveyed by sequencing a sample of 20 worldwide Arabidopsis thaliana ecotypes and one Arabidopsis lyrata spp. petraea stock. In contrast with most previously studied genes, the percentage of singletons was rather low in both the FAH1 and the F3H gene regions. There was, therefore, no footprint of a recent species expansion in the pattern of nucleotide variation in these regions. In both FAH1 and F3H, nucleotide variation was structured into two major highly differentiated haplotypes. In both genes, there was a peak of silent polymorphism in the 5' part of the coding region without a parallel increase in silent divergence. In FAH1, the peak was centered at the beginning of the second exon. In F3H, nucleotide diversity was highest at the beginning of the gene. The observed pattern of variation in both FAH1 and F3H, although suggestive of balancing selection, was compatible with a neutral model with no recombination.
Proteogenomic Investigation of Strain Variation in Clinical Mycobacterium tuberculosis Isolates.

PubMed

Heunis, Tiaan; Dippenaar, Anzaan; Warren, Robin M; van Helden, Paul D; van der Merwe, Ruben G; Gey van Pittius, Nicolaas C; Pain, Arnab; Sampson, Samantha L; Tabb, David L

2017-10-06

Mycobacterium tuberculosis consists of a large number of different strains that display unique virulence characteristics. Whole-genome sequencing has revealed substantial genetic diversity among clinical M. tuberculosis isolates, and elucidating the phenotypic variation encoded by this genetic diversity will be of the utmost importance to fully understand M. tuberculosis biology and pathogenicity. In this study, we integrated whole-genome sequencing and mass spectrometry (GeLC-MS/MS) to reveal strain-specific characteristics in the proteomes of two clinical M. tuberculosis Latin American-Mediterranean isolates. Using this approach, we identified 59 peptides containing single amino acid variants, which covered ∼9% of all coding nonsynonymous single nucleotide variants detected by whole-genome sequencing. Furthermore, we identified 29 distinct peptides that mapped to a hypothetical protein not present in the M. tuberculosis H37Rv reference proteome. Here, we provide evidence for the expression of this protein in the clinical M. tuberculosis SAWC3651 isolate. The strain-specific databases enabled confirmation of genomic differences (i.e., large genomic regions of difference and nonsynonymous single nucleotide variants) in these two clinical M. tuberculosis isolates and allowed strain differentiation at the proteome level. Our results contribute to the growing field of clinical microbial proteogenomics and can improve our understanding of phenotypic variation in clinical M. tuberculosis isolates.
Nucleic acid probes in diagnostic medicine

NASA Technical Reports Server (NTRS)

Oberry, Phillip A.

1991-01-01

The need for improved diagnostic procedures is outlined and variations in probe technology are briefly reviewed. A discussion of the application of probe technology to the diagnosis of disease in animals and humans is presented. A comparison of probe versus nonprobe diagnostics and isotopic versus nonisotopic probes is made and the current state of sequence amplification is described. The current market status of nucleic acid probes is reviewed with respect to their diagnostic application in human and veterinary medicine. Representative product examples are described and information on probes being developed that offer promise as future products is discussed.
Method for isolating chromosomal DNA in preparation for hybridization in suspension

DOEpatents

Lucas, Joe N.

2000-01-01

A method is provided for detecting nucleic acid sequence aberrations using two immobilization steps. According to the method, a nucleic acid sequence aberration is detected by detecting nucleic acid sequences having both a first nucleic acid sequence type (e.g., from a first chromosome) and a second nucleic acid sequence type (e.g., from a second chromosome), the presence of the first and the second nucleic acid sequence type on the same nucleic acid sequence indicating the presence of a nucleic acid sequence aberration. In the method, immobilization of a first hybridization probe is used to isolate a first set of nucleic acids in the sample which contain the first nucleic acid sequence type. Immobilization of a second hybridization probe is then used to isolate a second set of nucleic acids from within the first set of nucleic acids which contain the second nucleic acid sequence type. The second set of nucleic acids are then detected, their presence indicating the presence of a nucleic acid sequence aberration. Chromosomal DNA in a sample containing cell debris is prepared for hybridization in suspension by treating the mixture with RNase. The treated DNA can also be fixed prior to hybridization.
Genome Wide Analysis of Fatty Acid Desaturation and Its Response to Temperature1[OPEN

PubMed Central

Menard, Guillaume N.; Moreno, Jose Martin; Bryant, Fiona M.; Munoz-Azcarate, Olaya; Hassani-Pak, Keywan; Kurup, Smita

2017-01-01

Plants modify the polyunsaturated fatty acid content of their membrane and storage lipids in order to adapt to changes in temperature. In developing seeds, this response is largely controlled by the activities of the microsomal ω-6 and ω-3 fatty acid desaturases, FAD2 and FAD3. Although temperature regulation of desaturation has been studied at the molecular and biochemical levels, the genetic control of this trait is poorly understood. Here, we have characterized the response of Arabidopsis (Arabidopsis thaliana) seed lipids to variation in ambient temperature and found that heat inhibits both ω-6 and ω-3 desaturation in phosphatidylcholine, leading to a proportional change in triacylglycerol composition. Analysis of the 19 parental accessions of the multiparent advanced generation intercross (MAGIC) population showed that significant natural variation exists in the temperature responsiveness of ω-6 desaturation. A combination of quantitative trait locus (QTL) analysis and genome-wide association studies (GWAS) using the MAGIC population suggests that ω-6 desaturation is largely controlled by cis-acting sequence variants in the FAD2 5′ untranslated region intron that determine the expression level of the gene. However, the temperature responsiveness of ω-6 desaturation is controlled by a separate QTL on chromosome 2. The identity of this locus is unknown, but genome-wide association studies identified potentially causal sequence variants within ∼40 genes in an ∼450-kb region of the QTL. PMID:28108698

Helicobacter pylori cagL amino acid polymorphism D58E59 pave the way toward peptic ulcer disease while N58E59 is associated with gastric cancer in north of Iran.

PubMed

Cherati, Mina Rezaee; Shokri-Shirvani, Javad; Karkhah, Ahmad; Rajabnia, Ramzan; Nouri, Hamid Reza

2017-06-01

The cagL protein of Helicobacter pylori involving in pathogenesis of gastroduodenal disorders. Therefore, the current study was conducted to determine the cagL amino acid polymorphisms in patients with gastric diseases. One hundred gastric biopsies were collected from gastritis, peptic ulcer (PUD) and gastric cancer (GC) patients and were screened for cagL using polymerase chain reaction (PCR). Also, sequence variations of the cagL were assessed via sequence translation. The cagL geneopositivity was 71.6% in patients were infected with H. pylori. The cagL from PUD indicated a higher rate of D58 amino acid sequence polymorphism than those of the GC and gastritis (P < 0.05). The D58 polymorphism showed an increased risk of PUD up to 6.5-fold (95% CI: 1.2-35.7). This position was occupied with amino acid N58 in GC. The E59 polymorphism was more frequently found in PUD and GC than gastritis patients. Additionally, presence of Q62 and N122 significantly observed in PUD and GC, whereas I60 was detected in PUD patients. Our results demonstrated that presence of the D, I, Q and N at position 58, 60, 62 and 122, respectively increased the risk of peptic ulcer. However, amino acid N, M, Q and N at the same position alongside V134 increased the risk of gastric cancer. Copyright © 2017 Elsevier Ltd. All rights reserved.
DNA Sequence Polymorphism of the Lactate Dehydrogenase Genefrom Iranian Plasmodium vivax and Plasmodium falciparum Isolates.

PubMed

Getacher Feleke, Daniel; Nateghpour, Mehdi; Motevalli Haghi, Afsaneh; Hajjaran, Homa; Farivar, Leila; Mohebali, Mehdi; Raoofian, Reza

2015-01-01

Parasite lactate dehydrogenase (pLDH) is extensively employed as malaria rapid diagnostic tests (RDTs). Moreover, it is a well-known drug target candidate. However, the genetic diversity of this gene might influence performance of RDT kits and its drug target candidacy. This study aimed to determine polymorphism of pLDH gene from Iranian isolates of P. vivax and P. falciparum. Genomic DNA was extracted from whole blood of microscopically confirmed P. vivax and P. falciparum infected patients. pLDH gene of P. falciparum and P. vivax was amplified using conventional PCR from 43 symptomatic malaria patients from Sistan and Baluchistan Province, Southeast Iran from 2012 to 2013. Sequence analysis of 15 P. vivax LDH showed fourteen had 100% identity with P. vivax Sal-1 and Belem strains. Two nucleotide substitutions were detected with only one resulted in amino acid change. Analysis of P. falciparum LDH sequences showed six of the seven sequences had 100% homology with P. falciparum 3D7 and Mzr-1. Moreover, PfLDH displayed three nucleotide changes that resulted in changing only one amino acid. PvLDH and PfLDH showed 75%-76% nucleotide and 90.4%-90.76% amino acid homology. pLDH gene from Iranian P. falciparum and P. vivax isolates displayed 98.8-100% homology with 1-3 nucleotide substitutions. This indicated this gene was relatively conserved. Additional studies can be done weather this genetic variation can influence the performance of pLDH based RDTs or not.
Bioinformatic analysis suggests that the Orbivirus VP6 cistron encodes an overlapping gene

PubMed Central

Firth, Andrew E

2008-01-01

Background The genus Orbivirus includes several species that infect livestock – including Bluetongue virus (BTV) and African horse sickness virus (AHSV). These viruses have linear dsRNA genomes divided into ten segments, all of which have previously been assumed to be monocistronic. Results Bioinformatic evidence is presented for a short overlapping coding sequence (CDS) in the Orbivirus genome segment 9, overlapping the VP6 cistron in the +1 reading frame. In BTV, a 77–79 codon AUG-initiated open reading frame (hereafter ORFX) is present in all 48 segment 9 sequences analysed. The pattern of base variations across the 48-sequence alignment indicates that ORFX is subject to functional constraints at the amino acid level (even when the constraints due to coding in the overlapping VP6 reading frame are taken into account; MLOGD software). In fact the translated ORFX shows greater amino acid conservation than the overlapping region of VP6. The ORFX AUG codon has a strong Kozak context in all 48 sequences. Each has only one or two upstream AUG codons, always in the VP6 reading frame, and (with a single exception) always with weak or medium Kozak context. Thus, in BTV, ORFX may be translated via leaky scanning. A long (83–169 codon) ORF is present in a corresponding location and reading frame in all other Orbivirus species analysed except Saint Croix River virus (SCRV; the most divergent). Again, the pattern of base variations across sequence alignments indicates multiple coding in the VP6 and ORFX reading frames. Conclusion At ~9.5 kDa, the putative ORFX product in BTV is too small to appear on most published protein gels. Nonetheless, a review of past literature reveals a number of possible detections. We hope that presentation of this bioinformatic analysis will stimulate an attempt to experimentally verify the expression and functional role of ORFX, and hence lead to a greater understanding of the molecular biology of these important pathogens. PMID:18489030
Acquisition and evolution of plant pathogenesis-associated gene clusters and candidate determinants of tissue-specificity in xanthomonas.

PubMed

Lu, Hong; Patil, Prabhu; Van Sluys, Marie-Anne; White, Frank F; Ryan, Robert P; Dow, J Maxwell; Rabinowicz, Pablo; Salzberg, Steven L; Leach, Jan E; Sonti, Ramesh; Brendel, Volker; Bogdanove, Adam J

2008-01-01

Xanthomonas is a large genus of plant-associated and plant-pathogenic bacteria. Collectively, members cause diseases on over 392 plant species. Individually, they exhibit marked host- and tissue-specificity. The determinants of this specificity are unknown. To assess potential contributions to host- and tissue-specificity, pathogenesis-associated gene clusters were compared across genomes of eight Xanthomonas strains representing vascular or non-vascular pathogens of rice, brassicas, pepper and tomato, and citrus. The gum cluster for extracellular polysaccharide is conserved except for gumN and sequences downstream. The xcs and xps clusters for type II secretion are conserved, except in the rice pathogens, in which xcs is missing. In the otherwise conserved hrp cluster, sequences flanking the core genes for type III secretion vary with respect to insertion sequence element and putative effector gene content. Variation at the rpf (regulation of pathogenicity factors) cluster is more pronounced, though genes with established functional relevance are conserved. A cluster for synthesis of lipopolysaccharide varies highly, suggesting multiple horizontal gene transfers and reassortments, but this variation does not correlate with host- or tissue-specificity. Phylogenetic trees based on amino acid alignments of gum, xps, xcs, hrp, and rpf cluster products generally reflect strain phylogeny. However, amino acid residues at four positions correlate with tissue specificity, revealing hpaA and xpsD as candidate determinants. Examination of genome sequences of xanthomonads Xylella fastidiosa and Stenotrophomonas maltophilia revealed that the hrp, gum, and xcs clusters are recent acquisitions in the Xanthomonas lineage. Our results provide insight into the ancestral Xanthomonas genome and indicate that differentiation with respect to host- and tissue-specificity involved not major modifications or wholesale exchange of clusters, but subtle changes in a small number of genes or in non-coding sequences, and/or differences outside the clusters, potentially among regulatory targets or secretory substrates.
Metagenomic sequencing reveals altered metabolic pathways in the oral microbiota of sailors during a long sea voyage

PubMed Central

Zheng, Weiwei; Zhang, Ze; Liu, Cuihua; Qiao, Yuanyuan; Zhou, Dianrong; Qu, Jia; An, Huaijie; Xiong, Ming; Zhu, Zhiming; Zhao, Xiaohang

2015-01-01

Seafaring is a difficult occupation, and sailors face higher health risks than individuals on land. Commensal microbiota participates in the host immune system and metabolism, reflecting the host's health condition. However, the interaction mechanisms between the microbiota and the host's health condition remain unclear. This study reports the influence of long sea voyages on human health by utilising a metagenomic analysis of variation in the microbiota of the buccal mucosa. Paired samples collected before and after a sea-voyage were analysed. After more than 120 days of ocean sailing, the oral microbial diversity of sailors was reduced by approximately 5 fold, and the levels of several pathogens (e.g., Streptococcus pneumonia) increased. Moreover, 69.46% of the identified microbial sequences were unclassified microbiota. Notably, several metabolic pathways were dramatically decreased, including folate biosynthesis, carbohydrate, lipid and amino acid pathways. Clinical examination of the hosts confirmed the identified metabolic changes, as demonstrated by decreased serum levels of haemoglobin and folic acid, a decreased neutrophil-to-lymphocyte ratio, and increased levels of triglycerides, cholesterol and homocysteine, which are consistent with the observed microbial variation. Our study suggests that oral mucosal bacteria may reflect host health conditions and could provide approaches for improving the health of sailors. PMID:26154405
Influence of structural variation on nuclear localization of DNA-binding polyamide-fluorophore conjugates.

PubMed

Edelson, Benjamin S; Best, Timothy P; Olenyuk, Bogdan; Nickols, Nicholas G; Doss, Raymond M; Foister, Shane; Heckel, Alexander; Dervan, Peter B

2004-01-01

A pivotal step forward in chemical approaches to controlling gene expression is the development of sequence-specific DNA-binding molecules that can enter live cells and traffic to nuclei unaided. DNA-binding polyamides are a class of programmable, sequence-specific small molecules that have been shown to influence a wide variety of protein-DNA interactions. We have synthesized over 100 polyamide-fluorophore conjugates and assayed their nuclear uptake profiles in 13 mammalian cell lines. The compiled dataset, comprising 1300 entries, establishes a benchmark for the nuclear localization of polyamide-dye conjugates. Compounds in this series were chosen to provide systematic variation in several structural variables, including dye composition and placement, molecular weight, charge, ordering of the aromatic and aliphatic amino-acid building blocks and overall shape. Nuclear uptake does not appear to be correlated with polyamide molecular weight or with the number of imidazole residues, although the positions of imidazole residues affect nuclear access properties significantly. Generally negative determinants for nuclear access include the presence of a beta-Ala-tail residue and the lack of a cationic alkyl amine moiety, whereas the presence of an acetylated 2,4-diaminobutyric acid-turn is a positive factor for nuclear localization. We discuss implications of these data on the design of polyamide-dye conjugates for use in biological systems.
The Saccharomyces Genome Database Variant Viewer.

PubMed

Sheppard, Travis K; Hitz, Benjamin C; Engel, Stacia R; Song, Giltae; Balakrishnan, Rama; Binkley, Gail; Costanzo, Maria C; Dalusag, Kyla S; Demeter, Janos; Hellerstedt, Sage T; Karra, Kalpana; Nash, Robert S; Paskov, Kelley M; Skrzypek, Marek S; Weng, Shuai; Wong, Edith D; Cherry, J Michael

2016-01-04

The Saccharomyces Genome Database (SGD; http://www.yeastgenome.org) is the authoritative community resource for the Saccharomyces cerevisiae reference genome sequence and its annotation. In recent years, we have moved toward increased representation of sequence variation and allelic differences within S. cerevisiae. The publication of numerous additional genomes has motivated the creation of new tools for their annotation and analysis. Here we present the Variant Viewer: a dynamic open-source web application for the visualization of genomic and proteomic differences. Multiple sequence alignments have been constructed across high quality genome sequences from 11 different S. cerevisiae strains and stored in the SGD. The alignments and summaries are encoded in JSON and used to create a two-tiered dynamic view of the budding yeast pan-genome, available at http://www.yeastgenome.org/variant-viewer. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins.

PubMed

Sawle, Lucas; Ghosh, Kingshuk

2015-08-28

A general formalism to compute configurational properties of proteins and other heteropolymers with an arbitrary sequence of charges and non-uniform excluded volume interaction is presented. A variational approach is utilized to predict average distance between any two monomers in the chain. The presented analytical model, for the first time, explicitly incorporates the role of sequence charge distribution to determine relative sizes between two sequences that vary not only in total charge composition but also in charge decoration (even when charge composition is fixed). Furthermore, the formalism is general enough to allow variation in excluded volume interactions between two monomers. Model predictions are benchmarked against the all-atom Monte Carlo studies of Das and Pappu [Proc. Natl. Acad. Sci. U. S. A. 110, 13392 (2013)] for 30 different synthetic sequences of polyampholytes. These sequences possess an equal number of glutamic acid (E) and lysine (K) residues but differ in the patterning within the sequence. Without any fit parameter, the model captures the strong sequence dependence of the simulated values of the radius of gyration with a correlation coefficient of R(2) = 0.9. The model is then applied to real proteins to compare the unfolded state dimensions of 540 orthologous pairs of thermophilic and mesophilic proteins. The excluded volume parameters are assumed similar under denatured conditions, and only electrostatic effects encoded in the sequence are accounted for. With these assumptions, thermophilic proteins are found-with high statistical significance-to have more compact disordered ensemble compared to their mesophilic counterparts. The method presented here, due to its analytical nature, is capable of making such high throughput analysis of multiple proteins and will have broad applications in proteomic studies as well as in other heteropolymeric systems.
Development and Characterization of Somatic Hybrids of Ulva reticulata Forsskål (×) Monostroma oxyspermum (Kutz.)Doty

PubMed Central

Gupta, Vishal; Kumari, Puja; Reddy, CRK

2015-01-01

Ulvophycean species with diverse trait characteristics provide an opportunity to create novel allelic recombinant variants. The present study reports the development of seaweed variants with improved agronomic traits through protoplast fusion between Monostroma oxyspermum (Kutz.) Doty and Ulva reticulata Forsskål. A total of 12 putative hybrids were screened based on the variations in morphology and total DNA content over the fusion partners. DNA-fingerprinting by inter simple sequence repeat (ISSR) and amplified fragment length polymorphism (AFLP) analysis confirmed genomic introgression in the hybrids. The DNA fingerprint revealed sharing of parental alleles in regenerated hybrids and a few alleles that were unique to hybrids. The epigenetic variations in hybrids estimated in terms of DNA methylation polymorphism also revealed sharing of methylation loci with both the fusion partners. The functional trait analysis for growth showed a hybrid with heterotic trait (DGR% = 36.7 ± 1.55%) over the fusion partners U. reticulata (33.2 ± 2.6%) and M. oxyspermum (17.8 ± 1.77%), while others were superior to the mid-parental value (25.2 ± 2.2%) (p < 0.05). The fatty acid (FA) analysis of hybrids showed notable variations over fusion partners. Most hybrids showed increased polyunsaturated FAs (PUFAs) compared to saturated FAs (SFAs) and mainly includes the nutritionally important linoleic acid, α-linolenic acid, oleic acid, stearidonic acid, and docosahexaenoic acid. The other differences observed include superior cellulose content and antioxidative potential in hybrids over fusion partners. The hybrid varieties with superior traits developed in this study unequivocally demonstrate the significance of protoplast fusion technique in developing improved varients of macroalgae. PMID:25688248
Genomic location of the bovine growth hormone secretagogue receptor (GHSR) gene and investigation of genetic polymorphism.

PubMed

Colinet, F G; Vanderick, S; Charloteaux, B; Eggen, A; Gengler, N; Renaville, B; Brasseur, R; Portetelle, D; Renaville, Robert

2009-01-01

The growth hormone secretagogue receptor (GHSR) is involved in the regulation of energetic homeostasis and GH secretion. In this study, the bovine GHSR gene was mapped to BTA1 between BL26 and BMS4004. Two different bovine GHSR CDS (GHSR1a and GHSR1b) were sequenced. Six polymorphisms (five SNPs and one 3-bp indel) were also identified, three of them leading to amino acid variations L24V, D194N, and Del R242. These variations are located in the extracellular N-terminal end, the exoloop 2, and the cytoloop 3 of the receptor, respectively.
Multiple-strand displacement and identification of single nucleotide polymorphisms as markers of genotypic variation of Pasteuria penetrans biotypes infecting root-knot nematodes.

PubMed

Nong, Guang; Chow, Virginia; Schmidt, Liesbeth M; Dickson, Don W; Preston, James F

2007-08-01

Pasteuria species are endospore-forming obligate bacterial parasites of soil-inhabiting nematodes and water-inhabiting cladocerans, e.g. water fleas, and are closely related to Bacillus spp. by 16S rRNA gene sequence. As naturally occurring bacteria, biotypes of Pasteuria penetrans are attractive candidates for the biocontrol of various Meloidogyne spp. (root-knot nematodes). Failure to culture these bacteria outside their hosts has prevented isolation of genomic DNA in quantities sufficient for identification of genes associated with host recognition and virulence. We have applied multiple-strand displacement amplification (MDA) to generate DNA for comparative genomics of biotypes exhibiting different host preferences. Using the genome of Bacillus subtilis as a paradigm, MDA allowed quantitative detection and sequencing of 12 marker genes from 2000 cells. Meloidogyne spp. infected with P. penetrans P20 or B4 contained single nucleotide polymorphisms (SNPs) in the spoIIAB gene that did not change the amino acid sequence, or that substituted amino acids with similar chemical properties. Individual nematodes infected with P. penetrans P20 or B4 contained SNPs in the spoIIAB gene sequenced in MDA-generated products. Detection of SNPs in the spoIIAB gene in a nematode indicates infection by more than one genotype, supporting the need to sequence genomes of Pasteuria spp. derived from single spore isolates.
Nucleic acid arrays and methods of synthesis

DOEpatents

Sabanayagam, Chandran R.; Sano, Takeshi; Misasi, John; Hatch, Anson; Cantor, Charles

2001-01-01

The present invention generally relates to high density nucleic acid arrays and methods of synthesizing nucleic acid sequences on a solid surface. Specifically, the present invention contemplates the use of stabilized nucleic acid primer sequences immobilized on solid surfaces, and circular nucleic acid sequence templates combined with the use of isothermal rolling circle amplification to thereby increase nucleic acid sequence concentrations in a sample or on an array of nucleic acid sequences.
ChAy/Bx, a novel chimeric high-molecular-weight glutenin subunit gene apparently created by homoeologous recombination in Triticum turgidum ssp. dicoccoides.

PubMed

Guo, Xiao-Hui; Bi, Zhe-Guang; Wu, Bi-Hua; Wang, Zhen-Zhen; Hu, Ji-Liang; Zheng, You-Liang; Liu, Deng-Cai

2013-12-01

High-molecular-weight glutenin subunits (HMW-GSs) are of considerable interest, because they play a crucial role in determining dough viscoelastic properties and end-use quality of wheat flour. In this paper, ChAy/Bx, a novel chimeric HMW-GS gene from Triticum turgidum ssp. dicoccoides (AABB, 2n=4x=28) accession D129, was isolated and characterized. Sodium dodecyl sulfate polyacrylamide gel electrophoresis (SDS-PAGE) analysis revealed that the electrophoretic mobility of the glutenin subunit encoded by ChAy/Bx was slightly faster than that of 1Dy12. The complete ORF of ChAy/Bx contained 1,671 bp encoding a deduced polypeptide of 555 amino acid residues (or 534 amino acid residues for the mature protein), making it the smallest HMW-GS gene known from Triticum species. Sequence analysis showed that ChAy/Bx was neither a conventional x-type nor a conventional y-type subunit gene, but a novel chimeric gene. Its first 1305 nt sequence was highly homologous with the corresponding sequence of 1Ay type genes, while its final 366 nt sequence was highly homologous with the corresponding sequence of 1Bx type genes. The mature ChAy/Bx protein consisted of the N-terminus of 1Ay type subunit (the first 414 amino acid residues) and the C-terminus of 1Bx type subunit (the final 120 amino acid residues). Secondary structure prediction showed that ChAy/Bx contained some domains of 1Ay subunit and some domains of 1Bx subunit. The special structure of this HMW glutenin chimera ChAy/Bx subunit might have unique effects on the end-use quality of wheat flour. Here we propose that homoeologous recombination might be a novel pathway for allelic variation or molecular evolution of HMW-GSs. © 2013.
Characterisation of the nicotianamine aminotransferase and deoxymugineic acid synthase genes essential to Strategy II iron uptake in bread wheat (Triticum aestivum L.)

PubMed Central

Johnson, Alexander A. T.

2017-01-01

Iron (Fe) uptake in graminaceous plant species occurs via the release and uptake of Fe-chelating compounds known as mugineic acid family phytosiderophores (MAs). In the MAs biosynthetic pathway, nicotianamine aminotransferase (NAAT) and deoxymugineic acid synthase (DMAS) enzymes catalyse the formation of 2’-deoxymugineic acid (DMA) from nicotianamine (NA). Here we describe the identification and characterisation of six TaNAAT and three TaDMAS1 genes in bread wheat (Triticum aestivum L.). The coding sequences of all six TaNAAT homeologs consist of seven exons with ≥88.0% nucleotide sequence identity and most sequence variation present in the first exon. The coding sequences of the three TaDMAS1 homeologs consist of three exons with ≥97.8% nucleotide sequence identity. Phylogenetic analysis revealed that the TaNAAT and TaDMAS1 proteins are most closely related to the HvNAAT and HvDMAS1 proteins of barley and that there are two distinct groups of TaNAAT proteins—TaNAAT1 and TaNAAT2 –that correspond to the HvNAATA and HvNAATB proteins, respectively. Quantitative reverse transcription-PCR analysis revealed that the TaNAAT2 genes are expressed at highest levels in anther tissues whilst the TaNAAT1 and TaDMAS1 genes are expressed at highest levels in root tissues of bread wheat. Furthermore, the TaNAAT1, TaNAAT2 and TaDMAS1 genes were differentially regulated by plant Fe status and their expression was significantly upregulated in root tissues from day five onwards during a seven-day Fe deficiency treatment. The identification and characterization of the TaNAAT1, TaNAAT2 and TaDMAS1 genes provides a valuable genetic resource for improving bread wheat growth on Fe deficient soils and enhancing grain Fe nutrition. PMID:28475636
Characterization of Prdm9 in equids and sterility in mules.

PubMed

Steiner, Cynthia C; Ryder, Oliver A

2013-01-01

Prdm9 (Meisetz) is the first speciation gene discovered in vertebrates conferring reproductive isolation. This locus encodes a meiosis-specific histone H3 methyltransferase that specifies meiotic recombination hotspots during gametogenesis. Allelic differences in Prdm9, characterized for a variable number of zinc finger (ZF) domains, have been associated with hybrid sterility in male house mice via spermatogenic failure at the pachytene stage. The mule, a classic example of hybrid sterility in mammals also exhibits a similar spermatogenesis breakdown, making Prdm9 an interesting candidate to evaluate in equine hybrids. In this study, we characterized the Prdm9 gene in all species of equids by analyzing sequence variation of the ZF domains and estimating positive selection. We also evaluated the role of Prdm9 in hybrid sterility by assessing allelic differences of ZF domains in equine hybrids. We found remarkable variation in the sequence and number of ZF domains among equid species, ranging from five domains in the Tibetan kiang and Asiatic wild ass, to 14 in the Grevy's zebra. Positive selection was detected in all species at amino acid sites known to be associated with DNA-binding specificity of ZF domains in mice and humans. Equine hybrids, in particular a quartet pedigree composed of a fertile mule showed a mosaic of sequences and number of ZF domains suggesting that Prdm9 variation does not seem by itself to contribute to equine hybrid sterility.
Characterization of Prdm9 in Equids and Sterility in Mules

PubMed Central

Steiner, Cynthia C.; Ryder, Oliver A.

2013-01-01

Prdm9 (Meisetz) is the first speciation gene discovered in vertebrates conferring reproductive isolation. This locus encodes a meiosis-specific histone H3 methyltransferase that specifies meiotic recombination hotspots during gametogenesis. Allelic differences in Prdm9, characterized for a variable number of zinc finger (ZF) domains, have been associated with hybrid sterility in male house mice via spermatogenic failure at the pachytene stage. The mule, a classic example of hybrid sterility in mammals also exhibits a similar spermatogenesis breakdown, making Prdm9 an interesting candidate to evaluate in equine hybrids. In this study, we characterized the Prdm9 gene in all species of equids by analyzing sequence variation of the ZF domains and estimating positive selection. We also evaluated the role of Prdm9 in hybrid sterility by assessing allelic differences of ZF domains in equine hybrids. We found remarkable variation in the sequence and number of ZF domains among equid species, ranging from five domains in the Tibetan kiang and Asiatic wild ass, to 14 in the Grevy’s zebra. Positive selection was detected in all species at amino acid sites known to be associated with DNA-binding specificity of ZF domains in mice and humans. Equine hybrids, in particular a quartet pedigree composed of a fertile mule showed a mosaic of sequences and number of ZF domains suggesting that Prdm9 variation does not seem by itself to contribute to equine hybrid sterility. PMID:23613924
Association of SSR markers with contents of fatty acids in olive oil and genetic diversity analysis of an olive core collection.

PubMed

Ipek, M; Ipek, A; Seker, M; Gul, M K

2015-03-27

The purpose of this research was to characterize an olive core collection using some agronomic characters and simple sequence repeat (SSR) markers and to determine SSR markers associated with the content of fatty acids in olive oil. SSR marker analysis demonstrated the presence of a high amount of genetic variation between the olive cultivars analyzed. A UPGMA dendrogram demonstrated that olive cultivars did not cluster on the basis of their geographic origin. Fatty acid components of olive oil in these cultivars were determined. The results also showed that there was a great amount of variation between the olive cultivars in terms of fatty acid composition. For example, oleic acid content ranged from 57.76 to 76.9% with standard deviation of 5.10%. Significant correlations between fatty acids of olive oil were observed. For instance, a very high negative correlation (-0.812) between oleic and linoleic acids was detected. A structured association analysis between the content of fatty acids in olive oil and SSR markers was performed. STRUCTURE analysis assigned olive cultivars to two gene pools (K = 2). Assignment of olive cultivars to these gene pools was not based on geographical origin. Association between fatty acid traits and SSR markers was evaluated using the general linear model of TASSEL. Significant associations were determined between five SSR markers and stearic, oleic, linoleic, and linolenic acids of olive oil. Very high associations (P < 0.001) between ssrOeUA-DCA14 and stearic acid and between GAPU71B and oleic acid indicated that these markers could be used for marker-assisted selection in olive.
In silico Derivation of HLA-Specific Alloreactivity Potential from Whole Exome Sequencing of Stem-Cell Transplant Donors and Recipients: Understanding the Quantitative Immunobiology of Allogeneic Transplantation

PubMed Central

Jameson-Lee, Max; Koparde, Vishal; Griffith, Phil; Scalora, Allison F.; Sampson, Juliana K.; Khalid, Haniya; Sheth, Nihar U.; Batalo, Michael; Serrano, Myrna G.; Roberts, Catherine H.; Hess, Michael L.; Buck, Gregory A.; Neale, Michael C.; Manjili, Masoud H.; Toor, Amir Ahmed

2014-01-01

Donor T-cell mediated graft versus host (GVH) effects may result from the aggregate alloreactivity to minor histocompatibility antigens (mHA) presented by the human leukocyte antigen (HLA) molecules in each donor–recipient pair undergoing stem-cell transplantation (SCT). Whole exome sequencing has previously demonstrated a large number of non-synonymous single nucleotide polymorphisms (SNP) present in HLA-matched recipients of SCT donors (GVH direction). The nucleotide sequence flanking each of these SNPs was obtained and the amino acid sequence determined. All the possible nonameric peptides incorporating the variant amino acid resulting from these SNPs were interrogated in silico for their likelihood to be presented by the HLA class I molecules using the Immune Epitope Database stabilized matrix method (SMM) and NetMHCpan algorithms. The SMM algorithm predicted that a median of 18,396 peptides weakly bound HLA class I molecules in individual SCT recipients, and 2,254 peptides displayed strong binding. A similar library of presented peptides was identified when the data were interrogated using the NetMHCpan algorithm. The bioinformatic algorithm presented here demonstrates that there may be a high level of mHA variation in HLA-matched individuals, constituting a HLA-specific alloreactivity potential. PMID:25414699
In silico Derivation of HLA-Specific Alloreactivity Potential from Whole Exome Sequencing of Stem-Cell Transplant Donors and Recipients: Understanding the Quantitative Immunobiology of Allogeneic Transplantation.

PubMed

Jameson-Lee, Max; Koparde, Vishal; Griffith, Phil; Scalora, Allison F; Sampson, Juliana K; Khalid, Haniya; Sheth, Nihar U; Batalo, Michael; Serrano, Myrna G; Roberts, Catherine H; Hess, Michael L; Buck, Gregory A; Neale, Michael C; Manjili, Masoud H; Toor, Amir Ahmed

2014-01-01

Donor T-cell mediated graft versus host (GVH) effects may result from the aggregate alloreactivity to minor histocompatibility antigens (mHA) presented by the human leukocyte antigen (HLA) molecules in each donor-recipient pair undergoing stem-cell transplantation (SCT). Whole exome sequencing has previously demonstrated a large number of non-synonymous single nucleotide polymorphisms (SNP) present in HLA-matched recipients of SCT donors (GVH direction). The nucleotide sequence flanking each of these SNPs was obtained and the amino acid sequence determined. All the possible nonameric peptides incorporating the variant amino acid resulting from these SNPs were interrogated in silico for their likelihood to be presented by the HLA class I molecules using the Immune Epitope Database stabilized matrix method (SMM) and NetMHCpan algorithms. The SMM algorithm predicted that a median of 18,396 peptides weakly bound HLA class I molecules in individual SCT recipients, and 2,254 peptides displayed strong binding. A similar library of presented peptides was identified when the data were interrogated using the NetMHCpan algorithm. The bioinformatic algorithm presented here demonstrates that there may be a high level of mHA variation in HLA-matched individuals, constituting a HLA-specific alloreactivity potential.
PknB remains an essential and a conserved target for drug development in susceptible and MDR strains of M. Tuberculosis.

PubMed

Gupta, Anamika; Pal, Sudhir K; Pandey, Divya; Fakir, Najneen A; Rathod, Sunita; Sinha, Dhiraj; SivaKumar, S; Sinha, Pallavi; Periera, Mycal; Balgam, Shilpa; Sekar, Gomathi; UmaDevi, K R; Anupurba, Shampa; Nema, Vijay

2017-08-18

The Mycobacterium tuberculosis (M.tb) protein kinase B (PknB) which is now proved to be essential for the growth and survival of M.tb, is a transmembrane protein with a potential to be a good drug target. However it is not known if this target remains conserved in otherwise resistant isolates from clinical origin. The present study describes the conservation analysis of sequences covering the inhibitor binding domain of PknB to assess if it remains conserved in susceptible and resistant clinical strains of mycobacteria picked from three different geographical areas of India. A total of 116 isolates from North, South and West India were used in the study with a variable profile of their susceptibilities towards streptomycin, isoniazid, rifampicin, ethambutol and ofloxacin. Isolates were also spoligotyped in order to find if the conservation pattern of pknB gene remain consistent or differ with different spoligotypes. The impact of variation as found in the study was analyzed using Molecular dynamics simulations. The sequencing results with 115/116 isolates revealed the conserved nature of pknB sequences irrespective of their susceptibility status and spoligotypes. The only variation found was in one strains wherein pnkB sequence had G to A mutation at 664 position translating into a change of amino acid, Valine to Isoleucine. After analyzing the impact of this sequence variation using Molecular dynamics simulations, it was observed that the variation is causing no significant change in protein structure or the inhibitor binding. Hence, the study endorses that PknB is an ideal target for drug development and there is no pre-existing or induced resistance with respect to the sequences involved in inhibitor binding. Also if the mutation that we are reporting for the first time is found again in subsequent work, it should be checked with phenotypic profile before drawing the conclusion that it would affect the activity in any way. Bioinformatics analysis in our study says that it has no significant effect on the binding and hence the activity of the protein.

Variation in the Nucleotide Sequence of Cottontail Rabbit Papillomavirus a and b Subtypes Affects Wart Regression and Malignant Transformation and Level of Viral Replication in Domestic Rabbits

PubMed Central

Salmon, Jérôme; Nonnenmacher, Mathieu; Cazé, Sandrine; Flamant, Patricia; Croissant, Odile; Orth, Gérard; Breitburd, Françoise

2000-01-01

We previously reported the partial characterization of two cottontail rabbit papillomavirus (CRPV) subtypes with strikingly divergent E6 and E7 oncoproteins. We report now the complete nucleotide sequences of these subtypes, referred to as CRPVa4 (7,868 nucleotides) and CRPVb (7,867 nucleotides). The CRPVa4 and CRPVb genomes differed at 238 (3%) nucleotide positions, whereas CRPVa4 and the prototype CRPV differed by only 5 nucleotides. The most variable region (7% nucleotide divergence) included the long regulatory region (LRR) and the E6 and E7 genes. A mutation in the stop codon resulted in an 8-amino-acid-longer CRPVb E4 protein, and a nucleotide deletion reduced the coding capacity of the E5 gene from 101 to 25 amino acids. In domestic rabbits homozygous for a specific haplotype of the DRA and DQA genes of the major histocompatibility complex, warts induced by CRPVb DNA or a chimeric genome containing the CRPVb LRR/E6/E7 region showed an early regression, whereas warts induced by CRPVa4 or a chimeric genome containing the CRPVa4 LRR/E6/E7 region persisted and evolved into carcinomas. In contrast, most CRPVa, CRPVb, and chimeric CRPV DNA-induced warts showed no early regression in rabbits homozygous for another DRA-DQA haplotype. Little, if any, viral replication is usually observed in domestic rabbit warts. When warts induced by CRPVa and CRPVb virions and DNA were compared, the number of cells positive for viral DNA or capsid antigens was found to be greater by 1 order of magnitude for specimens induced by CRPVb. Thus, both sequence variation in the LRR/E6/E7 region and the genetic constitution of the host influence the expression of the oncogenic potential of CRPV. Furthermore, intratype variation may overcome to some extent the host restriction of CRPV replication in domestic rabbits. PMID:11044121
Size and sequence polymorphisms in the glutamate-rich protein gene of the human malaria parasite Plasmodium falciparum in Thailand.

PubMed

Pattaradilokrat, Sittiporn; Trakoolsoontorn, Chawinya; Simpalipan, Phumin; Warrit, Natapot; Kaewthamasorn, Morakot; Harnyuttanakorn, Pongchai

2018-01-22

The glutamate-rich protein (GLURP) of the malaria parasite Plasmodium falciparum is a key surface antigen that serves as a component of a clinical vaccine. Moreover, the GLURP gene is also employed routinely as a genetic marker for malarial genotyping in epidemiological studies. While extensive size polymorphisms in GLURP are well recorded, the extent of the sequence diversity of this gene is rarely investigated. The present study aimed to explore the genetic diversity of GLURP in natural populations of P. falciparum. The polymorphic C-terminal repetitive R2 region of GLURP sequences from 65 P. falciparum isolates in Thailand were generated and combined with the data from 103 worldwide isolates to generate a GLURP database. The collection was comprised of 168 alleles, encoding 105 unique GLURP subtypes, characterized by 18 types of amino acid repeat units (AAU). Of these, 28 GLURP subtypes, formed by 10 AAU types, were detected in P. falciparum in Thailand. Among them, 19 GLURP subtypes and 2 AAU types are described for the first time in the Thai parasite population. The AAU sequences were highly conserved, which is likely due to negative selection. Standard Fst analysis revealed the shared distributions of GLURP types among the P. falciparum populations, providing evidence of gene flow among the different demographic populations. Sequence diversity causing size variations in GLURP in Thai P. falciparum populations were detected, and caused by non-synonymous substitutions in repeat units and some insertion/deletion of aspartic acid or glutamic acid codons between repeat units. The P. falciparum population structure based on GLURP showed promising implications for the development of GLURP-based vaccines and for monitoring vaccine efficacy.
Sequence heterogeneity of cannabidiolic- and tetrahydrocannabinolic acid-synthase in Cannabis sativa L. and its relationship with chemical phenotype.

PubMed

Onofri, Chiara; de Meijer, Etienne P M; Mandolino, Giuseppe

2015-08-01

Sequence variants of THCA- and CBDA-synthases were isolated from different Cannabis sativa L. strains expressing various wild-type and mutant chemical phenotypes (chemotypes). Expressed and complete sequences were obtained from mature inflorescences. Each strain was shown to have a different specificity and/or ability to convert the precursor CBGA into CBDA and/or THCA type products. The comparison of the expressed sequences led to the identification of different mutations, all of them due to SNPs. These SNPs were found to relate to the cannabinoid composition of the inflorescence at maturity and are therefore proposed to have a functional significance. The amount of variation was found to be higher within the CBDAS sequence family than in the THCAS family, suggesting a more recent evolution of THCA-forming enzymes from the CBDAS group. We therefore consider CBDAS as the ancestral type of these synthases. Copyright © 2015 Elsevier Ltd. All rights reserved.
Molecular characterization and expression profiling of BMP 3 gene in broiler and layer chicken.

PubMed

Divya, Devara; Bhattacharya, Tarun Kumar; Gnana Prakash, Manthani; Chatterjee, R N; Shukla, Renu; Guru Vishnu, Pothana Boyina; Vinoth, Amirthalingam; Dushyanth, Kotha

2018-04-10

A study was carried out to characterize and explore the expression profile of BMP 3 gene in control broiler and control layer chicken. The total open reading frame of BMP 3 (1389 bp) was cloned and sequenced. The control broiler and control layer chicken showed variation at nucleotide and amino acid level with reference gene (Gallus gallus, NCBI Acc. No. NM_001034819). When compared to reference gene, the control broiler showed four nucleotide differences (c.192A>G, c.519C>T, 903G>A and 960C>G), while, control layer showed variation at c.33G>C, 192A>G, 858G>A, 904G>A, 960C>G and 1257C>T making six differences in total. However, between control broiler and control layer lines, nucleotide differences was observed at c.33G>C, 519T>C, 858G>A, 903A>G, 904G>A and 1257C>T. The change at amino acid level between reference and control broiler was p.D320N and with control layer chicken, it was p.D302N and p.D320N. On the other hand, a single amino acid difference (p.D302N) was observed between the control broiler and control layer chicken lines. The phylogenetic study displayed a close relationship between broiler and layer lines and reference gene and also with other avian species resulting in a cluster formation. These cluster in turn displayed a distant link with the mammalian species. The expression profile of BMP 3 gene exhibited a variation at different stages of embryonic development and also at post embryonic period among the lines with control layer showing higher expression than that of broiler chicken. The protein was also detected in bone marrow tissue of broiler and layer lines by western blotting. It is concluded that the BMP 3 gene sequence differed at nucleotide and amino acid level among the lines and the gene expressed differentially at different periods of embryonic development and also at post hatch period.
Interspecific variation in mitochondrial serine transfer RNA (UCN) in Euptychiina butterflies (Lepidoptera: Satyrinae): structure and alignment.

PubMed

Marín, Mario Alejandro; López, Andrés; Uribe, Sandra Inés

2012-06-01

The nucleotide variation and structural patterns of mitochondrial RNA molecule have been proposed as useful tools in molecular systematics; however, their usefulness is always subject to a proper assessment of homology in the sequence alignment. The present study describes the secondary structure of mitochondrial tRNA for the amino acid serine (UCN) on 13 Euptychiina species and the evaluation of its potential use for evolutionary studies in this group of butterflies. The secondary structure of tRNAs showed variation among the included species except between Hermeuptychia sp1 and sp2. Variation was concentrated in the ribotimidina-pseudouridine-cystosine (TψC), dihydrouridine (DHU) and variable loops and in the DHU and TψC arms. These results suggest this region as a potential marker useful for taxonomic differentiation of species in this group and also confirm the importance of including information from the secondary structure of tRNA to optimize the alignments.
Exploring variation-aware contig graphs for (comparative) metagenomics using MaryGold

PubMed Central

Nijkamp, Jurgen F.; Pop, Mihai; Reinders, Marcel J. T.; de Ridder, Dick

2013-01-01

Motivation: Although many tools are available to study variation and its impact in single genomes, there is a lack of algorithms for finding such variation in metagenomes. This hampers the interpretation of metagenomics sequencing datasets, which are increasingly acquired in research on the (human) microbiome, in environmental studies and in the study of processes in the production of foods and beverages. Existing algorithms often depend on the use of reference genomes, which pose a problem when a metagenome of a priori unknown strain composition is studied. In this article, we develop a method to perform reference-free detection and visual exploration of genomic variation, both within a single metagenome and between metagenomes. Results: We present the MaryGold algorithm and its implementation, which efficiently detects bubble structures in contig graphs using graph decomposition. These bubbles represent variable genomic regions in closely related strains in metagenomic samples. The variation found is presented in a condensed Circos-based visualization, which allows for easy exploration and interpretation of the found variation. We validated the algorithm on two simulated datasets containing three respectively seven Escherichia coli genomes and showed that finding allelic variation in these genomes improves assemblies. Additionally, we applied MaryGold to publicly available real metagenomic datasets, enabling us to find within-sample genomic variation in the metagenomes of a kimchi fermentation process, the microbiome of a premature infant and in microbial communities living on acid mine drainage. Moreover, we used MaryGold for between-sample variation detection and exploration by comparing sequencing data sampled at different time points for both of these datasets. Availability: MaryGold has been written in C++ and Python and can be downloaded from http://bioinformatics.tudelft.nl/software Contact: d.deridder@tudelft.nl PMID:24058058
Genomic Landscape of Intrahost Variation in Group A Streptococcus: Repeated and Abundant Mutational Inactivation of the fabT Gene Encoding a Regulator of Fatty Acid Synthesis

PubMed Central

Eraso, Jesus M.; Olsen, Randall J.; Beres, Stephen B.; Kachroo, Priyanka; Porter, Adeline R.; Nasser, Waleed; Bernard, Paul E.; DeLeo, Frank R.

2016-01-01

To obtain new information about Streptococcus pyogenes intrahost genetic variation during invasive infection, we sequenced the genomes of 2,954 serotype M1 strains recovered from a nonhuman primate experimental model of necrotizing fasciitis. A total of 644 strains (21.8%) acquired polymorphisms relative to the input parental strain. The fabT gene, encoding a transcriptional regulator of fatty acid biosynthesis genes, contained 54.5% of these changes. The great majority of polymorphisms were predicted to deleteriously alter FabT function. Transcriptome-sequencing (RNA-seq) analysis of a wild-type strain and an isogenic fabT deletion mutant strain found that between 3.7 and 28.5% of the S. pyogenes transcripts were differentially expressed, depending on the growth temperature (35°C or 40°C) and growth phase (mid-exponential or stationary phase). Genes implicated in fatty acid synthesis and lipid metabolism were significantly upregulated in the fabT deletion mutant strain. FabT also directly or indirectly regulated central carbon metabolism genes, including pyruvate hub enzymes and fermentation pathways and virulence genes. Deletion of fabT decreased virulence in a nonhuman primate model of necrotizing fasciitis. In addition, the fabT deletion strain had significantly decreased survival in human whole blood and during phagocytic interaction with polymorphonuclear leukocytes ex vivo. We conclude that FabT mutant progeny arise during infection, constitute a metabolically distinct subpopulation, and are less virulent in the experimental models used here. PMID:27600505
Survey of duckweed diversity in Lake Chao and total fatty acid, triacylglycerol, profiles of representative strains.

PubMed

Tang, J; Li, Y; Ma, J; Cheng, J J

2015-09-01

Lemnaceae (duckweeds) are widely distributed aquatic flowering plants. Their high growth rate, starch content and suitability for bioremediation make them potential feedstock for biofuels. However, few natural duckweed resources have been investigated in China, and there is no information about total fatty acid (TFA) and triacylglycerol (TAG) composition of duckweeds from China. Here, the genetic diversity of a natural duckweed population collected from Lake Chao, China, was investigated using multilocus sequence typing (MLST). The 54 strains were categorised into four species in four genera, representing 12 distinct sequence types. Strains representing Lemna aequinoctialis and Spirodela polyrhiza were predominant. Interestingly, a surprisingly high degree of genetic diversification within L. aequinoctialis was observed. The four duckweed species revealed a uniform fatty acid composition, with three fatty acids, palmitic acid, linoleic acid and linolenic acid, accounting for more than 80% of the TFA. The TFA in biomass varied among species, ranging from 1.05% (of dry weight, DW) for L. punctata and S. polyrhiza to 1.62% for Wolffia globosa. The four duckweed species contained similar TAG contents, 0.02% mg · DW(-1). The fatty acid profiles of TAG were different from those of TFA, and also varied among the four species. The survey investigated the genetic diversity of duckweeds from Lake Chao, and provides an initial insight into TFA and TAG of four duckweed species, indicating that intraspecific and interspecific variations exist in the content and composition of both TFA and TAG in comparison with other studies. © 2015 German Botanical Society and The Royal Botanical Society of the Netherlands.
Universal digital high-resolution melt: a novel approach to broad-based profiling of heterogeneous biological samples.

PubMed

Fraley, Stephanie I; Hardick, Justin; Masek, Billie J; Jo Masek, Billie; Athamanolap, Pornpat; Rothman, Richard E; Gaydos, Charlotte A; Carroll, Karen C; Wakefield, Teresa; Wang, Tza-Huei; Yang, Samuel

2013-10-01

Comprehensive profiling of nucleic acids in genetically heterogeneous samples is important for clinical and basic research applications. Universal digital high-resolution melt (U-dHRM) is a new approach to broad-based PCR diagnostics and profiling technologies that can overcome issues of poor sensitivity due to contaminating nucleic acids and poor specificity due to primer or probe hybridization inaccuracies for single nucleotide variations. The U-dHRM approach uses broad-based primers or ligated adapter sequences to universally amplify all nucleic acid molecules in a heterogeneous sample, which have been partitioned, as in digital PCR. Extensive assay optimization enables direct sequence identification by algorithm-based matching of melt curve shape and Tm to a database of known sequence-specific melt curves. We show that single-molecule detection and single nucleotide sensitivity is possible. The feasibility and utility of U-dHRM is demonstrated through detection of bacteria associated with polymicrobial blood infection and microRNAs (miRNAs) associated with host response to infection. U-dHRM using broad-based 16S rRNA gene primers demonstrates universal single cell detection of bacterial pathogens, even in the presence of larger amounts of contaminating bacteria; U-dHRM using universally adapted Lethal-7 miRNAs in a heterogeneous mixture showcases the single copy sensitivity and single nucleotide specificity of this approach.
Activation of c-jun N-terminal kinase upon influenza A virus (IAV) infection is independent of pathogen-related receptors but dependent on amino acid sequence variations of IAV NS1.

PubMed

Nacken, Wolfgang; Anhlan, Darisuren; Hrincius, Eike R; Mostafa, Ahmed; Wolff, Thorsten; Sadewasser, Anne; Pleschka, Stephan; Ehrhardt, Christina; Ludwig, Stephan

2014-08-01

A hallmark cell response to influenza A virus (IAV) infections is the phosphorylation and activation of c-jun N-terminal kinase (JNK). However, so far it is not fully clear which molecules are involved in the activation of JNK upon IAV infection. Here, we report that the transfection of influenza viral-RNA induces JNK in a retinoic acid-inducible gene I (RIG-I)-dependent manner. However, neither RIG-I-like receptors nor MyD88-dependent Toll-like receptors were found to be involved in the activation of JNK upon IAV infection. Viral JNK activation may be blocked by addition of cycloheximide and heat shock protein inhibitors during infection, suggesting that the expression of an IAV-encoded protein is responsible for JNK activation. Indeed, the overexpression of nonstructural protein 1 (NS1) of certain IAV subtypes activated JNK, whereas those of some other subtypes failed to activate JNK. Site-directed mutagenesis experiments using NS1 of the IAV H7N7, H5N1, and H3N2 subtypes identified the amino acid residue phenylalanine (F) at position 103 to be decisive for JNK activation. Cleavage- and polyadenylation-specific factor 30 (CPSF30), whose binding to NS1 is stabilized by the amino acids F103 and M106, is not involved in JNK activation. Conclusively, subtype-specific sequence variations in the IAV NS1 protein result in subtype-specific differences in JNK signaling upon IAV infection. Influenza A virus (IAV) infection leads to the activation or modulation of multiple signaling pathways. Here, we demonstrate for the first time that the c-jun N-terminal kinase (JNK), a long-known stress-activated mitogen-activated protein (MAP) kinase, is activated by RIG-I when cells are treated with IAV RNA. However, at the same time, nonstructural protein 1 (NS1) of IAV has an intrinsic JNK-activating property that is dependent on IAV subtype-specific amino acid variations around position 103. Our findings identify two different and independent pathways that result in the activation of JNK in the course of an IAV infection. Copyright © 2014, American Society for Microbiology. All Rights Reserved.
Activation of c-jun N-Terminal Kinase upon Influenza A Virus (IAV) Infection Is Independent of Pathogen-Related Receptors but Dependent on Amino Acid Sequence Variations of IAV NS1

PubMed Central

Nacken, Wolfgang; Anhlan, Darisuren; Hrincius, Eike R.; Mostafa, Ahmed; Wolff, Thorsten; Sadewasser, Anne; Pleschka, Stephan; Ehrhardt, Christina

2014-01-01

ABSTRACT A hallmark cell response to influenza A virus (IAV) infections is the phosphorylation and activation of c-jun N-terminal kinase (JNK). However, so far it is not fully clear which molecules are involved in the activation of JNK upon IAV infection. Here, we report that the transfection of influenza viral-RNA induces JNK in a retinoic acid-inducible gene I (RIG-I)-dependent manner. However, neither RIG-I-like receptors nor MyD88-dependent Toll-like receptors were found to be involved in the activation of JNK upon IAV infection. Viral JNK activation may be blocked by addition of cycloheximide and heat shock protein inhibitors during infection, suggesting that the expression of an IAV-encoded protein is responsible for JNK activation. Indeed, the overexpression of nonstructural protein 1 (NS1) of certain IAV subtypes activated JNK, whereas those of some other subtypes failed to activate JNK. Site-directed mutagenesis experiments using NS1 of the IAV H7N7, H5N1, and H3N2 subtypes identified the amino acid residue phenylalanine (F) at position 103 to be decisive for JNK activation. Cleavage- and polyadenylation-specific factor 30 (CPSF30), whose binding to NS1 is stabilized by the amino acids F103 and M106, is not involved in JNK activation. Conclusively, subtype-specific sequence variations in the IAV NS1 protein result in subtype-specific differences in JNK signaling upon IAV infection. IMPORTANCE Influenza A virus (IAV) infection leads to the activation or modulation of multiple signaling pathways. Here, we demonstrate for the first time that the c-jun N-terminal kinase (JNK), a long-known stress-activated mitogen-activated protein (MAP) kinase, is activated by RIG-I when cells are treated with IAV RNA. However, at the same time, nonstructural protein 1 (NS1) of IAV has an intrinsic JNK-activating property that is dependent on IAV subtype-specific amino acid variations around position 103. Our findings identify two different and independent pathways that result in the activation of JNK in the course of an IAV infection. PMID:24872593
The nucleotide sequence and genome organization of Plasmopara halstedii virus.

PubMed

Heller-Dohmen, Marion; Göpfert, Jens C; Pfannstiel, Jens; Spring, Otmar

2011-03-17

Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2) were established. RNA1 consisted of 2793 nucleotides (nt) exclusive its 3' poly(A) tract and a single open-reading frame (ORF1) of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR) of 18 nt and a 3' untranslated region (3' UTR) of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp) and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A) and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A) tract and a second ORF (ORF2) of 1128 nt. ORF2 coded for the single viral coat protein (CP) and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb) and RNA2 (ca. 1.4 kb) were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. The results showed the presence of a single and new virus type in different P. halstedii isolates. Insignificant viral sequence variation indicated that the virus did not account for differences in pathogenicity of the oomycete P. halstedii.
37 CFR 1.822 - Symbols and format to be used for nucleotide and/or amino acid sequence data.

Code of Federal Regulations, 2011 CFR

2011-07-01

... for nucleotide and/or amino acid sequence data. 1.822 Section 1.822 Patents, Trademarks, and... Amino Acid Sequences § 1.822 Symbols and format to be used for nucleotide and/or amino acid sequence data. (a) The symbols and format to be used for nucleotide and/or amino acid sequence data shall...
Genome-Wide Association Studies Identify Heavy Metal ATPase3 as the Primary Determinant of Natural Variation in Leaf Cadmium in Arabidopsis thaliana

PubMed Central

Chao, Dai-Yin; Silva, Adriano; Baxter, Ivan; Huang, Yu S.; Nordborg, Magnus; Danku, John; Lahner, Brett; Yakubova, Elena; Salt, David E.

2012-01-01

Understanding the mechanism of cadmium (Cd) accumulation in plants is important to help reduce its potential toxicity to both plants and humans through dietary and environmental exposure. Here, we report on a study to uncover the genetic basis underlying natural variation in Cd accumulation in a world-wide collection of 349 wild collected Arabidopsis thaliana accessions. We identified a 4-fold variation (0.5–2 µg Cd g−1 dry weight) in leaf Cd accumulation when these accessions were grown in a controlled common garden. By combining genome-wide association mapping, linkage mapping in an experimental F2 population, and transgenic complementation, we reveal that HMA3 is the sole major locus responsible for the variation in leaf Cd accumulation we observe in this diverse population of A. thaliana accessions. Analysis of the predicted amino acid sequence of HMA3 from 149 A. thaliana accessions reveals the existence of 10 major natural protein haplotypes. Association of these haplotypes with leaf Cd accumulation and genetics complementation experiments indicate that 5 of these haplotypes are active and 5 are inactive, and that elevated leaf Cd accumulation is associated with the reduced function of HMA3 caused by a nonsense mutation and polymorphisms that change two specific amino acids. PMID:22969436
Regional variations in the diversity and predicted metabolic potential of benthic prokaryotes in coastal northern Zhejiang, East China Sea

PubMed Central

Wang, Kai; Ye, Xiansen; Zhang, Huajun; Chen, Heping; Zhang, Demin; Liu, Lian

2016-01-01

Knowledge about the drivers of benthic prokaryotic diversity and metabolic potential in interconnected coastal sediments at regional scales is limited. We collected surface sediments across six zones covering ~200 km in coastal northern Zhejiang, East China Sea and combined 16 S rRNA gene sequencing, community-level metabolic prediction, and sediment physicochemical measurements to investigate variations in prokaryotic diversity and metabolic gene composition with geographic distance and under local environmental conditions. Geographic distance was the most influential factor in prokaryotic β-diversity compared with major environmental drivers, including temperature, sediment texture, acid-volatile sulfide, and water depth, but a large unexplained variation in community composition suggested the potential effects of unmeasured abiotic/biotic factors and stochastic processes. Moreover, prokaryotic assemblages showed a biogeographic provincialism across the zones. The predicted metabolic gene composition similarly shifted as taxonomic composition did. Acid-volatile sulfide was strongly correlated with variation in metabolic gene composition. The enrichments in the relative abundance of sulfate-reducing bacteria and genes relevant with dissimilatory sulfate reduction were observed and predicted, respectively, in the Yushan area. These results provide insights into the relative importance of geographic distance and environmental condition in driving benthic prokaryotic diversity in coastal areas and predict specific biogeochemically-relevant genes for future studies. PMID:27917954
PopHuman: the human population genomics browser.

PubMed

Casillas, Sònia; Mulet, Roger; Villegas-Mirón, Pablo; Hervas, Sergi; Sanz, Esteve; Velasco, Daniel; Bertranpetit, Jaume; Laayouni, Hafid; Barbadilla, Antonio

2018-01-04

The 1000 Genomes Project (1000GP) represents the most comprehensive world-wide nucleotide variation data set so far in humans, providing the sequencing and analysis of 2504 genomes from 26 populations and reporting >84 million variants. The availability of this sequence data provides the human lineage with an invaluable resource for population genomics studies, allowing the testing of molecular population genetics hypotheses and eventually the understanding of the evolutionary dynamics of genetic variation in human populations. Here we present PopHuman, a new population genomics-oriented genome browser based on JBrowse that allows the interactive visualization and retrieval of an extensive inventory of population genetics metrics. Efficient and reliable parameter estimates have been computed using a novel pipeline that faces the unique features and limitations of the 1000GP data, and include a battery of nucleotide variation measures, divergence and linkage disequilibrium parameters, as well as different tests of neutrality, estimated in non-overlapping windows along the chromosomes and in annotated genes for all 26 populations of the 1000GP. PopHuman is open and freely available at http://pophuman.uab.cat. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Scop3D: three-dimensional visualization of sequence conservation.

PubMed

Vermeire, Tessa; Vermaere, Stijn; Schepens, Bert; Saelens, Xavier; Van Gucht, Steven; Martens, Lennart; Vandermarliere, Elien

2015-04-01

The integration of a protein's structure with its known sequence variation provides insight on how that protein evolves, for instance in terms of (changing) function or immunogenicity. Yet, collating the corresponding sequence variants into a multiple sequence alignment, calculating each position's conservation, and mapping this information back onto a relevant structure is not straightforward. We therefore built the Sequence Conservation on Protein 3D structure (scop3D) tool to perform these tasks automatically. The output consists of two modified PDB files in which the B-values for each position are replaced by the percentage sequence conservation, or the information entropy for each position, respectively. Furthermore, text files with absolute and relative amino acid occurrences for each position are also provided, along with snapshots of the protein from six distinct directions in space. The visualization provided by scop3D can for instance be used as an aid in vaccine development or to identify antigenic hotspots, which we here demonstrate based on an analysis of the fusion proteins of human respiratory syncytial virus and mumps virus. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Polymorphisms and variants in the prion protein sequence of European moose (Alces alces), reindeer (Rangifer tarandus), roe deer (Capreolus capreolus) and fallow deer (Dama dama) in Scandinavia

PubMed Central

Wik, Lotta; Mikko, Sofia; Klingeborn, Mikael; Stéen, Margareta; Simonsson, Magnus; Linné, Tommy

2012-01-01

The prion protein (PrP) sequence of European moose, reindeer, roe deer and fallow deer in Scandinavia has high homology to the PrP sequence of North American cervids. Variants in the European moose PrP sequence were found at amino acid position 109 as K or Q. The 109Q variant is unique in the PrP sequence of vertebrates. During the 1980s a wasting syndrome in Swedish moose, Moose Wasting Syndrome (MWS), was described. SNP analysis demonstrated a difference in the observed genotype proportions of the heterozygous Q/K and homozygous Q/Q variants in the MWS animals compared with the healthy animals. In MWS moose the allele frequencies for 109K and 109Q were 0.73 and 0.27, respectively, and for healthy animals 0.69 and 0.31. Both alleles were seen as heterozygotes and homozygotes. In reindeer, PrP sequence variation was demonstrated at codon 176 as D or N and codon 225 as S or Y. The PrP sequences in roe deer and fallow deer were identical with published GenBank sequences. PMID:22441661
Microsatellite analysis in the genome of Acanthaceae: An in silico approach.

PubMed

Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar

2015-01-01

Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future.
Intact coding region of the serotonin transporter gene in obsessive-compulsive disorder

DOE Office of Scientific and Technical Information (OSTI.GOV)

Altemus, M.; Murphy, D.L.; Greenberg, B.

1996-07-26

Epidemiologic studies indicate that obsessive-compulsive disorder is genetically transmitted in some families, although no genetic abnormalities have been identified in individuals with this disorder. The selective response of obsessive-compulsive disorder to treatment with agents which block serotonin reuptake suggests the gene coding for the serotonin transporter as a candidate gene. The primary structure of the serotonin-transporter coding region was sequenced in 22 patients with obsessive-compulsive disorder, using direct PCR sequencing of cDNA synthesized from platelet serotonin-transporter mRNA. No variations in amino acid sequence were found among the obsessive-compulsive disorder patients or healthy controls. These results do not support a rolemore » for alteration in the primary structure of the coding region of the serotonin-transporter gene in the pathogenesis of obsessive-compulsive disorder. 27 refs.« less

RNA sequencing to study gene expression and single nucleotide polymorphism variation associated with citrate content in cow milk.

PubMed

Cánovas, A; Rincón, G; Islas-Trejo, A; Jimenez-Flores, R; Laubscher, A; Medrano, J F

2013-04-01

The technological properties of milk have significant importance for the dairy industry. Citrate, a normal constituent of milk, forms one of the main buffer systems that regulate the equilibrium between Ca(2+) and H(+) ions. Higher-than-normal citrate content is associated with poor coagulation properties of milk. To identify the genes responsible for the variation of citrate content in milk in dairy cattle, the metabolic steps involved in citrate and fatty acid synthesis pathways in ruminant mammary tissue using RNA sequencing were studied. Genetic markers that could influence milk citrate content in Holstein cows were used in a marker-trait association study to establish the relationship between 74 single nucleotide polymorphisms (SNP) in 20 candidate genes and citrate content in 250 Holstein cows. This analysis revealed 6 SNP in key metabolic pathway genes [isocitrate dehydrogenase 1 (NADP+), soluble (IDH1); pyruvate dehydrogenase (lipoamide) β (PDHB); pyruvate kinase (PKM2); and solute carrier family 25 (mitochondrial carrier; citrate transporter), member 1 (SLC25A1)] significantly associated with increased milk citrate content. The amount of the phenotypic variation explained by the 6 SNP ranged from 10.1 to 13.7%. Also, genotype-combination analysis revealed the highest phenotypic variation was explained combining IDH1_23211, PDHB_5562, and SLC25A1_4446 genotypes. This specific genotype combination explained 21.3% of the phenotypic variation. The largest citrate associated effect was in the 3' untranslated region of the SLC25A1 gene, which is responsible for the transport of citrate across the mitochondrial inner membrane. This study provides an approach using RNA sequencing, metabolic pathway analysis, and association studies to identify genetic variation in functional target genes determining complex trait phenotypes. Copyright © 2013 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
Solid phase sequencing of double-stranded nucleic acids

DOEpatents

Fu, Dong-Jing; Cantor, Charles R.; Koster, Hubert; Smith, Cassandra L.

2002-01-01

This invention relates to methods for detecting and sequencing of target double-stranded nucleic acid sequences, to nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probe comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include nucleic acids in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated determination of molecular weights and identification of the target sequence.
Searching for evidence of selection in avian DNA barcodes.

PubMed

Kerr, Kevin C R

2011-11-01

The barcode of life project has assembled a tremendous number of mitochondrial cytochrome c oxidase I (COI) sequences. Although these sequences were gathered to develop a DNA-based system for species identification, it has been suggested that further biological inferences may also be derived from this wealth of data. Recurrent selective sweeps have been invoked as an evolutionary mechanism to explain limited intraspecific COI diversity, particularly in birds, but this hypothesis has not been formally tested. In this study, I collated COI sequences from previous barcoding studies on birds and tested them for evidence of selection. Using this expanded data set, I re-examined the relationships between intraspecific diversity and interspecific divergence and sampling effort, respectively. I employed the McDonald-Kreitman test to test for neutrality in sequence evolution between closely related pairs of species. Because amino acid sequences were generally constrained between closely related pairs, I also included broader intra-order comparisons to quantify patterns of protein variation in avian COI sequences. Lastly, using 22 published whole mitochondrial genomes, I compared the evolutionary rate of COI against the other 12 protein-coding mitochondrial genes to assess intragenomic variability. I found no conclusive evidence of selective sweeps. Most evidence pointed to an overall trend of strong purifying selection and functional constraint. The COI protein did vary across the class Aves, but to a very limited extent. COI was the least variable gene in the mitochondrial genome, suggesting that other genes might be more informative for probing factors constraining mitochondrial variation within species. © 2011 Blackwell Publishing Ltd.
Evaluating variation in human gut microbiota profiles due to DNA extraction method and inter-subject differences.

PubMed

Wagner Mackenzie, Brett; Waite, David W; Taylor, Michael W

2015-01-01

The human gut contains dense and diverse microbial communities which have profound influences on human health. Gaining meaningful insights into these communities requires provision of high quality microbial nucleic acids from human fecal samples, as well as an understanding of the sources of variation and their impacts on the experimental model. We present here a systematic analysis of commonly used microbial DNA extraction methods, and identify significant sources of variation. Five extraction methods (Human Microbiome Project protocol, MoBio PowerSoil DNA Isolation Kit, QIAamp DNA Stool Mini Kit, ZR Fecal DNA MiniPrep, phenol:chloroform-based DNA isolation) were evaluated based on the following criteria: DNA yield, quality and integrity, and microbial community structure based on Illumina amplicon sequencing of the V4 region of bacterial and archaeal 16S rRNA genes. Our results indicate that the largest portion of variation within the model was attributed to differences between subjects (biological variation), with a smaller proportion of variation associated with DNA extraction method (technical variation) and intra-subject variation. A comprehensive understanding of the potential impact of technical variation on the human gut microbiota will help limit preventable bias, enabling more accurate diversity estimates.
ActiveDriverDB: human disease mutations and genome variation in post-translational modification sites of proteins

PubMed Central

Krassowski, Michal; Paczkowska, Marta; Cullion, Kim; Huang, Tina; Dzneladze, Irakli; Ouellette, B F Francis; Yamada, Joseph T; Fradet-Turcotte, Amelie

2018-01-01

Abstract Interpretation of genetic variation is needed for deciphering genotype-phenotype associations, mechanisms of inherited disease, and cancer driver mutations. Millions of single nucleotide variants (SNVs) in human genomes are known and thousands are associated with disease. An estimated 21% of disease-associated amino acid substitutions corresponding to missense SNVs are located in protein sites of post-translational modifications (PTMs), chemical modifications of amino acids that extend protein function. ActiveDriverDB is a comprehensive human proteo-genomics database that annotates disease mutations and population variants through the lens of PTMs. We integrated >385,000 published PTM sites with ∼3.6 million substitutions from The Cancer Genome Atlas (TCGA), the ClinVar database of disease genes, and human genome sequencing projects. The database includes site-specific interaction networks of proteins, upstream enzymes such as kinases, and drugs targeting these enzymes. We also predicted network-rewiring impact of mutations by analyzing gains and losses of kinase-bound sequence motifs. ActiveDriverDB provides detailed visualization, filtering, browsing and searching options for studying PTM-associated mutations. Users can upload mutation datasets interactively and use our application programming interface in pipelines. Integrative analysis of mutations and PTMs may help decipher molecular mechanisms of phenotypes and disease, as exemplified by case studies of TP53, BRCA2 and VHL. The open-source database is available at https://www.ActiveDriverDB.org. PMID:29126202
A phylogenetic analysis using full-length viral genomes of South American dengue serotype 3 in consecutive Venezuelan outbreaks reveals novel NS5 mutation

PubMed Central

Schmidt, DJ; Pickett, BE; Camacho, D; Comach, G; Xhaja, K; Lennon, NJ; Rizzolo, K; de Bosch, N; Becerra, A; Nogueira, ML; Mondini, A; da Silva, EV; Vasconcelos, PF; Muñoz-Jordán, JL; Santiago, GA; Ocazionez, R; Gehrke, L; Lefkowitz, EJ; Birren, BW; Henn, MR; Bosch, I

2013-01-01

Dengue virus currently causes 50-100 million infections annually. Comprehensive knowledge about the evolution of Dengue in response to selection pressure is currently unavailable, but would greatly enhance vaccine design efforts. In the current study, we sequenced 187 new dengue virus serotype 3(DENV-3) genotype III whole genomes isolated from Asia and the Americas. We analyzed them together with previously-sequenced isolates to gain a more detailed understanding of the evolutionary adaptations existing in this prevalent American serotype. In order to analyze the phylogenetic dynamics of DENV-3 during outbreak periods; we incorporated datasets of 48 and 11 sequences spanning two major outbreaks in Venezuela during 2001 and 2007-2008 respectively. Our phylogenetic analysis of newly sequenced viruses shows that subsets of genomes cluster primarily by geographic location, and secondarily by time of virus isolation. DENV-3 genotype III sequences from Asia are significantly divergent from those from the Americas due to their geographical separation and subsequent speciation. We measured amino acid variation for the E protein by calculating the Shannon entropy at each position between Asian and American genomes. We found a cluster of 7 amino acid substitutions having high variability within E protein domain III, which has previously been implicated in serotype-specific neutralization escape mutants. No novel mutations were found in the E protein of sequences isolated during either Venezuelan outbreak. Shannon entropy analysis of the NS5 polymerase mature protein revealed that a G374E mutation, in a region that contributes to interferon resistance in other flaviviruses by interfering with JAK-STAT signaling was present in both the Asian and American sequences from the 2007-2008 Venezuelan outbreak, but was absent in the sequences from the 2001 Venezuelan outbreak. In addition to E, several NS5 amino acid changes were unique to the 2007-2008 epidemic in Venezuela and may give additional insight into the adaptive response of DENV-3 at the population level. PMID:21964598
Impact of germline and somatic missense variations on drug binding sites.

PubMed

Yan, C; Pattabiraman, N; Goecks, J; Lam, P; Nayak, A; Pan, Y; Torcivia-Rodriguez, J; Voskanian, A; Wan, Q; Mazumder, R

2017-03-01

Advancements in next-generation sequencing (NGS) technologies are generating a vast amount of data. This exacerbates the current challenge of translating NGS data into actionable clinical interpretations. We have comprehensively combined germline and somatic nonsynonymous single-nucleotide variations (nsSNVs) that affect drug binding sites in order to investigate their prevalence. The integrated data thus generated in conjunction with exome or whole-genome sequencing can be used to identify patients who may not respond to a specific drug because of alterations in drug binding efficacy due to nsSNVs in the target protein's gene. To identify the nsSNVs that may affect drug binding, protein-drug complex structures were retrieved from Protein Data Bank (PDB) followed by identification of amino acids in the protein-drug binding sites using an occluded surface method. Then, the germline and somatic mutations were mapped to these amino acids to identify which of these alter protein-drug binding sites. Using this method we identified 12 993 amino acid-drug binding sites across 253 unique proteins bound to 235 unique drugs. The integration of amino acid-drug binding sites data with both germline and somatic nsSNVs data sets revealed 3133 nsSNVs affecting amino acid-drug binding sites. In addition, a comprehensive drug target discovery was conducted based on protein structure similarity and conservation of amino acid-drug binding sites. Using this method, 81 paralogs were identified that could serve as alternative drug targets. In addition, non-human mammalian proteins bound to drugs were used to identify 142 homologs in humans that can potentially bind to drugs. In the current protein-drug pairs that contain somatic mutations within their binding site, we identified 85 proteins with significant differential gene expression changes associated with specific cancer types. Information on protein-drug binding predicted drug target proteins and prevalence of both somatic and germline nsSNVs that disrupt these binding sites can provide valuable knowledge for personalized medicine treatment. A web portal is available where nsSNVs from individual patient can be checked by scanning against DrugVar to determine whether any of the SNVs affect the binding of any drug in the database.
Purification, characterization and molecular cloning of chymotrypsin inhibitor peptides from the venom of Burmese Daboia russelii siamensis.

PubMed

Guo, Chun-Teng; McClean, Stephen; Shaw, Chris; Rao, Ping-Fan; Ye, Ming-Yu; Bjourson, Anthony J

2013-05-01

One novel Kunitz BPTI-like peptide designated as BBPTI-1, with chymotrypsin inhibitory activity was identified from the venom of Burmese Daboia russelii siamensis. It was purified by three steps of chromatography including gel filtration, cation exchange and reversed phase. A partial N-terminal sequence of BBPTI-1, HDRPKFCYLPADPGECLAHMRSF was obtained by automated Edman degradation and a Ki value of 4.77nM determined. Cloning of BBPTI-1 including the open reading frame and 3' untranslated region was achieved from cDNA libraries derived from lyophilized venom using a 3' RACE strategy. In addition a cDNA sequence, designated as BBPTI-5, was also obtained. Alignment of cDNA sequences showed that BBPTI-5 exhibited an identical sequence to BBPTI-1 cDNA except for an eight nucleotide deletion in the open reading frame. Gene variations that represented deletions in the BBPTI-5 cDNA resulted in a novel protease inhibitor analog. Amino acid sequence alignment revealed that deduced peptides derived from cloning of their respective precursor cDNAs from libraries showed high similarity and homology with other Kunitz BPTI proteinase inhibitors. BBPTI-1 and BBPTI-5 consist of 60 and 66 amino acid residues respectively, including six conserved cysteine residues. As these peptides have been reported to have influence on the processes of coagulation, fibrinolysis and inflammation, their potential application in biomedical contexts warrants further investigation. Copyright © 2013 Elsevier Inc. All rights reserved.
Site-directed mutagenesis of Autographa californica nucleopolyhedrovirus (AcNPV) polyhedrin: effect on polyhedron structure.

PubMed

Bravo-Patiño, A; Ibarra, J E

2000-01-01

Amino acids Lys34, His36, and Phe37 were substituted by PCR-mediated, site-directed mutagenesis for three Trp's in the AcNPV polyhedrin sequence. Phase contrast microscopy revealed refringent, amorphous polyhedra in the nuclei of infected cells. Electron microscopy confirmed a great variation in form and size of the mutated polyhedra. Although crystallization of the mutated polyhedrin occurred, it was irregular within each polyhedron. Virion occlusion was also severely affected. Virions were partially occluded, or only one virion was occluded per polyhedron. Results suggest that the substitution of these three amino acids affected the morphology of polyhedra, the uniformity of crystallization within each polyhedron, and the virion occlusion.
The composition of cheetah (Acinonyx jubatus) milk.

PubMed

Osthoff, G; Hugo, A; de Wit, M

2006-01-01

Milk was obtained from two captive bred cheetahs. The nutrient content was 99.6 g protein; 64.8 g fat; and 40.21 g lactose per kg milk. Small amounts of oligosaccharides, glucose, galactose and fucose were noted. The protein fraction respectively consisted of 34.2 g caseins per kg milk and of 65.3 g whey proteins per kg milk. Very little variation in milk composition among the individual cheetahs was noted. Electrophoresis and identification of protein bands showed a similar migrating sequence of proteins as seen in lion's and cat's milk, with small differences in the beta-caseins. The lipid fraction contains 290.4 g saturated and 337.3 g mono-unsaturated fatty acids per kg milk fat respectively. The high content of 279.5 g kg(-1) milk fat of polyunsaturated fatty acids is due to a high content in alpha-linolenic acid. No short chain fatty acids, but substantial levels of uneven carbon chain fatty acids were observed.
Solid phase sequencing of biopolymers

DOEpatents

Cantor, Charles; Koster, Hubert

2010-09-28

This invention relates to methods for detecting and sequencing target nucleic acid sequences, to mass modified nucleic acid probes and arrays of probes useful in these methods, and to kits and systems which contain these probes. Useful methods involve hybridizing the nucleic acids or nucleic acids which represent complementary or homologous sequences of the target to an array of nucleic acid probes. These probes comprise a single-stranded portion, an optional double-stranded portion and a variable sequence within the single-stranded portion. The molecular weights of the hybridized nucleic acids of the set can be determined by mass spectroscopy, and the sequence of the target determined from the molecular weights of the fragments. Nucleic acids whose sequences can be determined include DNA or RNA in biological samples such as patient biopsies and environmental samples. Probes may be fixed to a solid support such as a hybridization chip to facilitate automated molecular weight analysis and identification of the target sequence.
Natural Selection and Adaptive Evolution of Leptin in the Ochotona Family Driven by the Cold Environmental Stress

PubMed Central

Yang, Jie; Wang, Zhen Long; Zhao, Xin Quan; Wang, De Peng; Qi, De Lin; Xu, Bao Hong; Ren, Yong Hong; Tian, Hui Fang

2008-01-01

Background Environmental stress can accelerate the evolutionary rate of specific stress-response proteins and create new functions specialized for different environments, enhancing an organism's fitness to stressful environments. Pikas (order Lagomorpha), endemic, non-hibernating mammals in the modern Holarctic Region, live in cold regions at either high altitudes or high latitudes and have a maximum distribution of species diversification confined to the Qinghai-Tibet Plateau. Variations in energy metabolism are remarkable for them living in cold environments. Leptin, an adipocyte-derived hormone, plays important roles in energy homeostasis. Methodology/Principal Findings To examine the extent of leptin variations within the Ochotona family, we cloned the entire coding sequence of pika leptin from 6 species in two regions (Qinghai-Tibet Plateau and Inner Mongolia steppe in China) and the leptin sequences of plateau pikas (O. curzonia) from different altitudes on Qinghai-Tibet Plateau. We carried out both DNA and amino acid sequence analyses in molecular evolution and compared modeled spatial structures. Our results show that positive selection (PS) acts on pika leptin, while nine PS sites located within the functionally significant segment 85-119 of leptin and one unique motif appeared only in pika lineages-the ATP synthase α and β subunit signature site. To reveal the environmental factors affecting sequence evolution of pika leptin, relative rate test was performed in pikas from different altitudes. Stepwise multiple regression shows that temperature is significantly and negatively correlated with the rates of non-synonymous substitution (Ka) and amino acid substitution (Aa), whereas altitude does not significantly affect synonymous substitution (Ks), Ka and Aa. Conclusions/Significance Our findings support the viewpoint that adaptive evolution may occur in pika leptin, which may play important roles in pikas' ecological adaptation to extreme environmental stress. We speculate that cold, and probably not hypoxia, may be the primary environmental factor for driving adaptive evolution of pika leptin. PMID:18213380
Whole exome sequencing of rare variants in EIF4G1 and VPS35 in Parkinson disease

PubMed Central

Nuytemans, Karen; Bademci, Guney; Inchausti, Vanessa; Dressen, Amy; Kinnamon, Daniel D.; Mehta, Arpit; Wang, Liyong; Züchner, Stephan; Beecham, Gary W.; Martin, Eden R.; Scott, William K.

2013-01-01

Objective: Recently, vacuolar protein sorting 35 (VPS35) and eukaryotic translation initiation factor 4 gamma 1 (EIF4G1) have been identified as 2 causal Parkinson disease (PD) genes. We used whole exome sequencing for rapid, parallel analysis of variations in these 2 genes. Methods: We performed whole exome sequencing in 213 patients with PD and 272 control individuals. Those rare variants (RVs) with <5% frequency in the exome variant server database and our own control data were considered for analysis. We performed joint gene-based tests for association using RVASSOC and SKAT (Sequence Kernel Association Test) as well as single-variant test statistics. Results: We identified 3 novel VPS35 variations that changed the coded amino acid (nonsynonymous) in 3 cases. Two variations were in multiplex families and neither segregated with PD. In EIF4G1, we identified 11 (9 nonsynonymous and 2 small indels) RVs including the reported pathogenic mutation p.R1205H, which segregated in all affected members of a large family, but also in 1 unaffected 86-year-old family member. Two additional RVs were found in isolated patients only. Whereas initial association studies suggested an association (p = 0.04) with all RVs in EIF4G1, subsequent testing in a second dataset for the driving variant (p.F1461) suggested no association between RVs in the gene and PD. Conclusions: We confirm that the specific EIF4G1 variation p.R1205H seems to be a strong PD risk factor, but is nonpenetrant in at least one 86-year-old. A few other select RVs in both genes could not be ruled out as causal. However, there was no evidence for an overall contribution of genetic variability in VPS35 or EIF4G1 to PD development in our dataset. PMID:23408866
Natural Variation in the Pto Pathogen Resistance Gene Within Species of Wild Tomato (Lycopersicon). I. Functional Analysis of Pto Alleles

PubMed Central

Rose, Laura E.; Langley, Charles H.; Bernal, Adriana J.; Michelmore, Richard W.

2005-01-01

Disease resistance to the bacterial pathogen Pseudomonas syringae pv. tomato (Pst) in the cultivated tomato, Lycopersicon esculentum, and the closely related L. pimpinellifolium is triggered by the physical interaction between plant disease resistance protein, Pto, and the pathogen avirulence protein, AvrPto. To investigate the extent to which variation in the Pto gene is responsible for naturally occurring variation in resistance to Pst, we determined the resistance phenotype of 51 accessions from seven species of Lycopersicon to isogenic strains of Pst differing in the presence of avrPto. One-third of the plants displayed resistance specifically when the pathogen expressed AvrPto, consistent with a gene-for-gene interaction. To test whether this resistance in these species was conferred specifically by the Pto gene, alleles of Pto were amplified and sequenced from 49 individuals and a subset (16) of these alleles was tested in planta using Agrobacterium-mediated transient assays. Eleven alleles conferred a hypersensitive resistance response (HR) in the presence of AvrPto, while 5 did not. Ten amino acid substitutions associated with the absence of AvrPto recognition and HR were identified, none of which had been identified in previous structure-function studies. Additionally, 3 alleles encoding putative pseudogenes of Pto were isolated from two species of Lycopersicon. Therefore, a large proportion, but not all, of the natural variation in the reaction to strains of Pst expressing AvrPto can be attributed to sequence variation in the Pto gene. PMID:15944360
Limited Variation in BK Virus T-Cell Epitopes Revealed by Next-Generation Sequencing

PubMed Central

Sahoo, Malaya K.; Tan, Susanna K.; Chen, Sharon F.; Kapusinszky, Beatrix; Concepcion, Katherine R.; Kjelson, Lynn; Mallempati, Kalyan; Farina, Heidi M.; Fernández-Viña, Marcelo; Tyan, Dolly; Grimm, Paul C.; Anderson, Matthew W.; Concepcion, Waldo

2015-01-01

BK virus (BKV) infection causing end-organ disease remains a formidable challenge to the hematopoietic cell transplant (HCT) and kidney transplant fields. As BKV-specific treatments are limited, immunologic-based therapies may be a promising and novel therapeutic option for transplant recipients with persistent BKV infection. Here, we describe a whole-genome, deep-sequencing methodology and bioinformatics pipeline that identify BKV variants across the genome and at BKV-specific HLA-A2-, HLA-B0702-, and HLA-B08-restricted CD8 T-cell epitopes. BKV whole genomes were amplified using long-range PCR with four inverse primer sets, and fragmentation libraries were sequenced on the Ion Torrent Personal Genome Machine (PGM). An error model and variant-calling algorithm were developed to accurately identify rare variants. A total of 65 samples from 18 pediatric HCT and kidney recipients with quantifiable BKV DNAemia underwent whole-genome sequencing. Limited genetic variation was observed. The median number of amino acid variants identified per sample was 8 (range, 2 to 37; interquartile range, 10), with the majority of variants (77%) detected at a frequency of <5%. When normalized for length, there was no statistical difference in the median number of variants across all genes. Similarly, the predominant virus population within samples harbored T-cell epitopes similar to the reference BKV strain that was matched for the BKV genotype. Despite the conservation of epitopes, low-level variants in T-cell epitopes were detected in 77.7% (14/18) of patients. Understanding epitope variation across the whole genome provides insight into the virus-immune interface and may help guide the development of protocols for novel immunologic-based therapies. PMID:26202116
Diversity and evolution analysis of glycoprotein GP85 from avian leukosis virus subgroup J isolates from chickens of different genetic backgrounds during 1989-2016: Coexistence of five extremely different clusters.

PubMed

Wang, Peikun; Lin, Lulu; Li, Haijuan; Yang, Yongli; Huang, Teng; Wei, Ping

2018-02-01

ALV-J has caused the most serious losses to the poultry industry in China. The gp85-coding sequence of ALV-J is known to be prone to mutation, but any association between the gp85 gene and breed of chicken remains unclear. A comprehensive and systematic study of the evolutionary process of ALV-J in China is needed. In this study, we compared and analyzed gp85 gene sequences from 198 ALV-J isolates, originating from China, USA, UK and France during 1989-2016. These were sorted into five clusters. Cluster 1, 2, 3, 4 and 5 included isolates from chicken types of different genetic backgrounds, e.g. white-feather broiler, Guangxi indigenous chicken breeds, Yellow chickens and layer chickens respectively. A correlation comparison of amino acid sequence similarities in the gp85 protein among the five clusters showed significant differences (P < 0.01) with the exception being when the third and fifth cluster were compared (P > 0.05). Results of entropy analysis of the gp85 sequences revealed that cluster 3 had the largest variation and cluster 1 had the least variation. The N-glycosylation sites in the majority of isolates numbered 14, 16, 17, 16 and 16, respectively, with regards to clusters 1-5. In addition, 5 isolates from cluster 3 had one more glycosylation site than the other isolates from cluster 3. Our study provides evidence that there were five extremely different ALV-J clusters during 1989-2016 and that the gp85 genes isolated from indigenous chicken breed isolates had the largest variation.
Identical mitochondrial somatic mutations unique to chronic periodontitis and coronary artery disease

PubMed Central

Pallavi, Tokala; Chandra, Rampalli Viswa; Reddy, Aileni Amarender; Reddy, Bavigadda Harish; Naveen, Anumala

2016-01-01

Context: The inflammatory processes involved in chronic periodontitis and coronary artery diseases (CADs) are similar and produce reactive oxygen species that may result in similar somatic mutations in mitochondrial deoxyribonucleic acid (mtDNA). Aims: The aims of the present study were to identify somatic mtDNA mutations in periodontal and cardiac tissues from subjects undergoing coronary artery bypass surgery and determine what fraction was identical and unique to these tissues. Settings and Design: The study population consisted of 30 chronic periodontitis subjects who underwent coronary artery surgery after an angiogram had indicated CAD. Materials and Methods: Gingival tissue samples were taken from the site with deepest probing depth; coronary artery tissue samples were taken during the coronary artery bypass grafting procedures, and blood samples were drawn during this surgical procedure. These samples were stored under aseptic conditions and later transported for mtDNA analysis. Statistical Analysis Used: Complete mtDNA sequences were obtained and aligned with the revised Cambridge reference sequence (NC_012920) using sequence analysis and auto assembler tools. Results: Among the complete mtDNA sequences, a total of 162 variations were spread across the whole mitochondrial genome and present only in the coronary artery and the gingival tissue samples but not in the blood samples. Among the 162 variations, 12 were novel and four of the 12 novel variations were found in mitochondrial NADH dehydrogenase subunit 5 complex I gene (33.3%). Conclusions: Analysis of mtDNA mutations indicated 162 variants unique to periodontitis and CAD. Of these, 12 were novel and may have resulted from destructive oxidative forces common to these two diseases. PMID:27041832
The genetic diversity and epizootiology of infectious hematopoietic necrosis virus

USGS Publications Warehouse

Oshima, Kevin H.; Arakawa, Cindy K.; Higman, Keith H.; Landolt, Marsha L.; Nichol, Stuart T.; Winton, James R.

1994-01-01

Infectious hematopoietic necrosis virus (IHNV) is a rhabdovirus which causes a serious disease in salmondd fish. The T1 ribonuclease fingerprinttin method was used to compare the RNA genomes of 26 isolates of IHNV recovered from sockeye salmon (Oncorhynchus nerka), chinook salmon (O. tshawytscha), and steelhead trout (O. mykiss) throughout the enzootic portion of western North America. Most of the isolates as a source of genetic variation. In from a single year (1987) to limit time of isolation as a source of genetic variation. In addition, isolates from different years collected at three sites were analyzed to investigate genetic drift or evolution of IHNV within specific locations. All of the isolates examined by T1 fingerprint analysis contained less than a 50% variation in spot location and were represented by a single fingerprint group. The observed variation was estimated to correspond to less than 5% variation in the nucleic acid sequence. However, sufficient variation was detected to separate the isolates into four subgroups which appeared to correlate to different geographic regions. Host species appeared not to be a significant source of variation. The evolutionary and epizootiologic significance of these findings and their relationship to other evidence of genetic variation in IHNV isolates are discussed.
Phylogenetic analyses indicate little variation among reticuloendotheliosis viruses infecting avian species, including the endangered Attwater's prairie chicken.

PubMed

Bohls, Ryan L; Linares, Jose A; Gross, Shannon L; Ferro, Pam J; Silvy, Nova J; Collisson, Ellen W

2006-08-01

Reticuloendotheliosis virus infection, which typically causes systemic lymphomas and high mortality in the endangered Attwater's prairie chicken, has been described as a major obstacle in repopulation efforts of captive breeding facilities in Texas. Although antigenic relationships among reticuloendotheliosis virus (REV) strains have been previously determined, phylogenetic relationships have not been reported. The pol and env of REV proviral DNA from prairie chickens (PC-R92 and PC-2404), from poxvirus lesions in domestic chickens, the prototype poultry derived REV-A and chick syncytial virus (CSV), and duck derived spleen necrosis virus (SNV) were PCR amplified and sequenced. The 5032bp, that included the pol and most of env genes, of the PC-R92 and REV-A were 98% identical, and nucleotide sequence identities of smaller regions within the pol and env from REV strains examined ranged from 95 to 99% and 93 to 99%, respectively. The putative amino acid sequences were 97-99% identical in the polymerase and 90-98% in the envelope. Phylogenetic analyses of the nucleotide and amino acid sequences indicated the closest relationship among the recent fowl pox-associated chicken isolates, the prairie chicken isolates and the prototype CSV while only the SNV appeared to be distinctly divergent. While the origin of the naturally occurring viruses is not known, the avian poxvirus may be a critical component of transmission of these ubiquitous oncogenic viruses.
Simian immunodeficiency viruses from African green monkeys display unusual genetic diversity.

PubMed Central

Johnson, P R; Fomsgaard, A; Allan, J; Gravell, M; London, W T; Olmsted, R A; Hirsch, V M

1990-01-01

African green monkeys are asymptomatic carriers of simian immunodeficiency viruses (SIV), commonly called SIVagm. As many as 50% of African green monkeys in the wild may be SIV seropositive. This high seroprevalence rate and the potential for genetic variation of lentiviruses suggested to us that African green monkeys may harbor widely differing genotypes of SIVagm. To investigate this hypothesis, we determined the entire nucleotide sequence of an infectious proviral molecular clone of SIVagm (155-4) and partial sequences (long terminal repeat and Gag) of three other distinct SIVagm isolates (90, gri-1, and ver-1). Comparisons among the SIVagm isolates revealed extreme diversity at the nucleotide and amino acid levels. Long terminal repeat nucleotide sequences varied up to 35% and Gag protein sequences varied up to 30%. The variability among SIVagm isolates exceeded the variability among any other group of primate lentiviruses. Our data suggest that SIVagm has been in the African green monkey population for a long time and may be the oldest primate lentivirus group in existence. PMID:2304139

The Anopheles stephensi odorant binding protein 1 (AsteObp1) gene: a new molecular marker for biological forms diagnosis.

PubMed

Gholizadeh, S; Firooziyan, S; Ladonni, H; Hajipirloo, H Mohammadzadeh; Djadid, N Dinparast; Hosseini, A; Raz, A

2015-06-01

Anopheles (Cellia) stephensi Liston 1901 is known as an Asian malaria vector. Three biological forms, namely "mysorensis", "intermediate", and "type" have been earlier reported in this species. Nevertheless, the present morphological and molecular information is insufficient to diagnose these forms. During this investigation, An. stephensi biological forms were morphologically identified and sequenced for odorant-binding protein 1 (Obp1) gene. Also, intron I sequences were used to construct phylogenetic trees. Despite nucleotide sequence variation in exon of AsteObp1, nearly 100% identity was observed at the amino acid level among the three biological forms. In order to overcome difficulties in using egg morphology characters, intron I sequences of An. stephensi Obp1 opens new molecular way to the identification of the main Asian malaria vector biological forms. However, multidisciplinary studies are needed to establish the taxonomic status of An. stephensi. Copyright © 2015 Elsevier B.V. All rights reserved.
3D RNA and functional interactions from evolutionary couplings

PubMed Central

Weinreb, Caleb; Riesselman, Adam; Ingraham, John B.; Gross, Torsten; Sander, Chris; Marks, Debora S.

2016-01-01

Summary Non-coding RNAs are ubiquitous, but the discovery of new RNA gene sequences far outpaces research on their structure and functional interactions. We mine the evolutionary sequence record to derive precise information about function and structure of RNAs and RNA-protein complexes. As in protein structure prediction, we use maximum entropy global probability models of sequence co-variation to infer evolutionarily constrained nucleotide-nucleotide interactions within RNA molecules, and nucleotide-amino acid interactions in RNA-protein complexes. The predicted contacts allow all-atom blinded 3D structure prediction at good accuracy for several known RNA structures and RNA-protein complexes. For unknown structures, we predict contacts in 160 non-coding RNA families. Beyond 3D structure prediction, evolutionary couplings help identify important functional interactions, e.g., at switch points in riboswitches and at a complex nucleation site in HIV. Aided by accelerating sequence accumulation, evolutionary coupling analysis can accelerate the discovery of functional interactions and 3D structures involving RNA. PMID:27087444
Evaluation of cysteine proteases of Plasmodium vivax as antimalarial drug targets: sequence analysis and sensitivity to cysteine protease inhibitors.

PubMed

Na, Byoung-Kuk; Kim, Tong-Soo; Rosenthal, Philip J; Lee, Jong-Koo; Kong, Yoon

2004-10-01

Cysteine proteases perform critical roles in the life cycles of malaria parasites. In Plasmodium falciparum, treatment of cysteine protease inhibitors inhibits hemoglobin hydrolysis and blocks the parasite development in vitro and in vivo, suggesting that plasmodial cysteine proteases may be interesting targets for new chemotherapeutics. To determine whether sequence diversity may limit chemotherapy against Plasmodium vivax, we analyzed sequence variations in the genes encoding three cysteine proteases, vivapain-1, -2 and -3, in 22 wild isolates of P. vivax. The sequences were highly conserved among wild isolates. A small number of substitutions leading to amino acid changes were found, while they did not modify essential residues for the function or structure of the enzymes. The substrate specificities and sensitivities to synthetic cysteine protease inhibitors of vivapain-2 and -3 from wild isolates were also very similar. These results support the suggestion that cysteine proteases of P. vivax are promising antimalarial chemotherapeutic targets.
Indication for Co-evolution of Lactobacillus johnsonii with its hosts

PubMed Central

2012-01-01

Background The intestinal microbiota, composed of complex bacterial populations, is host-specific and affected by environmental factors as well as host genetics. One important bacterial group is the lactic acid bacteria (LAB), which include many health-promoting strains. Here, we studied the genetic variation within a potentially probiotic LAB species, Lactobacillus johnsonii, isolated from various hosts. Results A wide survey of 104 fecal samples was carried out for the isolation of L. johnsonii. As part of the isolation procedure, terminal restriction fragment length polymorphism (tRFLP) was performed to identify L. johnsonii within a selected narrow spectrum of fecal LAB. The tRFLP results showed host specificity of two bacterial species, the Enterococcus faecium species cluster and Lactobacillus intestinalis, to different host taxonomic groups while the appearance of L. johnsonii and E. faecalis was not correlated with any taxonomic group. The survey ultimately resulted in the isolation of L. johnsonii from few host species. The genetic variation among the 47 L. johnsonii strains isolated from the various hosts was analyzed based on variation at simple sequence repeats (SSR) loci and multi-locus sequence typing (MLST) of conserved hypothetical genes. The genetic relationships among the strains inferred by each of the methods were similar, revealing three different clusters of L. johnsonii strains, each cluster consisting of strains from a different host, i.e. chickens, humans or mice. Conclusions Our typing results support phylogenetic separation of L. johnsonii strains isolated from different animal hosts, suggesting specificity of L. johnsonii strains to their hosts. Taken together with the tRFLP results, that indicated the association of specific LAB species with the host taxonomy, our study supports co-evolution of the host and its intestinal lactic acid bacteria. PMID:22827843
Indication for Co-evolution of Lactobacillus johnsonii with its hosts.

PubMed

Buhnik-Rosenblau, Keren; Matsko-Efimov, Vera; Jung, Minju; Shin, Heuynkil; Danin-Poleg, Yael; Kashi, Yechezkel

2012-07-25

The intestinal microbiota, composed of complex bacterial populations, is host-specific and affected by environmental factors as well as host genetics. One important bacterial group is the lactic acid bacteria (LAB), which include many health-promoting strains. Here, we studied the genetic variation within a potentially probiotic LAB species, Lactobacillus johnsonii, isolated from various hosts. A wide survey of 104 fecal samples was carried out for the isolation of L. johnsonii. As part of the isolation procedure, terminal restriction fragment length polymorphism (tRFLP) was performed to identify L. johnsonii within a selected narrow spectrum of fecal LAB. The tRFLP results showed host specificity of two bacterial species, the Enterococcus faecium species cluster and Lactobacillus intestinalis, to different host taxonomic groups while the appearance of L. johnsonii and E. faecalis was not correlated with any taxonomic group. The survey ultimately resulted in the isolation of L. johnsonii from few host species. The genetic variation among the 47 L. johnsonii strains isolated from the various hosts was analyzed based on variation at simple sequence repeats (SSR) loci and multi-locus sequence typing (MLST) of conserved hypothetical genes. The genetic relationships among the strains inferred by each of the methods were similar, revealing three different clusters of L. johnsonii strains, each cluster consisting of strains from a different host, i.e. chickens, humans or mice. Our typing results support phylogenetic separation of L. johnsonii strains isolated from different animal hosts, suggesting specificity of L. johnsonii strains to their hosts. Taken together with the tRFLP results, that indicated the association of specific LAB species with the host taxonomy, our study supports co-evolution of the host and its intestinal lactic acid bacteria.
Mitochondrial genomic comparison of Clonorchis sinensis from South Korea with other isolates of this species.

PubMed

Wang, Daxi; Young, Neil D; Koehler, Anson V; Tan, Patrick; Sohn, Woon-Mok; Korhonen, Pasi K; Gasser, Robin B

2017-07-01

Clonorchiasis is a neglected tropical disease that affects >35 million people mainly in China, Vietnam, South Korea and some parts of Russia. The disease-causing agent, Clonorchis sinensis, is a liver fluke of humans and other piscivorous animals, and has a complex aquatic life cycle involving snails and fish intermediate hosts. Chronic infection in humans causes liver disease and associated complications including malignant bile duct cancer. Central to control and to understanding the epidemiology of this disease is knowledge of the specific identity of the causative agent as well as genetic variation within and among populations of this parasite. Although most published molecular studies seem to suggest that C. sinensis represents a single species and that genetic variation within the species is limited, karyotypic variation within C. sinensis among China, Korea (2n=56) and Russian Far East (2n=14) suggests that this taxon might contain sibling species. Here, we assessed and applied a deep sequencing-bioinformatic approach to sequence and define a reference mitochondrial (mt) genome for a particular isolate of C. sinensis from Korea (Cs-k2), to confirm its specific identity, and compared this mt genome with homologous data sets available for this species. Comparative analyses revealed consistency in the number and structure of genes as well as in the lengths of protein-coding genes, and limited genetic variation among isolates of C. sinensis. Phylogenetic analyses of amino acid sequences predicted from mt genes showed that representatives of C. sinensis clustered together, with absolute nodal support, to the exclusion of other liver fluke representatives, but sub-structuring within C. sinensis was not well supported. The plan now is to proceed with the sequencing, assembly and annotation of a high quality draft nuclear genome of this defined isolate (Cs-k2) as a basis for a detailed investigation of molecular variation within C. sinensis from disparate geographical locations in parts of Asia and to prospect for cryptic species. Copyright © 2017 Elsevier B.V. All rights reserved.
Substantial variation in the hepatitis B surface antigen (HBsAg) in hepatitis B virus (HBV)-positive patients from South Africa: Reliable detection of HBV by the Elecsys HBsAg II assay.

PubMed

Gencay, Mikael; Vermeulen, Marion; Neofytos, Dionysis; Westergaard, Gaston; Pabinger, Stephan; Kriegner, Albert; Seffner, Anja; Gohl, Peter; Huebner, Kirsten; Nauck, Markus; Kaminski, Wolfgang E

2018-04-01

It is essential that hepatitis B surface antigen (HBsAg) diagnostic assays reliably detect genetic diversity in the major hydrophilic region (MHR) of HBsAg to avoid false-negative results. Mutations in this domain display marked ethno-geographic variation and may lead to failure to diagnose hepatitis B virus (HBV) infection. Evaluate diagnostic performance of the Elecsys ® HBsAg II Qualitative assay in a cohort of South African HBV-positive blood donors. A total of 179 South African HBsAg- and HBV DNA > 100 IU/mL-positive blood donor samples were included. Samples were sequenced for genetic variation in HBsAg MHR using next-generation ultra-deep sequencing. HBsAg seropositivity was determined using the Roche Elecsys HBsAg II Qualitative assay. Mutation rates were compared between the first (amino acids 124-137) and second (amino acids 139-147) loops of the immunodominant MHR 'a' determinant region. Frequency of occult HBV infection-associated Y100C mutations was also determined. We observed a total of 279 MHR mutations (117 variants) in 102 (57%) samples, of which 91 were located in the 'a' determinant region. The major vaccine-induced escape mutation G145R was observed in two samples. All occult HBV infection-associated Y100C and common diagnostic and vaccine-escape-associated P120T, G145R, K122R, M133L, M133T, Q129H, G130N, and T126S mutations were reliably detected by the assay, which consistently detected the presence of HBsAg in all 179 samples including samples with 11 novel mutations. Despite substantial variation in HBsAg MHR, the Elecsys HBsAg II Qualitative assay robustly detects HBV infection in this South African cohort. Copyright © 2018 The Author(s). Published by Elsevier B.V. All rights reserved.
Unique variations of Epstein-Barr virus-encoded BARF1 gene in nasopharyngeal carcinoma biopsies.

PubMed

Wang, Yun; Wang, Xiao-Feng; Sun, Zhi-Fu; Luo, Bing

2012-06-01

The Epstein-Barr virus (EBV) BamHI-A rightward frame 1 (BARF1) gene is frequently expressed in EBV-associated epithelial malignancies and involves in oncogenicity and immunomodulation. To characterize the variations of BARF1 gene in different populations, the sequences of BARF1 gene in Northern Chinese nasopharyngeal carcinoma (NPC), EBV-associated gastric carcinoma (EBVaGC) and healthy donors were analyzed. The correlation of BARF1 variation with polymorphisms of BamHI F fragment (type F and f variants) and EBV-coded viral interleukin-10 (vIL-10) gene (B95-8 and SPM patterns) was also explored. Two major subtypes of BARF1 gene, designated as B95-8 and V29A, were identified. B95-8 subtype had identical amino acid sequence to B95-8 and was the dominant subtype among the EBV isolates from Northern China. V29A subtype, with one consistent amino acid change at residue 29 (V→A) and several nucleotide changes, showed higher frequency in NPC cases (25.3%, 20/79) than in EBVaGC cases (0/45) or healthy donors (4.3%, 2/46) (NPC vs. EBVaGC: P=0.0001; NPC vs. healthy donor: P=0.004). A preferential linkage between BamHI F and BARF1/vIL-10 polymorphisms was found. Type f isolates was specially correlated with the V29A/SPM genotype in NPC isolates and type f/V29A/SPM was preferentially found in NPC. BARF1/c-fms homology domain, transforming domain and cytotoxic T lymphocyte (CTL) epitopes of BARF1 were highly conserved in most isolates, suggesting the important role of BARF1 in virus infection and the potential usefulness in EBV-targeting immunotherapy of EBV-associated tumors. The relatively higher prevalence of type f/V29A/SPM strains in NPC may also suggest the association between these variations in multiple viral genes and NPC. Copyright © 2012 Elsevier B.V. All rights reserved.
Identification of promoter motifs regulating ZmeIF4E expression level involved in maize rough dwarf disease resistance in maize (Zea Mays L.).

PubMed

Shi, Liyu; Weng, Jianfeng; Liu, Changlin; Song, Xinyuan; Miao, Hongqin; Hao, Zhuanfang; Xie, Chuanxiao; Li, Mingshun; Zhang, Degui; Bai, Li; Pan, Guangtang; Li, Xinhai; Zhang, Shihuang

2013-04-01

Maize rough dwarf disease (MRDD, a viral disease) results in significant grain yield losses, while genetic basis of which is largely unknown. Based on comparative genomics, eukaryotic translation initiation factor 4E (eIF4E) was considered as a candidate gene for MRDD resistance, validation of which will help to understand the possible genetic mechanism of this disease. ZmeIF4E (orthologs of eIF4E gene in maize) encodes a protein of 218 amino acids, harboring five exons and no variation in the cDNA sequence is identified between the resistant inbred line, X178 and susceptible one, Ye478. ZmeIF4E expression was different in the two lines plants treated with three plant hormones, ethylene, salicylic acid, and jasmonates at V3 developmental stage, suggesting that ZmeIF4E is more likely to be involved in the regulation of defense gene expression and induction of local and systemic resistance. Moreover, four cis-acting elements related to plant defense responses, including DOFCOREZM, EECCRCAH1, GT1GAMSCAM4, and GT1CONSENSUS were detected in ZmeIF4E promoter for harboring sequence variation in the two lines. Association analysis with 163 inbred lines revealed that one SNP in EECCRCAH1 is significantly associated with CSI of MRDD in two environments, which explained 3.33 and 9.04 % of phenotypic variation, respectively. Meanwhile, one SNP in GT-1 motif was found to affect MRDD resistance only in one of the two environments, which explained 5.17 % of phenotypic variation. Collectively, regulatory motifs respectively harboring the two significant SNPs in ZmeIF4E promoter could be involved in the defense process of maize after viral infection. These results contribute to understand maize defense mechanisms against maize rough dwarf virus.
Quantitative and discriminative analysis of nucleic acid samples using luminometric nonspecific nanoparticle methods

NASA Astrophysics Data System (ADS)

Pihlasalo, S.; Mariani, L.; Härmä, H.

2016-03-01

Homogeneous simple assays utilizing luminescence quenching and time-resolved luminescence resonance energy transfer (TR-LRET) were developed for the quantification of nucleic acids without sequence information. Nucleic acids prevent the adsorption of a protein to europium nanoparticles which is detected as a luminescence quenching of europium nanoparticles with a soluble quencher or as a decrease of TR-LRET from europium nanoparticles to the acceptor dye. Contrary to the existing methods based on fluorescent dye binding to nucleic acids, equal sensitivities for both single- (ssDNA) and double-stranded DNA (dsDNA) were measured and a detection limit of 60 pg was calculated for the quenching assay. The average coefficient of variation was 5% for the quenching assay and 8% for the TR-LRET assay. The TR-LRET assay was also combined with a nucleic acid dye selective to dsDNA in a single tube assay to measure the total concentration of DNA and the ratio of ssDNA and dsDNA in the mixture. To our knowledge, such a multiplexed assay is not accomplished with commercially available assays.Homogeneous simple assays utilizing luminescence quenching and time-resolved luminescence resonance energy transfer (TR-LRET) were developed for the quantification of nucleic acids without sequence information. Nucleic acids prevent the adsorption of a protein to europium nanoparticles which is detected as a luminescence quenching of europium nanoparticles with a soluble quencher or as a decrease of TR-LRET from europium nanoparticles to the acceptor dye. Contrary to the existing methods based on fluorescent dye binding to nucleic acids, equal sensitivities for both single- (ssDNA) and double-stranded DNA (dsDNA) were measured and a detection limit of 60 pg was calculated for the quenching assay. The average coefficient of variation was 5% for the quenching assay and 8% for the TR-LRET assay. The TR-LRET assay was also combined with a nucleic acid dye selective to dsDNA in a single tube assay to measure the total concentration of DNA and the ratio of ssDNA and dsDNA in the mixture. To our knowledge, such a multiplexed assay is not accomplished with commercially available assays. Electronic supplementary information (ESI) available: The labeling of amino modified polystyrene nanoparticles with Eu3+ chelate and the experimental details and results for the optimization of nucleic acid binding protein and for the ratiometric measurement of DNA and RNA with quenching assay. See DOI: 10.1039/c5nr09252c
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2011 CFR

2011-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2013 CFR

2013-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2012 CFR

2012-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2010 CFR

2010-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
37 CFR 1.823 - Requirements for nucleotide and/or amino acid sequences as part of the application.

Code of Federal Regulations, 2014 CFR

2014-07-01

... and/or amino acid sequences as part of the application. 1.823 Section 1.823 Patents, Trademarks, and... Amino Acid Sequences § 1.823 Requirements for nucleotide and/or amino acid sequences as part of the... incorporation-by-reference of the Sequence Listing as required by § 1.52(e)(5). The presentation of the...
A cluster of diagnostic Hsp68 amino acid sites that are identified in Drosophila from the melanogaster species group are concentrated around beta-sheet residues involved with substrate binding.

PubMed

Kellett, Mark; McKechnie, Stephen W

2005-04-01

The coding region of the hsp68 gene has been amplified, cloned, and sequenced from 10 Drosophila species, 5 from the melanogaster subgroup and 5 from the montium subgroup. When the predicted amino acid sequences are compared with available Hsp70 sequences, patterns of conservation suggest that the C-terminal region should be subdivided according to predominant secondary structure. Conservation levels between Hsp68 and Hsp70 proteins were high in the N-terminal ATPase and adjacent beta-sheet domains, medium in the alpha-helix domain, and low in the C-terminal mobile domain (78%, 72%, 41%, and 21% identity, respectively). A number of amino acid sites were found to be "diagnostic" for Hsp68 (28 of approximately 635 residues). A few of these occur in the ATPase domain (385 residues) but most (75%) are concentrated in the beta-sheet and alpha-helix domains (34% of the protein) with none in the short mobile domain. Five of the diagnostic sites in the beta-sheet domain are clustered around, but not coincident with, functional sites known to be involved in substrate binding. Nearly all of the Hsp70 family length variation occurs in the mobile domain. Within montium subgroup species, 2 nearly identical hsp68 PCR products that differed in length are either different alleles or products of an ancestral hsp68 duplication.
Using Evolution to Guide Protein Engineering: The Devil IS in the Details.

PubMed

Swint-Kruse, Liskin

2016-07-12

For decades, protein engineers have endeavored to reengineer existing proteins for novel applications. Overall, protein folds and gross functions can be readily transferred from one protein to another by transplanting large blocks of sequence (i.e., domain recombination). However, predictably fine-tuning function (e.g., by adjusting ligand affinity, specificity, catalysis, and/or allosteric regulation) remains a challenge. One approach has been to use the sequences of protein families to identify amino acid positions that change during the evolution of functional variation. The rationale is that these nonconserved positions could be mutated to predictably fine-tune function. Evolutionary approaches to protein design have had some success, but the engineered proteins seldom replicate the functional performances of natural proteins. This Biophysical Perspective reviews several complexities that have been revealed by evolutionary and experimental studies of protein function. These include 1) challenges in defining computational and biological thresholds that define important amino acids; 2) the co-occurrence of many different patterns of amino acid changes in evolutionary data; 3) difficulties in mapping the patterns of amino acid changes to discrete functional parameters; 4) the nonconventional mutational outcomes that occur for a particular group of functionally important, nonconserved positions; 5) epistasis (nonadditivity) among multiple mutations; and 6) the fact that a large fraction of a protein's amino acids contribute to its overall function. To overcome these challenges, new goals are identified for future studies. Copyright © 2016 Biophysical Society. Published by Elsevier Inc. All rights reserved.
Characterization of a stearoyl-acyl carrier protein desaturase gene family from chocolate tree, Theobroma cacao L.

PubMed

Zhang, Yufan; Maximova, Siela N; Guiltinan, Mark J

2015-01-01

In plants, the conversion of stearoyl-ACP to oleoyol-ACP is catalyzed by a plastid-localized soluble stearoyl-acyl carrier protein (ACP) desaturase (SAD). The activity of SAD significantly impacts the ratio of saturated and unsaturated fatty acids, and is thus a major determinant of fatty acid composition. The cacao genome contains eight putative SAD isoforms with high amino acid sequence similarities and functional domain conservation with SAD genes from other species. Sequence variation in known functional domains between different SAD family members suggested that these eight SAD isoforms might have distinct functions in plant development, a hypothesis supported by their diverse expression patterns in various cacao tissues. Notably, TcSAD1 is universally expressed across all the tissues, and its expression pattern in seeds is highly correlated with the dramatic change in fatty acid composition during seed maturation. Interestingly, TcSAD3 and TcSAD4 appear to be exclusively and highly expressed in flowers, functions of which remain unknown. To test the function of TcSAD1 in vivo, transgenic complementation of the Arabidopsis ssi2 mutant was performed, demonstrating that TcSAD1 successfully rescued all AtSSI2 related phenotypes further supporting the functional orthology between these two genes. The identification of the major SAD gene responsible for cocoa butter biosynthesis provides new strategies for screening for novel genotypes with desirable fatty acid compositions, and for use in breeding programs to help pyramid genes for quality and other traits such as disease resistance.
Pyrin gene and mutants thereof, which cause familial Mediterranean fever

DOEpatents

Kastner, Daniel L [Bethesda, MD; Aksentijevichh, Ivona [Bethesda, MD; Centola, Michael [Tacoma Park, MD; Deng, Zuoming [Gaithersburg, MD; Sood, Ramen [Rockville, MD; Collins, Francis S [Rockville, MD; Blake, Trevor [Laytonsville, MD; Liu, P Paul [Ellicott City, MD; Fischel-Ghodsian, Nathan [Los Angeles, CA; Gumucio, Deborah L [Ann Arbor, MI; Richards, Robert I [North Adelaide, AU; Ricke, Darrell O [San Diego, CA; Doggett, Norman A [Santa Cruz, NM; Pras, Mordechai [Tel-Hashomer, IL

2003-09-30

The invention provides the nucleic acid sequence encoding the protein associated with familial Mediterranean fever (FMF). The cDNA sequence is designated as MEFV. The invention is also directed towards fragments of the DNA sequence, as well as the corresponding sequence for the RNA transcript and fragments thereof. Another aspect of the invention provides the amino acid sequence for a protein (pyrin) associated with FMF. The invention is directed towards both the full length amino acid sequence, fusion proteins containing the amino acid sequence and fragments thereof. The invention is also directed towards mutants of the nucleic acid and amino acid sequences associated with FMF. In particular, the invention discloses three missense mutations, clustered in within about 40 to 50 amino acids, in the highly conserved rfp (B30.2) domain at the C-terminal of the protein. These mutants include M6801, M694V, K695R, and V726A. Additionally, the invention includes methods for diagnosing a patient at risk for having FMF and kits therefor.
Comparison of ZP3 protein sequences among vertebrate species: to obtain a consensus sequence for immunocontraception.

PubMed

Zhu, X; Naz, R K

1999-03-01

The deduced ZP3 amino acid (aa) sequences of 13 vertebrate species namely mouse, hamster, rabbit, pig, porcine, cow, dog, cat, human, bonnet, marmoset, carp, and frog were compared using the PILEUP and PRETTY alignment programs (GCG, Wisconsin, USA). The published aa sequences obtained from 13 vertebrate species indicated the overall evolutionarily conservation in the N-terminus, central region, and C-terminus of the ZP3 polypeptide. More variations of ZP3 polypeptide sequences were seen in the alignments of carp and frog from the 11 mammalian species making the leader sequence more prominent. The canonical furin proteolytic processing signal at the C-terminus was found in all the ZP3 polypeptide sequences except of carp and frog. In the central region, the ZP3 deduced aa sequences of all the 13 vertebrate species aligned well, and six relatively conserved sequences were found. There are 11 conserved cysteine residues in the central region across all species including carp and frog, indicating that these residues have longer evolutionary history. The ZP3 aa sequence similarities were examined using the GAP program (GCG). The highest aa similarities are observed between the members of the same order within the class mammalia, and also (95.4%) between pig (ungulata) and rabbit (lagomorpha). The deduced ZP3 aa sequences per se may not be enough to build a phylogenetic tree.

77 FR 65537 - Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence...

Federal Register 2010, 2011, 2012, 2013, 2014

2012-10-29

... DEPARTMENT OF COMMERCE Patent and Trademark Office Requirements for Patent Applications Containing Nucleotide Sequence and/or Amino Acid Sequence Disclosures ACTION: Proposed collection; comment request... Patent applications that contain nucleotide and/or amino acid sequence disclosures must include a copy of...
Identification and Structural Characterization of Naturally-Occurring Broad-Spectrum Cyclic Antibiotics Isolated from Paenibacillus

NASA Astrophysics Data System (ADS)

Knolhoff, Ann M.; Zheng, Jie; McFarland, Melinda A.; Luo, Yan; Callahan, John H.; Brown, Eric W.; Croley, Timothy R.

2015-08-01

The rise of antimicrobial resistance necessitates the discovery and/or production of novel antibiotics. Isolated strains of Paenibacillus alvei were previously shown to exhibit antimicrobial activity against a number of pathogens, such as E. coli, Salmonella, and methicillin-resistant Staphylococcus aureus (MRSA). The responsible antimicrobial compounds were isolated from these Paenibacillus strains and a combination of low and high resolution mass spectrometry with multiple-stage tandem mass spectrometry was used for identification. A group of closely related cyclic lipopeptides was identified, differing primarily by fatty acid chain length and one of two possible amino acid substitutions. Variation in the fatty acid length resulted in mass differences of 14 Da and yielded groups of related MSn spectra. Despite the inherent complexity of MS/MS spectra of cyclic compounds, straightforward analysis of these spectra was accomplished by determining differences in complementary product ion series between compounds that differ in molecular weight by 14 Da. The primary peptide sequence assignment was confirmed through genome mining; the combination of these analytical tools represents a workflow that can be used for the identification of complex antibiotics. The compounds also share amino acid sequence similarity to a previously identified broad-spectrum antibiotic isolated from Paenibacillus. The presence of such a wide distribution of related compounds produced by the same organism represents a novel class of broad-spectrum antibiotic compounds.
Variations in gut microbiota and fecal metabolic phenotype associated with depression by 16S rRNA gene sequencing and LC/MS-based metabolomics.

PubMed

Yu, Meng; Jia, Hongmei; Zhou, Chao; Yang, Yong; Zhao, Yang; Yang, Maohua; Zou, Zhongmei

2017-05-10

As a prevalent, life-threatening and highly recurrent psychiatric illness, depression is characterized by a wide range of pathological changes; however, its etiology remains incompletely understood. Accumulating evidence supports that gut microbiota affects not only gastrointestinal physiology but also central nervous system (CNS) function and behavior through the microbiota-gut-brain axis. To assess the impact of gut microbiota on fecal metabolic phenotype in depressive conditions, an integrated approach of 16S rRNA gene sequencing combined with ultra high-performance liquid chromatography-mass spectrometry (UHPLC-MS) based metabolomics was performed in chronic variable stress (CVS)-induced depression rat model. Interestingly, depression led to significant gut microbiota changes, at the phylum and genus levels in rats treated with CVS compared to controls. The relative abundances of the bacterial genera Marvinbryantia, Corynebacterium, Psychrobacter, Christensenella, Lactobacillus, Peptostreptococcaceae incertae sedis, Anaerovorax, Clostridiales incertae sedis and Coprococcus were significantly decreased, whereas Candidatus Arthromitus and Oscillibacter were markedly increased in model rats compared with normal controls. Meanwhile, distinct changes in fecal metabolic phenotype of depressive rats were also found, including lower levels of amino acids, and fatty acids, and higher amounts of bile acids, hypoxanthine and stercobilins. Moreover, there were substantial associations of perturbed gut microbiota genera with the altered fecal metabolites, especially compounds involved in the metabolism of tryptophan and bile acids. These results showed that the gut microbiota was altered in association with fecal metabolism in depressive conditions. These findings suggest that the 16S rRNA gene sequencing and LC-MS based metabolomics approach can be further applied to assess pathogenesis of depression. Copyright © 2017 Elsevier B.V. All rights reserved.
Surface gene variants of hepatitis B Virus in Saudi Patients.

PubMed

Al-Qudari, Ahmed Y; Amer, Haitham M; Abdo, Ayman A; Hussain, Zahid; Al-Hamoudi, Waleed; Alswat, Khalid; Almajhdi, Fahad N

2016-01-01

Hepatitis B virus (HBV) continues to be one of the most important viral pathogens in humans. Surface (S) protein is the major HBV antigen that mediates virus attachment and entry and determines the virus subtype. Mutations in S gene, particularly in the "a" determinant, can influence virus detection by ELISA and may generate escape mutants. Since no records have documented the S gene mutations in HBV strains circulating in Saudi Arabia, the current study was designed to study sequence variation of S gene in strains circulating in Saudi Arabia and its correlation with clinical and risk factors. A total of 123 HBV-infected patients were recruited for this study. Clinical and biochemical parameters, serological markers, and viral load were determined in all patients. The entire S gene sequence of samples with viral load exceeding 2000 IU/mL was retrieved and exploited in sequence and phylogenetic analysis. A total of 48 mutations (21 unique) were recorded in viral strains in Saudi Arabia, among which 24 (11 unique) changed their respective amino acids. Two amino acid changes were recorded in "a" determinant, including F130L and S135F with no evidence of the vaccine escape mutant G145R in any of the samples. No specific relationship was recognized between the mutation/amino acid change record of HBsAg in strains in Saudi Arabia and clinical or laboratory data. Phylogenetic analysis categorized HBV viral strains in Saudi Arabia as members of subgenotypes D1 and D3. The present report is the first that describes mutation analysis of HBsAg in strains in Saudi Arabia on both nucleotide and amino acid levels. Different substitutions, particularly in major hydrophilic region, may have a potential influence on disease diagnosis, vaccination strategy, and antiviral chemotherapy.
Characterization of Novel Fusaricidins Produced by Paenibacillus polymyxa-M1 Using MALDI-TOF Mass Spectrometry

NASA Astrophysics Data System (ADS)

Vater, Joachim; Niu, Ben; Dietel, Kristin; Borriss, Rainer

2015-09-01

Paenibacillus polymyxa-M1 is a potent producer of bioactive compounds, such as lipopeptides, polyketides, and lantibiotics of biotechnological and medical interest. Genome sequencing revealed nine gene clusters for nonribosomal biosynthesis of such agents. Here we report on the investigation of the fusaricidins, a complex of cyclic lipopeptides containing 15-guanidino-3-hydroxypentadecanoic acid (GHPD) as fatty acid component by matrix-assisted laser-desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). More than 20 variants of these compounds were detected and characterized in detail. Mass spectrometric sequence analysis was performed by MALDI-LIFT-TOF/TOF fragment analysis. The obtained product ion spectra show a specific processing in the fatty acid part. GHPD is cleaved between the α- and ß-position yielding two fragments a and b, one bearing the end-standing guanidine group and another one comprising the residual two C-atoms of GHPD with the attached peptide moiety. The complete sequence of all fusaricidins was derived from sets of bn- and yn-ions. The fusaricidin complex can be divided into four lipopeptide families, three of them showing variations of the amino acid in position 3, Val or Ile for the first and Tyr or Phe for families 2 and 3, respectively. A collection of novel fusaricidins was detected differing from those of families 1-3 by an additional residue of 71 Da (family 4). LIFT-TOF/TOF fragment spectra of these species imply that in their peptide moiety, an Ala-residue is attached by an ester bond to the free hydroxyl group of Thr4. More than 10 novel fusaricidins were characterized mass spectrometrically.
Unraveling Selection in the Mitochondrial Genome of Drosophila

PubMed Central

Ballard, JWO.; Kreitman, M.

1994-01-01

We examine mitochondrial DNA variation at the cytochrome b locus within and between three species of Drosophila to determine whether patterns of variation conform to the predictions of neutral molecular evolution. The entire 1137-bp cytochrome b locus was sequenced in 16 lines of Drosophila melanogaster, 18 lines of Drosophila simulans and 13 lines of Drosophila yakuba. Patterns of variation depart from neutrality by several test criteria. Analysis of the evolutionary clock hypothesis shows unequal rates of change along D. simulans lineages. A comparison within and between species of the ratio of amino acid replacement change to synonymous change reveals a relative excess of amino acid replacement polymorphism compared to the neutral prediction, suggestive of slightly deleterious or diversifying selection. There is evidence for excess homozygosity in our world wide sample of D. melanogaster and D. simulans alleles, as well as a reduction in the number of segregating sites in D. simulans, indicative of selective sweeps. Furthermore, a test of neutrality for codon usage shows the direction of mutations at third positions differs among different topological regions of the gene tree. The analyses indicate that molecular variation and evolution of mtDNA are governed by many of the same selective forces that have been shown to govern nuclear genome evolution and suggest caution be taken in the use of mtDNA as a ``neutral'' molecular marker. PMID:7851772
Genetic variation of coat protein gene among the isolates of Rice tungro spherical virus from tungro-endemic states of the India.

PubMed

Mangrauthia, Satendra K; Malathi, P; Agarwal, Surekha; Ramkumar, G; Krishnaveni, D; Neeraja, C N; Madhav, M Sheshu; Ladhalakshmi, D; Balachandran, S M; Viraktamath, B C

2012-06-01

Rice tungro disease, one of the major constraints to rice production in South and Southeast Asia, is caused by a combination of two viruses: Rice tungro spherical virus (RTSV) and Rice tungro bacilliform virus (RTBV). The present study was undertaken to determine the genetic variation of RTSV population present in tungro endemic states of Indian subcontinent. Phylogenetic analysis based on coat protein sequences showed distinct divergence of Indian RTSV isolates into two groups; one consisted isolates from Hyderabad (Andhra Pradesh), Cuttack (Orissa), and Puducherry and another from West Bengal, Coimbatore (Tamil Nadu), and Kanyakumari (Tamil Nadu). The results obtained from phylogenetic study were further supported with the SNPs (single nucleotide polymorphism), INDELs (insertion and deletion) and evolutionary distance analysis. In addition, sequence difference count matrix revealed 2-68 nucleotides differences among all the Indian RTSV isolates taken in this study. However, at the protein level these differences were not significant as revealed by Ka/Ks ratio calculation. Sequence identity at nucleotide and amino acid level was 92-100% and 97-100%, respectively, among Indian isolates of RTSV. Understanding of the population structure of RTSV from tungro endemic regions of India would potentially provide insights into the molecular diversification of this virus.
WNT10A mutation results in severe tooth agenesis in a family of three sisters.

PubMed

Abid, M F; Simpson, M A; Barbosa, I A; Seppala, M; Irving, M; Sharpe, P T; Cobourne, M T

2018-06-21

To identify the genetic basis of severe tooth agenesis in a family of three affected sisters. A family of three sisters with severe tooth agenesis was recruited for whole-exome sequencing to identify potential genetic variation responsible for this penetrant phenotype. The unaffected father was tested for specific mutations using Sanger sequencing. Gene discovery was supplemented with in situ hybridization to localize gene expression during human tooth development. We report a nonsense heterozygous mutation in exon 2 of WNT10A c.321C>A[p.Cys107*] likely to be responsible for the severe tooth agenesis identified in this family through the creation of a premature stop codon, resulting in truncation of the amino acid sequence and therefore loss of protein function. In situ hybridization showed expression of WNT10A in odontogenic epithelium during the early and late stages of human primary tooth development. WNT10A has previously been associated with both syndromic and non-syndromic forms of tooth agenesis, and this report further expands our knowledge of genetic variation underlying non-syndromic forms of this condition. We also demonstrate expression of WNT10A in the epithelial compartment of human tooth germs during development. © 2018 John Wiley & Sons A/S. Published by John Wiley & Sons Ltd.
Microsatellite analysis in the genome of Acanthaceae: An in silico approach

PubMed Central

Kaliswamy, Priyadharsini; Vellingiri, Srividhya; Nathan, Bharathi; Selvaraj, Saravanakumar

2015-01-01

Background: Acanthaceae is one of the advanced and specialized families with conventionally used medicinal plants. Simple sequence repeats (SSRs) play a major role as molecular markers for genome analysis and plant breeding. The microsatellites existing in the complete genome sequences would help to attain a direct role in the genome organization, recombination, gene regulation, quantitative genetic variation, and evolution of genes. Objective: The current study reports the frequency of microsatellites and appropriate markers for the Acanthaceae family genome sequences. Materials and Methods: The whole nucleotide sequences of Acanthaceae species were obtained from National Center for Biotechnology Information database and screened for the presence of SSRs. SSR Locator tool was used to predict the microsatellites and inbuilt Primer3 module was used for primer designing. Results: Totally 110 repeats from 108 sequences of Acanthaceae family plant genomes were identified, and the occurrence of dinucleotide repeats was found to be abundant in the genome sequences. The essential amino acid isoleucine was found rich in all the sequences. We also designed the SSR-based primers/markers for 59 sequences of this family that contains microsatellite repeats in their genome. Conclusion: The identified microsatellites and primers might be useful for breeding and genetic studies of plants that belong to Acanthaceae family in the future. PMID:25709226
Sequence analysis of Jembrana disease virus strains reveals a genetically stable lentivirus.

PubMed

Desport, Moira; Stewart, Meredith E; Mikosza, Andrew S; Sheridan, Carol A; Peterson, Shane E; Chavand, Olivier; Hartaningsih, Nining; Wilcox, Graham E

2007-06-01

Jembrana disease virus (JDV) is a lentivirus associated with an acute disease syndrome with a 20% case fatality rate in Bos javanicus (Bali cattle) in Indonesia, occurring after a short incubation period and with no recurrence of the disease after recovery. Partial regions of gag and pol and the entire env were examined for sequence variation in DNA samples from cases of Jembrana disease obtained from Bali, Sumatra and South Kalimantan in Indonesian Borneo. A high level of nucleotide conservation (97-100%) was observed in gag sequences from samples taken in Bali and Sumatra, indicating that the source of JDV in Sumatra was most likely to have originated from Bali. The pol sequences and, unexpectedly, the env sequences from Bali samples were also well conserved with low nucleotide (96-99%) and amino acid substitutions (95-99%). However, the sample from South Kalimantan (JDV(KAL/01)) contained more divergent sequences, particularly in env (88% identity). Phylogenetic analysis revealed that the JDV(KAL/01)env sequences clustered with the sequence from the Pulukan sample (Bali) from 2001. JDV appears to be remarkably stable genetically and has undergone minor genetic changes over a period of nearly 20 years in Bali despite becoming endemic in the cattle population of the island.
Amino acid sequence analysis of the annexin super-gene family of proteins.

PubMed

Barton, G J; Newman, R H; Freemont, P S; Crumpton, M J

1991-06-15

The annexins are a widespread family of calcium-dependent membrane-binding proteins. No common function has been identified for the family and, until recently, no crystallographic data existed for an annexin. In this paper we draw together 22 available annexin sequences consisting of 88 similar repeat units, and apply the techniques of multiple sequence alignment, pattern matching, secondary structure prediction and conservation analysis to the characterisation of the molecules. The analysis clearly shows that the repeats cluster into four distinct families and that greatest variation occurs within the repeat 3 units. Multiple alignment of the 88 repeats shows amino acids with conserved physicochemical properties at 22 positions, with only Gly at position 23 being absolutely conserved in all repeats. Secondary structure prediction techniques identify five conserved helices in each repeat unit and patterns of conserved hydrophobic amino acids are consistent with one face of a helix packing against the protein core in predicted helices a, c, d, e. Helix b is generally hydrophobic in all repeats, but contains a striking pattern of repeat-specific residue conservation at position 31, with Arg in repeats 4 and Glu in repeats 2, but unconserved amino acids in repeats 1 and 3. This suggests repeats 2 and 4 may interact via a buried saltbridge. The loop between predicted helices a and b of repeat 3 shows features distinct from the equivalent loop in repeats 1, 2 and 4, suggesting an important structural and/or functional role for this region. No compelling evidence emerges from this study for uteroglobin and the annexins sharing similar tertiary structures, or for uteroglobin representing a derivative of a primordial one-repeat structure that underwent duplication to give the present day annexins. The analyses performed in this paper are re-evaluated in the Appendix, in the light of the recently published X-ray structure for human annexin V. The structure confirms most of the predictions and shows the power of techniques for the determination of tertiary structural information from the amino acid sequences of an aligned protein family.
Insect sex determination: it all evolves around transformer.

PubMed

Verhulst, Eveline C; van de Zande, Louis; Beukeboom, Leo W

2010-08-01

Insects exhibit a variety of sex determining mechanisms including male or female heterogamety and haplodiploidy. The primary signal that starts sex determination is processed by a cascade of genes ending with the conserved switch doublesex that controls sexual differentiation. Transformer is the doublesex splicing regulator and has been found in all examined insects, indicating its ancestral function as a sex-determining gene. Despite this conserved function, the variation in transformer nucleotide sequence, amino acid composition and protein structure can accommodate a multitude of upstream sex determining signals. Transformer regulation of doublesex and its taxonomic distribution indicate that the doublesex-transformer axis is conserved among all insects and that transformer is the key gene around which variation in sex determining mechanisms has evolved.
The adaptive evolution of the mammalian mitochondrial genome

PubMed Central

da Fonseca, Rute R; Johnson, Warren E; O'Brien, Stephen J; Ramos, Maria João; Antunes, Agostinho

2008-01-01

Background The mitochondria produce up to 95% of a eukaryotic cell's energy through oxidative phosphorylation. The proteins involved in this vital process are under high functional constraints. However, metabolic requirements vary across species, potentially modifying selective pressures. We evaluate the adaptive evolution of 12 protein-coding mitochondrial genes in 41 placental mammalian species by assessing amino acid sequence variation and exploring the functional implications of observed variation in secondary and tertiary protein structures. Results Wide variation in the properties of amino acids were observed at functionally important regions of cytochrome b in species with more-specialized metabolic requirements (such as adaptation to low energy diet or large body size, such as in elephant, dugong, sloth, and pangolin, and adaptation to unusual oxygen requirements, for example diving in cetaceans, flying in bats, and living at high altitudes in alpacas). Signatures of adaptive variation in the NADH dehydrogenase complex were restricted to the loop regions of the transmembrane units which likely function as protons pumps. Evidence of adaptive variation in the cytochrome c oxidase complex was observed mostly at the interface between the mitochondrial and nuclear-encoded subunits, perhaps evidence of co-evolution. The ATP8 subunit, which has an important role in the assembly of F0, exhibited the highest signal of adaptive variation. ATP6, which has an essential role in rotor performance, showed a high adaptive variation in predicted loop areas. Conclusion Our study provides insight into the adaptive evolution of the mtDNA genome in mammals and its implications for the molecular mechanism of oxidative phosphorylation. We present a framework for future experimental characterization of the impact of specific mutations in the function, physiology, and interactions of the mtDNA encoded proteins involved in oxidative phosphorylation. PMID:18318906
Fine mapping and identification of a candidate gene for the barley Un8 true loose smut resistance gene.

PubMed

Zang, Wen; Eckstein, Peter E; Colin, Mark; Voth, Doug; Himmelbach, Axel; Beier, Sebastian; Stein, Nils; Scoles, Graham J; Beattie, Aaron D

2015-07-01

The candidate gene for the barley Un8 true loose smut resistance gene encodes a deduced protein containing two tandem protein kinase domains. In North America, durable resistance against all known isolates of barley true loose smut, caused by the basidiomycete pathogen Ustilago nuda (Jens.) Rostr. (U. nuda), is under the control of the Un8 resistance gene. Previous genetic studies mapped Un8 to the long arm of chromosome 5 (1HL). Here, a population of 4625 lines segregating for Un8 was used to delimit the Un8 gene to a 0.108 cM interval on chromosome arm 1HL, and assign it to fingerprinted contig 546 of the barley physical map. The minimal tilling path was identified for the Un8 locus using two flanking markers and consisted of two overlapping bacterial artificial chromosomes. One gene located close to a marker co-segregating with Un8 showed high sequence identity to a disease resistance gene containing two kinase domains. Sequence of the candidate gene from the parents of the segregating population, and in an additional 19 barley lines representing a broader spectrum of diversity, showed there was no intron in alleles present in either resistant or susceptible lines, and fifteen amino acid variations unique to the deduced protein sequence in resistant lines differentiated it from the deduced protein sequences in susceptible lines. Some of these variations were present within putative functional domains which may cause a loss of function in the deduced protein sequences within susceptible lines.
Method for nucleic acid hybridization using single-stranded DNA binding protein

DOEpatents

Tabor, Stanley; Richardson, Charles C.

1996-01-01

Method of nucleic acid hybridization for detecting the presence of a specific nucleic acid sequence in a population of different nucleic acid sequences using a nucleic acid probe. The nucleic acid probe hybridizes with the specific nucleic acid sequence but not with other nucleic acid sequences in the population. The method includes contacting a sample (potentially including the nucleic acid sequence) with the nucleic acid probe under hybridizing conditions in the presence of a single-stranded DNA binding protein provided in an amount which stimulates renaturation of a dilute solution (i.e., one in which the t.sub.1/2 of renaturation is longer than 3 weeks) of single-stranded DNA greater than 500 fold (i.e., to a t.sub.1/2 less than 60 min, preferably less than 5 min, and most preferably about 1 min.) in the absence of nucleotide triphosphates.
Sequence quality analysis tool for HIV type 1 protease and reverse transcriptase.

PubMed

Delong, Allison K; Wu, Mingham; Bennett, Diane; Parkin, Neil; Wu, Zhijin; Hogan, Joseph W; Kantor, Rami

2012-08-01

Access to antiretroviral therapy is increasing globally and drug resistance evolution is anticipated. Currently, protease (PR) and reverse transcriptase (RT) sequence generation is increasing, including the use of in-house sequencing assays, and quality assessment prior to sequence analysis is essential. We created a computational HIV PR/RT Sequence Quality Analysis Tool (SQUAT) that runs in the R statistical environment. Sequence quality thresholds are calculated from a large dataset (46,802 PR and 44,432 RT sequences) from the published literature ( http://hivdb.Stanford.edu ). Nucleic acid sequences are read into SQUAT, identified, aligned, and translated. Nucleic acid sequences are flagged if with >five 1-2-base insertions; >one 3-base insertion; >one deletion; >six PR or >18 RT ambiguous bases; >three consecutive PR or >four RT nucleic acid mutations; >zero stop codons; >three PR or >six RT ambiguous amino acids; >three consecutive PR or >four RT amino acid mutations; >zero unique amino acids; or <0.5% or >15% genetic distance from another submitted sequence. Thresholds are user modifiable. SQUAT output includes a summary report with detailed comments for troubleshooting of flagged sequences, histograms of pairwise genetic distances, neighbor joining phylogenetic trees, and aligned nucleic and amino acid sequences. SQUAT is a stand-alone, free, web-independent tool to ensure use of high-quality HIV PR/RT sequences in interpretation and reporting of drug resistance, while increasing awareness and expertise and facilitating troubleshooting of potentially problematic sequences.
Genomic analysis reveals Nairobi sheep disease virus to be highly diverse and present in both Africa, and in India in the form of the Ganjam virus variant.

PubMed

Yadav, Pragya D; Vincent, Martin J; Khristova, Marina; Kale, Charuta; Nichol, Stuart T; Mishra, Akhilesh C; Mourya, Devendra T

2011-07-01

Nairobi sheep disease (NSD) virus, the prototype tick-borne virus of the genus Nairovirus, family Bunyaviridae is associated with acute hemorrhagic gastroenteritis in sheep and goats in East and Central Africa. The closely related Ganjam virus found in India is associated with febrile illness in humans and disease in livestock. The complete S, M and L segment sequences of Ganjam and NSD virus and partial sequence analysis of Ganjam viral RNA genome S, M and L segments encoding regions (396 bp, 701 bp and 425 bp) of the viral nucleocapsid (N), glycoprotein precursor (GPC) and L polymerase (L) proteins, respectively, was carried out for multiple Ganjam virus isolates obtained from 1954 to 2002 and from various regions of India. M segments of NSD and Ganjam virus encode a large ORF for the glycoprotein precursor (GPC), (1627 and 1624 amino acids in length, respectively) and their L segments encode a very large L polymerase (3991 amino acids). The complete S, M and L segments of NSD and Ganjam viruses were more closely related to one another than to other characterized nairoviruses, and no evidence of reassortment was found. However, the NSD and Ganjam virus complete M segment differed by 22.90% and 14.70%, for nucleotide and amino acid respectively, and the complete L segment nucleotide and protein differing by 9.90% and 2.70%, respectively among themselves. Ganjam and NSD virus, complete S segment differed by 9.40-10.40% and 3.2-4.10 for nucleotide and proteins while among Ganjam viruses 0.0-6.20% and 0.0-1.4%, variation was found for nucleotide and amino acids. Ganjam virus isolates differed by up to 17% and 11% at the nucleotide level for the partial S and L gene fragments, respectively, with less variation observed at the deduced amino acid level (10.5 and 2%, S and L, respectively). However, the virus partial M gene fragment (which encodes the hypervariable mucin-like domain) of these viruses differed by as much as 56% at the nucleotide level. Phylogenetic analysis of partial sequence differences suggests considerable mixing and movement of Ganjam virus strains within India, with no clear relationship between genetic lineages and virus geographic origin or year of isolation. Surprisingly, NSD virus does not represent a distinct lineage, but appears as a variant with other Ganjam virus among NSD virus group. Copyright © 2011 Elsevier B.V. All rights reserved.
Variations in endothelin receptor B subtype 2 (EDNRB2) coding sequences and mRNA expression levels in 4 Muscovy duck plumage colour phenotypes.

PubMed

Wu, N; Qin, H; Wang, M; Bian, Y; Dong, B; Sun, G; Zhao, W; Chang, G; Xu, Q; Chen, G

2017-04-01

1. Endothelin receptor B subtype 2 (EDNRB2) is a paralog of EDNRB, which encodes a 7-transmembrane G-protein coupled receptor. Previous studies reported that EDNRB was essential for melanoblast migration in mammals and ducks. 2. Muscovy ducks have different plumage colour phenotypes. Variations in EDNRB2 coding sequences (CDSs) and mRNA expression levels were investigated in 4 different Muscovy duck plumage colour phenotypes, including black, black mutant, silver and white head. 3. The EDNRB2 gene from Muscovy duck was cloned; it had a length of 6435 bp and encoded 437 amino acids. The coding region was screened and potential single nucleotide polymorphisms were identified. Eight mutations were obtained, including one missense variant (c.64C > T) and 7 synonymous substitutions. The substitutions were associated with plumage colour phenotypes. 4. The EDNRB2 mRNA expression levels were compared between feather pulp from black birds and black mutant birds. The results indicated that EDNRB2 transcripts in feather pulp were significantly higher in black feathers than in white feathers. 5. The results determined the variation of EDNRB2 CDS and mRNA expression in Muscovy ducks of various plumage colours.
Constitutional sequence variation in the Fanconi anaemia group C (FANCC) gene in childhood acute myeloid leukaemia.

PubMed

Barber, Lisa M; McGrath, Helen E N; Meyer, Stefan; Will, Andrew M; Birch, Jillian M; Eden, Osborn B; Taylor, G Malcolm

2003-04-01

The extent to which genetic susceptibility contributes to the causation of childhood acute myeloid leukaemia (AML) is not known. The inherited bone marrow failure disorder Fanconi anaemia (FA) carries a substantially increased risk of AML, raising the possibility that constitutional variation in the FA (FANC) genes is involved in the aetiology of childhood AML. We have screened genomic DNA extracted from remission blood samples of 97 children with sporadic AML and 91 children with sporadic acute lymphoblastic leukaemia (ALL), together with 104 cord blood DNA samples from newborn children, for variations in the Fanconi anaemia group C (FANCC) gene. We found no evidence of known FANCC pathogenic mutations in children with AML, ALL or in the cord blood samples. However, we detected 12 different FANCC sequence variants, of which five were novel to this study. Among six FANCC variants leading to amino-acid substitutions, one (S26F) was present at a fourfold greater frequency in children with AML than in the cord blood samples (odds ratio: 4.09, P = 0.047; 95% confidence interval 1.08-15.54). Our results thus do not exclude the possibility that this polymorphic variant contributes to the risk of a small proportion of childhood AML.
Sequence characterization of S100A8 gene reveals structural differences of protein and transcriptional factor binding sites in water buffalo and yak.

PubMed

Kathiravan, P; Goyal, S; Kataria, R S; Mishra, B P; Jayakumar, S; Joshi, B K

2011-01-01

The present study was undertaken to characterize the structure of S100A8 gene and its promoter in water buffalo and yak. Sequence data of 2.067 kb, 2.071 kb, and 2.052 kb with respect to complete S100A8 gene including 5' flanking region was generated in river buffalo, swamp buffalo, and yak, respectively. BLAST analysis of coding DNA sequences (CDS) of S100A8 gene revealed 95% homology of buffalo sequence with cattle, 85% with pig and horse, 83% with dog, 72-73% with murines, and around 79% with primates and humans. Phylogenetic analysis of predicted CDS revealed distinct clustering of murines, primates, and domestic animals with bovines and bubalines forming a subcluster among farm animals. In silico translation of predicted CDS revealed a sequence of 89 amino acids with 7 amino acid changes between cattle and buffalo and 2 changes between cattle and yak. The search for Pfam family revealed the N-terminal calcium binding domain and the noncanonical EF hand domain in the carboxy terminus, with more variations being observed in the N-terminal domain among different species. Two amino acid changes observed in carboxy terminal EF hand domain resulted in altered secondary structure of yak S100A8 protein. Analysis of S100A8 gene promoter revealed 14 putative motifs for transcriptional factor binding sites. Two putative motifs viz. C/EBP and v-Myb were found to be absent in swamp buffalo as compared to river buffalo and cattle. Differences in the structure of S100A8 protein and the transcriptional factor binding sites identified in the present study need to be analyzed further for their functional significance in yak and swamp buffalo respectively. Copyright © Taylor & Francis Group, LLC

Polymorphic variations in the FANCA gene in high-risk non-BRCA1/2 breast cancer individuals from the French Canadian population.

PubMed

Litim, Nadhir; Labrie, Yvan; Desjardins, Sylvie; Ouellette, Geneviève; Plourde, Karine; Belleau, Pascal; Durocher, Francine

2013-02-01

The majority of genes associated with breast cancer susceptibility, including BRCA1 and BRCA2 genes, are involved in DNA repair mechanisms. Moreover, among the genes recently associated with an increased susceptibility to breast cancer, four are Fanconi Anemia (FA) genes: FANCD1/BRCA2, FANCJ/BACH1/BRIP1, FANCN/PALB2 and FANCO/RAD51C. FANCA is implicated in DNA repair and has been shown to interact directly with BRCA1. It has been proposed that the formation of FANCA/G (dependent upon the phosphorylation of FANCA) and FANCB/L sub-complexes altogether with FANCM, represent the initial step for DNA repair activation and subsequent formation of other sub-complexes leading to ubiquitination of FANCD2 and FANCI. As only approximately 25% of inherited breast cancers are attributable to BRCA1/2 mutations, FANCA therefore becomes an attractive candidate for breast cancer susceptibility. We thus analyzed FANCA gene in 97 high-risk French Canadian non-BRCA1/2 breast cancer individuals by direct sequencing as well as in 95 healthy control individuals from the same population. Among a total of 85 sequence variants found in either or both series, 28 are coding variants and 19 of them are missense variations leading to amino acid change. Three of the amino acid changes, namely Thr561Met, Cys625Ser and particularly Ser1088Phe, which has been previously reported to be associated with FA, are predicted to be damaging by the SIFT and PolyPhen softwares. cDNA amplification revealed significant expression of 4 alternative splicing events (insertion of an intronic portion of intron 10, and the skipping of exons 11, 30 and 31). In silico analyzes of relevant genomic variants have been performed in order to identify potential variations involved in the expression of these spliced transcripts. Sequence variants in FANCA could therefore be potential spoilers of the Fanconi-BRCA pathway and as a result, they could in turn have an impact in non-BRCA1/2 breast cancer families. Copyright © 2012 Federation of European Biochemical Societies. Published by Elsevier B.V. All rights reserved.
Arbuscular mycorrhizal fungi (Glomeromycota) harbour ancient fungal tubulin genes that resemble those of the chytrids (Chytridiomycota).

PubMed

Corradi, Nicolas; Hijri, Mohamed; Fumagalli, Luca; Sanders, Ian R

2004-11-01

The genes encoding alpha- and beta-tubulins have been widely sampled in most major fungal phyla and they are useful tools for fungal phylogeny. Here, we report the first isolation of alpha-tubulin sequences from arbuscular mycorrhizal fungi (AMF). In parallel, AMF beta-tubulins were sampled and analysed to identify the presence of paralogs of this gene. The AMF alpha-tubulin amino acid phylogeny was congruent with the results previously reported for AMF beta-tubulins and showed that AMF tubulins group together at a basal position in the fungal clade and showed high sequence similarities with members of the Chytridiomycota. This is in contrast with phylogenies for other regions of the AMF genome. The amount and nature of substitutions are consistent with an ancient divergence of both orthologs and paralogs of AMF tubulins. At the amino acid level, however, AMF tubulins have hardly evolved from those of the chytrids. This is remarkable given that these two groups are ancient and the monophyletic Glomeromycota probably diverged from basal fungal ancestors at least 500 million years ago. The specific primers we designed for the AMF tubulins, together with the high molecular variation we found among the AMF species we analysed, make AMF tubulin sequences potentially useful for AMF identification purposes.
ESTs from Seeds to Assist the Selective Breeding of Jatropha curcas L. for Oil and Active Compounds

PubMed Central

Gomes, Kleber A; Almeida, Tiago C; Gesteira, Abelmon S; Lôbo, Ivon P; Guimarães, Ana Carolina R; de Miranda, Antonio B; Van Sluys, Marie-Anne; da Cruz, Rosenira S; Cascardo, Júlio CM; Carels, Nicolas

2010-01-01

We report here on the characterization of a cDNA library from seeds of Jatropha curcas L. at three stages of fruit maturation before yellowing. We sequenced a total of 2200 clones and obtained a set of 931 non-redundant sequences (unigenes) after trimming and quality control, ie, 140 contigs and 791 singlets with PHRED quality ≥10. We found low levels of sequence redundancy and extensive metabolic coverage by homology comparison to GO. After comparison of 5841 non-redundant ESTs from a total of 13193 reads from GenBank with KEGG, we identified tags with nucleotide variations among J. curcas accessions for genes of fatty acid, terpene, alkaloid, quinone and hormone pathways of biosynthesis. More specifically, the expression level of four genes (palmitoyl-acyl carrier protein thioesterase, 3-ketoacyl-CoA thiolase B, lysophosphatidic acid acyltransferase and geranyl pyrophosphate synthase) measured by real-time PCR proved to be significantly different between leaves and fruits. Since the nucleotide polymorphism of these tags is associated to higher level of gene expression in fruits compared to leaves, we propose this approach to speed up the search for quantitative traits in selective breeding of J. curcas. We also discuss its potential utility for the selective breeding of economically important traits in J. curcas. PMID:26217103
The NS3 proteins of global strains of bluetongue virus evolve into regional topotypes through negative (purifying) selection.

PubMed

Balasuriya, U B R; Nadler, S A; Wilson, W C; Pritchard, L I; Smythe, A B; Savini, G; Monaco, F; De Santis, P; Zhang, N; Tabachnick, W J; Maclachlan, N J

2008-01-01

Comparison of the deduced amino acid sequences of the genes (S10) encoding the NS3 protein of 137 strains of bluetongue virus (BTV) from Africa, the Americas, Asia, Australia and the Mediterranean Basin showed limited variation. Common to all NS3 sequences were potential glycosylation sites at amino acid residues 63 and 150 and a cysteine at residue 137, whereas a cysteine at residue 181 was not conserved. The PPXY and PS/TAP late-domain motifs were conserved in all but three of the viruses. Phylogenetic analyses of these same sequences yielded two principal clades that grouped the viruses irrespective of their serotype or year of isolation (1900-2003). All viruses from Asia and Australia were grouped in one clade, whereas those from the other regions were present in both clades. Each clade segregated into distinct subclades that included viruses from single or multiple regions, and the S10 genes of some field viruses were identical to those of live-attenuated BTV vaccines. There was no evidence of positive selection on the S10 gene as assessed by reconstruction of ancestral codon states on the phylogeny, rather the functional constraints of the NS3 protein are expressed through substantial negative (purifying) selection.
A candidate gene for choanal atresia in alpaca.

PubMed

Reed, Kent M; Bauer, Miranda M; Mendoza, Kristelle M; Armién, Aníbal G

2010-03-01

Choanal atresia (CA) is a common nasal craniofacial malformation in New World domestic camelids (alpaca and llama). CA results from abnormal development of the nasal passages and is especially debilitating to newborn crias. CA in camelids shares many of the clinical manifestations of a similar condition in humans (CHARGE syndrome). Herein we report on the regulatory gene CHD7 of alpaca, whose homologue in humans is most frequently associated with CHARGE. Sequence of the CHD7 coding region was obtained from a non-affected cria. The complete coding region was 9003 bp, corresponding to a translated amino acid sequence of 3000 aa. Additional genomic sequences corresponding to a significant portion of the CHD7 gene were identified and assembled from the 2x alpaca whole genome sequence, providing confirmatory sequence for much of the CHD7 coding region. The alpaca CHD7 mRNA sequence was 97.9% similar to the human sequence, with the greatest sequence difference being an insertion in exon 38 that results in a polyalanine repeat (A12). Polymorphism in this repeat was tested for association with CA in alpaca by cloning and sequencing the repeat from both affected and non-affected individuals. Variation in length of the poly-A repeat was not associated with CA. Complete sequencing of the CHD7 gene will be necessary to determine whether other mutations in CHD7 are the cause of CA in camelids.
Identification of single-nucleotide polymorphisms of the prion protein gene in sika deer (Cervus nippon laiouanus)

PubMed Central

Jeong, Hyun-Jeong; Lee, Joong-Bok; Park, Seung-Yong; Song, Chang-Seon; Kim, Bo-Sook; Rho, Jung-Rae; Yoo, Mi-Hyun; Jeong, Byung-Hoon; Kim, Yong-Sun

2007-01-01

Polymorphisms of the prion protein gene (PRNP) have been detected in several cervid species. In order to confirm the genetic variations, this study examined the DNA sequences of the PRNP obtained from 33 captive sika deer (Cervus nippon laiouanus) in Korea. A total of three single-nucleotide polymorphisms (SNPs) at codons 100, 136 and 226 in the PRNP of the sika deer were identified. The polymorphic site located at codon 100 has not been reported. The SNPs detected at codons 100 and 226 induced amino acid substitutions. The SNP at codon 136 was a silent mutation that does not induce any amino acid change. The genotype and allele frequencies were determined for each of the SNPs. PMID:17679779
Investigation of occult hepatitis B virus infection in anti-hbc positive patients from a liver clinic.

PubMed

Martinez, Maria Carmela; Kok, Chee Choy; Baleriola, Cristina; Robertson, Peter; Rawlinson, William D

2015-01-01

Occult hepatitis B infection (OBI) is manifested by presence of very low levels (<200IU/mL) of Hepatitis B viral DNA (HBV DNA) in the blood and the liver while exhibiting undetectable HBV surface antigen (HBsAg). The molecular mechanisms underlying this occurrence are still not completely understood. This study investigated the prevalence of OBI in a high-risk Australian population and compared the HBV S gene sequences of our cohort with reference sequences. Serum from HBV DNA positive, HBsAg negative, and hepatitis B core antibody (anti-HBc) positive patients (study cohort) were obtained from samples tested at SEALS Serology Laboratory using the Abbott Architect, as part of screening and diagnostic testing. From a total of 228,108 samples reviewed, 1,451 patients were tested for all three OBI markers. Only 10 patients (0.69%) out of the 1,451 patients were found to fit the selection criteria for OBI. Sequence analysis of the HBV S gene from 5 suspected OBI infected patients showed increased sequence variability in the 'a' epitope of the major hydrophilic region compared to reference sequences. In addition, a total of eight consistent nucleotide substitutions resulting in seven amino acid changes were observed, and three patients had truncated S gene sequence. These mutations appeared to be stable and may result in alterations in HBsAg conformation. These may negatively impact the affinity of hepatitis B surface antibody (anti-HBs) and may explain the false negative results in serological HBV diagnosis. These changes may also enable the virus to persist in the liver by evading immune surveillance. Further studies on a bigger cohort are required to determine whether these amino acid variations have been acquired in the process of immune escape and serve as markers of OBI.
Characterization of a new GmFAD3A allele in Brazilian CS303TNKCA soybean cultivar.

PubMed

Silva, Luiz Claudio Costa; Bueno, Rafael Delmond; da Matta, Loreta Buuda; Pereira, Pedro Henrique Scarpelli; Mayrink, Danyelle Barbosa; Piovesan, Newton Deniz; Sediyama, Carlos Sigueyuki; Fontes, Elizabeth Pacheco Batista; Cardinal, Andrea J; Dal-Bianco, Maximiller

2018-05-01

We molecularly characterized a new mutation in the GmFAD3A gene associated with low linolenic content in the Brazilian soybean cultivar CS303TNKCA and developed a molecular marker to select this mutation. Soybean is one of the most important crops cultivated worldwide. Soybean oil has 13% palmitic acid, 4% stearic acid, 20% oleic acid, 55% linoleic acid and 8% linolenic acid. Breeding programs are developing varieties with high oleic and low polyunsaturated fatty acids (linoleic and linolenic) to improve the oil oxidative stability and make the varieties more attractive for the soy industry. The main goal of this study was to characterize the low linoleic acid trait in CS303TNKCA cultivar. We sequenced CS303TNKCA GmFAD3A, GmFAD3B and GmFAD3C genes and identified an adenine point deletion in the GmFAD3A exon 5 (delA). This alteration creates a premature stop codon, leading to a truncated protein with just 207 residues that result in a non-functional enzyme. Analysis of enzymatic activity by heterologous expression in yeast support delA as the cause of low linolenic acid content in CS303TNKCA. Thus, we developed a TaqMan genotyping assay to associate delA with low linolenic acid content in segregating populations. Lines homozygous for delA had a linolenic acid content of 3.3 to 4.4%, and the variation at this locus accounted for 50.83 to 73.70% of the phenotypic variation. This molecular marker is a new tool to introgress the low linolenic acid trait into elite soybean cultivars and can be used to combine with high oleic trait markers to produce soybean with enhanced economic value. The advantage of using CS303TNKCA compared to other lines available in the literature is that this cultivar has good agronomic characteristics and is adapted to Brazilian conditions.
Variation in opsin genes correlates with signaling ecology in North American fireflies

PubMed Central

Sander, Sarah E.; Hall, David W.

2015-01-01

Genes underlying signal reception should evolve to maximize signal detection in a particular environment. In animals, opsins, the protein component of visual pigments, are predicted to evolve according to this expectation. Fireflies are known for their bioluminescent mating signals. The eyes of nocturnal species are expected to maximize detection of conspecific signal colors emitted in the typical low-light environment. This is not expected for species that have transitioned to diurnal activity in bright daytime environments. Here we test the hypothesis that opsin gene sequence plays a role in modifying firefly eye spectral sensitivity. We use genome and transcriptome sequencing in four firefly species, transcriptome sequencing in six additional species, and targeted gene sequencing in 28 other species to identify all opsin genes present in North American fireflies and to elucidate amino acid sites under positive selection. We also determine whether amino acid substitutions in opsins are linked to evolutionary changes in signal mode, signal color, and light environment. We find only two opsins, one long wavelength and one ultraviolet, in all firefly species and identify 25 candidate sites that may be involved in determining spectral sensitivity. In addition, we find elevated rates of evolution at transitions to diurnal activity, and changes in selective constraint on LW opsin associated with changes in light environment. Our results suggest that changes in eye spectral sensitivity are at least partially due to opsin sequence. Fireflies continue to be a promising system in which to investigate the evolution of signals, receptors, and signaling environments. PMID:26289828
Solution Structure of Acidocin B, a Circular Bacteriocin Produced by Lactobacillus acidophilus M46

PubMed Central

Acedo, Jeella Z.; van Belkum, Marco J.; Lohans, Christopher T.; McKay, Ryan T.; Miskolzie, Mark

2015-01-01

Acidocin B, a bacteriocin produced by Lactobacillus acidophilus M46, was originally reported to be a linear peptide composed of 59 amino acid residues. However, its high sequence similarity to gassericin A, a circular bacteriocin from Lactobacillus gasseri LA39, suggested that acidocin B might be circular as well. Acidocin B was purified from culture supernatant by a series of hydrophobic interaction chromatographic steps. Its circular nature was ascertained by matrix-assisted laser desorption ionization–time of flight (MALDI-TOF) mass spectrometry and tandem mass spectrometry (MS/MS) sequencing. The peptide sequence was found to consist of 58 amino acids with a molecular mass of 5,621.5 Da. The sequence of the acidocin B biosynthetic gene cluster was also determined and showed high nucleotide sequence similarity to that of gassericin A. The nuclear magnetic resonance (NMR) solution structure of acidocin B in sodium dodecyl sulfate micelles was elucidated, revealing that it is composed of four α-helices of similar length that are folded to form a compact, globular bundle with a central pore. This is a three-dimensional structure for a member of subgroup II circular bacteriocins, which are classified based on their isoelectric points of ∼7 or lower. Comparison of acidocin B with carnocyclin A, a subgroup I circular bacteriocin with four α-helices and a pI of 10, revealed differences in the overall folding. The observed variations could be attributed to inherent diversity in their physical properties, which also required the use of different solvent systems for three-dimensional structural elucidation. PMID:25681186
Identification of Delta5-fatty acid desaturase from the cellular slime mold dictyostelium discoideum.

PubMed

Saito, T; Ochiai, H

1999-10-01

cDNA fragments putatively encoding amino acid sequences characteristic of the fatty acid desaturase were obtained using expressed sequence tag (EST) information of the Dictyostelium cDNA project. Using this sequence, we have determined the cDNA sequence and genomic sequence of a desaturase. The cloned cDNA is 1489 nucleotides long and the deduced amino acid sequence comprised 464 amino acid residues containing an N-terminal cytochrome b5 domain. The whole sequence was 38.6% identical to the initially identified Delta5-desaturase of Mortierella alpina. We have confirmed its function as Delta5-desaturase by over expression mutation in D. discoideum and also the gain of function mutation in the yeast Saccharomyces cerevisiae. Analysis of the lipids from transformed D. discoideum and yeast demonstrated the accumulation of Delta5-desaturated products. This is the first report concering fatty acid desaturase in cellular slime molds.
Intraspecific variation of Centruroides sculpturatus scorpion venom from two regions of Arizona.

PubMed

Carcamo-Noriega, Edson Norberto; Olamendi-Portugal, Timoteo; Restano-Cassulini, Rita; Rowe, Ashlee; Uribe-Romero, Selene Jocelyn; Becerril, Baltazar; Possani, Lourival Domingos

2018-01-15

This study investigated geographic variability in the venom of Centruroides sculpturatus scorpions from different biotopes. Venom from scorpions collected from two different regions in Arizona; Santa Rita Foothills (SR) and Yarnell (Yar) were analyzed. We found differences between venoms, mainly in the two most abundant peptides; SR (CsEv2e and CsEv1f) and Yar (CsEv2 and CsEv1c) identified as natural variants of CsEv1 and CsEv2. Sequence analyses of these peptides revealed conservative amino acid changes between variants, which may underlie biological activity against arthropods. A third peptide (CsEv6) was highly abundant in the Yar venom compared to the SR venom. CsEv6 is a 67 amino acid peptide with 8 cysteines. CsEv6 did not exhibit toxicity to the three animal models tested. However, both venoms shared similarities in peptides that are predicted to deter predators. For example, both venoms expressed CsEI (lethal to chick) in similar abundance, while CsEd and CsEM1a (toxic to mammals) displayed only moderate variation in their abundance. Electrophysiological evaluation of CsEd and CsEM1a showed that both toxins act on the human sodium-channel subtype 1.6 (hNav 1.6). Complete sequencing revealed that both toxins are structurally similar to beta-toxins isolated from different Centruroides species that also target hNav 1.6. Copyright © 2017 Elsevier Inc. All rights reserved.
New approach to real-time nucleic acids detection: folding polymerase chain reaction amplicons into a secondary structure to improve cleavage of Förster resonance energy transfer probes in 5′-nuclease assays

PubMed Central

Kutyavin, Igor V.

2010-01-01

The article describes a new technology for real-time polymerase chain reaction (PCR) detection of nucleic acids. Similar to Taqman, this new method, named Snake, utilizes the 5′-nuclease activity of Thermus aquaticus (Taq) DNA polymerase that cleaves dual-labeled Förster resonance energy transfer (FRET) probes and generates a fluorescent signal during PCR. However, the mechanism of the probe cleavage in Snake is different. In this assay, PCR amplicons fold into stem–loop secondary structures. Hybridization of FRET probes to one of these structures leads to the formation of optimal substrates for the 5′-nuclease activity of Taq. The stem–loop structures in the Snake amplicons are introduced by the unique design of one of the PCR primers, which carries a special 5′-flap sequence. It was found that at a certain length of these 5′-flap sequences the folded Snake amplicons have very little, if any, effect on PCR yield but benefit many aspects of the detection process, particularly the signal productivity. Unlike Taqman, the Snake system favors the use of short FRET probes with improved fluorescence background. The head-to-head comparison study of Snake and Taqman revealed that these two technologies have more differences than similarities with respect to their responses to changes in PCR protocol, e.g. the variations in primer concentration, annealing time, PCR asymmetry. The optimal PCR protocol for Snake has been identified. The technology’s real-time performance was compared to a number of conventional assays including Taqman, 3′-MGB-Taqman, Molecular Beacon and Scorpion primers. The test trial showed that Snake supersedes the conventional assays in the signal productivity and detection of sequence variations as small as single nucleotide polymorphisms. Due to the assay’s cost-effectiveness and simplicity of design, the technology is anticipated to quickly replace all known conventional methods currently used for real-time nucleic acid detection. PMID:19969535
Composition for nucleic acid sequencing

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2008-08-26

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules

DOEpatents

Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

2006-06-06

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Method for sequencing nucleic acid molecules

DOEpatents

Korlach, Jonas; Webb, Watt W.; Levene, Michael; Turner, Stephen; Craighead, Harold G.; Foquet, Mathieu

2006-05-30

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Dipeptide Sequence Determination: Analyzing Phenylthiohydantoin Amino Acids by HPLC

NASA Astrophysics Data System (ADS)

Barton, Janice S.; Tang, Chung-Fei; Reed, Steven S.

2000-02-01

Amino acid composition and sequence determination, important techniques for characterizing peptides and proteins, are essential for predicting conformation and studying sequence alignment. This experiment presents improved, fundamental methods of sequence analysis for an upper-division biochemistry laboratory. Working in pairs, students use the Edman reagent to prepare phenylthiohydantoin derivatives of amino acids for determination of the sequence of an unknown dipeptide. With a single HPLC technique, students identify both the N-terminal amino acid and the composition of the dipeptide. This method yields good precision of retention times and allows use of a broad range of amino acids as components of the dipeptide. Students learn fundamental principles and techniques of sequence analysis and HPLC.
Analysis of nucleotide diversity among alleles of the major bacterial blight resistance gene Xa27 in cultivars of rice (Oryza sativa) and its wild relatives.

PubMed

Bimolata, Waikhom; Kumar, Anirudh; Sundaram, Raman Meenakshi; Laha, Gouri Shankar; Qureshi, Insaf Ahmed; Reddy, Gajjala Ashok; Ghazi, Irfan Ahmad

2013-08-01

Xa27 is one of the important R-genes, effective against bacterial blight disease of rice caused by Xanthomonas oryzae pv. oryzae (Xoo). Using natural population of Oryza, we analyzed the sequence variation in the functionally important domains of Xa27 across the Oryza species. DNA sequences of Xa27 alleles from 27 rice accessions revealed higher nucleotide diversity among the reported R-genes of rice. Sequence polymorphism analysis revealed synonymous and non-synonymous mutations in addition to a number of InDels in non-coding regions of the gene. High sequence variation was observed in the promoter region including the 5'UTR with 'π' value 0.00916 and 'θ w ' = 0.01785. Comparative analysis of the identified Xa27 alleles with that of IRBB27 and IR24 indicated the operation of both positive selection (Ka/Ks > 1) and neutral selection (Ka/Ks ≈ 0). The genetic distances of alleles of the gene from Oryza nivara were nearer to IRBB27 as compared to IR24. We also found the presence of conserved and null UPT (upregulated by transcriptional activator) box in the isolated alleles. Considerable amino acid polymorphism was localized in the trans-membrane domain for which the functional significance is yet to be elucidated. However, the absence of functional UPT box in all the alleles except IRBB27 suggests the maintenance of single resistant allele throughout the natural population.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2014-02-25

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-05-16

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.

EGVI endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2008-04-01

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2010-10-12

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVIII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-05-23

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl8, and the corresponding EGVIII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVIII, recombinant EGVIII proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2010-10-05

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVI endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-06-06

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6, and the corresponding EGVI amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2009-05-05

The present invention provides an endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2013-07-16

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2012-02-14

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
EGVII endoglucanase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2015-04-14

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl7, and the corresponding EGVII amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVII, recombinant EGVII proteins and methods for producing the same.
Kit for detecting nucleic acid sequences using competitive hybridization probes

DOEpatents

Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

2001-01-01

A kit is provided for detecting a target nucleic acid sequence in a sample, the kit comprising: a first hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the first hybridization probe including a first complexing agent for forming a binding pair with a second complexing agent; and a second hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the first hybridization probe does not selectively hybridize, the second hybridization probe including a detectable marker; a third hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a first portion of the target sequence, the third hybridization probe including the same detectable marker as the second hybridization probe; and a fourth hybridization probe which includes a nucleic acid sequence that is sufficiently complementary to selectively hybridize to a second portion of the target sequence to which the third hybridization probe does not selectively hybridize, the fourth hybridization probe including the first complexing agent for forming a binding pair with the second complexing agent; wherein the first and second hybridization probes are capable of simultaneously hybridizing to the target sequence and the third and fourth hybridization probes are capable of simultaneously hybridizing to the target sequence, the detectable marker is not present on the first or fourth hybridization probes and the first, second, third, and fourth hybridization probes each include a competitive nucleic acid sequence which is sufficiently complementary to a third portion of the target sequence that the competitive sequences of the first, second, third, and fourth hybridization probes compete with each other to hybridize to the third portion of the target sequence.
Structural defects and variations in the HIV-1 nef gene from rapid, slow and non-progressor children.

PubMed

Casartelli, Nicoletta; Di Matteo, Gigliola; Argentini, Claudio; Cancrini, Caterina; Bernardi, Stefania; Castelli, Guido; Scarlatti, Gabriella; Plebani, Anna; Rossi, Paolo; Doria, Margherita

2003-06-13

Evaluation of sequence evolution as well as structural defects and mutations of the human immunodeficiency virus-type 1 (HIV-1) nef gene in relation to disease progression in infected children. We examined a large number of nef alleles sequentially derived from perinatally HIV-1-infected children with different rates of disease progression: six non-progressors (NPs), four rapid progressors (RPs), and three slow progressors (SPs). Nef alleles (182 total) were isolated from patients' peripheral blood mononuclear cells (PBMCs), sequenced and analysed for their evolutionary pattern, frequency of mutations and occurrence of amino acid variations associated with different stages of disease. The evolution rate of the nef gene apparently correlated with CD4+ decline in all progression groups. Evidence for rapid viral turnover and positive selection for changes were found only in two SPs and two RPs respectively. In NPs, a higher proportion of disrupted sequences and mutations at various functional motifs were observed. Furthermore, NP-derived Nef proteins were often changed at residues localized in the folded core domain at cytotoxic T lymphocytes (CTL) epitopes (E(105), K(106), E(110), Y(132), K(164), and R(200)), while other residues outside the core domain are more often changed in RPs (A(43)) and SPs (N(173) and Y(214)). Our results suggest a link between nef gene functions and the progression rate in HIV-1-infected children. Moreover, non-progressor-associated variations in the core domain of Nef, together with the genetic analysis, suggest that nef gene evolution is shaped by an effective immune system in these patients.
MipLAAO, a new L-amino acid oxidase from the redtail coral snake Micrurus mipartitus

PubMed Central

2018-01-01

L-amino acid oxidases (LAAOs) are ubiquitous enzymes in nature. Bioactivities described for these enzymes include apoptosis induction, edema formation, induction or inhibition of platelet aggregation, as well as antiviral, antiparasite, and antibacterial actions. With over 80 species, Micrurus snakes are the representatives of the Elapidae family in the New World. Although LAAOs in Micrurus venoms have been predicted by venom gland transcriptomic studies and detected in proteomic studies, no enzymes of this kind have been previously purified from their venoms. Earlier proteomic studies revealed that the venom of M. mipartitus from Colombia contains ∼4% of LAAO. This enzyme, here named MipLAAO, was isolated and biochemically and functionally characterized. The enzyme is found in monomeric form, with an isotope-averaged molecular mass of 59,100.6 Da, as determined by MALDI-TOF. Its oxidase activity shows substrate preference for hydrophobic amino acids, being optimal at pH 8.0. By nucleotide sequencing of venom gland cDNA of mRNA transcripts obtained from a single snake, six isoforms of MipLAAO with minor variations among them were retrieved. The deduced sequences present a mature chain of 483 amino acids, with a predicted pI of 8.9, and theoretical masses between 55,010.9 and 55,121.0 Da. The difference with experimentally observed mass is likely due to glycosylation, in agreement with the finding of three putative N-glycosylation sites in its amino acid sequence. A phylogenetic analysis of MmipLAAO placed this new enzyme within the clade of homologous proteins from elapid snakes, characterized by the conserved Serine at position 223, in contrast to LAAOs from viperids. MmipLAAO showed a potent bactericidal effect on S. aureus (MIC: 2 µg/mL), but not on E. coli. The former activity could be of interest to future studies assessing its potential as antimicrobial agent. PMID:29900074
MipLAAO, a new L-amino acid oxidase from the redtail coral snake Micrurus mipartitus.

PubMed

Rey-Suárez, Paola; Acosta, Cristian; Torres, Uday; Saldarriaga-Córdoba, Mónica; Lomonte, Bruno; Núñez, Vitelbina

2018-01-01

L-amino acid oxidases (LAAOs) are ubiquitous enzymes in nature. Bioactivities described for these enzymes include apoptosis induction, edema formation, induction or inhibition of platelet aggregation, as well as antiviral, antiparasite, and antibacterial actions. With over 80 species, Micrurus snakes are the representatives of the Elapidae family in the New World. Although LAAOs in Micrurus venoms have been predicted by venom gland transcriptomic studies and detected in proteomic studies, no enzymes of this kind have been previously purified from their venoms. Earlier proteomic studies revealed that the venom of M. mipartitus from Colombia contains ∼4% of LAAO. This enzyme, here named MipLAAO, was isolated and biochemically and functionally characterized. The enzyme is found in monomeric form, with an isotope-averaged molecular mass of 59,100.6 Da, as determined by MALDI-TOF. Its oxidase activity shows substrate preference for hydrophobic amino acids, being optimal at pH 8.0. By nucleotide sequencing of venom gland cDNA of mRNA transcripts obtained from a single snake, six isoforms of MipLAAO with minor variations among them were retrieved. The deduced sequences present a mature chain of 483 amino acids, with a predicted pI of 8.9, and theoretical masses between 55,010.9 and 55,121.0 Da. The difference with experimentally observed mass is likely due to glycosylation, in agreement with the finding of three putative N-glycosylation sites in its amino acid sequence. A phylogenetic analysis of MmipLAAO placed this new enzyme within the clade of homologous proteins from elapid snakes, characterized by the conserved Serine at position 223, in contrast to LAAOs from viperids. MmipLAAO showed a potent bactericidal effect on S. aureus (MIC: 2 µg/mL), but not on E. coli . The former activity could be of interest to future studies assessing its potential as antimicrobial agent.
Identification of three genotypes of sugarcane yellow leaf virus causing yellow leaf disease from India and their molecular characterization.

PubMed

Viswanathan, R; Balamuralikrishnan, M; Karuppaiah, R

2008-12-01

Sugarcane yellow leaf virus (SCYLV) that causes yellow leaf disease (YLD) in sugarcane (recently reported in India) belongs to Polerovirus. Detailed studies were conducted to characterize the virus based on partial open reading frames (ORFs) 1 and 2 and complete ORFs 3 and 4 sequences in their genome. Reverse-transcriptase polymerase chain reaction (RT-PCR) was performed on 48 sugarcane leaf samples to detect the virus using a specific set of primers. Of the 48 samples, 36 samples (field samples with and without foliar symptoms) including 10 meristem culture derived plants were found to be positive to SCYLV infection. Additionally, an aphid colony collected from symptomatic sugarcane in the field was also found to be SCYLV positive. The amplicons from 22 samples were cloned, sequenced and acronymed as SCYLV-CB isolates. The nucleotide (nt) and amino acid (aa) sequence comparison showed a significant variation between SCYLV-CB and the database sequences at nt (3.7-5.1%) and aa (3.2-5.3%) sequence level in the CP coding region. However, the database sequences comprising isolates of three reported genotypes, viz., BRA, PER and REU, were observed with least nt and aa sequence dissimilarities (0.0-1.6%). The phylogenetic analyses of the overlapping ORFs (ORF 3 and ORF 4) of SCYLV encoding CP and MP determined in this study and additional sequences of 26 other isolates including an Indian isolate (SCYLV-IND) available from GenBank were distributed in four phylogenetic clusters. The SCYLV-CB isolates from this study lineated in two clusters (C1 and C2) and all the other isolates from the worldwide locations into another two clusters (C3 and C4). The sequence variation of the isolates in this study with the database isolates, even in the least variable region of the SCYLV genome, showed that the population existing in India is significantly different from rest of the world. Further, comparison of partial sequences encoding for ORFs 1 and 2 revealed that YLD in sugarcane in India is caused by at least three genotypes, viz., CUB, IND and BRA-PER, of which a majority of the samples were found infected with Cuban genotype (CUB) and lesser by IND and BRA-PER genotypes. The genotype IND was identified as a new genotype from this study, and this was found to have significant variation with the reported genotypes.
Transcriptome and Proteome Expression Analysis of the Metabolism of Amino Acids by the Fungus Aspergillus oryzae in Fermented Soy Sauce

PubMed Central

Zhao, Guozhong; Yao, Yunping; Wang, Chunling; Tian, Fengwei; Liu, Xiaoming; Hou, Lihua; Yang, Zhen; Zhao, Jianxin; Zhang, Hao

2015-01-01

Amino acids comprise the majority of the flavor compounds in soy sauce. A portion of these amino acids are formed from the biosynthesis and metabolism of the fungus Aspergillus oryzae; however, the metabolic pathways leading to the formation of these amino acids in A. oryzae remain largely unknown. We sequenced the transcriptomes of A. oryzae 100-8 and A. oryzae 3.042 under similar soy sauce fermentation conditions. 2D gel electrophoresis was also used to find some differences in protein expression. We found that many amino acid hydrolases (endopeptidases, aminopeptidases, and X-pro-dipeptidyl aminopeptidase) were expressed at much higher levels (mostly greater than double) in A. oryzae 100-8 than in A. oryzae 3.042. Our results indicated that glutamate dehydrogenase may activate the metabolism of amino acids. We also found that the expression levels of some genes changed simultaneously in the metabolic pathways of tyrosine and leucine and that these conserved genes may modulate the function of the metabolic pathway. Such variation in the metabolic pathways of amino acids is important as it can significantly alter the flavor of fermented soy sauce. PMID:25945335
Transcriptome and Proteome Expression Analysis of the Metabolism of Amino Acids by the Fungus Aspergillus oryzae in Fermented Soy Sauce.

PubMed

Zhao, Guozhong; Yao, Yunping; Wang, Chunling; Tian, Fengwei; Liu, Xiaoming; Hou, Lihua; Yang, Zhen; Zhao, Jianxin; Zhang, Hao; Cao, Xiaohong

2015-01-01

Amino acids comprise the majority of the flavor compounds in soy sauce. A portion of these amino acids are formed from the biosynthesis and metabolism of the fungus Aspergillus oryzae; however, the metabolic pathways leading to the formation of these amino acids in A. oryzae remain largely unknown. We sequenced the transcriptomes of A. oryzae 100-8 and A. oryzae 3.042 under similar soy sauce fermentation conditions. 2D gel electrophoresis was also used to find some differences in protein expression. We found that many amino acid hydrolases (endopeptidases, aminopeptidases, and X-pro-dipeptidyl aminopeptidase) were expressed at much higher levels (mostly greater than double) in A. oryzae 100-8 than in A. oryzae 3.042. Our results indicated that glutamate dehydrogenase may activate the metabolism of amino acids. We also found that the expression levels of some genes changed simultaneously in the metabolic pathways of tyrosine and leucine and that these conserved genes may modulate the function of the metabolic pathway. Such variation in the metabolic pathways of amino acids is important as it can significantly alter the flavor of fermented soy sauce.
Alt a 1 allergen homologs from Alternaria and related taxa: analysis of phylogenetic content and secondary structure.

PubMed

Hong, Soon Gyu; Cramer, Robert A; Lawrence, Christopher B; Pryor, Barry M

2005-02-01

A gene for the Alternaria major allergen, Alt a 1, was amplified from 52 species of Alternaria and related genera, and sequence information was used for phylogenetic study. Alt a 1 gene sequences evolved 3.8 times faster and contained 3.5 times more parsimony-informative sites than glyceraldehyde-3-phosphate dehydrogenase (gpd) sequences. Analyses of Alt a 1 gene and gpd exon sequences strongly supported grouping of Alternaria spp. and related taxa into several species-groups described in previous studies, especially the infectoria, alternata, porri, brassicicola, and radicina species-groups and the Embellisia group. The sonchi species-group was newly suggested in this study. Monophyly of the Nimbya group was moderately supported, and monophyly of the Ulocladium group was weakly supported. Relationships among species-groups and among closely related species of the same species-group were not fully resolved. However, higher resolution could be obtained using Alt a 1 sequences or a combined dataset than using gpd sequences alone. Despite high levels of variation in amino acid sequences, results of in silico prediction of protein secondary structure for Alt a 1 demonstrated a high degree of structural similarity for most of the species suggesting a conservation of function.
Chip-based sequencing nucleic acids

DOEpatents

Beer, Neil Reginald

2014-08-26

A system for fast DNA sequencing by amplification of genetic material within microreactors, denaturing, demulsifying, and then sequencing the material, while retaining it in a PCR/sequencing zone by a magnetic field. One embodiment includes sequencing nucleic acids on a microchip that includes a microchannel flow channel in the microchip. The nucleic acids are isolated and hybridized to magnetic nanoparticles or to magnetic polystyrene-coated beads. Microreactor droplets are formed in the microchannel flow channel. The microreactor droplets containing the nucleic acids and the magnetic nanoparticles are retained in a magnetic trap in the microchannel flow channel and sequenced.
"De-novo" amino acid sequence elucidation of protein G'e by combined "top-down" and "bottom-up" mass spectrometry.

PubMed

Yefremova, Yelena; Al-Majdoub, Mahmoud; Opuni, Kwabena F M; Koy, Cornelia; Cui, Weidong; Yan, Yuetian; Gross, Michael L; Glocker, Michael O

2015-03-01

Mass spectrometric de-novo sequencing was applied to review the amino acid sequence of a commercially available recombinant protein G´ with great scientific and economic importance. Substantial deviations to the published amino acid sequence (Uniprot Q54181) were found by the presence of 46 additional amino acids at the N-terminus, including a so-called "His-tag" as well as an N-terminal partial α-N-gluconoylation and α-N-phosphogluconoylation, respectively. The unexpected amino acid sequence of the commercial protein G' comprised 241 amino acids and resulted in a molecular mass of 25,998.9 ± 0.2 Da for the unmodified protein. Due to the higher mass that is caused by its extended amino acid sequence compared with the original protein G' (185 amino acids), we named this protein "protein G'e." By means of mass spectrometric peptide mapping, the suggested amino acid sequence, as well as the N-terminal partial α-N-gluconoylations, was confirmed with 100% sequence coverage. After the protein G'e sequence was determined, we were able to determine the expression vector pET-28b from Novagen with the Xho I restriction enzyme cleavage site as the best option that was used for cloning and expressing the recombinant protein G'e in E. coli. A dissociation constant (K(d)) value of 9.4 nM for protein G'e was determined thermophoretically, showing that the N-terminal flanking sequence extension did not cause significant changes in the binding affinity to immunoglobulins.
Seq2Logo: a method for construction and visualization of amino acid binding motifs and sequence profiles including sequence weighting, pseudo counts and two-sided representation of amino acid enrichment and depletion

PubMed Central

Thomsen, Martin Christen Frølund; Nielsen, Morten

2012-01-01

Seq2Logo is a web-based sequence logo generator. Sequence logos are a graphical representation of the information content stored in a multiple sequence alignment (MSA) and provide a compact and highly intuitive representation of the position-specific amino acid composition of binding motifs, active sites, etc. in biological sequences. Accurate generation of sequence logos is often compromised by sequence redundancy and low number of observations. Moreover, most methods available for sequence logo generation focus on displaying the position-specific enrichment of amino acids, discarding the equally valuable information related to amino acid depletion. Seq2logo aims at resolving these issues allowing the user to include sequence weighting to correct for data redundancy, pseudo counts to correct for low number of observations and different logotype representations each capturing different aspects related to amino acid enrichment and depletion. Besides allowing input in the format of peptides and MSA, Seq2Logo accepts input as Blast sequence profiles, providing easy access for non-expert end-users to characterize and identify functionally conserved/variable amino acids in any given protein of interest. The output from the server is a sequence logo and a PSSM. Seq2Logo is available at http://www.cbs.dtu.dk/biotools/Seq2Logo (14 May 2012, date last accessed). PMID:22638583

Standing of nucleic acid testing strategies in veterinary diagnosis laboratories to uncover Mycobacterium tuberculosis complex members

PubMed Central

Costa, Pedro; Botelho, Ana; Couto, Isabel; Viveiros, Miguel; Inácio, João

2014-01-01

Nucleic acid testing (NAT) designate any molecular approach used for the detection, identification, and characterization of pathogenic microorganisms, enabling the rapid, specific, and sensitive diagnostic of infectious diseases, such as tuberculosis. These assays have been widely used since the 90s of the last century in human clinical laboratories and, subsequently, also in veterinary diagnostics. Most NAT strategies are based in the polymerase chain reaction (PCR) and its several enhancements and variations. From the conventional PCR, real-time PCR and its combinations, isothermal DNA amplification, to the nanotechnologies, here we review how the NAT assays have been applied to decipher if and which member of the Mycobacterium tuberculosis complex is present in a clinical sample. Recent advances in DNA sequencing also brought new challenges and have made possible to generate rapidly and at a low cost, large amounts of sequence data. This revolution with the high-throughput sequencing (HTS) technologies makes whole genome sequencing (WGS) and metagenomics the trendiest NAT strategies, today. The ranking of NAT techniques in the field of clinical diagnostics is rising, and we provide a SWOT (Strengths, Weaknesses, Opportunities, and Threats) analysis with our view of the use of molecular diagnostics for detecting tuberculosis in veterinary laboratories, notwithstanding the gold standard being still the classical culture of the agent. The complementary use of both classical and molecular diagnostics approaches is recommended to speed the diagnostic, enabling a fast decision by competent authorities and rapid tackling of the disease. PMID:25988157
Standing of nucleic acid testing strategies in veterinary diagnosis laboratories to uncover Mycobacterium tuberculosis complex members.

PubMed

Costa, Pedro; Botelho, Ana; Couto, Isabel; Viveiros, Miguel; Inácio, João

2014-01-01

Nucleic acid testing (NAT) designate any molecular approach used for the detection, identification, and characterization of pathogenic microorganisms, enabling the rapid, specific, and sensitive diagnostic of infectious diseases, such as tuberculosis. These assays have been widely used since the 90s of the last century in human clinical laboratories and, subsequently, also in veterinary diagnostics. Most NAT strategies are based in the polymerase chain reaction (PCR) and its several enhancements and variations. From the conventional PCR, real-time PCR and its combinations, isothermal DNA amplification, to the nanotechnologies, here we review how the NAT assays have been applied to decipher if and which member of the Mycobacterium tuberculosis complex is present in a clinical sample. Recent advances in DNA sequencing also brought new challenges and have made possible to generate rapidly and at a low cost, large amounts of sequence data. This revolution with the high-throughput sequencing (HTS) technologies makes whole genome sequencing (WGS) and metagenomics the trendiest NAT strategies, today. The ranking of NAT techniques in the field of clinical diagnostics is rising, and we provide a SWOT (Strengths, Weaknesses, Opportunities, and Threats) analysis with our view of the use of molecular diagnostics for detecting tuberculosis in veterinary laboratories, notwithstanding the gold standard being still the classical culture of the agent. The complementary use of both classical and molecular diagnostics approaches is recommended to speed the diagnostic, enabling a fast decision by competent authorities and rapid tackling of the disease.
VWF mutations and new sequence variations identified in healthy controls are more frequent in the African-American population.

PubMed

Bellissimo, Daniel B; Christopherson, Pamela A; Flood, Veronica H; Gill, Joan Cox; Friedman, Kenneth D; Haberichter, Sandra L; Shapiro, Amy D; Abshire, Thomas C; Leissinger, Cindy; Hoots, W Keith; Lusher, Jeanne M; Ragni, Margaret V; Montgomery, Robert R

2012-03-01

Diagnosis and classification of VWD is aided by molecular analysis of the VWF gene. Because VWF polymorphisms have not been fully characterized, we performed VWF laboratory testing and gene sequencing of 184 healthy controls with a negative bleeding history. The controls included 66 (35.9%) African Americans (AAs). We identified 21 new sequence variations, 13 (62%) of which occurred exclusively in AAs and 2 (G967D, T2666M) that were found in 10%-15% of the AA samples, suggesting they are polymorphisms. We identified 14 sequence variations reported previously as VWF mutations, the majority of which were type 1 mutations. These controls had VWF Ag levels within the normal range, suggesting that these sequence variations might not always reduce plasma VWF levels. Eleven mutations were found in AAs, and the frequency of M740I, H817Q, and R2185Q was 15%-18%. Ten AA controls had the 2N mutation H817Q; 1 was homozygous. The average factor VIII level in this group was 99 IU/dL, suggesting that this variation may confer little or no clinical symptoms. This study emphasizes the importance of sequencing healthy controls to understand ethnic-specific sequence variations so that asymptomatic sequence variations are not misidentified as mutations in other ethnic or racial groups.
Non-codingRNA sequence variations in human chronic lymphocytic leukemia and colorectal cancer.

PubMed

Wojcik, Sylwia E; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z; Rai, Kanti R; Kipps, Thomas J; Keating, Michael J; Croce, Carlo M; Calin, George A

2010-02-01

Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas.
Non-codingRNA sequence variations in human chronic lymphocytic leukemia and colorectal cancer

PubMed Central

Wojcik, Sylwia E.; Rossi, Simona; Shimizu, Masayoshi; Nicoloso, Milena S.; Cimmino, Amelia; Alder, Hansjuerg; Herlea, Vlad; Rassenti, Laura Z.; Rai, Kanti R.; Kipps, Thomas J.; Keating, Michael J.

2010-01-01

Cancer is a genetic disease in which the interplay between alterations in protein-coding genes and non-coding RNAs (ncRNAs) plays a fundamental role. In recent years, the full coding component of the human genome was sequenced in various cancers, whereas such attempts related to ncRNAs are still fragmentary. We screened genomic DNAs for sequence variations in 148 microRNAs (miRNAs) and ultraconserved regions (UCRs) loci in patients with chronic lymphocytic leukemia (CLL) or colorectal cancer (CRC) by Sanger technique and further tried to elucidate the functional consequences of some of these variations. We found sequence variations in miRNAs in both sporadic and familial CLL cases, mutations of UCRs in CLLs and CRCs and, in certain instances, detected functional effects of these variations. Furthermore, by integrating our data with previously published data on miRNA sequence variations, we have created a catalog of DNA sequence variations in miRNAs/ultraconserved genes in human cancers. These findings argue that ncRNAs are targeted by both germ line and somatic mutations as well as by single-nucleotide polymorphisms with functional significance for human tumorigenesis. Sequence variations in ncRNA loci are frequent and some have functional and biological significance. Such information can be exploited to further investigate on a genome-wide scale the frequency of genetic variations in ncRNAs and their functional meaning, as well as for the development of new diagnostic and prognostic markers for leukemias and carcinomas. PMID:19926640
Comparative characterization of random-sequence proteins consisting of 5, 12, and 20 kinds of amino acids

PubMed Central

Tanaka, Junko; Doi, Nobuhide; Takashima, Hideaki; Yanagawa, Hiroshi

2010-01-01

Screening of functional proteins from a random-sequence library has been used to evolve novel proteins in the field of evolutionary protein engineering. However, random-sequence proteins consisting of the 20 natural amino acids tend to aggregate, and the occurrence rate of functional proteins in a random-sequence library is low. From the viewpoint of the origin of life, it has been proposed that primordial proteins consisted of a limited set of amino acids that could have been abundantly formed early during chemical evolution. We have previously found that members of a random-sequence protein library constructed with five primitive amino acids show high solubility (Doi et al., Protein Eng Des Sel 2005;18:279–284). Although such a library is expected to be appropriate for finding functional proteins, the functionality may be limited, because they have no positively charged amino acid. Here, we constructed three libraries of 120-amino acid, random-sequence proteins using alphabets of 5, 12, and 20 amino acids by preselection using mRNA display (to eliminate sequences containing stop codons and frameshifts) and characterized and compared the structural properties of random-sequence proteins arbitrarily chosen from these libraries. We found that random-sequence proteins constructed with the 12-member alphabet (including five primitive amino acids and positively charged amino acids) have higher solubility than those constructed with the 20-member alphabet, though other biophysical properties are very similar in the two libraries. Thus, a library of moderate complexity constructed from 12 amino acids may be a more appropriate resource for functional screening than one constructed from 20 amino acids. PMID:20162614
Assessment for Melting Temperature Measurement of Nucleic Acid by HRM.

PubMed

Wang, Jing; Pan, Xiaoming; Liang, Xingguo

2016-01-01

High resolution melting (HRM), with a high sensitivity to distinguish the nucleic acid species with small variations, has been widely applied in the mutation scanning, methylation analysis, and genotyping. For the aim of extending HRM for the evaluation of thermal stability of nucleic acid secondary structures on sequence dependence, we investigated effects of the dye of EvaGreen, metal ions, and impurities (such as dNTPs) on melting temperature ( T m ) measurement by HRM. The accuracy of HRM was assessed as compared with UV melting method, and little difference between the two methods was found when the DNA T m was higher than 40°C. Both insufficiency and excessiveness of EvaGreen were found to give rise to a little bit higher T m , showing that the proportion of dye should be considered for precise T m measurement of nucleic acids. Finally, HRM method was also successfully used to measure T m s of DNA triplex, hairpin, and RNA duplex. In conclusion, HRM can be applied in the evaluation of thermal stability of nucleic acid (DNA or RNA) or secondary structural elements (even when dNTPs are present).
Genetic diversity of HA1 domain of heammaglutinin gene of influenza A(H1N1)pdm09 in Tunisia

PubMed Central

2013-01-01

We present major results concerning isolation and determination of the nucleotide sequence of hemagglutinin (HA1) of the pandemic (H1N1)pdm09 influenza viruses found in Tunisia. Amino acid analysis revealed minor amino acid changes in the antigenic or receptor-binding domains. We found mutations that were also present in 1918 pandemic virus, which includes S183P in 4 and S185T mutation in 19 of 27 viruses analyzed from 2011, while none of the 2009 viruses carried these mutations. Also two specific amino acid differences into N-glycosylation sites (N288T and N276H) were detected. The phylogenetic analysis revealed that the majority of the Tunisian isolates clustered with clade A/St. Petersburg/27/2011 viruses characterized by D97N and S185T mutations. However it also reveals a trend of 2010 strains to accumulate amino acid variation and form new phylogenetic clade with three specific amino acid substitutions: V47I, E172K and K308E. PMID:23679923
DOE Office of Scientific and Technical Information (OSTI.GOV)

Reiser, Steven E.; Somerville, Chris R.

The present invention relates to bacterial enzymes, in particular to an acyl-CoA reductase and a gene encoding an acyl-CoA reductase, the amino acid and nucleic acid sequences corresponding to the reductase polypeptide and gene, respectively, and to methods of obtaining such enzymes, amino acid sequences and nucleic acid sequences. The invention also relates to the use of such sequences to provide transgenic host cells capable of producing fatty alcohols and fatty aldehydes.
Identification of canine parvovirus with the Q370R point mutation in the VP2 gene from a giant panda (Ailuropoda melanoleuca).

PubMed

Guo, Ling; Yang, Shao-lin; Chen, Shi-jie; Zhang, Zhihe; Wang, Chengdong; Hou, Rong; Ren, Yupeng; Wen, Xintian; Cao, Sanjie; Guo, Wanzhu; Hao, Zhongxiang; Quan, Zifang; Zhang, Manli; Yan, Qi-gui

2013-05-26

In this study, we sequenced and phylogenetic analyses of the VP2 genes from twelve canine parvovirus (CPV) strains obtained from eleven domestic dogs and a giant panda (Ailuropoda melanoleuca) in China. A novel canine parvovirus (CPV) was detected from the giant panda in China. Nucleotide and phylogenetic analysis of the capsid protein VP2 gene classified the CPV as a new CPV-2a type. Substitution of Gln for Arg at the conserved 370 residue in CPV presents an unusual variation in the new CPV-2a amino acid sequence of the giant panda and is further evidence for the continuing evolution of the virus. These findings extend the knowledge on CPV molecular epidemiology of particular relevance to wild carnivores.
Genetic Variation in FABP4 and Evaluation of Its Effects on Beef Cattle Fat Content.

PubMed

Goszczynski, Daniel E; Papaleo-Mazzucco, Juliana; Ripoli, María V; Villarreal, Edgardo L; Rogberg-Muñoz, Andrés; Mezzadra, Carlos A; Melucci, Lilia M; Giovambattista, Guillermo

2017-07-03

FABP4 is a protein primarily expressed in adipocytes and macrophages that plays a key role in fatty acid trafficking and lipid hydrolysis. FABP4 gene polymorphisms have been associated with meat quality traits in cattle, mostly in Asian breeds under feedlot conditions. The objectives of this work were to characterize FABP4 genetic variation in several worldwide cattle breeds and evaluate possible genotype effects on fat content in a pasture-fed crossbred (Angus-Hereford-Limousin) population. We re-sequenced 43 unrelated animals from nine cattle breeds (Angus, Brahman, Creole, Hereford, Holstein, Limousin, Nelore, Shorthorn, and Wagyu) and obtained 22 single nucleotide polymorphisms (SNPs) over 3,164 bp, including four novel polymorphisms. Haplotypes and linkage disequilibrium analyses showed a high variability. Five SNPs were selected to perform validation and association studies in our crossbred population. Four SNPs showed well-balanced allele frequencies (minor frequency > 0.159), and three showed no significant deviations from Hardy-Weinberg proportions. SNPs showed significant effects on backfat thickness and fatty acid composition (P < 0.05). The protein structure of one of the missense SNPs was analyzed to elucidate its possible effect on fat content in our studied population. Our results revealed a possible blockage of the fatty acid binding site by the missense mutation.
BGL7 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2013-01-29

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl7, and the corresponding BGL7 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL7, recombinant BGL7 proteins and methods for producing the same.
BGL6 .beta.-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2012-10-02

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL5 .beta.-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-02-28

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.
BGL5 .beta.-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2008-03-18

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.
BGL6 beta-glucosidase and nucleic acids encoding the same

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dunn-Coleman, Nigel; Ward, Michael

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL6 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2014-03-04

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL7 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2015-04-14

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl7, and the corresponding BGL7 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL7, recombinant BGL7 proteins and methods for producing the same.
BGL7 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2014-03-25

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl7, and the corresponding BGL7 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL7, recombinant BGL7 proteins and methods for producing the same.
BGL6 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Ward, Michael

2015-08-11

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.

BGL3 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2007-09-25

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.
BGL3 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2008-04-01

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.
BGL4 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2011-12-06

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl4, and the corresponding BGL4 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL4, recombinant BGL4 proteins and methods for producing the same.
BGL4 .beta.-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-05-16

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl4, and the corresponding BGL4 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL4, recombinant BGL4 proteins and methods for producing the same.
BGL3 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2011-06-14

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.
BGL6 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Ward, Michael [San Francisco, CA

2009-09-01

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl6, and the corresponding BGL6 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL6, recombinant BGL6 proteins and methods for producing the same.
BGL3 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2012-10-30

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.
BGL4 beta-glucosidase and nucleic acids encoding the same

DOEpatents

Dunn-Coleman, Nigel [Los Gatos, CA; Goedegebuur, Frits [Vlaardingen, NL; Ward, Michael [San Francisco, CA; Yao, Jian [Sunnyvale, CA

2008-01-22

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl4, and the corresponding BGL4 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL4, recombinant BGL4 proteins and methods for producing the same.
Comparative genomics of citric-acid producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor V.; Baker, Scott E.; Andersen, Mikael R.

2011-04-28

The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regionsmore » have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up-regulation of genes relevant to glucoamylase A production, such as tRNA-synthases and protein transporters. Our results and datasets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi.[Supplemental materials (10 figures, three text documents and 16 tables) have been made available. The whole genome sequence for A. niger ATCC 1015 is available from NBCI under acc. no ACJE00000000. The up-dated sequence for A. niger CBS 513.88 is available from EMBL under acc. no AM269948-AM270415. The sequence data from the phylogeny study has been submitted to NCBI (GU296686-296739). Microarray data from this study is submitted to GEO as series GSE10983. Accession for reviewers is possible through: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi token GSE10983] The dsmM_ANIGERa_coll511030F library and platform information is deposited at GEO under number GPL6758« less
Lactobacillus kefiri shows inter-strain variations in the amino acid sequence of the S-layer proteins.

PubMed

Malamud, Mariano; Carasi, Paula; Bronsoms, Sílvia; Trejo, Sebastián A; Serradell, María de Los Angeles

2017-04-01

The S-layer is a proteinaceous envelope constituted by subunits that self-assemble to form a two-dimensional lattice that covers the surface of different species of Bacteria and Archaea, and it could be involved in cell recognition of microbes among other several distinct functions. In this work, both proteomic and genomic approaches were used to gain knowledge about the sequences of the S-layer protein (SLPs) encoding genes expressed by six aggregative and sixteen non-aggregative strains of potentially probiotic Lactobacillus kefiri. Peptide mass fingerprint (PMF) analysis confirmed the identity of SLPs extracted from L. kefiri, and based on the homology with phylogenetically related species, primers located outside and inside the SLP-genes were employed to amplify genomic DNA. The O-glycosylation site SASSAS was found in all L. kefiri SLPs. Ten strains were selected for sequencing of the complete genes. The total length of the mature proteins varies from 492 to 576 amino acids, and all SLPs have a calculated pI between 9.37 and 9.60. The N-terminal region is relatively conserved and shows a high percentage of positively charged amino acids. Major differences among strains are found in the C-terminal region. Different groups could be distinguished regarding the mature SLPs and the similarities observed in the PMF spectra. Interestingly, SLPs of the aggregative strains are 100% homologous, although these strains were isolated from different kefir grains. This knowledge provides relevant data for better understanding of the mechanisms involved in SLPs functionality and could contribute to the development of products of biotechnological interest from potentially probiotic bacteria.
Characterization of a stearoyl-acyl carrier protein desaturase gene family from chocolate tree, Theobroma cacao L

PubMed Central

Zhang, Yufan; Maximova, Siela N.; Guiltinan, Mark J.

2015-01-01

In plants, the conversion of stearoyl-ACP to oleoyol-ACP is catalyzed by a plastid-localized soluble stearoyl-acyl carrier protein (ACP) desaturase (SAD). The activity of SAD significantly impacts the ratio of saturated and unsaturated fatty acids, and is thus a major determinant of fatty acid composition. The cacao genome contains eight putative SAD isoforms with high amino acid sequence similarities and functional domain conservation with SAD genes from other species. Sequence variation in known functional domains between different SAD family members suggested that these eight SAD isoforms might have distinct functions in plant development, a hypothesis supported by their diverse expression patterns in various cacao tissues. Notably, TcSAD1 is universally expressed across all the tissues, and its expression pattern in seeds is highly correlated with the dramatic change in fatty acid composition during seed maturation. Interestingly, TcSAD3 and TcSAD4 appear to be exclusively and highly expressed in flowers, functions of which remain unknown. To test the function of TcSAD1 in vivo, transgenic complementation of the Arabidopsis ssi2 mutant was performed, demonstrating that TcSAD1 successfully rescued all AtSSI2 related phenotypes further supporting the functional orthology between these two genes. The identification of the major SAD gene responsible for cocoa butter biosynthesis provides new strategies for screening for novel genotypes with desirable fatty acid compositions, and for use in breeding programs to help pyramid genes for quality and other traits such as disease resistance. PMID:25926841
Bottlenecks and selective sweeps during domestication have increased deleterious genetic variation in dogs.

PubMed

Marsden, Clare D; Ortega-Del Vecchyo, Diego; O'Brien, Dennis P; Taylor, Jeremy F; Ramirez, Oscar; Vilà, Carles; Marques-Bonet, Tomas; Schnabel, Robert D; Wayne, Robert K; Lohmueller, Kirk E

2016-01-05

Population bottlenecks, inbreeding, and artificial selection can all, in principle, influence levels of deleterious genetic variation. However, the relative importance of each of these effects on genome-wide patterns of deleterious variation remains controversial. Domestic and wild canids offer a powerful system to address the role of these factors in influencing deleterious variation because their history is dominated by known bottlenecks and intense artificial selection. Here, we assess genome-wide patterns of deleterious variation in 90 whole-genome sequences from breed dogs, village dogs, and gray wolves. We find that the ratio of amino acid changing heterozygosity to silent heterozygosity is higher in dogs than in wolves and, on average, dogs have 2-3% higher genetic load than gray wolves. Multiple lines of evidence indicate this pattern is driven by less efficient natural selection due to bottlenecks associated with domestication and breed formation, rather than recent inbreeding. Further, we find regions of the genome implicated in selective sweeps are enriched for amino acid changing variants and Mendelian disease genes. To our knowledge, these results provide the first quantitative estimates of the increased burden of deleterious variants directly associated with domestication and have important implications for selective breeding programs and the conservation of rare and endangered species. Specifically, they highlight the costs associated with selective breeding and question the practice favoring the breeding of individuals that best fit breed standards. Our results also suggest that maintaining a large population size, rather than just avoiding inbreeding, is a critical factor for preventing the accumulation of deleterious variants.
Bottlenecks and selective sweeps during domestication have increased deleterious genetic variation in dogs

PubMed Central

Marsden, Clare D.; Ortega-Del Vecchyo, Diego; O’Brien, Dennis P.; Taylor, Jeremy F.; Ramirez, Oscar; Vilà, Carles; Marques-Bonet, Tomas; Schnabel, Robert D.; Wayne, Robert K.; Lohmueller, Kirk E.

2016-01-01

Population bottlenecks, inbreeding, and artificial selection can all, in principle, influence levels of deleterious genetic variation. However, the relative importance of each of these effects on genome-wide patterns of deleterious variation remains controversial. Domestic and wild canids offer a powerful system to address the role of these factors in influencing deleterious variation because their history is dominated by known bottlenecks and intense artificial selection. Here, we assess genome-wide patterns of deleterious variation in 90 whole-genome sequences from breed dogs, village dogs, and gray wolves. We find that the ratio of amino acid changing heterozygosity to silent heterozygosity is higher in dogs than in wolves and, on average, dogs have 2–3% higher genetic load than gray wolves. Multiple lines of evidence indicate this pattern is driven by less efficient natural selection due to bottlenecks associated with domestication and breed formation, rather than recent inbreeding. Further, we find regions of the genome implicated in selective sweeps are enriched for amino acid changing variants and Mendelian disease genes. To our knowledge, these results provide the first quantitative estimates of the increased burden of deleterious variants directly associated with domestication and have important implications for selective breeding programs and the conservation of rare and endangered species. Specifically, they highlight the costs associated with selective breeding and question the practice favoring the breeding of individuals that best fit breed standards. Our results also suggest that maintaining a large population size, rather than just avoiding inbreeding, is a critical factor for preventing the accumulation of deleterious variants. PMID:26699508
Methods and compositions for efficient nucleic acid sequencing

DOEpatents

Drmanac, Radoje

2006-07-04

Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.
Methods and compositions for efficient nucleic acid sequencing

DOEpatents

Drmanac, Radoje

2002-01-01

Disclosed are novel methods and compositions for rapid and highly efficient nucleic acid sequencing based upon hybridization with two sets of small oligonucleotide probes of known sequences. Extremely large nucleic acid molecules, including chromosomes and non-amplified RNA, may be sequenced without prior cloning or subcloning steps. The methods of the invention also solve various current problems associated with sequencing technology such as, for example, high noise to signal ratios and difficult discrimination, attaching many nucleic acid fragments to a surface, preparing many, longer or more complex probes and labelling more species.
Excess amino acid polymorphism in mitochondrial DNA: contrasts among genes from Drosophila, mice, and humans.

PubMed

Rand, D M; Kann, L M

1996-07-01

Recent studies of mitochondrial DNA (mtDNA) variation in mammals and Drosophila have shown an excess of amino acid variation within species (replacement polymorphism) relative to the number of silent and replacement differences fixed between species. To examine further this pattern of nonneutral mtDNA evolution, we present sequence data for the ND3 and ND5 genes from 59 lines of Drosophila melanogaster and 29 lines of D. simulans. Of interest are the frequency spectra of silent and replacement polymorphisms, and potential variation among genes and taxa in the departures from neutral expectations. The Drosophila ND3 and ND5 data show no significant excess of replacement polymorphism using the McDonald-Kreitman test. These data are in contrast to significant departures from neutrality for the ND3 gene in mammals and other genes in Drosophila mtDNA (cytochrome b and ATPase 6). Pooled across genes, however, both Drosophila and human mtDNA show very significant excesses of amino acid polymorphism. Silent polymorphisms at ND5 show a significantly higher variance in frequency than replacement polymorphisms, and the latter show a significant skew toward low frequencies (Tajima's D = -1.954). These patterns are interpreted in light of the nearly neutral theory where mildly deleterious amino acid haplotypes are observed as ephemeral variants within species but do not contribute to divergence. The patterns of polymorphism and divergence at charge-altering amino acid sites are presented for the Drosophila ND5 gene to examine the evolution of functionally distinct mutations. Excess charge-altering polymorphism is observed at the carboxyl terminal and excess charge-altering divergence is detected at the amino terminal. While the mildly deleterious model fits as a net effect in the evolution of nonrecombining mitochondrial genomes, these data suggest that opposing evolutionary pressures may act on different regions of mitochondrial genes and genomes.
Hybridization and sequencing of nucleic acids using base pair mismatches

DOEpatents

Fodor, Stephen P. A.; Lipshutz, Robert J.; Huang, Xiaohua

2001-01-01

Devices and techniques for hybridization of nucleic acids and for determining the sequence of nucleic acids. Arrays of nucleic acids are formed by techniques, preferably high resolution, light-directed techniques. Positions of hybridization of a target nucleic acid are determined by, e.g., epifluorescence microscopy. Devices and techniques are proposed to determine the sequence of a target nucleic acid more efficiently and more quickly through such synthesis and detection techniques.
Human jagged polypeptide, encoding nucleic acids and methods of use

DOEpatents

Li, Linheng; Hood, Leroy

2000-01-01

The present invention provides an isolated polypeptide exhibiting substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the polypeptide does not have the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. The invention further provides an isolated nucleic acid molecule containing a nucleotide sequence encoding substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the nucleotide sequence does not encode the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. Also provided herein is a method of inhibiting differentiation of hematopoietic progenitor cells by contacting the progenitor cells with an isolated JAGGED polypeptide, or active fragment thereof. The invention additionally provides a method of diagnosing Alagille Syndrome in an individual. The method consists of detecting an Alagille Syndrome disease-associated mutation linked to a JAGGED locus.
Polypeptide having or assisting in carbohydrate material degrading activity and uses thereof

DOEpatents

Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter

2016-02-16

The invention relates to a polypeptide which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 76% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having beta-glucosidase activity and uses thereof

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well asmore » the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.« less

Polypeptide having swollenin activity and uses thereof

DOEpatents

Schoonneveld-Bergmans, Margot Elizabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica D; Damveld, Robbertus Antonius

2015-11-04

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having beta-glucosidase activity and uses thereof

DOEpatents

Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; De Jong, Rene Marcel; Damveld, Robbertus Antonius

2015-09-01

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 70% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having cellobiohydrolase activity and uses thereof

DOEpatents

Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter

2015-09-15

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 93% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having acetyl xylan esterase activity and uses thereof

DOEpatents

Schoonneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Los, Alrik Pieter

2015-10-20

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 82% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Polypeptide having carbohydrate degrading activity and uses thereof

DOEpatents

Schooneveld-Bergmans, Margot Elisabeth Francoise; Heijne, Wilbert Herman Marie; Vlasie, Monica Diana; Damveld, Robbertus Antonius

2015-08-18

The invention relates to a polypeptide comprising the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 73% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional polypeptide and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Mammoth and Mastodon collagen sequences; survival and utility

NASA Astrophysics Data System (ADS)

Buckley, M.; Larkin, N.; Collins, M.

2011-04-01

Near-complete collagen (I) sequences are proposed for elephantid and mammutid taxa, based upon available African elephant genomic data and supported with LC-MALDI-MS/MS and LC-ESI-MS/MS analyses of collagen digests from proboscidean bone. Collagen sequence coverage was investigated from several specimens of two extinct mammoths ( Mammuthus trogontherii and Mammuthus primigenius), the extinct American mastodon ( Mammut americanum), the extinct straight-tusked elephant ( Elephas ( Palaeoloxodon) antiquus) and extant Asian ( Elephas maximus) and African ( Loxodonta africana) elephants and compared between the two ionization techniques used. Two suspected mammoth fossils from the British Middle Pleistocene (Cromerian) deposits of the West Runton Forest Bed were analysed to investigate the potential use of peptide mass spectrometry for fossil identification. Despite the age of the fossils, sufficient peptides were obtained to identify these as elephantid, and sufficient sequence variation to discriminate elephantid and mammutid collagen (I). In-depth LC-MS analyses further failed to identify a peptide that could be used to reliably distinguish between the three genera of elephantids ( Elephas, Loxodonta and Mammuthus), an observation consistent with predicted amino acid substitution rates between these species.
Biosensing of BCR/ABL fusion gene using an intensity-interrogation surface plasmon resonance imaging system

NASA Astrophysics Data System (ADS)

Wu, Jiangling; Huang, Yu; Bian, Xintong; Li, DanDan; Cheng, Quan; Ding, Shijia

2016-10-01

In this work, a custom-made intensity-interrogation surface plasmon resonance imaging (SPRi) system has been developed to directly detect a specific sequence of BCR/ABL fusion gene in chronic myelogenous leukemia (CML). The variation in the reflected light intensity detected from the sensor chip composed of gold islands array is proportional to the change of refractive index due to the selective hybridization of surface-bound DNA probes with target ssDNA. SPRi measurements were performed with different concentrations of synthetic target DNA sequence. The calibration curve of synthetic target sequence shows a good relationship between the concentration of synthetic target and the change of reflected light intensity. The detection limit of this SPRi measurement could approach 10.29 nM. By comparing SPRi images, the target ssDNA and non-complementary DNA sequence are able to be distinguished. This SPRi system has been applied for assay of BCR/ABL fusion gene extracted from real samples. This nucleic acid-based SPRi biosensor therefore offers an alternative high-effective, high-throughput label-free tool for DNA detection in biomedical research and molecular diagnosis.
Evolutionary and biophysical relationships among the papillomavirus E2 proteins.

PubMed

Blakaj, Dukagjin M; Fernandez-Fuentes, Narcis; Chen, Zigui; Hegde, Rashmi; Fiser, Andras; Burk, Robert D; Brenowitz, Michael

2009-01-01

Infection by human papillomavirus (HPV) may result in clinical conditions ranging from benign warts to invasive cancer. The HPV E2 protein represses oncoprotein transcription and is required for viral replication. HPV E2 binds to palindromic DNA sequences of highly conserved four base pair sequences flanking an identical length variable 'spacer'. E2 proteins directly contact the conserved but not the spacer DNA. Variation in naturally occurring spacer sequences results in differential protein affinity that is dependent on their sensitivity to the spacer DNA's unique conformational and/or dynamic properties. This article explores the biophysical character of this core viral protein with the goal of identifying characteristics that associated with risk of virally caused malignancy. The amino acid sequence, 3d structure and electrostatic features of the E2 protein DNA binding domain are highly conserved; specific interactions with DNA binding sites have also been conserved. In contrast, the E2 protein's transactivation domain does not have extensive surfaces of highly conserved residues. Rather, regions of high conservation are localized to small surface patches. Implications to cancer biology are discussed.
Organization of nif gene cluster in Frankia sp. EuIK1 strain, a symbiont of Elaeagnus umbellata.

PubMed

Oh, Chang Jae; Kim, Ho Bang; Kim, Jitae; Kim, Won Jin; Lee, Hyoungseok; An, Chung Sun

2012-01-01

The nucleotide sequence of a 20.5-kb genomic region harboring nif genes was determined and analyzed. The fragment was obtained from Frankia sp. EuIK1 strain, an indigenous symbiont of Elaeagnus umbellata. A total of 20 ORFs including 12 nif genes were identified and subjected to comparative analysis with the genome sequences of 3 Frankia strains representing diverse host plant specificities. The nucleotide and deduced amino acid sequences showed highest levels of identity with orthologous genes from an Elaeagnus-infecting strain. The gene organization patterns around the nif gene clusters were well conserved among all 4 Frankia strains. However, characteristic features appeared in the location of the nifV gene for each Frankia strain, depending on the type of host plant. Sequence analysis was performed to determine the transcription units and suggested that there could be an independent operon starting from the nifW gene in the EuIK strain. Considering the organization patterns and their total extensions on the genome, we propose that the nif gene clusters remained stable despite genetic variations occurring in the Frankia genomes.
The design of strain-specific polymerase chain reactions for discrimination of the racoon rabies virus strain from indigenous rabies viruses of Ontario.

PubMed

Nadin-Davis, S A; Huang, W; Wandeler, A I

1996-03-01

Since its recognition as a discrete epizootic in Florida in the early 1950s, the raccoon strain of rabies virus (RV) has spread over almost the entire eastern seaboard of the US and now threatens to enter the southernmost regions of Canada. To characterise this RV strain in more detail, nucleotide sequencing of the N and G genes, encoding the nucleoprotein and glycoprotein, respectively, of representative isolates has been undertaken. This sequence information generated a conserved restriction map of the N gene, thereby permitting unequivocal identification of this strain by molecular techniques. Comparisons of the predicted nucleoprotein and glycoprotein products with those of other RV strains identified a number of amino acid sequence variations conserved only in the raccoon strain. This information was used to design strain-specific primers targeted to the N gene sequences encoding these residues. The incorporation of these primers into a multiplex polymerase chain reaction (PCR) protocol permitted easy and rapid discrimination between the raccoon RV strain and indigenous Ontario RVs.
AlloRep: A Repository of Sequence, Structural and Mutagenesis Data for the LacI/GalR Transcription Regulators.

PubMed

Sousa, Filipa L; Parente, Daniel J; Shis, David L; Hessman, Jacob A; Chazelle, Allen; Bennett, Matthew R; Teichmann, Sarah A; Swint-Kruse, Liskin

2016-02-22

Protein families evolve functional variation by accumulating point mutations at functionally important amino acid positions. Homologs in the LacI/GalR family of transcription regulators have evolved to bind diverse DNA sequences and allosteric regulatory molecules. In addition to playing key roles in bacterial metabolism, these proteins have been widely used as a model family for benchmarking structural and functional prediction algorithms. We have collected manually curated sequence alignments for >3000 sequences, in vivo phenotypic and biochemical data for >5750 LacI/GalR mutational variants, and noncovalent residue contact networks for 65 LacI/GalR homolog structures. Using this rich data resource, we compared the noncovalent residue contact networks of the LacI/GalR subfamilies to design and experimentally validate an allosteric mutant of a synthetic LacI/GalR repressor for use in biotechnology. The AlloRep database (freely available at www.AlloRep.org) is a key resource for future evolutionary studies of LacI/GalR homologs and for benchmarking computational predictions of functional change. Copyright © 2015 Elsevier Ltd. All rights reserved.
Complete mitochondrial genome of Yangtze River wild common carp (Cyprinus carpio haematopterus) and Russian scattered scale mirror carp (Cyprinus carpio carpio).

PubMed

Hu, Guang Fu; Liu, Xiang Jiang; Zou, Gui Wei; Li, Zhong; Liang, Hong-Wei; Hu, Shao-Na

2016-01-01

We sequenced the complete mitogenomes of (Cyprinus carpio haematopterus) and Russian scattered scale mirror carp (Cyprinus carpio carpio). Comparison of these two mitogenomes revealed that the mitogenomes of these two common carp strains were remarkably similar in genome length, gene order and content, and AT content. There were only 55 bp variations in 16,581 nucleotides. About 1 bp variation was located in rRNAs, 2 bp in tRNAs, 9 bp in the control region and 43 bp in protein-coding genes. Furthermore, forty-three variable nucleotides in the protein-coding genes of the two strains led to four variable amino acids, which were located in the ND2, ATPase 6, ND5 and ND6 genes, respectively.
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...
37 CFR 5.31-5.33 - [Reserved

Code of Federal Regulations, 2011 CFR

2011-07-01

... from abandonment 1.135 Amino Acid Sequences. (See Nucleotide and/or Amino Acid Sequences) Appeal to... Appeals and Interference 41.47 Of rejection of an application 1.104(a) Nucleotide and/or Amino Acid...) Symbols for nucleotide and/or amino acid sequence data 1.822 T Tables in patent applications 1.58 Terminal...
37 CFR 1.821 - Nucleotide and/or amino acid sequence disclosures in patent applications.

Code of Federal Regulations, 2011 CFR

2011-07-01

... 37 Patents, Trademarks, and Copyrights 1 2011-07-01 2011-07-01 false Nucleotide and/or amino acid... Biotechnology Invention Disclosures Application Disclosures Containing Nucleotide And/or Amino Acid Sequences § 1.821 Nucleotide and/or amino acid sequence disclosures in patent applications. (a) Nucleotide and...
Genetic diversity and antigenicity variation of Babesia bovis merozoite surface antigen-1 (MSA-1) in Thailand.

PubMed

Tattiyapong, Muncharee; Sivakumar, Thillaiampalam; Takemae, Hitoshi; Simking, Pacharathon; Jittapalapong, Sathaporn; Igarashi, Ikuo; Yokoyama, Naoaki

2016-07-01

Babesia bovis, an intraerythrocytic protozoan parasite, causes severe clinical disease in cattle worldwide. The genetic diversity of parasite antigens often results in different immune profiles in infected animals, hindering efforts to develop immune control methodologies against the B. bovis infection. In this study, we analyzed the genetic diversity of the merozoite surface antigen-1 (msa-1) gene using 162 B. bovis-positive blood DNA samples sourced from cattle populations reared in different geographical regions of Thailand. The identity scores shared among 93 msa-1 gene sequences isolated by PCR amplification were 43.5-100%, and the similarity values among the translated amino acid sequences were 42.8-100%. Of 23 total clades detected in our phylogenetic analysis, Thai msa-1 gene sequences occurred in 18 clades; seven among them were composed of sequences exclusively from Thailand. To investigate differential antigenicity of isolated MSA-1 proteins, we expressed and purified eight recombinant MSA-1 (rMSA-1) proteins, including an rMSA-1 from B. bovis Texas (T2Bo) strain and seven rMSA-1 proteins based on the Thai msa-1 sequences. When these antigens were analyzed in a western blot assay, anti-T2Bo cattle serum strongly reacted with the rMSA-1 from T2Bo, as well as with three other rMSA-1 proteins that shared 54.9-68.4% sequence similarity with T2Bo MSA-1. In contrast, no or weak reactivity was observed for the remaining rMSA-1 proteins, which shared low sequence similarity (35.0-39.7%) with T2Bo MSA-1. While demonstrating the high genetic diversity of the B. bovis msa-1 gene in Thailand, the present findings suggest that the genetic diversity results in antigenicity variations among the MSA-1 antigens of B. bovis in Thailand. Copyright © 2016 Elsevier B.V. All rights reserved.
Genome-wide copy number variation in the bovine genome detected using low coverage sequence of popular beef breeds

USDA-ARS?s Scientific Manuscript database

Genomic structural variations are an important source of genetic diversity. Copy number variations (CNVs), gains and losses of large regions of genomic sequence between individuals of a species, are known to be associated with both diseases and phenotypic traits. Deeply sequenced genomes are often u...
Gene encoding a novel extracellular metalloprotease in Bacillus subtilis.

PubMed Central

Sloma, A; Rudolph, C F; Rufo, G A; Sullivan, B J; Theriault, K A; Ally, D; Pero, J

1990-01-01

The gene for a novel extracellular metalloprotease was cloned, and its nucleotide sequence was determined. The gene (mpr) encodes a primary product of 313 amino acids that has little similarity to other known Bacillus proteases. The amino acid sequence of the mature protease was preceded by a signal sequence of approximately 34 amino acids and a pro sequence of 58 amino acids. Four cysteine residues were found in the deduced amino acid sequence of the mature protein, indicating the possible presence of disulfide bonds. The mpr gene mapped in the cysA-aroI region of the chromosome and was not required for growth or sporulation. Images FIG. 2 FIG. 7 PMID:2105291
Two-Way Gold Nanoparticle Label-Free Sensing of Specific Sequence and Small Molecule Targets Using Switchable Concatemers.

PubMed

Zhu, Longjiao; Shao, Xiangli; Luo, Yunbo; Huang, Kunlung; Xu, Wentao

2017-05-19

A two-way colorimetric biosensor based on unmodified gold nanoparticles (GNPs) and a switchable double-stranded DNA (dsDNA) concatemer have been demonstrated. Two hairpin probes (H1 and H2) were first designed that provided the fuels to assemble the dsDNA concatemers via hybridization chain reaction (HCR). A functional hairpin (FH) was rationally designed to recognize the target sequences. All the hairpins contained a single-stranded DNA (ssDNA) loop and sticky end to prevent GNPs from salt-induced aggregation. In the presence of target sequence, the capture probe blocked in the FH recognizes the target to form a duplex DNA, which causes the release of the initiator probe by FH conformational change. This process then starts the alternate-opening of H1 and H2 through HCR, and dsDNA concatemers grow from the target sequence. As a result, unmodified GNPs undergo salt-induced aggregation because the formed dsDNA concatemers are stiffer and provide less stabilization. A light purple-to-blue color variation was observed in the bulk solution, termed the light-off sensing way. Furthermore, H1 ingeniously inserted an aptamer sequence to generate dsDNA concatemers with multiple small molecule binding sites. In the presence of small molecule targets, concatemers can be disassembled into mixtures with ssDNA sticky ends. A blue-to-purple reverse color variation was observed due to the regeneration of the ssDNA, termed the light-on way. The two-way biosensor can detect both nucleic acids and small molecule targets with one sensing device. This switchable sensing element is label-free, enzyme-free, and sophisticated-instrumentation-free. The detection limits of both targets were below nanomolar.
Comprehensive global amino acid sequence analysis of PB1F2 protein of influenza A H5N1 viruses and the influenza A virus subtypes responsible for the 20th‐century pandemics

PubMed Central

Pasricha, Gunisha; Mishra, Akhilesh C.; Chakrabarti, Alok K.

2012-01-01

Please cite this paper as: Pasricha et al. (2012) Comprehensive global amino acid sequence analysis of PB1F2 protein of influenza A H5N1 viruses and the Influenza A virus subtypes responsible for the 20th‐century pandemics. Influenza and Other Respiratory Viruses 7(4), 497–505. Background PB1F2 is the 11th protein of influenza A virus translated from +1 alternate reading frame of PB1 gene. Since the discovery, varying sizes and functions of the PB1F2 protein of influenza A viruses have been reported. Selection of PB1 gene segment in the pandemics, variable size and pleiotropic effect of PB1F2 intrigued us to analyze amino acid sequences of this protein in various influenza A viruses. Methods Amino acid sequences for PB1F2 protein of influenza A H5N1, H1N1, H2N2, and H3N2 subtypes were obtained from Influenza Research Database. Multiple sequence alignments of the PB1F2 protein sequences of the aforementioned subtypes were used to determine the size, variable and conserved domains and to perform mutational analysis. Results Analysis showed that 96·4% of the H5N1 influenza viruses harbored full‐length PB1F2 protein. Except for the 2009 pandemic H1N1 virus, all the subtypes of the 20th‐century pandemic influenza viruses contained full‐length PB1F2 protein. Through the years, PB1F2 protein of the H1N1 and H3N2 viruses has undergone much variation. PB1F2 protein sequences of H5N1 viruses showed both human‐ and avian host‐specific conserved domains. Global database of PB1F2 protein revealed that N66S mutation was present only in 3·8% of the H5N1 strains. We found a novel mutation, N84S in the PB1F2 protein of 9·35% of the highly pathogenic avian influenza H5N1 influenza viruses. Conclusions Varying sizes and mutations of the PB1F2 protein in different influenza A virus subtypes with pandemic potential were obtained. There was genetic divergence of the protein in various hosts which highlighted the host‐specific evolution of the virus. However, studies are required to correlate this sequence variability with the virulence and pathogenicity. PMID:22788742

Thermophilic cellobiohydrolase

DOEpatents

Sapra, Rajat; Park, Joshua I.; Datta, Supratim; Simmons, Blake A.

2017-04-18

The present invention provides for a composition comprising a polypeptide comprising a first amino acid sequence having at least 70% identity with the amino acid sequence of Csac GH5 wherein said first amino acid sequence has a thermostable or thermophilic cellobiohydrolase (CBH) or exoglucanase activity.
Investigation of sequential properties of snoring episodes for obstructive sleep apnoea identification.

PubMed

Cavusoglu, M; Ciloglu, T; Serinagaoglu, Y; Kamasak, M; Erogul, O; Akcam, T

2008-08-01

In this paper, 'snore regularity' is studied in terms of the variations of snoring sound episode durations, separations and average powers in simple snorers and in obstructive sleep apnoea (OSA) patients. The goal was to explore the possibility of distinguishing among simple snorers and OSA patients using only sleep sound recordings of individuals and to ultimately eliminate the need for spending a whole night in the clinic for polysomnographic recording. Sequences that contain snoring episode durations (SED), snoring episode separations (SES) and average snoring episode powers (SEP) were constructed from snoring sound recordings of 30 individuals (18 simple snorers and 12 OSA patients) who were also under polysomnographic recording in Gülhane Military Medical Academy Sleep Studies Laboratory (GMMA-SSL), Ankara, Turkey. Snore regularity is quantified in terms of mean, standard deviation and coefficient of variation values for the SED, SES and SEP sequences. In all three of these sequences, OSA patients' data displayed a higher variation than those of simple snorers. To exclude the effects of slow variations in the base-line of these sequences, new sequences that contain the coefficient of variation of the sample values in a 'short' signal frame, i.e., short time coefficient of variation (STCV) sequences, were defined. The mean, the standard deviation and the coefficient of variation values calculated from the STCV sequences displayed a stronger potential to distinguish among simple snorers and OSA patients than those obtained from the SED, SES and SEP sequences themselves. Spider charts were used to jointly visualize the three parameters, i.e., the mean, the standard deviation and the coefficient of variation values of the SED, SES and SEP sequences, and the corresponding STCV sequences as two-dimensional plots. Our observations showed that the statistical parameters obtained from the SED and SES sequences, and the corresponding STCV sequences, possessed a strong potential to distinguish among simple snorers and OSA patients, both marginally, i.e., when the parameters are examined individually, and jointly. The parameters obtained from the SEP sequences and the corresponding STCV sequences, on the other hand, did not have a strong discrimination capability. However, the joint behaviour of these parameters showed some potential to distinguish among simple snorers and OSA patients.
Phylogeny of North American Powassan virus.

PubMed

Ebel, G D; Spielman, A; Telford, S R

2001-07-01

To determine whether Powassan virus (POW) and deer tick virus (DTV) constitute distinct flaviviral populations transmitted by ixodid ticks in North America, we analysed diverse nucleotide sequences from 16 strains of these viruses. Two distinct genetic lineages are evident, which may be defined by geographical and host associations. The nucleotide and amino acid sequences of lineage one (comprising New York and Canadian POW isolates) are highly conserved across time and space, but those of lineage two (comprising isolates from deer ticks and a fox) are more variable. The divergence between lineages is much greater than the variation within either lineage, and lineage two appears to be more diverse genetically than is lineage one. Application of McDonald-Kreitman tests to the sequences of these strains indicates that adaptive evolution of the envelope protein separates lineage one from lineage two. The two POW lineages circulating in North America possess a pattern of genetic diversity suggesting that they comprise distinct subtypes that may perpetuate in separate enzootic cycles.
The structure of the human interferon alpha/beta receptor gene.

PubMed

Lutfalla, G; Gardiner, K; Proudhon, D; Vielh, E; Uzé, G

1992-02-05

Using the cDNA coding for the human interferon alpha/beta receptor (IFNAR), the IFNAR gene has been physically mapped relative to the other loci of the chromosome 21q22.1 region. 32,906 base pairs covering the IFNAR gene have been cloned and sequenced. Primer extension and solution hybridization-ribonuclease protection have been used to determine that the transcription of the gene is initiated in a broad region of 20 base pairs. Some aspects of the polymorphism of the gene, including noncoding sequences, have been analyzed; some are allelic differences in the coding sequence that induce amino acid variations in the resulting protein. The exon structure of the IFNAR gene and of that of the available genes for the receptors of the cytokine/growth hormone/prolactin/interferon receptor family have been compared with the predictions for the secondary structure of those receptors. From this analysis, we postulate a common origin and propose an hypothesis for the divergence from the immunoglobulin superfamily.
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, M.S.

1998-08-18

A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device. 27 figs.
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, Mark S.; Wang, Chunwei; Jevons, Luis C.; Bernhart, Derek H.; Lipshutz, Robert J.

2004-05-11

A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, Mark S.

1998-08-18

A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, Mark S.

2003-08-19

A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
Cell culture compositions

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yiao, Jian

2014-03-18

The present invention provides a novel endoglucanase nucleic acid sequence, designated egl6 (SEQ ID NO:1 encodes the full length endoglucanase; SEQ ID NO:4 encodes the mature form), and the corresponding endoglucanase VI amino acid sequence ("EGVI"; SEQ ID NO:3 is the signal sequence; SEQ ID NO:2 is the mature sequence). The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding EGVI, recombinant EGVI proteins and methods for producing the same.
Fatty acid composition and desaturase gene expression in flax (Linum usitatissimum L.).

PubMed

Thambugala, Dinushika; Cloutier, Sylvie

2014-11-01

Little is known about the relationship between expression levels of fatty acid desaturase genes during seed development and fatty acid (FA) composition in flax. In the present study, we looked at promoter structural variations of six FA desaturase genes and their relative expression throughout seed development. Computational analysis of the nucleotide sequences of the sad1, sad2, fad2a, fad2b, fad3a and fad3b promoters showed several basic transcriptional elements including CAAT and TATA boxes, and several putative target-binding sites for transcription factors, which have been reported to be involved in the regulation of lipid metabolism. Using semi-quantitative reverse transcriptase PCR, the expression patterns throughout seed development of the six FA desaturase genes were measured in six flax genotypes that differed for FA composition but that carried the same desaturase isoforms. FA composition data were determined by phenotyping the field grown genotypes over four years in two environments. All six genes displayed a bell-shaped pattern of expression peaking at 20 or 24 days after anthesis. Sad2 was the most highly expressed. The expression of all six desaturase genes did not differ significantly between genotypes (P = 0.1400), hence there were no correlations between FA desaturase gene expression and variations in FA composition in relatively low, intermediate and high linolenic acid genotypes expressing identical isoforms for all six desaturases. These results provide further clues towards understanding the genetic factors responsible for FA composition in flax.
Genetic characterization of a novel astrovirus in Pekin ducks.

PubMed

Liao, Qinfeng; Liu, Ning; Wang, Xiaoyan; Wang, Fumin; Zhang, Dabing

2015-06-01

Three divergent groups of duck astroviruses (DAstVs), namely DAstV-1, DAstV-2 (formerly duck hepatitis virus type 3) and DAstV-3 (isolate CPH), and other avastroviruses are known to infect domestic ducks. To provide more data regarding the molecular epidemiology of astroviruses in domestic ducks, we examined the prevalence of astroviruses in 136 domestic duck samples collected from four different provinces of China. Nineteen goose samples were also included. Using an astrovirus-specific reverse transcription-PCR assay, two groups of astroviruses were detected from our samples. A group of astroviruses detected from Pekin ducks, Shaoxing ducks and Landes geese were highly similar to the newly discovered DAstV-3. More interestingly, a novel group of avastroviruses, which we named DAstV-4, was detected in Pekin ducks. Following full-length sequencing and sequence analysis, the variation between DAstV-4 and other avastroviruses in terms of lengths of genome and internal component was highlighted. Sequence identity and phylogenetic analyses based on the amino acid sequences of the three open reading frames (ORFs) clearly demonstrated that DAstV-4 was highly divergent from all other avastroviruses. Further analyses showed that DAstV-4 shared low levels of genome identities (50-58%) and high levels of mean amino acid genetic distances in the ORF2 sequences (0.520-0.801) with other avastroviruses, suggesting DAstV-4 may represent an additional avastrovirus species although the taxonomic relationship of DAstV-4 to DAstV-3 remains to be resolved. The present works contribute to the understanding of epidemiology, ecology and taxonomy of astroviruses in ducks. Copyright © 2015 Elsevier B.V. All rights reserved.
Labeled nucleotide phosphate (NP) probes

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2009-02-03

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Trh (tdh-/trh+) gene analysis of clinical, environmental and food isolates of Vibrio parahaemolyticus as a tool for investigating pathogenicity.

PubMed

Leoni, Francesca; Talevi, Giulia; Masini, Laura; Ottaviani, Donatella; Rocchegiani, Elena

2016-05-16

Sequencing analysis of the trh gene encoding the TDH-related haemolysin of tdh-/trh+ Vibrio parahaemolyticus isolated in Italy between 2002 and 2011 from clinical, environmental, and food samples revealed the presence of the trh2 variant in all isolates. The trh2 of the clinical isolate was 100% identical to other clinical tdh-/trh2 V. parahaemolyticus from Europe. Nucleotide and amino acid differences in the trh2 sequences of clinical isolates from Italy and other countries allowed a differentiation of the clinical strains from the majority of environmental or food strains isolated in Italy. Aspartic acid and isoleucine at positions 113 and 115, encoded by nucleotide triplets GAT and ATT at positions 337-339 and 343-345 of the complete trh gene sequence, were present in clinical strains from Europe (Italy, Norway and Germany), Asia and the United States. Only 35.5% of the tdh-/trh2 V. parahaemolyticus of environmental or food origin from Italy shared the same triplets/amino acid detected in clinical isolates, while 64.5% of isolates from the marine environment were different from those of clinical origins, demonstrating that differences occur amongst the trh2 sequences of strains from the environment and these polymorphisms may differentiate potentially pathogenic from less or non-pathogenic cultures found in the environment and seafood. In addition the distribution of T3SS2 genes was investigated in this group of tdh-/trh+ V. parahaemolyticus from different sources and in three clinical tdh+/trh- V. parahaemolyticus isolates. All tdh-/trh+ V. parahaemolyticus of environmental or food source, independent of year of isolation or geographical origin, amplified all the screened T3SS2β genes and tested negative to PCR assays for all five T3SS2α genes, as the tdh-/trh+ clinical V. parahaemolyticus isolate. The vopC genes, encoding for one of the effector proteins of T3SS2, were partially sequenced and compared to clinical tdh-/trh+ and tdh+/trh+ V. parahaemolyticus isolates from other countries. Analysis of T3SS2β vopC sequences revealed variation in tdh-/trh2 isolates from Italy, which were separated from a group of vopC sequences derived from trh2 V. parahaemolyticus from the USA. Copyright © 2016 Elsevier B.V. All rights reserved.
Comprehensive global amino acid sequence analysis of PB1F2 protein of influenza A H5N1 viruses and the influenza A virus subtypes responsible for the 20th-century pandemics.

PubMed

Pasricha, Gunisha; Mishra, Akhilesh C; Chakrabarti, Alok K

2013-07-01

PB1F2 is the 11th protein of influenza A virus translated from +1 alternate reading frame of PB1 gene. Since the discovery, varying sizes and functions of the PB1F2 protein of influenza A viruses have been reported. Selection of PB1 gene segment in the pandemics, variable size and pleiotropic effect of PB1F2 intrigued us to analyze amino acid sequences of this protein in various influenza A viruses. Amino acid sequences for PB1F2 protein of influenza A H5N1, H1N1, H2N2, and H3N2 subtypes were obtained from Influenza Research Database. Multiple sequence alignments of the PB1F2 protein sequences of the aforementioned subtypes were used to determine the size, variable and conserved domains and to perform mutational analysis. Analysis showed that 96·4% of the H5N1 influenza viruses harbored full-length PB1F2 protein. Except for the 2009 pandemic H1N1 virus, all the subtypes of the 20th-century pandemic influenza viruses contained full-length PB1F2 protein. Through the years, PB1F2 protein of the H1N1 and H3N2 viruses has undergone much variation. PB1F2 protein sequences of H5N1 viruses showed both human- and avian host-specific conserved domains. Global database of PB1F2 protein revealed that N66S mutation was present only in 3·8% of the H5N1 strains. We found a novel mutation, N84S in the PB1F2 protein of 9·35% of the highly pathogenic avian influenza H5N1 influenza viruses. Varying sizes and mutations of the PB1F2 protein in different influenza A virus subtypes with pandemic potential were obtained. There was genetic divergence of the protein in various hosts which highlighted the host-specific evolution of the virus. However, studies are required to correlate this sequence variability with the virulence and pathogenicity. © 2012 John Wiley & Sons Ltd.
The evolutionary dynamics of variant antigen genes in Babesia reveal a history of genomic innovation underlying host-parasite interaction.

PubMed

Jackson, Andrew P; Otto, Thomas D; Darby, Alistair; Ramaprasad, Abhinay; Xia, Dong; Echaide, Ignacio Eduardo; Farber, Marisa; Gahlot, Sunayna; Gamble, John; Gupta, Dinesh; Gupta, Yask; Jackson, Louise; Malandrin, Laurence; Malas, Tareq B; Moussa, Ehab; Nair, Mridul; Reid, Adam J; Sanders, Mandy; Sharma, Jyotsna; Tracey, Alan; Quail, Mike A; Weir, William; Wastling, Jonathan M; Hall, Neil; Willadsen, Peter; Lingelbach, Klaus; Shiels, Brian; Tait, Andy; Berriman, Matt; Allred, David R; Pain, Arnab

2014-06-01

Babesia spp. are tick-borne, intraerythrocytic hemoparasites that use antigenic variation to resist host immunity, through sequential modification of the parasite-derived variant erythrocyte surface antigen (VESA) expressed on the infected red blood cell surface. We identified the genomic processes driving antigenic diversity in genes encoding VESA (ves1) through comparative analysis within and between three Babesia species, (B. bigemina, B. divergens and B. bovis). Ves1 structure diverges rapidly after speciation, notably through the evolution of shortened forms (ves2) from 5' ends of canonical ves1 genes. Phylogenetic analyses show that ves1 genes are transposed between loci routinely, whereas ves2 genes are not. Similarly, analysis of sequence mosaicism shows that recombination drives variation in ves1 sequences, but less so for ves2, indicating the adoption of different mechanisms for variation of the two families. Proteomic analysis of the B. bigemina PR isolate shows that two dominant VESA1 proteins are expressed in the population, whereas numerous VESA2 proteins are co-expressed, consistent with differential transcriptional regulation of each family. Hence, VESA2 proteins are abundant and previously unrecognized elements of Babesia biology, with evolutionary dynamics consistently different to those of VESA1, suggesting that their functions are distinct. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Using chaos to generate variations on movement sequences

NASA Astrophysics Data System (ADS)

Bradley, Elizabeth; Stuart, Joshua

1998-12-01

We describe a method for introducing variations into predefined motion sequences using a chaotic symbol-sequence reordering technique. A progression of symbols representing the body positions in a dance piece, martial arts form, or other motion sequence is mapped onto a chaotic trajectory, establishing a symbolic dynamics that links the movement sequence and the attractor structure. A variation on the original piece is created by generating a trajectory with slightly different initial conditions, inverting the mapping, and using special corpus-based graph-theoretic interpolation schemes to smooth any abrupt transitions. Sensitive dependence guarantees that the variation is different from the original; the attractor structure and the symbolic dynamics guarantee that the two resemble one another in both aesthetic and mathematical senses.
Biosynthesis of Lipoic Acid in Arabidopsis: Cloning and Characterization of the cDNA for Lipoic Acid Synthase1

PubMed Central

Yasuno, Rie; Wada, Hajime

1998-01-01

Lipoic acid is a coenzyme that is essential for the activity of enzyme complexes such as those of pyruvate dehydrogenase and glycine decarboxylase. We report here the isolation and characterization of LIP1 cDNA for lipoic acid synthase of Arabidopsis. The Arabidopsis LIP1 cDNA was isolated using an expressed sequence tag homologous to the lipoic acid synthase of Escherichia coli. This cDNA was shown to code for Arabidopsis lipoic acid synthase by its ability to complement a lipA mutant of E. coli defective in lipoic acid synthase. DNA-sequence analysis of the LIP1 cDNA revealed an open reading frame predicting a protein of 374 amino acids. Comparisons of the deduced amino acid sequence with those of E. coli and yeast lipoic acid synthase homologs showed a high degree of sequence similarity and the presence of a leader sequence presumably required for import into the mitochondria. Southern-hybridization analysis suggested that LIP1 is a single-copy gene in Arabidopsis. Western analysis with an antibody against lipoic acid synthase demonstrated that this enzyme is located in the mitochondrial compartment in Arabidopsis cells as a 43-kD polypeptide. PMID:9808738
The Role of Tetraether Lipid Composition in the Adaptation of Thermophilic Archaea to Acidity

PubMed Central

Boyd, Eric S.; Hamilton, Trinity L.; Wang, Jinxiang; He, Liu; Zhang, Chuanlun L.

2013-01-01

Diether and tetraether lipids are fundamental components of the archaeal cell membrane. Archaea adjust the degree of tetraether lipid cyclization in order to maintain functional membranes and cellular homeostasis when confronted with pH and/or thermal stress. Thus, the ability to adjust tetraether lipid composition likely represents a critical phenotypic trait that enabled archaeal diversification into environments characterized by extremes in pH and/or temperature. Here we assess the relationship between geochemical variation, core- and polar-isoprenoid glycerol dibiphytanyl glycerol tetraether (C-iGDGT and P-iGDGT, respectively) lipid composition, and archaeal 16S rRNA gene diversity and abundance in 27 geothermal springs in Yellowstone National Park, Wyoming. The composition and abundance of C-iGDGT and P-iGDGT lipids recovered from geothermal ecosystems were distinct from surrounding soils, indicating that they are synthesized endogenously. With the exception of GDGT-0 (no cyclopentyl rings), the abundances of individual C-iGDGT and P-iGDGT lipids were significantly correlated. The abundance of a number of individual tetraether lipids varied positively with the relative abundance of individual 16S rRNA gene sequences, most notably crenarchaeol in both the core and polar GDGT fraction and sequences closely affiliated with Candidatus Nitrosocaldus yellowstonii. This finding supports the proposal that crenarchaeol is a biomarker for nitrifying archaea. Variation in the degree of cyclization of C- and P-iGDGT lipids recovered from geothermal mats and sediments could best be explained by variation in spring pH, with lipids from acidic environments tending to have, on average, more internal cyclic rings than those from higher pH ecosystems. Likewise, variation in the phylogenetic composition of archaeal 16S rRNA genes could best be explained by spring pH. In turn, the phylogenetic similarity of archaeal 16S rRNA genes was significantly correlated with the similarity in the composition of C- and P-iGDGT lipids. Taken together, these data suggest that the ability to adjust the composition of GDGT lipid membranes played a central role in the diversification of archaea into or out of environments characterized by extremes of low pH and high temperature. PMID:23565112
[Analysis of acid rain characteristics of Lin'an Regional Background Station using long-term observation data].

PubMed

Li, Zheng-Quan; Ma, Hao; Mao, Yu-Ding; Feng, Tao

2014-02-01

Using long-term observation data of acid rain at Lin'an Regional Background Station (Lin'an RBS), this paper studied the interannual and monthly variations of acid rain, the reasons for the variations, and the relationships between acid rain and meteorological factors. The results showed that interannual variation of acid rain at Lin'an RBS had a general increasing trend in which there were two obvious intensifying processes and two distinct weakening processes, during the period ranging from 1985 to 2012. In last two decades, the monthly variation of acid rain at Lin'an RBS indicated that rain acidity and frequency of severe acid rain were increasing but the frequency of weak acid rain was decreasing when moving towards bilateral side months of July. Acid rain occurrence was affected by rainfall intensity, wind speed and wind direction. High frequency of severe acid rain and low frequency of weak acid rain were on days with drizzle, but high frequency of weak acid rain and low frequency of severe acid rain occurred on rainstorm days. With wind speed upgrading, the frequency of acid rain and the proportion of severe acid rain were declining, the pH value of precipitation was reducing too. Another character is that daily dominant wind direction of weak acid rain majorly converged in S-W section ,however that of severe acid rain was more likely distributed in N-E section. The monthly variation of acid rain at Lin'an RBS was mainly attributed to precipitation variation, the increasing and decreasing of monthly incoming wind from SSE-WSW and NWN-ENE sections of wind direction. The interannual variation of acid rain could be due to the effects of energy consumption raising and significant green policies conducted in Zhejiang, Jiangsu and Shanghai.
Genetic Variation in Cardiomyopathy and Cardiovascular Disorders.

PubMed

McNally, Elizabeth M; Puckelwartz, Megan J

2015-01-01

With the wider deployment of massively-parallel, next-generation sequencing, it is now possible to survey human genome data for research and clinical purposes. The reduced cost of producing short-read sequencing has now shifted the burden to data analysis. Analysis of genome sequencing remains challenged by the complexity of the human genome, including redundancy and the repetitive nature of genome elements and the large amount of variation in individual genomes. Public databases of human genome sequences greatly facilitate interpretation of common and rare genetic variation, although linking database sequence information to detailed clinical information is limited by privacy and practical issues. Genetic variation is a rich source of knowledge for cardiovascular disease because many, if not all, cardiovascular disorders are highly heritable. The role of rare genetic variation in predicting risk and complications of cardiovascular diseases has been well established for hypertrophic and dilated cardiomyopathy, where the number of genes that are linked to these disorders is growing. Bolstered by family data, where genetic variants segregate with disease, rare variation can be linked to specific genetic variation that offers profound diagnostic information. Understanding genetic variation in cardiomyopathy is likely to help stratify forms of heart failure and guide therapy. Ultimately, genetic variation may be amenable to gene correction and gene editing strategies.

[Hydrologic variability and sensitivity based on Hurst coefficient and Bartels statistic].

PubMed

Lei, Xu; Xie, Ping; Wu, Zi Yi; Sang, Yan Fang; Zhao, Jiang Yan; Li, Bin Bin

2018-04-01

Due to the global climate change and frequent human activities in recent years, the pure stochastic components of hydrological sequence is mixed with one or several of the variation ingredients, including jump, trend, period and dependency. It is urgently needed to clarify which indices should be used to quantify the degree of their variability. In this study, we defined the hydrological variability based on Hurst coefficient and Bartels statistic, and used Monte Carlo statistical tests to test and analyze their sensitivity to different variants. When the hydrological sequence had jump or trend variation, both Hurst coefficient and Bartels statistic could reflect the variation, with the Hurst coefficient being more sensitive to weak jump or trend variation. When the sequence had period, only the Bartels statistic could detect the mutation of the sequence. When the sequence had a dependency, both the Hurst coefficient and the Bartels statistics could reflect the variation, with the latter could detect weaker dependent variations. For the four variations, both the Hurst variability and Bartels variability increased with the increases of variation range. Thus, they could be used to measure the variation intensity of the hydrological sequence. We analyzed the temperature series of different weather stations in the Lancang River basin. Results showed that the temperature of all stations showed the upward trend or jump, indicating that the entire basin had experienced warming in recent years and the temperature variability in the upper and lower reaches was much higher. This case study showed the practicability of the proposed method.
Research Associate | Center for Cancer Research

Cancer.gov

The Basic Science Program (BSP) at the Frederick National Laboratory for Cancer Research (FNLCR) pursues independent, multidisciplinary research programs in basic or applied molecular biology, immunology, retrovirology, cancer biology or human genetics. As part of the BSP, the Microbiome and Genetics Core (the Core) characterizes microbiomes by next-generation sequencing to determine their composition and variation, as influenced by immune, genetic, and host health factors. The Core provides support across a spectrum of processes, from nucleic acid isolation through bioinformatics and statistical analysis. KEY ROLES/RESPONSIBILITIES The Research Associate II will provide support in the areas of automated isolation, preparation, PCR and sequencing of DNA on next generation platforms (Illumina MiSeq and NextSeq). An opportunity exists to join the Core’s team of highly trained experimentalists and bioinformaticians working to characterize microbiome samples. The following represent requirements of the position: A minimum of five (5) years related of biomedical experience. Experience with high-throughput nucleic acid (DNA/RNA) extraction. Experience in performing PCR amplification (including quantitative real-time PCR). Experience or familiarity with robotic liquid handling protocols (especially on the Eppendorf epMotion 5073 or 5075 platforms). Experience in operating and maintaining benchtop Illumina sequencers (MiSeq and NextSeq). Ability to evaluate experimental quality and to troubleshoot molecular biology protocols. Experience with sample tracking, inventory management and biobanking. Ability to operate and communicate effectively in a team-oriented work environment.
Phylogenetic analysis of canine parvovirus isolates from Sichuan and Gansu provinces of China in 2011.

PubMed

Xu, J; Guo, H-C; Wei, Y-Q; Shu, L; Wang, J; Li, J-S; Cao, S-Z; Sun, S-Q

2015-02-01

Canine parvovirus causes serious disease in dogs. Study of the genetic variation in emerging CPV strains is important for disease control strategy. The antigenic property of CPV is connected with specific amino acid changes, mainly in the capsid protein VP2. This study was carried out to characterize VP2 gene of CPV viruses from two provinces of China in 2011. The complete VP2 genes of the CPV-positive samples were amplified and sequenced. Genetic analysis based on the VP2 genes of CPV was conducted. All of the isolates screened and sequenced in this study were typed as CPV-2a except GS-K11 strain, which was typed as CPV-2b. Sequence comparison showed nucleotide identities of 98.8-100% among CPV strains, whereas the Aa similarities were 99.6-100%. Compared with the reference strains, there are three distinctive amino acid changes at VP2 gene residue 267, 324 and 440 of the strains isolated in this study. Of the 27 strains, fourteen (51.85%) had the 267 (Phe-Tyr) and 440 (Thr-Ala) substitution, all the 27 (100%) had 324 (Tyr-Ile) substitution. Phylogenetically, all of the strains isolated in this study formed a major monophyletic cluster together with one South Korean isolate, two Thailand isolates and four Chinese former isolates. © 2013 Blackwell Verlag GmbH.
Spatial Structure of the Mormon Cricket Gut Microbiome and its Predicted Contribution to Nutrition and Immune Function

PubMed Central

Smith, Chad C.; Srygley, Robert B.; Healy, Frank; Swaminath, Karthikeyan; Mueller, Ulrich G.

2017-01-01

The gut microbiome of insects plays an important role in their ecology and evolution, participating in nutrient acquisition, immunity, and behavior. Microbial community structure within the gut is heavily influenced by differences among gut regions in morphology and physiology, which determine the niches available for microbes to colonize. We present a high-resolution analysis of the structure of the gut microbiome in the Mormon cricket Anabrus simplex, an insect known for its periodic outbreaks in the western United States and nutrition-dependent mating system. The Mormon cricket microbiome was dominated by 11 taxa from the Lactobacillaceae, Enterobacteriaceae, and Streptococcaceae. While most of these were represented in all gut regions, there were marked differences in their relative abundance, with lactic-acid bacteria (Lactobacillaceae) more common in the foregut and midgut and enteric (Enterobacteriaceae) bacteria more common in the hindgut. Differences in community structure were driven by variation in the relative prevalence of three groups: a Lactobacillus in the foregut, Pediococcus lactic-acid bacteria in the midgut, and Pantoea agglomerans, an enteric bacterium, in the hindgut. These taxa have been shown to have beneficial effects on their hosts in insects and other animals by improving nutrition, increasing resistance to pathogens, and modulating social behavior. Using PICRUSt to predict gene content from our 16S rRNA sequences, we found enzymes that participate in carbohydrate metabolism and pathogen defense in other orthopterans. These were predominately represented in the hindgut and midgut, the most important sites for nutrition and pathogen defense. Phylogenetic analysis of 16S rRNA sequences from cultured isolates indicated low levels of divergence from sequences derived from plants and other insects, suggesting that these bacteria are likely to be exchanged between Mormon crickets and the environment. Our study shows strong spatial variation in microbiome community structure, which influences predicted gene content and thus the potential of the microbiome to influence host function. PMID:28553263
Taxonomic structure and stability of the bacterial community in belgian sourdough ecosystems as assessed by culture and population fingerprinting.

PubMed

Scheirlinck, Ilse; Van der Meulen, Roel; Van Schoor, Ann; Vancanneyt, Marc; De Vuyst, Luc; Vandamme, Peter; Huys, Geert

2008-04-01

A total of 39 traditional sourdoughs were sampled at 11 bakeries located throughout Belgium which were visited twice with a 1-year interval. The taxonomic structure and stability of the bacterial communities occurring in these traditional sourdoughs were assessed using both culture-dependent and culture-independent methods. A total of 1,194 potential lactic acid bacterium (LAB) isolates were tentatively grouped and identified by repetitive element sequence-based PCR, followed by sequence-based identification using 16S rRNA and pheS genes from a selection of genotypically unique LAB isolates. In parallel, all samples were analyzed by denaturing gradient gel electrophoresis (DGGE) of V3-16S rRNA gene amplicons. In addition, extensive metabolite target analysis of more than 100 different compounds was performed. Both culturing and DGGE analysis showed that the species Lactobacillus sanfranciscensis, Lactobacillus paralimentarius, Lactobacillus plantarum, and Lactobacillus pontis dominated the LAB population of Belgian type I sourdoughs. In addition, DGGE band sequence analysis demonstrated the presence of Acetobacter sp. and a member of the Erwinia/Enterobacter/Pantoea group in some samples. Overall, the culture-dependent and culture-independent approaches each exhibited intrinsic limitations in assessing bacterial LAB diversity in Belgian sourdoughs. Irrespective of the LAB biodiversity, a large majority of the sugar and amino acid metabolites were detected in all sourdough samples. Principal component-based analysis of biodiversity and metabolic data revealed only little variation among the two samples of the sourdoughs produced at the same bakery. The rare cases of instability observed could generally be linked with variations in technological parameters or differences in detection capacity between culture-dependent and culture-independent approaches. Within a sampling interval of 1 year, this study reinforces previous observations that the bakery environment rather than the type or batch of flour largely determines the development of a stable LAB population in sourdoughs.
Taxonomic Structure and Stability of the Bacterial Community in Belgian Sourdough Ecosystems as Assessed by Culture and Population Fingerprinting▿ †

PubMed Central

Scheirlinck, Ilse; Van der Meulen, Roel; Van Schoor, Ann; Vancanneyt, Marc; De Vuyst, Luc; Vandamme, Peter; Huys, Geert

2008-01-01

A total of 39 traditional sourdoughs were sampled at 11 bakeries located throughout Belgium which were visited twice with a 1-year interval. The taxonomic structure and stability of the bacterial communities occurring in these traditional sourdoughs were assessed using both culture-dependent and culture-independent methods. A total of 1,194 potential lactic acid bacterium (LAB) isolates were tentatively grouped and identified by repetitive element sequence-based PCR, followed by sequence-based identification using 16S rRNA and pheS genes from a selection of genotypically unique LAB isolates. In parallel, all samples were analyzed by denaturing gradient gel electrophoresis (DGGE) of V3-16S rRNA gene amplicons. In addition, extensive metabolite target analysis of more than 100 different compounds was performed. Both culturing and DGGE analysis showed that the species Lactobacillus sanfranciscensis, Lactobacillus paralimentarius, Lactobacillus plantarum, and Lactobacillus pontis dominated the LAB population of Belgian type I sourdoughs. In addition, DGGE band sequence analysis demonstrated the presence of Acetobacter sp. and a member of the Erwinia/Enterobacter/Pantoea group in some samples. Overall, the culture-dependent and culture-independent approaches each exhibited intrinsic limitations in assessing bacterial LAB diversity in Belgian sourdoughs. Irrespective of the LAB biodiversity, a large majority of the sugar and amino acid metabolites were detected in all sourdough samples. Principal component-based analysis of biodiversity and metabolic data revealed only little variation among the two samples of the sourdoughs produced at the same bakery. The rare cases of instability observed could generally be linked with variations in technological parameters or differences in detection capacity between culture-dependent and culture-independent approaches. Within a sampling interval of 1 year, this study reinforces previous observations that the bakery environment rather than the type or batch of flour largely determines the development of a stable LAB population in sourdoughs. PMID:18310426
Trichoderma .beta.-glucosidase

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2006-01-03

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl3, and the corresponding BGL3 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL3, recombinant BGL3 proteins and methods for producing the same.
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, Mark S.

1999-10-26

A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).
Computer-aided visualization and analysis system for sequence evaluation

DOEpatents

Chee, Mark S.

2001-06-05

A computer system (1) for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments may be improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area (814) and sample sequences in another area (816) on a display device (3).
Carbohydrate degrading polypeptide and uses thereof

DOEpatents

Sagt, Cornelis Maria Jacobus; Schooneveld-Bergmans, Margot Elisabeth Francoise; Roubos, Johannes Andries; Los, Alrik Pieter

2015-10-20

The invention relates to a polypeptide having carbohydrate material degrading activity which comprises the amino acid sequence set out in SEQ ID NO: 2 or an amino acid sequence encoded by the nucleotide sequence of SEQ ID NO: 1 or SEQ ID NO: 4, or a variant polypeptide or variant polynucleotide thereof, wherein the variant polypeptide has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2 or the variant polynucleotide encodes a polypeptide that has at least 96% sequence identity with the sequence set out in SEQ ID NO: 2. The invention features the full length coding sequence of the novel gene as well as the amino acid sequence of the full-length functional protein and functional equivalents of the gene or the amino acid sequence. The invention also relates to methods for using the polypeptide in industrial processes. Also included in the invention are cells transformed with a polynucleotide according to the invention suitable for producing these proteins.
Comparative study of the hemagglutinin and neuraminidase genes of influenza A virus H3N2, H9N2, and H5N1 subtypes using bioinformatics techniques.

PubMed

Ahn, Insung; Son, Hyeon S

2007-07-01

To investigate the genomic patterns of influenza A virus subtypes, such as H3N2, H9N2, and H5N1, we collected 1842 sequences of the hemagglutinin and neuraminidase genes from the NCBI database and parsed them into 7 categories: accession number, host species, sampling year, country, subtype, gene name, and sequence. The sequences that were isolated from the human, avian, and swine populations were extracted and stored in a MySQL database for intensive analysis. The GC content and relative synonymous codon usage (RSCU) values were calculated using JAVA codes. As a result, correspondence analysis of the RSCU values yielded the unique codon usage pattern (CUP) of each subtype and revealed no extreme differences among the human, avian, and swine isolates. H5N1 subtype viruses exhibited little variation in CUPs compared with other subtypes, suggesting that the H5N1 CUP has not yet undergone significant changes within each host species. Moreover, some observations may be relevant to CUP variation that has occurred over time among the H3N2 subtype viruses isolated from humans. All the sequences were divided into 3 groups over time, and each group seemed to have preferred synonymous codon patterns for each amino acid, especially for arginine, glycine, leucine, and valine. The bioinformatics technique we introduce in this study may be useful in predicting the evolutionary patterns of pandemic viruses.
Automated Sanger Analysis Pipeline (ASAP): A Tool for Rapidly Analyzing Sanger Sequencing Data with Minimum User Interference.

PubMed

Singh, Aditya; Bhatia, Prateek

2016-12-01

Sanger sequencing platforms, such as applied biosystems instruments, generate chromatogram files. Generally, for 1 region of a sequence, we use both forward and reverse primers to sequence that area, in that way, we have 2 sequences that need to be aligned and a consensus generated before mutation detection studies. This work is cumbersome and takes time, especially if the gene is large with many exons. Hence, we devised a rapid automated command system to filter, build, and align consensus sequences and also optionally extract exonic regions, translate them in all frames, and perform an amino acid alignment starting from raw sequence data within a very short time. In full capabilities of Automated Mutation Analysis Pipeline (ASAP), it is able to read "*.ab1" chromatogram files through command line interface, convert it to the FASTQ format, trim the low-quality regions, reverse-complement the reverse sequence, create a consensus sequence, extract the exonic regions using a reference exonic sequence, translate the sequence in all frames, and align the nucleic acid and amino acid sequences to reference nucleic acid and amino acid sequences, respectively. All files are created and can be used for further analysis. ASAP is available as Python 3.x executable at https://github.com/aditya-88/ASAP. The version described in this paper is 0.28.
Nucleic acid analysis using terminal-phosphate-labeled nucleotides

DOEpatents

Korlach, Jonas [Ithaca, NY; Webb, Watt W [Ithaca, NY; Levene, Michael [Ithaca, NY; Turner, Stephen [Ithaca, NY; Craighead, Harold G [Ithaca, NY; Foquet, Mathieu [Ithaca, NY

2008-04-22

The present invention is directed to a method of sequencing a target nucleic acid molecule having a plurality of bases. In its principle, the temporal order of base additions during the polymerization reaction is measured on a molecule of nucleic acid, i.e. the activity of a nucleic acid polymerizing enzyme on the template nucleic acid molecule to be sequenced is followed in real time. The sequence is deduced by identifying which base is being incorporated into the growing complementary strand of the target nucleic acid by the catalytic activity of the nucleic acid polymerizing enzyme at each step in the sequence of base additions. A polymerase on the target nucleic acid molecule complex is provided in a position suitable to move along the target nucleic acid molecule and extend the oligonucleotide primer at an active site. A plurality of labelled types of nucleotide analogs are provided proximate to the active site, with each distinguishable type of nucleotide analog being complementary to a different nucleotide in the target nucleic acid sequence. The growing nucleic acid strand is extended by using the polymerase to add a nucleotide analog to the nucleic acid strand at the active site, where the nucleotide analog being added is complementary to the nucleotide of the target nucleic acid at the active site. The nucleotide analog added to the oligonucleotide primer as a result of the polymerizing step is identified. The steps of providing labelled nucleotide analogs, polymerizing the growing nucleic acid strand, and identifying the added nucleotide analog are repeated so that the nucleic acid strand is further extended and the sequence of the target nucleic acid is determined.
Analysis of Vibrio cholerae Genome Sequences Reveals Unique rtxA Variants in Environmental Strains and an rtxA-Null Mutation in Recent Altered El Tor Isolates

PubMed Central

Dolores, Jazel; Satchell, Karla J. F.

2013-01-01

ABSTRACT Vibrio cholerae genome sequences were analyzed for variation in the rtxA gene that encodes the multifunctional autoprocessing RTX (MARTX) toxin. To accommodate genomic analysis, a discrepancy in the annotated rtxA start site was resolved experimentally. The correct start site is an ATG downstream from rtxC resulting in a gene of 13,638 bp and deduced protein of 4,545 amino acids. Among the El Tor O1 and closely related O139 and O37 genomes, rtxA was highly conserved, with nine alleles differing by only 1 to 6 nucleotides in 100 years. In contrast, 12 alleles from environment-associated isolates are highly variable, at 1 to 3% by nucleotide and 3 to 7% by amino acid. The difference in variation rates did not represent a bias for conservation of the El Tor rtxA compared to that of other strains but rather reflected the lack of gene variation in overall genomes. Three alleles were identified that would affect the function of the MARTX toxin. Two environmental isolates carry novel arrangements of effector domains. These include a variant from RC385 that would suggest an adenylate cyclase toxin and from HE-09 that may have actin ADP-ribosylating activity. Within the recently emerged altered El Tor strains that have a classical ctxB gene, a mutation arose in rtxA that introduces a premature stop codon that disabled toxin function. This null mutant is the genetic background for subsequent emergence of the ctxB7 allele resulting in the strain that spread into Haiti in 2010. Thus, similar to classical strains, the altered El Tor pandemic strains eliminated rtxA after acquiring a classical ctxB. PMID:23592265
DETERMINATE and LATE FLOWERING are two TERMINAL FLOWER1/CENTRORADIALIS homologs that control two distinct phases of flowering initiation and development in pea.

PubMed

Foucher, Fabrice; Morin, Julie; Courtiade, Juliette; Cadioux, Sandrine; Ellis, Noel; Banfield, Mark J; Rameau, Catherine

2003-11-01

Genes in the TERMINAL FLOWER1 (TFL1)/CENTRORADIALIS family are important key regulatory genes involved in the control of flowering time and floral architecture in several different plant species. To understand the functions of TFL1 homologs in pea, we isolated three TFL1 homologs, which we have designated PsTFL1a, PsTFL1b, and PsTFL1c. By genetic mapping and sequencing of mutant alleles, we demonstrate that PsTFL1a corresponds to the DETERMINATE (DET) gene and PsTFL1c corresponds to the LATE FLOWERING (LF) gene. DET acts to maintain the indeterminacy of the apical meristem during flowering, and consistent with this role, DET expression is limited to the shoot apex after floral initiation. LF delays the induction of flowering by lengthening the vegetative phase, and allelic variation at the LF locus is an important component of natural variation for flowering time in pea. The most severe class of alleles flowers early and carries either a deletion of the entire PsTFL1c gene or an amino acid substitution. Other natural and induced alleles for LF, with an intermediate flowering time phenotype, present no changes in the PsTFL1c amino acid sequence but affect LF transcript level in the shoot apex: low LF transcript levels are correlated with early flowering, and high LF transcript levels are correlated with late flowering. Thus, different TFL1 homologs control two distinct aspects of plant development in pea, whereas a single gene, TFL1, performs both functions in Arabidopsis. These results show that different species have evolved different strategies to control key developmental transitions and also that the genetic basis for natural variation in flowering time may differ among plant species.
Insight into biases and sequencing errors for amplicon sequencing with the Illumina MiSeq platform.

PubMed

Schirmer, Melanie; Ijaz, Umer Z; D'Amore, Rosalinda; Hall, Neil; Sloan, William T; Quince, Christopher

2015-03-31

With read lengths of currently up to 2 × 300 bp, high throughput and low sequencing costs Illumina's MiSeq is becoming one of the most utilized sequencing platforms worldwide. The platform is manageable and affordable even for smaller labs. This enables quick turnaround on a broad range of applications such as targeted gene sequencing, metagenomics, small genome sequencing and clinical molecular diagnostics. However, Illumina error profiles are still poorly understood and programs are therefore not designed for the idiosyncrasies of Illumina data. A better knowledge of the error patterns is essential for sequence analysis and vital if we are to draw valid conclusions. Studying true genetic variation in a population sample is fundamental for understanding diseases, evolution and origin. We conducted a large study on the error patterns for the MiSeq based on 16S rRNA amplicon sequencing data. We tested state-of-the-art library preparation methods for amplicon sequencing and showed that the library preparation method and the choice of primers are the most significant sources of bias and cause distinct error patterns. Furthermore we tested the efficiency of various error correction strategies and identified quality trimming (Sickle) combined with error correction (BayesHammer) followed by read overlapping (PANDAseq) as the most successful approach, reducing substitution error rates on average by 93%. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

DOEpatents

Studier, F. William

1995-04-18

Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient.
Method for high-volume sequencing of nucleic acids: random and directed priming with libraries of oligonucleotides

DOEpatents

Studier, F.W.

1995-04-18

Random and directed priming methods for determining nucleotide sequences by enzymatic sequencing techniques, using libraries of primers of lengths 8, 9 or 10 bases, are disclosed. These methods permit direct sequencing of nucleic acids as large as 45,000 base pairs or larger without the necessity for subcloning. Individual primers are used repeatedly to prime sequence reactions in many different nucleic acid molecules. Libraries containing as few as 10,000 octamers, 14,200 nonamers, or 44,000 decamers would have the capacity to determine the sequence of almost any cosmid DNA. Random priming with a fixed set of primers from a smaller library can also be used to initiate the sequencing of individual nucleic acid molecules, with the sequence being completed by directed priming with primers from the library. In contrast to random cloning techniques, a combined random and directed priming strategy is far more efficient. 2 figs.
Identification of canine parvovirus with the Q370R point mutation in the VP2 gene from a giant panda (Ailuropoda melanoleuca)

PubMed Central

2013-01-01

Background In this study, we sequenced and phylogenetic analyses of the VP2 genes from twelve canine parvovirus (CPV) strains obtained from eleven domestic dogs and a giant panda (Ailuropoda melanoleuca) in China. A novel canine parvovirus (CPV) was detected from the giant panda in China. Results Nucleotide and phylogenetic analysis of the capsid protein VP2 gene classified the CPV as a new CPV-2a type. Substitution of Gln for Arg at the conserved 370 residue in CPV presents an unusual variation in the new CPV-2a amino acid sequence of the giant panda and is further evidence for the continuing evolution of the virus. Conclusions These findings extend the knowledge on CPV molecular epidemiology of particular relevance to wild carnivores. PMID:23706032
Complete mitochondrial genome of Xingguo red carp (Cyprinus carpio var. singuonensis) and purse red carp (Cyprinus carpio var. wuyuanensis).

PubMed

Hu, Guang-Fu; Liu, Xiang-Jiang; Li, Zhong; Liang, Hong-Wei; Hu, Shao-Na; Zou, Gui-Wei

2016-01-01

The complete mitochondrial genomes of Xingguo red carp (Cyprinus carpio var. singuonensis) and purse red carp (Cyprinus carpio var. wuyuanensis) were sequenced. Comparison of these two mitochondrial genomes revealed that the mtDNAs of these two common carp varieties were remarkably similar in genome length, gene order and content, and AT content. However, size variation between these two mitochondrial genomes presented here showed 39 site differences in overall length. About 2 site differences were located in rRNAs, 3 in tRNAs, 3 in the control region, 31 in protein-coding genes. Thirty-one variable bases in the protein-coding regions between the two varieties mitochondrial sequences led to three variable amino acids, which were mainly located in the protein ND5 and ND4.

A robust and cost-effective approach to sequence and analyze complete genomes of small RNA viruses

USDA-ARS?s Scientific Manuscript database

Background: Next-generation sequencing (NGS) allows ultra-deep sequencing of nucleic acids. The use of sequence-independent amplification of viral nucleic acids without utilization of target-specific primers provides advantages over traditional sequencing methods and allows detection of unsuspected ...
.beta.-glucosidase 5 (BGL5) compositions

DOEpatents

Dunn-Coleman, Nigel; Goedegebuur, Frits; Ward, Michael; Yao, Jian

2010-06-01

The present invention provides a novel .beta.-glucosidase nucleic acid sequence, designated bgl5, and the corresponding BGL5 amino acid sequence. The invention also provides expression vectors and host cells comprising a nucleic acid sequence encoding BGL5, recombinant BGL5 proteins and methods for producing the same.
Methods of diagnosing alagille syndrome

DOEpatents

Li, Linheng; Hood, Leroy; Krantz, Ian D.; Spinner, Nancy B.

2004-03-09

The present invention provides an isolated polypeptide exhibiting substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the polypeptide does not have the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. The invention further provides an isolated nucleic acid molecule containing a nucleotide sequence encoding substantially the same amino acid sequence as JAGGED, or an active fragment thereof, provided that the nucleotide sequence does not encode the amino acid sequence of SEQ ID NO:5 or SEQ ID NO:6. Also provided herein is a method of inhibiting differentiation of hematopoietic progenitor cells by contacting the progenitor cells with an isolated JAGGED polypeptide, or active fragment thereof. The invention additionally provides a method of diagnosing Alagille Syndrome in an individual. The method consists of detecting an Alagille Syndrome disease-associated mutation linked to a JAGGED locus.
Sequence variation and phylogenetic analysis of envelope glycoprotein of hepatitis G virus.

PubMed

Lim, M Y; Fry, K; Yun, A; Chong, S; Linnen, J; Fung, K; Kim, J P

1997-11-01

A transfusion-transmissible agent provisionally designated hepatitis G virus (HGV) was recently identified. In this study, we examined the variability of the HGV genome by analysing sequences in the putative envelope region from 72 isolates obtained from diverse geographical sources. The 1561 nucleotide sequence of the E1/E2/NS2a region of HGV was determined from 12 isolates, and compared with three published sequences. The most variability was observed in 400 nucleotides at the N terminus of E2. We next analysed this 400 nucleotide envelope variable region (EV) from an additional 60 HGV isolates. This sequence varied considerably among the 75 isolates, with overall identity ranging from 79.3% to 99.5% at the nucleotide level, and from 83.5% to 100% at the amino acid level. However, hypervariable regions were not identified. Phylogenetic analyses indicated that the 75 HGV isolates belong to a single genotype. A single-tier distribution of evolutionary distances was observed among the 15 E1/E2/NS2a sequences and the 75 EV sequences. In contrast, 11 isolates of HCV were analysed and showed a three-tiered distribution, representing genotypes, subtypes, and isolates. The 75 isolates of HGV fell into four clusters on the phylogenetic tree. Tight geographical clustering was observed among the HGV isolates from Japan and Korea.
DNA-binding proteins from marine bacteria expand the known sequence diversity of TALE-like repeats.

PubMed

de Lange, Orlando; Wolf, Christina; Thiel, Philipp; Krüger, Jens; Kleusch, Christian; Kohlbacher, Oliver; Lahaye, Thomas

2015-11-16

Transcription Activator-Like Effectors (TALEs) of Xanthomonas bacteria are programmable DNA binding proteins with unprecedented target specificity. Comparative studies into TALE repeat structure and function are hindered by the limited sequence variation among TALE repeats. More sequence-diverse TALE-like proteins are known from Ralstonia solanacearum (RipTALs) and Burkholderia rhizoxinica (Bats), but RipTAL and Bat repeats are conserved with those of TALEs around the DNA-binding residue. We study two novel marine-organism TALE-like proteins (MOrTL1 and MOrTL2), the first to date of non-terrestrial origin. We have assessed their DNA-binding properties and modelled repeat structures. We found that repeats from these proteins mediate sequence specific DNA binding conforming to the TALE code, despite low sequence similarity to TALE repeats, and with novel residues around the BSR. However, MOrTL1 repeats show greater sequence discriminating power than MOrTL2 repeats. Sequence alignments show that there are only three residues conserved between repeats of all TALE-like proteins including the two new additions. This conserved motif could prove useful as an identifier for future TALE-likes. Additionally, comparing MOrTL repeats with those of other TALE-likes suggests a common evolutionary origin for the TALEs, RipTALs and Bats. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Visualizing bacterial tRNA identity determinants and antideterminants using function logos and inverse function logos

PubMed Central

Freyhult, Eva; Moulton, Vincent; Ardell, David H.

2006-01-01

Sequence logos are stacked bar graphs that generalize the notion of consensus sequence. They employ entropy statistics very effectively to display variation in a structural alignment of sequences of a common function, while emphasizing its over-represented features. Yet sequence logos cannot display features that distinguish functional subclasses within a structurally related superfamily nor do they display under-represented features. We introduce two extensions to address these needs: function logos and inverse logos. Function logos display subfunctions that are over-represented among sequences carrying a specific feature. Inverse logos generalize both sequence logos and function logos by displaying under-represented, rather than over-represented, features or functions in structural alignments. To make inverse logos, a compositional inverse is applied to the feature or function frequency distributions before logo construction, where a compositional inverse is a mathematical transform that makes common features or functions rare and vice versa. We applied these methods to a database of structurally aligned bacterial tDNAs to create highly condensed, birds-eye views of potentially all so-called identity determinants and antideterminants that confer specific amino acid charging or initiator function on tRNAs in bacteria. We recovered both known and a few potentially novel identity elements. Function logos and inverse logos are useful tools for exploratory bioinformatic analysis of structure–function relationships in sequence families and superfamilies. PMID:16473848
Purification and properties of an O-acetyl-transferase from Escherichia coli that can O-acetylate polysialic acid sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Higa, H.; Varki, A.

1986-05-01

Certain strains of bacteria synthesize an outer polysialic acid (K1) capsule. Some strains of K1/sup +/ E.coli are also capable of adding O-acetyl-esters to the exocyclic hydroxyl groups of the sialic acid residues. Both the capsule and the O-acetyl modification have been correlated with differences in antigenicity and pathogenicity. The authors have developed an assay for an O-acetyl-transferase in E.coli that transfers O-(/sup 3/H)acetyl groups from (/sup 3/H)acetyl-Coenzyme A to colominic acid (fragments of the polysialic acid capsule). Using this assay, the enzyme was solubilized, and purified approx. 600-fold using a single affinity chromatography step with Procion Red-A Agarose. Themore » enzyme also binds to Coenzyme A Sepharose, and can be eluted with high salt or Coenzyme A. The partially purified enzyme has a pH optimum of 7.0 - 7.5, is unaffected by divalent cations, is inhibited by high salt concentrations, is inhibited by Coenzyme A (50% inhibition at 100 ..mu..M), and shows an apparent Km for colominic acid of 3.7 mM (sialic acid concentration). This enzyme could be involved in the O-acetyl +/- form variation seen in some strains of K1/sup +/ E.coli.« less
Characterization of Bacteroides forsythus Strains from Cat and Dog Bite Wounds in Humans and Comparison with Monkey and Human Oral Strains

PubMed Central

Hudspeth, M. K.; Gerardo, S. Hunt; Maiden, M. F. J.; Citron, D. M.; Goldstein, E. J. C.

1999-01-01

Bacteroides forsythus strains recovered from cat and dog bite wound infections in humans (n = 3), monkey oral strains (n = 3), and the human oral ATCC 43037 type strain were characterized by using phenotypic characteristics, enzymatic tests, whole cell fatty acid analysis, sodium dodecyl sulfate-polyacrylamide gel electrophoresis (SDS-PAGE) analysis, PCR fingerprinting, and 16S rDNA (genes coding for rRNA) sequencing. All three bite wound isolates grew on brucella agar supplemented with 5% sheep blood, vitamin K1, and hemin. These strains, unlike the ATCC strain and previously described monkey oral and human clinical strains, did not require N-acetylmuramic acid supplementation for growth as pure cultures. However, their phenotypic characteristics, except for catalase production, were similar to those of previously identified strains. PCR fingerprinting analysis showed differences in band patterns from the ATCC strain. Also, SDS-PAGE and whole cell fatty acid analysis indicated that the dog and cat bite wound strains were similar but not identical to the human B. forsythus ATCC 43037 type strain and the monkey oral strains. The rDNA sequence analysis indicated that the three bite wound isolates had 99.93% homology with each other and 98.9 and 99.22% homology with the human ATCC 43037 and monkey oral strains, respectively. These results suggest that there are host-specific variations within each group. PMID:10325363
αIIbβ3 variants defined by next-generation sequencing: Predicting variants likely to cause Glanzmann thrombasthenia

PubMed Central

Buitrago, Lorena; Rendon, Augusto; Liang, Yupu; Simeoni, Ilenia; Negri, Ana; Filizola, Marta; Ouwehand, Willem H.; Coller, Barry S.; Alessi, Marie-Christine; Ballmaier, Matthias; Bariana, Tadbir; Bellissimo, Daniel; Bertoli, Marta; Bray, Paul; Bury, Loredana; Carrell, Robin; Cattaneo, Marco; Collins, Peter; French, Deborah; Favier, Remi; Freson, Kathleen; Furie, Bruce; Germeshausen, Manuela; Ghevaert, Cedric; Gomez, Keith; Goodeve, Anne; Gresele, Paolo; Guerrero, Jose; Hampshire, Dan J.; Hadinnapola, Charaka; Heemskerk, Johan; Henskens, Yvonne; Hill, Marian; Hogg, Nancy; Johnsen, Jill; Kahr, Walter; Kerr, Ron; Kunishima, Shinji; Laffan, Michael; Natwani, Amit; Neerman-Arbez, Marguerite; Nurden, Paquita; Nurden, Alan; Ormiston, Mark; Othman, Maha; Ouwehand, Willem; Perry, David; Vilk, Shoshana Ravel; Reitsma, Pieter; Rondina, Matthew; Simeoni, Ilenia; Smethurst, Peter; Stephens, Jonathan; Stevenson, William; Szkotak, Artur; Turro, Ernest; Van Geet, Christel; Vries, Minka; Ward, June; Waye, John; Westbury, Sarah; Whiteheart, Sidney; Wilcox, David; Zhang, Bi

2015-01-01

Next-generation sequencing is transforming our understanding of human genetic variation but assessing the functional impact of novel variants presents challenges. We analyzed missense variants in the integrin αIIbβ3 receptor subunit genes ITGA2B and ITGB3 identified by whole-exome or -genome sequencing in the ThromboGenomics project, comprising ∼32,000 alleles from 16,108 individuals. We analyzed the results in comparison with 111 missense variants in these genes previously reported as being associated with Glanzmann thrombasthenia (GT), 20 associated with alloimmune thrombocytopenia, and 5 associated with aniso/macrothrombocytopenia. We identified 114 novel missense variants in ITGA2B (affecting ∼11% of the amino acids) and 68 novel missense variants in ITGB3 (affecting ∼9% of the amino acids). Of the variants, 96% had minor allele frequencies (MAF) < 0.1%, indicating their rarity. Based on sequence conservation, MAF, and location on a complete model of αIIbβ3, we selected three novel variants that affect amino acids previously associated with GT for expression in HEK293 cells. αIIb P176H and β3 C547G severely reduced αIIbβ3 expression, whereas αIIb P943A partially reduced αIIbβ3 expression and had no effect on fibrinogen binding. We used receiver operating characteristic curves of combined annotation-dependent depletion, Polyphen 2-HDIV, and sorting intolerant from tolerant to estimate the percentage of novel variants likely to be deleterious. At optimal cut-off values, which had 69–98% sensitivity in detecting GT mutations, between 27% and 71% of the novel αIIb or β3 missense variants were predicted to be deleterious. Our data have implications for understanding the evolutionary pressure on αIIbβ3 and highlight the challenges in predicting the clinical significance of novel missense variants. PMID:25827233
Amino acid and structural variability of Yersinia pestis LcrV protein

DOE Office of Scientific and Technical Information (OSTI.GOV)

Anisimov, A P; Dentovskaya, S V; Panfertsev, E A

2009-11-09

The LcrV protein is a multifunctional virulence factor and protective antigen of the plague bacterium which is generally conserved between the epidemic strains of Yersinia pestis. They investigated the diversity in the LcrV sequences among non-epidemic Y. pestis strains which have a limited virulence in selected animal models and for humans. Sequencing of lcrV genes from ten Y. pestis strains belonging to different phylogenetic groups (subspecies) showed that the LcrV proteins possess four major variable hotspots at positions 18, 72, 273, and 324-326. These major variations, together with other minor substitutions in amino acid sequences, allowed them to classify themore » LcrV alleles into five sequence types (A-E). They observed that the strains of different Y. pestis subspecies can have the same typ of LcrV, and different types of LcrV can exist within the same natural plague focus. The LcrV polymorphisms were structurally analyzed by comparing the modeled structures of LcrV from all available strains. All changes except one occurred either in flexible regions or on the surface of the protein, but local chemical properties (i.e. those of a hydrophobic, hydrophilic, amphipathic, or charged nature) were conserved across all of the strains. Polymorphisms in flexible and surface regions are likely subject to less selective pressure, and have a limited impact on the structure. In contrast, the substitution of tryptophan at position 113 with either glutamic acid or glycine likely has a serious influence on the regional structure of the protein, and these mutations might have an effect on the function of LcrV. The polymorphisms at positions 18, 72 and 273 were accountable for differences in oligomerization of LcrV. The importance of the latter property in emergence of epidemic strains of Y. pestis during evolution of this pathogen will need to be further investigated.« less
Effects of age, sex, and genotype on high-sensitivity metabolomic profiles in the fruit fly, Drosophila melanogaster

PubMed Central

Hoffman, Jessica M; Soltow, Quinlyn A; Li, Shuzhao; Sidik, Alfire; Jones, Dean P; Promislow, Daniel E L

2014-01-01

Researchers have used whole-genome sequencing and gene expression profiling to identify genes associated with age, in the hope of understanding the underlying mechanisms of senescence. But there is a substantial gap from variation in gene sequences and expression levels to variation in age or life expectancy. In an attempt to bridge this gap, here we describe the effects of age, sex, genotype, and their interactions on high-sensitivity metabolomic profiles in the fruit fly, Drosophila melanogaster. Among the 6800 features analyzed, we found that over one-quarter of all metabolites were significantly associated with age, sex, genotype, or their interactions, and multivariate analysis shows that individual metabolomic profiles are highly predictive of these traits. Using a metabolomic equivalent of gene set enrichment analysis, we identified numerous metabolic pathways that were enriched among metabolites associated with age, sex, and genotype, including pathways involving sugar and glycerophospholipid metabolism, neurotransmitters, amino acids, and the carnitine shuttle. Our results suggest that high-sensitivity metabolomic studies have excellent potential not only to reveal mechanisms that lead to senescence, but also to help us understand differences in patterns of aging among genotypes and between males and females. PMID:24636523
Considerable MHC Diversity Suggests That the Functional Extinction of Baiji Is Not Related to Population Genetic Collapse

PubMed Central

Xu, Shixia; Ju, Jianfeng; Zhou, Xuming; Wang, Lian; Zhou, Kaiya; Yang, Guang

2012-01-01

To further extend our understanding of the mechanism causing the current nearly extinct status of the baiji (Lipotes vexillifer), one of the most critically endangered species in the world, genetic diversity at the major histocompatibility complex (MHC) class II DRB locus was investigated in the baiji. Nine highly divergent DRB alleles were identified in 17 samples, with an average of 28.4 (13.2%) nucleotide difference and 16.7 (23.5%) amino acid difference between alleles. The unexpectedly high levels of DRB allelic diversity in the baiji may partly be attributable to its evolutionary adaptations to the freshwater environment which is regarded to have a higher parasite diversity compared to the marine environment. In addition, balancing selection was found to be the main mechanisms in generating sequence diversity at baiji DRB gene. Considerable sequence variation at the adaptive MHC genes despite of significant loss of neutral genetic variation in baiji genome might suggest that intense selection has overpowered random genetic drift as the main evolutionary forces, which further suggested that the critically endangered or nearly extinct status of the baiji is not an outcome of genetic collapse. PMID:22272349
Proteogenomic Analysis of Polymorphisms and Gene Annotation Divergences in Prokaryotes using a Clustered Mass Spectrometry-Friendly Database*

PubMed Central

de Souza, Gustavo A.; Arntzen, Magnus Ø.; Fortuin, Suereta; Schürch, Anita C.; Målen, Hiwa; McEvoy, Christopher R. E.; van Soolingen, Dick; Thiede, Bernd; Warren, Robin M.; Wiker, Harald G.

2011-01-01

Precise annotation of genes or open reading frames is still a difficult task that results in divergence even for data generated from the same genomic sequence. This has an impact in further proteomic studies, and also compromises the characterization of clinical isolates with many specific genetic variations that may not be represented in the selected database. We recently developed software called multistrain mass spectrometry prokaryotic database builder (MSMSpdbb) that can merge protein databases from several sources and be applied on any prokaryotic organism, in a proteomic-friendly approach. We generated a database for the Mycobacterium tuberculosis complex (using three strains of Mycobacterium bovis and five of M. tuberculosis), and analyzed data collected from two laboratory strains and two clinical isolates of M. tuberculosis. We identified 2561 proteins, of which 24 were present in M. tuberculosis H37Rv samples, but not annotated in the M. tuberculosis H37Rv genome. We were also able to identify 280 nonsynonymous single amino acid polymorphisms and confirm 367 translational start sites. As a proof of concept we applied the database to whole-genome DNA sequencing data of one of the clinical isolates, which allowed the validation of 116 predicted single amino acid polymorphisms and the annotation of 131 N-terminal start sites. Moreover we identified regions not present in the original M. tuberculosis H37Rv sequence, indicating strain divergence or errors in the reference sequence. In conclusion, we demonstrated the potential of using a merged database to better characterize laboratory or clinical bacterial strains. PMID:21030493
Complete amino acid sequence of bovine colostrum low-Mr cysteine proteinase inhibitor.

PubMed

Hirado, M; Tsunasawa, S; Sakiyama, F; Niinobe, M; Fujii, S

1985-07-01

The complete amino acid sequence of bovine colostrum cysteine proteinase inhibitor was determined by sequencing native inhibitor and peptides obtained by cyanogen bromide degradation, Achromobacter lysylendopeptidase digestion and partial acid hydrolysis of reduced and S-carboxymethylated protein. Achromobacter peptidase digestion was successfully used to isolate two disulfide-containing peptides. The inhibitor consists of 112 amino acids with an Mr of 12787. Two disulfide bonds were established between Cys 66 and Cys 77 and between Cys 90 and Cys 110. A high degree of homology in the sequence was found between the colostrum inhibitor and human gamma-trace, human salivary acidic protein and chicken egg-white cystatin.
Detection and isolation of nucleic acid sequences using competitive hybridization probes

DOEpatents

Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

1997-01-01

A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided.
Detection and isolation of nucleic acid sequences using competitive hybridization probes

DOEpatents

Lucas, J.N.; Straume, T.; Bogen, K.T.

1997-04-01

A method for detecting a target nucleic acid sequence in a sample is provided using hybridization probes which competitively hybridize to a target nucleic acid. According to the method, a target nucleic acid sequence is hybridized to first and second hybridization probes which are complementary to overlapping portions of the target nucleic acid sequence, the first hybridization probe including a first complexing agent capable of forming a binding pair with a second complexing agent and the second hybridization probe including a detectable marker. The first complexing agent attached to the first hybridization probe is contacted with a second complexing agent, the second complexing agent being attached to a solid support such that when the first and second complexing agents are attached, target nucleic acid sequences hybridized to the first hybridization probe become immobilized on to the solid support. The immobilized target nucleic acids are then separated and detected by detecting the detectable marker attached to the second hybridization probe. A kit for performing the method is also provided. 7 figs.
Lactic Acid Bacteria in Durum Wheat Flour Are Endophytic Components of the Plant during Its Entire Life Cycle.

PubMed

Minervini, Fabio; Celano, Giuseppe; Lattanzi, Anna; Tedone, Luigi; De Mastro, Giuseppe; Gobbetti, Marco; De Angelis, Maria

2015-10-01

This study aimed at assessing the dynamics of lactic acid bacteria and other Firmicutes associated with durum wheat organs and processed products. 16S rRNA gene-based high-throughput sequencing showed that Lactobacillus, Streptococcus, Enterococcus, and Lactococcus were the main epiphytic and endophytic genera among lactic acid bacteria. Bacillus, Exiguobacterium, Paenibacillus, and Staphylococcus completed the picture of the core genus microbiome. The relative abundance of each lactic acid bacterium genus was affected by cultivars, phenological stages, other Firmicutes genera, environmental temperature, and water activity (aw) of plant organs. Lactobacilli, showing the highest sensitivity to aw, markedly decreased during milk development (Odisseo) and physiological maturity (Saragolla). At these stages, Lactobacillus was mainly replaced by Streptococcus, Lactococcus, and Enterococcus. However, a key sourdough species, Lactobacillus plantarum, was associated with plant organs during the life cycle of Odisseo and Saragolla wheat. The composition of the sourdough microbiota and the overall quality of leavened baked goods are also determined throughout the phenological stages of wheat cultivation, with variations depending on environmental and agronomic factors. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Lactic Acid Bacteria in Durum Wheat Flour Are Endophytic Components of the Plant during Its Entire Life Cycle

PubMed Central

Minervini, Fabio; Celano, Giuseppe; Lattanzi, Anna; Tedone, Luigi; De Mastro, Giuseppe; De Angelis, Maria

2015-01-01

This study aimed at assessing the dynamics of lactic acid bacteria and other Firmicutes associated with durum wheat organs and processed products. 16S rRNA gene-based high-throughput sequencing showed that Lactobacillus, Streptococcus, Enterococcus, and Lactococcus were the main epiphytic and endophytic genera among lactic acid bacteria. Bacillus, Exiguobacterium, Paenibacillus, and Staphylococcus completed the picture of the core genus microbiome. The relative abundance of each lactic acid bacterium genus was affected by cultivars, phenological stages, other Firmicutes genera, environmental temperature, and water activity (aw) of plant organs. Lactobacilli, showing the highest sensitivity to aw, markedly decreased during milk development (Odisseo) and physiological maturity (Saragolla). At these stages, Lactobacillus was mainly replaced by Streptococcus, Lactococcus, and Enterococcus. However, a key sourdough species, Lactobacillus plantarum, was associated with plant organs during the life cycle of Odisseo and Saragolla wheat. The composition of the sourdough microbiota and the overall quality of leavened baked goods are also determined throughout the phenological stages of wheat cultivation, with variations depending on environmental and agronomic factors. PMID:26187970
Relationship between amino acid changes in mitochondrial ATP6 and life-history variation in anguillid eels.

PubMed

Jacobsen, Magnus W; Pujolar, José Martin; Hansen, Michael M

2015-03-01

Mitochondrial genes are part of the oxidative phosphorylation pathway and important for energy production. Although evidence for positive selection at the mitochondrial level exists, few studies have investigated the link between amino acid changes and phenotype. Here we test the hypothesis that differences in two life-history related traits, migratory distance between spawning and foraging areas and larval phase duration, are associated with divergent selection within the mitochondrial ATP6 gene in anguillid eels. We compare amino acid changes among 18 species with the sequence of the putative ancestral species, believed to have shown short migratory distance and larval phase duration. We find positive correlations between both life-history related traits and (i) the number of amino acid changes and (ii) the strength of the combined physico-chemical and structural changes at positions previously identified as candidates for positive selection. This supports a link between genotype and phenotype driven by positive selection at ATP6. © 2015 The Author(s) Published by the Royal Society. All rights reserved.
Geographic origin is not supported by the genetic variability found in a large living collection of Jatropha curcas with accessions from three continents

PubMed Central

Maghuly, Fatemeh; Jankowicz-Cieslak, Joanna; Pabinger, Stephan; Till, Bradley J; Laimer, Margit

2015-01-01

Increasing economic interest in Jatropha curcas requires a major research focus on the genetic background and geographic origin of this non-edible biofuel crop. To determine the worldwide genetic structure of this species, amplified fragment length polymorphisms, inter simple sequence repeats, and novel single nucleotide polymorphisms (SNPs) were employed for a large collection of 907 J. curcas accessions and related species (RS) from three continents, 15 countries and 53 regions. PCoA, phenogram, and cophenetic analyses separated RS from two J. curcas groups. Accessions from Mexico, Bolivia, Paraguay, Kenya, and Ethiopia with unknown origins were found in both groups. In general, there was a considerable overlap between individuals from different regions and countries. The Bayesian approach using structure demonstrated two groups with a low genetic variation. Analysis of molecular varience revealed significant variation among individuals within populations. SNPs found by in silico analyses of Δ12 fatty acid desaturase indicated possible changes in gene expression and thus in fatty acid profiles. SNP variation was higher in the curcin gene compared to genes involved in oil production. Novel SNPs allowed separating toxic, non-toxic, and Mexican accessions. The present study confirms that human activities had a major influence on the genetic diversity of J. curcas, not only because of domestication, but also because of biased selection. PMID:25511658

Genetic variation of pfhrp2 in Plasmodium falciparum isolates from Yemen and the performance of HRP2-based malaria rapid diagnostic test.

PubMed

Atroosh, Wahib M; Al-Mekhlafi, Hesham M; Al-Jasari, Adel; Sady, Hany; Al-Delaimy, Ahmed K; Nasr, Nabil A; Dawaki, Salwa; Abdulsalam, Awatif M; Ithoi, Init; Lau, Yee Ling; Fong, Mun Yik; Surin, Johari

2015-07-22

The genetic variation in the Plasmodium falciparum histidine-rich protein 2 (pfhrp2) gene that may compromise the use of pfhrp2-based rapid diagnostic tests (RDTs) for the diagnosis of malaria was assessed in P. falciparum isolates from Yemen. This study was conducted in Hodeidah and Al-Mahwit governorates, Yemen. A total of 622 individuals with fever were examined for malaria by CareStart malaria HRP2-RDT and Giemsa-stained thin and thick blood films. The Pfhrp2 gene was amplified and sequenced from 180 isolates, and subjected to amino acid repeat types analysis. A total of 188 (30.2%) participants were found positive for P. falciparum by the RDT. Overall, 12 different amino acid repeat types were identified in Yemeni isolates. Six repeat types were detected in all the isolates (100%) namely types 1, 2, 6, 7, 10 and 12 while types 9 and 11 were not detected in any of the isolates. Moreover, the sensitivity and specificity of the used PfHRP2-based RDTs were high (90.5% and 96.1%, respectively). The present study provides data on the genetic variation within the pfhrp2 gene, and its potential impact on the PfHRP2-based RDTs commonly used in Yemen. CareStart Malaria HRP2-based RDT showed high sensitivity and specificity in endemic areas of Yemen.
How Many Protein Sequences Fold to a Given Structure? A Coevolutionary Analysis.

PubMed

Tian, Pengfei; Best, Robert B

2017-10-17

Quantifying the relationship between protein sequence and structure is key to understanding the protein universe. A fundamental measure of this relationship is the total number of amino acid sequences that can fold to a target protein structure, known as the "sequence capacity," which has been suggested as a proxy for how designable a given protein fold is. Although sequence capacity has been extensively studied using lattice models and theory, numerical estimates for real protein structures are currently lacking. In this work, we have quantitatively estimated the sequence capacity of 10 proteins with a variety of different structures using a statistical model based on residue-residue co-evolution to capture the variation of sequences from the same protein family. Remarkably, we find that even for the smallest protein folds, such as the WW domain, the number of foldable sequences is extremely large, exceeding the Avogadro constant. In agreement with earlier theoretical work, the calculated sequence capacity is positively correlated with the size of the protein, or better, the density of contacts. This allows the absolute sequence capacity of a given protein to be approximately predicted from its structure. On the other hand, the relative sequence capacity, i.e., normalized by the total number of possible sequences, is an extremely tiny number and is strongly anti-correlated with the protein length. Thus, although there may be more foldable sequences for larger proteins, it will be much harder to find them. Lastly, we have correlated the evolutionary age of proteins in the CATH database with their sequence capacity as predicted by our model. The results suggest a trade-off between the opposing requirements of high designability and the likelihood of a novel fold emerging by chance. Published by Elsevier Inc.
LOVD: easy creation of a locus-specific sequence variation database using an "LSDB-in-a-box" approach.

PubMed

Fokkema, Ivo F A C; den Dunnen, Johan T; Taschner, Peter E M

2005-08-01

The completion of the human genome project has initiated, as well as provided the basis for, the collection and study of all sequence variation between individuals. Direct access to up-to-date information on sequence variation is currently provided most efficiently through web-based, gene-centered, locus-specific databases (LSDBs). We have developed the Leiden Open (source) Variation Database (LOVD) software approaching the "LSDB-in-a-Box" idea for the easy creation and maintenance of a fully web-based gene sequence variation database. LOVD is platform-independent and uses PHP and MySQL open source software only. The basic gene-centered and modular design of the database follows the recommendations of the Human Genome Variation Society (HGVS) and focuses on the collection and display of DNA sequence variations. With minimal effort, the LOVD platform is extendable with clinical data. The open set-up should both facilitate and promote functional extension with scripts written by the community. The LOVD software is freely available from the Leiden Muscular Dystrophy pages (www.DMD.nl/LOVD/). To promote the use of LOVD, we currently offer curators the possibility to set up an LSDB on our Leiden server. (c) 2005 Wiley-Liss, Inc.
Population genetic structure and natural selection of Plasmodium falciparum apical membrane antigen-1 in Myanmar isolates.

PubMed

Kang, Jung-Mi; Lee, Jinyoung; Moe, Mya; Jun, Hojong; Lê, Hương Giang; Kim, Tae Im; Thái, Thị Lam; Sohn, Woon-Mok; Myint, Moe Kyaw; Lin, Khin; Shin, Ho-Joon; Kim, Tong-Soo; Na, Byoung-Kuk

2018-02-07

Plasmodium falciparum apical membrane antigen-1 (PfAMA-1) is one of leading blood stage malaria vaccine candidates. However, genetic variation and antigenic diversity identified in global PfAMA-1 are major hurdles in the development of an effective vaccine based on this antigen. In this study, genetic structure and the effect of natural selection of PfAMA-1 among Myanmar P. falciparum isolates were analysed. Blood samples were collected from 58 Myanmar patients with falciparum malaria. Full-length PfAMA-1 gene was amplified by polymerase chain reaction and cloned into a TA cloning vector. PfAMA-1 sequence of each isolate was sequenced. Polymorphic characteristics and effect of natural selection were analysed with using DNASTAR, MEGA4, and DnaSP programs. Polymorphic nature and natural selection in 459 global PfAMA-1 were also analysed. Thirty-seven different haplotypes of PfAMA-1 were identified in 58 Myanmar P. falciparum isolates. Most amino acid changes identified in Myanmar PfAMA-1 were found in domains I and III. Overall patterns of amino acid changes in Myanmar PfAMA-1 were similar to those in global PfAMA-1. However, frequencies of amino acid changes differed by country. Novel amino acid changes in Myanmar PfAMA-1 were also identified. Evidences for natural selection and recombination event were observed in global PfAMA-1. Among 51 commonly identified amino acid changes in global PfAMA-1 sequences, 43 were found in predicted RBC-binding sites, B-cell epitopes, or IUR regions. Myanmar PfAMA-1 showed similar patterns of nucleotide diversity and amino acid polymorphisms compared to those of global PfAMA-1. Balancing natural selection and intragenic recombination across PfAMA-1 are likely to play major roles in generating genetic diversity in global PfAMA-1. Most common amino acid changes in global PfAMA-1 were located in predicted B-cell epitopes where high levels of nucleotide diversity and balancing natural selection were found. These results highlight the strong selective pressure of host immunity on the PfAMA-1 gene. These results have significant implications in understanding the nature of Myanmar PfAMA-1 along with global PfAMA-1. They also provide useful information for the development of effective malaria vaccine based on this antigen.
Prevalence, antimicrobial resistance and genetic diversity of Campylobacter coli and Campylobacter jejuni in Ecuadorian broilers at slaughter age

PubMed Central

Vinueza-Burgos, Christian; Wautier, Magali; Martiny, Delphine; Cisneros, Marco; Van Damme, Inge; De Zutter, Lieven

2017-01-01

Abstract Thermotolerant Campylobacter spp. are a major cause of foodborne gastrointestinal infections worldwide. The linkage of human campylobacteriosis and poultry has been widely described. In this study we aimed to investigate the prevalence, antimicrobial resistance and genetic diversity of C. coli and C. jejuni in broilers from Ecuador. Caecal content from 379 randomly selected broiler batches originating from 115 farms were collected from 6 slaughterhouses located in the province of Pichincha during 1 year. Microbiological isolation was performed by direct plating on mCCDA agar. Identification of Campylobacter species was done by PCR. Minimum inhibitory concentration (MIC) values for gentamicin, ciprofloxacin, nalidixic acid, tetracycline, streptomycin, and erythromycin were obtained. Genetic variation was assessed by RFLP-flaA typing and Multilocus Sequence Typing (MLST) of selected isolates. Prevalence at batch level was 64.1%. Of the positive batches 68.7% were positive for C. coli, 18.9% for C. jejuni, and 12.4% for C. coli and C. jejuni. Resistance rates above 67% were shown for tetracycline, ciprofloxacin, and nalidixic acid. The resistance pattern tetracycline, ciprofloxin, and nalidixic acid was the dominant one in both Campylobacter species. RFLP-flaA typing analysis showed that C. coli and C. jejuni strains belonged to 38 and 26 profiles respectively. On the other hand MLST typing revealed that C. coli except one strain belonged to CC-828, while C. jejuni except 2 strains belonged to 12 assigned clonal complexes (CCs). Furthermore 4 new sequence types (STs) for both species were described, whereby 2 new STs for C. coli were based on new allele sequences. Further research is necessary to estimate the impact of the slaughter of Campylobacter positive broiler batches on the contamination level of carcasses in slaughterhouses and at retail in Ecuador. PMID:28339716
Formation and hydrolysis of amide bonds by lipase A from Candida antarctica; exceptional features.

PubMed

Liljeblad, Arto; Kallio, Pauli; Vainio, Marita; Niemi, Jarmo; Kanerva, Liisa T

2010-02-21

Various commercial lyophilized and immobilized preparations of lipase A from Candida antarctica (CAL-A) were studied for their ability to catalyze the hydrolysis of amide bonds in N-acylated alpha-amino acids, 3-butanamidobutanoic acid (beta-amino acid) and its ethyl ester. The activity toward amide bonds is highly untypical of lipases, despite the close mechanistic analogy to amidases which normally catalyze the corresponding reactions. Most CAL-A preparations cleaved amide bonds of various substrates with high enantioselectivity, although high variations in substrate selectivity and catalytic rates were detected. The possible role of contaminant protein species on the hydrolytic activity toward these bonds was studied by fractionation and analysis of the commercial lyophilized preparation of CAL-A (Cat#ICR-112, Codexis). In addition to minor impurities, two equally abundant proteins were detected, migrating on SDS-PAGE a few kDa apart around the calculated size of CAL-A. Based on peptide fragment analysis and sequence comparison both bands shared substantial sequence coverage with CAL-A. However, peptides at the C-terminal end constituting a motile domain described as an active-site flap were not identified in the smaller fragment. Separated gel filtration fractions of the two forms of CAL-A both catalyzed the amide bond hydrolysis of ethyl 3-butanamidobutanoate as well as the N-acylation of methyl pipecolinate. Hydrolytic activity towards N-acetylmethionine was, however, solely confined to the fractions containing the truncated form of CAL-A. These fractions were also found to contain a trace enzyme impurity identified in sequence analysis as a serine carboxypeptidase. The possible role of catalytic impurities versus the function of CAL-A in amide bond hydrolysis is further discussed in the paper.
Comparative Analysis of Genome Sequences Covering the Seven Cronobacter Species

PubMed Central

Cummings, Craig A.; Shih, Rita; Degoricija, Lovorka; Rico, Alain; Brzoska, Pius; Hamby, Stephen E.; Masood, Naqash; Hariri, Sumyya; Sonbol, Hana; Chuzhanova, Nadia; McClelland, Michael; Furtado, Manohar R.; Forsythe, Stephen J.

2012-01-01

Background Species of Cronobacter are widespread in the environment and are occasional food-borne pathogens associated with serious neonatal diseases, including bacteraemia, meningitis, and necrotising enterocolitis. The genus is composed of seven species: C. sakazakii, C. malonaticus, C. turicensis, C. dublinensis, C. muytjensii, C. universalis, and C. condimenti. Clinical cases are associated with three species, C. malonaticus, C. turicensis and, in particular, with C. sakazakii multilocus sequence type 4. Thus, it is plausible that virulence determinants have evolved in certain lineages. Methodology/Principal Findings We generated high quality sequence drafts for eleven Cronobacter genomes representing the seven Cronobacter species, including an ST4 strain of C. sakazakii. Comparative analysis of these genomes together with the two publicly available genomes revealed Cronobacter has over 6,000 genes in one or more strains and over 2,000 genes shared by all Cronobacter. Considerable variation in the presence of traits such as type six secretion systems, metal resistance (tellurite, copper and silver), and adhesins were found. C. sakazakii is unique in the Cronobacter genus in encoding genes enabling the utilization of exogenous sialic acid which may have clinical significance. The C. sakazakii ST4 strain 701 contained additional genes as compared to other C. sakazakii but none of them were known specific virulence-related genes. Conclusions/Significance Genome comparison revealed that pair-wise DNA sequence identity varies between 89 and 97% in the seven Cronobacter species, and also suggested various degrees of divergence. Sets of universal core genes and accessory genes unique to each strain were identified. These gene sequences can be used for designing genus/species specific detection assays. Genes encoding adhesins, T6SS, and metal resistance genes as well as prophages are found in only subsets of genomes and have contributed considerably to the variation of genomic content. Differences in gene content likely contribute to differences in the clinical and environmental distribution of species and sequence types. PMID:23166675
Abundant raw material for cis-regulatory evolution in humans

NASA Technical Reports Server (NTRS)

Rockman, Matthew V.; Wray, Gregory A.

2002-01-01

Changes in gene expression and regulation--due in particular to the evolution of cis-regulatory DNA sequences--may underlie many evolutionary changes in phenotypes, yet little is known about the distribution of such variation in populations. We present in this study the first survey of experimentally validated functional cis-regulatory polymorphism. These data are derived from more than 140 polymorphisms involved in the regulation of 107 genes in Homo sapiens, the eukaryote species with the most available data. We find that functional cis-regulatory variation is widespread in the human genome and that the consequent variation in gene expression is twofold or greater for 63% of the genes surveyed. Transcription factor-DNA interactions are highly polymorphic, and regulatory interactions have been gained and lost within human populations. On average, humans are heterozygous at more functional cis-regulatory sites (>16,000) than at amino acid positions (<13,000), in part because of an overrepresentation among the former in multiallelic tandem repeat variation, especially (AC)(n) dinucleotide microsatellites. The role of microsatellites in gene expression variation may provide a larger store of heritable phenotypic variation, and a more rapid mutational input of such variation, than has been realized. Finally, we outline the distinctive consequences of cis-regulatory variation for the genotype-phenotype relationship, including ubiquitous epistasis and genotype-by-environment interactions, as well as underappreciated modes of pleiotropy and overdominance. Ordinary small-scale mutations contribute to pervasive variation in transcription rates and consequently to patterns of human phenotypic variation.
Neocortical malformation as consequence of nonadaptive regulation of neuronogenetic sequence

NASA Technical Reports Server (NTRS)

Caviness, V. S. Jr; Takahashi, T.; Nowakowski, R. S.

2000-01-01

Variations in the structure of the neocortex induced by single gene mutations may be extreme or subtle. They differ from variations in neocortical structure encountered across and within species in that these "normal" structural variations are adaptive (both structurally and behaviorally), whereas those associated with disorders of development are not. Here we propose that they also differ in principle in that they represent disruptions of molecular mechanisms that are not normally regulatory to variations in the histogenetic sequence. We propose an algorithm for the operation of the neuronogenetic sequence in relation to the overall neocortical histogenetic sequence and highlight the restriction point of the G1 phase of the cell cycle as the master regulatory control point for normal coordinate structural variation across species and importantly within species. From considerations based on the anatomic evidence from neocortical malformation in humans, we illustrate in principle how this overall sequence appears to be disrupted by molecular biological linkages operating principally outside the control mechanisms responsible for the normal structural variation of the neocortex. MRDD Research Reviews 6:22-33, 2000. Copyright 2000 Wiley-Liss, Inc.
Recombinant follicle-stimulating hormone: new biotechnology for infertility.

PubMed

Prevost, R R

1998-01-01

The frequency of infertility in developed countries is approximately 8-10%. New drugs are available for assisted reproduction techniques. Two recombinant follicle-stimulating hormone (FSH) products, follitropin-beta (Follistim in the United States, Puregon in Europe) and follitropin-alpha (Gonal-F), join compounds derived through transfecting nonhuman cell lines with genetic material capable of replicating identical amino acid sequences to human compounds. The cell line used for recombinant (r)-FSH production is the Chinese hamster ovary (CHO). Previously, the only agents that showed benefit in controlled ovulatory stimulation were derived from the urine of menopausal women. Those compounds contain additional substances, such as urinary proteins and various amounts of luteininzing hormone. The amino acid sequence of r-FSH is identical to that of human FSH, but the two recombinant products exist in many different isoforms and differ from each other and from human FSH due to varied carbohydrate side chains. Due to variation in the carbohydrate side chains, follitropin-beta in solution has a higher pH than urine-derived FSH, which enhances receptor affinity and therefore is a greater inducer of folliculogenesis. Follitropin-beta does not cause endogenous production of anti-CHO or anti-FSH antibodies, and is well tolerated.
Association of HLA-DRB1 genetic variants with the persistence of atopic dermatitis

PubMed Central

Margolis, David J.; Mitra, Nandita; Kim, Brian; Gupta, Jayanta; Hoffstad, Ole J; Papadopoulos, Maryte; Wubbenhorst, Bradley; Nathanson, Katherine L; Duke, Jamie L.; Monos, Dimtri.; Kamoun, Malek

2015-01-01

Atopic dermatitis (AD) is a waxing and waning illness of childhood that is likely caused by interactions between an altered skin barrier and immune dysregulation. The goal of our study was to evaluate the association of DRB1 genetic variants and the persistence of AD using whole exome sequencing and high resolution typing. DRB1 was interrogated based on previous reports that utilized high throughput techniques. We evaluated an ongoing nation-wide long-term cohort of children with AD in which patients are asked every 6 months about their medication use and their AD symptoms. In total, 87 African-American and 50 European-American children were evaluated. Genetic association analysis was performed using a software tool focusing on amino acid variable positions shared by HLA-DRB1 alleles covering the antigen presenting domain. Amino acid variations at position 9 (pocket 9), position 26, and position 78 (pocket 4) were marginally associated with the prevalence of AD. However, the odds ratio was 0.30 (0.14, 0.68; p=0.003) for residue 78, 0.27 (0.10, 0.69; p=0.006) for residue 26 and not significant for residue 9 with respect to the persistence of AD. In conclusion, amino acid variations at peptide-binding pockets of HLA-DRB1 were associated with the persistence of AD in African-American children. PMID:26307177
Genetic Characterization of Influenza A (H1N1) Pandemic 2009 Virus Isolates from Mumbai.

PubMed

Gohil, Devanshi; Kothari, Sweta; Shinde, Pramod; Meharunkar, Rhuta; Warke, Rajas; Chowdhary, Abhay; Deshmukh, Ranjana

2017-08-01

Pandemic influenza A (H1N1) 2009 virus was first detected in India in May 2009 which subsequently became endemic in many parts of the country. Influenza A viruses have the ability to evade the immune response through its ability of antigenic variations. The study aims to characterize influenza A (H1N1) pdm 09 viruses circulating in Mumbai during the pandemic and post-pandemic period. Nasopharyngeal swabs positive for influenza A (H1N1) pdm 09 viruses were inoculated on Madin-Darby canine kidney cell line for virus isolation. Molecular and phylogenetic analysis of influenza A (H1N1) pdm 09 isolates was conducted to understand the evolution and genetic diversity of the strains. Nucleotide and amino acid sequences of the HA gene of Mumbai isolates when compared to A/California/07/2009-vaccine strain revealed 14 specific amino acid differences located at the antigenic sites. Amino acid variations in HA and NA gene resulted in changes in the N-linked glycosylation motif which may lead to immune evasion. Phylogenetic analysis of the isolates revealed their evolutionary position with vaccine strain A/California/07/2009 but had undergone changes gradually. The findings in the present study confirm genetic variability of influenza viruses and highlight the importance of continuous surveillance during influenza outbreaks.
A comparative gene analysis with rice identified orthologous group II HKT genes and their association with Na(+) concentration in bread wheat.

PubMed

Ariyarathna, H A Chandima K; Oldach, Klaus H; Francki, Michael G

2016-01-19

Although the HKT transporter genes ascertain some of the key determinants of crop salt tolerance mechanisms, the diversity and functional role of group II HKT genes are not clearly understood in bread wheat. The advanced knowledge on rice HKT and whole genome sequence was, therefore, used in comparative gene analysis to identify orthologous wheat group II HKT genes and their role in trait variation under different saline environments. The four group II HKTs in rice identified two orthologous gene families from bread wheat, including the known TaHKT2;1 gene family and a new distinctly different gene family designated as TaHKT2;2. A single copy of TaHKT2;2 was found on each homeologous chromosome arm 7AL, 7BL and 7DL and each gene was expressed in leaf blade, sheath and root tissues under non-stressed and at 200 mM salt stressed conditions. The proteins encoded by genes of the TaHKT2;2 family revealed more than 93% amino acid sequence identity but ≤52% amino acid identity compared to the proteins encoded by TaHKT2;1 family. Specifically, variations in known critical domains predicted functional differences between the two protein families. Similar to orthologous rice genes on chromosome 6L, TaHKT2;1 and TaHKT2;2 genes were located approximately 3 kb apart on wheat chromosomes 7AL, 7BL and 7DL, forming a static syntenic block in the two species. The chromosomal region on 7AL containing TaHKT2;1 7AL-1 co-located with QTL for shoot Na(+) concentration and yield in some saline environments. The differences in copy number, genes sequences and encoded proteins between TaHKT2;2 homeologous genes and other group II HKT gene families within and across species likely reflect functional diversity for ion selectivity and transport in plants. Evidence indicated that neither TaHKT2;2 nor TaHKT2;1 were associated with primary root Na(+) uptake but TaHKT2;1 may be associated with trait variation for Na(+) exclusion and yield in some but not all saline environments.
A maize landrace that emits defense volatiles in response to herbivore eggs possesses a strongly inducible terpene synthase gene.

PubMed

Tamiru, Amanuel; Bruce, Toby J A; Richter, Annett; Woodcock, Christine M; Midega, Charles A O; Degenhardt, Jörg; Kelemu, Segenet; Pickett, John A; Khan, Zeyaur R

2017-04-01

Maize ( Zea mays ) emits volatile terpenes in response to insect feeding and egg deposition to defend itself against harmful pests. However, maize cultivars differ strongly in their ability to produce the defense signal. To further understand the agroecological role and underlying genetic mechanisms for variation in terpene emission among maize cultivars, we studied the production of an important signaling component ( E )-caryophyllene in a South American maize landrace Braz1006 possessing stemborer Chilo partellus egg inducible defense trait, in comparison with the European maize line Delprim and North American inbred line B73. The ( E) - caryophyllene production level and transcript abundance of TPS23, terpene synthase responsible for ( E) - caryophyllene formation, were compared between Braz1006, Delprim, and B73 after mimicked herbivory. Braz1006-TPS23 was heterologously expressed in E. coli , and amino acid sequences were determined. Furthermore, electrophysiological and behavioral responses of a key parasitic wasp Cotesia sesamiae to C . partellus egg-induced Braz1006 volatiles were determined using coupled gas chromatography electroantennography and olfactometer bioassay studies. After elicitor treatment, Braz1006 released eightfold higher ( E) -caryophyllene than Delprim, whereas no ( E) -caryophyllene was detected in B73. The superior (E)- caryophyllene production by Braz1006 was positively correlated with high transcript levels of TPS23 in the landrace compared to Delprim. TPS23 alleles from Braz1006 showed dissimilarities at different sequence positions with Delprim and B73 and encodes an active enzyme. Cotesia sesamiae was attracted to egg-induced volatiles from Braz1006 and synthetic (E)- caryophyllene. The variation in ( E) -caryophyllene emission between Braz1006 and Delprim is positively correlated with induced levels of TPS23 transcripts. The enhanced TPS23 activity and corresponding ( E) -caryophyllene production by the maize landrace could be attributed to the differences in amino acid sequence with the other maize lines. This study suggested that the same analogous genes could have contrasting expression patterns in different maize genetic backgrounds. The current findings provide valuable insight not only into genetic mechanisms underlying variation in defense signal production but also the prospect of introgressing the novel defense traits into elite maize varieties for effective and ecologically sound protection of crops against damaging insect pests.
Fermentation and microbial population dynamics during the ensiling of native grass and subsequent exposure to air.

PubMed

Zhang, Qing; Wu, Baiyila; Nishino, Naoki; Wang, Xianguo; Yu, Zhu

2016-03-01

To study the microbial population and fermentation dynamics of large needlegrass (LN) and Chinese leymus (CL) during ensiling and subsequent exposure to air, silages were sampled and analyzed using culture-based techniques and denaturing gradient gel electrophoresis (DGGE). A total of 112 lactic acid bacteria (LAB) strains were isolated and identified using the 16S rRNA sequencing method. Lactic acid was not detected in the first 20 days in LN silage and the pH decreased to 6.13 after 45 days of ensiling. The temperature of the LN silage increased after approximately 30 h of air exposure and the CL silage showed a slight temperature variation. Enterococcus spp. were mainly present in LN silage. The proportion of Lactobacillus brevis in CL silage increased after exposure to air. LN silage with a higher proportion of Enterococcus spp. and propionic acid concentration did not show higher fermentation quality or aerobic stability than CL silage, which had a higher concentration of acetic acid, butyric acid and increased proportion of L. brevis after exposure to air. © 2015 Japanese Society of Animal Science.
Assessment for Melting Temperature Measurement of Nucleic Acid by HRM

PubMed Central

2016-01-01

High resolution melting (HRM), with a high sensitivity to distinguish the nucleic acid species with small variations, has been widely applied in the mutation scanning, methylation analysis, and genotyping. For the aim of extending HRM for the evaluation of thermal stability of nucleic acid secondary structures on sequence dependence, we investigated effects of the dye of EvaGreen, metal ions, and impurities (such as dNTPs) on melting temperature (T m) measurement by HRM. The accuracy of HRM was assessed as compared with UV melting method, and little difference between the two methods was found when the DNA T m was higher than 40°C. Both insufficiency and excessiveness of EvaGreen were found to give rise to a little bit higher T m, showing that the proportion of dye should be considered for precise T m measurement of nucleic acids. Finally, HRM method was also successfully used to measure T ms of DNA triplex, hairpin, and RNA duplex. In conclusion, HRM can be applied in the evaluation of thermal stability of nucleic acid (DNA or RNA) or secondary structural elements (even when dNTPs are present). PMID:27833775
Genomic Sequence Variation Markup Language (GSVML).

PubMed

Nakaya, Jun; Kimura, Michio; Hiroi, Kaei; Ido, Keisuke; Yang, Woosung; Tanaka, Hiroshi

2010-02-01

With the aim of making good use of internationally accumulated genomic sequence variation data, which is increasing rapidly due to the explosive amount of genomic research at present, the development of an interoperable data exchange format and its international standardization are necessary. Genomic Sequence Variation Markup Language (GSVML) will focus on genomic sequence variation data and human health applications, such as gene based medicine or pharmacogenomics. We developed GSVML through eight steps, based on case analysis and domain investigations. By focusing on the design scope to human health applications and genomic sequence variation, we attempted to eliminate ambiguity and to ensure practicability. We intended to satisfy the requirements derived from the use case analysis of human-based clinical genomic applications. Based on database investigations, we attempted to minimize the redundancy of the data format, while maximizing the data covering range. We also attempted to ensure communication and interface ability with other Markup Languages, for exchange of omics data among various omics researchers or facilities. The interface ability with developing clinical standards, such as the Health Level Seven Genotype Information model, was analyzed. We developed the human health-oriented GSVML comprising variation data, direct annotation, and indirect annotation categories; the variation data category is required, while the direct and indirect annotation categories are optional. The annotation categories contain omics and clinical information, and have internal relationships. For designing, we examined 6 cases for three criteria as human health application and 15 data elements for three criteria as data formats for genomic sequence variation data exchange. The data format of five international SNP databases and six Markup Languages and the interface ability to the Health Level Seven Genotype Model in terms of 317 items were investigated. GSVML was developed as a potential data exchanging format for genomic sequence variation data exchange focusing on human health applications. The international standardization of GSVML is necessary, and is currently underway. GSVML can be applied to enhance the utilization of genomic sequence variation data worldwide by providing a communicable platform between clinical and research applications. Copyright 2009 Elsevier Ireland Ltd. All rights reserved.
Complete nucleotide and derived amino acid sequence of cDNA encoding the mitochondrial uncoupling protein of rat brown adipose tissue: lack of a mitochondrial targeting presequence.

PubMed Central

Ridley, R G; Patel, H V; Gerber, G E; Morton, R C; Freeman, K B

1986-01-01

A cDNA clone spanning the entire amino acid sequence of the nuclear-encoded uncoupling protein of rat brown adipose tissue mitochondria has been isolated and sequenced. With the exception of the N-terminal methionine the deduced N-terminus of the newly synthesized uncoupling protein is identical to the N-terminal 30 amino acids of the native uncoupling protein as determined by protein sequencing. This proves that the protein contains no N-terminal mitochondrial targeting prepiece and that a targeting region must reside within the amino acid sequence of the mature protein. Images PMID:3012461
Protein composition correlates with the mechanical properties of spider ( Argiope trifasciata ) dragline silk.

PubMed

Marhabaie, Mohammad; Leeper, Thomas C; Blackledge, Todd A

2014-01-13

We investigated the natural variation in silk composition and mechanical performance of the orb-weaving spider Argiope trifasciata at multiple spatial and temporal scales in order to assess how protein composition contributes to the remarkable material properties of spider dragline silk. Major ampullate silk in orb-weaving spiders consists predominantly of two proteins (MaSp1 and MaSp2) with divergent amino acid compositions and functionally different microstructures. Adjusting the expression of these two proteins therefore provides spiders with a simple mechanism to alter the material properties of their silk. We first assessed the reliability and precision of the Waters AccQ-Tag amino acid composition analysis kit for determining the amino acid composition of small quantities of spider silk. We then tested how protein composition varied within single draglines, across draglines spun by the same spider on different days, and finally between spiders. Then, we correlated chemical composition with the material properties of dragline silk. Overall, we found that the chemical composition of major ampullate silk was in general homogeneous among individuals of the same population. Variation in chemical composition was not detectable within silk spun by a single spider on a single day. However, we found that variation within a single spider's silk across different days could, in rare instances, be greater than variation among individual spiders. Most of the variation in silk composition in our investigation resulted from a small number of outliers (three out of sixteen individuals) with a recent history of stress, suggesting stress affects silk production process in orb web spiders. Based on reported sequences for MaSp genes, we developed a gene expression model showing the covariation of the most abundant amino acids in major ampullate silk. Our gene expression model supports that dragline silk composition was mostly determined by the relative abundance of MaSp1 and MaSp2. Finally, we showed that silk composition (especially proline content) strongly correlated with some measures of mechanical performance, particularly how much fibers shrunk during supercontraction as well as their breaking strains. Our findings suggest that spiders are able to change the relative expression rates of different MaSp genes to produce silk fibers with different chemical compositions, and hence, different material properties.
Method of increasing conversion of a fatty acid to its corresponding dicarboxylic acid

DOEpatents

Craft, David L.; Wilson, C. Ron; Eirich, Dudley; Zhang, Yeyan

2004-09-14

A nucleic acid sequence including a CYP promoter operably linked to nucleic acid encoding a heterologous protein is provided to increase transcription of the nucleic acid. Expression vectors and host cells containing the nucleic acid sequence are also provided. The methods and compositions described herein are especially useful in the production of polycarboxylic acids by yeast cells.

A putative carbohydrate-binding domain of the lactose-binding Cytisus sessilifolius anti-H(O) lectin has a similar amino acid sequence to that of the L-fucose-binding Ulex europaeus anti-H(O) lectin.

PubMed

Konami, Y; Yamamoto, K; Osawa, T; Irimura, T

1995-04-01

The complete amino acid sequence of a lactose-binding Cytisus sessilifolius anti-H(O) lectin II (CSA-II) was determined using a protein sequencer. After digestion of CSA-II with endoproteinase Lys-C or Asp-N, the resulting peptides were purified by reversed-phase high performance liquid chromatography (HPLC) and then subjected to sequence analysis. Comparison of the complete amino acid sequence of CSA-II with the sequences of other leguminous seed lectins revealed regions of extensive homology. The amino acid sequence of a putative carbohydrate-binding domain of CSA-II was found to be similar to those of several anti-H(O) leguminous lectins, especially to that of the L-fucose-binding Ulex europaeus lectin I (UEA-I).
Genetic diversity of the merozoite surface protein-3 gene in Plasmodium falciparum populations in Thailand.

PubMed

Pattaradilokrat, Sittiporn; Sawaswong, Vorthon; Simpalipan, Phumin; Kaewthamasorn, Morakot; Siripoon, Napaporn; Harnyuttanakorn, Pongchai

2016-10-21

An effective malaria vaccine is an urgently needed tool to fight against human malaria, the most deadly parasitic disease of humans. One promising candidate is the merozoite surface protein-3 (MSP-3) of Plasmodium falciparum. This antigenic protein, encoded by the merozoite surface protein (msp-3) gene, is polymorphic and classified according to size into the two allelic types of K1 and 3D7. A recent study revealed that both the K1 and 3D7 alleles co-circulated within P. falciparum populations in Thailand, but the extent of the sequence diversity and variation within each allelic type remains largely unknown. The msp-3 gene was sequenced from 59 P. falciparum samples collected from five endemic areas (Mae Hong Son, Kanchanaburi, Ranong, Trat and Ubon Ratchathani) in Thailand and analysed for nucleotide sequence diversity, haplotype diversity and deduced amino acid sequence diversity. The gene was also subject to population genetic analysis (F st ) and neutrality tests (Tajima's D, Fu and Li D* and Fu and Li' F* tests) to determine any signature of selection. The sequence analyses revealed eight unique DNA haplotypes and seven amino acid sequence variants, with a haplotype and nucleotide diversity of 0.828 and 0.049, respectively. Neutrality tests indicated that the polymorphism detected in the alanine heptad repeat region of MSP-3 was maintained by positive diversifying selection, suggesting its role as a potential target of protective immune responses and supporting its role as a vaccine candidate. Comparison of MSP-3 variants among parasite populations in Thailand, India and Nigeria also inferred a close genetic relationship between P. falciparum populations in Asia. This study revealed the extent of the msp-3 gene diversity in P. falciparum in Thailand, providing the fundamental basis for the better design of future blood stage malaria vaccines against P. falciparum.
Geochemical Constraints on the Distribution and Function of Thermoproteales Populations in Yellowstone National Park

NASA Astrophysics Data System (ADS)

Jay, Z.; Rusch, D.; Romine, M.; Beam, J.; Inskeep, W.

2014-12-01

Metagenome surveys in Yellowstone National Park (YNP) indicate that members of the order Thermoproteales (phylum Crenarchaeota) are abundant in high-temperature (> 70 °C) geothermal systems. The goals of this study were to compare Thermoproteales sequences from different geothermal environments across YNP, and determine the variation in metabolic potential associated with their distribution. Thermoproteales sequence assemblies (> 0.5 Mbases) were curated from 10 habitats ranging in pH from 3 - 9 (with or without dissolved sulfide). The distribution of specific Thermoproteales is constrained by pH: Vulcanisaeta-like sequences are the most abundant Thermoproteales at pH < 6, Caldivirga-like sequences more important from pH 4 - 6, and Thermoproteus-like sequences abundant from ~ pH 5 - 7, and at pH > 7, Pyrobaculum-like sequences are nearly the only Thermoproteales present. Thermoproteales populations are generally found in hypoxic systems where reduced forms of S and As often limit concentrations of dissolved oxygen. These environmental conditions are correlated with the presence or absence of system-defined respiratory complexes including different terminal oxidases (e.g., aa3 or bd), numerous DMSO-molybdopterins, and dissimilatory sulfate reductases. Metabolic reconstruction of different genera revealed similar pathways for the degradation of carbohydrates, amino acids, and lipids across sites. Only the Thermoproteus and Pyrobaculum populations contained the three marker genes for the dicarboxylate/4-hyhdroxybutyrate cycle, which is responsible for the fixation of inorganic carbon. Most Thermoproteales populations have the metabolic capacity to synthesize their requirements for vitamins, cofactors, amino acids, and/or nucleotides. Our results indicate that Thermoproteales populations are important members of high-temperature microbial communities across a wide pH range, are responsible for the degradation of organic carbon, and may also serve as a source of metabolites required by other community members. Thermoproteales genera are abundant thermophiles in many hypoxic (and especially sulfidic) systems; however, the presence of introns in the 16S rRNA gene of many Thermoproteales often precludes accurate abundance estimates using universal primers.
WEB-server for search of a periodicity in amino acid and nucleotide sequences

NASA Astrophysics Data System (ADS)

E Frenkel, F.; Skryabin, K. G.; Korotkov, E. V.

2017-12-01

A new web server (http://victoria.biengi.ac.ru/splinter/login.php) was designed and developed to search for periodicity in nucleotide and amino acid sequences. The web server operation is based upon a new mathematical method of searching for multiple alignments, which is founded on the position weight matrices optimization, as well as on implementation of the two-dimensional dynamic programming. This approach allows the construction of multiple alignments of the indistinctly similar amino acid and nucleotide sequences that accumulated more than 1.5 substitutions per a single amino acid or a nucleotide without performing the sequences paired comparisons. The article examines the principles of the web server operation and two examples of studying amino acid and nucleotide sequences, as well as information that could be obtained using the web server.
Primary structure of prostaglandin G/H synthase from sheep vesicular gland determined from the complementary DNA sequence.

PubMed Central

DeWitt, D L; Smith, W L

1988-01-01

Prostaglandin G/H synthase (8,11,14-icosatrienoate, hydrogen-donor:oxygen oxidoreductase, EC 1.14.99.1) catalyzes the first step in the formation of prostaglandins and thromboxanes, the conversion of arachidonic acid to prostaglandin endoperoxides G and H. This enzyme is the site of action of nonsteroidal anti-inflammatory drugs. We have isolated a 2.7-kilobase complementary DNA (cDNA) encompassing the entire coding region of prostaglandin G/H synthase from sheep vesicular glands. This cDNA, cloned from a lambda gt 10 library prepared from poly(A)+ RNA of vesicular glands, hybridizes with a single 2.75-kilobase mRNA species. The cDNA clone was selected using oligonucleotide probes modeled from amino acid sequences of tryptic peptides prepared from the purified enzyme. The full-length cDNA encodes a protein of 600 amino acids, including a signal sequence of 24 amino acids. Identification of the cDNA as coding for prostaglandin G/H synthase is based on comparison of amino acid sequences of seven peptides comprising 103 amino acids with the amino acid sequence deduced from the nucleotide sequence of the cDNA. The molecular weight of the unglycosylated enzyme lacking the signal peptide is 65,621. The synthase is a glycoprotein, and there are three potential sites for N-glycosylation, two of them in the amino-terminal half of the molecule. The serine reported to be acetylated by aspirin is at position 530, near the carboxyl terminus. There is no significant similarity between the sequence of the synthase and that of any other protein in amino acid or nucleotide sequence libraries, and a heme binding site(s) is not apparent from the amino acid sequence. The availability of a full-length cDNA clone coding for prostaglandin G/H synthase should facilitate studies of the regulation of expression of this enzyme and the structural features important for catalysis and for interaction with anti-inflammatory drugs. Images PMID:3125548
Concentration variations of amino acids in mammalian fossils: effects of diagenesis and the implications for amino acid racemization analysis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Blackwell, B.; Rutter, N.W.

Detailed amino acid analysis of bones, teeth, and antler from several mammal species have shown that concentrations of several amino acids can be related to three factors: type of material analyzed, diagenetic alteration of the material, and relative age of the fossil. Concentrations of several amino acids are significantly different in enamel compared to those of dentine or cement. This can be used to check that no contamination of one material by another has occurred, which is critical for using the data for amino acid dating, since all three materials have different racemization rates for some acids. With increased inmore » growth of secondary minerals, generally reduced amino acid concentrations are observed. Interacid ratios and concentrations vary significantly the norms expected for the type of material with increasing degrees of alteration. These effects can be linked to abnormal racemization ratios observed in the same samples. Therefore, abnormal concentrations and/or interacid ratios can be used to detect samples in which the D/L amino acid ratios otherwise appear normal, thereby insuring better accuracy of amino acid racemization analysis. For unaltered fossils, with increasing sample age regardless the type of material, some amino acids steadily degrade, while others actually increase in concentration initially due to their generation as by-products of decay. Preliminary studies indicate that this progressive alteration can used to complement racemization data for determining relative stratigraphic sequences.« less
Thermodynamic framework to assess low abundance DNA mutation detection by hybridization.

PubMed

Willems, Hanny; Jacobs, An; Hadiwikarta, Wahyu Wijaya; Venken, Tom; Valkenborg, Dirk; Van Roy, Nadine; Vandesompele, Jo; Hooyberghs, Jef

2017-01-01

The knowledge of genomic DNA variations in patient samples has a high and increasing value for human diagnostics in its broadest sense. Although many methods and sensors to detect or quantify these variations are available or under development, the number of underlying physico-chemical detection principles is limited. One of these principles is the hybridization of sample target DNA versus nucleic acid probes. We introduce a novel thermodynamics approach and develop a framework to exploit the specific detection capabilities of nucleic acid hybridization, using generic principles applicable to any platform. As a case study, we detect point mutations in the KRAS oncogene on a microarray platform. For the given platform and hybridization conditions, we demonstrate the multiplex detection capability of hybridization and assess the detection limit using thermodynamic considerations; DNA containing point mutations in a background of wild type sequences can be identified down to at least 1% relative concentration. In order to show the clinical relevance, the detection capabilities are confirmed on challenging formalin-fixed paraffin-embedded clinical tumor samples. This enzyme-free detection framework contains the accuracy and efficiency to screen for hundreds of mutations in a single run with many potential applications in molecular diagnostics and the field of personalised medicine.
PubDNA Finder: a web database linking full-text articles to sequences of nucleic acids.

PubMed

García-Remesal, Miguel; Cuevas, Alejandro; Pérez-Rey, David; Martín, Luis; Anguita, Alberto; de la Iglesia, Diana; de la Calle, Guillermo; Crespo, José; Maojo, Víctor

2010-11-01

PubDNA Finder is an online repository that we have created to link PubMed Central manuscripts to the sequences of nucleic acids appearing in them. It extends the search capabilities provided by PubMed Central by enabling researchers to perform advanced searches involving sequences of nucleic acids. This includes, among other features (i) searching for papers mentioning one or more specific sequences of nucleic acids and (ii) retrieving the genetic sequences appearing in different articles. These additional query capabilities are provided by a searchable index that we created by using the full text of the 176 672 papers available at PubMed Central at the time of writing and the sequences of nucleic acids appearing in them. To automatically extract the genetic sequences occurring in each paper, we used an original method we have developed. The database is updated monthly by automatically connecting to the PubMed Central FTP site to retrieve and index new manuscripts. Users can query the database via the web interface provided. PubDNA Finder can be freely accessed at http://servet.dia.fi.upm.es:8080/pubdnafinder
Sequence alterations in RX in patients with microphthalmia, anophthalmia, and coloboma

PubMed Central

London, Nikolas J.S.; Kessler, Patricia; Williams, Bryan; Pauer, Gayle J.; Hagstrom, Stephanie A.

2009-01-01

Purpose Microphthalmia, anophthalmia, and coloboma are ocular malformations with a significant genetic component. Rx is a homeobox gene expressed early in the developing retina and is important in retinal cell fate specification as well as stem cell proliferation. We screened a group of 24 patients with microphthalmia, coloboma, and/or anophthalmia for RX mutations. Methods We used standard PCR and automated sequencing techniques to amplify and sequence each of the three RX exons. Patients’ charts were reviewed for clinical information. The pathologic impact of the identified sequence variant was analyzed by computational methods using PolyPhen and PMut algorithms. Results In addition to the polymorphisms we identified a single patient with coloboma having a heterozygous nucleotide change (g.197G>C) in the first exon that results in a missense mutation of arginine to threonine at amino acid position 66 (R66T). In silico analysis predicted R66T to be a deleterious mutation. Conclusions Sequence variations in RX are uncommon in patients with congenital ocular malformations, but may play a role in disease pathogenesis. We observed a missense mutation in RX in a patient with a small, typical chorioretinal coloboma, and postulate that the mutation is responsible for the patient’s phenotype. PMID:19158959
Genetic variation in eleven phase I drug metabolism genes in an ethnically diverse population.

PubMed

Solus, Joseph F; Arietta, Brenda J; Harris, James R; Sexton, David P; Steward, John Q; McMunn, Chara; Ihrie, Patrick; Mehall, Janelle M; Edwards, Todd L; Dawson, Elliott P

2004-10-01

The extent of genetic variation found in drug metabolism genes and its contribution to interindividual variation in response to medication remains incompletely understood. To better determine the identity and frequency of variation in 11 phase I drug metabolism genes, the exons and flanking intronic regions of the cytochrome P450 (CYP) isoenzyme genes CYP1A1, CYP1A2, CYP2A6, CYP2B6, CYP2C8, CYP2C9, CYP2C19, CYP2D6, CYP2E1, CYP3A4 and CYP3A5 were amplified from genomic DNA and sequenced. A total of 60 kb of bi-directional sequence was generated from each of 93 human DNAs, which included Caucasian, African-American and Asian samples. There were 388 different polymorphisms identified. These included 269 non-coding, 45 synonymous and 74 non-synonymous polymorphisms. Of these, 54% were novel and included 176 non-coding, 14 synonymous and 21 non-synonymous polymorphisms. Of the novel variants observed, 85 were represented by single occurrences of the minor allele in the sample set. Much of the variation observed was from low-frequency alleles. Comparatively, these genes are variation-rich. Calculations measuring genetic diversity revealed that while the values for the individual genes are widely variable, the overall nucleotide diversity of 7.7 x 10(-4) and polymorphism parameter of 11.5 x 10(-4) are higher than those previously reported for other gene sets. Several independent measurements indicate that these genes are under selective pressure, particularly for polymorphisms corresponding to non-synonymous amino acid changes. There is relatively little difference in measurements of diversity among the ethnic groups, but there are large differences among the genes and gene subfamilies themselves. Of the three CYP subfamilies involved in phase I drug metabolism (1, 2, and 3), subfamily 2 displays the highest levels of genetic diversity.
Isolation and molecular characterization of Thraustochytrium strain isolated from Antarctic Peninsula and its biotechnological potential in the production of fatty acids.

PubMed

Caamaño, Esteban; Loperena, Lyliam; Hinzpeter, Ivonne; Pradel, Paulina; Gordillo, Felipe; Corsini, Gino; Tello, Mario; Lavín, Paris; González, Alex R

Thraustochytrids are unicellular protists belonging to the Labyrinthulomycetes class, which are characterized by the presence of a high lipid content that could replace conventional fatty acids. They show a wide geographic distribution, however their diversity in the Antarctic Region is rather scarce. The analysis based on the complete sequence of 18S rRNA gene showed that strain 34-2 belongs to the species Thraustochytrium kinnei, with 99% identity. The total lipid profile shows a wide range of saturated fatty acids with abundance of palmitic acid (16:0), showing a range of 16.1-19.7%. On the other hand, long-chain polyunsaturated fatty acids, mainly docosahexaenoic acid and eicosapentaenoic acid are present in a range of 24-48% and 6.1-9.3%, respectively. All factors analyzed in cells (biomass, carbon consumption and lipid content) changed with variations of culture temperature (10°C and 25°C). The growth in glucose at a temperature of 10°C presented the most favorable conditions to produce omega-3fatty acid. This research provides the identification and characterization of a Thraustochytrids strain, with a total lipid content that presents potential applications in the production of nutritional supplements and as well biofuels. Copyright © 2017 Sociedade Brasileira de Microbiologia. Published by Elsevier Editora Ltda. All rights reserved.
Cardiorespiratory fitness as a predictor of intestinal microbial diversity and distinct metagenomic functions.

PubMed

Estaki, Mehrbod; Pither, Jason; Baumeister, Peter; Little, Jonathan P; Gill, Sandeep K; Ghosh, Sanjoy; Ahmadi-Vand, Zahra; Marsden, Katelyn R; Gibson, Deanna L

2016-08-08

Reduced microbial diversity in human intestines has been implicated in various conditions such as diabetes, colorectal cancer, and inflammatory bowel disease. The role of physical fitness in the context of human intestinal microbiota is currently not known. We used high-throughput sequencing to analyze fecal microbiota of 39 healthy participants with similar age, BMI, and diets but with varying cardiorespiratory fitness levels. Fecal short-chain fatty acids were analyzed using gas chromatography. We showed that peak oxygen uptake (VO2peak), the gold standard measure of cardiorespiratory fitness, can account for more than 20 % of the variation in taxonomic richness, after accounting for all other factors, including diet. While VO2peak did not explain variation in beta diversity, it did play a significant role in explaining variation in the microbiomes' predicted metagenomic functions, aligning positively with genes related to bacterial chemotaxis, motility, and fatty acid biosynthesis. These predicted functions were supported by measured increases in production of fecal butyrate, a short-chain fatty acid associated with improved gut health, amongst physically fit participants. We also identified increased abundances of key butyrate-producing taxa (Clostridiales, Roseburia, Lachnospiraceae, and Erysipelotrichaceae) amongst these individuals, which likely contributed to the observed increases in butyrate levels. Results from this study show that cardiorespiratory fitness is correlated with increased microbial diversity in healthy humans and that the associated changes are anchored around a set of functional cores rather than specific taxa. The microbial profiles of fit individuals favor the production of butyrate. As increased microbiota diversity and butyrate production is associated with overall host health, our findings warrant the use of exercise prescription as an adjuvant therapy in combating dysbiosis-associated diseases.
Mapping Genetic Variants Underlying Differences in the Central Nitrogen Metabolism in Fermenter Yeasts

PubMed Central

García, Verónica; Salinas, Francisco; Aguilera, Omayra; Liti, Gianni; Martínez, Claudio

2014-01-01

Different populations within a species represent a rich reservoir of allelic variants, corresponding to an evolutionary signature of withstood environmental constraints. Saccharomyces cerevisiae strains are widely utilised in the fermentation of different kinds of alcoholic beverages, such as, wine and sake, each of them derived from must with distinct nutrient composition. Importantly, adequate nitrogen levels in the medium are essential for the fermentation process, however, a comprehensive understanding of the genetic variants determining variation in nitrogen consumption is lacking. Here, we assessed the genetic factors underlying variation in nitrogen consumption in a segregating population derived from a cross between two main fermenter yeasts, a Wine/European and a Sake isolate. By linkage analysis we identified 18 main effect QTLs for ammonium and amino acids sources. Interestingly, majority of QTLs were involved in more than a single trait, grouped based on amino acid structure and indicating high levels of pleiotropy across nitrogen sources, in agreement with the observed patterns of phenotypic co-variation. Accordingly, we performed reciprocal hemizygosity analysis validating an effect for three genes, GLT1, ASI1 and AGP1. Furthermore, we detected a widespread pleiotropic effect on these genes, with AGP1 affecting seven amino acids and nine in the case of GLT1 and ASI1. Based on sequence and comparative analysis, candidate causative mutations within these genes were also predicted. Altogether, the identification of these variants demonstrate how Sake and Wine/European genetic backgrounds differentially consume nitrogen sources, in part explaining independently evolved preferences for nitrogen assimilation and representing a niche of genetic diversity for the implementation of practical approaches towards more efficient strains for nitrogen metabolism. PMID:24466135
Coiled-coil length: Size does matter.

PubMed

Surkont, Jaroslaw; Diekmann, Yoan; Ryder, Pearl V; Pereira-Leal, Jose B

2015-12-01

Protein evolution is governed by processes that alter primary sequence but also the length of proteins. Protein length may change in different ways, but insertions, deletions and duplications are the most common. An optimal protein size is a trade-off between sequence extension, which may change protein stability or lead to acquisition of a new function, and shrinkage that decreases metabolic cost of protein synthesis. Despite the general tendency for length conservation across orthologous proteins, the propensity to accept insertions and deletions is heterogeneous along the sequence. For example, protein regions rich in repetitive peptide motifs are well known to extensively vary their length across species. Here, we analyze length conservation of coiled-coils, domains formed by an ubiquitous, repetitive peptide motif present in all domains of life, that frequently plays a structural role in the cell. We observed that, despite the repetitive nature, the length of coiled-coil domains is generally highly conserved throughout the tree of life, even when the remaining parts of the protein change, including globular domains. Length conservation is independent of primary amino acid sequence variation, and represents a conservation of domain physical size. This suggests that the conservation of domain size is due to functional constraints. © 2015 Wiley Periodicals, Inc.
Gene: a gene-centered information resource at NCBI.

PubMed

Brown, Garth R; Hem, Vichet; Katz, Kenneth S; Ovetsky, Michael; Wallin, Craig; Ermolaeva, Olga; Tolstoy, Igor; Tatusova, Tatiana; Pruitt, Kim D; Maglott, Donna R; Murphy, Terence D

2015-01-01

The National Center for Biotechnology Information's (NCBI) Gene database (www.ncbi.nlm.nih.gov/gene) integrates gene-specific information from multiple data sources. NCBI Reference Sequence (RefSeq) genomes for viruses, prokaryotes and eukaryotes are the primary foundation for Gene records in that they form the critical association between sequence and a tracked gene upon which additional functional and descriptive content is anchored. Additional content is integrated based on the genomic location and RefSeq transcript and protein sequence data. The content of a Gene record represents the integration of curation and automated processing from RefSeq, collaborating model organism databases, consortia such as Gene Ontology, and other databases within NCBI. Records in Gene are assigned unique, tracked integers as identifiers. The content (citations, nomenclature, genomic location, gene products and their attributes, phenotypes, sequences, interactions, variation details, maps, expression, homologs, protein domains and external databases) is available via interactive browsing through NCBI's Entrez system, via NCBI's Entrez programming utilities (E-Utilities and Entrez Direct) and for bulk transfer by FTP. Published by Oxford University Press on behalf of Nucleic Acids Research 2014. This work is written by (a) US Government employee(s) and is in the public domain in the US.
Nucleotide sequence analysis of the gene encoding the Deinococcus radiodurans surface protein, derived amino acid sequence, and complementary protein chemical studies

DOE Office of Scientific and Technical Information (OSTI.GOV)

Peters, J.; Peters, M.; Lottspeich, F.

1987-11-01

The complete nucleotide sequence of the gene encoding the surface (hexagonally packed intermediate (HPI))-layer polypeptide of Deinococcus radiodurans Sark was determined and found to encode a polypeptide of 1036 amino acids. Amino acid sequence analysis of about 30% of the residues revealed that the mature polypeptide consists of at least 978 amino acids. The N terminus was blocked to Edman degradation. The results of proteolytic modification of the HPI layer in situ and M/sub r/ estimations of the HPI polypeptide expressed in Escherichia coli indicated that there is a leader sequence. The N-terminal region contained a very high percentage (29%)more » of threonine and serine, including a cluster of nine consecutive serine or threonine residues, whereas a stretch near the C terminus was extremely rich in aromatic amino acids (29%). The protein contained at least two disulfide bridges, as well as tightly bound reducing sugars and fatty acids.« less
Molecular Population Genetics of the Alcohol Dehydrogenase Gene Region of DROSOPHILA MELANOGASTER

PubMed Central

Aquadro, Charles F.; Desse, Susan F.; Bland, Molly M.; Langley, Charles H.; Laurie-Ahlberg, Cathy C.

1986-01-01

Variation in the DNA restriction map of a 13-kb region of chromosome II including the alcohol dehydrogenase structural gene (Adh) was examined in Drosophila melanogaster from natural populations. Detailed analysis of 48 D. melanogaster lines representing four eastern United States populations revealed extensive DNA sequence variation due to base substitutions, insertions and deletions. Cloning of this region from several lines allowed characterization of length variation as due to unique sequence insertions or deletions [nine sizes; 21–200 base pairs (bp)] or transposable element insertions (several sizes, 340 bp to 10.2 kb, representing four different elements). Despite this extensive variation in sequences flanking the Adh gene, only one length polymorphism is clearly associated with altered Adh expression (a copia element approximately 250 bp 5' to the distal transcript start site). Nonetheless, the frequency spectra of transposable elements within and between Drosophila species suggests they are slightly deleterious. Strong nonrandom associations are observed among Adh region sequence variants, ADH allozyme (Fast vs. Slow), ADH enzyme activity and the chromosome inversion ln(2L) t. Phylogenetic analysis of restriction map haplotypes suggest that the major twofold component of ADH activity variation (high vs. low, typical of Fast and Slow allozymes, respectively) is due to sequence variation tightly linked to and possibly distinct from that underlying the allozyme difference. The patterns of nucleotide and haplotype variation for Fast and Slow allozyme lines are consistent with the recent increase in frequency and spread of the Fast haplotype associated with high ADH activity. These data emphasize the important role of evolutionary history and strong nonrandom associations among tightly linked sequence variation as determinants of the patterns of variation observed in natural populations. PMID:3026893
Detection and isolation of nucleic acid sequences using a bifunctional hybridization probe

DOEpatents

Lucas, Joe N.; Straume, Tore; Bogen, Kenneth T.

2000-01-01

A method for detecting and isolating a target sequence in a sample of nucleic acids is provided using a bifunctional hybridization probe capable of hybridizing to the target sequence that includes a detectable marker and a first complexing agent capable of forming a binding pair with a second complexing agent. A kit is also provided for detecting a target sequence in a sample of nucleic acids using a bifunctional hybridization probe according to this method.
Variation, Repetition, And Choice

PubMed Central

Abreu-Rodrigues, Josele; Lattal, Kennon A; dos Santos, Cristiano V; Matos, Ricardo A

2005-01-01

Experiment 1 investigated the controlling properties of variability contingencies on choice between repeated and variable responding. Pigeons were exposed to concurrent-chains schedules with two alternatives. In the REPEAT alternative, reinforcers in the terminal link depended on a single sequence of four responses. In the VARY alternative, a response sequence in the terminal link was reinforced only if it differed from the n previous sequences (lag criterion). The REPEAT contingency generated low, constant levels of sequence variation whereas the VARY contingency produced levels of sequence variation that increased with the lag criterion. Preference for the REPEAT alternative tended to increase directly with the degree of variation required for reinforcement. Experiment 2 examined the potential confounding effects in Experiment 1 of immediacy of reinforcement by yoking the interreinforcer intervals in the REPEAT alternative to those in the VARY alternative. Again, preference for REPEAT was a function of the lag criterion. Choice between varying and repeating behavior is discussed with respect to obtained behavioral variability, probability of reinforcement, delay of reinforcement, and switching within a sequence. PMID:15828592
A novel mutation in TFL1 homolog affecting determinacy in cowpea (Vigna unguiculata).

PubMed

Dhanasekar, P; Reddy, K S

2015-02-01

Mutations in the widely conserved Arabidopsis Terminal Flower 1 (TFL1) gene and its homologs have been demonstrated to result in determinacy across genera, the knowledge of which is lacking in cowpea. Understanding the molecular events leading to determinacy of apical meristems could hasten development of cowpea varieties with suitable ideotypes. Isolation and characterization of a novel mutation in cowpea TFL1 homolog (VuTFL1) affecting determinacy is reported here for the first time. Cowpea TFL1 homolog was amplified using primers designed based on conserved sequences in related genera and sequence variation was analysed in three gamma ray-induced determinate mutants, their indeterminate parent "EC394763" and two indeterminate varieties. The analyses of sequence variation exposed a novel SNP distinguishing the determinate mutants from the indeterminate types. The non-synonymous point mutation in exon 4 at position 1,176 resulted from transversion of cytosine (C) to adenine (A) leading to an amino acid change (Pro-136 to His) in determinate mutants. The effect of the mutation on protein function and stability was predicted to be detrimental using different bioinformatics/computational tools. The functionally significant novel substitution mutation is hypothesized to affect determinacy in the cowpea mutants. Development of suitable regeneration protocols in this hitherto recalcitrant crop and subsequent complementation assay in mutants or over-expressing assay in parents could decisively conclude the role of the SNP in regulating determinacy in these cowpea mutants.

Why double-stranded RNA resists condensation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tolokh, Igor S.; Pabit, Suzette; Katz, Andrea M.

2014-09-15

The addition of small amounts of multivalent cations to solutions containing double-stranded DNA leads to attraction between the negatively charged helices and eventually to condensation. Surprisingly, this effect is suppressed in double-stranded RNA, which carries the same charge as the DNA, but assumes a different double helical form. However, additional characterization of short (25 base-pairs) nucleic acid (NA) duplex structures by circular dichroism shows that measured differences in condensation are not solely determined by duplex helical geometry. Here we combine experiment, theory, and atomistic simulations to propose a mechanism that connects the observed variations in condensation of short NA duplexesmore » with the spatial variation of cobalt hexammine (CoHex) binding at the NA duplex surface. The atomistic picture that emerged showed that CoHex distributions around the NA reveals two major NA-CoHex binding modes -- internal and external -- distinguished by the proximity of bound CoHex to the helical axis. Decreasing trends in experimentally observed condensation propensity of the four studied NA duplexes (from B-like form of homopolymeric DNA, to mixed sequence DNA, to DNA:RNA hybrid, to A-like RNA) are explained by the progressive decrease of a single quantity: the fraction of CoHex ions in the external binding mode. Thus, while NA condensation depends on a complex interplay between various structural and sequence features, our coupled experimental and theoretical results suggest a new model in which a single parameter connects the NA condensation propensity with geometry and sequence dependence of CoHex binding.« less
Distinguishing functional polymorphism from random variation in the sequences of >10,000 HLA-A, -B and -C alleles.

PubMed

Robinson, James; Guethlein, Lisbeth A; Cereb, Nezih; Yang, Soo Young; Norman, Paul J; Marsh, Steven G E; Parham, Peter

2017-06-01

HLA class I glycoproteins contain the functional sites that bind peptide antigens and engage lymphocyte receptors. Recently, clinical application of sequence-based HLA typing has uncovered an unprecedented number of novel HLA class I alleles. Here we define the nature and extent of the variation in 3,489 HLA-A, 4,356 HLA-B and 3,111 HLA-C alleles. This analysis required development of suites of methods, having general applicability, for comparing and analyzing large numbers of homologous sequences. At least three amino-acid substitutions are present at every position in the polymorphic α1 and α2 domains of HLA-A, -B and -C. A minority of positions have an incidence >1% for the 'second' most frequent nucleotide, comprising 70 positions in HLA-A, 85 in HLA-B and 54 in HLA-C. The majority of these positions have three or four alternative nucleotides. These positions were subject to positive selection and correspond to binding sites for peptides and receptors. Most alleles of HLA class I (>80%) are very rare, often identified in one person or family, and they differ by point mutation from older, more common alleles. These alleles with single nucleotide polymorphisms reflect the germ-line mutation rate. Their frequency predicts the human population harbors 8-9 million HLA class I variants. The common alleles of human populations comprise 42 core alleles, which represent all selected polymorphism, and recombinants that have assorted this polymorphism.
Distinguishing functional polymorphism from random variation in the sequences of >10,000 HLA-A, -B and -C alleles

PubMed Central

Cereb, Nezih; Yang, Soo Young; Marsh, Steven G. E.; Parham, Peter

2017-01-01

HLA class I glycoproteins contain the functional sites that bind peptide antigens and engage lymphocyte receptors. Recently, clinical application of sequence-based HLA typing has uncovered an unprecedented number of novel HLA class I alleles. Here we define the nature and extent of the variation in 3,489 HLA-A, 4,356 HLA-B and 3,111 HLA-C alleles. This analysis required development of suites of methods, having general applicability, for comparing and analyzing large numbers of homologous sequences. At least three amino-acid substitutions are present at every position in the polymorphic α1 and α2 domains of HLA-A, -B and -C. A minority of positions have an incidence >1% for the ‘second’ most frequent nucleotide, comprising 70 positions in HLA-A, 85 in HLA-B and 54 in HLA-C. The majority of these positions have three or four alternative nucleotides. These positions were subject to positive selection and correspond to binding sites for peptides and receptors. Most alleles of HLA class I (>80%) are very rare, often identified in one person or family, and they differ by point mutation from older, more common alleles. These alleles with single nucleotide polymorphisms reflect the germ-line mutation rate. Their frequency predicts the human population harbors 8–9 million HLA class I variants. The common alleles of human populations comprise 42 core alleles, which represent all selected polymorphism, and recombinants that have assorted this polymorphism. PMID:28650991
Optical resolution of phenylthiohydantoin-amino acids by capillary electrophoresis and identification of the phenylthiohydantoin-D-amino acid residue of [D-Ala2]-methionine enkephalin.

PubMed

Kurosu, Y; Murayama, K; Shindo, N; Shisa, Y; Ishioka, N

1996-11-01

This is an initial report to propose a protein sequence analysis system with DL differentiation using capillary electrophoresis (CE). This system consists of a protein sequencer and a CE system. After fractionation of phenyl-thiohydantoin (PTH)-amino acids using a protein sequencer, optical resolution for each PTH-amino acid is performed by CE using some chiral selectors such as digitonin, beta-escin and others. As a model peptide, [D-Ala2]-methionine enkephalin (L-Tyr-D-Ala-Gly-L-Phe-L-Met), was used and the sequence with DL differentiation was determined, with the exception of the fourth amino acid, L-Phe, using our proposed system.
Sequence analysis of a canine parvovirus isolated from a red panda (Ailurus fulgens) in China.

PubMed

Qin, Qin; Loeffler, I Kati; Li, Ming; Tian, Kegong; Wei, Fuwen

2007-06-01

Canine parvovirus (CPV) was first recognized in the late 1970 s in dogs and has mutated and spread throughout the world in canid and felid species since then. In this study, a novel CPV was isolated from the endangered red panda (Ailurus fulgens) in China. Nucleotide and phylogenetic analysis of the capsid protein VP2 gene classified the red panda parvovirus (RPPV) as a CPV-2a type. Substitution of Val for Gly at the conserved 300 residue in RPPV presents an unusual variation in the CPV-2a amino acid sequence and is further evidence for the continuing evolution of the virus. The 300 residue is important in distinguishing the antigenicity and host range of CPVs. The clinical significance and population impact of RPPV infection in captive red pandas in China is unknown and is an important topic for future research.
Giraffe genome sequence reveals clues to its unique morphology and physiology

PubMed Central

Agaba, Morris; Ishengoma, Edson; Miller, Webb C.; McGrath, Barbara C.; Hudson, Chelsea N.; Bedoya Reina, Oscar C.; Ratan, Aakrosh; Burhans, Rico; Chikhi, Rayan; Medvedev, Paul; Praul, Craig A.; Wu-Cavener, Lan; Wood, Brendan; Robertson, Heather; Penfold, Linda; Cavener, Douglas R.

2016-01-01

The origins of giraffe's imposing stature and associated cardiovascular adaptations are unknown. Okapi, which lacks these unique features, is giraffe's closest relative and provides a useful comparison, to identify genetic variation underlying giraffe's long neck and cardiovascular system. The genomes of giraffe and okapi were sequenced, and through comparative analyses genes and pathways were identified that exhibit unique genetic changes and likely contribute to giraffe's unique features. Some of these genes are in the HOX, NOTCH and FGF signalling pathways, which regulate both skeletal and cardiovascular development, suggesting that giraffe's stature and cardiovascular adaptations evolved in parallel through changes in a small number of genes. Mitochondrial metabolism and volatile fatty acids transport genes are also evolutionarily diverged in giraffe and may be related to its unusual diet that includes toxic plants. Unexpectedly, substantial evolutionary changes have occurred in giraffe and okapi in double-strand break repair and centrosome functions. PMID:27187213
PSSRdb: a relational database of polymorphic simple sequence repeats extracted from prokaryotic genomes.

PubMed

Kumar, Pankaj; Chaitanya, Pasumarthy S; Nagarajaram, Hampapathalu A

2011-01-01

PSSRdb (Polymorphic Simple Sequence Repeats database) (http://www.cdfd.org.in/PSSRdb/) is a relational database of polymorphic simple sequence repeats (PSSRs) extracted from 85 different species of prokaryotes. Simple sequence repeats (SSRs) are the tandem repeats of nucleotide motifs of the sizes 1-6 bp and are highly polymorphic. SSR mutations in and around coding regions affect transcription and translation of genes. Such changes underpin phase variations and antigenic variations seen in some bacteria. Although SSR-mediated phase variation and antigenic variations have been well-studied in some bacteria there seems a lot of other species of prokaryotes yet to be investigated for SSR mediated adaptive and other evolutionary advantages. As a part of our on-going studies on SSR polymorphism in prokaryotes we compared the genome sequences of various strains and isolates available for 85 different species of prokaryotes and extracted a number of SSRs showing length variations and created a relational database called PSSRdb. This database gives useful information such as location of PSSRs in genomes, length variation across genomes, the regions harboring PSSRs, etc. The information provided in this database is very useful for further research and analysis of SSRs in prokaryotes.
High level of molecular and phenotypic biodiversity in Jatropha curcas from Central America compared to Africa, Asia and South America

PubMed Central

2014-01-01

Background The main bottleneck to elevate jatropha (Jatropha curcas L.) from a wild species to a profitable biodiesel crop is the low genetic and phenotypic variation found in different regions of the world, hampering efficient plant breeding for productivity traits. In this study, 182 accessions from Asia (91), Africa (35), South America (9) and Central America (47) were evaluated at genetic and phenotypic level to find genetic variation and important traits for oilseed production. Results Genetic variation was assessed with SSR (Simple Sequence Repeat), TRAP (Target Region Amplification Polymorphism) and AFLP (Amplified fragment length polymorphism) techniques. Phenotypic variation included seed morphological characteristics, seed oil content and fatty acid composition and early growth traits. Jaccard’s similarity and cluster analysis by UPGM (Unweighted Paired Group Method) with arithmetic mean and PCA (Principle Component Analysis) indicated higher variability in Central American accessions compared to Asian, African and South American accessions. Polymorphism Information Content (PIC) values ranged from 0 to 0.65. In the set of Central American accessions. PIC values were higher than in other regions. Accessions from the Central American population contain alleles that were not found in the accessions from other populations. Analysis of Molecular Variance (AMOVA; P < 0.0001) indicated high genetic variation within regions (81.7%) and low variation across regions (18.3%). A high level of genetic variation was found on early growth traits and on components of the relative growth rate (specific leaf area, leaf weight, leaf weight ratio and net assimilation rate) as indicated by significant differences between accessions and by the high heritability values (50–88%). The fatty acid composition of jatropha oil significantly differed (P < 0.05) between regions. Conclusions The pool of Central American accessions showed very large genetic variation as assessed by DNA-marker variation compared to accessions from other regions. Central American accessions also showed the highest phenotypic variation and should be considered as the most important source for plant breeding. Some variation in early growth traits was found within a group of accessions from Asia and Africa, while these accessions did not differ in a single DNA-marker, possibly indicating epigenetic variation. PMID:24666927
Neofunctionalization of Duplicated P450 Genes Drives the Evolution of Insecticide Resistance in the Brown Planthopper.

PubMed

Zimmer, Christoph T; Garrood, William T; Singh, Kumar Saurabh; Randall, Emma; Lueke, Bettina; Gutbrod, Oliver; Matthiesen, Svend; Kohler, Maxie; Nauen, Ralf; Davies, T G Emyr; Bass, Chris

2018-01-22

Gene duplication is a major source of genetic variation that has been shown to underpin the evolution of a wide range of adaptive traits [1, 2]. For example, duplication or amplification of genes encoding detoxification enzymes has been shown to play an important role in the evolution of insecticide resistance [3-5]. In this context, gene duplication performs an adaptive function as a result of its effects on gene dosage and not as a source of functional novelty [3, 6-8]. Here, we show that duplication and neofunctionalization of a cytochrome P450, CYP6ER1, led to the evolution of insecticide resistance in the brown planthopper. Considerable genetic variation was observed in the coding sequence of CYP6ER1 in populations of brown planthopper collected from across Asia, but just two sequence variants are highly overexpressed in resistant strains and metabolize imidacloprid. Both variants are characterized by profound amino-acid alterations in substrate recognition sites, and the introduction of these mutations into a susceptible P450 sequence is sufficient to confer resistance. CYP6ER1 is duplicated in resistant strains with individuals carrying paralogs with and without the gain-of-function mutations. Despite numerical parity in the genome, the susceptible and mutant copies exhibit marked asymmetry in their expression with the resistant paralogs overexpressed. In the primary resistance-conferring CYP6ER1 variant, this results from an extended region of novel sequence upstream of the gene that provides enhanced expression. Our findings illustrate the versatility of gene duplication in providing opportunities for functional and regulatory innovation during the evolution of an adaptive trait. Copyright © 2017 The Authors. Published by Elsevier Ltd.. All rights reserved.
Molecular diversity of α-gliadin expressed genes in genetically contrasted spelt (Triticum aestivum ssp. spelta) accessions and comparison with bread wheat (T. aestivum ssp. aestivum) and related diploid Triticum and Aegilops species.

PubMed

Dubois, Benjamin; Bertin, Pierre; Mingeot, Dominique

2016-01-01

The gluten proteins of cereals such as bread wheat ( Triticum aestivum ssp. aestivum ) and spelt ( T. aestivum ssp. spelta ) are responsible for celiac disease (CD). The α-gliadins constitute the most immunogenic class of gluten proteins as they include four main T-cell stimulatory epitopes that affect CD patients. Spelt has been less studied than bread wheat and could constitute a source of valuable diversity. The objective of this work was to study the genetic diversity of spelt α-gliadin transcripts and to compare it with those of bread wheat. Genotyping data from 85 spelt accessions obtained with 19 simple sequence repeat (SSR) markers were used to select 11 contrasted accessions, from which 446 full open reading frame α-gliadin genes were cloned and sequenced, which revealed a high allelic diversity. High variations among the accessions were highlighted, in terms of the proportion of α-gliadin sequences from each of the three genomes (A, B and D), and their composition in the four T-cell stimulatory epitopes. An accession from Tajikistan stood out, having a particularly high proportion of α-gliadins from the B genome and a low immunogenic content. Even if no clear separation between spelt and bread wheat sequences was shown, spelt α-gliadins displayed specific features concerning e.g. the frequencies of some amino acid substitutions. Given this observation and the variations in toxicity revealed in the spelt accessions in this study, the high genetic diversity held in spelt germplasm collections could be a valuable resource in the development of safer varieties for CD patients.
Positive selection in the SLC11A1 gene in the family Equidae.

PubMed

Bayerova, Zuzana; Janova, Eva; Matiasovic, Jan; Orlando, Ludovic; Horin, Petr

2016-05-01

Immunity-related genes are a suitable model for studying effects of selection at the genomic level. Some of them are highly conserved due to functional constraints and purifying selection, while others are variable and change quickly to cope with the variation of pathogens. The SLC11A1 gene encodes a transporter protein mediating antimicrobial activity of macrophages. Little is known about the patterns of selection shaping this gene during evolution. Although it is a typical evolutionarily conserved gene, functionally important polymorphisms associated with various diseases were identified in humans and other species. We analyzed the genomic organization, genetic variation, and evolution of the SLC11A1 gene in the family Equidae to identify patterns of selection within this important gene. Nucleotide SLC11A1 sequences were shown to be highly conserved in ten equid species, with more than 97 % sequence identity across the family. Single nucleotide polymorphisms (SNPs) were found in the coding and noncoding regions of the gene. Seven codon sites were identified to be under strong purifying selection. Codons located in three regions, including the glycosylated extracellular loop, were shown to be under diversifying selection. A 3-bp indel resulting in a deletion of the amino acid 321 in the predicted protein was observed in all horses, while it has been maintained in all other equid species. This codon comprised in an N-glycosylation site was found to be under positive selection. Interspecific variation in the presence of predicted N-glycosylation sites was observed.
Identification and Analysis of Novel Amino-Acid Sequence Repeats in Bacillus anthracis str. Ames Proteome Using Computational Tools

PubMed Central

Hemalatha, G. R.; Rao, D. Satyanarayana; Guruprasad, L.

2007-01-01

We have identified four repeats and ten domains that are novel in proteins encoded by the Bacillus anthracis str. Ames proteome using automated in silico methods. A “repeat” corresponds to a region comprising less than 55-amino-acid residues that occur more than once in the protein sequence and sometimes present in tandem. A “domain” corresponds to a conserved region with greater than 55-amino-acid residues and may be present as single or multiple copies in the protein sequence. These correspond to (1) 57-amino-acid-residue PxV domain, (2) 122-amino-acid-residue FxF domain, (3) 111-amino-acid-residue YEFF domain, (4) 109-amino-acid-residue IMxxH domain, (5) 103-amino-acid-residue VxxT domain, (6) 84-amino-acid-residue ExW domain, (7) 104-amino-acid-residue NTGFIG domain, (8) 36-amino-acid-residue NxGK repeat, (9) 95-amino-acid-residue VYV domain, (10) 75-amino-acid-residue KEWE domain, (11) 59-amino-acid-residue AFL domain, (12) 53-amino-acid-residue RIDVK repeat, (13) (a) 41-amino-acid-residue AGQF repeat and (b) 42-amino-acid-residue GSAL repeat. A repeat or domain type is characterized by specific conserved sequence motifs. We discuss the presence of these repeats and domains in proteins from other genomes and their probable secondary structure. PMID:17538688
Comparative venom gland transcriptomics of Naja kaouthia (monocled cobra) from Malaysia and Thailand: elucidating geographical venom variation and insights into sequence novelty

PubMed Central

Chanhome, Lawan; Tan, Nget Hong

2017-01-01

Background The monocled cobra (Naja kaouthia) is a medically important venomous snake in Southeast Asia. Its venom has been shown to vary geographically in relation to venom composition and neurotoxic activity, indicating vast diversity of the toxin genes within the species. To investigate the polygenic trait of the venom and its locale-specific variation, we profiled and compared the venom gland transcriptomes of N. kaouthia from Malaysia (NK-M) and Thailand (NK-T) applying next-generation sequencing (NGS) technology. Methods The transcriptomes were sequenced on the Illumina HiSeq platform, assembled and followed by transcript clustering and annotations for gene expression and function. Pairwise or multiple sequence alignments were conducted on the toxin genes expressed. Substitution rates were studied for the major toxins co-expressed in NK-M and NK-T. Results and discussion The toxin transcripts showed high redundancy (41–82% of the total mRNA expression) and comprised 23 gene families expressed in NK-M and NK-T, respectively (22 gene families were co-expressed). Among the venom genes, three-finger toxins (3FTxs) predominated in the expression, with multiple sequences noted. Comparative analysis and selection study revealed that 3FTxs are genetically conserved between the geographical specimens whilst demonstrating distinct differential expression patterns, implying gene up-regulation for selected principal toxins, or alternatively, enhanced transcript degradation or lack of transcription of certain traits. One of the striking features that elucidates the inter-geographical venom variation is the up-regulation of α-neurotoxins (constitutes ∼80.0% of toxin’s fragments per kilobase of exon model per million mapped reads (FPKM)), particularly the long-chain α-elapitoxin-Nk2a (48.3%) in NK-T but only 1.7% was noted in NK-M. Instead, short neurotoxin isoforms were up-regulated in NK-M (46.4%). Another distinct transcriptional pattern observed is the exclusively and abundantly expressed cytotoxin CTX-3 in NK-T. The findings suggested correlation with the geographical variation in proteome and toxicity of the venom, and support the call for optimising antivenom production and use in the region. Besides, the current study uncovered full and partial sequences of numerous toxin genes from N. kaouthia which have not been reported hitherto; these include N. kaouthia-specific l-amino acid oxidase (LAAO), snake venom serine protease (SVSP), cystatin, acetylcholinesterase (AChE), hyaluronidase (HYA), waprin, phospholipase B (PLB), aminopeptidase (AP), neprilysin, etc. Taken together, the findings further enrich the snake toxin database and provide deeper insights into the genetic diversity of cobra venom toxins. PMID:28392982
37 CFR 1.824 - Form and format for nucleotide and/or amino acid sequence submissions in computer readable form.

Code of Federal Regulations, 2010 CFR

2010-07-01

... 37 Patents, Trademarks, and Copyrights 1 2010-07-01 2010-07-01 false Form and format for... And/or Amino Acid Sequences § 1.824 Form and format for nucleotide and/or amino acid sequence... Code for Information Interchange (ASCII) text. No other formats shall be allowed. (3) The computer...
The major origin of seedless grapes is associated with a missense mutation in the MADS-box gene VviAGL11.

PubMed

Royo, Carolina; Torres-Pérez, Rafael; Mauri, Nuria; Diestro, Nieves; Cabezas, José Antonio; Marchal, Cécile; Lacombe, Thierry; Ibáñez, Javier; Tornel, Manuel; Carreño, Juan; Martínez-Zapater, José M; Carbonell-Bejerano, Pablo

2018-05-31

Seedlessness is greatly prized by consumers of fresh grapes. While stenospermocarpic seed abortion determined by the SEED DEVELOPMENT INHIBITOR (SDI) locus is the usual source of seedlessness in commercial grapevine (Vitis vinifera) cultivars, the underlying sdi mutation remains unknown. Here, we undertook an integrative approach to identify the causal mutation. Quantitative genetics and fine mapping in two 'Crimson Seedless' (CS)-derived F1 mapping populations confirmed the major effect of the SDI locus and delimited the sdi mutation to a 323-kb region on chromosome 18. RNA-seq comparing seed traces of seedless and seeds of seeded F1 individuals identified processes triggered during sdi-determined seed abortion, including activation of salicylic acid-dependent defenses. The RNA-seq dataset was investigated for candidate genes and, while no evidence for causal cis-acting regulatory mutations was detected, deleterious nucleotide changes in coding sequences of the seedless haplotype were predicted in two genes within the sdi fine mapping interval. Targeted re-sequencing of the two genes in a collection of 124 grapevine cultivars showed that only the point variation causing the Arg197Leu substitution in the seed morphogenesis regulator gene AGAMOUS-LIKE 11 (VviAGL11) was fully linked with stenospermocarpy. The concurrent post-zygotic variation identified for this missense polymorphism and seedlessness phenotype in seeded somatic variants of the original stenospermocarpic cultivar supports a causal effect. We postulate that seed abortion caused by this amino acid substitution in VviAGL11 is the major cause of seedlessness in cultivated grapevine. This information can be exploited to boost seedless grape breeding. {copyright, serif} 2018 American Society of Plant Biologists. All rights reserved.
SNPs in stress-responsive rice genes: validation, genotyping, functional relevance and population structure

PubMed Central

2012-01-01

Background Single nucleotide polymorphism (SNP) validation and large-scale genotyping are required to maximize the use of DNA sequence variation and determine the functional relevance of candidate genes for complex stress tolerance traits through genetic association in rice. We used the bead array platform-based Illumina GoldenGate assay to validate and genotype SNPs in a select set of stress-responsive genes to understand their functional relevance and study the population structure in rice. Results Of the 384 putative SNPs assayed, we successfully validated and genotyped 362 (94.3%). Of these 325 (84.6%) showed polymorphism among the 91 rice genotypes examined. Physical distribution, degree of allele sharing, admixtures and introgression, and amino acid replacement of SNPs in 263 abiotic and 62 biotic stress-responsive genes provided clues for identification and targeted mapping of trait-associated genomic regions. We assessed the functional and adaptive significance of validated SNPs in a set of contrasting drought tolerant upland and sensitive lowland rice genotypes by correlating their allelic variation with amino acid sequence alterations in catalytic domains and three-dimensional secondary protein structure encoded by stress-responsive genes. We found a strong genetic association among SNPs in the nine stress-responsive genes with upland and lowland ecological adaptation. Higher nucleotide diversity was observed in indica accessions compared with other rice sub-populations based on different population genetic parameters. The inferred ancestry of 16% among rice genotypes was derived from admixed populations with the maximum between upland aus and wild Oryza species. Conclusions SNPs validated in biotic and abiotic stress-responsive rice genes can be used in association analyses to identify candidate genes and develop functional markers for stress tolerance in rice. PMID:22921105
In search of actionable targets for agrigenomics and microalgal biofuel production: sequence-structural diversity studies on algal and higher plants with a focus on GPAT protein.

PubMed

Misra, Namrata; Panda, Prasanna Kumar

2013-04-01

The triacylglycerol (TAG) pathway provides several targets for genetic engineering to optimize microalgal lipid productivity. GPAT (glycerol-3-phosphate acyltransferase) is a crucial enzyme that catalyzes the initial step of TAG biosynthesis. Despite many recent biochemical studies, a comprehensive sequence-structure analysis of GPAT across diverse lipid-yielding organisms is lacking. Hence, we performed a comparative genomic analysis of plastid-located GPAT proteins from 7 microalgae and 3 higher plants species. The close evolutionary relationship observed between red algae/diatoms and green algae/plant lineages in the phylogenetic tree were further corroborated by motif and gene structure analysis. The predicted molecular weight, amino acid composition, Instability Index, and hydropathicity profile gave an overall representation of the biochemical features of GPAT protein across the species under study. Furthermore, homology models of GPAT from Chlamydomonas reinhardtii, Arabidopsis thaliana, and Glycine max provided deep insights into the protein architecture and substrate binding sites. Despite low sequence identity found between algal and plant GPATs, the developed models exhibited strikingly conserved topology consisting of 14α helices and 9β sheets arranged in two domains. However, subtle variations in amino acids of fatty acyl binding site were identified that might influence the substrate selectivity of GPAT. Together, the results will provide useful resources to understand the functional and evolutionary relationship of GPAT and potentially benefit in development of engineered enzyme for augmenting algal biofuel production.
Ovine mitochondrial DNA sequence variation and its association with production and reproduction traits within an Afec-Assaf flock.

PubMed

Reicher, S; Seroussi, E; Weller, J I; Rosov, A; Gootwine, E

2012-07-01

Polymorphisms in mitochondrial DNA (mtDNA) protein- and tRNA-coding genes were shown to be associated with various diseases in humans as well as with production and reproduction traits in livestock. Alignment of full length mitochondria sequences from the 5 known ovine haplogroups: HA (n = 3), HB (n = 5), HC (n = 3), HD (n = 2), and HE (n = 2; GenBank accession nos. HE577847-50 and 11 published complete ovine mitochondria sequences) revealed sequence variation in 10 out of the 13 protein coding mtDNA sequences. Twenty-six of the 245 variable sites found in the protein coding sequences represent non-synonymous mutations. Sequence variation was observed also in 8 out of the 22 tRNA mtDNA sequences. On the basis of the mtDNA control region and cytochrome b partial sequences along with information on maternal lineages within an Afec-Assaf flock, 1,126 Afec-Assaf ewes were assigned to mitochondrial haplogroups HA, HB, and HC, with frequencies of 0.43, 0.43, and 0.14, respectively. Analysis of birth weight and growth rate records of lamb (n = 1286) and productivity from 4,993 lambing records revealed no association between mitochondrial haplogroup affiliation and female longevity, lambs perinatal survival rate, birth weight, and daily growth rate of lambs up to 150 d that averaged 1,664 d, 88.3%, 4.5 kg, and 320 g/d, respectively. However, significant (P < 0.0001) differences among the haplogroups were found for prolificacy of ewes, with prolificacies (mean ± SE) of 2.14 ± 0.04, 2.25 ± 0.04, and 2.30 ± 0.06 lamb born/ewe lambing for the HA, HB, and the HC haplogroups, respectively. Our results highlight the ovine mitogenome genetic variation in protein- and tRNA coding genes and suggest that sequence variation in ovine mtDNA is associated with variation in ewe prolificacy.
Application of 2D graphic representation of protein sequence based on Huffman tree method.

PubMed

Qi, Zhao-Hui; Feng, Jun; Qi, Xiao-Qin; Li, Ling

2012-05-01

Based on Huffman tree method, we propose a new 2D graphic representation of protein sequence. This representation can completely avoid loss of information in the transfer of data from a protein sequence to its graphic representation. The method consists of two parts. One is about the 0-1 codes of 20 amino acids by Huffman tree with amino acid frequency. The amino acid frequency is defined as the statistical number of an amino acid in the analyzed protein sequences. The other is about the 2D graphic representation of protein sequence based on the 0-1 codes. Then the applications of the method on ten ND5 genes and seven Escherichia coli strains are presented in detail. The results show that the proposed model may provide us with some new sights to understand the evolution patterns determined from protein sequences and complete genomes. Copyright © 2012 Elsevier Ltd. All rights reserved.
Opsin cDNA sequences of a UV and green rhodopsin of the satyrine butterfly Bicyclus anynana.

PubMed

Vanhoutte, K J A; Eggen, B J L; Janssen, J J M; Stavenga, D G

2002-11-01

The cDNAs of an ultraviolet (UV) and long-wavelength (LW) (green) absorbing rhodopsin of the bush brown Bicyclus anynana were partially identified. The UV sequence, encoding 377 amino acids, is 76-79% identical to the UV sequences of the papilionids Papilio glaucus and Papilio xuthus and the moth Manduca sexta. A dendrogram derived from aligning the amino acid sequences reveals an equidistant position of Bicyclus between Papilio and Manduca. The sequence of the green opsin cDNA fragment, which encodes 242 amino acids, represents six of the seven transmembrane regions. At the amino acid level, this fragment is more than 80% identical to the corresponding LW opsin sequences of Dryas, Heliconius, Papilio (rhodopsin 2) and Manduca. Whereas three LW absorbing rhodopsins were identified in the papilionid butterflies, only one green opsin was found in B. anynana.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.