kb sequence analysis: Topics by Science.gov

Sample records for kb sequence analysis

Structural analysis of two length variants of the rDNA intergenic spacer from Eruca sativa.

PubMed

Lakshmikumaran, M; Negi, M S

1994-03-01

Restriction enzyme analysis of the rRNA genes of Eruca sativa indicated the presence of many length variants within a single plant and also between different cultivars which is unusual for most crucifers studied so far. Two length variants of the rDNA intergenic spacer (IGS) from a single individual E. sativa (cv. Itsa) plant were cloned and characterized. The complete nucleotide sequences of both the variants (3 kb and 4 kb) were determined. The intergenic spacer contains three families of tandemly repeated DNA sequences denoted as A, B and C. However, the long (4 kb) variant shows the presence of an additional repeat, denoted as D, which is a duplication of a 224 bp sequence just upstream of the putative transcription initiation site. Repeat units belonging to the three different families (A, B and C) were in the size range of 22 to 30 bp. Such short repeat elements are present in the IGS of most of the crucifers analysed so far. Sequence analysis of the variants (3 kb and 4 kb) revealed that the length heterogeneity of the spacer is located at three different regions and is due to the varying copy numbers of repeat units belonging to families A and B. Length variation of the spacer is also due to the presence of a large duplication (D repeats) in the 4 kb variant which is absent in the 3 kb variant. The putative transcription initiation site was identified by comparisons with the rDNA sequences from other plant species.
Molecular analysis of the glucocerebrosidase gene locus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Winfield, S.L.; Martin, B.M.; Fandino, A.

1994-09-01

Gaucher disease is due to a deficiency in the activity of the lysosomal enzyme glucocerebrosidase. Both the functional gene for this enzyme and a pseudogene are located in close proximity on chromosome 1q21. Analysis of the mutations present in patient samples has suggested interaction between the functional gene and the pseudogene in the origin of mutant genotypes. To investigate the involvement of regions flanking the functional gene and pseudogene in the origin of mutations found in Gaucher disease, a YAC clone containing DNA from this locus has been subcloned and characterized. The original YAC containing {approximately}360 kb was truncated withmore » the use of fragmentation plasmids to about 85 kb. A lambda library derived from this YAC was screened to obtain clones containing glucocerebrosidase sequences. PCR amplification was used to identify subclones containing 5{prime}, central, or 3{prime} sequences of the functional gene or of the pseudogene. Clones spanning the entire distance from the last exon of the functional gene to intron 1 of the pseudogene, the 5{prime} end of the functional gene and 16 kb of 5{prime} flanking region and approximately 15 kb of 3{prime} flanking region of the pseudogene were sequenced. Sequence data from 48 kb of intergenic and flanking regions of the glucocerebrosidase gene and its pseudogene has been generated. A large number of Alu sequences and several simple repeats have been found. Two of these repeats exhibit fragment length polymorphism. There is almost 100% homology between the 3{prime} flanking regions of the functional gene and the pseudogene, extending to about 4 kb past the termination codons. A much lower degree of homology is observed in the 5{prime} flanking region. Patient samples are currently being screened for polymorphisms in these flanking regions.« less
Origin of noncoding DNA sequences: molecular fossils of genome evolution

DOE Office of Scientific and Technical Information (OSTI.GOV)

Naora, H.; Miyahara, K.; Curnow, R.N.

The total amount of noncoding sequences on chromosomes of contemporary organisms varies significantly from species to species. The authors propose a hypothesis for the origin of these noncoding sequences that assumes that (i) an approx. 0.55-kilobase (kb)-long reading frame composed the primordial gene and (ii) a 20-kb-long single-stranded polynucleotide is the longest molecule (as a genome) that was polymerized at random and without a specific template in the primordial soup/cell. The statistical distribution of stop codons allows examination of the probability of generating reading frames of approx. 0.55 kb in this primordial polynucleotide. This analysis reveals that with three stopmore » codons, a run of at least 0.55-kb equivalent length of nonstop codons would occur in 4.6% of 20-kb-long polynucleotide molecules. They attempt to estimate the total amount of noncoding sequences that would be present on the chromosomes of contemporary species assuming that present-day chromosomes retain the prototype primordial genome structure. Theoretical estimates thus obtained for most eukaryotes do not differ significantly from those reported for these specific organisms, with only a few exceptions. Furthermore, analysis of possible stop-codon distributions suggests that life on earth would not exist, at least in its present form, had two or four stop codons been selected early in evolution.« less
Genome sequencing and analysis of a type A Clostridium perfringens isolate from a case of bovine clostridial abomasitis.

PubMed

Nowell, Victoria J; Kropinski, Andrew M; Songer, J Glenn; MacInnes, Janet I; Parreira, Valeria R; Prescott, John F

2012-01-01

Clostridium perfringens is a common inhabitant of the avian and mammalian gastrointestinal tracts and can behave commensally or pathogenically. Some enteric diseases caused by type A C. perfringens, including bovine clostridial abomasitis, remain poorly understood. To investigate the potential basis of virulence in strains causing this disease, we sequenced the genome of a type A C. perfringens isolate (strain F262) from a case of bovine clostridial abomasitis. The ∼3.34 Mbp chromosome of C. perfringens F262 is predicted to contain 3163 protein-coding genes, 76 tRNA genes, and an integrated plasmid sequence, Cfrag (∼18 kb). In addition, sequences of two complete circular plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), and two incomplete plasmid fragments, pF262A (48.5 kb) and pF262B (50.0 kb), were identified. Comparison of the chromosome sequence of C. perfringens F262 to complete C. perfringens chromosomes, plasmids and phages revealed 261 unique genes. No novel toxin genes related to previously described clostridial toxins were identified: 60% of the 261 unique genes were hypothetical proteins. There was a two base pair deletion in virS, a gene reported to encode the main sensor kinase involved in virulence gene activation. Despite this frameshift mutation, C. perfringens F262 expressed perfringolysin O, alpha-toxin and the beta2-toxin, suggesting that another regulation system might contribute to the pathogenicity of this strain. Two complete plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), unique to this strain of C. perfringens were identified.
Genome Sequencing and Analysis of a Type A Clostridium perfringens Isolate from a Case of Bovine Clostridial Abomasitis

PubMed Central

Nowell, Victoria J.; Kropinski, Andrew M.; Songer, J. Glenn; MacInnes, Janet I.; Parreira, Valeria R.; Prescott, John F.

2012-01-01

Clostridium perfringens is a common inhabitant of the avian and mammalian gastrointestinal tracts and can behave commensally or pathogenically. Some enteric diseases caused by type A C. perfringens, including bovine clostridial abomasitis, remain poorly understood. To investigate the potential basis of virulence in strains causing this disease, we sequenced the genome of a type A C. perfringens isolate (strain F262) from a case of bovine clostridial abomasitis. The ∼3.34 Mbp chromosome of C. perfringens F262 is predicted to contain 3163 protein-coding genes, 76 tRNA genes, and an integrated plasmid sequence, Cfrag (∼18 kb). In addition, sequences of two complete circular plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), and two incomplete plasmid fragments, pF262A (48.5 kb) and pF262B (50.0 kb), were identified. Comparison of the chromosome sequence of C. perfringens F262 to complete C. perfringens chromosomes, plasmids and phages revealed 261 unique genes. No novel toxin genes related to previously described clostridial toxins were identified: 60% of the 261 unique genes were hypothetical proteins. There was a two base pair deletion in virS, a gene reported to encode the main sensor kinase involved in virulence gene activation. Despite this frameshift mutation, C. perfringens F262 expressed perfringolysin O, alpha-toxin and the beta2-toxin, suggesting that another regulation system might contribute to the pathogenicity of this strain. Two complete plasmids, pF262C (4.8 kb) and pF262D (9.1 kb), unique to this strain of C. perfringens were identified. PMID:22412860
Analysis of the Complete Mitochondrial Genome Sequence of the Diploid Cotton Gossypium raimondii by Comparative Genomics Approaches

PubMed Central

Paterson, Andrew H.; Wang, Xuelin; Xu, Yiqing; Wu, Dongyang; Qu, Yanshu; Jiang, Anna; Ye, Qiaolin

2016-01-01

Cotton is one of the most important economic crops and the primary source of natural fiber and is an important protein source for animal feed. The complete nuclear and chloroplast (cp) genome sequences of G. raimondii are already available but not mitochondria. Here, we assembled the complete mitochondrial (mt) DNA sequence of G. raimondii into a circular genome of length of 676,078 bp and performed comparative analyses with other higher plants. The genome contains 39 protein-coding genes, 6 rRNA genes, and 25 tRNA genes. We also identified four larger repeats (63.9 kb, 10.6 kb, 9.1 kb, and 2.5 kb) in this mt genome, which may be active in intramolecular recombination in the evolution of cotton. Strikingly, nearly all of the G. raimondii mt genome has been transferred to nucleus on Chr1, and the transfer event must be very recent. Phylogenetic analysis reveals that G. raimondii, as a member of Malvaceae, is much closer to another cotton (G. barbadense) than other rosids, and the clade formed by two Gossypium species is sister to Brassicales. The G. raimondii mt genome may provide a crucial foundation for evolutionary analysis, molecular biology, and cytoplasmic male sterility in cotton and other higher plants. PMID:27847816
Diagnostic screening identifies a wide range of mutations involving the SHOX gene, including a common 47.5 kb deletion 160 kb downstream with a variable phenotypic effect.

PubMed

Bunyan, David J; Baker, Kevin R; Harvey, John F; Thomas, N Simon

2013-06-01

Léri-Weill dyschondrosteosis (LWD) results from heterozygous mutations of the SHOX gene, with homozygosity or compound heterozygosity resulting in the more severe form, Langer mesomelic dysplasia (LMD). These mutations typically take the form of whole or partial gene deletions, point mutations within the coding sequence, or large (>100 kb) 3' deletions of downstream regulatory elements. We have analyzed the coding sequence of the SHOX gene and its downstream regulatory regions in a cohort of 377 individuals referred with symptoms of LWD, LMD or short stature. A causative mutation was identified in 68% of the probands with LWD or LMD (91/134). In addition, a 47.5 kb deletion was found 160 kb downstream of the SHOX gene in 17 of the 377 patients (12% of the LWD referrals, 4.5% of all referrals). In 14 of these 17 patients, this was the only potentially causative abnormality detected (13 had symptoms consistent with LWD and one had short stature only), but the other three 47.5 kb deletions were found in patients with an additional causative SHOX mutation (with symptoms of LWD rather than LMD). Parental samples were available on 14/17 of these families, and analysis of these showed a more variable phenotype ranging from apparently unaffected to LWD. Breakpoint sequence analysis has shown that the 47.5 kb deletion is identical in all 17 patients, most likely due to an ancient founder mutation rather than recurrence. This deletion was not seen in 471 normal controls (P<0.0001), providing further evidence for a phenotypic effect, albeit one with variable penetration. Copyright © 2013 Wiley Periodicals, Inc.
Analysis of complex repeat sequences within the spinal muscular atrophy (SMA) candidate region in 5q13

DOE Office of Scientific and Technical Information (OSTI.GOV)

Davies, K.E.; Morrison, K.E.; Daniels, R.I.

1994-09-01

We previously reported that the 400 kb interval flanked the polymorphic loci D5S435 and D5S557 contains blocks of a chromosome 5 specific repeat. This interval also defines the SMA candidate region by genetic analysis of recombinant families. A YAC contig of 2-3 Mb encompassing this area has been constructed and a 5.5 kb conserved fragment, isolated from a YAC end clone within the above interval, was used to obtain cDNAs from both fetal and adult brain libraries. We describe the identification of cDNAs with stretches of high DNA sequence homology to exons of {beta} glucuronidase on human chromosome 7. Themore » cDNAs map both to the candidate region and to an area of 5p using FISH and deletion hybrid analysis. Hybridization to bacteriophage and cosmid clones from the YACs localizes the {beta} glucuronidase related sequences within the 400 kb region of the YAC contig. The cDNAs show a polymorphic pattern on hybridization to genomic BamH1 fragments in the size range of 10-250 kb. Further analysis using YAC fragmentation vectors is being used to determine how these {beta} glucuronidase related cDNAs are distributed within 5q13. Dinucleotide repeats within the region are being investigated to determine linkage disequilibrium with the disease locus.« less
Sequencing, annotation and comparative analysis of nine BACs of giant panda (Ailuropoda melanoleuca).

PubMed

Zheng, Yang; Cai, Jing; Li, JianWen; Li, Bo; Lin, Runmao; Tian, Feng; Wang, XiaoLing; Wang, Jun

2010-01-01

A 10-fold BAC library for giant panda was constructed and nine BACs were selected to generate finish sequences. These BACs could be used as a validation resource for the de novo assembly accuracy of the whole genome shotgun sequencing reads of giant panda newly generated by the Illumina GA sequencing technology. Complete sanger sequencing, assembly, annotation and comparative analysis were carried out on the selected BACs of a joint length 878 kb. Homologue search and de novo prediction methods were used to annotate genes and repeats. Twelve protein coding genes were predicted, seven of which could be functionally annotated. The seven genes have an average gene size of about 41 kb, an average coding size of about 1.2 kb and an average exon number of 6 per gene. Besides, seven tRNA genes were found. About 27 percent of the BAC sequence is composed of repeats. A phylogenetic tree was constructed using neighbor-join algorithm across five species, including giant panda, human, dog, cat and mouse, which reconfirms dog as the most related species to giant panda. Our results provide detailed sequence and structure information for new genes and repeats of giant panda, which will be helpful for further studies on the giant panda.
Cloning and sequence analysis of a cDNA clone coding for the mouse GM2 activator protein.

PubMed Central

Bellachioma, G; Stirling, J L; Orlacchio, A; Beccari, T

1993-01-01

A cDNA (1.1 kb) containing the complete coding sequence for the mouse GM2 activator protein was isolated from a mouse macrophage library using a cDNA for the human protein as a probe. There was a single ATG located 12 bp from the 5' end of the cDNA clone followed by an open reading frame of 579 bp. Northern blot analysis of mouse macrophage RNA showed that there was a single band with a mobility corresponding to a size of 2.3 kb. We deduce from this that the mouse mRNA, in common with the mRNA for the human GM2 activator protein, has a long 3' untranslated sequence of approx. 1.7 kb. Alignment of the mouse and human deduced amino acid sequences showed 68% identity overall and 75% identity for the sequence on the C-terminal side of the first 31 residues, which in the human GM2 activator protein contains the signal peptide. Hydropathicity plots showed great similarity between the mouse and human sequences even in regions of low sequence similarity. There is a single N-glycosylation site in the mouse GM2 activator protein sequence (Asn151-Phe-Thr) which differs in its location from the single site reported in the human GM2 activator protein sequence (Asn63-Val-Thr). Images Figure 1 PMID:7689829
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kennedy, M.A.; Morris, C.M.; Fitzgerald, P.H.

The human kappa deleting element (Kde) mediates loss of CK and JK genes in B cells. A probe for Kde detects two genomic sequences on Southern blots. The Kde is located 24kb 3{prime} to CK, but the position of the homologous sequence is unknown. The authors in situ hybridized m141-2 to metaphase cells of JC11, a B-cell line bearing a t(2;14)(p11;q32) in which the chromosome 2 breakpoint is within JK or the VK-JK intron. Three peaks of labelled sites were obtained. Southern analysis of BamH1 digested DNA showed that Kde (14kb) and the homologous sequence (3kb) were both intact. Kdemore » accounts for hybridization to 14q+ and the 2p- signal presumably derives from the related sequence. This locates the sequence homologous to Kde upstream from JK, possibly within the VK cluster, and may reflect transposition or some other duplicative event as proposed for the evolution of other regions of the kappa locus.« less
Sequences in the intergenic spacer influence RNA Pol I transcription from the human rRNA promoter

DOE Office of Scientific and Technical Information (OSTI.GOV)

Li, W.M.; Sylvester, J.E.

1994-09-01

In most eucaryotic species, ribosomal genes are tandemly repeated about 100-5000 times per haploid genome. The 43 Kb human rDNA repeat consists of a 13 Kb coding region for the 18S, 5.8S, 28S ribosomal RNAs (rRNAs) and transcribed spacers separated by a 30 Kb intergenic spacer. For species such as frog, mouse and rat, sequences in the intergenic spacer other than the gene promoter have been shown to modulate transcription of the ribosomal gene. These sequences are spacer promoters, enhancers and the terminator for spacer transcription. We are addressing whether the human ribosomal gene promoter is similarly influenced. In-vitro transcriptionmore » run-off assays have revealed that the 4.5 kb region (CBE), directly upstream of the gene promoter, has cis-stimulation and trans-competition properties. This suggests that the CBE fragment contains an enhancer(s) for ribosomal gene transcription. Further experiments have shown that a fragment ({approximately}1.6 kb) within the CBE fragment also has trans-competition function. Deletion subclones of this region are being tested to delineate the exact sequences responsible for these modulating activities. Previous sequence analysis and functional studies have revealed that CBE contains regions of DNA capable of adopting alternative structures such as bent DNA, Z-DNA, and triple-stranded DNA. Whether these structures are required for modulating transcription remains to be determined as does the specific DNA-protein interaction involved.« less
Alternative polyadenylation of the gene transcripts encoding a rat DNA polymerase beta.

PubMed

Konopiński, R; Nowak, R; Siedlecki, J A

1996-10-17

Rat cells produce two different transcripts of DNA polymerase beta (beta-Pol). The low-molecular-weight transcript (1.4 kb) was already sequenced. We report here the cloning and sequencing of the full-length cDNA, corresponding to the high-molecular-weight (HMW) transcript (4.0 kb) of beta-Pol. Sequence data strongly suggest that both transcripts are produced from a single gene by alternative polyadenylation. The HMW transcript contains the entire 1.4 kb transcript sequence and additional 2.2 kb on the 3' end. The 3' UTR of the HMW transcript contains some regulatory sequences which are not present in the 1.4-kb transcript. The A + U-rich fragment and (GU)21 sequence are believed to influence the stability of the mRNA. The functional significance of the A-rich region locally destabilizing double-stranded secondary structure remains unknown.
Structural and transcription analysis of two homologous genes for the P700 chlorophyll a-apoproteins in Chlamydomonas reinhardii: evidence for in vivo trans-splicing

PubMed Central

Kück, Ulrich; Choquet, Yves; Schneider, Michel; Dron, Michel; Bennoun, Pierre

1987-01-01

The two homologous genes for the P700 chlorophyll a-apoproteins (ps1A1 and ps1A2) are encoded by the plastom in the green alga Chlamydomonas reinhardii. The structure and organization of the two genes were determined by comparison with the homologous genes from maize using data from heterologous hybridizations as well as from DNA and RNA sequencing. While the ps1A2 (736 codons) gene shows a continuous gene organization, the ps1A1 (754 codons) gene possesses some unusual features. The discontinuous gene is split into three separate exons which are scattered around the circular chloroplast genome. Exon 1 (86 bp) is separated by ∼50 kb from exon 2 (198 bp), which is located ∼ 90 kb apart from exon 3 (1984 bp). All exons are flanked by intronic sequences of group II. Transcription analysis reveals that the ps1A2 gene hybridizes with a 2.8-kb transcript, while all exon regions of the ps1A1 gene are homologous to a mature mRNA of 2.7 kb. From our data we conclude that the three distantly separated exonic sequences of the ps1A1 gene constitute a functional gene which probably operates by a trans-splicing mechanism. ImagesFig. 3.Fig. 5.Fig. 6. PMID:16453785
Machine Learned Replacement of N-Labels for Basecalled Sequences in DNA Barcoding.

PubMed

Ma, Eddie Y T; Ratnasingham, Sujeevan; Kremer, Stefan C

2018-01-01

This study presents a machine learning method that increases the number of identified bases in Sanger Sequencing. The system post-processes a KB basecalled chromatogram. It selects a recoverable subset of N-labels in the KB-called chromatogram to replace with basecalls (A,C,G,T). An N-label correction is defined given an additional read of the same sequence, and a human finished sequence. Corrections are added to the dataset when an alignment determines the additional read and human agree on the identity of the N-label. KB must also rate the replacement with quality value of in the additional read. Corrections are only available during system training. Developing the system, nearly 850,000 N-labels are obtained from Barcode of Life Datasystems, the premier database of genetic markers called DNA Barcodes. Increasing the number of correct bases improves reference sequence reliability, increases sequence identification accuracy, and assures analysis correctness. Keeping with barcoding standards, our system maintains an error rate of percent. Our system only applies corrections when it estimates low rate of error. Tested on this data, our automation selects and recovers: 79 percent of N-labels from COI (animal barcode); 80 percent from matK and rbcL (plant barcodes); and 58 percent from non-protein-coding sequences (across eukaryotes).
Molecular Cloning and Analysis of L(1)ogre, a Locus of Drosophila Melanogaster with Prominent Effects on the Postembryonic Development of the Central Nervous System

PubMed Central

Watanabe, T.; Kankel, D. R.

1990-01-01

Previous genetic studies have shown that wild-type function of the l(1)ogre (lethal (1) optic ganglion reduced) locus is essential for the generation and/or maintenance of the postembryonic neuroblasts including those from which the optic lobe is descended. In the present study molecular isolation and characterization of the l(1)ogre locus was carried out to study the structure and expression of this gene in order to gain information about the nature of l(1)ogre function and its relevance to the development of the central nervous system. About 70 kilobases (kb) of genomic DNA were isolated that spanned the region where l(1)ogre was known to reside. Southern analysis of a l(1)ogre mutation and subsequent P element-mediated DNA transformation mapped the l(1)ogre(+) function within a genomic fragment of 12.5 kb. Northern analyses showed that a 2.9-kb message transcribed from this 12.5-kb region represented l(1)ogre. A 2.15-kb portion of a corresponding cDNA clone was sequenced. An open reading frame (ORF) of 1,086 base pairs was found, and a protein sequence of 362 amino acids with one highly hydrophobic segment was deduced from conceptual translation of this ORF. PMID:1963867
Second-generation sequencing of entire mitochondrial coding-regions (∼15.4 kb) holds promise for study of the phylogeny and taxonomy of human body lice and head lice.

PubMed

Xiong, H; Campelo, D; Pollack, R J; Raoult, D; Shao, R; Alem, M; Ali, J; Bilcha, K; Barker, S C

2014-08-01

The Illumina Hiseq platform was used to sequence the entire mitochondrial coding-regions of 20 body lice, Pediculus humanus Linnaeus, and head lice, P. capitis De Geer (Phthiraptera: Pediculidae), from eight towns and cities in five countries: Ethiopia, France, China, Australia and the U.S.A. These data (∼310 kb) were used to see how much more informative entire mitochondrial coding-region sequences were than partial mitochondrial coding-region sequences, and thus to guide the design of future studies of the phylogeny, origin, evolution and taxonomy of body lice and head lice. Phylogenies were compared from entire coding-region sequences (∼15.4 kb), entire cox1 (∼1.5 kb), partial cox1 (∼700 bp) and partial cytb (∼600 bp) sequences. On the one hand, phylogenies from entire mitochondrial coding-region sequences (∼15.4 kb) were much more informative than phylogenies from entire cox1 sequences (∼1.5 kb) and partial gene sequences (∼600 to ∼700 bp). For example, 19 branches had > 95% bootstrap support in our maximum likelihood tree from the entire mitochondrial coding-regions (∼15.4 kb) whereas the tree from 700 bp cox1 had only two branches with bootstrap support > 95%. Yet, by contrast, partial cytb (∼600 bp) and partial cox1 (∼486 bp) sequences were sufficient to genotype lice to Clade A, B or C. The sequences of the mitochondrial genomes of the P. humanus, P. capitis and P. schaeffi Fahrenholz studied are in NCBI GenBank under the accession numbers KC660761-800, KC685631-6330, KC241882-97, EU219988-95, HM241895-8 and JX080388-407. © 2014 The Royal Entomological Society.
Mapping PDB chains to UniProtKB entries.

PubMed

Martin, Andrew C R

2005-12-01

UniProtKB/SwissProt is the main resource for detailed annotations of protein sequences. This database provides a jumping-off point to many other resources through the links it provides. Among others, these include other primary databases, secondary databases, the Gene Ontology and OMIM. While a large number of links are provided to Protein Data Bank (PDB) files, obtaining a regularly updated mapping between UniProtKB entries and PDB entries at the chain or residue level is not straightforward. In particular, there is no regularly updated resource which allows a UniProtKB/SwissProt entry to be identified for a given residue of a PDB file. We have created a completely automatically maintained database which maps PDB residues to residues in UniProtKB/SwissProt and UniProtKB/trEMBL entries. The protocol uses links from PDB to UniProtKB, from UniProtKB to PDB and a brute-force sequence scan to resolve PDB chains for which no annotated link is available. Finally the sequences from PDB and UniProtKB are aligned to obtain a residue-level mapping. The resource may be queried interactively or downloaded from http://www.bioinf.org.uk/pdbsws/.
Characterization of a linear DNA plasmid from the filamentous fungal plant pathogen Glomerella musae [Anamorph: Colletotrichum musae (Berk. and Curt.) arx.

USGS Publications Warehouse

Freeman, S.; Redman, R.S.; Grantham, G.; Rodriguez, R.J.

1997-01-01

A 7.4-kilobase (kb) DNA plasmid was isolated from Glomerella musae isolate 927 and designated pGML1. Exonuclease treatments indicated that pGML1 was a linear plasmid with blocked 5' termini. Cell-fractionation experiments combined with sequence-specific PCR amplification revealed that pGML1 resided in mitochondria. The pGML1 plasmid hybridized to cesium chloride-fractionated nuclear DNA but not to A + T-rich mitochondrial DNA. An internal 7.0-kb section of pGML1 was cloned and did not hybridize with either nuclear or mitochondrial DNA from G. musae. Sequence analysis revealed identical terminal inverted repeats (TIR) of 520 bp at the ends of the cloned 7.0-kb section of pGML1. The occurrence of pGML1 did not correspond with the pathogenicity of G. musae on banana fruit. Four additional isolates of G. musae possessed extrachromosomal DNA fragments similar in size and sequence to pGML1.
The transcriptional terminator sequences downstream of the covR gene terminate covR/S operon transcription to generate covR monocistronic transcripts in Streptococcus pyogenes.

PubMed

Chiang-Ni, Chuan; Tsou, Chih-Cheng; Lin, Yee-Shin; Chuang, Woei-Jer; Lin, Ming-T; Liu, Ching-Chuan; Wu, Jiunn-Jong

2008-12-31

CovR/S is an important two component regulatory system, which regulates about 15% of the gene expression in Streptococcus pyogenes. The covR/S locus was identified as an operon generating an RNA transcript around 2.5-kb in size. In this study, we found the covR/S operon produced three RNA transcripts (around 2.5-, 1.0-, and 0.8-kb in size). Using RNA transcriptional terminator sequence prediction and transcriptional terminator analysis, we identified two atypical rho-independent terminator sequences downstream of the covR gene and showed these terminator sequences terminate RNA transcription efficiently. These results indicate that covR/S operon generates covR/S transcript and monocistronic covR transcripts.

Nucleotide sequence of the Varkud mitochondrial plasmid of Neurospora and synthesis of a hybrid transcript with a 5' leader derived from mitochondrial RNA.

PubMed

Akins, R A; Grant, D M; Stohl, L L; Bottorff, D A; Nargang, F E; Lambowitz, A M

1988-11-05

The Mauriceville and Varkud mitochondrial plasmids of Neurospora are closely related, closed circular DNAs (3.6 and 3.7 kb, respectively; 1 kb = 10(3) bases or base-pairs), whose characteristics suggest relationships to mitochondrial DNA introns and retrotransposons. Here, we characterized the structure of the Varkud plasmid, determined its complete nucleotide sequence and mapped its major transcripts. The Mauriceville and Varkud plasmids have more than 97% positional identity. Both plasmids contain a 710 amino acid open reading frame that encodes a reverse transcriptase-like protein. The amino acid sequence of this open reading frame is strongly conserved between the two plasmids (701/710 amino acids) as expected for a functionally important protein. Both plasmids have a 0.4 kb region that contains five PstI palindromes and a direct repeat of approximately 160 base-pairs. Comparison of sequences in this region suggests that the Varkud plasmid has diverged less from a common ancestor than has the Mauriceville plasmid. Two major transcripts of the Varkud plasmid were detected by Northern hybridization experiments: a full-length linear RNA of 3.7 kb and an additional prominent transcript of 4.9 kb, 1.2 kb longer than monomer plasmid. Remarkably, we find that the 4.9 kb transcript is a hybrid RNA consisting of the full-length 3.7 kb Varkud plasmid transcript plus a 5' leader of 1.2 kb that is derived from the 5' end of the mitochondrial small rRNA. This and other findings suggest that the Varkud plasmid, like certain RNA viruses, has a mechanism for joining heterologous RNAs to the 5' end of its major transcript, and that, under some circumstances, nucleotide sequences in mitochondria may be recombined at the RNA level.
Two sequence-ready contigs spanning the two copies of a 200-kb duplication on human 21q: partial sequence and polymorphisms.

PubMed

Potier, M; Dutriaux, A; Orti, R; Groet, J; Gibelin, N; Karadima, G; Lutfalla, G; Lynn, A; Van Broeckhoven, C; Chakravarti, A; Petersen, M; Nizetic, D; Delabar, J; Rossier, J

1998-08-01

Physical mapping across a duplication can be a tour de force if the region is larger than the size of a bacterial clone. This was the case of the 170- to 275-kb duplication present on the long arm of chromosome 21 in normal human at 21q11.1 (proximal region) and at 21q22.1 (distal region), which we described previously. We have constructed sequence-ready contigs of the two copies of the duplication of which all the clones are genuine representatives of one copy or the other. This required the identification of four duplicon polymorphisms that are copy-specific and nonallelic variations in the sequence of the STSs. Thirteen STSs were mapped inside the duplicated region and 5 outside but close to the boundaries. Among these STSs 10 were end clones from YACs, PACs, or cosmids, and the average interval between two markers in the duplicated region was 16 kb. Eight PACs and cosmids showing minimal overlaps were selected in both copies of the duplication. Comparative sequence analysis along the duplication showed three single-basepair changes between the two copies over 659 bp sequenced (4 STSs), suggesting that the duplication is recent (less than 4 mya). Two CpG islands were located in the duplication, but no genes were identified after a 36-kb cosmid from the proximal copy of the duplication was sequenced. The homology of this chromosome 21 duplicated region with the pericentromeric regions of chromosomes 13, 2, and 18 suggests that the mechanism involved is probably similar to pericentromeric-directed mechanisms described in interchromosomal duplications. Copyright 1998 Academic Press.
Cytogenetic and Sequence Analyses of Mitochondrial DNA Insertions in Nuclear Chromosomes of Maize

PubMed Central

Lough, Ashley N.; Faries, Kaitlyn M.; Koo, Dal-Hoe; Hussain, Abid; Roark, Leah M.; Langewisch, Tiffany L.; Backes, Teresa; Kremling, Karl A. G.; Jiang, Jiming; Birchler, James A.; Newton, Kathleen J.

2015-01-01

The transfer of mitochondrial DNA (mtDNA) into nuclear genomes is a regularly occurring process that has been observed in many species. Few studies, however, have focused on the variation of nuclear-mtDNA sequences (NUMTs) within a species. This study examined mtDNA insertions within chromosomes of a diverse set of Zea mays ssp. mays (maize) inbred lines by the use of fluorescence in situ hybridization. A relatively large NUMT on the long arm of chromosome 9 (9L) was identified at approximately the same position in four inbred lines (B73, M825, HP301, and Oh7B). Further examination of the similarly positioned 9L NUMT in two lines, B73 and M825, indicated that the large size of these sites is due to the presence of a majority of the mitochondrial genome; however, only portions of this NUMT (∼252 kb total) were found in the publically available B73 nuclear sequence for chromosome 9. Fiber-fluorescence in situ hybridization analysis estimated the size of the B73 9L NUMT to be ∼1.8 Mb and revealed that the NUMT is methylated. Two regions of mtDNA (2.4 kb and 3.3 kb) within the 9L NUMT are not present in the B73 mitochondrial NB genome; however, these 2.4-kb and 3.3-kb segments are present in other Zea mitochondrial genomes, including that of Zea mays ssp. parviglumis, a progenitor of domesticated maize. PMID:26333837
Complete Genome Sequence of Escherichia coli Strain M8, Isolated from ob/ob Mice

PubMed Central

Siddharth, Jay; Membrez, Mathieu; Chakrabarti, Anirikh; Betrisey, Bertrand; Chou, Chieh Jason

2017-01-01

ABSTRACT Escherichia coli is one of the common inhabitants of the mammalian gastrointestinal track. We isolated a strain from an ob/ob mouse and performed whole-genome sequencing, which yielded a chromosome of ~5.1 Mb and three plasmids of ~160 kb, ~6 kb, and ~4 kb. PMID:28572322
Comparative Maps of Human 19p13.3 and Mouse Chromosome 10 Allow Identification of Sequences at Evolutionary Breakpoints

PubMed Central

Puttagunta, Radhika; Gordon, Laurie A.; Meyer, Gary E.; Kapfhamer, David; Lamerdin, Jane E.; Kantheti, Prameela; Portman, Kathleen M.; Chung, Wendy K.; Jenne, Dieter E.; Olsen, Anne S.; Burmeister, Margit

2000-01-01

A cosmid/bacterial artificial chromosome (BAC) contiguous (contig) map of human chromosome (HSA) 19p13.3 has been constructed, and over 50 genes have been localized to the contig. Genes and anonymous ESTs from ≈4000 kb of human 19p13.3 were placed on the central mouse chromosome 10 map by genetic mapping and pulsed-field gel electrophoresis (PFGE) analysis. A region of ∼2500 kb of HSA 19p13.3 is collinear to mouse chromosome (MMU) 10. In contrast, the adjacent ≈1200 kb are inverted. Two genes are located in a 50-kb region after the inversion on MMU 10, followed by a region of homology to mouse chromosome 17. The synteny breakpoint and one of the inversion breakpoints has been localized to sequenced regions in human <5 kb in size. Both breakpoints are rich in simple tandem repeats, including (TCTG)n, (CT)n, and (GTCTCT)n, suggesting that simple repeat sequences may be involved in chromosome breaks during evolution. The overall size of the region in mouse is smaller, although no large regions are missing. Comparing the physical maps to the genetic maps showed that in contrast to the higher-than-average rate of genetic recombination in gene-rich telomeric region on HSA 19p13.3, the average rate of recombination is lower than expected in the homologous mouse region. This might indicate that a hot spot of recombination may have been lost in mouse or gained in human during evolution, or that the position of sequences along the chromosome (telomeric compared to the middle of a chromosome) is important for recombination rates. PMID:10984455
Identification and characterization of transcripts from the biotin biosynthetic operon of Bacillus subtilis.

PubMed Central

Perkins, J B; Bower, S; Howitt, C L; Yocum, R R; Pero, J

1996-01-01

Northern (RNA) blot analysis of the Bacillus subtilis biotin operon, bioWAFDBIorf2, detected at least two steady-state polycistronic transcripts initiated from a putative vegetative (Pbio) promoter that precedes the operon, i.e., a full-length 7.2-kb transcript covering the entire operon and a more abundant 5.1-kb transcript covering just the first five genes of the operon. Biotin and the B. subtilis birA gene product regulated synthesis of the transcripts. Moreover, replacing the putative Pbio promoter and regulatory sequence with a constitutive SP01 phage promoter resulted in higher-level constitutive synthesis. Removal of a rho-independent terminator-like sequence located between the fifth (bioB) and sixth (bioI) genes prevented accumulation of the 5.1-kb transcript, suggesting that the putative terminator functions to limit expression of bioI, which is thought to be involved in an early step in biotin synthesis. PMID:8892842
Identification and characterization of transcripts from the biotin biosynthetic operon of Bacillus subtilis.

PubMed

Perkins, J B; Bower, S; Howitt, C L; Yocum, R R; Pero, J

1996-11-01

Northern (RNA) blot analysis of the Bacillus subtilis biotin operon, bioWAFDBIorf2, detected at least two steady-state polycistronic transcripts initiated from a putative vegetative (Pbio) promoter that precedes the operon, i.e., a full-length 7.2-kb transcript covering the entire operon and a more abundant 5.1-kb transcript covering just the first five genes of the operon. Biotin and the B. subtilis birA gene product regulated synthesis of the transcripts. Moreover, replacing the putative Pbio promoter and regulatory sequence with a constitutive SP01 phage promoter resulted in higher-level constitutive synthesis. Removal of a rho-independent terminator-like sequence located between the fifth (bioB) and sixth (bioI) genes prevented accumulation of the 5.1-kb transcript, suggesting that the putative terminator functions to limit expression of bioI, which is thought to be involved in an early step in biotin synthesis.
High-resolution mapping of the 11q13 amplicon and identification of a gene, TAOS1, that is amplified and overexpressed in oral cancer cells

PubMed Central

Huang, Xin; Gollin, Susanne M.; Raja, Siva; Godfrey, Tony E.

2002-01-01

Amplification of chromosomal band 11q13 is a common event in human cancer. It has been reported in about 45% of head and neck carcinomas and in other cancers including esophageal, breast, liver, lung, and bladder cancer. To understand the mechanism of 11q13 amplification and to identify the potential oncogene(s) driving it, we have fine-mapped the structure of the amplicon in oral squamous cell carcinoma cell lines and localized the proximal and distal breakpoints. A 5-Mb physical map of the region has been prepared from which sequence is available. We quantified copy number of sequence-tagged site markers at 42–550 kb intervals along the length of the amplicon and defined the amplicon core and breakpoints by using TaqMan-based quantitative microsatellite analysis. The core of the amplicon maps to a 1.5-Mb region. The proximal breakpoint localizes to two intervals between sequence-tagged site markers, 550 kb and 160 kb in size, and the distal breakpoint maps to a 250 kb interval. The cyclin D1 gene maps to the amplicon core, as do two new expressed sequence tag clusters. We have analyzed one of these expressed sequence tag clusters and now report that it contains a previously uncharacterized gene, TAOS1 (tumor amplified and overexpressed sequence 1), which is both amplified and overexpressed in oral cancer cells. The data suggest that TAOS1 may be an amplification-dependent candidate oncogene with a role in the development and/or progression of human tumors, including oral squamous cell carcinomas. The approach described here should be useful for characterizing amplified genomic regions in a wide variety of tumors. PMID:12172009
SfiI genomic cleavage map of Escherichia coli K-12 strain MG1655.

PubMed Central

Perkins, J D; Heath, J D; Sharma, B R; Weinstock, G M

1992-01-01

An SfiI restriction map of Escherichia coli K-12 strain MG1655 is presented. The map contains thirty-one cleavage sites separating fragments ranging in size from 407 kb to 3.7 kb. Several techniques were used in the construction of this map, including CHEF pulsed field gel electrophoresis; physical analysis of a set of twenty-six auxotrophic transposon insertions; correlation with the restriction map of Kohara and coworkers using the commercially available E. coli Gene Mapping Membranes; analysis of publicly available sequence information; and correlation of the above data with the combined genetic and physical map developed by Rudd, et al. The combination of these techniques has yielded a map in which all but one site can be localized within a range of +/- 2 kb, and over half the sites can be localized precisely by sequence data. Two sites present in the EcoSeq5 sequence database are not cleaved in MG1655 and four sites are noted to be sensitive to methylation by the dcm methylase. This map, combined with the NotI physical map of MG1655, can aid in the rapid, precise mapping of several different types of genetic alterations, including transposon mediated mutations and other insertions, inversions, deletions and duplications. Images PMID:1312707
Isolation and characterization of two overlapping cosmid clones from the 4q35 region, near the facioscapulohumeral muscular dystrophy locus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Deidda, G.; Grisanti, P.; Vigneti, E.

1994-09-01

The gene for facioscapulohumeral muscular dystrophy (FSHD) has been localized by linkage analysis to the 4q35 region. The most telomeric p13E-11 prove has been shown to detect 4q35 DNA rearrangements in both sporadic and familial cases of the disease. With the aim of constructing a detailed physical map of the 4q35 region and searching for the mutant gene, we used p13E-11 probe to isolate cosmid clones from a human genomic library in a pCos-EMBL 2 vector. Two positive clones were isolated, clones 3 and 5, which partially overlap and carry human genomic inserts of 42 and 45 kb, respectively. Themore » cosmids share a common region containing the p13E-11 region and a stretch of KpnI units consisting of 3.2 kb tandemly repeated sequences (about 10). The restriction maps were constructed using the following enzymes: Bam HI, BgIII, Eco RI, EcoRV, KpnI and Sfi I. Clone 3 extends 4 kb upstream of C5 and stops within the Kpn repeats. Clone 5 extends 4 kb downstream from the Kpn repeats and it presents an additional EcoRI site. Clone 5 contains a stretch of Kpn sequences of nearly 32 kb, corresponding to 10 Kpn repeats; clone 3 contains a stretch of 29 kb corresponding to 9 Kpn repeats, as determined by PFGE analysis of partial digestion of the clones. Clone 5 seems to contain the entire Eco RI region prone to rearrangements in FSHD patients. From clone 5 several subclones were obtained, from the Kpn region and from the region spanning from the last Kpn repeat to the cloning site. No single copy sequences were detected. Subclones from the 3{prime} end region contain beta-satellite or Sau3A-like sequences. In situ hybridization with the whole C5 cosmid shows hybridization signals at the tip of chromosome 4 (4q35) and chromosome 10 (10q26), in the pericentromeric region of chromosome 1 (1q12) and in the p12 region of the acrocentric chromosomes (chr. 21, 22, 13, 14, 15).« less
Comparison of the nucleotide and amino acid sequences of the RsrI and EcoRI restriction endonucleases.

PubMed

Stephenson, F H; Ballard, B T; Boyer, H W; Rosenberg, J M; Greene, P J

1989-12-21

The RsrI endonuclease, a type-II restriction endonuclease (ENase) found in Rhodobacter sphaeroides, is an isoschizomer of the EcoRI ENase. A clone containing an 11-kb BamHI fragment was isolated from an R. sphaeroides genomic DNA library by hybridization with synthetic oligodeoxyribonucleotide probes based on the N-terminal amino acid (aa) sequence of RsrI. Extracts of E. coli containing a subclone of the 11-kb fragment display RsrI activity. Nucleotide sequence analysis reveals an 831-bp open reading frame encoding a polypeptide of 277 aa. A 50% identity exists within a 266-aa overlap between the deduced aa sequences of RsrI and EcoRI. Regions of 75-100% aa sequence identity correspond to key structural and functional regions of EcoRI. The type-II ENases have many common properties, and a common origin might have been expected. Nevertheless, this is the first demonstration of aa sequence similarity between ENases produced by different organisms.
Bacteriophage prevalence in the genus Azospirillum and analysis of the first genome sequence of an Azospirillum brasilense integrative phage.

PubMed

Boyer, Mickaël; Haurat, Jacqueline; Samain, Sylvie; Segurens, Béatrice; Gavory, Frédérick; González, Víctor; Mavingui, Patrick; Rohr, René; Bally, René; Wisniewski-Dyé, Florence

2008-02-01

The prevalence of bacteriophages was investigated in 24 strains of four species of plant growth-promoting rhizobacteria belonging to the genus Azospirillum. Upon induction by mitomycin C, the release of phage particles was observed in 11 strains from three species. Transmission electron microscopy revealed two distinct sizes of particles, depending on the identity of the Azospirillum species, typical of the Siphoviridae family. Pulsed-field gel electrophoresis and hybridization experiments carried out on phage-encapsidated DNAs revealed that all phages isolated from A. lipoferum and A. doebereinerae strains had a size of about 10 kb whereas all phages isolated from A. brasilense strains displayed genome sizes ranging from 62 to 65 kb. Strong DNA hybridizing signals were shown for most phages hosted by the same species whereas no homology was found between phages harbored by different species. Moreover, the complete sequence of the A. brasilense Cd bacteriophage (phiAb-Cd) genome was determined as a double-stranded DNA circular molecule of 62,337 pb that encodes 95 predicted proteins. Only 14 of the predicted proteins could be assigned functions, some of which were involved in DNA processing, phage morphogenesis, and bacterial lysis. In addition, the phiAb-Cd complete genome was mapped as a prophage on a 570-kb replicon of strain A. brasilense Cd, and a region of 27.3 kb of phiAb-Cd was found to be duplicated on the 130-kb pRhico plasmid previously sequenced from A. brasilense Sp7, the parental strain of A. brasilense Cd.
Bacteriophage Prevalence in the Genus Azospirillum and Analysis of the First Genome Sequence of an Azospirillum brasilense Integrative Phage▿

PubMed Central

Boyer, Mickaël; Haurat, Jacqueline; Samain, Sylvie; Segurens, Béatrice; Gavory, Frédérick; González, Víctor; Mavingui, Patrick; Rohr, René; Bally, René; Wisniewski-Dyé, Florence

2008-01-01

The prevalence of bacteriophages was investigated in 24 strains of four species of plant growth-promoting rhizobacteria belonging to the genus Azospirillum. Upon induction by mitomycin C, the release of phage particles was observed in 11 strains from three species. Transmission electron microscopy revealed two distinct sizes of particles, depending on the identity of the Azospirillum species, typical of the Siphoviridae family. Pulsed-field gel electrophoresis and hybridization experiments carried out on phage-encapsidated DNAs revealed that all phages isolated from A. lipoferum and A. doebereinerae strains had a size of about 10 kb whereas all phages isolated from A. brasilense strains displayed genome sizes ranging from 62 to 65 kb. Strong DNA hybridizing signals were shown for most phages hosted by the same species whereas no homology was found between phages harbored by different species. Moreover, the complete sequence of the A. brasilense Cd bacteriophage (ΦAb-Cd) genome was determined as a double-stranded DNA circular molecule of 62,337 pb that encodes 95 predicted proteins. Only 14 of the predicted proteins could be assigned functions, some of which were involved in DNA processing, phage morphogenesis, and bacterial lysis. In addition, the ΦAb-Cd complete genome was mapped as a prophage on a 570-kb replicon of strain A. brasilense Cd, and a region of 27.3 kb of ΦAb-Cd was found to be duplicated on the 130-kb pRhico plasmid previously sequenced from A. brasilense Sp7, the parental strain of A. brasilense Cd. PMID:18065619
Assessing the performance of the Oxford Nanopore Technologies MinION

PubMed Central

Laver, T.; Harrison, J.; O’Neill, P.A.; Moore, K.; Farbos, A.; Paszkiewicz, K.; Studholme, D.J.

2015-01-01

The Oxford Nanopore Technologies (ONT) MinION is a new sequencing technology that potentially offers read lengths of tens of kilobases (kb) limited only by the length of DNA molecules presented to it. The device has a low capital cost, is by far the most portable DNA sequencer available, and can produce data in real-time. It has numerous prospective applications including improving genome sequence assemblies and resolution of repeat-rich regions. Before such a technology is widely adopted, it is important to assess its performance and limitations in respect of throughput and accuracy. In this study we assessed the performance of the MinION by re-sequencing three bacterial genomes, with very different nucleotide compositions ranging from 28.6% to 70.7%; the high G + C strain was underrepresented in the sequencing reads. We estimate the error rate of the MinION (after base calling) to be 38.2%. Mean and median read lengths were 2 kb and 1 kb respectively, while the longest single read was 98 kb. The whole length of a 5 kb rRNA operon was covered by a single read. As the first nanopore-based single molecule sequencer available to researchers, the MinION is an exciting prospect; however, the current error rate limits its ability to compete with existing sequencing technologies, though we do show that MinION sequence reads can enhance contiguity of de novo assembly when used in conjunction with Illumina MiSeq data. PMID:26753127
Nucleotide Sequence and Genetic Structure of a Novel Carbaryl Hydrolase Gene (cehA) from Rhizobium sp. Strain AC100

PubMed Central

Hashimoto, Masayuki; Fukui, Mitsuru; Hayano, Kouichi; Hayatsu, Masahito

2002-01-01

Rhizobium sp. strain AC100, which is capable of degrading carbaryl (1-naphthyl-N-methylcarbamate), was isolated from soil treated with carbaryl. This bacterium hydrolyzed carbaryl to 1-naphthol and methylamine. Carbaryl hydrolase from the strain was purified to homogeneity, and its N-terminal sequence, molecular mass (82 kDa), and enzymatic properties were determined. The purified enzyme hydrolyzed 1-naphthyl acetate and 4-nitrophenyl acetate indicating that the enzyme is an esterase. We then cloned the carbaryl hydrolase gene (cehA) from the plasmid DNA of the strain and determined the nucleotide sequence of the 10-kb region containing cehA. No homologous sequences were found by a database homology search using the nucleotide and deduced amino acid sequences of the cehA gene. Six open reading frames including the cehA gene were found in the 10-kb region, and sequencing analysis shows that the cehA gene is flanked by two copies of insertion sequence-like sequence, suggesting that it makes part of a composite transposon. PMID:11872471
Analysis of the entire genomes of fifteen torque teno midi virus variants classifiable into a third group of genus Anellovirus.

PubMed

Ninomiya, M; Takahashi, M; Shimosegawa, T; Okamoto, H

2007-01-01

Recently, we identified a novel human virus with a circular DNA genome of 3.2 kb, tentatively designated as torque teno midi virus (TTMDV), with a genomic organization resembling those of torque teno virus (TTV) of 3.8-3.9 kb and torque teno mini virus (TTMV) of 2.8-2.9 kb. To investigate the extent of genomic variability of TTMDV genomes, the full-length sequence was determined for 15 TTMDV isolates obtained from viremic individuals in Japan. The 15 TTMDV isolates comprised 3175-3230 bases and shared 67.0-90.3% identities with each other, and were only 68.4-73.0% identical to the 3 reported TTMDV isolates over the entire genome. TTMDV possessed a genomic organization with four open reading frames (ORF1-ORF4) with characteristic sequence motifs and stem and loop structures with high GC content, similar to TTV and TTMV. The total of 18 TTMDV genomes differed by up to 60.7% from each other in the amino acid sequence of ORF1 (658-677 amino acids), but segregated phylogenetically into the same cluster, which was distantly related to the TTVs and TTMVs. These results indicate that TTMDV with a circular DNA genome of 3.2 kb, has an extremely high degree of genomic variability, and is classifiable into a third group in the genus Anellovirus.
Cloning and sequencing of the pheP gene, which encodes the phenylalanine-specific transport system of Escherichia coli.

PubMed Central

Pi, J; Wookey, P J; Pittard, A J

1991-01-01

The phenylalanine-specific permease gene (pheP) of Escherichia coli has been cloned and sequenced. The gene was isolated on a 6-kb Sau3AI fragment from a chromosomal library, and its presence was verified by complementation of a mutant lacking the functional phenylalanine-specific permease. Subcloning from this fragment localized the pheP gene on a 2.7-kb HindIII-HindII fragment. The nucleotide sequence of this 2.7-kb region was determined. An open reading frame was identified which extends from a putative start point of translation (GTG at position 636) to a termination signal (TAA at position 2010). The assignment of the GTG as the initiation codon was verified by site-directed mutagenesis of the initiation codon and by introducing a chain termination mutation into the pheP-lacZ fusion construct. A single initiation site of transcription 30 bp upstream of the start point of translation was identified by the primer extension analysis. The pheP structural gene consists of 1,374 nucleotides specifying a protein of 458 amino acid residues. The PheP protein is very hydrophobic (71% nonpolar residues). A topological model predicted from the sequence analysis defines 12 transmembrane segments. This protein is highly homologous with the AroP (general aromatic transport) system of E. coli (59.6% identity) and to a lesser extent with the yeast permeases CAN1 (arginine), PUT4 (proline), and HIP1 (histidine) of Saccharomyces cerevisiae. Images PMID:1711024
Rose spring dwarf-associated virus has RNA structural and gene-expression features like those of Barley yellow dwarf virus

PubMed Central

Salem, Nida’ M.; Miller, W. Allen; Rowhani, Adib; Golino, Deborah A.; Moyne, Anne-Laure; Falk, Bryce W.

2015-01-01

We determined the complete nucleotide sequence of the Rose spring dwarf-associated virus (RSDaV) genomic RNA (GenBank accession no. EU024678) and compared its predicted RNA structural characteristics affecting gene expression. A cDNA library was derived from RSDaV double-stranded RNAs (dsRNAs) purified from infected tissue. Nucleotide sequence analysis of the cloned cDNAs, plus for clones generated by 5′- and 3′-RACE showed the RSDaV genomic RNA to be 5,808 nucleotides. The genomic RNA contains five major open reading frames (ORFs), and three small ORFs in the 3′-terminal 800 nucleotides, typical for viruses of genus Luteovirus in the family Luteoviridae. Northern blot hybridization analysis revealed the genomic RNA and two prominent subgenomic RNAs of approximately 3 kb and 1 kb. Putative 5′ ends of the sgRNAs were predicted by identification of conserved sequences and secondary structures which resembled the Barley yellow dwarf virus (BYDV) genomic RNA 5′ end and subgenomic RNA promoter sequences. Secondary structures of the BYDV-like ribosomal frameshift elements and cap-independent translation elements, including long-distance base pairing spanning four kb were identified. These contain similarities but also informative differences with the BYDV structures, including a strikingly different structure predicted for the 3′ cap-independent translation element. These analyses of the RSDaV genomic RNA show more complexity for the RNA structural elements for members of the Luteoviridae. PMID:18329064
Rose spring dwarf-associated virus has RNA structural and gene-expression features like those of Barley yellow dwarf virus.

PubMed

Salem, Nida' M; Miller, W Allen; Rowhani, Adib; Golino, Deborah A; Moyne, Anne-Laure; Falk, Bryce W

2008-06-05

We determined the complete nucleotide sequence of the Rose spring dwarf-associated virus (RSDaV) genomic RNA (GenBank accession no. EU024678) and compared its predicted RNA structural characteristics affecting gene expression. A cDNA library was derived from RSDaV double-stranded RNAs (dsRNAs) purified from infected tissue. Nucleotide sequence analysis of the cloned cDNAs, plus for clones generated by 5'- and 3'-RACE showed the RSDaV genomic RNA to be 5808 nucleotides. The genomic RNA contains five major open reading frames (ORFs), and three small ORFs in the 3'-terminal 800 nucleotides, typical for viruses of genus Luteovirus in the family Luteoviridae. Northern blot hybridization analysis revealed the genomic RNA and two prominent subgenomic RNAs of approximately 3 kb and 1 kb. Putative 5' ends of the sgRNAs were predicted by identification of conserved sequences and secondary structures which resembled the Barley yellow dwarf virus (BYDV) genomic RNA 5' end and subgenomic RNA promoter sequences. Secondary structures of the BYDV-like ribosomal frameshift elements and cap-independent translation elements, including long-distance base pairing spanning four kb were identified. These contain similarities but also informative differences with the BYDV structures, including a strikingly different structure predicted for the 3' cap-independent translation element. These analyses of the RSDaV genomic RNA show more complexity for the RNA structural elements for members of the Luteoviridae.
Emergence of Sequence Type 779 Methicillin-Resistant Staphylococcus aureus Harboring a Novel Pseudo Staphylococcal Cassette Chromosome mec (SCCmec)-SCC-SCCCRISPR Composite Element in Irish Hospitals

PubMed Central

Kinnevey, Peter M.; Shore, Anna C.; Brennan, Grainne I.; Sullivan, Derek J.; Ehricht, Ralf; Monecke, Stefan; Slickers, Peter

2013-01-01

Methicillin-resistant Staphylococcus aureus (MRSA) has been a major cause of nosocomial infection in Irish hospitals for 4 decades, and replacement of predominant MRSA clones has occurred several times. An MRSA isolate recovered in 2006 as part of a larger study of sporadic MRSA exhibited a rare spa (t878) and multilocus sequence (ST779) type and was nontypeable by PCR- and DNA microarray-based staphylococcal cassette chromosome mec (SCCmec) element typing. Whole-genome sequencing revealed the presence of a novel 51-kb composite island (CI) element with three distinct domains, each flanked by direct repeat and inverted repeat sequences, including (i) a pseudo SCCmec element (16.3 kb) carrying mecA with a novel mec class region, a fusidic acid resistance gene (fusC), and two copper resistance genes (copB and copC) but lacking ccr genes; (ii) an SCC element (17.5 kb) carrying a novel ccrAB4 allele; and (iii) an SCC element (17.4 kb) carrying a novel ccrC allele and a clustered regularly interspaced short palindromic repeat (CRISPR) region. The novel CI was subsequently identified by PCR in an additional 13 t878/ST779 MRSA isolates, six from bloodstream infections, recovered between 2006 and 2011 in 11 hospitals. Analysis of open reading frames (ORFs) carried by the CI showed amino acid sequence similarity of 44 to 100% to ORFs from S. aureus and coagulase-negative staphylococci (CoNS). These findings provide further evidence of genetic transfer between S. aureus and CoNS and show how this contributes to the emergence of novel SCCmec elements and MRSA strains. Ongoing surveillance of this MRSA strain is warranted and will require updating of currently used SCCmec typing methods. PMID:23147725

Sequences of two related multiple antibiotic resistance virulence plasmids sharing a unique IS26-related molecular signature isolated from different Escherichia coli pathotypes from different hosts.

PubMed

Venturini, Carola; Hassan, Karl A; Roy Chowdhury, Piklu; Paulsen, Ian T; Walker, Mark J; Djordjevic, Steven P

2013-01-01

Enterohemorrhagic Escherichia coli (EHEC) and atypical enteropathogenic E. coli (aEPEC) are important zoonotic pathogens that increasingly are becoming resistant to multiple antibiotics. Here we describe two plasmids, pO26-CRL125 (125 kb) from a human O26:H- EHEC, and pO111-CRL115 (115kb) from a bovine O111 aEPEC, that impart resistance to ampicillin, kanamycin, neomycin, streptomycin, sulfathiazole, trimethoprim and tetracycline and both contain atypical class 1 integrons with an identical IS26-mediated deletion in their 3´-conserved segment. Complete sequence analysis showed that pO26-CRL125 and pO111-CRL115 are essentially identical except for a 9.7 kb fragment, present in the backbone of pO26-CRL125 but absent in pO111-CRL115, and several indels. The 9.7 kb fragment encodes IncI-associated genes involved in plasmid stability during conjugation, a putative transposase gene and three imperfect repeats. Contiguous sequence identical to regions within these pO26-CRL125 imperfect repeats was identified in pO111-CRL115 precisely where the 9.7 kb fragment is missing, suggesting it may be mobile. Sequences shared between the plasmids include a complete IncZ replicon, a unique toxin/antitoxin system, IncI stability and maintenance genes, a novel putative serine protease autotransporter, and an IncI1 transfer system including a unique shufflon. Both plasmids carry a derivate Tn21 transposon with an atypical class 1 integron comprising a dfrA5 gene cassette encoding resistance to trimethoprim, and 24 bp of the 3´-conserved segment followed by Tn6026, which encodes resistance to ampicillin, kanymycin, neomycin, streptomycin and sulfathiazole. The Tn21-derivative transposon is linked to a truncated Tn1721, encoding resistance to tetracycline, via a region containing the IncP-1α oriV. Absence of the 5 bp direct repeats flanking Tn3-family transposons, indicates that homologous recombination events played a key role in the formation of this complex antibiotic resistance gene locus. Comparative sequence analysis of these closely related plasmids reveals aspects of plasmid evolution in pathogenic E. coli from different hosts.
Molecular Analysis of VanA Outbreak of Enterococcus faecium in Two Warsaw Hospitals: The Importance of Mobile Genetic Elements

PubMed Central

Wardal, Ewa; Markowska, Katarzyna; Żabicka, Dorota; Wróblewska, Marta; Giemza, Małgorzata; Mik, Ewa; Połowniak-Pracka, Hanna; Woźniak, Agnieszka; Hryniewicz, Waleria; Sadowy, Ewa

2014-01-01

Vancomycin-resistant Enterococcus faecium represents a growing threat in hospital-acquired infections. Two outbreaks of this pathogen from neighboring Warsaw hospitals have been analyzed in this study. Pulsed-field gel electrophoresis (PFGE) of SmaI-digested DNA, multilocus VNTR analysis (MLVA), and multilocus sequence typing (MLST) revealed a clonal variability of isolates which belonged to three main lineages (17, 18, and 78) of nosocomial E. faecium. All isolates were multidrug resistant and carried several resistance, virulence, and plasmid-specific genes. Almost all isolates shared the same variant of Tn1546 transposon, characterized by the presence of insertion sequence ISEf1 and a point mutation in the vanA gene. In the majority of cases, this transposon was located on 50 kb or 100 kb pRUM-related plasmids, which lacked, however, the axe-txe toxin-antitoxin genes. 100 kb plasmid was easily transferred by conjugation and was found in various clonal backgrounds in both institutions, while 50 kb plasmid was not transferable and occurred solely in MT159/ST78 strains that disseminated clonally in one institution. Although molecular data indicated the spread of VRE between two institutions or a potential common source of this alert pathogen, epidemiological investigations did not reveal the possible route by which outbreak strains disseminated. PMID:25003118
CDC Vital Signs: Recipe for Food Safety

MedlinePlus

... KB] Building public health capacity for advanced genome sequencing and analysis, which will make it possible to ... Content source: National Center for Emerging and Zoonotic Infectious Diseases Page maintained by: Office of the Associate ...
De Novo Assembly and Characterization of Four Anthozoan (Phylum Cnidaria) Transcriptomes.

PubMed

Kitchen, Sheila A; Crowder, Camerron M; Poole, Angela Z; Weis, Virginia M; Meyer, Eli

2015-09-17

Many nonmodel species exemplify important biological questions but lack the sequence resources required to study the genes and genomic regions underlying traits of interest. Reef-building corals are famously sensitive to rising seawater temperatures, motivating ongoing research into their stress responses and long-term prospects in a changing climate. A comprehensive understanding of these processes will require extending beyond the sequenced coral genome (Acropora digitifera) to encompass diverse coral species and related anthozoans. Toward that end, we have assembled and annotated reference transcriptomes to develop catalogs of gene sequences for three scleractinian corals (Fungia scutaria, Montastraea cavernosa, Seriatopora hystrix) and a temperate anemone (Anthopleura elegantissima). High-throughput sequencing of cDNA libraries produced ~20-30 million reads per sample, and de novo assembly of these reads produced ~75,000-110,000 transcripts from each sample with size distributions (mean ~1.4 kb, N50 ~2 kb), comparable to the distribution of gene models from the coral genome (mean ~1.7 kb, N50 ~2.2 kb). Each assembly includes matches for more than half the gene models from A. digitifera (54-67%) and many reasonably complete transcripts (~5300-6700) spanning nearly the entire gene (ortholog hit ratios ≥0.75). The catalogs of gene sequences developed in this study made it possible to identify hundreds to thousands of orthologs across diverse scleractinian species and related taxa. We used these sequences for phylogenetic inference, recovering known relationships and demonstrating superior performance over phylogenetic trees constructed using single mitochondrial loci. The resources developed in this study provide gene sequences and genetic markers for several anthozoan species. To enhance the utility of these resources for the research community, we developed searchable databases enabling researchers to rapidly recover sequences for genes of interest. Our analysis of de novo assembly quality highlights metrics that we expect will be useful for evaluating the relative quality of other de novo transcriptome assemblies. The identification of orthologous sequences and phylogenetic reconstruction demonstrates the feasibility of these methods for clarifying the substantial uncertainties in the existing scleractinian phylogeny. Copyright © 2015 Kitchen et al.
Recombination hot spot in 3.2-kb region of the Charcot-Marie Tooth type 1A repeat sequences: New tools for molecular diagnosis of hereditary neuropathy with liability to pressure palsies and of Charcot-Marie-Tooth type 1A

DOE Office of Scientific and Technical Information (OSTI.GOV)

Lopes, J.; LeGuern, E.; Gouider, R.

1996-06-01

Charcot-Marie-Tooth type 1A (CMT1A) disease and hereditary neuropathy with liability to pressure palsies (HNPP) are autosomal dominant neuropathies, associated, respectively, with duplications and deletions of the same 1.5-Mb region on 17p11.2-p12. These two rearrangements are the reciprocal products of an unequal meiotic crossover between the two chromosome 17 homologues, caused by the misalignment of the CMT1A repeat sequences (CMT1A-REPs), the homologous sequences flanking the 1.5-Mb CMT1A/HNPP monomer unit. In order to map recombination breakpoints within the CMT1A-REPs, a 12.9-kb restriction map was constructed from cloned EcoRI fragments of the proximal and distal CMT1A-REPs. Only 3 of the 17 tested restrictionmore » sites were present in the proximal CMT1A-REP but absent in the distal CMT1A-REP, indicating a high degree of homology between these sequences. The rearrangements were mapped in four regions of the CMT1A-REPs by analysis of 76 CMT1A index cases and 38 HNPP patients, who were unrelated. A hot spot of crossover breakpoints located in a 3.2-kb region accounted for three-quarters of the rearrangements, detected after EcoRI/SacI digestion, by the presence of 3.2-kb and 7.8-kb junction fragments in CMT1A and HNPP patients, respectively. These junction fragments, which can be detected on classical Southern blots, permit molecular diagnosis. Other rearrangements can also be detected by gene dosage on the same Southern blots. 25 refs., 4 figs., 2 tabs.« less
Molecular characterization of the breakpoints of a 12-kb deletion in the NF1 gene in a family showing germ-line mosaicism.

PubMed Central

Lázaro, C; Gaona, A; Lynch, M; Kruyer, H; Ravella, A; Estivill, X

1995-01-01

Neurofibromatosis type 1 (NF1) is caused by deletions, insertions, translocations, and point mutations in the NF1 gene, which spans 350 kb on the long arm of human chromosome 17. Although several point mutations have been described, large molecular abnormalities have rarely been characterized in detail. We describe here the molecular breakpoints of a 12-kb deletion of the NF1 gene, which is responsible for the NF1 phenotype in a kindred with two children affected because of germline mosaicism in the unaffected father, who has the mutation in 10% of his spermatozoa. The mutation spans introns 31-39, removing 12,021 nt and inserting 30 bp, of which 19 bp are a direct repetition of a sequence located in intron 31, just 4 bp before the 5' breakpoint. The 5' and 3' breakpoints contain the sequence TATTTTA, which could be involved in the generation of the deletion. The most plausible explanation for the mechanism involved in the generation of this 12-kb deletion is homologous/nonhomologous recombination. Since sperm of the father does not contain the corresponding insertion of the 12-kb deleted sequence, this deletion could have occurred within the NF1 chromosome through loop formation. RNA from lymphocytes of one of the NF1 patients showed similar levels of the mutated and normal transcripts, suggesting that the NF1-mRNA from mutations causing frame shifts of the reading frame or stop codons in this gene is not degraded during its processing. The mutation was not detected in fresh lymphocytes from the unaffected father by PCR analysis, supporting the case for true germ-line mosaicism. Images Figure 1 Figure 3 PMID:7485153
Facile Recovery of Individual High-Molecular-Weight, Low-Copy-Number Natural Plasmids for Genomic Sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Williams, L.E.; Detter, C,; Barrie, K.

2006-06-01

Sequencing of the large (>50 kb), low-copy-number (<5 per cell) plasmids that mediate horizontal gene transfer has been hindered by the difficulty and expense of isolating DNA from individual plasmids of this class. We report here that a kit method previously devised for purification of bacterial artificial chromosomes (BACs) can be adapted for effective preparation of individual plasmids up to 220 kb from wild gram-negative and gram-positive bacteria. Individual plasmid DNA recovered from less than 10 ml of Escherichia coli, Staphylococcus, and Corynebacterium cultures was of sufficient quantity and quality for construction of highcoverage libraries, as shown by sequencing fivemore » native plasmids ranging in size from 30 kb to 94 kb. We also report recommendations for vector screening to optimize plasmid sequence assembly, preliminary annotation of novel plasmid genomes, and insights on mobile genetic element biology derived from these sequences. Adaptation of this BAC method for large plasmid isolation removes one major technical hurdle to expanding our knowledge of the natural plasmid gene pool.« less
A candidate gene for X-linked Ocular Albinism (OA1)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bassi, M.T.; Schiaffino, V.; Rugarli, E.

1994-09-01

Ocular Albinism of the Nettleship-Fall type 1 (OA1) is the most common form of ocular albinism. It is transmitted as an X-linked recessive trait with affected males showing severe reduction of visual acuity, nystagmus, strabismus, photophobia. Ophthalmologic examination reveals foveal hypoplasia, hypopigmentation of the retina and iris translucency. Microscopic examination of melanocytes suggests that the underlying defect in OA1 is an abnormality in melanosome formation. Recently we assembled a 350 kb cosmid contig spanning the entire critical region on Xp22.3, which measures approximately 110 kb. A minimum set of cosmids was used to identify transcribed sequences using both cDNA selectionmore » and exon amplification. Two putative exons recovered by exon amplification strategy were found to be highly conserved throughout evolution and, therefore, they were used as probes for the screening of fetal and adult retina cDNA libraries. This led to the isolation of clones spanning a full-length cDNA which measures 7.6 kb. Sequence analysis revealed that the predicted protein product shows homology with syntrophines and a Xenopus laevis apical protein. The gene covers approximately 170 kb of DNA and spans the entire critical region for OA1, being deleted in two patients with contiguous gene deletion including OA1 and in one patient with isolated OA1. Therefore, this new gene represents a very strong candidate for involvement in OA1 (an alternative, but unlikely possibility to be considered is that the true OA1 gene lies within an intron of the former). Northern analysis revealed very high level of expression in retina and melanoma. Unlike most Xp22.3 genes, this gene is conserved in the mouse. We are currently performing SSCP analysis and direct sequencing of exons on DNAs from approximately 60 unrelated patients with OA1 for mutation detection.« less
A 3.0-kb deletion including an erythroid cell-specific regulatory element in intron 1 of the ABO blood group gene in an individual with the Bm phenotype.

PubMed

Sano, R; Kuboya, E; Nakajima, T; Takahashi, Y; Takahashi, K; Kubo, R; Kominato, Y; Takeshita, H; Yamao, H; Kishida, T; Isa, K; Ogasawara, K; Uchikawa, M

2015-04-01

We developed a sequence-specific primer PCR (SSP-PCR) for detection of a 5.8-kb deletion (B(m) 5.8) involving an erythroid cell-specific regulatory element in intron 1 of the ABO blood group gene. Using this SSP-PCR, we performed genetic analysis of 382 individuals with Bm or ABm. The 5.8-kb deletion was found in 380 individuals, and disruption of the GATA motif in the regulatory element was found in one individual. Furthermore, a novel 3.0-kb deletion involving the element (B(m) 3.0) was demonstrated in the remaining individual. Comparisons of single-nucleotide polymorphisms and microsatellites in intron 1 between B(m) 5.8 and B(m) 3.0 suggested that these deletions occurred independently. © 2014 International Society of Blood Transfusion.
Structure and expression of the Xenopus retinoblastoma gene.

PubMed

Destrée, O H; Lam, K T; Peterson-Maduro, L J; Eizema, K; Diller, L; Gryka, M A; Frebourg, T; Shibuya, E; Friend, S H

1992-09-01

We have cloned a Xenopus homology (XRb1) of the human retinoblastoma susceptibility gene. DNA sequence analysis shows that the XRb1 gene product is highly conserved in many regions. The leucine repeat motif and many of the potential cdc2 phosphorylation sites, as well as potential sites for other kinases, are retained. The region of the protein homologous to the SV40 T antigen binding site and the basic region directly C-terminal to the E1A binding site are all conserved. XRb1 gene expression at the RNA level was studied by Northern blot analysis. Transcripts of 4.2 and 10-kb are present as maternal RNA stores in the oocyte. While the 4.2-kb product is stable until at least the mid-blastula stage, the 10-kb transcript is selectively degraded. Between stages 11 and 13 the 10-kb transcript reappears and also a minor product of approximately 11 kb becomes apparent. Both the 4.2- and the 10-kb transcripts remain present until later stages of development and are also present in all adult tissues examined, although at differing levels. Antibodies raised against human p105Rb which recognize the protein product of the XRb1 gene, pXRb1, detect the Xenopus 99-kDa protein prior to the mid-blastula stage, but at lower levels than at later stages in development.
Chromosome arm-specific BAC end sequences permit comparative analysis of homoeologous chromosomes and genomes of polyploid wheat

PubMed Central

2012-01-01

Background Bread wheat, one of the world’s staple food crops, has the largest, highly repetitive and polyploid genome among the cereal crops. The wheat genome holds the key to crop genetic improvement against challenges such as climate change, environmental degradation, and water scarcity. To unravel the complex wheat genome, the International Wheat Genome Sequencing Consortium (IWGSC) is pursuing a chromosome- and chromosome arm-based approach to physical mapping and sequencing. Here we report on the use of a BAC library made from flow-sorted telosomic chromosome 3A short arm (t3AS) for marker development and analysis of sequence composition and comparative evolution of homoeologous genomes of hexaploid wheat. Results The end-sequencing of 9,984 random BACs from a chromosome arm 3AS-specific library (TaaCsp3AShA) generated 11,014,359 bp of high quality sequence from 17,591 BAC-ends with an average length of 626 bp. The sequence represents 3.2% of t3AS with an average DNA sequence read every 19 kb. Overall, 79% of the sequence consisted of repetitive elements, 1.38% as coding regions (estimated 2,850 genes) and another 19% of unknown origin. Comparative sequence analysis suggested that 70-77% of the genes present in both 3A and 3B were syntenic with model species. Among the transposable elements, gypsy/sabrina (12.4%) was the most abundant repeat and was significantly more frequent in 3A compared to homoeologous chromosome 3B. Twenty novel repetitive sequences were also identified using de novo repeat identification. BESs were screened to identify simple sequence repeats (SSR) and transposable element junctions. A total of 1,057 SSRs were identified with a density of one per 10.4 kb, and 7,928 junctions between transposable elements (TE) and other sequences were identified with a density of one per 1.39 kb. With the objective of enhancing the marker density of chromosome 3AS, oligonucleotide primers were successfully designed from 758 SSRs and 695 Insertion Site Based Polymorphisms (ISBPs). Of the 96 ISBP primer pairs tested, 28 (29%) were 3A-specific and compared to 17 (18%) for 96 SSRs. Conclusion This work reports on the use of wheat chromosome arm 3AS-specific BAC library for the targeted generation of sequence data from a particular region of the huge genome of wheat. A large quantity of sequences were generated from the A genome of hexaploid wheat for comparative genome analysis with homoeologous B and D genomes and other model grass genomes. Hundreds of molecular markers were developed from the 3AS arm-specific sequences; these and other sequences will be useful in gene discovery and physical mapping. PMID:22559868
The genome of flax (Linum usitatissimum) assembled de novo from short shotgun sequence reads.

PubMed

Wang, Zhiwen; Hobson, Neil; Galindo, Leonardo; Zhu, Shilin; Shi, Daihu; McDill, Joshua; Yang, Linfeng; Hawkins, Simon; Neutelings, Godfrey; Datla, Raju; Lambert, Georgina; Galbraith, David W; Grassa, Christopher J; Geraldes, Armando; Cronk, Quentin C; Cullis, Christopher; Dash, Prasanta K; Kumar, Polumetla A; Cloutier, Sylvie; Sharpe, Andrew G; Wong, Gane K-S; Wang, Jun; Deyholos, Michael K

2012-11-01

Flax (Linum usitatissimum) is an ancient crop that is widely cultivated as a source of fiber, oil and medicinally relevant compounds. To accelerate crop improvement, we performed whole-genome shotgun sequencing of the nuclear genome of flax. Seven paired-end libraries ranging in size from 300 bp to 10 kb were sequenced using an Illumina genome analyzer. A de novo assembly, comprised exclusively of deep-coverage (approximately 94× raw, approximately 69× filtered) short-sequence reads (44-100 bp), produced a set of scaffolds with N(50) =694 kb, including contigs with N(50)=20.1 kb. The contig assembly contained 302 Mb of non-redundant sequence representing an estimated 81% genome coverage. Up to 96% of published flax ESTs aligned to the whole-genome shotgun scaffolds. However, comparisons with independently sequenced BACs and fosmids showed some mis-assembly of regions at the genome scale. A total of 43384 protein-coding genes were predicted in the whole-genome shotgun assembly, and up to 93% of published flax ESTs, and 86% of A. thaliana genes aligned to these predicted genes, indicating excellent coverage and accuracy at the gene level. Analysis of the synonymous substitution rates (K(s) ) observed within duplicate gene pairs was consistent with a recent (5-9 MYA) whole-genome duplication in flax. Within the predicted proteome, we observed enrichment of many conserved domains (Pfam-A) that may contribute to the unique properties of this crop, including agglutinin proteins. Together these results show that de novo assembly, based solely on whole-genome shotgun short-sequence reads, is an efficient means of obtaining nearly complete genome sequence information for some plant species. © 2012 The Authors. The Plant Journal © 2012 Blackwell Publishing Ltd.
Recombination between Streptococcus suis ICESsu32457 and Streptococcus agalactiae ICESa2603 yields a hybrid ICE transferable to Streptococcus pyogenes.

PubMed

Marini, Emanuela; Palmieri, Claudio; Magi, Gloria; Facinelli, Bruna

2015-07-09

Integrative conjugative elements (ICEs) are mobile genetic elements that reside in the chromosome but retain the ability to undergo excision and to transfer by conjugation. Genes involved in drug resistance, virulence, or niche adaptation are often found among backbone genes as cargo DNA. We recently characterized in Streptococcus suis an ICE (ICESsu32457) carrying resistance genes [tet(O/W/32/O), tet(40), erm(B), aphA, and aadE] in the 15K unstable genetic element, which is flanked by two ∼1.3kb direct repeats. Remarkably, ∼1.3-kb sequences are conserved in ICESa2603 of Streptococcus agalactiae 2603V/R, which carry heavy metal resistance genes cadC/cadA and mer. In matings between S. suis 32457 (donor) and S. agalactiae 2603V/R (recipient), transconjugants were obtained. PCR experiments, PFGE, and sequence analysis of transconjugants demonstrated a tandem array between ICESsu32457 and ICESa2603. Matings between tandem array-containing S. agalactiae 2603V/R (donor) and Streptococcus pyogenes RF12 (recipient) yielded a single transconjugant containing a hybrid ICE, here named ICESa2603/ICESsu32457. The hybrid formed by recombination of the left ∼1.3-kb sequence of ICESsu32457 and the ∼1.3-kb sequence of ICESa2603. Interestingly, the hybrid ICE was transferable between S. pyogenes strains, thus demonstrating that it behaves as a conventional ICE. These findings suggest that both tandem arrays and hybrid ICEs may contribute to the evolution of antibiotic resistance in streptococci, creating novel mobile elements capable of disseminating new combinations of antibiotic resistance genes. Copyright © 2015 Elsevier B.V. All rights reserved.
Sequence characterization of S100A8 gene reveals structural differences of protein and transcriptional factor binding sites in water buffalo and yak.

PubMed

Kathiravan, P; Goyal, S; Kataria, R S; Mishra, B P; Jayakumar, S; Joshi, B K

2011-01-01

The present study was undertaken to characterize the structure of S100A8 gene and its promoter in water buffalo and yak. Sequence data of 2.067 kb, 2.071 kb, and 2.052 kb with respect to complete S100A8 gene including 5' flanking region was generated in river buffalo, swamp buffalo, and yak, respectively. BLAST analysis of coding DNA sequences (CDS) of S100A8 gene revealed 95% homology of buffalo sequence with cattle, 85% with pig and horse, 83% with dog, 72-73% with murines, and around 79% with primates and humans. Phylogenetic analysis of predicted CDS revealed distinct clustering of murines, primates, and domestic animals with bovines and bubalines forming a subcluster among farm animals. In silico translation of predicted CDS revealed a sequence of 89 amino acids with 7 amino acid changes between cattle and buffalo and 2 changes between cattle and yak. The search for Pfam family revealed the N-terminal calcium binding domain and the noncanonical EF hand domain in the carboxy terminus, with more variations being observed in the N-terminal domain among different species. Two amino acid changes observed in carboxy terminal EF hand domain resulted in altered secondary structure of yak S100A8 protein. Analysis of S100A8 gene promoter revealed 14 putative motifs for transcriptional factor binding sites. Two putative motifs viz. C/EBP and v-Myb were found to be absent in swamp buffalo as compared to river buffalo and cattle. Differences in the structure of S100A8 protein and the transcriptional factor binding sites identified in the present study need to be analyzed further for their functional significance in yak and swamp buffalo respectively. Copyright © Taylor & Francis Group, LLC
Comparative Genome Sequence Analysis of the Bpa/Str Region in Mouse and Man

PubMed Central

Mallon, A.-M.; Platzer, M.; Bate, R.; Gloeckner, G.; Botcherby, M.R.M.; Nordsiek, G.; Strivens, M.A.; Kioschis, P.; Dangel, A.; Cunningham, D.; Straw, R.N.A.; Weston, P.; Gilbert, M.; Fernando, S.; Goodall, K.; Hunter, G.; Greystrong, J.S.; Clarke, D.; Kimberley, C.; Goerdes, M.; Blechschmidt, K.; Rump, A.; Hinzmann, B.; Mundy, C.R.; Miller, W.; Poustka, A.; Herman, G.E.; Rhodes, M.; Denny, P.; Rosenthal, A.; Brown, S.D.M.

2000-01-01

The progress of human and mouse genome sequencing programs presages the possibility of systematic cross-species comparison of the two genomes as a powerful tool for gene and regulatory element identification. As the opportunities to perform comparative sequence analysis emerge, it is important to develop parameters for such analyses and to examine the outcomes of cross-species comparison. Our analysis used gene prediction and a database search of 430 kb of genomic sequence covering the Bpa/Str region of the mouse X chromosome, and 745 kb of genomic sequence from the homologous human X chromosome region. We identified 11 genes in mouse and 13 genes and two pseudogenes in human. In addition, we compared the mouse and human sequences using pairwise alignment and searches for evolutionary conserved regions (ECRs) exceeding a defined threshold of sequence identity. This approach aided the identification of at least four further putative conserved genes in the region. Comparative sequencing revealed that this region is a mosaic in evolutionary terms, with considerably more rearrangement between the two species than realized previously from comparative mapping studies. Surprisingly, this region showed an extremely high LINE and low SINE content, low G+C content, and yet a relatively high gene density, in contrast to the low gene density usually associated with such regions. [The sequence data described in this paper have been submitted to EMBL under the following accession nos.: Mouse Genomic Sequence: Mouse contig A (AL021127), Mouse contig B (AL049866), BAC41M10 (AL136328), PAC303O11(AL136329). Human Genomic Sequence: Human contig 1 (U82671, U82670), Human contig 2 (U82695).] PMID:10854409
Cloning and sequence analysis of the meso-diaminopimelate decarboxylase gene from Bacillus methanolicus MGA3 and comparison to other decarboxylase genes.

PubMed Central

Mills, D A; Flickinger, M C

1993-01-01

The lysA gene of Bacillus methanolicus MGA3 was cloned by complementation of an auxotrophic Escherichia coli lysA22 mutant with a genomic library of B. methanolicus MGA3 chromosomal DNA. Subcloning localized the B. methanolicus MGA3 lysA gene into a 2.3-kb SmaI-SstI fragment. Sequence analysis of the 2.3-kb fragment indicated an open reading frame encoding a protein of 48,223 Da, which was similar to the meso-diaminopimelate (DAP) decarboxylase amino acid sequences of Bacillus subtilis (62%) and Corynebacterium glutamicum (40%). Amino acid sequence analysis indicated several regions of conservation among bacterial DAP decarboxylases, eukaryotic ornithine decarboxylases, and arginine decarboxylases, suggesting a common structural arrangement for positioning of substrate and the cofactor pyridoxal 5'-phosphate. The B. methanolicus MGA3 DAP decarboxylase was shown to be a dimer (M(r) 86,000) with a subunit molecular mass of approximately 50,000 Da. This decarboxylase is inhibited by lysine (Ki = 0.93 mM) with a Km of 0.8 mM for DAP. The inhibition pattern suggests that the activity of this enzyme in lysine-overproducing strains of B. methanolicus MGA3 may limit lysine synthesis. Images PMID:8215365
Cloning and sequence analysis of the meso-diaminopimelate decarboxylase gene from Bacillus methanolicus MGA3 and comparison to other decarboxylase genes.

PubMed

Mills, D A; Flickinger, M C

1993-09-01

The lysA gene of Bacillus methanolicus MGA3 was cloned by complementation of an auxotrophic Escherichia coli lysA22 mutant with a genomic library of B. methanolicus MGA3 chromosomal DNA. Subcloning localized the B. methanolicus MGA3 lysA gene into a 2.3-kb SmaI-SstI fragment. Sequence analysis of the 2.3-kb fragment indicated an open reading frame encoding a protein of 48,223 Da, which was similar to the meso-diaminopimelate (DAP) decarboxylase amino acid sequences of Bacillus subtilis (62%) and Corynebacterium glutamicum (40%). Amino acid sequence analysis indicated several regions of conservation among bacterial DAP decarboxylases, eukaryotic ornithine decarboxylases, and arginine decarboxylases, suggesting a common structural arrangement for positioning of substrate and the cofactor pyridoxal 5'-phosphate. The B. methanolicus MGA3 DAP decarboxylase was shown to be a dimer (M(r) 86,000) with a subunit molecular mass of approximately 50,000 Da. This decarboxylase is inhibited by lysine (Ki = 0.93 mM) with a Km of 0.8 mM for DAP. The inhibition pattern suggests that the activity of this enzyme in lysine-overproducing strains of B. methanolicus MGA3 may limit lysine synthesis.
Mouse scrapie responsive gene 1 (Scrg1): genomic organization, physical linkage to sap30, genetic mapping on chromosome 8, and expression in neuronal primary cell cultures.

PubMed

Dron, M; Tartare, X; Guillo, F; Haik, S; Barbin, G; Maury, C; Tovey, M; Dandoy-Dron, F

2000-11-15

We have previously reported a transcript of a novel mouse gene (Scrg1) with increased expression in transmissible spongiform encephalopathies and the cloning of the human mRNA analogue. In this paper, we present the genomic organization of the mouse and human SCRG1 loci, which exhibit a high degree of conservation. The genes are composed of three exons; the two downstream exons contain the protein coding region. The mouse gene is expressed in brain tissue essentially as a 0.7-kb message but also as a minor 2.6-kb mRNA. We have sequenced 20 kb of DNA at the mouse Scrg1 locus and found that the longer transcript is the prolongation of the 0.7-kb mRNA to a polyadenylation site located about 2 kb further downstream. Sequencing revealed that the mouse Scrg1 gene is physically linked to Sap30, a gene that encodes a protein of the histone deacetylase complex, and genetic linkage mapping assigned the localization of Scrg1 to chromosome 8 between Ant1 and Hmg2. Northern blot analysis showed that Scrg1 is under strict developmental control in mouse embryo and is expressed by cells of neuronal origin in vitro. Comparison of the rat, mouse, and human SCRG1 proteins identified a box of 35 identical contiguous amino acids and a characteristic cysteine distribution pattern defining a new protein signature. Copyright 2000 Academic Press.
PGen: large-scale genomic variations analysis workflow and browser in SoyKB.

PubMed

Liu, Yang; Khan, Saad M; Wang, Juexin; Rynge, Mats; Zhang, Yuanxun; Zeng, Shuai; Chen, Shiyuan; Maldonado Dos Santos, Joao V; Valliyodan, Babu; Calyam, Prasad P; Merchant, Nirav; Nguyen, Henry T; Xu, Dong; Joshi, Trupti

2016-10-06

With the advances in next-generation sequencing (NGS) technology and significant reductions in sequencing costs, it is now possible to sequence large collections of germplasm in crops for detecting genome-scale genetic variations and to apply the knowledge towards improvements in traits. To efficiently facilitate large-scale NGS resequencing data analysis of genomic variations, we have developed "PGen", an integrated and optimized workflow using the Extreme Science and Engineering Discovery Environment (XSEDE) high-performance computing (HPC) virtual system, iPlant cloud data storage resources and Pegasus workflow management system (Pegasus-WMS). The workflow allows users to identify single nucleotide polymorphisms (SNPs) and insertion-deletions (indels), perform SNP annotations and conduct copy number variation analyses on multiple resequencing datasets in a user-friendly and seamless way. We have developed both a Linux version in GitHub ( https://github.com/pegasus-isi/PGen-GenomicVariations-Workflow ) and a web-based implementation of the PGen workflow integrated within the Soybean Knowledge Base (SoyKB), ( http://soykb.org/Pegasus/index.php ). Using PGen, we identified 10,218,140 single-nucleotide polymorphisms (SNPs) and 1,398,982 indels from analysis of 106 soybean lines sequenced at 15X coverage. 297,245 non-synonymous SNPs and 3330 copy number variation (CNV) regions were identified from this analysis. SNPs identified using PGen from additional soybean resequencing projects adding to 500+ soybean germplasm lines in total have been integrated. These SNPs are being utilized for trait improvement using genotype to phenotype prediction approaches developed in-house. In order to browse and access NGS data easily, we have also developed an NGS resequencing data browser ( http://soykb.org/NGS_Resequence/NGS_index.php ) within SoyKB to provide easy access to SNP and downstream analysis results for soybean researchers. PGen workflow has been optimized for the most efficient analysis of soybean data using thorough testing and validation. This research serves as an example of best practices for development of genomics data analysis workflows by integrating remote HPC resources and efficient data management with ease of use for biological users. PGen workflow can also be easily customized for analysis of data in other species.
Cloning and expression of 130-kd mosquito-larvicidal delta-endotoxin gene of Bacillus thuringiensis var. Israelensis in Escherichia coli.

PubMed

Angsuthanasombat, C; Chungjatupornchai, W; Kertbundit, S; Luxananil, P; Settasatian, C; Wilairat, P; Panyim, S

1987-07-01

Five recombinant E. coli clones exhibiting toxicity to Aedes aegypti larvae were obtained from a library of 800 clones containing XbaI DNA fragments of 110 kb plasmid from B. thuringiensis var. israelensis. All the five clones (pMU 14/258/303/388/679) had the same 3.8-kb insert and encoded a major protein of 130 kDa which was highly toxic to A. aegypti larvae. Three clones (pMU 258/303/388) transcribed the 130 kD a gene in the same direction as that of lac Z promoter of pUC12 vector whereas the transcription of the other two (pMU 14/679) was in the opposite direction. A 1.9-kb fragment of the 3.8 kb insert coded for a protein of 65 kDa. Partial DNA sequence of the 3.8 kb insert, corresponding to the 5'-terminal of the 130 kDa gene, revealed a continuous reading frame, a Shine-Dalgarno sequence and a tentative 5'-regulatory region. These results demonstrated that the 3.8 kb insert is a minimal DNA fragment containing a regulatory region plus the coding sequence of the 130 kDa protein that is highly toxic to mosquito larvae.

Genetic and Molecular Characterization of the Caenorhabditis Elegans Spermatogenesis-Defective Gene Spe-17

PubMed Central

L'Hernault, S. W.; Benian, G. M.; Emmons, R. B.

1993-01-01

Two self-sterile mutations that define the spermatogenesis-defective gene spe-17 have been analyzed. These mutations affect unc-22 and fail to complement each other for both Unc-22 and spermatogenesis defects. Both of these mutations are deficiencies (hcDf1 and hDf13) that affect more than one transcription unit. Genomic DNA adjacent to and including the region deleted by the smaller deficiency (hcDf1) has been sequenced and four mRNAs (including unc-22) have been localized to this sequenced region. The three non unc-22 mRNAs are shown to be sex-specific: a 1.2-kb mRNA that can be detected in sperm-free hermaphrodites and 1.2- and 0.56-kb mRNAs found in males. hDf13 deletes at least 55 kb of chromosome IV, including all of unc-22, both male-specific mRNAs and at least part of the female-specific mRNA. hcDf1, which is approximately 15.6 kb, deletes only the 5' end of unc-22 and the gene that encodes the 0.56-kb male-specific mRNA. The common defect that apparently accounts for the defective sperm in hcDf1 and hDf13 homozygotes is deletion of the spe-17 gene, which encodes the 0.56-kb mRNA. Strains carrying two copies of either deletion are self-fertile when they are transgenic for any of four extrachromosomal array that include spe-17. We have sequenced two spe-17 cDNAs, and the deduced 142 amino acid protein sequence is highly charged and rich in serine and threonine, but shows no significant homology to any previously determined protein sequence. PMID:8349108
Cloning and expression of Bartonella henselae sucB gene encoding an immunogenic dihydrolipoamide succinyltransferase homologous protein.

PubMed

Kabeya, Hidenori; Maruyama, Soichi; Hirano, Kouji; Mikami, Takeshi

2003-01-01

Immunoscreening of a ZAP genomic library of Bartonella henselae strain Houston-1 expressed in Escherichia coli resulted in the isolation of a clone containing 3.5 kb BamHI genomic DNA fragment. This 3.5 kb DNA fragment was found to contain a sequence of a gene encoding a protein with significant homology to the dihydrolipoamide succinyltransferase of Brucella melitensis (sucB). Subsequent cloning and DNA sequence analysis revealed that the deduced amino acid sequence from the cloned gene showed 66.5% identity to SucB protein of B. melitensis, and 43.4 and 47.2% identities to those of Coxiella burnetii and E. coli, respectively. The gene was expressed as a His-Nus A-tagged fusion protein. The recombinant SucB protein (rSucB) was shown to be an immunoreactive protein of about 115 kDa by Western blot analysis with sera from B. henselae-immunized mice. Therefore the rSucB may be a candidate antigen for a specific serological diagnosis of B. henselae infection.
Large-Scale Concatenation cDNA Sequencing

PubMed Central

Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.

1997-01-01

A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
X-Prolyl Dipeptidyl Aminopeptidase Gene (pepX) Is Part of the glnRA Operon in Lactobacillus rhamnosus

PubMed Central

Varmanen, Pekka; Savijoki, Kirsi; Åvall, Silja; Palva, Airi; Tynkkynen, Soile

2000-01-01

A peptidase gene expressing X-prolyl dipeptidyl aminopeptidase (PepX) activity was cloned from Lactobacillus rhamnosus 1/6 by using the chromogenic substrate l-glycyl-l-prolyl-β-naphthylamide for screening of a genomic library in Escherichia coli. The nucleotide sequence of a 3.5-kb HindIII fragment expressing the peptidase activity revealed one complete open reading frame (ORF) of 2,391 nucleotides. The 797-amino-acid protein encoded by this ORF was shown to be 40, 39, and 36% identical with PepXs from Lactobacillus helveticus, Lactobacillus delbrueckii, and Lactococcus lactis, respectively. By Northern analysis with a pepX-specific probe, transcripts of 4.5 and 7.0 kb were detected, indicating that pepX is part of a polycistronic operon in L. rhamnosus. Cloning and sequencing of the upstream region of pepX revealed the presence of two ORFs of 360 and 1,338 bp that were shown to be able to encode proteins with high homology to GlnR and GlnA proteins, respectively. By multiple primer extension analyses, the only functional promoter in the pepX region was located 25 nucleotides upstream of glnR. Northern analysis with glnA- and pepX-specific probes indicated that transcription from glnR promoter results in a 2.0-kb dicistronic glnR-glnA transcript and also in a longer read-through polycistronic transcript of 7.0 kb that was detected with both probes in samples from cells in exponential growth phase. The glnA gene was disrupted by a single-crossover recombinant event using a nonreplicative plasmid carrying an internal part of glnA. In the disruption mutant, glnRA-specific transcription was derepressed 10-fold compared to the wild type, but the 7.0-kb transcript was no longer detectable with either the glnA- or pepX-specific probe, demonstrating that pepX is indeed part of glnRA operon in L. rhamnosus. Reverse transcription-PCR analysis further supported this operon structure. An extended stem-loop structure was identified immediately upstream of pepX in the glnA-pepX intergenic region, a sequence that showed homology to a 23S-5S intergenic spacer and to several other L. rhamnosus-related entries in data banks. PMID:10613874
X-prolyl dipeptidyl aminopeptidase gene (pepX) is part of the glnRA operon in Lactobacillus rhamnosus.

PubMed

Varmanen, P; Savijoki, K; Avall, S; Palva, A; Tynkkynen, S

2000-01-01

A peptidase gene expressing X-prolyl dipeptidyl aminopeptidase (PepX) activity was cloned from Lactobacillus rhamnosus 1/6 by using the chromogenic substrate L-glycyl-L-prolyl-beta-naphthylamide for screening of a genomic library in Escherichia coli. The nucleotide sequence of a 3.5-kb HindIII fragment expressing the peptidase activity revealed one complete open reading frame (ORF) of 2,391 nucleotides. The 797-amino-acid protein encoded by this ORF was shown to be 40, 39, and 36% identical with PepXs from Lactobacillus helveticus, Lactobacillus delbrueckii, and Lactococcus lactis, respectively. By Northern analysis with a pepX-specific probe, transcripts of 4.5 and 7.0 kb were detected, indicating that pepX is part of a polycistronic operon in L. rhamnosus. Cloning and sequencing of the upstream region of pepX revealed the presence of two ORFs of 360 and 1,338 bp that were shown to be able to encode proteins with high homology to GlnR and GlnA proteins, respectively. By multiple primer extension analyses, the only functional promoter in the pepX region was located 25 nucleotides upstream of glnR. Northern analysis with glnA- and pepX-specific probes indicated that transcription from glnR promoter results in a 2.0-kb dicistronic glnR-glnA transcript and also in a longer read-through polycistronic transcript of 7.0 kb that was detected with both probes in samples from cells in exponential growth phase. The glnA gene was disrupted by a single-crossover recombinant event using a nonreplicative plasmid carrying an internal part of glnA. In the disruption mutant, glnRA-specific transcription was derepressed 10-fold compared to the wild type, but the 7.0-kb transcript was no longer detectable with either the glnA- or pepX-specific probe, demonstrating that pepX is indeed part of glnRA operon in L. rhamnosus. Reverse transcription-PCR analysis further supported this operon structure. An extended stem-loop structure was identified immediately upstream of pepX in the glnA-pepX intergenic region, a sequence that showed homology to a 23S-5S intergenic spacer and to several other L. rhamnosus-related entries in data banks.
A rare case of 46, XX SRY-negative male with approximately 74-kb duplication in a region upstream of SOX9.

PubMed

Xiao, Bing; Ji, Xing; Xing, Ya; Chen, Ying-Wei; Tao, Jiong

2013-12-01

The 46, XX male disorder of sex development (DSD) is a rare genetic condition. Here, we report the case of a 46, XX SRY-negative male with complete masculinization. The coding region and exon/intron boundaries of the DAX1, SOX9 and RSPO1 genes were sequenced, and no mutations were detected. Using whole genome array analysis and real-time PCR, we identified a approximately 74-kb duplication in a region approximately 510-584 kb upstream of SOX9 (chr17:69,533,305-69,606,825, hg19). Combined with the results of previous studies, the minimum critical region associated with gonadal development is a 67-kb region located 584-517 kb upstream of SOX9. The amplification of this region might lead to SOX9 overexpression, causing female-to-male sex reversal. Gonadal-specific enhancers in the region upstream of SOX9 may activate the SOX9 expression through long-range regulation, thus triggering testicular differentiation. Copyright © 2013 Elsevier Masson SAS. All rights reserved.
The Complete Sequence of a Human Parainfluenzavirus 4 Genome

PubMed Central

Yea, Carmen; Cheung, Rose; Collins, Carol; Adachi, Dena; Nishikawa, John; Tellier, Raymond

2009-01-01

Although the human parainfluenza virus 4 (HPIV4) has been known for a long time, its genome, alone among the human paramyxoviruses, has not been completely sequenced to date. In this study we obtained the first complete genomic sequence of HPIV4 from a clinical isolate named SKPIV4 obtained at the Hospital for Sick Children in Toronto (Ontario, Canada). The coding regions for the N, P/V, M, F and HN proteins show very high identities (95% to 97%) with previously available partial sequences for HPIV4B. The sequence for the L protein and the non-coding regions represent new information. A surprising feature of the genome is its length, more than 17 kb, making it the longest genome within the genus Rubulavirus, although the length is well within the known range of 15 kb to 19 kb for the subfamily Paramyxovirinae. The availability of a complete genomic sequence will facilitate investigations on a respiratory virus that is still not completely characterized. PMID:21994536
Cloning, sequencing, and expression of the gene coding for bile acid 7 alpha-hydroxysteroid dehydrogenase from Eubacterium sp. strain VPI 12708.

PubMed Central

Baron, S F; Franklund, C V; Hylemon, P B

1991-01-01

Southern blot analysis indicated that the gene encoding the constitutive, NADP-linked bile acid 7 alpha-hydroxysteroid dehydrogenase of Eubacterium sp. strain VPI 12708 was located on a 6.5-kb EcoRI fragment of the chromosomal DNA. This fragment was cloned into bacteriophage lambda gt11, and a 2.9-kb piece of this insert was subcloned into pUC19, yielding the recombinant plasmid pBH51. DNA sequence analysis of the 7 alpha-hydroxysteroid dehydrogenase gene in pBH51 revealed a 798-bp open reading frame, coding for a protein with a calculated molecular weight of 28,500. A putative promoter sequence and ribosome binding site were identified. The 7 alpha-hydroxysteroid dehydrogenase mRNA transcript in Eubacterium sp. strain VPI 12708 was about 0.94 kb in length, suggesting that it is monocistronic. An Escherichia coli DH5 alpha transformant harboring pBH51 had approximately 30-fold greater levels of 7 alpha-hydroxysteroid dehydrogenase mRNA, immunoreactive protein, and specific activity than Eubacterium sp. strain VPI 12708. The 7 alpha-hydroxysteroid dehydrogenase purified from the pBH51 transformant was similar in subunit molecular weight, specific activity, and kinetic properties to that from Eubacterium sp. strain VPI 12708, and it reached with antiserum raised against the authentic enzyme on Western immunoblots. Alignment of the amino acid sequence of the 7 alpha-hydroxysteroid dehydrogenase with those of 10 other pyridine nucleotide-linked alcohol/polyol dehydrogenases revealed six conserved amino acid residues in the N-terminal regions thought to function in coenzyme binding. Images PMID:1856160
Cloning, Sequencing, and Characterization of the cgmB Gene of Sinorhizobium meliloti Involved in Cyclic β-Glucan Biosynthesis

PubMed Central

Wang, Ping; Ingram-Smith, Cheryl; Hadley, Jill A.; Miller, Karen J.

1999-01-01

Periplasmic cyclic β-glucans of Rhizobium species provide important functions during plant infection and hypo-osmotic adaptation. In Sinorhizobium meliloti (also known as Rhizobium meliloti), these molecules are highly modified with phosphoglycerol and succinyl substituents. We have previously identified an S. meliloti Tn5 insertion mutant, S9, which is specifically impaired in its ability to transfer phosphoglycerol substituents to the cyclic β-glucan backbone (M. W. Breedveld, J. A. Hadley, and K. J. Miller, J. Bacteriol. 177:6346–6351, 1995). In the present study, we have cloned, sequenced, and characterized this mutation at the molecular level. By using the Tn5 flanking sequences (amplified by inverse PCR) as a probe, an S. meliloti genomic library was screened, and two overlapping cosmid clones which functionally complement S9 were isolated. A 3.1-kb HindIII-EcoRI fragment found in both cosmids was shown to fully complement mutant S9. Furthermore, when a plasmid containing this 3.1-kb fragment was used to transform Rhizobium leguminosarum bv. trifolii TA-1JH, a strain which normally synthesizes only neutral cyclic β-glucans, anionic glucans containing phosphoglycerol substituents were produced, consistent with the functional expression of an S. meliloti phosphoglycerol transferase gene. Sequence analysis revealed the presence of two major, overlapping open reading frames within the 3.1-kb fragment. Primer extension analysis revealed that one of these open reading frames, ORF1, was transcribed and its transcription was osmotically regulated. This novel locus of S. meliloti is designated the cgm (cyclic glucan modification) locus, and the product encoded by ORF1 is referred to as CgmB. PMID:10419956
Loss of retrovirus production in JB/RH melanoma cells transfected with H-2Kb and TAP-1 genes.

PubMed

Li, M; Xu, F; Muller, J; Huang, X; Hearing, V J; Gorelik, E

1999-01-20

JB/RH1 melanoma cells, as well as other melanomas of C57BL/6 mice (B16 and JB/MS), express a common melanoma-associated antigen (MAA) encoded by an ecotropic melanoma-associated retrovirus (MelARV). JB/RH1 cells do not express the H-2Kb molecules due to down-regulation of the H-2Kb and TAP-1 genes. When JB/RH1 cells were transfected with the H-2Kb and cotransfected with the TAP-1 gene, it resulted in the appearance of H-2Kb molecules and an increase in their immunogenicity, albeit they lost expression of retrovirus-encoded MAA recognized by MM2-9B6 mAb. Loss of MAA was found to result from a complete and stable elimination of ecotropic MelARV production in the H-2Kb/TAP-1-transfected JB/RH1 cells. Northern blot analysis showed no differences in ecotropic retroviral messages in MelARV-producing and -nonproducing melanoma cells, suggesting that loss of MelARV production was not due to down-regulation of MelARV transcription. Southern blot analysis revealed several rearrangements in the proviral DNA of H-2Kb-positive JB/RH1 melanoma cells. Sequence analysis of the ecotropic proviral DNA from these cells showed numerous nucleotide substitutions, some of which resulted in the appearance of a novel intraviral PstI restriction site and the loss of a HindIII restriction site in the pol region. PCR amplification of the proviral DNAs indicates that an ecotropic provirus found in the H-2Kb-positive cells is novel and does not preexist in the parental H-2Kb-negative melanoma cells. Conversely, the ecotropic provirus of the parental JB/RH1 cells was not amplifable from the H-2Kb-positive cells. Our data indicate that stable loss of retroviral production in the H-2Kb/TAP-1-transfected melanoma cells is probably due to the induction of recombination between a productive ecotropic MelARV and a defective nonecotropic provirus leading to the generation of a defective ecotropic provirus and the loss of MelARV production and expression of the retrovirus-encoded MAA. Copyright 1999 Academic Press.
The Universal Protein Resource (UniProt): an expanding universe of protein information.

PubMed

Wu, Cathy H; Apweiler, Rolf; Bairoch, Amos; Natale, Darren A; Barker, Winona C; Boeckmann, Brigitte; Ferro, Serenella; Gasteiger, Elisabeth; Huang, Hongzhan; Lopez, Rodrigo; Magrane, Michele; Martin, Maria J; Mazumder, Raja; O'Donovan, Claire; Redaschi, Nicole; Suzek, Baris

2006-01-01

The Universal Protein Resource (UniProt) provides a central resource on protein sequences and functional annotation with three database components, each addressing a key need in protein bioinformatics. The UniProt Knowledgebase (UniProtKB), comprising the manually annotated UniProtKB/Swiss-Prot section and the automatically annotated UniProtKB/TrEMBL section, is the preeminent storehouse of protein annotation. The extensive cross-references, functional and feature annotations and literature-based evidence attribution enable scientists to analyse proteins and query across databases. The UniProt Reference Clusters (UniRef) speed similarity searches via sequence space compression by merging sequences that are 100% (UniRef100), 90% (UniRef90) or 50% (UniRef50) identical. Finally, the UniProt Archive (UniParc) stores all publicly available protein sequences, containing the history of sequence data with links to the source databases. UniProt databases continue to grow in size and in availability of information. Recent and upcoming changes to database contents, formats, controlled vocabularies and services are described. New download availability includes all major releases of UniProtKB, sequence collections by taxonomic division and complete proteomes. A bibliography mapping service has been added, and an ID mapping service will be available soon. UniProt databases can be accessed online at http://www.uniprot.org or downloaded at ftp://ftp.uniprot.org/pub/databases/.
The organisation and interviral homologies of genes at the 3' end of tobacco rattle virus RNA1

PubMed Central

Boccara, Martine; Hamilton, William D. O.; Baulcombe, David C.

1986-01-01

The RNA1 of tobacco rattle virus (TRV) has been cloned as cDNA and the nucleotide sequence determined of 2 kb from the 3'-terminal region. The sequence contains three long open reading frames. One of these starts 5' of the cDNA and probably corresponds to the carboxy-terminal sequence of a 170-K protein encoded on RNA1. The deduced protein sequence from this reading frame shows homology with the putative replicases of tobacco mosaic virus (TMV) and tricornaviruses. The location of the second open reading frame, which encodes a 29-K polypeptide, was shown by Northern blot analysis to coincide with a 1.6-kb subgenomic RNA. The validity of this reading frame was confirmed by showing that the cDNA extending over this region could be transcribed and translated in vitro to produce a polypeptide of the predicted size which co-migrates in electrophoresis with a translation product of authentic viral RNA. The sequence of this 29-K polypeptide showed homology with two regions in the 30-K protein of TMV. This homology includes positions in the TMV 30-K protein where mutations have been identified which affect the transport of virus between cells. The third open reading frame encodes a potential 16-K protein and was shown by Northern blot hybridisation to be contained within the region of a 0.7-kb subgenomic RNA which is found in cellular RNA of infected cells but not virus particles. The many similarities between TRV and TMV in viral morphology, gene organisation and sequence suggest that these two viral groups may share a common viral ancestor. ImagesFig. 2.Fig. 3. PMID:16453668
Ornithine aminotransferase (OAT): recombination between an X-linked OAT sequence (7.5 kb) and the Norrie disease locus.

PubMed

Ngo, J T; Bateman, J B; Spence, M A; Cortessis, V; Sparkes, R S; Kivlin, J D; Mohandas, T; Inana, G

1990-01-01

A human ornithine aminotransferase (OAT) locus has been mapped to the Xp11.2, as has the Norrie disease locus. We used a cDNA probe to investigate a 3-generation UCLA family with Norrie disease; a 4.2-kb RFLP was detected and a maximum lod score of 0.602 at zero recombination fraction was calculated. We used the same probe to study a second multigeneration family with Norrie disease from Utah. A different RFLP of 7.5 kb in size was identified and a recombinational event between the OAT locus represented by this RFLP and the disease loci was observed. Linkage analysis of these two loci in this family revealed a maximum load score of 1.88 at a recombination fraction of 0.10. Although both families have affected members with the same disease, the lod scores are reported separately because the 4.2- and 7.5-kb RFLPs may represent two different loci for the X-linked OAT.
Comparative fine mapping of the Wax 1 (W1) locus in hexaploid wheat.

PubMed

Lu, Ping; Qin, Jinxia; Wang, Guoxin; Wang, Lili; Wang, Zhenzhong; Wu, Qiuhong; Xie, Jingzhong; Liang, Yong; Wang, Yong; Zhang, Deyun; Sun, Qixin; Liu, Zhiyong

2015-08-01

By applying comparative genomics analyses, a high-density genetic linkage map of the Wax 1 ( W1 ) locus was constructed as a framework for map-based cloning. Glaucousness is described as the scattering effect of visible light from wax deposited on the cuticle of plant aerial organs. In wheat, the wax on leaves and stems is mainly controlled by two sets of genes: glaucousness loci (W1 and W2) and non-glaucousness loci (Iw1 and Iw2). Bulked segregant analysis (BSA) and simple sequence repeat (SSR) mapping showed that Wax1 (W1) is located on chromosome arm 2BS between markers Xgwm210 and Xbarc35. By applying comparative genomics analyses, colinearity genomic regions of the W1 locus on wheat 2BS were identified in Brachypodium distachyon chromosome 5, rice chromosome 4 and sorghum chromosome 6, respectively. Four STS markers were developed using the Triticum aestivum cv. Chinese Spring 454 contig sequences and the International Wheat Genome Sequencing Consortium (IWGSC) survey sequences. W1 was mapped into a 0.93 cM genetic interval flanked by markers XWGGC3197 and XWGGC2484, which has synteny with genomic regions of 56.5 kb in Brachypodium, 390 kb in rice and 31.8 kb in sorghum. The fine genetic map can serve as a framework for chromosome landing, physical mapping and map-based cloning of the W1 in wheat.
Analysis of sequences from field samples reveals the presence of the recently described pepper vein yellows virus (genus Polerovirus) in six additional countries.

PubMed

Knierim, Dennis; Tsai, Wen-Shi; Kenyon, Lawrence

2013-06-01

Polerovirus infection was detected by reverse transcription polymerase chain reaction (RT-PCR) in 29 pepper plants (Capsicum spp.) and one black nightshade plant (Solanum nigrum) sample collected from fields in India, Indonesia, Mali, Philippines, Thailand and Taiwan. At least two representative samples for each country were selected to generate a general polerovirus RT-PCR product of 1.4 kb length for sequencing. Sequence analysis of the partial genome sequences revealed the presence of pepper vein yellows virus (PeVYV) in all 13 samples. A 1990 Australian herbarium sample of pepper described by serological means as infected with capsicum yellows virus (CYV) was identified by sequence analysis of a partial CP sequence as probably infected with a potato leaf roll virus (PLRV) isolate.
Characterization of Cer-1 cis-regulatory region during early Xenopus development.

PubMed

Silva, Ana Cristina; Filipe, Mário; Steinbeisser, Herbert; Belo, José António

2011-05-01

Cerberus-related molecules are well-known Wnt, Nodal, and BMP inhibitors that have been implicated in different processes including anterior–posterior patterning and left–right asymmetry. In both mouse and frog, two Cerberus-related genes have been isolated, mCer-1 and mCer-2, and Xcer and Xcoco, respectively. Until now, little is known about the mechanisms involved in their transcriptional regulation. Here, we report a heterologous analysis of the mouse Cerberus-1 gene upstream regulatory regions, responsible for its expression in the visceral endodermal cells. Our analysis showed that the consensus sequences for a TATA, CAAT, or GC boxes were absent but a TGTGG sequence was present at position -172 to -168 bp, relative to the ATG. Using a series of deletion constructs and transient expression in Xenopus embryos, we found that a fragment of 1.4 kb of Cer-1 promoter sequence could reproduce the endogenous expression pattern of Xenopus cerberus. A 0.7-kb mcer-1 upstream region was able to drive reporter expression to the involuting mesendodermal cells, while further deletions abolished reporter gene expression. Our results suggest that although no sequence similarity was found between mouse and Xenopus cerberus cis-regulatory regions, the signaling cascades regulating cerberus expression, during gastrulation, is conserved.
Comparative Analysis of Vertebrate Dystrophin Loci Indicate Intron Gigantism as a Common Feature

PubMed Central

Pozzoli, Uberto; Elgar, Greg; Cagliani, Rachele; Riva, Laura; Comi, Giacomo P.; Bresolin, Nereo; Bardoni, Alessandra; Sironi, Manuela

2003-01-01

The human DMD gene is the largest known to date, spanning > 2000 kb on the X chromosome. The gene size is mainly accounted for by huge intronic regions. We sequenced 190 kb of Fugu rubripes (pufferfish) genomic DNA corresponding to the complete dystrophin gene (FrDMD) and provide the first report of gene structure and sequence comparison among dystrophin genomic sequences from different vertebrate organisms. Almost all intron positions and phases are conserved between FrDMD and its mammalian counterparts, and the predicted protein product of the Fugu gene displays 55% identity and 71% similarity to human dystrophin. In analogy to the human gene, FrDMD presents several-fold longer than average intronic regions. Analysis of intron sequences of the human and murine genes revealed that they are extremely conserved in size and that a similar fraction of total intron length is represented by repetitive elements; moreover, our data indicate that intron expansion through repeat accumulation in the two orthologs is the result of independent insertional events. The hypothesis that intron length might be functionally relevant to the DMD gene regulation is proposed and substantiated by the finding that dystrophin intron gigantism is common to the three vertebrate genes. [Supplemental material is available online at www.genome.org.] PMID:12727896
HOXBES2: a novel epididymal HOXB2 homeoprotein and its domain-specific association with spermatozoa.

PubMed

Prabagaran, E; Bandivdekar, A H; Dighe, V; Raghavan, V P

2007-02-01

The sperm from the testis acquires complete fertilizing ability and forward progressive motility following its transit through the epididymis. Acquisition of these characteristics results from the modification of the sperm proteome following interactions with epididymal secretions. In our attempts to identify epididymis-specific sperm plasma membrane proteins, a partial 2.83-kb clone was identified by immunoscreening a monkey epididymal cDNA library with an agglutinating monoclonal antibody raised against washed human spermatozoa. The sequence of the 2.83-kb clone exhibited homology to the region between 1 and 1097 bp of the homeobox gene, Hoxb2. This sequence was found to be species conserved, as revealed by RT-PCR analysis. To obtain a full-length clone of the sequence, 5' RACE-PCR (rapid amplification of cDNA ends PCR) was carried out using rat epididymal RNA as the template. It resulted in a full-length 1.657-kb cDNA encoding a 32.9-kDa putative protein. The protein designated HOXBES2 exhibited homology to the conserved 61-amino acid homeodomain region of the HOXB2 homeoprotein. However, characteristic differences were noted in its amino and carboxyl termini compared with HOXB2. A putative 30-kDa protein was detected in the tissue extracts from adult rat epididymis and caudal spermatozoa, and a 37-kDa protein was detected in the rat embryo when probed with a polyclonal antibody against HOXB2 protein. Multiple tissue Western blot and immunohistochemical analysis further indicated its expression in the cytoplasm of the principal and basal epithelial cells, with maximal expression in the distal epididymal segments. Northern blot analysis detected a single approximately 2.5-kb transcript from the adult epididymis. Indirect immunofluorescence localized the protein to the acrosome, midpiece, and equatorial segments of rat caudal and ejaculated human and monkey spermatozoa, respectively. In conclusion, we have identified and characterized a novel epididymal homeoprotein different from HOXB2 protein and hereafter referred to as HOXBES2, (HOXB2 homeodomain containing epididymis-specific sperm protein) with a probable role in fertilization.
Characterization of the 101-Kilobase-Pair Megaplasmid pKB1, Isolated from the Rubber-Degrading Bacterium Gordonia westfalica Kb1

PubMed Central

Bröker, Daniel; Arenskötter, Matthias; Legatzki, Antje; Nies, Dietrich H.; Steinbüchel, Alexander

2004-01-01

The complete sequence of the circular 101,016-bp megaplasmid pKB1 from the cis-1,4-polyisoprene-degrading bacterium Gordonia westfalica Kb1, which represents the first described extrachromosomal DNA of a member of this genus, was determined. Plasmid pKB1 harbors 105 open reading frames. The predicted products of 46 of these are significantly related to proteins of known function. Plasmid pKB1 is organized into three functional regions that are flanked by insertion sequence (IS) elements: (i) a replication and putative partitioning region, (ii) a putative metabolic region, and (iii) a large putative conjugative transfer region, which is interrupted by an additional IS element. Southern hybridization experiments revealed the presence of another copy of this conjugational transfer region on the bacterial chromosome. The origin of replication (oriV) of pKB1 was identified and used for construction of Escherichia coli-Gordonia shuttle vectors, which was also suitable for several other Gordonia species and related genera. The metabolic region included the heavy-metal resistance gene cadA, encoding a P-type ATPase. Expression of cadA in E. coli mediated resistance to cadmium, but not to zinc, and decreased the cellular content of cadmium in this host. When G. westfalica strain Kb1 was cured of plasmid pKB1, the resulting derivative strains exhibited slightly decreased cadmium resistance. Furthermore, they had lost the ability to use isoprene rubber as a sole source of carbon and energy, suggesting that genes essential for rubber degradation are encoded by pKB1. PMID:14679241
Construction of trypanosome artificial mini-chromosomes.

PubMed Central

Lee, M G; E, Y; Axelrod, N

1995-01-01

We report the preparation of two linear constructs which, when transformed into the procyclic form of Trypanosoma brucei, become stably inherited artificial mini-chromosomes. Both of the two constructs, one of 10 kb and the other of 13 kb, contain a T.brucei PARP promoter driving a chloramphenicol acetyltransferase (CAT) gene. In the 10 kb construct the CAT gene is followed by one hygromycin phosphotransferase (Hph) gene, and in the 13 kb construct the CAT gene is followed by three tandemly linked Hph genes. At each end of these linear molecules are telomere repeats and subtelomeric sequences. Electroporation of these linear DNA constructs into the procyclic form of T.brucei generated hygromycin-B resistant cell lines. In these cell lines, the input DNA remained linear and bounded by the telomere ends, but it increased in size. In the cell lines generated by the 10 kb construct, the input DNA increased in size to 20-50 kb. In the cell lines generated by the 13 kb constructs, two sizes of linear DNAs containing the input plasmid were detected: one of 40-50 kb and the other of 150 kb. The increase in size was not the result of in vivo tandem repetitions of the input plasmid, but represented the addition of new sequences. These Hph containing linear DNA molecules were maintained stably in cell lines for at least 20 generations in the absence of drug selection and were subsequently referred to as trypanosome artificial mini-chromosomes, or TACs. Images PMID:8532534

Chromosomal insertion and excision of a 30 kb unstable genetic element is responsible for phase variation of lipopolysaccharide and other virulence determinants in Legionella pneumophila.

PubMed

Lüneberg, E; Mayer, B; Daryab, N; Kooistra, O; Zähringer, U; Rohde, M; Swanson, J; Frosch, M

2001-03-01

We recently described the phase-variable expression of a virulence-associated lipopolysaccharide (LPS) epitope in Legionella pneumophila. In this study, the molecular mechanism for phase variation was investigated. We identified a 30 kb unstable genetic element as the molecular origin for LPS phase variation. Thirty putative genes were encoded on the 30 kb sequence, organized in two putative opposite transcription units. Some of the open reading frames (ORFs) shared homologies with bacteriophage genes, suggesting that the 30 kb element was of phage origin. In the virulent wild-type strain, the 30 kb element was located on the chromosome, whereas excision from the chromosome and replication as a high-copy plasmid resulted in the mutant phenotype, which is characterized by alteration of an LPS epitope and loss of virulence. Mapping and sequencing of the insertion site in the genome revealed that the chromosomal attachment site was located in an intergenic region flanked by genes of unknown function. As phage release could not be induced by mitomycin C, it is conceivable that the 30 kb element is a non-functional phage remnant. The protein encoded by ORF T on the 30 kb plasmid could be isolated by an outer membrane preparation, indicating that the genes encoded on the 30 kb element are expressed in the mutant phenotype. Therefore, it is conceivable that the phenotypic alterations seen in the mutant depend on high-copy replication of the 30 kb element and expression of the encoded genes. Excision of the 30 kb element from the chromosome was found to occur in a RecA-independent pathway, presumably by the involvement of RecE, RecT and RusA homologues that are encoded on the 30 kb element.
A new polymorphic and multicopy MHC gene family related to nonmammalian class I

DOE Office of Scientific and Technical Information (OSTI.GOV)

Leelayuwat, C.; Degli-Esposti, M.A.; Abraham, L.J.

1994-12-31

The authors have used genomic analysis to characterize a region of the central major histocompatibility complex (MHC) spanning {approximately} 300 kilobases (kb) between TNF and HLA-B. This region has been suggested to carry genetic factors relevant to the development of autoimmune diseases such as myasthenia gravis (MG) and insulin dependent diabetes mellitus (IDDM). Genomic sequence was analyzed for coding potential, using two neural network programs, GRAIL and GeneParser. A genomic probe, JAB, containing putative coding sequences (PERB11) located 60 kb centromeric of HLA-B, was used for northern analysis of human tissues. Multiple transcripts were detected. Southern analysis of genomic DNAmore » and overlapping YAC clones, covering the region from BAT1 to HLA-F, indicated that there are at least five copies of PERB11, four of which are located within this region of the MHC. The partial cDNA sequence of PERB11 was obtained from poly-A RNA derived from skeletal muscle. The putative amino acid sequence of PERB11 shares {approximately} 30% identity to MHC class I molecules from various species, including reptiles, chickens, and frogs, as well as to other MHC class I-like molecules, such as the IgG FcR of the mouse and rat and the human Zn-{alpha}2-glycoprotein. From direct comparison of amino acid sequences, it is concluded that PERB11 is a distinct molecule more closely related to nonmammalian than known mammalian MHC class I molecules. Genomic sequence analysis of PERB11 from five MHC ancestral haplotypes (AH) indicated that the gene is polymorphic at both DNA and protein level. The results suggest that the authors have identified a novel polymorphic gene family with multiple copies within the MHC. 48 refs., 10 figs., 2 tabs.« less
An umbra-like virus of papaya discovered in Ecuador: detection, occurrence and phylogenetic relatedness

USDA-ARS?s Scientific Manuscript database

Double-stranded RNA (dsRNA) extractions from papaya leaves infected with Papaya ringspot virus (PRSV) revealed the presence of an unusual 4kb band, in addition to the presumed PRSV-associated 10kb band. Partial sequence of RT-PCR products from the 4kb dsRNA revealed homology to genomes of several me...
Extensive gene conversion at the PMS2 DNA mismatch repair locus.

PubMed

Hayward, Bruce E; De Vos, Michel; Valleley, Elizabeth M A; Charlton, Ruth S; Taylor, Graham R; Sheridan, Eamonn; Bonthron, David T

2007-05-01

Mutations of the PMS2 DNA repair gene predispose to a characteristic range of malignancies, with either childhood onset (when both alleles are mutated) or a partially penetrant adult onset (if heterozygous). These mutations have been difficult to detect, due to interference from a family of pseudogenes located on chromosome 7. One of these, the PMS2CL pseudogene, lies within a 100-kb inverted duplication (inv dup), 700 kb centromeric to PMS2 itself on 7p22. Here, we show that the reference genomic sequences cannot be relied upon to distinguish PMS2 from PMS2CL, because of sequence transfer between the two loci. The 7p22 inv dup occurred prior to the divergence of modern ape species (15 million years ago [Mya]), but has undergone extensive sequence homogenization. This process appears to be ongoing, since there is considerable allelic diversity within the duplicated region, much of it derived from sequence exchange between PMS2 and PMS2CL. This sequence diversity can result in both false-positive and false-negative mutation analysis at this locus. Great caution is still needed in the design and interpretation of PMS2 mutation screens. 2007 Wiley-Liss, Inc.
Nanopore DNA Sequencing and Genome Assembly on the International Space Station.

PubMed

Castro-Wallace, Sarah L; Chiu, Charles Y; John, Kristen K; Stahl, Sarah E; Rubins, Kathleen H; McIntyre, Alexa B R; Dworkin, Jason P; Lupisella, Mark L; Smith, David J; Botkin, Douglas J; Stephenson, Timothy A; Juul, Sissel; Turner, Daniel J; Izquierdo, Fernando; Federman, Scot; Stryke, Doug; Somasekar, Sneha; Alexander, Noah; Yu, Guixia; Mason, Christopher E; Burton, Aaron S

2017-12-21

We evaluated the performance of the MinION DNA sequencer in-flight on the International Space Station (ISS), and benchmarked its performance off-Earth against the MinION, Illumina MiSeq, and PacBio RS II sequencing platforms in terrestrial laboratories. Samples contained equimolar mixtures of genomic DNA from lambda bacteriophage, Escherichia coli (strain K12, MG1655) and Mus musculus (female BALB/c mouse). Nine sequencing runs were performed aboard the ISS over a 6-month period, yielding a total of 276,882 reads with no apparent decrease in performance over time. From sequence data collected aboard the ISS, we constructed directed assemblies of the ~4.6 Mb E. coli genome, ~48.5 kb lambda genome, and a representative M. musculus sequence (the ~16.3 kb mitochondrial genome), at 100%, 100%, and 96.7% consensus pairwise identity, respectively; de novo assembly of the E. coli genome from raw reads yielded a single contig comprising 99.9% of the genome at 98.6% consensus pairwise identity. Simulated real-time analyses of in-flight sequence data using an automated bioinformatic pipeline and laptop-based genomic assembly demonstrated the feasibility of sequencing analysis and microbial identification aboard the ISS. These findings illustrate the potential for sequencing applications including disease diagnosis, environmental monitoring, and elucidating the molecular basis for how organisms respond to spaceflight.
Cloning and characterization of the mouse alpha1C/A-adrenergic receptor gene and analysis of an alpha1C promoter in cardiac myocytes: role of an MCAT element that binds transcriptional enhancer factor-1 (TEF-1).

PubMed

O'Connell, T D; Rokosh, D G; Simpson, P C

2001-05-01

alpha1-Adrenergic receptor (AR) subtypes in the heart are expressed by myocytes but not by fibroblasts, a feature that distinguishes alpha1-ARs from beta-ARs. Here we studied myocyte-specific expression of alpha1-ARs, focusing on the subtype alpha1C (also called alpha1A), a subtype implicated in cardiac hypertrophic signaling in rat models. We first cloned the mouse alpha1C-AR gene, which consisted of two exons with an 18 kb intron, similar to the alpha1B-AR gene. The receptor coding sequence was >90% homologous to that of rat and human. alpha1C-AR transcription in mouse heart was initiated from a single Inr consensus sequence at -588 from the ATG; this and a putative polyadenylation sequence 8.5 kb 3' could account for the predominant 11 kb alpha1C mRNA in mouse heart. A 5'-nontranscribed fragment of 4.4 kb was active as a promoter in cardiac myocytes but not in fibroblasts. Promoter activity in myocytes required a single muscle CAT (MCAT) element, and this MCAT bound in vitro to recombinant and endogenous transcriptional enhancer factor-1. Thus, alpha1C-AR transcription in cardiac myocytes shares MCAT dependence with other cardiac-specific genes, including the alpha- and beta-myosin heavy chains, skeletal alpha-actin, and brain natriuretic peptide. However, the mouse alpha1C gene was not transcribed in the neonatal heart and was not activated by alpha1-AR and other hypertrophic agonists in rat myocytes, and thus differed from other MCAT-dependent genes and the rat alpha1C gene.
Molecular cloning of the mouse gene coding for {alpha}{sub 2}-macroglobulin and targeting of the gene in embryonic stem cells

DOE Office of Scientific and Technical Information (OSTI.GOV)

Umans, L.; Serneels, L.; Hilliker, C.

1994-08-01

The authors have cloned the mouse gene coding for {alpha}{sub 2}-macroglobulin in overlapping {lambda} clones and have analyzed its structure. The gene contains 36 exons, coding for the 4.8-kb cDNA that we cloned previously. Including putative control elements in the 5{prime} flanking region, the gene covers about 45 kb. A region of 3.8 kb, stretching from 835 bases upstream of the cDNA start site to exon 4, including all intervening sequences, was sequenced completely. The analysis demonstrated that the putative promoter region of the mouse A2M gene differed considerably from the known promoter sequences of the human A2M gene andmore » of the rat acute-phas A2M gene. Comparison of the exon-intron structure of all known genes of the A2M family confirmed that the rat acute phase A2M gene is more closely related to the human gene than to the mouse A2M gene. To generate mice with the A2M gene inactivated, an insertion type of construct containing 7.5 kb of genomic DNA of the mouse strain 129/J, encompassing exons 16 to 19, was synthesized. A hygromycin marker gene was embedded in intron 17. After electroporation, 198 hygromycin-resistant ES cell lines were isolated and analyzed by Southern blotting. Five ES cell lines were obtained with one allele of the mouse A2M gene targeted by this insertion construct, demonstrating that the position and the characteristics of the vector served the intended goal.« less
Molecular variation and horizontal gene transfer of the homocysteine methyltransferase gene mmuM and its distribution in clinical pathogens.

PubMed

Ying, Jianchao; Wang, Huifeng; Bao, Bokan; Zhang, Ying; Zhang, Jinfang; Zhang, Cheng; Li, Aifang; Lu, Junwan; Li, Peizhen; Ying, Jun; Liu, Qi; Xu, Teng; Yi, Huiguang; Li, Jinsong; Zhou, Li; Zhou, Tieli; Xu, Zuyuan; Ni, Liyan; Bao, Qiyu

2015-01-01

The homocysteine methyltransferase encoded by mmuM is widely distributed among microbial organisms. It is the key enzyme that catalyzes the last step in methionine biosynthesis and plays an important role in the metabolism process. It also enables the microbial organisms to tolerate high concentrations of selenium in the environment. In this research, 533 mmuM gene sequences covering 70 genera of the bacteria were selected from GenBank database. The distribution frequency of mmuM is different in the investigated genera of bacteria. The mapping results of 160 mmuM reference sequences showed that the mmuM genes were found in 7 species of pathogen genomes sequenced in this work. The polymerase chain reaction products of one mmuM genotype (NC_013951 as the reference) were sequenced and the sequencing results confirmed the mapping results. Furthermore, 144 representative sequences were chosen for phylogenetic analysis and some mmuM genes from totally different genera (such as the genes between Escherichia and Klebsiella and between Enterobacter and Kosakonia) shared closer phylogenetic relationship than those from the same genus. Comparative genomic analysis of the mmuM encoding regions on plasmids and bacterial chromosomes showed that pKF3-140 and pIP1206 plasmids shared a 21 kb homology region and a 4.9 kb fragment in this region was in fact originated from the Escherichia coli chromosome. These results further suggested that mmuM gene did go through the gene horizontal transfer among different species or genera of bacteria. High-throughput sequencing combined with comparative genomics analysis would explore distribution and dissemination of the mmuM gene among bacteria and its evolution at a molecular level.
De novo assembly of human genomes with massively parallel short read sequencing.

PubMed

Li, Ruiqiang; Zhu, Hongmei; Ruan, Jue; Qian, Wubin; Fang, Xiaodong; Shi, Zhongbin; Li, Yingrui; Li, Shengting; Shan, Gao; Kristiansen, Karsten; Li, Songgang; Yang, Huanming; Wang, Jian; Wang, Jun

2010-02-01

Next-generation massively parallel DNA sequencing technologies provide ultrahigh throughput at a substantially lower unit data cost; however, the data are very short read length sequences, making de novo assembly extremely challenging. Here, we describe a novel method for de novo assembly of large genomes from short read sequences. We successfully assembled both the Asian and African human genome sequences, achieving an N50 contig size of 7.4 and 5.9 kilobases (kb) and scaffold of 446.3 and 61.9 kb, respectively. The development of this de novo short read assembly method creates new opportunities for building reference sequences and carrying out accurate analyses of unexplored genomes in a cost-effective way.
Nucleotide sequences of Herpes Simplex Virus type 1 (HSV-1) affecting virus entry, cell fusion, and production of glycoprotein gB (VP7)

DOE Office of Scientific and Technical Information (OSTI.GOV)

DeLuca, N.; Bzik, D.J.; Bond, V.C.

1982-10-30

The tsB5 strain of Herpes Simplex Virus type 1 (HSV-1) contains at least two mutations; one mutation specifies the syncytial phenotype and the other confers temperature sensitivity for virus growth. These functions are known to be located between the prototypic map coordinates 0.30 and 0.42. In this study it was demonstrated that tsB5 enters human embryonic lung (HEL) cells more rapidly than KOS, another strain of HSV-1. The EcoRI restriction fragment F from the KOS strain (map coordinates 0.315 to 0.421) was mapped with eight restriction endonucleases, and 16 recombinant plasmids were constructed which contained varying portions of the KOSmore » genome. Recombinant viruses were generated by marker-rescue and marker-transfer cotransfection procedures, using intact DNA from one strain and a recombinant plasmid containing DNA from the other strain. The region of the crossover between the two nonisogenic strains was inferred by the identification of restriction sites in the recombinants that were characteristic of the parental strains. The recombinants were subjected to phenotypic analysis. Syncytium formation, rate of virus entry, and the production of gB were all separable by the crossovers that produced the recombinants. The KOS sequences which rescue the syncytial phenotype of tsB5 were localized to 1.5 kb (map coordinates 0.345 to 0.355), and the temperature-sensitive mutation was localized to 1.2 kb (0.360 to 0.368), giving an average separation between the mutations of 2.5 kb on the 150-kb genome. DNA sequences that specify a functional domain for virus entry were localized to the nucleotide sequences between the two mutations. All three functions could be encoded by the virus gene specifying the gB glycoprotein.« less
Evolution and dynamics of megaplasmids with genome sizes larger than 100 kb in the Bacillus cereus group.

PubMed

Zheng, Jinshui; Peng, Donghai; Ruan, Lifang; Sun, Ming

2013-12-02

Plasmids play a crucial role in the evolution of bacterial genomes by mediating horizontal gene transfer. However, the origin and evolution of most plasmids remains unclear, especially for megaplasmids. Strains of the Bacillus cereus group contain up to 13 plasmids with genome sizes ranging from 2 kb to 600 kb, and thus can be used to study plasmid dynamics and evolution. This work studied the origin and evolution of 31 B. cereus group megaplasmids (>100 kb) focusing on the most conserved regions on plasmids, minireplicons. Sixty-five putative minireplicons were identified and classified to six types on the basis of proteins that are essential for replication. Twenty-nine of the 31 megaplasmids contained two or more minireplicons. Phylogenetic analysis of the protein sequences showed that different minireplicons on the same megaplasmid have different evolutionary histories. Therefore, we speculated that these megaplasmids are the results of fusion of smaller plasmids. All plasmids of a bacterial strain must be compatible. In megaplasmids of the B. cereus group, individual minireplicons of different megaplasmids in the same strain belong to different types or subtypes. Thus, the subtypes of each minireplicon they contain may determine the incompatibilities of megaplasmids. A broader analysis of all 1285 bacterial plasmids with putative known minireplicons whose complete genome sequences were available from GenBank revealed that 34% (443 plasmids) of the plasmids have two or more minireplicons. This indicates that plasmid fusion events are general among bacterial plasmids. Megaplasmids of B. cereus group are fusion of smaller plasmids, and the fusion of plasmids likely occurs frequently in the B. cereus group and in other bacterial taxa. Plasmid fusion may be one of the major mechanisms for formation of novel megaplasmids in the evolution of bacteria.
A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project.

PubMed

Yebra, Gonzalo; Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R Bridget; Waters, Laura; Tong, C Y William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J

2018-01-01

The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. The initial analysis of genome sequences detected substantial hidden variability in the London HIV epidemic. Analysing full genome sequences, as opposed to only PR+RT, identified previously undetected recombinants. It provided a more reliable description of CRFs (that would be otherwise misclassified) and transmission clusters.
Scanning the Effects of Ethyl Methanesulfonate on the Whole Genome of Lotus japonicus Using Second-Generation Sequencing Analysis

PubMed Central

Mohd-Yusoff, Nur Fatihah; Ruperao, Pradeep; Tomoyoshi, Nurain Emylia; Edwards, David; Gresshoff, Peter M.; Biswas, Bandana; Batley, Jacqueline

2015-01-01

Genetic structure can be altered by chemical mutagenesis, which is a common method applied in molecular biology and genetics. Second-generation sequencing provides a platform to reveal base alterations occurring in the whole genome due to mutagenesis. A model legume, Lotus japonicus ecotype Miyakojima, was chemically mutated with alkylating ethyl methanesulfonate (EMS) for the scanning of DNA lesions throughout the genome. Using second-generation sequencing, two individually mutated third-generation progeny (M3, named AM and AS) were sequenced and analyzed to identify single nucleotide polymorphisms and reveal the effects of EMS on nucleotide sequences in these mutant genomes. Single-nucleotide polymorphisms were found in every 208 kb (AS) and 202 kb (AM) with a bias mutation of G/C-to-A/T changes at low percentage. Most mutations were intergenic. The mutation spectrum of the genomes was comparable in their individual chromosomes; however, each mutated genome has unique alterations, which are useful to identify causal mutations for their phenotypic changes. The data obtained demonstrate that whole genomic sequencing is applicable as a high-throughput tool to investigate genomic changes due to mutagenesis. The identification of these single-point mutations will facilitate the identification of phenotypically causative mutations in EMS-mutated germplasm. PMID:25660167
An exploration of the sequence of a 2.9-Mb region of the genome of Drosophila melanogaster: the Adh region.

PubMed Central

Ashburner, M; Misra, S; Roote, J; Lewis, S E; Blazej, R; Davis, T; Doyle, C; Galle, R; George, R; Harris, N; Hartzell, G; Harvey, D; Hong, L; Houston, K; Hoskins, R; Johnson, G; Martin, C; Moshrefi, A; Palazzolo, M; Reese, M G; Spradling, A; Tsang, G; Wan, K; Whitelaw, K; Celniker, S

1999-01-01

A contiguous sequence of nearly 3 Mb from the genome of Drosophila melanogaster has been sequenced from a series of overlapping P1 and BAC clones. This region covers 69 chromosome polytene bands on chromosome arm 2L, including the genetically well-characterized "Adh region." A computational analysis of the sequence predicts 218 protein-coding genes, 11 tRNAs, and 17 transposable element sequences. At least 38 of the protein-coding genes are arranged in clusters of from 2 to 6 closely related genes, suggesting extensive tandem duplication. The gene density is one protein-coding gene every 13 kb; the transposable element density is one element every 171 kb. Of 73 genes in this region identified by genetic analysis, 49 have been located on the sequence; P-element insertions have been mapped to 43 genes. Ninety-five (44%) of the known and predicted genes match a Drosophila EST, and 144 (66%) have clear similarities to proteins in other organisms. Genes known to have mutant phenotypes are more likely to be represented in cDNA libraries, and far more likely to have products similar to proteins of other organisms, than are genes with no known mutant phenotype. Over 650 chromosome aberration breakpoints map to this chromosome region, and their nonrandom distribution on the genetic map reflects variation in gene spacing on the DNA. This is the first large-scale analysis of the genome of D. melanogaster at the sequence level. In addition to the direct results obtained, this analysis has allowed us to develop and test methods that will be needed to interpret the complete sequence of the genome of this species.Before beginning a Hunt, it is wise to ask someone what you are looking for before you begin looking for it. Milne 1926 PMID:10471707
Complete Genome Sequences of Salmonella enterica Serovars Anatum and Anatum var. 15+, Isolated from Retail Ground Turkey

PubMed Central

Marasini, Daya; Abo-Shama, Usama H.

2016-01-01

The complete genome sequences of two isolates of Salmonella enterica serovars Anatum and Anatum var. 15+ revealed the presence of two plasmids of 112 kb and 3 kb in size in each. The chromosome of Salmonella Anatum (4.83 Mb) was slightly smaller than that of Salmonella Anatum var. 15+ (4.88 Mb). PMID:26798111
Complementary DNA characterization and chromosomal localization of a human gene related to the poliovirus receptor-encoding gene.

PubMed

Lopez, M; Eberlé, F; Mattei, M G; Gabert, J; Birg, F; Bardin, F; Maroc, C; Dubreuil, P

1995-04-03

The human poliovirus (PV) receptor (PVR) is a member of the immunoglobulin (Ig) superfamily with unknown cellular function. We have isolated a human PVR-related (PRR) cDNA. The deduced amino acid (aa) sequence of PRR showed, in the extracellular region, 51.7 and 54.3% similarity with human PVR and with the murine PVR homolog, respectively. The cDNA coding sequence is 1.6-kb long and encodes a deduced 57-kDa protein; this protein has a structural organization analogous to that of PVR, that is, one V- and two C-set Ig domains, with a conserved number of aa. Northern blot analysis indicated that a major 5.9-kb transcript is present in all normal human tissues tested. In situ hybridization showed that the PRR gene is located at bands q23-q24 of human chromosome 11.
Cloning and Characterization of the Pyrrolomycin Biosynthetic Gene Clusters from Actinosporangium vitaminophilum ATCC 31673 and Streptomyces sp. Strain UC 11065▿

PubMed Central

Zhang, Xiujun; Parry, Ronald J.

2007-01-01

The pyrrolomycins are a family of polyketide antibiotics, some of which contain a nitro group. To gain insight into the nitration mechanism associated with the formation of these antibiotics, the pyrrolomycin biosynthetic gene cluster from Actinosporangium vitaminophilum was cloned. Sequencing of ca. 56 kb of A. vitaminophilum DNA revealed 35 open reading frames (ORFs). Sequence analysis revealed a clear relationship between some of these ORFs and the biosynthetic gene cluster for pyoluteorin, a structurally related antibiotic. Since a gene transfer system could not be devised for A. vitaminophilum, additional proof for the identity of the cloned gene cluster was sought by cloning the pyrrolomycin gene cluster from Streptomyces sp. strain UC 11065, a transformable pyrrolomycin producer. Sequencing of ca. 26 kb of UC 11065 DNA revealed the presence of 17 ORFs, 15 of which exhibit strong similarity to ORFs in the A. vitaminophilum cluster as well as a nearly identical organization. Single-crossover disruption of two genes in the UC 11065 cluster abolished pyrrolomycin production in both cases. These results confirm that the genetic locus cloned from UC 11065 is essential for pyrrolomycin production, and they also confirm that the highly similar locus in A. vitaminophilum encodes pyrrolomycin biosynthetic genes. Sequence analysis revealed that both clusters contain genes encoding the two components of an assimilatory nitrate reductase. This finding suggests that nitrite is required for the formation of the nitrated pyrrolomycins. However, sequence analysis did not provide additional insights into the nitration process, suggesting the operation of a novel nitration mechanism. PMID:17158935
Composite conserved promoter-terminator motifs (PeSLs) that mediate modular shuffling in the diverse T4-like myoviruses.

PubMed

Comeau, André M; Arbiol, Christine; Krisch, Henry M

2014-06-19

The diverse T4-like phages (Tquatrovirinae) infect a wide array of gram-negative bacterial hosts. The genome architecture of these phages is generally well conserved, most of the phylogenetically variable genes being grouped together in a series hyperplastic regions (HPRs) that are interspersed among large blocks of conserved core genes. Recent evidence from a pair of closely related T4-like phages has suggested that small, composite terminator/promoter sequences (promoterearly stem loop [PeSLs]) were implicated in mediating the high levels of genetic plasticity by indels occurring within the HPRs. Here, we present the genome sequence analysis of two T4-like phages, PST (168 kb, 272 open reading frames [ORFs]) and nt-1 (248 kb, 405 ORFs). These two phages were chosen for comparative sequence analysis because, although they are closely related to phages that have been previously sequenced (T4 and KVP40, respectively), they have different host ranges. In each case, one member of the pair infects a bacterial strain that is a human pathogen, whereas the other phage's host is a nonpathogen. Despite belonging to phylogenetically distant branches of the T4-likes, these pairs of phage have diverged from each other in part by a mechanism apparently involving PeSL-mediated recombination. This analysis confirms a role of PeSL sequences in the generation of genomic diversity by serving as a point of genetic exchange between otherwise unrelated sequences within the HPRs. Finally, the palette of divergent genes swapped by PeSL-mediated homologous recombination is discussed in the context of the PeSLs' potentially important role in facilitating phage adaption to new hosts and environments. © The Author(s) 2014. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Comparative genomic analysis of the false killer whale (Pseudorca crassidens) LMBR1 locus.

PubMed

Kim, Dae-Won; Choi, Sang-Haeng; Kim, Ryong Nam; Kim, Sun-Hong; Paik, Sang-Gi; Nam, Seong-Hyeuk; Kim, Dong-Wook; Kim, Aeri; Kang, Aram; Park, Hong-Seog

2010-09-01

The sequencing and comparative genomic analysis of LMBR1 loci in mammals or other species, including human, would be very important in understanding evolutionary genetic changes underlying the evolution of limb development. In this regard, comparative genomic annotation of the false killer whale LMBR1 locus could shed new light on the evolution of limb development. We sequenced two false killer whale BAC clones, corresponding to 156 kb and 144 kb, respectively, harboring the tightly linked RNF32, LMBR1, and NOM1 genes. Our annotation of the false killer whale LMBR1 gene showed that it consists of 17 exons (1473 bp), in contrast to 18 exons (1596 bp) in human, and it displays 93.1% and 95.6% nucleotide and amino acid sequence similarity, respectively, compared with the human gene. In particular, we discovered that exon 10, deleted in the false killer whale LMBR1 gene, is present only in primates, and this fact strongly implies that exon 10 might be crucial in determining primate-specific limb development. ZRS and TFBS sequences have been well conserved across 11 species, suggesting that these regions could be involved in an important function of limb development and limb patterning. The neighboring gene RNF32 showed several lineage-conserved exons, such as exons 2 through 9 conserved in eutherian mammals, exons 3 through 9 conserved in mammals, and exons 5 through 9 conserved in vertebrates. The other neighboring gene, NOM1, had undergone a substitution (ATG→GTA) at the start codon, giving rise to a 36 bp shorter N-terminal sequence compared with the human sequence. Our comparative analysis of the false killer whale LMBR1 genomic locus provides important clues regarding the genetic regions that may play crucial roles in limb development and patterning.
Analysis of the 9p21.3 sequence associated with coronary artery disease reveals a tendency for duplication in a CAD patient

PubMed Central

Kouprina, Natalay; Noskov, Vladimir N.; Waterfall, Joshua J.; Walker, Robert L.; Meltzer, Paul S.; Topol, Eric J.; Larionov, Vladimir

2018-01-01

Tandem segmental duplications (SDs) greater than 10 kb are widespread in complex genomes. They provide material for gene divergence and evolutionary adaptation, while formation of specific de novo SDs is a hallmark of cancer and some human diseases. Most SDs map to distinct genomic regions termed ‘duplication blocks’. SDs organization within these blocks is often poorly characterized as they are mosaics of ancestral duplicons juxtaposed with younger duplicons arising from more recent duplication events. Structural and functional analysis of SDs is further hampered as long repetitive DNA structures are underrepresented in existing BAC and YAC libraries. We applied Transformation-Associated Recombination (TAR) cloning, a versatile technique for large DNA manipulation, to selectively isolate the coronary artery disease (CAD) interval sequence within the 9p21.3 chromosome locus from a patient with coronary artery disease and normal individuals. Four tandem head-to-tail duplicons, each ∼50 kb long, were recovered in the patient but not in normal individuals. Sequence analysis revealed that the repeats varied by 10-15 SNPs between each other and by 82 SNPs between the human genome sequence (version hg19). SNPs polymorphism within the junctions between repeats allowed two junction types to be distinguished, Type 1 and Type 2, which were found at a 2:1 ratio. The junction sequences contained an Alu element, a sequence previously shown to play a role in duplication. Knowledge of structural variation in the CAD interval from more patients could help link this locus to cardiovascular diseases susceptibility, and maybe relevant to other cases of regional amplification, including cancer. PMID:29632643

Physical map location of the multicopy genes coding for ammonia monooxygenase and hydroxylamine oxidoreductase in the ammonia-oxidizing bacterium Nitrosomonas sp. strain ENI-11.

PubMed

Hirota, R; Yamagata, A; Kato, J; Kuroda, A; Ikeda, T; Takiguchi, N; Ohtake, H

2000-02-01

Pulsed-field gel electrophoresis of PmeI digests of the Nitrosomonas sp. strain ENI-11 chromosome produced four bands ranging from 1,200 to 480 kb in size. Southern hybridizations suggested that a 487-kb PmeI fragment contained two copies of the amoCAB genes, coding for ammonia monooxygenase (designated amoCAB(1) and amoCAB(2)), and three copies of the hao gene, coding for hydroxylamine oxidoreductase (hao(1), hao(2), and hao(3)). In this DNA fragment, amoCAB(1) and amoCAB(2) were about 390 kb apart, while hao(1), hao(2), and hao(3) were separated by at least about 100 kb from each other. Interestingly, hao(1) and hao(2) were located relatively close to amoCAB(1) and amoCAB(2), respectively. DNA sequence analysis revealed that hao(1) and hao(2) shared 160 identical nucleotides immediately upstream of each translation initiation codon. However, hao(3) showed only 30% nucleotide identity in the 160-bp corresponding region.
Physical Map Location of the Multicopy Genes Coding for Ammonia Monooxygenase and Hydroxylamine Oxidoreductase in the Ammonia-Oxidizing Bacterium Nitrosomonas sp. Strain ENI-11

PubMed Central

Hirota, Ryuichi; Yamagata, Akira; Kato, Junichi; Kuroda, Akio; Ikeda, Tsukasa; Takiguchi, Noboru; Ohtake, Hisao

2000-01-01

Pulsed-field gel electrophoresis of PmeI digests of the Nitrosomonas sp. strain ENI-11 chromosome produced four bands ranging from 1,200 to 480 kb in size. Southern hybridizations suggested that a 487-kb PmeI fragment contained two copies of the amoCAB genes, coding for ammonia monooxygenase (designated amoCAB1 and amoCAB2), and three copies of the hao gene, coding for hydroxylamine oxidoreductase (hao1, hao2, and hao3). In this DNA fragment, amoCAB1 and amoCAB2 were about 390 kb apart, while hao1, hao2, and hao3 were separated by at least about 100 kb from each other. Interestingly, hao1 and hao2 were located relatively close to amoCAB1 and amoCAB2, respectively. DNA sequence analysis revealed that hao1 and hao2 shared 160 identical nucleotides immediately upstream of each translation initiation codon. However, hao3 showed only 30% nucleotide identity in the 160-bp corresponding region. PMID:10633121
Genomic organization of the 260 kb surrounding the waxy locus in a Japonica rice

PubMed

Nagano; Wu; Kawasaki; Kishima; Sano

1999-12-01

The present study was carried out to characterize the molecular organization in the vicinity of the waxy locus in rice. To determine the structural organization of the region surrounding waxy, contiguous clones covering a total of 260 kb were constructed using a bacterial artificial chromosome (BAC) library from the Shimokita variety of Japonica rice. This map also contains 200 overlapping subclones, which allowed construction of a fine physical map with a total of 64 HindIII sites. During the course of constructing the map, we noticed the presence of some repeated regions which might be related to transposable elements. We divided the 260-kb region into 60 segments (average size of 5.7 kb) to use as probes to determine their genomic organization. Hybridization patterns obtained by probing with these segments were classified into four types: class 1, a single or a few bands without a smeared background; class 2, a single or a few bands with a smeared background; class 3, multiple discrete bands without a smeared background; and class 4, only a smeared background. These classes constituted 6.5%, 20.9%, 3.7%, and 68.9% of the 260-kb region, respectively. The distribution of each class revealed that repetitive sequences are a major component in this region, as expected, and that unique sequence regions were mostly no longer than 6 kb due to interruption by repetitive sequences. We discuss how the map constructed here might be a powerful tool for characterization and comparison of the genome structures and the genes around the waxy locus in the Oryza species.
Characterization of a human X-linked gene from the DXS732E locus in the candidate region for the anhidrotic ectodermal dysplasia (EDA) gene (Xq13.1)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gault, J.; Zonana, J.; Zeltinger, J.

A conserved mouse genomic clone was used to identify a homologous human genomic clone (the DXS732E locus), which was subsequently employed to isolate cDNAs from a human fetal brain library. Nine unique overlapping cDNAs were isolated, and sequences analysis of 3.9 kb identified a putative 1 kb ORF. GRAIL analysis of the sequence supported the hypothesis that the putative ORF was coding sequence, and Prosite analysis of the putative ORF identified potential glycosylation and phosphorylation sites. The 5{prime} end of the gene maps within a CpG island, and comparison of cDNA sequences indicate the gene is alternatively spliced at itsmore » 3{prime} end. Northern analysis and RT-PCR indicate that two different sized messages appear to be expressed with the gene expressed in human fetal kidney, intestine, brain, and muscle. The gene is expressed in 77 day human skin, a time when hair follicle formation occurs. Anhidrotic ectodermal dysplasia (EDA) results in the abnormal morphogenesis of hair, teeth and eccrine sweat glands. A positional cloning strategy towards cloning the EDA gene had been used, and deletion and X-autosome translocation patients have been useful in further delimiting the EDA region. The present gene at the DXS732E locus is partially deleted in one EDA patient who does not have other apparent abnormalities. No rearrangements of the gene have been detected in two female X-autosome translocation EDA patients, nor in four additional male patients with submicroscopic molecular deletions.« less
Disease-Causing 7.4 kb Cis-Regulatory Deletion Disrupting Conserved Non-Coding Sequences and Their Interaction with the FOXL2 Promotor: Implications for Mutation Screening

PubMed Central

Dostie, Josée; Lemire, Edmond; Bouchard, Philippe; Field, Michael; Jones, Kristie; Lorenz, Birgit; Menten, Björn; Buysse, Karen; Pattyn, Filip; Friedli, Marc; Ucla, Catherine; Rossier, Colette; Wyss, Carine; Speleman, Frank; De Paepe, Anne; Dekker, Job; Antonarakis, Stylianos E.; De Baere, Elfride

2009-01-01

To date, the contribution of disrupted potentially cis-regulatory conserved non-coding sequences (CNCs) to human disease is most likely underestimated, as no systematic screens for putative deleterious variations in CNCs have been conducted. As a model for monogenic disease we studied the involvement of genetic changes of CNCs in the cis-regulatory domain of FOXL2 in blepharophimosis syndrome (BPES). Fifty-seven molecularly unsolved BPES patients underwent high-resolution copy number screening and targeted sequencing of CNCs. Apart from three larger distant deletions, a de novo deletion as small as 7.4 kb was found at 283 kb 5′ to FOXL2. The deletion appeared to be triggered by an H-DNA-induced double-stranded break (DSB). In addition, it disrupts a novel long non-coding RNA (ncRNA) PISRT1 and 8 CNCs. The regulatory potential of the deleted CNCs was substantiated by in vitro luciferase assays. Interestingly, Chromosome Conformation Capture (3C) of a 625 kb region surrounding FOXL2 in expressing cellular systems revealed physical interactions of three upstream fragments and the FOXL2 core promoter. Importantly, one of these contains the 7.4 kb deleted fragment. Overall, this study revealed the smallest distant deletion causing monogenic disease and impacts upon the concept of mutation screening in human disease and developmental disorders in particular. PMID:19543368
Sonication-based isolation and enrichment of Chlorella protothecoides chloroplasts for illumina genome sequencing

DOE Office of Scientific and Technical Information (OSTI.GOV)

Angelova, Angelina; Park, Sang-Hycuk; Kyndt, John

2013-09-01

With the increasing world demand for biofuel, a number of oleaginous algal species are being considered as renewable sources of oil. Chlorella protothecoides Krüger synthesizes triacylglycerols (TAGs) as storage compounds that can be converted into renewable fuel utilizing an anabolic pathway that is poorly understood. The paucity of algal chloroplast genome sequences has been an important constraint to chloroplast transformation and for studying gene expression in TAGs pathways. In this study, the intact chloroplasts were released from algal cells using sonication followed by sucrose gradient centrifugation, resulting in a 2.36-fold enrichment of chloroplasts from C. protothecoides, based on qPCR analysis.more » The C. protothecoides chloroplast genome (cpDNA) was determined using the Illumina HiSeq 2000 sequencing platform and found to be 84,576 Kb in size (8.57 Kb) in size, with a GC content of 30.8 %. This is the first report of an optimized protocol that uses a sonication step, followed by sucrose gradient centrifugation, to release and enrich intact chloroplasts from a microalga (C. prototheocoides) of sufficient quality to permit chloroplast genome sequencing with high coverage, while minimizing nuclear genome contamination. The approach is expected to guide chloroplast isolation from other oleaginous algal species for a variety of uses that benefit from enrichment of chloroplasts, ranging from biochemical analysis to genomics studies.« less
Structural and genetic analysis of a mutant of Rhodobacter sphaeroides WS8 deficient in hook length control.

PubMed Central

González-Pedrajo, B; Ballado, T; Campos, A; Sockett, R E; Camarena, L; Dreyfus, G

1997-01-01

Motility in the photosynthetic bacterium Rhodobacter sphaeroides is achieved by the unidirectional rotation of a single subpolar flagellum. In this study, transposon mutagenesis was used to obtain nonmotile flagellar mutants from this bacterium. We report here the isolation and characterization of a mutant that shows a polyhook phenotype. Morphological characterization of the mutant was done by electron microscopy. Polyhooks were obtained by shearing and were used to purify the hook protein monomer (FlgE). The apparent molecular mass of the hook protein was 50 kDa. N-terminal amino acid sequencing and comparisons with the hook proteins of other flagellated bacteria indicated that the Rhodobacter hook protein has consensus sequences common to axial flagellar components. A 25-kb fragment from an R. sphaeroides WS8 cosmid library restored wild-type flagellation and motility to the mutant. Using DNA adjacent to the inserted transposon as a probe, we identified a 4.6-kb SalI restriction fragment that contained the gene responsible for the polyhook phenotype. Nucleotide sequence analysis of this region revealed an open reading frame with a deduced amino acid sequence that was 23.4% identical to that of FliK of Salmonella typhimurium, the polypeptide responsible for hook length control in that enteric bacterium. The relevance of a gene homologous to fliK in the uniflagellated bacterium R. sphaeroides is discussed. PMID:9352903
Molecular and Genetic Characterization of the Drosophila Melanogaster 87e Actin Gene Region

PubMed Central

Manseau, L. J.; Ganetzky, B.; Craig, E. A.

1988-01-01

A combined molecular and genetic analysis of the 87E actin gene (Act87E) in Drosophila melanogaster was undertaken. A clone of Act87E was isolated and characterized. The Act87E transcription unit is 1.57 kb and includes a 556-base intervening sequence in the 5' leader of the gene. The protein-coding region is contiguous and encodes a protein that is >93% identical to the other Drosophila actins. By in situ hybridization with a series of deficiencies that break in 87E, Act87E was localized to a region encompassing one to three faint, polytene chromosome bands. The region between the deficiency endpoints that flank the actin gene was isolated and measures approximately 24-30 kb. The closest proximal deficiency endpoint lies 8-10 kb 5' to the actin gene; the closest distal deficiency endpoint lies 16-20 kb 3' to the actin gene. A single, recessive lethal complementation group lies between the deficiency endpoints that flank the actin gene. An EMS mutagenesis screen produced four additional members of this recessive lethal complementation group. Molecular analysis of the members of this complementation group indicated that two of the newly induced mutations have deletions of approximately 1 kb in a transcribed region 4-5 kb 3' (distal) to the actin gene. This result suggests that the recessive lethal complementation group represents a gene separate from and distal to the actin gene. The mutagenesis screen failed to identify additional recessive lethal complementation groups in the actin gene-containing region. The implications of the failure to identify recessive lethal mutations in the actin gene are discussed in reference to studies of other conserved multigene families and other muscle protein mutations. PMID:2840338
Isolation and characterization of Y chromosome sequences from the African malaria mosquito Anopheles gambiae.

PubMed Central

Krzywinski, Jaroslaw; Nusskern, Deborah R; Kern, Marcia K; Besansky, Nora J

2004-01-01

The karyotype of the African malaria mosquito Anopheles gambiae contains two pairs of autosomes and a pair of sex chromosomes. The Y chromosome, constituting approximately 10% of the genome, remains virtually unexplored, despite the recent completion of the A. gambiae genome project. Here we report the identification and characterization of Y chromosome sequences of total length approaching 150 kb. We developed 11 Y-specific PCR markers that consistently yielded male-specific products in specimens from both laboratory colony and natural populations. The markers are characterized by low sequence polymorphism in samples collected across Africa and by presence in more than one copy on the Y. Screening of the A. gambiae BAC library using these markers allowed detection of 90 Y-linked BAC clones. Analysis of the BAC sequences and other Y-derived fragments showed massive accumulation of a few transposable elements. Nevertheless, more complex sequences are apparently present on the Y; these include portions of an approximately 48-kb-long unmapped AAAB01008227 scaffold from the whole genome shotgun assembly. Anopheles Y appears not to harbor any of the genes identified in Drosophila Y. However, experiments suggest that one of the ORFs from the AAAB01008227 scaffold represents a fragment of a gene with male-specific expression. PMID:15082548
Characterization of the replication region of the Bacillus subtilis plasmid pLS20: a novel type of replicon.

PubMed Central

Meijer, W J; de Boer, A J; van Tongeren, S; Venema, G; Bron, S

1995-01-01

A 3.1 kb fragment of the large (approximately 55 kb) Bacillus subtilis plasmid pLS20 containing all the information for autonomous replication was cloned and sequenced. In contrast to the parental plasmid, derived minireplicons were unstably maintained. Using deletion analysis the fragment essential and sufficient for replication was delineated to 1.1 kb. This 1.1 kb fragment is located between two divergently transcribed genes, denoted orfA and orfB, neither of which is required for replication. orfA shows homology to the B.subtilis chromosomal genes rapA (spoOL, gsiA) and rapB (spoOP). The 1.1 kb fragment, which is characterized by the presence of several regions of dyad symmetry, contains no open reading frames of more than 85 codons and shows no similarity with other known plasmid replicons. The structural organization of the pLS20 minimal replicon is entirely different from that of typical rolling circle plasmids from Gram-positive bacteria. The pLS20 minireplicons replicate in polA5 and recA4 B.subtilis strains. Taken together, these results strongly suggest that pLS20 belongs to a new class of theta replicons. PMID:7667098
Human Hrs, a tyrosine kinase substrate in growth factor-stimulated cells: cDNA cloning and mapping of the gene to chromosome 17.

PubMed

Lu, L; Komada, M; Kitamura, N

1998-06-15

Hrs is a 115kDa zinc finger protein which is rapidly tyrosine phosphorylated in cells stimulated with various growth factors. We previously purified the protein from a mouse cell line and cloned its cDNA. In the present study, we cloned a human Hrs cDNA from a human placenta cDNA library by cross-hybridization, using the mouse cDNA as a probe, and determined its nucleotide sequence. The human Hrs cDNA encoded a 777-amino-acid protein whose sequence was 93% identical to that of mouse Hrs. Northern blot analysis showed that the Hrs mRNA was about 3.0kb long and was expressed in all the human adult and fetal tissues tested. In addition, we showed by genomic Southern blot analysis that the human Hrs gene was a single-copy gene with a size of about 20kb. Furthermore, the human Hrs gene was mapped to chromosome 17 by Southern blotting of genomic DNAs from human/rodent somatic cell hybrids. Copyright 1998 Elsevier Science B.V. All rights reserved.
Molecular characterization of class 1 integrons from Irish thermophilic Campylobacter spp.

PubMed

O'Halloran, Fiona; Lucey, Brigid; Cryan, Bartley; Buckley, Tom; Fanning, Séamus

2004-06-01

In this study a large random collection (n = 378) of Irish thermophilic Campylobacter isolates were investigated for the presence of integrons, genetic elements associated with the dissemination of antimicrobial resistance. Purified genomic DNA from each isolate was analysed by PCR for the presence of class 1 integrons. Four gene cassette-associated amplicons were completely characterized. Sixty-two of the isolates possessed a complete class 1 integron with a recombined gene cassette located within a 1.0 kb amplicon containing an aadA2 gene. This cassette was present in both Campylobacter jejuni and Campylobacter coli isolates and following sequence analysis was shown to be similar to sequences recently reported in Salmonella enterica Hadar and on an 85 kb plasmid conferring quinolone resistance in Escherichia coli. Aminoglycoside aadA2-encoding class 1 integrons were identified among unrelated Campylobacter spp. Amino acid sequence comparisons revealed identical structures in both Salmonella and E. coli. The presence of class 1 integrons in Campylobacter spp. may be significant should these organisms enter the food chain and especially when antimicrobial treatment for severe infections is being considered.
Extracellular proteins of Vibrio cholerae: molecular cloning, nucleotide sequence and characterization of the deoxyribonuclease (DNase) together with its periplasmic localization in Escherichia coli K-12.

PubMed

Focareta, T; Manning, P A

1987-01-01

The gene encoding the extracellular DNase of Vibrio cholerae was cloned into Escherichia coli K-12. A maximal coding region of 1.2 kb and a minimal region of 0.6 kb were determined by transposon mutagenesis and deletion analysis. The nucleotide sequence of this region contained a single open reading frame of 690 bp corresponding to a protein of Mr 26,389 with a typical N-terminal signal sequence of 18 aa which, when removed, would give a mature protein of Mr 24,163. This is in good agreement with the size of 24 kDa, calculated directly by Coomassie blue staining following sodium dodecyl sulphate-polyacrylamide gel electrophoresis and indirectly via a DNA-hydrolysis assay. The protein is located in the periplasmic space of E. coli K-12 unlike in V. cholerae where it is excreted into the extracellular medium. The introduction of the DNase gene into a periplasmic (tolA) leaky mutant of E. coli K-12 facilitates the release of the protein, further confirming the periplasmic location.
Nucleotide sequence of the Kaposi sarcoma-associated herpesvirus (HHV8)

PubMed Central

Russo, James J.; Bohenzky, Roy A.; Chien, Ming-Cheng; Chen, Jing; Yan, Ming; Maddalena, Dawn; Parry, J. Preston; Peruzzi, Daniela; Edelman, Isidore S.; Chang, Yuan; Moore, Patrick S.

1996-01-01

The genome of the Kaposi sarcoma-associated herpesvirus (KSHV or HHV8) was mapped with cosmid and phage genomic libraries from the BC-1 cell line. Its nucleotide sequence was determined except for a 3-kb region at the right end of the genome that was refractory to cloning. The BC-1 KSHV genome consists of a 140.5-kb-long unique coding region flanked by multiple G+C-rich 801-bp terminal repeat sequences. A genomic duplication that apparently arose in the parental tumor is present in this cell culture-derived strain. At least 81 ORFs, including 66 with homology to herpesvirus saimiri ORFs, and 5 internal repeat regions are present in the long unique region. The virus encodes homologs to complement-binding proteins, three cytokines (two macrophage inflammatory proteins and interleukin 6), dihydrofolate reductase, bcl-2, interferon regulatory factors, interleukin 8 receptor, neural cell adhesion molecule-like adhesin, and a D-type cyclin, as well as viral structural and metabolic proteins. Terminal repeat analysis of virus DNA from a KS lesion suggests a monoclonal expansion of KSHV in the KS tumor. PMID:8962146
Pea chloroplast tRNA(Lys) (UUU) gene: transcription and analysis of an intron-containing gene.

PubMed

Boyer, S K; Mullet, J E

1988-07-01

The pea chloroplast trnK gene which encodes tRNA(Lys) (UUU) was sequenced. TrnK is located 210 bp upstream from the promoter of psbA and immediately downstream from the 3'-end of rbcL. The gene is transcribed from the same DNA strand as psbA and rbcL. A 2447 bp intron with class II features is located in the trnK anticodon loop. The intron contains a 506 amino acid open reading frame which could encode an RNA maturase. The primary transcript of trnK is 2.9 kb long; its 5'-end was identified as a site of transcription initiation by in vitro transcription experiments. The 5'-terminus is adjacent to DNA sequences previously identified as transcription promoter elements. The most abundant trnK transcript is 2.5 kb long with termini corresponding to the 5' and 3' ends of the trnK exons. Intron specific RNAs were not detected. This suggests that RNA processing which produces tRNA(Lys) leads to rapid degradation of intron sequences.
Large-Scale Sequencing of Two Regions in Human Chromosome 7q22: Analysis of 650 kb of Genomic Sequence around the EPO and CUTL1 Loci Reveals 17 Genes

PubMed Central

Glöckner, Gernot; Scherer, Stephen; Schattevoy, Ruben; Boright, Andrew; Weber, Jacqueline; Tsui, Lap-Chee; Rosenthal, André

1998-01-01

We have sequenced and annotated two genomic regions located in the Giemsa negative band q22 of human chromosome 7. The first region defined by the erythropoietin (EPO) locus is 228 kb in length and contains 13 genes. Whereas 3 genes (GNB2, EPO, PCOLCE) were known previously on the mRNA level, we have been able to identify 10 novel genes using a newly developed automatic annotation tool RUMMAGE-DP, which comprises >26 different programs mainly for exon prediction, homology searches, and compositional and repeat analysis. For precise annotation we have also resequenced ESTs identified to the region and assembled them to build large cDNAs. In addition, we have investigated the differential splicing of genes. Using these tools we annotated 4 of the 10 genes as a zonadhesin, a transferrin homolog, a nucleoporin-like gene, and an actin gene. Two genes showed weak similarity to an insulin-like receptor and a neuronal protein with a leucine-rich amino-terminal domain. Four predicted genes (CDS1–CDS4) CDS that have been confirmed on the mRNA level showed no similarity to known proteins and a potential function could not be assigned. The second region in 7q22 defined by the CUTL1 (CCAAT displacement protein and its splice variant) locus is 416 kb in length and contains three known genes, including PMSL12, APS, CUTL1, and a novel gene (CDS5). The CUTL1 locus, consisting of two splice variants (CDP and CASP), occupies >300 kb. Based on the G,C profile an isochore switch can be defined between the CUTL1 gene and the APS and PMSL12 genes. [Clones 37G3, 164c7, and 235f8 are deposited in GenBank under accession no. AF053356; clone 123e15, accession no. AF024533; 186d2, accession no. AF024534; 46f6, accession no. AF006752; 50h2, accession no. AF047825; and 76h2, accession no. AF030453] PMID:9799793
Molecular cloning of human protein 4.2: a major component of the erythrocyte membrane.

PubMed Central

Sung, L A; Chien, S; Chang, L S; Lambert, K; Bliss, S A; Bouhassira, E E; Nagel, R L; Schwartz, R S; Rybicki, A C

1990-01-01

Protein 4.2 (P4.2) comprises approximately 5% of the protein mass of human erythrocyte (RBC) membranes. Anemia occurs in patients with RBCs deficient in P4.2, suggesting a role for this protein in maintaining RBC stability and integrity. We now report the molecular cloning and characterization of human RBC P4.2 cDNAs. By immunoscreening a human reticulocyte cDNA library and by using the polymerase chain reaction, two cDNA sequences of 2.4 and 2.5 kilobases (kb) were obtained. These cDNAs differ only by a 90-base-pair insert in the longer isoform located three codons downstream from the putative initiation site. The 2.4- and 2.5-kb cDNAs predict proteins of approximately 77 and approximately 80 kDa, respectively, and the authenticity was confirmed by sequence identity with 46 amino acids of three cyanogen bromide-cleaved peptides of P4.2. Northern blot analysis detected a major 2.4-kb RNA species in reticulocytes. Isolation of two P4.2 cDNAs implies existence of specific regulation of P4.2 expression in human RBCs. Human RBC P4.2 has significant homology with human factor XIII subunit a and guinea pig liver transglutaminase. Sequence alignment of P4.2 with these two transglutaminases, however, revealed that P4.2 lacks the critical cysteine residue required for the enzymatic crosslinking of substrates. Images PMID:1689063
Genomic interval engineering of mice identified a novel modulator of triglyceride production

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhu, Y.; Jong, M.C.; Frazer, K.A.

1999-10-01

To accelerate the biological annotation of novel genes discovered in sequenced of mammalian genomes, we are creating large deletions in the mouse genome targeted to include clusters of such genes. Here we describe the targeted deletion of a 450 kb region on mouse chromosome 11 which, based on computational analysis of the deleted murine sequences and human 5q orthologous sequences, codes for nine putative genes. Mice homozygous for the deletion had a variety of abnormalities including severe hypertriglyceridemia, hepatic and cardiac enlargement, growth retardation and premature mortality. Analysis of triglyceride metabolism in these animals demonstrated a several-fold increase in hepaticmore » very-low density lipoprotein (VLDL) triglyceride secretion, the most prevalent mechanism responsible for hypertriglyceridemia in humans. A series of mouse BAC and human YAC transgenes covering different intervals of the 450 kb deleted region were assessed for their ability to complement the deletion induced abnormalities. These studies revealed that OCTN2, a gene recently shown to play a role in carnitine transport, was able to correct the triglyceride abnormalities. The discovery of this previously unappreciated relationship between OCTN2, carnitine and hepatic triglyceride production is of particular importance due to the clinical consequence of hypertriglyceridemia and the paucity of genes known to modulate triglyceride secretion.« less
Analysis of the putative regulatory region of the gastric inhibitory polypeptide receptor gene in food-dependent Cushing's syndrome.

PubMed

Antonini, S R; N'Diaye, N; Baldacchino, V; Hamet, P; Tremblay, J; Lacroix, A

2004-07-01

Gastric inhibitory polypeptide (GIP)-dependent Cushing's syndrome (CS) results from the ectopic expression of non-mutated GIP receptor (hGIPR) in the adrenal cortex. We evaluated whether mutations or polymorphisms in the regulatory region of the GIPR gene could lead to this aberrant expression. We studied 9.0kb upstream and 1.3kb downstream of the GIPR gene putative promoter (pProm) by sequencing leukocyte DNA from controls and from adrenal tissues of GIP- and non-GIP-dependent CS patients. The putative proximal promoter region (800 bp) and the first exon and intron of the hGIPR gene were sequenced on adrenal DNA from nine GIP-dependent CS, as well as on leukocyte DNA of nine normal controls. Three variations found in this region were found in all patients and controls; at position -4/-5, an insertion of a T was seen in four out of nine patients and in five out of nine controls. Transient transfection studies conducted in rat GC and mouse Y1 cells showed that the TT allele confers loss of 40% in the promoter activity. The analysis of the 8-kb distal pProm region revealed eight distal single nucleotide polymorphisms (SNPs) without probable association with the disease, since frequencies in patients and controls were very similar. In conclusion, mutations or SNPs in the regulatory region of the GIPR gene are unlikely to underlie GIP-dependent CS. Copyright 2004 Elsevier Ltd.
Comparative genomics of citric-acid-producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

PubMed Central

Andersen, Mikael R.; Salazar, Margarita P.; Schaap, Peter J.; van de Vondervoort, Peter J.I.; Culley, David; Thykaer, Jette; Frisvad, Jens C.; Nielsen, Kristian F.; Albang, Richard; Albermann, Kaj; Berka, Randy M.; Braus, Gerhard H.; Braus-Stromeyer, Susanna A.; Corrochano, Luis M.; Dai, Ziyu; van Dijck, Piet W.M.; Hofmann, Gerald; Lasure, Linda L.; Magnuson, Jon K.; Menke, Hildegard; Meijer, Martin; Meijer, Susan L.; Nielsen, Jakob B.; Nielsen, Michael L.; van Ooyen, Albert J.J.; Pel, Herman J.; Poulsen, Lars; Samson, Rob A.; Stam, Hein; Tsang, Adrian; van den Brink, Johannes M.; Atkins, Alex; Aerts, Andrea; Shapiro, Harris; Pangilinan, Jasmyn; Salamov, Asaf; Lou, Yigong; Lindquist, Erika; Lucas, Susan; Grimwood, Jane; Grigoriev, Igor V.; Kubicek, Christian P.; Martinez, Diego; van Peij, Noël N.M.E.; Roubos, Johannes A.; Nielsen, Jens; Baker, Scott E.

2011-01-01

The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compel additional exploration. We therefore undertook whole-genome sequencing of the acidogenic A. niger wild-type strain (ATCC 1015) and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence, and half the telomeric regions have been elucidated. Moreover, sequence information from ATCC 1015 was used to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 Mb of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis supported up-regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases, and protein transporters in the protein producing CBS 513.88 strain. Our results and data sets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi. PMID:21543515

The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation

PubMed Central

Casadio, Rita

2017-01-01

Abstract BAR 3.0 updates our server BAR (Bologna Annotation Resource) for predicting protein structural and functional features from sequence. We increase data volume, query capabilities and information conveyed to the user. The core of BAR 3.0 is a graph-based clustering procedure of UniProtKB sequences, following strict pairwise similarity criteria (sequence identity ≥40% with alignment coverage ≥90%). Each cluster contains the available annotation downloaded from UniProtKB, GO, PFAM and PDB. After statistical validation, GO terms and PFAM domains are cluster-specific and annotate new sequences entering the cluster after satisfying similarity constraints. BAR 3.0 includes 28 869 663 sequences in 1 361 773 clusters, of which 22.2% (22 241 661 sequences) and 47.4% (24 555 055 sequences) have at least one validated GO term and one PFAM domain, respectively. 1.4% of the clusters (36% of all sequences) include PDB structures and the cluster is associated to a hidden Markov model that allows building template-target alignment suitable for structural modeling. Some other 3 399 026 sequences are singletons. BAR 3.0 offers an improved search interface, allowing queries by UniProtKB-accession, Fasta sequence, GO-term, PFAM-domain, organism, PDB and ligand/s. When evaluated on the CAFA2 targets, BAR 3.0 largely outperforms our previous version and scores among state-of-the-art methods. BAR 3.0 is publicly available and accessible at http://bar.biocomp.unibo.it/bar3. PMID:28453653
Novel variants of the 5S rRNA genes in Eruca sativa.

PubMed

Singh, K; Bhatia, S; Lakshmikumaran, M

1994-02-01

The 5S ribosomal RNA (rRNA) genes of Eruca sativa were cloned and characterized. They are organized into clusters of tandemly repeated units. Each repeat unit consists of a 119-bp coding region followed by a noncoding spacer region that separates it from the coding region of the next repeat unit. Our study reports novel gene variants of the 5S rRNA genes in plants. Two families of the 5S rDNA, the 0.5-kb size family and the 1-kb size family, coexist in the E. sativa genome. The 0.5-kb size family consists of the 5S rRNA genes (S4) that have coding regions similar to those of other reported plant 5S rDNA sequences, whereas the 1-kb size family consists of the 5S rRNA gene variants (S1) that exist as 1-kb BamHI tandem repeats. S1 is made up of two variant units (V1 and V2) of 5S rDNA where the BamHI site between the two units is mutated. Sequence heterogeneity among S4, V1, and V2 units exists throughout the sequence and is not limited to the noncoding spacer region only. The coding regions of V1 and V2 show approximately 20% dissimilarity to the coding regions of S4 and other reported plant 5S rDNA sequences. Such a large variation in the coding regions of the 5S rDNA units within the same plant species has been observed for the first time. Restriction site variation is observed between the two size classes of 5S rDNA in E. sativa.(ABSTRACT TRUNCATED AT 250 WORDS)
Cloning of an avilamycin biosynthetic gene cluster from Streptomyces viridochromogenes Tü57.

PubMed Central

Gaisser, S; Trefzer, A; Stockert, S; Kirschning, A; Bechthold, A

1997-01-01

A 65-kb region of DNA from Streptomyces viridochromogenes Tü57, containing genes encoding proteins involved in the biosynthesis of avilamycins, was isolated. The DNA sequence of a 6.4-kb fragment from this region revealed four open reading frames (ORF1 to ORF4), three of which are fully contained within the sequenced fragment. The deduced amino acid sequence of AviM, encoded by ORF2, shows 37% identity to a 6-methylsalicylic acid synthase from Penicillium patulum. Cultures of S. lividans TK24 and S. coelicolor CH999 containing plasmids with ORF2 on a 5.5-kb PstI fragment were able to produce orsellinic acid, an unreduced version of 6-methylsalicylic acid. The amino acid sequence encoded by ORF3 (AviD) is 62% identical to that of StrD, a dTDP-glucose synthase from S. griseus. The deduced amino acid sequence of AviE, encoded by ORF4, shows 55% identity to a dTDP-glucose dehydratase (StrE) from S. griseus. Gene insertional inactivation experiments of aviE abolished avilamycin production, indicating the involvement of aviE in the biosynthesis of avilamycins. PMID:9335272
Molecular Cloning and Sequence Analysis of the Sta58 Major Antigen Gene of Rickettsia tsutsugamushi: Sequence homology and Antigenic Comparison of Sta58 to the 60-Kilodalton Family of Stress Proteins

DTIC Science & Technology

1990-05-01

Sta58 antigen and the Sta56 strain- GroES, C. burnetii HtpA, Mycobacterium tuberculosis 12- specific major antigen of R. tsutsugamushi (strain Karp...kb HindlIl fragment carrying the gene for the Sta58 tuberculosis, and Mycobacterium smegmatis (65-kDa anti- protein was subjected to DNA sequence...the Hsp6O and HsplO proteins. R. tsu., R. isutsugamushi; M. lep., Mvtcobacteriutn leprae : C. bur., C. burneiii; Synech.. Synechococcus strain 6301; T
Fine Mapping Suggests that the Goat Polled Intersex Syndrome and the Human Blepharophimosis Ptosis Epicanthus Syndrome Map to a 100-kb Homologous Region

PubMed Central

Schibler, Laurent; Cribiu, Edmond P.; Oustry-Vaiman, Anne; Furet, Jean-Pierre; Vaiman, Daniel

2000-01-01

To clone the goat Polled Intersex Syndrome (PIS) gene(s), a chromosome walk was performed from six entry points at 1q43. This enabled 91 BACs to be recovered from a recently constructed goat BAC library. Six BAC contigs of goat chromosome 1q43 (ICC1–ICC6) were thus constructed covering altogether 4.5 Mb. A total of 37 microsatellite sequences were isolated from this 4.5-Mb region (16 in this study), of which 33 were genotyped and mapped. ICC3 (1500 kb) was shown by genetic analysis to encompass the PIS locus in a ∼400-kb interval without recombinants detected in the resource families (293 informative meioses). A strong linkage disequilibrium was detected among unrelated animals with the two central markers of the region, suggesting a probable location for PIS in ∼100 kb. High-resolution comparative mapping with human data shows that this DNA segment is the homolog of the human region associated with Blepharophimosis Ptosis Epicanthus inversus Syndrome (BPES) gene located in 3q23. This finding suggests that homologous gene(s) could be responsible for the pathologies observed in humans and goats. [The sequence data, PCR primers and PCR conditions for STS and microsatellites described in this paper have been submitted to the GenBank data library under accession nos. AQ666547–AQ666579, AQ686084–AQ686129, AQ793920–793931, AQ810429–AQ810527, G41201–G41228, and G54270–G54286.] PMID:10720572
A high HIV-1 strain variability in London, UK, revealed by full-genome analysis: Results from the ICONIC project

PubMed Central

Frampton, Dan; Gallo Cassarino, Tiziano; Raffle, Jade; Hubb, Jonathan; Ferns, R. Bridget; Waters, Laura; Tong, C. Y. William; Kozlakidis, Zisis; Hayward, Andrew; Kellam, Paul; Pillay, Deenan; Clark, Duncan; Nastouli, Eleni; Leigh Brown, Andrew J.

2018-01-01

Background & methods The ICONIC project has developed an automated high-throughput pipeline to generate HIV nearly full-length genomes (NFLG, i.e. from gag to nef) from next-generation sequencing (NGS) data. The pipeline was applied to 420 HIV samples collected at University College London Hospitals NHS Trust and Barts Health NHS Trust (London) and sequenced using an Illumina MiSeq at the Wellcome Trust Sanger Institute (Cambridge). Consensus genomes were generated and subtyped using COMET, and unique recombinants were studied with jpHMM and SimPlot. Maximum-likelihood phylogenetic trees were constructed using RAxML to identify transmission networks using the Cluster Picker. Results The pipeline generated sequences of at least 1Kb of length (median = 7.46Kb, IQR = 4.01Kb) for 375 out of the 420 samples (89%), with 174 (46.4%) being NFLG. A total of 365 sequences (169 of them NFLG) corresponded to unique subjects and were included in the down-stream analyses. The most frequent HIV subtypes were B (n = 149, 40.8%) and C (n = 77, 21.1%) and the circulating recombinant form CRF02_AG (n = 32, 8.8%). We found 14 different CRFs (n = 66, 18.1%) and multiple URFs (n = 32, 8.8%) that involved recombination between 12 different subtypes/CRFs. The most frequent URFs were B/CRF01_AE (4 cases) and A1/D, B/C, and B/CRF02_AG (3 cases each). Most URFs (19/26, 73%) lacked breakpoints in the PR+RT pol region, rendering them undetectable if only that was sequenced. Twelve (37.5%) of the URFs could have emerged within the UK, whereas the rest were probably imported from sub-Saharan Africa, South East Asia and South America. For 2 URFs we found highly similar pol sequences circulating in the UK. We detected 31 phylogenetic clusters using the full dataset: 25 pairs (mostly subtypes B and C), 4 triplets and 2 quadruplets. Some of these were not consistent across different genes due to inter- and intra-subtype recombination. Clusters involved 70 sequences, 19.2% of the dataset. Conclusions The initial analysis of genome sequences detected substantial hidden variability in the London HIV epidemic. Analysing full genome sequences, as opposed to only PR+RT, identified previously undetected recombinants. It provided a more reliable description of CRFs (that would be otherwise misclassified) and transmission clusters. PMID:29389981
Cloning and characterization of an alternatively spliced gene in proximal Xq28 deleted in two patients with intersexual genitalia and myotubular myopathy

DOE Office of Scientific and Technical Information (OSTI.GOV)

Laporte, J.; Hu, Ling-Jia; Kretz, C.

1997-05-01

We have identified a novel human gene that is entirely deleted in two boys with abnormal genital development and myotubular myopathy (MTM1). The gene, F18, is located in proximal Xq28, approximately 80 kb centromeric to the recently isolated MTM1 gene. Northern analysis of mRNA showed a ubiquitous pattern and suggested high levels of expression in skeletal muscle, brain, and heart. A transcript of 4.6 kb was detected in a range of tissues, and additional alternate forms of 3.8 and 2.6 kb were present in placenta and pancreas, respectively. The gene extends over 100 kb and is composed of at leastmore » seven exons, of which two are non-coding. Sequence analysis of a 4.6-kb cDNA contig revealed two overlapping open reading frames (ORFs) that encode putative proteins of 701 and 424 amino acids, respectively. Two alternative spliced transcripts affecting the large open reading frame were identified that, together with the Northern blot results, suggest that distinct proteins are derived from the gene. No significant homology to other known proteins was detected, but segments of the first ORF encode polyglutamine tracts and proline-rich domains, which are frequently observed in DNA-binding proteins. The F18 gene is a strong candidate for being implicated in the intersexual genitalia present in the two MTM1-deleted patients. The gene also serves as a candidate for other disorders that map to proximal Xq28. 15 refs., 3 figs., 1 tab.« less
Beta 2 adrenergic receptor gene restriction fragment length polymorphism and bronchial asthma.

PubMed Central

Ohe, M.; Munakata, M.; Hizawa, N.; Itoh, A.; Doi, I.; Yamaguchi, E.; Homma, Y.; Kawakami, Y.

1995-01-01

BACKGROUND--Beta 2 adrenergic dysfunction may be one of the underlying mechanisms responsible for atopy and bronchial asthma. The gene encoding the human beta 2 adrenergic receptor (beta 2ADR) has recently been isolated and sequenced. In addition, a two allele polymorphism of this receptor gene has been identified in white people. A study was carried out to determine whether this polymorphism is functionally important and has any relation to airways responsiveness, atopy, or asthma. METHODS--The subjects studied were 58 family members of four patients with atopic asthma. Restriction fragment length polymorphism (RFLP) with Ban-I digestion of the beta 2ADR gene was detected by a specific DNA probe with Southern blot analysis. Airways responses to inhaled methacholine and the beta 2 agonist salbutamol, the skin prick test, and serum IgE levels were also examined and correlated to the beta 2ADR gene RFLP. In addition, measurements of cAMP responses to isoproterenol in peripheral mononuclear cells were performed in 22 healthy subjects whose genotype for beta 2ADR was known. RESULTS--A two allele polymorphism (2.3 kb and 2.1 kb) of the beta 2ADR gene was detected in the Japanese population. Family members without allele 2.3 kb (homozygote of allele 2.1 kb) had lower airways responses to inhaled salbutamol than those with allele 2.3 kb. The incidence of asthma was higher in those without allele 2.3 kb than in those with allele 2.3 kb. The beta 2ADR gene RFLP had no relation to airways responses to methacholine and atopic status. cAMP responses in peripheral mononuclear cells of the subjects without allele 2.3 kb tended to be lower than those of the subjects with allele 2.3 kb. CONCLUSIONS--These results suggest that Ban-I RFLP of the beta 2ADR gene may have some association with the airways responses to beta 2 agonists and the incidence of bronchial asthma. Images PMID:7785006
Hypervariability of ribosomal DNA at multiple chromosomal sites in lake trout (Salvelinus namaycush).

PubMed

Zhuo, L; Reed, K M; Phillips, R B

1995-06-01

Variation in the intergenic spacer (IGS) of the ribosomal DNA (rDNA) of lake trout (Salvelinus namaycush) was examined. Digestion of genomic DNA with restriction enzymes showed that almost every individual had a unique combination of length variants with most of this variation occurring within rather than between populations. Sequence analysis of a 2.3 kilobase (kb) EcoRI-DraI fragment spanning the 3' end of the 28S coding region and approximately 1.8 kb of the IGS revealed two blocks of repetitive DNA. Putative transcriptional termination sites were found approximately 220 bases (b) downstream from the end of the 28S coding region. Comparison of the 2.3-kb fragments with two longer (3.1 kb) fragments showed that the major difference in length resulted from variation in the number of short (89 b) repeats located 3' to the putative terminator. Repeat units within a single nucleolus organizer region (NOR) appeared relatively homogeneous and genetic analysis found variants to be stably inherited. A comparison of the number of spacer-length variants with the number of NORs found that the number of length variants per individual was always less than the number of NORs. Examination of spacer variants in five populations showed that populations with more NORs had more spacer variants, indicating that variants are present at different rDNA sites on nonhomologous chromosomes.
Identification, characterization and functional analysis of regulatory region of nanos gene from half-smooth tongue sole (Cynoglossus semilaevis).

PubMed

Huang, Jinqiang; Li, Yongjuan; Shao, Changwei; Wang, Na; Chen, Songlin

2017-06-20

The nanos gene encodes an RNA-binding zinc finger protein, which is required in the development and maintenance of germ cells. However, there is very limited information about nanos in flatfish, which impedes its application in fish breeding. In this study, we report the molecular cloning, characterization and functional analysis of the 3'-untranslated region of the nanos gene (Csnanos) from half-smooth tongue sole (Cynoglossus semilaevis), which is an economically important flatfish in China. The 1233-bp cDNA sequence, 1709-bp genomic sequence and flanking sequences (2.8-kb 5'- and 1.6-kb 3'-flanking regions) of Csnanos were cloned and characterized. Sequence analysis revealed that CsNanos shares low homology with Nanos in other species, but the zinc finger domain of CsNanos is highly similar. Phylogenetic analysis indicated that CsNanos belongs to the Nanos2 subfamily. Csnanos expression was widely detected in various tissues, but the expression level was higher in testis and ovary. During early development and sex differentiation, Csnanos expression exhibited a clear sexually dimorphic pattern, suggesting its different roles in the migration and differentiation of primordial germ cells (PGCs). Higher expression levels of Csnanos mRNA in normal females and males than in neomales indicated that the nanos gene may play key roles in maintaining the differentiation of gonad. Moreover, medaka PGCs were successfully labeled by the microinjection of synthesized mRNA consisting of green fluorescence protein and the 3'-untranslated region of Csnanos. These findings provide new insights into nanos gene expression and function, and lay the foundation for further study of PGC development and applications in tongue sole breeding. Copyright © 2017 Elsevier B.V. All rights reserved.
RIKEN Integrated Sequence Analysis (RISA) System—384-Format Sequencing Pipeline with 384 Multicapillary Sequencer

PubMed Central

Shibata, Kazuhiro; Itoh, Masayoshi; Aizawa, Katsunori; Nagaoka, Sumiharu; Sasaki, Nobuya; Carninci, Piero; Konno, Hideaki; Akiyama, Junichi; Nishi, Katsuo; Kitsunai, Tokuji; Tashiro, Hideo; Itoh, Mari; Sumi, Noriko; Ishii, Yoshiyuki; Nakamura, Shin; Hazama, Makoto; Nishine, Tsutomu; Harada, Akira; Yamamoto, Rintaro; Matsumoto, Hiroyuki; Sakaguchi, Sumito; Ikegami, Takashi; Kashiwagi, Katsuya; Fujiwake, Syuji; Inoue, Kouji; Togawa, Yoshiyuki; Izawa, Masaki; Ohara, Eiji; Watahiki, Masanori; Yoneda, Yuko; Ishikawa, Tomokazu; Ozawa, Kaori; Tanaka, Takumi; Matsuura, Shuji; Kawai, Jun; Okazaki, Yasushi; Muramatsu, Masami; Inoue, Yorinao; Kira, Akira; Hayashizaki, Yoshihide

2000-01-01

The RIKEN high-throughput 384-format sequencing pipeline (RISA system) including a 384-multicapillary sequencer (the so-called RISA sequencer) was developed for the RIKEN mouse encyclopedia project. The RISA system consists of colony picking, template preparation, sequencing reaction, and the sequencing process. A novel high-throughput 384-format capillary sequencer system (RISA sequencer system) was developed for the sequencing process. This system consists of a 384-multicapillary auto sequencer (RISA sequencer), a 384-multicapillary array assembler (CAS), and a 384-multicapillary casting device. The RISA sequencer can simultaneously analyze 384 independent sequencing products. The optical system is a scanning system chosen after careful comparison with an image detection system for the simultaneous detection of the 384-capillary array. This scanning system can be used with any fluorescent-labeled sequencing reaction (chain termination reaction), including transcriptional sequencing based on RNA polymerase, which was originally developed by us, and cycle sequencing based on thermostable DNA polymerase. For long-read sequencing, 380 out of 384 sequences (99.2%) were successfully analyzed and the average read length, with more than 99% accuracy, was 654.4 bp. A single RISA sequencer can analyze 216 kb with >99% accuracy in 2.7 h (90 kb/h). For short-read sequencing to cluster the 3′ end and 5′ end sequencing by reading 350 bp, 384 samples can be analyzed in 1.5 h. We have also developed a RISA inoculator, RISA filtrator and densitometer, RISA plasmid preparator which can handle throughput of 40,000 samples in 17.5 h, and a high-throughput RISA thermal cycler which has four 384-well sites. The combination of these technologies allowed us to construct the RISA system consisting of 16 RISA sequencers, which can process 50,000 DNA samples per day. One haploid genome shotgun sequence of a higher organism, such as human, mouse, rat, domestic animals, and plants, can be revealed by seven RISA systems within one month. PMID:11076861
Fine mapping suggests that the goat Polled Intersex Syndrome and the human Blepharophimosis Ptosis Epicanthus Syndrome map to a 100-kb homologous region.

PubMed

Schibler, L; Cribiu, E P; Oustry-Vaiman, A; Furet, J P; Vaiman, D

2000-03-01

To clone the goat Polled Intersex Syndrome (PIS) gene(s), a chromosome walk was performed from six entry points at 1q43. This enabled 91 BACs to be recovered from a recently constructed goat BAC library. Six BAC contigs of goat chromosome 1q43 (ICC1-ICC6) were thus constructed covering altogether 4.5 Mb. A total of 37 microsatellite sequences were isolated from this 4.5-Mb region (16 in this study), of which 33 were genotyped and mapped. ICC3 (1500 kb) was shown by genetic analysis to encompass the PIS locus in a approximately 400-kb interval without recombinants detected in the resource families (293 informative meioses). A strong linkage disequilibrium was detected among unrelated animals with the two central markers of the region, suggesting a probable location for PIS in approximately 100 kb. High-resolution comparative mapping with human data shows that this DNA segment is the homolog of the human region associated with Blepharophimosis Ptosis Epicanthus inversus Syndrome (BPES) gene located in 3q23. This finding suggests that homologous gene(s) could be responsible for the pathologies observed in humans and goats.
Multiplexed resequencing analysis to identify rare variants in pooled DNA with barcode indexing using next-generation sequencer.

PubMed

Mitsui, Jun; Fukuda, Yoko; Azuma, Kyo; Tozaki, Hirokazu; Ishiura, Hiroyuki; Takahashi, Yuji; Goto, Jun; Tsuji, Shoji

2010-07-01

We have recently found that multiple rare variants of the glucocerebrosidase gene (GBA) confer a robust risk for Parkinson disease, supporting the 'common disease-multiple rare variants' hypothesis. To develop an efficient method of identifying rare variants in a large number of samples, we applied multiplexed resequencing using a next-generation sequencer to identification of rare variants of GBA. Sixteen sets of pooled DNAs from six pooled DNA samples were prepared. Each set of pooled DNAs was subjected to polymerase chain reaction to amplify the target gene (GBA) covering 6.5 kb, pooled into one tube with barcode indexing, and then subjected to extensive sequence analysis using the SOLiD System. Individual samples were also subjected to direct nucleotide sequence analysis. With the optimization of data processing, we were able to extract all the variants from 96 samples with acceptable rates of false-positive single-nucleotide variants.
Molecular Population Genetics of the Alcohol Dehydrogenase Gene Region of DROSOPHILA MELANOGASTER

PubMed Central

Aquadro, Charles F.; Desse, Susan F.; Bland, Molly M.; Langley, Charles H.; Laurie-Ahlberg, Cathy C.

1986-01-01

Variation in the DNA restriction map of a 13-kb region of chromosome II including the alcohol dehydrogenase structural gene (Adh) was examined in Drosophila melanogaster from natural populations. Detailed analysis of 48 D. melanogaster lines representing four eastern United States populations revealed extensive DNA sequence variation due to base substitutions, insertions and deletions. Cloning of this region from several lines allowed characterization of length variation as due to unique sequence insertions or deletions [nine sizes; 21–200 base pairs (bp)] or transposable element insertions (several sizes, 340 bp to 10.2 kb, representing four different elements). Despite this extensive variation in sequences flanking the Adh gene, only one length polymorphism is clearly associated with altered Adh expression (a copia element approximately 250 bp 5' to the distal transcript start site). Nonetheless, the frequency spectra of transposable elements within and between Drosophila species suggests they are slightly deleterious. Strong nonrandom associations are observed among Adh region sequence variants, ADH allozyme (Fast vs. Slow), ADH enzyme activity and the chromosome inversion ln(2L) t. Phylogenetic analysis of restriction map haplotypes suggest that the major twofold component of ADH activity variation (high vs. low, typical of Fast and Slow allozymes, respectively) is due to sequence variation tightly linked to and possibly distinct from that underlying the allozyme difference. The patterns of nucleotide and haplotype variation for Fast and Slow allozyme lines are consistent with the recent increase in frequency and spread of the Fast haplotype associated with high ADH activity. These data emphasize the important role of evolutionary history and strong nonrandom associations among tightly linked sequence variation as determinants of the patterns of variation observed in natural populations. PMID:3026893
Dissemination of genetically related IMP-6-producing multidrug-resistant Pseudomonas aeruginosa ST235 in South Korea.

PubMed

Yoo, Jung Sik; Yang, Ji Woo; Kim, Hye Mee; Byeon, Jeongheum; Kim, Hwa Su; Yoo, Jae Il; Chung, Gyung Tae; Lee, Yeong Seon

2012-04-01

The present study aimed to describe the prevalence and molecular epidemiology of metallo-β-lactamase (MBL)-producing Pseudomonas aeruginosa isolates obtained from non-tertiary care hospitals and geriatric hospitals in South Korea. Of the 644 isolates, 224 were carbapenem-resistant, amongst which 41 (18.3%) were MBL-producers and the major MBL type was IMP-6 (35 isolates). IMP-6-producing isolates were multidrug-resistant and showed higher minimum inhibitory concentrations for meropenem than imipenem. All of the IMP-6-producing isolates had class 1 integrons with amplification sizes of 4.5 kb/5.5 kb (34 isolates) or 3.0 kb (1 isolate); 4.5 kb/5.5 kb integrons had bla(IMP-6)-qac-aacA4-bla(OXA-1)-aadA1 (5.5 kb) and aadB-cmlA-bla(OXA-10)-aadA1 (4.5 kb). Pulsed-field gel electrophoresis (PFGE) analysis indicated that all IMP-6-producing P. aeruginosa from various geographic areas had nearly identical patterns with >85% similarity. All IMP-6-producing isolates showed high genetic similarity to those obtained from tertiary care hospitals and had the same integron type, indicating the spread of these strains to the three types of hospitals nationwide. These data show the wide spreading of clonally related IMP-6-producing P. aeruginosa (sequence type 235) through tertiary, non-tertiary and geriatric hospitals in South Korea. Continuous monitoring and thorough infection control should be performed in all types of hospitals to prevent further spreading of MBL-producing P. aeruginosa. Copyright Â© 2012 Elsevier B.V. and the International Society of Chemotherapy. All rights reserved.
The Friedreich ataxia critical region spans a 150-kb interval on chromosome 9q13

DOE Office of Scientific and Technical Information (OSTI.GOV)

Montermini, L.; Zara, F.; Patel, P.I.

1995-11-01

By analysis of crossovers in key recombinant families and by homozygosity analysis of inbred families, the Friedreich ataxia (FRDA) locus was localized in a 300-kb interval between the X104 gene and the microsatellite marker FR8 (D9S888). By homology searches of the sequence databases, we identified X104 as the human tight junction protein ZO-2 gene. We generated a large-scale physical map of the FRDA region by pulsed-field gel electrophoresis analysis of genomic DNA and of three YAC clones derived from different libraries, and we constructed an uninterrupted cosmid contig spanning the FRDA locus. The cAMP-dependent protein kinase {gamma}-catalytic subunit gene wasmore » identified within the critical FRDA interval, but it was excluded as candidate because of its biological properties and because of lack of mutations in FRDA patients. Six new polymorphic markers were isolated between FR2 (D9S886) and FR8 (D9S888), which were used for homozygosity analysis in a family in which parents of an affected child are distantly related. An ancient recombination involving the centromeric FRDA flanking markers had been previously demonstrated in this family. Homozygosity analysis indicated that the FRDA gene is localized in the telomeric 150 kb of the FR2-FR8 interval. 17 refs., 3 figs., 1 tab.« less
Cloning and restriction enzyme mapping of ribosomal DNA of Giardia duodenalis, Giardia ardeae and Giardia muris.

PubMed

van Keulen, H; Campbell, S R; Erlandsen, S L; Jarroll, E L

1991-06-01

In an attempt to study Giardia at the DNA sequence level, the rRNA genes of three species, Giardia duodenalis, Giardia ardeae and Giardia muris were cloned and restriction enzyme maps were constructed. The rDNA repeats of these Giardia show completely different restriction enzyme recognition patterns. The size of the rDNA repeat ranges from approximately 5.6 kb in G. duodenalis to 7.6 kb in both G. muris and G. ardeae. These size differences are mainly attributable to the variation in length of the spacer. Minor differences exist among these Giardia in the sizes of their small subunit rRNA and the internal transcribed spacer between small and large subunit rRNA. The genetic maps were constructed by sequence analysis of the DNA around the 5' and 3' ends of the mature rRNA genes and between the rRNA covering the 5.8S rRNA gene and internal transcribed spacer. Comparison of the 5.8S rDNA and 3' end of large subunit rDNA from these three Giardia species showed considerable sequence variation, but the rDNA sequences of G. duodenalis and G. ardeae appear more closely related to each other than to G. muris.
Complete sequences of IncHI1 plasmids carrying blaCTX-M-1 and qnrS1 in equine Escherichia coli provide new insights into plasmid evolution.

PubMed

Dolejska, Monika; Villa, Laura; Minoia, Marco; Guardabassi, Luca; Carattoli, Alessandra

2014-09-01

To determine the structure of two multidrug-resistant IncHI1 plasmids carrying blaCTX-M-1 in Escherichia coli isolates disseminated in an equine clinic in the Czech Republic. A complete nucleotide sequencing of 239 kb IncHI1 (pEQ1) and 287 kb IncHI1/X1 (pEQ2) plasmids was performed using the 454-Genome Sequencer FLX system. The sequences were compared using bioinformatic tools with other sequenced IncHI1 plasmids. A comparative analysis of pEQ1 and pEQ2 identified high nucleotide identity with the IncHI1 type 2 plasmids. A novel 24 kb module containing an operon involved in short-chain fructooligosaccharide uptake and metabolism was found in the pEQ backbones. The role of the pEQ plasmids in the metabolism of short-chain fructooligosaccharides was demonstrated by studying the growth of E. coli cells in the presence of these sugars. The module containing the blaCTX-M-1 gene was formed by a truncated macrolide resistance cluster and flanked by IS26 as previously observed in IncI1 and IncN plasmids. The IncHI1 plasmid changed size and gained the quinolone resistance gene qnrS1 as a result of IS26-mediated fusion with an IncX1 plasmid. Our data highlight the structure and evolution of IncHI1 from equine E. coli. A plasmid-mediated sugar metabolic element could play a key role in strain fitness, contributing to the successful dissemination and maintenance of these plasmids in the intestinal microflora of horses. © The Author 2014. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
KB425796-A, a novel antifungal antibiotic produced by Paenibacillus sp. 530603.

PubMed

Kai, Hirohito; Yamashita, Midori; Takase, Shigehiro; Hashimoto, Michizane; Muramatsu, Hideyuki; Nakamura, Ikuko; Yoshikawa, Koji; Ezaki, Masami; Nitta, Kumiko; Watanabe, Masato; Inamura, Noriaki; Fujie, Akihiko

2013-08-01

The novel antifungal macrocyclic lipopeptidolactone, KB425796-A (1), was isolated from the fermentation broth of bacterial strain 530603, which was identified as a new Paenibacillus species based on morphological and physiological characteristics, and 16S rRNA sequences. KB425796-A (1) was isolated as white powder by solvent extraction, HP-20 and ODS-B column chromatography, and lyophilization, and was determined to have the molecular formula C79H115N19O18. KB425796-A (1) showed antifungal activities against Aspergillus fumigatus and the micafungin-resistant infectious fungi Trichosporon asahii, Rhizopus oryzae, Pseudallescheria boydii and Cryptococcus neoformans.
Identification of Small Exonic CNV from Whole-Exome Sequence Data and Application to Autism Spectrum Disorder

PubMed Central

Poultney, Christopher S.; Goldberg, Arthur P.; Drapeau, Elodie; Kou, Yan; Harony-Nicolas, Hala; Kajiwara, Yuji; De Rubeis, Silvia; Durand, Simon; Stevens, Christine; Rehnström, Karola; Palotie, Aarno; Daly, Mark J.; Ma’ayan, Avi; Fromer, Menachem; Buxbaum, Joseph D.

2013-01-01

Copy number variation (CNV) is an important determinant of human diversity and plays important roles in susceptibility to disease. Most studies of CNV carried out to date have made use of chromosome microarray and have had a lower size limit for detection of about 30 kilobases (kb). With the emergence of whole-exome sequencing studies, we asked whether such data could be used to reliably call rare exonic CNV in the size range of 1–30 kilobases (kb), making use of the eXome Hidden Markov Model (XHMM) program. By using both transmission information and validation by molecular methods, we confirmed that small CNV encompassing as few as three exons can be reliably called from whole-exome data. We applied this approach to an autism case-control sample (n = 811, mean per-target read depth = 161) and observed a significant increase in the burden of rare (MAF ≤1%) 1–30 kb CNV, 1–30 kb deletions, and 1–10 kb deletions in ASD. CNV in the 1–30 kb range frequently hit just a single gene, and we were therefore able to carry out enrichment and pathway analyses, where we observed enrichment for disruption of genes in cytoskeletal and autophagy pathways in ASD. In summary, our results showed that XHMM provided an effective means to assess small exonic CNV from whole-exome data, indicated that rare 1–30 kb exonic deletions could contribute to risk in up to 7% of individuals with ASD, and implicated a candidate pathway in developmental delay syndromes. PMID:24094742

Genomic insights from whole genome sequencing of four clonal outbreak Campylobacter jejuni assessed within the global C. jejuni population.

PubMed

Clark, Clifford G; Berry, Chrystal; Walker, Matthew; Petkau, Aaron; Barker, Dillon O R; Guan, Cai; Reimer, Aleisha; Taboada, Eduardo N

2016-12-03

Whole genome sequencing (WGS) is useful for determining clusters of human cases, investigating outbreaks, and defining the population genetics of bacteria. It also provides information about other aspects of bacterial biology, including classical typing results, virulence, and adaptive strategies of the organism. Cell culture invasion and protein expression patterns of four related multilocus sequence type 21 (ST21) C. jejuni isolates from a significant Canadian water-borne outbreak were previously associated with the presence of a CJIE1 prophage. Whole genome sequencing was used to examine the genetic diversity among these isolates and confirm that previous observations could be attributed to differential prophage carriage. Moreover, we sought to determine the presence of genome sequences that could be used as surrogate markers to delineate outbreak-associated isolates. Differential carriage of the CJIE1 prophage was identified as the major genetic difference among the four outbreak isolates. High quality single-nucleotide variant (hqSNV) and core genome multilocus sequence typing (cgMLST) clustered these isolates within expanded datasets consisting of additional C. jejuni strains. The number and location of homopolymeric tract regions was identical in all four outbreak isolates but differed from all other C. jejuni examined. Comparative genomics and PCR amplification enabled the identification of large chromosomal inversions of approximately 93 kb and 388 kb within the outbreak isolates associated with transducer-like proteins containing long nucleotide repeat sequences. The 93-kb inversion was characteristic of the outbreak-associated isolates, and the gene content of this inverted region displayed high synteny with the reference strain. The four outbreak isolates were clonally derived and differed mainly in the presence of the CJIE1 prophage, validating earlier findings linking the prophage to phenotypic differences in virulence assays and protein expression. The identification of large, genetically syntenous chromosomal inversions in the genomes of outbreak-associated isolates provided a unique method for discriminating outbreak isolates from the background population. Transducer-like proteins appear to be associated with the chromosomal inversions. CgMLST and hqSNV analysis also effectively delineated the outbreak isolates within the larger C. jejuni population structure.
The Bologna Annotation Resource (BAR 3.0): improving protein functional annotation.

PubMed

Profiti, Giuseppe; Martelli, Pier Luigi; Casadio, Rita

2017-07-03

BAR 3.0 updates our server BAR (Bologna Annotation Resource) for predicting protein structural and functional features from sequence. We increase data volume, query capabilities and information conveyed to the user. The core of BAR 3.0 is a graph-based clustering procedure of UniProtKB sequences, following strict pairwise similarity criteria (sequence identity ≥40% with alignment coverage ≥90%). Each cluster contains the available annotation downloaded from UniProtKB, GO, PFAM and PDB. After statistical validation, GO terms and PFAM domains are cluster-specific and annotate new sequences entering the cluster after satisfying similarity constraints. BAR 3.0 includes 28 869 663 sequences in 1 361 773 clusters, of which 22.2% (22 241 661 sequences) and 47.4% (24 555 055 sequences) have at least one validated GO term and one PFAM domain, respectively. 1.4% of the clusters (36% of all sequences) include PDB structures and the cluster is associated to a hidden Markov model that allows building template-target alignment suitable for structural modeling. Some other 3 399 026 sequences are singletons. BAR 3.0 offers an improved search interface, allowing queries by UniProtKB-accession, Fasta sequence, GO-term, PFAM-domain, organism, PDB and ligand/s. When evaluated on the CAFA2 targets, BAR 3.0 largely outperforms our previous version and scores among state-of-the-art methods. BAR 3.0 is publicly available and accessible at http://bar.biocomp.unibo.it/bar3. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.
Haplotype estimation using sequencing reads.

PubMed

Delaneau, Olivier; Howie, Bryan; Cox, Anthony J; Zagury, Jean-François; Marchini, Jonathan

2013-10-03

High-throughput sequencing technologies produce short sequence reads that can contain phase information if they span two or more heterozygote genotypes. This information is not routinely used by current methods that infer haplotypes from genotype data. We have extended the SHAPEIT2 method to use phase-informative sequencing reads to improve phasing accuracy. Our model incorporates the read information in a probabilistic model through base quality scores within each read. The method is primarily designed for high-coverage sequence data or data sets that already have genotypes called. One important application is phasing of single samples sequenced at high coverage for use in medical sequencing and studies of rare diseases. Our method can also use existing panels of reference haplotypes. We tested the method by using a mother-father-child trio sequenced at high-coverage by Illumina together with the low-coverage sequence data from the 1000 Genomes Project (1000GP). We found that use of phase-informative reads increases the mean distance between switch errors by 22% from 274.4 kb to 328.6 kb. We also used male chromosome X haplotypes from the 1000GP samples to simulate sequencing reads with varying insert size, read length, and base error rate. When using short 100 bp paired-end reads, we found that using mixtures of insert sizes produced the best results. When using longer reads with high error rates (5-20 kb read with 4%-15% error per base), phasing performance was substantially improved. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
HPVdb: a data mining system for knowledge discovery in human papillomavirus with applications in T cell immunology and vaccinology

PubMed Central

Zhang, Guang Lan; Riemer, Angelika B.; Keskin, Derin B.; Chitkushev, Lou; Reinherz, Ellis L.; Brusic, Vladimir

2014-01-01

High-risk human papillomaviruses (HPVs) are the causes of many cancers, including cervical, anal, vulvar, vaginal, penile and oropharyngeal. To facilitate diagnosis, prognosis and characterization of these cancers, it is necessary to make full use of the immunological data on HPV available through publications, technical reports and databases. These data vary in granularity, quality and complexity. The extraction of knowledge from the vast amount of immunological data using data mining techniques remains a challenging task. To support integration of data and knowledge in virology and vaccinology, we developed a framework called KB-builder to streamline the development and deployment of web-accessible immunological knowledge systems. The framework consists of seven major functional modules, each facilitating a specific aspect of the knowledgebase construction process. Using KB-builder, we constructed the Human Papillomavirus T cell Antigen Database (HPVdb). It contains 2781 curated antigen entries of antigenic proteins derived from 18 genotypes of high-risk HPV and 18 genotypes of low-risk HPV. The HPVdb also catalogs 191 verified T cell epitopes and 45 verified human leukocyte antigen (HLA) ligands. Primary amino acid sequences of HPV antigens were collected and annotated from the UniProtKB. T cell epitopes and HLA ligands were collected from data mining of scientific literature and databases. The data were subject to extensive quality control (redundancy elimination, error detection and vocabulary consolidation). A set of computational tools for an in-depth analysis, such as sequence comparison using BLAST search, multiple alignments of antigens, classification of HPV types based on cancer risk, T cell epitope/HLA ligand visualization, T cell epitope/HLA ligand conservation analysis and sequence variability analysis, has been integrated within the HPVdb. Predicted Class I and Class II HLA binding peptides for 15 common HLA alleles are included in this database as putative targets. HPVdb is a knowledge-based system that integrates curated data and information with tailored analysis tools to facilitate data mining for HPV vaccinology and immunology. To our best knowledge, HPVdb is a unique data source providing a comprehensive list of HPV antigens and peptides. Database URL: http://cvc.dfci.harvard.edu/hpv/ PMID:24705205
HPVdb: a data mining system for knowledge discovery in human papillomavirus with applications in T cell immunology and vaccinology.

PubMed

Zhang, Guang Lan; Riemer, Angelika B; Keskin, Derin B; Chitkushev, Lou; Reinherz, Ellis L; Brusic, Vladimir

2014-01-01

High-risk human papillomaviruses (HPVs) are the causes of many cancers, including cervical, anal, vulvar, vaginal, penile and oropharyngeal. To facilitate diagnosis, prognosis and characterization of these cancers, it is necessary to make full use of the immunological data on HPV available through publications, technical reports and databases. These data vary in granularity, quality and complexity. The extraction of knowledge from the vast amount of immunological data using data mining techniques remains a challenging task. To support integration of data and knowledge in virology and vaccinology, we developed a framework called KB-builder to streamline the development and deployment of web-accessible immunological knowledge systems. The framework consists of seven major functional modules, each facilitating a specific aspect of the knowledgebase construction process. Using KB-builder, we constructed the Human Papillomavirus T cell Antigen Database (HPVdb). It contains 2781 curated antigen entries of antigenic proteins derived from 18 genotypes of high-risk HPV and 18 genotypes of low-risk HPV. The HPVdb also catalogs 191 verified T cell epitopes and 45 verified human leukocyte antigen (HLA) ligands. Primary amino acid sequences of HPV antigens were collected and annotated from the UniProtKB. T cell epitopes and HLA ligands were collected from data mining of scientific literature and databases. The data were subject to extensive quality control (redundancy elimination, error detection and vocabulary consolidation). A set of computational tools for an in-depth analysis, such as sequence comparison using BLAST search, multiple alignments of antigens, classification of HPV types based on cancer risk, T cell epitope/HLA ligand visualization, T cell epitope/HLA ligand conservation analysis and sequence variability analysis, has been integrated within the HPVdb. Predicted Class I and Class II HLA binding peptides for 15 common HLA alleles are included in this database as putative targets. HPVdb is a knowledge-based system that integrates curated data and information with tailored analysis tools to facilitate data mining for HPV vaccinology and immunology. To our best knowledge, HPVdb is a unique data source providing a comprehensive list of HPV antigens and peptides. Database URL: http://cvc.dfci.harvard.edu/hpv/.
[Association of phytoplasma with Bermuda grass white-leaf disease].

PubMed

Tan, Weijun; Chen, Yong; Zhang, Wu; Han, Chengchou; Tan, Zhiyuan; Zhang, Juming

2008-10-01

Bermuda grass white leaf is an important disease on Bermuda grass all over the world. The aim of this research is to identify the pathogen which leads to Bermuda grass white leaf occurring on the Chinese mainland. PCR amplification technique, sequence analysis and Southern hybridization were used. A 1.3 kb fragment was amplified by PCR phytoplasma universal primers and total DNA sample extracted from ill Bermuda grass as the amplified template. Sequence analysis of the amplified fragment indicated it clustered into Candidatus Phytoplasm Cynodontis. Southern hybridization analysis showed differential cingulums. The pathogen of Bermuda grass white leaf on the Chinese mainland contains phytoplasma, which provides a scientific basis for further identification, prevention and control of the disease.
Synteny of Prunus and other model plant species

PubMed Central

Jung, Sook; Jiwan, Derick; Cho, Ilhyung; Lee, Taein; Abbott, Albert; Sosinski, Bryon; Main, Dorrie

2009-01-01

Background Fragmentary conservation of synteny has been reported between map-anchored Prunus sequences and Arabidopsis. With the availability of genome sequence for fellow rosid I members Populus and Medicago, we analyzed the synteny between Prunus and the three model genomes. Eight Prunus BAC sequences and map-anchored Prunus sequences were used in the comparison. Results We found a well conserved synteny across the Prunus species – peach, plum, and apricot – and Populus using a set of homologous Prunus BACs. Conversely, we could not detect any synteny with Arabidopsis in this region. Other peach BACs also showed extensive synteny with Populus. The syntenic regions detected were up to 477 kb in Populus. Two syntenic regions between Arabidopsis and these BACs were much shorter, around 10 kb. We also found syntenic regions that are conserved between the Prunus BACs and Medicago. The array of synteny corresponded with the proposed whole genome duplication events in Populus and Medicago. Using map-anchored Prunus sequences, we detected many syntenic blocks with several gene pairs between Prunus and Populus or Arabidopsis. We observed a more complex network of synteny between Prunus-Arabidopsis, indicative of multiple genome duplication and subsequence gene loss in Arabidopsis. Conclusion Our result shows the striking microsynteny between the Prunus BACs and the genome of Populus and Medicago. In macrosynteny analysis, more distinct Prunus regions were syntenic to Populus than to Arabidopsis. PMID:19208249
The histidine permease gene (HIP1) of Saccharomyces cerevisiae.

PubMed

Tanaka, J; Fink, G R

1985-01-01

The histidine-specific permease gene (HIP1) of Saccharomyces cerevisiae has been mapped, cloned, and sequenced. The HIP1 gene maps to the right arm of chromosome VII, approx. 11 cM distal to the ADE3 gene. The gene was isolated as an 8.6-kb BamHI-Sau3A fragment by complementation of the histidine-specific permease deficiency in recipient yeast cells. We sequenced a 2.4-kb subfragment of this BamHI-Sau3A fragment containing the HIP1 gene and identified a 1596-bp open reading frame (ORF). We confirmed the assignment of the 1596-bp ORF as the HIP1 coding sequence by sequencing a hip1 nonsense mutation. Analysis of the amino acid (aa) sequence of the HIP1 gene reveals several hydrophobic stretches, but shows no obvious N-terminal signal peptide. We have constructed a deletion of the HIP1 gene in vitro and replaced the wild-type copy of the gene with this deletion. The hip1 deletion mutant can grow when it is supplemented with 30 mM histidine, 50 times the amount required for the growth of HIP1 cells. Revertants of this deletion mutant able to grow on a normal level of histidine arise by mutation in unlinked genes. Both these observations suggest that there are additional, low-affinity pathways for histidine uptake.
Third International Meeting on Esterases Reacting with Organophosphorus Compounds

DTIC Science & Technology

1998-01-01

cassette for negative selection, 884 bp of ACHE including exon 1, 1.6 kb of a Neor gene cassette for positive selection, 5.2 kb of the ACHE Bam HI...fragment including exon 6, and 3 kb of Bluescript. Deletion of exons 2-5 removed 80% of the ACHE coding sequence. The gene targeting vector was...expression due to environmental influences on CYP3A4 and the presence or absence of CYP3A5 which may be under genetic control in man. Plasma
[Phylogenetic analysis of genomes of Vibrio cholerae strains isolated on the territory of Rostov region].

PubMed

Kuleshov, K V; Markelov, M L; Dedkov, V G; Vodop'ianov, A S; Kermanov, A V; Pisanov, R V; Kruglikov, V D; Mazrukho, A B; Maleev, V V; Shipulin, G A

2013-01-01

Determination of origin of 2 Vibrio cholerae strains isolated on the territory of Rostov region by using full genome sequencing data. Toxigenic strain 2011 EL- 301 V. cholerae 01 El Tor Inaba No. 301 (ctxAB+, tcpA+) and nontoxigenic strain V. cholerae O1 Ogawa P- 18785 (ctxAB-, tcpA+) were studied. Sequencing was carried out on the MiSeq platform. Phylogenetic analysis of the genomes obtained was carried out based on comparison of conservative part of the studied and 54 previously sequenced genomes. 2011EL-301 strain genome was presented by 164 contigs with an average coverage of 100, N50 parameter was 132 kb, for strain P- 18785 - 159 contigs with a coverage of69, N50 - 83 kb. The contigs obtained for strain 2011 EL-301 were deposited in DDBJ/EMBL/GenBank databases with access code AJFN02000000, for strain P-18785 - ANHS00000000. 716 protein-coding orthologous genes were detected. Based on phylogenetic analysis strain P- 18785 belongs to PG-1 subgroup (a group of predecessor strains of the 7th pandemic). Strain 2011EL-301 belongs to groups of strains of the 7th pandemic and is included into the cluster with later isolates that are associated with cases of cholera in South Africa and cases of import of cholera to the USA from Pakistan. The data obtained allows to establish phylogenetic connections with V cholerae strains isolated earlier.
A Silent ABC Transporter Isolated from Streptomyces rochei F20 Induces Multidrug Resistance

PubMed Central

Fernández-Moreno, Miguel A.; Carbó, Lázaro; Cuesta, Trinidad; Vallín, Carlos; Malpartida, Francisco

1998-01-01

In the search for heterologous activators for actinorhodin production in Streptomyces lividans, 3.4 kb of DNA from Streptomyces rochei F20 (a streptothricin producer) were characterized. Subcloning experiments showed that the minimal DNA fragment required for activation was 0.4 kb in size. The activation is mediated by increasing the levels of transcription of the actII-ORF4 gene. Sequencing of the minimal activating fragment did not reveal any clues about its mechanism; nevertheless, it was shown to overlap the 3′ end of two convergent genes, one of whose translated products (ORF2) strongly resembles that of other genes belonging to the ABC transporter superfamily. Computer-assisted analysis of the 3.4-kb DNA sequence showed the 3′ terminus of an open reading frame (ORF), i.e., ORFA, and three complete ORFs (ORF1, ORF2, and ORFB). Searches in the databases with their respective gene products revealed similarities for ORF1 and ORF2 with ATP-binding proteins and transmembrane proteins, respectively, which are found in members of the ABC transporter superfamily. No similarities for ORFA and ORFB were found in the databases. Insertional inactivation of ORF1 and ORF2, their transcription analysis, and their cloning in heterologous hosts suggested that these genes were not expressed under our experimental conditions; however, cloning of ORF1 and ORF2 together (but not separately) under the control of an expressing promoter induced resistance to several chemically different drugs: oleandomycin, erythromycin, spiramycin, doxorubicin, and tetracycline. Thus, this genetic system, named msr, is a new bacterial multidrug ABC transporter. PMID:9696745
De-novo RNA Sequencing and Metabolite Profiling to Identify Genes Involved in Anthocyanin Biosynthesis in Korean Black Raspberry (Rubus coreanus Miquel)

PubMed Central

Rim, Yeonggil; Kumar, Ritesh; Han, Xiao; Lee, Sang Yeol; Lee, Choong Hwan; Kim, Jae-Yean

2014-01-01

The Korean black raspberry (Rubus coreanus Miquel, KB) on ripening is usually consumed as fresh fruit, whereas the unripe KB has been widely used as a source of traditional herbal medicine. Such a stage specific utilization of KB has been assumed due to the changing metabolite profile during fruit ripening process, but so far molecular and biochemical changes during its fruit maturation are poorly understood. To analyze biochemical changes during fruit ripening process at molecular level, firstly, we have sequenced, assembled, and annotated the transcriptome of KB fruits. Over 4.86 Gb of normalized cDNA prepared from fruits was sequenced using Illumina HiSeq™ 2000, and assembled into 43,723 unigenes. Secondly, we have reported that alterations in anthocyanins and proanthocyanidins are the major factors facilitating variations in these stages of fruits. In addition, up-regulation of F3′H1, DFR4 and LDOX1 resulted in the accumulation of cyanidin derivatives during the ripening process of KB, indicating the positive relationship between the expression of anthocyanin biosynthetic genes and the anthocyanin accumulation. Furthermore, the ability of RcMCHI2 (R. coreanus Miquel chalcone flavanone isomerase 2) gene to complement Arabidopsis transparent testa 5 mutant supported the feasibility of our transcriptome library to provide the gene resources for improving plant nutrition and pigmentation. Taken together, these datasets obtained from transcriptome library and metabolic profiling would be helpful to define the gene-metabolite relationships in this non-model plant. PMID:24505466
Concerted evolution of the tandem array encoding primate U2 snRNA occurs in situ, without changing the cytological context of the RNU2 locus.

PubMed Central

Pavelitz, T; Rusché, L; Matera, A G; Scharf, J M; Weiner, A M

1995-01-01

In primates, the tandemly repeated genes encoding U2 small nuclear RNA evolve concertedly, i.e. the sequence of the U2 repeat unit is essentially homogeneous within each species but differs somewhat between species. Using chromosome painting and the NGFR gene as an outside marker, we show that the U2 tandem array (RNU2) has remained at the same chromosomal locus (equivalent to human 17q21) through multiple speciation events over > 35 million years leading to the Old World monkey and hominoid lineages. The data suggest that the U2 tandem repeat, once established in the primate lineage, contained sequence elements favoring perpetuation and concerted evolution of the array in situ, despite a pericentric inversion in chimpanzee, a reciprocal translocation in gorilla and a paracentric inversion in orang utan. Comparison of the 11 kb U2 repeat unit found in baboon and other Old World monkeys with the 6 kb U2 repeat unit in humans and other hominids revealed that an ancestral U2 repeat unit was expanded by insertion of a 5 kb retrovirus bearing 1 kb long terminal repeats (LTRs). Subsequent excision of the provirus by homologous recombination between the LTRs generated a 6 kb U2 repeat unit containing a solo LTR. Remarkably, both junctions between the human U2 tandem array and flanking chromosomal DNA at 17q21 fall within the solo LTR sequence, suggesting a role for the LTR in the origin or maintenance of the primate U2 array. Images PMID:7828589
Content and organization of the human Ig VH locus: definition of three new VH families and linkage to the Ig CH locus.

PubMed Central

Berman, J E; Mellis, S J; Pollock, R; Smith, C L; Suh, H; Heinke, B; Kowal, C; Surti, U; Chess, L; Cantor, C R

1988-01-01

We present a detailed analysis of the content and organization of the human immunoglobulin VH locus. Human VH genes representing five distinct families were isolated, including novel members belonging to two out of three of the known VH gene families (VH1 and VH3) as well as members of three new families (VH4, VH5, and VH6). We report the nucleotide sequence of 21 novel human VH genes, many of which belong to the three new VH gene families. In addition, we provide a preliminary analysis of the organization of these gene segments over the full extent of the locus. We find that the five multi-segment families (VH1-5) have members interspersed over nearly the full 1500-2000 kb of the VH locus, and estimate that the entire heavy chain locus covers 2500 kb or less. Finally, we provide the first report of the physical linkage of the variable and constant loci of a human Ig gene family by demonstrating that the most proximal known human VH segments lie within 100 kb of the constant region locus. Images PMID:3396540
Norrie disease: linkage analysis using a 4.2-kb RFLP detected by a human ornithine aminotransferase cDNA probe.

PubMed

Ngo, J T; Bateman, J B; Cortessis, V; Sparkes, R S; Mohandas, T; Inana, G; Spence, M A

1989-05-01

Previous study has shown that the usual DNA marker for Norrie disease, the L1.28 probe which identifies the DXS7 locus, can recombine with the disease locus. In this study, we used a human ornithine aminotransferase (OAT) cDNA which detects OAT-related DNA sequences mapped to the same region on the X chromosome as that of the L1.28 probe to investigate the family with Norrie disease who exhibited the recombinational event. When genomic DNA from this family was digested with the PvuII restriction endonuclease, we found a restriction fragment length polymorphism (RFLP) of 4.2 kb in size. This fragment was absent in the affected males and cosegregated with the disease locus; we calculated a lod score of 0.602, at theta = 0.00. No deletion could be detected by chromosomal analysis or on Southern blots with other enzymes. These results suggest that one of the OAT-related sequences on the X chromosome may be in close proximity to the Norrie disease locus and represent the first report which indicates that the OAT cDNA may be useful for the identification of carrier status and/or prenatal diagnosis.
Cloning, sequence analysis, and expression in Escherichia coli of a gene coding for a. beta. -mannanase from the extremely thermophilic bacterium Caldocellum saccharolyticum

DOE Office of Scientific and Technical Information (OSTI.GOV)

Luethi, E.; Jasmat, N.B.; Grayling, R.A.

1991-03-01

A {lambda} recombinant phage expressing {beta}-mannanase activity in Escherichia coli has been isolated from a genomic library of the extremely thermophilic anaerobe Caldocellum saccharolyticum. The gene was cloned into pBR322 on a 5-kb BamHI fragment, and its location was obtained by deletion analysis. The sequence of a 2.1-kb fragment containing the mannanase gene has been determined. One open reading frame was found which could code for a protein of M{sub r} 38,904. The mannanase gene (manA) was overexpressed in E. coli by cloning the gene downstream from the lacZ promoter of pUC18. The enzyme was most active at pH 6more » and 80 C and degraded locust bean gum, guar gum, Pinus radiata glucomannan, and konjak glucomannan. The noncoding region downstream from the mannanase gene showed strong homology to celB, a gene coding for a cellulase from the same organism, suggesting that the manA gene might have been inserted into its present position on the C. saccharolyticum genome by homologous recombination.« less
Identification and analysis of the bacterial endosymbiont specialized for production of the chemotherapeutic natural product ET-743

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schofield, Michael M.; Jain, Sunit; Porat, Daphne

Ecteinascidin 743 (ET-743, Yondelis) is a clinically approved chemotherapeutic natural product isolated from the Caribbean mangrove tunicate Ecteinascidia turbinata. Researchers have long suspected that a microorganism may be the true producer of the anti-cancer drug, but its genome has remained elusive due to our inability to culture the bacterium in the laboratory using standard techniques. Here, we sequenced and assembled the complete genome of the ET-743 producer, Candidatus Endoecteinascidia frumentensis, directly from metagenomic DNA isolated from the tunicate. Analysis of the ~631 kb microbial genome revealed strong evidence of an endosymbiotic lifestyle and extreme genome reduction. Phylogenetic analysis suggested thatmore » the producer of the anti-cancer drug is taxonomically distinct from other sequenced microorganisms and could represent a new family of Gammaproteobacteria. The complete genome has also greatly expanded our understanding of ET-743 production and revealed new biosynthetic genes dispersed across more than 173 kb of the small genome. The gene cluster’s architecture and its preservation demonstrate that the drug is likely essential to the interactions of the microorganism with its mangrove tunicate host. In conclusion, taken together, these studies elucidate the lifestyle of a unique, and pharmaceutically-important microorganism and highlight the wide diversity of bacteria capable of making potent natural products.« less
Identification and analysis of the bacterial endosymbiont specialized for production of the chemotherapeutic natural product ET-743

DOE PAGES

Schofield, Michael M.; Jain, Sunit; Porat, Daphne; ...

2015-07-21

Ecteinascidin 743 (ET-743, Yondelis) is a clinically approved chemotherapeutic natural product isolated from the Caribbean mangrove tunicate Ecteinascidia turbinata. Researchers have long suspected that a microorganism may be the true producer of the anti-cancer drug, but its genome has remained elusive due to our inability to culture the bacterium in the laboratory using standard techniques. Here, we sequenced and assembled the complete genome of the ET-743 producer, Candidatus Endoecteinascidia frumentensis, directly from metagenomic DNA isolated from the tunicate. Analysis of the ~631 kb microbial genome revealed strong evidence of an endosymbiotic lifestyle and extreme genome reduction. Phylogenetic analysis suggested thatmore » the producer of the anti-cancer drug is taxonomically distinct from other sequenced microorganisms and could represent a new family of Gammaproteobacteria. The complete genome has also greatly expanded our understanding of ET-743 production and revealed new biosynthetic genes dispersed across more than 173 kb of the small genome. The gene cluster’s architecture and its preservation demonstrate that the drug is likely essential to the interactions of the microorganism with its mangrove tunicate host. In conclusion, taken together, these studies elucidate the lifestyle of a unique, and pharmaceutically-important microorganism and highlight the wide diversity of bacteria capable of making potent natural products.« less
Fine genetic mapping of spot blotch resistance gene Sb3 in wheat (Triticum aestivum).

PubMed

Lu, Ping; Liang, Yong; Li, Delin; Wang, Zhengzhong; Li, Wenbin; Wang, Guoxin; Wang, Yong; Zhou, Shenghui; Wu, Qiuhong; Xie, Jingzhong; Zhang, Deyun; Chen, Yongxing; Li, Miaomiao; Zhang, Yan; Sun, Qixin; Han, Chenggui; Liu, Zhiyong

2016-03-01

Spot blotch disease resistance gene Sb3 was mapped to a 0.15 centimorgan (cM) genetic interval spanning a 602 kb physical genomic region on chromosome 3BS. Wheat spot blotch disease, caused by B. sorokiniana, is a devastating disease that can cause severe yield losses. Although inoculum levels can be reduced by planting disease-free seed, treatment of plants with fungicides and crop rotation, genetic resistance is likely to be a robust, economical and environmentally friendly tool in the control of spot blotch. The winter wheat line 621-7-1 confers immune resistance against B. sorokiniana. Genetic analysis indicates that the spot blotch resistance of 621-7-1 is controlled by a single dominant gene, provisionally designated Sb3. Bulked segregant analysis (BSA) and simple sequence repeat (SSR) mapping showed that Sb3 is located on chromosome arm 3BS linked with markers Xbarc133 and Xbarc147. Seven and twelve new polymorphic markers were developed from the Chinese Spring 3BS shotgun survey sequence contigs and 3BS reference sequences, respectively. Finally, Sb3 was mapped in a 0.15 cM genetic interval spanning a 602 kb physical genomic region of Chinese Spring chromosome 3BS. The genetic and physical maps of Sb3 provide a framework for map-based cloning and marker-assisted selection (MAS) of the spot blotch resistance.
Formation of a functional maize centromere after loss of centromeric sequences and gain of ectopic sequences.

PubMed

Zhang, Bing; Lv, Zhenling; Pang, Junling; Liu, Yalin; Guo, Xiang; Fu, Shulan; Li, Jun; Dong, Qianhua; Wu, Hua-Jun; Gao, Zhi; Wang, Xiu-Jie; Han, Fangpu

2013-06-01

The maize (Zea mays) B centromere is composed of B centromere-specific repeats (ZmBs), centromere-specific satellite repeats (CentC), and centromeric retrotransposons of maize (CRM). Here we describe a newly formed B centromere in maize, which has lost CentC sequences and has dramatically reduced CRM and ZmBs sequences, but still retains the molecular features of functional centromeres, such as CENH3, H2A phosphorylation at Thr-133, H3 phosphorylation at Ser-10, and Thr-3 immunostaining signals. This new centromere is stable and can be transmitted to offspring through meiosis. Anti-CENH3 chromatin immunoprecipitation sequencing revealed that a 723-kb region from the short arm of chromosome 9 (9S) was involved in the formation of the new centromere. The 723-kb region, which is gene poor and enriched for transposons, contains two abundant DNA motifs. Genes in the new centromere region are still transcribed. The original 723-kb region showed a higher DNA methylation level compared with native centromeres but was not significantly changed when it was involved in new centromere formation. Our results indicate that functional centromeres may be formed without the known centromere-specific sequences, yet the maintenance of a high DNA methylation level seems to be crucial for the proper function of a new centromere.

Formation of a Functional Maize Centromere after Loss of Centromeric Sequences and Gain of Ectopic Sequences[C][W

PubMed Central

Zhang, Bing; Lv, Zhenling; Pang, Junling; Liu, Yalin; Guo, Xiang; Fu, Shulan; Li, Jun; Dong, Qianhua; Wu, Hua-Jun; Gao, Zhi; Wang, Xiu-Jie; Han, Fangpu

2013-01-01

The maize (Zea mays) B centromere is composed of B centromere–specific repeats (ZmBs), centromere-specific satellite repeats (CentC), and centromeric retrotransposons of maize (CRM). Here we describe a newly formed B centromere in maize, which has lost CentC sequences and has dramatically reduced CRM and ZmBs sequences, but still retains the molecular features of functional centromeres, such as CENH3, H2A phosphorylation at Thr-133, H3 phosphorylation at Ser-10, and Thr-3 immunostaining signals. This new centromere is stable and can be transmitted to offspring through meiosis. Anti-CENH3 chromatin immunoprecipitation sequencing revealed that a 723-kb region from the short arm of chromosome 9 (9S) was involved in the formation of the new centromere. The 723-kb region, which is gene poor and enriched for transposons, contains two abundant DNA motifs. Genes in the new centromere region are still transcribed. The original 723-kb region showed a higher DNA methylation level compared with native centromeres but was not significantly changed when it was involved in new centromere formation. Our results indicate that functional centromeres may be formed without the known centromere-specific sequences, yet the maintenance of a high DNA methylation level seems to be crucial for the proper function of a new centromere. PMID:23771890
Dominant Sequences of Human Major Histocompatibility Complex Conserved Extended Haplotypes from HLA-DQA2 to DAXX

PubMed Central

Larsen, Charles E.; Alford, Dennis R.; Trautwein, Michael R.; Jalloh, Yanoh K.; Tarnacki, Jennifer L.; Kunnenkeri, Sushruta K.; Fici, Dolores A.; Yunis, Edmond J.; Awdeh, Zuheir L.; Alper, Chester A.

2014-01-01

We resequenced and phased 27 kb of DNA within 580 kb of the MHC class II region in 158 population chromosomes, most of which were conserved extended haplotypes (CEHs) of European descent or contained their centromeric fragments. We determined the single nucleotide polymorphism and deletion-insertion polymorphism alleles of the dominant sequences from HLA-DQA2 to DAXX for these CEHs. Nine of 13 CEHs remained sufficiently intact to possess a dominant sequence extending at least to DAXX, 230 kb centromeric to HLA-DPB1. We identified the regions centromeric to HLA-DQB1 within which single instances of eight “common” European MHC haplotypes previously sequenced by the MHC Haplotype Project (MHP) were representative of those dominant CEH sequences. Only two MHP haplotypes had a dominant CEH sequence throughout the centromeric and extended class II region and one MHP haplotype did not represent a known European CEH anywhere in the region. We identified the centromeric recombination transition points of other MHP sequences from CEH representation to non-representation. Several CEH pairs or groups shared sequence identity in small blocks but had significantly different (although still conserved for each separate CEH) sequences in surrounding regions. These patterns partly explain strong calculated linkage disequilibrium over only short (tens to hundreds of kilobases) distances in the context of a finite number of observed megabase-length CEHs comprising half a population's haplotypes. Our results provide a clearer picture of European CEH class II allelic structure and population haplotype architecture, improved regional CEH markers, and raise questions concerning regional recombination hotspots. PMID:25299700
UniProtKB/Swiss-Prot, the Manually Annotated Section of the UniProt KnowledgeBase: How to Use the Entry View.

PubMed

Boutet, Emmanuel; Lieberherr, Damien; Tognolli, Michael; Schneider, Michel; Bansal, Parit; Bridge, Alan J; Poux, Sylvain; Bougueleret, Lydie; Xenarios, Ioannis

2016-01-01

The Universal Protein Resource (UniProt, http://www.uniprot.org ) consortium is an initiative of the SIB Swiss Institute of Bioinformatics (SIB), the European Bioinformatics Institute (EBI) and the Protein Information Resource (PIR) to provide the scientific community with a central resource for protein sequences and functional information. The UniProt consortium maintains the UniProt KnowledgeBase (UniProtKB), updated every 4 weeks, and several supplementary databases including the UniProt Reference Clusters (UniRef) and the UniProt Archive (UniParc).The Swiss-Prot section of the UniProt KnowledgeBase (UniProtKB/Swiss-Prot) contains publicly available expertly manually annotated protein sequences obtained from a broad spectrum of organisms. Plant protein entries are produced in the frame of the Plant Proteome Annotation Program (PPAP), with an emphasis on characterized proteins of Arabidopsis thaliana and Oryza sativa. High level annotations provided by UniProtKB/Swiss-Prot are widely used to predict annotation of newly available proteins through automatic pipelines.The purpose of this chapter is to present a guided tour of a UniProtKB/Swiss-Prot entry. We will also present some of the tools and databases that are linked to each entry.
Interchromosomal recombination in Zea mays.

PubMed Central

Hu, W; Timmermans, M C; Messing, J

1998-01-01

A new allele of the 27-kD zein locus in maize has been generated by interchromosomal recombination between chromosomes of two different inbred lines. A continuous patch of at least 11,817 bp of inbred W64A, containing the previously characterized Ra allele of the 27-kD zein gene, has been inserted into the genome of A188 by a single crossover. While both junction sequences are conserved, sequences of the two homologs between these junctions differ considerably. W64A contains the 7313-bp-long retrotransposon, Zeon-1. A188 contains a second copy of the 27-kD zein gene and a 2-kb repetitive element. Therefore, recombination results in a 7.3-kb insertion and a 14-kb deletion compared to the original S+A188 allele. If nonpairing sequences are looped out, 206 single base changes, frequently clustered, are present. The structure of this allele may explain how a recently discovered example of somatic recombination occurred in an A188/W64A hybrid. This would indicate that despite these sequence differences, pairing between these alleles could occur early during plant development. Therefore, such a somatically derived chimeric chromosome can also be heritable and give rise to new alleles. PMID:9799274
A Comparison of the First Two Sequenced Chloroplast Genomes in Asteraceae: Lettuce and Sunflower

DOE Office of Scientific and Technical Information (OSTI.GOV)

Timme, Ruth E.; Kuehl, Jennifer V.; Boore, Jeffrey L.

2006-01-20

Asteraceae is the second largest family of plants, with over 20,000 species. For the past few decades, numerous phylogenetic studies have contributed to our understanding of the evolutionary relationships within this family, including comparisons of the fast evolving chloroplast gene, ndhF, rbcL, as well as non-coding DNA from the trnL intron plus the trnLtrnF intergenic spacer, matK, and, with lesser resolution, psbA-trnH. This culminated in a study by Panero and Funk in 2002 that used over 13,000 bp per taxon for the largest taxonomic revision of Asteraceae in over a hundred years. Still, some uncertainties remain, and it would bemore » very useful to have more information on the relative rates of sequence evolution among various genes and on genome structure as a potential set of phylogenetic characters to help guide future phylogenetic structures. By way of contributing to this, we report the first two complete chloroplast genome sequences from members of the Asteraceae, those of Helianthus annuus and Lactuca sativa. These plants belong to two distantly related subfamilies, Asteroideae and Cichorioideae, respectively. In addition to these, there is only one other published chloroplast genome sequence for any plant within the larger group called Eusterids II, that of Panax ginseng (Araliaceae, 156,318 bps, AY582139). Early chloroplast genome mapping studies demonstrated that H. annuus and L. sativa share a 22 kb inversion relative to members of the subfamily Barnadesioideae. By comparison to outgroups, this inversion was shown to be derived, indicating that the Asteroideae and Cichorioideae are more closely related than either is to the Barnadesioideae. Later sequencing study found that taxa that share this 22 kb inversion also contain within this region a second, smaller, 3.3 kb inversion. These sequences also enable an analysis of patterns of shared repeats in the genomes at fine level and of RNA editing by comparison to available EST sequences. In addition, since both of these genomes are crop plants, their complete genome sequence will facilitate development of chloroplast genetic engineering technology, as in recent studies from Daniell's lab. Knowing the exact sequence from spacer regions is crucial for introducing transgenes into the chloroplast genome.« less
A Primary Assembly of a Bovine Haplotype Block Map Based on a 15,036-Single-Nucleotide Polymorphism Panel Genotyped in Holstein–Friesian Cattle

PubMed Central

Khatkar, Mehar S.; Zenger, Kyall R.; Hobbs, Matthew; Hawken, Rachel J.; Cavanagh, Julie A. L.; Barris, Wes; McClintock, Alexander E.; McClintock, Sara; Thomson, Peter C.; Tier, Bruce; Nicholas, Frank W.; Raadsma, Herman W.

2007-01-01

Analysis of data on 1000 Holstein–Friesian bulls genotyped for 15,036 single-nucleotide polymorphisms (SNPs) has enabled genomewide identification of haplotype blocks and tag SNPs. A final subset of 9195 SNPs in Hardy–Weinberg equilibrium and mapped on autosomes on the bovine sequence assembly (release Btau 3.1) was used in this study. The average intermarker spacing was 251.8 kb. The average minor allele frequency (MAF) was 0.29 (0.05–0.5). Following recent precedents in human HapMap studies, a haplotype block was defined where 95% of combinations of SNPs within a region are in very high linkage disequilibrium. A total of 727 haplotype blocks consisting of ≥3 SNPs were identified. The average block length was 69.7 ± 7.7 kb, which is ∼5–10 times larger than in humans. These blocks comprised a total of 2964 SNPs and covered 50,638 kb of the sequence map, which constitutes 2.18% of the length of all autosomes. A set of tag SNPs, which will be useful for further fine-mapping studies, has been identified. Overall, the results suggest that as many as 75,000–100,000 tag SNPs would be needed to track all important haplotype blocks in the bovine genome. This would require ∼250,000 SNPs in the discovery phase. PMID:17435229
Discovery of human inversion polymorphisms by comparative analysis of human and chimpanzee DNA sequence assemblies.

PubMed

Feuk, Lars; MacDonald, Jeffrey R; Tang, Terence; Carson, Andrew R; Li, Martin; Rao, Girish; Khaja, Razi; Scherer, Stephen W

2005-10-01

With a draft genome-sequence assembly for the chimpanzee available, it is now possible to perform genome-wide analyses to identify, at a submicroscopic level, structural rearrangements that have occurred between chimpanzees and humans. The goal of this study was to investigate chromosomal regions that are inverted between the chimpanzee and human genomes. Using the net alignments for the builds of the human and chimpanzee genome assemblies, we identified a total of 1,576 putative regions of inverted orientation, covering more than 154 mega-bases of DNA. The DNA segments are distributed throughout the genome and range from 23 base pairs to 62 mega-bases in length. For the 66 inversions more than 25 kilobases (kb) in length, 75% were flanked on one or both sides by (often unrelated) segmental duplications. Using PCR and fluorescence in situ hybridization we experimentally validated 23 of 27 (85%) semi-randomly chosen regions; the largest novel inversion confirmed was 4.3 mega-bases at human Chromosome 7p14. Gorilla was used as an out-group to assign ancestral status to the variants. All experimentally validated inversion regions were then assayed against a panel of human samples and three of the 23 (13%) regions were found to be polymorphic in the human genome. These polymorphic inversions include 730 kb (at 7p22), 13 kb (at 7q11), and 1 kb (at 16q24) fragments with a 5%, 30%, and 48% minor allele frequency, respectively. Our results suggest that inversions are an important source of variation in primate genome evolution. The finding of at least three novel inversion polymorphisms in humans indicates this type of structural variation may be a more common feature of our genome than previously realized.
The high-level expression of human tissue plasminogen activator in the milk of transgenic mice with hybrid gene locus strategy.

PubMed

Zhou, Yanrong; Lin, Yanli; Wu, Xiaojie; Xiong, Fuyin; Lv, Yuemeng; Zheng, Tao; Huang, Peitang; Chen, Hongxing

2012-02-01

Transgene expression for the mammary gland bioreactor aimed at producing recombinant proteins requires optimized expression vector construction. Previously we presented a hybrid gene locus strategy, which was originally tested with human lactoferrin (hLF) as target transgene, and an extremely high-level expression of rhLF ever been achieved as to 29.8 g/l in mice milk. Here to demonstrate the broad application of this strategy, another 38.4 kb mWAP-htPA hybrid gene locus was constructed, in which the 3-kb genomic coding sequence in the 24-kb mouse whey acidic protein (mWAP) gene locus was substituted by the 17.4-kb genomic coding sequence of human tissue plasminogen activator (htPA), exactly from the start codon to the end codon. Corresponding five transgenic mice lines were generated and the highest expression level of rhtPA in the milk attained as to 3.3 g/l. Our strategy will provide a universal way for the large-scale production of pharmaceutical proteins in the mammary gland of transgenic animals.
Autonomous replication and addition of telomerelike sequences to DNA microinjected into Paramecium tetraurelia macronuclei.

PubMed Central

Gilley, D; Preer, J R; Aufderheide, K J; Polisky, B

1988-01-01

Paramecium tetraurelia can be transformed by microinjection of cloned serotype A gene sequences into the macronucleus. Transformants are detected by their ability to express serotype A surface antigen from the injected templates. After injection, the DNA is converted from a supercoiled form to a linear form by cleavage at nonrandom sites. The linear form appears to replicate autonomously as a unit-length molecule and is present in transformants at high copy number. The injected DNA is further processed by the addition of paramecium-type telomeric sequences to the termini of the linear DNA. To examine the fate of injected linear DNA molecules, plasmid pSA14SB DNA containing the A gene was cleaved into two linear pieces, a 14-kilobase (kb) piece containing the A gene and flanking sequences and a 2.2-kb piece consisting of the procaryotic vector. In transformants expressing the A gene, we observed that two linear DNA species were present which correspond to the two species injected. Both species had Paramecium telomerelike sequences added to their termini. For the 2.2-kb DNA, we show that the site of addition of the telomerelike sequences is directly at one terminus and within one nucleotide of the other terminus. These results indicate that injected procaryotic DNA is capable of autonomous replication in Paramecium macronuclei and that telomeric addition in the macronucleus does not require specific recognition sequences. Images PMID:3211128
Molecular analysis of beta-globin gene mutations among Thai beta-thalassemia children: results from a single center study

PubMed Central

Boonyawat, Boonchai; Monsereenusorn, Chalinee; Traivaree, Chanchai

2014-01-01

Background Beta-thalassemia is one of the most common genetic disorders in Thailand. Clinical phenotype ranges from silent carrier to clinically manifested conditions including severe beta-thalassemia major and mild beta-thalassemia intermedia. Objective This study aimed to characterize the spectrum of beta-globin gene mutations in pediatric patients who were followed-up in Phramongkutklao Hospital. Patients and methods Eighty unrelated beta-thalassemia patients were enrolled in this study including 57 with beta-thalassemia/hemoglobin E, eight with homozygous beta-thalassemia, and 15 with heterozygous beta-thalassemia. Mutation analysis was performed by multiplex amplification refractory mutation system (M-ARMS), direct DNA sequencing of beta-globin gene, and gap polymerase chain reaction for 3.4 kb deletion detection, respectively. Results A total of 13 different beta-thalassemia mutations were identified among 88 alleles. The most common mutation was codon 41/42 (-TCTT) (37.5%), followed by codon 17 (A>T) (26.1%), IVS-I-5 (G>C) (8%), IVS-II-654 (C>T) (6.8%), IVS-I-1 (G>T) (4.5%), and codon 71/72 (+A) (2.3%), and all these six common mutations (85.2%) were detected by M-ARMS. Six uncommon mutations (10.2%) were identified by DNA sequencing including 4.5% for codon 35 (C>A) and 1.1% initiation codon mutation (ATG>AGG), codon 15 (G>A), codon 19 (A>G), codon 27/28 (+C), and codon 123/124/125 (-ACCCCACC), respectively. The 3.4 kb deletion was detected at 4.5%. The most common genotype of beta-thalassemia major patients was codon 41/42 (-TCTT)/codon 26 (G>A) or betaE accounting for 40%. Conclusion All of the beta-thalassemia alleles have been characterized by a combination of techniques including M-ARMS, DNA sequencing, and gap polymerase chain reaction for 3.4 kb deletion detection. Thirteen mutations account for 100% of the beta-thalassemia genes among the pediatric patients in our study. PMID:25525381
Prevotella paludivivens sp. nov., a novel strictly anaerobic, Gram-negative, hemicellulose-decomposing bacterium isolated from plant residue and rice roots in irrigated rice-field soil.

PubMed

Ueki, Atsuko; Akasaka, Hiroshi; Satoh, Atsuya; Suzuki, Daisuke; Ueki, Katsuji

2007-08-01

Two strictly anaerobic bacterial strains, KB7(T) and A42, were isolated from rice plant residue and living rice roots, respectively, from irrigated rice-field soil in Japan. These two strains were closely related to each other with 16S rRNA gene sequence similarity of 99.8 %. Both strains showed almost the same physiological properties. Cells were Gram-negative, non-motile, non-spore-forming rods. Growth was remarkably stimulated by the addition of haemin to the medium. The strains utilized various saccharides including xylan, xylose, pectin and carboxymethylcellulose and produced acetate and succinate with small amounts of formate and malate. The strains grew at 10-40 degrees C; optimum growth was observed at 30 degrees C and pH 5.7-6.7. Oxidase, catalase and nitrate-reducing activities were not detected. Aesculin was hydrolysed. The major cellular fatty acids were anteiso-C(15 : 0), iso-C(15 : 0), C(15 : 0) and iso-C(17 : 0) 3-OH. Menaquinones MK-11 and MK-11(H(2)) were the major respiratory quinones and the genomic DNA G+C content was 39.2 mol%. Phylogenetic analysis based on 16S rRNA gene sequences placed both strains in the phylum Bacteroidetes. 16S rRNA gene sequence analysis showed that the most related species to both strains was Prevotella oulorum (92.8-92.9 % similarity). Prevotella veroralis and Prevotella melaninogenica were the next most closely related known species with sequence similarities of 91.9-92.4 %. Based on differences in the phylogenetic, ecological, physiological and chemotaxonomic characteristics between the two isolates and related species, it is proposed that strains KB7(T) and A42 represent a novel species, Prevotella paludivivens sp. nov. This is the first described Prevotella species derived from a natural habitat; all other Prevotella species are from mammalian sources. The type strain of Prevotella paludivivens is KB7(T) (=JCM 13650(T)=DSM 17968(T)).
NcoI and TaqI RFLPs for human M creatine kinase (CKM)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Perryman, M.B.; Hejtmancik, J.F.; Ashizawa, Tetsuo

1988-09-12

Probe pHMCKUT contains a 135 bp cDNA fragment inserted into pGEM 3. The probe corresponds to nucleotides 1,201 to 1,336 located in the 3{prime} untranslated region of human M creatine kinase. The probe is specific for human M creatine kinase and does not hybridize to human B cretine kinase sequences. NcoI identifies a two allele polymorphism of a band at either 2.5 kb or 3.6 kb. TaqI identifies a two allele polymorphism at either 3.8 kb or 4.5 kb. Human M creatine has been localized to chromosome 19q. Autosomal co-dominant inheritance was shown in six informative Caucasian families.
Direct molecular regulation of the myogenic determination gene Myf5 by Pax3, with modulation by Six1/4 factors, is exemplified by the -111 kb-Myf5 enhancer.

PubMed

Daubas, Philippe; Buckingham, Margaret E

2013-04-15

The Myf5 gene plays an important role in myogenic determination during mouse embryo development. Multiple genomic regions of the Mrf4-Myf5 locus have been characterised as enhancer sequences responsible for the complex spatiotemporal expression of the Myf5 gene at the onset of myogenesis. These include an enhancer sequence, located at -111 kb upstream of the Myf5 transcription start site, which is responsible of Myf5 activation in ventral somitic domains (Ribas et al., 2011. Dev. Biol. 355, 372-380). We show that the -111 kb-Myf5 enhancer also directs transgene expression in some limb muscles, and is active at foetal as well as embryonic stages. We have carried out further characterisation of the regulation of this enhancer and show that the paired-box Pax3 transcription factor binds to it in vitro as in vivo, and that Pax binding sites are essential for its activity. This requirement is independent of the previously reported regulation by TEAD transcription factors. Six1/4 which, like Pax3, are important upstream regulators of myogenesis, also bind in vivo to sites in the -111 kb-Myf5 enhancer and modulate its activity. The -111 kb-Myf5 enhancer therefore shares common functional characteristics with another Myf5 regulatory sequence, the hypaxial and limb 145 bp-Myf5 enhancer, both being directly regulated in vivo by Pax3 and Six1/4 proteins. However, in the case of the -111 kb-Myf5 enhancer, Six has less effect and we conclude that Pax regulation plays a major role in controlling this aspect of the Myf5 gene expression at the onset of myogenesis in the embryo. Copyright © 2013 Elsevier Inc. All rights reserved.
Identification of small exonic CNV from whole-exome sequence data and application to autism spectrum disorder.

PubMed

Poultney, Christopher S; Goldberg, Arthur P; Drapeau, Elodie; Kou, Yan; Harony-Nicolas, Hala; Kajiwara, Yuji; De Rubeis, Silvia; Durand, Simon; Stevens, Christine; Rehnström, Karola; Palotie, Aarno; Daly, Mark J; Ma'ayan, Avi; Fromer, Menachem; Buxbaum, Joseph D

2013-10-03

Copy number variation (CNV) is an important determinant of human diversity and plays important roles in susceptibility to disease. Most studies of CNV carried out to date have made use of chromosome microarray and have had a lower size limit for detection of about 30 kilobases (kb). With the emergence of whole-exome sequencing studies, we asked whether such data could be used to reliably call rare exonic CNV in the size range of 1-30 kilobases (kb), making use of the eXome Hidden Markov Model (XHMM) program. By using both transmission information and validation by molecular methods, we confirmed that small CNV encompassing as few as three exons can be reliably called from whole-exome data. We applied this approach to an autism case-control sample (n = 811, mean per-target read depth = 161) and observed a significant increase in the burden of rare (MAF ≤1%) 1-30 kb CNV, 1-30 kb deletions, and 1-10 kb deletions in ASD. CNV in the 1-30 kb range frequently hit just a single gene, and we were therefore able to carry out enrichment and pathway analyses, where we observed enrichment for disruption of genes in cytoskeletal and autophagy pathways in ASD. In summary, our results showed that XHMM provided an effective means to assess small exonic CNV from whole-exome data, indicated that rare 1-30 kb exonic deletions could contribute to risk in up to 7% of individuals with ASD, and implicated a candidate pathway in developmental delay syndromes. Copyright © 2013 The American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

PubMed

Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

2006-04-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.
A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

PubMed Central

Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

2006-01-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031
Comparative genomics of citric-acid producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

DOE Office of Scientific and Technical Information (OSTI.GOV)

Andersen, Mikael R.; Salazar, Margarita; Schaap, Peter

2011-06-01

The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regionsmore » have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up regulation of genes associated with biosynthesis of amino acids that are abundant in glucoamylase A, tRNA-synthases and protein transporters.« less
Construction of a plant-transformation-competent BIBAC library and genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.)

PubMed Central

2013-01-01

Background Cotton, one of the world’s leading crops, is important to the world’s textile and energy industries, and is a model species for studies of plant polyploidization, cellulose biosynthesis and cell wall biogenesis. Here, we report the construction of a plant-transformation-competent binary bacterial artificial chromosome (BIBAC) library and comparative genome sequence analysis of polyploid Upland cotton (Gossypium hirsutum L.) with one of its diploid putative progenitor species, G. raimondii Ulbr. Results We constructed the cotton BIBAC library in a vector competent for high-molecular-weight DNA transformation in different plant species through either Agrobacterium or particle bombardment. The library contains 76,800 clones with an average insert size of 135 kb, providing an approximate 99% probability of obtaining at least one positive clone from the library using a single-copy probe. The quality and utility of the library were verified by identifying BIBACs containing genes important for fiber development, fiber cellulose biosynthesis, seed fatty acid metabolism, cotton-nematode interaction, and bacterial blight resistance. In order to gain an insight into the Upland cotton genome and its relationship with G. raimondii, we sequenced nearly 10,000 BIBAC ends (BESs) randomly selected from the library, generating approximately one BES for every 250 kb along the Upland cotton genome. The retroelement Gypsy/DIRS1 family predominates in the Upland cotton genome, accounting for over 77% of all transposable elements. From the BESs, we identified 1,269 simple sequence repeats (SSRs), of which 1,006 were new, thus providing additional markers for cotton genome research. Surprisingly, comparative sequence analysis showed that Upland cotton is much more diverged from G. raimondii at the genomic sequence level than expected. There seems to be no significant difference between the relationships of the Upland cotton D- and A-subgenomes with the G. raimondii genome, even though G. raimondii contains a D genome (D5). Conclusions The library represents the first BIBAC library in cotton and related species, thus providing tools useful for integrative physical mapping, large-scale genome sequencing and large-scale functional analysis of the Upland cotton genome. Comparative sequence analysis provides insights into the Upland cotton genome, and a possible mechanism underlying the divergence and evolution of polyploid Upland cotton from its diploid putative progenitor species, G. raimondii. PMID:23537070
Identification of an estrogen response element in the 3'-flanking region of the murine c-fos protooncogene.

PubMed

Hyder, S M; Stancel, G M; Nawaz, Z; McDonnell, D P; Loose-Mitchell, D S

1992-09-05

We have used transient transfection assays with reporter plasmids expressing chloramphenicol acetyltransferase, linked to regions of mouse c-fos, to identify a specific estrogen response element (ERE) in this protooncogene. This element is located in the untranslated 3'-flanking region of the c-fos gene, 5 kilobases (kb) downstream from the c-fos promoter and 1.5 kb downstream of the poly(A) signal. This element confers estrogen responsiveness to chloramphenicol acetyltransferase reporters linked to both the herpes simplex virus thymidine kinase promoter and the homologous c-fos promoter. Deletion analysis localized the response element to a 200-base pair fragment which contains the element GGTCACCACAGCC that resembles the consensus ERE sequence GGTCACAGTGACC originally identified in Xenopus vitellogenin A2 gene. A synthetic 36-base pair oligodeoxynucleotide containing this c-fos sequence conferred estrogen inducibility to the thymidine kinase promoter. The corresponding sequence also induced reporter activity when present in the c-fos gene fragment 3 kb from the thymidine kinase promoter. Gel-shift experiments demonstrated that synthetic oligonucleotides containing either the consensus ERE or the c-fos element bind human estrogen receptor obtained from a yeast expression system. However, the mobility of the shifted band is faster for the fos-ERE-complex than the consensus ERE complex suggesting that the three-dimensional structure of the protein-DNA complexes is different or that other factors are differentially involved in the two reactions. When the 5'-GGTCA sequence present in the c-fos ERE is mutated to 5'-TTTCA, transcriptional activation and receptor binding activities are both lost. Mutation of the CAGCC-3' element corresponding to the second half-site of the c-fos sequence also led to the loss of receptor binding activity, suggesting that both half-sites of this element are involved in this function. The estrogen induction mediated by either the c-fos or the consensus ERE was blunted by the antiestrogen tamoxifen. Based on these studies, we believe the 3'-fos ERE sequence we have identified may be a major cis-acting element involved in the physiological regulation of the gene by estrogens in vivo.
Community analysis of a full-scale anaerobic bioreactor treating paper mill wastewater.

PubMed

Roest, Kees; Heilig, Hans G H J; Smidt, Hauke; de Vos, Willem M; Stams, Alfons J M; Akkermans, Antoon D L

2005-03-01

To get insight into the microbial community of an Upflow Anaerobic Sludge Blanket reactor treating paper mill wastewater, conventional microbiological methods were combined with 16S rRNA gene analyses. Particular attention was paid to microorganisms able to degrade propionate or butyrate in the presence or absence of sulphate. Serial enrichment dilutions allowed estimating the number of microorganisms per ml sludge that could use butyrate with or without sulphate (10(5)), propionate without sulphate (10(6)), or propionate and sulphate (10(8)). Quantitative RNA dot-blot hybridisation indicated that Archaea were two-times more abundant in the microbial community of anaerobic sludge than Bacteria. The microbial community composition was further characterised by 16S rRNA-gene-targeted Denaturing Gradient Gel Electrophoresis (DGGE) fingerprinting, and via cloning and sequencing of dominant amplicons from the bacterial and archaeal patterns. Most of the nearly full length (approximately 1.45 kb) bacterial 16S rRNA gene sequences showed less than 97% similarity to sequences present in public databases, in contrast to the archaeal clones (approximately. 1.3 kb) that were highly similar to known sequences. While Methanosaeta was found as the most abundant genus, also Crenarchaeote-relatives were identified. The microbial community was relatively stable over a period of 3 years (samples taken in July 1999, May 2001, March 2002 and June 2002) as indicated by the high similarity index calculated from DGGE profiles (81.9+/-2.7% for Bacteria and 75.1+/-3.1% for Archaea). 16S rRNA gene sequence analysis indicated the presence of unknown and yet uncultured microorganisms, but also showed that known sulphate-reducing bacteria and syntrophic fatty acid-oxidising microorganisms dominated the enrichments.

Evidence for Widespread Reticulate Evolution within Human Duplicons

PubMed Central

Jackson, Michael S. ; Oliver, Karen ; Loveland, Jane ; Humphray, Sean ; Dunham, Ian ; Rocchi, Mariano ; Viggiano, Luigi ; Park, Jonathan P. ; Hurles, Matthew E. ; Santibanez-Koref, Mauro

2005-01-01

Approximately 5% of the human genome consists of segmental duplications that can cause genomic mutations and may play a role in gene innovation. Reticulate evolutionary processes, such as unequal crossing-over and gene conversion, are known to occur within specific duplicon families, but the broader contribution of these processes to the evolution of human duplications remains poorly characterized. Here, we use phylogenetic profiling to analyze multiple alignments of 24 human duplicon families that span >8 Mb of DNA. Our results indicate that none of them are evolving independently, with all alignments showing sharp discontinuities in phylogenetic signal consistent with reticulation. To analyze these results in more detail, we have developed a quartet method that estimates the relative contribution of nucleotide substitution and reticulate processes to sequence evolution. Our data indicate that most of the duplications show a highly significant excess of sites consistent with reticulate evolution, compared with the number expected by nucleotide substitution alone, with 15 of 30 alignments showing a >20-fold excess over that expected. Using permutation tests, we also show that at least 5% of the total sequence shares 100% sequence identity because of reticulation, a figure that includes 74 independent tracts of perfect identity >2 kb in length. Furthermore, analysis of a subset of alignments indicates that the density of reticulation events is as high as 1 every 4 kb. These results indicate that phylogenetic relationships within recently duplicated human DNA can be rapidly disrupted by reticulate evolution. This finding has important implications for efforts to finish the human genome sequence, complicates comparative sequence analysis of duplicon families, and could profoundly influence the tempo of gene-family evolution. PMID:16252241
Cloning and Characterization of the Scalloped Region of Drosophila Melanogaster

PubMed Central

Campbell, S. D.; Duttaroy, A.; Katzen, A. L.; Chovnick, A.

1991-01-01

Viable mutants of the scalloped gene (sd) of Drosophila melanogaster exhibit defects that can include gapping of the wing margin and ectopic bristle formation on the wing. Lethal sd alleles characterized in the present study now implicate this gene in a genetic function essential for normal development. In order to further characterize the developmental role of this gene, we have undertaken to clone and characterize the region where sd maps. A P[ry(+)] transposon insertion at 13F associated with sd([ry+2216]) served as the starting point for a 42-kb chromosomal walk. Molecular lesions associated with viable and lethal sd alleles were characterized by genomic hybridization analysis as a means of defining the extent of the gene. DNA rearrangements associated with 11 viable sd alleles map to a 2-kb interval which appears to be a ``hot spot'' for P element activity. Four of five recessive lethal sd mutations were mapped by denaturing gradient gel electrophoresis to a region 12-14 kb away from the region of viable lesions. In a sd(+) genotype, at least two structurally related and developmentally regulated transcripts hybridize to the genomic region where several sd lethal alleles have been localized. A viable mutation, sd(58), used for comparison in the transcript analysis, makes at least two slightly smaller transcripts that also hybridize to this region. Preliminary analysis of cDNA clones has identified three structurally related transcripts that hybridize to this genomic region. The 5' end of these transcripts extends into the 2-kb genomic region wherein DNA rearrangements were seen in the P element rearrangements. We favor the view that the transcripts represented by these cDNA clones are products of the sd gene. If this is true, the sd gene would include genomic sequences extending over at least 14 kb of the described chromosomal walk, and would appear to be subject to alternative splicing. PMID:1706292
Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications.

PubMed

Christen, Matthias; Del Medico, Luca; Christen, Heinz; Christen, Beat

2017-01-01

Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner.
Narrow-Host-Range Bacteriophages That Infect Rhizobium etli Associate with Distinct Genomic Types

PubMed Central

Santamaría, Rosa Isela; Bustos, Patricia; Sepúlveda-Robles, Omar; Lozano, Luis; Rodríguez, César; Fernández, José Luis; Juárez, Soledad; Kameyama, Luis; Guarneros, Gabriel; Dávila, Guillermo

2014-01-01

In this work, we isolated and characterized 14 bacteriophages that infect Rhizobium etli. They were obtained from rhizosphere soil of bean plants from agricultural lands in Mexico using an enrichment method. The host range of these phages was narrow but variable within a collection of 48 R. etli strains. We obtained the complete genome sequence of nine phages. Four phages were resistant to several restriction enzymes and in vivo cloning, probably due to nucleotide modifications. The genome size of the sequenced phages varied from 43 kb to 115 kb, with a median size of ∼45 to 50 kb. A large proportion of open reading frames of these phage genomes (65 to 70%) consisted of hypothetical and orphan genes. The remainder encoded proteins needed for phage morphogenesis and DNA synthesis and processing, among other functions, and a minor percentage represented genes of bacterial origin. We classified these phages into four genomic types on the basis of their genomic similarity, gene content, and host range. Since there are no reports of similar sequences, we propose that these bacteriophages correspond to novel species. PMID:24185856
HAMAP in 2013, new developments in the protein family classification and annotation system

PubMed Central

Pedruzzi, Ivo; Rivoire, Catherine; Auchincloss, Andrea H.; Coudert, Elisabeth; Keller, Guillaume; de Castro, Edouard; Baratin, Delphine; Cuche, Béatrice A.; Bougueleret, Lydie; Poux, Sylvain; Redaschi, Nicole; Xenarios, Ioannis; Bridge, Alan

2013-01-01

HAMAP (High-quality Automated and Manual Annotation of Proteins—available at http://hamap.expasy.org/) is a system for the classification and annotation of protein sequences. It consists of a collection of manually curated family profiles for protein classification, and associated annotation rules that specify annotations that apply to family members. HAMAP was originally developed to support the manual curation of UniProtKB/Swiss-Prot records describing microbial proteins. Here we describe new developments in HAMAP, including the extension of HAMAP to eukaryotic proteins, the use of HAMAP in the automated annotation of UniProtKB/TrEMBL, providing high-quality annotation for millions of protein sequences, and the future integration of HAMAP into a unified system for UniProtKB annotation, UniRule. HAMAP is continuously updated by expert curators with new family profiles and annotation rules as new protein families are characterized. The collection of HAMAP family classification profiles and annotation rules can be browsed and viewed on the HAMAP website, which also provides an interface to scan user sequences against HAMAP profiles. PMID:23193261
Structure and expression strategy of the genome of Culex pipiens densovirus, a mosquito densovirus with an ambisense organization.

PubMed

Baquerizo-Audiot, Elizabeth; Abd-Alla, Adly; Jousset, Françoise-Xavière; Cousserans, François; Tijssen, Peter; Bergoin, Max

2009-07-01

The genome of all densoviruses (DNVs) so far isolated from mosquitoes or mosquito cell lines consists of a 4-kb single-stranded DNA molecule with a monosense organization (genus Brevidensovirus, subfamily Densovirinae). We previously reported the isolation of a Culex pipiens DNV (CpDNV) that differs significantly from brevidensoviruses by (i) having a approximately 6-kb genome, (ii) lacking sequence homology, and (iii) lacking antigenic cross-reactivity with Brevidensovirus capsid polypeptides. We report here the sequence organization and transcription map of this virus. The cloned genome of CpDNV is 5,759 nucleotides (nt) long, and it possesses an inverted terminal repeat (ITR) of 285 nt and an ambisense organization of its genes. The nonstructural (NS) proteins NS-1, NS-2, and NS-3 are located in the 5' half of one strand and are organized into five open reading frames (ORFs) due to the split of both NS-1 and NS-2 into two ORFs. The ORF encoding capsid polypeptides is located in the 5' half of the complementary strand. The expression of NS proteins is controlled by two promoters, P7 and P17, driving the transcription of a 2.4-kb mRNA encoding NS-3 and of a 1.8-kb mRNA encoding NS-1 and NS-2, respectively. The two NS mRNAs species are spliced off a 53-nt sequence. Capsid proteins are translated from an unspliced 2.3-kb mRNA driven by the P88 promoter. CpDNV thus appears as a new type of mosquito DNV, and based on the overall organization and expression modalities of its genome, it may represent the prototype of a new genus of DNV.
SNPs in Entire Mitochondrial Genome Sequences (≈15.4 kb) and cox1 Sequences (≈486 bp) Resolve Body and Head Lice From Doubly Infected People From Ethiopia, China, Nepal, and Iran But Not France.

PubMed

Xiong, H; Campelo, D; Boutellis, A; Raoult, D; Alem, M; Ali, J; Bilcha, K; Shao, R; Pollack, R J; Barker, S C

2014-11-01

Some people host lice on the clothing as well as the head. Whether body lice and head lice are distinct species or merely variants of the same species remains contentious. We sought to ascertain the extent to which lice from these different habitats might interbreed on doubly infected people by comparing their entire mitochondrial genome sequences. Toward this end, we analyzed two sets of published genetic data from double-infections of body lice and head lice: 1) entire mitochondrial coding regions (≈15.4 kb) from body lice and head lice from seven doubly infected people from Ethiopia, China, and France; and 2) part of the cox1 gene (≈486 bp) from body lice and head lice from a further nine doubly infected people from China, Nepal, and Iran. These mitochondrial data, from 65 lice, revealed extraordinary variation in the number of single nucleotide polymorphisms between the individual body lice and individual head lice of double-infections: from 1.096 kb of 15.4 kb (7.6%) to 2 bps of 15.4 kb (0.01%). We detected coinfections of lice of Clades A and C on the scalp hair of three of the eight people from Nepal: one person of the two people from Kathmandu and two of the six people from Pokhara. Lice of Clades A and B coinfected the scalp hair of one person from Atherton, Far North Queensland, Australia. These findings argue for additional large-scale studies of the body lice and head lice of double-infected people. © 2014 Entomological Society of America.
De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis

PubMed Central

Nowrousian, Minou; Stajich, Jason E.; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D.; Pöggeler, Stefanie; Read, Nick D.; Seiler, Stephan; Smith, Kristina M.; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-01-01

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30–90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in ∼4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology. PMID:20386741
De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

PubMed

Nowrousian, Minou; Stajich, Jason E; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D; Pöggeler, Stefanie; Read, Nick D; Seiler, Stephan; Smith, Kristina M; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-04-08

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology.
The complete chloroplast genome sequence of the CAM epiphyte Spanish moss (Tillandsia usneoides, Bromeliaceae) and its comparative analysis.

PubMed

Poczai, Péter; Hyvönen, Jaakko

2017-01-01

Spanish moss (Tillandsia usneoides) is an epiphytic bromeliad widely distributed throughout tropical and warm temperate America. This plant is highly adapted to extreme environmental conditions. Striking features of this species include specialized trichomes (scales) covering the surface of its shoots aiding the absorption of water and nutrients directly from the atmosphere and a specific photosynthesis using crassulacean acid metabolism (CAM). Here we report the plastid genome of Spanish moss and present the comparison of genome organization and sequence evolution within Poales. The plastome of Spanish moss has a quadripartite structure consisting of a large single copy (LSC, 87,439 bp), two inverted regions (IRa and IRb, 26,803 bp) and short single copy (SSC, 18,612 bp) region. The plastid genome had 37.2% GC content and 134 genes with 88 being unique protein-coding genes and 20 of these are duplicated in the IR, similar to other reported bromeliads. Our study shows that early diverging lineages of Poales do not have high substitution rates as compared to grasses, and plastid genomes of bromeliads show structural features considered to be ancestral in graminids. These include the loss of the introns in the clpP and rpoC1 genes and the complete loss or partial degradation of accD and ycf genes in the Graminid clade. Further structural rearrangements appeared in the graminids lacking in Spanish moss, which include a 28-kb inversion between the trnG-UCC-rps14 region and 6-kb in the trnG-UCC-psbD, followed by a third <1kb inversion in the trnT sequence.
The complete chloroplast genome sequence of the CAM epiphyte Spanish moss (Tillandsia usneoides, Bromeliaceae) and its comparative analysis

PubMed Central

Hyvönen, Jaakko

2017-01-01

Spanish moss (Tillandsia usneoides) is an epiphytic bromeliad widely distributed throughout tropical and warm temperate America. This plant is highly adapted to extreme environmental conditions. Striking features of this species include specialized trichomes (scales) covering the surface of its shoots aiding the absorption of water and nutrients directly from the atmosphere and a specific photosynthesis using crassulacean acid metabolism (CAM). Here we report the plastid genome of Spanish moss and present the comparison of genome organization and sequence evolution within Poales. The plastome of Spanish moss has a quadripartite structure consisting of a large single copy (LSC, 87,439 bp), two inverted regions (IRa and IRb, 26,803 bp) and short single copy (SSC, 18,612 bp) region. The plastid genome had 37.2% GC content and 134 genes with 88 being unique protein-coding genes and 20 of these are duplicated in the IR, similar to other reported bromeliads. Our study shows that early diverging lineages of Poales do not have high substitution rates as compared to grasses, and plastid genomes of bromeliads show structural features considered to be ancestral in graminids. These include the loss of the introns in the clpP and rpoC1 genes and the complete loss or partial degradation of accD and ycf genes in the Graminid clade. Further structural rearrangements appeared in the graminids lacking in Spanish moss, which include a 28-kb inversion between the trnG-UCC–rps14 region and 6-kb in the trnG-UCC–psbD, followed by a third <1kb inversion in the trnT sequence. PMID:29095905
Evaluation of three read-depth based CNV detection tools using whole-exome sequencing data.

PubMed

Yao, Ruen; Zhang, Cheng; Yu, Tingting; Li, Niu; Hu, Xuyun; Wang, Xiumin; Wang, Jian; Shen, Yiping

2017-01-01

Whole exome sequencing (WES) has been widely accepted as a robust and cost-effective approach for clinical genetic testing of small sequence variants. Detection of copy number variants (CNV) within WES data have become possible through the development of various algorithms and software programs that utilize read-depth as the main information. The aim of this study was to evaluate three commonly used, WES read-depth based CNV detection programs using high-resolution chromosomal microarray analysis (CMA) as a standard. Paired CMA and WES data were acquired for 45 samples. A total of 219 CNVs (size ranged from 2.3 kb - 35 mb) identified on three CMA platforms (Affymetrix, Agilent and Illumina) were used as standards. CNVs were called from WES data using XHMM, CoNIFER, and CNVnator with modified settings. All three software packages detected an elevated proportion of small variants (< 20 kb) compared to CMA. XHMM and CoNIFER had poor detection sensitivity (22.2 and 14.6%), which correlated with the number of capturing probes involved. CNVnator detected most variants and had better sensitivity (87.7%); however, suffered from an overwhelming detection of small CNVs below 20 kb, which required further confirmation. Size estimation of variants was exaggerated by CNVnator and understated by XHMM and CoNIFER. Low concordances of CNV, detected by three different read-depth based programs, indicate the immature status of WES-based CNV detection. Low sensitivity and uncertain specificity of WES-based CNV detection in comparison with CMA based CNV detection suggests that CMA will continue to play an important role in detecting clinical grade CNV in the NGS era, which is largely based on WES.
Multiplexed direct genomic selection (MDiGS): a pooled BAC capture approach for highly accurate CNV and SNP/INDEL detection.

PubMed

Alvarado, David M; Yang, Ping; Druley, Todd E; Lovett, Michael; Gurnett, Christina A

2014-06-01

Despite declining sequencing costs, few methods are available for cost-effective single-nucleotide polymorphism (SNP), insertion/deletion (INDEL) and copy number variation (CNV) discovery in a single assay. Commercially available methods require a high investment to a specific region and are only cost-effective for large samples. Here, we introduce a novel, flexible approach for multiplexed targeted sequencing and CNV analysis of large genomic regions called multiplexed direct genomic selection (MDiGS). MDiGS combines biotinylated bacterial artificial chromosome (BAC) capture and multiplexed pooled capture for SNP/INDEL and CNV detection of 96 multiplexed samples on a single MiSeq run. MDiGS is advantageous over other methods for CNV detection because pooled sample capture and hybridization to large contiguous BAC baits reduces sample and probe hybridization variability inherent in other methods. We performed MDiGS capture for three chromosomal regions consisting of ∼ 550 kb of coding and non-coding sequence with DNA from 253 patients with congenital lower limb disorders. PITX1 nonsense and HOXC11 S191F missense mutations were identified that segregate in clubfoot families. Using a novel pooled-capture reference strategy, we identified recurrent chromosome chr17q23.1q23.2 duplications and small HOXC 5' cluster deletions (51 kb and 12 kb). Given the current interest in coding and non-coding variants in human disease, MDiGS fulfills a niche for comprehensive and low-cost evaluation of CNVs, coding, and non-coding variants across candidate regions of interest. © The Author(s) 2014. Published by Oxford University Press on behalf of Nucleic Acids Research.
Cloning and sequence analysis of a cDNA encoding the alpha-subunit of mouse beta-N-acetylhexosaminidase and comparison with the human enzyme.

PubMed Central

Beccari, T; Hoade, J; Orlacchio, A; Stirling, J L

1992-01-01

cDNAs encoding the mouse beta-N-acetylhexosaminidase alpha-subunit were isolated from a mouse testis library. The longest of these (1.7 kb) was sequenced and showed 83% similarity with the human alpha-subunit cDNA sequence. The 5' end of the coding sequence was obtained from a genomic DNA clone. Alignment of the human and mouse sequences showed that all three putative N-glycosylation sites are conserved, but that the mouse alpha-subunit has an additional site towards the C-terminus. All eight cysteines in the human sequence are conserved in the mouse. There are an additional two cysteines in the mouse alpha-subunit signal peptide. All amino acids affected in Tay-Sachs-disease mutations are conserved in the mouse. Images Fig. 1. PMID:1379046
Large-scale oscillation of structure-related DNA sequence features in human chromosome 21

NASA Astrophysics Data System (ADS)

Li, Wentian; Miramontes, Pedro

2006-08-01

Human chromosome 21 is the only chromosome in the human genome that exhibits oscillation of the (G+C) content of a cycle length of hundreds kilobases (kb) ( 500kb near the right telomere). We aim at establishing the existence of a similar periodicity in structure-related sequence features in order to relate this (G+C)% oscillation to other biological phenomena. The following quantities are shown to oscillate with the same 500kb periodicity in human chromosome 21: binding energy calculated by two sets of dinucleotide-based thermodynamic parameters, AA/TT and AAA/TTT bi- and tri-nucleotide density, 5'-TA-3' dinucleotide density, and signal for 10- or 11-base periodicity of AA/TT or AAA/TTT. These intrinsic quantities are related to structural features of the double helix of DNA molecules, such as base-pair binding, untwisting or unwinding, stiffness, and a putative tendency for nucleosome formation.
The nucleotide sequence and genome organization of Plasmopara halstedii virus.

PubMed

Heller-Dohmen, Marion; Göpfert, Jens C; Pfannstiel, Jens; Spring, Otmar

2011-03-17

Only very few viruses of Oomycetes have been studied in detail. Isometric virions were found in different isolates of the oomycete Plasmopara halstedii, the downy mildew pathogen of sunflower. However, complete nucleotide sequences and data on the genome organization were lacking. Viral RNA of different P. halstedii isolates was subjected to nucleotide sequencing and analysis of the viral genome. The N-terminal sequence of the viral coat protein was determined using Top-Down MALDI-TOF analysis. The complete nucleotide sequences of both single-stranded RNA segments (RNA1 and RNA2) were established. RNA1 consisted of 2793 nucleotides (nt) exclusive its 3' poly(A) tract and a single open-reading frame (ORF1) of 2745 nt. ORF1 was framed by a 5' untranslated region (5' UTR) of 18 nt and a 3' untranslated region (3' UTR) of 30 nt. ORF1 contained motifs of RNA-dependent RNA polymerases (RdRp) and showed similarities to RdRp of Scleropthora macrospora virus A (SmV A) and viruses within the Nodaviridae family. RNA2 consisted of 1526 nt exclusive its 3' poly(A) tract and a second ORF (ORF2) of 1128 nt. ORF2 coded for the single viral coat protein (CP) and was framed by a 5' UTR of 164 nt and a 3' UTR of 234 nt. The deduced amino acid sequence of ORF2 was verified by nano-LC-ESI-MS/MS experiments. Top-Down MALDI-TOF analysis revealed the N-terminal sequence of the CP. The N-terminal sequence represented a region within ORF2 suggesting a proteolytic processing of the CP in vivo. The CP showed similarities to CP of SmV A and viruses within the Tombusviridae family. Fragments of RNA1 (ca. 1.9 kb) and RNA2 (ca. 1.4 kb) were used to analyze the nucleotide sequence variation of virions in different P. halstedii isolates. Viral sequence variation was 0.3% or less regardless of their host's pathotypes, the geographical origin and the sensitivity towards the fungicide metalaxyl. The results showed the presence of a single and new virus type in different P. halstedii isolates. Insignificant viral sequence variation indicated that the virus did not account for differences in pathogenicity of the oomycete P. halstedii.
The Pfam protein families database: towards a more sustainable future.

PubMed

Finn, Robert D; Coggill, Penelope; Eberhardt, Ruth Y; Eddy, Sean R; Mistry, Jaina; Mitchell, Alex L; Potter, Simon C; Punta, Marco; Qureshi, Matloob; Sangrador-Vegas, Amaia; Salazar, Gustavo A; Tate, John; Bateman, Alex

2016-01-04

In the last two years the Pfam database (http://pfam.xfam.org) has undergone a substantial reorganisation to reduce the effort involved in making a release, thereby permitting more frequent releases. Arguably the most significant of these changes is that Pfam is now primarily based on the UniProtKB reference proteomes, with the counts of matched sequences and species reported on the website restricted to this smaller set. Building families on reference proteomes sequences brings greater stability, which decreases the amount of manual curation required to maintain them. It also reduces the number of sequences displayed on the website, whilst still providing access to many important model organisms. Matches to the full UniProtKB database are, however, still available and Pfam annotations for individual UniProtKB sequences can still be retrieved. Some Pfam entries (1.6%) which have no matches to reference proteomes remain; we are working with UniProt to see if sequences from them can be incorporated into reference proteomes. Pfam-B, the automatically-generated supplement to Pfam, has been removed. The current release (Pfam 29.0) includes 16 295 entries and 559 clans. The facility to view the relationship between families within a clan has been improved by the introduction of a new tool. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Complete telomere-to-telomere de novo assembly of the Plasmodium falciparum genome through long-read (>11 kb), single molecule, real-time sequencing

PubMed Central

Vembar, Shruthi Sridhar; Seetin, Matthew; Lambert, Christine; Nattestad, Maria; Schatz, Michael C.; Baybayan, Primo; Scherf, Artur; Smith, Melissa Laird

2016-01-01

The application of next-generation sequencing to estimate genetic diversity of Plasmodium falciparum, the most lethal malaria parasite, has proved challenging due to the skewed AT-richness [∼80.6% (A + T)] of its genome and the lack of technology to assemble highly polymorphic subtelomeric regions that contain clonally variant, multigene virulence families (Ex: var and rifin). To address this, we performed amplification-free, single molecule, real-time sequencing of P. falciparum genomic DNA and generated reads of average length 12 kb, with 50% of the reads between 15.5 and 50 kb in length. Next, using the Hierarchical Genome Assembly Process, we assembled the P. falciparum genome de novo and successfully compiled all 14 nuclear chromosomes telomere-to-telomere. We also accurately resolved centromeres [∼90–99% (A + T)] and subtelomeric regions and identified large insertions and duplications that add extra var and rifin genes to the genome, along with smaller structural variants such as homopolymer tract expansions. Overall, we show that amplification-free, long-read sequencing combined with de novo assembly overcomes major challenges inherent to studying the P. falciparum genome. Indeed, this technology may not only identify the polymorphic and repetitive subtelomeric sequences of parasite populations from endemic areas but may also evaluate structural variation linked to virulence, drug resistance and disease transmission. PMID:27345719
Sequence Analysis of the Cryptic Plasmid pMG101 from Rhodopseudomonas palustris and Construction of Stable Cloning Vectors

PubMed Central

Inui, Masayuki; Roh, Jung Hyeob; Zahn, Kenneth; Yukawa, Hideaki

2000-01-01

A 15-kb cryptic plasmid was obtained from a natural isolate of Rhodopseudomonas palustris. The plasmid, designated pMG101, was able to replicate in R. palustris and in closely related strains of Bradyrhizobium japonicum and phototrophic Bradyrhizobium species. However, it was unable to replicate in the purple nonsulfur bacterium Rhodobacter sphaeroides and in Rhizobium species. The replication region of pMG101 was localized to a 3.0-kb SalI-XhoI fragment, and this fragment was stably maintained in R. palustris for over 100 generations in the absence of selection. The complete nucleotide sequence of this fragment revealed two open reading frames (ORFs), ORF1 and ORF2. The deduced amino acid sequence of ORF1 is similar to sequences of Par proteins, which mediate plasmid stability from certain plasmids, while ORF2 was identified as a putative rep gene, coding for an initiator of plasmid replication, based on homology with the Rep proteins of several other plasmids. The function of these sequences was studied by deletion mapping and gene disruptions of ORF1 and ORF2. pMG101-based Escherichia coli-R. palustris shuttle cloning vectors pMG103 and pMG105 were constructed and were stably maintained in R. palustris growing under nonselective conditions. The ability of plasmid pMG101 to replicate in R. palustris and its close phylogenetic relatives should enable broad application of these vectors within this group of α-proteobacteria. PMID:10618203
Identification, cloning, and sequencing of a fragment of Amsacta moorei entomopoxvirus DNA containing the spheroidin gene and three vaccinia virus-related open reading frames.

PubMed Central

Hall, R L; Moyer, R W

1991-01-01

Entomopoxvirus virions are frequently contained within crystalline occlusion bodies, which are composed of primarily a single protein, spheroidin, which is analogous to the polyhedrin protein of baculovirus. The spheroidin gene of Amsacta moorei entomopoxvirus was identified following the microsequencing of polypeptides generated from cyanogen bromide treatment of spheroidin and the subsequent synthesis of oligonucleotide hybridization probes. DNA sequencing of a 6.8-kb region of DNA containing the spheroidin gene showed that the spheroidin protein is derived from a 3.0-kb open reading frame potentially encoding a protein of 115 kDa. Three copies of the heptanucleotide, TTTTTNT, a sequence associated with early gene transcription in the vertebrate poxviruses, and four in-frame translational termination signals were found within 60 bp upstream of the putative spheroidin gene promoter (TAAATG). The spheroidin gene promoter region contains the sequence TAAATG, which is found in many late promoters of the vertebrate poxviruses and which serves as the site of transcriptional initiation, as shown by primer extension. Primer extension experiments also showed that spheroidin gene transcripts contain 5' poly(A) sequences typical of vertebrate poxvirus late transcripts. The 92 bases upstream of the initiating TAAATG are unusually A + T rich and contain only 7 G or C residues. An analysis of open reading frames around the spheroidin gene suggests that the colinear core of "essential genes" typical of the vertebrate poxviruses is absent in A. moorei entomopoxvirus. Images PMID:1942245

Comparative Analysis of the First Complete Enterococcus faecium Genome

PubMed Central

Lam, Margaret M. C.; Seemann, Torsten; Bulach, Dieter M.; Gladman, Simon L.; Chen, Honglei; Haring, Volker; Moore, Robert J.; Ballard, Susan; Grayson, M. Lindsay; Johnson, Paul D. R.; Howden, Benjamin P.

2012-01-01

Vancomycin-resistant enterococci (VRE) are one of the leading causes of nosocomial infections in health care facilities around the globe. In particular, infections caused by vancomycin-resistant Enterococcus faecium are becoming increasingly common. Comparative and functional genomic studies of E. faecium isolates have so far been limited owing to the lack of a fully assembled E. faecium genome sequence. Here we address this issue and report the complete 3.0-Mb genome sequence of the multilocus sequence type 17 vancomycin-resistant Enterococcus faecium strain Aus0004, isolated from the bloodstream of a patient in Melbourne, Australia, in 1998. The genome comprises a 2.9-Mb circular chromosome and three circular plasmids. The chromosome harbors putative E. faecium virulence factors such as enterococcal surface protein, hemolysin, and collagen-binding adhesin. Aus0004 has a very large accessory genome (38%) that includes three prophage and two genomic islands absent among 22 other E. faecium genomes. One of the prophage was present as inverted 50-kb repeats that appear to have facilitated a 683-kb chromosomal inversion across the replication terminus, resulting in a striking replichore imbalance. Other distinctive features include 76 insertion sequence elements and a single chromosomal copy of Tn1549 containing the vanB vancomycin resistance element. A complete E. faecium genome will be a useful resource to assist our understanding of this emerging nosocomial pathogen. PMID:22366422
Characterization of AFLAV, a Tf1/Sushi retrotransposon from Aspergillus flavus.

PubMed

Hua, Sui-Sheng T; Tarun, Alice S; Pandey, Sonal N; Chang, Leo; Chang, Perng-Kuang

2007-02-01

The plasmid, pAF28, a genomic clone from Aspergillus flavus NRRL 6541, has been used as a hybridization probe to fingerprint A. flavus strains isolated in corn and peanut fields. The insert of pAF28 contains a 4.5 kb region which encodes a truncated retrotransposon (AfRTL-1). In search for a full-length and intact copy of retrotransposon, we exploited a novel PCR cloning strategy by amplifying a 3.4 kb region from the genomic DNA of A. flavus NRRL 6541. The fragment was cloned into pCR 4-TOPO. Sequence analysis confirmed that this region encoded putative domains of partial reverse transcriptase, RNase H, and integrase of the predicted retrotransposon. The two flanking long terminal repeats (LTRs) and the sequence between them comprise a putative full-length LTR retrotransposon of 7799 bp in length. This intact retrotransposon sequence is named AFLAV (A. flavus Retrotransposon). The order of the predicted catalytic domains in the polyprotein (Pol) placed AFLAV in the Tf1/sushi subgroup of the Ty3/gypsy retrotransposon family. Primers derived from AFLAV sequence were used to screen this retrotransposon in other strains of A. flavus. More than fifty strains of A. flavus isolated from different geological origins were surveyed and the results show that many strains have extensive deletions in the regions encoding the capsid (Gag) and Pol.
Compound haplotypes at Xp11.23 and human population growth in Eurasia.

PubMed

Alonso, S; Armour, J A L

2004-09-01

To investigate patterns of diversity and the evolutionary history of Eurasians, we have sequenced a 2.8 kb region at Xp11.23 in a sample of African and Eurasian chromosomes. This region is in a long intron of CLCN5 and is immediately flanked by a highly variable minisatellite, DXS255, and a human-specific Ta0 LINE. Compared to Africans, Eurasians showed a marked reduction in sequence diversity. The main Euro-Asiatic haplotype seems to be the ancestral haplotype for the whole sample. Coalescent simulations, including recombination and exponential growth, indicate a median length of strong linkage disequilibrium, up to approximately 9 kb for this area. The Ka/Ks ratio between the coding sequence of human CLCN5 and its mouse orthologue is much less than 1. This implies that the region sequenced is unlikely to be under the strong influence of positive selective processes on CLCN5, mutations in which have been associated with disorders such as Dent's disease. In contrast, a scenario based on a population bottleneck and exponential growth seems a more likely explanation for the reduced diversity observed in Eurasians. Coalescent analysis and linked minisatellite diversity (which reaches a gene diversity value greater than 98% in Eurasians) suggest an estimated age of origin of the Euro-Asiatic diversity compatible with a recent out-of-Africa model for colonization of Eurasia by modern Homo sapiens.
A novel adenovirus of Western lowland gorillas (Gorilla gorilla gorilla)

PubMed Central

2010-01-01

Adenoviruses (AdV) broadly infect vertebrate hosts including a variety of primates. We identified a novel AdV in the feces of captive gorillas by isolation in cell culture, electron microscopy and PCR. From the supernatants of infected cultures we amplified DNA polymerase (DPOL), preterminal protein (pTP) and hexon gene sequences with generic pan primate AdV PCR assays. The sequences in-between were amplified by long-distance PCRs of 2 - 10 kb length, resulting in a final sequence of 15.6 kb. Phylogenetic analysis placed the novel gorilla AdV into a cluster of primate AdVs belonging to the species Human adenovirus B (HAdV-B). Depending on the analyzed gene, its position within the cluster was variable. To further elucidate its origin, feces samples of wild gorillas were analyzed. AdV hexon sequences were detected which are indicative for three distinct and novel gorilla HAdV-B viruses, among them a virus nearly identical to the novel AdV isolated from captive gorillas. This shows that the discovered virus is a member of a group of HAdV-B viruses that naturally infect gorillas. The mixed phylogenetic clusters of gorilla, chimpanzee, bonobo and human AdVs within the HAdV-B species indicate that host switches may have been a component of the evolution of human and non-human primate HAdV-B viruses. PMID:21054831
A novel adenovirus of Western lowland gorillas (Gorilla gorilla gorilla).

PubMed

Wevers, Diana; Leendertz, Fabian H; Scuda, Nelly; Boesch, Christophe; Robbins, Martha M; Head, Josephine; Ludwig, Carsten; Kühn, Joachim; Ehlers, Bernhard

2010-11-05

Adenoviruses (AdV) broadly infect vertebrate hosts including a variety of primates. We identified a novel AdV in the feces of captive gorillas by isolation in cell culture, electron microscopy and PCR. From the supernatants of infected cultures we amplified DNA polymerase (DPOL), preterminal protein (pTP) and hexon gene sequences with generic pan primate AdV PCR assays. The sequences in-between were amplified by long-distance PCRs of 2-10 kb length, resulting in a final sequence of 15.6 kb. Phylogenetic analysis placed the novel gorilla AdV into a cluster of primate AdVs belonging to the species Human adenovirus B (HAdV-B). Depending on the analyzed gene, its position within the cluster was variable. To further elucidate its origin, feces samples of wild gorillas were analyzed. AdV hexon sequences were detected which are indicative for three distinct and novel gorilla HAdV-B viruses, among them a virus nearly identical to the novel AdV isolated from captive gorillas. This shows that the discovered virus is a member of a group of HAdV-B viruses that naturally infect gorillas. The mixed phylogenetic clusters of gorilla, chimpanzee, bonobo and human AdVs within the HAdV-B species indicate that host switches may have been a component of the evolution of human and non-human primate HAdV-B viruses.
Canine Lat1: molecular structure, distribution and its expression in cancer samples.

PubMed

Ochiai, Hideharu; Morishita, Taiki; Onda, Ken; Sugiyama, Hiroki; Maruo, Takuya

2012-07-01

A full-length cDNA sequence of canine L-type amino acid transporter 1 (Lat1) was determined from a canine brain. The sequence was 1828 bp long and was predicted to encode 485 amino acid polypeptides. The deduced amino acid sequence of canine Lat1 showed 93.2% and 91.1% similarities to those of humans and rats, respectively. Northern blot analysis detected Lat1 expression in the cerebellum at 4 kb, and Western blot analysis showed a single band at 40 kDa. RT-PCR analysis revealed a distinct expression of Lat1 in the pancreas and testis in addition to the cerebrum and cerebellum. Notably, Lat1 expression was observed in the tissues of thyroid cancer, melanoma and hemangiopericytoma. Although the cancer samples examined were not enough, Lat1 may serve as a useful biomarker of cancer cells in veterinary clinic.
Structure of the human type IV collagen COL4A6 gene, which is mutated in Alport syndrome-associated leiomyomatosis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Zhang, Xu; Zhou, Jing; Reeders, S.T.

1996-05-01

Basement membrane (type IV) collagen, a subfamily of the collagen protein family, is encoded by six distinct genes in mammals. Three of those, COL4A3, COL4A4, and COL4A5, are linked with Alport syndrome (hereditary nephritis). Patients with leimoyomatosis associated with Alport syndrome have been shown to have deletions in the 5{prime} end of the COL4A6 gene, in addition to having deletions in COL4A6. The human COL4A6 gene is reported to be 425 kb as determined by mapping of overlapping YAC clones by probes for its 5{prime} and 3{prime} ends. In the present study we describe the complete exon/intron size pattern ofmore » the human COL4A6 gene. The 12 {lambda} phage clones characterized in the study spanned a total of 110 kb, including 85 kb of the actual gene and 25 kb of flanking sequences. The overlapping clones contained all 46 exons of the gene and all introns, except for intron 2. Since the total size of the exons and all introns except for intron 2 is about 85 kb, intron 2 must be about 340 kb. All exons of the gene were assigned to EcoRI restriction fragments to facilitate analysis of the gene in patients with leiomyomatosis associated with Alport syndrome. The exon size pattern of COL4A6 is highly homologous with that of the human and mouse COL4A2 genes, with 27 of the 46 exons of COL4A6 being identical in size between the genes. 42 refs., 2 figs., 3 tabs.« less
Plasmid Characterization and Chromosome Analysis of Two netF+ Clostridium perfringens Isolates Associated with Foal and Canine Necrotizing Enteritis.

PubMed

Mehdizadeh Gohari, Iman; Kropinski, Andrew M; Weese, Scott J; Parreira, Valeria R; Whitehead, Ashley E; Boerlin, Patrick; Prescott, John F

2016-01-01

The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus.
Complete genome sequence and phenotype microarray analysis of Cronobacter sakazakii SP291: a persistent isolate cultured from a powdered infant formula production facility.

PubMed

Yan, Qiongqiong; Power, Karen A; Cooney, Shane; Fox, Edward; Gopinath, Gopal R; Grim, Christopher J; Tall, Ben D; McCusker, Matthew P; Fanning, Séamus

2013-01-01

Outbreaks of human infection linked to the powdered infant formula (PIF) food chain and associated with the bacterium Cronobacter, are of concern to public health. These bacteria are regarded as opportunistic pathogens linked to life-threatening infections predominantly in neonates, with an under developed immune system. Monitoring the microbiological ecology of PIF production sites is an important step in attempting to limit the risk of contamination in the finished food product. Cronobacter species, like other microorganisms can adapt to the production environment. These organisms are known for their desiccation tolerance, a phenotype that can aid their survival in the production site and PIF itself. In evaluating the genome data currently available for Cronobacter species, no sequence information has been published describing a Cronobacter sakazakii isolate found to persist in a PIF production facility. Here we report on the complete genome sequence of one such isolate, Cronobacter sakazakii SP291 along with its phenotypic characteristics. The genome of C. sakazakii SP291 consists of a 4.3-Mb chromosome (56.9% GC) and three plasmids, denoted as pSP291-1, [118.1-kb (57.2% GC)], pSP291-2, [52.1-kb (49.2% GC)], and pSP291-3, [4.4-kb (54.0% GC)]. When C. sakazakii SP291 was compared to the reference C. sakazakii ATCC BAA-894, which is also of PIF origin, the annotated genome data identified two interesting functional categories, comprising of genes related to the bacterial stress response and resistance to antimicrobial and toxic compounds. Using a phenotypic microarray (PM), we provided a full metabolic profile comparing C. sakazakii SP291 and the previously sequenced C. sakazakii ATCC BAA-894. These data extend our understanding of the genome of this important neonatal pathogen and provides further insights into the genotypes associated with features that can contribute to its persistence in the PIF environment.
Complete genome sequence and phenotype microarray analysis of Cronobacter sakazakii SP291: a persistent isolate cultured from a powdered infant formula production facility

PubMed Central

Yan, Qiongqiong; Power, Karen A.; Cooney, Shane; Fox, Edward; Gopinath, Gopal R.; Grim, Christopher J.; Tall, Ben D.; McCusker, Matthew P.; Fanning, Séamus

2013-01-01

Outbreaks of human infection linked to the powdered infant formula (PIF) food chain and associated with the bacterium Cronobacter, are of concern to public health. These bacteria are regarded as opportunistic pathogens linked to life-threatening infections predominantly in neonates, with an under developed immune system. Monitoring the microbiological ecology of PIF production sites is an important step in attempting to limit the risk of contamination in the finished food product. Cronobacter species, like other microorganisms can adapt to the production environment. These organisms are known for their desiccation tolerance, a phenotype that can aid their survival in the production site and PIF itself. In evaluating the genome data currently available for Cronobacter species, no sequence information has been published describing a Cronobacter sakazakii isolate found to persist in a PIF production facility. Here we report on the complete genome sequence of one such isolate, Cronobacter sakazakii SP291 along with its phenotypic characteristics. The genome of C. sakazakii SP291 consists of a 4.3-Mb chromosome (56.9% GC) and three plasmids, denoted as pSP291-1, [118.1-kb (57.2% GC)], pSP291-2, [52.1-kb (49.2% GC)], and pSP291-3, [4.4-kb (54.0% GC)]. When C. sakazakii SP291 was compared to the reference C. sakazakii ATCC BAA-894, which is also of PIF origin, the annotated genome data identified two interesting functional categories, comprising of genes related to the bacterial stress response and resistance to antimicrobial and toxic compounds. Using a phenotypic microarray (PM), we provided a full metabolic profile comparing C. sakazakii SP291 and the previously sequenced C. sakazakii ATCC BAA-894. These data extend our understanding of the genome of this important neonatal pathogen and provides further insights into the genotypes associated with features that can contribute to its persistence in the PIF environment. PMID:24032028
Degradation of Substituted Phenylurea Herbicides by Arthrobacter globiformis Strain D47 and Characterization of a Plasmid-Associated Hydrolase Gene, puhA

PubMed Central

Turnbull, Gillian A.; Ousley, Margaret; Walker, Allan; Shaw, Eve; Morgan, J. Alun W.

2001-01-01

Arthrobacter globiformis D47 was shown to degrade a range of substituted phenylurea herbicides in soil. This strain contained two plasmids of approximately 47 kb (pHRIM620) and 34 kb (pHRIM621). Plasmid-curing experiments produced plasmid-free strains as well as strains containing either the 47- or the 34-kb plasmid. The strains were tested for their ability to degrade diuron, which demonstrated that the degradative genes were located on the 47-kb plasmid. Studies on the growth of these strains indicated that the ability to degrade diuron did not offer a selective advantage to A. globiformis D47 on minimal medium designed to contain the herbicide as a sole carbon source. The location of the genes on a plasmid and a lack of selection would explain why the degradative phenotype, as with many other pesticide-degrading bacteria, can be lost on subculture. A 22-kb EcoRI fragment of plasmid pHRIM620 was expressed in Escherichia coli and enabled cells to degrade diuron. Transposon mutagenesis of this fragment identified one open reading frame that was essential for enzyme activity. A smaller subclone of this gene (2.5 kb) expressed in E. coli coded for the protein that degraded diuron. This gene and its predicted protein sequence showed only a low level of protein identity (25% over ca. 440 amino acids) to other database sequences and was named after the enzyme it encoded, phenylurea hydrolase (puhA gene). PMID:11319111
Recombination-dependent replication and gene conversion homogenize repeat sequences and diversify plastid genome structure.

PubMed

Ruhlman, Tracey A; Zhang, Jin; Blazier, John C; Sabir, Jamal S M; Jansen, Robert K

2017-04-01

There is a misinterpretation in the literature regarding the variable orientation of the small single copy region of plastid genomes (plastomes). The common phenomenon of small and large single copy inversion, hypothesized to occur through intramolecular recombination between inverted repeats (IR) in a circular, single unit-genome, in fact, more likely occurs through recombination-dependent replication (RDR) of linear plastome templates. If RDR can be primed through both intra- and intermolecular recombination, then this mechanism could not only create inversion isomers of so-called single copy regions, but also an array of alternative sequence arrangements. We used Illumina paired-end and PacBio single-molecule real-time (SMRT) sequences to characterize repeat structure in the plastome of Monsonia emarginata (Geraniaceae). We used OrgConv and inspected nucleotide alignments to infer ancestral nucleotides and identify gene conversion among repeats and mapped long (>1 kb) SMRT reads against the unit-genome assembly to identify alternative sequence arrangements. Although M. emarginata lacks the canonical IR, we found that large repeats (>1 kilobase; kb) represent ∼22% of the plastome nucleotide content. Among the largest repeats (>2 kb), we identified GC-biased gene conversion and mapping filtered, long SMRT reads to the M. emarginata unit-genome assembly revealed alternative, substoichiometric sequence arrangements. We offer a model based on RDR and gene conversion between long repeated sequences in the M. emarginata plastome and provide support that both intra-and intermolecular recombination between large repeats, particularly in repeat-rich plastomes, varies unit-genome structure while homogenizing the nucleotide sequence of repeats. © 2017 Botanical Society of America.
Non-contiguous genome sequence of Mycobacterium simiae strain DSM 44165(T.).

PubMed

Sassi, Mohamed; Robert, Catherine; Raoult, Didier; Drancourt, Michel

2013-01-01

Mycobacterium simiae is a non-tuberculosis mycobacterium causing pulmonary infections in both immunocompetent and imunocompromized patients. We announce the draft genome sequence of M. simiae DSM 44165(T). The 5,782,968-bp long genome with 65.15% GC content (one chromosome, no plasmid) contains 5,727 open reading frames (33% with unknown function and 11 ORFs sizing more than 5000 -bp), three rRNA operons, 52 tRNA, one 66-bp tmRNA matching with tmRNA tags from Mycobacterium avium, Mycobacterium tuberculosis, Mycobacterium bovis, Mycobacterium microti, Mycobacterium marinum, and Mycobacterium africanum and 389 DNA repetitive sequences. Comparing ORFs and size distribution between M. simiae and five other Mycobacterium species M. simiae clustered with M. abscessus and M. smegmatis. A 40-kb prophage was predicted in addition to two prophage-like elements, 7-kb and 18-kb in size, but no mycobacteriophage was seen after the observation of 10(6) M. simiae cells. Fifteen putative CRISPRs were found. Three genes were predicted to encode resistance to aminoglycosides, betalactams and macrolide-lincosamide-streptogramin B. A total of 163 CAZYmes were annotated. M. simiae contains ESX-1 to ESX-5 genes encoding for a type-VII secretion system. Availability of the genome sequence may help depict the unique properties of this environmental, opportunistic pathogen.
Insights into the Evolution of Mitochondrial Genome Size from Complete Sequences of Citrullus lanatus and Cucurbita pepo (Cucurbitaceae)

PubMed Central

Alverson, Andrew J.; Wei, XiaoXin; Rice, Danny W.; Stern, David B.; Barry, Kerrie; Palmer, Jeffrey D.

2010-01-01

The mitochondrial genomes of seed plants are unusually large and vary in size by at least an order of magnitude. Much of this variation occurs within a single family, the Cucurbitaceae, whose genomes range from an estimated 390 to 2,900 kb in size. We sequenced the mitochondrial genomes of Citrullus lanatus (watermelon: 379,236 nt) and Cucurbita pepo (zucchini: 982,833 nt)—the two smallest characterized cucurbit mitochondrial genomes—and determined their RNA editing content. The relatively compact Citrullus mitochondrial genome actually contains more and longer genes and introns, longer segmental duplications, and more discernibly nuclear-derived DNA. The large size of the Cucurbita mitochondrial genome reflects the accumulation of unprecedented amounts of both chloroplast sequences (>113 kb) and short repeated sequences (>370 kb). A low mutation rate has been hypothesized to underlie increases in both genome size and RNA editing frequency in plant mitochondria. However, despite its much larger genome, Cucurbita has a significantly higher synonymous substitution rate (and presumably mutation rate) than Citrullus but comparable levels of RNA editing. The evolution of mutation rate, genome size, and RNA editing are apparently decoupled in Cucurbitaceae, reflecting either simple stochastic variation or governance by different factors. PMID:20118192
Characterization of an Equine α-S2-Casein Variant Due to a 1.3 kb Deletion Spanning Two Coding Exons

PubMed Central

Brinkmann, Julia; Koudelka, Tomas; Keppler, Julia K.; Tholey, Andreas; Schwarz, Karin; Thaller, Georg; Tetens, Jens

2015-01-01

The production and consumption of mare’s milk in Europe has gained importance, mainly based on positive health effects and a lower allergenic potential as compared to cows’ milk. The allergenicity of milk is to a certain extent affected by different genetic variants. In classical dairy species, much research has been conducted into the genetic variability of milk proteins, but the knowledge in horses is scarce. Here, we characterize two major forms of equine αS2-casein arising from genomic 1.3 kb in-frame deletion involving two coding exons, one of which represents an equid specific duplication. Findings at the DNA-level have been verified by cDNA sequencing from horse milk of mares with different genotypes. At the protein-level, we were able to show by SDS-page and in-gel digestion with subsequent LC-MS analysis that both proteins are actually expressed. The comparison with published sequences of other equids revealed that the deletion has probably occurred before the ancestor of present-day asses and zebras diverged from the horse lineage. PMID:26444874
Genomic organization and expression of the human MSH3 gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Watanabe, Atsushi; Ikejima, Miyoko; Suzuki, Noriko

1996-02-01

We have studied the expression and genomic organization of the human MSH3 gene, which encodes a human homologue of the bacterial DNA mismatch repair protein MutS. This gene is located upstream of the dihydrofolate reductase (DHFR) gene. Northern analysis has demonstrated that the hMSH3 gene is expressed in a variety of human tissues at low levels, like the DHFR gene. Characterization of cosmid clones has shown that the hMSH3 gene consists of 24 exons spanning at least 160 kb. All exon-intron junction sequences match the classical GT/AG rule, except that intron 6 has AT and AA at the ends. Twomore » major transcripts of 5.0 and 3.8 kb have been shown to be derived from the differential use of two polyadenylation sites. Elucidation of the complete genomic organization and the nucleotide sequences of the introns of the hMSH3 gene should be useful for studying the function of this gene and the possible involvement of specific mutations of the hMSH3 gene in some diseases. 34 refs., 5 figs., 1 tab.« less
The complete genomic sequence of egg drop syndrome virus strain AAV-2.

PubMed

Jin, Q; Zeng, L; Yang, F; Li, M; Hou, Y

1999-12-01

In the search for the genome of egg drop syndrome virus (EDSV-76) Chinese strain AAV-2, part of restriction endonuclease physical map is analyzed, the complete genomic library is organized. On basis of this, the complete genome nucleotide sequences (32 838 bp in length, including terminal structures) are determined. The data analysis shows: compared with the other Adenoviruses, strain AAV-2 has more disparity on genomic structure and the distribution of open reading frame (ORF). There are no clear E1, E3 and E4 regions in AAV-2 genome. Two segments located at both ends of genome (1.1 kb and 8.3 kb in length respectively) have no homology with the other adenovirus genomes. In addition, strain AAV-2 genome lacks ORFs encoding ElA, pV and pIX, which are common ORFs encoding early, lately proteins in Adenovirus. This reveals differences between EDSA-76, the sole standard strain of group III Avian Adenoviruses, and the other Avian Adenoviruses for the first time. It will help the search for Avian Adenovirus and will also help the search of all Adenoviruses.
Saccharomyces cerevisiae sigma 1278b has novel genes of the N-acetyltransferase gene superfamily required for L-proline analogue resistance.

PubMed

Takagi, H; Shichiri, M; Takemura, M; Mohri, M; Nakamori, S

2000-08-01

We discovered on the chromosome of Saccharomyces cerevisiae Sigma 1278b novel genes involved in L-proline analogue L-azetidine-2-carboxylic acid resistance which are not present in the standard laboratory strains. The 5.4 kb-DNA fragment was cloned from the genomic library of the L-azetidine-2-carboxylic acid-resistant mutant derived from a cross between S. cerevisiae strains S288C and Sigma 1278b. The nucleotide sequence of a 4.5-kb segment exhibited no identity with the sequence in the genome project involving strain S288C. Deletion analysis indicated that one open reading frame encoding a predicted protein of 229 amino acids is indispensable for L-azetidine-2-carboxylic acid resistance. The protein sequence was found to be a member of the N-acetyltransferase superfamily. Genomic Southern analysis and gene disruption showed that two copies of the novel gene with one amino acid change at position 85 required for L-azetidine-2-carboxylic acid resistance were present on chromosomes X and XIV of Sigma 1278b background strains. When this novel MPR1 or MPR2 gene (sigma 1278b gene for L-proline analogue resistance) was introduced into the other S. cerevisiae strains, all of the recombinants were resistant to L-azetidine-2-carboxylic acid, indicating that both MPR1 and MPR2 are expressed and have a global function in S. cerevisiae.
Construction, Characterization, and Preliminary BAC-End Sequence Analysis of a Bacterial Artificial Chromosome Library of the Tea Plant (Camellia sinensis)

PubMed Central

Lin, Jinke; Kudrna, Dave; Wing, Rod A.

2011-01-01

We describe the construction and characterization of a publicly available BAC library for the tea plant, Camellia sinensis. Using modified methods, the library was constructed with the aim of developing public molecular resources to advance tea plant genomics research. The library consists of a total of 401,280 clones with an average insert size of 135 kb, providing an approximate coverage of 13.5 haploid genome equivalents. No empty vector clones were observed in a random sampling of 576 BAC clones. Further analysis of 182 BAC-end sequences from randomly selected clones revealed a GC content of 40.35% and low chloroplast and mitochondrial contamination. Repetitive sequence analyses indicated that LTR retrotransposons were the most predominant sequence class (86.93%–87.24%), followed by DNA retrotransposons (11.16%–11.69%). Additionally, we found 25 simple sequence repeats (SSRs) that could potentially be used as genetic markers. PMID:21234344
Comparative sequence analysis of the X-inactivation center region in mouse, human, and bovine.

PubMed

Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent

2002-06-01

We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5' of Xist that was recently shown to attract histone modification early after the onset of X inactivation.

Allelic association of sequence variants in the herpes virus entry mediator-B gene (PVRL2) with the severity of multiple sclerosis.

PubMed

Schmidt, S; Pericak-Vance, M A; Sawcer, S; Barcellos, L F; Hart, J; Sims, J; Prokop, A M; van der Walt, J; DeLoa, C; Lincoln, R R; Oksenberg, J R; Compston, A; Hauser, S L; Haines, J L; Gregory, S G

2006-07-01

Discrepant findings have been reported regarding an association of the apolipoprotein E (APOE) gene with the clinical course of multiple sclerosis (MS). To resolve these discrepancies, we examined common sequence variation in six candidate genes residing in a 380-kb genomic region surrounding and including the APOE locus for an association with MS severity. We genotyped at least three polymorphisms in each of six candidate genes in 1,540 Caucasian MS families (729 single-case and multiple-case families from the United States, 811 single-case families from the UK). By applying the quantitative transmission/disequilibrium test to a recently proposed MS severity score, the only statistically significant (P=0.003) association with MS severity was found for an intronic variant in the Herpes Virus Entry Mediator-B Gene PVRL2. Additional genotyping extended the association to a 16.6 kb block spanning intron 1 to intron 2 of the gene. Sequencing of PVRL2 failed to identify variants with an obvious functional role. In conclusion, the analysis of a very large data set suggests that genetic polymorphisms in PVRL2 may influence MS severity and supports the possibility that viral factors may contribute to the clinical course of MS, consistent with previous reports.
[Analysis of cis-regulatory element distribution in gene promoters of Gossypium raimondii and Arabidopsis thaliana].

PubMed

Sun, Gao-Fei; He, Shou-Pu; Du, Xiong-Ming

2013-10-01

Cotton genomic studies have boomed since the release of Gossypium raimondii draft genome. In this study, cis-regulatory element (CRE) in 1 kb length sequence upstream 5' UTR of annotated genes were selected and scanned in the Arabidopsis thaliana (At) and Gossypium raimondii (Gr) genomes, based on the database of PLACE (Plant cis-acting Regulatory DNA Elements). According to the definition of this study, 44 (12.3%) and 57 (15.5%) CREs presented "peak-like" distribution in the 1 kb selected sequences of both genomes, respectively. Thirty-four of them were peak-like distributed in both genomes, which could be further categorized into 4 types based on their core sequences. The coincidence of TATABOX peak position and their actual position ((-) -30 bp) indicated that the position of a common CRE was conservative in different genes, which suggested that the peak position of these CREs was their possible actual position of transcription factors. The position of a common CRE was also different between the two genomes due to stronger length variation of 5' UTR in Gr than At. Furthermore, most of the peak-like CREs were located in the region of -110 bp-0 bp, which suggested that concentrated distribution might be conductive to the interaction of transcription factors, and then regulate the gene expression in downstream.
Complete genome sequence of the phenanthrene-degrading soil bacterium Delftia acidovorans Cs1-4

DOE Office of Scientific and Technical Information (OSTI.GOV)

Shetty, Ameesha R.; de Gannes, Vidya; Obi, Chioma C.

Polycyclic aromatic hydrocarbons (PAH) are ubiquitous environmental pollutants and microbial biodegradation is an important means of remediation of PAH-contaminated soil. Delftia acidovorans Cs1-4 (formerly Delftia sp. Cs1-4) was isolated by using phenanthrene as the sole carbon source from PAH contaminated soil in Wisconsin. Its full genome sequence was determined to gain insights into a mechanisms underlying biodegradation of PAH. Three genomic libraries were constructed and sequenced: an Illumina GAii shotgun library (916,416,493 reads), a 454 Titanium standard library (770,171 reads) and one paired-end 454 library (average insert size of 8 kb, 508,092 reads). The initial assembly contained 40 contigs inmore » two scaffolds. The 454 Titanium standard data and the 454 paired end data were assembled together and the consensus sequences were computationally shredded into 2 kb overlapping shreds. Illumina sequencing data was assembled, and the consensus sequence was computationally shredded into 1.5 kb overlapping shreds. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks. A total of 182 additional reactions were needed to close gaps and to raise the quality of the finished sequence. The final assembly is based on 253.3 Mb of 454 draft data (averaging 38.4 X coverage) and 590.2 Mb of Illumina draft data (averaging 89.4 X coverage). The genome of strain Cs1-4 consists of a single circular chromosome of 6,685,842 bp (66.7 %G+C) containing 6,028 predicted genes; 5,931 of these genes were protein-encoding and 4,425 gene products were assigned to a putative function. Genes encoding phenanthrene degradation were localized to a 232 kb genomic island (termed the phn island), which contained near its 3’ end a bacteriophage P4-like integrase, an enzyme often associated with chromosomal integration of mobile genetic elements. Other biodegradation pathways reconstructed from the genome sequence included: benzoate (by the acetyl-CoA pathway), styrene, nicotinic acid (by the maleamate pathway) and the pesticides Dicamba and Fenitrothion. Lastly, determination of the complete genome sequence of D. acidovorans Cs1-4 has provided new insights the microbial mechanisms of PAH biodegradation that may shape the process in the environment.« less
Complete genome sequence of the phenanthrene-degrading soil bacterium Delftia acidovorans Cs1-4

DOE PAGES

Shetty, Ameesha R.; de Gannes, Vidya; Obi, Chioma C.; ...

2015-08-15

Polycyclic aromatic hydrocarbons (PAH) are ubiquitous environmental pollutants and microbial biodegradation is an important means of remediation of PAH-contaminated soil. Delftia acidovorans Cs1-4 (formerly Delftia sp. Cs1-4) was isolated by using phenanthrene as the sole carbon source from PAH contaminated soil in Wisconsin. Its full genome sequence was determined to gain insights into a mechanisms underlying biodegradation of PAH. Three genomic libraries were constructed and sequenced: an Illumina GAii shotgun library (916,416,493 reads), a 454 Titanium standard library (770,171 reads) and one paired-end 454 library (average insert size of 8 kb, 508,092 reads). The initial assembly contained 40 contigs inmore » two scaffolds. The 454 Titanium standard data and the 454 paired end data were assembled together and the consensus sequences were computationally shredded into 2 kb overlapping shreds. Illumina sequencing data was assembled, and the consensus sequence was computationally shredded into 1.5 kb overlapping shreds. Gaps between contigs were closed by editing in Consed, by PCR and by Bubble PCR primer walks. A total of 182 additional reactions were needed to close gaps and to raise the quality of the finished sequence. The final assembly is based on 253.3 Mb of 454 draft data (averaging 38.4 X coverage) and 590.2 Mb of Illumina draft data (averaging 89.4 X coverage). The genome of strain Cs1-4 consists of a single circular chromosome of 6,685,842 bp (66.7 %G+C) containing 6,028 predicted genes; 5,931 of these genes were protein-encoding and 4,425 gene products were assigned to a putative function. Genes encoding phenanthrene degradation were localized to a 232 kb genomic island (termed the phn island), which contained near its 3’ end a bacteriophage P4-like integrase, an enzyme often associated with chromosomal integration of mobile genetic elements. Other biodegradation pathways reconstructed from the genome sequence included: benzoate (by the acetyl-CoA pathway), styrene, nicotinic acid (by the maleamate pathway) and the pesticides Dicamba and Fenitrothion. Lastly, determination of the complete genome sequence of D. acidovorans Cs1-4 has provided new insights the microbial mechanisms of PAH biodegradation that may shape the process in the environment.« less
PMS2 gene mutational analysis: direct cDNA sequencing to circumvent pseudogene interference.

PubMed

Wimmer, Katharina; Wernstedt, Annekatrin

2014-01-01

The presence of highly homologous pseudocopies can compromise the mutation analysis of a gene of interest. In particular, when using PCR-based strategies, pseudogene co-amplification has to be effectively prevented. This is often achieved by using primers designed to be parental gene specific according to the reference sequence and by applying stringent PCR conditions. However, there are cases in which this approach is of limited utility. For example, it has been shown that the PMS2 gene exchanges sequences with one of its pseudogenes, named PMS2CL. This results in functional PMS2 alleles containing pseudogene-derived sequences at their 3'-end and in nonfunctional PMS2CL pseudogene alleles that contain gene-derived sequences. Hence, the paralogues cannot be distinguished according to the reference sequence. This shortcoming can be effectively circumvented by using direct cDNA sequencing. This approach is based on the selective amplification of PMS2 transcripts in two overlapping 1.6-kb RT-PCR products. In addition to avoiding pseudogene co-amplification and allele dropout, this method has also the advantage that it allows to effectively identify deletions, splice mutations, and de novo retrotransposon insertions that escape the detection of most DNA-based mutation analysis protocols.
Evolutionary analysis of a large mtDNA translocation (numt) into the nuclear genome of the Panthera genus species

PubMed Central

Kim, Jae-Heup; Antunes, Agostinho; Luo, Shu-Jin; Menninger, Joan; Nash, William G.; O’Brien, Stephen J.; Johnson, Warren E.

2006-01-01

Translocation of cymtDNA into the nuclear genome, also referred to as numt, has been reported in many species, including several closely related to the domestic cat (Felis catus). We describe the recent transposition of 12,536 bp of the 17 kb mitochondrial genome into the nucleus of the common ancestor of the five Panthera genus species: tiger, P. tigris; snow leopard, P. uncia; jaguar, P. onca; leopard, P. pardus; and lion, P. leo. This nuclear integration, representing 74% of the mitochondrial genome, is one of the largest to be reported in eukaryotes. The Panthera genus numt differs from the numt previously described in the Felis genus in: (1) chromosomal location (F2 – telomeric region vs. D2 – centromeric region), (2) gene make up (from the ND5 to the ATP8 vs. from the CR to the COII), (3) size (12.5 kb vs. 7.9 kb), and (4) structure (single monomer vs. tandemly repeated in Felis). These distinctions indicate that the origin of this large numt fragment in the nuclear genome of the Panthera species is an independent insertion from that of the domestic cat lineage, which has been further supported by phylogenetic analyses. The tiger cymtDNA shared around 90% sequence identity with the homologous numt sequence, suggesting an origin for the Panthera numt at around 3.5 million years ago, prior to the radiation of the five extant Panthera species. PMID:16380222
ChimerDB 3.0: an enhanced database for fusion genes from cancer transcriptome and literature data mining.

PubMed

Lee, Myunggyo; Lee, Kyubum; Yu, Namhee; Jang, Insu; Choi, Ikjung; Kim, Pora; Jang, Ye Eun; Kim, Byounggun; Kim, Sunkyu; Lee, Byungwook; Kang, Jaewoo; Lee, Sanghyuk

2017-01-04

Fusion gene is an important class of therapeutic targets and prognostic markers in cancer. ChimerDB is a comprehensive database of fusion genes encompassing analysis of deep sequencing data and manual curations. In this update, the database coverage was enhanced considerably by adding two new modules of The Cancer Genome Atlas (TCGA) RNA-Seq analysis and PubMed abstract mining. ChimerDB 3.0 is composed of three modules of ChimerKB, ChimerPub and ChimerSeq. ChimerKB represents a knowledgebase including 1066 fusion genes with manual curation that were compiled from public resources of fusion genes with experimental evidences. ChimerPub includes 2767 fusion genes obtained from text mining of PubMed abstracts. ChimerSeq module is designed to archive the fusion candidates from deep sequencing data. Importantly, we have analyzed RNA-Seq data of the TCGA project covering 4569 patients in 23 cancer types using two reliable programs of FusionScan and TopHat-Fusion. The new user interface supports diverse search options and graphic representation of fusion gene structure. ChimerDB 3.0 is available at http://ercsb.ewha.ac.kr/fusiongene/. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.
Characterization of the human lineage-specific pericentric inversion that distinguishes human chromosome 1 from the homologous chromosomes of the great apes.

PubMed

Szamalek, Justyna M; Goidts, Violaine; Cooper, David N; Hameister, Horst; Kehrer-Sawatzki, Hildegard

2006-08-01

The human and chimpanzee genomes are distinguishable in terms of ten gross karyotypic differences including nine pericentric inversions and a chromosomal fusion. Seven of these large pericentric inversions are chimpanzee-specific whereas two of them, involving human chromosomes 1 and 18, were fixed in the human lineage after the divergence of humans and chimpanzees. We have performed detailed molecular and computational characterization of the breakpoint regions of the human-specific inversion of chromosome 1. FISH analysis and sequence comparisons together revealed that the pericentromeric region of HSA 1 contains numerous segmental duplications that display a high degree of sequence similarity between both chromosomal arms. Detailed analysis of these regions has allowed us to refine the p-arm breakpoint region to a 154.2 kb interval at 1p11.2 and the q-arm breakpoint region to a 562.6 kb interval at 1q21.1. Both breakpoint regions contain human-specific segmental duplications arranged in inverted orientation. We therefore propose that the pericentric inversion of HSA 1 was mediated by intra-chromosomal non-homologous recombination between these highly homologous segmental duplications that had themselves arisen only recently in the human lineage by duplicative transposition.
Genomic Investigation Reveals Highly Conserved, Mosaic, Recombination Events Associated with Capsular Switching among Invasive Neisseria meningitidis Serogroup W Sequence Type (ST)-11 Strains.

PubMed

Mustapha, Mustapha M; Marsh, Jane W; Krauland, Mary G; Fernandez, Jorge O; de Lemos, Ana Paula S; Dunning Hotopp, Julie C; Wang, Xin; Mayer, Leonard W; Lawrence, Jeffrey G; Hiller, N Luisa; Harrison, Lee H

2016-07-03

Neisseria meningitidis is an important cause of meningococcal disease globally. Sequence type (ST)-11 clonal complex (cc11) is a hypervirulent meningococcal lineage historically associated with serogroup C capsule and is believed to have acquired the W capsule through a C to W capsular switching event. We studied the sequence of capsule gene cluster (cps) and adjoining genomic regions of 524 invasive W cc11 strains isolated globally. We identified recombination breakpoints corresponding to two distinct recombination events within W cc11: A 8.4-kb recombinant region likely acquired from W cc22 including the sialic acid/glycosyl-transferase gene, csw resulted in a C→W change in capsular phenotype and a 13.7-kb recombinant segment likely acquired from Y cc23 lineage includes 4.5 kb of cps genes and 8.2 kb downstream of the cps cluster resulting in allelic changes in capsule translocation genes. A vast majority of W cc11 strains (497/524, 94.8%) retain both recombination events as evidenced by sharing identical or very closely related capsular allelic profiles. These data suggest that the W cc11 capsular switch involved two separate recombination events and that current global W cc11 meningococcal disease is caused by strains bearing this mosaic capsular switch. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Genome Partitioner: A web tool for multi-level partitioning of large-scale DNA constructs for synthetic biology applications

PubMed Central

Del Medico, Luca; Christen, Heinz; Christen, Beat

2017-01-01

Recent advances in lower-cost DNA synthesis techniques have enabled new innovations in the field of synthetic biology. Still, efficient design and higher-order assembly of genome-scale DNA constructs remains a labor-intensive process. Given the complexity, computer assisted design tools that fragment large DNA sequences into fabricable DNA blocks are needed to pave the way towards streamlined assembly of biological systems. Here, we present the Genome Partitioner software implemented as a web-based interface that permits multi-level partitioning of genome-scale DNA designs. Without the need for specialized computing skills, biologists can submit their DNA designs to a fully automated pipeline that generates the optimal retrosynthetic route for higher-order DNA assembly. To test the algorithm, we partitioned a 783 kb Caulobacter crescentus genome design. We validated the partitioning strategy by assembling a 20 kb test segment encompassing a difficult to synthesize DNA sequence. Successful assembly from 1 kb subblocks into the 20 kb segment highlights the effectiveness of the Genome Partitioner for reducing synthesis costs and timelines for higher-order DNA assembly. The Genome Partitioner is broadly applicable to translate DNA designs into ready to order sequences that can be assembled with standardized protocols, thus offering new opportunities to harness the diversity of microbial genomes for synthetic biology applications. The Genome Partitioner web tool can be accessed at https://christenlab.ethz.ch/GenomePartitioner. PMID:28531174
Colonization of heterochromatic genes by transposable elements in Drosophila.

PubMed

Dimitri, Patrizio; Junakovic, Nikolaj; Arcà, Bruno

2003-04-01

As a further step toward understanding transposable element-host genome interactions, we investigated the molecular anatomy of introns from five heterochromatic and 22 euchromatic protein-coding genes of Drosophila melanogaster. A total of 79 kb of intronic sequences from heterochromatic genes and 355 kb of intronic sequences from euchromatic genes have been used in Blast searches against Drosophila transposable elements (TEs). The results show that TE-homologous sequences belonging to 19 different families represent about 50% of intronic DNA from heterochromatic genes. In contrast, only 0.1% of the euchromatic intron DNA exhibits homology to known TEs. Intraspecific and interspecific size polymorphisms of introns were found, which are likely to be associated with changes in TE-related sequences. Together, the enrichment in TEs and the apparent dynamic state of heterochromatic introns suggest that TEs contribute significantly to the evolution of genes located in heterochromatin.
Bacterial Artificial Chromosome Libraries for Mouse Sequencing and Functional Analysis

PubMed Central

Osoegawa, Kazutoyo; Tateno, Minako; Woon, Peng Yeong; Frengen, Eirik; Mammoser, Aaron G.; Catanese, Joseph J.; Hayashizaki, Yoshihide; de Jong, Pieter J.

2000-01-01

Bacterial artificial chromosome (BAC) and P1-derived artificial chromosome (PAC) libraries providing a combined 33-fold representation of the murine genome have been constructed using two different restriction enzymes for genomic digestion. A large-insert PAC library was prepared from the 129S6/SvEvTac strain in a bacterial/mammalian shuttle vector to facilitate functional gene studies. For genome mapping and sequencing, we prepared BAC libraries from the 129S6/SvEvTac and the C57BL/6J strains. The average insert sizes for the three libraries range between 130 kb and 200 kb. Based on the numbers of clones and the observed average insert sizes, we estimate each library to have slightly in excess of 10-fold genome representation. The average number of clones found after hybridization screening with 28 probes was in the range of 9–14 clones per marker. To explore the fidelity of the genomic representation in the three libraries, we analyzed three contigs, each established after screening with a single unique marker. New markers were established from the end sequences and screened against all the contig members to determine if any of the BACs and PACs are chimeric or rearranged. Only one chimeric clone and six potential deletions have been observed after extensive analysis of 113 PAC and BAC clones. Seventy-one of the 113 clones were conclusively nonchimeric because both end markers or sequences were mapped to the other confirmed contig members. We could not exclude chimerism for the remaining 41 clones because one or both of the insert termini did not contain unique sequence to design markers. The low rate of chimerism, ∼1%, and the low level of detected rearrangements support the anticipated usefulness of the BAC libraries for genome research. [The sequence data described in this paper have been submitted to the GenBank data library under accession numbers AQ797173–AQ797398.] PMID:10645956
Genetic analysis of biodegradation of tetralin by a Sphingomonas strain

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hernaez, M.J.; Santero, E.; Reineke, W.

Tetralin (1,2,3,4-tetrahydronaphthalene) is produced for industrial purposes from naphthalene by catalytic hydrogenation or from anthracene by cracking. A strain designated TFA which very efficiently utilizes tetralin has been isolated from the Rhine river. The strain has been identified as Sphingomonas macrogoltabidus, based on 16S rDNA sequence similarity. Genetic analysis of tetralin biodegradation has been performed by insertion mutagenesis and by physical analysis and analysis of complementation between the mutants. The genes involved in tetralin utilization are clustered in a region of 9 kb, comprising at least five genes grouped in two divergently transcribed operons.
An in vitro reprogrammable antiviral RISC with size-preferential ribonuclease activity.

PubMed

Omarov, Rustem T; Ciomperlik, Jessica; Scholthof, Herman B

2016-03-01

Infection of Nicotiana benthamiana plants with Tomato bushy stunt virus (TBSV) mutants compromised for silencing suppression induces formation of an antiviral RISC (vRISC) that can be isolated using chromatography procedures. The isolated vRISC sequence-specifically degrades TBSV RNA in vitro, its activity can be down-regulated by removing siRNAs, and re-stimulated by exogenous supply of siRNAs. vRISC is most effective at hydrolyzing the ~4.8kb genomic RNA, but less so for a ~2.2kb TBSV subgenomic mRNA (sgRNA1), while the 3' co-terminal sgRNA2 of ~0.9kb appears insensitive to vRISC cleavage. Moreover, experiments with in vitro generated 5' co-terminal viral transcripts show that RNAs of ~2.7kb are efficiently cleaved while those of ~1.1kb or shorter are unaffected. The isolated antiviral ribonuclease complex fails to degrade ~0.4kb defective interfering RNAs (DIs) in vitro, agreeing with findings that in plants DIs are not targeted by silencing. Copyright © 2016. Published by Elsevier Inc.
Evidence of protein-free homology recognition in magnetic bead force-extension experiments

NASA Astrophysics Data System (ADS)

O'Lee, D. J.; Danilowicz, C.; Rochester, C.; Kornyshev, A. A.; Prentiss, M.

2016-07-01

Earlier theoretical studies have proposed that the homology-dependent pairing of large tracts of dsDNA may be due to physical interactions between homologous regions. Such interactions could contribute to the sequence-dependent pairing of chromosome regions that may occur in the presence or the absence of double-strand breaks. Several experiments have indicated the recognition of homologous sequences in pure electrolytic solutions without proteins. Here, we report single-molecule force experiments with a designed 60 kb long dsDNA construct; one end attached to a solid surface and the other end to a magnetic bead. The 60 kb constructs contain two 10 kb long homologous tracts oriented head to head, so that their sequences match if the two tracts fold on each other. The distance between the bead and the surface is measured as a function of the force applied to the bead. At low forces, the construct molecules extend substantially less than normal, control dsDNA, indicating the existence of preferential interaction between the homologous regions. The force increase causes no abrupt but continuous unfolding of the paired homologous regions. Simple semi-phenomenological models of the unfolding mechanics are proposed, and their predictions are compared with the data.
Stability of Tandem Repeats in the Drosophila Melanogaster HSR-Omega Nuclear RNA

PubMed Central

Hogan, N. C.; Slot, F.; Traverse, K. L.; Garbe, J. C.; Bendena, W. G.; Pardue, M. L.

1995-01-01

The Drosophila melanogaster Hsr-omega locus produces a nuclear RNA containing >5 kb of tandem repeat sequences. These repeats are unique to Hsr-omega and show concerted evolution similar to that seen with classical satellite DNAs. In D. melanogaster the monomer is ~280 bp. Sequences of 191/2 monomers differ by 8 +/- 5% (mean +/- SD), when all pairwise comparisons are considered. Differences are single nucleotide substitutions and 1-3 nucleotide deletions/insertions. Changes appear to be randomly distributed over the repeat unit. Outer repeats do not show the decrease in monomer homogeneity that might be expected if homogeneity is maintained by recombination. However, just outside the last complete repeat at each end, there are a few fragments of sequence similar to the monomer. The sequences in these flanking regions are not those predicted for sequences decaying in the absence of recombination. Instead, the fragmentation of the sequence homology suggests that flanking regions have undergone more severe disruptions, possibly during an insertion or amplification event. Hsr-omega alleles differing in the number of repeats are detected and appear to be stable over a few thousand generations; however, both increases and decreases in repeat numbers have been observed. The new alleles appear to be as stable as their predecessors. No alleles of less than ~5 kb nor more than ~16 kb of repeats were seen in any stocks examined. The evidence that there is a limit on the minimum number of repeats is consistent with the suggestion that these repeats are important in the function of the unusual Hsr-omega nuclear RNA. PMID:7540581
Identification of a novel herpes simplex virus type 1 transcript and protein (AL3) expressed during latency.

PubMed

Jaber, Tareq; Henderson, Gail; Li, Sumin; Perng, Guey-Chuen; Carpenter, Dale; Wechsler, Steven L; Jones, Clinton

2009-10-01

The herpes simplex virus type 1 (HSV-1) latency-associated transcript (LAT) is abundantly expressed in latently infected sensory neurons. In small animal models of infection, expression of the first 1.5 kb of LAT coding sequences is necessary and sufficient for wild-type reactivation from latency. The ability of LAT to inhibit apoptosis is important for reactivation from latency. Within the first 1.5 kb of LAT coding sequences and LAT promoter sequences, additional transcripts have been identified. For example, the anti-sense to LAT transcript (AL) is expressed in the opposite direction to LAT from the 5' end of LAT and LAT promoter sequences. In addition, the upstream of LAT (UOL) transcript is expressed in the LAT direction from sequences in the LAT promoter. Further examination of the first 1.5 kb of LAT coding sequences revealed two small ORFs that are anti-sense with respect to LAT (AL2 and AL3). A transcript spanning AL3 was detected in productively infected cells, mouse neuroblastoma cells stably expressing LAT and trigeminal ganglia (TG) of latently infected mice. Peptide-specific IgG directed against AL3 specifically recognized a protein migrating near 15 kDa in cells stably transfected with LAT, mouse neuroblastoma cells transfected with a plasmid containing the AL3 ORF and TG of latently infected mice. The inability to detect the AL3 protein during productive infection may have been because the 5' terminus of the AL3 transcript was downstream of the first in-frame methionine of the AL3 ORF during productive infection.
Identification of a Divided Genome for VSH-1, the Prophage-Like Gene Transfer Agent of Brachyspira hyodysenteriae

USDA-ARS?s Scientific Manuscript database

The Brachyspira hyodysenteriae B204 genome sequence revealed three VSH-1 tail genes hvp31, hvp60, and hvp37, in a 3.6 kb cluster. The location and transcription direction of these genes relative to the previously described VSH-1 16.3 kb gene operon indicate that the gene transfer agent VSH-1 has a ...
Horizontal transfer of DNA from the mitochondrial to the plastid genome and its subsequent evolution in milkweeds (Apocynaceae)

Treesearch

Shannon C.K. Straub; Richard C. Cronn; Christopher Edwards; Mark Fishbein; Aaron Liston

2013-01-01

Horizontal gene transfer (HGT) of DNA from the plastid to the nuclear and mitochondrial genomes of higher plants is a common phenomenon; however, plastid genomes (plastomes) are highly conserved and have generally been regarded as impervious to HGT. We sequenced the 158 kb plastome and the 690 kb mitochondrial genome of common milkweed (Asclepias syriaca [Apocynaceae...
A mutation in an alternative untranslated exon of hexokinase 1 associated with hereditary motor and sensory neuropathy -- Russe (HMSNR).

PubMed

Hantke, Janina; Chandler, David; King, Rosalind; Wanders, Ronald J A; Angelicheva, Dora; Tournev, Ivailo; McNamara, Elyshia; Kwa, Marcel; Guergueltcheva, Velina; Kaneva, Radka; Baas, Frank; Kalaydjieva, Luba

2009-12-01

Hereditary Motor and Sensory Neuropathy -- Russe (HMSNR) is a severe autosomal recessive disorder, identified in the Gypsy population. Our previous studies mapped the gene to 10q22-q23 and refined the gene region to approximately 70 kb. Here we report the comprehensive sequencing analysis and fine mapping of this region, reducing it to approximately 26 kb of fully characterised sequence spanning the upstream exons of Hexokinase 1 (HK1). We identified two sequence variants in complete linkage disequilibrium, a G>C in a novel alternative untranslated exon (AltT2) and a G>A in the adjacent intron, segregating with the disease in affected families and present in the heterozygote state in only 5/790 population controls. Sequence conservation of the AltT2 exon in 16 species with invariable preservation of the G allele at the mutated site, strongly favour the exonic change as the pathogenic mutation. Analysis of the Hk1 upstream region in mouse mRNA from testis and neural tissues showed an abundance of AltT2-containing transcripts generated by extensive, developmentally regulated alternative splicing. Expression is very low compared with ubiquitous Hk1 and all transcripts skip exon1, which encodes the protein domain responsible for binding to the outer mitochondrial membrane, and regulation of energy production and apoptosis. Hexokinase activity measurement and immunohistochemistry of the peripheral nerve showed no difference between patients and controls. The mutational mechanism and functional effects remain unknown and could involve disrupted translational regulation leading to increased anti-apoptotic activity (suggested by the profuse regenerative activity in affected nerves), or impairment of an unknown HK1 function in the peripheral nervous system (PNS).

A mutation in an alternative untranslated exon of hexokinase 1 associated with Hereditary Motor and Sensory Neuropathy – Russe (HMSNR)

PubMed Central

Hantke, Janina; Chandler, David; King, Rosalind; Wanders, Ronald JA; Angelicheva, Dora; Tournev, Ivailo; McNamara, Elyshia; Kwa, Marcel; Guergueltcheva, Velina; Kaneva, Radka; Baas, Frank; Kalaydjieva, Luba

2009-01-01

Hereditary Motor and Sensory Neuropathy – Russe (HMSNR) is a severe autosomal recessive disorder, identified in the Gypsy population. Our previous studies mapped the gene to 10q22-q23 and refined the gene region to ∼70 kb. Here we report the comprehensive sequencing analysis and fine mapping of this region, reducing it to ∼26 kb of fully characterised sequence spanning the upstream exons of Hexokinase 1 (HK1). We identified two sequence variants in complete linkage disequilibrium, a G>C in a novel alternative untranslated exon (AltT2) and a G>A in the adjacent intron, segregating with the disease in affected families and present in the heterozygote state in only 5/790 population controls. Sequence conservation of the AltT2 exon in 16 species with invariable preservation of the G allele at the mutated site, strongly favour the exonic change as the pathogenic mutation. Analysis of the Hk1 upstream region in mouse mRNA from testis and neural tissues showed an abundance of AltT2-containing transcripts generated by extensive, developmentally regulated alternative splicing. Expression is very low compared with ubiquitous Hk1 and all transcripts skip exon1, which encodes the protein domain responsible for binding to the outer mitochondrial membrane, and regulation of energy production and apoptosis. Hexokinase activity measurement and immunohistochemistry of the peripheral nerve showed no difference between patients and controls. The mutational mechanism and functional effects remain unknown and could involve disrupted translational regulation leading to increased anti-apoptotic activity (suggested by the profuse regenerative activity in affected nerves), or impairment of an unknown HK1 function in the peripheral nervous system (PNS). PMID:19536174
The nif Gene Operon of the Methanogenic Archaeon Methanococcus maripaludis

PubMed Central

Kessler, Peter S.; Blank, Carrine; Leigh, John A.

1998-01-01

Nitrogen fixation occurs in two domains, Archaea and Bacteria. We have characterized a nif (nitrogen fixation) gene cluster in the methanogenic archaeon Methanococcus maripaludis. Sequence analysis revealed eight genes, six with sequence similarity to known nif genes and two with sequence similarity to glnB. The gene order, nifH, ORF105 (similar to glnB), ORF121 (similar to glnB), nifD, nifK, nifE, nifN, and nifX, was the same as that found in part in other diazotrophic methanogens and except for the presence of the glnB-like genes, also resembled the order found in many members of the Bacteria. Using transposon insertion mutagenesis, we determined that an 8-kb region required for nitrogen fixation corresponded to the nif gene cluster. Northern analysis revealed the presence of either a single 7.6-kb nif mRNA transcript or 10 smaller mRNA species containing portions of the large transcript. Polar effects of transposon insertions demonstrated that all of these mRNAs arose from a single promoter region, where transcription initiated 80 bp 5′ to nifH. Distinctive features of the nif gene cluster include the presence of the six primary nif genes in a single operon, the placement of the two glnB-like genes within the cluster, the apparent physical separation of the cluster from any other nif genes that might be in the genome, the fragmentation pattern of the mRNA, and the regulation of expression by a repression mechanism described previously. Our study and others with methanogenic archaea reporting multiple mRNAs arising from gene clusters with only a single putative promoter sequence suggest that mRNA processing following transcription may be a common occurrence in methanogens. PMID:9515920
The human MCP-2 gene (SCYA8): Cloning, sequence analysis, tissue expression, and assignment to the CC chemokine gene contig on chromosome 17q11.2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Van Coillie, E.; Fiten, P.; Van Damme, J.

1997-03-01

Monocyte chemotactic proteins (MCPs) form a subfamily of chemokines that recruit leukocytes to sites of inflammation and that may contribute to tumor-associated leukocyte infiltration and to the antiviral state against HIV infection. With the use of degenerate primers that were based on CC chemokine consensus sequences, the known MIP-1{alpha}/LD78{alpha}, MCP-1, and MCP-3 genes and the previously unidentified eotaxin and MCP-2 genes were isolated from a YAC contig from human chromosome 17q11.2. The amplified genomic MCP-2 fragment was used to isolate an MCP-2 cosmid from which the gene sequence was determined. The MCP-2 gene shares with the MCP-1 and MCP-3 genesmore » a conserved intron-exon structure and a coding nucleotide sequence homology of 77%. By Northern blot analysis the 1.0-kb MCP-2 mRNA was predominantly detectable in the small intestine, peripheral blood, heart, placenta, lung, skeletal muscle, ovary, colon, spinal cord, pancreas, and thymus. Transcripts of 1.5 and 2.4 kb were found in the testis, the small intestine, and the colon. The isolation of the MCP-2 gene from the chemokine contig localized it on YAC clones of chromosome 17q11.2, which also contain the eotaxin, MCP-1, MCP-3, and NCC-1/MCP-4 genes. The combination of using degenerate primer PCR and YACs illustrates that novel genes can efficiently be isolated from gene cluster contigs with less redundancy and effort than the isolation of novel ESTs. 42 refs., 5 figs., 2 tabs.« less
SSMap: a new UniProt-PDB mapping resource for the curation of structural-related information in the UniProt/Swiss-Prot Knowledgebase.

PubMed

David, Fabrice P A; Yip, Yum L

2008-09-23

Sequences and structures provide valuable complementary information on protein features and functions. However, it is not always straightforward for users to gather information concurrently from the sequence and structure levels. The UniProt knowledgebase (UniProtKB) strives to help users on this undertaking by providing complete cross-references to Protein Data Bank (PDB) as well as coherent feature annotation using available structural information. In this study, SSMap - a new UniProt-PDB residue-residue level mapping - was generated. The primary objective of this mapping is not only to facilitate the two tasks mentioned above, but also to palliate a number of shortcomings of existent mappings. SSMap is the first isoform sequence-specific mapping resource and is up-to-date for UniProtKB annotation tasks. The method employed by SSMap differs from the other mapping resources in that it stresses on the correct reconstruction of the PDB sequence from structures, and on the correct attribution of a UniProtKB entry to each PDB chain by using a series of post-processing steps. SSMap was compared to other existing mapping resources in terms of the correctness of the attribution of PDB chains to UniProtKB entries, and of the quality of the pairwise alignments supporting the residue-residue mapping. It was found that SSMap shared about 80% of the mappings with other mapping sources. New and alternative mappings proposed by SSMap were mostly good as assessed by manual verification of data subsets. As for local pairwise alignments, it was shown that major discrepancies (both in terms of alignment lengths and boundaries), when present, were often due to differences in methodologies used for the mappings. SSMap provides an independent, good quality UniProt-PDB mapping. The systematic comparison conducted in this study allows the further identification of general problems in UniProt-PDB mappings so that both the coverage and the quality of the mappings can be systematically improved for the benefit of the scientific community. SSMap mapping is currently used to provide PDB cross-references in UniProtKB.
Novel conjugative plasmids from the natural isolate Lactococcus lactis subspecies cremoris DPC3758: a repository of genes for the potential improvement of dairy starters.

PubMed

Fallico, V; Ross, R P; Fitzgerald, G F; McAuliffe, O

2012-07-01

A collection of 17 natural lactococcal isolates from raw milk cheeses were studied in terms of their plasmid distribution, content, and diversity. All strains in the collection harbored an abundance of plasmids, including Lactococcus lactis ssp. cremoris DPC3758, whose 8-plasmid complement was selected for sequencing. The complete sequences of pAF22 (22,388 kb), pAF14 (14,419 kb), pAF12 (12,067 kb), pAF07 (7,435 kb), and pAF04 (3,801 kb) were obtained, whereas gene functions of technological interest were mapped to pAF65 (65 kb) and pAF45 (45 kb) by PCR. The plasmids of L. lactis DPC3758 were found to encode many genes with the potential to improve the technological properties of dairy starters. These included 3 anti-phage restriction/modification (R/M) systems (1 of type I and 2 of type II) and genes for immunity/resistance to nisin, lacticin 481, cadmium, and copper. Regions encoding conjugative/mobilization functions were present in 6 of the 8 plasmids, including those containing the R/M systems, thus enabling the food-grade transfer of these mechanisms to industrial strains. Using cadmium selection, the sequential stacking of the R/M plasmids into a plasmid-free host provided the recipient with increased protection against 936- and c2-type phages. The association of food-grade selectable markers and mobilization functions on L. lactis DPC3758 plasmids will facilitate their exploitation to obtain industrial strains with enhanced phage protection and robustness. These natural plasmids also provide another example of the major role of plasmids in contributing to host fitness and preservation within its ecological niche. Copyright © 2012 American Dairy Science Association. Published by Elsevier Inc. All rights reserved.
piggyBac transposons expressing full-length human dystrophin enable genetic correction of dystrophic mesoangioblasts

PubMed Central

Loperfido, Mariana; Jarmin, Susan; Dastidar, Sumitava; Di Matteo, Mario; Perini, Ilaria; Moore, Marc; Nair, Nisha; Samara-Kuko, Ermira; Athanasopoulos, Takis; Tedesco, Francesco Saverio; Dickson, George; Sampaolesi, Maurilio; VandenDriessche, Thierry; Chuah, Marinee K.

2016-01-01

Duchenne muscular dystrophy (DMD) is a genetic neuromuscular disorder caused by the absence of dystrophin. We developed a novel gene therapy approach based on the use of the piggyBac (PB) transposon system to deliver the coding DNA sequence (CDS) of either full-length human dystrophin (DYS: 11.1 kb) or truncated microdystrophins (MD1: 3.6 kb; MD2: 4 kb). PB transposons encoding microdystrophins were transfected in C2C12 myoblasts, yielding 65±2% MD1 and 66±2% MD2 expression in differentiated multinucleated myotubes. A hyperactive PB (hyPB) transposase was then deployed to enable transposition of the large-size PB transposon (17 kb) encoding the full-length DYS and green fluorescence protein (GFP). Stable GFP expression attaining 78±3% could be achieved in the C2C12 myoblasts that had undergone transposition. Western blot analysis demonstrated expression of the full-length human DYS protein in myotubes. Subsequently, dystrophic mesoangioblasts from a Golden Retriever muscular dystrophy dog were transfected with the large-size PB transposon resulting in 50±5% GFP-expressing cells after stable transposition. This was consistent with correction of the differentiated dystrophic mesoangioblasts following expression of full-length human DYS. These results pave the way toward a novel non-viral gene therapy approach for DMD using PB transposons underscoring their potential to deliver large therapeutic genes. PMID:26682797
Comparative analysis of the complete sequence of the plastid genome of Parthenium argentatum and identification of DNA barcodes to differentiate Parthenium species and lines

PubMed Central

2009-01-01

Background Parthenium argentatum (guayule) is an industrial crop that produces latex, which was recently commercialized as a source of latex rubber safe for people with Type I latex allergy. The complete plastid genome of P. argentatum was sequenced. The sequence provides important information useful for genetic engineering strategies. Comparison to the sequences of plastid genomes from three other members of the Asteraceae, Lactuca sativa, Guitozia abyssinica and Helianthus annuus revealed details of the evolution of the four genomes. Chloroplast-specific DNA barcodes were developed for identification of Parthenium species and lines. Results The complete plastid genome of P. argentatum is 152,803 bp. Based on the overall comparison of individual protein coding genes with those in L. sativa, G. abyssinica and H. annuus, we demonstrate that the P. argentatum chloroplast genome sequence is most closely related to that of H. annuus. Similar to chloroplast genomes in G. abyssinica, L. sativa and H. annuus, the plastid genome of P. argentatum has a large 23 kb inversion with a smaller 3.4 kb inversion, within the large inversion. Using the matK and psbA-trnH spacer chloroplast DNA barcodes, three of the four Parthenium species tested, P. tomentosum, P. hysterophorus and P. schottii, can be differentiated from P. argentatum. In addition, we identified lines within P. argentatum. Conclusion The genome sequence of the P. argentatum chloroplast will enrich the sequence resources of plastid genomes in commercial crops. The availability of the complete plastid genome sequence may facilitate transformation efficiency by using the precise sequence of endogenous flanking sequences and regulatory elements in chloroplast transformation vectors. The DNA barcoding study forms the foundation for genetic identification of commercially significant lines of P. argentatum that are important for producing latex. PMID:19917140
Multiple determinants controlling activation of yeast replication origins late in S phase.

PubMed

Friedman, K L; Diller, J D; Ferguson, B M; Nyland, S V; Brewer, B J; Fangman, W L

1996-07-01

Analysis of a 131-kb segment of the left arm of yeast chromosome XIV beginning 157 kb from the telomere reveals four highly active origins of replication that initiate replication late in S phase. Previous work has shown that telomeres act as determinants for late origin activation. However, at least two of the chromosome XIV origins maintain their late activation time when located on large circular plasmids, indicating that late replication is independent of telomeres. Analysis of the replication time of plasmid derivatives containing varying amounts of chromosome XIV DNA show that a minimum of three chromosomal elements, distinct from each tested origin, contribute to late activation time. These late determinants are functionally equivalent, because duplication of one set of contributing sequences can compensate for the removal of another set. Furthermore, insertion of an origin that is normally early activated into this domain results in a shift to late activation, suggesting that the chromosome XIV origins are not unique in their ability to respond to the late determinants.
Draft genome sequence of ramie, Boehmeria nivea (L.) Gaudich.

PubMed

Luan, Ming-Bao; Jian, Jian-Bo; Chen, Ping; Chen, Jun-Hui; Chen, Jian-Hua; Gao, Qiang; Gao, Gang; Zhou, Ju-Hong; Chen, Kun-Mei; Guang, Xuan-Min; Chen, Ji-Kang; Zhang, Qian-Qian; Wang, Xiao-Fei; Fang, Long; Sun, Zhi-Min; Bai, Ming-Zhou; Fang, Xiao-Dong; Zhao, Shan-Cen; Xiong, He-Ping; Yu, Chun-Ming; Zhu, Ai-Guo

2018-05-01

Ramie, Boehmeria nivea (L.) Gaudich, family Urticaceae, is a plant native to eastern Asia, and one of the world's oldest fibre crops. It is also used as animal feed and for the phytoremediation of heavy metal-contaminated farmlands. Thus, the genome sequence of ramie was determined to explore the molecular basis of its fibre quality, protein content and phytoremediation. For further understanding ramie genome, different paired-end and mate-pair libraries were combined to generate 134.31 Gb of raw DNA sequences using the Illumina whole-genome shotgun sequencing approach. The highly heterozygous B. nivea genome was assembled using the Platanus Genome Assembler, which is an effective tool for the assembly of highly heterozygous genome sequences. The final length of the draft genome of this species was approximately 341.9 Mb (contig N50 = 22.62 kb, scaffold N50 = 1,126.36 kb). Based on ramie genome annotations, 30,237 protein-coding genes were predicted, and the repetitive element content was 46.3%. The completeness of the final assembly was evaluated by benchmarking universal single-copy orthologous genes (BUSCO); 90.5% of the 1,440 expected embryophytic genes were identified as complete, and 4.9% were identified as fragmented. Phylogenetic analysis based on single-copy gene families and one-to-one orthologous genes placed ramie with mulberry and cannabis, within the clade of urticalean rosids. Genome information of ramie will be a valuable resource for the conservation of endangered Boehmeria species and for future studies on the biogeography and characteristic evolution of members of Urticaceae. © 2018 John Wiley & Sons Ltd.
Population structure and its effect on haplotype diversity and linkage disequilibrium surrounding the xa5 locus of rice (Oryza sativa L.).

PubMed Central

Garris, Amanda J; McCouch, Susan R; Kresovich, Stephen

2003-01-01

To assess the usefulness of linkage disequilibrium mapping in an autogamous, domesticated species, we have characterized linkage disequilibrium in the candidate region for xa5, a recessive gene conferring race-specific resistance to bacterial blight in rice. This trait and locus have good mapping information, a tractable phenotype, and available sequence data, but no cloned gene. We sampled 13 short segments from the 70-kb candidate region in 114 accessions of Oryza sativa. Five additional segments were sequenced from the adjacent 45-kb region in resistant accessions to estimate the distance at which linkage disequilibrium decays. The data show significant linkage disequilibrium between sites 100 kb apart. The presence of the xa5 resistant reaction in two ecotypes and in accessions with different haplotypes in the candidate region may indicate multiple origins or genetic heterogeneity for resistance. In addition, genetic differentiation between ecotypes emphasizes the need for controlling for population structure in the design of linkage disequilibrium studies in rice. PMID:14573486
Construction of the BAC Library of Small Abalone (Haliotis diversicolor) for Gene Screening and Genome Characterization.

PubMed

Jiang, Likun; You, Weiwei; Zhang, Xiaojun; Xu, Jian; Jiang, Yanliang; Wang, Kai; Zhao, Zixia; Chen, Baohua; Zhao, Yunfeng; Mahboob, Shahid; Al-Ghanim, Khalid A; Ke, Caihuan; Xu, Peng

2016-02-01

The small abalone (Haliotis diversicolor) is one of the most important aquaculture species in East Asia. To facilitate gene cloning and characterization, genome analysis, and genetic breeding of it, we constructed a large-insert bacterial artificial chromosome (BAC) library, which is an important genetic tool for advanced genetics and genomics research. The small abalone BAC library includes 92,610 clones with an average insert size of 120 Kb, equivalent to approximately 7.6× of the small abalone genome. We set up three-dimensional pools and super pools of 18,432 BAC clones for target gene screening using PCR method. To assess the approach, we screened 12 target genes in these 18,432 BAC clones and identified 16 positive BAC clones. Eight positive BAC clones were then sequenced and assembled with the next generation sequencing platform. The assembled contigs representing these 8 BAC clones spanned 928 Kb of the small abalone genome, providing the first batch of genome sequences for genome evaluation and characterization. The average GC content of small abalone genome was estimated as 40.33%. A total of 21 protein-coding genes, including 7 target genes, were annotated into the 8 BACs, which proved the feasibility of PCR screening approach with three-dimensional pools in small abalone BAC library. One hundred fifty microsatellite loci were also identified from the sequences for marker development in the future. The BAC library and clone pools provided valuable resources and tools for genetic breeding and conservation of H. diversicolor.
Comparative Sequence Analysis of the X-Inactivation Center Region in Mouse, Human, and Bovine

PubMed Central

Chureau, Corinne; Prissette, Marine; Bourdet, Agnès; Barbe, Valérie; Cattolico, Laurence; Jones, Louis; Eggen, André; Avner, Philip; Duret, Laurent

2002-01-01

We have sequenced to high levels of accuracy 714-kb and 233-kb regions of the mouse and bovine X-inactivation centers (Xic), respectively, centered on the Xist gene. This has provided the basis for a fully annotated comparative analysis of the mouse Xic with the 2.3-Mb orthologous region in human and has allowed a three-way species comparison of the core central region, including the Xist gene. These comparisons have revealed conserved genes, both coding and noncoding, conserved CpG islands and, more surprisingly, conserved pseudogenes. The distribution of repeated elements, especially LINE repeats, in the mouse Xic region when compared to the rest of the genome does not support the hypothesis of a role for these repeat elements in the spreading of X inactivation. Interestingly, an asymmetric distribution of LINE elements on the two DNA strands was observed in the three species, not only within introns but also in intergenic regions. This feature is suggestive of important transcriptional activity within these intergenic regions. In silico prediction followed by experimental analysis has allowed four new genes, Cnbp2, Ftx, Jpx, and Ppnx, to be identified and novel, widespread, complex, and apparently noncoding transcriptional activity to be characterized in a region 5′ of Xist that was recently shown to attract histone modification early after the onset of X inactivation. [The sequence data described in this paper have been submitted to the EMBL data library under accession nos. AJ421478, AJ421479, AJ421480, and AJ421481. Online supplemental data are available at http://pbil.univ-lyon1.fr/datasets/Xic2002/data.html and www.genome.org.] PMID:12045143
Cloning and sequencing of a gene encoding a novel extracellular neutral proteinase from Streptomyces sp. strain C5 and expression of the gene in Streptomyces lividans 1326.

PubMed Central

Lampel, J S; Aphale, J S; Lampel, K A; Strohl, W R

1992-01-01

The gene encoding a novel milk protein-hydrolyzing proteinase was cloned on a 6.56-kb SstI fragment from Streptomyces sp. strain C5 genomic DNA into Streptomyces lividans 1326 by using the plasmid vector pIJ702. The gene encoding the small neutral proteinase (snpA) was located within a 2.6-kb BamHI-SstI restriction fragment that was partially sequenced. The molecular mass of the deduced amino acid sequence of the mature protein was determined to be 15,740, which corresponds very closely with the relative molecular mass of the purified protein (15,500) determined by sodium dodecyl sulfate-polyacrylamide gel electrophoresis. The N-terminal amino acid sequence of the purified neutral proteinase was determined, and the DNA encoding this sequence was found to be located within the sequenced DNA. The deduced amino acid sequence contains a conserved zinc binding site, although secondary ligand binding and active sites typical of thermolysinlike metalloproteinases are absent. The combination of its small size, deduced amino acid sequence, and substrate and inhibition profile indicate that snpA encodes a novel neutral proteinase. Images PMID:1569011
Studies on the expression of an H-2K/human growth hormone fusion gene in giant transgenic mice.

PubMed Central

Morello, D; Moore, G; Salmon, A M; Yaniv, M; Babinet, C

1986-01-01

Transgenic mice carrying the H-2K/human growth hormone (hGH) fusion gene were produced by microinjecting into the pronucleus of fertilized eggs DNA molecules containing 2 kb of the 5' flanking sequences (including promoter) of the class I H-2Kb gene joined to the coding sequences of the hGH gene. Thirteen transgenic mice were obtained which all contained detectable levels of hGH hormone in their blood. Nine grew larger than their control litter-mates. Endogenous H-2Kb and exogenous hGH mRNA levels were analysed by S1 nuclease digestion experiments. hGH transcripts were found in all the tissues examined and the pattern of expression paralleled that of endogenous H-2K gene expression, being high in liver and lymphoid organs and low in muscle and brain. Thus 2 kb of the 5' promoter/regulatory region of the H-2K gene are sufficient to ensure regulated expression of hGH in transgenic mice. This promoter may therefore be of use to target the expression of different exogenous genes in most tissues of transgenic mice and to study the biological role of the corresponding proteins in different cellular environments. Images Fig. 2. Fig. 3. Fig. 4. Fig. 5. PMID:3019667
Complete Chloroplast Genome Sequences of Mongolia Medicine Artemisia frigida and Phylogenetic Relationships with Other Plants

PubMed Central

Liu, Yue; Huo, Naxin; Dong, Lingli; Wang, Yi; Zhang, Shuixian; Young, Hugh A.; Feng, Xiaoxiao; Gu, Yong Qiang

2013-01-01

Background Artemisia frigida Willd. is an important Mongolian traditional medicinal plant with pharmacological functions of stanch and detumescence. However, there is little sequence and genomic information available for Artemisia frigida, which makes phylogenetic identification, evolutionary studies, and genetic improvement of its value very difficult. We report the complete chloroplast genome sequence of Artemisia frigida based on 454 pyrosequencing. Methodology/Principal Findings The complete chloroplast genome of Artemisia frigida is 151,076 bp including a large single copy (LSC) region of 82,740 bp, a small single copy (SSC) region of 18,394 bp and a pair of inverted repeats (IRs) of 24,971 bp. The genome contains 114 unique genes and 18 duplicated genes. The chloroplast genome of Artemisia frigida contains a small 3.4 kb inversion within a large 23 kb inversion in the LSC region, a unique feature in Asteraceae. The gene order in the SSC region of Artemisia frigida is inverted compared with the other 6 Asteraceae species with the chloroplast genomes sequenced. This inversion is likely caused by an intramolecular recombination event only occurred in Artemisia frigida. The existence of rich SSR loci in the Artemisia frigida chloroplast genome provides a rare opportunity to study population genetics of this Mongolian medicinal plant. Phylogenetic analysis demonstrates a sister relationship between Artemisia frigida and four other species in Asteraceae, including Ageratina adenophora, Helianthus annuus, Guizotia abyssinica and Lactuca sativa, based on 61 protein-coding sequences. Furthermore, Artemisia frigida was placed in the tribe Anthemideae in the subfamily Asteroideae (Asteraceae) based on ndhF and trnL-F sequence comparisons. Conclusion The chloroplast genome sequence of Artemisia frigida was assembled and analyzed in this study, representing the first plastid genome sequenced in the Anthemideae tribe. This complete chloroplast genome sequence will be useful for molecular ecology and molecular phylogeny studies within Artemisia species and also within the Asteraceae family. PMID:23460871
Characterization of a Major Cluster of nif, fix, and Associated Genes in a Sugarcane Endophyte, Acetobacter diazotrophicus

PubMed Central

Lee, Sunhee; Reth, Alexander; Meletzus, Dietmar; Sevilla, Myrna; Kennedy, Christina

2000-01-01

A major 30.5-kb cluster of nif and associated genes of Acetobacter diazotrophicus (syn. Gluconacetobacter diazotrophicus), a nitrogen-fixing endophyte of sugarcane, was sequenced and analyzed. This cluster represents the largest assembly of contiguous nif-fix and associated genes so far characterized in any diazotrophic bacterial species. Northern blots and promoter sequence analysis indicated that the genes are organized into eight transcriptional units. The overall arrangement of genes is most like that of the nif-fix cluster in Azospirillum brasilense, while the individual gene products are more similar to those in species of Rhizobiaceae or in Rhodobacter capsulatus. PMID:11092875
Analysis of the DNA sequence of a 15,500 bp fragment near the left telomere of chromosome XV from Saccharomyces cerevisiae reveals a putative sugar transporter, a carboxypeptidase homologue and two new open reading frames.

PubMed

Gamo, F J; Lafuente, M J; Casamayor, A; Ariño, J; Aldea, M; Casas, C; Herrero, E; Gancedo, C

1996-06-15

We report the sequence of a 15.5 kb DNA segment located near the left telomere of chromosome XV of Saccharomyces cerevisiae. The sequence contains nine open reading frames (ORFs) longer than 300 bp. Three of them are internal to other ones. One corresponds to the gene LGT3 that encodes a putative sugar transporter. Three adjacent ORFs were separated by two stop codons in frame. These ORFs presented homology with the gene CPS1 that encodes carboxypeptidase S. The stop codons were not found in the same sequence derived from another yeast strain. Two other ORFs without significant homology in databases were also found. One of them, O0420, is very rich in serine and threonine and presents a series of repeated or similar amino acid stretches along the sequence.
Whole genome sequence and comparative analysis of Borrelia burgdorferi MM1

PubMed Central

Jabbari, Neda; Reddy, Panga Jaipal; Hood, Leroy

2018-01-01

Lyme disease is caused by spirochaetes of the Borrelia burgdorferi sensu lato genospecies. Complete genome assemblies are available for fewer than ten strains of Borrelia burgdorferi sensu stricto, the primary cause of Lyme disease in North America. MM1 is a sensu stricto strain originally isolated in the midwestern United States. Aside from a small number of genes, the complete genome sequence of this strain has not been reported. Here we present the complete genome sequence of MM1 in relation to other sensu stricto strains and in terms of its Multi Locus Sequence Typing. Our results indicate that MM1 is a new sequence type which contains a conserved main chromosome and 15 plasmids. Our results include the first contiguous 28.5 kb assembly of lp28-8, a linear plasmid carrying the vls antigenic variation system, from a Borrelia burgdorferi sensu stricto strain. PMID:29889842
Cloning of a cDNA encoding bovine mitochondrial NADP(+)-specific isocitrate dehydrogenase and structural comparison with its isoenzymes from different species.

PubMed Central

Huh, T L; Ryu, J H; Huh, J W; Sung, H C; Oh, I U; Song, B J; Veech, R L

1993-01-01

Mitochondrial NADP(+)-specific isocitrate dehydrogenase (IDP) was co-purified with the pyruvate dehydrogenase complex from bovine kidney mitochondria. The determination of its N-terminal 16-amino-acid sequence revealed that it is highly similar to the IDP from yeast. A cDNA clone (1.8 kb long) encoding this protein was isolated from a bovine kidney lambda gt11 cDNA library using a synthetic oligodeoxynucleotide. The deduced protein sequence of this cDNA clone rendered a precursor protein of 452 amino-acid residues (50,830 Da) and a mature protein of 413 amino-acid residues (46,519 Da). It is 100% identical to the internal tryptic peptide sequences of the autologous form from pig heart and 62% similar to that from yeast. However, it shares little similarity with the mitochondrial NAD(+)-specific isoenzyme from yeast. Structural analyses of the deduced proteins of IDP isoenzymes from different species indicated that similarity exists in certain regions, which may represent the common domains for the active sites or coenzyme-binding sites. In Northern-blot analysis, one species of mRNA (about 2.2 kb for both bovine and human) was hybridized with a 32P-labelled cDNA probe. Southern-blot analysis of genomic DNAs verified simple patterns of hybridization with this cDNA. These results strongly indicate that the mitochondrial IDP may be derived from a single gene family which does not appear to be closely related to that of the NAD(+)-specific isoenzyme. Images Figure 1 Figure 3 Figure 4 Figure 5 PMID:8318002
Identification of a spliced gene from duck enteritis virus encoding a protein homologous to UL15 of herpes simplex virus 1.

PubMed

Zhu, Hongwei; Li, Huixin; Han, Zongxi; Shao, Yuhao; Wang, Yu; Kong, Xiangang

2011-04-06

In herpesviruses, UL15 homologue is a subunit of terminase complex responsible for cleavage and packaging of the viral genome into pre-assembled capsids. However, for duck enteritis virus (DEV), the causative agent of duck viral enteritis (DVE), the genomic sequence was not completely determined until most recently. There is limited information of this putative spliced gene and its encoding protein. DEV UL15 consists of two exons with a 3.5 kilobases (kb) inron and transcribes into two transcripts: the full-length UL15 and an N-terminally truncated UL15.5. The 2.9 kb UL15 transcript encodes a protein of 739 amino acids with an approximate molecular mass of 82 kiloDaltons (kDa), whereas the UL15.5 transcript is 1.3 kb in length, containing a putative 888 base pairs (bp) ORF that encodes a 32 kDa product. We also demonstrated that UL15 gene belonged to the late kinetic class as its expression was sensitive to cycloheximide and phosphonoacetic acid. UL15 is highly conserved within the Herpesviridae, and contains Walker A and B motifs homologous to the catalytic subunit of the bacteriophage terminase as revealed by sequence analysis. Phylogenetic tree constructed with the amino acid sequences of 23 herpesvirus UL15 homologues suggests a close relationship of DEV to the Mardivirus genus within the Alphaherpesvirinae. Further, the UL15 and UL15.5 proteins can be detected in the infected cell lysate but not in the sucrose density gradient-purified virion when reacting with the antiserum against UL15. Within the CEF cells, the UL15 and/or UL15.5 localize(s) in the cytoplasm at 6 h post infection (h p. i.) and mainly in the nucleus at 12 h p. i. and at 24 h p. i., while accumulate(s) in the cytoplasm in the absence of any other viral protein. DEV UL15 is a spliced gene that encodes two products encoded by 2.9 and 1.3 kb transcripts respectively. The UL15 is expressed late during infection. The coding sequences of DEV UL15 are very similar to those of alphaherpesviruses and most similar to the genus Mardivirus. The UL15 and/or UL15.5 accumulate(s) in the cytoplasm during early times post-infection and then are translocated to the nucleus at late times.

Molecular Structure and Transformation of the Glucose Dehydrogenase Gene in Drosophila Melanogaster

PubMed Central

Whetten, R.; Organ, E.; Krasney, P.; Cox-Foster, D.; Cavener, D.

1988-01-01

We have precisely mapped and sequenced the three 5' exons of the Drosophila melanogaster Gld gene and have identified the start sites for transcription and translation. The first exon is composed of 335 nucleotides and does not contain any putative translation start codons. The second exon is separated from the first exon by 8 kb and contains the Gld translation start codon. The inferred amino acid sequence of the amino terminus contains two unusual features: three tandem repeats of serine-alanine, and a relatively high density of cysteine residues. P element-mediated transformation experiments demonstrated that a 17.5-kb genomic fragment contains the functional and regulatory components of the Gld gene. PMID:3143620
Molecular definition of 22q11 deletions in 151 velo-cardio-facial syndrome patients.

PubMed Central

Carlson, C; Sirotkin, H; Pandita, R; Goldberg, R; McKie, J; Wadey, R; Patanjali, S R; Weissman, S M; Anyane-Yeboa, K; Warburton, D; Scambler, P; Shprintzen, R; Kucherlapati, R; Morrow, B E

1997-01-01

Velo-cardio-facial syndrome (VCFS) is a relatively common developmental disorder characterized by craniofacial anomalies and conotruncal heart defects. Many VCFS patients have hemizygous deletions for a part of 22q11, suggesting that haploinsufficiency in this region is responsible for its etiology. Because most cases of VCFS are sporadic, portions of 22q11 may be prone to rearrangement. To understand the molecular basis for chromosomal deletions, we defined the extent of the deletion, by genotyping 151 VCFS patients and performing haplotype analysis on 105, using 15 consecutive polymorphic markers in 22q11. We found that 83% had a deletion and >90% of these had a similar approximately 3 Mb deletion, suggesting that sequences flanking the common breakpoints are susceptible to rearrangement. We found no correlation between the presence or size of the deletion and the phenotype. To further define the chromosomal breakpoints among the VCFS patients, we developed somatic hybrid cell lines from a set of VCFS patients. An 11-kb resolution physical map of a 1,080-kb region that includes deletion breakpoints was constructed, incorporating genes and expressed sequence tags (ESTs) isolated by the hybridization selection method. The ordered markers were used to examine the two separated copies of chromosome 22 in the somatic hybrid cell lines. In some cases, we were able to map the chromosome breakpoints within a single cosmid. A 480-kb critical region for VCFS has been delineated, including the genes for GSCL, CTP, CLTD, HIRA, and TMVCF, as well as a number of novel ordered ESTs. PMID:9326327
The PL6-Family Plasmids of Haloquadratum Are Virus-Related.

PubMed

Dyall-Smith, Mike; Pfeiffer, Friedhelm

2018-01-01

Plasmids PL6A and PL6B are both carried by the C23 T strain of the square archaeon Haloquadratum walsbyi , and are closely related (76% nucleotide identity), circular, about 6 kb in size, and display the same gene synteny. They are unrelated to other known plasmids and all of the predicted proteins are cryptic in function. Here we describe two additional PL6-related plasmids, pBAJ9-6 and pLT53-7, each carried by distinct isolates of Haloquadratum walsbyi that were recovered from hypersaline waters in Australia. A third PL6-like plasmid, pLTMV-6, was assembled from metavirome data from Lake Tyrell, a salt-lake in Victoria, Australia. Comparison of all five plasmids revealed a distinct plasmid family with strong conservation of gene content and synteny, an average size of 6.2 kb (range 5.8-7.0 kb) and pairwise similarities between 61-79%. One protein (F3) was closely similar to a protein carried by betapleolipoviruses while another (R6) was similar to a predicted AAA-ATPase of His 1 halovirus (His1V_gp16). Plasmid pLT53-7 carried a gene for a FkbM family methyltransferase that was not present in any of the other plasmids. Comparative analysis of all PL6-like plasmids provided better resolution of conserved sequences and coding regions, confirmed the strong link to haloviruses, and showed that their sequences are highly conserved among examples from Haloquadratum isolates and metagenomic data that collectively cover geographically distant locations, indicating that these genetic elements are widespread.
Determination of the melon chloroplast and mitochondrial genome sequences reveals that the largest reported mitochondrial genome in plants contains a significant amount of DNA having a nuclear origin

PubMed Central

2011-01-01

Background The melon belongs to the Cucurbitaceae family, whose economic importance among vegetable crops is second only to Solanaceae. The melon has a small genome size (454 Mb), which makes it suitable for molecular and genetic studies. Despite similar nuclear and chloroplast genome sizes, cucurbits show great variation when their mitochondrial genomes are compared. The melon possesses the largest plant mitochondrial genome, as much as eight times larger than that of other cucurbits. Results The nucleotide sequences of the melon chloroplast and mitochondrial genomes were determined. The chloroplast genome (156,017 bp) included 132 genes, with 98 single-copy genes dispersed between the small (SSC) and large (LSC) single-copy regions and 17 duplicated genes in the inverted repeat regions (IRa and IRb). A comparison of the cucumber and melon chloroplast genomes showed differences in only approximately 5% of nucleotides, mainly due to short indels and SNPs. Additionally, 2.74 Mb of mitochondrial sequence, accounting for 95% of the estimated mitochondrial genome size, were assembled into five scaffolds and four additional unscaffolded contigs. An 84% of the mitochondrial genome is contained in a single scaffold. The gene-coding region accounted for 1.7% (45,926 bp) of the total sequence, including 51 protein-coding genes, 4 conserved ORFs, 3 rRNA genes and 24 tRNA genes. Despite the differences observed in the mitochondrial genome sizes of cucurbit species, Citrullus lanatus (379 kb), Cucurbita pepo (983 kb) and Cucumis melo (2,740 kb) share 120 kb of sequence, including the predicted protein-coding regions. Nevertheless, melon contained a high number of repetitive sequences and a high content of DNA of nuclear origin, which represented 42% and 47% of the total sequence, respectively. Conclusions Whereas the size and gene organisation of chloroplast genomes are similar among the cucurbit species, mitochondrial genomes show a wide variety of sizes, with a non-conserved structure both in gene number and organisation, as well as in the features of the noncoding DNA. The transfer of nuclear DNA to the melon mitochondrial genome and the high proportion of repetitive DNA appear to explain the size of the largest mitochondrial genome reported so far. PMID:21854637
Molecular characterization of banana bunchy top virus isolate from Sri Lanka and its genetic relationship with other isolates.

PubMed

Wickramaarachchi, W A R T; Shankarappa, K S; Rangaswamy, K T; Maruthi, M N; Rajapakse, R G A S; Ghosh, Saptarshi

2016-06-01

Bunchy top disease of banana caused by Banana bunchy top virus (BBTV, genus Babuvirus family Nanoviridae) is one of the most important constraints in production of banana in the different parts of the world. Six genomic DNA components of BBTV isolate from Kandy, Sri Lanka (BBTV-K) were amplified by polymerase chain reaction (PCR) with specific primers using total DNA extracted from banana tissues showing typical symptoms of bunchy top disease. The amplicons were of expected size of 1.0-1.1 kb, which were cloned and sequenced. Analysis of sequence data revealed the presence of six DNA components; DNA-R, DNA-U3, DNA-S, DNA-N, DNA-M and DNA-C for Sri Lanka isolate. Comparisons of sequence data of DNA components followed by the phylogenetic analysis, grouped Sri Lanka-(Kandy) isolate in the Pacific Indian Oceans (PIO) group. Sri Lanka-(Kandy) isolate of BBTV is classified a new member of PIO group based on analysis of six components of the virus.
Molecular characterization of the equine testis-specific protein 1 (TPX1) and acidic epididymal glycoprotein 2 (AEG2) genes encoding members of the cysteine-rich secretory protein (CRISP) family.

PubMed

Giese, Alexander; Jude, Rony; Kuiper, Heidi; Raudsepp, Terje; Piumi, Francois; Schambony, Alexandra; Guérin, Gérard; Chowdhary, Bhanu P; Distl, Ottmar; Töpfer-Petersen, Edda; Leeb, Tosso

2002-10-16

The cysteine-rich secretory protein (CRISP) family consists of three members called acidic epididymal glycoprotein 1 (AEG1), AEG2, and testis-specific protein 1 (TPX1), which share 16 conserved cysteine residues at their C-termini. The CRISP proteins are primarily expressed in different sections of the male genital tract and are thought to mediate cell-cell interactions of male germ cells with other cells during sperm maturation or during fertilization. Therefore, their genes are of interest as candidate genes for inherited male fertility dysfunctions and as putative quantitative trait loci for male fertility traits. In this report, the cloning and DNA sequence of 137 kb of horse genomic DNA from equine chromosome 20q22 containing the closely linked equine TPX1 and AEG2 genes are described. The equine TPX1 gene consists of ten exons spanning 18 kb while the AEG2 gene consists of eight exons that are spread over 24 kb. The expression of these two genes was investigated in several tissues by reverse transcription polymerase chain reaction analysis and Western blotting. Comparative genome analysis between horse, human, and mouse indicates that all three CRISP genes are clustered on one chromosomal location, which shows conserved synteny between these species.
Recombining overlapping BACs into a single larger BAC.

PubMed

Kotzamanis, George; Huxley, Clare

2004-01-06

BAC clones containing entire mammalian genes including all the transcribed region and long range controlling elements are very useful for functional analysis. Sequenced BACs are available for most of the human and mouse genomes and in many cases these contain intact genes. However, large genes often span more than one BAC, and single BACs covering the entire region of interest are not available. Here we describe a system for linking two or more overlapping BACs into a single clone by homologous recombination. The method was used to link a 61-kb insert carrying the final 5 exons of the human CFTR gene onto a 160-kb BAC carrying the first 22 exons. Two rounds of homologous recombination were carried out in the EL350 strain of bacteria which can be induced for the Red genes. In the first round, the inserts of the two overlapping BACs were subcloned into modified BAC vectors using homologous recombination. In the second round, the BAC to be added was linearised with the very rare-cutting enzyme I-PpoI and electroporated into recombination efficient EL350 bacteria carrying the other BAC. Recombined BACs were identified by antibiotic selection and PCR screening and 10% of clones contained the correctly recombined 220-kb BAC. The system can be used to link the inserts from any overlapping BAC or PAC clones. The original orientation of the inserts is not important and desired regions of the inserts can be selected. The size limit for the fragments recombined may be larger than the 61 kb used here and multiple BACs in a contig could be combined by alternating use of the two pBACLink vectors. This system should be of use to many investigators wishing to carry out functional analysis on large mammalian genes which are not available in single BAC clones.
Functional organization of a single nif cluster in the mesophilic archaeon Methanosarcina mazei strain Gö1

PubMed Central

Ehlers, Claudia; Veit, Katharina; Gottschalk, Gerhard; Schmitz, Ruth A.

2002-01-01

The mesophilic methanogenic archaeon Methanosarcina mazei strain Gö1 is able to utilize molecular nitrogen (N2) as its sole nitrogen source. We have identified and characterized a single nitrogen fixation (nif) gene cluster in M. mazei Gö1 with an approximate length of 9 kbp. Sequence analysis revealed seven genes with sequence similarities to nifH, nifI1, nifI2, nifD, nifK, nifE and nifN, similar to other diazotrophic methanogens and certain bacteria such as Clostridium acetobutylicum, with the two glnB-like genes (nifI1 and nifI2) located between nifH and nifD. Phylogenetic analysis of deduced amino acid sequences for the nitrogenase structural genes of M. mazei Gö1 showed that they are most closely related to Methanosarcina barkeri nif2 genes, and also closely resemble those for the corresponding nif products of the gram-positive bacterium C. acetobutylicum. Northern blot analysis and reverse transcription PCR analysis demonstrated that the M. mazei nif genes constitute an operon transcribed only under nitrogen starvation as a single 8 kb transcript. Sequence analysis revealed a palindromic sequence at the transcriptional start site in front of the M. mazei nifH gene, which may have a function in transcriptional regulation of the nif operon. PMID:15803652
T-cell receptor V sub. alpha. and C sub. alpha. alleles associated with multiple sclerosis and myasthenia gravis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Oksenberg, J.R.; Cavalli-Sforza, L.L.; Steinman, L.

1989-02-01

Polymorphic markers in genes encoding the {alpha} chain of the human T-cell receptor (TcR) have been detected by Southern blot analysis in Pss I digests. Polymorphic bands were observed at 6.3 and 2.0 kilobases (kb) with frequencies of 0.30 and 0.44, respectively, in the general population. Using the polymerase chain reaction (PCR) method, the authors amplified selected sequences derived from the full-length TcR {alpha} cDNA probe. These PcR products were used as specific probes to demonstrate that the 6.3-kb polymorphic fragment hybridizes to the variable (V)-region probe and the 2.0-kb fragment hybridizes to the constant (C)-region probe. Segregation of themore » polymorphic bands was analyzed in family studies. To look for associations between these markers and autoimmune diseases, the authors have studied the restriction fragment length polymorphism distribution of the Pss I markers in patients with multiple sclerosis, myasthenia gravis, and Graves disease. Significant differences in the frequency of the polymorphic V{sub {alpha}} and C{sub {alpha}} markers were identified between patients and healthy individuals.« less
Molecular cloning and characterization of the spaB gene of Streptococcus sobrinus.

PubMed

Holt, R G; Perry, S E

1990-07-01

A gene of Streptococcus sobrinus 6715 (serotype g) designated spaB and encoding a surface protein antigen was isolated from a cosmid gene bank. A 5.4 kb HindIII/AvaI DNA fragment containing the gene was inserted into plasmid pBR322 to yield plasmid pXI404. Analysis of plasmid-encoded gene products showed that the 5.4 kb fragment of pXI404 encoded a 195 kDa protein. Southern blot experiments revealed that the 5.4 kb chromosomal insert DNA had sequence similarity with genomic DNA of S. sobrinus 6715, S. sobrinus B13 (serotype d) and Streptococcus cricetus HS6 (serotype a). The recombinant SpaB protein (rSpaB) was purified and monospecific antiserum was prepared. With immunological techniques and the anti-rSpaB serum, we have shown: (1) that the rSpaB protein has physico-chemical and antigenic identity with the S. sobrinus SpaB protein, (2) the presence of cross-reactive proteins in the extracellular protein of serotypes a and d of the mutans group of streptococci and (3) that the SpaB protein is expressed on the surface of mutans streptococcal serotypes a, d and g.
Genomic structure and expression of STM2, the chromosome 1 familial Alzheimer disease gene.

PubMed

Levy-Lahad, E; Poorkaj, P; Wang, K; Fu, Y H; Oshima, J; Mulligan, J; Schellenberg, G D

1996-06-01

Mutations in the gene STM2 result in autosomal dominant familial Alzheimer disease. To screen for mutations and to identify regulatory elements for this gene, the genomic DNA sequence and intron-exon structure were determined. Twelve exons including 10 coding exons were identified in a genomic region spanning 23,737 bp. The first 2 exons encode the 5'-untranslated region. Expression analysis of STM2 indicates that two transcripts of 2.4 and 2.8 kb are found in skeletal muscle, pancreas, and heart. In addition, a splice variant of the 2.4-kb transcript was identified that is the result of the use of an alternative splice acceptor site located in exon 10. The use of this site results in a transcript lacking a single glutamate. The promotor for this gene and the alternatively spliced exons leading to the 2.8-kb form of the gene remain to be identified. Expression of STM2 was high in skeletal muscle and pancreas, with comparatively low levels observed in brain. This expression pattern is intriguing since in Alzheimer disease, pathology and degeneration are observed only in the central nervous system.
Genomic structure and expression of STM2, the chromosome 1 familial Alzheimer disease gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Levy-Lahad, E.; Wang, Kai; Fu, Ying Hui

1996-06-01

Mutations in the gene STM2 result in autosomal dominant familial Alzheimer disease. To screen for mutations and to identify regulatory elements for this gene, the genomic DNA sequence and intron-exon structure were determined. Twelve exons including 10 coding exons were identified in a genomic region spanning 23, 737 bp. The first 2 exons encode the 5{prime}-untranslated region. Expression analysis of STM2 indicates that two transcripts of 2.4 and 2.8 kb are found in skeletal muscle, pancreas, and heart. In addition, a splice variant of the 2.4-kb transcript was identified that is the result of the use of an alternative splicemore » acceptor site located in exon 10. The use of this site results in a transcript lacking a single glutamate. The promotor for this gene and the alternatively spliced exons leading to the 2.8-kb form of the gene remain to be identified. Expression of STM2 was high in skeletal muscle and pancreas, with comparatively low levels observed in brain. This expression pattern is intriguing since in Alzheimer disease, pathology and degeneration are observed only in the central nervous system. 19 refs., 2 figs., 3 tabs.« less
Cloning and Characterization of the Lactococcal Plasmid-Encoded Type II Restriction/Modification System, LlaDII

PubMed Central

Madsen, Annette; Josephsen, Jytte

1998-01-01

The LlaDII restriction/modification (R/M) system was found on the naturally occurring 8.9-kb plasmid pHW393 in Lactococcus lactis subsp. cremoris W39. A 2.4-kb PstI-EcoRI fragment inserted into the Escherichia coli-L. lactis shuttle vector pCI3340 conferred to L. lactis LM2301 and L. lactis SMQ86 resistance against representatives of the three most common lactococcal phage species: 936, P335, and c2. The LlaDII endonuclease was partially purified and found to recognize and cleave the sequence 5′-GC↓NGC-3′, where the arrow indicates the cleavage site. It is thus an isoschizomer of the commercially available restriction endonuclease Fnu4HI. Sequencing of the 2.4-kb PstI-EcoRI fragment revealed two open reading frames arranged tandemly and separated by a 105-bp intergenic region. The endonuclease gene of 543 bp preceded the methylase gene of 954 bp. The deduced amino acid sequence of the LlaDII R/M system showed high homology to that of its only sequenced isoschizomer, Bsp6I from Bacillus sp. strain RFL6, with 41% identity between the endonucleases and 60% identity between the methylases. The genetic organizations of the LlaDII and Bsp6I R/M systems are identical. Both methylases have two recognition sites (5′-GCGGC-3′ and 5′-GCCGC-3′) forming a putative stem-loop structure spanning part of the presumed −35 sequence and part of the intervening region between the −35 and −10 sequences. Alignment of the LlaDII and Bsp6I methylases with other m5C methylases showed that the protein primary structures possessed the same organization. PMID:9647810
Cloning and characterization of largemouth bass ( Micropterus salmoides) myostatin encoding gene and its promoter

NASA Astrophysics Data System (ADS)

Li, Shengjie; Bai, Junjie; Wang, Lin

2008-08-01

Myostatin or GDF-8, a member of the transforming growth factor-β (TGF-β) superfamily, has been demonstrated to be a negative regulator of skeletal muscle mass in mammals. In the present study, we obtained a 5.64 kb sequence of myostatin encoding gene and its promoter from largemouth bass ( Micropterus salmoides). The myostatin encoding gene consisted of three exons (488 bp, 371 bp and 1779 bp, respectively) and two introns (390 bp and 855 bp, respectively). The intron-exon boundaries were conservative in comparison with those of mammalian myostatin encoding genes, whereas the size of introns was smaller than that of mammals. Sequence analysis of 1.569 kb of the largemouth bass myostatin gene promoter region revealed that it contained two TATA boxes, one CAAT box and nine putative E-boxes. Putative muscle growth response elements for myocyte enhancer factor 2 (MEF2), serum response factor (SRF), activator protein 1 (AP1), etc., and muscle-specific Mt binding site (MTBF) were also detected. Some of the transcription factor binding sites were conserved among five teleost species. This information will be useful for studying the transcriptional regulation of myostatin in fish.
Identification of a psoriasis susceptibility candidate gene by linkage disequilibrium mapping with a localized single nucleotide polymorphism map.

PubMed

Hewett, Duncan; Samuelsson, Lena; Polding, Joanne; Enlund, Fredrik; Smart, Devi; Cantone, Kathryn; See, Chee Gee; Chadha, Sapna; Inerot, Annica; Enerback, Charlotta; Montgomery, Doug; Christodolou, Chris; Robinson, Phil; Matthews, Paul; Plumpton, Mary; Wahlstrom, Jan; Swanbeck, Gunnar; Martinsson, Tommy; Roses, Allen; Riley, John; Purvis, Ian

2002-03-01

Psoriasis is a chronic inflammatory disease of the skin with both genetic and environmental risk factors. Here we describe the creation of a single-nucleotide polymorphism (SNP) map spanning 900-1200 kb of chromosome 3q21, which had been previously recognized as containing a psoriasis susceptibility locus, PSORS5. We genotyped 644 individuals, from 195 Swedish psoriatic families, for 19 polymorphisms. Linkage disequilibrium (LD) between marker and disease was assessed using the transmission/disequilibrium test (TDT). In the TDT analysis, alleles of three of these SNPs showed significant association with disease (P<0.05). A 160-kb interval encompassing these three SNPs was sequenced, and a coding sequence consisting of 13 exons was identified. The predicted protein shares 30-40% homology with the family of cation/chloride cotransporters. A five-marker haplotype spanning the 3' half of this gene is associated with psoriasis to a P value of 3.8<10(-5). We have called this gene SLC12A8, coding for a member of the solute carrier family 12 proteins. It belongs to a class of genes that were previously unrecognized as playing a role in psoriasis pathogenesis.
The genome of the Antarctic-endemic copepod, Tigriopus kingsejongensis.

PubMed

Kang, Seunghyun; Ahn, Do-Hwan; Lee, Jun Hyuck; Lee, Sung Gu; Shin, Seung Chul; Lee, Jungeun; Min, Gi-Sik; Lee, Hyoungseok; Kim, Hyun-Woo; Kim, Sanghee; Park, Hyun

2017-01-01

The Antarctic intertidal zone is continuously subjected to extremely fluctuating biotic and abiotic stressors. The West Antarctic Peninsula is the most rapidly warming region on Earth. Organisms living in Antarctic intertidal pools are therefore interesting for research into evolutionary adaptation to extreme environments and the effects of climate change. We report the whole genome sequence of the Antarctic-endemic harpacticoid copepod Tigriopus kingsejongensi . The 37 Gb raw DNA sequence was generated using the Illumina Miseq platform. Libraries were prepared with 65-fold coverage and a total length of 295 Mb. The final assembly consists of 48 368 contigs with an N50 contig length of 17.5 kb, and 27 823 scaffolds with an N50 contig length of 159.2 kb. A total of 12 772 coding genes were inferred using the MAKER annotation pipeline. Comparative genome analysis revealed that T. kingsejongensis -specific genes are enriched in transport and metabolism processes. Furthermore, rapidly evolving genes related to energy metabolism showed positive selection signatures. The T. kingsejongensis genome provides an interesting example of an evolutionary strategy for Antarctic cold adaptation, and offers new genetic insights into Antarctic intertidal biota. © The Author 2017. Published by Oxford University Press.
The genome of the Antarctic-endemic copepod, Tigriopus kingsejongensis

PubMed Central

Kang, Seunghyun; Ahn, Do-Hwan; Lee, Jun Hyuck; Lee, Sung Gu; Shin, Seung Chul; Lee, Jungeun; Min, Gi-Sik; Lee, Hyoungseok

2017-01-01

Abstract Background: The Antarctic intertidal zone is continuously subjected to extremely fluctuating biotic and abiotic stressors. The West Antarctic Peninsula is the most rapidly warming region on Earth. Organisms living in Antarctic intertidal pools are therefore interesting for research into evolutionary adaptation to extreme environments and the effects of climate change. Findings: We report the whole genome sequence of the Antarctic-endemic harpacticoid copepod Tigriopus kingsejongensi. The 37 Gb raw DNA sequence was generated using the Illumina Miseq platform. Libraries were prepared with 65-fold coverage and a total length of 295 Mb. The final assembly consists of 48 368 contigs with an N50 contig length of 17.5 kb, and 27 823 scaffolds with an N50 contig length of 159.2 kb. A total of 12 772 coding genes were inferred using the MAKER annotation pipeline. Comparative genome analysis revealed that T. kingsejongensis-specific genes are enriched in transport and metabolism processes. Furthermore, rapidly evolving genes related to energy metabolism showed positive selection signatures. Conclusions: The T. kingsejongensis genome provides an interesting example of an evolutionary strategy for Antarctic cold adaptation, and offers new genetic insights into Antarctic intertidal biota. PMID:28369352
Molecular and genetic characterization of the rhizopine catabolism (mocABRC) genes of Rhizobium meliloti L5-30.

PubMed

Rossbach, S; Kulpa, D A; Rossbach, U; de Bruijn, F J

1994-10-17

Rhizopine (L-3-O-methyl-scyllo-inosamine, 3-O-MSI) is a symbiosis-specific compound, which is synthesized in nitrogen-fixing nodules of Medicago sativa induced by Rhizobium meliloti strain L5-30. 3-O-MSI is thought to function as an unusual growth substrate for R. meliloti L5-30, which carries a locus (mos) responsible for its synthesis closely linked to a locus (moc) responsible for its degradation. Here, the essential moc genes were delimited by Tn5 mutagenesis and shown to be organized into two regions, separated by 3 kb of DNA. The DNA sequence of a 9-kb fragment spanning the two moc regions was determined, and four genes were identified that play an essential role in rhizopine catabolism (mocABC and mocR). The analysis of the DNA sequence and the amino acid sequence of the deduced protein products revealed that MocA resembles NADH-dependent dehydrogenases. MocB exhibits characteristic features of periplasmic-binding proteins that are components of high-affinity transport systems. MocC does not share significant homology with any protein in the database. MocR shows homology with the GntR class of bacterial regulator proteins. These results suggest that the mocABC genes are involved in the uptake and subsequent degradation of rhizopine, whereas mocR is likely to play a regulatory role.
Molecular Characterization of OXA-198 Carbapenemase-Producing Pseudomonas aeruginosa Clinical Isolates.

PubMed

Bonnin, Rémy A; Bogaerts, Pierre; Girlich, Delphine; Huang, Te-Din; Dortet, Laurent; Glupczynski, Youri; Naas, Thierry

2018-06-01

Carbapenemase-producing Pseudomonadaceae have increasingly been reported worldwide, with an ever-increasing heterogeneity of carbapenem resistance mechanisms, depending on the bacterial species and the geographical location. OXA-198 is a plasmid-encoded class D β-lactamase involved in carbapenem resistance in one Pseudomonas aeruginosa isolate from Belgium. In the setting of a multicenter survey of carbapenem resistance in P. aeruginosa strains in Belgian hospitals in 2013, three additional OXA-198-producing P. aeruginosa isolates originating from patients hospitalized in one hospital were detected. To reveal the molecular mechanism underlying the reduced susceptibility to carbapenems, MIC determinations, whole-genome sequencing, and PCR analyses to confirm the genetic organization were performed. The plasmid harboring the bla OXA-198 gene was characterized, along with the genetic relatedness of the four P. aeruginosa isolates. The bla OXA-198 gene was harbored on a class 1 integron carried by an ∼49-kb IncP-type plasmid proposed as IncP-11. The same plasmid was present in all four P. aeruginosa isolates. Multilocus sequence typing revealed that the isolates all belonged to sequence type 446, and single-nucleotide polymorphism analysis revealed only a few differences between the isolates. This report describes the structure of a 49-kb plasmid harboring the bla OXA-198 gene and presents the first description of OXA-198-producing P. aeruginosa isolates associated with a hospital-associated cluster episode. Copyright © 2018 American Society for Microbiology.
Isolation of a complementary DNA clone for thyroid microsomal antigen. Homology with the gene for thyroid peroxidase.

PubMed Central

Seto, P; Hirayu, H; Magnusson, R P; Gestautas, J; Portmann, L; DeGroot, L J; Rapoport, B

1987-01-01

The thyroid microsomal antigen (MSA) in autoimmune thyroid disease is a protein of approximately 107 kD. We screened a human thyroid cDNA library constructed in the expression vector lambda gt11 with anti-107-kD monoclonal antibodies. Of five clones obtained, the recombinant beta-galactosidase fusion protein from one clone (PM-5) was confirmed to react with the monoclonal antiserum. The complementary DNA (cDNA) insert from PM-5 (0.8 kb) was used as a probe on Northern blot analysis to estimate the size of the mRNA coding for the MSA. The 2.9-kb messenger RNA (mRNA) species observed was the same size as that coding for human thyroid peroxidase (TPO). The probe did not bind to human liver mRNA, indicating the thyroid-specific nature of the PM-5-related mRNA. The nucleotide sequence of PM-5 (842 bp) was determined and consisted of a single open reading frame. Comparison of the nucleotide sequence of PM-5 with that presently available for pig TPO indicates 84% homology. In conclusion, a cDNA clone representing part of the microsomal antigen has been isolated. Sequence homology with porcine TPO, as well as identity in the size of the mRNA species for both the microsomal antigen and TPO, indicate that the microsomal antigen is, at least in part, TPO. Images PMID:3654979

Structural and functional analysis of mouse Msx1 gene promoter: sequence conservation with human MSX1 promoter points at potential regulatory elements.

PubMed

Gonzalez, S M; Ferland, L H; Robert, B; Abdelhay, E

1998-06-01

Vertebrate Msx genes are related to one of the most divergent homeobox genes of Drosophila, the muscle segment homeobox (msh) gene, and are expressed in a well-defined pattern at sites of tissue interactions. This pattern of expression is conserved in vertebrates as diverse as quail, zebrafish, and mouse in a range of sites including neural crest, appendages, and craniofacial structures. In the present work, we performed structural and functional analyses in order to identify potential cis-acting elements that may be regulating Msx1 gene expression. To this end, a 4.9-kb segment of the 5'-flanking region was sequenced and analyzed for transcription-factor binding sites. Four regions showing a high concentration of these sites were identified. Transfection assays with fragments of regulatory sequences driving the expression of the bacterial lacZ reporter gene showed that a region of 4 kb upstream of the transcription start site contains positive and negative elements responsible for controlling gene expression. Interestingly, a fragment of 130 bp seems to contain the minimal elements necessary for gene expression, as its removal completely abolishes gene expression in cultured cells. These results are reinforced by comparison of this region with the human Msx1 gene promoter, which shows extensive conservation, including many consensus binding sites, suggesting a regulatory role for them.
Characterization of a Theta-Type Plasmid from Lactobacillus sakei: a Potential Basis for Low-Copy-Number Vectors in Lactobacilli

PubMed Central

Alpert, Carl-Alfred; Crutz-Le Coq, Anne-Marie; Malleret, Christine; Zagorec, Monique

2003-01-01

The complete nucleotide sequence of the 13-kb plasmid pRV500, isolated from Lactobacillus sakei RV332, was determined. Sequence analysis enabled the identification of genes coding for a putative type I restriction-modification system, two genes coding for putative recombinases of the integrase family, and a region likely involved in replication. The structural features of this region, comprising a putative ori segment containing 11- and 22-bp repeats and a repA gene coding for a putative initiator protein, indicated that pRV500 belongs to the pUCL287 subfamily of theta-type replicons. A 3.7-kb fragment encompassing this region was fused to an Escherichia coli replicon to produce the shuttle vector pRV566 and was observed to be functional in L. sakei for plasmid replication. The L. sakei replicon alone could not support replication in E. coli. Plasmid pRV500 and its derivative pRV566 were determined to be at very low copy numbers in L. sakei. pRV566 was maintained at a reasonable rate over 20 generations in several lactobacilli, such as Lactobacillus curvatus, Lactobacillus casei, and Lactobacillus plantarum, in addition to L. sakei, making it an interesting basis for developing vectors. Sequence relationships with other plasmids are described and discussed. PMID:12957947
SMRT sequencing of the Vitis vinifera cv. ‘Flame seedless’ genome using a SMRTbell-free library preparation from Swift Biosciences

USDA-ARS?s Scientific Manuscript database

Single Molecule Real-Time (SMRT) sequencing provides advantages to the sequencing of complex genomes. The long reads generated are superior for resolving complex genomic regions and provide highly contiguous de novo assemblies. Current SMRTbell libraries generate average read lengths of 10-15kb. How...
Thermodynamically balanced inside-out (TBIO) PCR-based gene synthesis: a novel method of primer design for high-fidelity assembly of longer gene sequences

PubMed Central

Gao, Xinxin; Yo, Peggy; Keith, Andrew; Ragan, Timothy J.; Harris, Thomas K.

2003-01-01

A novel thermodynamically-balanced inside-out (TBIO) method of primer design was developed and compared with a thermodynamically-balanced conventional (TBC) method of primer design for PCR-based gene synthesis of codon-optimized gene sequences for the human protein kinase B-2 (PKB2; 1494 bp), p70 ribosomal S6 subunit protein kinase-1 (S6K1; 1622 bp) and phosphoinositide-dependent protein kinase-1 (PDK1; 1712 bp). Each of the 60mer TBIO primers coded for identical nucleotide regions that the 60mer TBC primers covered, except that half of the TBIO primers were reverse complement sequences. In addition, the TBIO and TBC primers contained identical regions of temperature- optimized primer overlaps. The TBC method was optimized to generate sequential overlapping fragments (∼0.4–0.5 kb) for each of the gene sequences, and simultaneous and sequential combinations of overlapping fragments were tested for their ability to be assembled under an array of PCR conditions. However, no fully synthesized gene sequences could be obtained by this approach. In contrast, the TBIO method generated an initial central fragment (∼0.4–0.5 kb), which could be gel purified and used for further inside-out bidirectional elongation by additional increments of 0.4–0.5 kb. By using the newly developed TBIO method of PCR-based gene synthesis, error-free synthetic genes for the human protein kinases PKB2, S6K1 and PDK1 were obtained with little or no corrective mutagenesis. PMID:14602936
Superimposed Code Theorectic Analysis of DNA Codes and DNA Computing

DTIC Science & Technology

2010-03-01

because only certain collections (partitioned by font type) of sequences are allowed to be in each position (e.g., Arial = position 0, Comic ...rigidity of short oligos and the shape of the polar charge. Oligo movement was modeled by a Brownian motion 3 dimensional random walk. The one...temperature, kB is Boltz he viscosity of the medium. The random walk motion is modeled by assuming the oligo is on a three dimensional lattice and may
YAC and cosmid contigs encompassing the Fukuyama-type congenital muscular dystrophy (FCMD) candidate region on 9q31

DOE Office of Scientific and Technical Information (OSTI.GOV)

Miyake, Masashi; Nakahori, Yutaka; Matsushita, Ikumi

1997-03-01

Fukuyama-type congenital muscular dystrophy (FCMD), the second most common form of childhood muscular dystrophy in Japan, is an autosomal recessive severe muscular dystrophy associated with an anomaly of the brain. We had mapped the FCMD gene to an approximately 5-cM interval between D9S127 and D9S2111 on 9q31-q33 and had also found evidence for linkage disequilibrium between FCMD and D9S306 in this candidate region. Through further analysis, we have defined another marker, D9S172, which showed stronger linkage disequilibrium than D9S306. A yeast artificial chromosome (YAC) contig spanning 3.5 Mb, which includes this D9S306-D9S172 interval on 9q31, has been constructed by amore » combination of sequence-tagged site, Alu-PCR, and restriction mapping. Also, cosmid clones subcloned from the YAC were assembled into three contigs, one of which contains D9S2107, which showed the strongest linkage disequilibrium with FCMD. These contigs also allowed us to order the markers as follows: cen-D9S127-({approximately}800 kb)-D9S306 (identical to D9S53)-({approximately}700 kb)-A107XF9-({approximately}500 kb)-D9S172-({approximately}30 kb)-D9S299 (identical to D9S774)-({approximately}120 kb)-WI2269-tel. Thus, we have constructed the first high-resolution physical map of the FCMD candidate region. The YAC and cosmid contigs established here will be a crucial resource for identification of the FCMD gene and other genes in this region. 37 refs., 7 figs., 2 tabs.« less
Extremely low nucleotide polymorphism in Pinus krempfii Lecomte, a unique flat needle pine endemic to Vietnam

PubMed Central

Wang, Baosheng; Khalili Mahani, Marjan; Ng, Wei Lun; Kusumi, Junko; Phi, Hai Hong; Inomata, Nobuyuki; Wang, Xiao-Ru; Szmidt, Alfred E

2014-01-01

Pinus krempfii Lecomte is a morphologically and ecologically unique pine, endemic to Vietnam. It is regarded as vulnerable species with distribution limited to just two provinces: Khanh Hoa and Lam Dong. Although a few phylogenetic studies have included this species, almost nothing is known about its genetic features. In particular, there are no studies addressing the levels and patterns of genetic variation in natural populations of P. krempfii. In this study, we sampled 57 individuals from six natural populations of P. krempfii and analyzed their sequence variation in ten nuclear gene regions (approximately 9 kb) and 14 mitochondrial (mt) DNA regions (approximately 10 kb). We also analyzed variation at seven chloroplast (cp) microsatellite (SSR) loci. We found very low haplotype and nucleotide diversity at nuclear loci compared with other pine species. Furthermore, all investigated populations were monomorphic across all mitochondrial DNA (mtDNA) regions included in our study, which are polymorphic in other pine species. Population differentiation at nuclear loci was low (5.2%) but significant. However, structure analysis of nuclear loci did not detect genetically differentiated groups of populations. Approximate Bayesian computation (ABC) using nuclear sequence data and mismatch distribution analysis for cpSSR loci suggested recent expansion of the species. The implications of these findings for the management and conservation of P. krempfii genetic resources were discussed. PMID:25360263
Nucleotide sequence of the coat protein gene of Lettuce big-vein virus.

PubMed

Sasaya, T; Ishikawa, K; Koganezawa, H

2001-06-01

A sequence of 1425 nt was established that included the complete coat protein (CP) gene of Lettuce big-vein virus (LBVV). The LBVV CP gene encodes a 397 amino acid protein with a predicted M(r) of 44486. Antisera raised against synthetic peptides corresponding to N-terminal or C-terminal parts of the LBVV CP reacted in Western blot analysis with a protein with an M(r) of about 48000. RNA extracted from purified particles of LBVV by using proteinase K, SDS and phenol migrated in gels as two single-stranded RNA species of approximately 7.3 kb (ss-1) and 6.6 kb (ss-2). After denaturation by heat and annealing at room temperature, the RNA migrated as four species, ss-1, ss-2 and two additional double-stranded RNAs (ds-1 and ds-2). The Northern blot hybridization analysis using riboprobes from a full-length clone of the LBVV CP gene indicated that ss-2 has a negative-sense nature and contains the LBVV CP gene. Moreover, ds-2 is a double-stranded form of ss-2. Database searches showed that the LBVV CP most resembled the nucleocapsid proteins of rhabdoviruses. These results indicate that it would be appropriate to classify LBVV as a negative-sense single-stranded RNA virus rather than as a double-stranded RNA virus.
pLS010 plasmid vector

DOEpatents

Lacks, Sanford A.; Balganesh, Tanjore S.

1988-01-01

Disclosed is recombinant plasmid pLS101, consisting essentially of a 2.0 Kb malM gene fragment ligated to a 4.4 Kb T.sub.c r DNA fragment, which is particularly useful for transforming Gram-positive bacteria. This plasmid contains at least four restriction sites suitable for inserting exogeneous gene sequences. Also disclosed is a method for plasmid isolation by penicillin selection, as well as processes for enrichment of recombinant plasmids in Gram-positive bacterial systems.
Proliferation of group II introns in the chloroplast genome of the green alga Oedocladium carolinianum (Chlorophyceae).

PubMed

Brouard, Jean-Simon; Turmel, Monique; Otis, Christian; Lemieux, Claude

2016-01-01

The chloroplast genome sustained extensive changes in architecture during the evolution of the Chlorophyceae, a morphologically and ecologically diverse class of green algae belonging to the Chlorophyta; however, the forces driving these changes are poorly understood. The five orders recognized in the Chlorophyceae form two major clades: the CS clade consisting of the Chlamydomonadales and Sphaeropleales, and the OCC clade consisting of the Oedogoniales, Chaetophorales, and Chaetopeltidales. In the OCC clade, considerable variations in chloroplast DNA (cpDNA) structure, size, gene order, and intron content have been observed. The large inverted repeat (IR), an ancestral feature characteristic of most green plants, is present in Oedogonium cardiacum (Oedogoniales) but is lacking in the examined members of the Chaetophorales and Chaetopeltidales. Remarkably, the Oedogonium 35.5-kb IR houses genes that were putatively acquired through horizontal DNA transfer. To better understand the dynamics of chloroplast genome evolution in the Oedogoniales, we analyzed the cpDNA of a second representative of this order, Oedocladium carolinianum . The Oedocladium cpDNA was sequenced and annotated. The evolutionary distances separating Oedocladium and Oedogonium cpDNAs and two other pairs of chlorophycean cpDNAs were estimated using a 61-gene data set. Phylogenetic analysis of an alignment of group IIA introns from members of the OCC clade was performed. Secondary structures and insertion sites of oedogonialean group IIA introns were analyzed. The 204,438-bp Oedocladium genome is 7.9 kb larger than the Oedogonium genome, but its repertoire of conserved genes is remarkably similar and gene order differs by only one reversal. Although the 23.7-kb IR is missing the putative foreign genes found in Oedogonium , it contains sequences coding for a putative phage or bacterial DNA primase and a hypothetical protein. Intergenic sequences are 1.5-fold longer and dispersed repeats are more abundant, but a smaller fraction of the Oedocladium genome is occupied by introns. Six additional group II introns are present, five of which lack ORFs and carry highly similar sequences to that of the ORF-less IIA intron shared with Oedogonium . Secondary structure analysis of the group IIA introns disclosed marked differences in the exon-binding sites; however, each intron showed perfect or nearly perfect base pairing interactions with its target site. Our results suggest that chloroplast genes rearrange more slowly in the Oedogoniales than in the Chaetophorales and raise questions as to what was the nature of the foreign coding sequences in the IR of the common ancestor of the Oedogoniales. They provide the first evidence for intragenomic proliferation of group IIA introns in the Viridiplantae, revealing that intron spread in the Oedocladium lineage likely occurred by retrohoming after sequence divergence of the exon-binding sites.
Evidence of protein-free homology recognition in magnetic bead force–extension experiments

PubMed Central

(O’) Lee, D. J.; Danilowicz, C.; Rochester, C.; Prentiss, M.

2016-01-01

Earlier theoretical studies have proposed that the homology-dependent pairing of large tracts of dsDNA may be due to physical interactions between homologous regions. Such interactions could contribute to the sequence-dependent pairing of chromosome regions that may occur in the presence or the absence of double-strand breaks. Several experiments have indicated the recognition of homologous sequences in pure electrolytic solutions without proteins. Here, we report single-molecule force experiments with a designed 60 kb long dsDNA construct; one end attached to a solid surface and the other end to a magnetic bead. The 60 kb constructs contain two 10 kb long homologous tracts oriented head to head, so that their sequences match if the two tracts fold on each other. The distance between the bead and the surface is measured as a function of the force applied to the bead. At low forces, the construct molecules extend substantially less than normal, control dsDNA, indicating the existence of preferential interaction between the homologous regions. The force increase causes no abrupt but continuous unfolding of the paired homologous regions. Simple semi-phenomenological models of the unfolding mechanics are proposed, and their predictions are compared with the data. PMID:27493568
Non-random distribution and co-localization of purine/pyrimidine-encoded information and transcriptional regulatory domains.

PubMed

Povinelli, C M

1992-01-01

In order to detect sequence-based information predictive for the location of eukaryotic transcriptional regulatory domains, the frequencies and distributions of the 36 possible purine/pyrimidine reverse complement hexamer pairs was determined for test sets of real and random sequences. The distribution of one of the hexamer pairs (RRYYRR/YYRRYY, referred to as M1) was further examined in a larger set of sequences (> 32 genes, 230 kb). Predominant clusters of M1 and the locations of eukaryotic transcriptional regulatory domains were found to be associated and non-randomly distributed along the DNA consistent with a periodicity of approximately 1.2 kb. In the context of higher ordered chromatin this would align promoters, enhancers and the predominant clusters of M1 longitudinally along one face of a 30 nm fiber. Using only information about the distribution of the M1 motif, 50-70% of a sequence could be eliminated as being unlikely to contain transcriptional regulatory domains with an 87% recovery of the regulatory domains present.
Cloning, structure, and chromosome localization of the mouse glutaryl-CoA dehydrogenase gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Koeller, D.M.; DiGiulio, A.; Frerman, F.E.

Glutaryl-CoA dehydrogenase (GCDH) is a nuclear-encoded, mitochondrial matrix enzyme. In humans, deficiency of GCDH leads to glutaric acidemia type I, and inherited disorder of amino acid metabolism characterized by a progressive neurodegenerative disease. In this report we describe the cloning and structure of the mouse GCDH (Gcdh) gene and cDNA and its chromosomal localization. The mouse Gcdh cDNA is 1.75 kb long and contains and open reading frame of 438 amino acids. The amino acid sequences of mouse, human, and pig GCDH are highly conserved. The mouse Gcdh gene contains 11 exons and spans 7 kb of genomic DNA. Gcdhmore » was mapped by backcross analysis to mouse chromosome 8 within a region that is homologous to a region of human chromosome 19, where the human gene was previously mapped. 14 refs., 3 figs.« less
The pituitary hormones arginine vasopressin-neurophysin II and oxytocin-neurophysin I show close linkage with interleukin-1 on mouse chromosome 2

DOE Office of Scientific and Technical Information (OSTI.GOV)

Marini, J.C.; Nelson, K.K.; Siracusa, L.D.

1993-01-01

Arginine vasopressin (AVP) and oxytocin (OXT) are posterior pituitary hormones. AVP is involved in fluid homeostasis, while OXT is involved in lactation and parturition. AVP is derived from a larger precursor, prepro-arginine-vasopressin-neurophysin II (prepro-AVP-NP II; AVP), and is physically linked to prepro-oxytocin-neurophysin I (prepro-OXT-NPI1; OXT). The genes for AVP and OXT are separated by only 12 kb of DNA in humans, whereas in the mouse 3.5 kb of intergenic sequence lies between Avp and Oxt. Interspecific backcross analysis has now been used to map the Avp/Oxt complex to chromosome 2 in the mouse. This map position confirms and extends themore » known region of linkage conservation between mouse chromosome 2 and human chromosome 20. 16 refs., 2 figs., 1 tab.« less
Plasmid Characterization and Chromosome Analysis of Two netF+ Clostridium perfringens Isolates Associated with Foal and Canine Necrotizing Enteritis

PubMed Central

Mehdizadeh Gohari, Iman; Kropinski, Andrew M.; Weese, Scott J.; Parreira, Valeria R.; Whitehead, Ashley E.; Boerlin, Patrick; Prescott, John F.

2016-01-01

The recent discovery of a novel beta-pore-forming toxin, NetF, which is strongly associated with canine and foal necrotizing enteritis should improve our understanding of the role of type A Clostridium perfringens associated disease in these animals. The current study presents the complete genome sequence of two netF-positive strains, JFP55 and JFP838, which were recovered from cases of foal necrotizing enteritis and canine hemorrhagic gastroenteritis, respectively. Genome sequencing was done using Single Molecule, Real-Time (SMRT) technology-PacBio and Illumina Hiseq2000. The JFP55 and JFP838 genomes include a single 3.34 Mb and 3.53 Mb chromosome, respectively, and both genomes include five circular plasmids. Plasmid annotation revealed that three plasmids were shared by the two newly sequenced genomes, including a NetF/NetE toxins-encoding tcp-conjugative plasmid, a CPE/CPB2 toxins-encoding tcp-conjugative plasmid and a putative bacteriocin-encoding plasmid. The putative beta-pore-forming toxin genes, netF, netE and netG, were located in unique pathogenicity loci on tcp-conjugative plasmids. The C. perfringens JFP55 chromosome carries 2,825 protein-coding genes whereas the chromosome of JFP838 contains 3,014 protein-encoding genes. Comparison of these two chromosomes with three available reference C. perfringens chromosome sequences identified 48 (~247 kb) and 81 (~430 kb) regions unique to JFP55 and JFP838, respectively. Some of these divergent genomic regions in both chromosomes are phage- and plasmid-related segments. Sixteen of these unique chromosomal regions (~69 kb) were shared between the two isolates. Five of these shared regions formed a mosaic of plasmid-integrated segments, suggesting that these elements were acquired early in a clonal lineage of netF-positive C. perfringens strains. These results provide significant insight into the basis of canine and foal necrotizing enteritis and are the first to demonstrate that netF resides on a large and unique plasmid-encoded locus. PMID:26859667
Comprehensive analysis of MHC class I genes from the U-, S-, and Z-lineages in Atlantic salmon.

PubMed

Lukacs, Morten F; Harstad, Håvard; Bakke, Hege G; Beetz-Sargent, Marianne; McKinnel, Linda; Lubieniecki, Krzysztof P; Koop, Ben F; Grimholt, Unni

2010-03-05

We have previously sequenced more than 500 kb of the duplicated MHC class I regions in Atlantic salmon. In the IA region we identified the loci for the MHC class I gene Sasa-UBA in addition to a soluble MHC class I molecule, Sasa-ULA. A pseudolocus for Sasa-UCA was identified in the nonclassical IB region. Both regions contained genes for antigen presentation, as wells as orthologues to other genes residing in the human MHC region. The genomic localisation of two MHC class I lineages (Z and S) has been resolved. 7 BACs were sequenced using a combination of standard Sanger and 454 sequencing. The new sequence data extended the IA region with 150 kb identifying the location of one Z-lineage locus, ZAA. The IB region was extended with 350 kb including three new Z-lineage loci, ZBA, ZCA and ZDA in addition to a UGA locus. An allelic version of the IB region contained a functional UDA locus in addition to the UCA pseudolocus. Additionally a BAC harbouring two MHC class I genes (UHA) was placed on linkage group 14, while a BAC containing the S-lineage locus SAA (previously known as UAA) was placed on LG10. Gene expression studies showed limited expression range for all class I genes with exception of UBA being dominantly expressed in gut, spleen and gills, and ZAA with high expression in blood. Here we describe the genomic organization of MHC class I loci from the U-, Z-, and S-lineages in Atlantic salmon. Nine of the described class I genes are located in the extension of the duplicated IA and IB regions, while three class I genes are found on two separate linkage groups. The gene organization of the two regions indicates that the IB region is evolving at a different pace than the IA region. Expression profiling, polymorphic content, peptide binding properties and phylogenetic relationship show that Atlantic salmon has only one MHC class Ia gene (UBA), in addition to a multitude of nonclassical MHC class I genes from the U-, S- and Z-lineages.
[Construction of thr461 --> Asn461 and Ile462 --> Val462 mutation vector of P4501A1 gene].

PubMed

Wei, Qing; Liu, Yi-Min; Wang, Hui; Zhao, Xiao-Lin; Ren, Tie-ling; Xiao, Yong-mei

2006-09-01

To construct Thr461 --> Asn461 and Ile462 --> Val462 mutation vector of P4501A1 gene and to provide scientific base for deeply researching on the function of cytochrome 1A1 gene (CYP1A1) and the mechanism of carcinogenesis. According to cDNA sequence of human CYP1A1 gene, universal primers (Pm3/Pm4) and mutant primers (Pt15/Pt16 and Pt17/Pt18) containing restriction enzyme site and mutation site were designed. The first set of primers involving Pm3/Pt16 and Pm3/Pt18 amplified a forward 1.5kb fragment from pGEM-T-CYP1A1 plasmid. The second set of primers involving Pt15/Pm4 and Pt17/Pm4 amplified a reverse 177-bp fragment from 10ng pGEM-T-CYP1A1 plasmid. The third set of primers involving Pm3/Pm4 amplified a 1.5kb fragment from the fomer PCR amplifications. The third PCR products were separated, purified and recovered from 1% agarose gel, then inserted into pMD-T vector. Subsequently the conjunct products were transformed into E. coil strain DH-5alpha., then the single clone was screened out and plasmids were extracted from such clone finally verified by restriction endonuclease analysis and sequencing. A 1.5kb fragment of tricycle PCR amplifications were digested by restriction endonucleases (BamHI and SailI) and sequenced bidirectionally by universal primers(T7p and SP6). The results verified that the cloned fragment including Asn461 and Val462 mutant site had 99.9% homology with the human cDNA of CYP1A1 gene in Genebank. The objective fragment containing Asn461 and Va462 mutant site with cDNA of the CYP1A1 gene has been successfully constructed in this experiment.
Identification and characterization of large DNA deletions affecting oil quality traits in soybean seeds through transcriptome sequencing analysis.

PubMed

Goettel, Wolfgang; Ramirez, Martha; Upchurch, Robert G; An, Yong-Qiang Charles

2016-08-01

Identification and characterization of a 254-kb genomic deletion on a duplicated chromosome segment that resulted in a low level of palmitic acid in soybean seeds using transcriptome sequencing. A large number of soybean genotypes varying in seed oil composition and content have been identified. Understanding the molecular mechanisms underlying these variations is important for breeders to effectively utilize them as a genetic resource. Through design and application of a bioinformatics approach, we identified nine co-regulated gene clusters by comparing seed transcriptomes of nine soybean genotypes varying in oil composition and content. We demonstrated that four gene clusters in the genotypes M23, Jack and N0304-303-3 coincided with large-scale genome rearrangements. The co-regulated gene clusters in M23 and Jack mapped to a previously described 164-kb deletion and a copy number amplification of the Rhg1 locus, respectively. The coordinately down-regulated gene clusters in N0304-303-3 were caused by a 254-kb deletion containing 19 genes including a fatty acyl-ACP thioesterase B gene (FATB1a). This deletion was associated with reduced palmitic acid content in seeds and was the molecular cause of a previously reported nonfunctional FATB1a allele, fap nc . The M23 and N0304-304-3 deletions were located in duplicated genome segments retained from the Glycine-specific whole genome duplication that occurred 13 million years ago. The homoeologous genes in these duplicated regions shared a strong similarity in both their encoded protein sequences and transcript accumulation levels, suggesting that they may have conserved and important functions in seeds. The functional conservation of homoeologous genes may result in genetic redundancy and gene dosage effects for their associated seed traits, explaining why the large deletion did not cause lethal effects or completely eliminate palmitic acid in N0304-303-3.
Complete Sequencing of pNDM-HK Encoding NDM-1 Carbapenemase from a Multidrug-Resistant Escherichia coli Strain Isolated in Hong Kong

PubMed Central

Ho, Pak Leung; Lo, Wai U.; Yeung, Man Kiu; Lin, Chi Ho; Chow, Kin Hung; Ang, Irene; Tong, Amy Hin Yan; Bao, Jessie Yun-Juan; Lok, Si; Lo, Janice Yee Chi

2011-01-01

Background The emergence of plasmid-mediated carbapenemases, such as NDM-1 in Enterobacteriaceae is a major public health issue. Since they mediate resistance to virtually all β-lactam antibiotics and there is often co-resistance to other antibiotic classes, the therapeutic options for infections caused by these organisms are very limited. Methodology We characterized the first NDM-1 producing E. coli isolate recovered in Hong Kong. The plasmid encoding the metallo-β-lactamase gene was sequenced. Principal Findings The plasmid, pNDM-HK readily transferred to E. coli J53 at high frequencies. It belongs to the broad host range IncL/M incompatibility group and is 88803 bp in size. Sequence alignment showed that pNDM-HK has a 55 kb backbone which shared 97% homology with pEL60 originating from the plant pathogen, Erwina amylovora in Lebanon and a 28.9 kb variable region. The plasmid backbone includes the mucAB genes mediating ultraviolet light resistance. The 28.9 kb region has a composite transposon-like structure which includes intact or truncated genes associated with resistance to β-lactams (bla TEM-1, bla NDM-1, Δbla DHA-1), aminoglycosides (aacC2, armA), sulphonamides (sul1) and macrolides (mel, mph2). It also harbors the following mobile elements: IS26, ISCR1, tnpU, tnpAcp2, tnpD, ΔtnpATn1 and insL. Certain blocks within the 28.9 kb variable region had homology with the corresponding sequences in the widely disseminated plasmids, pCTX-M3, pMUR050 and pKP048 originating from bacteria in Poland in 1996, in Spain in 2002 and in China in 2006, respectively. Significance The genetic support of NDM-1 gene suggests that it has evolved through complex pathways. The association with broad host range plasmid and multiple mobile genetic elements explain its observed horizontal mobility in multiple bacterial taxa. PMID:21445317
Sequence Ready Characterization of the Pericentromeric Region of 19p12

DOE Office of Scientific and Technical Information (OSTI.GOV)

Evan E. Eichler

2006-08-31

Current mapping and sequencing strategies have been inadequate within the proximal portion of 19p12 due, in part, to the presence of a recently expanded ZNF (zinc-finger) gene family and the presence of large (25-50 kb) inverted beta-satellite repeat structures which bracket this tandemly duplicated gene family. The virtual of absence of classically defined “unique” sequence within the region has hampered efforts to identify and characterize a suitable minimal tiling path of clones which can be used as templates required for finished sequencing of the region. The goal of this proposal is to develop and implement a novel sequence-anchor strategy tomore » generate a contiguous BAC map of the most proximal portion of chromosome 19p12 for the purpose of complete sequence characterization. The target region will be an estimated 4.5 Mb of DNA extending from STS marker D19S450 (the beginning of the ZNF gene cluster) to the centromeric (alpha-satellite) junction of 19p11. The approach will entail 1) pre-selection of 19p12 BAC and cosmid clones (NIH approved library) utilizing both 19p12 -unique and 19p12-SPECIFIC repeat probes (Eichler et al., 1998); 2) the generation of a BAC/cosmid end-sequence map across the region with a density of one marker every 8kb; 3) the development of a second-generation of STS (sequence tagged sites) which will be used to identify and verify clonal overlap at the level of the sequence; 4) incorporation of these sequence-anchored overlapping clones into existing cosmid/BAC restriction maps developed at Livermore National Laboratory; and 5) validation of the organization of this region utilizing high-resolution FISH techniques (extended chromatin analysis) on monochromosomal 19 somatic cell hybrids and parental cell lines of source material. The data generated will be used in the selection of the most parsimonious tiling path of BAC clones to be sequenced as part of the JGI effort on chromosome 19 and should serve as a model for the sequence characterization of other difficult regions of the human genome« less

New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits

PubMed Central

2011-01-01

Background Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18) to duodecaploid (12X = 108). Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. Results A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective). Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. Conclusions The construction of the first switchgrass BAC library and comparative analysis of homoeologous harboring OsBRI1 orthologs present a glimpse into the switchgrass genome structure and complexity. Data obtained demonstrate the feasibility of using HICF fingerprinting to resolve the homoeologous chromosomes of the two distinct genomes in switchgrass, providing a robust and accurate BAC-based physical platform for this species. The genomic resources and sequence data generated will lay the foundation for deciphering the switchgrass genome and lead the way for an accurate genome sequencing strategy. PMID:21767393
New genomic resources for switchgrass: a BAC library and comparative analysis of homoeologous genomic regions harboring bioenergy traits.

PubMed

Saski, Christopher A; Li, Zhigang; Feltus, Frank A; Luo, Hong

2011-07-18

Switchgrass, a C4 species and a warm-season grass native to the prairies of North America, has been targeted for development into an herbaceous biomass fuel crop. Genetic improvement of switchgrass feedstock traits through marker-assisted breeding and biotechnology approaches calls for genomic tools development. Establishment of integrated physical and genetic maps for switchgrass will accelerate mapping of value added traits useful to breeding programs and to isolate important target genes using map based cloning. The reported polyploidy series in switchgrass ranges from diploid (2X = 18) to duodecaploid (12X = 108). Like in other large, repeat-rich plant genomes, this genomic complexity will hinder whole genome sequencing efforts. An extensive physical map providing enough information to resolve the homoeologous genomes would provide the necessary framework for accurate assembly of the switchgrass genome. A switchgrass BAC library constructed by partial digestion of nuclear DNA with EcoRI contains 147,456 clones covering the effective genome approximately 10 times based on a genome size of 3.2 Gigabases (~1.6 Gb effective). Restriction digestion and PFGE analysis of 234 randomly chosen BACs indicated that 95% of the clones contained inserts, ranging from 60 to 180 kb with an average of 120 kb. Comparative sequence analysis of two homoeologous genomic regions harboring orthologs of the rice OsBRI1 locus, a low-copy gene encoding a putative protein kinase and associated with biomass, revealed that orthologous clones from homoeologous chromosomes can be unambiguously distinguished from each other and correctly assembled to respective fingerprint contigs. Thus, the data obtained not only provide genomic resources for further analysis of switchgrass genome, but also improve efforts for an accurate genome sequencing strategy. The construction of the first switchgrass BAC library and comparative analysis of homoeologous harboring OsBRI1 orthologs present a glimpse into the switchgrass genome structure and complexity. Data obtained demonstrate the feasibility of using HICF fingerprinting to resolve the homoeologous chromosomes of the two distinct genomes in switchgrass, providing a robust and accurate BAC-based physical platform for this species. The genomic resources and sequence data generated will lay the foundation for deciphering the switchgrass genome and lead the way for an accurate genome sequencing strategy.
Evolution and selection of Rhg1, a copy-number variant nematode-resistance locus

PubMed Central

Lee, Tong Geon; Kumar, Indrajit; Diers, Brian W; Hudson, Matthew E

2015-01-01

The soybean cyst nematode (SCN) resistance locus Rhg1 is a tandem repeat of a 31.2 kb unit of the soybean genome. Each 31.2-kb unit contains four genes. One allele of Rhg1, Rhg1-b, is responsible for protecting most US soybean production from SCN. Whole-genome sequencing was performed, and PCR assays were developed to investigate allelic variation in sequence and copy number of the Rhg1 locus across a population of soybean germplasm accessions. Four distinct sequences of the 31.2-kb repeat unit were identified, and some Rhg1 alleles carry up to three different types of repeat unit. The total number of copies of the repeat varies from 1 to 10 per haploid genome. Both copy number and sequence of the repeat correlate with the resistance phenotype, and the Rhg1 locus shows strong signatures of selection. Significant linkage disequilibrium in the genome outside the boundaries of the repeat allowed the Rhg1 genotype to be inferred using high-density single nucleotide polymorphism genotyping of 15 996 accessions. Over 860 germplasm accessions were found likely to possess Rhg1 alleles. The regions surrounding the repeat show indications of non-neutral evolution and high genetic variability in populations from different geographic locations, but without evidence of fixation of the resistant genotype. A compelling explanation of these results is that balancing selection is in operation at Rhg1. PMID:25735447
Transcriptome Analysis and Development of SSR Molecular Markers in Glycyrrhiza uralensis Fisch.

PubMed Central

Liu, Yaling; Zhang, Pengfei; Song, Meiling; Hou, Junling; Qing, Mei; Wang, Wenquan; Liu, Chunsheng

2015-01-01

Licorice is an important traditional Chinese medicine with clinical and industrial applications. Genetic resources of licorice are insufficient for analysis of molecular biology and genetic functions; as such, transcriptome sequencing must be conducted for functional characterization and development of molecular markers. In this study, transcriptome sequencing on the Illumina HiSeq 2500 sequencing platform generated a total of 5.41 Gb clean data. De novo assembly yielded a total of 46,641 unigenes. Comparison analysis using BLAST showed that the annotations of 29,614 unigenes were conserved. Further study revealed 773 genes related to biosynthesis of secondary metabolites of licorice, 40 genes involved in biosynthesis of the terpenoid backbone, and 16 genes associated with biosynthesis of glycyrrhizic acid. Analysis of unigenes larger than 1 Kb with a length of 11,702 nt presented 7,032 simple sequence repeats (SSR). Sixty-four of 69 randomly designed and synthesized SSR pairs were successfully amplified, 33 pairs of primers were polymorphism in in Glycyrrhiza uralensis Fisch., Glycyrrhiza inflata Bat., Glycyrrhiza glabra L. and Glycyrrhiza pallidiflora Maxim. This study not only presents the molecular biology data of licorice but also provides a basis for genetic diversity research and molecular marker-assisted breeding of licorice. PMID:26571372
Influence of flanking sequences on variability in expression levels of an introduced gene in transgenic tobacco plants.

PubMed Central

Dean, C; Jones, J; Favreau, M; Dunsmuir, P; Bedbrook, J

1988-01-01

The petunia rbcS gene SSU301 was introduced into tobacco using Agrobacterium tumefaciens-mediated transformation. The time at which rbcS expression was maximal after transfer of the tobacco plants to the greenhouse was determined. The expression level of the SSU301 gene varied up to 9 fold between individual tobacco plants which had been standardized physiologically as much as possible. The presence of adjacent pUC plasmid sequences did not affect the expression of the SSU301 gene. In an attempt to reduce the between-transformant variability in expression, the SSU301 gene was introduced into tobacco surrounded by 10kb of 5' and 13 kb of 3' DNA sequences which normally flank SSU301 in petunia. The longer flanking regions did not reduce the between-transformant variability of SSU301 gene expression. Images PMID:3174450
Influence of Bacillus spp. strains on seedling growth and physiological parameters of sorghum under moisture stress conditions.

PubMed

Grover, Minakshi; Madhubala, R; Ali, Sk Z; Yadav, S K; Venkateswarlu, B

2014-09-01

Microorganisms isolated from stressed ecosystem may prove as ideal candidates for development of bio-inoculants for stressed agricultural production systems. In the present study, moisture stress tolerant rhizobacteria were isolated from the rhizosphere of sorghum, pigeonpea, and cowpea grown under semiarid conditions in India. Four isolates KB122, KB129, KB133, and KB142 from sorghum rhizosphere exhibited plant growth promoting traits and tolerance to salinity, high temperature, and moisture stress. These isolates were identified as Bacillus spp. by 16S rDNA sequence analysis. The strains were evaluated for growth promotion of sorghum seedlings under two different moisture stress conditions (set-I, continuous 50% soil water holding capacity (WHC) throughout the experiment and set-II, 75% soil WHC for 27 days followed by no irrigation for 5 days) under greenhouse conditions. Plate count and scanning electron microscope studies indicated successful root surface colonization by inoculated bacteria. Plants inoculated with Bacillus spp. strains showed better growth in terms of shoot length and root biomass with dark greenish leaves due to high chlorophyll content while un-inoculated plants showed rolling of the leaves, stunted appearance, and wilting under both stress conditions. Inoculation also improved leaf relative water content and soil moisture content. However, variation in proline and sugar content in the different treatments under two stress conditions indicated differential effect of microbial treatments on plant physiological parameters under stress conditions. © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Novel Compound Heterozygous CLCNKB Gene Mutations (c.1755A>G/ c.848_850delTCT) Cause Classic Bartter Syndrome.

PubMed

Wang, Chunli; Chen, Ying; Zheng, Bixia; Zhu, Mengshu; Fan, Jia; Wang, Juejin; Jia, Zhanjun; Huang, Songming; Zhang, Aihua

2018-02-14

Inactivated variants in CLCNKB gene encoding the basolateral chloride channel ClC-Kb cause classic Bartter syndrome characterized by hypokalemic metabolic alkalosis and hyperreninemic hyperaldosteronism. Here we identified two cBS siblings presenting hypokalemia in a Chinese family due to novel compound heterozygous CLCNKB mutations (c.848_850delTCT/c.1755A>G). Compound heterozygosity was confirmed by amplifying and sequencing the patient's genomic DNA. The synonymous mutation c.1755A>G (Thr585Thr) was located at +2bp from the 5' splice donor site in exon 15, further transcript analysis demonstrated that this single nucleotide mutation causes exclusion of exon 15 in the cDNA from the proband and his mother. Furthermore, we investigated the expression and protein trafficking change of c.848_850delTCT (TCT) and exon 15 deletion（E15）mutation in vitro. The E15 mutation markedly decreased the expression of ClC-Kb and resulted in a low-molecular-weight band (~55kD) trapping in the endoplasmic reticulum, while the TCT mutant only decreased the total and plasma membrane ClC-Kb protein expression but did not affect the subcellular localization. Finally, we studied the physiological functions of mutations by using whole-cell patch clamp and found that E15 or TCT mutation decreased the current of ClC-Kb/barttin channel. These results suggested that the compound defective mutations of CLCNKB gene are the molecular mechanism of the two cBS siblings.
piggyBac transposons expressing full-length human dystrophin enable genetic correction of dystrophic mesoangioblasts.

PubMed

Loperfido, Mariana; Jarmin, Susan; Dastidar, Sumitava; Di Matteo, Mario; Perini, Ilaria; Moore, Marc; Nair, Nisha; Samara-Kuko, Ermira; Athanasopoulos, Takis; Tedesco, Francesco Saverio; Dickson, George; Sampaolesi, Maurilio; VandenDriessche, Thierry; Chuah, Marinee K

2016-01-29

Duchenne muscular dystrophy (DMD) is a genetic neuromuscular disorder caused by the absence of dystrophin. We developed a novel gene therapy approach based on the use of the piggyBac (PB) transposon system to deliver the coding DNA sequence (CDS) of either full-length human dystrophin (DYS: 11.1 kb) or truncated microdystrophins (MD1: 3.6 kb; MD2: 4 kb). PB transposons encoding microdystrophins were transfected in C2C12 myoblasts, yielding 65±2% MD1 and 66±2% MD2 expression in differentiated multinucleated myotubes. A hyperactive PB (hyPB) transposase was then deployed to enable transposition of the large-size PB transposon (17 kb) encoding the full-length DYS and green fluorescence protein (GFP). Stable GFP expression attaining 78±3% could be achieved in the C2C12 myoblasts that had undergone transposition. Western blot analysis demonstrated expression of the full-length human DYS protein in myotubes. Subsequently, dystrophic mesoangioblasts from a Golden Retriever muscular dystrophy dog were transfected with the large-size PB transposon resulting in 50±5% GFP-expressing cells after stable transposition. This was consistent with correction of the differentiated dystrophic mesoangioblasts following expression of full-length human DYS. These results pave the way toward a novel non-viral gene therapy approach for DMD using PB transposons underscoring their potential to deliver large therapeutic genes. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.
Dissemination of NDM-1-Producing Enterobacteriaceae Mediated by the IncX3-Type Plasmid

PubMed Central

Fu, Ying; Du, Xiaoxing; Shen, Yuqin; Yu, Yunsong

2015-01-01

The emergence and spread of NDM-1-producing Enterobacteriaceae have resulted in a worldwide public health risk that has affected some provinces of China. China is an exceptionally large country, and there is a crucial need to investigate the epidemic of bla NDM-1-positive Enterobacteriaceae in our province. A total of 186 carbapenem-resistant Enterobacteriaceae isolates (CRE) were collected in a grade-3 hospital in Zhejiang province. Carbapenem-resistant genes, including bla KPC, bla IMP, bla VIM, bla OXA-48 and bla NDM-1 were screened and sequenced. Ninety isolates were identified as harboring the bla KPC-2 genes, and five bla NDM-1-positive isolates were uncovered. XbaI-PFGE revealed that three bla NDM-1-positive K. pneumoniae isolates belonged to two different clones. S1-PFGE and southern blot suggested that the bla NDM-1 genes were located on IncX3-type plasmids with two different sizes ranging from 33.3 to 54.7 kb (n=4) and 104.5 to 138.9 kb (n=1), respectively, all of which could easily transfer to Escherichia coli by conjugation and electrotransformation. The high-throughput sequencing of two plasmids was performed leading to the identification of a smaller 54-kb plasmid, which had high sequence similarity with a previously reported pCFNDM-CN, and a larger plasmid in which only a 7.8-kb sequence of a common gene environment around bla NDM-1 (bla NDM-1-trpF- dsbC-cutA1-groEL-ΔInsE,) was detected. PCR mapping and sequencing demonstrated that four smaller bla NDM-1 plasmids contained a common gene environment around bla NDM-1 (IS5-bla NDM-1-trpF- dsbC-cutA1-groEL). We monitored the CRE epidemic in our hospital and determined that KPC-2 carbapenemase was a major risk to patient health and the IncX3-type plasmid played a vital role in the spread of the bla NDM-1 gene among the CRE. PMID:26047502
The 253-kb inversion and deep intronic mutations in UNC13D are present in North American patients with familial hemophagocytic lymphohistiocytosis 3.

PubMed

Qian, Yaping; Johnson, Judith A; Connor, Jessica A; Valencia, C Alexander; Barasa, Nathaniel; Schubert, Jeffery; Husami, Ammar; Kissell, Diane; Zhang, Ge; Weirauch, Matthew T; Filipovich, Alexandra H; Zhang, Kejian

2014-06-01

The mutations in UNC13D are responsible for familial hemophagocytic lymphohistiocytosis (FHL) type 3. A 253-kb inversion and two deep intronic mutations, c.118-308C > T and c.118-307G > A, in UNC13D were recently reported in European and Asian FHL3 patients. We sought to determine the prevalence of these three non-coding mutations in North American FHL patients and evaluate the significance of examining these new mutations in genetic testing. We performed DNA sequencing of UNC13D and targeted analysis of these three mutations in 1,709 North American patients with a suspected clinical diagnosis of hemophagocytic lymphohistiocytosis (HLH). The 253-kb inversion, intronic mutations c.118-308C > T and c.118-307G > A were found in 11, 15, and 4 patients, respectively, in which the genetic basis (bi-allelic mutations) explained 25 additional patients. Taken together with previously diagnosed FHL3 patients in our HLH patient registry, these three non-coding mutations were found in 31.6% (25/79) of the FHL3 patients. The 253-kb inversion, c.118-308C > T and c.118-307G > A accounted for 7.0%, 8.9%, and 1.3% of mutant alleles, respectively. Significantly, eight novel mutations in UNC13D are being reported in this study. To further evaluate the expression level of the newly reported intronic mutation c.118-307G > A, reverse transcription PCR and Western blot analysis revealed a significant reduction of both RNA and protein levels suggesting that the c.118-307G > A mutation affects transcription. These specified non-coding mutations were found in a significant number of North American patients and inclusion of them in mutation analysis will improve the molecular diagnosis of FHL3. © 2014 Wiley Periodicals, Inc.
'DNA Strider': a 'C' program for the fast analysis of DNA and protein sequences on the Apple Macintosh family of computers.

PubMed Central

Marck, C

1988-01-01

DNA Strider is a new integrated DNA and Protein sequence analysis program written with the C language for the Macintosh Plus, SE and II computers. It has been designed as an easy to learn and use program as well as a fast and efficient tool for the day-to-day sequence analysis work. The program consists of a multi-window sequence editor and of various DNA and Protein analysis functions. The editor may use 4 different types of sequences (DNA, degenerate DNA, RNA and one-letter coded protein) and can handle simultaneously 6 sequences of any type up to 32.5 kB each. Negative numbering of the bases is allowed for DNA sequences. All classical restriction and translation analysis functions are present and can be performed in any order on any open sequence or part of a sequence. The main feature of the program is that the same analysis function can be repeated several times on different sequences, thus generating multiple windows on the screen. Many graphic capabilities have been incorporated such as graphic restriction map, hydrophobicity profile and the CAI plot- codon adaptation index according to Sharp and Li. The restriction sites search uses a newly designed fast hexamer look-ahead algorithm. Typical runtime for the search of all sites with a library of 130 restriction endonucleases is 1 second per 10,000 bases. The circular graphic restriction map of the pBR322 plasmid can be therefore computed from its sequence and displayed on the Macintosh Plus screen within 2 seconds and its multiline restriction map obtained in a scrolling window within 5 seconds. PMID:2832831
Enhancing genome assemblies by integrating non-sequence based data

PubMed Central

2011-01-01

Introduction Many genome projects were underway before the advent of high-throughput sequencing and have thus been supported by a wealth of genome information from other technologies. Such information frequently takes the form of linkage and physical maps, both of which can provide a substantial amount of data useful in de novo sequencing projects. Furthermore, the recent abundance of genome resources enables the use of conserved synteny maps identified in related species to further enhance genome assemblies. Methods The tammar wallaby (Macropus eugenii) is a model marsupial mammal with a low coverage genome. However, we have access to extensive comparative maps containing over 14,000 markers constructed through the physical mapping of conserved loci, chromosome painting and comprehensive linkage maps. Using a custom Bioperl pipeline, information from the maps was aligned to assembled tammar wallaby contigs using BLAT. This data was used to construct pseudo paired-end libraries with intervals ranging from 5-10 MB. We then used Bambus (a program designed to scaffold eukaryotic genomes by ordering and orienting contigs through the use of paired-end data) to scaffold our libraries. To determine how map data compares to sequence based approaches to enhance assemblies, we repeated the experiment using a 0.5× coverage of unique reads from 4 KB and 8 KB Illumina paired-end libraries. Finally, we combined both the sequence and non-sequence-based data to determine how a combined approach could further enhance the quality of the low coverage de novo reconstruction of the tammar wallaby genome. Results Using the map data alone, we were able order 2.2% of the initial contigs into scaffolds, and increase the N50 scaffold size to 39 KB (36 KB in the original assembly). Using only the 0.5× paired-end sequence based data, 53% of the initial contigs were assigned to scaffolds. Combining both data sets resulted in a further 2% increase in the number of initial contigs integrated into a scaffold (55% total) but a 35% increase in N50 scaffold size over the use of sequence-based data alone. Conclusions We provide a relatively simple pipeline utilizing existing bioinformatics tools to integrate map data into a genome assembly which is available at http://www.mcb.uconn.edu/fac.php?name=paska. While the map data only contributed minimally to assigning the initial contigs to scaffolds in the new assembly, it greatly increased the N50 size. This process added structure to our low coverage assembly, greatly increasing its utility in further analyses. PMID:21554765
Enhancing genome assemblies by integrating non-sequence based data.

PubMed

Heider, Thomas N; Lindsay, James; Wang, Chenwei; O'Neill, Rachel J; Pask, Andrew J

2011-05-28

Many genome projects were underway before the advent of high-throughput sequencing and have thus been supported by a wealth of genome information from other technologies. Such information frequently takes the form of linkage and physical maps, both of which can provide a substantial amount of data useful in de novo sequencing projects. Furthermore, the recent abundance of genome resources enables the use of conserved synteny maps identified in related species to further enhance genome assemblies. The tammar wallaby (Macropus eugenii) is a model marsupial mammal with a low coverage genome. However, we have access to extensive comparative maps containing over 14,000 markers constructed through the physical mapping of conserved loci, chromosome painting and comprehensive linkage maps. Using a custom Bioperl pipeline, information from the maps was aligned to assembled tammar wallaby contigs using BLAT. This data was used to construct pseudo paired-end libraries with intervals ranging from 5-10 MB. We then used Bambus (a program designed to scaffold eukaryotic genomes by ordering and orienting contigs through the use of paired-end data) to scaffold our libraries. To determine how map data compares to sequence based approaches to enhance assemblies, we repeated the experiment using a 0.5× coverage of unique reads from 4 KB and 8 KB Illumina paired-end libraries. Finally, we combined both the sequence and non-sequence-based data to determine how a combined approach could further enhance the quality of the low coverage de novo reconstruction of the tammar wallaby genome. Using the map data alone, we were able order 2.2% of the initial contigs into scaffolds, and increase the N50 scaffold size to 39 KB (36 KB in the original assembly). Using only the 0.5× paired-end sequence based data, 53% of the initial contigs were assigned to scaffolds. Combining both data sets resulted in a further 2% increase in the number of initial contigs integrated into a scaffold (55% total) but a 35% increase in N50 scaffold size over the use of sequence-based data alone. We provide a relatively simple pipeline utilizing existing bioinformatics tools to integrate map data into a genome assembly which is available at http://www.mcb.uconn.edu/fac.php?name=paska. While the map data only contributed minimally to assigning the initial contigs to scaffolds in the new assembly, it greatly increased the N50 size. This process added structure to our low coverage assembly, greatly increasing its utility in further analyses.
Strong Signature of Natural Selection within an FHIT Intron Implicated in Prostate Cancer Risk

PubMed Central

Ding, Yan; Larson, Garrett; Rivas, Guillermo; Lundberg, Cathryn; Geller, Louis; Ouyang, Ching; Weitzel, Jeffrey; Archambeau, John; Slater, Jerry; Daly, Mary B.; Benson, Al B.; Kirkwood, John M.; O'Dwyer, Peter J.; Sutphen, Rebecca; Stewart, James A.; Johnson, David; Nordborg, Magnus; Krontiris, Theodore G.

2008-01-01

Previously, a candidate gene linkage approach on brother pairs affected with prostate cancer identified a locus of prostate cancer susceptibility at D3S1234 within the fragile histidine triad gene (FHIT), a tumor suppressor that induces apoptosis. Subsequent association tests on 16 SNPs spanning approximately 381 kb surrounding D3S1234 in Americans of European descent revealed significant evidence of association for a single SNP within intron 5 of FHIT. In the current study, re-sequencing and genotyping within a 28.5 kb region surrounding this SNP further delineated the association with prostate cancer risk to a 15 kb region. Multiple SNPs in sequences under evolutionary constraint within intron 5 of FHIT defined several related haplotypes with an increased risk of prostate cancer in European-Americans. Strong associations were detected for a risk haplotype defined by SNPs 138543, 142413, and 152494 in all cases (Pearson's χ2 = 12.34, df 1, P = 0.00045) and for the homozygous risk haplotype defined by SNPs 144716, 142413, and 148444 in cases that shared 2 alleles identical by descent with their affected brothers (Pearson's χ2 = 11.50, df 1, P = 0.00070). In addition to highly conserved sequences encompassing SNPs 148444 and 152413, population studies revealed strong signatures of natural selection for a 1 kb window covering the SNP 144716 in two human populations, the European American (π = 0.0072, Tajima's D = 3.31, 14 SNPs) and the Japanese (π = 0.0049, Fay & Wu's H = 8.05, 14 SNPs), as well as in chimpanzees (Fay & Wu's H = 8.62, 12 SNPs). These results strongly support the involvement of the FHIT intronic region in an increased risk of prostate cancer. PMID:18953408
Identification of Novel Pathogenicity Loci in Clostridium perfringens Strains That Cause Avian Necrotic Enteritis

PubMed Central

Parreira, Valeria R.; Marri, Pradeep R.; Rosey, Everett L.; Gong, Joshua; Songer, J. Glenn; Vedantam, Gayatri; Prescott, John F.

2010-01-01

Type A Clostridium perfringens causes poultry necrotic enteritis (NE), an enteric disease of considerable economic importance, yet can also exist as a member of the normal intestinal microbiota. A recently discovered pore-forming toxin, NetB, is associated with pathogenesis in most, but not all, NE isolates. This finding suggested that NE-causing strains may possess other virulence gene(s) not present in commensal type A isolates. We used high-throughput sequencing (HTS) technologies to generate draft genome sequences of seven unrelated C. perfringens poultry NE isolates and one isolate from a healthy bird, and identified additional novel NE-associated genes by comparison with nine publicly available reference genomes. Thirty-one open reading frames (ORFs) were unique to all NE strains and formed the basis for three highly conserved NE-associated loci that we designated NELoc-1 (42 kb), NELoc-2 (11.2 kb) and NELoc-3 (5.6 kb). The largest locus, NELoc-1, consisted of netB and 36 additional genes, including those predicted to encode two leukocidins, an internalin-like protein and a ricin-domain protein. Pulsed-field gel electrophoresis (PFGE) and Southern blotting revealed that the NE strains each carried 2 to 5 large plasmids, and that NELoc-1 and -3 were localized on distinct plasmids of sizes ∼85 and ∼70 kb, respectively. Sequencing of the regions flanking these loci revealed similarity to previously characterized conjugative plasmids of C. perfringens. These results provide significant insight into the pathogenetic basis of poultry NE and are the first to demonstrate that netB resides in a large, plasmid-encoded locus. Our findings strongly suggest that poultry NE is caused by several novel virulence factors, whose genes are clustered on discrete pathogenicity loci, some of which are plasmid-borne. PMID:20532244
Identification of novel pathogenicity loci in Clostridium perfringens strains that cause avian necrotic enteritis.

PubMed

Lepp, Dion; Roxas, Bryan; Parreira, Valeria R; Marri, Pradeep R; Rosey, Everett L; Gong, Joshua; Songer, J Glenn; Vedantam, Gayatri; Prescott, John F

2010-05-24

Type A Clostridium perfringens causes poultry necrotic enteritis (NE), an enteric disease of considerable economic importance, yet can also exist as a member of the normal intestinal microbiota. A recently discovered pore-forming toxin, NetB, is associated with pathogenesis in most, but not all, NE isolates. This finding suggested that NE-causing strains may possess other virulence gene(s) not present in commensal type A isolates. We used high-throughput sequencing (HTS) technologies to generate draft genome sequences of seven unrelated C. perfringens poultry NE isolates and one isolate from a healthy bird, and identified additional novel NE-associated genes by comparison with nine publicly available reference genomes. Thirty-one open reading frames (ORFs) were unique to all NE strains and formed the basis for three highly conserved NE-associated loci that we designated NELoc-1 (42 kb), NELoc-2 (11.2 kb) and NELoc-3 (5.6 kb). The largest locus, NELoc-1, consisted of netB and 36 additional genes, including those predicted to encode two leukocidins, an internalin-like protein and a ricin-domain protein. Pulsed-field gel electrophoresis (PFGE) and Southern blotting revealed that the NE strains each carried 2 to 5 large plasmids, and that NELoc-1 and -3 were localized on distinct plasmids of sizes approximately 85 and approximately 70 kb, respectively. Sequencing of the regions flanking these loci revealed similarity to previously characterized conjugative plasmids of C. perfringens. These results provide significant insight into the pathogenetic basis of poultry NE and are the first to demonstrate that netB resides in a large, plasmid-encoded locus. Our findings strongly suggest that poultry NE is caused by several novel virulence factors, whose genes are clustered on discrete pathogenicity loci, some of which are plasmid-borne.
HMG-CoA lyase (HL) gene: Cloning and characterization of the 5{prime} end of the mouse gene, gene targeting in ES cells, and demonstration of large deletions in three HL-deficient patients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, S.; Robert, M.F.; Mitchell, G.A.

1994-09-01

3-hydroxy-3-methylglutaryl CoA lyase (HL) is a mitochondrial matrix enzyme which catalyzes the last step of leucine catabolism and of ketogenesis. Autosomal recessive HL deficiency in humans results in episodes of hypoglycemia and coma. We are interested in the pathophysiology of HL deficiency as a model for both amino acid and fatty acid inborn errors. We have cloned the human and mouse HL genes. In order to analyze the 5{prime} nontranslated region of mouse HL gene, we cloned and sequenced a 1.8 kb fragment containing the 5{prime} extremity including exon 1 and about 1.6 kb of 5{prime} nontranslated sequence. The regionmore » surrounding exon 1 is CpG-rich (66.4%). Using the criteria of West, the Observed/Expected ratio for CpG dinucleotides is 0.7 ({ge}0.6 is consistent with a CpG island). We are carrying out primer extension and RNase protection experiments to determine the transcription initiation site. We constructed a gene targeting vector by introducing the neomycin resistance gene into exon 2 of a 7.5 kb genomic subclone of the mouse HL gene. Targeting was performed by electroporating 10 mg linearized vector into 10{sup 7} ES cells and selecting for 12 days with G418. 5/228 colonies (2.2%) had homologous recombination as shown by PCR screening and Southern analysis. We are microinjecting the 5 targeted clones into blastocysts to create an HL-deficient mouse. To date we have obtained two chimeras with contributions of 95% and 55% from 129, by coat color estimates. Three of 27 (11%) of the HL-deficient patients studied were suggested by genomic Southern analysis to be homozygous for large intragenic deletions. We confirmed this and defined the boundaries using exonic PCR.« less
Complete Genome Sequence and Comparative Analysis of the Fish Pathogen Lactococcus garvieae

PubMed Central

Oshima, Kenshiro; Yoshizaki, Mariko; Kawanishi, Michiko; Nakaya, Kohei; Suzuki, Takehito; Miyauchi, Eiji; Ishii, Yasuo; Tanabe, Soichi; Murakami, Masaru; Hattori, Masahira

2011-01-01

Lactococcus garvieae causes fatal haemorrhagic septicaemia in fish such as yellowtail. The comparative analysis of genomes of a virulent strain Lg2 and a non-virulent strain ATCC 49156 of L. garvieae revealed that the two strains shared a high degree of sequence identity, but Lg2 had a 16.5-kb capsule gene cluster that is absent in ATCC 49156. The capsule gene cluster was composed of 15 genes, of which eight genes are highly conserved with those in exopolysaccharide biosynthesis gene cluster often found in Lactococcus lactis strains. Sequence analysis of the capsule gene cluster in the less virulent strain L. garvieae Lg2-S, Lg2-derived strain, showed that two conserved genes were disrupted by a single base pair deletion, respectively. These results strongly suggest that the capsule is crucial for virulence of Lg2. The capsule gene cluster of Lg2 may be a genomic island from several features such as the presence of insertion sequences flanked on both ends, different GC content from the chromosomal average, integration into the locus syntenic to other lactococcal genome sequences, and distribution in human gut microbiomes. The analysis also predicted other potential virulence factors such as haemolysin. The present study provides new insights into understanding of the virulence mechanisms of L. garvieae in fish. PMID:21829716
Two open reading frames (ORF1 and ORF2) within the 2.0-kilobase latency-associated transcript of herpes simplex virus type 1 are not essential for reactivation from latency.

PubMed Central

Fareed, M U; Spivack, J G

1994-01-01

The herpes simplex virus type 1 (HSV-1) latency-associated transcripts (LATs) are dispensable for establishment and maintenance of latent infection. However, the LATs have been implicated in reactivation of the virus from its latent state. Since the reported LAT deletion and/or insertion variants that are reactivation impaired contain deletions in the putative LAT promoter, it is not known which LAT sequences are involved in reactivation. To examine the role of the 2.0-kb LAT in the process of reactivation and the functional importance of the putative open reading frames (ORF1 and ORF2) contained within the 2.0-kb LAT, we have constructed an HSV-1 variant that contains a precise deletion and insertion within the LAT-specific DNA sequences using site-directed mutagenesis. The HSV-1 variant FS1001K contains an 1,186-bp deletion starting precisely from the 5' end of the 2.0-kb LAT and, for identification, a XbaI restriction endonuclease site insertion. The FS1001K genome contains no other deletions and/or insertions as analyzed by a variety of restriction endonucleases. The deletion in FS1001K removes the entire 556-bp intron within the 2.0-kb LAT, the first 229 nucleotides of ORF1, and the first 159 nucleotides of ORF2 without having an affect on the RL2 (ICP0) gene. Explant cocultivation reactivation assays indicated that this deletion had a minimal effect on reactivation of the variant FS1001K compared with the parental wild-type virus using a mouse eye model. As expected, Northern (RNA) blot analyses have shown that the variant virus (FS1001K) does not produce the 2.0-kb LAT or the 1.45- to 1.5-kb LAT either in vitro or in vivo; however, FS1001K produces an intact RL2 transcript in tissue culture. These data suggest that the 2.0-kb LAT putative ORF1 and ORF2 (or the first 1,186 bp of the 2.0-kb LAT) are dispensable for explant reactivation of latent HSV-1. Images PMID:7966597
Simultaneous non-contiguous deletions using large synthetic DNA and site-specific recombinases

PubMed Central

Krishnakumar, Radha; Grose, Carissa; Haft, Daniel H.; Zaveri, Jayshree; Alperovich, Nina; Gibson, Daniel G.; Merryman, Chuck; Glass, John I.

2014-01-01

Toward achieving rapid and large scale genome modification directly in a target organism, we have developed a new genome engineering strategy that uses a combination of bioinformatics aided design, large synthetic DNA and site-specific recombinases. Using Cre recombinase we swapped a target 126-kb segment of the Escherichia coli genome with a 72-kb synthetic DNA cassette, thereby effectively eliminating over 54 kb of genomic DNA from three non-contiguous regions in a single recombination event. We observed complete replacement of the native sequence with the modified synthetic sequence through the action of the Cre recombinase and no competition from homologous recombination. Because of the versatility and high-efficiency of the Cre-lox system, this method can be used in any organism where this system is functional as well as adapted to use with other highly precise genome engineering systems. Compared to present-day iterative approaches in genome engineering, we anticipate this method will greatly speed up the creation of reduced, modularized and optimized genomes through the integration of deletion analyses data, transcriptomics, synthetic biology and site-specific recombination. PMID:24914053

pLS101 plasmid vector

DOEpatents

Lacks, S.A.; Balganesh, T.S.

1985-02-19

Disclosed is recombinant plasmid pLS101, consisting essentially of a 2.0 Kb ma1M gene fragment ligated to a 4.4 Kb Tcr DNA fragment, which is particularly useful for transforming Gram-positive bacteria. This plasmid contains at least four restriction sites suitable for inserting exogeneous gene sequences. Also disclosed is a method for plasmid isolation by penicillin selection, as well as processes for enrichment of recombinant plasmids in Gram-positive bacterial systems. 5 figs., 2 tabs.
Comparative genomic and morphological analyses of Listeria phages isolated from farm environments.

PubMed

Denes, Thomas; Vongkamjan, Kitiya; Ackermann, Hans-Wolfgang; Moreno Switt, Andrea I; Wiedmann, Martin; den Bakker, Henk C

2014-08-01

The genus Listeria is ubiquitous in the environment and includes the globally important food-borne pathogen Listeria monocytogenes. While the genomic diversity of Listeria has been well studied, considerably less is known about the genomic and morphological diversity of Listeria bacteriophages. In this study, we sequenced and analyzed the genomes of 14 Listeria phages isolated mostly from New York dairy farm environments as well as one related Enterococcus faecalis phage to obtain information on genome characteristics and diversity. We also examined 12 of the phages by electron microscopy to characterize their morphology. These Listeria phages, based on gene orthology and morphology, together with previously sequenced Listeria phages could be classified into five orthoclusters, including one novel orthocluster. One orthocluster (orthocluster I) consists of large genome (~135-kb) myoviruses belonging to the genus “Twort-like viruses,” three orthoclusters (orthoclusters II to IV) contain small-genome (36- to 43-kb) siphoviruses with icosahedral heads, and the novel orthocluster V contains medium-sized-genome (~66-kb) siphoviruses with elongated heads. A novel orthocluster (orthocluster VI) of E. faecalis phages, with medium-sized genomes (~56 kb), was identified, which grouped together and shares morphological features with the novel Listeria phage orthocluster V. This new group of phages (i.e., orthoclusters V and VI) is composed of putative lytic phages that may prove to be useful in phage-based applications for biocontrol, detection, and therapeutic purposes.
Detection of two fungal biocontrol agents against root-knot nematodes by RAPD markers.

PubMed

Zhu, Ming Liang; Mo, Ming He; Xia, Zhen Yuan; Li, Yun Hua; Yang, Shu Jun; Li, Tian Fei; Zhang, Ke Qin

2006-05-01

The strain ZK7 of Pochonia chlamydosporia var. chlamydosporia and IPC of Paecilomyces lilacinus are highly effective in the biological control against root-knot nematodes infecting tobacco. When applied, they require a specific monitoring method to evaluate the colonization and dispersal in soil. In this work, the randomly amplified polymorphic DNA (RAPD) technique was used to differentiate between the two individual strains and 95 other isolates, including isolates of the same species and common soil fungi. This approach allowed the selection of specific fragments of 1.2 kb (Vc1200) and 2.0 kb (Vc2000) specific for ZK7, 1.4 kb (P1400) and 0.85 kb (P850) specific for IPC, using the random Primers OPL-02, OPD-05, OPD-05 and OPC-11, respectively. These fragments were cloned, sequenced, and used to design sequence-characterized amplification region (SCAR) primers specific for the two strains. In classical polymerase chain reaction (PCR), with serial dilution of ZK7 and IPC pure culture DNAs template, the detection limits of these oligonucleotide SCAR-PCR primers were found to be 10, 1000, 500, 100 pg, respectively. In the dot blotting, digoxigenin (DIG)-labeled amplicons from these four primers specifically recognized the corresponding fragments in the DNAs template of these two strains. The detection limit of these amplicons were 0.2, 0.2, 0.5, 0.5 mug, respectively.
Genome Calligrapher: A Web Tool for Refactoring Bacterial Genome Sequences for de Novo DNA Synthesis.

PubMed

Christen, Matthias; Deutsch, Samuel; Christen, Beat

2015-08-21

Recent advances in synthetic biology have resulted in an increasing demand for the de novo synthesis of large-scale DNA constructs. Any process improvement that enables fast and cost-effective streamlining of digitized genetic information into fabricable DNA sequences holds great promise to study, mine, and engineer genomes. Here, we present Genome Calligrapher, a computer-aided design web tool intended for whole genome refactoring of bacterial chromosomes for de novo DNA synthesis. By applying a neutral recoding algorithm, Genome Calligrapher optimizes GC content and removes obstructive DNA features known to interfere with the synthesis of double-stranded DNA and the higher order assembly into large DNA constructs. Subsequent bioinformatics analysis revealed that synthesis constraints are prevalent among bacterial genomes. However, a low level of codon replacement is sufficient for refactoring bacterial genomes into easy-to-synthesize DNA sequences. To test the algorithm, 168 kb of synthetic DNA comprising approximately 20 percent of the synthetic essential genome of the cell-cycle bacterium Caulobacter crescentus was streamlined and then ordered from a commercial supplier of low-cost de novo DNA synthesis. The successful assembly into eight 20 kb segments indicates that Genome Calligrapher algorithm can be efficiently used to refactor difficult-to-synthesize DNA. Genome Calligrapher is broadly applicable to recode biosynthetic pathways, DNA sequences, and whole bacterial genomes, thus offering new opportunities to use synthetic biology tools to explore the functionality of microbial diversity. The Genome Calligrapher web tool can be accessed at https://christenlab.ethz.ch/GenomeCalligrapher  .
High Quality Maize Centromere 10 Sequence Reveals Evidence of Frequent Recombination Events

PubMed Central

Wolfgruber, Thomas K.; Nakashima, Megan M.; Schneider, Kevin L.; Sharma, Anupma; Xie, Zidian; Albert, Patrice S.; Xu, Ronghui; Bilinski, Paul; Dawe, R. Kelly; Ross-Ibarra, Jeffrey; Birchler, James A.; Presting, Gernot G.

2016-01-01

The ancestral centromeres of maize contain long stretches of the tandemly arranged CentC repeat. The abundance of tandem DNA repeats and centromeric retrotransposons (CR) has presented a significant challenge to completely assembling centromeres using traditional sequencing methods. Here, we report a nearly complete assembly of the 1.85 Mb maize centromere 10 from inbred B73 using PacBio technology and BACs from the reference genome project. The error rates estimated from overlapping BAC sequences are 7 × 10−6 and 5 × 10−5 for mismatches and indels, respectively. The number of gaps in the region covered by the reassembly was reduced from 140 in the reference genome to three. Three expressed genes are located between 92 and 477 kb from the inferred ancestral CentC cluster, which lies within the region of highest centromeric repeat density. The improved assembly increased the count of full-length CR from 5 to 55 and revealed a 22.7 kb segmental duplication that occurred approximately 121,000 years ago. Our analysis provides evidence of frequent recombination events in the form of partial retrotransposons, deletions within retrotransposons, chimeric retrotransposons, segmental duplications including higher order CentC repeats, a deleted CentC monomer, centromere-proximal inversions, and insertion of mitochondrial sequences. Double-strand DNA break (DSB) repair is the most plausible mechanism for these events and may be the major driver of centromere repeat evolution and diversity. In many cases examined here, DSB repair appears to be mediated by microhomology, suggesting that tandem repeats may have evolved to efficiently repair frequent DSBs in centromeres. PMID:27047500
Cloning and characterization of the gene encoding the endopolygalacturonase-inhibiting protein (PGIP) of Phaseolus vulgaris L.

PubMed

Toubart, P; Desiderio, A; Salvi, G; Cervone, F; Daroda, L; De Lorenzo, G

1992-05-01

Polygalacturonase-inhibiting protein (PGIP) is a cell wall protein purified from hypocotyls of true bean (Phaseolus vulgaris L.). PGIP inhibits fungal endopolygalacturonases and is considered to be an important factor for plant resistance to phytopathogenic fungi (Albersheim and Anderson, 1971; Cervone et al., 1987). The amino acid sequences of the N-terminus and one internal tryptic peptide of the PGIP purified from P. vulgaris cv. Pinto were used to design redundant oligonucleotides that were successfully utilized as primers in a polymerase chain reaction (PCR) with total DNA of P. vulgaris as a template. A DNA band of 758 bp (a specific PCR amplification product of part of the gene coding for PGIP) was isolated and cloned. By using the 758-bp DNA as a hybridization probe, a lambda clone containing the PGIP gene was isolated from a genomic library of P. vulgaris cv. Saxa. The coding and immediate flanking regions of the PGIP gene, contained on a subcloned 3.3 kb SalI-SalI DNA fragment, were sequenced. A single, continuous ORF of 1026 nt (342 amino acids) was present in the genomic clone. The nucleotide and deduced amino acid sequences of the PGIP gene showed no significant similarity with any known databank sequence. Northern blotting analysis of poly(A)+ RNAs, isolated from various tissues of bean seedlings or from suspension-cultured bean cells, were also performed using the cloned PCR-generated DNA as a probe. A 1.2 kb transcript was detected in suspension-cultured cells and, to a lesser extent, in leaves, hypocotyls, and flowers.(ABSTRACT TRUNCATED AT 250 WORDS)
Peptide-based pharmacomodulation of a cancer-targeted optical imaging and photodynamic therapy agent

PubMed Central

Stefflova, Klara; Li, Hui; Chen, Juan; Zheng, Gang

2008-01-01

We designed and synthesized a folate receptor-targeted, water soluble, and pharmacomodulated photodynamic therapy (PDT) agent that selectively detects and destroys the targeted cancer cells while sparing normal tissue. This was achieved by minimizing the normal organ uptake (e.g., liver and spleen) and by discriminating between tumors with different levels of folate receptor (FR) expression. This construct (Pyro-peptide-Folate, PPF) is comprised of three components: 1) Pyropheophorbide a (Pyro) as an imaging and therapeutic agent, 2) peptide sequence as a stable linker and modulator improving the delivery efficiency, and 3) Folate as a homing molecule targeting FR-expressing cancer cells. We observed an enhanced accumulation of PPF in KB cancer cells (FR+) compared to HT 1080 cancer cells (FR-), resulting in a more effective post-PDT killing of KB cells over HT 1080 or normal CHO cells. The accumulation of PPF in KB cells can be up to 70% inhibited by an excess of free folic acid. The effect of Folate on preferential accumulation of PPF in KB tumors (KB vs. HT 1080 tumors 2.5:1) was also confirmed in vivo. In contrast to that, no significant difference between the KB and HT 1080 tumor was observed in case of the untargeted probe (Pyro-peptide, PP), eliminating the potential influence of Pyro’s own nonspecific affinity to cancer cells. More importantly, we found that incorporating a short peptide sequence considerably improved the delivery efficiency of the probe – a process we attributed to a possible peptide-based pharmacomodulation – as was demonstrated by a 50-fold reduction in PPF accumulation in liver and spleen when compared to a peptide-lacking probe (Pyro-K-Folate, PKF). This approach could potentially be generalized to improve the delivery efficiency of other targeted molecular imaging and photodynamic therapy agents. PMID:17298029
Construction of an 800-kb contig in the near-centromeric region of the rice blast resistance gene Pi-ta2 using a highly representative rice BAC library.

PubMed

Nakamura, S; Asakawa, S; Ohmido, N; Fukui, K; Shimizu, N; Kawasaki, S

1997-05-01

We constructed a rice Bacterial Artificial Chromosome (BAC) library from green leaf protoplasts of the cultivar Shimokita harboring the rice blast resistance gene Pi-ta. The average insert size of 155 kb and the library size of seven genome equivalents make it one of the most comprehensive BAC libraries available, and larger than many plant YAC libraries. The library clones were plated on seven high density membranes of microplate size, enabling efficient colony identification in colony hybridization experiments. Seven percent of clones carried chloroplast DNA. By probing with markers close to the blast resistance genes Pi-ta2(closely linked to Pi-ta) and Pi-b, respectively located in the centromeric region of chromosome 12 and near the telomeric end of chromosome 2, on average 2.2 +/- 1.3 and 8.0 +/- 2.6 BAC clones/marker were isolated. Differences in chromosomal structures may contribute to this wide variation in yield. A contig of about 800 kb, consisting of 19 clones, was constructed in the Pi-ta2 region. This region had a high frequency of repetitive sequences. To circumvent this difficulty, we devised a "two-step walking" method. The contig spanned a 300 kb region between markers located at 0 cM and 0.3 cM from Pi-ta. The ratio of physical to genetic distances (> 1,000 kb/cM) was more than three times larger than the average of rice (300 kb/cM). The low recombination rate and high frequency of repetitive sequences may also be related to the near centromeric character of this region. Fluorescent in situ hybridization (FISH) with a BAC clone from the Pi-b region yielded very clear signals on the long arm of chromosome 2, while a clone from the Pi-ta2 region showed various cross-hybridizing signals near the centromeric regions of all chromosomes.
Fine mapping of the genic male-sterile ms 1 gene in Capsicum annuum L.

PubMed

Jeong, Kyumi; Choi, Doil; Lee, Jundae

2018-01-01

The genomic region cosegregating with the genic male-sterile ms 1 gene of Capsicum annuum L. was delimited to a region of 869.9 kb on chromosome 5 through fine mapping analysis. A strong candidate gene, CA05g06780, a homolog of the Arabidopsis MALE STERILITY 1 gene that controls pollen development, was identified in this region. Genic male sterility caused by the ms 1 gene has been used for the economically efficient production of massive hybrid seeds in paprika (Capsicum annuum L.), a colored bell-type sweet pepper. Previously, a CAPS marker, PmsM1-CAPS, located about 2-3 cM from the ms 1 locus, was reported. In this study, we constructed a fine map near the ms 1 locus using high-resolution melting (HRM) markers in an F 2 population consisting of 1118 individual plants, which segregated into 867 male-fertile and 251 male-sterile plants. A total of 12 HRM markers linked to the ms 1 locus were developed from 53 primer sets targeting intraspecific SNPs derived by comparing genome-wide sequences obtained by next-generation resequencing analysis. Using this approach, we narrowed down the region cosegregating with the ms 1 gene to 869.9 kb of sequence. Gene prediction analysis revealed 11 open reading frames in this region. A strong candidate gene, CA05g06780, was identified; this gene is a homolog of the Arabidopsis MALE STERILITY 1 (MS1) gene, which encodes a PHD-type transcription factor that regulates pollen and tapetum development. Sequence comparison analysis suggested that the CA05g06780 gene is the strongest candidate for the ms 1 gene of paprika. To summarize, we developed a cosegregated marker, 32187928-HRM, for marker-assisted selection and identified a strong candidate for the ms 1 gene.
Pipeline for large-scale microdroplet bisulfite PCR-based sequencing allows the tracking of hepitype evolution in tumors.

PubMed

Herrmann, Alexander; Haake, Andrea; Ammerpohl, Ole; Martin-Guerrero, Idoia; Szafranski, Karol; Stemshorn, Kathryn; Nothnagel, Michael; Kotsopoulos, Steve K; Richter, Julia; Warner, Jason; Olson, Jeff; Link, Darren R; Schreiber, Stefan; Krawczak, Michael; Platzer, Matthias; Nürnberg, Peter; Siebert, Reiner; Hampe, Jochen

2011-01-01

Cytosine methylation provides an epigenetic level of cellular plasticity that is important for development, differentiation and cancerogenesis. We adopted microdroplet PCR to bisulfite treated target DNA in combination with second generation sequencing to simultaneously assess DNA sequence and methylation. We show measurement of methylation status in a wide range of target sequences (total 34 kb) with an average coverage of 95% (median 100%) and good correlation to the opposite strand (rho = 0.96) and to pyrosequencing (rho = 0.87). Data from lymphoma and colorectal cancer samples for SNRPN (imprinted gene), FGF6 (demethylated in the cancer samples) and HS3ST2 (methylated in the cancer samples) serve as a proof of principle showing the integration of SNP data and phased DNA-methylation information into "hepitypes" and thus the analysis of DNA methylation phylogeny in the somatic evolution of cancer.
Nucleotide Sequence Diversity and Linkage Disequilibrium of Four Nuclear Loci in Foxtail Millet (Setaria italica).

PubMed

He, Shui-Lian; Yang, Yang; Morrell, Peter L; Yi, Ting-Shuang

2015-01-01

Foxtail millet (Setaria italica (L.) Beauv) is one of the earliest domesticated grains, which has been cultivated in northern China by 8,700 years before present (YBP) and across Eurasia by 4,000 YBP. Owing to a small genome and diploid nature, foxtail millet is a tractable model crop for studying functional genomics of millets and bioenergy grasses. In this study, we examined nucleotide sequence diversity, geographic structure, and levels of linkage disequilibrium at four nuclear loci (ADH1, G3PDH, IGS1 and TPI1) in representative samples of 311 landrace accessions across its cultivated range. Higher levels of nucleotide sequence and haplotype diversity were observed in samples from China relative to other sampled regions. Genetic assignment analysis classified the accessions into seven clusters based on nucleotide sequence polymorphisms. Intralocus LD decayed rapidly to half the initial value within ~1.2 kb or less.
Superimposed Code Theoretic Analysis of Deoxyribonucleic Acid (DNA) Codes and DNA Computing

DTIC Science & Technology

2010-01-01

partitioned by font type) of sequences are allowed to be in each position (e.g., Arial = position 0, Comic = position 1, etc. ) and within each collection...movement was modeled by a Brownian motion 3 dimensional random walk. The one dimensional diffusion coefficient D for the ellipsoid shape with 3...temperature, kB is Boltzmann’s constant, and η is the viscosity of the medium. The random walk motion is modeled by assuming the oligo is on a three
Analysis of Ethnic Admixture in Prostate Cancer

DTIC Science & Technology

2006-12-01

low penetrant genes have been identified as potential PCA suscept- ibility genes. These candidate genes include SRD5A2 (MIM 607306), CYP3A4 (MIM 124010...progression [13]. The CDH1gene is located at 16q22.1 and consists of 16 exons spanning approximately 100 kb of genomic DNA. Several polymorphisms, germline and...upstreamof theATGstart site and all 16 exons of CDH1 were screened for DNA sequence variation by denaturing high-performance liquid chro- matography
Prevalence and genetic characterization of eimeriid coccidia from feces of black-necked cranes, Grus nigricollis.

PubMed

Liang, Yu; Zhao, ZiJiao; Hu, JunJie; Esch, Gerald W; Peng, MingChun; Liu, Qiong; Chen, JinQing

2018-03-01

Disseminated visceral coccidiosis (DVC) is a widely distributed intestinal and extraintestinal disease of cranes caused by eimeriid coccidia and has lethal pathogenicity to several crane species. Here, feces of 164 black-necked cranes collected in Dashanbao Black-necked Crane National Nature Reserve, China, were examined to determine the prevalence of coccidial oocysts. Of the 164 fecal samples, 76 (46.3%) were positive for oocysts of Eimeria, including E. gruis in 59 (35.9%), E. reichenowi in 52 (31.7%), and E. bosquei in 47 (28.7%) by microscopic observation. Sixty-eight (89.5%) of these positive samples included two or more morphologically identifiable species of Eimeria. The nearly full length 18S rRNA gene (18S rRNA; about 1.8 kb) and partial mitochondrial cytochrome c oxidase I gene (COX1; about 1.3 kb) from oocysts of each morphologically distinct species of Eimeria were amplified, sequenced, and analyzed. BLAST searches using these new 18S rRNA sequences for E. gruis, E. reichenowi, or E. bosquei showed the most similar sequences were those of E. gruis (98.7-99.7% identity), E. reichenowi (97.9-100% identity), or E. gruis (98.6-99.6% identity) isolated from different species of Grus. BLAST searches using the new COX1 sequences for the three species of Eimeria showed that no nucleotide sequences of Eimeria and Isospora coccidia in GenBank have more than 83.0% identity with these species. Identities among the new COX1 sequences were 91.8% for E. gruis and E. reichenowi, 94.5% for E. gruis and E. bosquei, and 91.3% for E. reichenowi and E. bosquei. Phylogenetic analysis based on 18S rRNA or COX1 sequences indicated that Eimeria spp. in black-necked cranes were clustered together with other previously identified Eimeria species from different cranes.
Organization, chromosomal localization and promoter analysis of the gene encoding human acidic fibroblast growth factor intracellular binding protein.

PubMed Central

Kolpakova, E; Frengen, E; Stokke, T; Olsnes, S

2000-01-01

Acidic fibroblast growth factor (aFGF) intracellular binding protein (FIBP) is a protein found mainly in the nucleus that might be involved in the intracellular function of aFGF. Here we present a comparative analysis of the deduced amino acid sequences of human, murine and Drosophila FIBP analogues and demonstrate that FIBP is an evolutionarily conserved protein. The human gene spans more than 5 kb, comprising ten exons and nine introns, and maps to chromosome 11q13.1. Two slightly different splice variants found in different tissues were isolated and characterized. Sequence analysis of the region surrounding the translation start revealed a CpG island, a classical feature of widely expressed genes. Functional studies of the promoter region with a luciferase reporter system suggested a strong transcriptional activity residing within 600 bp of the 5' flanking region. PMID:11104667
A framework for organizing cancer-related variations from existing databases, publications and NGS data using a High-performance Integrated Virtual Environment (HIVE).

PubMed

Wu, Tsung-Jung; Shamsaddini, Amirhossein; Pan, Yang; Smith, Krista; Crichton, Daniel J; Simonyan, Vahan; Mazumder, Raja

2014-01-01

Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Because of the petabytes of data and information present in NGS primary repositories, a platform HIVE (High-performance Integrated Virtual Environment) for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31 979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13 896 small-scale and 308 986 large-scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies. Database URL: BioMuta: http://hive.biochemistry.gwu.edu/tools/biomuta/index.php; CSR: http://hive.biochemistry.gwu.edu/dna.cgi?cmd=csr; HIVE: http://hive.biochemistry.gwu.edu.
First Report of cfr-Carrying Plasmids in the Pandemic Sequence Type 22 Methicillin-Resistant Staphylococcus aureus Staphylococcal Cassette Chromosome mec Type IV Clone

PubMed Central

Shore, Anna C.; Lazaris, Alexandros; Kinnevey, Peter M.; Brennan, Orla M.; Brennan, Gráinne I.; O'Connell, Brian; Feßler, Andrea T.; Schwarz, Stefan

2016-01-01

Linezolid is often the drug of last resort for serious methicillin-resistant Staphylococcus aureus (MRSA) infections. Linezolid resistance is mediated by mutations in 23S rRNA and genes for ribosomal proteins; cfr, encoding phenicol, lincosamide, oxazolidinone, pleuromutilin, and streptogramin A (PhLOPSA) resistance; its homologue cfr(B); or optrA, conferring oxazolidinone and phenicol resistance. Linezolid resistance is rare in S. aureus, and cfr is even rarer. This study investigated the clonality and linezolid resistance mechanisms of two MRSA isolates from patients in separate Irish hospitals. Isolates were subjected to cfr PCR, PhLOPSA susceptibility testing, 23S rRNA PCR and sequencing, DNA microarray profiling, spa typing, pulsed-field gel electrophoresis (PFGE), plasmid curing, and conjugative transfer. Whole-genome sequencing was used for single-nucleotide variant (SNV) analysis, multilocus sequence typing, L protein mutation identification, cfr plasmid sequence analysis, and optrA and cfr(B) detection. Isolates M12/0145 and M13/0401 exhibited linezolid MICs of 64 and 16 mg/liter, respectively, and harbored identical 23S rRNA and L22 mutations, but M12/0145 exhibited the mutation in 2/6 23S rRNA alleles, compared to 1/5 in M13/0401. Both isolates were sequence type 22 MRSA staphylococcal cassette chromosome mec type IV (ST22-MRSA-IV)/spa type t032 isolates, harbored cfr, exhibited the PhLOPSA phenotype, and lacked optrA and cfr(B). They differed by five PFGE bands and 603 SNVs. Isolate M12/0145 harbored cfr and fexA on a 41-kb conjugative pSCFS3-type plasmid, whereas M13/0401 harbored cfr and lsa(B) on a novel 27-kb plasmid. This is the first report of cfr in the pandemic ST22-MRSA-IV clone. Different cfr plasmids and mutations associated with linezolid resistance in genotypically distinct ST22-MRSA-IV isolates highlight that prudent management of linezolid use is essential. PMID:26953212
Breed relationships facilitate fine-mapping studies: A 7.8-kb deletion cosegregates with Collie eye anomaly across multiple dog breeds

PubMed Central

Parker, Heidi G.; Kukekova, Anna V.; Akey, Dayna T.; Goldstein, Orly; Kirkness, Ewen F.; Baysac, Kathleen C.; Mosher, Dana S.; Aguirre, Gustavo D.; Acland, Gregory M.; Ostrander, Elaine A.

2007-01-01

The features of modern dog breeds that increase the ease of mapping common diseases, such as reduced heterogeneity and extensive linkage disequilibrium, may also increase the difficulty associated with fine mapping and identifying causative mutations. One way to address this problem is by combining data from multiple breeds segregating the same trait after initial linkage has been determined. The multibreed approach increases the number of potentially informative recombination events and reduces the size of the critical haplotype by taking advantage of shortened linkage disequilibrium distances found across breeds. In order to identify breeds that likely share a trait inherited from the same ancestral source, we have used cluster analysis to divide 132 breeds of dog into five primary breed groups. We then use the multibreed approach to fine-map Collie eye anomaly (cea), a complex disorder of ocular development that was initially mapped to a 3.9-cM region on canine chromosome 37. Combined genotypes from affected individuals from four breeds of a single breed group significantly narrowed the candidate gene region to a 103-kb interval spanning only four genes. Sequence analysis revealed that all affected dogs share a homozygous deletion of 7.8 kb in the NHEJ1 gene. This intronic deletion spans a highly conserved binding domain to which several developmentally important proteins bind. This work both establishes that the primary cea mutation arose as a single disease allele in a common ancestor of herding breeds as well as highlights the value of comparative population analysis for refining regions of linkage. PMID:17916641
The mitochondrial genome sequences of the round goby and the sand goby reveal patterns of recent evolution in gobiid fish.

PubMed

Adrian-Kalchhauser, Irene; Svensson, Ola; Kutschera, Verena E; Alm Rosenblad, Magnus; Pippel, Martin; Winkler, Sylke; Schloissnig, Siegfried; Blomberg, Anders; Burkhardt-Holm, Patricia

2017-02-16

Vertebrate mitochondrial genomes are optimized for fast replication and low cost of RNA expression. Accordingly, they are devoid of introns, are transcribed as polycistrons and contain very little intergenic sequences. Usually, vertebrate mitochondrial genomes measure between 16.5 and 17 kilobases (kb). During genome sequencing projects for two novel vertebrate models, the invasive round goby and the sand goby, we found that the sand goby genome is exceptionally small (16.4 kb), while the mitochondrial genome of the round goby is much larger than expected for a vertebrate. It is 19 kb in size and is thus one of the largest fish and even vertebrate mitochondrial genomes known to date. The expansion is attributable to a sequence insertion downstream of the putative transcriptional start site. This insertion carries traces of repeats from the control region, but is mostly novel. To get more information about this phenomenon, we gathered all available mitochondrial genomes of Gobiidae and of nine gobioid species, performed phylogenetic analyses, analysed gene arrangements, and compared gobiid mitochondrial genome sizes, ecological information and other species characteristics with respect to the mitochondrial phylogeny. This allowed us amongst others to identify a unique arrangement of tRNAs among Ponto-Caspian gobies. Our results indicate that the round goby mitochondrial genome may contain novel features. Since mitochondrial genome organisation is tightly linked to energy metabolism, these features may be linked to its invasion success. Also, the unique tRNA arrangement among Ponto-Caspian gobies may be helpful in studying the evolution of this highly adaptive and invasive species group. Finally, we find that the phylogeny of gobiids can be further refined by the use of longer stretches of linked DNA sequence.
Microfluidic droplet enrichment for targeted sequencing

PubMed Central

Eastburn, Dennis J.; Huang, Yong; Pellegrino, Maurizio; Sciambi, Adam; Ptáček, Louis J.; Abate, Adam R.

2015-01-01

Targeted sequence enrichment enables better identification of genetic variation by providing increased sequencing coverage for genomic regions of interest. Here, we report the development of a new target enrichment technology that is highly differentiated from other approaches currently in use. Our method, MESA (Microfluidic droplet Enrichment for Sequence Analysis), isolates genomic DNA fragments in microfluidic droplets and performs TaqMan PCR reactions to identify droplets containing a desired target sequence. The TaqMan positive droplets are subsequently recovered via dielectrophoretic sorting, and the TaqMan amplicons are removed enzymatically prior to sequencing. We demonstrated the utility of this approach by generating an average 31.6-fold sequence enrichment across 250 kb of targeted genomic DNA from five unique genomic loci. Significantly, this enrichment enabled a more comprehensive identification of genetic polymorphisms within the targeted loci. MESA requires low amounts of input DNA, minimal prior locus sequence information and enriches the target region without PCR bias or artifacts. These features make it well suited for the study of genetic variation in a number of research and diagnostic applications. PMID:25873629

Early evolutionary colocalization of the nuclear ribosomal 5S and 45S gene families in seed plants: evidence from the living fossil gymnosperm Ginkgo biloba.

PubMed

Galián, J A; Rosato, M; Rosselló, J A

2012-06-01

In seed plants, the colocalization of the 5S loci within the intergenic spacer (IGS) of the nuclear 45S tandem units is restricted to the phylogenetically derived Asteraceae family. However, fluorescent in situ hybridization (FISH) colocalization of both multigene families has also been observed in other unrelated seed plant lineages. Previous work has identified colocalization of 45S and 5S loci in Ginkgo biloba using FISH, but these observations have not been confirmed recently by sequencing a 1.8 kb IGS. In this work, we report the presence of the 45S-5S linkage in G. biloba, suggesting that in seed plants the molecular events leading to the restructuring of the ribosomal loci are much older than estimated previously. We obtained a 6.0 kb IGS fragment showing structural features of functional sequences, and a single copy of the 5S gene was inserted in the same direction of transcription as the ribosomal RNA genes. We also obtained a 1.8 kb IGS that was a truncate variant of the 6.0 kb IGS lacking the 5S gene. Several lines of evidence strongly suggest that the 1.8 kb variants are pseudogenes that are present exclusively on the satellite chromosomes bearing the 45S-5S genes. The presence of ribosomal IGS pseudogenes best reconciles contradictory results concerning the presence or absence of the 45S-5S linkage in Ginkgo. Our finding that both ribosomal gene families have been unified to a single 45S-5S unit in Ginkgo indicates that an accurate reassessment of the organization of rDNA genes in basal seed plants is necessary.
Early evolutionary colocalization of the nuclear ribosomal 5S and 45S gene families in seed plants: evidence from the living fossil gymnosperm Ginkgo biloba

PubMed Central

Galián, J A; Rosato, M; Rosselló, J A

2012-01-01

In seed plants, the colocalization of the 5S loci within the intergenic spacer (IGS) of the nuclear 45S tandem units is restricted to the phylogenetically derived Asteraceae family. However, fluorescent in situ hybridization (FISH) colocalization of both multigene families has also been observed in other unrelated seed plant lineages. Previous work has identified colocalization of 45S and 5S loci in Ginkgo biloba using FISH, but these observations have not been confirmed recently by sequencing a 1.8 kb IGS. In this work, we report the presence of the 45S–5S linkage in G. biloba, suggesting that in seed plants the molecular events leading to the restructuring of the ribosomal loci are much older than estimated previously. We obtained a 6.0 kb IGS fragment showing structural features of functional sequences, and a single copy of the 5S gene was inserted in the same direction of transcription as the ribosomal RNA genes. We also obtained a 1.8 kb IGS that was a truncate variant of the 6.0 kb IGS lacking the 5S gene. Several lines of evidence strongly suggest that the 1.8 kb variants are pseudogenes that are present exclusively on the satellite chromosomes bearing the 45S–5S genes. The presence of ribosomal IGS pseudogenes best reconciles contradictory results concerning the presence or absence of the 45S–5S linkage in Ginkgo. Our finding that both ribosomal gene families have been unified to a single 45S–5S unit in Ginkgo indicates that an accurate reassessment of the organization of rDNA genes in basal seed plants is necessary. PMID:22354111
Genetic Diversity of Crimean Congo Hemorrhagic Fever Virus Strains from Iran

PubMed Central

Chinikar, Sadegh; Bouzari, Saeid; Shokrgozar, Mohammad Ali; Mostafavi, Ehsan; Jalali, Tahmineh; Khakifirouz, Sahar; Nowotny, Norbert; Fooks, Anthony R.; Shah-Hosseini, Nariman

2016-01-01

Background: Crimean Congo hemorrhagic fever virus (CCHFV) is a member of the Bunyaviridae family and Nairovirus genus. It has a negative-sense, single stranded RNA genome approximately 19.2 kb, containing the Small, Medium, and Large segments. CCHFVs are relatively divergent in their genome sequence and grouped in seven distinct clades based on S-segment sequence analysis and six clades based on M-segment sequences. Our aim was to obtain new insights into the molecular epidemiology of CCHFV in Iran. Methods: We analyzed partial and complete nucleotide sequences of the S and M segments derived from 50 Iranian patients. The extracted RNA was amplified using one-step RT-PCR and then sequenced. The sequences were analyzed using Mega5 software. Results: Phylogenetic analysis of partial S segment sequences demonstrated that clade IV-(Asia 1), clade IV-(Asia 2) and clade V-(Europe) accounted for 80 %, 4 % and 14 % of the circulating genomic variants of CCHFV in Iran respectively. However, one of the Iranian strains (Iran-Kerman/22) was associated with none of other sequences and formed a new clade (VII). The phylogenetic analysis of complete S-segment nucleotide sequences from selected Iranian CCHFV strains complemented with representative strains from GenBank revealed similar topology as partial sequences with eight major clusters. A partial M segment phylogeny positioned the Iranian strains in either association with clade III (Asia-Africa) or clade V (Europe). Conclusion: The phylogenetic analysis revealed subtle links between distant geographic locations, which we propose might originate either from international livestock trade or from long-distance carriage of CCHFV by infected ticks via bird migration. PMID:27308271
Characterization of the human gene (TBXAS1) encoding thromboxane synthase.

PubMed

Miyata, A; Yokoyama, C; Ihara, H; Bandoh, S; Takeda, O; Takahashi, E; Tanabe, T

1994-09-01

The gene encoding human thromboxane synthase (TBXAS1) was isolated from a human EMBL3 genomic library using human platelet thromboxane synthase cDNA as a probe. Nucleotide sequencing revealed that the human thromboxane synthase gene spans more than 75 kb and consists of 13 exons and 12 introns, of which the splice donor and acceptor sites conform to the GT/AG rule. The exon-intron boundaries of the thromboxane synthase gene were similar to those of the human cytochrome P450 nifedipine oxidase gene (CYP3A4) except for introns 9 and 10, although the primary sequences of these enzymes exhibited 35.8% identity each other. The 1.2-kb of the 5'-flanking region sequence contained potential binding sites for several transcription factors (AP-1, AP-2, GATA-1, CCAAT box, xenobiotic-response element, PEA-3, LF-A1, myb, basic transcription element and cAMP-response element). Primer-extension analysis indicated the multiple transcription-start sites, and the major start site was identified as an adenine residue located 142 bases upstream of the translation-initiation site. However, neither a typical TATA box nor a typical CAAT box is found within the 100-b upstream of the translation-initiation site. Southern-blot analysis revealed the presence of one copy of the thromboxane synthase gene per haploid genome. Furthermore, a fluorescence in situ hybridization study revealed that the human gene for thromboxane synthase is localized to band q33-q34 of the long arm of chromosome 7. A tissue-distribution study demonstrated that thromboxane synthase mRNA is widely expressed in human tissues and is particularly abundant in peripheral blood leukocyte, spleen, lung and liver. The low but significant levels of mRNA were observed in kidney, placenta and thymus.
Transposon-like properties of the major, long repetitive sequence family in the genome of Physarum polycephalum

PubMed Central

Pearston, Douglas H.; Gordon, Mairi; Hardman, Norman

1985-01-01

A family of long, highly-repetitive sequences, referred to previously as `HpaII-repeats', dominates the genome of the eukaryotic slime mould Physarum polycephalum. These sequences are found exclusively in scrambled clusters. They account for about one-half of the total complement of repetitive DNA in Physarum, and represent the major sequence component found in hypermethylated, 20-50 kb segments of Physarum genomic DNA that fail to be cleaved using the restriction endonuclease HpaII. The structure of this abundant repetitive element was investigated by analysing cloned segments derived from the hypermethylated genomic DNA compartment. We show that the `HpaII-repeat' forms part of a larger repetitive DNA structure, ∼8.6 kb in length, with several structural features in common with recognised eukaryotic transposable genetic elements. Scrambled clusters of the sequence probably arise as a result of transposition-like events, during which the element preferentially recombines in either orientation with target sites located in other copies of the same repeated sequence. The target sites for transposition/recombination are not related in sequence but in all cases studied they are potentially capable of promoting the formation of small `cruciforms' or `Z-DNA' structures which might be recognised during the recombination process. ImagesFig. 3.Fig. 4. PMID:16453652
Dmc1 of Schizosaccharomyces pombe plays a role in meiotic recombination.

PubMed

Fukushima, K; Tanaka, Y; Nabeshima, K; Yoneki, T; Tougan, T; Tanaka, S; Nojima, H

2000-07-15

We report here a Schizosaccharomyces pombe gene (dmc1(+)) that resembles budding yeast DMC1 in the region immediately upstream of the rad24(+) gene. We showed by northern and Southern blot analysis that dmc1(+) and rad24(+) are co-transcribed as a bicistronic mRNA of 2.8 kb with meiotic specificity, whereas rad24(+) itself is constitutively transcribed as a 1.0-kb mRNA species during meiosis. Induction of the bicistronic transcript is under the control of a meiosis-specific transcription factor, Ste11. Disruption of both dmc1(+) and rad24(+) had no effect on mitosis or spore formation, and dmc1Delta cells displayed no change in sensitivity to UV or gamma irradiation relative to the wild type. Tetrad analysis indicated that Dmc1 is involved in meiotic recombination. Analysis of gene conversion frequencies using single and double mutants of dmc1 and rhp51 indicated that both Dmc1 and Rhp51 function in meiotic gene conversion. These observations, together with a high level of sequence identity, indicate that the dmc1(+) gene of S. POMBE: is a structural homolog of budding yeast DMC1, sharing both similar and distinct functions in meiosis.
Identification of herpes simplex virus type 1 proteins encoded within the first 1.5 kb of the latency-associated transcript.

PubMed

Henderson, Gail; Jaber, Tareq; Carpenter, Dale; Wechsler, Steven L; Jones, Clinton

2009-09-01

Expression of the first 1.5 kb of the latency-associated transcript (LAT) that is encoded by herpes simplex virus type 1 (HSV-1) is sufficient for wild-type (wt) levels of reactivation from latency in small animal models. Peptide-specific immunoglobulin G (IgG) was generated against open reading frames (ORFs) that are located within the first 1.5 kb of LAT coding sequences. Cells stably transfected with LAT or trigeminal ganglionic neurons of mice infected with a LAT expressing virus appeared to express the L2 or L8 ORF. Only L2 ORF expression was readily detected in trigeminal ganglionic neurons of latently infected mice.
Interval mapping for red/green skin color in Asian pears using a modified QTL-seq method

PubMed Central

Xue, Huabai; Shi, Ting; Wang, Fangfang; Zhou, Huangkai; Yang, Jian; Wang, Long; Wang, Suke; Su, Yanli; Zhang, Zhen; Qiao, Yushan; Li, Xiugen

2017-01-01

Pears with red skin are attractive to consumers and provide additional health benefits. Identification of the gene(s) responsible for skin coloration can benefit cultivar selection and breeding. The use of QTL-seq, a bulked segregant analysis method, can be problematic when heterozygous parents are involved. The present study modified the QTL-seq method by introducing a |Δ(SNP-index)| parameter to improve the accuracy of mapping the red skin trait in a group of highly heterozygous Asian pears. The analyses were based on mixed DNA pools composed of 28 red-skinned and 27 green-skinned pear lines derived from a cross between the ‘Mantianhong’ and ‘Hongxiangsu’ red-skinned cultivars. The ‘Dangshansuli’ cultivar genome was used as reference for sequence alignment. An average single-nucleotide polymorphism (SNP) index was calculated using a sliding window approach (200-kb windows, 20-kb increments). Nine scaffolds within the candidate QTL interval were in the fifth linkage group from 111.9 to 177.1 cM. There was a significant linkage between the insertions/deletions and simple sequence repeat markers designed from the candidate intervals and the red/green skin (R/G) locus, which was in a 582.5-kb candidate interval that contained 81 predicted protein-coding gene models and was composed of two subintervals at the bottom of the fifth chromosome. The ZFRI 130-16, In2130-12 and In2130-16 markers located near the R/G locus could potentially be used to identify the red skin trait in Asian pear populations. This study provides new insights into the genetics controlling the red skin phenotype in this fruit. PMID:29118994
Interval mapping for red/green skin color in Asian pears using a modified QTL-seq method.

PubMed

Xue, Huabai; Shi, Ting; Wang, Fangfang; Zhou, Huangkai; Yang, Jian; Wang, Long; Wang, Suke; Su, Yanli; Zhang, Zhen; Qiao, Yushan; Li, Xiugen

2017-01-01

Pears with red skin are attractive to consumers and provide additional health benefits. Identification of the gene(s) responsible for skin coloration can benefit cultivar selection and breeding. The use of QTL-seq, a bulked segregant analysis method, can be problematic when heterozygous parents are involved. The present study modified the QTL-seq method by introducing a |Δ(SNP-index)| parameter to improve the accuracy of mapping the red skin trait in a group of highly heterozygous Asian pears. The analyses were based on mixed DNA pools composed of 28 red-skinned and 27 green-skinned pear lines derived from a cross between the 'Mantianhong' and 'Hongxiangsu' red-skinned cultivars. The 'Dangshansuli' cultivar genome was used as reference for sequence alignment. An average single-nucleotide polymorphism (SNP) index was calculated using a sliding window approach (200-kb windows, 20-kb increments). Nine scaffolds within the candidate QTL interval were in the fifth linkage group from 111.9 to 177.1 cM. There was a significant linkage between the insertions/deletions and simple sequence repeat markers designed from the candidate intervals and the red/green skin (R/G) locus, which was in a 582.5-kb candidate interval that contained 81 predicted protein-coding gene models and was composed of two subintervals at the bottom of the fifth chromosome. The ZFRI 130-16, In2130-12 and In2130-16 markers located near the R/G locus could potentially be used to identify the red skin trait in Asian pear populations. This study provides new insights into the genetics controlling the red skin phenotype in this fruit.
Potential Novel Mechanism for Axenfeld-Rieger Syndrome: Deletion of a Distant Region Containing Regulatory Elements of PITX2

PubMed Central

Volkmann, Bethany A.; Zinkevich, Natalya S.; Mustonen, Aki; Schilter, Kala F.; Bosenko, Dmitry V.; Reis, Linda M.; Broeckel, Ulrich; Link, Brian A.

2011-01-01

Purpose. Mutations in PITX2 are associated with Axenfeld-Rieger syndrome (ARS), which involves ocular, dental, and umbilical abnormalities. Identification of cis-regulatory elements of PITX2 is important to better understand the mechanisms of disease. Methods. Conserved noncoding elements surrounding PITX2/pitx2 were identified and examined through transgenic analysis in zebrafish; expression pattern was studied by in situ hybridization. Patient samples were screened for deletion/duplication of the PITX2 upstream region using arrays and probes. Results. Zebrafish pitx2 demonstrates conserved expression during ocular and craniofacial development. Thirteen conserved noncoding sequences positioned within a gene desert as far as 1.1 Mb upstream of the human PITX2 gene were identified; 11 have enhancer activities consistent with pitx2 expression. Ten elements mediated expression in the developing brain, four regions were active during eye formation, and two sequences were associated with craniofacial expression. One region, CE4, located approximately 111 kb upstream of PITX2, directed a complex pattern including expression in the developing eye and craniofacial region, the classic sites affected in ARS. Screening of ARS patients identified an approximately 7600-kb deletion that began 106 to 108 kb upstream of the PITX2 gene, leaving PITX2 intact while removing regulatory elements CE4 to CE13. Conclusions. These data suggest the presence of a complex distant regulatory matrix within the gene desert located upstream of PITX2 with an essential role in its activity and provides a possible mechanism for the previous reports of ARS in patients with balanced translocations involving the 4q25 region upstream of PITX2 and the current patient with an upstream deletion. PMID:20881290
The genomic organization of a human creatine transporter (CRTR) gene located in Xq28

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sandoval, N.; Bauer, D.; Brenner, V.

1996-07-15

During the course of a large-scale sequencing project in Xq28, a human creatine transporter (CRTR) gene was discovered. The gene is located approximately 36 kb centromeric to ALD. The gene contains 13 exons and spans about 8.5 kb of genomic DNA. Since the creatine transporter has a prominent function in muscular physiology, it is a candidate gene for Barth syndrome and infantile cardiomyopathy mapped to Xq28. 19 refs., 1 fig., 1 tab.
The mitochondrial genome of Moniliophthora roreri, the frosty pod rot pathogen of cacao.

PubMed

Costa, Gustavo G L; Cabrera, Odalys G; Tiburcio, Ricardo A; Medrano, Francisco J; Carazzolle, Marcelo F; Thomazella, Daniela P T; Schuster, Stephen C; Carlson, John E; Guiltinan, Mark J; Bailey, Bryan A; Mieczkowski, Piotr; Pereira, Gonçalo A G; Meinhardt, Lyndel W

2012-05-01

In this study, we report the sequence of the mitochondrial (mt) genome of the Basidiomycete fungus Moniliophthora roreri, which is the etiologic agent of frosty pod rot of cacao (Theobroma cacao L.). We also compare it to the mtDNA from the closely-related species Moniliophthora perniciosa, which causes witches' broom disease of cacao. The 94 Kb mtDNA genome of M. roreri has a circular topology and codes for the typical 14 mt genes involved in oxidative phosphorylation. It also codes for both rRNA genes, a ribosomal protein subunit, 13 intronic open reading frames (ORFs), and a full complement of 27 tRNA genes. The conserved genes of M. roreri mtDNA are completely syntenic with homologous genes of the 109 Kb mtDNA of M. perniciosa. As in M. perniciosa, M. roreri mtDNA contains a high number of hypothetical ORFs (28), a remarkable feature that make Moniliophthoras the largest reservoir of hypothetical ORFs among sequenced fungal mtDNA. Additionally, the mt genome of M. roreri has three free invertron-like linear mt plasmids, one of which is very similar to that previously described as integrated into the main M. perniciosa mtDNA molecule. Moniliophthora roreri mtDNA also has a region of suspected plasmid origin containing 15 hypothetical ORFs distributed in both strands. One of these ORFs is similar to an ORF in the mtDNA gene encoding DNA polymerase in Pleurotus ostreatus. The comparison to M. perniciosa showed that the 15 Kb difference in mtDNA sizes is mainly attributed to a lower abundance of repetitive regions in M. roreri (5.8 Kb vs 20.7 Kb). The most notable differences between M. roreri and M. perniciosa mtDNA are attributed to repeats and regions of plasmid origin. These elements might have contributed to the rapid evolution of mtDNA. Since M. roreri is the second species of the genus Moniliophthora whose mtDNA genome has been sequenced, the data presented here contribute valuable information for understanding the evolution of fungal mt genomes among closely-related species. Crown Copyright © 2012. Published by Elsevier Ltd. All rights reserved.
Breakpoint analysis of the pericentric inversion between chimpanzee chromosome 10 and the homologous chromosome 12 in humans.

PubMed

Kehrer-Sawatzki, H; Sandig, C A; Goidts, V; Hameister, H

2005-01-01

During this study, we analysed the pericentric inversion that distinguishes human chromosome 12 (HSA12) from the homologous chimpanzee chromosome (PTR10). Two large chimpanzee-specific duplications of 86 and 23 kb were observed in the breakpoint regions, which most probably occurred associated with the inversion. The inversion break in PTR10p caused the disruption of the SLCO1B3 gene in exon 11. However, the 86-kb duplication includes the functional SLCO1B3 locus, which is thus retained in the chimpanzee, although inverted to PTR10q. The second duplication spans 23 kb and does not contain expressed sequences. Eleven genes map to a region of about 1 Mb around the breakpoints. Six of these eleven genes are not among the differentially expressed genes as determined previously by comparing the human and chimpanzee transcriptome of fibroblast cell lines, blood leukocytes, liver and brain samples. These findings imply that the inversion did not cause major expression differences of these genes. Comparative FISH analysis with BACs spanning the inversion breakpoints in PTR on metaphase chromosomes of gorilla (GGO) confirmed that the pericentric inversion of the chromosome 12 homologs in GGO and PTR have distinct breakpoints and that humans retain the ancestral arrangement. These findings coincide with the trend observed in hominoid karyotype evolution that humans have a karyotype close to an ancestral one, while African great apes present with more derived chromosome arrangements. Copyright (c) 2005 S. Karger AG, Basel.
A genome-wide screening of BEL-Pao like retrotransposons in Anopheles gambiae by the LTR_STRUC program.

PubMed

Marsano, Renè Massimiliano; Caizzi, Ruggiero

2005-09-12

The advanced status of assembly of the nematoceran Anopheles gambiae genomic sequence allowed us to perform a wide genome analysis to looking at the presence of Long Terminal Repeats (LTRs) in the range of 10 kb by means of the LTR_STRUC tool. More than three hundred sequences were retrieved and 210 were treated as putative complete retrotransposons that were individually analysed with respect to known retrotransposons of A. gambiae and D. melanogaster. The results show that the vast majority of the retrotransposons analysed belong to the Ty3/gypsy class and only 8% to the Ty1/copia class. In addition, phylogenetic analysis allowed us to characterize in more detail the relationship of a large BEL-Pao lineage in which a single family was shown to harbour an additional env gene.
Organization of nif gene cluster in Frankia sp. EuIK1 strain, a symbiont of Elaeagnus umbellata.

PubMed

Oh, Chang Jae; Kim, Ho Bang; Kim, Jitae; Kim, Won Jin; Lee, Hyoungseok; An, Chung Sun

2012-01-01

The nucleotide sequence of a 20.5-kb genomic region harboring nif genes was determined and analyzed. The fragment was obtained from Frankia sp. EuIK1 strain, an indigenous symbiont of Elaeagnus umbellata. A total of 20 ORFs including 12 nif genes were identified and subjected to comparative analysis with the genome sequences of 3 Frankia strains representing diverse host plant specificities. The nucleotide and deduced amino acid sequences showed highest levels of identity with orthologous genes from an Elaeagnus-infecting strain. The gene organization patterns around the nif gene clusters were well conserved among all 4 Frankia strains. However, characteristic features appeared in the location of the nifV gene for each Frankia strain, depending on the type of host plant. Sequence analysis was performed to determine the transcription units and suggested that there could be an independent operon starting from the nifW gene in the EuIK strain. Considering the organization patterns and their total extensions on the genome, we propose that the nif gene clusters remained stable despite genetic variations occurring in the Frankia genomes.
Analysis of SINE and LINE repeat content of Y chromosomes in the platypus, Ornithorhynchus anatinus.

PubMed

Kortschak, R Daniel; Tsend-Ayush, Enkhjargal; Grützner, Frank

2009-01-01

Monotremes feature an extraordinary sex-chromosome system that consists of five X and five Y chromosomes in males. These sex chromosomes share homology with bird sex chromosomes but no homology with the therian X. The genome of a female platypus was recently completed, providing unique insights into sequence and gene content of autosomes and X chromosomes, but no Y-specific sequence has so far been analysed. Here we report the isolation, sequencing and analysis of approximately 700 kb of sequence of the non-recombining regions of Y2, Y3 and Y5, which revealed differences in base composition and repeat content between autosomes and sex chromosomes, and within the sex chromosomes themselves. This provides the first insights into repeat content of Y chromosomes in platypus, which overall show similar patterns of repeat composition to Y chromosomes in other species. Interestingly, we also observed differences between the various Y chromosomes, and in combination with timing and activity patterns we provide an approach that can be used to examine the evolutionary history of the platypus sex-chromosome chain.
Cinnamate-4-hydroxylase expression in arabidopsis. Regulation in response to development and the environment

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bell-Lelong, D.A.; Cusumano, J.C.; Meyer, K.

1997-03-01

Cinnamate-r-hydroxylase (C4H) is the first Cyt P450-dependent monooxygenase of the phenylpropanoid pathway. To study the expression of this gene in Arabidopsis thaliana, a C4H cDNA clone from the Arabidopsis expressed sequence tag database was identified and used to isolate its corresponding genomic clone. The entire C4H coding sequence plus 2.9 kb of its promoter were isolated on a 5.4-kb HindIII fragment of this cosmid. Inspection of the promoter sequence revealed the presence of a number of putative regulatory motifs previously identified in the promoters of other phenylpropanoid pathway genes. The expression of C4H was analyzed by RNA blot hybridization analysismore » and in transgenic Arabidopsis carrying a C4H-{beta}-glucuronidase transcriptional fusion. C4H message accumulation was light-dependent, but was detectable even in dark-grown seedlings. Consistent with these data, C4H mRNA was accumulated to light-grown levels in etiolated det1-1 mutant seedlings. C4H is widely expressed in various Arabidopsis tissues, particularly in roots and cells undergoing lignification. The C4H-driven {beta}-glucuronidase expression accurately reflected the tissue-specificity and wound-inducibility of the C4H promoter indicated by RNA blot hybridization analysis. A modest increase in C4H expression was observed in the tt8 mutant of Arabidopsis. 77 refs., 5 figs.« less
Assembly and analysis of a male sterile rubber tree mitochondrial genome reveals DNA rearrangement events and a novel transcript.

PubMed

Shearman, Jeremy R; Sangsrakru, Duangjai; Ruang-Areerate, Panthita; Sonthirod, Chutima; Uthaipaisanwong, Pichahpuk; Yoocha, Thippawan; Poopear, Supannee; Theerawattanasuk, Kanikar; Tragoonrung, Somvong; Tangphatsornruang, Sithichoke

2014-02-10

The rubber tree, Hevea brasiliensis, is an important plant species that is commercially grown to produce latex rubber in many countries. The rubber tree variety BPM 24 exhibits cytoplasmic male sterility, inherited from the variety GT 1. We constructed the rubber tree mitochondrial genome of a cytoplasmic male sterile variety, BPM 24, using 454 sequencing, including 8 kb paired-end libraries, plus Illumina paired-end sequencing. We annotated this mitochondrial genome with the aid of Illumina RNA-seq data and performed comparative analysis. We then compared the sequence of BPM 24 to the contigs of the published rubber tree, variety RRIM 600, and identified a rearrangement that is unique to BPM 24 resulting in a novel transcript containing a portion of atp9. The novel transcript is consistent with changes that cause cytoplasmic male sterility through a slight reduction to ATP production efficiency. The exhaustive nature of the search rules out alternative causes and supports previous findings of novel transcripts causing cytoplasmic male sterility.
Sequence and Analysis of the Tomato JOINTLESS Locus1

PubMed Central

Mao, Long; Begum, Dilara; Goff, Stephen A.; Wing, Rod A.

2001-01-01

A 119-kb bacterial artificial chromosome from the JOINTLESS locus on the tomato (Lycopersicon esculentum) chromosome 11 contained 15 putative genes. Repetitive sequences in this region include one copia-like LTR retrotransposon, 13 simple sequence repeats, three copies of a novel type III foldback transposon, and four putative short DNA repeats. Database searches showed that the foldback transposon and the short DNA repeats seemed to be associated preferably with genes. The predicted tomato genes were compared with the complete Arabidopsis genome. Eleven out of 15 tomato open reading frames were found to be colinear with segments on five Arabidopsis bacterial artificial chromosome/P1-derived artificial chromosome clones. The synteny patterns, however, did not reveal duplicated segments in Arabidopsis, where over half of the genome is duplicated. Our analysis indicated that the microsynteny between the tomato and Arabidopsis genomes was still conserved at a very small scale but was complicated by the large number of gene families in the Arabidopsis genome. PMID:11457984
Identification of the electron transfer flavoprotein as an upregulated enzyme in the benzoate utilization of Desulfotignum balticum.

PubMed

Habe, Hiroshi; Kobuna, Akinori; Hosoda, Akifumi; Kosaka, Tomoyuki; Endoh, Takayuki; Tamura, Hiroto; Yamane, Hisakazu; Nojiri, Hideaki; Omori, Toshio; Watanabe, Kazuya

2009-07-01

Desulfotignum balticum utilizes benzoate coupled to sulfate reduction. Two-dimensional polyacrylamide gel electrophoresis (2D-PAGE) analysis was conducted to detect proteins that increased more after growth on benzoate than on butyrate. A comparison of proteins on 2D gels showed that at least six proteins were expressed. The N-terminal sequences of three proteins exhibited significant identities with the alpha and beta subunits of electron transfer flavoprotein (ETF) from anaerobic aromatic-degraders. By sequence analysis of the fosmid clone insert (37,590 bp) containing the genes encoding the ETF subunits, we identified three genes, whose deduced amino acid sequences showed 58%, 74%, and 62% identity with those of Gmet_2267 (Fe-S oxidoreductase), Gmet_2266 (ETF beta subunit), and Gmet_2265 (ETF alpha subunit) respectively, which exist within the 300-kb genomic island of aromatic-degradation genes from Geobacter metallireducens GS-15. The genes encoding ETF subunits found in this study were upregulated in benzoate utilization.

Sequencing and functional analysis of the nifENXorf1orf2 gene cluster of Herbaspirillum seropedicae.

PubMed

Klassen, G; Pedrosa, F O; Souza, E M; Yates, M G; Rigo, L U

1999-12-01

A 5.1-kb DNA fragment from the nifHDK region of H. seropedicae was isolated and sequenced. Sequence analysis showed the presence of nifENXorf1orf2 but nifTY were not present. No nif or consensus promoter was identified. Furthermore, orf1 expression occurred only under nitrogen-fixing conditions and no promoter activity was detected between nifK and nifE, suggesting that these genes are expressed from the upstream nifH promoter and are parts of a unique nif operon. Mutagenesis studies indicate that nifN was essential for nitrogenase activity whereas nifXorf1orf2 were not. High homology between the C-terminal region of the NifX and NifB proteins from H. seropedicae was observed. Since the NifX and NifY proteins are important for FeMo cofactor (FeMoco) synthesis, we propose that alternative proteins with similar activities exist in H. seropedicae.
Calibrating genomic and allelic coverage bias in single-cell sequencing.

PubMed

Zhang, Cheng-Zhong; Adalsteinsson, Viktor A; Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L; Meyerson, Matthew; Love, J Christopher

2015-04-16

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1-10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (∼0.1 × ) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples.
Calibrating genomic and allelic coverage bias in single-cell sequencing

PubMed Central

Francis, Joshua; Cornils, Hauke; Jung, Joonil; Maire, Cecile; Ligon, Keith L.; Meyerson, Matthew; Love, J. Christopher

2016-01-01

Artifacts introduced in whole-genome amplification (WGA) make it difficult to derive accurate genomic information from single-cell genomes and require different analytical strategies from bulk genome analysis. Here, we describe statistical methods to quantitatively assess the amplification bias resulting from whole-genome amplification of single-cell genomic DNA. Analysis of single-cell DNA libraries generated by different technologies revealed universal features of the genome coverage bias predominantly generated at the amplicon level (1–10 kb). The magnitude of coverage bias can be accurately calibrated from low-pass sequencing (~0.1 ×) to predict the depth-of-coverage yield of single-cell DNA libraries sequenced at arbitrary depths. We further provide a benchmark comparison of single-cell libraries generated by multi-strand displacement amplification (MDA) and multiple annealing and looping-based amplification cycles (MALBAC). Finally, we develop statistical models to calibrate allelic bias in single-cell whole-genome amplification and demonstrate a census-based strategy for efficient and accurate variant detection from low-input biopsy samples. PMID:25879913
Confirmation of a novel siadenovirus species detected in raptors: partial sequence and phylogenetic analysis.

PubMed

Kovács, Endre R; Benko, Mária

2009-03-01

Partial genome characterisation of a novel adenovirus, found recently in organ samples of multiple species of dead birds of prey, was carried out by sequence analysis of PCR-amplified DNA fragments. The virus, named as raptor adenovirus 1 (RAdV-1), has originally been detected by a nested PCR method with consensus primers targeting the adenoviral DNA polymerase gene. Phylogenetic analysis with the deduced amino acid sequence of the small PCR product has implied a new siadenovirus type present in the samples. Since virus isolation attempts remained unsuccessful, further characterisation of this putative novel siadenovirus was carried out with the use of PCR on the infected organ samples. The DNA sequence of the central genome part of RAdV-1, encompassing nine full (pTP, 52K, pIIIa, III, pVII, pX, pVI, hexon, protease) and two partial (DNA polymerase and DBP) genes and exceeding 12 kb pairs in size, was determined. Phylogenetic tree reconstructions, based on several genes, unambiguously confirmed the preliminary classification of RAdV-1 as a new species within the genus Siadenovirus. Further study of RAdV-1 is of interest since it represents a rare adenovirus genus of yet undetermined host origin.
Genome Sequencing and Analysis of Yersina pestis KIM D27, an Avirulent Strain Exempt from Select Agent Regulation

PubMed Central

Losada, Liliana; Varga, John J.; Hostetler, Jessica; Radune, Diana; Kim, Maria; Durkin, Scott; Schneewind, Olaf; Nierman, William C.

2011-01-01

Yersinia pestis is the causative agent of the plague. Y. pestis KIM 10+ strain was passaged and selected for loss of the 102 kb pgm locus, resulting in an attenuated strain, KIM D27. In this study, whole genome sequencing was performed on KIM D27 in order to identify any additional differences. Initial assemblies of 454 data were highly fragmented, and various bioinformatic tools detected between 15 and 465 SNPs and INDELs when comparing both strains, the vast majority associated with A or T homopolymer sequences. Consequently, Illumina sequencing was performed to improve the quality of the assembly. Hybrid sequence assemblies were performed and a total of 56 validated SNP/INDELs and 5 repeat differences were identified in the D27 strain relative to published KIM 10+ sequence. However, further analysis showed that 55 of these SNP/INDELs and 3 repeats were errors in the KIM 10+ reference sequence. We conclude that both 454 and Illumina sequencing were required to obtain the most accurate and rapid sequence results for Y. pestis KIMD27. SNP and INDELS calls were most accurate when both Newbler and CLC Genomics Workbench were employed. For purposes of obtaining high quality genome sequence differences between strains, any identified differences should be verified in both the new and reference genomes. PMID:21559501
Genome sequencing and analysis of Yersina pestis KIM D27, an avirulent strain exempt from select agent regulation.

PubMed

Losada, Liliana; Varga, John J; Hostetler, Jessica; Radune, Diana; Kim, Maria; Durkin, Scott; Schneewind, Olaf; Nierman, William C

2011-04-29

Yersinia pestis is the causative agent of the plague. Y. pestis KIM 10+ strain was passaged and selected for loss of the 102 kb pgm locus, resulting in an attenuated strain, KIM D27. In this study, whole genome sequencing was performed on KIM D27 in order to identify any additional differences. Initial assemblies of 454 data were highly fragmented, and various bioinformatic tools detected between 15 and 465 SNPs and INDELs when comparing both strains, the vast majority associated with A or T homopolymer sequences. Consequently, Illumina sequencing was performed to improve the quality of the assembly. Hybrid sequence assemblies were performed and a total of 56 validated SNP/INDELs and 5 repeat differences were identified in the D27 strain relative to published KIM 10+ sequence. However, further analysis showed that 55 of these SNP/INDELs and 3 repeats were errors in the KIM 10+ reference sequence. We conclude that both 454 and Illumina sequencing were required to obtain the most accurate and rapid sequence results for Y. pestis KIMD27. SNP and INDELS calls were most accurate when both Newbler and CLC Genomics Workbench were employed. For purposes of obtaining high quality genome sequence differences between strains, any identified differences should be verified in both the new and reference genomes.
Genome Sequences of Akhmeta Virus, an Early Divergent Old World Orthopoxvirus.

PubMed

Gao, Jinxin; Gigante, Crystal; Khmaladze, Ekaterine; Liu, Pengbo; Tang, Shiyuyun; Wilkins, Kimberly; Zhao, Kun; Davidson, Whitni; Nakazawa, Yoshinori; Maghlakelidze, Giorgi; Geleishvili, Marika; Kokhreidze, Maka; Carroll, Darin S; Emerson, Ginny; Li, Yu

2018-05-12

Annotated whole genome sequences of three isolates of the Akhmeta virus (AKMV), a novel species of orthopoxvirus (OPXV), isolated from the Akhmeta and Vani regions of the country Georgia, are presented and discussed. The AKMV genome is similar in genomic content and structure to that of the cowpox virus (CPXV), but a lower sequence identity was found between AKMV and Old World OPXVs than between other known species of Old World OPXVs. Phylogenetic analysis showed that AKMV diverged prior to other Old World OPXV. AKMV isolates formed a monophyletic clade in the OPXV phylogeny, yet the sequence variability between AKMV isolates was higher than between the monkeypox virus strains in the Congo basin and West Africa. An AKMV isolate from Vani contained approximately six kb sequence in the left terminal region that shared a higher similarity with CPXV than with other AKMV isolates, whereas the rest of the genome was most similar to AKMV, suggesting recombination between AKMV and CPXV in a region containing several host range and virulence genes.
A novel tandem repeat sequence located on human chromosome 4p: isolation and characterization.

PubMed

Kogi, M; Fukushige, S; Lefevre, C; Hadano, S; Ikeda, J E

1997-06-01

In an effort to analyze the genomic region of the distal half of human chromosome 4p, to where Huntington disease and other diseases have been mapped, we have isolated the cosmid clone (CRS447) that was likely to contain a region with specific repeat sequences. Clone CRS447 was subjected to detailed analysis, including chromosome mapping, restriction mapping, and DNA sequencing. Chromosome mapping by both a human-CHO hybrid cell panel and FISH revealed that CRS447 was predominantly located in the 4p15.1-15.3 region. CRS447 was shown to consist of tandem repeats of 4.7-kb units present on chromosome 4p. A single EcoRI unit was subcloned (pRS447), and the complete sequence was determined as 4752 nucleotides. When pRS447 was used as a probe, the number of copies of this repeat per haploid genome was estimated to be 50-70. Sequence analysis revealed that it contained two internal CA repeats and one putative ORF. Database search established that this sequence was unreported. However, two homologous STS markers were found in the database. We concluded that CRS447/pRS447 is a novel tandem repeat sequence that is mainly specific to human chromosome 4p.
Complete Genome Sequence of the Quality Control Strain Staphylococcus aureus subsp. aureus ATCC 25923

PubMed Central

Treangen, Todd J.; Maybank, Rosslyn A.; Enke, Sana; Friss, Mary Beth; Diviak, Lynn F.; Karaolis, David K. R.; Koren, Sergey; Ondov, Brian; Phillippy, Adam M.; Bergman, Nicholas H.

2014-01-01

Staphylococcus aureus subsp. aureus ATCC 25923 is commonly used as a control strain for susceptibility testing to antibiotics and as a quality control strain for commercial products. We present the completed genome sequence for the strain, consisting of the chromosome and a 27.5-kb plasmid. PMID:25377701
Genetic mapping of the LOBED LEAF 1 (ClLL1) gene to a 127.6-kb region in watermelon (Citrullus lanatus L.)

PubMed Central

Wei, Chunhua; Chen, Xiner; Wang, Zhongyuan; Liu, Qiyan; Li, Hao; Zhang, Yong; Ma, Jianxiang; Yang, Jianqiang

2017-01-01

The lobed leaf character is a unique morphologic trait in crops, featuring many potential advantages for agricultural productivity. Although the majority of watermelon varieties feature lobed leaves, the genetic factors responsible for lobed leaf formation remain elusive. The F2:3 leaf shape segregating population offers the opportunity to study the underlying mechanism of lobed leaf formation in watermelon. Genetic analysis revealed that a single dominant allele (designated ClLL1) controlled the lobed leaf trait. A large-sized F3:4 population derived from F2:3 individuals was used to map ClLL1. A total of 5,966 reliable SNPs and indels were identified genome-wide via a combination of BSA and RNA-seq. Using the validated SNP and indel markers, the location of ClLL1 was narrowed down to a 127.6-kb region between markers W08314 and W07061, containing 23 putative ORFs. Expression analysis via qRT-PCR revealed differential expression patterns (fold-changes above 2-fold or below 0.5-fold) of three ORFs (ORF3, ORF11, and ORF18) between lobed and non-lobed leaf plants. Based on gene annotation and expression analysis, ORF18 (encoding an uncharacterized protein) and ORF22 (encoding a homeobox-leucine zipper-like protein) were considered as most likely candidate genes. Furthermore, sequence analysis revealed no polymorphisms in cDNA sequences of ORF18; however, two notable deletions were identified in ORF22. This study is the first report to map a leaf shape gene in watermelon and will facilitate cloning and functional characterization of ClLL1 in future studies. PMID:28704497
Genetic mapping of the LOBED LEAF 1 (ClLL1) gene to a 127.6-kb region in watermelon (Citrullus lanatus L.).

PubMed

Wei, Chunhua; Chen, Xiner; Wang, Zhongyuan; Liu, Qiyan; Li, Hao; Zhang, Yong; Ma, Jianxiang; Yang, Jianqiang; Zhang, Xian

2017-01-01

The lobed leaf character is a unique morphologic trait in crops, featuring many potential advantages for agricultural productivity. Although the majority of watermelon varieties feature lobed leaves, the genetic factors responsible for lobed leaf formation remain elusive. The F2:3 leaf shape segregating population offers the opportunity to study the underlying mechanism of lobed leaf formation in watermelon. Genetic analysis revealed that a single dominant allele (designated ClLL1) controlled the lobed leaf trait. A large-sized F3:4 population derived from F2:3 individuals was used to map ClLL1. A total of 5,966 reliable SNPs and indels were identified genome-wide via a combination of BSA and RNA-seq. Using the validated SNP and indel markers, the location of ClLL1 was narrowed down to a 127.6-kb region between markers W08314 and W07061, containing 23 putative ORFs. Expression analysis via qRT-PCR revealed differential expression patterns (fold-changes above 2-fold or below 0.5-fold) of three ORFs (ORF3, ORF11, and ORF18) between lobed and non-lobed leaf plants. Based on gene annotation and expression analysis, ORF18 (encoding an uncharacterized protein) and ORF22 (encoding a homeobox-leucine zipper-like protein) were considered as most likely candidate genes. Furthermore, sequence analysis revealed no polymorphisms in cDNA sequences of ORF18; however, two notable deletions were identified in ORF22. This study is the first report to map a leaf shape gene in watermelon and will facilitate cloning and functional characterization of ClLL1 in future studies.
Genetic and functional properties of uncultivated thermophilic crenarchaeotes from a subsurface gold mine as revealed by analysis of genome fragments.

PubMed

Nunoura, Takuro; Hirayama, Hisako; Takami, Hideto; Oida, Hanako; Nishi, Shinro; Shimamura, Shigeru; Suzuki, Yohey; Inagaki, Fumio; Takai, Ken; Nealson, Kenneth H; Horikoshi, Koki

2005-12-01

Within a phylum Crenarchaeota, only some members of the hyperthermophilic class Thermoprotei, have been cultivated and characterized. In this study, we have constructed a metagenomic library from a microbial mat formation in a subsurface hot water stream of the Hishikari gold mine, Japan, and sequenced genome fragments of two different phylogroups of uncultivated thermophilic Crenarchaeota: (i) hot water crenarchaeotic group (HWCG) I (41.2 kb), and (ii) HWCG III (49.3 kb). The genome fragment of HWCG I contained a 16S rRNA gene, two tRNA genes and 35 genes encoding proteins but no 23S rRNA gene. Among the genes encoding proteins, several genes for putative aerobic-type carbon monoxide dehydrogenase represented a potential clue with regard to the yet unknown metabolism of HWCG I Archaea. The genome fragment of HWCG III contained a 16S/23S rRNA operon and 44 genes encoding proteins. In the 23S rRNA gene, we detected a homing-endonuclease encoding a group I intron similar to those detected in hyperthermophilic Crenarchaeota and Bacteria, as well as eukaryotic organelles. The reconstructed phylogenetic tree based on the 23S rRNA gene sequence reinforced the intermediate phylogenetic affiliation of HWCG III bridging the hyperthermophilic and non-thermophilic uncultivated Crenarchaeota.
A 590 kb deletion caused by non-allelic homologous recombination between two LINE-1 elements in a patient with mesomelia-synostosis syndrome.

PubMed

Kohmoto, Tomohiro; Naruto, Takuya; Watanabe, Miki; Fujita, Yuji; Ujiro, Sae; Okamoto, Nana; Horikawa, Hideaki; Masuda, Kiyoshi; Imoto, Issei

2017-04-01

Mesomelia-synostoses syndrome (MSS) is a rare, autosomal-dominant, syndromal osteochondrodysplasia characterized by mesomelic limb shortening, acral synostoses, and multiple congenital malformations due to a non-recurrent deletion at 8q13 that always encompasses two coding-genes, SULF1 and SLCO5A1. To date, five unrelated patients have been reported worldwide, and MMS was previously proposed to not be a genomic disorder associated with deletions recurring from non-allelic homologous recombination (NAHR) in at least two analyzed cases. We conducted targeted gene panel sequencing and subsequent array-based copy number analysis in an 11-year-old undiagnosed Japanese female patient with multiple congenital anomalies that included mesomelic limb shortening and detected a novel 590 Kb deletion at 8q13 encompassing the same gene set as reported previously, resulting in the diagnosis of MSS. Breakpoint sequences of the deleted region in our case demonstrated the first LINE-1s (L1s)-mediated unequal NAHR event utilizing two distant L1 elements as homology substrates in this disease, which may represent a novel causative mechanism of the 8q13 deletion, expanding the range of mechanisms involved in the chromosomal rearrangements responsible for MSS. © 2017 Wiley Periodicals, Inc.
Evidence for large inversion polymorphisms in the human genome from HapMap data

PubMed Central

Bansal, Vikas; Bashir, Ali; Bafna, Vineet

2007-01-01

Knowledge about structural variation in the human genome has grown tremendously in the past few years. However, inversions represent a class of structural variation that remains difficult to detect. We present a statistical method to identify large inversion polymorphisms using unusual Linkage Disequilibrium (LD) patterns from high-density SNP data. The method is designed to detect chromosomal segments that are inverted (in a majority of the chromosomes) in a population with respect to the reference human genome sequence. We demonstrate the power of this method to detect such inversion polymorphisms through simulations done using the HapMap data. Application of this method to the data from the first phase of the International HapMap project resulted in 176 candidate inversions ranging from 200 kb to several megabases in length. Our predicted inversions include an 800-kb polymorphic inversion at 7p22, a 1.1-Mb inversion at 16p12, and a novel 1.2-Mb inversion on chromosome 10 that is supported by the presence of two discordant fosmids. Analysis of the genomic sequence around inversion breakpoints showed that 11 predicted inversions are flanked by pairs of highly homologous repeats in the inverted orientation. In addition, for three candidate inversions, the inverted orientation is represented in the Celera genome assembly. Although the power of our method to detect inversions is restricted because of inherently noisy LD patterns in population data, inversions predicted by our method represent strong candidates for experimental validation and analysis. PMID:17185644
Comparative sequence analysis of the potato cyst nematode resistance locus H1 reveals a major lack of co-linearity between three haplotypes in potato (Solanum tuberosum ssp.).

PubMed

Finkers-Tomczak, Anna; Bakker, Erin; de Boer, Jan; van der Vossen, Edwin; Achenbach, Ute; Golas, Tomasz; Suryaningrat, Suwardi; Smant, Geert; Bakker, Jaap; Goverse, Aska

2011-02-01

The H1 locus confers resistance to the potato cyst nematode Globodera rostochiensis pathotypes 1 and 4. It is positioned at the distal end of chromosome V of the diploid Solanum tuberosum genotype SH83-92-488 (SH) on an introgression segment derived from S. tuberosum ssp. andigena. Markers from a high-resolution genetic map of the H1 locus (Bakker et al. in Theor Appl Genet 109:146-152, 2004) were used to screen a BAC library to construct a physical map covering a 341-kb region of the resistant haplotype coming from SH. For comparison, physical maps were also generated of the two haplotypes from the diploid susceptible genotype RH89-039-16 (S. tuberosum ssp. tuberosum/S. phureja), spanning syntenic regions of 700 and 319 kb. Gene predictions on the genomic segments resulted in the identification of a large cluster consisting of variable numbers of the CC-NB-LRR type of R genes for each haplotype. Furthermore, the regions were interspersed with numerous transposable elements and genes coding for an extensin-like protein and an amino acid transporter. Comparative analysis revealed a major lack of gene order conservation in the sequences of the three closely related haplotypes. Our data provide insight in the evolutionary mechanisms shaping the H1 locus and will facilitate the map-based cloning of the H1 resistance gene.
Complete genome sequence of a new bipartite begomovirus infecting fluted pumpkin (Telfairia occidentalis) plants in Cameroon.

PubMed

Leke, Walter N; Khatabi, Behnam; Fondong, Vincent N; Brown, Judith K

2016-08-01

The complete genome sequence was determined and characterized for a previously unreported bipartite begomovirus from fluted pumpkin (Telfairia occidentalis, family Cucurbitaceae) plants displaying mosaic symptoms in Cameroon. The DNA-A and DNA-B components were ~2.7 kb and ~2.6 kb in size, and the arrangement of viral coding regions on the genomic components was like those characteristic of other known bipartite begomoviruses originating in the Old World. While the DNA-A component was more closely related to that of chayote yellow mosaic virus (ChaYMV), at 78 %, the DNA-B component was more closely related to that of soybean chlorotic blotch virus (SbCBV), at 64 %. This newly discovered bipartite Old World virus is herein named telfairia mosaic virus (TelMV).
Tomato chocolate spot virus, a member of a new torradovirus species that causes a necrosis-associated disease of tomato in Guatemala.

PubMed

Batuman, O; Kuo, Y-W; Palmieri, M; Rojas, M R; Gilbertson, R L

2010-06-01

Tomatoes in Guatemala have been affected by a new disease, locally known as "mancha de chocolate" (chocolate spot). The disease is characterized by distinct necrotic spots on leaves, stems and petioles that eventually expand and cause a dieback of apical tissues. Samples from symptomatic plants tested negative for infection by tomato spotted wilt virus, tobacco streak virus, tobacco etch virus and other known tomato-infecting viruses. A virus-like agent was sap-transmitted from diseased tissue to Nicotiana benthamiana and, when graft-transmitted to tomato, this agent induced chocolate spot symptoms. This virus-like agent also was sap-transmitted to Datura stramonium and Nicotiana glutinosa, but not to a range of non-solanaceous indicator plants. Icosahedral virions approximately 28-30 nm in diameter were purified from symptomatic N. benthamiana plants. When rub-inoculated onto leaves of N. benthamiana plants, these virions induced symptoms indistinguishable from those in N. benthamiana plants infected with the sap-transmissible virus associated with chocolate spot disease. Tomatoes inoculated with sap or grafted with shoots from N. benthamiana plants infected with purified virions developed typical chocolate spot symptoms, consistent with this virus being the causal agent of the disease. Analysis of nucleic acids associated with purified virions of the chocolate-spot-associated virus, revealed a genome composed of two single-stranded RNAs of approximately 7.5 and approximately 5.1 kb. Sequence analysis of these RNAs revealed a genome organization similar to recently described torradoviruses, a new group of picorna-like viruses causing necrosis-associated diseases of tomatoes in Europe [tomato torrado virus (ToTV)] and Mexico [tomato apex necrosis virus (ToANV) and tomato marchitez virus (ToMarV)]. Thus, the approximately 7.5 kb and approximately 5.1 kb RNAs of the chocolate-spot-associated virus corresponded to the torradovirus RNA1 and RNA2, respectively; however, sequence comparisons revealed 64-83% identities with RNA1 and RNA2 sequences of ToTV, ToANV and ToMarV. Together, these results indicate that the chocolate-spot-associated virus is a member of a distinct torradovirus species and, thus, another member of the recently established genus Torradovirus in the family Secoviridae. The name tomato chocolate spot virus is proposed.
Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana).

PubMed

Baurens, Franc-Christophe; Bocs, Stéphanie; Rouard, Mathieu; Matsumoto, Takashi; Miller, Robert N G; Rodier-Goud, Marguerite; MBéguié-A-MBéguié, Didier; Yahiaoui, Nabila

2010-07-16

Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana.
DNA sequence analysis of the photosynthesis region of Rhodobacter sphaeroides 2.4.1.

PubMed

Choudhary, M; Kaplan, S

2000-02-15

This paper describes the DNA sequence of the photosynthesis region of Rhodobacter sphaeroides 2.4.1 (T). The photosynthesis gene cluster is located within a approximately 73 kb Ase I genomic DNA fragment containing the puf, puhA, cycA and puc operons. A total of 65 open reading frames (ORFs) have been identified, of which 61 showed significant similarity to genes/proteins of other organisms while only four did not reveal any significant sequence similarity to any gene/protein sequences in the database. The data were compared with the corresponding genes/ORFs from a different strain of R.sphaeroides and Rhodobacter capsulatus, a close relative of R. sphaeroides. A detailed analysis of the gene organization in the photosynthesis region revealed a similar gene order in both species with some notable differences located to the pucBAC = cycA region. In addition, photosynthesis gene regulatory protein (PpsR, FNR, IHF) binding motifs in upstream sequences of a number of photosynthesis genes have been identified and shown to differ between these two species. The difference in gene organization relative to pucBAC and cycA suggests that this region originated independently of the photosynthesis gene cluster of R.sphaeroides.
Comparative Genomic and Morphological Analyses of Listeria Phages Isolated from Farm Environments

PubMed Central

Denes, Thomas; Ackermann, Hans-Wolfgang; Moreno Switt, Andrea I.; Wiedmann, Martin; den Bakker, Henk C.

2014-01-01

The genus Listeria is ubiquitous in the environment and includes the globally important food-borne pathogen Listeria monocytogenes. While the genomic diversity of Listeria has been well studied, considerably less is known about the genomic and morphological diversity of Listeria bacteriophages. In this study, we sequenced and analyzed the genomes of 14 Listeria phages isolated mostly from New York dairy farm environments as well as one related Enterococcus faecalis phage to obtain information on genome characteristics and diversity. We also examined 12 of the phages by electron microscopy to characterize their morphology. These Listeria phages, based on gene orthology and morphology, together with previously sequenced Listeria phages could be classified into five orthoclusters, including one novel orthocluster. One orthocluster (orthocluster I) consists of large-genome (∼135-kb) myoviruses belonging to the genus “Twort-like viruses,” three orthoclusters (orthoclusters II to IV) contain small-genome (36- to 43-kb) siphoviruses with icosahedral heads, and the novel orthocluster V contains medium-sized-genome (∼66-kb) siphoviruses with elongated heads. A novel orthocluster (orthocluster VI) of E. faecalis phages, with medium-sized genomes (∼56 kb), was identified, which grouped together and shares morphological features with the novel Listeria phage orthocluster V. This new group of phages (i.e., orthoclusters V and VI) is composed of putative lytic phages that may prove to be useful in phage-based applications for biocontrol, detection, and therapeutic purposes. PMID:24837381

Evaluation of Linkage Disequilibrium Pattern and Association Study on Seed Oil Content in Brassica napus Using ddRAD Sequencing.

PubMed

Wu, Zhikun; Wang, Bo; Chen, Xun; Wu, Jiangsheng; King, Graham J; Xiao, Yingjie; Liu, Kede

2016-01-01

High-density genetic markers are the prerequisite for understanding linkage disequilibrium (LD) and genome-wide association studies (GWASs) of complex traits in crops. To evaluate the LD pattern in oilseed rape, we sequenced a previous association panel containing 189 B. napus inbred lines using double-digested restriction-site associated DNA (ddRAD) and genotyped 19,327 RAD tags. A total of 15,921 RAD tags were assigned to a published genetic linkage map and the majority (71.1%) of these tags was uniquely mapped to the draft reference genome "Darmor-bzh." The distance of LD decay was 1,214 kb across the genome at the background level (r2 = 0.26), with the distances of LD decay being 405 kb and 2,111 kb in the A and C subgenomes, respectively. A total of 361 haplotype blocks with length > 100 kb were identified in the entire genome. The association panel could be classified into two groups, P1 and P2, which are essentially consistent with the geographical origins of varieties. A large number of group-specific haplotypes were identified, reflecting that varieties in the P1 and P2 groups experienced distinct selection in breeding programs to adapt their different growth habitats. GWAS repeatedly detected two loci significantly associated with oil content of seeds based on the developed SNPs, suggesting that the high-density SNPs were useful for understanding the genetic determinants of complex traits in GWAS.
The complete sequence and structural analysis of human apolipoprotein B-100: relationship between apoB-100 and apoB-48 forms.

PubMed Central

Cladaras, C; Hadzopoulou-Cladaras, M; Nolte, R T; Atkinson, D; Zannis, V I

1986-01-01

We have isolated and sequenced overlapping cDNA clones covering the entire sequence of human apolipoprotein B-100 (apoB-100). DNA sequence analysis and determination of the mRNA transcription initiation site by S1 nuclease mapping showed that the apoB mRNA consists of 14,112 nucleotides including the 5' and 3' untranslated regions which are 128 and 301 nucleotides respectively. The DNA-derived protein sequence shows that apoB-100 is 513,000 daltons and contains 4560 amino acids including a 24-amino-acid-long signal peptide. The mol. wt of apoB-100 implies that there is one apoB molecule per LDL particle. Computer analysis of the predicted secondary structure of the protein showed that some of the potential alpha helical and beta sheet structures are amphipathic, whereas others have non-amphipathic neutral to apolar character. These latter regions may contribute to the formation of the lipid-binding domains of apoB-100. The protein contains 25 cysteines and 20 potential N-glycosylation sites. The majority of cysteines are distributed in the amino terminal portion of the protein. Four of the potential glycosylation sites are in predicted beta turn structures and may represent true glycosylation positions. ApoB lacks the tandem repeats which are characteristic of other apolipoproteins. The mean hydrophobicity the mean value of H1 and helical hydrophobic moment the mean value of microH profiles of apoB showed the presence of several potential helical regions with strong polar character and high hydrophobic moment. The region with the highest hydrophobic moment, between amino acid residues 3352 and 3369, contains five closely spaced, positively charged residues, and has sequence homology to the LDL receptor binding site of apoE. This region is flanked by three neighbouring regions with positively charged amino acids and high hydrophobic moment that are located between residues 3174 and 3681. One or more of these closely spaced apoB sequences may be involved in the formation of the LDL receptor-binding domain of apoB-100. Blotting analysis of intestinal RNA and hybridization of the blots with carboxy apoB cDNA probes produced a single 15-kb hybridization band whereas hybridization with amino terminal probes produced two hybridization bands of 15 and 8 kb. Our data indicate that both forms of apoB mRNA contain common sequences which extend from the amino terminal of apoB-100 to the vicinity of nucleotide residue 6300. These two messages may have resulted from differential splicing of the same primary apoB mRNA transcript. Images Fig. 4. Fig. 6. PMID:3030729
The first complete chloroplast genome of the Genistoid legume Lupinus luteus: evidence for a novel major lineage-specific rearrangement and new insights regarding plastome evolution in the legume family

PubMed Central

Martin, Guillaume E.; Rousseau-Gueutin, Mathieu; Cordonnier, Solenn; Lima, Oscar; Michon-Coudouel, Sophie; Naquin, Delphine; de Carvalho, Julie Ferreira; Aïnouche, Malika; Salmon, Armel; Aïnouche, Abdelkader

2014-01-01

Background and Aims To date chloroplast genomes are available only for members of the non-protein amino acid-accumulating clade (NPAAA) Papilionoid lineages in the legume family (i.e. Millettioids, Robinoids and the ‘inverted repeat-lacking clade’, IRLC). It is thus very important to sequence plastomes from other lineages in order to better understand the unusual evolution observed in this model flowering plant family. To this end, the plastome of a lupine species, Lupinus luteus, was sequenced to represent the Genistoid lineage, a noteworthy but poorly studied legume group. Methods The plastome of L. luteus was reconstructed using Roche-454 and Illumina next-generation sequencing. Its structure, repetitive sequences, gene content and sequence divergence were compared with those of other Fabaceae plastomes. PCR screening and sequencing were performed in other allied legumes in order to determine the origin of a large inversion identified in L. luteus. Key Results The first sequenced Genistoid plastome (L. luteus: 155 894 bp) resulted in the discovery of a 36-kb inversion, embedded within the already known 50-kb inversion in the large single-copy (LSC) region of the Papilionoideae. This inversion occurs at the base or soon after the Genistoid emergence, and most probably resulted from a flip–flop recombination between identical 29-bp inverted repeats within two trnS genes. Comparative analyses of the chloroplast gene content of L. luteus vs. Fabaceae and extra-Fabales plastomes revealed the loss of the plastid rpl22 gene, and its functional relocation to the nucleus was verified using lupine transcriptomic data. An investigation into the evolutionary rate of coding and non-coding sequences among legume plastomes resulted in the identification of remarkably variable regions. Conclusions This study resulted in the discovery of a novel, major 36-kb inversion, specific to the Genistoids. Chloroplast mutational hotspots were also identified, which contain novel and potentially informative regions for molecular evolutionary studies at various taxonomic levels in the legumes. Taken together, the results provide new insights into the evolutionary landscape of the legume plastome. PMID:24769537
Comparative genomics of citric-acid producing Aspergillus niger ATCC 1015 versus enzyme-producing CBS 513.88

DOE Office of Scientific and Technical Information (OSTI.GOV)

Grigoriev, Igor V.; Baker, Scott E.; Andersen, Mikael R.

2011-04-28

The filamentous fungus Aspergillus niger exhibits great diversity in its phenotype. It is found globally, both as marine and terrestrial strains, produces both organic acids and hydrolytic enzymes in high amounts, and some isolates exhibit pathogenicity. Although the genome of an industrial enzyme-producing A. niger strain (CBS 513.88) has already been sequenced, the versatility and diversity of this species compels additional exploration. We therefore undertook whole genome sequencing of the acidogenic A. niger wild type strain (ATCC 1015), and produced a genome sequence of very high quality. Only 15 gaps are present in the sequence and half the telomeric regionsmore » have been elucidated. Moreover, sequence information from ATCC 1015 was utilized to improve the genome sequence of CBS 513.88. Chromosome-level comparisons uncovered several genome rearrangements, deletions, a clear case of strain-specific horizontal gene transfer, and identification of 0.8 megabase of novel sequence. Single nucleotide polymorphisms per kilobase (SNPs/kb) between the two strains were found to be exceptionally high (average: 7.8, maximum: 160 SNPs/kb). High variation within the species was confirmed with exo-metabolite profiling and phylogenetics. Detailed lists of alleles were generated, and genotypic differences were observed to accumulate in metabolic pathways essential to acid production and protein synthesis. A transcriptome analysis revealed up-regulation of the electron transport chain, specifically the alternative oxidative pathway in ATCC 1015, while CBS 513.88 showed significant up-regulation of genes relevant to glucoamylase A production, such as tRNA-synthases and protein transporters. Our results and datasets from this integrative systems biology analysis resulted in a snapshot of fungal evolution and will support further optimization of cell factories based on filamentous fungi.[Supplemental materials (10 figures, three text documents and 16 tables) have been made available. The whole genome sequence for A. niger ATCC 1015 is available from NBCI under acc. no ACJE00000000. The up-dated sequence for A. niger CBS 513.88 is available from EMBL under acc. no AM269948-AM270415. The sequence data from the phylogeny study has been submitted to NCBI (GU296686-296739). Microarray data from this study is submitted to GEO as series GSE10983. Accession for reviewers is possible through: http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi token GSE10983] The dsmM_ANIGERa_coll511030F library and platform information is deposited at GEO under number GPL6758« less
Targeted deletion of the 9p21 noncoding coronary artery disease risk interval in mice

DOE Office of Scientific and Technical Information (OSTI.GOV)

Visel, Axel; Zhu, Yiwen; May, Dalit

2010-01-01

Sequence polymorphisms in a 58kb interval on chromosome 9p21 confer a markedly increased risk for coronary artery disease (CAD), the leading cause of death worldwide 1,2. The variants have a substantial impact on the epidemiology of CAD and other life?threatening vascular conditions since nearly a quarter of Caucasians are homozygous for risk alleles. However, the risk interval is devoid of protein?coding genes and the mechanism linking the region to CAD risk has remained enigmatic. Here we show that deletion of the orthologous 70kb noncoding interval on mouse chromosome 4 affects cardiac expression of neighboring genes, as well as proliferation propertiesmore » of vascular cells. Chr4delta70kb/delta70kb mice are viable, but show increased mortality both during development and as adults. Cardiac expression of two genes near the noncoding interval, Cdkn2a and Cdkn2b, is severely reduced in chr4delta70kb/delta70kb mice, indicating that distant-acting gene regulatory functions are located in the noncoding CAD risk interval. Allelespecific expression of Cdkn2b transcripts in heterozygous mice revealed that the deletion affects expression through a cis-acting mechanism. Primary cultures of chr4delta70kb/delta70kb aortic smooth muscle cells exhibited excessive proliferation and diminished senescence, a cellular phenotype consistent with accelerated CAD pathogenesis. Taken together, our results provide direct evidence that the CAD risk interval plays a pivotal role in regulation of cardiac Cdkn2a/b expression and suggest that this region affects CAD progression by altering the dynamics of vascular cell proliferation.« less
A report on extensive lateral genetic reciprocation between arsenic resistant Bacillus subtilis and Bacillus pumilus strains analyzed using RAPD-PCR.

PubMed

Khowal, Sapna; Siddiqui, Md Zulquarnain; Ali, Shadab; Khan, Mohd Taha; Khan, Mather Ali; Naqvi, Samar Husain; Wajid, Saima

2017-02-01

The study involves isolation of arsenic resistant bacteria from soil samples. The characterization of bacteria isolates was based on 16S rRNA gene sequences. The phylogenetic consanguinity among isolates was studied employing rpoB and gltX gene sequence. RAPD-PCR technique was used to analyze genetic similarity between arsenic resistant isolates. In accordance with the results Bacillus subtilis and Bacillus pumilus strains may exhibit extensive horizontal gene transfer. Arsenic resistant potency in Bacillus sonorensis and high arsenite tolerance in Bacillus pumilus strains was identified. The RAPD-PCR primer OPO-02 amplified a 0.5kb DNA band specific to B. pumilus 3ZZZ strain and 0.75kb DNA band specific to B. subtilis 3PP. These unique DNA bands may have potential use as SCAR (Sequenced Characterized Amplified Region) molecular markers for identification of arsenic resistant B. pumilus and B. subtilis strains. Copyright © 2016 Elsevier Inc. All rights reserved.
Chromosomal arrangement of leghemoglobin genes in soybean.

PubMed Central

Lee, J S; Brown, G G; Verma, D P

1983-01-01

A cluster of four different leghemoglobin (Lb) genes was isolated from AluI-HaeIII and EcoRI genomic libraries of soybean in a set of overlapping clones which together include 45 kilobases (kb) of contiguous DNA. These four genes, including a pseudogene, are present in the same orientation and are arranged in the order: 5'-Lba-Lbc1-Lb psi-Lbc3-3'. The intergenic regions average 2.5 kb. In addition to this main Lb locus, there are other Lb genes which do not appear to be contiguous to this locus. A sequence probably common to the 3' region of Lb loci was found flanking the Lbc3 gene. The 3' flanking region of the main Lb locus also contains a sequence that appears to be expressed more abundantly in root tissue. Another sequence which is primarily expressed in root and leaf is found 5' to two Lb loci. Overall, the main leghemoglobin locus is similar in structure to the mammalian globin gene loci. Images PMID:6310504
A 21.7 kb DNA segment on the left arm of yeast chromosome XIV carries WHI3, GCR2, SPX18, SPX19, an homologue to the heat shock gene SSB1 and 8 new open reading frames of unknown function.

PubMed

Jonniaux, J L; Coster, F; Purnelle, B; Goffeau, A

1994-12-01

We report the amino acid sequence of 13 open reading frames (ORF > 299 bp) located on a 21.7 kb DNA segment from the left arm of chromosome XIV of Saccharomyces cerevisiae. Five open reading frames had been entirely or partially sequenced previously: WHI3, GCR2, SPX19, SPX18 and a heat shock gene similar to SSB1. The products of 8 other ORFs are new putative proteins among which N1394 is probably a membrane protein. N1346 contains a leucine zipper pattern and the corresponding ORF presents an HAP (global regulator of respiratory genes) upstream activating sequence in the promoting region. N1386 shares homologies with the DNA structure-specific recognition protein family SSRPs and the corresponding ORF is preceded by an MCB (MluI cell cycle box) upstream activating factor.
Identification of the WBSCR9 gene, encoding a novel transcriptional regulator, in the Williams-Beuren syndrome deletion at 7q11.23.

PubMed

Peoples, R J; Cisco, M J; Kaplan, P; Francke, U

1998-01-01

We have identified a novel gene (WBSCR9) within the common Williams-Beuren syndrome (WBS) deletion by interspecies sequence conservation. The WBSCR9 gene encodes a roughly 7-kb transcript with an open reading frame of 1483 amino acids and a predicted protein product size of 170.8 kDa. WBSCR9 is comprised of at least 20 exons extending over 60 kb. The transcript is expressed ubiquitously throughout development and is subject to alternative splicing. Functional motifs identified by sequence homology searches include a bromodomain; a PHD, or C4HC3, finger; several putative nuclear localization signals; four nuclear receptor binding motifs; a polyglutamate stretch and two PEST sequences. Bromodomains, PHD motifs and nuclear receptor binding motifs are cardinal features of proteins that are involved in chromatin remodeling and modulation of transcription. Haploinsufficiency for WBSCR9 gene products may contribute to the complex phenotype of WBS by interacting with tissue-specific regulatory factors during development.
A structural variant in the 5’-flanking region of the TWIST2 gene affects melanocyte development in belted cattle

PubMed Central

Drögemüller, Cord; Jagannathan, Vidhya; Keller, Irene; Wüthrich, Daniel; Bruggmann, Rémy; Schütz, Ekkehard; Demmel, Steffi; Moser, Simon; Signer-Hasler, Heidi; Pieńkowska-Schelling, Aldona; Schelling, Claude; Sande, Marcos; Rongen, Ronald

2017-01-01

Belted cattle have a circular belt of unpigmented hair and skin around their midsection. The belt is inherited as a monogenic autosomal dominant trait. We mapped the causative variant to a 37 kb segment on bovine chromosome 3. Whole genome sequence data of 2 belted and 130 control cattle yielded only one private genetic variant in the critical interval in the two belted animals. The belt-associated variant was a copy number variant (CNV) involving the quadruplication of a 6 kb non-coding sequence located approximately 16 kb upstream of the TWIST2 gene. Increased copy numbers at this CNV were strongly associated with the belt phenotype in a cohort of 333 cases and 1322 controls. We hypothesized that the CNV causes aberrant expression of TWIST2 during neural crest development, which might negatively affect melanoblasts. Functional studies showed that ectopic expression of bovine TWIST2 in neural crest in transgenic zebrafish led to a decrease in melanocyte numbers. Our results thus implicate an unsuspected involvement of TWIST2 in regulating pigmentation and reveal a non-coding CNV underlying a captivating Mendelian character. PMID:28658273
Genome-wide patterns of copy number variation in the diversified chicken genomes using next-generation sequencing.

PubMed

Yi, Guoqiang; Qu, Lujiang; Liu, Jianfeng; Yan, Yiyuan; Xu, Guiyun; Yang, Ning

2014-11-07

Copy number variation (CNV) is important and widespread in the genome, and is a major cause of disease and phenotypic diversity. Herein, we performed a genome-wide CNV analysis in 12 diversified chicken genomes based on whole genome sequencing. A total of 8,840 CNV regions (CNVRs) covering 98.2 Mb and representing 9.4% of the chicken genome were identified, ranging in size from 1.1 to 268.8 kb with an average of 11.1 kb. Sequencing-based predictions were confirmed at a high validation rate by two independent approaches, including array comparative genomic hybridization (aCGH) and quantitative PCR (qPCR). The Pearson's correlation coefficients between sequencing and aCGH results ranged from 0.435 to 0.755, and qPCR experiments revealed a positive validation rate of 91.71% and a false negative rate of 22.43%. In total, 2,214 (25.0%) predicted CNVRs span 2,216 (36.4%) RefSeq genes associated with specific biological functions. Besides two previously reported copy number variable genes EDN3 and PRLR, we also found some promising genes with potential in phenotypic variation. Two genes, FZD6 and LIMS1, related to disease susceptibility/resistance are covered by CNVRs. The highly duplicated SOCS2 may lead to higher bone mineral density. Entire or partial duplication of some genes like POPDC3 may have great economic importance in poultry breeding. Our results based on extensive genetic diversity provide a more refined chicken CNV map and genome-wide gene copy number estimates, and warrant future CNV association studies for important traits in chickens.
Living with Genome Instability: the Adaptation of Phytoplasmas to Diverse Environments of Their Insect and Plant Hosts††

PubMed Central

Bai, Xiaodong; Zhang, Jianhua; Ewing, Adam; Miller, Sally A.; Jancso Radek, Agnes; Shevchenko, Dmitriy V.; Tsukerman, Kiryl; Walunas, Theresa; Lapidus, Alla; Campbell, John W.; Hogenhout, Saskia A.

2006-01-01

Phytoplasmas (“Candidatus Phytoplasma,” class Mollicutes) cause disease in hundreds of economically important plants and are obligately transmitted by sap-feeding insects of the order Hemiptera, mainly leafhoppers and psyllids. The 706,569-bp chromosome and four plasmids of aster yellows phytoplasma strain witches' broom (AY-WB) were sequenced and compared to the onion yellows phytoplasma strain M (OY-M) genome. The phytoplasmas have small repeat-rich genomes. This comparative analysis revealed that the repeated DNAs are organized into large clusters of potential mobile units (PMUs), which contain tra5 insertion sequences (ISs) and genes for specialized sigma factors and membrane proteins. So far, these PMUs appear to be unique to phytoplasmas. Compared to mycoplasmas, phytoplasmas lack several recombination and DNA modification functions, and therefore, phytoplasmas may use different mechanisms of recombination, likely involving PMUs, for the creation of variability, allowing phytoplasmas to adjust to the diverse environments of plants and insects. The irregular GC skews and the presence of ISs and large repeated sequences in the AY-WB and OY-M genomes are indicative of high genomic plasticity. Nevertheless, segments of ∼250 kb located between the lplA and glnQ genes are syntenic between the two phytoplasmas and contain the majority of the metabolic genes and no ISs. AY-WB appears to be further along in the reductive evolution process than OY-M. The AY-WB genome is ∼154 kb smaller than the OY-M genome, primarily as a result of fewer multicopy sequences, including PMUs. Furthermore, AY-WB lacks genes that are truncated and are part of incomplete pathways in OY-M. PMID:16672622
Fine mapping of Restorer-of-fertility in pepper (Capsicum annuum L.) identified a candidate gene encoding a pentatricopeptide repeat (PPR)-containing protein.

PubMed

Jo, Yeong Deuk; Ha, Yeaseong; Lee, Joung-Ho; Park, Minkyu; Bergsma, Alex C; Choi, Hong-Il; Goritschnig, Sandra; Kloosterman, Bjorn; van Dijk, Peter J; Choi, Doil; Kang, Byoung-Cheorl

2016-10-01

Using fine mapping techniques, the genomic region co-segregating with Restorer - of - fertility ( Rf ) in pepper was delimited to a region of 821 kb in length. A PPR gene in this region, CaPPR6 , was identified as a strong candidate for Rf based on expression pattern and characteristics of encoding sequence. Cytoplasmic-genic male sterility (CGMS) has been used for the efficient production of hybrid seeds in peppers (Capsicum annuum L.). Although the mitochondrial candidate genes that might be responsible for cytoplasmic male sterility (CMS) have been identified, the nuclear Restorer-of-fertility (Rf) gene has not been isolated. To identify the genomic region co-segregating with Rf in pepper, we performed fine mapping using an Rf-segregating population consisting of 1068 F2 individuals, based on BSA-AFLP and a comparative mapping approach. Through six cycles of chromosome walking, the co-segregating region harboring the Rf locus was delimited to be within 821 kb of sequence. Prediction of expressed genes in this region based on transcription analysis revealed four candidate genes. Among these, CaPPR6 encodes a pentatricopeptide repeat (PPR) protein with PPR motifs that are repeated 14 times. Characterization of the CaPPR6 protein sequence, based on alignment with other homologs, showed that CaPPR6 is a typical Rf-like (RFL) gene reported to have undergone diversifying selection during evolution. A marker developed from a sequence near CaPPR6 showed a higher prediction rate of the Rf phenotype than those of previously developed markers when applied to a panel of breeding lines of diverse origin. These results suggest that CaPPR6 is a strong candidate for the Rf gene in pepper.
Genome sequence of the Japanese oak silk moth, Antheraea yamamai: the first draft genome in the family Saturniidae

PubMed Central

Kim, Seong-Ryul; Kwak, Woori; Kim, Hyaekang; Kim, Kee-Young; Kim, Su-Bae; Choi, Kwang-Ho; Kim, Seong-Wan; Hwang, Jae-Sam; Kim, Minjee; Kim, Iksoo; Goo, Tae-Won

2018-01-01

Abstract Background Antheraea yamamai, also known as the Japanese oak silk moth, is a wild species of silk moth. Silk produced by A. yamamai, referred to as tensan silk, shows different characteristics such as thickness, compressive elasticity, and chemical resistance compared with common silk produced from the domesticated silkworm, Bombyx mori. Its unique characteristics have led to its use in many research fields including biotechnology and medical science, and the scientific as well as economic importance of the wild silk moth continues to gradually increase. However, no genomic information for the wild silk moth, including A. yamamai, is currently available. Findings In order to construct the A. yamamai genome, a total of 147G base pairs using Illumina and Pacbio sequencing platforms were generated, providing 210-fold coverage based on the 700-Mb estimated genome size of A. yamamai. The assembled genome of A. yamamai was 656 Mb (>2 kb) with 3675 scaffolds, and the N50 length of assembly was 739 Kb with a 34.07% GC ratio. Identified repeat elements covered 37.33% of the total genome, and the completeness of the constructed genome assembly was estimated to be 96.7% by Benchmarking Universal Single-Copy Orthologs v2 analysis. A total of 15 481 genes were identified using Evidence Modeler based on the gene prediction results obtained from 3 different methods (ab initio, RNA-seq-based, known-gene-based) and manual curation. Conclusions Here we present the genome sequence of A. yamamai, the first genome sequence of the wild silk moth. These results provide valuable genomic information, which will help enrich our understanding of the molecular mechanisms relating to not only specific phenotypes such as wild silk itself but also the genomic evolution of Saturniidae. PMID:29186418
Detailed transcriptome description of the neglected cestode Taenia multiceps.

PubMed

Wu, Xuhang; Fu, Yan; Yang, Deying; Zhang, Runhui; Zheng, Wanpeng; Nie, Huaming; Xie, Yue; Yan, Ning; Hao, Guiying; Gu, Xiaobin; Wang, Shuxian; Peng, Xuerong; Yang, Guangyou

2012-01-01

The larval stage of Taenia multiceps, a global cestode, encysts in the central nervous system (CNS) of sheep and other livestock. This frequently leads to their death and huge socioeconomic losses, especially in developing countries. This parasite can also cause zoonotic infections in humans, but has been largely neglected due to a lack of diagnostic techniques and studies. Recent developments in next-generation sequencing provide an opportunity to explore the transcriptome of T. multiceps. We obtained a total of 31,282 unigenes (mean length 920 bp) using Illumina paired-end sequencing technology and a new Trinity de novo assembler without a referenced genome. Individual transcription molecules were determined by sequence-based annotations and/or domain-based annotations against public databases (Nr, UniprotKB/Swiss-Prot, COG, KEGG, UniProtKB/TrEMBL, InterPro and Pfam). We identified 26,110 (83.47%) unigenes and inferred 20,896 (66.8%) coding sequences (CDS). Further comparative transcripts analysis with other cestodes (Taenia pisiformis, Taenia solium, Echincoccus granulosus and Echincoccus multilocularis) and intestinal parasites (Trichinella spiralis, Ancylostoma caninum and Ascaris suum) showed that 5,100 common genes were shared among three Taenia tapeworms, 261 conserved genes were detected among five Taeniidae cestodes, and 109 common genes were found in four zoonotic intestinal parasites. Some of the common genes were genes required for parasite survival, involved in parasite-host interactions. In addition, we amplified two full-length CDS of unigenes from the common genes using RT-PCR. This study provides an extensive transcriptome of the adult stage of T. multiceps, and demonstrates that comparative transcriptomic investigations deserve to be further studied. This transcriptome dataset forms a substantial public information platform to achieve a fundamental understanding of the biology of T. multiceps, and helps in the identification of drug targets and parasite-host interaction studies.
Living with genome instability: the adaptation of phytoplasmas todiverse environments of their insect and plant hosts

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bai, Xiaodong; Zhang, Jianhua; Ewing, Adam

Phytoplasmas (Candidatus Phytoplasma, Class Mollicutes) cause disease in hundreds of economically important plants, and are obligately transmitted by sap-feeding insects of the order Hemiptera, mainly leafhoppers and psyllids. The 706,569-bp chromosome and four plasmids of aster yellows phytoplasma strain witches broom (AY-WB) were sequenced and compared to the onion yellows phytoplasma strain M (OY-M) genome. The phytoplasmas have small repeat-rich genomes. The repeated DNAs are organized into large clusters, potential mobile units (PMUs), which contain tra5 insertion sequences (ISs), and specialized sigma factors and membrane proteins. So far, PMUs are unique to phytoplasmas. Compared to mycoplasmas, phytoplasmas lack several recombinationmore » and DNA modification functions, and therefore phytoplasmas probably use different mechanisms of recombination, likely involving PMUs, for the creation of variability, allowing phytoplasmas to adjust to the diverse environments of plants and insects. The irregular GC skews and presence of ISs and large repeated sequences in the AY-WB and OY-M genomes are indicative of high genomic plasticity. Nevertheless, segments of {approx}250 kb, located between genes lplA and glnQ are syntenic between the two phytoplasmas, contain the majority of the metabolic genes and no ISs. AY-WB is further along in the reductive evolution process than OY-M. The AY-WB genome is {approx}154 kb smaller than the OY-M genome, primarily as a result of fewer multicopy sequences, including PMUs. Further, AY-WB lacks genes that are truncated and are part of incomplete pathways in OY-M. This is the first comparative phytoplasma genome analysis and report of the existence of PMUs in phytoplasma genomes.« less
Spectroscopic studies of the binding of Cu(II) complexes of oxicam NSAIDs to alternating G-C and homopolymeric G-C sequences

NASA Astrophysics Data System (ADS)

Chakraborty, Sreeja; Bose, Madhuparna; Sarkar, Munna

2014-03-01

Drugs belonging to the Non-steroidal anti-inflammatory (NSAID) group are not only used as anti-inflammatory, analgesic and anti-pyretic agents, but also show anti-cancer effects. Complexing them with a bioactive metal like copper, show an enhancement in their anti-cancer effects compared to the bare drugs, whose exact mechanism of action is not yet fully understood. For the first time, it was shown by our group that Cu(II)-NSAIDs can directly bind to the DNA backbone. The ability of the copper complexes of NSAIDs namely meloxicam and piroxicam to bind to the DNA backbone could be a possible molecular mechanism behind their enhanced anticancer effects. Elucidating base sequence specific interaction of Cu(II)-NSAIDs to the DNA will provide information on their possible binding sites in the genome sequence. In this work, we present how these complexes respond to differences in structure and hydration pattern of GC rich sequences. For this, binding studies of Cu(II) complexes of piroxicam [Cu(II)-(Px)2 (L)2] and meloxicam [Cu(II)-(Mx)2 (L)] with alternating GC (polydG-dC) and homopolymeric GC (polydG-polydC) sequences were carried out using a combination of spectroscopic techniques that include UV-Vis absorption, fluorescence and circular dichroism (CD) spectroscopy. The Cu(II)-NSAIDs show strong binding affinity to both polydG-dC and polydG-polydC. The role reversal of Cu(II)-meloxicam from a strong binder of polydG-dC (Kb = 11.5 × 103 M-1) to a weak binder of polydG-polydC (Kb = 5.02 × 103 M-1), while Cu(II)-piroxicam changes from a strong binder of polydG-polydC (Kb = 8.18 × 103 M-1) to a weak one of polydG-dC (Kb = 2.18 × 103 M-1), point to the sensitivity of these complexes to changes in the backbone structures/hydration. Changes in the profiles of UV absorption band and CD difference spectra, upon complex binding to polynucleotides and the results of competitive binding assay using ethidium bromide (EtBr) fluorescence indicate different binding modes in each case.
Genetic high throughput screening in Retinitis Pigmentosa based on high resolution melting (HRM) analysis.

PubMed

Anasagasti, Ander; Barandika, Olatz; Irigoyen, Cristina; Benitez, Bruno A; Cooper, Breanna; Cruchaga, Carlos; López de Munain, Adolfo; Ruiz-Ederra, Javier

2013-11-01

Retinitis Pigmentosa (RP) involves a group of genetically determined retinal diseases caused by a large number of mutations that result in rod photoreceptor cell death followed by gradual death of cone cells. Most cases of RP are monogenic, with more than 80 associated genes identified so far. The high number of genes and variants involved in RP, among other factors, is making the molecular characterization of RP a real challenge for many patients. Although HRM has been used for the analysis of isolated variants or single RP genes, as far as we are concerned, this is the first study that uses HRM analysis for a high-throughput screening of several RP genes. Our main goal was to test the suitability of HRM analysis as a genetic screening technique in RP, and to compare its performance with two of the most widely used NGS platforms, Illumina and PGM-Ion Torrent technologies. RP patients (n = 96) were clinically diagnosed at the Ophthalmology Department of Donostia University Hospital, Spain. We analyzed a total of 16 RP genes that meet the following inclusion criteria: 1) size: genes with transcripts of less than 4 kb; 2) number of exons: genes with up to 22 exons; and 3) prevalence: genes reported to account for, at least, 0.4% of total RP cases worldwide. For comparison purposes, RHO gene was also sequenced with Illumina (GAII; Illumina), Ion semiconductor technologies (PGM; Life Technologies) and Sanger sequencing (ABI 3130xl platform; Applied Biosystems). Detected variants were confirmed in all cases by Sanger sequencing and tested for co-segregation in the family of affected probands. We identified a total of 65 genetic variants, 15 of which (23%) were novel, in 49 out of 96 patients. Among them, 14 (4 novel) are probable disease-causing genetic variants in 7 RP genes, affecting 15 patients. Our HRM analysis-based study, proved to be a cost-effective and rapid method that provides an accurate identification of genetic RP variants. This approach is effective for medium sized (<4 kb transcript) RP genes, which constitute over 80% of the total of known RP genes.
Genetic highthroughput screening in retinitis pigmentosa based on high resolution melting (HRM) analysis.

PubMed

Anasagasti, Ander; Barandika, Olatz; Irigoyen, Cristina; Benitez, Bruno A; Cooper, Breanna; Cruchaga, Carlos; López de Munain, Adolfo; Ruiz-Ederra, Javier

2013-10-24

Retinitis Pigmentosa (RP) involves a group of genetically determined retinal diseases caused by a large number of mutations that result in rod photoreceptor cell death followed by gradual death of cone cells. Most cases of RP are monogenic, with more than 80 associated genes identified so far. The high number of genes and variants involved in RP, among other factors, is making the molecular characterization of RP a real challenge for many patients. Although HRM has been used for the analysis of isolated variants or single RP genes, as far as we are concerned, this is the first study that uses HRM analysis for a high-throughput screening of several RP genes. Our main goal was to test the suitability of HRM analysis as a genetic screening technique in RP, and to compare its performance with two of the most widely used NGS platforms, Illumina and PGM-Ion Torrent technologies. RP patients (n=96) were clinically diagnosed at the Ophthalmology Department of Donostia University Hospital, Spain. We analyzed a total of 16 RP genes that meet the following inclusion criteria: 1) size: genes with transcripts of less than 4 kb; 2) number of exons: genes with up to 22 exons; and 3) prevalence: genes reported to account for, at least, 0.4 % of total RP cases worldwide. For comparison purposes, RHO gene was also sequenced with Illumina (GAII; Illumina), Ion semiconductor technologies (PGM; Life Technologies) and Sanger sequencing (ABI 3130xl platform; Applied Biosystems). Detected variants were confirmed in all cases by Sanger sequencing and tested for co-segregation in the family of affected probands. We identified a total of 65 genetic variants, 15 of which (23%) were novel, in 49 out of 96 patients. Among them, 14 (4 novel) are probable disease-causing genetic variants in 7 RP genes, affecting 15 patients. Our HRM analysis-based study, proved to be a cost-effective and rapid method that provides an accurate identification of genetic RP variants. This approach is effective for medium sized (<4 kb transcript) RP genes, which constitute over 80% of the total of known RP genes. © 2013 Published by Elsevier Ltd.
Complete Genome Sequence of the Quality Control Strain Staphylococcus aureus subsp. aureus ATCC 25923.

PubMed

Treangen, Todd J; Maybank, Rosslyn A; Enke, Sana; Friss, Mary Beth; Diviak, Lynn F; Karaolis, David K R; Koren, Sergey; Ondov, Brian; Phillippy, Adam M; Bergman, Nicholas H; Rosovitz, M J

2014-11-06

Staphylococcus aureus subsp. aureus ATCC 25923 is commonly used as a control strain for susceptibility testing to antibiotics and as a quality control strain for commercial products. We present the completed genome sequence for the strain, consisting of the chromosome and a 27.5-kb plasmid. Copyright © 2014 Treangen et al.

Analysis of the entire genomes of torque teno midi virus variants in chimpanzees: infrequent cross-species infection between humans and chimpanzees.

PubMed

Ninomiya, Masashi; Takahashi, Masaharu; Hoshino, Yu; Ichiyama, Koji; Simmonds, Peter; Okamoto, Hiroaki

2009-02-01

Humans are frequently infected with three anelloviruses which have circular DNA genomes of 3.6-3.9 kb [Torque teno virus (TTV)], 2.8-2.9 kb [Torque teno mini virus (TTMV)] and 3.2 kb [a recently discovered anellovirus named Torque teno midi virus (TTMDV)]. Unexpectedly, human TTMDV DNA was not detectable in any of 74 chimpanzees tested, although all but one tested positive for both human TTV and TTMV DNA. Using universal primers for anelloviruses, novel variants of TTMDV that are phylogenetically clearly separate from human TTMDV were identified from chimpanzees, and over the entire genome, three chimpanzee TTMDV variants differed by 17.9-20.3 % from each other and by 40.4-43.6 % from all 18 reported human TTMDVs. A newly developed PCR assay that uses chimpanzee TTMDV-specific primers revealed the high prevalence of chimpanzee TTMDV in chimpanzees (63/74, 85 %) but low prevalence in humans (1/100). While variants of TTV and TTMV from chimpanzees and humans were phylogenetically interspersed, those of TTMDV were monophyletic for each species, with sequence diversity of <33 and <20 % within the 18 human and three chimpanzee TTMDV variants, respectively. Maximum within-group divergence values for TTV and TTMV were 51 and 57 %, respectively; both of these values were substantially greater than the maximum divergence among TTMDV variants (44 %), consistent with a later evolutionary emergence of TTMDV. However, substantiation of this hypothesis will require further analysis of genetic diversity using an expanded dataset of TTMDV variants in humans and chimpanzees. Similarly, the underlying mechanism of observed infrequent cross-species infection of TTMDV between humans and chimpanzees deserves further analysis.
Characterization of a 3.3-kb plasmid of Escherichia coli O157:H7 and evaluation of stability of genetically engineered derivatives of this plasmid expressing green fluorescence.

PubMed

Sharma, Vijay K; Stanton, Thaddeus B

2008-12-10

Enterohemorrhagic Escherichia coli (EHEC) O157:H7 (strain 86-24) harbors a 3.3-kb plasmid (pSP70) that does not encode a selectable phenotype. A 1.1-kb fragment of DNA encoding kanamycin resistance (Kan(r)) was inserted by in vitro transposon mutagenesis at a random location on pSP70 to construct pSP70-Kan(r) that conferred Kan(r) to the host E. coli strain. Oligonucleotides complementary to 5' and 3' ends of the fragment encoding Kan(r) were used for initiating nucleotide sequencing from the plus and minus strands of pSP70, and thereafter primer walking was used to determine nucleotide sequence of pSP70. Analysis of nucleotide sequence revealed that pSP70 contained 3306 base pairs in its genome and that the genome was almost 100% identical to nucleotide sequences of small plasmids identified in EHEC O157:H7 isolates from Germany and Japan. A DNA cassette encoding a green fluorescent protein (GFP), ampicillin resistance (Amp(r)), and a double transcriptional terminator (DT) was cloned in pSP70 either at the BamHI site (created by deletion of mobA by PCR) or at the NsiI site located downstream of mobA to generate pSP70 DeltamobA-GFP/Amp(r)/DT (pSM431) and pSP70-GFP/Amp(r)/DT (pSM433), respectively. Introduction of pSM431 or pSM433 into EHEC O157:H7 yielded ampicillin-resistant colonies that glowed green under UV illumination. Consecutive subcultures of EHEC O157:H7, carrying pSM431 or pSM433 under conditions simulating the environment of bovine intestine (no selective antibiotic, incubation temperature of 39 degrees C, with or without oxygen), demonstrated that these plasmids were highly stable as greater than 95% of the isolates recovered from these subcultures were positive for green fluorescence. These findings indicate that EHEC O157:H7 carrying pSM431 or pSM433 would be useful for studying persistence and shedding of this important food-borne pathogen in cattle.
Avian sarcoma virus 17 carries the jun oncogene.

PubMed Central

Maki, Y; Bos, T J; Davis, C; Starbuck, M; Vogt, P K

1987-01-01

Biologically active molecular clones of avian sarcoma virus 17 (ASV 17) contain a replication-defective proviral genome of 3.5 kilobases (kb). The genome retains partial gag and env sequences, which flank a cell-derived putative oncogene of 0.93 kb, termed jun. The jun gene lacks preserved coding domains of tyrosine-specific protein kinases. It also shows no significant nucleic acid homology with other known oncogenes. The probable transformation-specific protein in ASV 17-transformed cells is a 55-kDa gag-jun fusion product. Images PMID:3033666
Identification of a novel 15.5 kb SHOX deletion associated with marked intrafamilial phenotypic variability and analysis of its molecular origin.

PubMed

Alexandrou, Angelos; Papaevripidou, Ioannis; Tsangaras, Kyriakos; Alexandrou, Ioanna; Tryfonidis, Marios; Christophidou-Anastasiadou, Violetta; Zamba-Papanicolaou, Eleni; Koumbaris, George; Neocleous, Vassos; Phylactou, Leonidas A; Skordis, Nicos; Tanteles, George A; Sismani, Carolina

2016-12-01

Haploinsufficiency of the short stature homeobox contaning SHOX gene has been shown to result in a spectrum of phenotypes ranging from Leri-Weill dyschondrosteosis (LWD) at the more severe end to SHOX-related short stature at the milder end of the spectrum. Most alterations are whole gene deletions, point mutations within the coding region, or microdeletions in its flanking sequences. Here, we present the clinical and molecular data as well as the potential molecular mechanism underlying a novel microdeletion, causing a variable SHOX-related haploinsufficiency disorder in a three-generation family. The phenotype resembles that of LWD in females, in males, however, the phenotypic expression is milder. The 15523-bp SHOX intragenic deletion, encompassing exons 3-6, was initially detected by array-CGH, followed by MLPA analysis. Sequencing of the breakpoints indicated an Alu recombination-mediated deletion (ARMD) as the potential causative mechanism.
Identification and characterization of Serpulina hyodysenteriae by restriction enzyme analysis and Southern blot analysis.

PubMed Central

Sotiropoulos, C; Coloe, P J; Smith, S C

1994-01-01

Chromosomal DNA restriction enzyme analysis and Southern blot hybridization were used to characterize Serpulina hyodysenteriae strains. When chromosomal DNAs from selected strains (reference serotypes) of S. hyodysenteriae were digested with the restriction endonuclease Sau3A and hybridized with a 1.1-kb S. hyodysenteriae-specific DNA probe, a common 3-kb band was always detected in S. hyodysenteriae strains but was absent from Serpulina innocens strains. When the chromosomal DNA was digested with the restriction endonuclease Asp 700 and hybridized with two S. hyodysenteriae-specific DNA probes (0.75 and 1.1 kb of DNA), distinct hybridization patterns for each S. hyodysenteriae reference strain and the Australian isolate S. hyodysenteriae 5380 were detected. Neither the 1.1-kb nor the 0.75-kb DNA probe hybridized with Asp 700- or Sau3A-digested S. innocens chromosomal DNA. The presence of the 3-kb Sau3A DNA fragment in S. hyodysenteriae reference strains from diverse geographical locations shows that this fragment is conserved among S. hyodysenteriae strains and can be used as a species-specific marker. Restriction endonuclease analysis and Southern blot hybridization with these well-defined DNA probes are reliable and accurate methods for species-specific and strain-specific identification of S. hyodysenteriae. Images PMID:7914209
De novo assembly of a haplotype-resolved human genome.

PubMed

Cao, Hongzhi; Wu, Honglong; Luo, Ruibang; Huang, Shujia; Sun, Yuhui; Tong, Xin; Xie, Yinlong; Liu, Binghang; Yang, Hailong; Zheng, Hancheng; Li, Jian; Li, Bo; Wang, Yu; Yang, Fang; Sun, Peng; Liu, Siyang; Gao, Peng; Huang, Haodong; Sun, Jing; Chen, Dan; He, Guangzhu; Huang, Weihua; Huang, Zheng; Li, Yue; Tellier, Laurent C A M; Liu, Xiao; Feng, Qiang; Xu, Xun; Zhang, Xiuqing; Bolund, Lars; Krogh, Anders; Kristiansen, Karsten; Drmanac, Radoje; Drmanac, Snezana; Nielsen, Rasmus; Li, Songgang; Wang, Jian; Yang, Huanming; Li, Yingrui; Wong, Gane Ka-Shu; Wang, Jun

2015-06-01

The human genome is diploid, and knowledge of the variants on each chromosome is important for the interpretation of genomic information. Here we report the assembly of a haplotype-resolved diploid genome without using a reference genome. Our pipeline relies on fosmid pooling together with whole-genome shotgun strategies, based solely on next-generation sequencing and hierarchical assembly methods. We applied our sequencing method to the genome of an Asian individual and generated a 5.15-Gb assembled genome with a haplotype N50 of 484 kb. Our analysis identified previously undetected indels and 7.49 Mb of novel coding sequences that could not be aligned to the human reference genome, which include at least six predicted genes. This haplotype-resolved genome represents the most complete de novo human genome assembly to date. Application of our approach to identify individual haplotype differences should aid in translating genotypes to phenotypes for the development of personalized medicine.
The presence of the ancestral insect telomeric motif in kissing bugs (Triatominae) rules out the hypothesis of its loss in evolutionarily advanced Heteroptera (Cimicomorpha)

PubMed Central

Pita, Sebastián; Panzera, Francisco; Mora, Pablo; Vela, Jesús; Palomeque, Teresa; Lorite, Pedro

2016-01-01

Abstract Next-generation sequencing data analysis on Triatoma infestans Klug, 1834 (Heteroptera, Cimicomorpha, Reduviidae) revealed the presence of the ancestral insect (TTAGG)n telomeric motif in its genome. Fluorescence in situ hybridization confirms that chromosomes bear this telomeric sequence in their chromosomal ends. Furthermore, motif amount estimation was about 0.03% of the total genome, so that the average telomere length in each chromosomal end is almost 18 kb long. We also detected the presence of (TTAGG)n telomeric repeat in mitotic and meiotic chromosomes in other three species of Triatominae: Triatoma dimidiata Latreille, 1811, Dipetalogaster maxima Uhler, 1894, and Rhodnius prolixus Ståhl, 1859. This is the first report of the (TTAGG)n telomeric repeat in the infraorder Cimicomorpha, contradicting the currently accepted hypothesis that evolutionarily recent heteropterans lack this ancestral insect telomeric sequence. PMID:27830050
Insight into stereochemistry of a new IMP allelic variant (IMP-55) metallo-β-lactamase identified in a clinical strain of Acinetobacter baumannii.

PubMed

Shakibaie, Mohammad Reza; Azizi, Omid; Shahcheraghi, Fereshteh

2017-07-01

Metallo-β-lactamases (MBLs) such as IMPs are broad-spectrum β-lactamases that inactivate virtually all β-lactam antibiotics including carbapenems. In this study, we investigated the hydrolytic activity, phylogenetic relationship, three dimensional (3D) structure including zinc binding motif of a new IMP variant (IMP-55) identified in a clinical strain of Acinetobacter baumannii (AB). AB strain 56 was isolated from an adult ICU of a teaching hospital in Kerman, Iran. It exhibited MIC 32μg/ml to imipenem and showed MBL activity. Hydrolytic property of the MBL enzyme was measured phenotypically. Presence of bla IMP gene encoded by class 1 integrons was detected by PCR-sequencing. Phylogenetic tree of IMP protein was constructed using the Unweighted Pair Group Method with Arithmetic Mean (UPGMA) and 3D model including zinc binding motif was predicted by bioinformatics softwares. Analysis of IMP sequence led to the identification of a novel IMP-type designated as IMP-55 (GenBank: KU299753.1; UniprotKB: A0A0S2MTX2). Impact in term of hydrolytic activity compared to the closest variants suggested efficient imipenem hydrolysis by this enzyme. Evolutionary distance matrix assessment indicated that IMP-55 protein is not closely related to other A. baumannii IMPs, however, shared 98% homology with Escherichia coli IMP-30 (UniprotKB: A0A0C5PJR0) and Pseudomonas aeruginosa IMP-1 (UniprotKB: Q19KT1). It consisted of five α-helices, ten β-sheets and six loops. A monovalent zinc ion attached to core of enzyme via His95, His97, His157 and Cys176. Multiple amino acid sequence alignments and mutational trajectory with reported IMPs showed 4 amino acid substitutions at positions 12(Phe→Ile), 31(Asp→Glu), 172(Leu→Phe) and 185(Asn→Lys). We suggest that the pleiotropic effect of mutations due to frequent administration of imipenem is responsible for emergence of new IMP variant in our hospitals. Copyright © 2017 Elsevier B.V. All rights reserved.
Proteus genomic island 1 (PGI1), a new resistance genomic island from two Proteus mirabilis French clinical isolates.

PubMed

Siebor, Eliane; Neuwirth, Catherine

2014-12-01

To analyse the genetic environment of the antibiotic resistance genes in two clinical Proteus mirabilis isolates resistant to multiple antibiotics. PCR, gene walking and whole-genome sequencing were used to determine the sequence of the resistance regions, the surrounding genetic structure and the flanking chromosomal regions. A genomic island of 81.1 kb named Proteus genomic island 1 (PGI1) located at the 3'-end of trmE (formerly known as thdF) was characterized. The large MDR region of PGI1 (55.4 kb) included a class 1 integron (aadB and aadA2) and regions deriving from several transposons: Tn2 (blaTEM-135), Tn21, Tn6020-like transposon (aphA1b), a hybrid Tn502/Tn5053 transposon, Tn501, a hybrid Tn1696/Tn1721 transposon [tetA(A)] carrying a class 1 integron (aadA1) and Tn5393 (strA and strB). Several ISs were also present (IS4321, IS1R and IS26). The PGI1 backbone (25.7 kb) was identical to that identified in Salmonella Heidelberg SL476 and shared some identity with the Salmonella genomic island 1 (SGI1) backbone. An IS26-mediated recombination event caused the division of the MDR region into two parts separated by a large chromosomal DNA fragment of 197 kb, the right end of PGI1 and this chromosomal sequence being in inverse orientation. PGI1 is a new resistance genomic island from P. mirabilis belonging to the same island family as SGI1. The role of PGI1 in the spread of antimicrobial resistance genes among Enterobacteriaceae of medical importance needs to be evaluated. © The Author 2014. Published by Oxford University Press on behalf of the British Society for Antimicrobial Chemotherapy. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Two novel types of contiguous gene deletion of the AVPR2 and ARHGAP4 genes in unrelated Japanese kindreds with nephrogenic diabetes insipidus.

PubMed

Demura, Masashi; Takeda, Yoshiyu; Yoneda, Takashi; Furukawa, Kenji; Usukura, Mikiya; Itoh, Yuji; Mabuchi, Hiroshi

2002-01-01

Study of two families containing individuals with nephrogenic diabetes insipidus (NDI) indicated different types of 21.3 kb and 26.3 kb deletions involving the AVPR2 and ARHGAP4 (RhoGAP C1) genes. In the case of the 21.3 kb deletion, the deletion consensus motif (5'-TGAAGG-3') and polypurine runs, known as the arrest site of polymerase alpha, were detected in the vicinity of the deletion junction. Inverted repeats (7/8 matches), believed to potentiate DNA loop formation, flank the deletion breakpoint. We propose this deletion to be the result of slipped mispairing during DNA replication. In the case of the 26.3 kb deletion, the 12,945 bp inverted region with the 10,003 bp internal deletion was accompanied with the 2,509 bp deletion in the 5'-side and the 13,785 bp deletion in the 3'-side. We defined three deletion junctions in this rearrangement (DJ1, DJ2, and DJ3) from the 5'-side. The surrounding sequence of DJ1 (5'-CCC-3') closely resembled that of DJ3 (5'-AGGG-3') (DJ1; 5'-cCCCgaggg-3', DJ3; 5'-ccccAGGG-3'), and DJ1 was located in the 5'-side of DJ3 without any overlapping in sequence. The immunoglobulin class switch (ICS) motif (5'-TGGGG-3') was found around the complementary sequence of DJ3. There was a 10-base palindrome (5'-aGACAtgtct-3') in the alignment of the DJ2 (5'-GACA-3') region. From these findings, we propose a novel mutation process with the rearrangement probably resulting from stem-loop induced non-homologous recombination in an ICS-like fashion. Both patients, despite lacking ARHGAP4, had no morphological, clinical, or laboratory abnormalities except for those usually found in patients with NDI. Copyright 2001 Wiley-Liss, Inc.
Testing for Archaic Hominin Admixture on the X Chromosome: Model Likelihoods for the Modern Human RRM2P4 Region From Summaries of Genealogical Topology Under the Structured Coalescent

PubMed Central

Cox, Murray P.; Mendez, Fernando L.; Karafet, Tatiana M.; Pilkington, Maya Metni; Kingan, Sarah B.; Destro-Bisol, Giovanni; Strassmann, Beverly I.; Hammer, Michael F.

2008-01-01

A 2.4-kb stretch within the RRM2P4 region of the X chromosome, previously sequenced in a sample of 41 globally distributed humans, displayed both an ancient time to the most recent common ancestor (e.g., a TMRCA of ∼2 million years) and a basal clade composed entirely of Asian sequences. This pattern was interpreted to reflect a history of introgressive hybridization from archaic hominins (most likely Asian Homo erectus) into the anatomically modern human genome. Here, we address this hypothesis by resequencing the 2.4-kb RRM2P4 region in 131 African and 122 non-African individuals and by extending the length of sequence in a window of 16.5 kb encompassing the RRM2P4 pseudogene in a subset of 90 individuals. We find that both the ancient TMRCA and the skew in non-African representation in one of the basal clades are essentially limited to the central 2.4-kb region. We define a new summary statistic called the minimum clade proportion (pmc), which quantifies the proportion of individuals from a specified geographic region in each of the two basal clades of a binary gene tree, and then employ coalescent simulations to assess the likelihood of the observed central RRM2P4 genealogy under two alternative views of human evolutionary history: recent African replacement (RAR) and archaic admixture (AA). A molecular-clock-based TMRCA estimate of 2.33 million years is a statistical outlier under the RAR model; however, the large variance associated with this estimate makes it difficult to distinguish the predictions of the human origins models tested here. The pmc summary statistic, which has improved power with larger samples of chromosomes, yields values that are significantly unlikely under the RAR model and fit expectations better under a range of archaic admixture scenarios. PMID:18202385
The Pea light-independent photomorphogenesis1 Mutant Results from Partial Duplication of COP1 Generating an Internal Promoter and Producing Two Distinct Transcripts

PubMed Central

Sullivan, James A.; Gray, John C.

2000-01-01

The pea lip1 (light-independent photomorphogenesis1) mutant shows many of the characteristics of light-grown development when grown in continuous darkness. To investigate the identity of LIP1, cDNAs encoding the pea homolog of COP1, a repressor of photomorphogenesis identified in Arabidopsis, were isolated from wild-type and lip1 pea seedlings. lip1 seedlings contained a wild-type COP1 transcript as well as a larger COP1′ transcript that contained an internal in-frame duplication of 894 bp. The COP1′ transcript segregated with the lip1 phenotype in F2 seedlings and could be translated in vitro to produce a protein of ∼100 kD. The COP1 gene in lip1 peas contained a 7.5-kb duplication, consisting of exons 1 to 7 of the wild-type sequence, located 2.5 kb upstream of a region of genomic DNA identical to the wild-type COP1 DNA sequence. Transcription and splicing of the mutant COP1 gene was predicted to produce the COP1′ transcript, whereas transcription from an internal promoter in the 2.5-kb region of DNA located between the duplicated regions of COP1 would produce the wild-type COP1 transcript. The presence of small quantities of wild-type COP1 transcripts may reduce the severity of the phenotype produced by the mutated COP1′ protein. The genomic DNA sequences of the COP1 gene from wild-type and lip1 peas and the cDNA sequences of COP1 and COP1′ transcripts have been submitted to the EMBL database under the EMBL accession numbers AJ276591, AJ276592, AJ289773, and AJ289774, respectively. PMID:11041887
Molecular and bioinformatic analysis of the FB-NOF transposable element.

PubMed

Badal, Martí; Portela, Anna; Xamena, Noel; Cabré, Oriol

2006-04-12

The Drosophila melanogaster transposable element FB-NOF is known to play a role in genome plasticity through the generation of all sort of genomic rearrangements. Moreover, several insertional mutants due to FB mobilizations have been reported. Its structure and sequence, however, have been poorly studied mainly as a consequence of the long, complex and repetitive sequence of FB inverted repeats. This repetitive region is composed of several 154 bp blocks, each with five almost identical repeats. In this paper, we report the sequencing process of 2 kb long FB inverted repeats of a complete FB-NOF element, with high precision and reliability. This achievement has been possible using a new map of the FB repetitive region, which identifies unambiguously each repeat with new features that can be used as landmarks. With this new vision of the element, a list of FB-NOF in the D. melanogaster genomic clones has been done, improving previous works that used only bioinformatic algorithms. The availability of many FB and FB-NOF sequences allowed an analysis of the FB insertion sequences that showed no sequence specificity, but a preference for A/T rich sequences. The position of NOF into FB is also studied, revealing that it is always located after a second repeat in a random block. With the results of this analysis, we propose a model of transposition in which NOF jumps from FB to FB, using an unidentified transposase enzyme that should specifically recognize the second repeat end of the FB blocks.
A genome-wide BAC-end sequence survey provides first insights into sweetpotato (Ipomoea batatas (L.) Lam.) genome composition.

PubMed

Si, Zengzhi; Du, Bing; Huo, Jinxi; He, Shaozhen; Liu, Qingchang; Zhai, Hong

2016-11-21

Sweetpotato, Ipomoea batatas (L.) Lam., is an important food crop widely grown in the world. However, little is known about the genome of this species because it is a highly heterozygous hexaploid. Gaining a more in-depth knowledge of sweetpotato genome is therefore necessary and imperative. In this study, the first bacterial artificial chromosome (BAC) library of sweetpotato was constructed. Clones from the BAC library were end-sequenced and analyzed to provide genome-wide information about this species. The BAC library contained 240,384 clones with an average insert size of 101 kb and had a 7.93-10.82 × coverage of the genome, and the probability of isolating any single-copy DNA sequence from the library was more than 99%. Both ends of 8310 BAC clones randomly selected from the library were sequenced to generate 11,542 high-quality BAC-end sequences (BESs), with an accumulative length of 7,595,261 bp and an average length of 658 bp. Analysis of the BESs revealed that 12.17% of the sweetpotato genome were known repetitive DNA, including 7.37% long terminal repeat (LTR) retrotransposons, 1.15% Non-LTR retrotransposons and 1.42% Class II DNA transposons etc., 18.31% of the genome were identified as sweetpotato-unique repetitive DNA and 10.00% of the genome were predicted to be coding regions. In total, 3,846 simple sequences repeats (SSRs) were identified, with a density of one SSR per 1.93 kb, from which 288 SSRs primers were designed and tested for length polymorphism using 20 sweetpotato accessions, 173 (60.07%) of them produced polymorphic bands. Sweetpotato BESs had significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum than those of Vitis vinifera, Theobroma cacao and Arabidopsis thaliana. The first BAC library for sweetpotato has been successfully constructed. The high quality BESs provide first insights into sweetpotato genome composition, and have significant hits to the genome sequences of I. trifida and more matches to the whole-genome sequences of Solanum lycopersicum. These resources as a robust platform will be used in high-resolution mapping, gene cloning, assembly of genome sequences, comparative genomics and evolution for sweetpotato.
Development, characterization and cross species amplification of polymorphic microsatellite markers from expressed sequence tags of turmeric (Curcuma longa L.).

PubMed

Siju, S; Dhanya, K; Syamkumar, S; Sasikumar, B; Sheeja, T E; Bhat, A I; Parthasarathy, V A

2010-02-01

Expressed sequence tags (ESTs) from turmeric (Curcuma longa L.) were used for the screening of type and frequency of Class I (hypervariable) simple sequence repeats (SSRs). A total of 231 microsatellite repeats were detected from 12,593 EST sequences of turmeric after redundancy elimination. The average density of Class I SSRs accounts to one SSR per 17.96 kb of EST. Mononucleotides were the most abundant class of microsatellite repeat in turmeric ESTs followed by trinucleotides. A robust set of 17 polymorphic EST-SSRs were developed and used for evaluating 20 turmeric accessions. The number of alleles detected ranged from 3 to 8 per loci. The developed markers were also evaluated in 13 related species of C. longa confirming high rate (100%) of cross species transferability. The polymorphic microsatellite markers generated from this study could be used for genetic diversity analysis and resolving the taxonomic confusion prevailing in the genus.
Genetic analysis and fine mapping of LH1 and LH2, a set of complementary genes controlling late heading in rice (Oryza sativa L.)

PubMed Central

Liu, Shuang; Wang, Feng; Gao, Li Jun; Li, Jin Hua; Li, Rong Bai; Gao, Han Liang; Deng, Guo Fu; Yang, Jin Shui; Luo, Xiao Jin

2012-01-01

Heading date in rice (Oryza sativa L.) is a critical agronomic trait with a complex inheritance. To investigate the genetic basis and mechanism of gene interaction in heading date, we conducted genetic analysis on segregation populations derived from crosses among the indica cultivars Bo B, Yuefeng B and Baoxuan 2. A set of dominant complementary genes controlling late heading, designated LH1 and LH2, were detected by molecular marker mapping. Genetic analysis revealed that Baoxuan 2 contains both dominant genes, while Bo B and Yuefeng B each possess either LH1 or LH2. Using larger populations with segregant ratios of 3 : 1, we fine-mapped LH1 to a 63-kb region near the centromere of chromosome 7 flanked by markers RM5436 and RM8034, and LH2 to a 177-kb region on the short arm of chromosome 8 between flanking markers Indel22468-3 and RM25. Some candidate genes were identified through sequencing of Bo B and Yuefeng B in these target regions. Our work provides a solid foundation for further study on gene interaction in heading date and has application in marker-assisted breeding of photosensitive hybrid rice in China. PMID:23341744
Genetic analysis and fine mapping of LH1 and LH2, a set of complementary genes controlling late heading in rice (Oryza sativa L.).

PubMed

Liu, Shuang; Wang, Feng; Gao, Li Jun; Li, Jin Hua; Li, Rong Bai; Gao, Han Liang; Deng, Guo Fu; Yang, Jin Shui; Luo, Xiao Jin

2012-12-01

Heading date in rice (Oryza sativa L.) is a critical agronomic trait with a complex inheritance. To investigate the genetic basis and mechanism of gene interaction in heading date, we conducted genetic analysis on segregation populations derived from crosses among the indica cultivars Bo B, Yuefeng B and Baoxuan 2. A set of dominant complementary genes controlling late heading, designated LH1 and LH2, were detected by molecular marker mapping. Genetic analysis revealed that Baoxuan 2 contains both dominant genes, while Bo B and Yuefeng B each possess either LH1 or LH2. Using larger populations with segregant ratios of 3 : 1, we fine-mapped LH1 to a 63-kb region near the centromere of chromosome 7 flanked by markers RM5436 and RM8034, and LH2 to a 177-kb region on the short arm of chromosome 8 between flanking markers Indel22468-3 and RM25. Some candidate genes were identified through sequencing of Bo B and Yuefeng B in these target regions. Our work provides a solid foundation for further study on gene interaction in heading date and has application in marker-assisted breeding of photosensitive hybrid rice in China.
Draft genome of the leopard gecko, Eublepharis macularius.

PubMed

Xiong, Zijun; Li, Fang; Li, Qiye; Zhou, Long; Gamble, Tony; Zheng, Jiao; Kui, Ling; Li, Cai; Li, Shengbin; Yang, Huanming; Zhang, Guojie

2016-10-26

Geckos are among the most species-rich reptile groups and the sister clade to all other lizards and snakes. Geckos possess a suite of distinctive characteristics, including adhesive digits, nocturnal activity, hard, calcareous eggshells, and a lack of eyelids. However, one gecko clade, the Eublepharidae, appears to be the exception to most of these 'rules' and lacks adhesive toe pads, has eyelids, and lays eggs with soft, leathery eggshells. These differences make eublepharids an important component of any investigation into the underlying genomic innovations contributing to the distinctive phenotypes in 'typical' geckos. We report high-depth genome sequencing, assembly, and annotation for a male leopard gecko, Eublepharis macularius (Eublepharidae). Illumina sequence data were generated from seven insert libraries (ranging from 170 to 20 kb), representing a raw sequencing depth of 136X from 303 Gb of data, reduced to 84X and 187 Gb after filtering. The assembled genome of 2.02 Gb was close to the 2.23 Gb estimated by k-mer analysis. Scaffold and contig N50 sizes of 664 and 20 kb, respectively, were comparable to the previously published Gekko japonicus genome. Repetitive elements accounted for 42 % of the genome. Gene annotation yielded 24,755 protein-coding genes, of which 93 % were functionally annotated. CEGMA and BUSCO assessment showed that our assembly captured 91 % (225 of 248) of the core eukaryotic genes, and 76 % of vertebrate universal single-copy orthologs. Assembly of the leopard gecko genome provides a valuable resource for future comparative genomic studies of geckos and other squamate reptiles.
Design and verification of a pangenome microarray oligonucleotide probe set for Dehalococcoides spp.

PubMed

Hug, Laura A; Salehi, Maryam; Nuin, Paulo; Tillier, Elisabeth R; Edwards, Elizabeth A

2011-08-01

Dehalococcoides spp. are an industrially relevant group of Chloroflexi bacteria capable of reductively dechlorinating contaminants in groundwater environments. Existing Dehalococcoides genomes revealed a high level of sequence identity within this group, including 98 to 100% 16S rRNA sequence identity between strains with diverse substrate specificities. Common molecular techniques for identification of microbial populations are often not applicable for distinguishing Dehalococcoides strains. Here we describe an oligonucleotide microarray probe set designed based on clustered Dehalococcoides genes from five different sources (strain DET195, CBDB1, BAV1, and VS genomes and the KB-1 metagenome). This "pangenome" probe set provides coverage of core Dehalococcoides genes as well as strain-specific genes while optimizing the potential for hybridization to closely related, previously unknown Dehalococcoides strains. The pangenome probe set was compared to probe sets designed independently for each of the five Dehalococcoides strains. The pangenome probe set demonstrated better predictability and higher detection of Dehalococcoides genes than strain-specific probe sets on nontarget strains with <99% average nucleotide identity. An in silico analysis of the expected probe hybridization against the recently released Dehalococcoides strain GT genome and additional KB-1 metagenome sequence data indicated that the pangenome probe set performs more robustly than the combined strain-specific probe sets in the detection of genes not included in the original design. The pangenome probe set represents a highly specific, universal tool for the detection and characterization of Dehalococcoides from contaminated sites. It has the potential to become a common platform for Dehalococcoides-focused research, allowing meaningful comparisons between microarray experiments regardless of the strain examined.
DNA sequence responsible for the amplification of adjacent genes.

PubMed

Pasion, S G; Hartigan, J A; Kumar, V; Biswas, D K

1987-10-01

A 10.3-kb DNA fragment in the 5'-flanking region of the rat prolactin (rPRL) gene was isolated from F1BGH(1)2C1, a strain of rat pituitary tumor cells (GH cells) that produces prolactin in response to 5-bromodeoxyuridine (BrdU). Following transfection and integration into genomic DNA of recipient mouse L cells, this DNA induced amplification of the adjacent thymidine kinase gene from Herpes simplex virus type 1 (HSV1TK). We confirmed the ability of this "Amplicon" sequence to induce amplification of other linked or unlinked genes in DNA-mediated gene transfer studies. When transferred into the mouse L cells with the 10.3-5'rPRL gene sequence of BrdU-responsive cells, both the human growth hormone and the HSV1TK genes are amplified in response to 5-bromodeoxyuridine. This observation is substantiated by BrdU-induced amplification of the cotransferred bacterial Neo gene. Cotransfection studies reveal that the BrdU-induced amplification capability is associated with a 4-kb DNA sequence in the 5'-flanking region of the rPRL gene of BrdU-responsive cells. These results demonstrate that genes of heterologous origin, linked or unlinked, and selected or unselected, can be coamplified when located within the amplification boundary of the Amplicon sequence.

AFEAP cloning: a precise and efficient method for large DNA sequence assembly.

PubMed

Zeng, Fanli; Zang, Jinping; Zhang, Suhua; Hao, Zhimin; Dong, Jingao; Lin, Yibin

2017-11-14

Recent development of DNA assembly technologies has spurred myriad advances in synthetic biology, but new tools are always required for complicated scenarios. Here, we have developed an alternative DNA assembly method named AFEAP cloning (Assembly of Fragment Ends After PCR), which allows scarless, modular, and reliable construction of biological pathways and circuits from basic genetic parts. The AFEAP method requires two-round of PCRs followed by ligation of the sticky ends of DNA fragments. The first PCR yields linear DNA fragments and is followed by a second asymmetric (one primer) PCR and subsequent annealing that inserts overlapping overhangs at both sides of each DNA fragment. The overlapping overhangs of the neighboring DNA fragments annealed and the nick was sealed by T4 DNA ligase, followed by bacterial transformation to yield the desired plasmids. We characterized the capability and limitations of new developed AFEAP cloning and demonstrated its application to assemble DNA with varying scenarios. Under the optimized conditions, AFEAP cloning allows assembly of an 8 kb plasmid from 1-13 fragments with high accuracy (between 80 and 100%), and 8.0, 11.6, 19.6, 28, and 35.6 kb plasmids from five fragments at 91.67, 91.67, 88.33, 86.33, and 81.67% fidelity, respectively. AFEAP cloning also is capable to construct bacterial artificial chromosome (BAC, 200 kb) with a fidelity of 46.7%. AFEAP cloning provides a powerful, efficient, seamless, and sequence-independent DNA assembly tool for multiple fragments up to 13 and large DNA up to 200 kb that expands synthetic biologist's toolbox.
Whole-Genome Sequence of Coxiella burnetii Nine Mile RSA439 (Phase II, Clone 4), a Laboratory Workhorse Strain

PubMed Central

Beare, Paul A.; Moses, Abraham S.; Martens, Craig A.; Heinzen, Robert A.

2017-01-01

ABSTRACT Here, we report the whole-genome sequence of Coxiella burnetii Nine Mile RSA439 (phase II, clone 4), a laboratory strain used extensively to investigate the biology of this intracellular bacterial pathogen. The genome consists of a 1.97-Mb chromosome and a 37.32-kb plasmid. PMID:28596399
Whole-Genome Sequence of Coxiella burnetii Nine Mile RSA439 (Phase II, Clone 4), a Laboratory Workhorse Strain.

PubMed

Millar, Jess A; Beare, Paul A; Moses, Abraham S; Martens, Craig A; Heinzen, Robert A; Raghavan, Rahul

2017-06-08

Here, we report the whole-genome sequence of Coxiella burnetii Nine Mile RSA439 (phase II, clone 4), a laboratory strain used extensively to investigate the biology of this intracellular bacterial pathogen. The genome consists of a 1.97-Mb chromosome and a 37.32-kb plasmid. Copyright © 2017 Millar et al.
Finished Genome Sequence of Bacillus cereus Strain 03BB87, a Clinical Isolate with B. anthracis Virulence Genes

DOE PAGES

Johnson, Shannon L.; Minogue, Timothy D.; Teshima, Hazuki; ...

2015-01-15

Bacillus cereus strain 03BB87, a blood culture isolate, originated in a 56-year-old male muller operator with a fatal case of pneumonia in 2003. Here we present the finished genome sequence of that pathogen, including a 5.46-Mb chromosome and two plasmids (209 and 52 Kb, respectively).
Characterization of the synthesis and expression of the GTA-kinase from transformed and normal rodent cells.

PubMed

Kerr, M; Fischer, J E; Purushotham, K R; Gao, D; Nakagawa, Y; Maeda, N; Ghanta, V; Hiramoto, R; Chegini, N; Humphreys-Beher, M G

1994-08-02

The murine transformed cell line YC-8 and beta-adrenergic receptor agonist (isoproternol) treated rat and mouse parotid gland acinar cells ectopically express cell surface beta 1-4 galactosyltransferase during active proliferation. This activity is dependent upon the expression of the GTA-kinase (p58) in these cells. Using total RNA, cDNA clones for the protein coding region of the kinase were isolated by reverse transcriptase-PCR cloning. DNA sequence analysis failed to show sequence differences with the normal homolog from mouse cells although Southern blot analysis of YC-8, and a second cell line KI81, indicated changes in the restriction enzyme digestion profile relative to murine cell lines which do not express cell surface galactosyltransferase. The rat cDNA clone from isoproterenol-treated salivary glands showed a high degree of protein and nucleic acid sequence homology to the GTA-kinase from both murine and human sources. Northern blot analysis of YC-8 and a control cell line LSTRA revealed the synthesis of a major 3.0 kb mRNA from both cell lines plus the unique expression of a 4.5 kb mRNA in the YC-8 cells. Reverse transcriptase-PCR of LSTRA and YC-8 confirmed the increased steady state levels of the GTA-kinase mRNA in YC-8. In the mouse, induction of cell proliferation by isoproterenol resulted in a 50-fold increase in steady state mRNA levels for the kinase over the low level of expression in quiescent cells. Expression of the rat 3' untranslated region in rat parotid cells in vitro led to an increased rate of DNA synthesis, cell number an ectopic expression of cell surface galactosyltransferase in the sense orientation. Antisense expression or vector alone did not alter growth characteristics of acinar cells. A polyclonal antibody monospecific to a murine amino terminal peptide sequence revealed a uniform distribution of GTA-kinase over the cytoplasm of acinar and duct cells of control mouse parotid glands. However, upon growth stimulation, kinase was detected primarily in a perinuclear and nuclear immunostaining pattern. Western blot analysis confirmed a translocation from a cytoplasmic localization in both LSTRA and quiescent salivary cells to a membrane-associated localization in YC-8 and proliferating salivary cells.
Genetics of bacteria that utilize one carbon compounds: Final report, March 1, 1982-February 29, 1988

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hanson, R.S.

Broad host range plasmid vectors useful for cloning genes from bacteria that grow on methane and methanol were constructed. We have cloned and mapped nineteen genes required for the growth of Methylobacterium organophilum strain XX on methanol. Nineteen genes were found in seven linkage groups on the M. organophilum genome and were separated by 40 kb or more. Eleven genes were required for the synthesis of methanol dehydrogenase (MDH) and were located in three unlinked gene clusters. The MDH structural gene was localized on a 2.5 kb DNA fragment. The gene was sequenced and contains a 175 bp untranslated leadermore » sequence, a signal sequence and the structural gene. MDH messenger RNA (mRNA) has a half life of approximately 20 min. and is present at approximately 2% of the cellular mRNA. The structural gene for the ..gamma.. subunit of methane monoxygenases has been cloned from Methylosporovibrio. Methane monooxygenase subunits have been purified by Prof. J. Lipscomb's laboratory and are being sequenced to construct DNA probes to identify cloned subunit genes. New facultative methylotrophic bacteria were isolated and characterized. Several amino acid auxotrophs have been isolated. 11 refs.« less
IS30-related transposon mediated insertional inactivation of bile salt hydrolase (bsh1) gene of Lactobacillus plantarum strain Lp20.

PubMed

Kumar, Rajesh; Grover, Sunita; Kaushik, Jai K; Batish, Virender Kumar

2014-01-01

Lactobacillus plantarum is a flexible and versatile microorganism that inhabits a variety of niches, and its genome may express up to four bsh genes to maximize its survival in the mammalian gut. However, the ecological significance of multiple bsh genes in L. plantarum is still not clearly understood. Hence, this study demonstrated the disruption of bile salt hydrolase (bsh1) gene due to the insertion of a transposable element in L. plantarum Lp20 - a wild strain of human fecal origin. Surprisingly, L. plantarum strain Lp20 produced a ∼2.0 kb bsh1 amplicon against the normal size (∼1.0 kb) bsh1 amplicon of Bsh(+)L. plantarum Lp21. Strain Lp20 exhibited minimal Bsh activity in spite of having intact bsh2, bsh3 and bsh4 genes in its genome and hence had a Bsh(-) phenotype. Cloning and sequence characterization of Lp20 bsh1 gene predicted four individual open reading frames (ORFs) within this region. BLAST analysis of ORF1 and ORF2 revealed significant sequence similarity to the L. plantarum bsh1 gene while ORF3 and ORF4 showed high sequence homology to IS30-family transposases. Since, IS30-related transposon element was inserted within Lp20 bsh1 gene in reverse orientation (3'-5'), it introduced several stop codons and disrupted the protein reading frames of both Bsh1 and transposase. Inverted terminal repeats (GGCAGATTG) of transposon, mediated its insertion at 255-263 nt and 1301-1309 nt positions of Lp20 bsh1 gene. In conclusion, insertion of IS30 related-transposon within the bsh1 gene sequence of L. plantarum strain Lp20 demolished the integrity and functionality of Bsh1 enzyme. Additionally, this transposon DNA sequence remains active among various Lactobacillus spp. and hence harbors the potential to be explored in the development of efficient insertion mutagenesis system. Copyright © 2013 Elsevier GmbH. All rights reserved.
Mapping-by-sequencing of Ligon-lintless-1 (Li 1 ) reveals a cluster of neighboring genes with correlated expression in developing fibers of Upland cotton (Gossypium hirsutum L.).

PubMed

Thyssen, Gregory N; Fang, David D; Turley, Rickie B; Florane, Christopher; Li, Ping; Naoumkina, Marina

2015-09-01

Mapping-by-sequencing and SNP marker analysis were used to fine map the Ligon-lintless-1 ( Li 1 ) short fiber mutation in tetraploid cotton to a 255-kb region that contains 16 annotated proteins. The Ligon-lintless-1 (Li 1 ) mutant of cotton (Gossypium hirsutum L.) has been studied as a model for cotton fiber development since its identification in 1929; however, the causative mutation has not been identified yet. Here we report the fine genetic mapping of the mutation to a 255-kb region that contains only 16 annotated genes in the reference Gossypium raimondii genome. We took advantage of the incompletely dominant dwarf vegetative phenotype to identify 100 mutants (Li 1 /Li 1 ) and 100 wild-type (li 1 /li 1 ) homozygotes from a mapping population of 2567 F2 plants, which we bulked and deep sequenced. Since only homozygotes were sequenced, we were able to use a high stringency in SNP calling to rapidly narrow down the region harboring the Li 1 locus, and designed subgenome-specific SNP markers to test the population. We characterized the expression of all sixteen genes in the region by RNA sequencing of elongating fibers and by RT-qPCR at seven time points spanning fiber development. One of the most highly expressed genes found in this interval in wild-type fiber cells is 40-fold under-expressed at the day of anthesis (DOA) in the mutant fiber cells. This gene is a major facilitator superfamily protein, part of the large family of proteins that includes auxin and sugar transporters. Interestingly, nearly all genes in this region were most highly expressed at DOA and showed a high degree of co-expression. Further characterization is required to determine if transport of hormones or carbohydrates is involved in both the dwarf and lintless phenotypes of Li 1 plants.
Pseudomonas syringae pv. actinidiae Draft Genomes Comparison Reveal Strain-Specific Features Involved in Adaptation and Virulence to Actinidia Species

PubMed Central

Marcelletti, Simone; Ferrante, Patrizia; Petriccione, Milena; Firrao, Giuseppe; Scortichini, Marco

2011-01-01

A recent re-emerging bacterial canker disease incited by Pseudomonas syringae pv. actinidiae (Psa) is causing severe economic losses to Actinidia chinensis and A. deliciosa cultivations in southern Europe, New Zealand, Chile and South Korea. Little is known about the genetic features of this pathovar. We generated genome-wide Illumina sequence data from two Psa strains causing outbreaks of bacterial canker on the A. deliciosa cv. Hayward in Japan (J-Psa, type-strain of the pathovar) and in Italy (I-Psa) in 1984 and 1992, respectively as well as from a Psa strain (I2-Psa) isolated at the beginning of the recent epidemic on A. chinensis cv. Hort16A in Italy. All strains were isolated from typical leaf spot symptoms. The phylogenetic relationships revealed that Psa is more closely related to P. s. pv. theae than to P. avellanae within genomospecies 8. Comparative genomic analyses revealed both relevant intrapathovar variations and putative pathovar-specific genomic regions in Psa. The genomic sequences of J-Psa and I-Psa were very similar. Conversely, the I2-Psa genome encodes four additional effector protein genes, lacks a 50 kb plasmid and the phaseolotoxin gene cluster, argK-tox but has acquired a 160 kb plasmid and putative prophage sequences. Several lines of evidence from the analysis of the genome sequences support the hypothesis that this strain did not evolve from the Psa population that caused the epidemics in 1984–1992 in Japan and Italy but rather is the product of a recent independent evolution of the pathovar actinidiae for infecting Actinidia spp. All Psa strains share the genetic potential for copper resistance, antibiotic detoxification, high affinity iron acquisition and detoxification of nitric oxide of plant origin. Similar to other sequenced phytopathogenic pseudomonads associated with woody plant species, the Psa strains isolated from leaves also display a set of genes involved in the catabolism of plant-derived aromatic compounds. PMID:22132095
Generation of sequence signatures from DNA amplification fingerprints with mini-hairpin and microsatellite primers.

PubMed

Caetano-Anollés, G; Gresshoff, P M

1996-06-01

DNA amplification fingerprinting (DAF) with mini-hairpins harboring arbitrary "core" sequences at their 3' termini were used to fingerprint a variety of templates, including PCR products and whole genomes, to establish genetic relationships between plant tax at the interspecific and intraspecific level, and to identify closely related fungal isolates and plant accessions. No correlation was observed between the sequence of the arbitrary core, the stability of the mini-hairpin structure and DAF efficiency. Mini-hairpin primers with short arbitrary cores and primers complementary to simple sequence repeats present in microsatellites were also used to generate arbitrary signatures from amplification profiles (ASAP). The ASAP strategy is a dual-step amplification procedure that uses at least one primer in each fingerprinting stage. ASAP was able to reproducibly amplify DAF products (representing about 10-15 kb of sequence) following careful optimization of amplification parameters such as primer and template concentration. Avoidance of primer sequences partially complementary to DAF product termini was necessary in order to produce distinct fingerprints. This allowed the combinatorial use of oligomers in nucleic acid screening, with numerous ASAP fingerprinting reactions based on a limited number of primer sequences. Mini-hairpin primers and ASAP analysis significantly increased detection of polymorphic DNA, separating closely related bermudagrass (Cynodon) cultivars and detecting putatively linked markers in bulked segregant analysis of the soybean (Glycine max) supernodulation (nitrate-tolerant symbiosis) locus.
Molecular Characterization of Transgene Integration by Next-Generation Sequencing in Transgenic Cattle

PubMed Central

Zhang, Ran; Yin, Yinliang; Zhang, Yujun; Li, Kexin; Zhu, Hongxia; Gong, Qin; Wang, Jianwu; Hu, Xiaoxiang; Li, Ning

2012-01-01

As the number of transgenic livestock increases, reliable detection and molecular characterization of transgene integration sites and copy number are crucial not only for interpreting the relationship between the integration site and the specific phenotype but also for commercial and economic demands. However, the ability of conventional PCR techniques to detect incomplete and multiple integration events is limited, making it technically challenging to characterize transgenes. Next-generation sequencing has enabled cost-effective, routine and widespread high-throughput genomic analysis. Here, we demonstrate the use of next-generation sequencing to extensively characterize cattle harboring a 150-kb human lactoferrin transgene that was initially analyzed by chromosome walking without success. Using this approach, the sites upstream and downstream of the target gene integration site in the host genome were identified at the single nucleotide level. The sequencing result was verified by event-specific PCR for the integration sites and FISH for the chromosomal location. Sequencing depth analysis revealed that multiple copies of the incomplete target gene and the vector backbone were present in the host genome. Upon integration, complex recombination was also observed between the target gene and the vector backbone. These findings indicate that next-generation sequencing is a reliable and accurate approach for the molecular characterization of the transgene sequence, integration sites and copy number in transgenic species. PMID:23185606
Optical mapping and sequencing of the Escherichia coli KO11 genome reveal extensive chromosomal rearrangements, and multiple tandem copies of the Zymomonas mobilis pdc and adhB genes.

PubMed

Turner, Peter C; Yomano, Lorraine P; Jarboe, Laura R; York, Sean W; Baggett, Christy L; Moritz, Brélan E; Zentz, Emily B; Shanmugam, K T; Ingram, Lonnie O

2012-04-01

Escherichia coli KO11 (ATCC 55124) was engineered in 1990 to produce ethanol by chromosomal insertion of the Zymomonas mobilis pdc and adhB genes into E. coli W (ATCC 9637). KO11FL, our current laboratory version of KO11, and its parent E. coli W were sequenced, and contigs assembled into genomic sequences using optical NcoI restriction maps as templates. E. coli W contained plasmids pRK1 (102.5 kb) and pRK2 (5.4 kb), but KO11FL only contained pRK2. KO11FL optical maps made with AflII and with BamHI showed a tandem repeat region, consisting of at least 20 copies of a 10-kb unit. The repeat region was located at the insertion site for the pdc, adhB, and chloramphenicol-resistance genes. Sequence coverage of these genes was about 25-fold higher than average, consistent with amplification of the foreign genes that were inserted as circularized DNA. Selection for higher levels of chloramphenicol resistance originally produced strains with higher pdc and adhB expression, and hence improved fermentation performance, by increasing the gene copy number. Sequence data for an earlier version of KO11, ATCC 55124, indicated that multiple copies of pdc adhB were present. Comparison of the W and KO11FL genomes showed large inversions and deletions in KO11FL, mostly enabled by IS10, which is absent from W but present at 30 sites in KO11FL. The early KO11 strain ATCC 55124 had no rearrangements, contained only one IS10, and lacked most accumulated single nucleotide polymorphisms (SNPs) present in KO11FL. Despite rearrangements and SNPs in KO11FL, fermentation performance was equal to that of ATCC 55124.
Combined pituitary hormone deficiency due to gross deletions in the POU1F1 (PIT-1) and PROP1 genes.

PubMed

Bertko, Eleonore; Klammt, Jürgen; Dusatkova, Petra; Bahceci, Mithat; Gonc, Nazli; Ten Have, Louise; Kandemir, Nurgun; Mansmann, Georg; Obermannova, Barbora; Oostdijk, Wilma; Pfäffle, Heike; Rockstroh-Lippold, Denise; Schlicke, Marina; Tuzcu, Alpaslan Kemal; Pfäffle, Roland

2017-08-01

Pituitary development depends on a complex cascade of interacting transcription factors and signaling molecules. Lesions in this cascade lead to isolated or combined pituitary hormone deficiency (CPHD). The aim of this study was to identify copy number variants (CNVs) in genes known to cause CPHD and to determine their structure. We analyzed 70 CPHD patients from 64 families. Deletions were found in three Turkish families and one family from northern Iraq. In one family we identified a 4.96 kb deletion that comprises the first two exons of POU1F1. In three families a homozygous 15.9 kb deletion including complete PROP1 was discovered. Breakpoints map within highly homologous AluY sequences. Haplotype analysis revealed a shared haplotype of 350 kb among PROP1 deletion carriers. For the first time we were able to assign the boundaries of a previously reported PROP1 deletion. This gross deletion shows strong evidence to originate from a common ancestor in patients with Kurdish descent. No CNVs within LHX3, LHX4, HESX1, GH1 and GHRHR were found. Our data prove multiplex ligation-dependent probe amplification to be a valuable tool for the detection of CNVs as cause of pituitary insufficiencies and should be considered as an analytical method particularly in Kurdish patients.
Combined pituitary hormone deficiency due to gross deletions in the POU1F1 (PIT-1) and PROP1 genes

PubMed Central

Bertko, Eleonore; Klammt, Jürgen; Dusatkova, Petra; Bahceci, Mithat; Gonc, Nazli; ten Have, Louise; Kandemir, Nurgun; Mansmann, Georg; Obermannova, Barbora; Oostdijk, Wilma; Pfäffle, Heike; Rockstroh-Lippold, Denise; Schlicke, Marina; Tuzcu, Alpaslan Kemal; Pfäffle, Roland

2017-01-01

Pituitary development depends on a complex cascade of interacting transcription factors and signaling molecules. Lesions in this cascade lead to isolated or combined pituitary hormone deficiency (CPHD). The aim of this study was to identify copy number variants (CNVs) in genes known to cause CPHD and to determine their structure. We analyzed 70 CPHD patients from 64 families. Deletions were found in three Turkish families and one family from northern Iraq. In one family we identified a 4.96 kb deletion that comprises the first two exons of POU1F1. In three families a homozygous 15.9 kb deletion including complete PROP1 was discovered. Breakpoints map within highly homologous AluY sequences. Haplotype analysis revealed a shared haplotype of 350 kb among PROP1 deletion carriers. For the first time we were able to assign the boundaries of a previously reported PROP1 deletion. This gross deletion shows strong evidence to originate from a common ancestor in patients with Kurdish descent. No CNVs within LHX3, LHX4, HESX1, GH1 and GHRHR were found. Our data prove multiplex ligation-dependent probe amplification to be a valuable tool for the detection of CNVs as cause of pituitary insufficiencies and should be considered as an analytical method particularly in Kurdish patients. PMID:28356564
Population Based Assessment of MHC Class 1 Antigens Down Regulation as Marker in Increased Risk for Development and Progression of Breast Cancer From Benign Breast Lesions

DTIC Science & Technology

2006-01-01

isolated using a routine salting-out method (DNA E-Z Prepkit, Orchid Diagnostics Europe, St Katelijne Waver, Belgium). Sequence based typing In...electrophoresis using ethidiumbromide to show the single 2 KB band before sequencing. Next, sequencing reactions were performed separately for exons 2, 3...Multiplex reverse transcription-polymerase chain reaction for simultaneous screening of 29 translocations and chromosomal aberrations in acute
Novel chaperonins are prevalent in the virioplankton and demonstrate links to viral biology and ecology

PubMed Central

Marine, Rachel L; Nasko, Daniel J; Wray, Jeffrey; Polson, Shawn W; Wommack, K Eric

2017-01-01

Chaperonins are protein-folding machinery found in all cellular life. Chaperonin genes have been documented within a few viruses, yet, surprisingly, analysis of metagenome sequence data indicated that chaperonin-carrying viruses are common and geographically widespread in marine ecosystems. Also unexpected was the discovery of viral chaperonin sequences related to thermosome proteins of archaea, indicating the presence of virioplankton populations infecting marine archaeal hosts. Virioplankton large subunit chaperonin sequences (GroELs) were divergent from bacterial sequences, indicating that viruses have carried this gene over long evolutionary time. Analysis of viral metagenome contigs indicated that: the order of large and small subunit genes was linked to the phylogeny of GroEL; both lytic and temperate phages may carry group I chaperonin genes; and viruses carrying a GroEL gene likely have large double-stranded DNA (dsDNA) genomes (>70 kb). Given these connections, it is likely that chaperonins are critical to the biology and ecology of virioplankton populations that carry these genes. Moreover, these discoveries raise the intriguing possibility that viral chaperonins may more broadly alter the structure and function of viral and cellular proteins in infected host cells. PMID:28731469
Novel chaperonins are prevalent in the virioplankton and demonstrate links to viral biology and ecology.

PubMed

Marine, Rachel L; Nasko, Daniel J; Wray, Jeffrey; Polson, Shawn W; Wommack, K Eric

2017-11-01

Chaperonins are protein-folding machinery found in all cellular life. Chaperonin genes have been documented within a few viruses, yet, surprisingly, analysis of metagenome sequence data indicated that chaperonin-carrying viruses are common and geographically widespread in marine ecosystems. Also unexpected was the discovery of viral chaperonin sequences related to thermosome proteins of archaea, indicating the presence of virioplankton populations infecting marine archaeal hosts. Virioplankton large subunit chaperonin sequences (GroELs) were divergent from bacterial sequences, indicating that viruses have carried this gene over long evolutionary time. Analysis of viral metagenome contigs indicated that: the order of large and small subunit genes was linked to the phylogeny of GroEL; both lytic and temperate phages may carry group I chaperonin genes; and viruses carrying a GroEL gene likely have large double-stranded DNA (dsDNA) genomes (>70 kb). Given these connections, it is likely that chaperonins are critical to the biology and ecology of virioplankton populations that carry these genes. Moreover, these discoveries raise the intriguing possibility that viral chaperonins may more broadly alter the structure and function of viral and cellular proteins in infected host cells.
Multiple bidirectional initiations and terminations of transcription in the Marek's disease virus long repeat regions.

PubMed Central

Chen, X B; Velicer, L F

1991-01-01

Marek's disease is an oncogenic disease of chickens caused by a herpesvirus, Marek's disease virus (MDV). Serial in vitro passage of pathogenic MDV results in amplification of a 132-bp direct repeat in the MDV genome's TRL and IRL repeat regions and loss of tumorigenicity. This led to the hypothesis that upon such expansion, one or more tumor-inducing genes fail to be expressed. In this report a group of cDNAs mapping in the expanded regions were isolated from a pathogenic MDV strain in which the 132-bp direct repeat number was found to range between one and seven. Partial cDNA sequencing and S1 nuclease protection analysis revealed that the corresponding transcripts are either initiated or terminated within or near the expanded regions at multiple sites in both rightward and leftward directions. Furthermore, each 132-bp repeat contains one TATA box and two polyadenylation consensus sequences in each direction. These RNAs contain a partial copy or one or more full copies of the 132-bp direct repeat at either their 5' or 3' end. Northern (RNA) blot analysis showed that the majority of transcripts are 1.8 kb in size, while the minor species range in size from 0.67 to 3.1 kb. Together, these data raise the possibility that the 132-bp direct repeat, and indirectly its copy number, may be involved in the regulation of transcriptional initiation and termination and therefore in the generation of four groups of transcripts from the TRL and IRL, although this remains to be demonstrated. Images PMID:1850022
ESTuber db: an online database for Tuber borchii EST sequences.

PubMed

Lazzari, Barbara; Caprera, Andrea; Cosentino, Cristian; Stella, Alessandra; Milanesi, Luciano; Viotti, Angelo

2007-03-08

The ESTuber database (http://www.itb.cnr.it/estuber) includes 3,271 Tuber borchii expressed sequence tags (EST). The dataset consists of 2,389 sequences from an in-house prepared cDNA library from truffle vegetative hyphae, and 882 sequences downloaded from GenBank and representing four libraries from white truffle mycelia and ascocarps at different developmental stages. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts. Data were collected in a MySQL database, which can be queried via a php-based web interface. Sequences included in the ESTuber db were clustered and annotated against three databases: the GenBank nr database, the UniProtKB database and a third in-house prepared database of fungi genomic sequences. An algorithm was implemented to infer statistical classification among Gene Ontology categories from the ontology occurrences deduced from the annotation procedure against the UniProtKB database. Ontologies were also deduced from the annotation of more than 130,000 EST sequences from five filamentous fungi, for intra-species comparison purposes. Further analyses were performed on the ESTuber db dataset, including tandem repeats search and comparison of the putative protein dataset inferred from the EST sequences to the PROSITE database for protein patterns identification. All the analyses were performed both on the complete sequence dataset and on the contig consensus sequences generated by the EST assembly procedure. The resulting web site is a resource of data and links related to truffle expressed genes. The Sequence Report and Contig Report pages are the web interface core structures which, together with the Text search utility and the Blast utility, allow easy access to the data stored in the database.
Accurate phylogenetic classification of DNA fragments based onsequence composition

DOE Office of Scientific and Technical Information (OSTI.GOV)

McHardy, Alice C.; Garcia Martin, Hector; Tsirigos, Aristotelis

2006-05-01

Metagenome studies have retrieved vast amounts of sequenceout of a variety of environments, leading to novel discoveries and greatinsights into the uncultured microbial world. Except for very simplecommunities, diversity makes sequence assembly and analysis a verychallenging problem. To understand the structure a 5 nd function ofmicrobial communities, a taxonomic characterization of the obtainedsequence fragments is highly desirable, yet currently limited mostly tothose sequences that contain phylogenetic marker genes. We show that forclades at the rank of domain down to genus, sequence composition allowsthe very accurate phylogenetic 10 characterization of genomic sequence.We developed a composition-based classifier, PhyloPythia, for de novophylogenetic sequencemore » characterization and have trained it on adata setof 340 genomes. By extensive evaluation experiments we show that themethodis accurate across all taxonomic ranks considered, even forsequences that originate fromnovel organisms and are as short as 1kb.Application to two metagenome datasets 15 obtained from samples ofphosphorus-removing sludge showed that the method allows the accurateclassification at genus level of most sequence fragments from thedominant populations, while at the same time correctly characterizingeven larger parts of the samples at higher taxonomic levels.« less

Genome Dynamics and Evolution of the Mla (Powdery Mildew) Resistance Locus in BarleyW⃞

PubMed Central

Wei, Fusheng; Wing, Rod A.; Wise, Roger P.

2002-01-01

Genes that confer defense against pathogens often are clustered in the genome and evolve via diverse mechanisms. To evaluate the organization and content of a major defense gene complex in cereals, we determined the complete sequence of a 261-kb BAC contig from barley cv Morex that spans the Mla (powdery mildew) resistance locus. Among the 32 predicted genes on this contig, 15 are associated with plant defense responses; 6 of these are associated with defense responses to powdery mildew disease but function in different signaling pathways. The Mla region is organized as three gene-rich islands separated by two nested complexes of transposable elements and a 45-kb gene-poor region. A heterochromatic-like region is positioned directly proximal to Mla and is composed of a gene-poor core with 17 families of diverse tandem repeats that overlap a hypermethylated, but transcriptionally active, gene-dense island. Paleontology analysis of long terminal repeat retrotransposons indicates that the present Mla region evolved over a period of >7 million years through a variety of duplication, inversion, and transposon-insertion events. Sequence-based recombination estimates indicate that R genes positioned adjacent to nested long terminal repeat retrotransposons, such as Mla, do not favor recombination as a means of diversification. We present a model for the evolution of the Mla region that encompasses several emerging features of large cereal genomes. PMID:12172030
A robust method to analyze copy number alterations of less than 100 kb in single cells using oligonucleotide array CGH.

PubMed

Möhlendick, Birte; Bartenhagen, Christoph; Behrens, Bianca; Honisch, Ellen; Raba, Katharina; Knoefel, Wolfram T; Stoecklein, Nikolas H

2013-01-01

Comprehensive genome wide analyses of single cells became increasingly important in cancer research, but remain to be a technically challenging task. Here, we provide a protocol for array comparative genomic hybridization (aCGH) of single cells. The protocol is based on an established adapter-linker PCR (WGAM) and allowed us to detect copy number alterations as small as 56 kb in single cells. In addition we report on factors influencing the success of single cell aCGH downstream of the amplification method, including the characteristics of the reference DNA, the labeling technique, the amount of input DNA, reamplification, the aCGH resolution, and data analysis. In comparison with two other commercially available non-linear single cell amplification methods, WGAM showed a very good performance in aCGH experiments. Finally, we demonstrate that cancer cells that were processed and identified by the CellSearch® System and that were subsequently isolated from the CellSearch® cartridge as single cells by fluorescence activated cell sorting (FACS) could be successfully analyzed using our WGAM-aCGH protocol. We believe that even in the era of next-generation sequencing, our single cell aCGH protocol will be a useful and (cost-) effective approach to study copy number alterations in single cells at resolution comparable to those reported currently for single cell digital karyotyping based on next generation sequencing data.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Schriner, J.E.; Yi, W.; Hofmann, S.L.

Palmitoyl-protein thioesterase (PPT) is a small glycoprotein that removes palmitate groups from cysteine residues in lipid-modified proteins. We recently reported mutations in PPT in patients with infantile neuronal ceroid lipofuscinosis (INCL), a severe neurodegenerative disorder. INCL is characterized by the accumulation of proteolipid storage material in brain and other tissues, suggesting that the disease is a consequence of abnormal catabolism of acylated proteins. In the current paper, we report the sequence of the human PPT cDNA and the structure of the human PPT gene. The cDNA predicts a protein of 306 amino acids that contains a 25-amino-acid signal peptide, threemore » N-linked glycosylation sites, and consensus motifs characteristic of thioesterases. Northern analysis of a human tissue blot revealed ubiquitous expression of a single 2.5-kb mRNA, with highest expression in lung, brain, and heart. The human PPT gene spans 25 kb and is composed of seven coding exons and a large eighth exon, containing the entire 3{prime}-untranslated region of 1388 bp. An Alu repeat and promoter elements corresponding to putative binding sites for several general transcription factors were identified in the 1060 nucleotides upstream of the transcription start site. The human PPT cDNA sequence and gene structure will provide the means for the identification of further causative mutations in INCL and facilitate genetic screening in selected high-risk populations. 31 refs., 5 figs., 1 tab.« less
Identification and Functional Analysis of the Nocardithiocin Gene Cluster in Nocardia pseudobrasiliensis

PubMed Central

Sakai, Kanae; Komaki, Hisayuki; Gonoi, Tohru

2015-01-01

Nocardithiocin is a thiopeptide compound isolated from the opportunistic pathogen Nocardia pseudobrasiliensis. It shows a strong activity against acid-fast bacteria and is also active against rifampicin-resistant Mycobacterium tuberculosis. Here, we report the identification of the nocardithiocin gene cluster in N. pseudobrasiliensis IFM 0761 based on conserved thiopeptide biosynthesis gene sequence and the whole genome sequence. The predicted gene cluster was confirmed by gene disruption and complementation. As expected, strains containing the disrupted gene did not produce nocardithiocin while gene complementation restored nocardithiocin production in these strains. The predicted cluster was further analyzed using RNA-seq which showed that the nocardithiocin gene cluster contains 12 genes within a 15.2-kb region. This finding will promote the improvement of nocardithiocin productivity and its derivatives production. PMID:26588225
Euglena gracilis chloroplast DNA: analysis of a 1.6 kb intron of the psb C gene containing an open reading frame of 458 codons.

PubMed

Montandon, P E; Vasserot, A; Stutz, E

1986-01-01

We retrieved a 1.6 kbp intron separating two exons of the psb C gene which codes for the 44 kDa reaction center protein of photosystem II. This intron is 3 to 4 times the size of all previously sequenced Euglena gracilis chloroplast introns. It contains an open reading frame of 458 codons potentially coding for a basic protein of 54 kDa of yet unknown function. The intron boundaries follow consensus sequences established for chloroplast introns related to class II and nuclear pre-mRNA introns. Its 3'-terminal segment has structural features similar to class II mitochondrial introns with an invariant base A as possible branch point for lariat formation.
Characterization of a highly polymorphic region 5′ to JH in the human immunoglobulin heavy chain

PubMed Central

Silva, Alcino J.; Johnson, John P.; White, Raymond L.

1987-01-01

A cloned DNA segment 1.25 kilobases (kb) upstream from the joining segments of the human heavy chain immunoglobulin gene revealed extensive polymorphic variation at this locus, and the polymorphic pattern was stably transmitted to the next generation. Genomic restriction analysis showed that the polymorphism was caused by insertions/deletions within an MspI/BamHI fragment. Sequencing of one allele, 848 base pairs (bp) long, revealed eleven 50-base-pair tandem repeats. A second allele, 648 bp long, was cloned from a human genomic cosmid library, sequenced, and found to contain four fewer repeats than the first allele. A survey of 186 chromosomes from unrelated individuals of primarily northern European descent revealed at least six alleles. Images PMID:2884636
Fastidian gum: the Xylella fastidiosa exopolysaccharide possibly involved in bacterial pathogenicity.

PubMed

da Silva, F R; Vettore, A L; Kemper, E L; Leite, A; Arruda, P

2001-09-25

The Gram-negative bacterium Xylella fastidiosa was the first plant pathogen to be completely sequenced. This species causes several economically important plant diseases, including citrus variegated chlorosis (CVC). Analysis of the genomic sequence of X. fastidiosa revealed a 12 kb DNA fragment containing an operon closely related to the gum operon of Xanthomonas campestris. The presence of all genes involved in the synthesis of sugar precursors, existence of exopolysaccharide (EPS) production regulators in the genome, and the absence of three of the X. campestris gum genes suggested that X. fastidiosa is able to synthesize an EPS different from that of xanthan gum. This novel EPS probably consists of polymerized tetrasaccharide repeating units assembled by the sequential addition of glucose-1-phosphate, glucose, mannose and glucuronic acid on a polyprenol phosphate carrier.
Assessment of the utility of the tomato fruit-specific E8 promoter for driving vaccine antigen expression.

PubMed

He, Zhu-Mei; Jiang, Xiao-Ling; Qi, Yu; Luo, Di-Qing

2008-06-01

To assess the utility of the tomato fruit-specific E8 gene's promoter for driving vaccine antigen expression in plant, the 2.2 kb and 1.1 kb E8 promoters were isolated and sequenced from Lycopersicon esculentum cv. Jinfeng #1. The 1.1 kb promoter was fused to vaccine antigen HBsAg M gene for the transfer to Nicotiana tabacum, and the CaMV 35S promoter was used for comparison. Cholera toxin B (ctb) gene under the control of the 1.1 kb promoter was transformed into both N. tabacum and L. esculentum. Southern blot hybridization confirmed the stable integration of the target genes into the tomato and tobacco genomes. ELISA assay showed that the expression product of HBsAg M gene under the control of the 1.1 kb E8 promoter could not be detected in transgenic tobacco tissues such as leaves, flowers, and seeds. In contrast, the expression of HBsAg M gene driven by CaMV 35S promoter could be detected in transgenic tobacco. ELISA assay for CTB proved that the 1.1 kb E8 promoter was able to direct the expression of exotic gene in ripe fruits of transgenic tomato, but expression was absent in leaf, flower, and unripe fruit of tomato, and CTB protein was not detected in transgenic tobacco tissues such as leaves, flowers, and seeds when the gene was under the control of the 1.1 kb E8 promoter. The results indicated that the E8 promoter acted not only in an organ-specific, but also in a species-specific fashion in plant transformation.
The genome of the Erwinia amylovora phage PhiEaH1 reveals greater diversity and broadens the applicability of phages for the treatment of fire blight.

PubMed

Meczker, Katalin; Dömötör, Dóra; Vass, János; Rákhely, Gábor; Schneider, György; Kovács, Tamás

2014-01-01

The enterobacterium Erwinia amylovora is the causal agent of fire blight. This study presents the analysis of the complete genome of phage PhiEaH1, isolated from the soil surrounding an E. amylovora-infected apple tree in Hungary. Its genome is 218 kb in size, containing 244 ORFs. PhiEaH1 is the second E. amylovora infecting phage from the Siphoviridae family whose complete genome sequence was determined. Beside PhiEaH2, PhiEaH1 is the other active component of Erwiphage, the first bacteriophage-based pesticide on the market against E. amylovora. Comparative genome analysis in this study has revealed that PhiEaH1 not only differs from the 10 formerly sequenced E. amylovora bacteriophages belonging to other phage families, but also from PhiEaH2. Sequencing of more Siphoviridae phage genomes might reveal further diversity, providing opportunities for the development of even more effective biological control agents, phage cocktails against Erwinia fire blight disease of commercial fruit crops.
Assembly and analysis of a male sterile rubber tree mitochondrial genome reveals DNA rearrangement events and a novel transcript

PubMed Central

2014-01-01

Background The rubber tree, Hevea brasiliensis, is an important plant species that is commercially grown to produce latex rubber in many countries. The rubber tree variety BPM 24 exhibits cytoplasmic male sterility, inherited from the variety GT 1. Results We constructed the rubber tree mitochondrial genome of a cytoplasmic male sterile variety, BPM 24, using 454 sequencing, including 8 kb paired-end libraries, plus Illumina paired-end sequencing. We annotated this mitochondrial genome with the aid of Illumina RNA-seq data and performed comparative analysis. We then compared the sequence of BPM 24 to the contigs of the published rubber tree, variety RRIM 600, and identified a rearrangement that is unique to BPM 24 resulting in a novel transcript containing a portion of atp9. Conclusions The novel transcript is consistent with changes that cause cytoplasmic male sterility through a slight reduction to ATP production efficiency. The exhaustive nature of the search rules out alternative causes and supports previous findings of novel transcripts causing cytoplasmic male sterility. PMID:24512148
PISMA: A Visual Representation of Motif Distribution in DNA Sequences.

PubMed

Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

2017-01-01

Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code-like, as a gene-map-like, and as a transcript scheme. We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf.
Overview of recurrent chromosomal losses in retinoblastoma detected by low coverage next generation sequencing

PubMed Central

García-Chequer, A.J.; Méndez-Tenorio, A.; Olguín-Ruiz, G.; Sánchez-Vallejo, C.; Isa, P.; Arias, C.F.; Torres, J.; Hernández-Angeles, A.; Ramírez-Ortiz, M.A.; Lara, C.; Cabrera-Muñoz, M.L.; Sadowinski-Pine, S.; Bravo-Ortiz, J.C.; Ramón-García, G.; Diegopérez-Ramírez, J.; Ramírez-Reyes, G.; Casarrubias-Islas, R.; Ramírez, J.; Orjuela, M.A.; Ponce-Castañeda, M.V.

2016-01-01

Genes are frequently lost or gained in malignant tumors and the analysis of these changes can be informative about the underlying tumor biology. Retinoblastoma is a pediatric intraocular malignancy, and since deletions in chromosome 13 have been described in this tumor, we performed genome wide sequencing with the Illumina platform to test whether recurrent losses could be detected in low coverage data from DNA pools of Rb cases. An in silico reference profile for each pool was created from the human genome sequence GRCh37p5; a chromosome integrity score and a graphics 40 Kb window analysis approach, allowed us to identify with high resolution previously reported non random recurrent losses in all chromosomes of these tumors. We also found a pattern of gains and losses associated to clear and dark cytogenetic bands respectively. We further analyze a pool of medulloblastoma and found a more stable genomic profile and previously reported losses in this tumor. This approach facilitates identification of recurrent deletions from many patients that may be biological relevant for tumor development. PMID:26883451
PISMA: A Visual Representation of Motif Distribution in DNA Sequences

PubMed Central

Alcántara-Silva, Rogelio; Alvarado-Hermida, Moisés; Díaz-Contreras, Gibrán; Sánchez-Barrios, Martha; Carrera, Samantha; Galván, Silvia Carolina

2017-01-01

Background: Because the graphical presentation and analysis of motif distribution can provide insights for experimental hypothesis, PISMA aims at identifying motifs on DNA sequences, counting and showing them graphically. The motif length ranges from 2 to 10 bases, and the DNA sequences range up to 10 kb. The motif distribution is shown as a bar-code–like, as a gene-map–like, and as a transcript scheme. Results: We obtained graphical schemes of the CpG site distribution from 91 human papillomavirus genomes. Also, we present 2 analyses: one of DNA motifs associated with either methylation-resistant or methylation-sensitive CpG islands and another analysis of motifs associated with exosome RNA secretion. Availability and Implementation: PISMA is developed in Java; it is executable in any type of hardware and in diverse operating systems. PISMA is freely available to noncommercial users. The English version and the User Manual are provided in Supplementary Files 1 and 2, and a Spanish version is available at www.biomedicas.unam.mx/wp-content/software/pisma.zip and www.biomedicas.unam.mx/wp-content/pdf/manual/pisma.pdf. PMID:28469418
DOE Office of Scientific and Technical Information (OSTI.GOV)

Kerr, J.M.; Fisher, L.W.; Termine, J.D.

The authors have isolated and partially sequenced the human bone sialoprotein gene (IBSP). IBSP has been sublocalized by in situ hybridization to chromosome 4q38-q31 and is composed of six small exons (51 to 159 bp) and 1 large exon ([approximately]2.6 kb). The intron/exon junctions defined by sequence analysis are of class O, retaining an intact coding triplet. Sequence analysis of the 5[prime] upstream region revealed a TATAA (nucleotides -30 to-25 from the transcriptional start point) and a CCAAT (nucleotides -56 to-52) box, both in the reverse orientation. Intron 1 contains interesting structural elements composed of polypyrimidine repeats followed by amore » poly(AC)[sub n] tract. Both types of structural elements have been detected in promoter regions of other genes and have been implicated in transcriptional regulation. Several differences between the previously published cDNA sequence and the authors' sequence have been identified, most of which are contained within the untranslated exon 1. Three base revisions in the coding region include a G to T (Gly to Val, amino acid 195), T to C (Val to Ala, amino acid 268), and T to A (Glu to Asp, amino acid 270). In conclusion, the genomic organization and potential regulatory elements of human IBSP have been elucidated. 42 refs., 4 figs., 1 tab.« less
DOE Office of Scientific and Technical Information (OSTI.GOV)

Lagrimini, L.M.

Since this manuscript was submitted we have conducted a more thorough physiological analysis of water relations in wild-type and peroxidase overproducing plants. These experiments include pressure bomb, plasmolysis, and membrane integrity analysis. We are also in the process of analyzing other phenotypes in peroxidase overproducer plants such as excessive browning of tissue, the rapid death of tissue in culture, and poor germination of seed. Transformed plants of Nicotiana tabacum and Nicotiana sylvestris were obtained which have peroxidase activity 3--7 fold lower than wild-type plants. This was done by introducing a chimeric gene composed of the CaMV 35S promoter and themore » 5' half of the tobacco anionic peroxidase cDNA in the antisense RNA configuration. A manuscript which describes this work is being written, and will be submitted for publication in January 1990. The anionic peroxidase gene has been cloned by hybridization to the cloned cDNA. The entire gene is contained on an 8.7kb fragment within a lambda phage clone. Several smaller DNA fragments have been subcloned, and some have been sequenced. One exon within the coding sequence has been sequenced, along with the partial sequence of two introns. Further sequencing is being carried-out to identify the promoter, which will be later joined to a reporter gene. 6 figs.« less
Comparative sequence analysis of a region on human chromosome 13q14, frequently deleted in B-cell chronic lymphocytic leukemia, and its homologous region on mouse chromosome 14.

PubMed

Kapanadze, B; Makeeva, N; Corcoran, M; Jareborg, N; Hammarsund, M; Baranova, A; Zabarovsky, E; Vorontsova, O; Merup, M; Gahrton, G; Jansson, M; Yankovsky, N; Einhorn, S; Oscier, D; Grandér, D; Sangfelt, O

2000-12-15

Previous studies have indicated the presence of a putative tumor suppressor gene on human chromosome 13q14, commonly deleted in patients with B-cell chronic lymphocytic leukemia (B-CLL). We have recently identified a minimally deleted region encompassing parts of two adjacent genes, termed LEU1 and LEU2 (leukemia-associated genes 1 and 2), and several additional transcripts. In addition, 50 kb centromeric to this region we have identified another gene, LEU5/RFP2. To elucidate further the complex genomic organization of this region, we have identified, mapped, and sequenced the homologous region in the mouse. Fluorescence in situ hybridization analysis demonstrated that the region maps to mouse chromosome 14. The overall organization and gene order in this region were found to be highly conserved in the mouse. Sequence comparison between the human deletion hotspot region and its homologous mouse region revealed a high degree of sequence conservation with an overall score of 74%. However, our data also show that in terms of transcribed sequences, only two of those, human LEU2 and LEU5/RFP2, are clearly conserved, strengthening the case for these genes as putative candidate B-CLL tumor suppressor genes.
Analysis of the aac(3)-VIa gene encoding a novel 3-N-acetyltransferase.

PubMed Central

Rather, P N; Mann, P A; Mierzwa, R; Hare, R S; Miller, G H; Shaw, K J

1993-01-01

Biochemical analysis (G. A. Papanicolaou, R. S. Hare, R. Mierzwa, and G. H. Miller, abstr. 152, Program Abstr. 29th Intersci. Conf. Antimicrob. Agents Chemother., 1989) demonstrated the presence of a novel 3-N-acetyltransferase in Enterobacter cloacae 88020217. This organism was resistant to gentamicin, and the MIC of 2'-N-ethylnetilmicin for it was fourfold lower than that of 6'-N-ethylnetilmicin, a resistance pattern which suggested 2'-acetylating activity. However, high-pressure liquid chromatography analysis demonstrated that the enzyme acetylated sisomicin in the 3 position. We have cloned the structural gene for this enzyme from a large (> 70-kb) conjugative plasmid present in E. cloacae. Subcloning experiments have localized the aac(3)-VIa gene to a 2.1-kb Sau3A fragment. The deduced AAC(3)-VIa protein showed 48% amino acid identity to the AAC(3)-IIa protein and 39% identity to the AAC(3)-VII protein. Examination of the 5'-flanking sequences demonstrated that the aac(3)-VIa gene was located 167 bp downstream of the aadA1 gene and was present in an integron. In addition, the aac(3)-VIa gene is also downstream of a 59-base element often seen in an integron environment. Primer extension analysis has identified a promoter for the aac(3)-VIa gene downstream of both the aadA1 gene and a 59-base element. Images PMID:8257126
Gordonia westfalica sp. nov., a novel rubber-degrading actinomycete.

PubMed

Linos, Alexandros; Berekaa, Mahmoud M; Steinbüchel, Alexander; Kim, Kwang Kyu; Sproer, Cathrin; Kroppenstedt, Reiner M

2002-07-01

A cis-1,4-polyisoprene-degrading bacterium (strain Kb2T) was isolated from foul water taken from the inside of a deteriorated automobile tyre found on a farmer's field in Westfalia, Germany. The strain was aerobic, Gram-positive, exhibited orange smooth and rough colonies on complex nutrient agar, produced elementary branching hyphae that fragmented into rod/coccus-like elements and showed chemotaxonomic markers which were consistent with its classification within the genus Gordonia, i.e. the presence of mesodiaminopimelic acid, arabinose and galactose in whole-cell hydrolysates (cell-wall chemotype IV), N-glycolylmuramic acid in the peptidoglycan wall, a fatty-acid pattern composed of unbranched saturated and monounsaturated fatty acids plus tuberculostearic acid, mycolic acids comprising 56-60 carbon atoms and MK-9(H2) as the only menaquinone. The 16S rDNA sequence of strain Kb2T was found to be most similar to the 16S rDNA sequences of the type strains of Gordonia alkanivorans (DSM 44369T) and Gordonia nitida (KCTC 0605BPT). However, DNA-DNA relatedness data showed that strain Kb2T ( =DSM 44215T NRRL B-24152T) could be distinguished from these two species and represented a new species within the genus Gordonia, for which the name Gordonia westfalica is proposed.
The Mitochondrial Genome and a 60-kb Nuclear DNA Segment from Naegleria fowleri, the Causative Agent of Primary Amoebic Meningoencephalitis

PubMed Central

Herman, Emily K.; Greninger, Alexander L.; Visvesvara, Govinda S.; Marciano-Cabral, Francine; Dacks, Joel B.; Chiu, Charles Y.

2013-01-01

Naegleria fowleri is a unicellular eukaryote causing primary amoebic meningoencephalitis, a neuropathic disease killing 99% of those infected, usually within 7–14 days. N. fowleri is found globally in regions including the US and Australia. The genome of the related non-pathogenic species Naegleria gruberi has been sequenced, but the genetic basis for N. fowleri pathogenicity is unclear. To generate such insight, we sequenced and assembled the mitochondrial genome and a 60-kb segment of nuclear genome from N. fowleri. The mitochondrial genome is highly similar to its counterpart in N. gruberi in gene complement and organization, while distinct lack of synteny is observed for the nuclear segments. Even in this short (60-kb) segment, we identified examples of potential factors for pathogenesis, including ten novel N. fowleri-specific genes. We also identified a homologue of cathepsin B; proteases proposed to be involved in the pathogenesis of diverse eukaryotic pathogens, including N. fowleri. Finally, we demonstrate a likely case of horizontal gene transfer between N. fowleri and two unrelated amoebae, one of which causes granulomatous amoebic encephalitis. This initial look into the N. fowleri nuclear genome has revealed several examples of potential pathogenesis factors, improving our understanding of a neglected pathogen of increasing global importance. PMID:23360210
Use of deep whole-genome sequencing data to identify structure risk variants in breast cancer susceptibility genes.

PubMed

Guo, Xingyi; Shi, Jiajun; Cai, Qiuyin; Shu, Xiao-Ou; He, Jing; Wen, Wanqing; Allen, Jamie; Pharoah, Paul; Dunning, Alison; Hunter, David J; Kraft, Peter; Easton, Douglas F; Zheng, Wei; Long, Jirong

2018-03-01

Functional disruptions of susceptibility genes by large genomic structure variant (SV) deletions in germlines are known to be associated with cancer risk. However, few studies have been conducted to systematically search for SV deletions in breast cancer susceptibility genes. We analysed deep (> 30x) whole-genome sequencing (WGS) data generated in blood samples from 128 breast cancer patients of Asian and European descent with either a strong family history of breast cancer or early cancer onset disease. To identify SV deletions in known or suspected breast cancer susceptibility genes, we used multiple SV calling tools including Genome STRiP, Delly, Manta, BreakDancer and Pindel. SV deletions were detected by at least three of these bioinformatics tools in five genes. Specifically, we identified heterozygous deletions covering a fraction of the coding regions of BRCA1 (with approximately 80kb in two patients), and TP53 genes (with ∼1.6 kb in two patients), and of intronic regions (∼1 kb) of the PALB2 (one patient), PTEN (three patients) and RAD51C genes (one patient). We confirmed the presence of these deletions using real-time quantitative PCR (qPCR). Our study identified novel SV deletions in breast cancer susceptibility genes and the identification of such SV deletions may improve clinical testing.

The Large Mitochondrial Genome of Symbiodinium minutum Reveals Conserved Noncoding Sequences between Dinoflagellates and Apicomplexans

PubMed Central

Shoguchi, Eiichi; Shinzato, Chuya; Hisata, Kanako; Satoh, Nori; Mungpakdee, Sutada

2015-01-01

Even though mitochondrial genomes, which characterize eukaryotic cells, were first discovered more than 50 years ago, mitochondrial genomics remains an important topic in molecular biology and genome sciences. The Phylum Alveolata comprises three major groups (ciliates, apicomplexans, and dinoflagellates), the mitochondrial genomes of which have diverged widely. Even though the gene content of dinoflagellate mitochondrial genomes is reportedly comparable to that of apicomplexans, the highly fragmented and rearranged genome structures of dinoflagellates have frustrated whole genomic analysis. Consequently, noncoding sequences and gene arrangements of dinoflagellate mitochondrial genomes have not been well characterized. Here we report that the continuous assembled genome (∼326 kb) of the dinoflagellate, Symbiodinium minutum, is AT-rich (∼64.3%) and that it contains three protein-coding genes. Based upon in silico analysis, the remaining 99% of the genome comprises transcriptomic noncoding sequences. RNA edited sites and unique, possible start and stop codons clarify conserved regions among dinoflagellates. Our massive transcriptome analysis shows that almost all regions of the genome are transcribed, including 27 possible fragmented ribosomal RNA genes and 12 uncharacterized small RNAs that are similar to mitochondrial RNA genes of the malarial parasite, Plasmodium falciparum. Gene map comparisons show that gene order is only slightly conserved between S. minutum and P. falciparum. However, small RNAs and intergenic sequences share sequence similarities with P. falciparum, suggesting that the function of noncoding sequences has been preserved despite development of very different genome structures. PMID:26199191
Hungarian Marfan family with large FBN1 deletion calls attention to copy number variation detection in the current NGS era

PubMed Central

Ágg, Bence; Meienberg, Janine; Kopps, Anna M.; Fattorini, Nathalie; Stengl, Roland; Daradics, Noémi; Pólos, Miklós; Bors, András; Radovits, Tamás; Merkely, Béla; De Backer, Julie; Szabolcs, Zoltán; Mátyás, Gábor

2018-01-01

Copy number variations (CNVs) comprise about 10% of reported disease-causing mutations in Mendelian disorders. Nevertheless, pathogenic CNVs may have been under-detected due to the lack or insufficient use of appropriate detection methods. In this report, on the example of the diagnostic odyssey of a patient with Marfan syndrome (MFS) harboring a hitherto unreported 32-kb FBN1 deletion, we highlight the need for and the feasibility of testing for CNVs (>1 kb) in Mendelian disorders in the current next-generation sequencing (NGS) era. PMID:29850152
A sharp lower bound for the sum of a sine series with convex coefficients

NASA Astrophysics Data System (ADS)

Solodov, A. P.

2016-12-01

The sum of a sine series g(\\mathbf b,x)=\\sumk=1^∞ b_k\\sin kx with coefficients forming a convex sequence \\mathbf b is known to be positive on the interval (0,π). Its values near zero are conventionally evaluated using the Salem function v(\\mathbf b,x)=x\\sumk=1m(x) kb_k, m(x)=[π/x]. In this paper it is proved that 2π-2v(\\mathbf b,x) is not a minorant for g(\\mathbf b,x). The modified Salem function v_0(\\mathbf b,x)=x\\bigl(\\sumk=1m(x)-1 kb_k+(1/2)m(x)bm(x)\\bigr) is shown to satisfy the lower bound g(\\mathbf b,x)>2π-2v_0(\\mathbf b,x) in some right neighbourhood of zero. This estimate is shown to be sharp on the class of convex sequences \\mathbf b. Moreover, the upper bound for g(\\mathbf b,x) is refined on the class of monotone sequences \\mathbf b. Bibliography: 11 titles.
Whole Genome Sequencing Identifies a 78 kb Insertion from Chromosome 8 as the Cause of Charcot-Marie-Tooth Neuropathy CMTX3

PubMed Central

Brewer, Megan H.; Chaudhry, Rabia; Qi, Jessica; Kidambi, Aditi; Drew, Alexander P.; Ryan, Monique M.; Subramanian, Gopinath M.; Young, Helen K.; Zuchner, Stephan; Reddel, Stephen W.; Nicholson, Garth A.; Kennerson, Marina L.

2016-01-01

With the advent of whole exome sequencing, cases where no pathogenic coding mutations can be found are increasingly being observed in many diseases. In two large, distantly-related families that mapped to the Charcot-Marie-Tooth neuropathy CMTX3 locus at chromosome Xq26.3-q27.3, all coding mutations were excluded. Using whole genome sequencing we found a large DNA interchromosomal insertion within the CMTX3 locus. The 78 kb insertion originates from chromosome 8q24.3, segregates fully with the disease in the two families, and is absent from the general population as well as 627 neurologically normal chromosomes from in-house controls. Large insertions into chromosome Xq27.1 are known to cause a range of diseases and this is the first neuropathy phenotype caused by an interchromosomal insertion at this locus. The CMTX3 insertion represents an understudied pathogenic structural variation mechanism for inherited peripheral neuropathies. Our finding highlights the importance of considering all structural variation types when studying unsolved inherited peripheral neuropathy cases with no pathogenic coding mutations. PMID:27438001
Three genes in the human MHC class III region near the junction with the class II: Gene for receptor of advanced glycosylation end products, PBX2 homeobox gene and a notch homolog, human counterpart of mouse mammary tumor gene int-3

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sugaya, K.; Fukagawa, T.; Matsumoto, K.

Cosmid walking of about 250 kb from MHC class III gene CYP21 to class II was conducted. The gene for receptor of advanced glycosylation end products of proteins (RAGE, a member of immunoglobulin super-family molecules), the PBX2 homeobox gene designated HOX12, and the human counterpart of the mouse mammary tumor gene int-3 were found. The contiguous RAGE and HOX12 genes were completely sequenced, and the human int-3 counterpart was partially sequenced and assigned to a Notch homolog. This human Notch homolog, designated NOTCH3, showed both the intracellular portion present in the mouse int-3 sequence and the extracellular portion absent inmore » the int-3. It thus corresponds to the intact form of a Notch-type transmembrane protein. About 20 kb of dense Alu clustering was found just centromeric to the NOTCH3. 48 refs., 9 figs., 2 tabs.« less
The first complete chloroplast genome of the Genistoid legume Lupinus luteus: evidence for a novel major lineage-specific rearrangement and new insights regarding plastome evolution in the legume family.

PubMed

Martin, Guillaume E; Rousseau-Gueutin, Mathieu; Cordonnier, Solenn; Lima, Oscar; Michon-Coudouel, Sophie; Naquin, Delphine; de Carvalho, Julie Ferreira; Aïnouche, Malika; Salmon, Armel; Aïnouche, Abdelkader

2014-06-01

To date chloroplast genomes are available only for members of the non-protein amino acid-accumulating clade (NPAAA) Papilionoid lineages in the legume family (i.e. Millettioids, Robinoids and the 'inverted repeat-lacking clade', IRLC). It is thus very important to sequence plastomes from other lineages in order to better understand the unusual evolution observed in this model flowering plant family. To this end, the plastome of a lupine species, Lupinus luteus, was sequenced to represent the Genistoid lineage, a noteworthy but poorly studied legume group. The plastome of L. luteus was reconstructed using Roche-454 and Illumina next-generation sequencing. Its structure, repetitive sequences, gene content and sequence divergence were compared with those of other Fabaceae plastomes. PCR screening and sequencing were performed in other allied legumes in order to determine the origin of a large inversion identified in L. luteus. The first sequenced Genistoid plastome (L. luteus: 155 894 bp) resulted in the discovery of a 36-kb inversion, embedded within the already known 50-kb inversion in the large single-copy (LSC) region of the Papilionoideae. This inversion occurs at the base or soon after the Genistoid emergence, and most probably resulted from a flip-flop recombination between identical 29-bp inverted repeats within two trnS genes. Comparative analyses of the chloroplast gene content of L. luteus vs. Fabaceae and extra-Fabales plastomes revealed the loss of the plastid rpl22 gene, and its functional relocation to the nucleus was verified using lupine transcriptomic data. An investigation into the evolutionary rate of coding and non-coding sequences among legume plastomes resulted in the identification of remarkably variable regions. This study resulted in the discovery of a novel, major 36-kb inversion, specific to the Genistoids. Chloroplast mutational hotspots were also identified, which contain novel and potentially informative regions for molecular evolutionary studies at various taxonomic levels in the legumes. Taken together, the results provide new insights into the evolutionary landscape of the legume plastome. © The Author 2014. Published by Oxford University Press on behalf of the Annals of Botany Company. All rights reserved. For Permissions, please email: journals.permissions@oup.com.
Transcriptional insulation of the human keratin 18 gene in transgenic mice.

PubMed Central

Neznanov, N; Thorey, I S; Ceceña, G; Oshima, R G

1993-01-01

Expression of the 10-kb human keratin 18 (K18) gene in transgenic mice results in efficient and appropriate tissue-specific expression in a variety of internal epithelial organs, including liver, lung, intestine, kidney, and the ependymal epithelium of brain, but not in spleen, heart, or skeletal muscle. Expression at the RNA level is directly proportional to the number of integrated K18 transgenes. These results indicate that the K18 gene is able to insulate itself both from the commonly observed cis-acting effects of the sites of integration and from the potential complications of duplicated copies of the gene arranged in head-to-tail fashion. To begin to identify the K18 gene sequences responsible for this property of transcriptional insulation, additional transgenic mouse lines containing deletions of either the 5' or 3' distal end of the K18 gene have been characterized. Deletion of 1.5 kb of the distal 5' flanking sequence has no effect upon either the tissue specificity or the copy number-dependent behavior of the transgene. In contrast, deletion of the 3.5-kb 3' flanking sequence of the gene results in the loss of the copy number-dependent behavior of the gene in liver and intestine. However, expression in kidney, lung, and brain remains efficient and copy number dependent in these transgenic mice. Furthermore, herpes simplex virus thymidine kinase gene expression is copy number dependent in transgenic mice when the gene is located between the distal 5'- and 3'-flanking sequences of the K18 gene. Each adult transgenic male expressed the thymidine kinase gene in testes and brain and proportionally to the number of integrated transgenes. We conclude that the characteristic of copy number-dependent expression of the K18 gene is tissue specific because the sequence requirements for transcriptional insulation in adult liver and intestine are different from those for lung and kidney. In addition, the behavior of the transgenic thymidine kinase gene in testes and brain suggests that the property of transcriptional insulation of the K18 gene may be conferred by the distal flanking sequences of the K18 gene and, additionally, may function for other genes. Images PMID:7681143
Mechanisms of haplotype divergence at the RGA08 nucleotide-binding leucine-rich repeat gene locus in wild banana (Musa balbisiana)

PubMed Central

2010-01-01

Background Comparative sequence analysis of complex loci such as resistance gene analog clusters allows estimating the degree of sequence conservation and mechanisms of divergence at the intraspecies level. In banana (Musa sp.), two diploid wild species Musa acuminata (A genome) and Musa balbisiana (B genome) contribute to the polyploid genome of many cultivars. The M. balbisiana species is associated with vigour and tolerance to pests and disease and little is known on the genome structure and haplotype diversity within this species. Here, we compare two genomic sequences of 253 and 223 kb corresponding to two haplotypes of the RGA08 resistance gene analog locus in M. balbisiana "Pisang Klutuk Wulung" (PKW). Results Sequence comparison revealed two regions of contrasting features. The first is a highly colinear gene-rich region where the two haplotypes diverge only by single nucleotide polymorphisms and two repetitive element insertions. The second corresponds to a large cluster of RGA08 genes, with 13 and 18 predicted RGA genes and pseudogenes spread over 131 and 152 kb respectively on each haplotype. The RGA08 cluster is enriched in repetitive element insertions, in duplicated non-coding intergenic sequences including low complexity regions and shows structural variations between haplotypes. Although some allelic relationships are retained, a large diversity of RGA08 genes occurs in this single M. balbisiana genotype, with several RGA08 paralogs specific to each haplotype. The RGA08 gene family has evolved by mechanisms of unequal recombination, intragenic sequence exchange and diversifying selection. An unequal recombination event taking place between duplicated non-coding intergenic sequences resulted in a different RGA08 gene content between haplotypes pointing out the role of such duplicated regions in the evolution of RGA clusters. Based on the synonymous substitution rate in coding sequences, we estimated a 1 million year divergence time for these M. balbisiana haplotypes. Conclusions A large RGA08 gene cluster identified in wild banana corresponds to a highly variable genomic region between haplotypes surrounded by conserved flanking regions. High level of sequence identity (70 to 99%) of the genic and intergenic regions suggests a recent and rapid evolution of this cluster in M. balbisiana. PMID:20637079
Method for identifying mutagenic agents which induce large, multilocus deletions in DNA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bradley, W.E.C.; Belouchi, A.; Dewyse, P.

1993-07-13

A method of identifying a mutagenic agent is described which includes a large, multilocus deletions in DNA in mammalian cells comprising: (i) exposing a class III heterozygous CHO cell line to a potential mutagenic agent under investigation, and allowing any mutation of the cell line to proceed, said cell line being characterized in that a restriction fragment length variation exists in on mutation it becomes resistant to 2,6-diaminopurine and in that the DNA sequence adjacent to the two alleles of the APRT gene such that the DNA sequence adjacent to one of the two alleles can be digested with themore » enzyme BclI but the DNA sequence variation adjacent to the other of the two alleles cannot be digested with BclI, (ii) isolating induced mutations of the cell line deficient in APRT function, (iii) isolating DNA from the induced mutants, (iv) digesting the isolated DNA with BclI enzyme to produce digested fragments including a 19 kb fragment and any 2 kb fragment, which fragments hybridize with the labeled probe derived from DNA fragment PDI, (v) separating any digested fragments, (vi) transferring the separated fragments of (v) to a solid support, (vii) hybridizing the supported separated fragments with a labeled probe derived from the clone DNA fragment PD 1, (viii) determining fragments having undergone loss of the 2 kb band identified by the probe, as an identification of parent mutants in which the loss occurred, and (ix) evaluating the mutating ability of the potential mutagenic agent.« less
Prion gene haplotypes of U.S. cattle

PubMed Central

Clawson, Michael L; Heaton, Michael P; Keele, John W; Smith, Timothy PL; Harhay, Gregory P; Laegreid, William W

2006-01-01

Background Bovine spongiform encephalopathy (BSE) is a fatal neurological disorder characterized by abnormal deposits of a protease-resistant isoform of the prion protein. Characterizing linkage disequilibrium (LD) and haplotype networks within the bovine prion gene (PRNP) is important for 1) testing rare or common PRNP variation for an association with BSE and 2) interpreting any association of PRNP alleles with BSE susceptibility. The objective of this study was to identify polymorphisms and haplotypes within PRNP from the promoter region through the 3'UTR in a diverse sample of U.S. cattle genomes. Results A 25.2-kb genomic region containing PRNP was sequenced from 192 diverse U.S. beef and dairy cattle. Sequence analyses identified 388 total polymorphisms, of which 287 have not previously been reported. The polymorphism alleles define PRNP by regions of high and low LD. High LD is present between alleles in the promoter region through exon 2 (6.7 kb). PRNP alleles within the majority of intron 2, the entire coding sequence and the untranslated region of exon 3 are in low LD (18.0 kb). Two haplotype networks, one representing the region of high LD and the other the region of low LD yielded nineteen different combinations that represent haplotypes spanning PRNP. The haplotype combinations are tagged by 19 polymorphisms (htSNPS) which characterize variation within and across PRNP. Conclusion The number of polymorphisms in the prion gene region of U.S. cattle is nearly four times greater than previously described. These polymorphisms define PRNP haplotypes that may influence BSE susceptibility in cattle. PMID:17092337
PMS2 inactivation by a complex rearrangement involving an HERV retroelement and the inverted 100-kb duplicon on 7p22.1.

PubMed

Vogt, Julia; Wernstedt, Annekatrin; Ripperger, Tim; Pabst, Brigitte; Zschocke, Johannes; Kratz, Christian; Wimmer, Katharina

2016-11-01

Biallelic PMS2 mutations are responsible for more than half of all cases of constitutional mismatch repair deficiency (CMMRD), a recessively inherited childhood cancer predisposition syndrome. The mismatch repair gene PMS2 is partly embedded within one copy of an inverted 100-kb low-copy repeat (LCR) on 7p22.1. In an individual with CMMRD syndrome, PMS2 was found to be homozygously inactivated by a complex chromosomal rearrangement, which separates the 5'-part from the 3'-part of the gene. The rearrangement involves sequences of the inverted 100-kb LCR and a human endogenous retrovirus element and may be associated with an inversion that is indistinguishable from the known inversion polymorphism affecting the ~0.7-Mb sequence intervening the LCR. Its formation is best explained by a replication-based mechanism (RBM) such as fork stalling and template switching/microhomology-mediated break-induced replication (FoSTeS/MMBIR). This finding supports the hypothesis that the inverted LCR can not only facilitate the formation of the non-allelic homologous recombination-mediated inversion polymorphism but it also promotes the occurrence of more complex rearrangements that can be associated with a large inversion, as well, but are mediated by a RBM. This further suggests that among the inversion polymorphism on 7p22.1, more complex rearrangements might be hidden. Furthermore, as the locus is embedded in a common fragile site (CFS) region, this rearrangement also supports the recently raised hypothesis that CFS sequence motifs may facilitate replication-based rearrangement mechanisms.
PMS2 inactivation by a complex rearrangement involving an HERV retroelement and the inverted 100-kb duplicon on 7p22.1

PubMed Central

Vogt, Julia; Wernstedt, Annekatrin; Ripperger, Tim; Pabst, Brigitte; Zschocke, Johannes; Kratz, Christian; Wimmer, Katharina

2016-01-01

Biallelic PMS2 mutations are responsible for more than half of all cases of constitutional mismatch repair deficiency (CMMRD), a recessively inherited childhood cancer predisposition syndrome. The mismatch repair gene PMS2 is partly embedded within one copy of an inverted 100-kb low-copy repeat (LCR) on 7p22.1. In an individual with CMMRD syndrome, PMS2 was found to be homozygously inactivated by a complex chromosomal rearrangement, which separates the 5′-part from the 3′-part of the gene. The rearrangement involves sequences of the inverted 100-kb LCR and a human endogenous retrovirus element and may be associated with an inversion that is indistinguishable from the known inversion polymorphism affecting the ~0.7-Mb sequence intervening the LCR. Its formation is best explained by a replication-based mechanism (RBM) such as fork stalling and template switching/microhomology-mediated break-induced replication (FoSTeS/MMBIR). This finding supports the hypothesis that the inverted LCR can not only facilitate the formation of the non-allelic homologous recombination-mediated inversion polymorphism but it also promotes the occurrence of more complex rearrangements that can be associated with a large inversion, as well, but are mediated by a RBM. This further suggests that among the inversion polymorphism on 7p22.1, more complex rearrangements might be hidden. Furthermore, as the locus is embedded in a common fragile site (CFS) region, this rearrangement also supports the recently raised hypothesis that CFS sequence motifs may facilitate replication-based rearrangement mechanisms. PMID:27329736
Cloning and sequencing of a gene encoding a 21-kilodalton outer membrane protein from Bordetella avium and expression of the gene in Salmonella typhimurium.

PubMed Central

Gentry-Weeks, C R; Hultsch, A L; Kelly, S M; Keith, J M; Curtiss, R

1992-01-01

Three gene libraries of Bordetella avium 197 DNA were prepared in Escherichia coli LE392 by using the cosmid vectors pCP13 and pYA2329, a derivative of pCP13 specifying spectinomycin resistance. The cosmid libraries were screened with convalescent-phase anti-B. avium turkey sera and polyclonal rabbit antisera against B. avium 197 outer membrane proteins. One E. coli recombinant clone produced a 56-kDa protein which reacted with convalescent-phase serum from a turkey infected with B. avium 197. In addition, five E. coli recombinant clones were identified which produced B. avium outer membrane proteins with molecular masses of 21, 38, 40, 43, and 48 kDa. At least one of these E. coli clones, which encoded the 21-kDa protein, reacted with both convalescent-phase turkey sera and antibody against B. avium 197 outer membrane proteins. The gene for the 21-kDa outer membrane protein was localized by Tn5seq1 mutagenesis, and the nucleotide sequence was determined by dideoxy sequencing. DNA sequence analysis of the 21-kDa protein revealed an open reading frame of 582 bases that resulted in a predicted protein of 194 amino acids. Comparison of the predicted amino acid sequence of the gene encoding the 21-kDa outer membrane protein with protein sequences in the National Biomedical Research Foundation protein sequence data base indicated significant homology to the OmpA proteins of Shigella dysenteriae, Enterobacter aerogenes, E. coli, and Salmonella typhimurium and to Neisseria gonorrhoeae outer membrane protein III, Haemophilus influenzae protein P6, and Pseudomonas aeruginosa porin protein F. The gene (ompA) encoding the B. avium 21-kDa protein hybridized with 4.1-kb DNA fragments from EcoRI-digested, chromosomal DNA of Bordetella pertussis and Bordetella bronchiseptica and with 6.0- and 3.2-kb DNA fragments from EcoRI-digested, chromosomal DNA of B. avium and B. avium-like DNA, respectively. A 6.75-kb DNA fragment encoding the B. avium 21-kDa protein was subcloned into the Asd+ vector pYA292, and the construct was introduced into the avirulent delta cya delta crp delta asd S. typhimurium chi 3987 for oral immunization of birds. The gene encoding the 21-kDa protein was expressed equivalently in B. avium 197, delta asd E. coli chi 6097, and S. typhimurium chi 3987 and was localized primarily in the cytoplasmic membrane and outer membrane. In preliminary studies on oral inoculation of turkey poults with S. typhimurium chi 3987 expressing the gene encoding the B. avium 21-kDa protein, it was determined that a single dose of the recombinant Salmonella vaccine failed to elicit serum antibodies against the 21-kDa protein and challenge with wild-type B. avium 197 resulted in colonization of the trachea and thymus with B. avium 197. Images PMID:1447140
Control of photosynthetic membrane assembly in Rhodobacter sphaeroides mediated by puhA and flanking sequences.

PubMed Central

Sockett, R E; Donohue, T J; Varga, A R; Kaplan, S

1989-01-01

A reaction center H- strain (RCH-) of Rhodobacter sphaeroides, PUHA1, was made by in vitro deletion of an XhoI restriction endonuclease fragment from the puhA gene coupled with insertion of a kanamycin resistance gene cartridge. The resulting construct was delivered to R. sphaeroides wild-type 2.4.1, with the defective puhA gene replacing the wild-type copy by recombination, followed by selection for kanamycin resistance. When grown under conditions known to induce intracytoplasmic membrane development, PUHA1 synthesized a pigmented intracytoplasmic membrane. Spectral analysis of this membrane showed that it was deficient in B875 spectral complexes as well as functional reaction centers and that the level of B800-850 spectral complexes was greater than in the wild type. The RCH- strain was photosythetically incompetent, but photosynthetic growth was restored by complementation with a 1.45-kilobase (kb) BamHI restriction endonuclease fragment containing the puhA gene carried in trans on plasmid pRK404. B875 spectral complexes were not restored by complementation with the 1.45-kb BamHI restriction endonuclease fragment containing the puhA gene but were restored along with photosynthetic competence by complementation with DNA from a cosmid carrying the puhA gene, as well as a flanking DNA sequence. Interestingly, B875 spectral complexes, but not photosynthetic competence, were restored to PUHA1 by introduction in trans of a 13-kb BamHI restriction endonuclease fragment carrying genes encoding the puf operon region of the DNA. The effect of the puhA deletion was further investigated by an examination of the levels of specific mRNA species derived from the puf and puc operons, as well as by determinations of the relative abundances of polypeptides associated with various spectral complexes by immunological methods. The roles of puhA and other genetic components in photosynthetic gene expression and membrane assembly are discussed. Images PMID:2644200
Molecular Analysis of the Locus Responsible for Production of Plantaricin S, a Two-Peptide Bacteriocin Produced by Lactobacillus plantarum LPCO10

PubMed Central

Stephens, Sarah K.; Floriano, Belén; Cathcart, Declan P.; Bayley, Susan A.; Witt, Valerie F.; Jiménez-Díaz, Rufino; Warner, Philip J.; Ruiz-Barba, José Luis

1998-01-01

A 4.5-kb region of chromosomal DNA carrying the locus responsible for the production of plantaricin S, a two-peptide bacteriocin produced by Lactobacillus plantarum LPCO10 (R. Jiménez-Díaz, J. L. Ruiz-Barba, D. P. Cathcart, H. Holo, I. F. Nes, K. H. Sletten, and P. J. Warner, Appl. Environ. Microbiol. 61:4459–4463, 1995), has been cloned, and the nucleotide sequence has been elucidated. Two genes, designated plsA and plsB and encoding peptides α and β, respectively, of plantaricin S, plus an open reading frame (ORF), ORF2, were found to be organized in an operon. Northern blot analysis showed that these genes are cotranscribed, giving a ca. 0.7-kb mRNA, whose transcription start point was determined by primer extension. Nucleotide sequences of plsA and plsB revealed that both genes are translated as bacteriocin precursors which include N-terminal leader sequences of the double-glycine type. The role of ORF2 is unknown at the moment, although it might be expected to encode an immunity protein of the type described for other bacteriocin operons. In addition, several other potential ORFs have been found, including some which may be responsible for the regulation of bacteriocin production. Two of them, ORF8 and ORF14, show strong homology with histidine protein kinase and response regulator genes, respectively, which have been found to be involved in the regulation of the production of other bacteriocins from lactic acid bacteria. A third ORF, ORF5, shows homology with gene agrB from Staphylococcus aureus, which is involved in the mechanism of regulation of the virulence phenotype in this species. Thus, an agr-like regulatory system for the production of plantaricin S is postulated. PMID:9572965
Isolation and characterization of the human CDX1 gene: A candidate gene for diastrophic dysplasia

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bonner, C.; Loftus, S.; Wasmuth, J.J.

1994-09-01

Diastrophic dysplasia is an autosomal recessive disorder characterized by short stature, dislocation of the joints, spinal deformities and malformation of the hands and feet. Multipoint linkage analysis places the diastrophic dysplasia (DTD) locus in 5q31-5q34. Linkage disequilibrium mapping places the DTD locus near CSFIR in the direction of PDGFRB (which is tandem to CSFIR). This same study tentatively placed PDGFRB and DTD proximal to CSFIR. Our results, as well as recently reported work from other laboratories, suggest that PDGFRB (and possibly DTD) is distal rather than proximal to CSFIR. We have constructed a cosmid contig covering approximately 200 kb ofmore » the region containing CSFIR. Several exons have been {open_quotes}trapped{close_quotes} from these cosmids using exon amplification. One of these exons was trapped from a cosmid isolated from a walk from PDGFRB, approximately 80 kb from CSFIR. This exon was sequenced and was determined to be 89% identical to the nucleotide sequence of exon two of the murine CDX1 gene (100% amino acid identity). The exon was used to isolate the human CDX gene. Sequence analysis of the human CDX1 gene indicates a very high degree of homology to the murine gene. CDX1 is a caudal type homeobox gene expressed during gastrulation. In the mouse, expression during gastrulation begins in the primitive streak and subsequently localizes to the ectodermal and mesodermal cells of the primitive streak, neural tube, somites, and limb buds. Later in gastrulation, CDX1 expression becomes most prominent in the mesoderm of the forelimbs, and, to a lesser extent, the hindlimbs. CDX1 is an intriguing candidate gene for diastrophic dysplasia. We are currently screening DNA from affected individuals and hope to shortly determine whether CDX1 is involved in this disorder.« less
Cloning and expression of the hypoxanthine-guanine phosphoribosyltransferase gene from Trypanosoma brucei.

PubMed Central

Allen, T E; Ullman, B

1993-01-01

The hypoxanthine-guanine phosphoribosyltransferase (HGPRT) enzyme of Trypanosoma brucei and related parasites provides a rational target for the treatment of African sleeping sickness and several other parasitic diseases. To characterize the T. brucei HGPRT enzyme in detail, the T. brucei hgprt was isolated within a 4.2 kb SalI-KpnI genomic insert and sequenced. Nucleotide sequence analysis revealed an open reading frame of 630 bp that encoded a protein of 210 amino acids with a M(r) = 23.4 kd. After gap alignment, the T. brucei HGPRT exhibited 21-23% amino acid sequence identity, mostly in three clustered regions, with the HGPRTs from human, S. mansoni, and P falciparum, indicating that the trypanosome enzyme was the most divergent of the group. Surprisingly, the T. brucei HGPRT was more homologous to the hypoxanthine phosphoribosyltransferase (HPRT) from the prokaryote V. harveyi than to the eukaryotic HGPRTs. Northern blot analysis revealed two trypanosome transcripts of 1.4 and 1.9 kb, each expressed to equivalent degrees in insect vector and mammalian forms of the parasite. The T. brucei hgprt was inserted into an expression plasmid and transformed into S phi 606 E. coli that are deficient in both HPRT and xanthine-guanine phosphoribosyltransferase activities. Soluble, enzymatically active recombinant T. brucei HGPRT was expressed to high levels and purified to homogeneity by GTP-agarose affinity chromatography. The purified recombinant enzyme recognized hypoxanthine, guanine, and allopurinol, but not xanthine or adenine, as substrates and was inhibited by a variety of nucleotide effectors. The availability of a molecular clone encoding the T. brucei hgprt and large quantities of homogeneous recombinant HGPRT enzyme provides an experimentally manipulable molecular and biochemical system for the rational design of novel therapeutic agents for the treatment of African sleeping sickness and other diseases of parasitic origin. Images PMID:8265360
Sex chromosome differentiation and the W- and Z-specific loci in Xenopus laevis.

PubMed

Mawaribuchi, Shuuji; Takahashi, Shuji; Wada, Mikako; Uno, Yoshinobu; Matsuda, Yoichi; Kondo, Mariko; Fukui, Akimasa; Takamatsu, Nobuhiko; Taira, Masanori; Ito, Michihiko

2017-06-15

Genetic sex-determining systems in vertebrates include two basic types of heterogamety; XX (female)/XY (male) and ZZ (male)/ZW (female) types. The African clawed frog Xenopus laevis has a ZZ/ZW-type sex-determining system. In this species, we previously identified a W-specific sex (female)-determining gene dmw, and specified W and Z chromosomes, which could be morphologically indistinguishable (homomorphic). In addition to dmw, we most recently discovered two genes, named scanw and ccdc69w, and one gene, named capn5z in the W- and Z-specific regions, respectively. In this study, we revealed the detail structures of the W/Z-specific loci and genes. Sequence analysis indicated that there is almost no sequence similarity between 278kb W-specific and 83kb Z-specific sequences on chromosome 2Lq32-33, where both the transposable elements are abundant. Synteny and phylogenic analyses indicated that all the W/Z-specific genes might have emerged independently. Expression analysis demonstrated that scanw and ccdc69w or capn5z are expressed in early differentiating ZW gonads or testes, thereby suggesting possible roles in female or male development, respectively. Importantly, the sex-determining gene (SDG) dmw might have been generated after allotetraploidization, thereby indicating the construction of the new sex-determining system by dmw after species hybridization. Furthermore, by direct genotyping, we confirmed that diploid WW embryos developed into normal female frogs, which indicate that the Z-specific region is not essential for female development. Overall, these findings indicate that sex chromosome differentiation has started, although no heteromorphic sex chromosomes are evident yet, in X. laevis. Homologous recombination suppression might have promoted the accumulation of mutations and transposable elements, and enlarged the W/Z-specific regions, thereby resulting in differentiation of the W/Z chromosomes. Copyright © 2016 Elsevier Inc. All rights reserved.
Genome survey of pistachio (Pistacia vera L.) by next generation sequencing: Development of novel SSR markers and genetic diversity in Pistacia species.

PubMed

Ziya Motalebipour, Elmira; Kafkas, Salih; Khodaeiaminjan, Mortaza; Çoban, Nergiz; Gözel, Hatice

2016-12-07

Pistachio (Pistacia vera L.) is one of the most important nut crops in the world. There are about 11 wild species in the genus Pistacia, and they have importance as rootstock seed sources for cultivated P. vera and forest trees. Published information on the pistachio genome is limited. Therefore, a genome survey is necessary to obtain knowledge on the genome structure of pistachio by next generation sequencing. Simple sequence repeat (SSR) markers are useful tools for germplasm characterization, genetic diversity analysis, and genetic linkage mapping, and may help to elucidate genetic relationships among pistachio cultivars and species. To explore the genome structure of pistachio, a genome survey was performed using the Illumina platform at approximately 40× coverage depth in the P. vera cv. Siirt. The K-mer analysis indicated that pistachio has a genome that is about 600 Mb in size and is highly heterozygous. The assembly of 26.77 Gb Illumina data produced 27,069 scaffolds at N50 = 3.4 kb with a total of 513.5 Mb. A total of 59,280 SSR motifs were detected with a frequency of 8.67 kb. A total of 206 SSRs were used to characterize 24 P. vera cultivars and 20 wild Pistacia genotypes (four genotypes from each five wild Pistacia species) belonging to P. atlantica, P. integerrima, P. chinenesis, P. terebinthus, and P. lentiscus genotypes. Overall 135 SSR loci amplified in all 44 cultivars and genotypes, 41 were polymorphic in six Pistacia species. The novel SSR loci developed from cultivated pistachio were highly transferable to wild Pistacia species. The results from a genome survey of pistachio suggest that the genome size of pistachio is about 600 Mb with a high heterozygosity rate. This information will help to design whole genome sequencing strategies for pistachio. The newly developed novel polymorphic SSRs in this study may help germplasm characterization, genetic diversity, and genetic linkage mapping studies in the genus Pistacia.
Repeated sequence sets in mitochondrial DNA molecules of root knot nematodes (Meloidogyne): nucleotide sequences, genome location and potential for host-race identification.

PubMed Central

Okimoto, R; Chamberlin, H M; Macfarlane, J L; Wolstenholme, D R

1991-01-01

Within a 7 kb segment of the mtDNA molecule of the root knot nematode, Meloidogyne javanica, that lacks standard mitochondrial genes, are three sets of strictly tandemly arranged, direct repeat sequences: approximately 36 copies of a 102 ntp sequence that contains a TaqI site; 11 copies of a 63 ntp sequence, and 5 copies of an 8 ntp sequence. The 7 kb repeat-containing segment is bounded by putative tRNAasp and tRNAf-met genes and the arrangement of sequences within this segment is: the tRNAasp gene; a unique 1,528 ntp segment that contains two highly stable hairpin-forming sequences; the 102 ntp repeat set; the 8 ntp repeat set; a unique 1,068 ntp segment; the 63 ntp repeat set; and the tRNAf-met gene. The nucleotide sequences of the 102 ntp copies and the 63 ntp copies have been conserved among the species examined. Data from Southern hybridization experiments indicate that 102 ntp and 63 ntp repeats occur in the mtDNAs of three, two and two races of M.incognita, M.hapla and M.arenaria, respectively. Nucleotide sequences of the M.incognita Race-3 102 ntp repeat were found to be either identical or highly similar to those of the M.javanica 102 ntp repeat. Differences in migration distance and number of 102 ntp repeat-containing bands seen in Southern hybridization autoradiographs of restriction-digested mtDNAs of M.javanica and the different host races of M.incognita, M.hapla and M.arenaria are sufficient to distinguish the different host races of each species. Images PMID:2027769

Sequence-based analysis of pQBR103; a representative of a unique, transfer-proficient mega plasmid resident in the microbial community of sugar beet

PubMed Central

Tett, Adrian; Spiers, Andrew J; Crossman, Lisa C; Ager, Duane; Ciric, Lena; Dow, J Maxwell; Fry, John C; Harris, David; Lilley, Andrew; Oliver, Anna; Parkhill, Julian; Quail, Michael A; Rainey, Paul B; Saunders, Nigel J; Seeger, Kathy; Snyder, Lori AS; Squares, Rob; Thomas, Christopher M; Turner, Sarah L; Zhang, Xue-Xian; Field, Dawn; Bailey, Mark J

2009-01-01

The plasmid pQBR103 was found within Pseudomonas populations colonizing the leaf and root surfaces of sugar beet plants growing at Wytham, Oxfordshire, UK. At 425 kb it is the largest self-transmissible plasmid yet sequenced from the phytosphere. It is known to enhance the competitive fitness of its host, and parts of the plasmid are known to be actively transcribed in the plant environment. Analysis of the complete sequence of this plasmid predicts a coding sequence (CDS)-rich genome containing 478 CDSs and an exceptional degree of genetic novelty; 80% of predicted coding sequences cannot be ascribed a function and 60% are orphans. Of those to which function could be assigned, 40% bore greatest similarity to sequences from Pseudomonas spp, and the majority of the remainder showed similarity to other c-proteobacterial genera and plasmids. pQBR103 has identifiable regions presumed responsible for replication and partitioning, but despite being tra+ lacks the full complement of any previously described conjugal transfer functions. The DNA sequence provided few insights into the functional significance of plant-induced transcriptional regions, but suggests that 14% of CDSs may be expressed (11 CDSs with functional annotation and 54 without), further highlighting the ecological importance of these novel CDSs. Comparative analysis indicates that pQBR103 shares significant regions of sequence with other plasmids isolated from sugar beet plants grown at the same geographic location. These plasmid sequences indicate there is more novelty in the mobile DNA pool accessible to phytosphere pseudomonas than is currently appreciated or understood. PMID:18043644
High-resolution community profiling of arbuscular mycorrhizal fungi.

PubMed

Schlaeppi, Klaus; Bender, S Franz; Mascher, Fabio; Russo, Giancarlo; Patrignani, Andrea; Camenzind, Tessa; Hempel, Stefan; Rillig, Matthias C; van der Heijden, Marcel G A

2016-11-01

Community analyses of arbuscular mycorrhizal fungi (AMF) using ribosomal small subunit (SSU) or internal transcribed spacer (ITS) DNA sequences often suffer from low resolution or coverage. We developed a novel sequencing based approach for a highly resolving and specific profiling of AMF communities. We took advantage of previously established AMF-specific PCR primers that amplify a c. 1.5-kb long fragment covering parts of SSU, ITS and parts of the large ribosomal subunit (LSU), and we sequenced the resulting amplicons with single molecule real-time (SMRT) sequencing. The method was applicable to soil and root samples, detected all major AMF families and successfully discriminated closely related AMF species, which would not be discernible using SSU sequences. In inoculation tests we could trace the introduced AMF inoculum at the molecular level. One of the introduced strains almost replaced the local strain(s), revealing that AMF inoculation can have a profound impact on the native community. The methodology presented offers researchers a powerful new tool for AMF community analysis because it unifies improved specificity and enhanced resolution, whereas the drawback of medium sequencing throughput appears of lesser importance for low-diversity groups such as AMF. © 2016 The Authors. New Phytologist © 2016 New Phytologist Trust.
Analysis for complete genomic sequence of HLA-B and HLA-C alleles in the Chinese Han population.

PubMed

Zhu, F; He, Y; Zhang, W; He, J; He, J; Xu, X; Lv, H; Yan, L

2011-08-01

In the present study, we have determined the complete genomic sequence and analysed the intron polymorphism of partial HLA-B and HLA-C alleles in the Chinese Han population. Over 3.0 kb DNA fragments of HLA-B and HLA-C loci were amplified by polymerase chain reaction from partial 5' untranslated region to 3' noncoding region respectively, and then the amplified products were sequenced. Full-length nucleotide sequences of 14 HLA-B alleles and 10 HLA-C alleles were obtained and have been submitted to GenBank and IMGT/HLA database. Two novel alleles of HLA-B*52:01:01:02 and HLA-B*59:01:01:02 were identified, and the complete genomic sequence of HLA-B*52:01:01:01 was firstly reported. Totally 157 and 167 polymorphism positions were found in the full-length genomic sequence of HLA-B and HLA-C loci respectively. Our results suggested that many single nucleotide polymorphisms existed in the exon and intron regions, and the data can provide useful information for understanding the evolution of HLA-B and HLA-C alleles. © 2011 Blackwell Publishing Ltd.
Analysis of conserved noncoding DNA in Drosophila reveals similar constraints in intergenic and intronic sequences.

PubMed

Bergman, C M; Kreitman, M

2001-08-01

Comparative genomic approaches to gene and cis-regulatory prediction are based on the principle that differential DNA sequence conservation reflects variation in functional constraint. Using this principle, we analyze noncoding sequence conservation in Drosophila for 40 loci with known or suspected cis-regulatory function encompassing >100 kb of DNA. We estimate the fraction of noncoding DNA conserved in both intergenic and intronic regions and describe the length distribution of ungapped conserved noncoding blocks. On average, 22%-26% of noncoding sequences surveyed are conserved in Drosophila, with median block length approximately 19 bp. We show that point substitution in conserved noncoding blocks exhibits transition bias as well as lineage effects in base composition, and occurs more than an order of magnitude more frequently than insertion/deletion (indel) substitution. Overall, patterns of noncoding DNA structure and evolution differ remarkably little between intergenic and intronic conserved blocks, suggesting that the effects of transcription per se contribute minimally to the constraints operating on these sequences. The results of this study have implications for the development of alignment and prediction algorithms specific to noncoding DNA, as well as for models of cis-regulatory DNA sequence evolution.
Characterization of a sterol carrier protein 2/3-oxoacyl-CoA thiolase from the cotton leafworm (Spodoptera littoralis): a lepidopteran mechanism closer to that in mammals than that in dipterans

PubMed Central

2004-01-01

Numerous invertebrate species belonging to several phyla cannot synthesize sterols de novo and rely on a dietary source of the compound. SCPx (sterol carrier protein 2/3-oxoacyl-CoA thiolase) is a protein involved in the trafficking of sterols and oxidation of branched-chain fatty acids. We have isolated SCPx protein from Spodoptera littoralis (cotton leafworm) and have subjected it to limited amino acid sequencing. A reverse-transcriptase PCR-based approach has been used to clone the cDNA (1.9 kb), which encodes a 57 kDa protein. Northern blotting detected two mRNA transcripts, one of 1.9 kb, encoding SCPx, and one of 0.95 kb, presumably encoding SCP2 (sterol carrier protein 2). The former mRNA was highly expressed in midgut and Malpighian tubules during the last larval instar. Furthermore, constitutive expression of the gene was detected in the prothoracic glands, which are the main tissue producing the insect moulting hormone. There was no significant change in the 1.9 kb mRNA in midgut throughout development, but slightly higher expression in the early stages. Conceptual translation of the cDNA and a database search revealed that the gene includes the SCP2 sequence and a putative peroxisomal targeting signal in the C-terminal region. Also a cysteine residue at the putative active site for the 3-oxoacyl-CoA thiolase is conserved. Southern blotting showed that SCPx is likely to be encoded by a single-copy gene. The mRNA expression pattern and the gene structure suggest that SCPx from S. littoralis (a lepidopteran) is evolutionarily closer to that of mammals than to that of dipterans. PMID:15149283
Distribution of CFTR mutations in Eastern Hungarians: relevance to genetic testing and to the introduction of newborn screening for cystic fibrosis.

PubMed

Ivady, Gergely; Madar, Laszlo; Nagy, Bela; Gonczi, Ferenc; Ajzner, Eva; Dzsudzsak, Erika; Dvořáková, Lenka; Gombos, Eva; Kappelmayer, Janos; Macek, Milan; Balogh, Istvan

2011-05-01

The aim of this study was characterization of an updated distribution of CFTR mutations in a representative cohort of 40 CF patients with the classical form of the disease drawn from Eastern Hungary. Due to the homogeneity of the Hungarian population our data are generally applicable to other regions of the country, including the sizeable diaspora. We utilized the recommended "cascade" CFTR mutation screening approach, initially using a commercial assay, followed by examination of the common "Slavic" deletion CFTRdele2,3(21kb). Subsequently, the entire CFTR coding region of the CFTR gene was sequenced in patients with yet unidentified mutations. The Elucigene CF29(Tm) v2 assay detected 81.25% of all CF causing mutations. An addition of the CFTRdele2,3(21kb) increased the mutation detection rate to 86.25%. DNA sequencing enabled us to identify mutations on 79/80 CF alleles. Mutations [CFTRdele2,3(21kb), p.Gln685ThrfsX4 (2184insA) were found at an unusually high frequency, each comprising 5.00% of all CF alleles. We have identified common CF causing mutations in the Hungarian population with the most common mutations (p.Phe508del, p.Asn1303Lys, CFTRdele2,3(21kb), 2184insA, p.Gly542X, and p.Leu101X), comprising over 93.75% of all CF alleles. Obtained data are applicable to the improvement of DNA diagnostics in Hungary and beyond, and are the necessary prerequisite for the introduction of a nationwide "two tier" CF newborn screening program. Copyright © 2011 European Cystic Fibrosis Society. Published by Elsevier B.V. All rights reserved.
The BioHub Knowledge Base: Ontology and Repository for Sustainable Biosourcing.

PubMed

Read, Warren J; Demetriou, George; Nenadic, Goran; Ruddock, Noel; Stevens, Robert; Winter, Jerry

2016-06-01

The motivation for the BioHub project is to create an Integrated Knowledge Management System (IKMS) that will enable chemists to source ingredients from bio-renewables, rather than from non-sustainable sources such as fossil oil and its derivatives. The BioHubKB is the data repository of the IKMS; it employs Semantic Web technologies, especially OWL, to host data about chemical transformations, bio-renewable feedstocks, co-product streams and their chemical components. Access to this knowledge base is provided to other modules within the IKMS through a set of RESTful web services, driven by SPARQL queries to a Sesame back-end. The BioHubKB re-uses several bio-ontologies and bespoke extensions, primarily for chemical feedstocks and products, to form its knowledge organisation schema. Parts of plants form feedstocks, while various processes generate co-product streams that contain certain chemicals. Both chemicals and transformations are associated with certain qualities, which the BioHubKB also attempts to capture. Of immediate commercial and industrial importance is to estimate the cost of particular sets of chemical transformations (leading to candidate surfactants) performed in sequence, and these costs too are captured. Data are sourced from companies' internal knowledge and document stores, and from the publicly available literature. Both text analytics and manual curation play their part in populating the ontology. We describe the prototype IKMS, the BioHubKB and the services that it supports for the IKMS. The BioHubKB can be found via http://biohub.cs.manchester.ac.uk/ontology/biohub-kb.owl .
Organizational differences between cytoplasmic male sterile and male fertile Brassica mitochondrial genomes are confined to a single transposed locus.

PubMed Central

L'Homme, Y; Brown, G G

1993-01-01

Comparison of the physical maps of male fertile (cam) and male sterile (pol) mitochondrial genomes of Brassica napus indicates that structural differences between the two mtDNAs are confined to a region immediately upstream of the atp6 gene. Relative to cam mtDNA, pol mtDNA possesses a 4.5 kb segment at this locus that includes a chimeric gene that is cotranscribed with atp6 and lacks an approximately 1kb region located upstream of the cam atp6 gene. The 4.5 kb pol segment is present and similarly organized in the mitochondrial genome of the common nap B.napus cytoplasm; however, the nap and pol DNA regions flanking this segment are different and the nap sequences are not expressed. The 4.5 kb CMS-associated pol segment has thus apparently undergone transposition during the evolution of the nap and pol cytoplasms and has been lost in the cam genome subsequent to the pol-cam divergence. This 4.5 kb segment comprises the single DNA region that is expressed differently in fertile, pol CMS and fertility restored pol cytoplasm plants. The finding that this locus is part of the single mtDNA region organized differently in the fertile and male sterile mitochondrial genomes provides strong support for the view that it specifies the pol CMS trait. Images PMID:8388101
Duplicated Enhancer Region Increases Expression of CTSB and Segregates with Keratolytic Winter Erythema in South African and Norwegian Families.

PubMed

Ngcungcu, Thandiswa; Oti, Martin; Sitek, Jan C; Haukanes, Bjørn I; Linghu, Bolan; Bruccoleri, Robert; Stokowy, Tomasz; Oakeley, Edward J; Yang, Fan; Zhu, Jiang; Sultan, Marc; Schalkwijk, Joost; van Vlijmen-Willems, Ivonne M J J; von der Lippe, Charlotte; Brunner, Han G; Ersland, Kari M; Grayson, Wayne; Buechmann-Moller, Stine; Sundnes, Olav; Nirmala, Nanguneri; Morgan, Thomas M; van Bokhoven, Hans; Steen, Vidar M; Hull, Peter R; Szustakowski, Joseph; Staedtler, Frank; Zhou, Huiqing; Fiskerstrand, Torunn; Ramsay, Michele

2017-05-04

Keratolytic winter erythema (KWE) is a rare autosomal-dominant skin disorder characterized by recurrent episodes of palmoplantar erythema and epidermal peeling. KWE was previously mapped to 8p23.1-p22 (KWE critical region) in South African families. Using targeted resequencing of the KWE critical region in five South African families and SNP array and whole-genome sequencing in two Norwegian families, we identified two overlapping tandem duplications of 7.67 kb (South Africans) and 15.93 kb (Norwegians). The duplications segregated with the disease and were located upstream of CTSB, a gene encoding cathepsin B, a cysteine protease involved in keratinocyte homeostasis. Included in the 2.62 kb overlapping region of these duplications is an enhancer element that is active in epidermal keratinocytes. The activity of this enhancer correlated with CTSB expression in normal differentiating keratinocytes and other cell lines, but not with FDFT1 or NEIL2 expression. Gene expression (qPCR) analysis and immunohistochemistry of the palmar epidermis demonstrated significantly increased expression of CTSB, as well as stronger staining of cathepsin B in the stratum granulosum of affected individuals than in that of control individuals. Analysis of higher-order chromatin structure data and RNA polymerase II ChIA-PET data from MCF-7 cells did not suggest remote effects of the enhancer. In conclusion, KWE in South African and Norwegian families is caused by tandem duplications in a non-coding genomic region containing an active enhancer element for CTSB, resulting in upregulation of this gene in affected individuals. Copyright © 2017 American Society of Human Genetics. Published by Elsevier Inc. All rights reserved.
Cloning and identification of a cDNA that encodes a novel human protein with thrombospondin type I repeat domain, hPWTSR.

PubMed

Chen, Jin-Zhong; Wang, Shu; Tang, Rong; Yang, Quan-Sheng; Zhao, Enpeng; Chao, Yaoqiong; Ying, Kang; Xie, Yi; Mao, Yu-Min

2002-09-01

A cDNA was isolated from the fetal brain cDNA library by high throughput cDNA sequencing. The 2390 bp cDNA with an open reading fragment (ORF) of 816 bp encodes a 272 amino acids putative protein with a thrombospondin type I repeat (TSR) domain and a cysteine-rich region at the N-terminus, so it is named hPWTSR. We used Northern blot detected two bands with length of about 3 kb and 4 kb respectively, which expressed in human adult tissues with different intensities. The expression pattern was verified by RT-PCR, revealing that the transcripts were expressed ubiquitously in fetal tissues and human tumor tissues too. However, the transcript was detected neither in ovarian carcinoma GI-102 nor in lung carcinoma LX-1. Blast analysis against NCBI database revealed that the new gene contained at least 5 exons and located in human chromosome 6q22.33. Our results demonstrate that the gene is a novel member of TSR supergene family.
Woot, an Active Gypsy-Class Retrotransposon in the Flour Beetle, Tribolium Castaneum, Is Associated with a Recent Mutation

PubMed Central

Beeman, R. W.; Thomson, M. S.; Clark, J. M.; DeCamillis, M. A.; Brown, S. J.; Denell, R. E.

1996-01-01

A recently isolated, lethal mutation of the homeotic Abdominal gene of the red flour beetle Tribolium castaneum is associated with an insertion of a novel retrotransposon into an intron. Sequence analysis indicates that this retrotransposon, named Woot, is a member of the gypsy family of mobile elements. Most strains of T. castaneum appear to harbor ~25-35 copies of Woot per genome. Woot is composed of long terminal repeats of unprecedented length (3.6 kb each), flanking an internal coding region 5.0 kb in length. For most copies of Woot, the internal region includes two open reading frames (ORFs) that correspond to the gag and pol genes of previously described retrotransposons and retroviruses. The copy of Woot inserted into Abdominal bears an apparent single frameshift mutation that separates the normal second ORF into two. Woot does not appear to generate infectious virions by the criterion that no envelop gene is discernible. The association of Woot with a recent mutation suggests that this retroelement is currently transpositionally active in at least some strains. PMID:8722793
SequenceLDhot: detecting recombination hotspots.

PubMed

Fearnhead, Paul

2006-12-15

There is much local variation in recombination rates across the human genome--with the majority of recombination occurring in recombination hotspots--short regions of around approximately 2 kb in length that have much higher recombination rates than neighbouring regions. Knowledge of this local variation is important, e.g. in the design and analysis of association studies for disease genes. Population genetic data, such as that generated by the HapMap project, can be used to infer the location of these hotspots. We present a new, efficient and powerful method for detecting recombination hotspots from population data. We compare our method with four current methods for detecting hotspots. It is orders of magnitude quicker, and has greater power, than two related approaches. It appears to be more powerful than HotspotFisher, though less accurate at inferring the precise positions of the hotspot. It was also more powerful than LDhot in some situations: particularly for weaker hotspots (10-40 times the background rate) when SNP density is lower (< 1/kb). Program, data sets, and full details of results are available at: http://www.maths.lancs.ac.uk/~fearnhea/Hotspot.
Comparative Sequence and X-Inactivation Analyses of a Domain of Escape in Human Xp11.2 and the Conserved Segment in Mouse

PubMed Central

Tsuchiya, Karen D.; Greally, John M.; Yi, Yajun; Noel, Kevin P.; Truong, Jean-Pierre; Disteche, Christine M.

2004-01-01

We have performed X-inactivation and sequence analyses on 350 kb of sequence from human Xp11.2, a region shown previously to contain a cluster of genes that escape X inactivation, and we compared this region with the region of conserved synteny in mouse. We identified several new transcripts from this region in human and in mouse, which defined the full extent of the domain escaping X inactivation in both species. In human, escape from X inactivation involves an uninterrupted 235-kb domain of multiple genes. Despite highly conserved gene content and order between the two species, Smcx is the only mouse gene from the conserved segment that escapes inactivation. As repetitive sequences are believed to facilitate spreading of X inactivation along the chromosome, we compared the repetitive sequence composition of this region between the two species. We found that long terminal repeats (LTRs) were decreased in the human domain of escape, but not in the majority of the conserved mouse region adjacent to Smcx in which genes were subject to X inactivation, suggesting that these repeats might be excluded from escape domains to prevent spreading of silencing. Our findings indicate that genomic context, as well as gene-specific regulatory elements, interact to determine expression of a gene from the inactive X-chromosome. PMID:15197169
A prokaryotic viral sequence is expressed and conserved in mammalian brain.

PubMed

Yeh, Yang-Hui; Gunasekharan, Vignesh; Manuelidis, Laura

2017-07-03

A natural and permanent transfer of prokaryotic viral sequences to mammals has not been reported by others. Circular "SPHINX" DNAs <5 kb were previously isolated from nuclease-protected cytoplasmic particles in rodent neuronal cell lines and brain. Two of these DNAs were sequenced after Φ29 polymerase amplification, and they revealed significant but imperfect homology to segments of commensal Acinetobacter phage viruses. These findings were surprising because the brain is isolated from environmental microorganisms. The 1.76-kb DNA sequence (SPHINX 1.8), with an iteron before its ORF, was evaluated here for its expression in neural cells and brain. A rabbit affinity purified antibody generated against a peptide without homology to mammalian sequences labeled a nonglycosylated ∼41-kDa protein (spx1) on Western blots, and the signal was efficiently blocked by the competing peptide. Spx1 was resistant to limited proteinase K digestion, but was unrelated to the expression of host prion protein or its pathologic amyloid form. Remarkably, spx1 concentrated in selected brain synapses, such as those on anterior motor horn neurons that integrate many complex neural inputs. SPHINX 1.8 appears to be involved in tissue-specific differentiation, including essential functions that preserve its propagation during mammalian evolution, possibly via maternal inheritance. The data here indicate that mammals can share and exchange a larger world of prokaryotic viruses than previously envisioned.
Recombinational hotspot specific to female meiosis in the mouse major histocompatibility complex.

PubMed

Shiroishi, T; Hanzawa, N; Sagai, T; Ishiura, M; Gojobori, T; Steinmetz, M; Moriwaki, K

1990-01-01

The wm7 haplotype of the major histocompatibility complex (MHC), derived from the Japanese wild mouse Mus musculus molossinus, enhances recombination specific to female meiosis in the K/A beta interval of the MHC. We have mapped crossover points of fifteen independent recombinants from genetic crosses of the wm7 and laboratory haplotypes. Most of them were confined to a short segment of approximately 1 kilobase (kb) of DNA between the A beta 3 and A beta 2 genes, indicating the presence of a female-specific recombinational hotspot. Its location overlaps with a sex-independent hotspot previously identified in the Mus musculus castaneus CAS3 haplotype. We have cloned and sequenced DNA fragments surrounding the hotspot from the wm7 haplotype and the corresponding regions from the hotspot-negative B10.A and C57BL/10 strains. There is no significant difference between the sequences of these three strains, or between these and the published sequences of the CAS3 and C57BL/6 strains. However, a comparison of this A beta 3/A beta 2 hotspot with a previously characterized hotspot in the E beta gene revealed that they have a very similar molecular organization. Each hotspot consists of two elements, the consensus sequence of the mouse middle repetitive MT family and the tetrameric repeated sequences, which are separated by 1 kb of DNA.
Small rare recurrent deletions and reciprocal duplications in 2q21.1, including brain-specific ARHGEF4 and GPR148

PubMed Central

Dharmadhikari, Avinash V.; Kang, Sung-Hae L.; Szafranski, Przemyslaw; Person, Richard E.; Sampath, Srirangan; Prakash, Siddharth K.; Bader, Patricia I.; Phillips, John A.; Hannig, Vickie; Williams, Misti; Vinson, Sherry S.; Wilfong, Angus A.; Reimschisel, Tyler E.; Craigen, William J.; Patel, Ankita; Bi, Weimin; Lupski, James R.; Belmont, John; Cheung, Sau Wai; Stankiewicz, Pawel

2012-01-01

We have identified a rare small (∼450 kb unique sequence) recurrent deletion in a previously linked attention-deficit hyperactivity disorder (ADHD) locus at 2q21.1 in five unrelated families with developmental delay (DD)/intellectual disability (ID), ADHD, epilepsy and other neurobehavioral abnormalities from 17 035 samples referred for clinical chromosomal microarray analysis. Additionally, a DECIPHER (http://decipher.sanger.ac.uk) patient 2311 was found to have the same deletion and presented with aggressive behavior. The deletion was not found in either six control groups consisting of 13 999 healthy individuals or in the DGV database. We have also identified reciprocal duplications in five unrelated families with autism, developmental delay (DD), seizures and ADHD. This genomic region is flanked by large, complex low-copy repeats (LCRs) with directly oriented subunits of ∼109 kb in size that have 97.7% DNA sequence identity. We sequenced the deletion breakpoints within the directly oriented paralogous subunits of the flanking LCR clusters, demonstrating non-allelic homologous recombination as a mechanism of formation. The rearranged segment harbors five genes: GPR148, FAM123C, ARHGEF4, FAM168B and PLEKHB2. Expression of ARHGEF4 (Rho guanine nucleotide exchange factor 4) is restricted to the brain and may regulate the actin cytoskeletal network, cell morphology and migration, and neuronal function. GPR148 encodes a G-protein-coupled receptor protein expressed in the brain and testes. We suggest that small rare recurrent deletion of 2q21.1 is pathogenic for DD/ID, ADHD, epilepsy and other neurobehavioral abnormalities and, because of its small size, low frequency and more severe phenotype might have been missed in other previous genome-wide screening studies using single-nucleotide polymorphism analyses. PMID:22543972
Human-Specific Duplication and Mosaic Transcripts: The Recent Paralogous Structure of Chromosome 22

PubMed Central

Bailey, Jeffrey A. ; Yavor, Amy M. ; Viggiano, Luigi ; Misceo, Doriana ; Horvath, Juliann E. ; Archidiacono, Nicoletta ; Schwartz, Stuart ; Rocchi, Mariano ; Eichler, Evan E.

2002-01-01

In recent decades, comparative chromosomal banding, chromosome painting, and gene-order studies have shown strong conservation of gross chromosome structure and gene order in mammals. However, findings from the human genome sequence suggest an unprecedented degree of recent (<35 million years ago) segmental duplication. This dynamism of segmental duplications has important implications in disease and evolution. Here we present a chromosome-wide view of the structure and evolution of the most highly homologous duplications (⩾1 kb and ⩾90%) on chromosome 22. Overall, 10.8% (3.7/33.8 Mb) of chromosome 22 is duplicated, with an average sequence identity of 95.4%. To organize the duplications into tractable units, intron-exon structure and well-defined duplication boundaries were used to define 78 duplicated modules (minimally shared evolutionary segments) with 157 copies on chromosome 22. Analysis of these modules provides evidence for the creation or modification of 11 novel transcripts. Comparative FISH analyses of human, chimpanzee, gorilla, orangutan, and macaque reveal qualitative and quantitative differences in the distribution of these duplications—consistent with their recent origin. Several duplications appear to be human specific, including a ∼400-kb duplication (99.4%–99.8% sequence identity) that transposed from chromosome 14 to the most proximal pericentromeric region of chromosome 22. Experimental and in silico data further support a pericentromeric gradient of duplications where the most recent duplications transpose adjacent to the centromere. Taken together, these data suggest that segmental duplications have been an ongoing process of primate genome evolution, contributing to recent gene innovation and the dynamic transformation of genome architecture within and among closely related species. PMID:11731936
Linezolid-Resistant Staphylococcus aureus Strain 1128105, the First Known Clinical Isolate Possessing the cfr Multidrug Resistance Gene

PubMed Central

Zuill, Douglas E.; Scharn, Caitlyn R.; Deane, Jennifer; Sahm, Daniel F.; Denys, Gerald A.; Goering, Richard V.; Shaw, Karen J.

2014-01-01

The Cfr methyltransferase confers resistance to six classes of drugs which target the peptidyl transferase center of the 50S ribosomal subunit, including some oxazolidinones, such as linezolid (LZD). The mobile cfr gene was identified in European veterinary isolates from the late 1990s, although the earliest report of a clinical cfr-positive strain was the 2005 Colombian methicillin-resistant Staphylococcus aureus (MRSA) isolate CM05. Here, through retrospective analysis of LZDr clinical strains from a U.S. surveillance program, we identified a cfr-positive MRSA isolate, 1128105, from January 2005, predating CM05 by 5 months. Molecular typing of 1128105 revealed a unique pulsed-field gel electrophoresis (PFGE) profile most similar to that of USA100, spa type t002, and multilocus sequence type 5 (ST5). In addition to cfr, LZD resistance in 1128105 is partially attributed to the presence of a single copy of the 23S rRNA gene mutation T2500A. Transformation of the ∼37-kb conjugative p1128105 cfr-bearing plasmid from 1128105 into S. aureus ATCC 29213 background strains was successful in recapitulating the Cfr antibiogram, as well as resistance to aminoglycosides and trimethoprim. A 7-kb cfr-containing region of p1128105 possessed sequence nearly identical to that found in the Chinese veterinary Proteus vulgaris isolate PV-01 and in U.S. clinical S. aureus isolate 1900, although the presence of IS431-like sequences is unique to p1128105. The cfr gene environment in this early clinical cfr-positive isolate has now been identified in Gram-positive and Gram-negative strains of clinical and veterinary origin and has been associated with multiple mobile elements, highlighting the versatility of this multidrug resistance gene and its potential for further dissemination. PMID:25155597
Draft Genome of the Pearl Oyster Pinctada fucata: A Platform for Understanding Bivalve Biology

PubMed Central

Takeuchi, Takeshi; Kawashima, Takeshi; Koyanagi, Ryo; Gyoja, Fuki; Tanaka, Makiko; Ikuta, Tetsuro; Shoguchi, Eiichi; Fujiwara, Mayuki; Shinzato, Chuya; Hisata, Kanako; Fujie, Manabu; Usami, Takeshi; Nagai, Kiyohito; Maeyama, Kaoru; Okamoto, Kikuhiko; Aoki, Hideo; Ishikawa, Takashi; Masaoka, Tetsuji; Fujiwara, Atushi; Endo, Kazuyoshi; Endo, Hirotoshi; Nagasawa, Hiromichi; Kinoshita, Shigeharu; Asakawa, Shuichi; Watabe, Shugo; Satoh, Nori

2012-01-01

The study of the pearl oyster Pinctada fucata is key to increasing our understanding of the molecular mechanisms involved in pearl biosynthesis and biology of bivalve molluscs. We sequenced ∼1150-Mb genome at ∼40-fold coverage using the Roche 454 GS-FLX and Illumina GAIIx sequencers. The sequences were assembled into contigs with N50 = 1.6 kb (total contig assembly reached to 1024 Mb) and scaffolds with N50 = 14.5 kb. The pearl oyster genome is AT-rich, with a GC content of 34%. DNA transposons, retrotransposons, and tandem repeat elements occupied 0.4, 1.5, and 7.9% of the genome, respectively (a total of 9.8%). Version 1.0 of the P. fucata draft genome contains 23 257 complete gene models, 70% of which are supported by the corresponding expressed sequence tags. The genes include those reported to have an association with bio-mineralization. Genes encoding transcription factors and signal transduction molecules are present in numbers comparable with genomes of other metazoans. Genome-wide molecular phylogeny suggests that the lophotrochozoan represents a distinct clade from ecdysozoans. Our draft genome of the pearl oyster thus provides a platform for the identification of selection markers and genes for calcification, knowledge of which will be important in the pearl industry. PMID:22315334
Exploitation of the diverse insertion sequence element content of dairy Lactobacillus helveticus starters as a rapid method to identify different strains.

PubMed

Kaleta, Pawel; Callanan, Michael J; O'Callaghan, John; Fitzgerald, Gerald F; Beresford, Thomas P; Ross, R Paul

2009-10-01

The species Lactobacillus helveticus is a commonly used thermophilic starter and/or adjunct culture for Swiss and Cheddar cheese manufacture. Its use is normally associated with flavour improvement which is known to be associated with culture traits such as rapid autolysis and high proteolytic activity. The genome of the commercial strain, DPC4571, was recently sequenced and found to have an abundance of IS sequences in terms of both abundance (213 intact) and diversity (21 types). Given this unique diversity for a lactic acid bacterium, we investigated whether PCR-based IS fingerprinting could be used as a discriminatory tool to distinguish between different strains of Lb. helveticus. A set of ten primers targeting five of the most numerous groups (ISL1201, ISLhe65, ISLhe2, ISLhe15 and ISL2) of IS elements was designed. Multiplex-PCR with all primers resulted in 1-12 discreet amplicons for each strain tested. The resultant fingerprints (in the 0.5 kb-3 kb range) were found to be strain specific and reproducible. This approach thus provides a valuable method to distinguish between Lb. helveticus strains while giving some indication of the relative abundance of IS sequences in each strain.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.