candidate dna sequences: Topics by Science.gov

Sample records for candidate dna sequences

HLA genotyping by next-generation sequencing of complementary DNA.

PubMed

Segawa, Hidenobu; Kukita, Yoji; Kato, Kikuya

2017-11-28

Genotyping of the human leucocyte antigen (HLA) is indispensable for various medical treatments. However, unambiguous genotyping is technically challenging due to high polymorphism of the corresponding genomic region. Next-generation sequencing is changing the landscape of genotyping. In addition to high throughput of data, its additional advantage is that DNA templates are derived from single molecules, which is a strong merit for the phasing problem. Although most currently developed technologies use genomic DNA, use of cDNA could enable genotyping with reduced costs in data production and analysis. We thus developed an HLA genotyping system based on next-generation sequencing of cDNA. Each HLA gene was divided into 3 or 4 target regions subjected to PCR amplification and subsequent sequencing with Ion Torrent PGM. The sequence data were then subjected to an automated analysis. The principle of the analysis was to construct candidate sequences generated from all possible combinations of variable bases and arrange them in decreasing order of the number of reads. Upon collecting candidate sequences from all target regions, 2 haplotypes were usually assigned. Cases not assigned 2 haplotypes were forwarded to 4 additional processes: selection of candidate sequences applying more stringent criteria, removal of artificial haplotypes, selection of candidate sequences with a relaxed threshold for sequence matching, and countermeasure for incomplete sequences in the HLA database. The genotyping system was evaluated using 30 samples; the overall accuracy was 97.0% at the field 3 level and 98.3% at the G group level. With one sample, genotyping of DPB1 was not completed due to short read size. We then developed a method for complete sequencing of individual molecules of the DPB1 gene, using the molecular barcode technology. The performance of the automatic genotyping system was comparable to that of systems developed in previous studies. Thus, next-generation sequencing of cDNA is a viable option for HLA genotyping.
Pediatric Glioblastoma Therapies Based on Patient-Derived Stem Cell Resources

DTIC Science & Technology

2014-11-01

genomic DNA and then subjected to Illumina high-throughput sequencing . In this analysis, shRNAs lost in the GSC population represent candidate gene...and genomic DNA and then subjected to Illumina high-throughput sequencing . In this analysis, shRNAs lost in the GSC population represent candidate...PRISM 7900 Sequence Detection System ( Genomics Resource, FHCRC). Relative transcript abundance was analyzed using the 2−ΔΔCt method. TRIzol (Invitrogen
DNA tetrominoes: the construction of DNA nanostructures using self-organised heterogeneous deoxyribonucleic acids shapes.

PubMed

Ong, Hui San; Rahim, Mohd Syafiq; Firdaus-Raih, Mohd; Ramlan, Effirul Ikhwan

2015-01-01

The unique programmability of nucleic acids offers alternative in constructing excitable and functional nanostructures. This work introduces an autonomous protocol to construct DNA Tetris shapes (L-Shape, B-Shape, T-Shape and I-Shape) using modular DNA blocks. The protocol exploits the rich number of sequence combinations available from the nucleic acid alphabets, thus allowing for diversity to be applied in designing various DNA nanostructures. Instead of a deterministic set of sequences corresponding to a particular design, the protocol promotes a large pool of DNA shapes that can assemble to conform to any desired structures. By utilising evolutionary programming in the design stage, DNA blocks are subjected to processes such as sequence insertion, deletion and base shifting in order to enrich the diversity of the resulting shapes based on a set of cascading filters. The optimisation algorithm allows mutation to be exerted indefinitely on the candidate sequences until these sequences complied with all the four fitness criteria. Generated candidates from the protocol are in agreement with the filter cascades and thermodynamic simulation. Further validation using gel electrophoresis indicated the formation of the designed shapes. Thus, supporting the plausibility of constructing DNA nanostructures in a more hierarchical, modular, and interchangeable manner.
Using Drosophila melanogaster as a Model for Genotoxic Chemical Mutational Studies with a New Program, SnpSift

PubMed Central

Cingolani, Pablo; Patel, Viral M.; Coon, Melissa; Nguyen, Tung; Land, Susan J.; Ruden, Douglas M.; Lu, Xiangyi

2012-01-01

This paper describes a new program SnpSift for filtering differential DNA sequence variants between two or more experimental genomes after genotoxic chemical exposure. Here, we illustrate how SnpSift can be used to identify candidate phenotype-relevant variants including single nucleotide polymorphisms, multiple nucleotide polymorphisms, insertions, and deletions (InDels) in mutant strains isolated from genome-wide chemical mutagenesis of Drosophila melanogaster. First, the genomes of two independently isolated mutant fly strains that are allelic for a novel recessive male-sterile locus generated by genotoxic chemical exposure were sequenced using the Illumina next-generation DNA sequencer to obtain 20- to 29-fold coverage of the euchromatic sequences. The sequencing reads were processed and variants were called using standard bioinformatic tools. Next, SnpEff was used to annotate all sequence variants and their potential mutational effects on associated genes. Then, SnpSift was used to filter and select differential variants that potentially disrupt a common gene in the two allelic mutant strains. The potential causative DNA lesions were partially validated by capillary sequencing of polymerase chain reaction-amplified DNA in the genetic interval as defined by meiotic mapping and deletions that remove defined regions of the chromosome. Of the five candidate genes located in the genetic interval, the Pka-like gene CG12069 was found to carry a separate pre-mature stop codon mutation in each of the two allelic mutants whereas the other four candidate genes within the interval have wild-type sequences. The Pka-like gene is therefore a strong candidate gene for the male-sterile locus. These results demonstrate that combining SnpEff and SnpSift can expedite the identification of candidate phenotype-causative mutations in chemically mutagenized Drosophila strains. This technique can also be used to characterize the variety of mutations generated by genotoxic chemicals. PMID:22435069
Human Chromosome 7: DNA Sequence and Biology

PubMed Central

Scherer, Stephen W.; Cheung, Joseph; MacDonald, Jeffrey R.; Osborne, Lucy R.; Nakabayashi, Kazuhiko; Herbrick, Jo-Anne; Carson, Andrew R.; Parker-Katiraee, Layla; Skaug, Jennifer; Khaja, Razi; Zhang, Junjun; Hudek, Alexander K.; Li, Martin; Haddad, May; Duggan, Gavin E.; Fernandez, Bridget A.; Kanematsu, Emiko; Gentles, Simone; Christopoulos, Constantine C.; Choufani, Sanaa; Kwasnicka, Dorota; Zheng, Xiangqun H.; Lai, Zhongwu; Nusskern, Deborah; Zhang, Qing; Gu, Zhiping; Lu, Fu; Zeesman, Susan; Nowaczyk, Malgorzata J.; Teshima, Ikuko; Chitayat, David; Shuman, Cheryl; Weksberg, Rosanna; Zackai, Elaine H.; Grebe, Theresa A.; Cox, Sarah R.; Kirkpatrick, Susan J.; Rahman, Nazneen; Friedman, Jan M.; Heng, Henry H. Q.; Pelicci, Pier Giuseppe; Lo-Coco, Francesco; Belloni, Elena; Shaffer, Lisa G.; Pober, Barbara; Morton, Cynthia C.; Gusella, James F.; Bruns, Gail A. P.; Korf, Bruce R.; Quade, Bradley J.; Ligon, Azra H.; Ferguson, Heather; Higgins, Anne W.; Leach, Natalia T.; Herrick, Steven R.; Lemyre, Emmanuelle; Farra, Chantal G.; Kim, Hyung-Goo; Summers, Anne M.; Gripp, Karen W.; Roberts, Wendy; Szatmari, Peter; Winsor, Elizabeth J. T.; Grzeschik, Karl-Heinz; Teebi, Ahmed; Minassian, Berge A.; Kere, Juha; Armengol, Lluis; Pujana, Miguel Angel; Estivill, Xavier; Wilson, Michael D.; Koop, Ben F.; Tosi, Sabrina; Moore, Gudrun E.; Boright, Andrew P.; Zlotorynski, Eitan; Kerem, Batsheva; Kroisel, Peter M.; Petek, Erwin; Oscier, David G.; Mould, Sarah J.; Döhner, Hartmut; Döhner, Konstanze; Rommens, Johanna M.; Vincent, John B.; Venter, J. Craig; Li, Peter W.; Mural, Richard J.; Adams, Mark D.; Tsui, Lap-Chee

2010-01-01

DNA sequence and annotation of the entire human chromosome 7, encompassing nearly 158 million nucleotides of DNA and 1917 gene structures, are presented. To generate a higher order description, additional structural features such as imprinted genes, fragile sites, and segmental duplications were integrated at the level of the DNA sequence with medical genetic data, including 440 chromosome rearrangement breakpoints associated with disease. This approach enabled the discovery of candidate genes for developmental diseases including autism. PMID:12690205
Constructing DNA Barcode Sets Based on Particle Swarm Optimization.

PubMed

Wang, Bin; Zheng, Xuedong; Zhou, Shihua; Zhou, Changjun; Wei, Xiaopeng; Zhang, Qiang; Wei, Ziqi

2018-01-01

Following the completion of the human genome project, a large amount of high-throughput bio-data was generated. To analyze these data, massively parallel sequencing, namely next-generation sequencing, was rapidly developed. DNA barcodes are used to identify the ownership between sequences and samples when they are attached at the beginning or end of sequencing reads. Constructing DNA barcode sets provides the candidate DNA barcodes for this application. To increase the accuracy of DNA barcode sets, a particle swarm optimization (PSO) algorithm has been modified and used to construct the DNA barcode sets in this paper. Compared with the extant results, some lower bounds of DNA barcode sets are improved. The results show that the proposed algorithm is effective in constructing DNA barcode sets.
GENESUS: a two-step sequence design program for DNA nanostructure self-assembly.

PubMed

Tsutsumi, Takanobu; Asakawa, Takeshi; Kanegami, Akemi; Okada, Takao; Tahira, Tomoko; Hayashi, Kenshi

2014-01-01

DNA has been recognized as an ideal material for bottom-up construction of nanometer scale structures by self-assembly. The generation of sequences optimized for unique self-assembly (GENESUS) program reported here is a straightforward method for generating sets of strand sequences optimized for self-assembly of arbitrarily designed DNA nanostructures by a generate-candidates-and-choose-the-best strategy. A scalable procedure to prepare single-stranded DNA having arbitrary sequences is also presented. Strands for the assembly of various structures were designed and successfully constructed, validating both the program and the procedure.
Recent patents of nanopore DNA sequencing technology: progress and challenges.

PubMed

Zhou, Jianfeng; Xu, Bingqian

2010-11-01

DNA sequencing techniques witnessed fast development in the last decades, primarily driven by the Human Genome Project. Among the proposed new techniques, Nanopore was considered as a suitable candidate for the single DNA sequencing with ultrahigh speed and very low cost. Several fabrication and modification techniques have been developed to produce robust and well-defined nanopore devices. Many efforts have also been done to apply nanopore to analyze the properties of DNA molecules. By comparing with traditional sequencing techniques, nanopore has demonstrated its distinctive superiorities in main practical issues, such as sample preparation, sequencing speed, cost-effective and read-length. Although challenges still remain, recent researches in improving the capabilities of nanopore have shed a light to achieve its ultimate goal: Sequence individual DNA strand at single nucleotide level. This patent review briefly highlights recent developments and technological achievements for DNA analysis and sequencing at single molecule level, focusing on nanopore based methods.
Exon trapping: a genetic screen to identify candidate transcribed sequences in cloned mammalian genomic DNA.

PubMed

Duyk, G M; Kim, S W; Myers, R M; Cox, D R

1990-11-01

Identification and recovery of transcribed sequences from cloned mammalian genomic DNA remains an important problem in isolating genes on the basis of their chromosomal location. We have developed a strategy that facilitates the recovery of exons from random pieces of cloned genomic DNA. The basis of this "exon trapping" strategy is that, during a retroviral life cycle, genomic sequences of nonviral origin are correctly spliced and may be recovered as a cDNA copy of the introduced segment. By using this genetic assay for cis-acting sequences required for RNA splicing, we have screened approximately 20 kilobase pairs of cloned genomic DNA and have recovered all four predicted exons.
Exon trapping: a genetic screen to identify candidate transcribed sequences in cloned mammalian genomic DNA.

PubMed Central

Duyk, G M; Kim, S W; Myers, R M; Cox, D R

1990-01-01

Identification and recovery of transcribed sequences from cloned mammalian genomic DNA remains an important problem in isolating genes on the basis of their chromosomal location. We have developed a strategy that facilitates the recovery of exons from random pieces of cloned genomic DNA. The basis of this "exon trapping" strategy is that, during a retroviral life cycle, genomic sequences of nonviral origin are correctly spliced and may be recovered as a cDNA copy of the introduced segment. By using this genetic assay for cis-acting sequences required for RNA splicing, we have screened approximately 20 kilobase pairs of cloned genomic DNA and have recovered all four predicted exons. PMID:2247475
Methylation analysis of plasma cell-free DNA for breast cancer early detection using bisulfite next-generation sequencing.

PubMed

Li, Zibo; Guo, Xinwu; Tang, Lili; Peng, Limin; Chen, Ming; Luo, Xipeng; Wang, Shouman; Xiao, Zhi; Deng, Zhongping; Dai, Lizhong; Xia, Kun; Wang, Jun

2016-10-01

Circulating cell-free DNA (cfDNA) has been considered as a potential biomarker for non-invasive cancer detection. To evaluate the methylation levels of six candidate genes (EGFR, GREM1, PDGFRB, PPM1E, SOX17, and WRN) in plasma cfDNA as biomarkers for breast cancer early detection, quantitative analysis of the promoter methylation of these genes from 86 breast cancer patients and 67 healthy controls was performed by using microfluidic-PCR-based target enrichment and next-generation bisulfite sequencing technology. The predictive performance of different logistic models based on methylation status of candidate genes was investigated by means of the area under the ROC curve (AUC) and odds ratio (OR) analysis. Results revealed that EGFR, PPM1E, and 8 gene-specific CpG sites showed significantly hypermethylation in cancer patients' plasma and significantly associated with breast cancer (OR ranging from 2.51 to 9.88). The AUC values for these biomarkers were ranging from 0.66 to 0.75. Combinations of multiple hypermethylated genes or CpG sites substantially improved the predictive performance for breast cancer detection. Our study demonstrated the feasibility of quantitative measurement of candidate gene methylation in cfDNA by using microfluidic-PCR-based target enrichment and bisulfite next-generation sequencing, which is worthy of further validation and potentially benefits a broad range of applications in clinical oncology practice. Quantitative analysis of methylation pattern of plasma cfDNA by next-generation sequencing might be a valuable non-invasive tool for early detection of breast cancer.
Integrating De Novo Transcriptome Assembly and Cloning to Obtain Chicken Ovocleidin-17 Full-Length cDNA

PubMed Central

Ning, ZhongHua; Hincke, Maxwell T.; Yang, Ning; Hou, ZhuoCheng

2014-01-01

Efficiently obtaining full-length cDNA for a target gene is the key step for functional studies and probing genetic variations. However, almost all sequenced domestic animal genomes are not ‘finished’. Many functionally important genes are located in these gapped regions. It can be difficult to obtain full-length cDNA for which only partial amino acid/EST sequences exist. In this study we report a general pipeline to obtain full-length cDNA, and illustrate this approach for one important gene (Ovocleidin-17, OC-17) that is associated with chicken eggshell biomineralization. Chicken OC-17 is one of the best candidates to control and regulate the deposition of calcium carbonate in the calcified eggshell layer. OC-17 protein has been purified, sequenced, and has had its three-dimensional structure solved. However, researchers still cannot conduct OC-17 mRNA related studies because the mRNA sequence is unknown and the gene is absent from the current chicken genome. We used RNA-Seq to obtain the entire transcriptome of the adult hen uterus, and then conducted de novo transcriptome assembling with bioinformatics analysis to obtain candidate OC-17 transcripts. Based on this sequence, we used RACE and PCR cloning methods to successfully obtain the full-length OC-17 cDNA. Temporal and spatial OC-17 mRNA expression analyses were also performed to demonstrate that OC-17 is predominantly expressed in the adult hen uterus during the laying cycle and barely at immature developmental stages. Differential uterine expression of OC-17 was observed in hens laying eggs with weak versus strong eggshell, confirming its important role in the regulation of eggshell mineralization and providing a new tool for genetic selection for eggshell quality parameters. This study is the first one to report the full-length OC-17 cDNA sequence, and builds a foundation for OC-17 mRNA related studies. We provide a general method for biologists experiencing difficulty in obtaining candidate gene full-length cDNA sequences. PMID:24676480
Integrating de novo transcriptome assembly and cloning to obtain chicken Ovocleidin-17 full-length cDNA.

PubMed

Zhang, Quan; Liu, Long; Zhu, Feng; Ning, ZhongHua; Hincke, Maxwell T; Yang, Ning; Hou, ZhuoCheng

2014-01-01

Efficiently obtaining full-length cDNA for a target gene is the key step for functional studies and probing genetic variations. However, almost all sequenced domestic animal genomes are not 'finished'. Many functionally important genes are located in these gapped regions. It can be difficult to obtain full-length cDNA for which only partial amino acid/EST sequences exist. In this study we report a general pipeline to obtain full-length cDNA, and illustrate this approach for one important gene (Ovocleidin-17, OC-17) that is associated with chicken eggshell biomineralization. Chicken OC-17 is one of the best candidates to control and regulate the deposition of calcium carbonate in the calcified eggshell layer. OC-17 protein has been purified, sequenced, and has had its three-dimensional structure solved. However, researchers still cannot conduct OC-17 mRNA related studies because the mRNA sequence is unknown and the gene is absent from the current chicken genome. We used RNA-Seq to obtain the entire transcriptome of the adult hen uterus, and then conducted de novo transcriptome assembling with bioinformatics analysis to obtain candidate OC-17 transcripts. Based on this sequence, we used RACE and PCR cloning methods to successfully obtain the full-length OC-17 cDNA. Temporal and spatial OC-17 mRNA expression analyses were also performed to demonstrate that OC-17 is predominantly expressed in the adult hen uterus during the laying cycle and barely at immature developmental stages. Differential uterine expression of OC-17 was observed in hens laying eggs with weak versus strong eggshell, confirming its important role in the regulation of eggshell mineralization and providing a new tool for genetic selection for eggshell quality parameters. This study is the first one to report the full-length OC-17 cDNA sequence, and builds a foundation for OC-17 mRNA related studies. We provide a general method for biologists experiencing difficulty in obtaining candidate gene full-length cDNA sequences.
Comparative analysis of mitochondrial genomes between a wheat K-type cytoplasmic male sterility (CMS) line and its maintainer line.

PubMed

Liu, Huitao; Cui, Peng; Zhan, Kehui; Lin, Qiang; Zhuo, Guoyin; Guo, Xiaoli; Ding, Feng; Yang, Wenlong; Liu, Dongcheng; Hu, Songnian; Yu, Jun; Zhang, Aimin

2011-03-29

Plant mitochondria, semiautonomous organelles that function as manufacturers of cellular ATP, have their own genome that has a slow rate of evolution and rapid rearrangement. Cytoplasmic male sterility (CMS), a common phenotype in higher plants, is closely associated with rearrangements in mitochondrial DNA (mtDNA), and is widely used to produce F1 hybrid seeds in a variety of valuable crop species. Novel chimeric genes deduced from mtDNA rearrangements causing CMS have been identified in several plants, such as rice, sunflower, pepper, and rapeseed, but there are very few reports about mtDNA rearrangements in wheat. In the present work, we describe the mitochondrial genome of a wheat K-type CMS line and compare it with its maintainer line. The complete mtDNA sequence of a wheat K-type (with cytoplasm of Aegilops kotschyi) CMS line, Ks3, was assembled into a master circle (MC) molecule of 647,559 bp and found to harbor 34 known protein-coding genes, three rRNAs (18 S, 26 S, and 5 S rRNAs), and 16 different tRNAs. Compared to our previously published sequence of a K-type maintainer line, Km3, we detected Ks3-specific mtDNA (> 100 bp, 11.38%) and repeats (> 100 bp, 29 units) as well as genes that are unique to each line: rpl5 was missing in Ks3 and trnH was absent from Km3. We also defined 32 single nucleotide polymorphisms (SNPs) in 13 protein-coding, albeit functionally irrelevant, genes, and predicted 22 unique ORFs in Ks3, representing potential candidates for K-type CMS. All these sequence variations are candidates for involvement in CMS. A comparative analysis of the mtDNA of several angiosperms, including those from Ks3, Km3, rice, maize, Arabidopsis thaliana, and rapeseed, showed that non-coding sequences of higher plants had mostly divergent multiple reorganizations during the mtDNA evolution of higher plants. The complete mitochondrial genome of the wheat K-type CMS line Ks3 is very different from that of its maintainer line Km3, especially in non-coding sequences. Sequence rearrangement has produced novel chimeric ORFs, which may be candidate genes for CMS. Comparative analysis of several angiosperm mtDNAs indicated that non-coding sequences are the most frequently reorganized during mtDNA evolution in higher plants.
Highly parallel single-molecule amplification approach based on agarose droplet polymerase chain reaction for efficient and cost-effective aptamer selection.

PubMed

Zhang, Wei Yun; Zhang, Wenhua; Liu, Zhiyuan; Li, Cong; Zhu, Zhi; Yang, Chaoyong James

2012-01-03

We have developed a novel method for efficiently screening affinity ligands (aptamers) from a complex single-stranded DNA (ssDNA) library by employing single-molecule emulsion polymerase chain reaction (PCR) based on the agarose droplet microfluidic technology. In a typical systematic evolution of ligands by exponential enrichment (SELEX) process, the enriched library is sequenced first, and tens to hundreds of aptamer candidates are analyzed via a bioinformatic approach. Possible candidates are then chemically synthesized, and their binding affinities are measured individually. Such a process is time-consuming, labor-intensive, inefficient, and expensive. To address these problems, we have developed a highly efficient single-molecule approach for aptamer screening using our agarose droplet microfluidic technology. Statistically diluted ssDNA of the pre-enriched library evolved through conventional SELEX against cancer biomarker Shp2 protein was encapsulated into individual uniform agarose droplets for droplet PCR to generate clonal agarose beads. The binding capacity of amplified ssDNA from each clonal bead was then screened via high-throughput fluorescence cytometry. DNA clones with high binding capacity and low K(d) were chosen as the aptamer and can be directly used for downstream biomedical applications. We have identified an ssDNA aptamer that selectively recognizes Shp2 with a K(d) of 24.9 nM. Compared to a conventional sequencing-chemical synthesis-screening work flow, our approach avoids large-scale DNA sequencing and expensive, time-consuming DNA synthesis of large populations of DNA candidates. The agarose droplet microfluidic approach is thus highly efficient and cost-effective for molecular evolution approaches and will find wide application in molecular evolution technologies, including mRNA display, phage display, and so on. © 2011 American Chemical Society
Simple and efficient identification of rare recessive pathologically important sequence variants from next generation exome sequence data.

PubMed

Carr, Ian M; Morgan, Joanne; Watson, Christopher; Melnik, Svitlana; Diggle, Christine P; Logan, Clare V; Harrison, Sally M; Taylor, Graham R; Pena, Sergio D J; Markham, Alexander F; Alkuraya, Fowzan S; Black, Graeme C M; Ali, Manir; Bonthron, David T

2013-07-01

Massively parallel ("next generation") DNA sequencing (NGS) has quickly become the method of choice for seeking pathogenic mutations in rare uncharacterized monogenic diseases. Typically, before DNA sequencing, protein-coding regions are enriched from patient genomic DNA, representing either the entire genome ("exome sequencing") or selected mapped candidate loci. Sequence variants, identified as differences between the patient's and the human genome reference sequences, are then filtered according to various quality parameters. Changes are screened against datasets of known polymorphisms, such as dbSNP and the 1000 Genomes Project, in the effort to narrow the list of candidate causative variants. An increasing number of commercial services now offer to both generate and align NGS data to a reference genome. This potentially allows small groups with limited computing infrastructure and informatics skills to utilize this technology. However, the capability to effectively filter and assess sequence variants is still an important bottleneck in the identification of deleterious sequence variants in both research and diagnostic settings. We have developed an approach to this problem comprising a user-friendly suite of programs that can interactively analyze, filter and screen data from enrichment-capture NGS data. These programs ("Agile Suite") are particularly suitable for small-scale gene discovery or for diagnostic analysis. © 2013 WILEY PERIODICALS, INC.
Mitochondrial DNA variant at HVI region as a candidate of genetic markers of type 2 diabetes

NASA Astrophysics Data System (ADS)

Gumilar, Gun Gun; Purnamasari, Yunita; Setiadi, Rahmat

2016-02-01

Mitochondrial DNA (mtDNA) is maternally inherited. mtDNA mutations which can contribute to the excess of maternal inheritance of type 2 diabetes. Due to the high mutation rate, one of the areas in the mtDNA that is often associated with the disease is the hypervariable region I (HVI). Therefore, this study was conducted to determine the genetic variants of human mtDNA HVI that related to the type 2 diabetes in four samples that were taken from four generations in one lineage. Steps being taken include the lyses of hair follicles, amplification of mtDNA HVI fragment using Polymerase Chain Reaction (PCR), detection of PCR products through agarose gel electrophoresis technique, the measurement of the concentration of mtDNA using UV-Vis spectrophotometer, determination of the nucleotide sequence via direct sequencing method and analysis of the sequencing results using SeqMan DNASTAR program. Based on the comparison between nucleotide sequence of samples and revised Cambridge Reference Sequence (rCRS) obtained six same mutations that these are C16147T, T16189C, C16193del, T16127C, A16235G, and A16293C. After comparing the data obtained to the secondary data from Mitomap and NCBI, it were found that two mutations, T16189C and T16217C, become candidates as genetic markers of type 2 diabetes even the mutations were found also in the generations of undiagnosed type 2 diabetes. The results of this study are expected to give contribution to the collection of human mtDNA database of genetic variants that associated to metabolic diseases, so that in the future it can be utilized in various fields, especially in medicine.
SNPServer: a real-time SNP discovery tool.

PubMed

Savage, David; Batley, Jacqueline; Erwin, Tim; Logan, Erica; Love, Christopher G; Lim, Geraldine A C; Mongin, Emmanuel; Barker, Gary; Spangenberg, German C; Edwards, David

2005-07-01

SNPServer is a real-time flexible tool for the discovery of SNPs (single nucleotide polymorphisms) within DNA sequence data. The program uses BLAST, to identify related sequences, and CAP3, to cluster and align these sequences. The alignments are parsed to the SNP discovery software autoSNP, a program that detects SNPs and insertion/deletion polymorphisms (indels). Alternatively, lists of related sequences or pre-assembled sequences may be entered for SNP discovery. SNPServer and autoSNP use redundancy to differentiate between candidate SNPs and sequence errors. For each candidate SNP, two measures of confidence are calculated, the redundancy of the polymorphism at a SNP locus and the co-segregation of the candidate SNP with other SNPs in the alignment. SNPServer is available at http://hornbill.cspp.latrobe.edu.au/snpdiscovery.html.
Sequencing Needs for Viral Diagnostics

DOE Office of Scientific and Technical Information (OSTI.GOV)

Gardner, S N; Lam, M; Mulakken, N J

2004-01-26

We built a system to guide decisions regarding the amount of genomic sequencing required to develop diagnostic DNA signatures, which are short sequences that are sufficient to uniquely identify a viral species. We used our existing DNA diagnostic signature prediction pipeline, which selects regions of a target species genome that are conserved among strains of the target (for reliability, to prevent false negatives) and unique relative to other species (for specificity, to avoid false positives). We performed simulations, based on existing sequence data, to assess the number of genome sequences of a target species and of close phylogenetic relatives (''nearmore » neighbors'') that are required to predict diagnostic signature regions that are conserved among strains of the target species and unique relative to other bacterial and viral species. For DNA viruses such as variola (smallpox), three target genomes provide sufficient guidance for selecting species-wide signatures. Three near neighbor genomes are critical for species specificity. In contrast, most RNA viruses require four target genomes and no near neighbor genomes, since lack of conservation among strains is more limiting than uniqueness. SARS and Ebola Zaire are exceptional, as additional target genomes currently do not improve predictions, but near neighbor sequences are urgently needed. Our results also indicate that double stranded DNA viruses are more conserved among strains than are RNA viruses, since in most cases there was at least one conserved signature candidate for the DNA viruses and zero conserved signature candidates for the RNA viruses.« less
MARTA: a suite of Java-based tools for assigning taxonomic status to DNA sequences.

PubMed

Horton, Matthew; Bodenhausen, Natacha; Bergelson, Joy

2010-02-15

We have created a suite of Java-based software to better provide taxonomic assignments to DNA sequences. We anticipate that the program will be useful for protistologists, virologists, mycologists and other microbial ecologists. The program relies on NCBI utilities including the BLAST software and Taxonomy database and is easily manipulated at the command-line to specify a BLAST candidate's query-coverage or percent identity requirements; other options include the ability to set minimal consensus requirements (%) for each of the eight major taxonomic ranks (Domain, Kingdom, Phylum, ...) and whether to consider lower scoring candidates when the top-hit lacks taxonomic classification.

Designing universal primers for the isolation of DNA sequences encoding Proanthocyanidins biosynthetic enzymes in Crataegus aronia

PubMed Central

2012-01-01

Background Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs) and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Findings Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. Conclusion To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants. PMID:22883984
Designing universal primers for the isolation of DNA sequences encoding Proanthocyanidins biosynthetic enzymes in Crataegus aronia.

PubMed

Zuiter, Afnan Saeid; Sawwan, Jammal; Al Abdallat, Ayed

2012-08-10

Hawthorn is the common name of all plant species in the genus Crataegus, which belongs to the Rosaceae family. Crataegus are considered useful medicinal plants because of their high content of proanthocyanidins (PAs) and other related compounds. To improve PAs production in Crataegus tissues, the sequences of genes encoding PAs biosynthetic enzymes are required. Different bioinformatics tools, including BLAST, multiple sequence alignment and alignment PCR analysis were used to design primers suitable for the amplification of DNA fragments from 10 candidate genes encoding enzymes involved in PAs biosynthesis in C. aronia. DNA sequencing results proved the utility of the designed primers. The primers were used successfully to amplify DNA fragments of different PAs biosynthesis genes in different Rosaceae plants. To the best of our knowledge, this is the first use of the alignment PCR approach to isolate DNA sequences encoding PAs biosynthetic enzymes in Rosaceae plants.
Islander: A database of precisely mapped genomic islands in tRNA and tmRNA genes

DOE PAGES

Hudson, Corey M.; Lau, Britney Y.; Williams, Kelly P.

2014-11-05

Genomic islands are mobile DNAs that are major agents of bacterial and archaeal evolution. Integration into prokaryotic chromosomes usually occurs site-specifically at tRNA or tmRNA gene (together, tDNA) targets, catalyzed by tyrosine integrases. This splits the target gene, yet sequences within the island restore the disrupted gene; the regenerated target and its displaced fragment precisely mark the endpoints of the island. We applied this principle to search for islands in genomic DNA sequences. Our algorithm identifies tDNAs, finds fragments of those tDNAs in the same replicon and removes unlikely candidate islands through a series of filters. A search for islandsmore » in 2168 whole prokaryotic genomes produced 3919 candidates. The website Islander (recently moved to http://bioinformatics.sandia.gov/islander/) presents these precisely mapped candidate islands, the gene content and the island sequence. The algorithm further insists that each island encode an integrase, and attachment site sequence identity is carefully noted; therefore, the database also serves in the study of integrase site-specificity and its evolution.« less
Selection of a DNA barcode for Nectriaceae from fungal whole-genomes.

PubMed

Zeng, Zhaoqing; Zhao, Peng; Luo, Jing; Zhuang, Wenying; Yu, Zhihe

2012-01-01

A DNA barcode is a short segment of sequence that is able to distinguish species. A barcode must ideally contain enough variation to distinguish every individual species and be easily obtained. Fungi of Nectriaceae are economically important and show high species diversity. To establish a standard DNA barcode for this group of fungi, the genomes of Neurospora crassa and 30 other filamentous fungi were compared. The expect value was treated as a criterion to recognize homologous sequences. Four candidate markers, Hsp90, AAC, CDC48, and EF3, were tested for their feasibility as barcodes in the identification of 34 well-established species belonging to 13 genera of Nectriaceae. Two hundred and fifteen sequences were analyzed. Intra- and inter-specific variations and the success rate of PCR amplification and sequencing were considered as important criteria for estimation of the candidate markers. Ultimately, the partial EF3 gene met the requirements for a good DNA barcode: No overlap was found between the intra- and inter-specific pairwise distances. The smallest inter-specific distance of EF3 gene was 3.19%, while the largest intra-specific distance was 1.79%. In addition, there was a high success rate in PCR and sequencing for this gene (96.3%). CDC48 showed sufficiently high sequence variation among species, but the PCR and sequencing success rate was 84% using a single pair of primers. Although the Hsp90 and AAC genes had higher PCR and sequencing success rates (96.3% and 97.5%, respectively), overlapping occurred between the intra- and inter-specific variations, which could lead to misidentification. Therefore, we propose the EF3 gene as a possible DNA barcode for the nectriaceous fungi.
Universal Readers Based on Hydrogen Bonding or π-π Stacking for Identification of DNA Nucleotides in Electron Tunnel Junctions.

PubMed

Biswas, Sovan; Sen, Suman; Im, JongOne; Biswas, Sudipta; Krstic, Predrag; Ashcroft, Brian; Borges, Chad; Zhao, Yanan; Lindsay, Stuart; Zhang, Peiming

2016-12-27

A reader molecule, which recognizes all the naturally occurring nucleobases in an electron tunnel junction, is required for sequencing DNA by a recognition tunneling (RT) technique, referred to as a universal reader. In the present study, we have designed a series of heterocyclic carboxamides based on hydrogen bonding and a large-sized pyrene ring based on a π-π stacking interaction as universal reader candidates. Each of these compounds was synthesized to bear a thiolated linker for attachment to metal electrodes and examined for their interactions with naturally occurring DNA nucleosides and nucleotides by 1 H NMR, ESI-MS, computational calculations, and surface plasmon resonance. RT measurements were carried out in a scanning tunnel microscope. All of these molecules generated electrical signals with DNA nucleotides in tunneling junctions under physiological conditions (phosphate buffered aqueous solution, pH 7.4). Using a support vector machine as a tool for data analysis, we found that these candidates distinguished among naturally occurring DNA nucleotides with the accuracy of pyrene (by π-π stacking interactions) > azole carboxamides (by hydrogen-bonding interactions). In addition, the pyrene reader operated efficiently in a larger tunnel junction. However, the azole carboxamide could read abasic (AP) monophosphate, a product from spontaneous base hydrolysis or an intermediate of base excision repair. Thus, we envision that sequencing DNA using both π-π stacking and hydrogen-bonding-based universal readers in parallel should generate more comprehensive genome sequences than sequencing based on either reader molecule alone.
Identification of Prostate Cancer-Specific microDNAs

DTIC Science & Technology

2014-12-01

displacement amplification (MDA). 2 adopted multiple displacement amplification (MDA) with random primers for enriched circular DNA by rolling circle ... amplification (RCA) (Fig. 1) and then amplified DNA fragments were subject to deep sequencing. Sequence NO of Reads seq 1 184 seq 2 133 seq 3 2407 seq...prostate cancer cells through multiple displacement amplification .  Clone #7 is the top candidate which has been cloned in an expression vector and it
Ariadne: a database search engine for identification and chemical analysis of RNA using tandem mass spectrometry data.

PubMed

Nakayama, Hiroshi; Akiyama, Misaki; Taoka, Masato; Yamauchi, Yoshio; Nobe, Yuko; Ishikawa, Hideaki; Takahashi, Nobuhiro; Isobe, Toshiaki

2009-04-01

We present here a method to correlate tandem mass spectra of sample RNA nucleolytic fragments with an RNA nucleotide sequence in a DNA/RNA sequence database, thereby allowing tandem mass spectrometry (MS/MS)-based identification of RNA in biological samples. Ariadne, a unique web-based database search engine, identifies RNA by two probability-based evaluation steps of MS/MS data. In the first step, the software evaluates the matches between the masses of product ions generated by MS/MS of an RNase digest of sample RNA and those calculated from a candidate nucleotide sequence in a DNA/RNA sequence database, which then predicts the nucleotide sequences of these RNase fragments. In the second step, the candidate sequences are mapped for all RNA entries in the database, and each entry is scored for a function of occurrences of the candidate sequences to identify a particular RNA. Ariadne can also predict post-transcriptional modifications of RNA, such as methylation of nucleotide bases and/or ribose, by estimating mass shifts from the theoretical mass values. The method was validated with MS/MS data of RNase T1 digests of in vitro transcripts. It was applied successfully to identify an unknown RNA component in a tRNA mixture and to analyze post-transcriptional modification in yeast tRNA(Phe-1).
Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing.

PubMed

Hargreaves, Adam D; Mulley, John F

2015-01-01

Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0-2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5' and 3' UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species.
Assessing the utility of the Oxford Nanopore MinION for snake venom gland cDNA sequencing

PubMed Central

Hargreaves, Adam D.

2015-01-01

Portable DNA sequencers such as the Oxford Nanopore MinION device have the potential to be truly disruptive technologies, facilitating new approaches and analyses and, in some cases, taking sequencing out of the lab and into the field. However, the capabilities of these technologies are still being revealed. Here we show that single-molecule cDNA sequencing using the MinION accurately characterises venom toxin-encoding genes in the painted saw-scaled viper, Echis coloratus. We find the raw sequencing error rate to be around 12%, improved to 0–2% with hybrid error correction and 3% with de novo error correction. Our corrected data provides full coding sequences and 5′ and 3′ UTRs for 29 of 33 candidate venom toxins detected, far superior to Illumina data (13/40 complete) and Sanger-based ESTs (15/29). We suggest that, should the current pace of improvement continue, the MinION will become the default approach for cDNA sequencing in a variety of species. PMID:26623194
Ordered shotgun sequencing of a 135 kb Xq25 YAC containing ANT2 and four possible genes, including three confirmed by EST matches.

PubMed Central

Chen, C N; Su, Y; Baybayan, P; Siruno, A; Nagaraja, R; Mazzarella, R; Schlessinger, D; Chen, E

1996-01-01

Ordered shotgun sequencing (OSS) has been successfully carried out with an Xq25 YAC substrate. yWXD703 DNA was subcloned into lambda phage and sequences of insert ends of the lambda subclones were used to generate a map to select a minimum tiling path of clones to be completely sequenced. The sequence of 135 038 nt contains the entire ANT2 cDNA as well as four other candidates suggested by computer-assisted analyses. One of the putative genes is homologous to a gene implicated in Graves' disease and it, ANT2 and two others are confirmed by EST matches. The results suggest that OSS can be applied to YACs in accord with earlier simulations and further indicate that the sequence of the YAC accurately reflects the sequence of uncloned human DNA. PMID:8918809
Fourteen-Genome Comparison Identifies DNA Markers for Severe-Disease-Associated Strains of Clostridium difficile▿†

PubMed Central

Forgetta, Vincenzo; Oughton, Matthew T.; Marquis, Pascale; Brukner, Ivan; Blanchette, Ruth; Haub, Kevin; Magrini, Vince; Mardis, Elaine R.; Gerding, Dale N.; Loo, Vivian G.; Miller, Mark A.; Mulvey, Michael R.; Rupnik, Maja; Dascal, Andre; Dewar, Ken

2011-01-01

Clostridium difficile is a common cause of infectious diarrhea in hospitalized patients. A severe and increased incidence of C. difficile infection (CDI) is associated predominantly with the NAP1 strain; however, the existence of other severe-disease-associated (SDA) strains and the extensive genetic diversity across C. difficile complicate reliable detection and diagnosis. Comparative genome analysis of 14 sequenced genomes, including those of a subset of NAP1 isolates, allowed the assessment of genetic diversity within and between strain types to identify DNA markers that are associated with severe disease. Comparative genome analysis of 14 isolates, including five publicly available strains, revealed that C. difficile has a core genome of 3.4 Mb, comprising ∼3,000 genes. Analysis of the core genome identified candidate DNA markers that were subsequently evaluated using a multistrain panel of 177 isolates, representing more than 50 pulsovars and 8 toxinotypes. A subset of 117 isolates from the panel had associated patient data that allowed assessment of an association between the DNA markers and severe CDI. We identified 20 candidate DNA markers for species-wide detection and 10,683 single nucleotide polymorphisms (SNPs) associated with the predominant SDA strain (NAP1). A species-wide detection candidate marker, the sspA gene, was found to be the same across 177 sequenced isolates and lacked significant similarity to those of other species. Candidate SNPs in genes CD1269 and CD1265 were found to associate more closely with disease severity than currently used diagnostic markers, as they were also present in the toxin A-negative and B-positive (A-B+) strain types. The genetic markers identified illustrate the potential of comparative genomics for the discovery of diagnostic DNA-based targets that are species specific or associated with multiple SDA strains. PMID:21508155
A DNA barcode for land plants.

PubMed

2009-08-04

DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF-atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK-psbI spacer, and trnH-psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL+matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants.
A DNA barcode for land plants

PubMed Central

Hollingsworth, Peter M.; Forrest, Laura L.; Spouge, John L.; Hajibabaei, Mehrdad; Ratnasingham, Sujeevan; van der Bank, Michelle; Chase, Mark W.; Cowan, Robyn S.; Erickson, David L.; Fazekas, Aron J.; Graham, Sean W.; James, Karen E.; Kim, Ki-Joong; Kress, W. John; Schneider, Harald; van AlphenStahl, Jonathan; Barrett, Spencer C.H.; van den Berg, Cassio; Bogarin, Diego; Burgess, Kevin S.; Cameron, Kenneth M.; Carine, Mark; Chacón, Juliana; Clark, Alexandra; Clarkson, James J.; Conrad, Ferozah; Devey, Dion S.; Ford, Caroline S.; Hedderson, Terry A.J.; Hollingsworth, Michelle L.; Husband, Brian C.; Kelly, Laura J.; Kesanakurti, Prasad R.; Kim, Jung Sung; Kim, Young-Dong; Lahaye, Renaud; Lee, Hae-Lim; Long, David G.; Madriñán, Santiago; Maurin, Olivier; Meusnier, Isabelle; Newmaster, Steven G.; Park, Chong-Wook; Percy, Diana M.; Petersen, Gitte; Richardson, James E.; Salazar, Gerardo A.; Savolainen, Vincent; Seberg, Ole; Wilkinson, Michael J.; Yi, Dong-Keun; Little, Damon P.

2009-01-01

DNA barcoding involves sequencing a standard region of DNA as a tool for species identification. However, there has been no agreement on which region(s) should be used for barcoding land plants. To provide a community recommendation on a standard plant barcode, we have compared the performance of 7 leading candidate plastid DNA regions (atpF–atpH spacer, matK gene, rbcL gene, rpoB gene, rpoC1 gene, psbK–psbI spacer, and trnH–psbA spacer). Based on assessments of recoverability, sequence quality, and levels of species discrimination, we recommend the 2-locus combination of rbcL+matK as the plant barcode. This core 2-locus barcode will provide a universal framework for the routine use of DNA sequence data to identify specimens and contribute toward the discovery of overlooked species of land plants. PMID:19666622
Probiotic Candidates from Fish Pond Water in Central Java Indonesia

NASA Astrophysics Data System (ADS)

Harjuno Condro Haditomo, Alfabetian; Desrina; Sarjito; Budi Prayitno, S.

2018-02-01

Aeromonas hydrophilla is a major bacterial pathogen of intensive fresh water fish culture in Indonesia. An alternative method to control the pathogen is using probiotics. Probiotics is usually consist of live microorganisms which when administered in adequate amounts confer a health benefits on host. The aim of this research was to determine the probiotic candidates against A. hydrophilla which identified based on the 16S rDNA gene sequences. This research was started with field survey to obtained the probiotic candidate and continue with laboratory experiment. Probiotic candidates were isolated from fish pond water located in Boyolali, and Banjarnegara Regency, Central Java, Indonesia. A total of 133 isolates bacteria were isolated and cultured on to TSA, TSB and GSP medium. Out of 133 isolates only 30 isolates showed inhibition to A.hydrophilla activity. Three promising isolates were identified with PCR using primer for 16S rDNA. Based on 16S rDNA sequence analysis, all three isolates were belong to Bacillus genus. Isolate CKlA21, CKlA28, and CBA14 respectively were closely related to Bacillus sp. 13843 (GenBank accession no. JN874760.1 -100% homology), Bacillus subtilis strain H13 (GenBank accession no.KT907045.1 -- 99% homology), and Bacillus sp. strain 22-4 (GenBank accession no. KX816417.1 -- 97% homology).
Identification of a p53-response element in the promoter of the proline oxidase gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Maxwell, Steve A.; Kochevar, Gerald J.

2008-05-02

Proline oxidase (POX) is a p53-induced proapoptotic gene. We investigated whether p53 could bind directly to the POX gene promoter. Chromatin immunoprecipitation (ChIP) assays detected p53 bound to POX upstream gene sequences. In support of the ChIP results, sequence analysis of the POX gene and its 5' flanking sequences revealed a potential p53-binding site, GGGCTTGTCTTCGTGTGACTTCTGTCT, located at 1161 base pairs (bp) upstream of the transcriptional start site. A 711-bp DNA fragment containing the candidate p53-binding site exhibited reporter gene activity that was induced by p53. In contrast, the same DNA region lacking the candidate p53-binding site did not show significantmore » p53-response activity. Electrophoretic mobility shift assay (EMSA) in ACHN renal carcinoma cell nuclear lysates confirmed that p53 could bind to the 711-bp POX DNA fragment. We concluded from these experiments that a p53-binding site is positioned at -1161 to -1188 bp upstream of the POX transcriptional start site.« less
Vaginal microbial flora analysis by next generation sequencing and microarrays; can microbes indicate vaginal origin in a forensic context?

PubMed

Benschop, Corina C G; Quaak, Frederike C A; Boon, Mathilde E; Sijen, Titia; Kuiper, Irene

2012-03-01

Forensic analysis of biological traces generally encompasses the investigation of both the person who contributed to the trace and the body site(s) from which the trace originates. For instance, for sexual assault cases, it can be beneficial to distinguish vaginal samples from skin or saliva samples. In this study, we explored the use of microbial flora to indicate vaginal origin. First, we explored the vaginal microbiome for a large set of clinical vaginal samples (n = 240) by next generation sequencing (n = 338,184 sequence reads) and found 1,619 different sequences. Next, we selected 389 candidate probes targeting genera or species and designed a microarray, with which we analysed a diverse set of samples; 43 DNA extracts from vaginal samples and 25 DNA extracts from samples from other body sites, including sites in close proximity of or in contact with the vagina. Finally, we used the microarray results and next generation sequencing dataset to assess the potential for a future approach that uses microbial markers to indicate vaginal origin. Since no candidate genera/species were found to positively identify all vaginal DNA extracts on their own, while excluding all non-vaginal DNA extracts, we deduce that a reliable statement about the cellular origin of a biological trace should be based on the detection of multiple species within various genera. Microarray analysis of a sample will then render a microbial flora pattern that is probably best analysed in a probabilistic approach.
DNA/RNA transverse current sequencing: intrinsic structural noise from neighboring bases

PubMed Central

Alvarez, Jose R.; Skachkov, Dmitry; Massey, Steven E.; Kalitsov, Alan; Velev, Julian P.

2015-01-01

Nanopore DNA sequencing via transverse current has emerged as a promising candidate for third-generation sequencing technology. It produces long read lengths which could alleviate problems with assembly errors inherent in current technologies. However, the high error rates of nanopore sequencing have to be addressed. A very important source of the error is the intrinsic noise in the current arising from carrier dispersion along the chain of the molecule, i.e., from the influence of neighboring bases. In this work we perform calculations of the transverse current within an effective multi-orbital tight-binding model derived from first-principles calculations of the DNA/RNA molecules, to study the effect of this structural noise on the error rates in DNA/RNA sequencing via transverse current in nanopores. We demonstrate that a statistical technique, utilizing not only the currents through the nucleotides but also the correlations in the currents, can in principle reduce the error rate below any desired precision. PMID:26150827
Automated design of genomic Southern blot probes

PubMed Central

2010-01-01

Background Sothern blotting is a DNA analysis technique that has found widespread application in molecular biology. It has been used for gene discovery and mapping and has diagnostic and forensic applications, including mutation detection in patient samples and DNA fingerprinting in criminal investigations. Southern blotting has been employed as the definitive method for detecting transgene integration, and successful homologous recombination in gene targeting experiments. The technique employs a labeled DNA probe to detect a specific DNA sequence in a complex DNA sample that has been separated by restriction-digest and gel electrophoresis. Critically for the technique to succeed the probe must be unique to the target locus so as not to cross-hybridize to other endogenous DNA within the sample. Investigators routinely employ a manual approach to probe design. A genome browser is used to extract DNA sequence from the locus of interest, which is searched against the target genome using a BLAST-like tool. Ideally a single perfect match is obtained to the target, with little cross-reactivity caused by homologous DNA sequence present in the genome and/or repetitive and low-complexity elements in the candidate probe. This is a labor intensive process often requiring several attempts to find a suitable probe for laboratory testing. Results We have written an informatic pipeline to automatically design genomic Sothern blot probes that specifically attempts to optimize the resultant probe, employing a brute-force strategy of generating many candidate probes of acceptable length in the user-specified design window, searching all against the target genome, then scoring and ranking the candidates by uniqueness and repetitive DNA element content. Using these in silico measures we can automatically design probes that we predict to perform as well, or better, than our previous manual designs, while considerably reducing design time. We went on to experimentally validate a number of these automated designs by Southern blotting. The majority of probes we tested performed well confirming our in silico prediction methodology and the general usefulness of the software for automated genomic Southern probe design. Conclusions Software and supplementary information are freely available at: http://www.genes2cognition.org/software/southern_blot PMID:20113467
Rational design of DNA sequences for nanotechnology, microarrays and molecular computers using Eulerian graphs.

PubMed

Pancoska, Petr; Moravek, Zdenek; Moll, Ute M

2004-01-01

Nucleic acids are molecules of choice for both established and emerging nanoscale technologies. These technologies benefit from large functional densities of 'DNA processing elements' that can be readily manufactured. To achieve the desired functionality, polynucleotide sequences are currently designed by a process that involves tedious and laborious filtering of potential candidates against a series of requirements and parameters. Here, we present a complete novel methodology for the rapid rational design of large sets of DNA sequences. This method allows for the direct implementation of very complex and detailed requirements for the generated sequences, thus avoiding 'brute force' filtering. At the same time, these sequences have narrow distributions of melting temperatures. The molecular part of the design process can be done without computer assistance, using an efficient 'human engineering' approach by drawing a single blueprint graph that represents all generated sequences. Moreover, the method eliminates the necessity for extensive thermodynamic calculations. Melting temperature can be calculated only once (or not at all). In addition, the isostability of the sequences is independent of the selection of a particular set of thermodynamic parameters. Applications are presented for DNA sequence designs for microarrays, universal microarray zip sequences and electron transfer experiments.
Hybridization-based antibody cDNA recovery for the production of recombinant antibodies identified by repertoire sequencing.

PubMed

Valdés-Alemán, Javier; Téllez-Sosa, Juan; Ovilla-Muñoz, Marbella; Godoy-Lozano, Elizabeth; Velázquez-Ramírez, Daniel; Valdovinos-Torres, Humberto; Gómez-Barreto, Rosa E; Martinez-Barnetche, Jesús

2014-01-01

High-throughput sequencing of the antibody repertoire is enabling a thorough analysis of B cell diversity and clonal selection, which may improve the novel antibody discovery process. Theoretically, an adequate bioinformatic analysis could allow identification of candidate antigen-specific antibodies, requiring their recombinant production for experimental validation of their specificity. Gene synthesis is commonly used for the generation of recombinant antibodies identified in silico. Novel strategies that bypass gene synthesis could offer more accessible antibody identification and validation alternatives. We developed a hybridization-based recovery strategy that targets the complementarity-determining region 3 (CDRH3) for the enrichment of cDNA of candidate antigen-specific antibody sequences. Ten clonal groups of interest were identified through bioinformatic analysis of the heavy chain antibody repertoire of mice immunized with hen egg white lysozyme (HEL). cDNA from eight of the targeted clonal groups was recovered efficiently, leading to the generation of recombinant antibodies. One representative heavy chain sequence from each clonal group recovered was paired with previously reported anti-HEL light chains to generate full antibodies, later tested for HEL-binding capacity. The recovery process proposed represents a simple and scalable molecular strategy that could enhance antibody identification and specificity assessment, enabling a more cost-efficient generation of recombinant antibodies.

Identification and analysis of pig chimeric mRNAs using RNA sequencing data

PubMed Central

2012-01-01

Background Gene fusion is ubiquitous over the course of evolution. It is expected to increase the diversity and complexity of transcriptomes and proteomes through chimeric sequence segments or altered regulation. However, chimeric mRNAs in pigs remain unclear. Here we identified some chimeric mRNAs in pigs and analyzed the expression of them across individuals and breeds using RNA-sequencing data. Results The present study identified 669 putative chimeric mRNAs in pigs, of which 251 chimeric candidates were detected in a set of RNA-sequencing data. The 618 candidates had clear trans-splicing sites, 537 of which obeyed the canonical GU-AG splice rule. Only two putative pig chimera variants whose fusion junction was overlapped with that of a known human chimeric mRNA were found. A set of unique chimeric events were considered middle variances in the expression across individuals and breeds, and revealed non-significant variance between sexes. Furthermore, the genomic region of the 5′ partner gene shares a similar DNA sequence with that of the 3′ partner gene for 458 putative chimeric mRNAs. The 81 of those shared DNA sequences significantly matched the known DNA-binding motifs in the JASPAR CORE database. Four DNA motifs shared in parental genomic regions had significant similarity with known human CTCF binding sites. Conclusions The present study provided detailed information on some pig chimeric mRNAs. We proposed a model that trans-acting factors, such as CTCF, induced the spatial organisation of parental genes to the same transcriptional factory so that parental genes were coordinatively transcribed to give birth to chimeric mRNAs. PMID:22925561
Targeted next-generation sequencing identification of mutations in disease resistance gene anologs (RGAs) in wild and cultivated beets

USDA-ARS?s Scientific Manuscript database

Resistance gene analogs (RGAs) were searched bioinformatically in the sugar beet (Beta vulgaris L.) genome as potential candidates for improving resistance against different diseases. In the present study, Ion Torrent sequencing technology was used to identify mutations in 21 RGAs. The DNA samples o...
Neuropeptidomics of the Mosquito Aedes Aegypti

DTIC Science & Technology

2010-01-01

translational processing ( pyroglutamate formation) was detected for AST-C and CAPA-PVK-2. For the first time in insects, we succeeded in the direct...hormones, trace DNA sequences generated by TIGR and the Broad Institute were first searched by TBLASTN24 using amino acid sequences of candidate peptides...previously described.1 TBLASTN searches, using the amino acid sequences of putative Ae. aegypti neuropeptide and peptide hormone orthologs identified in
Screening the sequence selectivity of DNA-binding molecules using a gold nanoparticle-based colorimetric approach.

PubMed

Hurst, Sarah J; Han, Min Su; Lytton-Jean, Abigail K R; Mirkin, Chad A

2007-09-15

We have developed a novel competition assay that uses a gold nanoparticle (Au NP)-based, high-throughput colorimetric approach to screen the sequence selectivity of DNA-binding molecules. This assay hinges on the observation that the melting behavior of DNA-functionalized Au NP aggregates is sensitive to the concentration of the DNA-binding molecule in solution. When short, oligomeric hairpin DNA sequences were added to a reaction solution consisting of DNA-functionalized Au NP aggregates and DNA-binding molecules, these molecules may either bind to the Au NP aggregate interconnects or the hairpin stems based on their relative affinity for each. This relative affinity can be measured as a change in the melting temperature (Tm) of the DNA-modified Au NP aggregates in solution. As a proof of concept, we evaluated the selectivity of 4',6-diamidino-2-phenylindone (an AT-specific binder), ethidium bromide (a nonspecific binder), and chromomycin A (a GC-specific binder) for six sequences of hairpin DNA having different numbers of AT pairs in a five-base pair variable stem region. Our assay accurately and easily confirmed the known trends in selectivity for the DNA binders in question without the use of complicated instrumentation. This novel assay will be useful in assessing large libraries of potential drug candidates that work by binding DNA to form a drug/DNA complex.
From genomes to vaccines: Leishmania as a model.

PubMed Central

Almeida, Renata; Norrish, Alan; Levick, Mark; Vetrie, David; Freeman, Tom; Vilo, Jaak; Ivens, Alasdair; Lange, Uta; Stober, Carmel; McCann, Sharon; Blackwell, Jenefer M

2002-01-01

The 35 Mb genome of Leishmania should be sequenced by late 2002. It contains approximately 8500 genes that will probably translate into more than 10 000 proteins. In the laboratory we have been piloting strategies to try to harness the power of the genome-proteome for rapid screening of new vaccine candidate. To this end, microarray analysis of 1094 unique genes identified using an EST analysis of 2091 cDNA clones from spliced leader libraries prepared from different developmental stages of Leishmania has been employed. The plan was to identify amastigote-expressed genes that could be used in high-throughput DNA-vaccine screens to identify potential new vaccine candidates. Despite the lack of transcriptional regulation that polycistronic transcription in Leishmania dictates, the data provide evidence for a high level of post-transcriptional regulation of RNA abundance during the developmental cycle of promastigotes in culture and in lesion-derived amastigotes of Leishmania major. This has provided 147 candidates from the 1094 unique genes that are specifically upregulated in amastigotes and are being used in vaccine studies. Using DNA vaccination, it was demonstrated that pooling strategies can work to identify protective vaccines, but it was found that some potentially protective antigens are masked by other disease-exacerbatory antigens in the pool. A total of 100 new vaccine candidates are currently being tested separately and in pools to extend this analysis, and to facilitate retrospective bioinformatic analysis to develop predictive algorithms for sequences that constitute potentially protective antigens. We are also working with other members of the Leishmania Genome Network to determine whether RNA expression determined by microarray analyses parallels expression at the protein level. We believe we are making good progress in developing strategies that will allow rapid translation of the sequence of Leishmania into potential interventions for disease control in humans. PMID:11839176
XX/XY System of Sex Determination in the Geophilomorph Centipede Strigamia maritima

PubMed Central

Green, Jack E.; Dalíková, Martina; Sahara, Ken; Marec, František; Akam, Michael

2016-01-01

We show that the geophilomorph centipede Strigamia maritima possesses an XX/XY system of sex chromosomes, with males being the heterogametic sex. This is, to our knowledge, the first report of sex chromosomes in any geophilomorph centipede. Using the recently assembled Strigamia genome sequence, we identified a set of scaffolds differentially represented in male and female DNA sequence. Using quantitative real-time PCR, we confirmed that three candidate X chromosome-derived scaffolds are present at approximately twice the copy number in females as in males. Furthermore, we confirmed that six candidate Y chromosome-derived scaffolds contain male-specific sequences. Finally, using this molecular information, we designed an X chromosome-specific DNA probe and performed fluorescent in situ hybridization against mitotic and meiotic chromosome spreads to identify the Strigamia XY sex-chromosome pair cytologically. We found that the X and Y chromosomes are recognizably different in size during the early pachytene stage of meiosis, and exhibit incomplete and delayed pairing. PMID:26919730
Molecular Analysis of Dehalococcoides 16S Ribosomal DNA from Chloroethene-Contaminated Sites throughout North America and Europe

PubMed Central

Hendrickson, Edwin R.; Payne, Jo Ann; Young, Roslyn M.; Starr, Mark G.; Perry, Michael P.; Fahnestock, Stephen; Ellis, David E.; Ebersole, Richard C.

2002-01-01

The environmental distribution of Dehalococcoides group organisms and their association with chloroethene-contaminated sites were examined. Samples from 24 chloroethene-dechlorinating sites scattered throughout North America and Europe were tested for the presence of members of the Dehalococcoides group by using a PCR assay developed to detect Dehalococcoides 16S rRNA gene (rDNA) sequences. Sequences identified by sequence analysis as sequences of members of the Dehalococcoides group were detected at 21 sites. Full dechlorination of chloroethenes to ethene occurred at these sites. Dehalococcoides sequences were not detected in samples from three sites at which partial dechlorination of chloroethenes occurred, where dechlorination appeared to stop at 1,2-cis-dichloroethene. Phylogenetic analysis of the 16S rDNA amplicons confirmed that Dehalococcoides sequences formed a unique 16S rDNA group. These 16S rDNA sequences were divided into three subgroups based on specific base substitution patterns in variable regions 2 and 6 of the Dehalococcoides 16S rDNA sequence. Analyses also demonstrated that specific base substitution patterns were signature patterns. The specific base substitutions distinguished the three sequence subgroups phylogenetically. These results demonstrated that members of the Dehalococcoides group are widely distributed in nature and can be found in a variety of geological formations and in different climatic zones. Furthermore, the association of these organisms with full dechlorination of chloroethenes suggests that they are promising candidates for engineered bioremediation and may be important contributors to natural attenuation of chloroethenes. PMID:11823182
Bioinformatics analysis and detection of gelatinase encoded gene in Lysinibacillussphaericus

NASA Astrophysics Data System (ADS)

Repin, Rul Aisyah Mat; Mutalib, Sahilah Abdul; Shahimi, Safiyyah; Khalid, Rozida Mohd.; Ayob, Mohd. Khan; Bakar, Mohd. Faizal Abu; Isa, Mohd Noor Mat

2016-11-01

In this study, we performed bioinformatics analysis toward genome sequence of Lysinibacillussphaericus (L. sphaericus) to determine gene encoded for gelatinase. L. sphaericus was isolated from soil and gelatinase species-specific bacterium to porcine and bovine gelatin. This bacterium offers the possibility of enzymes production which is specific to both species of meat, respectively. The main focus of this research is to identify the gelatinase encoded gene within the bacteria of L. Sphaericus using bioinformatics analysis of partially sequence genome. From the research study, three candidate gene were identified which was, gelatinase candidate gene 1 (P1), NODE_71_length_93919_cov_158.931839_21 which containing 1563 base pair (bp) in size with 520 amino acids sequence; Secondly, gelatinase candidate gene 2 (P2), NODE_23_length_52851_cov_190.061386_17 which containing 1776 bp in size with 591 amino acids sequence; and Thirdly, gelatinase candidate gene 3 (P3), NODE_106_length_32943_cov_169.147919_8 containing 1701 bp in size with 566 amino acids sequence. Three pairs of oligonucleotide primers were designed and namely as, F1, R1, F2, R2, F3 and R3 were targeted short sequences of cDNA by PCR. The amplicons were reliably results in 1563 bp in size for candidate gene P1 and 1701 bp in size for candidate gene P3. Therefore, the results of bioinformatics analysis of L. Sphaericus resulting in gene encoded gelatinase were identified.
Studies of a biochemical factory: tomato trichome deep expressed sequence tag sequencing and proteomics.

PubMed

Schilmiller, Anthony L; Miner, Dennis P; Larson, Matthew; McDowell, Eric; Gang, David R; Wilkerson, Curtis; Last, Robert L

2010-07-01

Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces beta-caryophyllene and alpha-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells.
Studies of a Biochemical Factory: Tomato Trichome Deep Expressed Sequence Tag Sequencing and Proteomics1[W][OA

PubMed Central

Schilmiller, Anthony L.; Miner, Dennis P.; Larson, Matthew; McDowell, Eric; Gang, David R.; Wilkerson, Curtis; Last, Robert L.

2010-01-01

Shotgun proteomics analysis allows hundreds of proteins to be identified and quantified from a single sample at relatively low cost. Extensive DNA sequence information is a prerequisite for shotgun proteomics, and it is ideal to have sequence for the organism being studied rather than from related species or accessions. While this requirement has limited the set of organisms that are candidates for this approach, next generation sequencing technologies make it feasible to obtain deep DNA sequence coverage from any organism. As part of our studies of specialized (secondary) metabolism in tomato (Solanum lycopersicum) trichomes, 454 sequencing of cDNA was combined with shotgun proteomics analyses to obtain in-depth profiles of genes and proteins expressed in leaf and stem glandular trichomes of 3-week-old plants. The expressed sequence tag and proteomics data sets combined with metabolite analysis led to the discovery and characterization of a sesquiterpene synthase that produces β-caryophyllene and α-humulene from E,E-farnesyl diphosphate in trichomes of leaf but not of stem. This analysis demonstrates the utility of combining high-throughput cDNA sequencing with proteomics experiments in a target tissue. These data can be used for dissection of other biochemical processes in these specialized epidermal cells. PMID:20431087
Resistance gene candidates identified by PCR with degenerate oligonucleotide primers map to clusters of resistance genes in lettuce.

PubMed

Shen, K A; Meyers, B C; Islam-Faridi, M N; Chin, D B; Stelly, D M; Michelmore, R W

1998-08-01

The recent cloning of genes for resistance against diverse pathogens from a variety of plants has revealed that many share conserved sequence motifs. This provides the possibility of isolating numerous additional resistance genes by polymerase chain reaction (PCR) with degenerate oligonucleotide primers. We amplified resistance gene candidates (RGCs) from lettuce with multiple combinations of primers with low degeneracy designed from motifs in the nucleotide binding sites (NBSs) of RPS2 of Arabidopsis thaliana and N of tobacco. Genomic DNA, cDNA, and bacterial artificial chromosome (BAC) clones were successfully used as templates. Four families of sequences were identified that had the same similarity to each other as to resistance genes from other species. The relationship of the amplified products to resistance genes was evaluated by several sequence and genetic criteria. The amplified products contained open reading frames with additional sequences characteristic of NBSs. Hybridization of RGCs to genomic DNA and to BAC clones revealed large numbers of related sequences. Genetic analysis demonstrated the existence of clustered multigene families for each of the four RGC sequences. This parallels classical genetic data on clustering of disease resistance genes. Two of the four families mapped to known clusters of resistance genes; these two families were therefore studied in greater detail. Additional evidence that these RGCs could be resistance genes was gained by the identification of leucine-rich repeat (LRR) regions in sequences adjoining the NBS similar to those in RPM1 and RPS2 of A. thaliana. Fluorescent in situ hybridization confirmed the clustered genomic distribution of these sequences. The use of PCR with degenerate oligonucleotide primers is therefore an efficient method to identify numerous RGCs in plants.
A calmodulin-like protein (LCALA) is a new Leishmania amazonensis candidate for telomere end-binding protein.

PubMed

Morea, Edna G O; Viviescas, Maria Alejandra; Fernandes, Carlos A H; Matioli, Fabio F; Lira, Cristina B B; Fernandez, Maribel F; Moraes, Barbara S; da Silva, Marcelo S; Storti, Camila B; Fontes, Marcos R M; Cano, Maria Isabel N

2017-11-01

Leishmania spp. telomeres are composed of 5'-TTAGGG-3' repeats associated with proteins. We have previously identified LaRbp38 and LaRPA-1 as proteins that bind the G-rich telomeric strand. At that time, we had also partially characterized a protein: DNA complex, named LaGT1, but we could not identify its protein component. Using protein-DNA interaction and competition assays, we confirmed that LaGT1 is highly specific to the G-rich telomeric single-stranded DNA. Three protein bands, with LaGT1 activity, were isolated from affinity-purified protein extracts in-gel digested, and sequenced de novo using mass spectrometry analysis. In silico analysis of the digested peptide identified them as a putative calmodulin with sequences identical to the T. cruzi calmodulin. In the Leishmania genome, the calmodulin ortholog is present in three identical copies. We cloned and sequenced one of the gene copies, named it LCalA, and obtained the recombinant protein. Multiple sequence alignment and molecular modeling showed that LCalA shares homology to most eukaryotes calmodulin. In addition, we demonstrated that LCalA is nuclear, partially co-localizes with telomeres and binds in vivo the G-rich telomeric strand. Recombinant LCalA can bind specifically and with relative affinity to the G-rich telomeric single-strand and to a 3'G-overhang, and DNA binding is calcium dependent. We have described a novel candidate component of Leishmania telomeres, LCalA, a nuclear calmodulin that binds the G-rich telomeric strand with high specificity and relative affinity, in a calcium-dependent manner. LCalA is the first reported calmodulin that binds in vivo telomeric DNA. Copyright © 2017 Elsevier B.V. All rights reserved.
DNA pooling: a comprehensive, multi-stage association analysis of ACSL6 and SIRT5 polymorphisms in schizophrenia.

PubMed

Chowdari, K V; Northup, A; Pless, L; Wood, J; Joo, Y H; Mirnics, K; Lewis, D A; Levitt, P R; Bacanu, S-A; Nimgaonkar, V L

2007-04-01

Many candidate gene association studies have evaluated incomplete, unrepresentative sets of single nucleotide polymorphisms (SNPs), producing non-significant results that are difficult to interpret. Using a rapid, efficient strategy designed to investigate all common SNPs, we tested associations between schizophrenia and two positional candidate genes: ACSL6 (Acyl-Coenzyme A synthetase long-chain family member 6) and SIRT5 (silent mating type information regulation 2 homologue 5). We initially evaluated the utility of DNA sequencing traces to estimate SNP allele frequencies in pooled DNA samples. The mean variances for the DNA sequencing estimates were acceptable and were comparable to other published methods (mean variance: 0.0008, range 0-0.0119). Using pooled DNA samples from cases with schizophrenia/schizoaffective disorder (Diagnostic and Statistical Manual of Mental Disorders edition IV criteria) and controls (n=200, each group), we next sequenced all exons, introns and flanking upstream/downstream sequences for ACSL6 and SIRT5. Among 69 identified SNPs, case-control allele frequency comparisons revealed nine suggestive associations (P<0.2). Each of these SNPs was next genotyped in the individual samples composing the pools. A suggestive association with rs 11743803 at ACSL6 remained (allele-wise P=0.02), with diminished evidence in an extended sample (448 cases, 554 controls, P=0.062). In conclusion, we propose a multi-stage method for comprehensive, rapid, efficient and economical genetic association analysis that enables simultaneous SNP detection and allele frequency estimation in large samples. This strategy may be particularly useful for research groups lacking access to high throughput genotyping facilities. Our analyses did not yield convincing evidence for associations of schizophrenia with ACSL6 or SIRT5.
The Status, Quality, and Expansion of the NIH Full-Length cDNA Project: The Mammalian Gene Collection (MGC)

PubMed Central

2004-01-01

The National Institutes of Health's Mammalian Gene Collection (MGC) project was designed to generate and sequence a publicly accessible cDNA resource containing a complete open reading frame (ORF) for every human and mouse gene. The project initially used a random strategy to select clones from a large number of cDNA libraries from diverse tissues. Candidate clones were chosen based on 5′-EST sequences, and then fully sequenced to high accuracy and analyzed by algorithms developed for this project. Currently, more than 11,000 human and 10,000 mouse genes are represented in MGC by at least one clone with a full ORF. The random selection approach is now reaching a saturation point, and a transition to protocols targeted at the missing transcripts is now required to complete the mouse and human collections. Comparison of the sequence of the MGC clones to reference genome sequences reveals that most cDNA clones are of very high sequence quality, although it is likely that some cDNAs may carry missense variants as a consequence of experimental artifact, such as PCR, cloning, or reverse transcriptase errors. Recently, a rat cDNA component was added to the project, and ongoing frog (Xenopus) and zebrafish (Danio) cDNA projects were expanded to take advantage of the high-throughput MGC pipeline. PMID:15489334
Extensive structural variations between mitochondrial genomes of CMS and normal peppers (Capsicum annuum L.) revealed by complete nucleotide sequencing.

PubMed

Jo, Yeong Deuk; Choi, Yoomi; Kim, Dong-Hwan; Kim, Byung-Dong; Kang, Byoung-Cheorl

2014-07-04

Cytoplasmic male sterility (CMS) is an inability to produce functional pollen that is caused by mutation of the mitochondrial genome. Comparative analyses of mitochondrial genomes of lines with and without CMS in several species have revealed structural differences between genomes, including extensive rearrangements caused by recombination. However, the mitochondrial genome structure and the DNA rearrangements that may be related to CMS have not been characterized in Capsicum spp. We obtained the complete mitochondrial genome sequences of the pepper CMS line FS4401 (507,452 bp) and the fertile line Jeju (511,530 bp). Comparative analysis between mitochondrial genomes of peppers and tobacco that are included in Solanaceae revealed extensive DNA rearrangements and poor conservation in non-coding DNA. In comparison between pepper lines, FS4401 and Jeju mitochondrial DNAs contained the same complement of protein coding genes except for one additional copy of an atp6 gene (ψatp6-2) in FS4401. In terms of genome structure, we found eighteen syntenic blocks in the two mitochondrial genomes, which have been rearranged in each genome. By contrast, sequences between syntenic blocks, which were specific to each line, accounted for 30,380 and 17,847 bp in FS4401 and Jeju, respectively. The previously-reported CMS candidate genes, orf507 and ψatp6-2, were located on the edges of the largest sequence segments that were specific to FS4401. In this region, large number of small sequence segments which were absent or found on different locations in Jeju mitochondrial genome were combined together. The incorporation of repeats and overlapping of connected sequence segments by a few nucleotides implied that extensive rearrangements by homologous recombination might be involved in evolution of this region. Further analysis using mtDNA pairs from other plant species revealed common features of DNA regions around CMS-associated genes. Although large portion of sequence context was shared by mitochondrial genomes of CMS and male-fertile pepper lines, extensive genome rearrangements were detected. CMS candidate genes located on the edges of highly-rearranged CMS-specific DNA regions and near to repeat sequences. These characteristics were detected among CMS-associated genes in other species, implying a common mechanism might be involved in the evolution of CMS-associated genes.
Looking into flowering time in almond (Prunus dulcis (Mill) D. A. Webb): the candidate gene approach.

PubMed

Silva, C; Garcia-Mas, J; Sánchez, A M; Arús, P; Oliveira, M M

2005-03-01

Blooming time is one of the most important agronomic traits in almond. Biochemical and molecular events underlying flowering regulation must be understood before methods to stimulate late flowering can be developed. Attempts to elucidate the genetic control of this process have led to the identification of a major gene (Lb) and quantitative trait loci (QTLs) linked to observed phenotypic differences, but although this gene and these QTLs have been placed on the Prunus reference genetic map, their sequences and specific functions remain unknown. The aim of our investigation was to associate these loci with known genes using a candidate gene approach. Two almond cDNAs and eight Prunus expressed sequence tags were selected as candidate genes (CGs) since their sequences were highly identical to those of flowering regulatory genes characterized in other species. The CGs were amplified from both parental lines of the mapping population using specific primers. Sequence comparison revealed DNA polymorphisms between the parental lines, mainly of the single nucleotide type. Polymorphisms were used to develop co-dominant cleaved amplified polymorphic sequence markers or length polymorphisms based on insertion/deletion events for mapping the candidate genes on the Prunus reference map. Ten candidate genes were assigned to six linkage groups in the Prunus genome. The positions of two of these were compatible with the regions where two QTLs for blooming time were detected. One additional candidate was localized close to the position of the Evergrowing gene, which determines a non-deciduous behaviour in peach.
The molecular genetic makeup of acute lymphoblastic leukemia | Office of Cancer Genomics

Cancer.gov

Abstract: Genomic profiling has transformed our understanding of the genetic basis of acute lymphoblastic leukemia (ALL). Recent years have seen a shift from microarray analysis and candidate gene sequencing to next-generation sequencing. Together, these approaches have shown that many ALL subtypes are characterized by constellations of structural rearrangements, submicroscopic DNA copy number alterations, and sequence mutations, several of which have clear implications for risk stratification and targeted therapeutic intervention.
Transcriptome sequencing analysis of novel sRNAs of Kineococcus radiotolerans in response to ionizing radiation.

PubMed

Chen, Zhouwei; Li, Lufeng; Shan, Zhan; Huang, Hannian; Chen, Huan; Ding, Xianfeng; Guo, Jiangfeng; Liu, Lili

2016-11-01

Kineococcus radiotolerans is a Gram-positive, radio-resistant bacterium isolated from a radioactive environment. The small noncoding RNAs (sRNAs) in bacteria are reported to play roles in the immediate response to stress and/or the recovery from stress. The analysis of K. radiotolerans transcriptome sequencing results can identify these sRNAs in a genome-wide detection, using RNA sequencing (RNA-seq) by the deep sequencing technique. In this study, the raw data of radiation-exposed samples (RS) and control samples (CS) were acquired separately from the sequencing platform. There were 217 common sRNA candidates in the two samples screened in the genome-wide scale by bioinformatics analysis. There were 43 differentially expressed sRNA candidates, including 28 up-regulated and 15 down-regulated ones. The down-regulated sRNAs were selected for the sRNA target prediction, of which 12 sRNAs that may modulate the genes related to the transcription regulation and DNA repair were considered as the candidates involved in the radio-resistance regulation system. Copyright © 2016 Elsevier GmbH. All rights reserved.
Strawberry disease lesions in rainbow trout from southern Idaho are associated with DNA from a Rickettsia-like organism.

PubMed

Lloyd, Sonja J; LaPatra, Scott E; Snekvik, Kevin R; St-Hilaire, Sophie; Cain, Kenneth D; Call, Douglas R

2008-11-20

Strawberry disease (SD) in the USA is a skin disorder of unknown etiology that occurs in rainbow trout Oncorhynchus mykiss and is characterized by bright red inflammatory lesions. To identify a candidate bacterial agent responsible for SD, we constructed 16S rDNA libraries from 7 SD lesion samples and 2 apparently healthy skin samples from SD-affected fish. A 16S rDNA sequence highly similar to members of the order Rickettsiales was present in 3 lesion libraries at 1%, 32% and 54% prevalence, but this sequence was not found in either healthy tissue library. Based on phylogenetic analysis, this Rickettsia-like organism (RLO) sequence is most closely related to 16S rDNA sequences of bacteria that may form a novel lineage within the Rickettsiales. We used nested PCR assays to screen 25 SD-affected fish for RLO or Flavobacterium psychrophilum DNA. Sixteen lesion samples were positive for the RLO sequence and 4 of the matched healthy samples were positive resulting in a significant association between SD lesions and presence of RLO DNA. While F. psychrophilum is reportedly associated with 'cold water strawberry disease' in the UK, we found no significant association between SD lesions and the presence of F. psychrophilum DNA. The statistical association between SD lesions and presence of RLO DNA is not proof of etiology, but these data suggest that RLO may play a role in SD in southern Idaho, USA.
Identification of Dendrobium species by a candidate DNA barcode sequence: the chloroplast psbA-trnH intergenic region.

PubMed

Yao, Hui; Song, Jing-Yuan; Ma, Xin-Ye; Liu, Chang; Li, Ying; Xu, Hong-Xi; Han, Jian-Ping; Duan, Li-Sheng; Chen, Shi-Lin

2009-05-01

DNA barcoding is a novel technology that uses a standard DNA sequence to facilitate species identification. Although a consensus has not been reached regarding which DNA sequences can be used as the best plant barcodes, the psbA-trnH spacer region has been tested extensively in recent years. In this study, we hypothesize that the psbA-trnH spacer regions are also effective barcodes for Dendrobium species. We have sequenced the chloroplast psbA-trnH intergenic spacers of 17 Dendrobium species to test this hypothesis. The sequences were found to be significantly different from those of other species, with percentages of variation ranging from 0.3 % to 2.3 % and an average of 1.2 %. In contrast, the intraspecific variation among the Dendrobium species studied ranged from 0 % to 0.1 %. The sequence difference between the psbA-trnH sequences of 17 Dendrobium species and one Bulbophyllum odoratissimum ranged from 2.0 % to 3.1 %, with an average of 2.5 %. Our results support the notion that the psbA-trnH intergenic spacer region could be used as a barcode to distinguish various Dendrobium species and to differentiate Dendrobium species from other adulterating species. Copyright Georg Thieme Verlag KG Stuttgart. New York.

Clinical Utility of Circulating Tumor DNA for Molecular Assessment and Precision Medicine in Pancreatic Cancer.

PubMed

Takai, Erina; Totoki, Yasushi; Nakamura, Hiromi; Kato, Mamoru; Shibata, Tatsuhiro; Yachida, Shinichi

2016-01-01

Pancreatic ductal adenocarcinoma (PDAC) remains one of the most lethal malignancies. The genomic landscape of the PDAC genome features four frequently mutated genes (KRAS, CDKN2A, TP53, and SMAD4) and dozens of candidate driver genes altered at low frequency, including potential clinical targets. Circulating cell-free DNA (cfDNA) is a promising resource to detect molecular characteristics of tumors, supporting the concept of "liquid biopsy".We determined the mutational status of KRAS in plasma cfDNA using multiplex droplet digital PCR in 259 patients with PDAC, retrospectively. Furthermore, we constructed a novel modified SureSelect-KAPA-Illumina platform and an original panel of 60 genes. We then performed targeted deep sequencing of cfDNA in 48 patients who had ≥1 % mutant allele frequencies of KRAS in plasma cfDNA.Droplet digital PCR detected KRAS mutations in plasma cfDNA in 63 of 107 (58.9 %) patients with inoperable tumors. Importantly, potentially targetable somatic mutations were identified in 14 of 48 patients (29.2 %) examined by cfDNA sequencing.Our two-step approach with plasma cfDNA, combining droplet digital PCR and targeted deep sequencing, is a feasible clinical approach. Assessment of mutations in plasma cfDNA may provide a new diagnostic tool, assisting decisions for optimal therapeutic strategies for PDAC patients.
Finding functional features in Saccharomyces genomes by phylogenetic footprinting.

PubMed

Cliften, Paul; Sudarsanam, Priya; Desikan, Ashwin; Fulton, Lucinda; Fulton, Bob; Majors, John; Waterston, Robert; Cohen, Barak A; Johnston, Mark

2003-07-04

The sifting and winnowing of DNA sequence that occur during evolution cause nonfunctional sequences to diverge, leaving phylogenetic footprints of functional sequence elements in comparisons of genome sequences. We searched for such footprints among the genome sequences of six Saccharomyces species and identified potentially functional sequences. Comparison of these sequences allowed us to revise the catalog of yeast genes and identify sequence motifs that may be targets of transcriptional regulatory proteins. Some of these conserved sequence motifs reside upstream of genes with similar functional annotations or similar expression patterns or those bound by the same transcription factor and are thus good candidates for functional regulatory sequences.
Mechanistically Distinct Pathways of Divergent Regulatory DNA Creation Contribute to Evolution of Human-Specific Genomic Regulatory Networks Driving Phenotypic Divergence of Homo sapiens

PubMed Central

Glinsky, Gennadi V.

2016-01-01

Abstract Thousands of candidate human-specific regulatory sequences (HSRS) have been identified, supporting the hypothesis that unique to human phenotypes result from human-specific alterations of genomic regulatory networks. Collectively, a compendium of multiple diverse families of HSRS that are functionally and structurally divergent from Great Apes could be defined as the backbone of human-specific genomic regulatory networks. Here, the conservation patterns analysis of 18,364 candidate HSRS was carried out requiring that 100% of bases must remap during the alignments of human, chimpanzee, and bonobo sequences. A total of 5,535 candidate HSRS were identified that are: (i) highly conserved in Great Apes; (ii) evolved by the exaptation of highly conserved ancestral DNA; (iii) defined by either the acceleration of mutation rates on the human lineage or the functional divergence from non-human primates. The exaptation of highly conserved ancestral DNA pathway seems mechanistically distinct from the evolution of regulatory DNA segments driven by the species-specific expansion of transposable elements. Genome-wide proximity placement analysis of HSRS revealed that a small fraction of topologically associating domains (TADs) contain more than half of HSRS from four distinct families. TADs that are enriched for HSRS and termed rapidly evolving in humans TADs (revTADs) comprise 0.8–10.3% of 3,127 TADs in the hESC genome. RevTADs manifest distinct correlation patterns between placements of human accelerated regions, human-specific transcription factor-binding sites, and recombination rates. There is a significant enrichment within revTAD boundaries of hESC-enhancers, primate-specific CTCF-binding sites, human-specific RNAPII-binding sites, hCONDELs, and H3K4me3 peaks with human-specific enrichment at TSS in prefrontal cortex neurons (P < 0.0001 in all instances). Present analysis supports the idea that phenotypic divergence of Homo sapiens is driven by the evolution of human-specific genomic regulatory networks via at least two mechanistically distinct pathways of creation of divergent sequences of regulatory DNA: (i) recombination-associated exaptation of the highly conserved ancestral regulatory DNA segments; (ii) human-specific insertions of transposable elements. PMID:27503290
Identification of Streptococcus mitis321A vaccine antigens based on reverse vaccinology

PubMed Central

Zhang, Qiao; Lin, Kexiong; Wang, Changzheng; Xu, Zhi; Yang, Li; Ma, Qianli

2018-01-01

Streptococcus mitis (S. mitis) may transform into highly pathogenic bacteria. The aim of the present study was to identify potential antigen targets for designing an effective vaccine against the pathogenic S. mitis321A. The genome of S. mitis321A was sequenced using an Illumina Hiseq2000 instrument. Subsequently, Glimmer 3.02 and Tandem Repeat Finder (TRF) 4.04 were used to predict genes and tandem repeats, respectively, with DNA sequence function analysis using the Basic Local Alignment Search Tool (BLAST) in the Kyoto Encyclopedia of Genes and Genomes (KEGG) and Cluster of Orthologous Groups of proteins (COG) databases. Putative gene antigen candidates were screened with BLAST ahead of phylogenetic tree analysis. The DNA sequence assembly size was 2,110,680 bp with 40.12% GC, 6 scaffolds and 9 contig. Consequently, 1,944 genes were predicted, and 119 TRF, 56 microsatellite DNA, 10 minisatellite DNA and 154 transposons were acquired. The predicted genes were associated with various pathways and functions concerning membrane transport and energy metabolism. Multiple putative genes encoding surface proteins, secreted proteins and virulence factors, as well as essential genes were determined. The majority of essential genes belonged to a phylogenetic lineage, while 321AGL000129 and 321AGL000299 were on the same branch. The current study provided useful information regarding the biological function of the S. mitis321A genome and recommends putative antigen candidates for developing a potent vaccine against S. mitis. PMID:29620181
Clinical utility of circulating tumor DNA for molecular assessment in pancreatic cancer.

PubMed

Takai, Erina; Totoki, Yasushi; Nakamura, Hiromi; Morizane, Chigusa; Nara, Satoshi; Hama, Natsuko; Suzuki, Masami; Furukawa, Eisaku; Kato, Mamoru; Hayashi, Hideyuki; Kohno, Takashi; Ueno, Hideki; Shimada, Kazuaki; Okusaka, Takuji; Nakagama, Hitoshi; Shibata, Tatsuhiro; Yachida, Shinichi

2015-12-16

Pancreatic ductal adenocarcinoma (PDAC) remains one of the most lethal malignancies. The genomic landscape of the PDAC genome features four frequently mutated genes (KRAS, CDKN2A, TP53, and SMAD4) and dozens of candidate driver genes altered at low frequency, including potential clinical targets. Circulating cell-free DNA (cfDNA) is a promising resource to detect and monitor molecular characteristics of tumors. In the present study, we determined the mutational status of KRAS in plasma cfDNA using multiplex picoliter-droplet digital PCR in 259 patients with PDAC. We constructed a novel modified SureSelect-KAPA-Illumina platform and an original panel of 60 genes. We then performed targeted deep sequencing of cfDNA and matched germline DNA samples in 48 patients who had ≥1% mutant allele frequencies of KRAS in plasma cfDNA. Importantly, potentially targetable somatic mutations were identified in 14 of 48 patients (29.2%) examined by targeted deep sequencing of cfDNA. We also analyzed somatic copy number alterations based on the targeted sequencing data using our in-house algorithm, and potentially targetable amplifications were detected. Assessment of mutations and copy number alterations in plasma cfDNA may provide a prognostic and diagnostic tool to assist decisions regarding optimal therapeutic strategies for PDAC patients.
High-throughput, pooled sequencing identifies mutations in NUBPL and FOXRED1 in human complex I deficiency

PubMed Central

Calvo, Sarah E; Tucker, Elena J; Compton, Alison G; Kirby, Denise M; Crawford, Gabriel; Burtt, Noel P; Rivas, Manuel A; Guiducci, Candace; Bruno, Damien L; Goldberger, Olga A; Redman, Michelle C; Wiltshire, Esko; Wilson, Callum J; Altshuler, David; Gabriel, Stacey B; Daly, Mark J; Thorburn, David R; Mootha, Vamsi K

2010-01-01

Discovering the molecular basis of mitochondrial respiratory chain disease is challenging given the large number of both mitochondrial and nuclear genes involved. We report a strategy of focused candidate gene prediction, high-throughput sequencing, and experimental validation to uncover the molecular basis of mitochondrial complex I (CI) disorders. We created five pools of DNA from a cohort of 103 patients and then performed deep sequencing of 103 candidate genes to spotlight 151 rare variants predicted to impact protein function. We used confirmatory experiments to establish genetic diagnoses in 22% of previously unsolved cases, and discovered that defects in NUBPL and FOXRED1 can cause CI deficiency. Our study illustrates how large-scale sequencing, coupled with functional prediction and experimental validation, can reveal novel disease-causing mutations in individual patients. PMID:20818383
Genome-Wide Characterization of RNA Editing in Chicken Embryos Reveals Common Features among Vertebrates.

PubMed

Frésard, Laure; Leroux, Sophie; Roux, Pierre-François; Klopp, Christophe; Fabre, Stéphane; Esquerré, Diane; Dehais, Patrice; Djari, Anis; Gourichon, David; Lagarrigue, Sandrine; Pitel, Frédérique

2015-01-01

RNA editing results in a post-transcriptional nucleotide change in the RNA sequence that creates an alternative nucleotide not present in the DNA sequence. This leads to a diversification of transcription products with potential functional consequences. Two nucleotide substitutions are mainly described in animals, from adenosine to inosine (A-to-I) and from cytidine to uridine (C-to-U). This phenomenon is described in more details in mammals, notably since the availability of next generation sequencing technologies allowing whole genome screening of RNA-DNA differences. The number of studies recording RNA editing in other vertebrates like chicken is still limited. We chose to use high throughput sequencing technologies to search for RNA editing in chicken, and to extend the knowledge of its conservation among vertebrates. We performed sequencing of RNA and DNA from 8 embryos. Being aware of common pitfalls inherent to sequence analyses that lead to false positive discovery, we stringently filtered our datasets and found fewer than 40 reliable candidates. Conservation of particular sites of RNA editing was attested by the presence of 3 edited sites previously detected in mammals. We then characterized editing levels for selected candidates in several tissues and at different time points, from 4.5 days of embryonic development to adults, and observed a clear tissue-specificity and a gradual increase of editing level with time. By characterizing the RNA editing landscape in chicken, our results highlight the extent of evolutionary conservation of this phenomenon within vertebrates, attest to its tissue and stage specificity and provide support of the absence of non A-to-I events from the chicken transcriptome.
Genome-Wide Characterization of RNA Editing in Chicken Embryos Reveals Common Features among Vertebrates

PubMed Central

Frésard, Laure; Leroux, Sophie; Roux, Pierre-François; Klopp, Christophe; Fabre, Stéphane; Esquerré, Diane; Dehais, Patrice; Djari, Anis; Gourichon, David

2015-01-01

RNA editing results in a post-transcriptional nucleotide change in the RNA sequence that creates an alternative nucleotide not present in the DNA sequence. This leads to a diversification of transcription products with potential functional consequences. Two nucleotide substitutions are mainly described in animals, from adenosine to inosine (A-to-I) and from cytidine to uridine (C-to-U). This phenomenon is described in more details in mammals, notably since the availability of next generation sequencing technologies allowing whole genome screening of RNA-DNA differences. The number of studies recording RNA editing in other vertebrates like chicken is still limited. We chose to use high throughput sequencing technologies to search for RNA editing in chicken, and to extend the knowledge of its conservation among vertebrates. We performed sequencing of RNA and DNA from 8 embryos. Being aware of common pitfalls inherent to sequence analyses that lead to false positive discovery, we stringently filtered our datasets and found fewer than 40 reliable candidates. Conservation of particular sites of RNA editing was attested by the presence of 3 edited sites previously detected in mammals. We then characterized editing levels for selected candidates in several tissues and at different time points, from 4.5 days of embryonic development to adults, and observed a clear tissue-specificity and a gradual increase of editing level with time. By characterizing the RNA editing landscape in chicken, our results highlight the extent of evolutionary conservation of this phenomenon within vertebrates, attest to its tissue and stage specificity and provide support of the absence of non A-to-I events from the chicken transcriptome. PMID:26024316
Identification of GATC- and CCGG- recognizing Type II REases and their putative specificity-determining positions using Scan2S—a novel motif scan algorithm with optional secondary structure constraints

PubMed Central

Niv, Masha Y.; Skrabanek, Lucy; Roberts, Richard J.; Scheraga, Harold A.; Weinstein, Harel

2008-01-01

Restriction endonucleases (REases) are DNA-cleaving enzymes that have become indispensable tools in molecular biology. Type II REases are highly divergent in sequence despite their common structural core, function and, in some cases, common specificities towards DNA sequences. This makes it difficult to identify and classify them functionally based on sequence, and has hampered the efforts of specificity-engineering. Here, we define novel REase sequence motifs, which extend beyond the PD-(D/E)XK hallmark, and incorporate secondary structure information. The automated search using these motifs is carried out with a newly developed fast regular expression matching algorithm that accommodates long patterns with optional secondary structure constraints. Using this new tool, named Scan2S, motifs derived from REases with specificity towards GATC- and CGGG-containing DNA sequences successfully identify REases of the same specificity. Notably, some of these sequences are not identified by standard sequence detection tools. The new motifs highlight potential specificity-determining positions that do not fully overlap for the GATC- and the CCGG-recognizing REases and are candidates for specificity re-engineering. PMID:17972284
Identification of GATC- and CCGG-recognizing Type II REases and their putative specificity-determining positions using Scan2S--a novel motif scan algorithm with optional secondary structure constraints.

PubMed

Niv, Masha Y; Skrabanek, Lucy; Roberts, Richard J; Scheraga, Harold A; Weinstein, Harel

2008-05-01

Restriction endonucleases (REases) are DNA-cleaving enzymes that have become indispensable tools in molecular biology. Type II REases are highly divergent in sequence despite their common structural core, function and, in some cases, common specificities towards DNA sequences. This makes it difficult to identify and classify them functionally based on sequence, and has hampered the efforts of specificity-engineering. Here, we define novel REase sequence motifs, which extend beyond the PD-(D/E)XK hallmark, and incorporate secondary structure information. The automated search using these motifs is carried out with a newly developed fast regular expression matching algorithm that accommodates long patterns with optional secondary structure constraints. Using this new tool, named Scan2S, motifs derived from REases with specificity towards GATC- and CGGG-containing DNA sequences successfully identify REases of the same specificity. Notably, some of these sequences are not identified by standard sequence detection tools. The new motifs highlight potential specificity-determining positions that do not fully overlap for the GATC- and the CCGG-recognizing REases and are candidates for specificity re-engineering.
Molecular characterization of the canine mitochondrial DNA control region for forensic applications.

PubMed

Eichmann, Cordula; Parson, Walther

2007-09-01

The canine mitochondrial DNA (mtDNA) control region of 133 dogs living in the area around Innsbruck, Austria was sequenced. A total of 40 polymorphic sites were observed in the first hypervariable segment and 15 in the second, which resulted in the differentiation of 40 distinct haplotypes. We observed five nucleotide positions that were highly polymorphic within different haplogroups, and they represent good candidates for mtDNA screening. We found five point heteroplasmic positions; all located in HVS-I and a polythymine region in HVS-II, the latter often being associated with length heteroplasmy. In contrast to human mtDNA, the canine control region contains a hypervariable 10 nucleotide repeat region, which is located between the two hypervariable regions. In our population sample, we observed eight different repeat types, which we characterized by direct sequencing and fragment length analysis. The discrimination power of the canine mtDNA control region was 0.93, not taking the polymorphic repeat region into consideration.
p53 Specifically Binds Triplex DNA In Vitro and in Cells

PubMed Central

Brázdová, Marie; Tichý, Vlastimil; Helma, Robert; Bažantová, Pavla; Polášková, Alena; Krejčí, Aneta; Petr, Marek; Navrátilová, Lucie; Tichá, Olga; Nejedlý, Karel; Bennink, Martin L.; Subramaniam, Vinod; Bábková, Zuzana; Martínek, Tomáš; Lexa, Matej; Adámik, Matej

2016-01-01

Triplex DNA is implicated in a wide range of biological activities, including regulation of gene expression and genomic instability leading to cancer. The tumor suppressor p53 is a central regulator of cell fate in response to different type of insults. Sequence and structure specific modes of DNA recognition are core attributes of the p53 protein. The focus of this work is the structure-specific binding of p53 to DNA containing triplex-forming sequences in vitro and in cells and the effect on p53-driven transcription. This is the first DNA binding study of full-length p53 and its deletion variants to both intermolecular and intramolecular T.A.T triplexes. We demonstrate that the interaction of p53 with intermolecular T.A.T triplex is comparable to the recognition of CTG-hairpin non-B DNA structure. Using deletion mutants we determined the C-terminal DNA binding domain of p53 to be crucial for triplex recognition. Furthermore, strong p53 recognition of intramolecular T.A.T triplexes (H-DNA), stabilized by negative superhelicity in plasmid DNA, was detected by competition and immunoprecipitation experiments, and visualized by AFM. Moreover, chromatin immunoprecipitation revealed p53 binding T.A.T forming sequence in vivo. Enhanced reporter transactivation by p53 on insertion of triplex forming sequence into plasmid with p53 consensus sequence was observed by luciferase reporter assays. In-silico scan of human regulatory regions for the simultaneous presence of both consensus sequence and T.A.T motifs identified a set of candidate p53 target genes and p53-dependent activation of several of them (ABCG5, ENOX1, INSR, MCC, NFAT5) was confirmed by RT-qPCR. Our results show that T.A.T triplex comprises a new class of p53 binding sites targeted by p53 in a DNA structure-dependent mode in vitro and in cells. The contribution of p53 DNA structure-dependent binding to the regulation of transcription is discussed. PMID:27907175
DNA Barcoding of the Endangered Aquilaria (Thymelaeaceae) and Its Application in Species Authentication of Agarwood Products Traded in the Market

PubMed Central

Lee, Shiou Yih; Ng, Wei Lun; Mahat, Mohd Noor; Nazre, Mohd; Mohamed, Rozi

2016-01-01

The identification of Aquilaria species from their resinous non-wood product, the agarwood, is challenging as conventional techniques alone are unable to ascertain the species origin. Aquilaria is a highly protected species due to the excessive exploitation of its precious agarwood. Here, we applied the DNA barcoding technique to generate barcode sequences for Aquilaria species and later applied the barcodes to identify the source species of agarwood found in the market. We developed a reference DNA barcode library using eight candidate barcode loci (matK, rbcL, rpoB, rpoC1, psbA-trnH, trnL-trnF, ITS, and ITS2) amplified from 24 leaf accessions of seven Aquilaria species obtained from living trees. Our results indicated that all single barcodes can be easily amplified and sequenced with the selected primers. The combination of trnL-trnF+ITS and trnL-trnF+ITS2 yielded the greatest species resolution using the least number of loci combination, while matK+trnL-trnF+ITS showed potential in detecting the geographical origins of Aquilaria species. We propose trnL-trnF+ITS2 as the best candidate barcode for Aquilaria as ITS2 has a shorter sequence length compared to ITS, which eases PCR amplification especially when using degraded DNA samples such as those extracted from processed agarwood products. A blind test conducted on eight agarwood samples in different forms using the proposed barcode combination proved successful in their identification up to the species level. Such potential of DNA barcoding in identifying the source species of agarwood will contribute to the international timber trade control, by providing an effective method for species identification and product authentication. PMID:27128309
Isolation of candidate genes of Friedreich`s ataxia on chromosome 9q13

DOE Office of Scientific and Technical Information (OSTI.GOV)

Montermini, L.; Zara, F.; Pandolfo, M.

1994-09-01

Friedreich`s ataxia (FRDA) is an autosomal recessive degenerative disease involving the central and peripheral nervous system and the heart. The mutated gene in FRDA has recently been localized within a 450 Kb interval on chromosome 9q13 between the markers D9S202/FR1/FR8. We have been able to confirm such localization for the disease gene by analysis of extended haplotype in consanguineous families. Cases of loss of marker homozygosity, which are likely to be due to ancient recombinations, have been found to involve D9S110, D9S15, and D9S111 on the telomeric side, and FR5 on the centromeric side, while homozygosity was always found formore » a core haplotype including D9S5, FD1, and D9S202. We constructed a YAC contig spanning the region between the telomeric markers and FR5, and cosmids have been obtained from the YACs. In order to isolate transcribed sequences from the FRDA candidate region we are utilizing a combination of approaches, including hybridization of YACs and cosmids to an arrayed human heart cDNA library, cDNA direct selection, and exon amplification. A transcribed sequence near the telomeric end of the region has been isolated by cDNA direct selection using pooled cosmids as genomic template and primary human heart, muscle, brain, liver and placenta cDNAs as cDNA source. We have shown this sequence to be the human equivalent of ZO-2, a tight junction protein previously described in the dog. No mutations of this gene have been found in FRDA subjects. Additional cDNA have recently been isolated and they are currently being evaluated.« less
Nanofluidic Device with Embedded Nanopore

NASA Astrophysics Data System (ADS)

Zhang, Yuning; Reisner, Walter

2014-03-01

Nanofluidic based devices are robust methods for biomolecular sensing and single DNA manipulation. Nanopore-based DNA sensing has attractive features that make it a leading candidate as a single-molecule DNA sequencing technology. Nanochannel based extension of DNA, combined with enzymatic or denaturation-based barcoding schemes, is already a powerful approach for genome analysis. We believe that there is revolutionary potential in devices that combine nanochannels with nanpore detectors. In particular, due to the fast translocation of a DNA molecule through a standard nanopore configuration, there is an unfavorable trade-off between signal and sequence resolution. With a combined nanochannel-nanopore device, based on embedding a nanopore inside a nanochannel, we can in principle gain independent control over both DNA translocation speed and sensing signal, solving the key draw-back of the standard nanopore configuration. We demonstrate that we can detect - using fluorescent microscopy - successful translocation of DNA from the nanochannel out through the nanopore, a possible method to 'select' a given barcode for further analysis. We also show that in equilibrium DNA will not escape through an embedded sub-persistence length nanopore until a certain voltage bias is added.
Genotyping of 25 leukemia-associated genes in a single work flow by next-generation sequencing technology with low amounts of input template DNA.

PubMed

Rinke, Jenny; Schäfer, Vivien; Schmidt, Mathias; Ziermann, Janine; Kohlmann, Alexander; Hochhaus, Andreas; Ernst, Thomas

2013-08-01

We sought to establish a convenient, sensitive next-generation sequencing (NGS) method for genotyping the 26 most commonly mutated leukemia-associated genes in a single work flow and to optimize this method for low amounts of input template DNA. We designed 184 PCR amplicons that cover all of the candidate genes. NGS was performed with genomic DNA (gDNA) from a cohort of 10 individuals with chronic myelomonocytic leukemia. The results were compared with NGS data obtained from sequencing of DNA generated by whole-genome amplification (WGA) of 20 ng template gDNA. Differences between gDNA and WGA samples in variant frequencies were determined for 2 different WGA kits. For gDNA samples, 25 of 26 genes were successfully sequenced with a sensitivity of 5%, which was achieved by a median coverage of 492 reads (range, 308-636 reads) per amplicon. We identified 24 distinct mutations in 11 genes. With WGA samples, we reliably detected all mutations above 5% sensitivity with a median coverage of 506 reads (range, 256-653 reads) per amplicon. With all variants included in the analysis, WGA amplification by the 2 kits tested yielded differences in variant frequencies that ranged from -28.19% to +9.94% [mean (SD) difference, -0.2% (4.08%)] and from -35.03% to +18.67% [mean difference, -0.75% (5.12%)]. Our method permits simultaneous analysis of a wide range of leukemia-associated target genes in a single sequencing run. NGS can be performed after WGA of template DNA for reliable detection of variants without introducing appreciable bias.
A Novel Locomotion-based Validation Assay for Candidate Drugs Using Drosophila DYT1 Disease Model

DTIC Science & Technology

2013-11-01

the genome using the same parental fly line, minimizing the effect of surrounding sequences and genetic variations on the ...locomotion and GTPC cyclrohydolase protein levels; (3) supplementation of dopamine can partially rescue the locomotion defects of Drosophila larvae...8217- GCGAACAACCAAAAAATCATTGAGATAATAAACTCCTCCATTAG-3’) to make dtorsin cDNA that lacks GAC (D307) (Fig. 1) respectively. After confirming mutated sequences , the insert was again
Whole-Exome Sequencing to Decipher the Genetic Heterogeneity of Hearing Loss in a Chinese Family with Deaf by Deaf Mating

PubMed Central

Qing, Jie; Yan, Denise; Zhou, Yuan; Liu, Qiong; Wu, Weijing; Xiao, Zian; Liu, Yuyuan; Liu, Jia; Du, Lilin; Xie, Dinghua; Liu, Xue Zhong

2014-01-01

Inherited deafness has been shown to have high genetic heterogeneity. For many decades, linkage analysis and candidate gene approaches have been the main tools to elucidate the genetics of hearing loss. However, this associated study design is costly, time-consuming, and unsuitable for small families. This is mainly due to the inadequate numbers of available affected individuals, locus heterogeneity, and assortative mating. Exome sequencing has now become technically feasible and a cost-effective method for detection of disease variants underlying Mendelian disorders due to the recent advances in next-generation sequencing (NGS) technologies. In the present study, we have combined both the Deafness Gene Mutation Detection Array and exome sequencing to identify deafness causative variants in a large Chinese composite family with deaf by deaf mating. The simultaneous screening of the 9 common deafness mutations using the allele-specific PCR based universal array, resulted in the identification of the 1555A>G in the mitochondrial DNA (mtDNA) 12S rRNA in affected individuals in one branch of the family. We then subjected the mutation-negative cases to exome sequencing and identified novel causative variants in the MYH14 and WFS1 genes. This report confirms the effective use of a NGS technique to detect pathogenic mutations in affected individuals who were not candidates for classical genetic studies. PMID:25289672
Investigation of the mechanism of meiotic DNA cleavage by VMA1-derived endonuclease uncovers a meiotic alteration in chromatin structure around the target site.

PubMed

Fukuda, Tomoyuki; Ohta, Kunihiro; Ohya, Yoshikazu

2006-06-01

VMA1-derived endonuclease (VDE), a homing endonuclease in Saccharomyces cerevisiae, is encoded by the mobile intein-coding sequence within the nuclear VMA1 gene. VDE recognizes and cleaves DNA at the 31-bp VDE recognition sequence (VRS) in the VMA1 gene lacking the intein-coding sequence during meiosis to insert a copy of the intein-coding sequence at the cleaved site. The mechanism underlying the meiosis specificity of VMA1 intein-coding sequence homing remains unclear. We studied various factors that might influence the cleavage activity in vivo and found that VDE binding to the VRS can be detected only when DNA cleavage by VDE takes place, implying that meiosis-specific DNA cleavage is regulated by the accessibility of VDE to its target site. As a possible candidate for the determinant of this accessibility, we analyzed chromatin structure around the VRS and revealed that local chromatin structure near the VRS is altered during meiosis. Although the meiotic chromatin alteration exhibits correlations with DNA binding and cleavage by VDE at the VMA1 locus, such a chromatin alteration is not necessarily observed when the VRS is embedded in ectopic gene loci. This suggests that nucleosome positioning or occupancy around the VRS by itself is not the sole mechanism for the regulation of meiosis-specific DNA cleavage by VDE and that other mechanisms are involved in the regulation.
Investigation of the Mechanism of Meiotic DNA Cleavage by VMA1-Derived Endonuclease Uncovers a Meiotic Alteration in Chromatin Structure around the Target Site

PubMed Central

Fukuda, Tomoyuki; Ohta, Kunihiro; Ohya, Yoshikazu

2006-01-01

VMA1-derived endonuclease (VDE), a homing endonuclease in Saccharomyces cerevisiae, is encoded by the mobile intein-coding sequence within the nuclear VMA1 gene. VDE recognizes and cleaves DNA at the 31-bp VDE recognition sequence (VRS) in the VMA1 gene lacking the intein-coding sequence during meiosis to insert a copy of the intein-coding sequence at the cleaved site. The mechanism underlying the meiosis specificity of VMA1 intein-coding sequence homing remains unclear. We studied various factors that might influence the cleavage activity in vivo and found that VDE binding to the VRS can be detected only when DNA cleavage by VDE takes place, implying that meiosis-specific DNA cleavage is regulated by the accessibility of VDE to its target site. As a possible candidate for the determinant of this accessibility, we analyzed chromatin structure around the VRS and revealed that local chromatin structure near the VRS is altered during meiosis. Although the meiotic chromatin alteration exhibits correlations with DNA binding and cleavage by VDE at the VMA1 locus, such a chromatin alteration is not necessarily observed when the VRS is embedded in ectopic gene loci. This suggests that nucleosome positioning or occupancy around the VRS by itself is not the sole mechanism for the regulation of meiosis-specific DNA cleavage by VDE and that other mechanisms are involved in the regulation. PMID:16757746

Proteome-wide Identification of Novel Ceramide-binding Proteins by Yeast Surface cDNA Display and Deep Sequencing.

PubMed

Bidlingmaier, Scott; Ha, Kevin; Lee, Nam-Kyung; Su, Yang; Liu, Bin

2016-04-01

Although the bioactive sphingolipid ceramide is an important cell signaling molecule, relatively few direct ceramide-interacting proteins are known. We used an approach combining yeast surface cDNA display and deep sequencing technology to identify novel proteins binding directly to ceramide. We identified 234 candidate ceramide-binding protein fragments and validated binding for 20. Most (17) bound selectively to ceramide, although a few (3) bound to other lipids as well. Several novel ceramide-binding domains were discovered, including the EF-hand calcium-binding motif, the heat shock chaperonin-binding motif STI1, the SCP2 sterol-binding domain, and the tetratricopeptide repeat region motif. Interestingly, four of the verified ceramide-binding proteins (HPCA, HPCAL1, NCS1, and VSNL1) and an additional three candidate ceramide-binding proteins (NCALD, HPCAL4, and KCNIP3) belong to the neuronal calcium sensor family of EF hand-containing proteins. We used mutagenesis to map the ceramide-binding site in HPCA and to create a mutant HPCA that does not bind to ceramide. We demonstrated selective binding to ceramide by mammalian cell-produced wild type but not mutant HPCA. Intriguingly, we also identified a fragment from prostaglandin D2synthase that binds preferentially to ceramide 1-phosphate. The wide variety of proteins and domains capable of binding to ceramide suggests that many of the signaling functions of ceramide may be regulated by direct binding to these proteins. Based on the deep sequencing data, we estimate that our yeast surface cDNA display library covers ∼60% of the human proteome and our selection/deep sequencing protocol can identify target-interacting protein fragments that are present at extremely low frequency in the starting library. Thus, the yeast surface cDNA display/deep sequencing approach is a rapid, comprehensive, and flexible method for the analysis of protein-ligand interactions, particularly for the study of non-protein ligands. © 2016 by The American Society for Biochemistry and Molecular Biology, Inc.
The Comprehensive Phytopathogen Genomics Resource: a web-based resource for data-mining plant pathogen genomes.

PubMed

Hamilton, John P; Neeno-Eckwall, Eric C; Adhikari, Bishwo N; Perna, Nicole T; Tisserat, Ned; Leach, Jan E; Lévesque, C André; Buell, C Robin

2011-01-01

The Comprehensive Phytopathogen Genomics Resource (CPGR) provides a web-based portal for plant pathologists and diagnosticians to view the genome and trancriptome sequence status of 806 bacterial, fungal, oomycete, nematode, viral and viroid plant pathogens. Tools are available to search and analyze annotated genome sequences of 74 bacterial, fungal and oomycete pathogens. Oomycete and fungal genomes are obtained directly from GenBank, whereas bacterial genome sequences are downloaded from the A Systematic Annotation Package (ASAP) database that provides curation of genomes using comparative approaches. Curated lists of bacterial genes relevant to pathogenicity and avirulence are also provided. The Plant Pathogen Transcript Assemblies Database provides annotated assemblies of the transcribed regions of 82 eukaryotic genomes from publicly available single pass Expressed Sequence Tags. Data-mining tools are provided along with tools to create candidate diagnostic markers, an emerging use for genomic sequence data in plant pathology. The Plant Pathogen Ribosomal DNA (rDNA) database is a resource for pathogens that lack genome or transcriptome data sets and contains 131 755 rDNA sequences from GenBank for 17 613 species identified as plant pathogens and related genera. Database URL: http://cpgr.plantbiology.msu.edu.
An efficient approach to finding Siraitia grosvenorii triterpene biosynthetic genes by RNA-seq and digital gene expression analysis.

PubMed

Tang, Qi; Ma, Xiaojun; Mo, Changming; Wilson, Iain W; Song, Cai; Zhao, Huan; Yang, Yanfang; Fu, Wei; Qiu, Deyou

2011-07-05

Siraitia grosvenorii (Luohanguo) is an herbaceous perennial plant native to southern China and most prevalent in Guilin city. Its fruit contains a sweet, fleshy, edible pulp that is widely used in traditional Chinese medicine. The major bioactive constituents in the fruit extract are the cucurbitane-type triterpene saponins known as mogrosides. Among them, mogroside V is nearly 300 times sweeter than sucrose. However, little is known about mogrosides biosynthesis in S. grosvenorii, especially the late steps of the pathway. In this study, a cDNA library generated from of equal amount of RNA taken from S. grosvenorii fruit at 50 days after flowering (DAF) and 70 DAF were sequenced using Illumina/Solexa platform. More than 48,755,516 high-quality reads from a cDNA library were generated that was assembled into 43,891 unigenes. De novo assembly and gap-filling generated 43,891 unigenes with an average sequence length of 668 base pairs. A total of 26,308 (59.9%) unique sequences were annotated and 11,476 of the unique sequences were assigned to specific metabolic pathways by the Kyoto Encyclopedia of Genes and Genomes. cDNA sequences for all of the known enzymes involved in mogrosides backbone synthesis were identified from our library. Additionally, a total of eighty-five cytochrome P450 (CYP450) and ninety UDP-glucosyltransferase (UDPG) unigenes were identified, some of which appear to encode enzymes responsible for the conversion of the mogroside backbone into the various mogrosides. Digital gene expression profile (DGE) analysis using Solexa sequencing was performed on three important stages of fruit development, and based on their expression pattern, seven CYP450s and five UDPGs were selected as the candidates most likely to be involved in mogrosides biosynthesis. A combination of RNA-seq and DGE analysis based on the next generation sequencing technology was shown to be a powerful method for identifying candidate genes encoding enzymes responsible for the biosynthesis of novel secondary metabolites in a non-model plant. Seven CYP450s and five UDPGs were selected as potential candidates involved in mogrosides biosynthesis. The transcriptome data from this study provides an important resource for understanding the formation of major bioactive constituents in the fruit extract from S. grosvenorii.
Origins of domestication and polyploidy in oca (Oxalis tuberosa : Oxalidaceae): nrDNA ITS data.

PubMed

Emshwiller, E; Doyle, J

1998-07-01

As part of a study aimed at elucidating the origins of the octoploid tuber crop "oca," Oxalis tuberosa, DNA sequences of the internal trancribed spacer of nuclear ribosomal DNA (nrDNA ITS) were determined for oca and several wild Oxalis species, mostly from Bolivia. Phylogenetic analysis of these data supports a group of these species as being close relatives of oca, in agreement with morphology and cytology, but at odds with traditional infrageneric taxonomy. Variation in ITS sequences within this group is quite low (0-7 substitutions in the entire ITS region), contrasting with the highly divergent (unalignable in some cases) sequences within the genus overall. Some groups of morphologically differentiated species were found to have identical sequences, notably a group that includes oca, wild populations of Oxalis that bear small tubers, and several other clearly distinct species. The presence of a second, minor sequence type in at least some oca accessions suggests a possible contribution from a second genome donor, also from within this same species group. ITS data lack sufficient variation to elucidate the origins of oca precisely, but have identified a pool of candidate species and so can be used as a tool to screen yet unsampled species for possible progenitors.
Germline transformation of the butterfly Bicyclus anynana.

PubMed

Marcus, Jeffrey M; Ramos, Diane M; Monteiro, Antónia

2004-08-07

Ecological and evolutionary theory has frequently been inspired by the diversity of colour patterns on the wings of butterflies. More recently, these varied patterns have also become model systems for studying the evolution of developmental mechanisms. A technique that will facilitate our understanding of butterfly colour-pattern development is germline transformation. Germline transformation permits functional tests of candidate gene products and of cis-regulatory regions, and provides a means of generating new colour-pattern mutants by insertional mutagenesis. We report the successful transformation of the African satyrid butterfly Bicyclus anynana with two different transposable element vectors, Hermes and piggyBac, each carrying EGFP coding sequences driven by the 3XP3 synthetic enhancer that drives gene expression in the eyes. Candidate lines identified by screening for EGFP in adult eyes were later confirmed by PCR amplification of a fragment of the EGFP coding sequence from genomic DNA. Flanking DNA surrounding the insertions was amplified by inverse PCR and sequenced. Transformation rates were 5% for piggyBac and 10.2% for Hermes. Ultimately, the new data generated by these techniques may permit an integrated understanding of the developmental genetics of colour-pattern formation and of the ecological and evolutionary processes in which these patterns play a role.
Current sequencing technology makes microhaplotypes a powerful new type of genetic marker for forensics.

PubMed

Kidd, Kenneth K; Pakstis, Andrew J; Speed, William C; Lagacé, Robert; Chang, Joseph; Wootton, Sharon; Haigh, Eva; Kidd, Judith R

2014-09-01

SNPs that are molecularly very close (<10kb) will generally have extremely low recombination rates, much less than 10(-4). Multiple haplotypes will often exist because of the history of the origins of the variants at the different sites, rare recombinants, and the vagaries of random genetic drift and/or selection. Such multiallelic haplotype loci are potentially important in forensic work for individual identification, for defining ancestry, and for identifying familial relationships. The new DNA sequencing capabilities currently available make possible continuous runs of a few hundred base pairs so that we can now determine the allelic combination of multiple SNPs on each chromosome of an individual, i.e., the phase, for multiple SNPs within a small segment of DNA. Therefore, we have begun to identify regions, encompassing two to four SNPs with an extent of <200bp that define multiallelic haplotype loci. We have identified candidate regions and have collected pilot data on many candidate microhaplotype loci. Here we present 31 microhaplotype loci that have at least three alleles, have high heterozygosity, are globally informative, and are statistically independent at the population level. This study of microhaplotype loci (microhaps) provides proof of principle that such markers exist and validates their usefulness for ancestry inference, lineage-clan-family inference, and individual identification. The true value of microhaplotypes will come with sequencing methods that can establish alleles unambiguously, including disentangling of mixtures, because a single sequencing run on a single strand of DNA will encompass all of the SNPs. Copyright © 2014 The Authors. Published by Elsevier Ireland Ltd.. All rights reserved.
Novel division level bacterial diversity in a Yellowstone hot spring.

PubMed

Hugenholtz, P; Pitulle, C; Hershberger, K L; Pace, N R

1998-01-01

A culture-independent molecular phylogenetic survey was carried out for the bacterial community in Obsidian Pool (OP), a Yellowstone National Park hot spring previously shown to contain remarkable archaeal diversity (S. M. Barns, R. E. Fundyga, M. W. Jeffries, and N. R. Page, Proc. Natl. Acad. Sci. USA 91:1609-1613, 1994). Small-subunit rRNA genes (rDNA) were amplified directly from OP sediment DNA by PCR with universally conserved or Bacteria-specific rDNA primers and cloned. Unique rDNA types among > 300 clones were identified by restriction fragment length polymorphism, and 122 representative rDNA sequences were determined. These were found to represent 54 distinct bacterial sequence types or clusters (> or = 98% identity) of sequences. A majority (70%) of the sequence types were affiliated with 14 previously recognized bacterial divisions (main phyla; kingdoms); 30% were unaffiliated with recognized bacterial divisions. The unaffiliated sequence types (represented by 38 sequences) nominally comprise 12 novel, division level lineages termed candidate divisions. Several OP sequences were nearly identical to those of cultivated chemolithotrophic thermophiles, including the hydrogen-oxidizing Calderobacterium and the sulfate reducers Thermodesulfovibrio and Thermodesulfobacterium, or belonged to monophyletic assemblages recognized for a particular type of metabolism, such as the hydrogen-oxidizing Aquificales and the sulfate-reducing delta-Proteobacteria. The occurrence of such organisms is consistent with the chemical composition of OP (high in reduced iron and sulfur) and suggests a lithotrophic base for primary productivity in this hot spring, through hydrogen oxidation and sulfate reduction. Unexpectedly, no archaeal sequences were encountered in OP clone libraries made with universal primers. Hybridization analysis of amplified OP DNA with domain-specific probes confirmed that the analyzed community rDNA from OP sediment was predominantly bacterial. These results expand substantially our knowledge of the extent of bacterial diversity and call into question the commonly held notion that Archaea dominate hydrothermal environments. Finally, the currently known extent of division level bacterial phylogenetic diversity is collated and summarized.
Mapping DNA methylation by transverse current sequencing: Reduction of noise from neighboring nucleotides

NASA Astrophysics Data System (ADS)

Alvarez, Jose; Massey, Steven; Kalitsov, Alan; Velev, Julian

Nanopore sequencing via transverse current has emerged as a competitive candidate for mapping DNA methylation without needed bisulfite-treatment, fluorescent tag, or PCR amplification. By eliminating the error producing amplification step, long read lengths become feasible, which greatly simplifies the assembly process and reduces the time and the cost inherent in current technologies. However, due to the large error rates of nanopore sequencing, single base resolution has not been reached. A very important source of noise is the intrinsic structural noise in the electric signature of the nucleotide arising from the influence of neighboring nucleotides. In this work we perform calculations of the tunneling current through DNA molecules in nanopores using the non-equilibrium electron transport method within an effective multi-orbital tight-binding model derived from first-principles calculations. We develop a base-calling algorithm accounting for the correlations of the current through neighboring bases, which in principle can reduce the error rate below any desired precision. Using this method we show that we can clearly distinguish DNA methylation and other base modifications based on the reading of the tunneling current.
Mechanistically Distinct Pathways of Divergent Regulatory DNA Creation Contribute to Evolution of Human-Specific Genomic Regulatory Networks Driving Phenotypic Divergence of Homo sapiens.

PubMed

Glinsky, Gennadi V

2016-09-19

Thousands of candidate human-specific regulatory sequences (HSRS) have been identified, supporting the hypothesis that unique to human phenotypes result from human-specific alterations of genomic regulatory networks. Collectively, a compendium of multiple diverse families of HSRS that are functionally and structurally divergent from Great Apes could be defined as the backbone of human-specific genomic regulatory networks. Here, the conservation patterns analysis of 18,364 candidate HSRS was carried out requiring that 100% of bases must remap during the alignments of human, chimpanzee, and bonobo sequences. A total of 5,535 candidate HSRS were identified that are: (i) highly conserved in Great Apes; (ii) evolved by the exaptation of highly conserved ancestral DNA; (iii) defined by either the acceleration of mutation rates on the human lineage or the functional divergence from non-human primates. The exaptation of highly conserved ancestral DNA pathway seems mechanistically distinct from the evolution of regulatory DNA segments driven by the species-specific expansion of transposable elements. Genome-wide proximity placement analysis of HSRS revealed that a small fraction of topologically associating domains (TADs) contain more than half of HSRS from four distinct families. TADs that are enriched for HSRS and termed rapidly evolving in humans TADs (revTADs) comprise 0.8-10.3% of 3,127 TADs in the hESC genome. RevTADs manifest distinct correlation patterns between placements of human accelerated regions, human-specific transcription factor-binding sites, and recombination rates. There is a significant enrichment within revTAD boundaries of hESC-enhancers, primate-specific CTCF-binding sites, human-specific RNAPII-binding sites, hCONDELs, and H3K4me3 peaks with human-specific enrichment at TSS in prefrontal cortex neurons (P < 0.0001 in all instances). Present analysis supports the idea that phenotypic divergence of Homo sapiens is driven by the evolution of human-specific genomic regulatory networks via at least two mechanistically distinct pathways of creation of divergent sequences of regulatory DNA: (i) recombination-associated exaptation of the highly conserved ancestral regulatory DNA segments; (ii) human-specific insertions of transposable elements. © The Author 2016. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
On the path to genetic novelties: insights from programmed DNA elimination and RNA splicing.

PubMed

Catania, Francesco; Schmitz, Jürgen

2015-01-01

Understanding how genetic novelties arise is a central goal of evolutionary biology. To this end, programmed DNA elimination and RNA splicing deserve special consideration. While programmed DNA elimination reshapes genomes by eliminating chromatin during organismal development, RNA splicing rearranges genetic messages by removing intronic regions during transcription. Small RNAs help to mediate this class of sequence reorganization, which is not error-free. It is this imperfection that makes programmed DNA elimination and RNA splicing excellent candidates for generating evolutionary novelties. Leveraging a number of these two processes' mechanistic and evolutionary properties, which have been uncovered over the past years, we present recently proposed models and empirical evidence for how splicing can shape the structure of protein-coding genes in eukaryotes. We also chronicle a number of intriguing similarities between the processes of programmed DNA elimination and RNA splicing, and highlight the role that the variation in the population-genetic environment may play in shaping their target sequences. © 2015 Wiley Periodicals, Inc.
Canis mtDNA HV1 database: a web-based tool for collecting and surveying Canis mtDNA HV1 haplotype in public database.

PubMed

Thai, Quan Ke; Chung, Dung Anh; Tran, Hoang-Dung

2017-06-26

Canine and wolf mitochondrial DNA haplotypes, which can be used for forensic or phylogenetic analyses, have been defined in various schemes depending on the region analyzed. In recent studies, the 582 bp fragment of the HV1 region is most commonly used. 317 different canine HV1 haplotypes have been reported in the rapidly growing public database GenBank. These reported haplotypes contain several inconsistencies in their haplotype information. To overcome this issue, we have developed a Canis mtDNA HV1 database. This database collects data on the HV1 582 bp region in dog mitochondrial DNA from the GenBank to screen and correct the inconsistencies. It also supports users in detection of new novel mutation profiles and assignment of new haplotypes. The Canis mtDNA HV1 database (CHD) contains 5567 nucleotide entries originating from 15 subspecies in the species Canis lupus. Of these entries, 3646 were haplotypes and grouped into 804 distinct sequences. 319 sequences were recognized as previously assigned haplotypes, while the remaining 485 sequences had new mutation profiles and were marked as new haplotype candidates awaiting further analysis for haplotype assignment. Of the 3646 nucleotide entries, only 414 were annotated with correct haplotype information, while 3232 had insufficient or lacked haplotype information and were corrected or modified before storing in the CHD. The CHD can be accessed at http://chd.vnbiology.com . It provides sequences, haplotype information, and a web-based tool for mtDNA HV1 haplotyping. The CHD is updated monthly and supplies all data for download. The Canis mtDNA HV1 database contains information about canine mitochondrial DNA HV1 sequences with reconciled annotation. It serves as a tool for detection of inconsistencies in GenBank and helps identifying new HV1 haplotypes. Thus, it supports the scientific community in naming new HV1 haplotypes and to reconcile existing annotation of HV1 582 bp sequences.
Using NMR and molecular dynamics to link structure and dynamics effects of the universal base 8-aza, 7-deaza, N8 linked adenosine analog

PubMed Central

Spring-Connell, Alexander M.; Evich, Marina G.; Debelak, Harald; Seela, Frank; Germann, Markus W.

2016-01-01

A truly universal nucleobase enables a host of novel applications such as simplified templates for PCR primers, randomized sequencing and DNA based devices. A universal base must pair indiscriminately to each of the canonical bases with little or preferably no destabilization of the overall duplex. In reality, many candidates either destabilize the duplex or do not base pair indiscriminatingly. The novel base 8-aza-7-deazaadenine (pyrazolo[3,4-d]pyrimidin- 4-amine) N8-(2′deoxyribonucleoside), a deoxyadenosine analog (UB), pairs with each of the natural DNA bases with little sequence preference. We have utilized NMR complemented with molecular dynamic calculations to characterize the structure and dynamics of a UB incorporated into a DNA duplex. The UB participates in base stacking with little to no perturbation of the local structure yet forms an unusual base pair that samples multiple conformations. These local dynamics result in the complete disappearance of a single UB proton resonance under native conditions. Accommodation of the UB is additionally stabilized via heightened backbone conformational sampling. NMR combined with various computational techniques has allowed for a comprehensive characterization of both structural and dynamic effects of the UB in a DNA duplex and underlines that the UB as a strong candidate for universal base applications. PMID:27566150
The intrinsic combinatorial organization and information theoretic content of a sequence are correlated to the DNA encoded nucleosome organization of eukaryotic genomes.

PubMed

Utro, Filippo; Di Benedetto, Valeria; Corona, Davide F V; Giancarlo, Raffaele

2016-03-15

Thanks to research spanning nearly 30 years, two major models have emerged that account for nucleosome organization in chromatin: statistical and sequence specific. The first is based on elegant, easy to compute, closed-form mathematical formulas that make no assumptions of the physical and chemical properties of the underlying DNA sequence. Moreover, they need no training on the data for their computation. The latter is based on some sequence regularities but, as opposed to the statistical model, it lacks the same type of closed-form formulas that, in this case, should be based on the DNA sequence only. We contribute to close this important methodological gap between the two models by providing three very simple formulas for the sequence specific one. They are all based on well-known formulas in Computer Science and Bioinformatics, and they give different quantifications of how complex a sequence is. In view of how remarkably well they perform, it is very surprising that measures of sequence complexity have not even been considered as candidates to close the mentioned gap. We provide experimental evidence that the intrinsic level of combinatorial organization and information-theoretic content of subsequences within a genome are strongly correlated to the level of DNA encoded nucleosome organization discovered by Kaplan et al Our results establish an important connection between the intrinsic complexity of subsequences in a genome and the intrinsic, i.e. DNA encoded, nucleosome organization of eukaryotic genomes. It is a first step towards a mathematical characterization of this latter 'encoding'. Supplementary data are available at Bioinformatics online. futro@us.ibm.com. © The Author 2015. Published by Oxford University Press. All rights reserved. For Permissions, please e-mail: journals.permissions@oup.com.
Bacterial community composition in different sediments from the Eastern Mediterranean Sea: a comparison of four 16S ribosomal DNA clone libraries.

PubMed

Polymenakou, Paraskevi N; Bertilsson, Stefan; Tselepides, Anastasios; Stephanou, Euripides G

2005-10-01

The regional variability of sediment bacterial community composition and diversity was studied by comparative analysis of four large 16S ribosomal DNA (rDNA) clone libraries from sediments in different regions of the Eastern Mediterranean Sea (Thermaikos Gulf, Cretan Sea, and South lonian Sea). Amplified rDNA restriction analysis of 664 clones from the libraries indicate that the rDNA richness and evenness was high: for example, a near-1:1 relationship among screened clones and number of unique restriction patterns when up to 190 clones were screened for each library. Phylogenetic analysis of 207 bacterial 16S rDNA sequences from the sediment libraries demonstrated that Gamma-, Delta-, and Alphaproteobacteria, Holophaga/Acidobacteria, Planctomycetales, Actinobacteria, Bacteroidetes, and Verrucomicrobia were represented in all four libraries. A few clones also grouped with the Betaproteobacteria, Nitrospirae, Spirochaetales, Chlamydiae, Firmicutes, and candidate division OPl 1. The abundance of sequences affiliated with Gammaproteobacteria was higher in libraries from shallow sediments in the Thermaikos Gulf (30 m) and the Cretan Sea (100 m) compared to the deeper South Ionian station (2790 m). Most sequences in the four sediment libraries clustered with uncultured 16S rDNA phylotypes from marine habitats, and many of the closest matches were clones from hydrocarbon seeps, benzene-mineralizing consortia, sulfate reducers, sulk oxidizers, and ammonia oxidizers. LIBSHUFF statistics of 16S rDNA gene sequences from the four libraries revealed major differences, indicating either a very high richness in the sediment bacterial communities or considerable variability in bacterial community composition among regions, or both.
Identification of an expressed gene in Dipylidium caninum.

PubMed

Miranda, Rodrigo R C; Costa-Júnior, Livio M; Campos, Artur K; Santos, Hudson A; Rabelo, Elida M L

2004-10-01

Recombinant DNA studies have been focused on developing vaccines to different cestodes. But few studies involving Dipylidium caninum molecular biology and genes have been done. Only partial sequences of mitochondrial DNA and ribosomal RNA gene are available in databases. Any molecular work with this parasite, including epidemiology, study of drug-resistant strains, and vaccine development, is hampered by the lack of knowledge of its genome. Thus, the knowledge of specific genes of different developmental stages of D. caninum is crucial to locate potential targets to be used as candidates to develop a vaccine and/or new drugs against this parasite. Here we report, for the first time, the sequencing of a fragment of a D. caninum expressed gene.
Jonquetella anthropi gen. nov., sp. nov., the first member of the candidate phylum 'Synergistetes' isolated from man.

PubMed

Jumas-Bilak, Estelle; Carlier, Jean-Philippe; Jean-Pierre, Hélène; Citron, Diane; Bernard, Kathryn; Damay, Audrey; Gay, Bernard; Teyssier, Corinne; Campos, Josiane; Marchandin, Hélène

2007-12-01

Six clinical isolates of a hitherto unknown, strictly anaerobic, Gram-negative rod showing fastidious growth were subjected to a polyphasic taxonomic study, including phenotypic, genomic and phylogenetic feature analyses. 16S rRNA gene sequenced-based phylogeny revealed that the novel strains represent a homogeneous group distant from any recognized species in the candidate phylum 'Synergistetes'. The novel isolates were most closely related to species of the genus Dethiosulfovibrio, with 88.2-88.7 % 16S rRNA gene sequence similarity. Large-scale chromosome structure and DNA G+C content also differentiated the novel strains from members of the genus Dethiosulfovibrio. The novel strains were asaccharolytic. Major metabolic end products in trypticase/glucose/yeast extract broth were acetic, lactic, succinic and isovaleric acids and the major cellular fatty acids iso-C(15 : 0) and C(16 : 0). Based on the data presented here, a new genus, Jonquetella gen. nov., is proposed with one novel species, Jonquetella anthropi sp. nov. J. anthropi is the first characterized species of the candidate phylum 'Synergistetes' that includes human isolates. The G+C content of the DNA of the type strain of J. anthropi ADV 126(T) (=AIP 136.05(T)=CIP 109408(T)=CCUG 53819(T)) is 59.4 mol%.
Synthesis and Properties of Size-expanded DNAs: Toward Designed, Functional Genetic Systems

PubMed Central

Krueger, Andrew T.; Lu, Haige; Lee, Alex H. F.; Kool, Eric T.

2008-01-01

We describe the design, synthesis, and properties of DNA-like molecules in which the base pairs are expanded by benzo homologation. The resulting size-expanded genetic helices are called xDNA (“expanded DNA”) and yDNA (“wide DNA”). The large component bases are fluorescent, and they display high stacking affinity. When singly substituted into natural DNA, they are destabilizing because the benzo-expanded base pair size is too large for the natural helix. However, when all base pairs are expanded, xDNA and yDNA form highly stable, sequence-selective double helices. The size-expanded DNAs are candidates for components of new, functioning genetic systems. In addition, the fluorescence of expanded DNA bases makes them potentially useful in probing nucleic acids. PMID:17309194
Identification of Two Candidate Tumor Suppressor Genes on Chromosome 17p13.3: Assessment of Their Roles in Breast and Ovarian Carcinogenesis

DTIC Science & Technology

1997-07-01

minimum region of allelic loss on chromosome 17p 13.3, between polymorphic markers D17S5 and D17S28, in genomic DNA from breast and ovarian tumors (Figure 1...encode proteins of 443 and 227 amino acids, with no known functional motifs. Comparison of genomic and cDNA sequences showed that the genes overlap...is tissue specific (Figure 4). When zoo blots comprised of EcoRI fragments of genomic DNA from various species were probed with the unique exon 1 of
Complete Genome Sequence of Sporisorium scitamineum and Biotrophic Interaction Transcriptome with Sugarcane

PubMed Central

Benevenuto, Juliana; Peters, Leila P.; Carvalho, Giselle; Palhares, Alessandra; Quecine, Maria C.; Nunes, Filipe R. S.; Kmit, Maria C. P.; Wai, Alvan; Hausner, Georg; Aitken, Karen S.; Berkman, Paul J.; Fraser, James A.; Moolhuijzen, Paula M.; Coutinho, Luiz L.; Creste, Silvana; Vieira, Maria L. C.; Kitajima, João P.; Monteiro-Vitorello, Claudia B.

2015-01-01

Sporisorium scitamineum is a biotrophic fungus responsible for the sugarcane smut, a worldwide spread disease. This study provides the complete sequence of individual chromosomes of S. scitamineum from telomere to telomere achieved by a combination of PacBio long reads and Illumina short reads sequence data, as well as a draft sequence of a second fungal strain. Comparative analysis to previous available sequences of another strain detected few polymorphisms among the three genomes. The novel complete sequence described herein allowed us to identify and annotate extended subtelomeric regions, repetitive elements and the mitochondrial DNA sequence. The genome comprises 19,979,571 bases, 6,677 genes encoding proteins, 111 tRNAs and 3 assembled copies of rDNA, out of our estimated number of copies as 130. Chromosomal reorganizations were detected when comparing to sequences of S. reilianum, the closest smut relative, potentially influenced by repeats of transposable elements. Repetitive elements may have also directed the linkage of the two mating-type loci. The fungal transcriptome profiling from in vitro and from interaction with sugarcane at two time points (early infection and whip emergence) revealed that 13.5% of the genes were differentially expressed in planta and particular to each developmental stage. Among them are plant cell wall degrading enzymes, proteases, lipases, chitin modification and lignin degradation enzymes, sugar transporters and transcriptional factors. The fungus also modulates transcription of genes related to surviving against reactive oxygen species and other toxic metabolites produced by the plant. Previously described effectors in smut/plant interactions were detected but some new candidates are proposed. Ten genomic islands harboring some of the candidate genes unique to S. scitamineum were expressed only in planta. RNAseq data was also used to reassure gene predictions. PMID:26065709
Many human accelerated regions are developmental enhancers

PubMed Central

Capra, John A.; Erwin, Genevieve D.; McKinsey, Gabriel; Rubenstein, John L. R.; Pollard, Katherine S.

2013-01-01

The genetic changes underlying the dramatic differences in form and function between humans and other primates are largely unknown, although it is clear that gene regulatory changes play an important role. To identify regulatory sequences with potentially human-specific functions, we and others used comparative genomics to find non-coding regions conserved across mammals that have acquired many sequence changes in humans since divergence from chimpanzees. These regions are good candidates for performing human-specific regulatory functions. Here, we analysed the DNA sequence, evolutionary history, histone modifications, chromatin state and transcription factor (TF) binding sites of a combined set of 2649 non-coding human accelerated regions (ncHARs) and predicted that at least 30% of them function as developmental enhancers. We prioritized the predicted ncHAR enhancers using analysis of TF binding site gain and loss, along with the functional annotations and expression patterns of nearby genes. We then tested both the human and chimpanzee sequence for 29 ncHARs in transgenic mice, and found 24 novel developmental enhancers active in both species, 17 of which had very consistent patterns of activity in specific embryonic tissues. Of these ncHAR enhancers, five drove expression patterns suggestive of different activity for the human and chimpanzee sequence at embryonic day 11.5. The changes to human non-coding DNA in these ncHAR enhancers may modify the complex patterns of gene expression necessary for proper development in a human-specific manner and are thus promising candidates for understanding the genetic basis of human-specific biology. PMID:24218637

Investigating the genetics of Bti resistance using mRNA tag sequencing: application on laboratory strains and natural populations of the dengue vector Aedes aegypti

PubMed Central

Paris, Margot; Marcombe, Sebastien; Coissac, Eric; Corbel, Vincent; David, Jean-Philippe; Després, Laurence

2013-01-01

Mosquito control is often the main method used to reduce mosquito-transmitted diseases. In order to investigate the genetic basis of resistance to the bio-insecticide Bacillus thuringiensis subsp. israelensis (Bti), we used information on polymorphism obtained from cDNA tag sequences from pooled larvae of laboratory Bti-resistant and susceptible Aedes aegypti mosquito strains to identify and analyse 1520 single nucleotide polymorphisms (SNPs). Of the 372 SNPs tested, 99.2% were validated using DNA Illumina GoldenGate® array, with a strong correlation between the allelic frequencies inferred from the pooled and individual data (r = 0.85). A total of 11 genomic regions and five candidate genes were detected using a genome scan approach. One of these candidate genes showed significant departures from neutrality in the resistant strain at sequence level. Six natural populations from Martinique Island were sequenced for the 372 tested SNPs with a high transferability (87%), and association mapping analyses detected 14 loci associated with Bti resistance, including one located in a putative receptor for Cry11 toxins. Three of these loci were also significantly differentiated between the laboratory strains, suggesting that most of the genes associated with resistance might differ between the two environments. It also suggests that common selected regions might harbour key genes for Bti resistance. PMID:24187584
Idiopathic slow transit constipation and megacolon are not associated with neurturin mutations.

PubMed

Chen, B; Knowles, C H; Scott, M; Anand, P; Williams, N S; Milbrandt, J; Tam, P K H

2002-10-01

Chronic idiopathic slow-transit constipation (ISTC) and idiopathic megacolon (IMC) are early-onset gastrointestinal motility disorders of unknown aetiology. The gene encoding the neurotrophic factor neurturin may be a candidate for these disorders, as neurturin-deficient mice have a similar enteric phenotype. In the present study, we tested this hypothesis. Genomic DNA from 26 cases of chronic idiopathic STC [with a family history of constipation in 15 (58%) and Hirschsprung's disease in two (8%)], and five cases of IMC [two familial (40%)] was screened by direct DNA sequencing using the fluorescent dideoxy terminator method. Results were compared with published sequence data and 24 control DNAs. Our results revealed several previously unreported common sequence polymorphisms, but overall frequencies were comparable between patients and controls. We conclude that mutation of neurturin is not a frequent cause of ISTC or IMC.
An XRCC4 Splice Mutation Associated With Severe Short Stature, Gonadal Failure, and Early-Onset Metabolic Syndrome

PubMed Central

de Bruin, Christiaan; Mericq, Verónica; Andrew, Shayne F.; van Duyvenvoorde, Hermine A.; Verkaik, Nicole S.; Losekoot, Monique; Porollo, Aleksey; Garcia, Hernán; Kuang, Yi; Hanson, Dan; Clayton, Peter; van Gent, Dik C.; Wit, Jan M.; Hwa, Vivian

2015-01-01

Context: Severe short stature can be caused by defects in numerous biological processes including defects in IGF-1 signaling, centromere function, cell cycle control, and DNA damage repair. Many syndromic causes of short stature are associated with medical comorbidities including hypogonadism and microcephaly. Objective: To identify an underlying genetic etiology in two siblings with severe short stature and gonadal failure. Design: Clinical phenotyping, genetic analysis, complemented by in vitro functional studies of the candidate gene. Setting: An academic pediatric endocrinology clinic. Patients or Other Participants: Two adult siblings (male patient [P1] and female patient 2 [P2]) presented with a history of severe postnatal growth failure (adult heights: P1, −6.8 SD score; P2, −4 SD score), microcephaly, primary gonadal failure, and early-onset metabolic syndrome in late adolescence. In addition, P2 developed a malignant gastrointestinal stromal tumor at age 28. Intervention(s): Single nucleotide polymorphism microarray and exome sequencing. Results: Combined microarray analysis and whole exome sequencing of the two affected siblings and one unaffected sister identified a homozygous variant in XRCC4 as the probable candidate variant. Sanger sequencing and mRNA studies revealed a splice variant resulting in an in-frame deletion of 23 amino acids. Primary fibroblasts (P1) showed a DNA damage repair defect. Conclusions: In this study we have identified a novel pathogenic variant in XRCC4, a gene that plays a critical role in non-homologous end-joining DNA repair. This finding expands the spectrum of DNA damage repair syndromes to include XRCC4 deficiency causing severe postnatal growth failure, microcephaly, gonadal failure, metabolic syndrome, and possibly tumor predisposition. PMID:25742519
G-rich telomeric and ribosomal DNA sequences from the fission yeast genome form stable G-quadruplex DNA structures in vitro and are unwound by the Pfh1 DNA helicase

PubMed Central

Wallgren, Marcus; Mohammad, Jani B.; Yan, Kok-Phen; Pourbozorgi-Langroudi, Parham; Ebrahimi, Mahsa; Sabouri, Nasim

2016-01-01

Certain guanine-rich sequences have an inherent propensity to form G-quadruplex (G4) structures. G4 structures are e.g. involved in telomere protection and gene regulation. However, they also constitute obstacles during replication if they remain unresolved. To overcome these threats to genome integrity, organisms harbor specialized G4 unwinding helicases. In Schizosaccharomyces pombe, one such candidate helicase is Pfh1, an evolutionarily conserved Pif1 homolog. Here, we addressed whether putative G4 sequences in S. pombe can adopt G4 structures and, if so, whether Pfh1 can resolve them. We tested two G4 sequences, derived from S. pombe ribosomal and telomeric DNA regions, and demonstrated that they form inter- and intramolecular G4 structures, respectively. Also, Pfh1 was enriched in vivo at the ribosomal G4 DNA and telomeric sites. The nuclear isoform of Pfh1 (nPfh1) unwound both types of structure, and although the G4-stabilizing compound Phen-DC3 significantly enhanced their stability, nPfh1 still resolved them efficiently. However, stable G4 structures significantly inhibited adenosine triphosphate hydrolysis by nPfh1. Because ribosomal and telomeric DNA contain putative G4 regions conserved from yeasts to humans, our studies support the important role of G4 structure formation in these regions and provide further evidence for a conserved role for Pif1 helicases in resolving G4 structures. PMID:27185885
Analysis of Ethnic Admixture in Prostate Cancer

DTIC Science & Technology

2006-12-01

low penetrant genes have been identified as potential PCA suscept- ibility genes. These candidate genes include SRD5A2 (MIM 607306), CYP3A4 (MIM 124010...progression [13]. The CDH1gene is located at 16q22.1 and consists of 16 exons spanning approximately 100 kb of genomic DNA. Several polymorphisms, germline and...upstreamof theATGstart site and all 16 exons of CDH1 were screened for DNA sequence variation by denaturing high-performance liquid chro- matography
Population-scale whole genome sequencing identifies 271 highly polymorphic short tandem repeats from Japanese population.

PubMed

Hirata, Satoshi; Kojima, Kaname; Misawa, Kazuharu; Gervais, Olivier; Kawai, Yosuke; Nagasaki, Masao

2018-05-01

Forensic DNA typing is widely used to identify missing persons and plays a central role in forensic profiling. DNA typing usually uses capillary electrophoresis fragment analysis of PCR amplification products to detect the length of short tandem repeat (STR) markers. Here, we analyzed whole genome data from 1,070 Japanese individuals generated using massively parallel short-read sequencing of 162 paired-end bases. We have analyzed 843,473 STR loci with two to six basepair repeat units and cataloged highly polymorphic STR loci in the Japanese population. To evaluate the performance of the cataloged STR loci, we compared 23 STR loci, widely used in forensic DNA typing, with capillary electrophoresis based STR genotyping results in the Japanese population. Seventeen loci had high correlations and high call rates. The other six loci had low call rates or low correlations due to either the limitations of short-read sequencing technology, the bioinformatics tool used, or the complexity of repeat patterns. With these analyses, we have also purified the suitable 218 STR loci with four basepair repeat units and 53 loci with five basepair repeat units both for short read sequencing and PCR based technologies, which would be candidates to the actual forensic DNA typing in Japanese population.
DNA motifs associated with aberrant CpG island methylation.

PubMed

Feltus, F Alex; Lee, Eva K; Costello, Joseph F; Plass, Christoph; Vertino, Paula M

2006-05-01

Epigenetic silencing involving the aberrant methylation of promoter region CpG islands is widely recognized as a tumor suppressor silencing mechanism in cancer. However, the molecular pathways underlying aberrant DNA methylation remain elusive. Recently we showed that, on a genome-wide level, CpG island loci differ in their intrinsic susceptibility to aberrant methylation and that this susceptibility can be predicted based on underlying sequence context. These data suggest that there are sequence/structural features that contribute to the protection from or susceptibility to aberrant methylation. Here we use motif elicitation coupled with classification techniques to identify DNA sequence motifs that selectively define methylation-prone or methylation-resistant CpG islands. Motifs common to 28 methylation-prone or 47 methylation-resistant CpG island-containing genomic fragments were determined using the MEME and MAST algorithms (). The five most discriminatory motifs derived from methylation-prone sequences were found to be associated with CpG islands in general and were nonrandomly distributed throughout the genome. In contrast, the eight most discriminatory motifs derived from the methylation-resistant CpG islands were randomly distributed throughout the genome. Interestingly, this latter group tended to associate with Alu and other repetitive sequences. Used together, the frequency of occurrence of these motifs successfully discriminated methylation-prone and methylation-resistant CpG island groups with an accuracy of 87% after 10-fold cross-validation. The motifs identified here are candidate methylation-targeting or methylation-protection DNA sequences.
Saturation analysis of ChIP-seq data for reproducible identification of binding peaks

PubMed Central

Hansen, Peter; Hecht, Jochen; Ibrahim, Daniel M.; Krannich, Alexander; Truss, Matthias; Robinson, Peter N.

2015-01-01

Chromatin immunoprecipitation coupled with next-generation sequencing (ChIP-seq) is a powerful technology to identify the genome-wide locations of transcription factors and other DNA binding proteins. Computational ChIP-seq peak calling infers the location of protein–DNA interactions based on various measures of enrichment of sequence reads. In this work, we introduce an algorithm, Q, that uses an assessment of the quadratic enrichment of reads to center candidate peaks followed by statistical analysis of saturation of candidate peaks by 5′ ends of reads. We show that our method not only is substantially faster than several competing methods but also demonstrates statistically significant advantages with respect to reproducibility of results and in its ability to identify peaks with reproducible binding site motifs. We show that Q has superior performance in the delineation of double RNAPII and H3K4me3 peaks surrounding transcription start sites related to a better ability to resolve individual peaks. The method is implemented in C+l+ and is freely available under an open source license. PMID:26163319
Tunable graphene quantum point contact transistor for DNA detection and characterization

PubMed Central

Girdhar, Anuj; Sathe, Chaitanya; Schulten, Klaus; Leburton, Jean-Pierre

2015-01-01

A graphene membrane conductor containing a nanopore in a quantum point contact (QPC) geometry is a promising candidate to sense, and potentially sequence, DNA molecules translocating through the nanopore. Within this geometry, the shape, size, and position of the nanopore as well as the edge configuration influences the membrane conductance caused by the electrostatic interaction between the DNA nucleotides and the nanopore edge. It is shown that the graphene conductance variations resulting from DNA translocation can be enhanced by choosing a particular geometry as well as by modulating the graphene Fermi energy, which demonstrates the ability to detect conformational transformations of a double-stranded DNA, as well as the passage of individual base pairs of a single-stranded DNA molecule through the nanopore. PMID:25765702
Novel candidate genes may be possible predisposing factors revealed by whole exome sequencing in familial esophageal squamous cell carcinoma.

PubMed

Forouzanfar, Narjes; Baranova, Ancha; Milanizadeh, Saman; Heravi-Moussavi, Alireza; Jebelli, Amir; Abbaszadegan, Mohammad Reza

2017-05-01

Esophageal squamous cell carcinoma is one of the deadliest of all the cancers. Its metastatic properties portend poor prognosis and high rate of recurrence. A more advanced method to identify new molecular biomarkers predicting disease prognosis can be whole exome sequencing. Here, we report the most effective genetic variants of the Notch signaling pathway in esophageal squamous cell carcinoma susceptibility by whole exome sequencing. We analyzed nine probands in unrelated familial esophageal squamous cell carcinoma pedigrees to identify candidate genes. Genomic DNA was extracted and whole exome sequencing performed to generate information about genetic variants in the coding regions. Bioinformatics software applications were utilized to exploit statistical algorithms to demonstrate protein structure and variants conservation. Polymorphic regions were excluded by false-positive investigations. Gene-gene interactions were analyzed for Notch signaling pathway candidates. We identified novel and damaging variants of the Notch signaling pathway through extensive pathway-oriented filtering and functional predictions, which led to the study of 27 candidate novel mutations in all nine patients. Detection of the trinucleotide repeat containing 6B gene mutation (a slice site alteration) in five of the nine probands, but not in any of the healthy samples, suggested that it may be a susceptibility factor for familial esophageal squamous cell carcinoma. Noticeably, 8 of 27 novel candidate gene mutations (e.g. epidermal growth factor, signal transducer and activator of transcription 3, MET) act in a cascade leading to cell survival and proliferation. Our results suggest that the trinucleotide repeat containing 6B mutation may be a candidate predisposing gene in esophageal squamous cell carcinoma. In addition, some of the Notch signaling pathway genetic mutations may act as key contributors to esophageal squamous cell carcinoma.
Association of Amine-Receptor DNA Sequence Variants with Associative Learning in the Honeybee.

PubMed

Lagisz, Malgorzata; Mercer, Alison R; de Mouzon, Charlotte; Santos, Luana L S; Nakagawa, Shinichi

2016-03-01

Octopamine- and dopamine-based neuromodulatory systems play a critical role in learning and learning-related behaviour in insects. To further our understanding of these systems and resulting phenotypes, we quantified DNA sequence variations at six loci coding octopamine-and dopamine-receptors and their association with aversive and appetitive learning traits in a population of honeybees. We identified 79 polymorphic sequence markers (mostly SNPs and a few insertions/deletions) located within or close to six candidate genes. Intriguingly, we found that levels of sequence variation in the protein-coding regions studied were low, indicating that sequence variation in the coding regions of receptor genes critical to learning and memory is strongly selected against. Non-coding and upstream regions of the same genes, however, were less conserved and sequence variations in these regions were weakly associated with between-individual differences in learning-related traits. While these associations do not directly imply a specific molecular mechanism, they suggest that the cross-talk between dopamine and octopamine signalling pathways may influence olfactory learning and memory in the honeybee.
Nuclease-mediated double-strand break (DSB) enhancement of small fragment homologous recombination (SFHR) gene modification in human-induced pluripotent stem cells (hiPSCs).

PubMed

Sargent, R Geoffrey; Suzuki, Shingo; Gruenert, Dieter C

2014-01-01

Recent developments in methods to specifically modify genomic DNA using sequence-specific endonucleases and donor DNA have opened the door to a new therapeutic paradigm for cell and gene therapy of inherited diseases. Sequence-specific endonucleases, in particular transcription activator-like (TAL) effector nucleases (TALENs), have been coupled with polynucleotide small/short DNA fragments (SDFs) to correct the most common mutation in the cystic fibrosis (CF) transmembrane conductance regulator (CFTR) gene, a 3-base-pair deletion at codon 508 (delF508), in induced pluripotent stem (iPS) cells. The studies presented here describe the generation of candidate TALENs and their co-transfection with wild-type (wt) CFTR-SDFs into CF-iPS cells homozygous for the delF508 mutation. Using an allele-specific PCR (AS-PCR)-based cyclic enrichment protocol, clonal populations of corrected CF-iPS cells were isolated and expanded.
DISSECTING THE GENETICS OF HUMAN HIGH MYOPIA: A MOLECULAR BIOLOGIC APPROACH

PubMed Central

Young, Terri L

2004-01-01

ABSTRACT Purpose Despite the plethora of experimental myopia animal studies that demonstrate biochemical factor changes in various eye tissues, and limited human studies utilizing pharmacologic agents to thwart axial elongation, we have little knowledge of the basic physiology that drives myopic development. Identifying the implicated genes for myopia susceptibility will provide a fundamental molecular understanding of how myopia occurs and may lead to directed physiologic (ie, pharmacologic, gene therapy) interventions. The purpose of this proposal is to describe the results of positional candidate gene screening of selected genes within the autosomal dominant high-grade myopia-2 locus (MYP2) on chromosome 18p11.31. Methods A physical map of a contracted MYP2 interval was compiled, and gene expression studies in ocular tissues using complementary DNA library screens, microarray matches, and reverse-transcription techniques aided in prioritizing gene selection for screening. The TGIF, EMLIN-2, MLCB, and CLUL1 genes were screened in DNA samples from unrelated controls and in high-myopia affected and unaffected family members from the original seven MYP2 pedigrees. All candidate genes were screened by direct base pair sequence analysis. Results Consistent segregation of a gene sequence alteration (polymorphism) with myopia was not demonstrated in any of the seven families. Novel single nucleotide polymorphisms were found. Conclusion The positional candidate genes TGIF, EMLIN-2, MLCB, and CLUL1 are not associated with MYP2-linked high-grade myopia. Base change polymorphisms discovered with base sequence screening of these genes were submitted to an Internet database. Other genes that also map within the interval are currently undergoing mutation screening. PMID:15747770
'FloraArray' for screening of specific DNA probes representing the characteristics of a certain microbial community.

PubMed

Yokoi, Takahide; Kaku, Yoshiko; Suzuki, Hiroyuki; Ohta, Masayuki; Ikuta, Hajime; Isaka, Kazuichi; Sumino, Tatsuo; Wagatsuma, Masako

2007-08-01

To investigate uncharacterized microbial communities, a custom DNA microarray named 'FloraArray' was developed for screening specific probes that would represent the characteristics of a microbial community. The array was prepared by spotting 2000 plasmid DNAs from a genomic shotgun library of a sludge sample on a DNA microarray. By comparative hybridization of the array with two different samples of genomic DNA, one from the activated sludge and the other from a nonactivated sludge sample of an anaerobic ammonium oxidation (anammox) bacterial community, specific spots were visualized as a definite fluctuating profile in an MA (differential intensity ratio vs. spot intensity) plot. About 300 spots of the array accounted for the candidate probes to represent anammox reaction of the activated sludge. After sequence analysis of the probes and examination of the results of blastn searches against the reported anammox reference sequence, complete matches were found for 161 probes (58.3%) and >90% matches were found for 242 probes (87.1%). These results demonstrate that 'FloraArray' could be a useful tool for screening specific DNA molecules of unknown microbial communities.
Species identification of medicinal pteridophytes by a DNA barcode marker, the chloroplast psbA-trnH intergenic region.

PubMed

Ma, Xin-Ye; Xie, Cai-Xiang; Liu, Chang; Song, Jing-Yuan; Yao, Hui; Luo, Kun; Zhu, Ying-Jie; Gao, Ting; Pang, Xiao-Hui; Qian, Jun; Chen, Shi-Lin

2010-01-01

Medicinal pteridophytes are an important group used in traditional Chinese medicine; however, there is no simple and universal way to differentiate various species of this group by morphological traits. A novel technology termed "DNA barcoding" could discriminate species by a standard DNA sequence with universal primers and sufficient variation. To determine whether DNA barcoding would be effective for differentiating pteridophyte species, we first analyzed five DNA sequence markers (psbA-trnH intergenic region, rbcL, rpoB, rpoC1, and matK) using six chloroplast genomic sequences from GeneBank and found psbA-trnH intergenic region the best candidate for availability of universal primers. Next, we amplified the psbA-trnH region from 79 samples of medicinal pteridophyte plants. These samples represented 51 species from 24 families, including all the authentic pteridophyte species listed in the Chinese pharmacopoeia (2005 version) and some commonly used adulterants. We found that the sequence of the psbA-trnH intergenic region can be determined with both high polymerase chain reaction (PCR) amplification efficiency (94.1%) and high direct sequencing success rate (81.3%). Combined with GeneBank data (54 species cross 12 pteridophyte families), species discriminative power analysis showed that 90.2% of species could be separated/identified successfully by the TaxonGap method in conjunction with the Basic Local Alignment Search Tool 1 (BLAST1) method. The TaxonGap method results further showed that, for 37 out of 39 separable species with at least two samples each, between-species variation was higher than the relevant within-species variation. Thus, the psbA-trnH intergenic region is a suitable DNA marker for species identification in medicinal pteridophytes.
The D1-D2 region of the large subunit ribosomal DNA as barcode for ciliates.

PubMed

Stoeck, T; Przybos, E; Dunthorn, M

2014-05-01

Ciliates are a major evolutionary lineage within the alveolates, which are distributed in nearly all habitats on our planet and are an essential component for ecosystem function, processes and stability. Accurate identification of these unicellular eukaryotes through, for example, microscopy or mating type reactions is reserved to few specialists. To satisfy the demand for a DNA barcode for ciliates, which meets the standard criteria for DNA barcodes defined by the Consortium for the Barcode of Life (CBOL), we here evaluated the D1-D2 region of the ribosomal DNA large subunit (LSU-rDNA). Primer universality for the phylum Ciliophora was tested in silico with available database sequences as well as in the laboratory with 73 ciliate species, which represented nine of 12 ciliate classes. Primers tested in this study were successful for all tested classes. To test the ability of the D1-D2 region to resolve conspecific and congeneric sequence divergence, 63 Paramecium strains were sampled from 24 mating species. The average conspecific D1-D2 variation was 0.18%, whereas congeneric sequence divergence averaged 4.83%. In pairwise genetic distance analyses, we identified a D1-D2 sequence divergence of <0.6% as an ideal threshold to discriminate Paramecium species. Using this definition, only 3.8% of all conspecific and 3.9% of all congeneric sequence comparisons had the potential of false assignments. Neighbour-joining analyses inferred monophyly for all taxa but for two Paramecium octaurelia strains. Here, we present a protocol for easy DNA amplification of single cells and voucher deposition. In conclusion, the presented data pinpoint the D1-D2 region as an excellent candidate for an official CBOL barcode for ciliated protists. © 2013 John Wiley & Sons Ltd.
Integrative taxonomy supports new candidate fish species in a poorly studied neotropical region: the Jequitinhonha River Basin.

PubMed

Pugedo, Marina Lages; de Andrade Neto, Francisco Ricardo; Pessali, Tiago Casarim; Birindelli, José Luís Olivan; Carvalho, Daniel Cardoso

2016-06-01

Molecular identification through DNA barcoding has been proposed as a way to standardize a global biodiversity identification system using a partial sequence of the mitochondrial COI gene. We applied an integrative approach using DNA barcoding and traditional morphology-based bioassessment to identify fish from a neotropical region possessing low taxonomic knowledge: the Jequitinhonha River Basin (Southeastern Brazil). The Jequitinhonha River Basin (JRB) has a high rate of endemism and is considered an area of high priority for fish conservation, with estimates indicating the presence of around 110 native and non-indigenous species. DNA barcodes were obtained from 260 individuals belonging to 52 species distributed among 35 genera, 21 families and 6 orders, including threatened and rare species such as Rhamdia jequitinhonha and Steindachneridion amblyurum. The mean Kimura two-parameter genetic distances within species, genera and families were: 0.44, 12.16 and 20.58 %, respectively. Mean intraspecific genetic variation ranged from 0 to 11.43 %, and high values (>2 %) were recovered for five species. Species with a deep intraspecific distance, possibly flagging overlooked taxa, were detected within the genus Pimelodella. Fifteen species, only identified to the genus level, had unique BINs, with a nearest neighbor distance over 2 % and therefore, potential new candidate species supported by DNA barcoding. The integrative taxonomy approach using DNA barcoding and traditional taxonomy may be a remedy to taxonomy impediment, accelerating species identification by flagging potential new candidate species and to adequately conserve the megadiverse neotropical ichthyofauna.
Management of familial cancer: sequencing, surveillance and society.

PubMed

Samuel, Nardin; Villani, Anita; Fernandez, Conrad V; Malkin, David

2014-12-01

The clinical management of familial cancer begins with recognition of patterns of cancer occurrence suggestive of genetic susceptibility in a proband or pedigree, to enable subsequent investigation of the underlying DNA mutations. In this regard, next-generation sequencing of DNA continues to transform cancer diagnostics, by enabling screening for cancer-susceptibility genes in the context of known and emerging familial cancer syndromes. Increasingly, not only are candidate cancer genes sequenced, but also entire 'healthy' genomes are mapped in children with cancer and their family members. Although large-scale genomic analysis is considered intrinsic to the success of cancer research and discovery, a number of accompanying ethical and technical issues must be addressed before this approach can be adopted widely in personalized therapy. In this Perspectives article, we describe our views on how the emergence of new sequencing technologies and cancer surveillance strategies is altering the framework for the clinical management of hereditary cancer. Genetic counselling and disclosure issues are discussed, and strategies for approaching ethical dilemmas are proposed.
Does CTCF mediate between nuclear organization and gene expression?

PubMed

Ohlsson, Rolf; Lobanenkov, Victor; Klenova, Elena

2010-01-01

The multifunctional zinc-finger protein CCCTC-binding factor (CTCF) is a very strong candidate for the role of coordinating the expression level of coding sequences with their three-dimensional position in the nucleus, apparently responding to a "code" in the DNA itself. Dynamic interactions between chromatin fibers in the context of nuclear architecture have been implicated in various aspects of genome functions. However, the molecular basis of these interactions still remains elusive and is a subject of intense debate. Here we discuss the nature of CTCF-DNA interactions, the CTCF-binding specificity to its binding sites and the relationship between CTCF and chromatin, and we examine data linking CTCF with gene regulation in the three-dimensional nuclear space. We discuss why these features render CTCF a very strong candidate for the role and propose a unifying model, the "CTCF code," explaining the mechanistic basis of how the information encrypted in DNA may be interpreted by CTCF into diverse nuclear functions.
Construction of a yeast artificial chromosome contig encompassing the chromosome 14 Alzheimer`s disease locus

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sharma, V.; Bonnycastle, L.; Poorkai, P.

1994-09-01

We have constructed a yeast artificial chromosome (YAC) contig of chromosome 14q24.3 which encompasses the chromosome 14 Alzheimer`s disease locus (AD3). Determined by linkage analysis of early-onset Alzheimer`s disease kindreds, this interval is bounded by the genetic markers D14S61-D14S63 and spans approximately 15 centimorgans. The contig consists of 29 markers and 74 YACs of which 57 are defined by one or more sequence tagged sites (STSs). The STS markers comprise 5 genes, 16 short tandem repeat polymorphisms and 8 cDNA clones. An additional number of genes, expressed sequence tags and cDNA fragments have been identified and localized to the contigmore » by hybridization and sequence analysis of anonymous clones isolated by cDNA direct selection techniques. A minimal contig of about 15 YACs averaging 0.5-1.5 megabase in length will span this interval and is, at first approximation, in rough agreement with the genetic map. For two regions of the contig, our coverage has relied on L1/THE fingerprint and Alu-PCR hybridization data of YACs provided by CEPH/Genethon. We are currently developing sequence tagged sites from these to confirm the overlaps revealed by the fingerprint data. Among the genes which map to the contig are transforming growth factor beta 3, c-fos, and heat shock protein 2A (HSPA2). C-fos is not a candidate gene for AD3 based on the sequence analysis of affected and unaffected individuals. HSPA2 maps to the proximal edge of the contig and Calmodulin 1, a candidate gene from 4q24.3, maps outside of the region. The YAC contig is a framework physical map from which cosmid or P1 clone contigs can be constructed. As more genes and cDNAs are mapped, a highly resolved transcription map will emerge, a necessary step towards positionally cloning the AD3 gene.« less

Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

PubMed Central

Gorodkin, Jan; Cirera, Susanna; Hedegaard, Jakob; Gilchrist, Michael J; Panitz, Frank; Jørgensen, Claus; Scheibye-Knudsen, Karsten; Arvin, Troels; Lumholdt, Steen; Sawera, Milena; Green, Trine; Nielsen, Bente J; Havgaard, Jakob H; Rosenkilde, Carina; Wang, Jun; Li, Heng; Li, Ruiqiang; Liu, Bin; Hu, Songnian; Dong, Wei; Li, Wei; Yu, Jun; Wang, Jian; Stærfeldt, Hans-Henrik; Wernersson, Rasmus; Madsen, Lone B; Thomsen, Bo; Hornshøj, Henrik; Bujie, Zhan; Wang, Xuegang; Wang, Xuefei; Bolund, Lars; Brunak, Søren; Yang, Huanming; Bendixen, Christian; Fredholm, Merete

2007-01-01

Background Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing 35 different tissues and three developmental stages. Results Using the Distiller package, the ESTs were assembled to roughly 48,000 contigs and 73,000 singletons, of which approximately 25% have a high confidence match to UniProt. Approximately 6,000 new porcine gene clusters were identified. Expression analysis based on the non-normalized libraries resulted in the following findings. The distribution of cluster sizes is scaling invariant. Brain and testes are among the tissues with the greatest number of different expressed genes, whereas tissues with more specialized function, such as developing liver, have fewer expressed genes. There are at least 65 high confidence housekeeping gene candidates and 876 cDNA library-specific gene candidates. We identified differential expression of genes between different tissues, in particular brain/spinal cord, and found patterns of correlation between genes that share expression in pairs of libraries. Finally, there was remarkable agreement in expression between specialized tissues according to Gene Ontology categories. Conclusion This EST collection, the largest to date in pig, represents an essential resource for annotation, comparative genomics, assembly of the pig genome sequence, and further porcine transcription studies. PMID:17407547
Nanochannel Device with Embedded Nanopore: a New Approach for Single-Molecule DNA Analysis and Manipulation

NASA Astrophysics Data System (ADS)

Zhang, Yuning; Reisner, Walter

2013-03-01

Nanopore and nanochannel based devices are robust methods for biomolecular sensing and single DNA manipulation. Nanopore-based DNA sensing has attractive features that make it a leading candidate as a single-molecule DNA sequencing technology. Nanochannel based extension of DNA, combined with enzymatic or denaturation-based barcoding schemes, is already a powerful approach for genome analysis. We believe that there is revolutionary potential in devices that combine nanochannels with embedded pore detectors. In particular, due to the fast translocation of a DNA molecule through a standard nanopore configuration, there is an unfavorable trade-off between signal and sequence resolution. With a combined nanochannel-nanopore device, based on embedding a pore inside a nanochannel, we can in principle gain independent control over both DNA translocation speed and sensing signal, solving the key draw-back of the standard nanopore configuration. We demonstrate that we can optically detect successful translocation of DNA from the nanochannel out through the nanopore, a possible method to 'select' a given barcode for further analysis. In particular, we show that in equilibrium DNA will not escape through an embedded sub-persistence length nanopore, suggesting that the pore could be used as a nanoscale window through which to interrogate a nanochannel extended DNA molecule. Furthermore, electrical measurements through the nanopore are performed, indicating that DNA sensing is feasible using the nanochannel-nanopore device.
Transcript map of the Ovum mutant (Om) locus: isolation by exon trapping of new candidate genes for the DDK syndrome.

PubMed

Le Bras, Stéphanie; Cohen-Tannoudji, Michel; Guyot, Valérie; Vandormael-Pournin, Sandrine; Coumailleau, Franck; Babinet, Charles; Baldacci, Patricia

2002-08-21

The DDK syndrome is defined as the embryonic lethality of F1 mouse embryos from crosses between DDK females and males from other strains (named hereafter as non-DDK strains). Genetically controlled by the Ovum mutant (Om) locus, it is due to a deleterious interaction between a maternal factor present in DDK oocytes and the non-DDK paternal pronucleus. Therefore, the DDK syndrome constitutes a unique genetic tool to study the crucial interactions that take place between the parental genomes and the egg cytoplasm during mammalian development. In this paper, we present an extensive analysis performed by exon trapping on the Om region. Twenty-seven trapped sequences were from genes in the databases: beta-adaptin, CCT zeta2, DNA LigaseIII, Notchless, Rad51l3 and Scya1. Twenty-eight other sequences presented similarities with expressed sequence tags and genomic sequences whereas 57 did not. The pattern of expression of 37 of these markers was established. Importantly, five of them are expressed in DDK oocytes and are candidate genes for the maternal factor, and 20 are candidate genes for the paternal factor since they are expressed in testis. This data is an important step towards identifying the genes responsible for the DDK syndrome.
Phylogenetic selection of target species in Amaryllidaceae tribe Haemantheae for acetylcholinesterase inhibition and affinity to the serotonin reuptake transport protein

USDA-ARS?s Scientific Manuscript database

We present phylogenetic analyses of 37 taxa of Amaryllidaceae, tribe Haemantheae and Amaryllis belladonna L. as an outgroup, in order to provide a phylogenetic framework for the selection of candidate plants for lead discoveries in relation to Alzheimer´s disease and depression. DNA sequences from t...
Cloning and expression of Bartonella henselae sucB gene encoding an immunogenic dihydrolipoamide succinyltransferase homologous protein.

PubMed

Kabeya, Hidenori; Maruyama, Soichi; Hirano, Kouji; Mikami, Takeshi

2003-01-01

Immunoscreening of a ZAP genomic library of Bartonella henselae strain Houston-1 expressed in Escherichia coli resulted in the isolation of a clone containing 3.5 kb BamHI genomic DNA fragment. This 3.5 kb DNA fragment was found to contain a sequence of a gene encoding a protein with significant homology to the dihydrolipoamide succinyltransferase of Brucella melitensis (sucB). Subsequent cloning and DNA sequence analysis revealed that the deduced amino acid sequence from the cloned gene showed 66.5% identity to SucB protein of B. melitensis, and 43.4 and 47.2% identities to those of Coxiella burnetii and E. coli, respectively. The gene was expressed as a His-Nus A-tagged fusion protein. The recombinant SucB protein (rSucB) was shown to be an immunoreactive protein of about 115 kDa by Western blot analysis with sera from B. henselae-immunized mice. Therefore the rSucB may be a candidate antigen for a specific serological diagnosis of B. henselae infection.
Design and characterization of plasmids encoding antigenic peptides of Aha1 from Aeromonas hydrophila as prospective fish vaccines.

PubMed

Rauta, Pradipta R; Nayak, Bismita; Monteiro, Gabriel A; Mateus, Marília

2017-01-10

The current investigation aimed at designing DNA vaccines against Aeromonas hydrophila infections. The DNA vaccine candidates were designed to express two antigenic outer membrane protein (Aha1) peptides and to be delivered by a nanoparticle-based delivery system. Gene sequences of conserved regions of antigenic Aha1 [aha1(211-381), aha1(211-381)opt, aha1(703-999) and aha1(703-999)opt] were cloned into pVAX-GFP expression vector. The selected DNA vaccine candidates were purified from E. coli DH5α and transfected into Chinese hamster ovary cells. The expression of the antigenic peptides was measured in cells along post-transfection time, through the fluorescence intensity of the reporter GFP. The lipofection efficiency of aha-pVAX-GFP was highest after 24h incubation. Formulated PLGA-chitosan nanoparticle/plasmid DNA complexes were characterized in terms of size, size distribution and zeta potential. Nanocomplexes with average diameters in the range of 150-170nm transfected in a similar fashion into CHO cells confirmed transfection efficiency comparable to that of lipofection. DNA entrapment and further DNase digestion assays demonstrated ability for pDNA protection by the nanoparticles against enzymatic digestion. Copyright © 2016 Elsevier B.V. All rights reserved.
DNA methylation Landscape of body size variation in sheep.

PubMed

Cao, Jiaxue; Wei, Caihong; Liu, Dongming; Wang, Huihua; Wu, Mingming; Xie, Zhiyuan; Capellini, Terence D; Zhang, Li; Zhao, Fuping; Li, Li; Zhong, Tao; Wang, Linjie; Lu, Jian; Liu, Ruizao; Zhang, Shifang; Du, Yongfei; Zhang, Hongping; Du, Lixin

2015-10-16

Sub-populations of Chinese Mongolian sheep exhibit significant variance in body mass. In the present study, we sequenced the whole genome DNA methylation in these breeds to detect whether DNA methylation plays a role in determining the body mass of sheep by Methylated DNA immunoprecipitation - sequencing method. A high quality methylation map of Chinese Mongolian sheep was obtained in this study. We identified 399 different methylated regions located in 93 human orthologs, which were previously reported as body size related genes in human genome-wide association studies. We tested three regions in LTBP1, and DNA methylation of two CpG sites showed significant correlation with its RNA expression. Additionally, a particular set of differentially methylated windows enriched in the "development process" (GO: 0032502) was identified as potential candidates for association with body mass variation. Next, we validated small part of these windows in 5 genes; DNA methylation of SMAD1, TSC1 and AKT1 showed significant difference across breeds, and six CpG were significantly correlated with RNA expression. Interestingly, two CpG sites showed significant correlation with TSC1 protein expression. This study provides a thorough understanding of body size variation in sheep from an epigenetic perspective.
Evaluating the feasibility of using candidate DNA barcodes in discriminating species of the large Asteraceae family.

PubMed

Gao, Ting; Yao, Hui; Song, Jingyuan; Zhu, Yingjie; Liu, Chang; Chen, Shilin

2010-10-26

Five DNA regions, namely, rbcL, matK, ITS, ITS2, and psbA-trnH, have been recommended as primary DNA barcodes for plants. Studies evaluating these regions for species identification in the large plant taxon, which includes a large number of closely related species, have rarely been reported. The feasibility of using the five proposed DNA regions was tested for discriminating plant species within Asteraceae, the largest family of flowering plants. Among these markers, ITS2 was the most useful in terms of universality, sequence variation, and identification capability in the Asteraceae family. The species discriminating power of ITS2 was also explored in a large pool of 3,490 Asteraceae sequences that represent 2,315 species belonging to 494 different genera. The result shows that ITS2 correctly identified 76.4% and 97.4% of plant samples at the species and genus levels, respectively. In addition, ITS2 displayed a variable ability to discriminate related species within different genera. ITS2 is the best DNA barcode for the Asteraceae family. This approach significantly broadens the application of DNA barcoding to resolve classification problems in the family Asteraceae at the genera and species levels.
Genome Sequencing Technologies and Nursing: What Are the Roles of Nurses and Nurse Scientists?

PubMed

Taylor, Jacquelyn Y; Wright, Michelle L; Hickey, Kathleen T; Housman, David E

Advances in DNA sequencing technology have resulted in an abundance of personalized data with challenging clinical utility and meaning for clinicians. This wealth of data has potential to dramatically impact the quality of healthcare. Nurses are at the focal point in educating patients regarding relevant healthcare needs; therefore, an understanding of sequencing technology and utilizing these data are critical. The objective of this study was to explicate the role of nurses and nurse scientists as integral members of healthcare teams in improving understanding of DNA sequencing data and translational genomics for patients. A history of the nurse role in newborn screening is used as an exemplar. This study serves as an exemplar on how genome sequencing has been utilized in nursing science and incorporates linkages of other omics approaches used by nurses that are included in this special issue. This special issue showcased nurse scientists conducting multi-omic research from various methods, including targeted candidate genes, pharmacogenomics, proteomics, epigenomics, and the microbiome. From this vantage point, we provide an overview of the roles of nurse scientists in genome sequencing research and provide recommendations for the best utilization of nurses and nurse scientists related to genome sequencing.
A Novel Locomotion-based Validation Assay for Candidate Drugs Using Drosophila DYT1 Disease Model

DTIC Science & Technology

2014-06-01

rescue the locomotion defects of Drosophila larvae caused by the expression of human torsinAΔE. These results demonstrated that human torsinA can... Drosophila dtorsin∆D transgenic lines dtorsin∆E and dtorsin∆D cDNA constructs were made from the wild type dtorsin cDNA using QuikChange II XL Site...After confirming mutated sequences , the insert was again cut out with EcoRI and NotI and inserted between EcoRI and NotI sites of pUAST [2] to produce
Nanochannel Device with Embedded Nanopore: a New Approach for Single-Molecule DNA Analysis and Manipulation

NASA Astrophysics Data System (ADS)

Zhang, Yuning; Reisner, Walter

2012-02-01

Nanopore and nanochannel based devices are robust methods for biomolecular sensing and single DNA manipulation. Nanopore-based DNA sensing has attractive features that make it a leading candidate as a single-molecule DNA sequencing technology. Nanochannel based extension of DNA, combined with enzymatic or denaturation-based barcoding schemes, is already a powerful approach for genome analysis. We believe that there is revolutionary potential in devices that combine nanochannels with nanpore detectors. In particular, due to the fast translocation of a DNA molecule through a standard nanopore configuration, there is an unfavorable trade-off between signal and sequence resolution. With a combined nanochannel-nanopore device, based on embedding a nanopore inside a nanochannel, we can in principle gain independent control over both DNA translocation speed and sensing signal, solving the key draw-back of the standard nanopore configuration. We will discuss our recent progress on device fabrication and characterization. In particular, we demonstrate that we can detect - using fluorescent microscopy - successful translocation of DNA from the nanochannel out through the nanopore, a possible method to 'select' a given barcode for further analysis. In particular, we show that in equilibrium DNA will not escape through an embedded sub-persistence length nanopore, suggesting that the embedded pore could be used as a nanoscale window through which to interrogate a nanochannel extended DNA molecule.
Genome-wide characterization of centromeric satellites from multiple mammalian genomes.

PubMed

Alkan, Can; Cardone, Maria Francesca; Catacchio, Claudia Rita; Antonacci, Francesca; O'Brien, Stephen J; Ryder, Oliver A; Purgato, Stefania; Zoli, Monica; Della Valle, Giuliano; Eichler, Evan E; Ventura, Mario

2011-01-01

Despite its importance in cell biology and evolution, the centromere has remained the final frontier in genome assembly and annotation due to its complex repeat structure. However, isolation and characterization of the centromeric repeats from newly sequenced species are necessary for a complete understanding of genome evolution and function. In recent years, various genomes have been sequenced, but the characterization of the corresponding centromeric DNA has lagged behind. Here, we present a computational method (RepeatNet) to systematically identify higher-order repeat structures from unassembled whole-genome shotgun sequence and test whether these sequence elements correspond to functional centromeric sequences. We analyzed genome datasets from six species of mammals representing the diversity of the mammalian lineage, namely, horse, dog, elephant, armadillo, opossum, and platypus. We define candidate monomer satellite repeats and demonstrate centromeric localization for five of the six genomes. Our analysis revealed the greatest diversity of centromeric sequences in horse and dog in contrast to elephant and armadillo, which showed high-centromeric sequence homogeneity. We could not isolate centromeric sequences within the platypus genome, suggesting that centromeres in platypus are not enriched in satellite DNA. Our method can be applied to the characterization of thousands of other vertebrate genomes anticipated for sequencing in the near future, providing an important tool for annotation of centromeres.
Current and future developments in patents for quantitative trait loci in dairy cattle.

PubMed

Weller, Joel I

2007-01-01

Many studies have proposed that rates of genetic gain in dairy cattle can be increased by direct selection on the individual quantitative loci responsible for the genetic variation in these traits, or selection on linked genetic markers. The development of DNA-level genetic markers has made detection of QTL nearly routine in all major livestock species. The studies that attempted to detect genes affecting quantitative traits can be divided into two categories: analysis of candidate genes, and genome scans based on within-family genetic linkage. To date, 12 patent cooperative treaty (PCT) and US patents have been registered for DNA sequences claimed to be associated with effects on economic traits in dairy cattle. All claim effects on milk production, but other traits are also included in some of the claims. Most of the sequences found by the candidate gene approach are of dubious validity, and have been repeated in only very few independent studies. The two missense mutations on chromosomes 6 and 14 affecting milk concentration derived from genome scans are more solidly based, but the claims are also disputed. A few PCT in dairy cattle are commercialized as genetic tests where commercial dairy farmers are the target market.
DNA Barcoding for Efficient Species- and Pathovar-Level Identification of the Quarantine Plant Pathogen Xanthomonas

PubMed Central

Tian, Qian; Zhao, Wenjun; Lu, Songyu; Zhu, Shuifang; Li, Shidong

2016-01-01

Genus Xanthomonas comprises many economically important plant pathogens that affect a wide range of hosts. Indeed, fourteen Xanthomonas species/pathovars have been regarded as official quarantine bacteria for imports in China. To date, however, a rapid and accurate method capable of identifying all of the quarantine species/pathovars has yet to be developed. In this study, we therefore evaluated the capacity of DNA barcoding as a digital identification method for discriminating quarantine species/pathovars of Xanthomonas. For these analyses, 327 isolates, representing 45 Xanthomonas species/pathovars, as well as five additional species/pathovars from GenBank (50 species/pathovars total), were utilized to test the efficacy of four DNA barcode candidate genes (16S rRNA gene, cpn60, gyrB, and avrBs2). Of these candidate genes, cpn60 displayed the highest rate of PCR amplification and sequencing success. The tree-building (Neighbor-joining), ‘best close match’, and barcode gap methods were subsequently employed to assess the species- and pathovar-level resolution of each gene. Notably, all isolates of each quarantine species/pathovars formed a monophyletic group in the neighbor-joining tree constructed using the cpn60 sequences. Moreover, cpn60 also demonstrated the most satisfactory results in both barcoding gap analysis and the ‘best close match’ test. Thus, compared with the other markers tested, cpn60 proved to be a powerful DNA barcode, providing a reliable and effective means for the species- and pathovar-level identification of the quarantine plant pathogen Xanthomonas. PMID:27861494
A knowledge engineering approach to recognizing and extracting sequences of nucleic acids from scientific literature.

PubMed

García-Remesal, Miguel; Maojo, Victor; Crespo, José

2010-01-01

In this paper we present a knowledge engineering approach to automatically recognize and extract genetic sequences from scientific articles. To carry out this task, we use a preliminary recognizer based on a finite state machine to extract all candidate DNA/RNA sequences. The latter are then fed into a knowledge-based system that automatically discards false positives and refines noisy and incorrectly merged sequences. We created the knowledge base by manually analyzing different manuscripts containing genetic sequences. Our approach was evaluated using a test set of 211 full-text articles in PDF format containing 3134 genetic sequences. For such set, we achieved 87.76% precision and 97.70% recall respectively. This method can facilitate different research tasks. These include text mining, information extraction, and information retrieval research dealing with large collections of documents containing genetic sequences.
Association, intrinsic shape, and molecular recognition: Elucidating DNA biophysics through coarse-grained simulation

NASA Astrophysics Data System (ADS)

Freeman, Gordon Samuel

DNA is of central importance in biology as it is responsible for carrying, copying, and translating the genetic code into the building blocks that comprise life. In order to accomplish these tasks, the DNA molecule must be versatile and robust. Indeed, the underlying molecular interactions that allow DNA to execute these tasks are complex and their origins are only beginning to be understood. While experiments are able to elucidate many key biophysical phenomena, there remain many unanswered questions. Molecular simulation is able to shed light on phenomena at the molecular scale and provide information that is missing from experimental views of DNA behavior. In this dissertation I use state-of-the-art coarse-grained DNA models to address two key problems. In the first, metadynamics calculations are employed to uncover the free energy surface of two complimentary DNA strands. This free energy surface takes on the appearance of a hybridization funnel and reveals candidates for intermediate states in the hybridization of short DNA oligomers. Such short oligomers are important building blocks for DNA-driven self-assembly and the mechanism of hybridization in this regime is not well understood. The second problem is that of nucleosome formation. Nucleosomes are the fundamental subunit of genome compaction in the nucleus of a cell. As such, nucleosomes are a key epigenetic factor and affect gene expression and the ability of DNA-binding proteins to locate and bind to the appropriate position in the genome. However, the factors that drive nucleosome positioning are not well understood. While DNA sequence is known to affect nucleosome formation, the mechanism by which it does so has not been established and a number of hypotheses explaining this sequence-dependence exist in the literature. I demonstrate that DNA shape dominates this process with contributions arising from both intrinsic DNA curvature as well as DNA-protein interactions driven by sequence-dependent variations in minor groove dimensions.
Generation and analysis of expressed sequence tags from a cDNA library of the fruiting body of Ganoderma lucidum

PubMed Central

2010-01-01

Background Little genomic or trancriptomic information on Ganoderma lucidum (Lingzhi) is known. This study aims to discover the transcripts involved in secondary metabolite biosynthesis and developmental regulation of G. lucidum using an expressed sequence tag (EST) library. Methods A cDNA library was constructed from the G. lucidum fruiting body. Its high-quality ESTs were assembled into unique sequences with contigs and singletons. The unique sequences were annotated according to sequence similarities to genes or proteins available in public databases. The detection of simple sequence repeats (SSRs) was preformed by online analysis. Results A total of 1,023 clones were randomly selected from the G. lucidum library and sequenced, yielding 879 high-quality ESTs. These ESTs showed similarities to a diverse range of genes. The sequences encoding squalene epoxidase (SE) and farnesyl-diphosphate synthase (FPS) were identified in this EST collection. Several candidate genes, such as hydrophobin, MOB2, profilin and PHO84 were detected for the first time in G. lucidum. Thirteen (13) potential SSR-motif microsatellite loci were also identified. Conclusion The present study demonstrates a successful application of EST analysis in the discovery of transcripts involved in the secondary metabolite biosynthesis and the developmental regulation of G. lucidum. PMID:20230644
ProbeDesigner: for the design of probesets for branched DNA (bDNA) signal amplification assays.

PubMed

Bushnell, S; Budde, J; Catino, T; Cole, J; Derti, A; Kelso, R; Collins, M L; Molino, G; Sheridan, P; Monahan, J; Urdea, M

1999-05-01

The sensitivity and specificity of branched DNA (bDNA) assays are derived in part through the judicious design of the capture and label extender probes. To minimize non-specific hybridization (NSH) events, which elevate assay background, candidate probes must be computer screened for complementarity with generic sequences present in the assay. We present a software application which allows for rapid and flexible design of bDNA probesets for novel targets. It includes an algorithm for estimating the magnitude of NSH contribution to background, a mechanism for removing probes with elevated contributions, a methodology for the simultaneous design of probesets for multiple targets, and a graphical user interface which guides the user through the design steps. The program is available as a commercial package through the Pharmaceutical Drug Discovery program at Chiron Diagnostics.
Preparation and biomedical applications of programmable and multifunctional DNA nanoflowers

PubMed Central

Lv, Yifan; Hu, Rong; Zhu, Guizhi; Zhang, Xiaobing; Mei, Lei; Liu, Qiaoling; Qiu, Liping; Wu, Cuichen; Tan, Weihong

2016-01-01

We describe a comprehensive protocol for the preparation of multifunctional DNA nanostructures termed nanoflowers (NFs), which are self-assembled from long DNA building blocks generated via rolling-circle replication (RCR) of a designed template. NF assembly is driven by liquid crystallization and dense packaging of building blocks, which eliminates the need for conventional Watson-Crick base pairing. As a result of dense DNA packaging, NFs are resistant to nuclease degradation, denaturation or dissociation at extremely low concentrations. By manually changing the template sequence, many different functional moieties including aptamers, bioimaging agents and drug-loading sites could be easily integrated into NF particles, making NFs ideal candidates for a variety of applications in biomedicine. In this protocol, the preparation of multifunctional DNA NFs with highly tunable sizes is described for applications in cell targeting, intracellular imaging and drug delivery. Preparation and characterization of functional DNA NFs takes ~5 d; the following biomedical applications take ~10 d. PMID:26357007
An Improved Method for TAL Effectors DNA-Binding Sites Prediction Reveals Functional Convergence in TAL Repertoires of Xanthomonas oryzae Strains

PubMed Central

Pérez-Quintero, Alvaro L.; Rodriguez-R, Luis M.; Dereeper, Alexis; López, Camilo; Koebnik, Ralf; Szurek, Boris; Cunnac, Sebastien

2013-01-01

Transcription Activators-Like Effectors (TALEs) belong to a family of virulence proteins from the Xanthomonas genus of bacterial plant pathogens that are translocated into the plant cell. In the nucleus, TALEs act as transcription factors inducing the expression of susceptibility genes. A code for TALE-DNA binding specificity and high-resolution three-dimensional structures of TALE-DNA complexes were recently reported. Accurate prediction of TAL Effector Binding Elements (EBEs) is essential to elucidate the biological functions of the many sequenced TALEs as well as for robust design of artificial TALE DNA-binding domains in biotechnological applications. In this work a program with improved EBE prediction performances was developed using an updated specificity matrix and a position weight correction function to account for the matching pattern observed in a validation set of TALE-DNA interactions. To gain a systems perspective on the large TALE repertoires from X. oryzae strains, this program was used to predict rice gene targets for 99 sequenced family members. Integrating predictions and available expression data in a TALE-gene network revealed multiple candidate transcriptional targets for many TALEs as well as several possible instances of functional convergence among TALEs. PMID:23869221

Evaluation of DNA barcode candidates for the discrimination of Artemisia L.

PubMed

Liu, Geyu; Ning, Huixia; Ayidaerhan, Nurbolati; Aisa, Haji Akber

2017-11-01

Because of the very similar morphologies and wide diversity of Artemisia L. varieties, they are difficult to identify, and there have been many arguments about the systematic classification Artemisia L., especially concerning the division of species. DNA barcode technology is used to rapidly identify species based on standard short DNA sequences. To evaluate seven candidate DNA barcodes (ITS, ITS2, psbA-trnH, rbcL, matK, rpoB, and rpoC1) regarding their ability to identify closely related species of the Artemisia genus in Xinjiang. The corresponding PCR amplification efficiency, detectable genetic divergence, identification efficiency and phylogenetic tree were assessed. We found that the internal transcribed spacer (ITS) region exhibited the highest interspecific divergence, which was significantly higher than the observed intraspecific variation and showed the highest identification efficiency, followed by ITS2, psbA-trnH and, finally, rpoB. matK, rbcL, and rpoC1 performed poorly in this evaluation. ITS, ITS2, and psbA-trnH were able to perfectly identify the tested species Artemisia annua, A. absinthium, A. rupestris, A. tonurnefortiana, A. austriaca, A. dracunculus, A. vulgaris, and A. macrocephala. Therefore, we propose the ITS, ITS2, and psbA-trnH regions as promising DNA barcodes for the closely related species of Artemisia L. in Xinjiang.
A candidate gene for X-linked Ocular Albinism (OA1)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Bassi, M.T.; Schiaffino, V.; Rugarli, E.

1994-09-01

Ocular Albinism of the Nettleship-Fall type 1 (OA1) is the most common form of ocular albinism. It is transmitted as an X-linked recessive trait with affected males showing severe reduction of visual acuity, nystagmus, strabismus, photophobia. Ophthalmologic examination reveals foveal hypoplasia, hypopigmentation of the retina and iris translucency. Microscopic examination of melanocytes suggests that the underlying defect in OA1 is an abnormality in melanosome formation. Recently we assembled a 350 kb cosmid contig spanning the entire critical region on Xp22.3, which measures approximately 110 kb. A minimum set of cosmids was used to identify transcribed sequences using both cDNA selectionmore » and exon amplification. Two putative exons recovered by exon amplification strategy were found to be highly conserved throughout evolution and, therefore, they were used as probes for the screening of fetal and adult retina cDNA libraries. This led to the isolation of clones spanning a full-length cDNA which measures 7.6 kb. Sequence analysis revealed that the predicted protein product shows homology with syntrophines and a Xenopus laevis apical protein. The gene covers approximately 170 kb of DNA and spans the entire critical region for OA1, being deleted in two patients with contiguous gene deletion including OA1 and in one patient with isolated OA1. Therefore, this new gene represents a very strong candidate for involvement in OA1 (an alternative, but unlikely possibility to be considered is that the true OA1 gene lies within an intron of the former). Northern analysis revealed very high level of expression in retina and melanoma. Unlike most Xp22.3 genes, this gene is conserved in the mouse. We are currently performing SSCP analysis and direct sequencing of exons on DNAs from approximately 60 unrelated patients with OA1 for mutation detection.« less
Genome-wide methylation sequencing of paired primary and metastatic cell lines identifies common DNA methylation changes and a role for EBF3 as a candidate epigenetic driver of melanoma metastasis

PubMed Central

Chatterjee, Aniruddha; Stockwell, Peter A; Ahn, Antonio; Rodger, Euan J; Leichter, Anna L; Eccles, Michael R

2017-01-01

Epigenetic alterations are increasingly implicated in metastasis, whereas very few genetic mutations have been identified as authentic drivers of cancer metastasis. Yet, to date, few studies have identified metastasis-related epigenetic drivers, in part because a framework for identifying driver epigenetic changes in metastasis has not been established. Using reduced representation bisulfite sequencing (RRBS), we mapped genome-wide DNA methylation patterns in three cutaneous primary and metastatic melanoma cell line pairs to identify metastasis-related epigenetic drivers. Globally, metastatic melanoma cell lines were hypomethylated compared to the matched primary melanoma cell lines. Using whole genome RRBS we identified 75 shared (10 hyper- and 65 hypomethylated) differentially methylated fragments (DMFs), which were associated with 68 genes showing significant methylation differences. One gene, Early B Cell Factor 3 (EBF3), exhibited promoter hypermethylation in metastatic cell lines, and was validated with bisulfite sequencing and in two publicly available independent melanoma cohorts (n = 40 and 458 melanomas, respectively). We found that hypermethylation of the EBF3 promoter was associated with increased EBF3 mRNA levels in metastatic melanomas and subsequent inhibition of DNA methylation reduced EBF3 expression. RNAi-mediated knockdown of EBF3 mRNA levels decreased proliferation, migration and invasion in primary and metastatic melanoma cell lines. Overall, we have identified numerous epigenetic changes characterising metastatic melanoma cell lines, including EBF3-induced aggressive phenotypic behaviour with elevated EBF3 expression in metastatic melanoma, suggesting that EBF3 promoter hypermethylation may be a candidate epigenetic driver of metastasis. PMID:28030832
Genome-wide association study identifies phospholipase C zeta 1 (PLCz1) as a stallion fertility locus in Hanoverian warmblood horses.

PubMed

Schrimpf, Rahel; Dierks, Claudia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

2014-01-01

A consistently high level of stallion fertility plays an economically important role in modern horse breeding. We performed a genome-wide association study for estimated breeding values of the paternal component of the pregnancy rate per estrus cycle (EBV-PAT) in Hanoverian stallions. A total of 228 Hanoverian stallions were genotyped using the Equine SNP50 Beadchip. The most significant association was found on horse chromosome 6 for a single nucleotide polymorphism (SNP) within phospholipase C zeta 1 (PLCz1). In the close neighbourhood to PLCz1 is located CAPZA3 (capping protein (actin filament) muscle Z-line, alpha 3). The gene PLCz1 encodes a protein essential for spermatogenesis and oocyte activation through sperm induced Ca2+-oscillation during fertilization. We derived equine gene models for PLCz1 and CAPZA3 based on cDNA and genomic DNA sequences. The equine PLCz1 had four different transcripts of which two contained a premature termination codon. Sequencing all exons and their flanking sequences using genomic DNA samples from 19 Hanoverian stallions revealed 47 polymorphisms within PLCz1 and one SNP within CAPZA3. Validation of these 48 polymorphisms in 237 Hanoverian stallions identified three intronic SNPs within PLCz1 as significantly associated with EBV-PAT. Bioinformatic analysis suggested regulatory effects for these SNPs via transcription factor binding sites or microRNAs. In conclusion, non-coding polymorphisms within PLCz1 were identified as conferring stallion fertility and PLCz1 as candidate locus for male fertility in Hanoverian warmblood. CAPZA3 could be eliminated as candidate gene for fertility in Hanoverian stallions.
Genome-wide prediction of cis-regulatory regions using supervised deep learning methods.

PubMed

Li, Yifeng; Shi, Wenqiang; Wasserman, Wyeth W

2018-05-31

In the human genome, 98% of DNA sequences are non-protein-coding regions that were previously disregarded as junk DNA. In fact, non-coding regions host a variety of cis-regulatory regions which precisely control the expression of genes. Thus, Identifying active cis-regulatory regions in the human genome is critical for understanding gene regulation and assessing the impact of genetic variation on phenotype. The developments of high-throughput sequencing and machine learning technologies make it possible to predict cis-regulatory regions genome wide. Based on rich data resources such as the Encyclopedia of DNA Elements (ENCODE) and the Functional Annotation of the Mammalian Genome (FANTOM) projects, we introduce DECRES based on supervised deep learning approaches for the identification of enhancer and promoter regions in the human genome. Due to their ability to discover patterns in large and complex data, the introduction of deep learning methods enables a significant advance in our knowledge of the genomic locations of cis-regulatory regions. Using models for well-characterized cell lines, we identify key experimental features that contribute to the predictive performance. Applying DECRES, we delineate locations of 300,000 candidate enhancers genome wide (6.8% of the genome, of which 40,000 are supported by bidirectional transcription data), and 26,000 candidate promoters (0.6% of the genome). The predicted annotations of cis-regulatory regions will provide broad utility for genome interpretation from functional genomics to clinical applications. The DECRES model demonstrates potentials of deep learning technologies when combined with high-throughput sequencing data, and inspires the development of other advanced neural network models for further improvement of genome annotations.
Genome-Wide Association Study Identifies Phospholipase C zeta 1 (PLCz1) as a Stallion Fertility Locus in Hanoverian Warmblood Horses

PubMed Central

Schrimpf, Rahel; Dierks, Claudia; Martinsson, Gunilla; Sieme, Harald; Distl, Ottmar

2014-01-01

A consistently high level of stallion fertility plays an economically important role in modern horse breeding. We performed a genome-wide association study for estimated breeding values of the paternal component of the pregnancy rate per estrus cycle (EBV-PAT) in Hanoverian stallions. A total of 228 Hanoverian stallions were genotyped using the Equine SNP50 Beadchip. The most significant association was found on horse chromosome 6 for a single nucleotide polymorphism (SNP) within phospholipase C zeta 1 (PLCz1). In the close neighbourhood to PLCz1 is located CAPZA3 (capping protein (actin filament) muscle Z-line, alpha 3). The gene PLCz1 encodes a protein essential for spermatogenesis and oocyte activation through sperm induced Ca2+-oscillation during fertilization. We derived equine gene models for PLCz1 and CAPZA3 based on cDNA and genomic DNA sequences. The equine PLCz1 had four different transcripts of which two contained a premature termination codon. Sequencing all exons and their flanking sequences using genomic DNA samples from 19 Hanoverian stallions revealed 47 polymorphisms within PLCz1 and one SNP within CAPZA3. Validation of these 48 polymorphisms in 237 Hanoverian stallions identified three intronic SNPs within PLCz1 as significantly associated with EBV-PAT. Bioinformatic analysis suggested regulatory effects for these SNPs via transcription factor binding sites or microRNAs. In conclusion, non-coding polymorphisms within PLCz1 were identified as conferring stallion fertility and PLCz1 as candidate locus for male fertility in Hanoverian warmblood. CAPZA3 could be eliminated as candidate gene for fertility in Hanoverian stallions. PMID:25354211
DNA barcoding reveals species level divergence between populations of the microhylid frog genus Arcovomer (Anura: Microhylidae) in the Atlantic Rainforest of southeastern Brazil.

PubMed

Jennings, W Bryan; Wogel, Henrique; Bilate, Marcos; Salles, Rodrigo de O L; Buckup, Paulo A

2016-09-01

The microhylid frogs belonging to the genus Arcovomer have been reported from lowland Atlantic Rainforest in the Brazilian states of Espírito Santo, Rio de Janeiro, and São Paulo. Here, we use DNA barcoding to assess levels of genetic divergence between apparently isolated populations in Espírito Santo and Rio de Janeiro. Our mtDNA data consisting of cytochrome oxidase subunit I (COI) nucleotide sequences reveals 13.2% uncorrected and 30.4% TIM2 + I + Γ corrected genetic divergences between these two populations. This level of divergence exceeds the suggested 10% uncorrected divergence threshold for elevating amphibian populations to candidate species using this marker, which implies that the Espírito Santo population is a species distinct from Arcovomer passarellii. Calibration of our model-corrected sequence divergence estimates suggests that the time of population divergence falls between 12 and 29 million years ago.
PCR-based methods for identification of potentially zoonotic ascaridoid parasites of the dog, fox and cat.

PubMed

Jacobs, D E; Zhu, X; Gasser, R B; Chilton, N B

1997-11-01

Genomic DNA was extracted from ascaridoid nematodes collected from dogs, foxes and cats. A region spanning the second internal transcribed spacer (ITS-2) of the ribosomal DNA of each sample was amplified by PCR. Representative ITS-2 products for each nematode species (Toxocara canis, Toxocara cati and Toxascaris leonina) were sequenced. Restriction sites were identified for use as genetic markers in a PCR-linked RFLP assay. The three species could be differentiated from each other and from other ascaridoids that may be found in human tissues by use of two endonucleases, HinfI and RsaI. Primers were designed to unique regions of the ITS-2 sequences of the three species for use in diagnostic PCR procedures and primer sets evaluated against panels of homologous and heterologous DNA samples. Results suggest that both methods are good candidates for further development for the detection and/or identification of ascaridoid larvae in human tissues.
Genetic Influences on Preterm Birth in Argentina

PubMed Central

Mann, Paul C.; Cooper, Margaret E.; Ryckman, Kelli K.; Comas, Belén; Gili, Juan; Crumley, Suzanne; Bream, Elise N.A.; Byers, Heather M.; Piester, Travis; Schaefer, Amanda; Christine, Paul J.; Lawrence, Amy; Schaa, Kendra L.; Kelsey, Keegan J.P.; Berends, Susan K.; Gadow, Enrique; Cosentino, Viviana; Castilla, Eduardo E.; Camelo, Jorge López; Saleme, Cesar; Day, Lori J.; England, Sarah K.; Marazita, Mary L.; Dagle, John M.; Murray, Jeffrey C.

2013-01-01

Objective To investigate genetic etiologies of preterm birth (PTB) in Argentina through evaluation of single-nucleotide polymorphisms (SNP) in candidate genes and population genetic admixture. Study Design Genotyping was performed in 389 families. Maternal, paternal, and fetal effects were studied separately. Mitochondrial DNA (mtDNA) was sequenced in 50 males and 50 females. Y-chromosome anthropological markers were evaluated in 50 males. Results Fetal association with PTB was found in the progesterone receptor (PGR, rs1942836; p= 0.004). Maternal association with PTB was found in small conductance calcium activated potassium channel isoform 3 (KCNN3, rs883319; p= 0.01). Gestational age associated with PTB in PGR rs1942836 at 32 –36 weeks (p= 0.0004). MtDNA sequencing determined 88 individuals had Amerindian consistent haplogroups. Two individuals had Amerindian Y-chromosome consistent haplotypes. Conclusions This study replicates single locus fetal associations with PTB in PGR, maternal association in KCNN3, and demonstrates possible effects for divergent racial admixture on PTB. PMID:23018797
DNA markers in molecular diagnostics for hepatocellular carcinoma

PubMed Central

Su, Ying-Hsiu; Lin, Selena Y; Song, Wei; Jain, Surbhi

2015-01-01

Hepatocellular carcinoma (HCC) is the one of the leading causes of cancer mortality in the world, mainly due to the difficulty of early detection and limited therapeutic options. The implementation of HCC surveillance programs in well-defined, high-risk populations were only able to detect about 40–50% of HCC at curative stages (Barcelona Clinic Liver Cancer stages 0 & 1) due to the low sensitivities of the current screening methods. The advance of sequencing technologies has identified numerous modifications as potential candidate DNA markers for diagnosis/surveillance. Here we aim to provide an overview of the DNA alterations that result in activation of cancer pathways known to potentially drive HCC carcinogenesis and to summarize performance characteristics of each DNA marker in the periphery (blood or urine) for HCC screening. PMID:25098554
Amino acid sequence of bovine muzzle epithelial desmocollin derived from cloned cDNA: a novel subtype of desmosomal cadherins.

PubMed

Koch, P J; Goldschmidt, M D; Walsh, M J; Zimbelmann, R; Schmelz, M; Franke, W W

1991-05-01

Desmosomes are cell-type-specific intercellular junctions found in epithelium, myocardium and certain other tissues. They consist of assemblies of molecules involved in the adhesion of specific cell types and in the anchorage of cell-type-specific cytoskeletal elements, the intermediate-size filaments, to the plasma membrane. To explore the individual desmosomal components and their functions we have isolated DNA clones encoding the desmosomal glycoprotein, desmocollin, using antibodies and a cDNA expression library from bovine muzzle epithelium. The cDNA-deduced amino-acid sequence of desmocollin (presently we cannot decide to which of the two desmocollins, DC I or DC II, this clone relates) defines a polypeptide with a calculated molecular weight of 85,000, with a single candidate sequence of 24 amino acids sufficiently long for a transmembrane arrangement, and an extracellular aminoterminal portion of 561 amino acid residues, compared to a cytoplasmic part of only 176 amino acids. Amino acid sequence comparisons have revealed that desmocollin is highly homologous to members of the cadherin family of cell adhesion molecules, including the previously sequenced desmoglein, another desmosome-specific cadherin. Using riboprobes derived from cDNAs for Northern-blot analyses, we have identified an mRNA of approximately 6 kb in stratified epithelia such as muzzle epithelium and tongue mucosa but not in two epithelial cell culture lines containing desmosomes and desmoplakins. The difference may indicate drastic differences in mRNA concentration or the existence of cell-type-specific desmocollin subforms. The molecular topology of desmocollin(s) is discussed in relation to possible functions of the individual molecular domains.
Analysis of the mitochondrial genome of cheetahs (Acinonyx jubatus) with neurodegenerative disease.

PubMed

Burger, Pamela A; Steinborn, Ralf; Walzer, Christian; Petit, Thierry; Mueller, Mathias; Schwarzenberger, Franz

2004-08-18

The complete mitochondrial genome of Acinonyx jubatus was sequenced and mitochondrial DNA (mtDNA) regions were screened for polymorphisms as candidates for the cause of a neurodegenerative demyelinating disease affecting captive cheetahs. The mtDNA reference sequences were established on the basis of the complete sequences of two diseased and two nondiseased animals as well as partial sequences of 26 further individuals. The A. jubatus mitochondrial genome is 17,047-bp long and shows a high sequence similarity (91%) to the domestic cat. Based on single nucleotide polymorphisms (SNPs) in the control region (CR) and pedigree information, the 18 myelopathic and 12 non-myelopathic cheetahs included in this study were classified into haplotypes I, II and III. In view of the phenotypic comparability of the neurodegenerative disease observed in cheetahs and human mtDNA-associated diseases, specific coding regions including the tRNAs leucine UUR, lysine, serine UCN, and partial complex I and V sequences were screened. We identified a heteroplasmic and a homoplasmic SNP at codon 507 in the subunit 5 (MTND5) of complex I. The heteroplasmic haplotype I-specific valine to methionine substitution represents a nonconservative amino acid change and was found in 11 myelopathic and eight non-myelopathic cheetahs with levels ranging from 29% to 79%. The homoplasmic conservative amino acid substitution valine to alanine was identified in two myelopathic animals of haplotype II. In addition, a synonymous SNP in the codon 76 of the MTND4L gene was found in the single haplotype III animal. The amino acid exchanges in the MTND5 gene were not associated with the occurrence of neurodegenerative disease in captive cheetahs.
Characterization and mapping of the human rhodopsin kinase gene and screening of the gene for mutations in patients with retinitis pigmentosa

DOE Office of Scientific and Technical Information (OSTI.GOV)

Khani, S.C.; Lin, D.; Magovcevic, I.

1994-09-01

Rhodopsin kinase (RK) is a cytosolic enzyme in rod photoreceptors that initiates the deactivation of the phototransductions cascade by phosphorylating photoactivated rhodopsin. Although the cDNA sequence of bovine RK has been determined previously, no human cDNA or genomic sequence has thus far been available for genetic studies. In order to investigate the possible role of this candidate gene in retinitis pigmentosa (RP) and allied diseases, we have isolated and characterized human cDNA and genomic clones derived from the RK locus. The coding sequence of the human gene is 1692 nucleotides in length and is split into seven exons. The humanmore » and the bovine sequence show 84% identity at the nucleotide level and 92% identity at the amino acid level. Thus far, the intronic sequences flanking each exon except for one have been determined. We have also mapped the human RK gene to chromosome 13q34 using fluorescence in situ hybridization. To our knowledge, no RP gene has as yet been linked to this region. However, since the substrate for RK (rhodopsin) and other members of the phototransduction cascade have been implicated in the pathogenesis of RP, it is conceivable that defects in RK can also cause some forms of this disease. We are evaluating this possibility by screening DNA from 173 patients with autosomal recessive RP and 190 patients with autosomal dominant RP. So far, we have found 11 patients with variant bands. In one patient with autosomal dominant RP we discovered the missense change Ser536Leu. Cosegregation studies and further sequencing of the variant bands are currently underway.« less
Identification of genes from the Treacher Collins candidate region

DOE Office of Scientific and Technical Information (OSTI.GOV)

Dixon, M.; Dixon, J.; Edwards, S.

Treacher Collins syndrome (TCOF1) is an autosomal dominant disorder of craniofacial development. The TCOF1 locus has previously been mapped to chromosome 5q32-33. The candidate gene region has been defined as being between two flanking markers, ribosomal protein S14 (RPS14) and Annexin 6 (ANX6), by analyzing recombination events in affected individuals. It is estimated that the distance between these flanking markers is 500 kb by three separate analysis methods: (1) radiation hybrid mapping; (2) genetic linkage; and (3) YAC contig analysis. A cosmid contig which spans the candidate gene region for TCOF1 has been constructed by screening the Los Alamos Nationalmore » Laboratory flow-sorted chromosome 5 cosmid library. Cosmids were obtained by using a combination of probes generated from YAC end clones, Alu-PCR fragments from YACs, and asymmetric PCR fragments from both T7 and T3 cosmid ends. Exon amplifications, the selection of genomic coding sequences based upon the presence of functional splice acceptor and donor sites, was used to identify potential exon sequences. Sequences found to be conserved between species were then used to screen cDNA libraries in order to identify candidate genes. To date, four different cDNAs have been isolated from this region and are being analyzed as potential candidate genes for TCOF1. These include the genes encoding plasma glutathione peroxidase (GPX3), heparin sulfate sulfotransferase (HSST), a gene with homology to the ETS family of proteins and one which shows no homology to any known genes. Work is also in progress to identify and characterize additional cDNAs from the candidate gene region.« less
The Fanconi anemia DNA damage repair pathway in the spotlight for germline predisposition to colorectal cancer.

PubMed

Esteban-Jurado, Clara; Franch-Expósito, Sebastià; Muñoz, Jenifer; Ocaña, Teresa; Carballal, Sabela; López-Cerón, Maria; Cuatrecasas, Miriam; Vila-Casadesús, Maria; Lozano, Juan José; Serra, Enric; Beltran, Sergi; Brea-Fernández, Alejandro; Ruiz-Ponte, Clara; Castells, Antoni; Bujanda, Luis; Garre, Pilar; Caldés, Trinidad; Cubiella, Joaquín; Balaguer, Francesc; Castellví-Bel, Sergi

2016-10-01

Colorectal cancer (CRC) is one of the most common neoplasms in the world. Fanconi anemia (FA) is a very rare genetic disease causing bone marrow failure, congenital growth abnormalities and cancer predisposition. The comprehensive FA DNA damage repair pathway requires the collaboration of 53 proteins and it is necessary to restore genome integrity by efficiently repairing damaged DNA. A link between FA genes in breast and ovarian cancer germline predisposition has been previously suggested. We selected 74 CRC patients from 40 unrelated Spanish families with strong CRC aggregation compatible with an autosomal dominant pattern of inheritance and without mutations in known hereditary CRC genes and performed germline DNA whole-exome sequencing with the aim of finding new candidate germline predisposition variants. After sequencing and data analysis, variant prioritization selected only those very rare alterations, producing a putative loss of function and located in genes with a role compatible with cancer. We detected an enrichment for variants in FA DNA damage repair pathway genes in our familial CRC cohort as 6 families carried heterozygous, rare, potentially pathogenic variants located in BRCA2/FANCD1, BRIP1/FANCJ, FANCC, FANCE and REV3L/POLZ. In conclusion, the FA DNA damage repair pathway may play an important role in the inherited predisposition to CRC.
Evaluating the feasibility of using candidate DNA barcodes in discriminating species of the large Asteraceae family

PubMed Central

2010-01-01

Background Five DNA regions, namely, rbcL, matK, ITS, ITS2, and psbA-trnH, have been recommended as primary DNA barcodes for plants. Studies evaluating these regions for species identification in the large plant taxon, which includes a large number of closely related species, have rarely been reported. Results The feasibility of using the five proposed DNA regions was tested for discriminating plant species within Asteraceae, the largest family of flowering plants. Among these markers, ITS2 was the most useful in terms of universality, sequence variation, and identification capability in the Asteraceae family. The species discriminating power of ITS2 was also explored in a large pool of 3,490 Asteraceae sequences that represent 2,315 species belonging to 494 different genera. The result shows that ITS2 correctly identified 76.4% and 97.4% of plant samples at the species and genus levels, respectively. In addition, ITS2 displayed a variable ability to discriminate related species within different genera. Conclusions ITS2 is the best DNA barcode for the Asteraceae family. This approach significantly broadens the application of DNA barcoding to resolve classification problems in the family Asteraceae at the genera and species levels. PMID:20977734
Polymorphism discovery and allele frequency estimation using high-throughput DNA sequencing of target-enriched pooled DNA samples

PubMed Central

2012-01-01

Background The central role of the somatotrophic axis in animal post-natal growth, development and fertility is well established. Therefore, the identification of genetic variants affecting quantitative traits within this axis is an attractive goal. However, large sample numbers are a pre-requisite for the identification of genetic variants underlying complex traits and although technologies are improving rapidly, high-throughput sequencing of large numbers of complete individual genomes remains prohibitively expensive. Therefore using a pooled DNA approach coupled with target enrichment and high-throughput sequencing, the aim of this study was to identify polymorphisms and estimate allele frequency differences across 83 candidate genes of the somatotrophic axis, in 150 Holstein-Friesian dairy bulls divided into two groups divergent for genetic merit for fertility. Results In total, 4,135 SNPs and 893 indels were identified during the resequencing of the 83 candidate genes. Nineteen percent (n = 952) of variants were located within 5' and 3' UTRs. Seventy-two percent (n = 3,612) were intronic and 9% (n = 464) were exonic, including 65 indels and 236 SNPs resulting in non-synonymous substitutions (NSS). Significant (P < 0.01) mean allele frequency differentials between the low and high fertility groups were observed for 720 SNPs (58 NSS). Allele frequencies for 43 of the SNPs were also determined by genotyping the 150 individual animals (Sequenom® MassARRAY). No significant differences (P > 0.1) were observed between the two methods for any of the 43 SNPs across both pools (i.e., 86 tests in total). Conclusions The results of the current study support previous findings of the use of DNA sample pooling and high-throughput sequencing as a viable strategy for polymorphism discovery and allele frequency estimation. Using this approach we have characterised the genetic variation within genes of the somatotrophic axis and related pathways, central to mammalian post-natal growth and development and subsequent lactogenesis and fertility. We have identified a large number of variants segregating at significantly different frequencies between cattle groups divergent for calving interval plausibly harbouring causative variants contributing to heritable variation. To our knowledge, this is the first report describing sequencing of targeted genomic regions in any livestock species using groups with divergent phenotypes for an economically important trait. PMID:22235840
Comparison of the Genome-Wide DNA Methylation Profiles between Fast-Growing and Slow-Growing Broilers

PubMed Central

Li, Zhenhui; Zheng, Xuejuan; Jia, Xinzheng; Nie, Qinghua; Zhang, Xiquan

2013-01-01

Introduction Growth traits are important in poultry production, however, little is known for its regulatory mechanism at epigenetic level. Therefore, in this study, we aim to compare DNA methylation profiles between fast- and slow-growing broilers in order to identify candidate genes for chicken growth. Methylated DNA immunoprecipitation-sequencing (MeDIP-seq) was used to investigate the genome-wide DNA methylation pattern in high and low tails of Recessive White Rock (WRRh; WRRl) and that of Xinhua Chickens (XHh; XHl) at 7 weeks of age. The results showed that the average methylation density was the lowest in CGIs followed by promoters. Within the gene body, the methylation density of introns was higher than that of UTRs and exons. Moreover, different methylation levels were observed in different repeat types with the highest in LINE/CR1. Methylated CGIs were prominently distributed in the intergenic regions and were enriched in the size ranging 200–300 bp. In total 13,294 methylated genes were found in four samples, including 4,085 differentially methylated genes of WRRh Vs. WRRl, 5,599 of XHh Vs. XHl, 4,204 of WRRh Vs. XHh, as well as 7,301 of WRRl Vs. XHl. Moreover, 132 differentially methylated genes related to growth and metabolism were observed in both inner contrasts (WRRh Vs. WRRl and XHh Vs. XHl), whereas 129 differentially methylated genes related to growth and metabolism were found in both across-breed contrasts (WRRh Vs. XHh and WRRl Vs. XHl). Further analysis showed that overall 75 genes exhibited altered DNA methylation in all four contrasts, which included some well-known growth factors of IGF1R, FGF12, FGF14, FGF18, FGFR2, and FGFR3. In addition, we validate the MeDIP-seq results by bisulfite sequencing in some regions. Conclusions This study revealed the global DNA methylation pattern of chicken muscle, and identified candidate genes that potentially regulate muscle development at 7 weeks of age at methylation level. PMID:23441189
Comparative analysis of bacteria associated with different mosses by 16S rRNA and 16S rDNA sequencing.

PubMed

Tian, Yang; Li, Yan Hong

2017-01-01

To understand the differences of the bacteria associated with different mosses, a phylogenetic study of bacterial communities in three mosses was carried out based on 16S rDNA and 16S rRNA sequencing. The mosses used were Hygroamblystegium noterophilum, Entodon compressus and Grimmia montana, representing hygrophyte, shady plant and xerophyte, respectively. In total, the operational taxonomic units (OTUs), richness and diversity were different regardless of the moss species and the library level. All the examined 1183 clones were assigned to 248 OTUs, 56 genera were assigned in rDNA libraries and 23 genera were determined at the rRNA level. Proteobacteria and Bacteroidetes were considered as the most dominant phyla in all the libraries, whereas abundant Actinobacteria and Acidobacteria were detected in the rDNA library of Entodon compressus and approximately 24.7% clones were assigned to Candidate division TM7 in Grimmia montana at rRNA level. The heatmap showed the bacterial profiles derived from rRNA and rDNA were partly overlapping. However, the principle component analysis of all the profiles derived from rDNA showed sharper differences between the different mosses than that of rRNA-based profiles. This suggests that the metabolically active bacterial compositions in different mosses were more phylogenetically similar and the differences of the bacteria associated with different mosses were mainly detected at the rDNA level. Obtained results clearly demonstrate that combination of 16S rDNA and 16S rRNA sequencing is preferred approach to have a good understanding on the constitution of the microbial communities in mosses. © 2016 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.
Human retina-specific amine oxidase (RAO): cDNA cloning, tissue expression, and chromosomal mapping

DOE Office of Scientific and Technical Information (OSTI.GOV)

Imamura, Yutaka; Kubota, Ryo; Wang, Yimin

In search of candidate genes for hereditary retinal disease, we have employed a subtractive and differential cDNA cloning strategy and isolated a novel retina-specific cDNA. Nucleotide sequence analysis revealed an open reading frame of 2187 bp, which encodes a 729-amino-acid protein with a calculated molecular mass of 80,644 Da. The putative protein contained a conserved domain of copper amine oxidase, which is found in various species from bacteria to mammals. It showed the highest homology to bovine serum amine oxidase, which is believed to control the level of serum biogenic amines. Northern blot analysis of human adult and fetal tissuesmore » revealed that the protein is expressed abundantly and specifically in retina as a 2.7-kb transcript. Thus, we considered this protein a human retina-specific amine oxidase (RAO). The RAO gene (AOC2) was mapped by fluorescence in situ hybridization to human chromosome 17q21. We propose that AOC2 may be a candidate gene for hereditary ocular diseases. 38 refs., 4 figs.« less

Supervised DNA Barcodes species classification: analysis, comparisons and results

PubMed Central

2014-01-01

Background Specific fragments, coming from short portions of DNA (e.g., mitochondrial, nuclear, and plastid sequences), have been defined as DNA Barcode and can be used as markers for organisms of the main life kingdoms. Species classification with DNA Barcode sequences has been proven effective on different organisms. Indeed, specific gene regions have been identified as Barcode: COI in animals, rbcL and matK in plants, and ITS in fungi. The classification problem assigns an unknown specimen to a known species by analyzing its Barcode. This task has to be supported with reliable methods and algorithms. Methods In this work the efficacy of supervised machine learning methods to classify species with DNA Barcode sequences is shown. The Weka software suite, which includes a collection of supervised classification methods, is adopted to address the task of DNA Barcode analysis. Classifier families are tested on synthetic and empirical datasets belonging to the animal, fungus, and plant kingdoms. In particular, the function-based method Support Vector Machines (SVM), the rule-based RIPPER, the decision tree C4.5, and the Naïve Bayes method are considered. Additionally, the classification results are compared with respect to ad-hoc and well-established DNA Barcode classification methods. Results A software that converts the DNA Barcode FASTA sequences to the Weka format is released, to adapt different input formats and to allow the execution of the classification procedure. The analysis of results on synthetic and real datasets shows that SVM and Naïve Bayes outperform on average the other considered classifiers, although they do not provide a human interpretable classification model. Rule-based methods have slightly inferior classification performances, but deliver the species specific positions and nucleotide assignments. On synthetic data the supervised machine learning methods obtain superior classification performances with respect to the traditional DNA Barcode classification methods. On empirical data their classification performances are at a comparable level to the other methods. Conclusions The classification analysis shows that supervised machine learning methods are promising candidates for handling with success the DNA Barcoding species classification problem, obtaining excellent performances. To conclude, a powerful tool to perform species identification is now available to the DNA Barcoding community. PMID:24721333
Genome Sequencing Technologies and Nursing: What Are the Roles of Nurses and Nurse Scientists?

PubMed Central

Taylor, Jacquelyn Y.; Wright, Michelle L.; Hickey, Kathleen T.; Housman, David

2016-01-01

Background Advances in DNA sequencing technology have resulted in an abundance of personalized data with challenging clinical utility and meaning for clinicians. This wealth of data has potential to dramatically impact the quality of healthcare. Nurses are at the focal point in educating patients regarding relevant healthcare needs; therefore, an understanding of sequencing technology and utilizing these data are critical. Aim The objective of this paper is to explicate the role of nurses and nurse scientists as integral members of healthcare teams in improving understanding of DNA sequencing data and translational genomics for patients. Approach A history of the nurse role in newborn screening is used as an exemplar. Discussion This paper serves as an exemplar on how genome sequencing has been utilized in nursing science and incorporates linkages of other omics approaches used by nurses that are included in this special issue. This special issue showcased nurse scientists conducting multi-omic research from various methods, including targeted candidate genes, pharmacogenomics, proteomics, epigenomics and the microbiome. From this vantage point, we provide an overview of the roles of nurse scientists in genome sequencing research and provide recommendations for the best utilization of nurses and nurse scientists related to genome sequencing. PMID:28252579
Analysis of protein-coding genetic variation in 60,706 humans.

PubMed

Lek, Monkol; Karczewski, Konrad J; Minikel, Eric V; Samocha, Kaitlin E; Banks, Eric; Fennell, Timothy; O'Donnell-Luria, Anne H; Ware, James S; Hill, Andrew J; Cummings, Beryl B; Tukiainen, Taru; Birnbaum, Daniel P; Kosmicki, Jack A; Duncan, Laramie E; Estrada, Karol; Zhao, Fengmei; Zou, James; Pierce-Hoffman, Emma; Berghout, Joanne; Cooper, David N; Deflaux, Nicole; DePristo, Mark; Do, Ron; Flannick, Jason; Fromer, Menachem; Gauthier, Laura; Goldstein, Jackie; Gupta, Namrata; Howrigan, Daniel; Kiezun, Adam; Kurki, Mitja I; Moonshine, Ami Levy; Natarajan, Pradeep; Orozco, Lorena; Peloso, Gina M; Poplin, Ryan; Rivas, Manuel A; Ruano-Rubio, Valentin; Rose, Samuel A; Ruderfer, Douglas M; Shakir, Khalid; Stenson, Peter D; Stevens, Christine; Thomas, Brett P; Tiao, Grace; Tusie-Luna, Maria T; Weisburd, Ben; Won, Hong-Hee; Yu, Dongmei; Altshuler, David M; Ardissino, Diego; Boehnke, Michael; Danesh, John; Donnelly, Stacey; Elosua, Roberto; Florez, Jose C; Gabriel, Stacey B; Getz, Gad; Glatt, Stephen J; Hultman, Christina M; Kathiresan, Sekar; Laakso, Markku; McCarroll, Steven; McCarthy, Mark I; McGovern, Dermot; McPherson, Ruth; Neale, Benjamin M; Palotie, Aarno; Purcell, Shaun M; Saleheen, Danish; Scharf, Jeremiah M; Sklar, Pamela; Sullivan, Patrick F; Tuomilehto, Jaakko; Tsuang, Ming T; Watkins, Hugh C; Wilson, James G; Daly, Mark J; MacArthur, Daniel G

2016-08-18

Large-scale reference data sets of human genetic variation are critical for the medical and functional interpretation of DNA sequence changes. Here we describe the aggregation and analysis of high-quality exome (protein-coding region) DNA sequence data for 60,706 individuals of diverse ancestries generated as part of the Exome Aggregation Consortium (ExAC). This catalogue of human genetic diversity contains an average of one variant every eight bases of the exome, and provides direct evidence for the presence of widespread mutational recurrence. We have used this catalogue to calculate objective metrics of pathogenicity for sequence variants, and to identify genes subject to strong selection against various classes of mutation; identifying 3,230 genes with near-complete depletion of predicted protein-truncating variants, with 72% of these genes having no currently established human disease phenotype. Finally, we demonstrate that these data can be used for the efficient filtering of candidate disease-causing variants, and for the discovery of human 'knockout' variants in protein-coding genes.
Detection of somatic mutations by high-resolution DNA melting (HRM) analysis in multiple cancers.

PubMed

Gonzalez-Bosquet, Jesus; Calcei, Jacob; Wei, Jun S; Garcia-Closas, Montserrat; Sherman, Mark E; Hewitt, Stephen; Vockley, Joseph; Lissowska, Jolanta; Yang, Hannah P; Khan, Javed; Chanock, Stephen

2011-01-17

Identification of somatic mutations in cancer is a major goal for understanding and monitoring the events related to cancer initiation and progression. High resolution melting (HRM) curve analysis represents a fast, post-PCR high-throughput method for scanning somatic sequence alterations in target genes. The aim of this study was to assess the sensitivity and specificity of HRM analysis for tumor mutation screening in a range of tumor samples, which included 216 frozen pediatric small rounded blue-cell tumors as well as 180 paraffin-embedded tumors from breast, endometrial and ovarian cancers (60 of each). HRM analysis was performed in exons of the following candidate genes known to harbor established commonly observed mutations: PIK3CA, ERBB2, KRAS, TP53, EGFR, BRAF, GATA3, and FGFR3. Bi-directional sequencing analysis was used to determine the accuracy of the HRM analysis. For the 39 mutations observed in frozen samples, the sensitivity and specificity of HRM analysis were 97% and 87%, respectively. There were 67 mutation/variants in the paraffin-embedded samples, and the sensitivity and specificity for the HRM analysis were 88% and 80%, respectively. Paraffin-embedded samples require higher quantity of purified DNA for high performance. In summary, HRM analysis is a promising moderate-throughput screening test for mutations among known candidate genomic regions. Although the overall accuracy appears to be better in frozen specimens, somatic alterations were detected in DNA extracted from paraffin-embedded samples.
Detection of Somatic Mutations by High-Resolution DNA Melting (HRM) Analysis in Multiple Cancers

PubMed Central

Gonzalez-Bosquet, Jesus; Calcei, Jacob; Wei, Jun S.; Garcia-Closas, Montserrat; Sherman, Mark E.; Hewitt, Stephen; Vockley, Joseph; Lissowska, Jolanta; Yang, Hannah P.; Khan, Javed; Chanock, Stephen

2011-01-01

Identification of somatic mutations in cancer is a major goal for understanding and monitoring the events related to cancer initiation and progression. High resolution melting (HRM) curve analysis represents a fast, post-PCR high-throughput method for scanning somatic sequence alterations in target genes. The aim of this study was to assess the sensitivity and specificity of HRM analysis for tumor mutation screening in a range of tumor samples, which included 216 frozen pediatric small rounded blue-cell tumors as well as 180 paraffin-embedded tumors from breast, endometrial and ovarian cancers (60 of each). HRM analysis was performed in exons of the following candidate genes known to harbor established commonly observed mutations: PIK3CA, ERBB2, KRAS, TP53, EGFR, BRAF, GATA3, and FGFR3. Bi-directional sequencing analysis was used to determine the accuracy of the HRM analysis. For the 39 mutations observed in frozen samples, the sensitivity and specificity of HRM analysis were 97% and 87%, respectively. There were 67 mutation/variants in the paraffin-embedded samples, and the sensitivity and specificity for the HRM analysis were 88% and 80%, respectively. Paraffin-embedded samples require higher quantity of purified DNA for high performance. In summary, HRM analysis is a promising moderate-throughput screening test for mutations among known candidate genomic regions. Although the overall accuracy appears to be better in frozen specimens, somatic alterations were detected in DNA extracted from paraffin-embedded samples. PMID:21264207
Candidate Luminal B Breast Cancer Genes Identified by Genome, Gene Expression and DNA Methylation Profiling

PubMed Central

Addou-Klouche, Lynda; Finetti, Pascal; Saade, Marie-Rose; Manai, Marwa; Carbuccia, Nadine; Bekhouche, Ismahane; Letessier, Anne; Charafe-Jauffret, Emmanuelle; Jacquemier, Jocelyne; Spicuglia, Salvatore; de The, Hugues; Viens, Patrice; Bertucci, François; Birnbaum, Daniel; Chaffanet, Max

2014-01-01

Breast cancers (BCs) of the luminal B subtype are estrogen receptor-positive (ER+), highly proliferative, resistant to standard therapies and have a poor prognosis. To better understand this subtype we compared DNA copy number aberrations (CNAs), DNA promoter methylation, gene expression profiles, and somatic mutations in nine selected genes, in 32 luminal B tumors with those observed in 156 BCs of the other molecular subtypes. Frequent CNAs included 8p11-p12 and 11q13.1-q13.2 amplifications, 7q11.22-q34, 8q21.12-q24.23, 12p12.3-p13.1, 12q13.11-q24.11, 14q21.1-q23.1, 17q11.1-q25.1, 20q11.23-q13.33 gains and 6q14.1-q24.2, 9p21.3-p24,3, 9q21.2, 18p11.31-p11.32 losses. A total of 237 and 101 luminal B-specific candidate oncogenes and tumor suppressor genes (TSGs) presented a deregulated expression in relation with their CNAs, including 11 genes previously reported associated with endocrine resistance. Interestingly, 88% of the potential TSGs are located within chromosome arm 6q, and seven candidate oncogenes are potential therapeutic targets. A total of 100 candidate oncogenes were validated in a public series of 5,765 BCs and the overexpression of 67 of these was associated with poor survival in luminal tumors. Twenty-four genes presented a deregulated expression in relation with a high DNA methylation level. FOXO3, PIK3CA and TP53 were the most frequent mutated genes among the nine tested. In a meta-analysis of next-generation sequencing data in 875 BCs, KCNB2 mutations were associated with luminal B cases while candidate TSGs MDN1 (6q15) and UTRN (6q24), were mutated in this subtype. In conclusion, we have reported luminal B candidate genes that may play a role in the development and/or hormone resistance of this aggressive subtype. PMID:24416132
Feasibility study of molecular memory device based on DNA using methylation to store information

NASA Astrophysics Data System (ADS)

Jiang, Liming; Qiu, Wanzhi; Al-Dirini, Feras; Hossain, Faruque M.; Evans, Robin; Skafidas, Efstratios

2016-07-01

DNA, because of its robustness and dense information storage capability, has been proposed as a potential candidate for next-generation storage media. However, encoding information into the DNA sequence requires molecular synthesis technology, which to date is costly and prone to synthesis errors. Reading the DNA strand information is also complex. Ideally, DNA storage will provide methods for modifying stored information. Here, we conduct a feasibility study investigating the use of the DNA 5-methylcytosine (5mC) methylation state as a molecular memory to store information. We propose a new 1-bit memory device and study, based on the density functional theory and non-equilibrium Green's function method, the feasibility of electrically reading the information. Our results show that changes to methylation states lead to changes in the peak of negative differential resistance which can be used to interrogate memory state. Our work demonstrates a new memory concept based on methylation state which can be beneficial in the design of next generation DNA based molecular electronic memory devices.
Structural characterization of copia-type retrotransposons leads to insights into the marker development in a biofuel crop, Jatropha curcas L.

PubMed Central

2013-01-01

Background Recently, Jatropha curcas L. has attracted worldwide attention for its potential as a source of biodiesel. However, most DNA markers have demonstrated high levels of genetic similarity among and within jatropha populations around the globe. Despite promising features of copia-type retrotransposons as ideal genetic tools for gene tagging, mutagenesis, and marker-assisted selection, they have not been characterized in the jatropha genome yet. Here, we examined the diversity, evolution, and genome-wide organization of copia-type retrotransposons in the Asian, African, and Mesoamerican accessions of jatropha, then introduced a retrotransposon-based marker for this biofuel crop. Results In total, 157 PCR fragments that were amplified using the degenerate primers for the reverse transcriptase (RT) domain of copia-type retroelements were sequenced and aligned to construct the neighbor-joining tree. Phylogenetic analysis demonstrated that isolated copia RT sequences were classified into ten families, which were then grouped into three lineages. An in-depth study of the jatropha genome for the RT sequences of each family led to the characterization of full consensus sequences of the jatropha copia-type families. Estimated copy numbers of target sequences were largely different among families, as was presence of genes within 5 kb flanking regions for each family. Five copia-type families were as appealing candidates for the development of DNA marker systems. A candidate marker from family Jc7 was particularly capable of detecting genetic variation among different jatropha accessions. Fluorescence in situ hybridization (FISH) to metaphase chromosomes reveals that copia-type retrotransposons are scattered across chromosomes mainly located in the distal part regions. Conclusion This is the first report on genome-wide analysis and the cytogenetic mapping of copia-type retrotransposons of jatropha, leading to the discovery of families bearing high potential as DNA markers. Distinct dynamics of individual copia-type families, feasibility of a retrotransposon-based insertion polymorphism marker system in examining genetic variability, and approaches for the development of breeding strategies in jatropha using copia-type retrotransposons are discussed. PMID:24020916
Synthesis of mouse centromere-targeted polyamides and physico-chemical studies of their interaction with the target double-stranded DNA.

PubMed

Nozeret, Karine; Bonan, Marc; Yarmoluk, Serguiy M; Novopashina, Darya S; Boutorine, Alexandre S

2015-09-01

Synthetic minor groove-binding pyrrole-imidazole polyamides labeled by fluorophores are promising candidates for fluorescence imaging of double-stranded DNA in isolated chromosomes or fixed and living cells. We synthesized nine hairpin and two head-to-head tandem polyamides targeting repeated sequences from mouse major satellites. Their interaction with synthetic target dsDNA has been studied by physico-chemical methods in vitro before and after coupling to various fluorophores. Great variability in affinities and fluorescence properties reveals a conclusion that these properties do not only rely on recognition rules, but also on other known and unknown structural factors. Individual testing of each probe is needed before cellular applications. Copyright © 2015 Elsevier Ltd. All rights reserved.
The genetic basis of adaptive pigmentation variation in Drosophila melanogaster.

PubMed

Pool, John E; Aquadro, Charles F

2007-07-01

In a broad survey of Drosophila melanogaster population samples, levels of abdominal pigmentation were found to be highly variable and geographically differentiated. A strong positive correlation was found between dark pigmentation and high altitude, suggesting adaptation to specific environments. DNA sequence polymorphism at the candidate gene ebony revealed a clear association with the pigmentation of homozygous third chromosome lines. The darkest lines sequenced had nearly identical haplotypes spanning 14.5 kb upstream of the protein-coding exons of ebony. Thus, natural selection may have elevated the frequency of an allele that confers dark abdominal pigmentation by influencing the regulation of ebony.
NEIBank: Genomics and bioinformatics resources for vision research

PubMed Central

Peterson, Katherine; Gao, James; Buchoff, Patee; Jaworski, Cynthia; Bowes-Rickman, Catherine; Ebright, Jessica N.; Hauser, Michael A.; Hoover, David

2008-01-01

NEIBank is an integrated resource for genomics and bioinformatics in vision research. It includes expressed sequence tag (EST) data and sequence-verified cDNA clones for multiple eye tissues of several species, web-based access to human eye-specific SAGE data through EyeSAGE, and comprehensive, annotated databases of known human eye disease genes and candidate disease gene loci. All expression- and disease-related data are integrated in EyeBrowse, an eye-centric genome browser. NEIBank provides a comprehensive overview of current knowledge of the transcriptional repertoires of eye tissues and their relation to pathology. PMID:18648525
Characterization of X-OCRL, a Xenopus laevis homologue of OCRL-1, the Lowe oculocerebrorenal syndrome candidate gene

DOE Office of Scientific and Technical Information (OSTI.GOV)

Reilly, D.S.; Nussbaum, R.L.

1994-09-01

The Lowe oculocerebrorenal syndrome (OCRL) is an X-linked disease characterized by congenital cataract, mental retardation, and renal tubular dysfunction. A candidate cDNA, OCRL-1, was identified by positional cloning and mutations in OCRL-1 have been detected in patients with Lowe syndrome. The OCRL-1 nucleotide sequence encodes a predicted protein of 968 amino acids and shares 51% amino acid identity with a human inositol polyphosphate-5-phosphatase. This suggests that the underlying defect in OCRL may be due to a defect in inositol phosphate metabolism. The isolation of OCRL-1 provides the opportunity to investigate its function through the use of animal model systems. Wemore » have isolated a partial cDNA clone encoding an OCRL-1 homologue, X-OCRL, from the South African clawed frog, Xenopus laevis. We used a portion of the human cDNA to screen a Xenopus laevis embryo cDNA library and isolated four positive clones. One clone, 42-5A, is a 650 bp insert with over 75% amino acid identity to the corresponding region of the human OCRL-1 sequence. 42-5A detects messenger RNA in adult Xenopus brain, stomach, small intestine, skin, muscle, lung, blood, and oviduct. X-OCRL messenger RNA is first detected during late gastrula and continues to be expressed throughout Xenopus development. In situ hybridization studies are underway to identify the cellular localization of X-OCRL expression in Xenopus embryos and adult tissues. We are especially interested in characterizing X-OCRL expression during formation of the amphibian lens since congenital cataracts are a constant feature of the human disease.« less
Bioinformatics analysis of single and multi-hybrid epitopes of GRA-1, GRA-4, GRA-6 and GRA-7 proteins to improve DNA vaccine design against Toxoplasma gondii.

PubMed

Shaddel, Minoo; Ebrahimi, Mansour; Tabandeh, Mohammad Reza

2018-06-01

Toxoplasma gondii , is a causative agent of morbidity and mortality in immunocompromised and congenitally-infected individuals. Attempts to construct DNA vaccines against T. gondii using surface proteins are increasing. The dense granule antigens are highly expressed in the acute and chronic phases of T. gondii infection and considered as suitable DNA vaccine candidates to control toxoplasmosis. In the present study, bioinformatics tools and online software were used to predict, analyze and compare the structural, physical and chemical characters and immunogenicity of the GRA-1, GRA-4, GRA-6 and GRA-7 proteins. Sequence alignment results indicated that the GRA-1, GRA-4, GRA-6 and GRA-7 proteins had low similarity. The secondary structure prediction demonstrated that among the four proteins, GRA-1 and GRA-6 had similar secondary structure except for a little discrepancy. Hydrophilicity/hydrophobicity analysis showed multiple hydrophilic regions and some classical high hydrophilic domains for each protein sequence. Immunogenic epitope prediction results demonstrated that the GRA-1 and GRA-4 epitopes were stable and GRA-4 showed the highest degree of antigenicity. Although the GRA-7 epitope had the highest score of immunogenicity, this epitope was instable and had the lowest degree of antigenicity and half-time in eukaryotic cell. Also, the results indicated that GRA4-GRA7 epitope and GRA6-GRA7 had the highest degree of antigenicity and immunogenicity among multi-hybrid epitopes, respectively. Totally, in the present study, single epitopes showed the highest degree of antigenicity compared with multi-hybrid epitopes. Given the results, it can be concluded that GRA-4 and GRA-7 can be powerful DNA vaccine candidates against T. gondii .
A Rapid and Improved Method to Generate Recombinant Dengue Virus Vaccine Candidates

PubMed Central

Govindarajan, Dhanasekaran; Guan, Liming; Meschino, Steven; Fridman, Arthur; Bagchi, Ansu; Pak, Irene; ter Meulen, Jan; Casimiro, Danilo R.; Bett, Andrew J.

2016-01-01

Dengue is one of the most important mosquito-borne infections accounting for severe morbidity and mortality worldwide. Recently, the tetravalent chimeric live attenuated Dengue vaccine Dengvaxia® was approved for use in several dengue endemic countries. In general, live attenuated vaccines (LAV) are very efficacious and offer long-lasting immunity against virus-induced disease. Rationally designed LAVs can be generated through reverse genetics technology, a method of generating infectious recombinant viruses from full length cDNA contained in bacterial plasmids. In vitro transcribed (IVT) viral RNA from these infectious clones is transfected into susceptible cells to generate recombinant virus. However, the generation of full-length dengue virus cDNA clones can be difficult due to the genetic instability of viral sequences in bacterial plasmids. To circumvent the need for a single plasmid containing a full length cDNA, in vitro ligation of two or three cDNA fragments contained in separate plasmids can be used to generate a full-length dengue viral cDNA template. However, in vitro ligation of multiple fragments often yields low quality template for IVT reactions, resulting in inconsistent low yield RNA. These technical difficulties make recombinant virus recovery less efficient. In this study, we describe a simple, rapid and efficient method of using LONG-PCR to recover recombinant chimeric Yellow fever dengue (CYD) viruses as potential dengue vaccine candidates. Using this method, we were able to efficiently generate several viable recombinant viruses without introducing any artificial mutations into the viral genomes. We believe that the techniques reported here will enable rapid and efficient recovery of recombinant flaviviruses for evaluation as vaccine candidates and, be applicable to the recovery of other RNA viruses. PMID:27008550
A Rapid and Improved Method to Generate Recombinant Dengue Virus Vaccine Candidates.

PubMed

Govindarajan, Dhanasekaran; Guan, Liming; Meschino, Steven; Fridman, Arthur; Bagchi, Ansu; Pak, Irene; ter Meulen, Jan; Casimiro, Danilo R; Bett, Andrew J

2016-01-01

Dengue is one of the most important mosquito-borne infections accounting for severe morbidity and mortality worldwide. Recently, the tetravalent chimeric live attenuated Dengue vaccine Dengvaxia® was approved for use in several dengue endemic countries. In general, live attenuated vaccines (LAV) are very efficacious and offer long-lasting immunity against virus-induced disease. Rationally designed LAVs can be generated through reverse genetics technology, a method of generating infectious recombinant viruses from full length cDNA contained in bacterial plasmids. In vitro transcribed (IVT) viral RNA from these infectious clones is transfected into susceptible cells to generate recombinant virus. However, the generation of full-length dengue virus cDNA clones can be difficult due to the genetic instability of viral sequences in bacterial plasmids. To circumvent the need for a single plasmid containing a full length cDNA, in vitro ligation of two or three cDNA fragments contained in separate plasmids can be used to generate a full-length dengue viral cDNA template. However, in vitro ligation of multiple fragments often yields low quality template for IVT reactions, resulting in inconsistent low yield RNA. These technical difficulties make recombinant virus recovery less efficient. In this study, we describe a simple, rapid and efficient method of using LONG-PCR to recover recombinant chimeric Yellow fever dengue (CYD) viruses as potential dengue vaccine candidates. Using this method, we were able to efficiently generate several viable recombinant viruses without introducing any artificial mutations into the viral genomes. We believe that the techniques reported here will enable rapid and efficient recovery of recombinant flaviviruses for evaluation as vaccine candidates and, be applicable to the recovery of other RNA viruses.
Global DNA methylation analysis reveals miR-214-3p contributes to cisplatin resistance in pediatric intracranial nongerminomatous malignant germ cell tumors.

PubMed

Hsieh, Tsung-Han; Liu, Yun-Ru; Chang, Ting-Yu; Liang, Muh-Lii; Chen, Hsin-Hung; Wang, Hsei-Wei; Yen, Yun; Wong, Tai-Tong

2018-03-27

Pediatric central nervous system germ cell tumors (CNSGCTs) are rare and heterogeneous neoplasms, which can be divided into germinomas and nongerminomatous germ cell tumors (NGGCTs). NGGCTs are further subdivided into mature teratomas and nongerminomatous malignant GCTs (NGMGCTs). Clinical outcomes suggest that NGMGCTs have poor prognosis and survival and that they require more extensive radiotherapy and adjuvant chemotherapy. However, the mechanisms underlying this difference are still unclear. DNA methylation alteration is generally acknowledged to cause therapeutic resistance in cancers. We hypothesized that the pediatric NGMGCTs exhibit a different genome-wide DNA methylation pattern, which is involved in the mechanism of its therapeutic resistance. We performed methylation and hydroxymethylation DNA immunoprecipitation sequencing, mRNA expression microarray, and small RNA sequencing (smRNA-seq) to determine methylation-regulated genes, including microRNAs (miRNAs). The expression levels of 97 genes and 8 miRNAs were correlated with promoter DNA methylation and hydroxymethylation status, such as the miR-199/-214 cluster, and treatment with DNA demethylating agent 5-aza-2'-deoxycytidine elevated its expression level. Furthermore, smRNA-seq analysis showed 27 novel miRNA candidates with differential expression between germinomas and NGMGCTs. Overexpresssion of miR-214-3p in NCCIT cells leads to reduced expression of the pro-apoptotic protein BCL2-like 11 and induces cisplatin resistance. We interrogated the differential DNA methylation patterns between germinomas and NGMGCTs and proposed a mechanism for chemoresistance in NGMGCTs. In addition, our sequencing data provide a roadmap for further pediatric CNSGCT research and potential targets for the development of new therapeutic strategies.
Development of Cross-Assembly Phage PCR-Based Methods ...

EPA Pesticide Factsheets

Technologies that can characterize human fecal pollution in environmental waters offer many advantages over traditional general indicator approaches. However, many human-associated methods cross-react with non-human animal sources and lack suitable sensitivity for fecal source identification applications. The genome of a newly discovered bacteriophage (~97 kbp), the Cross-Assembly phage or “crAssphage”, assembled from a human gut metagenome DNA sequence library is predicted to be both highly abundant and predominately occur in human feces suggesting that this double stranded DNA virus may be an ideal human fecal pollution indicator. We report the development of two human-associated crAssphage endpoint PCR methods (crAss056 and crAss064). A shotgun strategy was employed where 384 candidate primers were designed to cover ~41 kbp of the crAssphage genome deemed favorable for method development based on a series of bioinformatics analyses. Candidate primers were subjected to three rounds of testing to evaluate assay optimization, specificity, limit of detection (LOD95), geographic variability, and performance in environmental water samples. The top two performing candidate primer sets exhibited 100% specificity (n = 70 individual samples from 8 different animal species), >90% sensitivity (n = 10 raw sewage samples from different geographic locations), LOD95 of 0.01 ng/µL of total DNA per reaction, and successfully detected human fecal pollution in impaired envi
Congenital hypothyroidism with goiter in Tenterfield terriers.

PubMed

Dodgson, S E; Day, R; Fyfe, J C

2012-01-01

A cluster of cases of congenital hypothyroidism with goiter (CHG) in Tenterfield Terriers was identified and hypothesized to be dyshormonogenesis of genetic etiology with autosomal recessive inheritance. To describe the phenotype, thyroid histopathology, biochemistry, mode of inheritance, and causal mutation of CHG in Tenterfield Terriers. Thyroid tissue from 1 CHG-affected Tenterfield Terriers, 2 affected Toy Fox Terriers, and 7 normal control dogs. Genomic DNA from blood or buccal brushings of 114 additional Tenterfield Terriers. Biochemical and genetic segregation analysis of functional gene candidates in a Tenterfield Terrier kindred. Thyroid peroxidase (TPO) iodide oxidation activity was measured, and TPO protein and SDS-resistant thyroglobulin aggregation were assessed on western blots. TPO cDNA was amplified from thyroid RNA and sequenced. Exons and flanking splice sites were amplified from genomic DNA and sequenced. Variant TPO allele segregation was assessed by restriction enzyme digestion of PCR products. Thyroid from an affected pup had lesions consistent with dyshormonogenesis. TPO activity was absent, but normal sized immunocrossreactive TPO protein was present. Affected dog cDNA and genomic sequences revealed a homozygous TPO missense mutation in exon 9 (R593W) that was heterozygous in all obligate carriers and in 31% of other clinically normal Tenterfield Terriers. The mutation underlying CHG in Tenterfield Terriers was identified, and a convenient carrier test made available for screening Tenterfield Terriers used for breeding. Copyright © 2012 by the American College of Veterinary Internal Medicine.
Automated Antibody De Novo Sequencing and Its Utility in Biopharmaceutical Discovery

NASA Astrophysics Data System (ADS)

Sen, K. Ilker; Tang, Wilfred H.; Nayak, Shruti; Kil, Yong J.; Bern, Marshall; Ozoglu, Berk; Ueberheide, Beatrix; Davis, Darryl; Becker, Christopher

2017-05-01

Applications of antibody de novo sequencing in the biopharmaceutical industry range from the discovery of new antibody drug candidates to identifying reagents for research and determining the primary structure of innovator products for biosimilar development. When murine, phage display, or patient-derived monoclonal antibodies against a target of interest are available, but the cDNA or the original cell line is not, de novo protein sequencing is required to humanize and recombinantly express these antibodies, followed by in vitro and in vivo testing for functional validation. Availability of fully automated software tools for monoclonal antibody de novo sequencing enables efficient and routine analysis. Here, we present a novel method to automatically de novo sequence antibodies using mass spectrometry and the Supernovo software. The robustness of the algorithm is demonstrated through a series of stress tests.
Accurate Typing of Human Leukocyte Antigen Class I Genes by Oxford Nanopore Sequencing.

PubMed

Liu, Chang; Xiao, Fangzhou; Hoisington-Lopez, Jessica; Lang, Kathrin; Quenzel, Philipp; Duffy, Brian; Mitra, Robi David

2018-04-03

Oxford Nanopore Technologies' MinION has expanded the current DNA sequencing toolkit by delivering long read lengths and extreme portability. The MinION has the potential to enable expedited point-of-care human leukocyte antigen (HLA) typing, an assay routinely used to assess the immunologic compatibility between organ donors and recipients, but the platform's high error rate makes it challenging to type alleles with accuracy. We developed and validated accurate typing of HLA by Oxford nanopore (Athlon), a bioinformatic pipeline that i) maps nanopore reads to a database of known HLA alleles, ii) identifies candidate alleles with the highest read coverage at different resolution levels that are represented as branching nodes and leaves of a tree structure, iii) generates consensus sequences by remapping the reads to the candidate alleles, and iv) calls the final diploid genotype by blasting consensus sequences against the reference database. Using two independent data sets generated on the R9.4 flow cell chemistry, Athlon achieved a 100% accuracy in class I HLA typing at the two-field resolution. Copyright © 2018 American Society for Investigative Pathology and the Association for Molecular Pathology. Published by Elsevier Inc. All rights reserved.

Candidate Cancer Allele cDNA Collection | Office of Cancer Genomics

Cancer.gov

CTD2 researchers at the Broad Institute/DFCI have developed a collection of plasmids including mutant alleles found in sequencing studies of cancer. It includes somatic variants found in lung adenocarcinoma and across other cancer types. The clones enable researchers to characterize the function of the cancer variants in a high throughput experiments. These plasmids are collectively called the “Broad Target Accelerator Plasmid Collections”.
Characterization of the canine desmin (DES) gene and evaluation as a candidate gene for dilated cardiomyopathy in the Dobermann.

PubMed

Stabej, Polona; Imholz, Sandra; Versteeg, Serge A; Zijlstra, Carla; Stokhof, Arnold A; Domanjko-Petric, Aleksandra; Leegwater, Peter A J; van Oost, Bernard A

2004-10-13

Canine-dilated cardiomyopathy (DCM) in dogs is a disease of the myocardium associated with dilatation and impaired contraction of the ventricles and is suspected to have a genetic cause. A missense mutation in the desmin gene (DES) causes DCM in a human family. Human DCM closely resembles the canine disease. In the present study, we evaluated whether DES gene mutations are responsible for DCM in Dobermann dogs. We have isolated bacterial artificial chromosome clones (BACs) containing the canine DES gene and determined the chromosomal location by fluorescence in situ hybridization (FISH). Using data deposited in the NCBI trace archive and GenBank, the canine DES gene DNA sequence was assembled and seven single nucleotide polymorphisms (SNPs) were identified. From the canine DES gene BAC clones, a polymorphic microsatellite marker was isolated. The microsatellite marker and four informative desmin SNPs were typed in a Dobermann family with frequent DCM occurrence, but the disease phenotype did not associate with a desmin haplotype. We concluded that mutations in the DES gene do not play a role in Dobermann DCM. Availability of the microsatellite marker, SNPs and DNA sequence reported in this study enable fast evaluation of the DES gene as a DCM candidate gene in other dog breeds with DCM occurrence.
An optimized and low-cost FPGA-based DNA sequence alignment--a step towards personal genomics.

PubMed

Shah, Hurmat Ali; Hasan, Laiq; Ahmad, Nasir

2013-01-01

DNA sequence alignment is a cardinal process in computational biology but also is much expensive computationally when performing through traditional computational platforms like CPU. Of many off the shelf platforms explored for speeding up the computation process, FPGA stands as the best candidate due to its performance per dollar spent and performance per watt. These two advantages make FPGA as the most appropriate choice for realizing the aim of personal genomics. The previous implementation of DNA sequence alignment did not take into consideration the price of the device on which optimization was performed. This paper presents optimization over previous FPGA implementation that increases the overall speed-up achieved as well as the price incurred by the platform that was optimized. The optimizations are (1) The array of processing elements is made to run on change in input value and not on clock, so eliminating the need for tight clock synchronization, (2) the implementation is unrestrained by the size of the sequences to be aligned, (3) the waiting time required for the sequences to load to FPGA is reduced to the minimum possible and (4) an efficient method is devised to store the output matrix that make possible to save the diagonal elements to be used in next pass, in parallel with the computation of output matrix. Implemented on Spartan3 FPGA, this implementation achieved 20 times performance improvement in terms of CUPS over GPP implementation.
Prediction of constitutive A-to-I editing sites from human transcriptomes in the absence of genomic sequences

PubMed Central

2013-01-01

Background Adenosine-to-inosine (A-to-I) RNA editing is recognized as a cellular mechanism for generating both RNA and protein diversity. Inosine base pairs with cytidine during reverse transcription and therefore appears as guanosine during sequencing of cDNA. Current approaches of RNA editing identification largely depend on the comparison between transcriptomes and genomic DNA (gDNA) sequencing datasets from the same individuals, and it has been challenging to identify editing candidates from transcriptomes in the absence of gDNA information. Results We have developed a new strategy to accurately predict constitutive RNA editing sites from publicly available human RNA-seq datasets in the absence of relevant genomic sequences. Our approach establishes new parameters to increase the ability to map mismatches and to minimize sequencing/mapping errors and unreported genome variations. We identified 695 novel constitutive A-to-I editing sites that appear in clusters (named “editing boxes”) in multiple samples and which exhibit spatial and dynamic regulation across human tissues. Some of these editing boxes are enriched in non-repetitive regions lacking inverted repeat structures and contain an extremely high conversion frequency of As to Is. We validated a number of editing boxes in multiple human cell lines and confirmed that ADAR1 is responsible for the observed promiscuous editing events in non-repetitive regions, further expanding our knowledge of the catalytic substrate of A-to-I RNA editing by ADAR enzymes. Conclusions The approach we present here provides a novel way of identifying A-to-I RNA editing events by analyzing only RNA-seq datasets. This method has allowed us to gain new insights into RNA editing and should also aid in the identification of more constitutive A-to-I editing sites from additional transcriptomes. PMID:23537002
Molecular dynamics simulations of DNA-polycation complexes

NASA Astrophysics Data System (ADS)

Ziebarth, Jesse; Wang, Yongmei

2008-03-01

A necessary step in the preparation of DNA for use in gene therapy is the packaging of DNA with a vector that can condense DNA and provide protection from degrading enzymes. Because of the immunoresponses caused by viral vectors, there has been interest in developing synthetic gene therapy vectors, with polycations emerging as promising candidates. Molecular dynamics simulations of the DNA duplex CGCGAATTCGCG in the presence of 20 monomer long sequences of the polycations, poly-L-lysine (PLL) and polyethyleneimine (PEI), with explicit counterions and TIP3P water, are performed to provide insight into the structure and formation of DNA polyplexes. After an initial separation of approximately 50 å, the DNA and polycation come together and form a stable complex within 10 ns. The DNA does not undergo any major structural changes upon complexation and remains in the B-form. In the formed complex, the charged amine groups of the polycation mainly interact with DNA phosphate groups, and rarely occupy electronegative sites in either the major or minor grooves. Differences between complexation with PEI and PLL will be discussed.
Exome Sequence Analysis of 14 Families With High Myopia.

PubMed

Kloss, Bethany A; Tompson, Stuart W; Whisenhunt, Kristina N; Quow, Krystina L; Huang, Samuel J; Pavelec, Derek M; Rosenberg, Thomas; Young, Terri L

2017-04-01

To identify causal gene mutations in 14 families with autosomal dominant (AD) high myopia using exome sequencing. Select individuals from 14 large Caucasian families with high myopia were exome sequenced. Gene variants were filtered to identify potential pathogenic changes. Sanger sequencing was used to confirm variants in original DNA, and to test for disease cosegregation in additional family members. Candidate genes and chromosomal loci previously associated with myopic refractive error and its endophenotypes were comprehensively screened. In 14 high myopia families, we identified 73 rare and 31 novel gene variants as candidates for pathogenicity. In seven of these families, two of the novel and eight of the rare variants were within known myopia loci. A total of 104 heterozygous nonsynonymous rare variants in 104 genes were identified in 10 out of 14 probands. Each variant cosegregated with affection status. No rare variants were identified in genes known to cause myopia or in genes closest to published genome-wide association study association signals for refractive error or its endophenotypes. Whole exome sequencing was performed to determine gene variants implicated in the pathogenesis of AD high myopia. This study provides new genes for consideration in the pathogenesis of high myopia, and may aid in the development of genetic profiling of those at greatest risk for attendant ocular morbidities of this disorder.
DNA detection and single nucleotide mutation identification using SERS for molecular diagnostics and global health

NASA Astrophysics Data System (ADS)

Ngo, Hoan T.; Gandra, Naveen; Fales, Andrew M.; Taylor, Steve M.; Vo-Dinh, Tuan

2017-02-01

Nucleic acid-based molecular diagnostics at the point-of-care (POC) and in resource-limited settings is still a challenge. We present a sensitive yet simple DNA detection method with single nucleotide polymorphism (SNP) identification capability. The detection scheme involves sandwich hybridization of magnetic beads conjugated with capture probes, target sequences, and ultrabright surface-enhanced Raman Scattering (SERS) nanorattles conjugated with reporter probes. Upon hybridization, the sandwich probes are concentrated at the detection focus controlled by a magnetic system for SERS measurements. The ultrabright SERS nanorattles, consisting of a core and a shell with resonance Raman reporters loaded in the gap space between the core and the shell, serve as SERS tags for ultrasensitive signal detection. Specific DNA sequences of the malaria parasite Plasmodium falciparum and dengue virus 1 (DENV1) were used as the model marker system. Detection limit of approximately 100 attomoles was achieved. Single nucleotide polymorphism (SNP) discrimination of wild type malaria DNA and mutant malaria DNA, which confers resistance to artemisinin drugs, was also demonstrated. The results demonstrate the molecular diagnostic potential of the nanorattle-based method to both detect and genotype infectious pathogens. The method's simplicity makes it a suitable candidate for molecular diagnosis at the POC and in resource-limited settings.
A conserved region of leptospiral immunoglobulin-like A and B proteins as a DNA vaccine elicits a prophylactic immune response against leptospirosis.

PubMed

Forster, Karine M; Hartwig, Daiane D; Seixas, Fabiana K; Bacelo, Kátia L; Amaral, Marta; Hartleben, Cláudia P; Dellagostin, Odir A

2013-05-01

The leptospiral immunoglobulin-like (Lig) proteins LigA and LigB possess immunoglobulin-like domains with 90-amino-acid repeats and are adhesion molecules involved in pathogenicity. They are conserved in pathogenic Leptospira spp. and thus are of interest for use as serodiagnostic antigens and in recombinant vaccine formulations. The N-terminal amino acid sequences of the LigA and LigB proteins are identical, but the C-terminal sequences vary. In this study, we evaluated the protective potential of five truncated forms of LigA and LigB proteins from Leptospira interrogans serovar Canicola as DNA vaccines using the pTARGET mammalian expression vector. Hamsters immunized with the DNA vaccines were subjected to a heterologous challenge with L. interrogans serovar Copenhageni strain Spool via the intraperitoneal route. Immunization with a DNA vaccine encoding LigBrep resulted in the survival of 5/8 (62.5%) hamsters against lethal infection (P < 0.05). None of the control hamsters or animals immunized with the other vaccine preparations survived. The vaccine induced an IgG antibody response and, additionally, conferred sterilizing immunity in 80% of the surviving animals. Our results indicate that the LigBrep DNA vaccine is a promising candidate for inclusion in a protective leptospiral vaccine.
A Conserved Region of Leptospiral Immunoglobulin-Like A and B Proteins as a DNA Vaccine Elicits a Prophylactic Immune Response against Leptospirosis

PubMed Central

Forster, Karine M.; Hartwig, Daiane D.; Seixas, Fabiana K.; Bacelo, Kátia L.; Amaral, Marta; Hartleben, Cláudia P.

2013-01-01

The leptospiral immunoglobulin-like (Lig) proteins LigA and LigB possess immunoglobulin-like domains with 90-amino-acid repeats and are adhesion molecules involved in pathogenicity. They are conserved in pathogenic Leptospira spp. and thus are of interest for use as serodiagnostic antigens and in recombinant vaccine formulations. The N-terminal amino acid sequences of the LigA and LigB proteins are identical, but the C-terminal sequences vary. In this study, we evaluated the protective potential of five truncated forms of LigA and LigB proteins from Leptospira interrogans serovar Canicola as DNA vaccines using the pTARGET mammalian expression vector. Hamsters immunized with the DNA vaccines were subjected to a heterologous challenge with L. interrogans serovar Copenhageni strain Spool via the intraperitoneal route. Immunization with a DNA vaccine encoding LigBrep resulted in the survival of 5/8 (62.5%) hamsters against lethal infection (P < 0.05). None of the control hamsters or animals immunized with the other vaccine preparations survived. The vaccine induced an IgG antibody response and, additionally, conferred sterilizing immunity in 80% of the surviving animals. Our results indicate that the LigBrep DNA vaccine is a promising candidate for inclusion in a protective leptospiral vaccine. PMID:23486420
Direct and long-term detection of gene doping in conventional blood samples.

PubMed

Beiter, T; Zimmermann, M; Fragasso, A; Hudemann, J; Niess, A M; Bitzer, M; Lauer, U M; Simon, P

2011-03-01

The misuse of somatic gene therapy for the purpose of enhancing athletic performance is perceived as a coming threat to the world of sports and categorized as 'gene doping'. This article describes a direct detection approach for gene doping that gives a clear yes-or-no answer based on the presence or absence of transgenic DNA in peripheral blood samples. By exploiting a priming strategy to specifically amplify intronless DNA sequences, we developed PCR protocols allowing the detection of very small amounts of transgenic DNA in genomic DNA samples to screen for six prime candidate genes. Our detection strategy was verified in a mouse model, giving positive signals from minute amounts (20 μl) of blood samples for up to 56 days following intramuscular adeno-associated virus-mediated gene transfer, one of the most likely candidate vector systems to be misused for gene doping. To make our detection strategy amenable for routine testing, we implemented a robust sample preparation and processing protocol that allows cost-efficient analysis of small human blood volumes (200 μl) with high specificity and reproducibility. The practicability and reliability of our detection strategy was validated by a screening approach including 327 blood samples taken from professional and recreational athletes under field conditions.
Resolving species delimitation within the genus Bunopus Blanford, 1874 (Squamata: Gekkonidae) in Iran using DNA barcoding approach.

PubMed

Khosravani, Azar; Rastegar-Pouyani, Eskandar; Rastegar-Pouyani, Nasrullah; Oraie, Hamzeh; Papenfuss, Theodore J

2017-12-19

Mitochondrial COI sequences were used to investigate species delimitation within the genus Bunopus in Iran. A dataset with a final sequence length of 633 nucleotides including 100 specimens from 31 geographically distant localities across Iran were generated. The result demonstrated that two major clades with strong support can be identified within the genus Bunopus in Iran. Clade A includes Bunopus crassicaudus and two new entities, eastern populations (subclade A2,1) and Shahdad populations (subclade A2,2). The second clade comprises western and southwestern populations (subclade B1,1), Arabian populations (subclade B1,2) and south and southeast populations in Iran, to which Bunopus tuberculatus (subclade B2) is assigned. In addition to Bunopus crassicaudus and B. tuberculatus, three new candidate species in Iran can easily be identified based on the DNA barcoding approach.
EXONSAMPLER: a computer program for genome-wide and candidate gene exon sampling for targeted next-generation sequencing.

PubMed

Cosart, Ted; Beja-Pereira, Albano; Luikart, Gordon

2014-11-01

The computer program EXONSAMPLER automates the sampling of thousands of exon sequences from publicly available reference genome sequences and gene annotation databases. It was designed to provide exon sequences for the efficient, next-generation gene sequencing method called exon capture. The exon sequences can be sampled by a list of gene name abbreviations (e.g. IFNG, TLR1), or by sampling exons from genes spaced evenly across chromosomes. It provides a list of genomic coordinates (a bed file), as well as a set of sequences in fasta format. User-adjustable parameters for collecting exon sequences include a minimum and maximum acceptable exon length, maximum number of exonic base pairs (bp) to sample per gene, and maximum total bp for the entire collection. It allows for partial sampling of very large exons. It can preferentially sample upstream (5 prime) exons, downstream (3 prime) exons, both external exons, or all internal exons. It is written in the Python programming language using its free libraries. We describe the use of EXONSAMPLER to collect exon sequences from the domestic cow (Bos taurus) genome for the design of an exon-capture microarray to sequence exons from related species, including the zebu cow and wild bison. We collected ~10% of the exome (~3 million bp), including 155 candidate genes, and ~16,000 exons evenly spaced genomewide. We prioritized the collection of 5 prime exons to facilitate discovery and genotyping of SNPs near upstream gene regulatory DNA sequences, which control gene expression and are often under natural selection. © 2014 John Wiley & Sons Ltd.
Identification of 15 candidate structured noncoding RNA motifs in fungi by comparative genomics.

PubMed

Li, Sanshu; Breaker, Ronald R

2017-10-13

With the development of rapid and inexpensive DNA sequencing, the genome sequences of more than 100 fungal species have been made available. This dataset provides an excellent resource for comparative genomics analyses, which can be used to discover genetic elements, including noncoding RNAs (ncRNAs). Bioinformatics tools similar to those used to uncover novel ncRNAs in bacteria, likewise, should be useful for searching fungal genomic sequences, and the relative ease of genetic experiments with some model fungal species could facilitate experimental validation studies. We have adapted a bioinformatics pipeline for discovering bacterial ncRNAs to systematically analyze many fungal genomes. This comparative genomics pipeline integrates information on conserved RNA sequence and structural features with alternative splicing information to reveal fungal RNA motifs that are candidate regulatory domains, or that might have other possible functions. A total of 15 prominent classes of structured ncRNA candidates were identified, including variant HDV self-cleaving ribozyme representatives, atypical snoRNA candidates, and possible structured antisense RNA motifs. Candidate regulatory motifs were also found associated with genes for ribosomal proteins, S-adenosylmethionine decarboxylase (SDC), amidase, and HexA protein involved in Woronin body formation. We experimentally confirm that the variant HDV ribozymes undergo rapid self-cleavage, and we demonstrate that the SDC RNA motif reduces the expression of SAM decarboxylase by translational repression. Furthermore, we provide evidence that several other motifs discovered in this study are likely to be functional ncRNA elements. Systematic screening of fungal genomes using a computational discovery pipeline has revealed the existence of a variety of novel structured ncRNAs. Genome contexts and similarities to known ncRNA motifs provide strong evidence for the biological and biochemical functions of some newly found ncRNA motifs. Although initial examinations of several motifs provide evidence for their likely functions, other motifs will require more in-depth analysis to reveal their functions.
Identification of a novel mutation in a Chinese family with Nance-Horan syndrome by whole exome sequencing.

PubMed

Hong, Nan; Chen, Yan-hua; Xie, Chen; Xu, Bai-sheng; Huang, Hui; Li, Xin; Yang, Yue-qing; Huang, Ying-ping; Deng, Jian-lian; Qi, Ming; Gu, Yang-shun

2014-08-01

Nance-Horan syndrome (NHS) is a rare X-linked disorder characterized by congenital nuclear cataracts, dental anomalies, and craniofacial dysmorphisms. Mental retardation was present in about 30% of the reported cases. The purpose of this study was to investigate the genetic and clinical features of NHS in a Chinese family. Whole exome sequencing analysis was performed on DNA from an affected male to scan for candidate mutations on the X-chromosome. Sanger sequencing was used to verify these candidate mutations in the whole family. Clinical and ophthalmological examinations were performed on all members of the family. A combination of exome sequencing and Sanger sequencing revealed a nonsense mutation c.322G>T (E108X) in exon 1 of NHS gene, co-segregating with the disease in the family. The nonsense mutation led to the conversion of glutamic acid to a stop codon (E108X), resulting in truncation of the NHS protein. Multiple sequence alignments showed that codon 108, where the mutation (c.322G>T) occurred, was located within a phylogenetically conserved region. The clinical features in all affected males and female carriers are described in detail. We report a nonsense mutation c.322G>T (E108X) in a Chinese family with NHS. Our findings broaden the spectrum of NHS mutations and provide molecular insight into future NHS clinical genetic diagnosis.
Cell-Free Expression and In Situ Immobilization of Parasite Proteins from Clonorchis sinensis for Rapid Identification of Antigenic Candidates

PubMed Central

Ju, Jung Won; Kim, Ho-Cheol; Shin, Hyun-Il; Kim, Yu Jung; Kim, Dong-Myung

2015-01-01

Progress towards genetic sequencing of human parasites has provided the groundwork for a post-genomic approach to develop novel antigens for the diagnosis and treatment of parasite infections. To fully utilize the genomic data, however, high-throughput methodologies are required for functional analysis of the proteins encoded in the genomic sequences. In this study, we investigated cell-free expression and in situ immobilization of parasite proteins as a novel platform for the discovery of antigenic proteins. PCR-amplified parasite DNA was immobilized on microbeads that were also functionalized to capture synthesized proteins. When the microbeads were incubated in a reaction mixture for cell-free synthesis, proteins expressed from the microbead-immobilized DNA were instantly immobilized on the same microbeads, providing a physical linkage between the genetic information and encoded proteins. This approach of in situ expression and isolation enables streamlined recovery and analysis of cell-free synthesized proteins and also allows facile identification of the genes coding antigenic proteins through direct PCR of the microbead-bound DNA. PMID:26599101
De novo transcriptome sequencing of axolotl blastema for identification of differentially expressed genes during limb regeneration

PubMed Central

2013-01-01

Background Salamanders are unique among vertebrates in their ability to completely regenerate amputated limbs through the mediation of blastema cells located at the stump ends. This regeneration is nerve-dependent because blastema formation and regeneration does not occur after limb denervation. To obtain the genomic information of blastema tissues, de novo transcriptomes from both blastema tissues and denervated stump ends of Ambystoma mexicanum (axolotls) 14 days post-amputation were sequenced and compared using Solexa DNA sequencing. Results The sequencing done for this study produced 40,688,892 reads that were assembled into 307,345 transcribed sequences. The N50 of transcribed sequence length was 562 bases. A similarity search with known proteins identified 39,200 different genes to be expressed during limb regeneration with a cut-off E-value exceeding 10-5. We annotated assembled sequences by using gene descriptions, gene ontology, and clusters of orthologous group terms. Targeted searches using these annotations showed that the majority of the genes were in the categories of essential metabolic pathways, transcription factors and conserved signaling pathways, and novel candidate genes for regenerative processes. We discovered and confirmed numerous sequences of the candidate genes by using quantitative polymerase chain reaction and in situ hybridization. Conclusion The results of this study demonstrate that de novo transcriptome sequencing allows gene expression analysis in a species lacking genome information and provides the most comprehensive mRNA sequence resources for axolotls. The characterization of the axolotl transcriptome can help elucidate the molecular mechanisms underlying blastema formation during limb regeneration. PMID:23815514
Novel mutation in the replication focus targeting sequence domain of DNMT1 causes hereditary sensory and autonomic neuropathy IE.

PubMed

Yuan, Junhui; Higuchi, Yujiro; Nagado, Tatsui; Nozuma, Satoshi; Nakamura, Tomonori; Matsuura, Eiji; Hashiguchi, Akihiro; Sakiyama, Yusuke; Yoshimura, Akiko; Takashima, Hiroshi

2013-03-01

DNMT1, encoding DNA methyltransferase 1 (Dnmt1), is a critical enzyme which is mainly responsible for conversion of unmethylated DNA into hemimethylated DNA. To date, two phenotypes produced by DNMT1 mutations have been reported, including hereditary sensory and autonomic neuropathy (HSAN) type IE with mutations in exon 20, and autosomal dominant cerebellar ataxia, deafness, and narcolepsy caused by mutations in exon 21. We report a sporadic case in a Japanese patient with loss of pain and vibration sense, chronic osteomyelitis, autonomic system dysfunctions, hearing loss, and mild dementia, but without definite cerebellar ataxia. Electrophysiological studies revealed absent sensory nerve action potential with nearly normal motor nerve conduction studies. Brain magnetic resonance imaging revealed mild diffuse cerebral and cerebellar atrophy. Using a next-generation sequencing system, 16 candidate genes were analyzed and a novel missense mutation, c.1706A>G (p.His569Arg), was identified in exon 21 of DNMT1. Our findings suggest that mutation in exon 21 of DNMT1 may also produce a HSAN phenotype. Because all reported mutations of DNMT1 are concentrated in exons 20 and 21, which encode the replication focus targeting sequence (RFTS) domain of Dnmt1, the RFTS domain could be a mutation hot spot. © 2013 Peripheral Nerve Society.
Molecular cloning and characterization of novel phytocystatin gene from turmeric, Curcuma longa.

PubMed

Chan, Seow-Neng; Abu Bakar, Norliza; Mahmood, Maziah; Ho, Chai-Ling; Shaharuddin, Noor Azmi

2014-01-01

Phytocystatin, a type of protease inhibitor (PI), plays major roles in plant defense mechanisms and has been reported to show antipathogenic properties and plant stress tolerance. Recombinant plant PIs are gaining popularity as potential candidates in engineering of crop protection and in synthesizing medicine. It is therefore crucial to identify PI from novel sources like Curcuma longa as it is more effective in combating against pathogens due to its novelty. In this study, a novel cDNA fragment encoding phytocystatin was isolated using degenerate PCR primers, designed from consensus regions of phytocystatin from other plant species. A full-length cDNA of the phytocystatin gene, designated CypCl, was acquired using 5'/3' rapid amplification of cDNA ends method and it has been deposited in NCBI database (accession number KF545954.1). It has a 687 bp long open reading frame (ORF) which encodes 228 amino acids. BLAST result indicated that CypCl is similar to cystatin protease inhibitor from Cucumis sativus with 74% max identity. Sequence analysis showed that CypCl contains most of the motifs found in a cystatin, including a G residue, LARFAV-, QxVxG sequence, PW dipeptide, and SNSL sequence at C-terminal extension. Phylogenetic studies also showed that CypCl is related to phytocystatin from Elaeis guineensis.
Molecular Cloning and Characterization of Novel Phytocystatin Gene from Turmeric, Curcuma longa

PubMed Central

Chan, Seow-Neng; Abu Bakar, Norliza; Mahmood, Maziah; Ho, Chai-Ling

2014-01-01

Phytocystatin, a type of protease inhibitor (PI), plays major roles in plant defense mechanisms and has been reported to show antipathogenic properties and plant stress tolerance. Recombinant plant PIs are gaining popularity as potential candidates in engineering of crop protection and in synthesizing medicine. It is therefore crucial to identify PI from novel sources like Curcuma longa as it is more effective in combating against pathogens due to its novelty. In this study, a novel cDNA fragment encoding phytocystatin was isolated using degenerate PCR primers, designed from consensus regions of phytocystatin from other plant species. A full-length cDNA of the phytocystatin gene, designated CypCl, was acquired using 5′/3′ rapid amplification of cDNA ends method and it has been deposited in NCBI database (accession number KF545954.1). It has a 687 bp long open reading frame (ORF) which encodes 228 amino acids. BLAST result indicated that CypCl is similar to cystatin protease inhibitor from Cucumis sativus with 74% max identity. Sequence analysis showed that CypCl contains most of the motifs found in a cystatin, including a G residue, LARFAV-, QxVxG sequence, PW dipeptide, and SNSL sequence at C-terminal extension. Phylogenetic studies also showed that CypCl is related to phytocystatin from Elaeis guineensis. PMID:25853138
The meta-epigenomic structure of purified human stem cell populations is defined at cis-regulatory sequences

PubMed Central

Zhao, Yong Mei; Golden, Aaron; Mar, Jessica C.; Einstein, Francine H.; Greally, John M.

2014-01-01

The mechanism and significance of epigenetic variability in the same cell type between healthy individuals are not clear. Here, we purify human CD34+ hematopoietic stem and progenitor cells (HSPCs) from different individuals and find that there is increased variability of DNA methylation at loci with properties of promoters and enhancers. The variability is especially enriched at candidate enhancers near genes transitioning between silent and expressed states, and encoding proteins with leukocyte differentiation properties. Our findings of increased variability at loci with intermediate DNA methylation values, at candidate “poised” enhancers, and at genes involved in HSPC lineage commitment suggest that CD34+ cell subtype heterogeneity between individuals is a major mechanism for the variability observed. Epigenomic studies performed on cell populations, even when purified, are testing collections of epigenomes, or meta-epigenomes. Our findings show that meta-epigenomic approaches to data analysis can provide insights into cell subpopulation structure. PMID:25327398

Differences in Brain Transcriptomes of Closely Related Baikal Coregonid Species

PubMed Central

Bychenko, Oksana S.; Sukhanova, Lyubov V.; Azhikina, Tatyana L.; Skvortsov, Timofey A.; Belomestnykh, Tuyana V.; Sverdlov, Eugene D.

2014-01-01

The aim of this work was to get deeper insight into genetic factors involved in the adaptive divergence of closely related species, specifically two representatives of Baikal coregonids—Baikal whitefish (Coregonus baicalensis Dybowski) and Baikal omul (Coregonus migratorius Georgi)—that diverged from a common ancestor as recently as 10–20 thousand years ago. Using the Serial Analysis of Gene Expression method, we obtained libraries of short representative cDNA sequences (tags) from the brains of Baikal whitefish and omul. A comparative analysis of the libraries revealed quantitative differences among ~4% tags of the fishes under study. Based on the similarity of these tags with cDNA of known organisms, we identified candidate genes taking part in adaptive divergence. The most important candidate genes related to the adaptation of Baikal whitefish and Baikal omul, identified in this work, belong to the genes of cell metabolism, nervous and immune systems, protein synthesis, and regulatory genes as well as to DTSsa4 Tc1-like transposons which are widespread among fishes. PMID:24719892
Isolation and expression analysis of cDNAs that are associated with alternate bearing in Olea europaea L. cv. Ayvalık

PubMed Central

2013-01-01

Background Olive cDNA libraries to isolate candidate genes that can help enlightening the molecular mechanism of periodicity and / or fruit production were constructed and analyzed. For this purpose, cDNA libraries from the leaves of trees in “on year” and in “off year” in July (when fruits start to appear) and in November (harvest time) were constructed. Randomly selected 100 positive clones from each library were analyzed with respect to sequence and size. A fruit-flesh cDNA library was also constructed and characterized to confirm the reliability of each library’s temporal and spatial properties. Results Quantitative real-time RT-PCR (qRT-PCR) analyses of the cDNA libraries confirmed cDNA molecules that are associated with different developmental stages (e. g. “on year” leaves in July, “off year” leaves in July, leaves in November) and fruits. Hence, a number of candidate cDNAs associated with “on year” and “off year” were isolated. Comparison of the detected cDNAs to the current EST database of GenBank along with other non - redundant databases of NCBI revealed homologs of previously described genes along with several unknown cDNAs. Of around 500 screened cDNAs, 48 cDNA elements were obtained after eliminating ribosomal RNA sequences. These independent transcripts were analyzed using BLAST searches (cutoff E-value of 1.0E-5) against the KEGG and GenBank nucleotide databases and 37 putative transcripts corresponding to known gene functions were annotated with gene names and Gene Ontology (GO) terms. Transcripts in the biological process were found to be related with metabolic process (27%), cellular process (23%), response to stimulus (17%), localization process (8.5%), multicellular organismal process (6.25%), developmental process (6.25%) and reproduction (4.2%). Conclusions A putative P450 monooxigenase expressed fivefold more in the “on year” than that of “off year” leaves in July. Two putative dehydrins expressed significantly more in “on year” leaves than that of “off year” leaves in November. Homologs of UDP – glucose epimerase, acyl - CoA binding protein, triose phosphate isomerase and a putative nuclear core anchor protein were significant in fruits only, while a homolog of an embryo binding protein / small GTPase regulator was detected in “on year” leaves only. One of the two unknown cDNAs was specific to leaves in July while the other was detected in all of the libraries except fruits. KEGG pathway analyses for the obtained sequences correlated with essential metabolisms such as galactose metabolism, amino sugar and nucleotide sugar metabolisms and photosynthesis. Detailed analysis of the results presents candidate cDNAs that can be used to dissect further the genetic basis of fruit production and / or alternate bearing which causes significant economical loss for olive growers. PMID:23552171
Isolation and Characterization of Burkholderia rinojensis sp. nov., a Non-Burkholderia cepacia Complex Soil Bacterium with Insecticidal and Miticidal Activities

PubMed Central

Fernandez, Lorena E.; Koivunen, Marja; Yang, April; Flor-Weiler, Lina; Marrone, Pamela G.

2013-01-01

Isolate A396, a bacterium isolated from a Japanese soil sample demonstrated strong insecticidal and miticidal activities in laboratory bioassays. The isolate was characterized through biochemical methods, fatty acid methyl ester (FAME) analysis, sequencing of 16S rRNA, multilocus sequence typing and analysis, and DNA-DNA hybridization. FAME analysis matched A396 to Burkholderia cenocepacia, but this result was not confirmed by 16S rRNA or DNA-DNA hybridization. 16S rRNA sequencing indicated closest matches with B. glumae and B. plantarii. DNA-DNA hybridization experiments with B. plantarii, B. glumae, B. multivorans, and B. cenocepacia confirmed the low genetic similarity (11.5 to 37.4%) with known members of the genus. PCR-based screening showed that A396 lacks markers associated with members of the B. cepacia complex. Bioassay results indicated two mechanisms of action: through ingestion and contact. The isolate effectively controlled beet armyworms (Spodoptera exigua; BAW) and two-spotted spider mites (Tetranychus urticae; TSSM). In diet overlay bioassays with BAW, 1% to 4% (vol/vol) dilution of the whole-cell broth caused 97% to 100% mortality 4 days postexposure, and leaf disc treatment bioassays attained 75% ± 22% mortality 3 days postexposure. Contact bioassays led to 50% larval mortality, as well as discoloration, stunting, and failure to molt. TSSM mortality reached 93% in treated leaf discs. Activity was maintained in cell-free supernatants and after heat treatment (60°C for 2 h), indicating that a secondary metabolite or excreted thermostable enzyme might be responsible for the activity. Based on these results, we describe the novel species Burkholderia rinojensis, a good candidate for the development of a biocontrol product against insect and mite pests. PMID:24096416
Exome-wide DNA capture and next generation sequencing in domestic and wild species.

PubMed

Cosart, Ted; Beja-Pereira, Albano; Chen, Shanyuan; Ng, Sarah B; Shendure, Jay; Luikart, Gordon

2011-07-05

Gene-targeted and genome-wide markers are crucial to advance evolutionary biology, agriculture, and biodiversity conservation by improving our understanding of genetic processes underlying adaptation and speciation. Unfortunately, for eukaryotic species with large genomes it remains costly to obtain genome sequences and to develop genome resources such as genome-wide SNPs. A method is needed to allow gene-targeted, next-generation sequencing that is flexible enough to include any gene or number of genes, unlike transcriptome sequencing. Such a method would allow sequencing of many individuals, avoiding ascertainment bias in subsequent population genetic analyses.We demonstrate the usefulness of a recent technology, exon capture, for genome-wide, gene-targeted marker discovery in species with no genome resources. We use coding gene sequences from the domestic cow genome sequence (Bos taurus) to capture (enrich for), and subsequently sequence, thousands of exons of B. taurus, B. indicus, and Bison bison (wild bison). Our capture array has probes for 16,131 exons in 2,570 genes, including 203 candidate genes with known function and of interest for their association with disease and other fitness traits. We successfully sequenced and mapped exon sequences from across the 29 autosomes and X chromosome in the B. taurus genome sequence. Exon capture and high-throughput sequencing identified thousands of putative SNPs spread evenly across all reference chromosomes, in all three individuals, including hundreds of SNPs in our targeted candidate genes. This study shows exon capture can be customized for SNP discovery in many individuals and for non-model species without genomic resources. Our captured exome subset was small enough for affordable next-generation sequencing, and successfully captured exons from a divergent wild species using the domestic cow genome as reference.
A large homozygous deletion in the SAMHD1 gene causes atypical Aicardi–Goutiéres syndrome associated with mtDNA deletions

PubMed Central

Leshinsky-Silver, Esther; Malinger, Gustavo; Ben-Sira, Liat; Kidron, Dvora; Cohen, Sarit; Inbar, Shani; Bezaleli, Tali; Levine, Arie; Vinkler, Chana; Lev, Dorit; Lerman-Sagie, Tally

2011-01-01

Aicardi–Goutiéres syndrome (AGS) is a genetic neurodegenerative disorder with clinical symptoms mimicking a congenital viral infection. Five causative genes have been described: three prime repair exonuclease1 (TREX1), ribonucleases H2A, B and C, and most recently SAM domain and HD domain 1 (SAMHD1). We performed a detailed clinical and molecular characterization of a family with autosomal recessive neurodegenerative disorder showing white matter destruction and calcifications, presenting in utero and associated with multiple mtDNA deletions. A muscle biopsy was normal and did not show any evidence of respiratory chain dysfunction. Southern blot analysis of tissue from a living child and affected fetuses demonstrated multiple mtDNA deletions. Molecular analysis of genes involved in mtDNA synthesis and maintenance (POLGα, POLGβ, Twinkle, ANT1, TK2, SUCLA1 and DGOUK) revealed normal sequences. Sequencing of TREX1 and ribonucleases H2A, B and C failed to reveal any mutations. Whole-genome homozygosity mapping revealed a candidate region containing the SAMHD1 gene. Sequencing of the gene in the affected child and two affected fetuses revealed a large deletion (9 kb), spanning the promoter, exon1 and intron 1. The parents were found to be heterozygous for this deletion. The identification of a homozygous large deletion in the SAMHD1 gene causing atypical AGS with multiple mtDNA deletions may add information regarding the involvement of mitochondria in self-activation of innate immunity by cell intrinsic components. PMID:21102625
Deciphering amphibian diversity through DNA barcoding: chances and challenges.

PubMed

Vences, Miguel; Thomas, Meike; Bonett, Ronald M; Vieites, David R

2005-10-29

Amphibians globally are in decline, yet there is still a tremendous amount of unrecognized diversity, calling for an acceleration of taxonomic exploration. This process will be greatly facilitated by a DNA barcoding system; however, the mitochondrial population structure of many amphibian species presents numerous challenges to such a standardized, single locus, approach. Here we analyse intra- and interspecific patterns of mitochondrial variation in two distantly related groups of amphibians, mantellid frogs and salamanders, to determine the promise of DNA barcoding with cytochrome oxidase subunit I (cox1) sequences in this taxon. High intraspecific cox1 divergences of 7-14% were observed (18% in one case) within the whole set of amphibian sequences analysed. These high values are not caused by particularly high substitution rates of this gene but by generally deep mitochondrial divergences within and among amphibian species. Despite these high divergences, cox1 sequences were able to correctly identify species including disparate geographic variants. The main problems with cox1 barcoding of amphibians are (i) the high variability of priming sites that hinder the application of universal primers to all species and (ii) the observed distinct overlap of intraspecific and interspecific divergence values, which implies difficulties in the definition of threshold values to identify candidate species. Common discordances between geographical signatures of mitochondrial and nuclear markers in amphibians indicate that a single-locus approach can be problematic when high accuracy of DNA barcoding is required. We suggest that a number of mitochondrial and nuclear genes may be used as DNA barcoding markers to complement cox1.
Transcriptionally active PCR for antigen identification and vaccine development: in vitro genome-wide screening and in vivo immunogenicity

PubMed Central

Regis, David P.; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L.; Stefaniak, Maureen E.; Campo, Joseph J.; Carucci, Daniel J.; Roth, David A.; He, Huaping; Felgner, Philip L.; Doolan, Denise L.

2009-01-01

We have evaluated a technology called Transcriptionally Active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data. PMID:18164079
Transcriptionally active PCR for antigen identification and vaccine development: in vitro genome-wide screening and in vivo immunogenicity.

PubMed

Regis, David P; Dobaño, Carlota; Quiñones-Olson, Paola; Liang, Xiaowu; Graber, Norma L; Stefaniak, Maureen E; Campo, Joseph J; Carucci, Daniel J; Roth, David A; He, Huaping; Felgner, Philip L; Doolan, Denise L

2008-03-01

We have evaluated a technology called transcriptionally active PCR (TAP) for high throughput identification and prioritization of novel target antigens from genomic sequence data using the Plasmodium parasite, the causative agent of malaria, as a model. First, we adapted the TAP technology for the highly AT-rich Plasmodium genome, using well-characterized P. falciparum and P. yoelii antigens and a small panel of uncharacterized open reading frames from the P. falciparum genome sequence database. We demonstrated that TAP fragments encoding six well-characterized P. falciparum antigens and five well-characterized P. yoelii antigens could be amplified in an equivalent manner from both plasmid DNA and genomic DNA templates, and that uncharacterized open reading frames could also be amplified from genomic DNA template. Second, we showed that the in vitro expression of the TAP fragments was equivalent or superior to that of supercoiled plasmid DNA encoding the same antigen. Third, we evaluated the in vivo immunogenicity of TAP fragments encoding a subset of the model P. falciparum and P. yoelii antigens. We found that antigen-specific antibody and cellular immune responses induced by the TAP fragments in mice were equivalent or superior to those induced by the corresponding plasmid DNA vaccines. Finally, we developed and demonstrated proof-of-principle for an in vitro humoral immunoscreening assay for down-selection of novel target antigens. These data support the potential of a TAP approach for rapid high throughput functional screening and identification of potential candidate vaccine antigens from genomic sequence data.
Evolutionary conservation of regulatory elements in vertebrate HOX gene clusters

DOE Office of Scientific and Technical Information (OSTI.GOV)

Santini, Simona; Boore, Jeffrey L.; Meyer, Axel

2003-12-31

Due to their high degree of conservation, comparisons of DNA sequences among evolutionarily distantly-related genomes permit to identify functional regions in noncoding DNA. Hox genes are optimal candidate sequences for comparative genome analyses, because they are extremely conserved in vertebrates and occur in clusters. We aligned (Pipmaker) the nucleotide sequences of HoxA clusters of tilapia, pufferfish, striped bass, zebrafish, horn shark, human and mouse (over 500 million years of evolutionary distance). We identified several highly conserved intergenic sequences, likely to be important in gene regulation. Only a few of these putative regulatory elements have been previously described as being involvedmore » in the regulation of Hox genes, while several others are new elements that might have regulatory functions. The majority of these newly identified putative regulatory elements contain short fragments that are almost completely conserved and are identical to known binding sites for regulatory proteins (Transfac). The conserved intergenic regions located between the most rostrally expressed genes in the developing embryo are longer and better retained through evolution. We document that presumed regulatory sequences are retained differentially in either A or A clusters resulting from a genome duplication in the fish lineage. This observation supports both the hypothesis that the conserved elements are involved in gene regulation and the Duplication-Deletion-Complementation model.« less
Bayesian clustering of DNA sequences using Markov chains and a stochastic partition model.

PubMed

Jääskinen, Väinö; Parkkinen, Ville; Cheng, Lu; Corander, Jukka

2014-02-01

In many biological applications it is necessary to cluster DNA sequences into groups that represent underlying organismal units, such as named species or genera. In metagenomics this grouping needs typically to be achieved on the basis of relatively short sequences which contain different types of errors, making the use of a statistical modeling approach desirable. Here we introduce a novel method for this purpose by developing a stochastic partition model that clusters Markov chains of a given order. The model is based on a Dirichlet process prior and we use conjugate priors for the Markov chain parameters which enables an analytical expression for comparing the marginal likelihoods of any two partitions. To find a good candidate for the posterior mode in the partition space, we use a hybrid computational approach which combines the EM-algorithm with a greedy search. This is demonstrated to be faster and yield highly accurate results compared to earlier suggested clustering methods for the metagenomics application. Our model is fairly generic and could also be used for clustering of other types of sequence data for which Markov chains provide a reasonable way to compress information, as illustrated by experiments on shotgun sequence type data from an Escherichia coli strain.
An Exploration into Fern Genome Space.

PubMed

Wolf, Paul G; Sessa, Emily B; Marchant, Daniel Blaine; Li, Fay-Wei; Rothfels, Carl J; Sigel, Erin M; Gitzendanner, Matthew A; Visger, Clayton J; Banks, Jo Ann; Soltis, Douglas E; Soltis, Pamela S; Pryer, Kathleen M; Der, Joshua P

2015-08-26

Ferns are one of the few remaining major clades of land plants for which a complete genome sequence is lacking. Knowledge of genome space in ferns will enable broad-scale comparative analyses of land plant genes and genomes, provide insights into genome evolution across green plants, and shed light on genetic and genomic features that characterize ferns, such as their high chromosome numbers and large genome sizes. As part of an initial exploration into fern genome space, we used a whole genome shotgun sequencing approach to obtain low-density coverage (∼0.4X to 2X) for six fern species from the Polypodiales (Ceratopteris, Pteridium, Polypodium, Cystopteris), Cyatheales (Plagiogyria), and Gleicheniales (Dipteris). We explore these data to characterize the proportion of the nuclear genome represented by repetitive sequences (including DNA transposons, retrotransposons, ribosomal DNA, and simple repeats) and protein-coding genes, and to extract chloroplast and mitochondrial genome sequences. Such initial sweeps of fern genomes can provide information useful for selecting a promising candidate fern species for whole genome sequencing. We also describe variation of genomic traits across our sample and highlight some differences and similarities in repeat structure between ferns and seed plants. © The Author(s) 2015. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Studies on the interaction of a synthetic nitro-flavone derivative with DNA: A multi-spectroscopic and molecular docking approach.

PubMed

Mitra, A; Saikh, F; Das, J; Ghosh, S; Ghosh, R

2018-05-22

Interaction of a ligand with DNA is often the basis of drug action of many molecules. Flavones are important in this regard as their structural features confer them the ability to bind to DNA. 2-(4-Nitrophenyl)-4H-chromen-4-one (4NCO) is an important biologically active synthetic flavone derivative. We are therefore interested in studying its interaction with DNA. Absorption spectroscopy studies included standard and reverse titration, effect of ionic strength on titration, determination of stoichiometry of binding and thermal denaturation. Spectrofluorimetry techniques included fluorimetric titration, quenching studies and fluorescence displacement assay. Assessment of relative viscosity and estimation of thermodynamic parameters from CD spectral studies were also undertaken. Furthermore, molecular docking analyses were also done with different short DNA sequences. The fluorescent flavone 4NCO reversibly interacted with DNA through partial intercalation as well as minor-groove binding. The binding constant and the number of binding sites were of the order 10 4  M -1 and 1 respectively. The binding stoichiometry with DNA was found to be 1:1. The nature of the interaction of 4NCO with DNA was hydrophobic in nature and the process of binding was spontaneous, endothermic and entropy-driven. The flavone also showed a preference for binding to GC rich sequences. The study presents a profile for structural and thermodynamic parameters, for the binding of 4NCO with DNA. DNA is an important target for ligands that are effective against cell proliferative disorders. In this regard, the molecule 4NCO is important since it can exert its biological activity through its DNA binding ability and can be a potential drug candidate. Copyright © 2018 Elsevier B.V. All rights reserved.
Next-generation DNA sequencing identifies novel gene variants and pathways involved in specific language impairment.

PubMed

Chen, Xiaowei Sylvia; Reader, Rose H; Hoischen, Alexander; Veltman, Joris A; Simpson, Nuala H; Francks, Clyde; Newbury, Dianne F; Fisher, Simon E

2017-04-25

A significant proportion of children have unexplained problems acquiring proficient linguistic skills despite adequate intelligence and opportunity. Developmental language disorders are highly heritable with substantial societal impact. Molecular studies have begun to identify candidate loci, but much of the underlying genetic architecture remains undetermined. We performed whole-exome sequencing of 43 unrelated probands affected by severe specific language impairment, followed by independent validations with Sanger sequencing, and analyses of segregation patterns in parents and siblings, to shed new light on aetiology. By first focusing on a pre-defined set of known candidates from the literature, we identified potentially pathogenic variants in genes already implicated in diverse language-related syndromes, including ERC1, GRIN2A, and SRPX2. Complementary analyses suggested novel putative candidates carrying validated variants which were predicted to have functional effects, such as OXR1, SCN9A and KMT2D. We also searched for potential "multiple-hit" cases; one proband carried a rare AUTS2 variant in combination with a rare inherited haplotype affecting STARD9, while another carried a novel nonsynonymous variant in SEMA6D together with a rare stop-gain in SYNPR. On broadening scope to all rare and novel variants throughout the exomes, we identified biological themes that were enriched for such variants, including microtubule transport and cytoskeletal regulation.
Next-generation DNA sequencing identifies novel gene variants and pathways involved in specific language impairment

PubMed Central

Chen, Xiaowei Sylvia; Reader, Rose H.; Hoischen, Alexander; Veltman, Joris A.; Simpson, Nuala H.; Francks, Clyde; Newbury, Dianne F.; Fisher, Simon E.

2017-01-01

A significant proportion of children have unexplained problems acquiring proficient linguistic skills despite adequate intelligence and opportunity. Developmental language disorders are highly heritable with substantial societal impact. Molecular studies have begun to identify candidate loci, but much of the underlying genetic architecture remains undetermined. We performed whole-exome sequencing of 43 unrelated probands affected by severe specific language impairment, followed by independent validations with Sanger sequencing, and analyses of segregation patterns in parents and siblings, to shed new light on aetiology. By first focusing on a pre-defined set of known candidates from the literature, we identified potentially pathogenic variants in genes already implicated in diverse language-related syndromes, including ERC1, GRIN2A, and SRPX2. Complementary analyses suggested novel putative candidates carrying validated variants which were predicted to have functional effects, such as OXR1, SCN9A and KMT2D. We also searched for potential “multiple-hit” cases; one proband carried a rare AUTS2 variant in combination with a rare inherited haplotype affecting STARD9, while another carried a novel nonsynonymous variant in SEMA6D together with a rare stop-gain in SYNPR. On broadening scope to all rare and novel variants throughout the exomes, we identified biological themes that were enriched for such variants, including microtubule transport and cytoskeletal regulation. PMID:28440294
Analysis of complex repeat sequences within the spinal muscular atrophy (SMA) candidate region in 5q13

DOE Office of Scientific and Technical Information (OSTI.GOV)

Davies, K.E.; Morrison, K.E.; Daniels, R.I.

1994-09-01

We previously reported that the 400 kb interval flanked the polymorphic loci D5S435 and D5S557 contains blocks of a chromosome 5 specific repeat. This interval also defines the SMA candidate region by genetic analysis of recombinant families. A YAC contig of 2-3 Mb encompassing this area has been constructed and a 5.5 kb conserved fragment, isolated from a YAC end clone within the above interval, was used to obtain cDNAs from both fetal and adult brain libraries. We describe the identification of cDNAs with stretches of high DNA sequence homology to exons of {beta} glucuronidase on human chromosome 7. Themore » cDNAs map both to the candidate region and to an area of 5p using FISH and deletion hybrid analysis. Hybridization to bacteriophage and cosmid clones from the YACs localizes the {beta} glucuronidase related sequences within the 400 kb region of the YAC contig. The cDNAs show a polymorphic pattern on hybridization to genomic BamH1 fragments in the size range of 10-250 kb. Further analysis using YAC fragmentation vectors is being used to determine how these {beta} glucuronidase related cDNAs are distributed within 5q13. Dinucleotide repeats within the region are being investigated to determine linkage disequilibrium with the disease locus.« less
The genetic basis of adaptive pigmentation variation in Drosophila melanogaster

PubMed Central

Pool, John E.; Aquadro, Charles F.

2009-01-01

In a broad survey of Drosophila melanogaster population samples, levels of abdominal pigmentation were found to be highly variable and geographically differentiated. A strong positive correlation was found between dark pigmentation and high altitude, suggesting adaptation to specific environments. DNA sequence polymorphism at the candidate gene ebony revealed a clear association with the pigmentation of homozygous third chromosome lines. The darkest lines sequenced had nearly identical haplotypes spanning 14.5 kilobases upstream of the protein-coding exons of ebony. Thus, natural selection may have elevated the frequency of an allele that confers dark abdominal pigmentation by influencing the regulation of ebony. PMID:17614900
Characterization and Amplification of Gene-Based Simple Sequence Repeat (SSR) Markers in Date Palm.

PubMed

Zhao, Yongli; Keremane, Manjunath; Prakash, Channapatna S; He, Guohao

2017-01-01

The paucity of molecular markers limits the application of genetic and genomic research in date palm (Phoenix dactylifera L.). Availability of expressed sequence tag (EST) sequences in date palm may provide a good resource for developing gene-based markers. This study characterizes a substantial fraction of transcriptome sequences containing simple sequence repeats (SSRs) from the EST sequences in date palm. The EST sequences studied are mainly homologous to those of Elaeis guineensis and Musa acuminata. A total of 911 gene-based SSR markers, characterized with functional annotations, have provided a useful basis not only for discovering candidate genes and understanding genetic basis of traits of interest but also for developing genetic and genomic tools for molecular research in date palm, such as diversity study, quantitative trait locus (QTL) mapping, and molecular breeding. The procedures of DNA extraction, polymerase chain reaction (PCR) amplification of these gene-based SSR markers, and gel electrophoresis of PCR products are described in this chapter.
Application of whole genome re-sequencing data in the development of diagnostic DNA markers tightly linked to a disease-resistance locus for marker-assisted selection in lupin (Lupinus angustifolius).

PubMed

Yang, Huaan; Jian, Jianbo; Li, Xuan; Renshaw, Daniel; Clements, Jonathan; Sweetingham, Mark W; Tan, Cong; Li, Chengdao

2015-09-02

Molecular marker-assisted breeding provides an efficient tool to develop improved crop varieties. A major challenge for the broad application of markers in marker-assisted selection is that the marker phenotypes must match plant phenotypes in a wide range of breeding germplasm. In this study, we used the legume crop species Lupinus angustifolius (lupin) to demonstrate the utility of whole genome sequencing and re-sequencing on the development of diagnostic markers for molecular plant breeding. Nine lupin cultivars released in Australia from 1973 to 2007 were subjected to whole genome re-sequencing. The re-sequencing data together with the reference genome sequence data were used in marker development, which revealed 180,596 to 795,735 SNP markers from pairwise comparisons among the cultivars. A total of 207,887 markers were anchored on the lupin genetic linkage map. Marker mining obtained an average of 387 SNP markers and 87 InDel markers for each of the 24 genome sequence assembly scaffolds bearing markers linked to 11 genes of agronomic interest. Using the R gene PhtjR conferring resistance to phomopsis stem blight disease as a test case, we discovered 17 candidate diagnostic markers by genotyping and selecting markers on a genetic linkage map. A further 243 candidate diagnostic markers were discovered by marker mining on a scaffold bearing non-diagnostic markers linked to the PhtjR gene. Nine out from the ten tested candidate diagnostic markers were confirmed as truly diagnostic on a broad range of commercial cultivars. Markers developed using these strategies meet the requirements for broad application in molecular plant breeding. We demonstrated that low-cost genome sequencing and re-sequencing data were sufficient and very effective in the development of diagnostic markers for marker-assisted selection. The strategies used in this study may be applied to any trait or plant species. Whole genome sequencing and re-sequencing provides a powerful tool to overcome current limitations in molecular plant breeding, which will enable plant breeders to precisely pyramid favourable genes to develop super crop varieties to meet future food demands.
Feasibility study of molecular memory device based on DNA using methylation to store information

DOE Office of Scientific and Technical Information (OSTI.GOV)

Jiang, Liming; Al-Dirini, Feras; Center for Neural Engineering

DNA, because of its robustness and dense information storage capability, has been proposed as a potential candidate for next-generation storage media. However, encoding information into the DNA sequence requires molecular synthesis technology, which to date is costly and prone to synthesis errors. Reading the DNA strand information is also complex. Ideally, DNA storage will provide methods for modifying stored information. Here, we conduct a feasibility study investigating the use of the DNA 5-methylcytosine (5mC) methylation state as a molecular memory to store information. We propose a new 1-bit memory device and study, based on the density functional theory and non-equilibriummore » Green's function method, the feasibility of electrically reading the information. Our results show that changes to methylation states lead to changes in the peak of negative differential resistance which can be used to interrogate memory state. Our work demonstrates a new memory concept based on methylation state which can be beneficial in the design of next generation DNA based molecular electronic memory devices.« less
Host-associated bacterial taxa from Chlorobi, Chloroflexi, GN02, Synergistetes, SR1, TM7, and WPS-2 Phyla/candidate divisions

PubMed Central

Camanocha, Anuj; Dewhirst, Floyd E.

2014-01-01

Background and objective In addition to the well-known phyla Firmicutes, Proteobacteria, Bacteroidetes, Actinobacteria, Spirochaetes, Fusobacteria, Tenericutes, and Chylamydiae, the oral microbiomes of mammals contain species from the lesser-known phyla or candidate divisions, including Synergistetes, TM7, Chlorobi, Chloroflexi, GN02, SR1, and WPS-2. The objectives of this study were to create phyla-selective 16S rDNA PCR primer pairs, create selective 16S rDNA clone libraries, identify novel oral taxa, and update canine and human oral microbiome databases. Design 16S rRNA gene sequences for members of the lesser-known phyla were downloaded from GenBank and Greengenes databases and aligned with sequences in our RNA databases. Primers with potential phylum level selectivity were designed heuristically with the goal of producing nearly full-length 16S rDNA amplicons. The specificity of primer pairs was examined by making clone libraries from PCR amplicons and determining phyla identity by BLASTN analysis. Results Phylum-selective primer pairs were identified that allowed construction of clone libraries with 96–100% specificity for each of the lesser-known phyla. From these clone libraries, seven human and two canine novel oral taxa were identified and added to their respective taxonomic databases. For each phylum, genome sequences closest to human oral taxa were identified and added to the Human Oral Microbiome Database to facilitate metagenomic, transcriptomic, and proteomic studies that involve tiling sequences to the most closely related taxon. While examining ribosomal operons in lesser-known phyla from single-cell genomes and metagenomes, we identified a novel rRNA operon order (23S-5S-16S) in three SR1 genomes and the splitting of the 23S rRNA gene by an I-CeuI-like homing endonuclease in a WPS-2 genome. Conclusions This study developed useful primer pairs for making phylum-selective 16S rRNA clone libraries. Phylum-specific libraries were shown to be useful for identifying previously unrecognized taxa in lesser-known phyla and would be useful for future environmental and host-associated studies. PMID:25317252

A remark on copy number variation detection methods.

PubMed

Li, Shuo; Dou, Xialiang; Gao, Ruiqi; Ge, Xinzhou; Qian, Minping; Wan, Lin

2018-01-01

Copy number variations (CNVs) are gain and loss of DNA sequence of a genome. High throughput platforms such as microarrays and next generation sequencing technologies (NGS) have been applied for genome wide copy number losses. Although progress has been made in both approaches, the accuracy and consistency of CNV calling from the two platforms remain in dispute. In this study, we perform a deep analysis on copy number losses on 254 human DNA samples, which have both SNP microarray data and NGS data publicly available from Hapmap Project and 1000 Genomes Project respectively. We show that the copy number losses reported from Hapmap Project and 1000 Genome Project only have < 30% overlap, while these reports are required to have cross-platform (e.g. PCR, microarray and high-throughput sequencing) experimental supporting by their corresponding projects, even though state-of-art calling methods were employed. On the other hand, copy number losses are found directly from HapMap microarray data by an accurate algorithm, i.e. CNVhac, almost all of which have lower read mapping depth in NGS data; furthermore, 88% of which can be supported by the sequences with breakpoint in NGS data. Our results suggest the ability of microarray calling CNVs and the possible introduction of false negatives from the unessential requirement of the additional cross-platform supporting. The inconsistency of CNV reports from Hapmap Project and 1000 Genomes Project might result from the inadequate information containing in microarray data, the inconsistent detection criteria, or the filtration effect of cross-platform supporting. The statistical test on CNVs called from CNVhac show that the microarray data can offer reliable CNV reports, and majority of CNV candidates can be confirmed by raw sequences. Therefore, the CNV candidates given by a good caller could be highly reliable without cross-platform supporting, so additional experimental information should be applied in need instead of necessarily.
A Short Interspersed Nuclear Element (SINE)-Based Real-Time PCR Approach to Detect and Quantify Porcine Component in Meat Products.

PubMed

Zhang, Chi; Fang, Xin; Qiu, Haopu; Li, Ning

2015-01-01

Real-time PCR amplification of mitochondria gene could not be used for DNA quantification, and that of single copy DNA did not allow an ideal sensitivity. Moreover, cross-reactions among similar species were commonly observed in the published methods amplifying repetitive sequence, which hindered their further application. The purpose of this study was to establish a short interspersed nuclear element (SINE)-based real-time PCR approach having high specificity for species detection that could be used in DNA quantification. After massive screening of candidate Sus scrofa SINEs, one optimal combination of primers and probe was selected, which had no cross-reaction with other common meat species. LOD of the method was 44 fg DNA/reaction. Further, quantification tests showed this approach was practical in DNA estimation without tissue variance. Thus, this study provided a new tool for qualitative detection of porcine component, which could be promising in the QC of meat products.
Yellow lupin (Lupinus luteus L.) transcriptome sequencing: molecular marker development and comparative studies

PubMed Central

2012-01-01

Background Yellow lupin (Lupinus luteus L.) is a minor legume crop characterized by its high seed protein content. Although grown in several temperate countries, its orphan condition has limited the generation of genomic tools to aid breeding efforts to improve yield and nutritional quality. In this study, we report the construction of 454-expresed sequence tag (EST) libraries, carried out comparative studies between L. luteus and model legume species, developed a comprehensive set of EST-simple sequence repeat (SSR) markers, and validated their utility on diversity studies and transferability to related species. Results Two runs of 454 pyrosequencing yielded 205 Mb and 530 Mb of sequence data for L1 (young leaves, buds and flowers) and L2 (immature seeds) EST- libraries. A combined assembly (L1L2) yielded 71,655 contigs with an average contig length of 632 nucleotides. L1L2 contigs were clustered into 55,309 isotigs. 38,200 isotigs translated into proteins and 8,741 of them were full length. Around 57% of L. luteus sequences had significant similarity with at least one sequence of Medicago, Lotus, Arabidopsis, or Glycine, and 40.17% showed positive matches with all of these species. L. luteus isotigs were also screened for the presence of SSR sequences. A total of 2,572 isotigs contained at least one EST-SSR, with a frequency of one SSR per 17.75 kbp. Empirical evaluation of the EST-SSR candidate markers resulted in 222 polymorphic EST-SSRs. Two hundred and fifty four (65.7%) and 113 (30%) SSR primer pairs were able to amplify fragments from L. hispanicus and L. mutabilis DNA, respectively. Fifty polymorphic EST-SSRs were used to genotype a sample of 64 L. luteus accessions. Neighbor-joining distance analysis detected the existence of several clusters among L. luteus accessions, strongly suggesting the existence of population subdivisions. However, no clear clustering patterns followed the accession’s origin. Conclusion L. luteus deep transcriptome sequencing will facilitate the further development of genomic tools and lupin germplasm. Massive sequencing of cDNA libraries will continue to produce raw materials for gene discovery, identification of polymorphisms (SNPs, EST-SSRs, INDELs, etc.) for marker development, anchoring sequences for genome comparisons and putative gene candidates for QTL detection. PMID:22920992
Yellow lupin (Lupinus luteus L.) transcriptome sequencing: molecular marker development and comparative studies.

PubMed

Parra-González, Lorena B; Aravena-Abarzúa, Gabriela A; Navarro-Navarro, Cristell S; Udall, Joshua; Maughan, Jeff; Peterson, Louis M; Salvo-Garrido, Haroldo E; Maureira-Butler, Iván J

2012-08-24

Yellow lupin (Lupinus luteus L.) is a minor legume crop characterized by its high seed protein content. Although grown in several temperate countries, its orphan condition has limited the generation of genomic tools to aid breeding efforts to improve yield and nutritional quality. In this study, we report the construction of 454-expresed sequence tag (EST) libraries, carried out comparative studies between L. luteus and model legume species, developed a comprehensive set of EST-simple sequence repeat (SSR) markers, and validated their utility on diversity studies and transferability to related species. Two runs of 454 pyrosequencing yielded 205 Mb and 530 Mb of sequence data for L1 (young leaves, buds and flowers) and L2 (immature seeds) EST- libraries. A combined assembly (L1L2) yielded 71,655 contigs with an average contig length of 632 nucleotides. L1L2 contigs were clustered into 55,309 isotigs. 38,200 isotigs translated into proteins and 8,741 of them were full length. Around 57% of L. luteus sequences had significant similarity with at least one sequence of Medicago, Lotus, Arabidopsis, or Glycine, and 40.17% showed positive matches with all of these species. L. luteus isotigs were also screened for the presence of SSR sequences. A total of 2,572 isotigs contained at least one EST-SSR, with a frequency of one SSR per 17.75 kbp. Empirical evaluation of the EST-SSR candidate markers resulted in 222 polymorphic EST-SSRs. Two hundred and fifty four (65.7%) and 113 (30%) SSR primer pairs were able to amplify fragments from L. hispanicus and L. mutabilis DNA, respectively. Fifty polymorphic EST-SSRs were used to genotype a sample of 64 L. luteus accessions. Neighbor-joining distance analysis detected the existence of several clusters among L. luteus accessions, strongly suggesting the existence of population subdivisions. However, no clear clustering patterns followed the accession's origin. L. luteus deep transcriptome sequencing will facilitate the further development of genomic tools and lupin germplasm. Massive sequencing of cDNA libraries will continue to produce raw materials for gene discovery, identification of polymorphisms (SNPs, EST-SSRs, INDELs, etc.) for marker development, anchoring sequences for genome comparisons and putative gene candidates for QTL detection.
Applying plant DNA barcodes to identify species of Parnassia (Parnassiaceae).

PubMed

Yang, Jun-Bo; Wang, Yi-Ping; Möller, Michael; Gao, Lian-Ming; Wu, Ding

2012-03-01

DNA barcoding is a technique to identify species by using standardized DNA sequences. In this study, a total of 105 samples, representing 30 Parnassia species, were collected to test the effectiveness of four proposed DNA barcodes (rbcL, matK, trnH-psbA and ITS) for species identification. Our results demonstrated that all four candidate DNA markers have a maximum level of primer universality and sequencing success. As a single DNA marker, the ITS region provided the highest species resolution with 86.7%, followed by trnH-psbA with 73.3%. The combination of the core barcode regions, matK+rbcL, gave the lowest species identification success (63.3%) among any combination of multiple markers and was found unsuitable as DNA barcode for Parnassia. The combination of ITS+trnH-psbA achieved the highest species discrimination with 90.0% resolution (27 of 30 sampled species), equal to the four-marker combination and higher than any two or three marker combination including rbcL or matK. Therefore, matK and rbcL should not be used as DNA barcodes for the species identification of Parnassia. Based on the overall performance, the combination of ITS+trnH-psbA is proposed as the most suitable DNA barcode for identifying Parnassia species. DNA barcoding is a useful technique and provides a reliable and effective mean for the discrimination of Parnassia species, and in combination with morphology-based taxonomy, will be a robust approach for tackling taxonomically complex groups. In the light of our findings, we found among the three species not identified a possible cryptic speciation event in Parnassia. © 2011 Blackwell Publishing Ltd.
DNA sequence variation and selection of tag single-nucleotide polymorphisms at candidate genes for drought-stress response in Pinus taeda L.

PubMed

González-Martínez, Santiago C; Ersoz, Elhan; Brown, Garth R; Wheeler, Nicholas C; Neale, David B

2006-03-01

Genetic association studies are rapidly becoming the experimental approach of choice to dissect complex traits, including tolerance to drought stress, which is the most common cause of mortality and yield losses in forest trees. Optimization of association mapping requires knowledge of the patterns of nucleotide diversity and linkage disequilibrium and the selection of suitable polymorphisms for genotyping. Moreover, standard neutrality tests applied to DNA sequence variation data can be used to select candidate genes or amino acid sites that are putatively under selection for association mapping. In this article, we study the pattern of polymorphism of 18 candidate genes for drought-stress response in Pinus taeda L., an important tree crop. Data analyses based on a set of 21 putatively neutral nuclear microsatellites did not show population genetic structure or genomewide departures from neutrality. Candidate genes had moderate average nucleotide diversity at silent sites (pi(sil) = 0.00853), varying 100-fold among single genes. The level of within-gene LD was low, with an average pairwise r2 of 0.30, decaying rapidly from approximately 0.50 to approximately 0.20 at 800 bp. No apparent LD among genes was found. A selective sweep may have occurred at the early-response-to-drought-3 (erd3) gene, although population expansion can also explain our results and evidence for selection was not conclusive. One other gene, ccoaomt-1, a methylating enzyme involved in lignification, showed dimorphism (i.e., two highly divergent haplotype lineages at equal frequency), which is commonly associated with the long-term action of balancing selection. Finally, a set of haplotype-tagging SNPs (htSNPs) was selected. Using htSNPs, a reduction of genotyping effort of approximately 30-40%, while sampling most common allelic variants, can be gained in our ongoing association studies for drought tolerance in pine.
Comparative Genomics Reveals Chd1 as a Determinant of Nucleosome Spacing in Vivo.

PubMed

Hughes, Amanda L; Rando, Oliver J

2015-07-14

Packaging of genomic DNA into nucleosomes is nearly universally conserved in eukaryotes, and many features of the nucleosome landscape are quite conserved. Nonetheless, quantitative aspects of nucleosome packaging differ between species because, for example, the average length of linker DNA between nucleosomes can differ significantly even between closely related species. We recently showed that the difference in nucleosome spacing between two Hemiascomycete species-Saccharomyces cerevisiae and Kluyveromyces lactis-is established by trans-acting factors rather than being encoded in cis in the DNA sequence. Here, we generated several S. cerevisiae strains in which endogenous copies of candidate nucleosome spacing factors are deleted and replaced with the orthologous factors from K. lactis. We find no change in nucleosome spacing in such strains in which H1 or Isw1 complexes are swapped. In contrast, the K. lactis gene encoding the ATP-dependent remodeler Chd1 was found to direct longer internucleosomal spacing in S. cerevisiae, establishing that this remodeler is partially responsible for the relatively long internucleosomal spacing observed in K. lactis. By analyzing several chimeric proteins, we find that sequence differences that contribute to the spacing activity of this remodeler are dispersed throughout the coding sequence, but that the strongest spacing effect is linked to the understudied N-terminal end of Chd1. Taken together, our data find a role for sequence evolution of a chromatin remodeler in establishing quantitative aspects of the chromatin landscape in a species-specific manner. Copyright © 2015 Hughes and Rando.
Population-based rare variant detection via pooled exome or custom hybridization capture with or without individual indexing.

PubMed

Ramos, Enrique; Levinson, Benjamin T; Chasnoff, Sara; Hughes, Andrew; Young, Andrew L; Thornton, Katherine; Li, Allie; Vallania, Francesco L M; Province, Michael; Druley, Todd E

2012-12-06

Rare genetic variation in the human population is a major source of pathophysiological variability and has been implicated in a host of complex phenotypes and diseases. Finding disease-related genes harboring disparate functional rare variants requires sequencing of many individuals across many genomic regions and comparing against unaffected cohorts. However, despite persistent declines in sequencing costs, population-based rare variant detection across large genomic target regions remains cost prohibitive for most investigators. In addition, DNA samples are often precious and hybridization methods typically require large amounts of input DNA. Pooled sample DNA sequencing is a cost and time-efficient strategy for surveying populations of individuals for rare variants. We set out to 1) create a scalable, multiplexing method for custom capture with or without individual DNA indexing that was amenable to low amounts of input DNA and 2) expand the functionality of the SPLINTER algorithm for calling substitutions, insertions and deletions across either candidate genes or the entire exome by integrating the variant calling algorithm with the dynamic programming aligner, Novoalign. We report methodology for pooled hybridization capture with pre-enrichment, indexed multiplexing of up to 48 individuals or non-indexed pooled sequencing of up to 92 individuals with as little as 70 ng of DNA per person. Modified solid phase reversible immobilization bead purification strategies enable no sample transfers from sonication in 96-well plates through adapter ligation, resulting in 50% less library preparation reagent consumption. Custom Y-shaped adapters containing novel 7 base pair index sequences with a Hamming distance of ≥2 were directly ligated onto fragmented source DNA eliminating the need for PCR to incorporate indexes, and was followed by a custom blocking strategy using a single oligonucleotide regardless of index sequence. These results were obtained aligning raw reads against the entire genome using Novoalign followed by variant calling of non-indexed pools using SPLINTER or SAMtools for indexed samples. With these pipelines, we find sensitivity and specificity of 99.4% and 99.7% for pooled exome sequencing. Sensitivity, and to a lesser degree specificity, proved to be a function of coverage. For rare variants (≤2% minor allele frequency), we achieved sensitivity and specificity of ≥94.9% and ≥99.99% for custom capture of 2.5 Mb in multiplexed libraries of 22-48 individuals with only ≥5-fold coverage/chromosome, but these parameters improved to ≥98.7 and 100% with 20-fold coverage/chromosome. This highly scalable methodology enables accurate rare variant detection, with or without individual DNA sample indexing, while reducing the amount of required source DNA and total costs through less hybridization reagent consumption, multi-sample sonication in a standard PCR plate, multiplexed pre-enrichment pooling with a single hybridization and lesser sequencing coverage required to obtain high sensitivity.
EFEMP1 as a novel DNA methylation marker for prostate cancer: array-based DNA methylation and expression profiling.

PubMed

Kim, Yong-June; Yoon, Hyung-Yoon; Kim, Seon-Kyu; Kim, Young-Won; Kim, Eun-Jung; Kim, Isaac Yi; Kim, Wun-Jae

2011-07-01

Abnormal DNA methylation is associated with many human cancers. The aim of the present study was to identify novel methylation markers in prostate cancer (PCa) by microarray analysis and to test whether these markers could discriminate normal and PCa cells. Microarray-based DNA methylation and gene expression profiling was carried out using a panel of PCa cell lines and a control normal prostate cell line. The methylation status of candidate genes in prostate cell lines was confirmed by real-time reverse transcriptase-PCR, bisulfite sequencing analysis, and treatment with a demethylation agent. DNA methylation and gene expression analysis in 203 human prostate specimens, including 106 PCa and 97 benign prostate hyperplasia (BPH), were carried out. Further validation using microarray gene expression data from the Gene Expression Omnibus (GEO) was carried out. Epidermal growth factor-containing fibulin-like extracellular matrix protein 1 (EFEMP1) was identified as a lead candidate methylation marker for PCa. The gene expression level of EFEMP1 was significantly higher in tissue samples from patients with BPH than in those with PCa (P < 0.001). The sensitivity and specificity of EFEMP1 methylation status in discriminating between PCa and BPH reached 95.3% (101 of 106) and 86.6% (84 of 97), respectively. From the GEO data set, we confirmed that the expression level of EFEMP1 was significantly different between PCa and BPH. Genome-wide characterization of DNA methylation profiles enabled the identification of EFEMP1 aberrant methylation patterns in PCa. EFEMP1 might be a useful indicator for the detection of PCa.
GateKeeper: a new hardware architecture for accelerating pre-alignment in DNA short read mapping.

PubMed

Alser, Mohammed; Hassan, Hasan; Xin, Hongyi; Ergin, Oguz; Mutlu, Onur; Alkan, Can

2017-11-01

High throughput DNA sequencing (HTS) technologies generate an excessive number of small DNA segments -called short reads- that cause significant computational burden. To analyze the entire genome, each of the billions of short reads must be mapped to a reference genome based on the similarity between a read and 'candidate' locations in that reference genome. The similarity measurement, called alignment, formulated as an approximate string matching problem, is the computational bottleneck because: (i) it is implemented using quadratic-time dynamic programming algorithms and (ii) the majority of candidate locations in the reference genome do not align with a given read due to high dissimilarity. Calculating the alignment of such incorrect candidate locations consumes an overwhelming majority of a modern read mapper's execution time. Therefore, it is crucial to develop a fast and effective filter that can detect incorrect candidate locations and eliminate them before invoking computationally costly alignment algorithms. We propose GateKeeper, a new hardware accelerator that functions as a pre-alignment step that quickly filters out most incorrect candidate locations. GateKeeper is the first design to accelerate pre-alignment using Field-Programmable Gate Arrays (FPGAs), which can perform pre-alignment much faster than software. When implemented on a single FPGA chip, GateKeeper maintains high accuracy (on average >96%) while providing, on average, 90-fold and 130-fold speedup over the state-of-the-art software pre-alignment techniques, Adjacency Filter and Shifted Hamming Distance (SHD), respectively. The addition of GateKeeper as a pre-alignment step can reduce the verification time of the mrFAST mapper by a factor of 10. https://github.com/BilkentCompGen/GateKeeper. mohammedalser@bilkent.edu.tr or onur.mutlu@inf.ethz.ch or calkan@cs.bilkent.edu.tr. Supplementary data are available at Bioinformatics online. © The Author (2017). Published by Oxford University Press. All rights reserved. For Permissions, please email: journals.permissions@oup.com
Whole-Exome Sequencing Study of Thyrotropin-Secreting Pituitary Adenomas.

PubMed

Sapkota, Santosh; Horiguchi, Kazuhiko; Tosaka, Masahiko; Yamada, Syozo; Yamada, Masanobu

2017-02-01

Thyrotropin (TSH)-secreting pituitary adenomas (TSHomas) are a rare cause of hyperthyroidism, and the genetic aberrations responsible remain unknown. To identify somatic genetic abnormalities in TSHomas. A single-nucleotide polymorphism (SNP) array analysis was performed on 8 TSHomas. Four tumors with no allelic losses or limited loss of heterozygosity were selected, and whole-exome sequencing was performed, including their corresponding blood samples. Somatic variants were confirmed by Sanger sequencing. A set of 8 tumors was also assessed to validate candidate genes. Twelve patients with sporadic TSHomas were examined. The overall performance of whole-exome sequencing was good, with an average coverage of each base in the targeted region of 97.6%. Six DNA variants were confirmed as candidate driver mutations, with an average of 1.5 somatic mutations per tumor. No mutations were recurrent. Two of these mutations were found in genes with an established role in malignant tumorigenesis (SMOX and SYTL3), and 4 had unknown roles (ZSCAN23, ASTN2, R3HDM2, and CWH43). Similarly, an SNP array analysis revealed frequent chromosomal regions of copy number gains, including recurrent gains at loci harboring 4 of these 6 genes. Several candidate somatic mutations and changes in copy numbers for TSHomas were identified. The results showed no recurrence of mutations in the tumors studied but a low number of mutations, thereby highlighting their benign nature. Further studies on a larger cohort of TSHomas, along with the use of epigenetic and transcriptomic approaches, may reveal the underlying genetic lesions. Copyright © 2017 by the Endocrine Society
Gene Targeting in Rabbits: Single-Step Generation of Knock-out Rabbits by Microinjection of CRISPR/Cas9 Plasmids.

PubMed

Kawano, Yoshihiro; Honda, Arata

2017-01-01

The development of genome editing technology has allowed gene disruptions to be achieved in various animal species and has been beneficial to many mammals. Gene disruption using pluripotent stem cells is difficult to achieve in rabbits, but thanks to advances in genome editing technology, a number of gene disruptions have been conducted. This paper describes a simple and easy method for carrying out gene disruptions in rabbits using CRISPR/Cas9 in which the gene to be disrupted is marked, the presence or absence of off-target candidates is checked, and a plasmid allowing simultaneous expression of Cas9 and sgRNA is constructed. Next, the cleaving activity of candidate sequences is investigated, and assessments are carried out to determine whether the target sequences can be cut. Female rabbits subjected to superovulation treatment are mated with male rabbits and fertilized eggs are collected, and then pronuclear injection of plasmid DNA is performed. The next day, the two-cell stage embryos are transplanted into pseudopregnant rabbits, and offspring are born within approximately 29-30 days. The genomic DNA of the offspring is then examined to check what types of genetic modifications have occurred. With the advent of CRISPR/Cas9, the accessibility of gene disruptions in rabbits has improved remarkably. This paper summarizes specifically how to carry out gene disruptions in rabbits.
Cloning, characterization, and physical mapping of the canine Prop-1 gene (PROP1): exclusion as a candidate for combined pituitary hormone deficiency in German shepherd dogs.

PubMed

Lantinga-van Leeuwen, I S; Kooistra, H S; Mol, J A; Renier, C; Breen, M; van Oost, B A

2000-01-01

Abnormalities in the genes encoding Pit-1 and Prop-1 have been reported to cause combined pituitary hormone deficiency (CPHD) in mice and humans. In dogs, a similar phenotype has been described in the German shepherd breed. We have previously reported that the Pit-1 gene (POU1F1) is not mutated in affected German shepherd dogs. In this study, we report the isolation and mapping of the canine Prop-1 gene (PROP1), and we assessed the involvement of PROP1 in German shepherd dog dwarfism. The canine PROP1 gene was found to contain three exons, encoding a 226 amino acid protein. The deduced amino acid sequence was 79% and 84% homologous with the mouse and human Prop-1 protein, respectively. Using fluorescence in situ hybridization, PROP1 was mapped to canine chromosome 11. Further mapping with a canine radiation hybrid panel showed co-localization with the polymorphic DNA marker AHT137. Sequence analysis of genomic DNA from dwarf German shepherd dogs revealed no alterations in the PROP1 gene. Moreover, linkage analysis of AHT137 revealed no co-segregation between the PROP1 locus and the CPHD phenotype, excluding this gene as candidate for canine CPHD and providing a new spontaneous model of hypopituitarism. Copyright 2000 S. Karger AG, Basel
Testing DNA barcodes in closely related species of Curcuma (Zingiberaceae) from Myanmar and China.

PubMed

Chen, Juan; Zhao, Jietang; Erickson, David L; Xia, Nianhe; Kress, W John

2015-03-01

The genus Curcuma L. is commonly used as spices, medicines, dyes and ornamentals. Owing to its economic significance and lack of clear-cut morphological differences between species, this genus is an ideal case for developing DNA barcodes. In this study, four chloroplast DNA regions (matK, rbcL, trnH-psbA and trnL-F) and one nuclear region (ITS2) were generated for 44 Curcuma species and five species from closely related genera, represented by 96 samples. PCR amplification success rate, intra- and inter-specific genetic distance variation and the correct identification percentage were taken into account to assess candidate barcode regions. PCR and sequence success rate were high in matK (89.7%), rbcL (100%), trnH-psbA (100%), trnL-F (95.7%) and ITS2 (82.6%) regions. The results further showed that four candidate chloroplast barcoding regions (matK, rbcL, trnH-psbA and trnL-F) yield no barcode gaps, indicating that the genus Curcuma represents a challenging group for DNA barcoding. The ITS2 region presented large interspecific variation and provided the highest correct identification rates (46.7%) based on BLASTClust method among the five regions. However, the ITS2 only provided 7.9% based on NJ tree method. An increase in discriminatory power needs the development of more variable markers. © 2014 John Wiley & Sons Ltd.
OVA: integrating molecular and physical phenotype data from multiple biomedical domain ontologies with variant filtering for enhanced variant prioritization.

PubMed

Antanaviciute, Agne; Watson, Christopher M; Harrison, Sally M; Lascelles, Carolina; Crinnion, Laura; Markham, Alexander F; Bonthron, David T; Carr, Ian M

2015-12-01

Exome sequencing has become a de facto standard method for Mendelian disease gene discovery in recent years, yet identifying disease-causing mutations among thousands of candidate variants remains a non-trivial task. Here we describe a new variant prioritization tool, OVA (ontology variant analysis), in which user-provided phenotypic information is exploited to infer deeper biological context. OVA combines a knowledge-based approach with a variant-filtering framework. It reduces the number of candidate variants by considering genotype and predicted effect on protein sequence, and scores the remainder on biological relevance to the query phenotype.We take advantage of several ontologies in order to bridge knowledge across multiple biomedical domains and facilitate computational analysis of annotations pertaining to genes, diseases, phenotypes, tissues and pathways. In this way, OVA combines information regarding molecular and physical phenotypes and integrates both human and model organism data to effectively prioritize variants. By assessing performance on both known and novel disease mutations, we show that OVA performs biologically meaningful candidate variant prioritization and can be more accurate than another recently published candidate variant prioritization tool. OVA is freely accessible at http://dna2.leeds.ac.uk:8080/OVA/index.jsp. Supplementary data are available at Bioinformatics online. umaan@leeds.ac.uk. © The Author 2015. Published by Oxford University Press.
Elevated transcription factor specificity protein 1 in autistic brains alters the expression of autism candidate genes.

PubMed

Thanseem, Ismail; Anitha, Ayyappan; Nakamura, Kazuhiko; Suda, Shiro; Iwata, Keiko; Matsuzaki, Hideo; Ohtsubo, Masafumi; Ueki, Takatoshi; Katayama, Taiichi; Iwata, Yasuhide; Suzuki, Katsuaki; Minoshima, Shinsei; Mori, Norio

2012-03-01

Profound changes in gene expression can result from abnormalities in the concentrations of sequence-specific transcription factors like specificity protein 1 (Sp1). Specificity protein 1 binding sites have been reported in the promoter regions of several genes implicated in autism. We hypothesize that dysfunction of Sp1 could affect the expression of multiple autism candidate genes, contributing to the heterogeneity of autism. We assessed any alterations in the expression of Sp1 and that of autism candidate genes in the postmortem brain (anterior cingulate gyrus [ACG], motor cortex, and thalamus) of autism patients (n = 8) compared with healthy control subjects (n = 13). Alterations in the expression of candidate genes upon Sp1/DNA binding inhibition with mithramycin and Sp1 silencing by RNAi were studied in SK-N-SH neuronal cells. We observed elevated expression of Sp1 in ACG of autism patients (p = .010). We also observed altered expression of several autism candidate genes. GABRB3, RELN, and HTR2A showed reduced expression, whereas CD38, ITGB3, MAOA, MECP2, OXTR, and PTEN showed elevated expression in autism. In SK-N-SH cells, OXTR, PTEN, and RELN showed reduced expression upon Sp1/DNA binding inhibition and Sp1 silencing. The RNA integrity number was not available for any of the samples. Transcription factor Sp1 is dysfunctional in the ACG of autistic brain. Consequently, the expression of potential autism candidate genes regulated by Sp1, especially OXTR and PTEN, could be affected. The diverse downstream pathways mediated by the Sp1-regulated genes, along with the environmental and intracellular signal-related regulation of Sp1, could explain the complex phenotypes associated with autism.
Csa-19, a radiation-responsive human gene, identified by an unbiased two-gel cDNA library screening method in human cancer cells

NASA Technical Reports Server (NTRS)

Balcer-Kubiczek, E. K.; Meltzer, S. J.; Han, L. H.; Zhang, X. F.; Shi, Z. M.; Harrison, G. H.; Abraham, J. M.

1997-01-01

A novel polymerase chain reaction (PCR)-based method was used to identify candidate genes whose expression is altered in cancer cells by ionizing radiation. Transcriptional induction of randomly selected genes in control versus irradiated human HL60 cells was compared. Among several complementary DNA (cDNA) clones recovered by this approach, one cDNA clone (CL68-5) was downregulated in X-irradiated HL60 cells but unaffected by 12-O-tetradecanoyl phorbol-13-acetate, forskolin, or cyclosporin-A. DNA sequencing of the CL68-5 cDNA revealed 100% nucleotide sequence homology to the reported human Csa-19 gene. Northern blot analysis of RNA from control and irradiated cells revealed the expression of a single 0.7-kilobase (kb) messenger RNA (mRNA) transcript. This 0.7-kb Csa-19 mRNA transcript was also expressed in a variety of human adult and corresponding fetal normal tissues. Moreover, when the effect of X- or fission neutron-irradiation on Csa-19 mRNA was compared in cultured human cells differing in p53 gene status (p53-/- versus p53+/+), downregulation of Csa-19 by X-rays or fission neutrons was similar in p53-wild type and p53-null cell lines. Our results provide the first known example of a radiation-responsive gene in human cancer cells whose expression is not associated with p53, adenylate cyclase or protein kinase C.
Sensitive DNA detection and SNP discrimination using ultrabright SERS nanorattles and magnetic beads for malaria diagnostics.

PubMed

Ngo, Hoan T; Gandra, Naveen; Fales, Andrew M; Taylor, Steve M; Vo-Dinh, Tuan

2016-07-15

One of the major obstacles to implement nucleic acid-based molecular diagnostics at the point-of-care (POC) and in resource-limited settings is the lack of sensitive and practical DNA detection methods that can be seamlessly integrated into portable platforms. Herein we present a sensitive yet simple DNA detection method using a surface-enhanced Raman scattering (SERS) nanoplatform: the ultrabright SERS nanorattle. The method, referred to as the nanorattle-based method, involves sandwich hybridization of magnetic beads that are loaded with capture probes, target sequences, and ultrabright SERS nanorattles that are loaded with reporter probes. Upon hybridization, a magnet was applied to concentrate the hybridization sandwiches at a detection spot for SERS measurements. The ultrabright SERS nanorattles, composed of a core and a shell with resonance Raman reporters loaded in the gap space between the core and the shell, serve as SERS tags for signal detection. Using this method, a specific DNA sequence of the malaria parasite Plasmodium falciparum could be detected with a detection limit of approximately 100 attomoles. Single nucleotide polymorphism (SNP) discrimination of wild type malaria DNA and mutant malaria DNA, which confers resistance to artemisinin drugs, was also demonstrated. These test models demonstrate the molecular diagnostic potential of the nanorattle-based method to both detect and genotype infectious pathogens. Furthermore, the method's simplicity makes it a suitable candidate for integration into portable platforms for POC and in resource-limited settings applications. Copyright © 2016. Published by Elsevier B.V.
Leishmania genome analysis and high-throughput immunological screening identifies tuzin as a novel vaccine candidate against visceral leishmaniasis.

PubMed

Lakshmi, Bhavana Sethu; Wang, Ruobing; Madhubala, Rentala

2014-06-24

Leishmaniasis is a neglected tropical disease caused by Leishmania species. It is a major health concern affecting 88 countries and threatening 350 million people globally. Unfortunately, there are no vaccines and there are limitations associated with the current therapeutic regimens for leishmaniasis. The emerging cases of drug-resistance further aggravate the situation, demanding rapid drug and vaccine development. The genome sequence of Leishmania, provides access to novel genes that hold potential as chemotherapeutic targets or vaccine candidates. In this study, we selected 19 antigenic genes from about 8000 common Leishmania genes based on the Leishmania major and Leishmania infantum genome information available in the pathogen databases. Potential vaccine candidates thus identified were screened using an in vitro high throughput immunological platform developed in the laboratory. Four candidate genes coding for tuzin, flagellar glycoprotein-like protein (FGP), phospholipase A1-like protein (PLA1) and potassium voltage-gated channel protein (K VOLT) showed a predominant protective Th1 response over disease exacerbating Th2. We report the immunogenic properties and protective efficacy of one of the four antigens, tuzin, as a DNA vaccine against Leishmania donovani challenge. Our results show that administration of tuzin DNA protected BALB/c mice against L. donovani challenge and that protective immunity was associated with higher levels of IFN-γ and IL-12 production in comparison to IL-4 and IL-10. Our study presents a simple approach to rapidly identify potential vaccine candidates using the exhaustive information stored in the genome and an in vitro high-throughput immunological platform. Copyright © 2014. Published by Elsevier Ltd.
Mining new crystal protein genes from Bacillus thuringiensis on the basis of mixed plasmid-enriched genome sequencing and a computational pipeline.

PubMed

Ye, Weixing; Zhu, Lei; Liu, Yingying; Crickmore, Neil; Peng, Donghai; Ruan, Lifang; Sun, Ming

2012-07-01

We have designed a high-throughput system for the identification of novel crystal protein genes (cry) from Bacillus thuringiensis strains. The system was developed with two goals: (i) to acquire the mixed plasmid-enriched genomic sequence of B. thuringiensis using next-generation sequencing biotechnology, and (ii) to identify cry genes with a computational pipeline (using BtToxin_scanner). In our pipeline method, we employed three different kinds of well-developed prediction methods, BLAST, hidden Markov model (HMM), and support vector machine (SVM), to predict the presence of Cry toxin genes. The pipeline proved to be fast (average speed, 1.02 Mb/min for proteins and open reading frames [ORFs] and 1.80 Mb/min for nucleotide sequences), sensitive (it detected 40% more protein toxin genes than a keyword extraction method using genomic sequences downloaded from GenBank), and highly specific. Twenty-one strains from our laboratory's collection were selected based on their plasmid pattern and/or crystal morphology. The plasmid-enriched genomic DNA was extracted from these strains and mixed for Illumina sequencing. The sequencing data were de novo assembled, and a total of 113 candidate cry sequences were identified using the computational pipeline. Twenty-seven candidate sequences were selected on the basis of their low level of sequence identity to known cry genes, and eight full-length genes were obtained with PCR. Finally, three new cry-type genes (primary ranks) and five cry holotypes, which were designated cry8Ac1, cry7Ha1, cry21Ca1, cry32Fa1, and cry21Da1 by the B. thuringiensis Toxin Nomenclature Committee, were identified. The system described here is both efficient and cost-effective and can greatly accelerate the discovery of novel cry genes.

Identification of a novel mutation in a Chinese family with Nance-Horan syndrome by whole exome sequencing*

PubMed Central

Hong, Nan; Chen, Yan-hua; Xie, Chen; Xu, Bai-sheng; Huang, Hui; Li, Xin; Yang, Yue-qing; Huang, Ying-ping; Deng, Jian-lian; Qi, Ming; Gu, Yang-shun

2014-01-01

Objective: Nance-Horan syndrome (NHS) is a rare X-linked disorder characterized by congenital nuclear cataracts, dental anomalies, and craniofacial dysmorphisms. Mental retardation was present in about 30% of the reported cases. The purpose of this study was to investigate the genetic and clinical features of NHS in a Chinese family. Methods: Whole exome sequencing analysis was performed on DNA from an affected male to scan for candidate mutations on the X-chromosome. Sanger sequencing was used to verify these candidate mutations in the whole family. Clinical and ophthalmological examinations were performed on all members of the family. Results: A combination of exome sequencing and Sanger sequencing revealed a nonsense mutation c.322G>T (E108X) in exon 1 of NHS gene, co-segregating with the disease in the family. The nonsense mutation led to the conversion of glutamic acid to a stop codon (E108X), resulting in truncation of the NHS protein. Multiple sequence alignments showed that codon 108, where the mutation (c.322G>T) occurred, was located within a phylogenetically conserved region. The clinical features in all affected males and female carriers are described in detail. Conclusions: We report a nonsense mutation c.322G>T (E108X) in a Chinese family with NHS. Our findings broaden the spectrum of NHS mutations and provide molecular insight into future NHS clinical genetic diagnosis. PMID:25091991
A Reevaluation of Rice Mitochondrial Evolution Based on the Complete Sequence of Male-Fertile and Male-Sterile Mitochondrial Genomes1[C][W][OA

PubMed Central

Bentolila, Stéphane; Stefanov, Stefan

2012-01-01

Plant mitochondrial genomes have features that distinguish them radically from their animal counterparts: a high rate of rearrangement, of uptake and loss of DNA sequences, and an extremely low point mutation rate. Perhaps the most unique structural feature of plant mitochondrial DNAs is the presence of large repeated sequences involved in intramolecular and intermolecular recombination. In addition, rare recombination events can occur across shorter repeats, creating rearrangements that result in aberrant phenotypes, including pollen abortion, which is known as cytoplasmic male sterility (CMS). Using next-generation sequencing, we pyrosequenced two rice (Oryza sativa) mitochondrial genomes that belong to the indica subspecies. One genome is normal, while the other carries the wild abortive-CMS. We find that numerous rearrangements in the rice mitochondrial genome occur even between close cytotypes during rice evolution. Unlike maize (Zea mays), a closely related species also belonging to the grass family, integration of plastid sequences did not play a role in the sequence divergence between rice cytotypes. This study also uncovered an excellent candidate for the wild abortive-CMS-encoding gene; like most of the CMS-associated open reading frames that are known in other species, this candidate was created via a rearrangement, is chimeric in structure, possesses predicted transmembrane domains, and coopted the promoter of a genuine mitochondrial gene. Our data give new insights into rice mitochondrial evolution, correcting previous reports. PMID:22128137
Integrative Annotation of 21,037 Human Genes Validated by Full-Length cDNA Clones

PubMed Central

Imanishi, Tadashi; Itoh, Takeshi; Suzuki, Yutaka; O'Donovan, Claire; Fukuchi, Satoshi; Koyanagi, Kanako O; Barrero, Roberto A; Tamura, Takuro; Yamaguchi-Kabata, Yumi; Tanino, Motohiko; Yura, Kei; Miyazaki, Satoru; Ikeo, Kazuho; Homma, Keiichi; Kasprzyk, Arek; Nishikawa, Tetsuo; Hirakawa, Mika; Thierry-Mieg, Jean; Thierry-Mieg, Danielle; Ashurst, Jennifer; Jia, Libin; Nakao, Mitsuteru; Thomas, Michael A; Mulder, Nicola; Karavidopoulou, Youla; Jin, Lihua; Kim, Sangsoo; Yasuda, Tomohiro; Lenhard, Boris; Eveno, Eric; Suzuki, Yoshiyuki; Yamasaki, Chisato; Takeda, Jun-ichi; Gough, Craig; Hilton, Phillip; Fujii, Yasuyuki; Sakai, Hiroaki; Tanaka, Susumu; Amid, Clara; Bellgard, Matthew; Bonaldo, Maria de Fatima; Bono, Hidemasa; Bromberg, Susan K; Brookes, Anthony J; Bruford, Elspeth; Carninci, Piero; Chelala, Claude; Couillault, Christine; de Souza, Sandro J.; Debily, Marie-Anne; Devignes, Marie-Dominique; Dubchak, Inna; Endo, Toshinori; Estreicher, Anne; Eyras, Eduardo; Fukami-Kobayashi, Kaoru; R. Gopinath, Gopal; Graudens, Esther; Hahn, Yoonsoo; Han, Michael; Han, Ze-Guang; Hanada, Kousuke; Hanaoka, Hideki; Harada, Erimi; Hashimoto, Katsuyuki; Hinz, Ursula; Hirai, Momoki; Hishiki, Teruyoshi; Hopkinson, Ian; Imbeaud, Sandrine; Inoko, Hidetoshi; Kanapin, Alexander; Kaneko, Yayoi; Kasukawa, Takeya; Kelso, Janet; Kersey, Paul; Kikuno, Reiko; Kimura, Kouichi; Korn, Bernhard; Kuryshev, Vladimir; Makalowska, Izabela; Makino, Takashi; Mano, Shuhei; Mariage-Samson, Regine; Mashima, Jun; Matsuda, Hideo; Mewes, Hans-Werner; Minoshima, Shinsei; Nagai, Keiichi; Nagasaki, Hideki; Nagata, Naoki; Nigam, Rajni; Ogasawara, Osamu; Ohara, Osamu; Ohtsubo, Masafumi; Okada, Norihiro; Okido, Toshihisa; Oota, Satoshi; Ota, Motonori; Ota, Toshio; Otsuki, Tetsuji; Piatier-Tonneau, Dominique; Poustka, Annemarie; Ren, Shuang-Xi; Saitou, Naruya; Sakai, Katsunaga; Sakamoto, Shigetaka; Sakate, Ryuichi; Schupp, Ingo; Servant, Florence; Sherry, Stephen; Shiba, Rie; Shimizu, Nobuyoshi; Shimoyama, Mary; Simpson, Andrew J; Soares, Bento; Steward, Charles; Suwa, Makiko; Suzuki, Mami; Takahashi, Aiko; Tamiya, Gen; Tanaka, Hiroshi; Taylor, Todd; Terwilliger, Joseph D; Unneberg, Per; Veeramachaneni, Vamsi; Watanabe, Shinya; Wilming, Laurens; Yasuda, Norikazu; Yoo, Hyang-Sook; Stodolsky, Marvin; Makalowski, Wojciech; Go, Mitiko; Nakai, Kenta; Takagi, Toshihisa; Kanehisa, Minoru; Sakaki, Yoshiyuki; Quackenbush, John; Okazaki, Yasushi; Hayashizaki, Yoshihide; Hide, Winston; Chakraborty, Ranajit; Nishikawa, Ken; Sugawara, Hideaki; Tateno, Yoshio; Chen, Zhu; Oishi, Michio; Tonellato, Peter; Apweiler, Rolf; Okubo, Kousaku; Wagner, Lukas; Wiemann, Stefan; Strausberg, Robert L; Isogai, Takao; Auffray, Charles; Nomura, Nobuo; Sugano, Sumio

2004-01-01

The human genome sequence defines our inherent biological potential; the realization of the biology encoded therein requires knowledge of the function of each gene. Currently, our knowledge in this area is still limited. Several lines of investigation have been used to elucidate the structure and function of the genes in the human genome. Even so, gene prediction remains a difficult task, as the varieties of transcripts of a gene may vary to a great extent. We thus performed an exhaustive integrative characterization of 41,118 full-length cDNAs that capture the gene transcripts as complete functional cassettes, providing an unequivocal report of structural and functional diversity at the gene level. Our international collaboration has validated 21,037 human gene candidates by analysis of high-quality full-length cDNA clones through curation using unified criteria. This led to the identification of 5,155 new gene candidates. It also manifested the most reliable way to control the quality of the cDNA clones. We have developed a human gene database, called the H-Invitational Database (H-InvDB; http://www.h-invitational.jp/). It provides the following: integrative annotation of human genes, description of gene structures, details of novel alternative splicing isoforms, non-protein-coding RNAs, functional domains, subcellular localizations, metabolic pathways, predictions of protein three-dimensional structure, mapping of known single nucleotide polymorphisms (SNPs), identification of polymorphic microsatellite repeats within human genes, and comparative results with mouse full-length cDNAs. The H-InvDB analysis has shown that up to 4% of the human genome sequence (National Center for Biotechnology Information build 34 assembly) may contain misassembled or missing regions. We found that 6.5% of the human gene candidates (1,377 loci) did not have a good protein-coding open reading frame, of which 296 loci are strong candidates for non-protein-coding RNA genes. In addition, among 72,027 uniquely mapped SNPs and insertions/deletions localized within human genes, 13,215 nonsynonymous SNPs, 315 nonsense SNPs, and 452 indels occurred in coding regions. Together with 25 polymorphic microsatellite repeats present in coding regions, they may alter protein structure, causing phenotypic effects or resulting in disease. The H-InvDB platform represents a substantial contribution to resources needed for the exploration of human biology and pathology. PMID:15103394
DNA methylation screening of primary prostate tumors identifies SRD5A2 and CYP11A1 as candidate markers for assessing risk of biochemical recurrence.

PubMed

Horning, Aaron M; Awe, Julius A; Wang, Chiou-Miin; Liu, Joseph; Lai, Zhao; Wang, Vickie Yao; Jadhav, Rohit R; Louie, Anna D; Lin, Chun-Lin; Kroczak, Tad; Chen, Yidong; Jin, Victor X; Abboud-Werner, Sherry L; Leach, Robin J; Hernandez, Javior; Thompson, Ian M; Saranchuk, Jeff; Drachenberg, Darrel; Chen, Chun-Liang; Mai, Sabine; Huang, Tim Hui-Ming

2015-11-01

Altered DNA methylation in CpG islands of gene promoters has been implicated in prostate cancer (PCa) progression and can be used to predict disease outcome. In this study, we determine whether methylation changes of androgen biosynthesis pathway (ABP)-related genes in patients' plasma cell-free DNA (cfDNA) can serve as prognostic markers for biochemical recurrence (BCR). Methyl-binding domain capture sequencing (MBDCap-seq) was used to identify differentially methylated regions (DMRs) in primary tumors of patients who subsequently developed BCR or not, respectively. Methylation pyrosequencing of candidate loci was validated in cfDNA samples of 86 PCa patients taken at and/or post-radical prostatectomy (RP) using univariate and multivariate prediction analyses. Putative DMRs in 13 of 30 ABP-related genes were found between tumors of BCR (n = 12) versus no evidence of disease (NED) (n = 15). In silico analysis of The Cancer Genome Atlas data confirmed increased DNA methylation of two loci-SRD5A2 and CYP11A1, which also correlated with their decreased expression, in tumors with subsequent BCR development. Their aberrant cfDNA methylation was also associated with detectable levels of PSA taken after patients' post-RP. Multivariate analysis of the change in cfDNA methylation at all of CpG sites measured along with patient's treatment history predicted if a patient will develop BCR with 77.5% overall accuracy. Overall, increased DNA methylation of SRD5A2 and CYP11A1 related to androgen biosynthesis functions may play a role in BCR after patients' RP. The correlation between aberrant cfDNA methylation and detectable PSA in post-RP further suggests their utility as predictive markers for PCa recurrence. . © 2015 Wiley Periodicals, Inc.
Hairpin stabilized fluorescent silver nanoclusters for quantitative detection of NAD+ and monitoring NAD+/NADH based enzymatic reactions.

PubMed

Jain, Priyamvada; Chakma, Babina; Patra, Sanjukta; Goswami, Pranab

2017-03-01

A set of 90 mer long ssDNA candidates, with different degrees of cytosine (C-levels) (% and clusters) was analyzed for their function as suitable Ag-nanocluster (AgNC) nucleation scaffolds. The sequence (P4) with highest C-level (42.2%) emerged as the only candidate supporting the nucleation process as evident from its intense fluorescence peak at λ 660 nm . Shorter DNA subsets derived from P4 with only stable hairpin structures could support the AgNC formation. The secondary hairpin structures were confirmed by PAGE, and CD studies. The number of base pairs in the stem region also contributes to the stability of the hairpins. A shorter 29 mer sequence (Sub 3) (ΔG = -1.3 kcal/mol) with 3-bp in the stem of a 7-mer loop conferred highly stable AgNC. NAD + strongly quenched the fluorescence of Sub 3-AgNC in a concentration dependent manner. Time resolved photoluminescence studies revealed the quenching involves a combined static and dynamic interaction where the binding constant and number of binding sites for NAD + were 0.201 L mol -1 and 3.6, respectively. A dynamic NAD + detection range of 50-500 μM with a limit of detection of 22.3 μM was discerned. The NAD + mediated quenching of AgNC was not interfered by NADH, NADP + , monovalent and divalent ions, or serum samples. The method was also used to follow alcohol dehydrogenase and lactate dehydrogenase catalyzed physiological reactions in a turn-on and turn-off assay, respectively. The proposed method with ssDNA-AgNC could therefore be extended to monitor other NAD + /NADH based enzyme catalyzed reactions in a turn-on/turn-off approach. Copyright © 2016 Elsevier B.V. All rights reserved.
Histological and transcript analyses of intact somatic embryos in an elite maize (Zea mays L.) inbred line Y423.

PubMed

Liu, Beibei; Su, Shengzhong; Wu, Ying; Li, Ying; Shan, Xiaohui; Li, Shipeng; Liu, Hongkui; Dong, Haixiao; Ding, Meiqi; Han, Junyou; Yuan, Yaping

2015-07-01

Intact somatic embryos were obtained from an elite maize inbred line Y423, bred in our laboratory. Using 13-day immature embryos after self-pollination as explants, and after 4-5 times subculture, a large number of somatic embryos were detected on the surface of the embryonic calli on the medium. The intact somatic embryos were transferred into the differential medium, where the plantlets regenerated with shoots and roots forming simultaneously. Histological analysis and scanning electron micrographs confirmed the different developmental stages of somatic embryogenesis, including globular-shaped embryo, pear-shaped embryo, scutiform embryo, and mature embryo. cDNA-amplified fragment length polymorphism (cDNA-AFLP) was used for comparative transcript profiling between embryogenic and non-embryogenic calli of a new elite maize inbred line Y423 during somatic embryogenesis. Differentially expressed genes were cloned and sequenced. Gene Ontology analysis of 117 candidate genes indicated their involvement in cellular component, biological process and molecular function. Nine of the candidate genes were selected. The changes in their expression levels during embryo induction and regeneration were analyzed in detail using quantitative real-time PCR. Two full-length cDNA sequences, encoding ZmSUF4 (suppressor of fir 4-like protein) and ZmDRP3A (dynamin-related protein), were cloned successfully from intact somatic embryos of the elite inbred maize line Y423. Here, a procedure for maize plant regeneration from somatic embryos is described. Additionally, the possible roles of some of these genes during the somatic embryogenesis has been discussed. This study is a systematic analysis of the cellular and molecular mechanism during the formation of intact somatic embryos in maize. Copyright © 2015 Elsevier Masson SAS. All rights reserved.
Restriction site polymorphism-based candidate gene mapping for seedling drought tolerance in cowpea [Vigna unguiculata (L.) Walp.].

PubMed

Muchero, Wellington; Ehlers, Jeffrey D; Roberts, Philip A

2010-02-01

Quantitative trait loci (QTL) studies provide insight into the complexity of drought tolerance mechanisms. Molecular markers used in these studies also allow for marker-assisted selection (MAS) in breeding programs, enabling transfer of genetic factors between breeding lines without complete knowledge of their exact nature. However, potential for recombination between markers and target genes limit the utility of MAS-based strategies. Candidate gene mapping offers an alternative solution to identify trait determinants underlying QTL of interest. Here, we used restriction site polymorphisms to investigate co-location of candidate genes with QTL for seedling drought stress-induced premature senescence identified previously in cowpea. Genomic DNA isolated from 113 F(2:8) RILs of drought-tolerant IT93K503-1 and drought susceptible CB46 genotypes was digested with combinations of EcoR1 and HpaII, Mse1, or Msp1 restriction enzymes and amplified with primers designed from 13 drought-responsive cDNAs. JoinMap 3.0 and MapQTL 4.0 software were used to incorporate polymorphic markers onto the AFLP map and to analyze their association with the drought response QTL. Seven markers co-located with peaks of previously identified QTL. Isolation, sequencing, and blast analysis of these markers confirmed their significant homology with drought or other abiotic stress-induced expressed sequence tags (EST) from cowpea and other plant systems. Further, homology with coding sequences for a multidrug resistance protein 3 and a photosystem I assembly protein ycf3 was revealed in two of these candidates. These results provide a platform for the identification and characterization of genetic trait determinants underlying seedling drought tolerance in cowpea.
Purification and Characterization of Plantaricin ZJ5, a New Bacteriocin Produced by Lactobacillus plantarum ZJ5

PubMed Central

Song, Da-Feng; Zhu, Mu-Yuan; Gu, Qing

2014-01-01

The aim of this study is to investigate the antimicrobial potential of Lactobacillus plantarum ZJ5, a strain isolated from fermented mustard with a broad range of inhibitory activity against both Gram-positive and Gram-negative bacteria. Here we present the peptide plantaricin ZJ5 (PZJ5), which is an extreme pH and heat-stable. However, it can be digested by pepsin and proteinase K. This peptide has strong activity against Staphylococcus aureus. PZJ5 has been purified using a multi-step process, including ammonium sulfate precipitation, cation-exchange chromatography, hydrophobic interactions and reverse-phase chromatography. The molecular mass of the peptide was found to be 2572.9 Da using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). The primary structure of this peptide was determined using amino acid sequencing and DNA sequencing, and these analyses revealed that the DNA sequence translated as a 44-residue precursor containing a 22-amino-acid N-terminal extension that was of the double-glycine type. The bacteriocin sequence exhibited no homology with known bacteriocins when compared with those available in the database, indicating that it was a new class IId bacteriocin. PZJ5 from a food-borne strain may be useful as a promising probiotic candidate. PMID:25147943
Purification and characterization of Plantaricin ZJ5, a new bacteriocin produced by Lactobacillus plantarum ZJ5.

PubMed

Song, Da-Feng; Zhu, Mu-Yuan; Gu, Qing

2014-01-01

The aim of this study is to investigate the antimicrobial potential of Lactobacillus plantarum ZJ5, a strain isolated from fermented mustard with a broad range of inhibitory activity against both Gram-positive and Gram-negative bacteria. Here we present the peptide plantaricin ZJ5 (PZJ5), which is an extreme pH and heat-stable. However, it can be digested by pepsin and proteinase K. This peptide has strong activity against Staphylococcus aureus. PZJ5 has been purified using a multi-step process, including ammonium sulfate precipitation, cation-exchange chromatography, hydrophobic interactions and reverse-phase chromatography. The molecular mass of the peptide was found to be 2572.9 Da using matrix-assisted laser desorption/ionization time-of-flight mass spectrometry (MALDI-TOF MS). The primary structure of this peptide was determined using amino acid sequencing and DNA sequencing, and these analyses revealed that the DNA sequence translated as a 44-residue precursor containing a 22-amino-acid N-terminal extension that was of the double-glycine type. The bacteriocin sequence exhibited no homology with known bacteriocins when compared with those available in the database, indicating that it was a new class IId bacteriocin. PZJ5 from a food-borne strain may be useful as a promising probiotic candidate.
Expressed sequence tag analysis of adult human lens for the NEIBank Project: over 2000 non-redundant transcripts, novel genes and splice variants.

PubMed

Wistow, Graeme; Bernstein, Steven L; Wyatt, M Keith; Behal, Amita; Touchman, Jeffrey W; Bouffard, Gerald; Smith, Don; Peterson, Katherine

2002-06-15

To explore the expression profile of the human lens and to provide a resource for microarray studies, expressed sequence tag (EST) analysis has been performed on cDNA libraries from adult lenses. A cDNA library was constructed from two adult (40 year old) human lenses. Over two thousand clones were sequenced from the unamplified, un-normalized library. The library was then normalized and a further 2200 sequences were obtained. All the data were analyzed using GRIST (GRouping and Identification of Sequence Tags), a procedure for gene identification and clustering. The lens library (by) contains a low percentage of non-mRNA contaminants and a high fraction (over 75%) of apparently full length cDNA clones. Approximately 2000 reads from the unamplified library yields 810 clusters, potentially representing individual genes expressed in the lens. After normalization, the content of crystallins and other abundant cDNAs is markedly reduced and a similar number of reads from this library (fs) yields 1455 unique groups of which only two thirds correspond to named genes in GenBank. Among the most abundant cDNAs is one for a novel gene related to glutamine synthetase, which was designated "lengsin" (LGS). Analyses of ESTs also reveal examples of alternative transcripts, including a major alternative splice form for the lens specific membrane protein MP19. Variant forms for other transcripts, including those encoding the apoptosis inhibitor Livin and the armadillo repeat protein ARVCF, are also described. The lens cDNA libraries are a resource for gene discovery, full length cDNAs for functional studies and microarrays. The discovery of an abundant, novel transcript, lengsin, and a major novel splice form of MP19 reflect the utility of unamplified libraries constructed from dissected tissue. Many novel transcripts and splice forms are represented, some of which may be candidates for genetic diseases.
Molecular Authentication of the Traditional Medicinal Plant "Lakshman Booti" (Smithia conferta Sm.) and Its Adulterants through DNA Barcoding.

PubMed

Umdale, Suraj D; Kshirsagar, Parthraj R; Lekhak, Manoj M; Gaikwad, Nikhil B

2017-07-01

Smithia conferta Sm. is an annual herb widely used in Indian traditional medical practice and commonly known as "Lakshman booti" in Sanskrit. Morphological resemblance among the species of genus Smithia Aiton . leads to inaccurate identification and adulteration. This causes inconsistent therapeutic effects and also affects the quality of herbal medicine. This study aimed to generate potential barcode for authentication of S. conferta and its adulterants through DNA barcoding technique. Genomic DNA extracted from S. conferta and its adulterants was used as templates for polymerase chain reaction amplification of the barcoding regions. The amplicons were directed for sequencing, and species identification was conducted using BLASTn and unweighted pair-group method with arithmetic mean trees. In addition, the secondary structures of internal transcribed spacer (ITS) 2 region were predicted. The nucleotide sequence of ITS provides species-specific single nucleotide polymorphisms and sequence divergence (22%) than psb A- trn H (10.9%) and rbc L (3.1%) sequences. The ITS barcode indicates that S. conferta and Smithia sensitiva are closely related compared to other species. ITS is the most applicable barcode for molecular authentication of S. conferta , and further chloroplast barcodes should be tested for phylogenetic analysis of genus Smithia. The present investigation is the first effort of utilization of DNA barcode for molecular authentication of S. conferta and its adulterants. Also, this study expanded the application of the ITS2 sequence data in the authentication. The ITS has been proved as a potential and reliable candidate barcode for the authentication of S. conferta . Abbreviations used: BLASTn: Basic Local Alignment Search Tool for Nucleotide; MEGA: Molecular Evolutionary Genetic Analysis; EMBL: European Molecular Biology Laboratory; psb A- trn H: Photosystem II protein D1- stuctural RNA: His tRNA gene; rbcL: Ribulose 1,5 bi-phosphate carboxylase/oxygenase large subunit gene.
Structural analysis of the HLA-A/HLA-F subregion: Precise localization of two new multigene families closely associated with the HLA class I sequences

DOE Office of Scientific and Technical Information (OSTI.GOV)

Pichon, L.; Carn, G.; Bouric, P.

1996-03-01

Positional cloning strategies for the hemochromatosis gene have previously concentrated on a target area restricted to a maximum genomic expanse of 400 kb around the HLA-A and HLA-F loci. Recently, the candidate region has been extended to 2-3 Mb on the distal side of the MHC. In this study, 10 coding sequences [hemochromatosis candidate genes (HCG) I to X] were isolated by cDNA selection using YACs covering the HLA-A/HLA-F subregion. Two of these (HCG II and HCG IV) belong to multigene families, as well as other sequences already described in this region, i.e., P5, pMC 6.7, and HLA class I.more » Fingerprinting of the four YACSs overlapping the region was performed and allowed partial localization of the different multigene family sequences on each YAC without defining their exact positions. Fingerprinting on cosmids isolated from the ICRF chromosome 6-specific cosmid library allowed more precise localization of the redundant sequences in all of the multigene families and revealed their apparent organization in clusters. Further examination of these intertwined sequences demonstrated that this structural organization resulted from a succession of complex phenomena, including duplications and contractions. This study presents a precise description of the structural organization of the HLA-A/HLA-F region and a determination of the sequences involved in the megabase size polymorphism observed among the A3, A24, and A31 haplotypes. 29 refs., 2 figs., 2 tabs.« less
The genomic organization of a human creatine transporter (CRTR) gene located in Xq28

DOE Office of Scientific and Technical Information (OSTI.GOV)

Sandoval, N.; Bauer, D.; Brenner, V.

1996-07-15

During the course of a large-scale sequencing project in Xq28, a human creatine transporter (CRTR) gene was discovered. The gene is located approximately 36 kb centromeric to ALD. The gene contains 13 exons and spans about 8.5 kb of genomic DNA. Since the creatine transporter has a prominent function in muscular physiology, it is a candidate gene for Barth syndrome and infantile cardiomyopathy mapped to Xq28. 19 refs., 1 fig., 1 tab.
Multiple-strand displacement and identification of single nucleotide polymorphisms as markers of genotypic variation of Pasteuria penetrans biotypes infecting root-knot nematodes.

PubMed

Nong, Guang; Chow, Virginia; Schmidt, Liesbeth M; Dickson, Don W; Preston, James F

2007-08-01

Pasteuria species are endospore-forming obligate bacterial parasites of soil-inhabiting nematodes and water-inhabiting cladocerans, e.g. water fleas, and are closely related to Bacillus spp. by 16S rRNA gene sequence. As naturally occurring bacteria, biotypes of Pasteuria penetrans are attractive candidates for the biocontrol of various Meloidogyne spp. (root-knot nematodes). Failure to culture these bacteria outside their hosts has prevented isolation of genomic DNA in quantities sufficient for identification of genes associated with host recognition and virulence. We have applied multiple-strand displacement amplification (MDA) to generate DNA for comparative genomics of biotypes exhibiting different host preferences. Using the genome of Bacillus subtilis as a paradigm, MDA allowed quantitative detection and sequencing of 12 marker genes from 2000 cells. Meloidogyne spp. infected with P. penetrans P20 or B4 contained single nucleotide polymorphisms (SNPs) in the spoIIAB gene that did not change the amino acid sequence, or that substituted amino acids with similar chemical properties. Individual nematodes infected with P. penetrans P20 or B4 contained SNPs in the spoIIAB gene sequenced in MDA-generated products. Detection of SNPs in the spoIIAB gene in a nematode indicates infection by more than one genotype, supporting the need to sequence genomes of Pasteuria spp. derived from single spore isolates.
Identification of an miRNA candidate reflects the possible significance of transcribed microsatellites in the hairpin precursors of black pepper.

PubMed

Joy, Nisha; Soniya, Eppurathu Vasudevan

2012-06-01

Plant miRNAs (18-24nt) are generated by the RNase III-type Dicer endonuclease from the endogenous hairpin precursors ('pre-miRNAs') with significant regulatory functions. The transcribed regions display a higher frequency of microsatellites, when compared to other regions of the genomic DNA. Simple sequence repeats (SSRs) resulting from replication slippage occurring in transcripts affect the expression of genes. The available experimental evidence for the incidence of SSRs in the miRNA precursors is limited. Considering the potential significance of SSRs in the miRNA genes, we carried out a preliminary analysis to verify the presence of SSRs in the pri-miRNAs of black pepper (Piper nigrum L.). We isolated a (CT) dinucleotide SSR bearing transcript using SMART strategy. The transcript was predicted to be a 'pri-miRNA candidate' with Dicer sites based on miRNA prediction tools and MFOLD structural predictions. The presence of this 'miRNA candidate' was confirmed by real-time TaqMan assays. The upstream sequence of the 'miRNA candidate' by genome walking when subjected to PlantCARE showed the presence of certain promoter elements, and the deduced amino acid showed significant similarity with NAP1 gene, which affects the transcription of many genes. Moreover the hairpin-like precursor overlapped the neighbouring NAP1 gene. In silico analysis revealed distinct putative functions for the 'miRNA candidate', of which majority were related to growth. Hence, we assume that this 'miRNA candidate' may get activated during transcription of NAP gene, thereby regulating the expression of many genes involved in developmental processes.
Identification of Small RNAs in Desulfovibrio vulgaris Hildenborough

DOE Office of Scientific and Technical Information (OSTI.GOV)

Burns, Andrew; Joachimiak, Marcin; Deutschbauer, Adam

2010-05-17

Desulfovibrio vulgaris is an anaerobic sulfate-reducing bacterium capable of facilitating the removal of toxic metals such as uranium from contaminated sites via reduction. As such, it is essential to understand the intricate regulatory cascades involved in how D. vulgaris and its relatives respond to stressors in such sites. One approach is the identification and analysis of small non-coding RNAs (sRNAs); molecules ranging in size from 20-200 nucleotides that predominantly affect gene regulation by binding to complementary mRNA in an anti-sense fashion and therefore provide an immediate regulatory response. To identify sRNAs in D. vulgaris, a bacterium that does not possessmore » an annotated hfq gene, RNA was pooled from stationary and exponential phases, nitrate exposure, and biofilm conditions. The subsequent RNA was size fractionated, modified, and converted to cDNA for high throughput transcriptomic deep sequencing. A computational approach to identify sRNAs via the alignment of seven separate Desulfovibrio genomes was also performed. From the deep sequencing analysis, 2,296 reads between 20 and 250 nt were identified with expression above genome background. Analysis of those reads limited the number of candidates to ~;;87 intergenic, while ~;;140 appeared to be antisense to annotated open reading frames (ORFs). Further BLAST analysis of the intergenic candidates and other Desulfovibrio genomes indicated that eight candidates were likely portions of ORFs not previously annotated in the D. vulgaris genome. Comparison of the intergenic and antisense data sets to the bioinformatical predicted candidates, resulted in ~;;54 common candidates. Current approaches using Northern analysis and qRT-PCR are being used toverify expression of the candidates and to further develop the role these sRNAs play in D. vulgaris regulation.« less
Identification of evolutionarily conserved DNA damage response genes that alter sensitivity to cisplatin

PubMed Central

Gaponova, Anna V.; Deneka, Alexander Y.; Beck, Tim N.; Liu, Hanqing; Andrianov, Gregory; Nikonova, Anna S.; Nicolas, Emmanuelle; Einarson, Margret B.; Golemis, Erica A.; Serebriiskii, Ilya G.

2017-01-01

Ovarian, head and neck, and other cancers are commonly treated with cisplatin and other DNA damaging cytotoxic agents. Altered DNA damage response (DDR) contributes to resistance of these tumors to chemotherapies, some targeted therapies, and radiation. DDR involves multiple protein complexes and signaling pathways, some of which are evolutionarily ancient and involve protein orthologs conserved from yeast to humans. To identify new regulators of cisplatin-resistance in human tumors, we integrated high throughput and curated datasets describing yeast genes that regulate sensitivity to cisplatin and/or ionizing radiation. Next, we clustered highly validated genes based on chemogenomic profiling, and then mapped orthologs of these genes in expanded genomic networks for multiple metazoans, including humans. This approach identified an enriched candidate set of genes involved in the regulation of resistance to radiation and/or cisplatin in humans. Direct functional assessment of selected candidate genes using RNA interference confirmed their activity in influencing cisplatin resistance, degree of γH2AX focus formation and ATR phosphorylation, in ovarian and head and neck cancer cell lines, suggesting impaired DDR signaling as the driving mechanism. This work enlarges the set of genes that may contribute to chemotherapy resistance and provides a new contextual resource for interpreting next generation sequencing (NGS) genomic profiling of tumors. PMID:27863405
DNA Barcoding Survey of Anurans across the Eastern Cordillera of Colombia and the Impact of the Andes on Cryptic Diversity

PubMed Central

Guarnizo, Carlos E.; Paz, Andrea; Muñoz-Ortiz, Astrid; Flechas, Sandra V.; Méndez-Narváez, Javier; Crawford, Andrew J.

2015-01-01

Colombia hosts the second highest amphibian species diversity on Earth, yet its fauna remains poorly studied, especially using molecular genetic techniques. We present the results of the first wide-scale DNA barcoding survey of anurans of Colombia, focusing on a transect across the Eastern Cordillera. We surveyed 10 sites between the Magdalena Valley to the west and the eastern foothills of the Eastern Cordillera, sequencing portions of the mitochondrial 16S ribosomal RNA and cytochrome oxidase subunit 1 (CO1) genes for 235 individuals from 52 nominal species. We applied two barcode algorithms, Automatic Barcode Gap Discovery and Refined Single Linkage Analysis, to estimate the number of clusters or “unconfirmed candidate species” supported by DNA barcode data. Our survey included ~7% of the anuran species known from Colombia. While barcoding algorithms differed slightly in the number of clusters identified, between three and ten nominal species may be obscuring candidate species (in some cases, more than one cryptic species per nominal species). Our data suggest that the high elevations of the Eastern Cordillera and the low elevations of the Chicamocha canyon acted as geographic barriers in at least seven nominal species, promoting strong genetic divergences between populations associated with the Eastern Cordillera. PMID:26000447
Mitochondrial transcription factor A (Tfam) gene sequencing and mitochondrial evaluation in inherited retinal dysplasia in miniature schnauzer dogs

PubMed Central

Bauer, Bianca S.; Forsyth, George W.; Sandmeyer, Lynne S.; Grahn, Bruce H.

2011-01-01

Mitochondrial transcription factor A (Tfam) has been implicated in the pathogenesis of retinal dysplasia in miniature schnauzer dogs and it has been proposed that affected dogs have altered mitochondrial numbers, size, and morphology. To test these hypotheses the Tfam gene of affected and normal miniature schnauzer dogs with retinal dysplasia was sequenced and lymphocyte mitochondria were quantified, measured, and the morphology was compared in normal and affected dogs using transmission electron microscopy. For Tfam sequencing, retina, retinal pigment epithelium (RPE), and whole blood samples were collected. Total RNA was isolated from the retina and RPE and reverse transcribed to make cDNA. Genomic DNA was extracted from white blood cell pellets obtained from the whole blood samples. The Tfam coding sequence, 5′ promoter region, intron1 and the 3′ non-coding sequence of normal and affected dogs were amplified using polymerase chain reaction (PCR), cloned and sequenced. For electron microscopy, lymphocytes from affected and normal dogs were photographed and the mitochondria within each cross-section were identified, quantified, and the mitochondrial area (μm2) per lymphocyte cross-section was calculated. Lastly, using a masked technique, mitochondrial morphology was compared between the 2 groups. Sequencing of the miniature schnauzer Tfam gene revealed no functional sequence variation between affected and normal dogs. Lymphocyte and mitochondrial area, mitochondrial quantification, and morphology assessment also revealed no significant difference between the 2 groups. Further investigation into other candidate genes or factors causing retinal dysplasia in the miniature schnauzer is warranted. PMID:21731185
Mitochondrial transcription factor A (Tfam) gene sequencing and mitochondrial evaluation in inherited retinal dysplasia in miniature schnauzer dogs.

PubMed

Bauer, Bianca S; Forsyth, George W; Sandmeyer, Lynne S; Grahn, Bruce H

2011-04-01

Mitochondrial transcription factor A (Tfam) has been implicated in the pathogenesis of retinal dysplasia in miniature schnauzer dogs and it has been proposed that affected dogs have altered mitochondrial numbers, size, and morphology. To test these hypotheses the Tfam gene of affected and normal miniature schnauzer dogs with retinal dysplasia was sequenced and lymphocyte mitochondria were quantified, measured, and the morphology was compared in normal and affected dogs using transmission electron microscopy. For Tfam sequencing, retina, retinal pigment epithelium (RPE), and whole blood samples were collected. Total RNA was isolated from the retina and RPE and reverse transcribed to make cDNA. Genomic DNA was extracted from white blood cell pellets obtained from the whole blood samples. The Tfam coding sequence, 5' promoter region, intron1 and the 3' non-coding sequence of normal and affected dogs were amplified using polymerase chain reaction (PCR), cloned and sequenced. For electron microscopy, lymphocytes from affected and normal dogs were photographed and the mitochondria within each cross-section were identified, quantified, and the mitochondrial area (μm²) per lymphocyte cross-section was calculated. Lastly, using a masked technique, mitochondrial morphology was compared between the 2 groups. Sequencing of the miniature schnauzer Tfam gene revealed no functional sequence variation between affected and normal dogs. Lymphocyte and mitochondrial area, mitochondrial quantification, and morphology assessment also revealed no significant difference between the 2 groups. Further investigation into other candidate genes or factors causing retinal dysplasia in the miniature schnauzer is warranted.

Capturing the Biofuel Wellhead and Powerhouse: The Chloroplast and Mitochondrial Genomes of the Leguminous Feedstock Tree Pongamia pinnata

PubMed Central

Kazakoff, Stephen H.; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T.; Gresshoff, Peter M.

2012-01-01

Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® ‘Second Generation DNA Sequencing (2GS)’ and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites. PMID:23272141
Capturing the biofuel wellhead and powerhouse: the chloroplast and mitochondrial genomes of the leguminous feedstock tree Pongamia pinnata.

PubMed

Kazakoff, Stephen H; Imelfort, Michael; Edwards, David; Koehorst, Jasper; Biswas, Bandana; Batley, Jacqueline; Scott, Paul T; Gresshoff, Peter M

2012-01-01

Pongamia pinnata (syn. Millettia pinnata) is a novel, fast-growing arboreal legume that bears prolific quantities of oil-rich seeds suitable for the production of biodiesel and aviation biofuel. Here, we have used Illumina® 'Second Generation DNA Sequencing (2GS)' and a new short-read de novo assembler, SaSSY, to assemble and annotate the Pongamia chloroplast (152,968 bp; cpDNA) and mitochondrial (425,718 bp; mtDNA) genomes. We also show that SaSSY can be used to accurately assemble 2GS data, by re-assembling the Lotus japonicus cpDNA and in the process assemble its mtDNA (380,861 bp). The Pongamia cpDNA contains 77 unique protein-coding genes and is almost 60% gene-dense. It contains a 50 kb inversion common to other legumes, as well as a novel 6.5 kb inversion that is responsible for the non-disruptive, re-orientation of five protein-coding genes. Additionally, two copies of an inverted repeat firmly place the species outside the subclade of the Fabaceae lacking the inverted repeat. The Pongamia and L. japonicus mtDNA contain just 33 and 31 unique protein-coding genes, respectively, and like other angiosperm mtDNA, have expanded intergenic and multiple repeat regions. Through comparative analysis with Vigna radiata we measured the average synonymous and non-synonymous divergence of all three legume mitochondrial (1.59% and 2.40%, respectively) and chloroplast (8.37% and 8.99%, respectively) protein-coding genes. Finally, we explored the relatedness of Pongamia within the Fabaceae and showed the utility of the organellar genome sequences by mapping transcriptomic data to identify up- and down-regulated stress-responsive gene candidates and confirm in silico predicted RNA editing sites.
DOE Office of Scientific and Technical Information (OSTI.GOV)

Ahn, Anne-Catherine; Meier-Kolthoff, Jan P.; Overmars, Lex

Thioalkalivibrio is a genus of obligate chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacteria. Their habitat are soda lakes which are dual extreme environments with a pH range from 9.5 to 11 and salt concentrations up to saturation. More than 100 strains of this genus have been isolated from various soda lakes all over the world, but only ten species have been effectively described yet. Therefore, the assignment of the remaining strains to either existing or novel species is important and will further elucidate their genomic diversity as well as give a better general understanding of this genus. Recently, the genomes of 76 Thioalkalivibriomore » strains were sequenced. On these, we applied different methods including (i) 16S rRNA gene sequence analysis, (ii) Multilocus Sequence Analysis (MLSA) based on eight housekeeping genes, (iii) Average Nucleotide Identity based on BLAST (ANI b) and MUMmer (ANI m ), (iv) Tetranucleotide frequency correlation coefficients (TETRA), (v) digital DNA:DNA hybridization (dDDH) as well as (vi) nucleotide- and amino acid-based Genome BLAST Distance Phylogeny (GBDP) analyses. We detected a high genomic diversity by revealing 15 new "genomic" species and 16 new "genomic" subspecies in addition to the ten already described species. Phylogenetic and phylogenomic analyses showed that the genus is not monophyletic, because four strains were clearly separated from the other Thioalkalivibrio by type strains from other genera. Therefore, it is recommended to classify the latter group as a novel genus. The biogeographic distribution of Thioalkalivibrio suggested that the different "genomic" species can be classified as candidate disjunct or candidate endemic species. This study is a detailed genome-based classification and identification of members within the genus Thioalkalivibrio. However, future phenotypical and chemotaxonomical studies will be needed for a full species description of this genus.« less
Genomic diversity within the haloalkaliphilic genus Thioalkalivibrio

DOE PAGES

Ahn, Anne-Catherine; Meier-Kolthoff, Jan P.; Overmars, Lex; ...

2017-03-10

Thioalkalivibrio is a genus of obligate chemolithoautotrophic haloalkaliphilic sulfur-oxidizing bacteria. Their habitat are soda lakes which are dual extreme environments with a pH range from 9.5 to 11 and salt concentrations up to saturation. More than 100 strains of this genus have been isolated from various soda lakes all over the world, but only ten species have been effectively described yet. Therefore, the assignment of the remaining strains to either existing or novel species is important and will further elucidate their genomic diversity as well as give a better general understanding of this genus. Recently, the genomes of 76 Thioalkalivibriomore » strains were sequenced. On these, we applied different methods including (i) 16S rRNA gene sequence analysis, (ii) Multilocus Sequence Analysis (MLSA) based on eight housekeeping genes, (iii) Average Nucleotide Identity based on BLAST (ANI b) and MUMmer (ANI m ), (iv) Tetranucleotide frequency correlation coefficients (TETRA), (v) digital DNA:DNA hybridization (dDDH) as well as (vi) nucleotide- and amino acid-based Genome BLAST Distance Phylogeny (GBDP) analyses. We detected a high genomic diversity by revealing 15 new "genomic" species and 16 new "genomic" subspecies in addition to the ten already described species. Phylogenetic and phylogenomic analyses showed that the genus is not monophyletic, because four strains were clearly separated from the other Thioalkalivibrio by type strains from other genera. Therefore, it is recommended to classify the latter group as a novel genus. The biogeographic distribution of Thioalkalivibrio suggested that the different "genomic" species can be classified as candidate disjunct or candidate endemic species. This study is a detailed genome-based classification and identification of members within the genus Thioalkalivibrio. However, future phenotypical and chemotaxonomical studies will be needed for a full species description of this genus.« less
Triterpenoid Saponin Biosynthetic Pathway Profiling and Candidate Gene Mining of the Ilex asprella Root Using RNA-Seq

PubMed Central

Zheng, Xiasheng; Xu, Hui; Ma, Xinye; Zhan, Ruoting; Chen, Weiwen

2014-01-01

Ilex asprella, which contains abundant α-amyrin type triterpenoid saponins, is an anti-influenza herbal drug widely used in south China. In this work, we first analysed the transcriptome of the I. asprella root using RNA-Seq, which provided a dataset for functional gene mining. mRNA was isolated from the total RNA of the I. asprella root and reverse-transcribed into cDNA. Then, the cDNA library was sequenced using an Illumina HiSeq™ 2000, which generated 55,028,452 clean reads. De novo assembly of these reads generated 51,865 unigenes, in which 39,269 unigenes were annotated (75.71% yield). According to the structures of the triterpenoid saponins of I. asprella, a putative biosynthetic pathway downstream of 2,3-oxidosqualene was proposed and candidate unigenes in the transcriptome data that were potentially involved in the pathway were screened using homology-based BLAST and phylogenetic analysis. Further amplification and functional analysis of these putative unigenes will provide insight into the biosynthesis of Ilex triterpenoid saponins. PMID:24722569
Integrating evolutionary and functional approaches to infer adaptation at specific loci.

PubMed

Storz, Jay F; Wheat, Christopher W

2010-09-01

Inferences about adaptation at specific loci are often exclusively based on the static analysis of DNA sequence variation. Ideally,population-genetic evidence for positive selection serves as a stepping-off point for experimental studies to elucidate the functional significance of the putatively adaptive variation. We argue that inferences about adaptation at specific loci are best achieved by integrating the indirect, retrospective insights provided by population-genetic analyses with the more direct, mechanistic insights provided by functional experiments. Integrative studies of adaptive genetic variation may sometimes be motivated by experimental insights into molecular function, which then provide the impetus to perform population genetic tests to evaluate whether the functional variation is of adaptive significance. In other cases, studies may be initiated by genome scans of DNA variation to identify candidate loci for recent adaptation. Results of such analyses can then motivate experimental efforts to test whether the identified candidate loci do in fact contribute to functional variation in some fitness-related phenotype. Functional studies can provide corroborative evidence for positive selection at particular loci, and can potentially reveal specific molecular mechanisms of adaptation.
Reverse genetics of measles virus and resulting multivalent recombinant vaccines: applications of recombinant measles viruses.

PubMed

Billeter, M A; Naim, H Y; Udem, S A

2009-01-01

An overview is given on the development of technologies to allow reverse genetics of RNA viruses, i.e., the rescue of viruses from cDNA, with emphasis on nonsegmented negative-strand RNA viruses (Mononegavirales), as exemplified for measles virus (MV). Primarily, these technologies allowed site-directed mutagenesis, enabling important insights into a variety of aspects of the biology of these viruses. Concomitantly, foreign coding sequences were inserted to (a) allow localization of virus replication in vivo through marker gene expression, (b) develop candidate multivalent vaccines against measles and other pathogens, and (c) create candidate oncolytic viruses. The vector use of these viruses was experimentally encouraged by the pronounced genetic stability of the recombinants unexpected for RNA viruses, and by the high load of insertable genetic material, in excess of 6 kb. The known assets, such as the small genome size of the vector in comparison to DNA viruses proposed as vectors, the extensive clinical experience of attenuated MV as vaccine with a proven record of high safety and efficacy, and the low production cost per vaccination dose are thus favorably complemented.
Biochemical Characterization of Novel Retroviral Integrase Proteins

PubMed Central

Ballandras-Colas, Allison; Naraharisetty, Hema; Li, Xiang; Serrao, Erik; Engelman, Alan

2013-01-01

Integrase is an essential retroviral enzyme, catalyzing the stable integration of reverse transcribed DNA into cellular DNA. Several aspects of the integration mechanism, including the length of host DNA sequence duplication flanking the integrated provirus, which can be from 4 to 6 bp, and the nucleotide preferences at the site of integration, are thought to cluster among the different retroviral genera. To date only the spumavirus prototype foamy virus integrase has provided diffractable crystals of integrase-DNA complexes, revealing unprecedented details on the molecular mechanisms of DNA integration. Here, we characterize five previously unstudied integrase proteins, including those derived from the alpharetrovirus lymphoproliferative disease virus (LPDV), betaretroviruses Jaagsiekte sheep retrovirus (JSRV), and mouse mammary tumor virus (MMTV), epsilonretrovirus walleye dermal sarcoma virus (WDSV), and gammaretrovirus reticuloendotheliosis virus strain A (Rev-A) to identify potential novel structural biology candidates. Integrase expressed in bacterial cells was analyzed for solubility, stability during purification, and, once purified, 3′ processing and DNA strand transfer activities in vitro. We show that while we were unable to extract or purify accountable amounts of WDSV, JRSV, or LPDV integrase, purified MMTV and Rev-A integrase each preferentially support the concerted integration of two viral DNA ends into target DNA. The sequencing of concerted Rev-A integration products indicates high fidelity cleavage of target DNA strands separated by 5 bp during integration, which contrasts with the 4 bp duplication generated by a separate gammaretrovirus, the Moloney murine leukemia virus (MLV). By comparing Rev-A in vitro integration sites to those generated by MLV in cells, we concordantly conclude that the spacing of target DNA cleavage is more evolutionarily flexible than are the target DNA base contacts made by integrase during integration. Given their desirable concerted DNA integration profiles, Rev-A and MMTV integrase proteins have been earmarked for structural biology studies. PMID:24124581
Single nucleotide polymorphisms of DNA repair genes as predictors of radioresponse.

PubMed

Parliament, Matthew B; Murray, David

2010-10-01

Radiation therapy is a key modality in the treatment of cancer. Substantial progress has been made in unraveling the molecular events which underpin the responses of malignant and surrounding normal tissues to ionizing radiation. An understanding of the genes involved in processes such as DNA double-strand break repair, DNA damage response, cell-cycle control, apoptosis, cellular antioxidant defenses, and cytokine production, has evolved toward examination of how genetic variants, most often, single nucleotide polymorphisms (SNPs), may influence interindividual radioresponse. Experimental approaches, such as candidate SNP-association studies, genome-wide association studies, and massively parallel sequencing are being proposed to address these questions. We present a focused review of the evidence supporting an association between SNPs in DNA repair genes and radioresponse in normal tissues and tumors. Although preliminary results indicate possible associations, there are methodological weaknesses in many of the studies, and independent validation of SNPs as biomarkers of radioresponse in much larger cohorts will likely require research cooperation through international consortia. Copyright © 2010 Elsevier Inc. All rights reserved.
Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

NASA Astrophysics Data System (ADS)

Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

2016-06-01

Mass spectrometry-based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications.
Proteogenomics: Integrating Next-Generation Sequencing and Mass Spectrometry to Characterize Human Proteomic Variation

PubMed Central

Sheynkman, Gloria M.; Shortreed, Michael R.; Cesnik, Anthony J.; Smith, Lloyd M.

2016-01-01

Mass spectrometry–based proteomics has emerged as the leading method for detection, quantification, and characterization of proteins. Nearly all proteomic workflows rely on proteomic databases to identify peptides and proteins, but these databases typically contain a generic set of proteins that lack variations unique to a given sample, precluding their detection. Fortunately, proteogenomics enables the detection of such proteomic variations and can be defined, broadly, as the use of nucleotide sequences to generate candidate protein sequences for mass spectrometry database searching. Proteogenomics is experiencing heightened significance due to two developments: (a) advances in DNA sequencing technologies that have made complete sequencing of human genomes and transcriptomes routine, and (b) the unveiling of the tremendous complexity of the human proteome as expressed at the levels of genes, cells, tissues, individuals, and populations. We review here the field of human proteogenomics, with an emphasis on its history, current implementations, the types of proteomic variations it reveals, and several important applications. PMID:27049631
Illuminator, a desktop program for mutation detection using short-read clonal sequencing.

PubMed

Carr, Ian M; Morgan, Joanne E; Diggle, Christine P; Sheridan, Eamonn; Markham, Alexander F; Logan, Clare V; Inglehearn, Chris F; Taylor, Graham R; Bonthron, David T

2011-10-01

Current methods for sequencing clonal populations of DNA molecules yield several gigabases of data per day, typically comprising reads of < 100 nt. Such datasets permit widespread genome resequencing and transcriptome analysis or other quantitative tasks. However, this huge capacity can also be harnessed for the resequencing of smaller (gene-sized) target regions, through the simultaneous parallel analysis of multiple subjects, using sample "tagging" or "indexing". These methods promise to have a huge impact on diagnostic mutation analysis and candidate gene testing. Here we describe a software package developed for such studies, offering the ability to resolve pooled samples carrying barcode tags and to align reads to a reference sequence using a mutation-tolerant process. The program, Illuminator, can identify rare sequence variants, including insertions and deletions, and permits interactive data analysis on standard desktop computers. It facilitates the effective analysis of targeted clonal sequencer data without dedicated computational infrastructure or specialized training. Copyright © 2011 Elsevier Inc. All rights reserved.
Novel Antigen Identification Method for Discovery of Protective Malaria Antigens by Rapid Testing of DNA Vaccines Encoding Exons from the Parasite Genome

PubMed Central

Haddad, Diana; Bilcikova, Erika; Witney, Adam A.; Carlton, Jane M.; White, Charles E.; Blair, Peter L.; Chattopadhyay, Rana; Russell, Joshua; Abot, Esteban; Charoenvit, Yupin; Aguiar, Joao C.; Carucci, Daniel J.; Weiss, Walter R.

2004-01-01

We describe a novel approach for identifying target antigens for preerythrocytic malaria vaccines. Our strategy is to rapidly test hundreds of DNA vaccines encoding exons from the Plasmodium yoelii yoelii genomic sequence. In this antigen identification method, we measure reduction in parasite burden in the liver after sporozoite challenge in mice. Orthologs of protective P. y. yoelii genes can then be identified in the genomic databases of Plasmodium falciparum and Plasmodium vivax and investigated as candidate antigens for a human vaccine. A pilot study to develop the antigen identification method approach used 192 P. y. yoelii exons from genes expressed during the sporozoite stage of the life cycle. A total of 182 (94%) exons were successfully cloned into a DNA immunization vector with the Gateway cloning technology. To assess immunization strategies, mice were vaccinated with 19 of the new DNA plasmids in addition to the well-characterized protective plasmid encoding P. y. yoelii circumsporozoite protein. Single plasmid immunization by gene gun identified a novel vaccine target antigen which decreased liver parasite burden by 95% and which has orthologs in P. vivax and P. knowlesi but not P. falciparum. Intramuscular injection of DNA plasmids produced a different pattern of protective responses from those seen with gene gun immunization. Intramuscular immunization with plasmid pools could reduce liver parasite burden in mice despite the fact that none of the plasmids was protective when given individually. We conclude that high-throughput cloning of exons into DNA vaccines and their screening is feasible and can rapidly identify new malaria vaccine candidate antigens. PMID:14977966
DNA-encoded libraries - an efficient small molecule discovery technology for the biomedical sciences.

PubMed

Kunig, Verena; Potowski, Marco; Gohla, Anne; Brunschweiger, Andreas

2018-06-27

DNA-encoded compound libraries are a highly attractive technology for the discovery of small molecule protein ligands. These compound collections consist of small molecules covalently connected to individual DNA sequences carrying readable information about the compound structure. DNA-tagging allows for efficient synthesis, handling and interrogation of vast numbers of chemically synthesized, drug-like compounds. They are screened on proteins by an efficient, generic assay based on Darwinian principles of selection. To date, selection of DNA-encoded libraries allowed for the identification of numerous bioactive compounds. Some of these compounds uncovered hitherto unknown allosteric binding sites on target proteins; several compounds proved their value as chemical biology probes unraveling complex biology; and the first examples of clinical candidates that trace their ancestry to a DNA-encoded library were reported. Thus, DNA-encoded libraries proved their value for the biomedical sciences as a generic technology for the identification of bioactive drug-like molecules numerous times. However, large scale experiments showed that even the selection of billions of compounds failed to deliver bioactive compounds for the majority of proteins in an unbiased panel of target proteins. This raises the question of compound library design.
Intact coding region of the serotonin transporter gene in obsessive-compulsive disorder

DOE Office of Scientific and Technical Information (OSTI.GOV)

Altemus, M.; Murphy, D.L.; Greenberg, B.

1996-07-26

Epidemiologic studies indicate that obsessive-compulsive disorder is genetically transmitted in some families, although no genetic abnormalities have been identified in individuals with this disorder. The selective response of obsessive-compulsive disorder to treatment with agents which block serotonin reuptake suggests the gene coding for the serotonin transporter as a candidate gene. The primary structure of the serotonin-transporter coding region was sequenced in 22 patients with obsessive-compulsive disorder, using direct PCR sequencing of cDNA synthesized from platelet serotonin-transporter mRNA. No variations in amino acid sequence were found among the obsessive-compulsive disorder patients or healthy controls. These results do not support a rolemore » for alteration in the primary structure of the coding region of the serotonin-transporter gene in the pathogenesis of obsessive-compulsive disorder. 27 refs.« less
Corresponding Mitochondrial DNA and Niche Divergence for Crested Newt Candidate Species

PubMed Central

Wielstra, Ben; Beukema, Wouter; Arntzen, Jan W.; Skidmore, Andrew K.; Toxopeus, Albertus G.; Raes, Niels

2012-01-01

Genetic divergence of mitochondrial DNA does not necessarily correspond to reproductive isolation. However, if mitochondrial DNA lineages occupy separate segments of environmental space, this supports the notion of their evolutionary independence. We explore niche differentiation among three candidate species of crested newt (characterized by distinct mitochondrial DNA lineages) and interpret the results in the light of differences observed for recognized crested newt species. We quantify niche differences among all crested newt (candidate) species and test hypotheses regarding niche evolution, employing two ordination techniques (PCA-env and ENFA). Niche equivalency is rejected: all (candidate) species are found to occupy significantly different segments of environmental space. Furthermore, niche overlap values for the three candidate species are not significantly higher than those for the recognized species. As the three candidate crested newt species are, not only in terms of mitochondrial DNA genetic divergence, but also ecologically speaking, as diverged as the recognized crested newt species, our findings are in line with the hypothesis that they represent cryptic species. We address potential pitfalls of our methodology. PMID:23029564
Development of novel low-copy nuclear markers for Hieraciinae (Asteraceae) and their perspective for other tribes.

PubMed

Krak, Karol; Alvarez, Inés; Caklová, Petra; Costa, Andrea; Chrtek, Jindrich; Fehrer, Judith

2012-02-01

The development of three low-copy nuclear markers for low taxonomic level phylogenies in Asteraceae with emphasis on the subtribe Hieraciinae is reported. Marker candidates were selected by comparing a Lactuca complementary DNA (cDNA) library with public DNA sequence databases. Interspecific variation and phylogenetic signal of the selected genes were investigated for diploid taxa from the subtribe Hieraciinae and compared to a reference phylogeny. Their ability to cross-amplify was assessed for other Asteraceae tribes. All three markers had higher variation (2.1-4.5 times) than the internal transcribed spacer (ITS) in Hieraciinae. Cross-amplification was successful in at least seven other tribes of the Asteraceae. Only three cases indicating the presence of paralogs or pseudogenes were detected. The results demonstrate the potential of these markers for phylogeny reconstruction in the Hieraciinae as well as in other Asteraceae tribes, especially for very closely related species.
LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms.

PubMed

Yang, Peng; Wu, Min; Guo, Jing; Kwoh, Chee Keong; Przytycka, Teresa M; Zheng, Jie

2014-02-17

As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Recently, an algorithm called "LDsplit" has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots.
LDsplit: screening for cis-regulatory motifs stimulating meiotic recombination hotspots by analysis of DNA sequence polymorphisms

PubMed Central

2014-01-01

Background As a fundamental genomic element, meiotic recombination hotspot plays important roles in life sciences. Thus uncovering its regulatory mechanisms has broad impact on biomedical research. Despite the recent identification of the zinc finger protein PRDM9 and its 13-mer binding motif as major regulators for meiotic recombination hotspots, other regulators remain to be discovered. Existing methods for finding DNA sequence motifs of recombination hotspots often rely on the enrichment of co-localizations between hotspots and short DNA patterns, which ignore the cross-individual variation of recombination rates and sequence polymorphisms in the population. Our objective in this paper is to capture signals encoded in genetic variations for the discovery of recombination-associated DNA motifs. Results Recently, an algorithm called “LDsplit” has been designed to detect the association between single nucleotide polymorphisms (SNPs) and proximal meiotic recombination hotspots. The association is measured by the difference of population recombination rates at a hotspot between two alleles of a candidate SNP. Here we present an open source software tool of LDsplit, with integrative data visualization for recombination hotspots and their proximal SNPs. Applying LDsplit on SNPs inside an established 7-mer motif bound by PRDM9 we observed that SNP alleles preserving the original motif tend to have higher recombination rates than the opposite alleles that disrupt the motif. Running on SNP windows around hotspots each containing an occurrence of the 7-mer motif, LDsplit is able to guide the established motif finding algorithm of MEME to recover the 7-mer motif. In contrast, without LDsplit the 7-mer motif could not be identified. Conclusions LDsplit is a software tool for the discovery of cis-regulatory DNA sequence motifs stimulating meiotic recombination hotspots by screening and narrowing down to hotspot associated SNPs. It is the first computational method that utilizes the genetic variation of recombination hotspots among individuals, opening a new avenue for motif finding. Tested on an established motif and simulated datasets, LDsplit shows promise to discover novel DNA motifs for meiotic recombination hotspots. PMID:24533858
GraphTeams: a method for discovering spatial gene clusters in Hi-C sequencing data.

PubMed

Schulz, Tizian; Stoye, Jens; Doerr, Daniel

2018-05-08

Hi-C sequencing offers novel, cost-effective means to study the spatial conformation of chromosomes. We use data obtained from Hi-C experiments to provide new evidence for the existence of spatial gene clusters. These are sets of genes with associated functionality that exhibit close proximity to each other in the spatial conformation of chromosomes across several related species. We present the first gene cluster model capable of handling spatial data. Our model generalizes a popular computational model for gene cluster prediction, called δ-teams, from sequences to graphs. Following previous lines of research, we subsequently extend our model to allow for several vertices being associated with the same label. The model, called δ-teams with families, is particular suitable for our application as it enables handling of gene duplicates. We develop algorithmic solutions for both models. We implemented the algorithm for discovering δ-teams with families and integrated it into a fully automated workflow for discovering gene clusters in Hi-C data, called GraphTeams. We applied it to human and mouse data to find intra- and interchromosomal gene cluster candidates. The results include intrachromosomal clusters that seem to exhibit a closer proximity in space than on their chromosomal DNA sequence. We further discovered interchromosomal gene clusters that contain genes from different chromosomes within the human genome, but are located on a single chromosome in mouse. By identifying δ-teams with families, we provide a flexible model to discover gene cluster candidates in Hi-C data. Our analysis of Hi-C data from human and mouse reveals several known gene clusters (thus validating our approach), but also few sparsely studied or possibly unknown gene cluster candidates that could be the source of further experimental investigations.

Ultra Deep Sequencing of Listeria monocytogenes sRNA Transcriptome Revealed New Antisense RNAs

PubMed Central

Behrens, Sebastian; Widder, Stefanie; Mannala, Gopala Krishna; Qing, Xiaoxing; Madhugiri, Ramakanth; Kefer, Nathalie; Mraheil, Mobarak Abu; Rattei, Thomas; Hain, Torsten

2014-01-01

Listeria monocytogenes, a gram-positive pathogen, and causative agent of listeriosis, has become a widely used model organism for intracellular infections. Recent studies have identified small non-coding RNAs (sRNAs) as important factors for regulating gene expression and pathogenicity of L. monocytogenes. Increased speed and reduced costs of high throughput sequencing (HTS) techniques have made RNA sequencing (RNA-Seq) the state-of-the-art method to study bacterial transcriptomes. We created a large transcriptome dataset of L. monocytogenes containing a total of 21 million reads, using the SOLiD sequencing technology. The dataset contained cDNA sequences generated from L. monocytogenes RNA collected under intracellular and extracellular condition and additionally was size fractioned into three different size ranges from <40 nt, 40–150 nt and >150 nt. We report here, the identification of nine new sRNAs candidates of L. monocytogenes and a reevaluation of known sRNAs of L. monocytogenes EGD-e. Automatic comparison to known sRNAs revealed a high recovery rate of 55%, which was increased to 90% by manual revision of the data. Moreover, thorough classification of known sRNAs shed further light on their possible biological functions. Interestingly among the newly identified sRNA candidates are antisense RNAs (asRNAs) associated to the housekeeping genes purA, fumC and pgi and potentially their regulation, emphasizing the significance of sRNAs for metabolic adaptation in L. monocytogenes. PMID:24498259
Mapping the affinity landscape of Thrombin-binding aptamers on 2΄F-ANA/DNA chimeric G-Quadruplex microarrays

PubMed Central

Abou Assi, Hala; Gómez-Pinto, Irene; González, Carlos

2017-01-01

Abstract In situ fabricated nucleic acids microarrays are versatile and very high-throughput platforms for aptamer optimization and discovery, but the chemical space that can be probed against a given target has largely been confined to DNA, while RNA and non-natural nucleic acid microarrays are still an essentially uncharted territory. 2΄-Fluoroarabinonucleic acid (2΄F-ANA) is a prime candidate for such use in microarrays. Indeed, 2΄F-ANA chemistry is readily amenable to photolithographic microarray synthesis and its potential in high affinity aptamers has been recently discovered. We thus synthesized the first microarrays containing 2΄F-ANA and 2΄F-ANA/DNA chimeric sequences to fully map the binding affinity landscape of the TBA1 thrombin-binding G-quadruplex aptamer containing all 32 768 possible DNA-to-2΄F-ANA mutations. The resulting microarray was screened against thrombin to identify a series of promising 2΄F-ANA-modified aptamer candidates with Kds significantly lower than that of the unmodified control and which were found to adopt highly stable, antiparallel-folded G-quadruplex structures. The solution structure of the TBA1 aptamer modified with 2΄F-ANA at position T3 shows that fluorine substitution preorganizes the dinucleotide loop into the proper conformation for interaction with thrombin. Overall, our work strengthens the potential of 2΄F-ANA in aptamer research and further expands non-genomic applications of nucleic acids microarrays. PMID:28100695
Molecular insight into the association between cartilage regeneration and ear wound healing in genetic mouse models: targeting new genes in regeneration.

PubMed

Rai, Muhammad Farooq; Schmidt, Eric J; McAlinden, Audrey; Cheverud, James M; Sandell, Linda J

2013-11-06

Tissue regeneration is a complex trait with few genetic models available. Mouse strains LG/J and MRL are exceptional healers. Using recombinant inbred strains from a large (LG/J, healer) and small (SM/J, nonhealer) intercross, we have previously shown a positive genetic correlation between ear wound healing, knee cartilage regeneration, and protection from osteoarthritis. We hypothesize that a common set of genes operates in tissue healing and articular cartilage regeneration. Taking advantage of archived histological sections from recombinant inbred strains, we analyzed expression of candidate genes through branched-chain DNA technology directly from tissue lysates. We determined broad-sense heritability of candidates, Pearson correlation of candidates with healing phenotypes, and Ward minimum variance cluster analysis for strains. A bioinformatic assessment of allelic polymorphisms within and near candidate genes was also performed. The expression of several candidates was significantly heritable among strains. Although several genes correlated with both ear wound healing and cartilage healing at a marginal level, the expression of four genes representing DNA repair (Xrcc2, Pcna) and Wnt signaling (Axin2, Wnt16) pathways was significantly positively correlated with both phenotypes. Cluster analysis accurately classified healers and nonhealers for seven out of eight strains based on gene expression. Specific sequence differences between LG/J and SM/J were identified as potential causal polymorphisms. Our study suggests a common genetic basis between tissue healing and osteoarthritis susceptibility. Mapping genetic variations causing differences in diverse healing responses in multiple tissues may reveal generic healing processes in pursuit of new therapeutic targets designed to induce or enhance regeneration and, potentially, protection from osteoarthritis.
A comparative genomics strategy for targeted discovery of single-nucleotide polymorphisms and conserved-noncoding sequences in orphan crops.

PubMed

Feltus, F A; Singh, H P; Lohithaswa, H C; Schulze, S R; Silva, T D; Paterson, A H

2006-04-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species.
A Comparative Genomics Strategy for Targeted Discovery of Single-Nucleotide Polymorphisms and Conserved-Noncoding Sequences in Orphan Crops1[W

PubMed Central

Feltus, F.A.; Singh, H.P.; Lohithaswa, H.C.; Schulze, S.R.; Silva, T.D.; Paterson, A.H.

2006-01-01

Completed genome sequences provide templates for the design of genome analysis tools in orphan species lacking sequence information. To demonstrate this principle, we designed 384 PCR primer pairs to conserved exonic regions flanking introns, using Sorghum/Pennisetum expressed sequence tag alignments to the Oryza genome. Conserved-intron scanning primers (CISPs) amplified single-copy loci at 37% to 80% success rates in taxa that sample much of the approximately 50-million years of Poaceae divergence. While the conserved nature of exons fostered cross-taxon amplification, the lesser evolutionary constraints on introns enhanced single-nucleotide polymorphism detection. For example, in eight rice (Oryza sativa) genotypes, polymorphism averaged 12.1 per kb in introns but only 3.6 per kb in exons. Curiously, among 124 CISPs evaluated across Oryza, Sorghum, Pennisetum, Cynodon, Eragrostis, Zea, Triticum, and Hordeum, 23 (18.5%) seemed to be subject to rigid intron size constraints that were independent of per-nucleotide DNA sequence variation. Furthermore, we identified 487 conserved-noncoding sequence motifs in 129 CISP loci. A large CISP set (6,062 primer pairs, amplifying introns from 1,676 genes) designed using an automated pipeline showed generally higher abundance in recombinogenic than in nonrecombinogenic regions of the rice genome, thus providing relatively even distribution along genetic maps. CISPs are an effective means to explore poorly characterized genomes for both DNA polymorphism and noncoding sequence conservation on a genome-wide or candidate gene basis, and also provide anchor points for comparative genomics across a diverse range of species. PMID:16607031
Ancient DNA clarifies the evolutionary history of American Late Pleistocene equids.

PubMed

Orlando, Ludovic; Male, Dean; Alberdi, Maria Teresa; Prado, Jose Luis; Prieto, Alfredo; Cooper, Alan; Hänni, Catherine

2008-05-01

Hippidions are past members of the equid lineage which appeared in the South American fossil record around 2.5 Ma but then became extinct during the great late Pleistocene megafaunal extinction. According to fossil records and numerous dental, cranial, and postcranial characters, Hippidion and Equus lineages were expected to cluster in two distinct phylogenetic groups that diverged at least 10 MY, long before the emergence of the first Equus. However, the first DNA sequence information retrieved from Hippidion fossils supported a striking different phylogeny, with hippidions nesting inside a paraphyletic group of Equus. This result indicated either that the currently accepted phylogenetic tree of equids was incorrect regarding the timing of the evolutionary split between Hippidion and Equus or that the taxonomic identification of the hippidion fossils used for DNA analysis needed to be reexamined (and attributed to another extinct South American member of the equid lineage). The most likely candidate for the latter explanation is Equus (Amerhippus) neogeus. Here, we show by retrieving new ancient mtDNA sequences that hippidions and Equus (Amerhippus) neogeus were members of two distinct lineages. Furthermore, using a rigorous phylogenetic approach, we demonstrate that while formerly the largest equid from Southern America, Equus (Amerhippus) was just a member of the species Equus caballus. This new data increases the known phenotypic plasticity of horses and consequently casts doubt on the taxonomic validity of the subgenus Equus (Amerhippus).
Mapping genes to human chromosome 19

DOE Office of Scientific and Technical Information (OSTI.GOV)

Connolly, Sarah

1996-05-01

For this project, 22 Expressed Sequence Tags (ESTs) were fine mapped to regions of human chromosome 19. An EST is a short DNA sequence that occurs once in the genome and corresponds to a single expressed gene. {sup 32}P-radiolabeled probes were made by polymerase chain reaction for each EST and hybridized to filters containing a chromosome 19-specific cosmid library. The location of the ESTs on the chromosome was determined by the location of the ordered cosmid to which the EST hybridized. Of the 22 ESTs that were sublocalized, 6 correspond to known genes, and 16 correspond to anonymous genes. Thesemore » localized ESTs may serve as potential candidates for disease genes, as well as markers for future physical mapping.« less
De novo Assembly of a 40 Mb Eukaryotic Genome from Short Sequence Reads: Sordaria macrospora, a Model Organism for Fungal Morphogenesis

PubMed Central

Nowrousian, Minou; Stajich, Jason E.; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D.; Pöggeler, Stefanie; Read, Nick D.; Seiler, Stephan; Smith, Kristina M.; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-01-01

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30–90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in ∼4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology. PMID:20386741
De novo assembly of a 40 Mb eukaryotic genome from short sequence reads: Sordaria macrospora, a model organism for fungal morphogenesis.

PubMed

Nowrousian, Minou; Stajich, Jason E; Chu, Meiling; Engh, Ines; Espagne, Eric; Halliday, Karen; Kamerewerd, Jens; Kempken, Frank; Knab, Birgit; Kuo, Hsiao-Che; Osiewacz, Heinz D; Pöggeler, Stefanie; Read, Nick D; Seiler, Stephan; Smith, Kristina M; Zickler, Denise; Kück, Ulrich; Freitag, Michael

2010-04-08

Filamentous fungi are of great importance in ecology, agriculture, medicine, and biotechnology. Thus, it is not surprising that genomes for more than 100 filamentous fungi have been sequenced, most of them by Sanger sequencing. While next-generation sequencing techniques have revolutionized genome resequencing, e.g. for strain comparisons, genetic mapping, or transcriptome and ChIP analyses, de novo assembly of eukaryotic genomes still presents significant hurdles, because of their large size and stretches of repetitive sequences. Filamentous fungi contain few repetitive regions in their 30-90 Mb genomes and thus are suitable candidates to test de novo genome assembly from short sequence reads. Here, we present a high-quality draft sequence of the Sordaria macrospora genome that was obtained by a combination of Illumina/Solexa and Roche/454 sequencing. Paired-end Solexa sequencing of genomic DNA to 85-fold coverage and an additional 10-fold coverage by single-end 454 sequencing resulted in approximately 4 Gb of DNA sequence. Reads were assembled to a 40 Mb draft version (N50 of 117 kb) with the Velvet assembler. Comparative analysis with Neurospora genomes increased the N50 to 498 kb. The S. macrospora genome contains even fewer repeat regions than its closest sequenced relative, Neurospora crassa. Comparison with genomes of other fungi showed that S. macrospora, a model organism for morphogenesis and meiosis, harbors duplications of several genes involved in self/nonself-recognition. Furthermore, S. macrospora contains more polyketide biosynthesis genes than N. crassa. Phylogenetic analyses suggest that some of these genes may have been acquired by horizontal gene transfer from a distantly related ascomycete group. Our study shows that, for typical filamentous fungi, de novo assembly of genomes from short sequence reads alone is feasible, that a mixture of Solexa and 454 sequencing substantially improves the assembly, and that the resulting data can be used for comparative studies to address basic questions of fungal biology.
Ten years of barcoding at the African Centre for DNA Barcoding.

PubMed

Bezeng, B S; Davies, T J; Daru, B H; Kabongo, R M; Maurin, O; Yessoufou, K; van der Bank, H; van der Bank, M

2017-07-01

The African Centre for DNA Barcoding (ACDB) was established in 2005 as part of a global initiative to accurately and rapidly survey biodiversity using short DNA sequences. The mitochondrial cytochrome c oxidase 1 gene (CO1) was rapidly adopted as the de facto barcode for animals. Following the evaluation of several candidate loci for plants, the Plant Working Group of the Consortium for the Barcoding of Life in 2009 recommended that two plastid genes, rbcLa and matK, be adopted as core DNA barcodes for terrestrial plants. To date, numerous studies continue to test the discriminatory power of these markers across various plant lineages. Over the past decade, we at the ACDB have used these core DNA barcodes to generate a barcode library for southern Africa. To date, the ACDB has contributed more than 21 000 plant barcodes and over 3000 CO1 barcodes for animals to the Barcode of Life Database (BOLD). Building upon this effort, we at the ACDB have addressed questions related to community assembly, biogeography, phylogenetic diversification, and invasion biology. Collectively, our work demonstrates the diverse applications of DNA barcoding in ecology, systematics, evolutionary biology, and conservation.
How many novel eukaryotic 'kingdoms'? Pitfalls and limitations of environmental DNA surveys

PubMed Central

Berney, Cédric; Fahrni, José; Pawlowski, Jan

2004-01-01

Background Over the past few years, the use of molecular techniques to detect cultivation-independent, eukaryotic diversity has proven to be a powerful approach. Based on small-subunit ribosomal RNA (SSU rRNA) gene analyses, these studies have revealed the existence of an unexpected variety of new phylotypes. Some of them represent novel diversity in known eukaryotic groups, mainly stramenopiles and alveolates. Others do not seem to be related to any molecularly described lineage, and have been proposed to represent novel eukaryotic kingdoms. In order to review the evolutionary importance of this novel high-level eukaryotic diversity critically, and to test the potential technical and analytical pitfalls and limitations of eukaryotic environmental DNA surveys (EES), we analysed 484 environmental SSU rRNA gene sequences, including 81 new sequences from sediments of the small river, the Seymaz (Geneva, Switzerland). Results Based on a detailed screening of an exhaustive alignment of eukaryotic SSU rRNA gene sequences and the phylogenetic re-analysis of previously published environmental sequences using Bayesian methods, our results suggest that the number of novel higher-level taxa revealed by previously published EES was overestimated. Three main sources of errors are responsible for this situation: (1) the presence of undetected chimeric sequences; (2) the misplacement of several fast-evolving sequences; and (3) the incomplete sampling of described, but yet unsequenced eukaryotes. Additionally, EES give a biased view of the diversity present in a given biotope because of the difficult amplification of SSU rRNA genes in some taxonomic groups. Conclusions Environmental DNA surveys undoubtedly contribute to reveal many novel eukaryotic lineages, but there is no clear evidence for a spectacular increase of the diversity at the kingdom level. After re-analysis of previously published data, we found only five candidate lineages of possible novel high-level eukaryotic taxa, two of which comprise several phylotypes that were found independently in different studies. To ascertain their taxonomic status, however, the organisms themselves have now to be identified. PMID:15176975
Differential Gene Expression Reveals Candidate Genes for Drought Stress Response in Abies alba (Pinaceae)

PubMed Central

Ziegenhagen, Birgit; Liepelt, Sascha

2015-01-01

Increasing drought periods as a result of global climate change pose a threat to many tree species by possibly outpacing their adaptive capabilities. Revealing the genetic basis of drought stress response is therefore implemental for future conservation strategies and risk assessment. Access to informative genomic regions is however challenging, especially for conifers, partially due to their large genomes, which puts constraints on the feasibility of whole genome scans. Candidate genes offer a valuable tool to reduce the complexity of the analysis and the amount of sequencing work and costs. For this study we combined an improved drought stress phenotyping of needles via a novel terahertz water monitoring technique with Massive Analysis of cDNA Ends to identify candidate genes for drought stress response in European silver fir (Abies alba Mill.). A pooled cDNA library was constructed from the cotyledons of six drought stressed and six well-watered silver fir seedlings, respectively. Differential expression analyses of these libraries revealed 296 candidate genes for drought stress response in silver fir (247 up- and 49 down-regulated) of which a subset was validated by RT-qPCR of the twelve individual cotyledons. A majority of these genes code for currently uncharacterized proteins and hint on new genomic resources to be explored in conifers. Furthermore, we could show that some traditional reference genes from model plant species (GAPDH and eIF4A2) are not suitable for differential analysis and we propose a new reference gene, TPC1, for drought stress expression profiling in needles of conifer seedlings. PMID:25924061
Genealogy-based methods for inference of historical recombination and gene flow and their application in Saccharomyces cerevisiae.

PubMed

Jenkins, Paul A; Song, Yun S; Brem, Rachel B

2012-01-01

Genetic exchange between isolated populations, or introgression between species, serves as a key source of novel genetic material on which natural selection can act. While detecting historical gene flow from DNA sequence data is of much interest, many existing methods can be limited by requirements for deep population genomic sampling. In this paper, we develop a scalable genealogy-based method to detect candidate signatures of gene flow into a given population when the source of the alleles is unknown. Our method does not require sequenced samples from the source population, provided that the alleles have not reached fixation in the sampled recipient population. The method utilizes recent advances in algorithms for the efficient reconstruction of ancestral recombination graphs, which encode genealogical histories of DNA sequence data at each site, and is capable of detecting the signatures of gene flow whose footprints are of length up to single genes. Further, we employ a theoretical framework based on coalescent theory to test for statistical significance of certain recombination patterns consistent with gene flow from divergent sources. Implementing these methods for application to whole-genome sequences of environmental yeast isolates, we illustrate the power of our approach to highlight loci with unusual recombination histories. By developing innovative theory and methods to analyze signatures of gene flow from population sequence data, our work establishes a foundation for the continued study of introgression and its evolutionary relevance.
Genealogy-Based Methods for Inference of Historical Recombination and Gene Flow and Their Application in Saccharomyces cerevisiae

PubMed Central

Jenkins, Paul A.; Song, Yun S.; Brem, Rachel B.

2012-01-01

Genetic exchange between isolated populations, or introgression between species, serves as a key source of novel genetic material on which natural selection can act. While detecting historical gene flow from DNA sequence data is of much interest, many existing methods can be limited by requirements for deep population genomic sampling. In this paper, we develop a scalable genealogy-based method to detect candidate signatures of gene flow into a given population when the source of the alleles is unknown. Our method does not require sequenced samples from the source population, provided that the alleles have not reached fixation in the sampled recipient population. The method utilizes recent advances in algorithms for the efficient reconstruction of ancestral recombination graphs, which encode genealogical histories of DNA sequence data at each site, and is capable of detecting the signatures of gene flow whose footprints are of length up to single genes. Further, we employ a theoretical framework based on coalescent theory to test for statistical significance of certain recombination patterns consistent with gene flow from divergent sources. Implementing these methods for application to whole-genome sequences of environmental yeast isolates, we illustrate the power of our approach to highlight loci with unusual recombination histories. By developing innovative theory and methods to analyze signatures of gene flow from population sequence data, our work establishes a foundation for the continued study of introgression and its evolutionary relevance. PMID:23226196
Cloning of the anhidrotic ectodermal dysplasia gene: Identification of cDNAs associated with CpG islands mapped near translocation breakpoint in two female patients

DOE Office of Scientific and Technical Information (OSTI.GOV)

Srivastava, A.K.; Schlessinger, D.; Kere, J.

1994-09-01

The gene for the X chromosomal developmental disorder anhidrotic ectodermal dysplasia (EDA) has been mapped to Xq12-q13 by linkage analysis and is expressed in a few females with chromosomal translocations involving band Xq12-q13. A yeast artificial chromosome (YAC) contig (2.0 Mb) spanning two translocation breakpoints has been assembled by sequence-tagged site (STS)-based chromosomal walking. The two translocation breakpoints (X:autosome translocations from the affected female patients) have been mapped less than 60 kb apart within a YAC contig. Unique probes and intragenic STSs (mapped between the two translocations) have been developed and a somatic cell hybrid carrying the translocated X chromosomemore » from the AK patient has been analyzed by isolating unique probes that span the breakpoint. Several STSs made from intragenic sequences have been found to be conserved in mouse, hamster and monkey, but we have detected no mRNAs in a number of tissues tested. However, a probe and STS developed from the DNA spanning the AK breakpoint is conserved in mouse, hamster and monkey, and we have detected expressed sequences in skin cells and cDNA libraries. In addition, unique sequences have been obtained from two CpG islands in the region that maps proximal to the breakpoints. cDNAs containing these sequences are being studied as candidates for the gene affected in the etiology of EDA.« less
Sturgeon conservation genomics: SNP discovery and validation using RAD sequencing.

PubMed

Ogden, R; Gharbi, K; Mugue, N; Martinsohn, J; Senn, H; Davey, J W; Pourkazemi, M; McEwing, R; Eland, C; Vidotto, M; Sergeev, A; Congiu, L

2013-06-01

Caviar-producing sturgeons belonging to the genus Acipenser are considered to be one of the most endangered species groups in the world. Continued overfishing in spite of increasing legislation, zero catch quotas and extensive aquaculture production have led to the collapse of wild stocks across Europe and Asia. The evolutionary relationships among Adriatic, Russian, Persian and Siberian sturgeons are complex because of past introgression events and remain poorly understood. Conservation management, traceability and enforcement suffer a lack of appropriate DNA markers for the genetic identification of sturgeon at the species, population and individual level. This study employed RAD sequencing to discover and characterize single nucleotide polymorphism (SNP) DNA markers for use in sturgeon conservation in these four tetraploid species over three biological levels, using a single sequencing lane. Four population meta-samples and eight individual samples from one family were barcoded separately before sequencing. Analysis of 14.4 Gb of paired-end RAD data focused on the identification of SNPs in the paired-end contig, with subsequent in silico and empirical validation of candidate markers. Thousands of putatively informative markers were identified including, for the first time, SNPs that show population-wide differentiation between Russian and Persian sturgeons, representing an important advance in our ability to manage these cryptic species. The results highlight the challenges of genotyping-by-sequencing in polyploid taxa, while establishing the potential genetic resources for developing a new range of caviar traceability and enforcement tools. © 2013 John Wiley & Sons Ltd.
Scar-less multi-part DNA assembly design automation

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hillson, Nathan J.

The present invention provides a method of a method of designing an implementation of a DNA assembly. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which to assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding flanking homology sequences to each of the DNA oligos. In an exemplary embodiment, the method includes (1) receiving a list of DNA sequence fragments to be assembled together and an order in which tomore » assemble the DNA sequence fragments, (2) designing DNA oligonucleotides (oligos) for each of the DNA sequence fragments, and (3) creating a plan for adding optimized overhang sequences to each of the DNA oligos.« less
A Ribeiroia spp. (Class: Trematoda) - Specific PCR-based diagnostic

USGS Publications Warehouse

Reinitz, David M.; Yoshino, T.P.; Cole, Rebecca A.

2007-01-01

Increased reporting of amphibian malformations in North America has been noted with concern in light of reports that amphibian numbers and species are declining worldwide. Ribeiroia ondatrae has been shown to cause a variety of types of malformations in amphibians. However, little is known about the prevalence of R. ondatrae in North America. To aid in conducting field studies of Ribeiroia spp., we have developed a polymerase chain reaction (PCR)-based diagnostic. Herein, we describe the development of an accurate, rapid, simple, and cost-effective diagnostic for detection of Ribeiroia spp. infection in snails (Planorbella trivolvis). Candidate oligonucleotide primers for PCR were designed via DNA sequence analyses of multiple ribosomal internal transcribed spacer-2 regions from Ribeiroia spp. and Echinostoma spp. Comparison of consensus sequences determined from both genera identified areas of sequence potentially unique to Ribeiroia spp. The PCR reliably produced a diagnostic 290-base pair (bp) product in the presence of a wide concentration range of snail or frog DNA. Sensitivity was examined with DNA extracted from single R. ondatrae cercaria. The single-tube PCR could routinely detect less than 1 cercariae equivalent, because DNA isolated from a single cercaria could be diluted at least 1:50 and still yield a positive result via gel electrophoresis. An even more sensitive nested PCR also was developed that routinely detected 100 fg of the 290-bp fragment. The assay did not detect furcocercous cercariae of certain Schistosomatidae, Echinostoma sp., or Sphaeridiotrema globulus nor adults of Clinostomum sp. or Cyathocotyle bushiensis. Field testing of 137 P. trivolvis identified 3 positives with no overt environmental cross-reactivity, and results concurred with microscopic examinations in all cases. ?? American Society of Parasitologists 2007.
Cloning of the Gene Encoding a 22-Kilodalton Cell Surface Antigen of Mycobacterium bovis BCG and Analysis of Its Potential for DNA Vaccination against Tuberculosis

PubMed Central

Lefèvre, Philippe; Denis, Olivier; De Wit, Lucas; Tanghe, Audrey; Vandenbussche, Paul; Content, Jean; Huygen, Kris

2000-01-01

Using spleen cells from mice vaccinated with live Mycobacterium bovis BCG, we previously generated three monoclonal antibodies reactive against a 22-kDa protein present in mycobacterial culture filtrate (CF) (K. Huygen et al., Infect. Immun. 61:2687–2693, 1993). These monoclonal antibodies were used to screen an M. bovis BCG genomic library made in phage λgt11. The gene encoding a 233-amino-acid (aa) protein, including a putative 26-aa signal sequence, was isolated, and sequence analysis indicated that the protein was 98% identical with the M. tuberculosis Lppx protein and that it contained a sequence 94% identical with the M. leprae 38-mer polypeptide 13B3 recognized by T cells from killed M. leprae-immunized subjects. Flow cytometry and cell fractionation demonstrated that the 22-kDa CF protein is also highly expressed in the bacterial cell wall and membrane compartment but not in the cytosol. C57BL/6, C3H, and BALB/c mice were vaccinated with plasmid DNA encoding the 22-kDa protein and analyzed for immune response and protection against intravenous M. tuberculosis challenge. Whereas DNA vaccination induced elevated antibody responses in C57BL/6 and particularly in C3H mice, Th1-type cytokine response, as measured by interleukin-2 and gamma interferon secretion, was only modest, and no protection against intravenous M. tuberculosis challenge was observed in any of the three mouse strains tested. Therefore, the 22-kDa antigen seems to have little potential for a DNA vaccine against tuberculosis, but it may be a good candidate for a mycobacterial antigen detection test. PMID:10678905
Sequencing of Seven Haloarchaeal Genomes Reveals Patterns of Genomic Flux

PubMed Central

Lynch, Erin A.; Langille, Morgan G. I.; Darling, Aaron; Wilbanks, Elizabeth G.; Haltiner, Caitlin; Shao, Katie S. Y.; Starr, Michael O.; Teiling, Clotilde; Harkins, Timothy T.; Edwards, Robert A.; Eisen, Jonathan A.; Facciotti, Marc T.

2012-01-01

We report the sequencing of seven genomes from two haloarchaeal genera, Haloferax and Haloarcula. Ease of cultivation and the existence of well-developed genetic and biochemical tools for several diverse haloarchaeal species make haloarchaea a model group for the study of archaeal biology. The unique physiological properties of these organisms also make them good candidates for novel enzyme discovery for biotechnological applications. Seven genomes were sequenced to ∼20×coverage and assembled to an average of 50 contigs (range 5 scaffolds - 168 contigs). Comparisons of protein-coding gene compliments revealed large-scale differences in COG functional group enrichment between these genera. Analysis of genes encoding machinery for DNA metabolism reveals genera-specific expansions of the general transcription factor TATA binding protein as well as a history of extensive duplication and horizontal transfer of the proliferating cell nuclear antigen. Insights gained from this study emphasize the importance of haloarchaea for investigation of archaeal biology. PMID:22848480

Identification of genetic aberrations on chromosome 22 outside the NF2 locus in schwannomatosis and neurofibromatosis type 2.

PubMed

Buckley, Patrick G; Mantripragada, Kiran K; Díaz de Ståhl, Teresita; Piotrowski, Arkadiusz; Hansson, Caisa M; Kiss, Hajnalka; Vetrie, David; Ernberg, Ingemar T; Nordenskjöld, Magnus; Bolund, Lars; Sainio, Markku; Rouleau, Guy A; Niimura, Michihito; Wallace, Andrew J; Evans, D Gareth R; Grigelionis, Gintautas; Menzel, Uwe; Dumanski, Jan P

2005-12-01

Schwannomatosis is characterized by multiple peripheral and cranial nerve schwannomas that occur in the absence of bilateral 8th cranial nerve schwannomas. The latter is the main diagnostic criterion of neurofibromatosis type 2 (NF2), which is a related but distinct disorder. The genetic factors underlying the differences between schwannomatosis and NF2 are poorly understood, although available evidence implicates chromosome 22 as the primary location of the gene(s) of interest. To investigate this, we comprehensively profiled the DNA copy number in samples from sporadic and familial schwannomatosis, NF2, and a large cohort of normal controls. Using a tiling-path chromosome 22 genomic array, we identified two candidate regions of copy number variation, which were further characterized by a PCR-based array with higher resolution. The latter approach allows the detection of minute alterations in total genomic DNA, with as little as 1.5 kb per measurement point of nonredundant sequence on the array. In DNA derived from peripheral blood from a schwannomatosis patient and a sporadic schwannoma sample, we detected rearrangements of the immunoglobulin lambda (IGL) locus, which is unlikely to be due to a B-cell specific somatic recombination of IGL. Analysis of normal controls indicated that these IGL rearrangements were restricted to schwannomatosis/schwannoma samples. In the second candidate region spanning GSTT1 and CABIN1 genes, we observed a frequent copy number polymorphism at the GSTT1 locus. We further describe missense mutations in the CABIN1 gene that are specific to samples from schwannomatosis and NF2 and make this gene a plausible candidate for contributing to the pathogenesis of these disorders. Copyright 2005 Wiley-Liss, Inc.
A new leaf-tailed gecko of the Uroplatus ebenaui group (Squamata: Gekkonidae) from Madagascar's central eastern rainforests.

PubMed

Ratsoavina, Fanomezana Mihaja; Ranjanaharisoa, Fiadanantsoa Andrianja; Glaw, Frank; Raselimanana, Achille P; Miralles, Aurélien; Vences, Miguel

2015-08-21

We describe a new leaf-tailed gecko species of the Uroplatus ebenaui group from the eastern central rainforests of Madagascar, which had previously been considered as a confirmed candidate species. Our description of Uroplatus fiera sp. nov. relies on integrating evidence from molecular and morphological characters and is based on newly collected material from two localities. A phylogenetic analysis based on multiple mitochondrial DNA fragments places the new species as sister to a lineage of uncertain status (Uroplatus ebenaui [Ca8]), and the clade consisting of these two lineages is sister to a further undescribed candidate species (U. ebenaui [Ca1]). This entire clade is sister to U. phantasticus plus another candidate species. The new species differs from these close relatives, and all other congenerics, by strong differences in DNA sequences of mitochondrial genes (>8.5% uncorrected p-distance in 16S rDNA to all nominal species of the genus) and lacks shared alleles with any of the nominal species in the nuclear CMOS gene. From its closest relatives the new species further differs in its much smaller tail size (relative to U. phantasticus), and a narrower tail, fewer supralabials, and more toe lamellae (relative to U. ebenaui [Ca1]). Morphologically the new species is most similar to U. ebenaui but differs in its larger body size and unpigmented oral mucosa. Given its distribution in central eastern Madagascar, with records from near Fierenana and Ambatovy, its range overlaps with that of U. phantasticus. Based on examination of the U. phantasticus holotype, we confirm that this latter has a blackish pigmented oral mucosa as do those specimens typically attributed to this nomen, thereby confirming its distinctness from U. fiera sp. nov., in which the mucosa is unpigmented.
Coral Reef Genomics: Developing tools for functional genomics ofcoral symbiosis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Schwarz, Jodi; Brokstein, Peter; Manohar, Chitra

Symbioses between cnidarians and dinoflagellates in the genus Symbiodinium are widespread in the marine environment. The importance of this symbiosis to reef-building corals and reef nutrient and carbon cycles is well documented, but little is known about the mechanisms by which the partners establish and regulate the symbiosis. Because the dinoflagellate symbionts live inside the cells of their host coral, the interactions between the partners occur on cellular and molecular levels, as each partner alters the expression of genes and proteins to facilitate the partnership. These interactions can examined using high-throughput techniques that allow thousands of genes to be examinedmore » simultaneously. We are developing the groundwork so that we can use DNA microarray profiling to identify genes involved in the Montastraea faveolata and Acropora palmata symbioses. Here we report results from the initial steps in this microarray initiative, that is, the construction of cDNA libraries from 4 of 16 target stages, sequencing of 3450 cDNA clones to generate Expressed Sequenced Tags (ESTs), and annotation of the ESTs to identify candidate genes to include in the microarrays. An understanding of how the coral-dinoflagellate symbiosis is regulated will have implications for atmospheric and ocean sciences, conservation biology, the study and diagnosis of coral bleaching and disease, and comparative studies of animal-protest interactions.« less
Identification of Abundantly Expressed Novel and Conserved Genes from the Infective Larval Stage of Toxocara canis by an Expressed Sequence Tag Strategy

PubMed Central

Tetteh, Kevin K. A.; Loukas, Alex; Tripp, Cindy; Maizels, Rick M.

1999-01-01

Larvae of Toxocara canis, a nematode parasite of dogs, infect humans, causing visceral and ocular larva migrans. In noncanid hosts, larvae neither grow nor differentiate but endure in a state of arrested development. Reasoning that parasite protein production is orientated to immune evasion, we undertook a random sequencing project from a larval cDNA library to characterize the most highly expressed transcripts. In all, 266 clones were sequenced, most from both 3′ and 5′ ends, and similarity searches against GenBank protein and dbEST nucleotide databases were conducted. Cluster analyses showed that 128 distinct gene products had been found, all but 3 of which represented newly identified genes. Ninety-five genes were represented by a single clone, but seven transcripts were present at high frequencies, each composing >2% of all clones sequenced. These high-abundance transcripts include a mucin and a C-type lectin, which are both major excretory-secretory antigens released by parasites. Four highly expressed novel gene transcripts, termed ant (abundant novel transcript) genes, were found. Together, these four genes comprised 18% of all cDNA clones isolated, but no similar sequences occur in the Caenorhabditis elegans genome. While the coding regions of the four genes are dissimilar, their 3′ untranslated tracts have significant homology in nucleotide sequence. The discovery of these abundant, parasite-specific genes of newly identified lectins and mucins, as well as a range of conserved and novel proteins, provides defined candidates for future analysis of the molecular basis of immune evasion by T. canis. PMID:10456930
Colon Cancer-Upregulated Long Non-Coding RNA lincDUSP Regulates Cell Cycle Genes and Potentiates Resistance to Apoptosis.

PubMed

Forrest, Megan E; Saiakhova, Alina; Beard, Lydia; Buchner, David A; Scacheri, Peter C; LaFramboise, Thomas; Markowitz, Sanford; Khalil, Ahmad M

2018-05-09

Long non-coding RNAs (lncRNAs) are frequently dysregulated in many human cancers. We sought to identify candidate oncogenic lncRNAs in human colon tumors by utilizing RNA sequencing data from 22 colon tumors and 22 adjacent normal colon samples from The Cancer Genome Atlas (TCGA). The analysis led to the identification of ~200 differentially expressed lncRNAs. Validation in an independent cohort of normal colon and patient-derived colon cancer cell lines identified a novel lncRNA, lincDUSP, as a potential candidate oncogene. Knockdown of lincDUSP in patient-derived colon tumor cell lines resulted in significantly decreased cell proliferation and clonogenic potential, and increased susceptibility to apoptosis. The knockdown of lincDUSP affects the expression of ~800 genes, and NCI pathway analysis showed enrichment of DNA damage response and cell cycle control pathways. Further, identification of lincDUSP chromatin occupancy sites by ChIRP-Seq demonstrated association with genes involved in the replication-associated DNA damage response and cell cycle control. Consistent with these findings, lincDUSP knockdown in colon tumor cell lines increased both the accumulation of cells in early S-phase and γH2AX foci formation, indicating increased DNA damage response induction. Taken together, these results demonstrate a key role of lincDUSP in the regulation of important pathways in colon cancer.
MToolBox: a highly automated pipeline for heteroplasmy annotation and prioritization analysis of human mitochondrial variants in high-throughput sequencing

PubMed Central

Diroma, Maria Angela; Santorsola, Mariangela; Guttà, Cristiano; Gasparre, Giuseppe; Picardi, Ernesto; Pesole, Graziano; Attimonelli, Marcella

2014-01-01

Motivation: The increasing availability of mitochondria-targeted and off-target sequencing data in whole-exome and whole-genome sequencing studies (WXS and WGS) has risen the demand of effective pipelines to accurately measure heteroplasmy and to easily recognize the most functionally important mitochondrial variants among a huge number of candidates. To this purpose, we developed MToolBox, a highly automated pipeline to reconstruct and analyze human mitochondrial DNA from high-throughput sequencing data. Results: MToolBox implements an effective computational strategy for mitochondrial genomes assembling and haplogroup assignment also including a prioritization analysis of detected variants. MToolBox provides a Variant Call Format file featuring, for the first time, allele-specific heteroplasmy and annotation files with prioritized variants. MToolBox was tested on simulated samples and applied on 1000 Genomes WXS datasets. Availability and implementation: MToolBox package is available at https://sourceforge.net/projects/mtoolbox/. Contact: marcella.attimonelli@uniba.it Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25028726
Machine learning classifier for identification of damaging missense mutations exclusive to human mitochondrial DNA-encoded polypeptides.

PubMed

Martín-Navarro, Antonio; Gaudioso-Simón, Andrés; Álvarez-Jarreta, Jorge; Montoya, Julio; Mayordomo, Elvira; Ruiz-Pesini, Eduardo

2017-03-07

Several methods have been developed to predict the pathogenicity of missense mutations but none has been specifically designed for classification of variants in mtDNA-encoded polypeptides. Moreover, there is not available curated dataset of neutral and damaging mtDNA missense variants to test the accuracy of predictors. Because mtDNA sequencing of patients suffering mitochondrial diseases is revealing many missense mutations, it is needed to prioritize candidate substitutions for further confirmation. Predictors can be useful as screening tools but their performance must be improved. We have developed a SVM classifier (Mitoclass.1) specific for mtDNA missense variants. Training and validation of the model was executed with 2,835 mtDNA damaging and neutral amino acid substitutions, previously curated by a set of rigorous pathogenicity criteria with high specificity. Each instance is described by a set of three attributes based on evolutionary conservation in Eukaryota of wildtype and mutant amino acids as well as coevolution and a novel evolutionary analysis of specific substitutions belonging to the same domain of mitochondrial polypeptides. Our classifier has performed better than other web-available tested predictors. We checked performance of three broadly used predictors with the total mutations of our curated dataset. PolyPhen-2 showed the best results for a screening proposal with a good sensitivity. Nevertheless, the number of false positive predictions was too high. Our method has an improved sensitivity and better specificity in relation to PolyPhen-2. We also publish predictions for the complete set of 24,201 possible missense variants in the 13 human mtDNA-encoded polypeptides. Mitoclass.1 allows a better selection of candidate damaging missense variants from mtDNA. A careful search of discriminatory attributes and a training step based on a curated dataset of amino acid substitutions belonging exclusively to human mtDNA genes allows an improved performance. Mitoclass.1 accuracy could be improved in the future when more mtDNA missense substitutions will be available for updating the attributes and retraining the model.
Demonstration of GTG as an endogenous initiation codon for a human mRNA transcript revealed by molecular cloning of the serpin endopin 2B.

PubMed

Hwang, Shin-Rong; Garza, Christina Z; Wegrzyn, Jill; Hook, Vivian Y H

2004-08-16

This study demonstrates utilization of the novel GTG initiation codon for translation of a human mRNA transcript that encodes the serpin endopin 2B, a protease inhibitor. Molecular cloning revealed the nucleotide sequence of the human endopin 2B cDNA. Its deduced primary sequence shows high homology to bovine endopin 2A that possesses cross-class protease inhibition of elastase and papain. Notably, the human endopin 2B cDNA sequence revealed GTG as the predicted translation initiation codon; the predicted translation product of 46 kDa endopin 2B was produced by in vitro translation of 35S-endopin 2B with mammalian (rabbit) protein translation components. Importantly, bioinformatic studies demonstrated the presence of the entire human endopin 2B cDNA sequence with GTG as initiation codon within the human genome on chromosome 14. Further evidence for GTG as a functional initiation codon was illustrated by GTG-mediated in vitro translation of the heterologous protein EGFP, and by GTG-mediated expression of EGFP in mammalian PC12 cells. Mutagenesis of GTG to GTC resulted in the absence of EGFP expression in PC12 cells, indicating the function of GTG as an initiation codon. In addition, it was apparent that the GTG initiation codon produces lower levels of translated protein compared to ATG as initiation codon. Significantly, GTG-mediated translation of endopin 2B demonstrates a functional human gene product not previously predicted from initial analyses of the human genome. Further analyses based on GTG as an alternative initiation codon may predict new candidate genes of the human genome.
Twenty-seven nonoverlapping zinc finger cDNAs from human T cells map to nine different chromosomes with apparent clustering.

PubMed Central

Huebner, K; Druck, T; Croce, C M; Thiesen, H J

1991-01-01

cDNA clones encoding zinc finger structures were isolated by screening Molt4 and Jurkat cDNA libraries with zinc finger consensus sequences. Candidate clones were partially sequenced to verify the presence of zinc finger-encoding regions; nonoverlapping cDNA clones were chosen on the basis of sequences and genomic hybridization pattern. Zinc finger structure-encoding clones, which were designated by the term "Kox" and a number from 1 to 32 and which were apparently unique (i.e., distinct from each other and distinct from those isolated by other laboratories), were chosen for mapping in the human genome. DNAs from rodent-human somatic cell hybrids retaining defined complements of human chromosomes were analyzed for the presence of each of the Kox genes. Correlation between the presence of specific human chromosome regions and specific Kox genes established the chromosomal locations. Multiple Kox loci were mapped to 7q (Kox 18 and 25 and a locus detected by both Kox 8 cDNA and Kox 27 cDNA), 8q24 5' to the myc locus (Kox 9 and 32), 10cen----q24 (Kox 2, 15, 19, 21, 30, and 31), 12q13-qter (Kox 1 and 20), 17p13 (Kox 11 and 26), and 19q (Kox 5, 6, 10, 22, 24, and 28). Single Kox loci were mapped to 7p22 (Kox 3), 18q12 (Kox 17), 19p (Kox 13), 22q11 between IG lambda and BCR-1 (locus detected by both Kox 8 cDNA and Kox 27 cDNA), and Xp (Kox 14). Several of the Kox loci map to regions in which other zinc finger structure-encoding loci have already been localized, indicating possible zinc finger gene clusters. In addition, Kox genes at 8q24, 17p13, and 22q11--and perhaps other Kox genes--are located near recurrent chromosomal translocation breakpoints. Others, such as those on 7p and 7q, may be near regions specifically active in T cells. Images Figure 4 Figure 5 Figure 2 Figure 3 PMID:2014798
Germline Mutations in PALB2, BRCA1, and RAD51C, Which Regulate DNA Recombination Repair, in Patients with Gastric Cancer

PubMed Central

Sahasrabudhe, Ruta; Lott, Paul; Bohorquez, Mabel; Toal, Ted; Estrada, Ana P.; Suarez, John J.; Brea-Fernández, Alejandro; Cameselle-Teijeiro, José; Pinto, Carla; Ramos, Irma; Mantilla, Alejandra; Prieto, Rodrigo; Corvalan, Alejandro; Norero, Enrique; Alvarez, Carolina; Tapia, Teresa; Carvallo, Pilar; Gonzalez, Luz M.; Cock-Rada, Alicia; Solano, Angela; Neffa, Florencia; Valle, Adriana Della; Yau, Chris; Soares, Gabriela; Borowsky, Alexander; Hu, Nan; He, Li-Ji; Han, Xiao-You; Taylor, Philip R.; Goldstein, Alisa M.; Torres, Javier; Echeverry, Magdalena; Ruiz-Ponte, Clara; Teixeira, Manuel R.; Carvajal Carmona, Luis G.

2016-01-01

Up to 10% of cases of gastric cancer are familial, but so far, only mutations in CDH1 have been associated with gastric cancer risk. To identify genetic variants that affect risk for gastric cancer, we collected blood samples from 28 patients with hereditary diffuse gastric cancer (HDGC) not associated with mutations in CDH1 and performed whole-exome sequence analysis. We then analyzed sequences of candidate genes in 333 independent HDGC and non-HDGC cases. We identified 11 cases with mutations in PALB2, BRCA1, or RAD51C genes, which regulate homologous DNA recombination. We found these mutations in 2 of 31 patients with HDGC (6.5%) and 9 of 331 patients with sporadic gastric cancer (2.8%). Most of these mutations had been previously associated with other types of tumors and partially co-segregated with gastric cancer in our study. Tumors that developed in patients with these mutations had a mutation signature associated with somatic homologous recombination deficiency. Our findings indicate that defects in homologous recombination increase risk for gastric cancer. PMID:28024868
Demonstration of retrotransposition of the Tf1 element in fission yeast.

PubMed

Levin, H L; Boeke, J D

1992-03-01

Tf1, a retrotransposon from fission yeast, has LTRs and coding sequences resembling the protease, reverse transcriptase and integrase domains of retroviral pol genes. A unique aspect of Tf1 is that it contains a single open reading frame whereas other retroviruses and retrotransposons usually possess two or more open reading frames. To determine whether Tf1 can transpose, we overproduced Tf1 transcripts encoded by a plasmid copy of the element marked with a neo gene. Approximately 0.1-4.0% of the cell population acquired chromosomally inherited resistance to G418. DNA blot analysis demonstrated that such strains had acquired both Tf1 and neo specific sequences within a restriction fragment of the same size; the size of this restriction fragment varied between different isolates. Structural analysis of the cloned DNA flanking the Tf1-neo element of two transposition candidates with the same regions in the parent strain showed that the ability to grow on G418 was due to transposition of Tf1-neo and not other types of recombination events.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

DOEpatents

Gardner, Shea N; Mariella, Jr., Raymond P; Christian, Allen T; Young, Jennifer A; Clague, David S

2013-06-25

A method of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths.
Sequential addition of short DNA oligos in DNA-polymerase-based synthesis reactions

DOEpatents

Gardner, Shea N [San Leandro, CA; Mariella, Jr., Raymond P.; Christian, Allen T [Tracy, CA; Young, Jennifer A [Berkeley, CA; Clague, David S [Livermore, CA

2011-01-18

A method of fabricating a DNA molecule of user-defined sequence. The method comprises the steps of preselecting a multiplicity of DNA sequence segments that will comprise the DNA molecule of user-defined sequence, separating the DNA sequence segments temporally, and combining the multiplicity of DNA sequence segments with at least one polymerase enzyme wherein the multiplicity of DNA sequence segments join to produce the DNA molecule of user-defined sequence. Sequence segments may be of length n, where n is an even or odd integer. In one embodiment the length of desired hybridizing overlap is specified by the user and the sequences and the protocol for combining them are guided by computational (bioinformatics) predictions. In one embodiment sequence segments are combined from multiple reading frames to span the same region of a sequence, so that multiple desired hybridizations may occur with different overlap lengths. In one embodiment starting sequence fragments are of different lengths, n, n+1, n+2, etc.
DNA methylation profiling for a confirmatory test for blood, saliva, semen, vaginal fluid and menstrual blood.

PubMed

Lee, Hwan Young; Jung, Sang-Eun; Lee, Eun Hee; Yang, Woo Ick; Shin, Kyoung-Jin

2016-09-01

The ability to predict the type of tissues or cells from molecular profiles of crime scene samples has important practical implications in forensics. A previously reported multiplex assay using DNA methylation markers could only discriminate between 4 types of body fluids: blood, saliva, semen, and the body fluid which originates from female reproductive organ. In the present study, we selected 15 menstrual blood-specific CpG marker candidates based on analysis of 12 genome-wide DNA methylation profiles of vaginal fluid and menstrual blood. The menstrual blood-specificity of the candidate markers was confirmed by comparison with HumanMethylation450 BeadChip array data obtained for 58 samples including 12 blood, 12 saliva, 12 semen, 3 vaginal fluid, and 19 skin epidermis samples. Among 15CpG marker candidates, 3 were located in the promoter region of the SLC26A10 gene, and 2 of them (cg09696411 and cg18069290) showed high menstrual blood specificity. DNA methylation at the 2CpG markers was further tested by targeted bisulfite sequencing of 461 additional samples including 49 blood, 52 saliva, 34 semen, 125 vaginal fluid, and 201 menstrual blood. Because the 2 markers showed menstrual blood-specific methylation patterns, we modified our previous multiplex methylation SNaPshot reaction to include these 2 markers. In addition, a blood marker cg01543184 with cross reactivity to semen was replaced with cg08792630, and a semen-specific unmethylation marker cg17621389 was removed. The resultant multiplex methylation SNaPshot allowed positive identification of blood, saliva, semen, vaginal fluid and menstrual blood using the 9CpG markers which show a methylation signal only in the target body fluids. Because of the complexity in cell composition, menstrual bloods produced DNA methylation profiles that vary with menstrual cycle and sample collection methods, which are expected to provide more insight into forensic menstrual blood test. Moreover, because the developed multiplex methylation SNaPshot reaction includes the 4CpG markers of which specificities have been confirmed by multiple studies, it will facilitate confirmatory tests for body fluids that are frequently observed in forensic casework. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.
Identification of a duplication within the GDF9 gene and novel candidate genes for primary ovarian insufficiency (POI) by a customized high-resolution array comparative genomic hybridization platform.

PubMed

Norling, A; Hirschberg, A L; Rodriguez-Wallberg, K A; Iwarsson, E; Wedell, A; Barbaro, M

2014-08-01

Can high-resolution array comparative genomic hybridization (CGH) analysis of DNA samples from women with primary ovarian insufficiency (POI) improve the diagnosis of the condition and identify novel candidate genes for POI? A mutation affecting the regulatory region of growth differentiation factor 9 (GDF9) was identified for the first time together with several novel candidate genes for POI. Most patients with POI do not receive a molecular diagnosis despite a significant genetic component in the pathogenesis. We performed a case-control study. Twenty-six patients were analyzed by array CGH for identification of copy number variants. Novel changes were investigated in 95 controls and in a separate population of 28 additional patients with POI. The experimental procedures were performed during a 1-year period. DNA samples from 26 patients with POI were analyzed by a customized 1M array-CGH platform with whole genome coverage and probe enrichment targeting 78 genes in sex development. By PCR amplification and sequencing, the breakpoint of an identified partial GDF9 gene duplication was characterized. A multiplex ligation-dependent probe amplification (MLPA) probe set for specific identification of deletions/duplications affecting GDF9 was developed. An MLPA probe set for the identification of additional cases or controls carrying novel candidate regions identified by array-CGH was developed. Sequencing of three candidate genes was performed. Eleven unique copy number changes were identified in a total of 11 patients, including a tandem duplication of 475 bp, containing part of the GDF9 gene promoter region. The duplicated region contains three NOBOX-binding elements and an E-box, important for GDF9 gene regulation. This aberration is likely causative of POI. Fifty-four patients were investigated for copy number changes within GDF9, but no additional cases were found. Ten aberrations constituting novel candidate regions were detected, including a second DNAH6 deletion in a patient with POI. Other identified candidate genes were TSPYL6, SMARCC1, CSPG5 and ZFR2. This is a descriptive study and no functional experiments were performed. The study illustrates the importance of analyzing small copy number changes in addition to sequence alterations in the genetic investigation of patients with POI. Also, promoter regions should be included in the investigation. The study was supported by grants from the Swedish Research council (project no 12198 to A.W. and project no 20324 to A.L.H.), Stockholm County Council (E.I., A.W. and K.R.W.), Foundation Frimurare Barnhuset (A.N., A.W. and M.B.), Karolinska Institutet (A.N., A.L.H., E.I., A.W. and M.B.), Novo Nordic Foundation (A.W.) and Svenska Läkaresällskapet (M.B.). The funding sources had no involvement in the design or analysis of the study. The authors have no competing interests to declare. Not applicable. © The Author 2014. Published by Oxford University Press on behalf of the European Society of Human Reproduction and Embryology.
Global Transcriptional Start Site Mapping Using Differential RNA Sequencing Reveals Novel Antisense RNAs in Escherichia coli

PubMed Central

Thomason, Maureen K.; Bischler, Thorsten; Eisenbart, Sara K.; Förstner, Konrad U.; Zhang, Aixia; Herbig, Alexander; Nieselt, Kay

2014-01-01

While the model organism Escherichia coli has been the subject of intense study for decades, the full complement of its RNAs is only now being examined. Here we describe a survey of the E. coli transcriptome carried out using a differential RNA sequencing (dRNA-seq) approach, which can distinguish between primary and processed transcripts, and an automated prediction algorithm for transcriptional start sites (TSS). With the criterion of expression under at least one of three growth conditions examined, we predicted 14,868 TSS candidates, including 5,574 internal to annotated genes (iTSS) and 5,495 TSS corresponding to potential antisense RNAs (asRNAs). We examined expression of 14 candidate asRNAs by Northern analysis using RNA from wild-type E. coli and from strains defective for RNases III and E, two RNases reported to be involved in asRNA processing. Interestingly, nine asRNAs detected as distinct bands by Northern analysis were differentially affected by the rnc and rne mutations. We also compared our asRNA candidates with previously published asRNA annotations from RNA-seq data and discuss the challenges associated with these cross-comparisons. Our global transcriptional start site map represents a valuable resource for identification of transcription start sites, promoters, and novel transcripts in E. coli and is easily accessible, together with the cDNA coverage plots, in an online genome browser. PMID:25266388
Beyond the binding site: in vivo identification of tbx2, smarca5 and wnt5b as molecular targets of CNBP during embryonic development.

PubMed

Armas, Pablo; Margarit, Ezequiel; Mouguelar, Valeria S; Allende, Miguel L; Calcaterra, Nora B

2013-01-01

CNBP is a nucleic acid chaperone implicated in vertebrate craniofacial development, as well as in myotonic dystrophy type 2 (DM2) and sporadic inclusion body myositis (sIBM) human muscle diseases. CNBP is highly conserved among vertebrates and has been implicated in transcriptional regulation; however, its DNA binding sites and molecular targets remain elusive. The main goal of this work was to identify CNBP DNA binding sites that might reveal target genes involved in vertebrate embryonic development. To accomplish this, we used a recently described yeast one-hybrid assay to identify DNA sequences bound in vivo by CNBP. Bioinformatic analyses revealed that these sequences are G-enriched and show high frequency of putative G-quadruplex DNA secondary structure. Moreover, an in silico approach enabled us to establish the CNBP DNA-binding site and to predict CNBP putative targets based on gene ontology terms and synexpression with CNBP. The direct interaction between CNBP and candidate genes was proved by EMSA and ChIP assays. Besides, the role of CNBP upon the identified genes was validated in loss-of-function experiments in developing zebrafish. We successfully confirmed that CNBP up-regulates tbx2b and smarca5, and down-regulates wnt5b gene expression. The highly stringent strategy used in this work allowed us to identify new CNBP target genes functionally important in different contexts of vertebrate embryonic development. Furthermore, it represents a novel approach toward understanding the biological function and regulatory networks involving CNBP in the biology of vertebrates.
Beyond the Binding Site: In Vivo Identification of tbx2, smarca5 and wnt5b as Molecular Targets of CNBP during Embryonic Development

PubMed Central

Mouguelar, Valeria S.; Allende, Miguel L.; Calcaterra, Nora B.

2013-01-01

CNBP is a nucleic acid chaperone implicated in vertebrate craniofacial development, as well as in myotonic dystrophy type 2 (DM2) and sporadic inclusion body myositis (sIBM) human muscle diseases. CNBP is highly conserved among vertebrates and has been implicated in transcriptional regulation; however, its DNA binding sites and molecular targets remain elusive. The main goal of this work was to identify CNBP DNA binding sites that might reveal target genes involved in vertebrate embryonic development. To accomplish this, we used a recently described yeast one-hybrid assay to identify DNA sequences bound in vivo by CNBP. Bioinformatic analyses revealed that these sequences are G-enriched and show high frequency of putative G-quadruplex DNA secondary structure. Moreover, an in silico approach enabled us to establish the CNBP DNA-binding site and to predict CNBP putative targets based on gene ontology terms and synexpression with CNBP. The direct interaction between CNBP and candidate genes was proved by EMSA and ChIP assays. Besides, the role of CNBP upon the identified genes was validated in loss-of-function experiments in developing zebrafish. We successfully confirmed that CNBP up-regulates tbx2b and smarca5, and down-regulates wnt5b gene expression. The highly stringent strategy used in this work allowed us to identify new CNBP target genes functionally important in different contexts of vertebrate embryonic development. Furthermore, it represents a novel approach toward understanding the biological function and regulatory networks involving CNBP in the biology of vertebrates. PMID:23667590
[Prokaryote diversity in water environment of land-ocean ecotone of Zhuhai City].

PubMed

Huang, Xiao-Lan; Chen, Jian-Yao; Zhou, Shi-Ning; Xie, Li-Chun; Fu, Cong-Sheng

2010-02-01

By constructing 16S rDNA clone library with PCR-RFLP, the prokaryote diversity in the seawater and groundwater of land-ocean ecotone of Zhuhai City was investigated, and the similarity and cluster analyses were implemented with the database of the sequences in Genbank. In the seawater, Proteobacteria was dominant, followed by Archaeon, Gemmatimonadetes, Candidate division OP3 and OP8, and Planctomycetes, etc.; while in the groundwater, Archaeon was dominant, followed by Proteobacteria, Sphingobacteria, Candidate division OP3, Actinobacterium, and Pseudomonas. The dominant taxa in the groundwater had high similarity to the unculturable groups of marine microorganisms. Large amount of bacteria capable of degrading organic matter and purifying water body existed in the groundwater, suggesting that after long-term evolution, the land-ocean ecotone of Zhuhai City had the characteristics of both land and ocean.
Flow cytometric purification of Colletotrichum higginsianum biotrophic hyphae from Arabidopsis leaves for stage-specific transcriptome analysis.

PubMed

Takahara, Hiroyuki; Dolf, Andreas; Endl, Elmar; O'Connell, Richard

2009-08-01

Generation of stage-specific cDNA libraries is a powerful approach to identify pathogen genes that are differentially expressed during plant infection. Biotrophic pathogens develop specialized infection structures inside living plant cells, but sampling the transcriptome of these structures is problematic due to the low ratio of fungal to plant RNA, and the lack of efficient methods to isolate them from infected plants. Here we established a method, based on fluorescence-activated cell sorting (FACS), to purify the intracellular biotrophic hyphae of Colletotrichum higginsianum from homogenates of infected Arabidopsis leaves. Specific selection of viable hyphae using a fluorescent vital marker provided intact RNA for cDNA library construction. Pilot-scale sequencing showed that the library was enriched with plant-induced and pathogenicity-related fungal genes, including some encoding small, soluble secreted proteins that represent candidate fungal effectors. The high purity of the hyphae (94%) prevented contamination of the library by sequences derived from host cells or other fungal cell types. RT-PCR confirmed that genes identified in the FACS-purified hyphae were also expressed in planta. The method has wide applicability for isolating the infection structures of other plant pathogens, and will facilitate cell-specific transcriptome analysis via deep sequencing and microarray hybridization, as well as proteomic analyses.

Impacts of Genome-Wide Analyses on Our Understanding of Human Herpesvirus Diversity and Evolution.

PubMed

Renner, Daniel W; Szpara, Moriah L

2018-01-01

Until fairly recently, genome-wide evolutionary dynamics and within-host diversity were more commonly examined in the context of small viruses than in the context of large double-stranded DNA viruses such as herpesviruses. The high mutation rates and more compact genomes of RNA viruses have inspired the investigation of population dynamics for these species, and recent data now suggest that herpesviruses might also be considered candidates for population modeling. High-throughput sequencing (HTS) and bioinformatics have expanded our understanding of herpesviruses through genome-wide comparisons of sequence diversity, recombination, allele frequency, and selective pressures. Here we discuss recent data on the mechanisms that generate herpesvirus genomic diversity and underlie the evolution of these virus families. We focus on human herpesviruses, with key insights drawn from veterinary herpesviruses and other large DNA virus families. We consider the impacts of cell culture on herpesvirus genomes and how to accurately describe the viral populations under study. The need for a strong foundation of high-quality genomes is also discussed, since it underlies all secondary genomic analyses such as RNA sequencing (RNA-Seq), chromatin immunoprecipitation, and ribosome profiling. Areas where we foresee future progress, such as the linking of viral genetic differences to phenotypic or clinical outcomes, are highlighted as well. Copyright © 2017 Renner and Szpara.
Systems genetics: a paradigm to improve discovery of candidate genes and mechanisms underlying complex traits.

PubMed

Feltus, F Alex

2014-06-01

Understanding the control of any trait optimally requires the detection of causal genes, gene interaction, and mechanism of action to discover and model the biochemical pathways underlying the expressed phenotype. Functional genomics techniques, including RNA expression profiling via microarray and high-throughput DNA sequencing, allow for the precise genome localization of biological information. Powerful genetic approaches, including quantitative trait locus (QTL) and genome-wide association study mapping, link phenotype with genome positions, yet genetics is less precise in localizing the relevant mechanistic information encoded in DNA. The coupling of salient functional genomic signals with genetically mapped positions is an appealing approach to discover meaningful gene-phenotype relationships. Techniques used to define this genetic-genomic convergence comprise the field of systems genetics. This short review will address an application of systems genetics where RNA profiles are associated with genetically mapped genome positions of individual genes (eQTL mapping) or as gene sets (co-expression network modules). Both approaches can be applied for knowledge independent selection of candidate genes (and possible control mechanisms) underlying complex traits where multiple, likely unlinked, genomic regions might control specific complex traits. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Exome sequencing of Pakistani consanguineous families identifies 30 novel candidate genes for recessive intellectual disability

PubMed Central

Riazuddin, S; Hussain, M; Razzaq, A; Iqbal, Z; Shahzad, M; Polla, D L; Song, Y; van Beusekom, E; Khan, A A; Tomas-Roca, L; Rashid, M; Zahoor, M Y; Wissink-Lindhout, W M; Basra, M A R; Ansar, M; Agha, Z; van Heeswijk, K; Rasheed, F; Van de Vorst, M; Veltman, J A; Gilissen, C; Akram, J; Kleefstra, T; Assir, M Z; Grozeva, D; Carss, K; Raymond, F L; O'Connor, T D; Riazuddin, S A; Khan, S N; Ahmed, Z M; de Brouwer, A P M; van Bokhoven, H; Riazuddin, S

2017-01-01

Intellectual disability (ID) is a clinically and genetically heterogeneous disorder, affecting 1–3% of the general population. Although research into the genetic causes of ID has recently gained momentum, identification of pathogenic mutations that cause autosomal recessive ID (ARID) has lagged behind, predominantly due to non-availability of sizeable families. Here we present the results of exome sequencing in 121 large consanguineous Pakistani ID families. In 60 families, we identified homozygous or compound heterozygous DNA variants in a single gene, 30 affecting reported ID genes and 30 affecting novel candidate ID genes. Potential pathogenicity of these alleles was supported by co-segregation with the phenotype, low frequency in control populations and the application of stringent bioinformatics analyses. In another eight families segregation of multiple pathogenic variants was observed, affecting 19 genes that were either known or are novel candidates for ID. Transcriptome profiles of normal human brain tissues showed that the novel candidate ID genes formed a network significantly enriched for transcriptional co-expression (P<0.0001) in the frontal cortex during fetal development and in the temporal–parietal and sub-cortex during infancy through adulthood. In addition, proteins encoded by 12 novel ID genes directly interact with previously reported ID proteins in six known pathways essential for cognitive function (P<0.0001). These results suggest that disruptions of temporal parietal and sub-cortical neurogenesis during infancy are critical to the pathophysiology of ID. These findings further expand the existing repertoire of genes involved in ARID, and provide new insights into the molecular mechanisms and the transcriptome map of ID. PMID:27457812
Exome sequencing of Pakistani consanguineous families identifies 30 novel candidate genes for recessive intellectual disability.

PubMed

Riazuddin, S; Hussain, M; Razzaq, A; Iqbal, Z; Shahzad, M; Polla, D L; Song, Y; van Beusekom, E; Khan, A A; Tomas-Roca, L; Rashid, M; Zahoor, M Y; Wissink-Lindhout, W M; Basra, M A R; Ansar, M; Agha, Z; van Heeswijk, K; Rasheed, F; Van de Vorst, M; Veltman, J A; Gilissen, C; Akram, J; Kleefstra, T; Assir, M Z; Grozeva, D; Carss, K; Raymond, F L; O'Connor, T D; Riazuddin, S A; Khan, S N; Ahmed, Z M; de Brouwer, A P M; van Bokhoven, H; Riazuddin, S

2017-11-01

Intellectual disability (ID) is a clinically and genetically heterogeneous disorder, affecting 1-3% of the general population. Although research into the genetic causes of ID has recently gained momentum, identification of pathogenic mutations that cause autosomal recessive ID (ARID) has lagged behind, predominantly due to non-availability of sizeable families. Here we present the results of exome sequencing in 121 large consanguineous Pakistani ID families. In 60 families, we identified homozygous or compound heterozygous DNA variants in a single gene, 30 affecting reported ID genes and 30 affecting novel candidate ID genes. Potential pathogenicity of these alleles was supported by co-segregation with the phenotype, low frequency in control populations and the application of stringent bioinformatics analyses. In another eight families segregation of multiple pathogenic variants was observed, affecting 19 genes that were either known or are novel candidates for ID. Transcriptome profiles of normal human brain tissues showed that the novel candidate ID genes formed a network significantly enriched for transcriptional co-expression (P<0.0001) in the frontal cortex during fetal development and in the temporal-parietal and sub-cortex during infancy through adulthood. In addition, proteins encoded by 12 novel ID genes directly interact with previously reported ID proteins in six known pathways essential for cognitive function (P<0.0001). These results suggest that disruptions of temporal parietal and sub-cortical neurogenesis during infancy are critical to the pathophysiology of ID. These findings further expand the existing repertoire of genes involved in ARID, and provide new insights into the molecular mechanisms and the transcriptome map of ID.
Sequence and gene content of a large fragment of a lizard sex chromosome and evaluation of candidate sex differentiating gene R-spondin 1

PubMed Central

2013-01-01

Background Scant genomic information from non-avian reptile sex chromosomes is available, and for only a few lizards, several snakes and one turtle species, and it represents only a small fraction of the total sex chromosome sequences in these species. Results We report a 352 kb of contiguous sequence from the sex chromosome of a squamate reptile, Pogona vitticeps, with a ZZ/ZW sex microchromosome system. This contig contains five protein coding genes (oprd1, rcc1, znf91, znf131, znf180), and major families of repetitive sequences with a high number of copies of LTR and non-LTR retrotransposons, including the CR1 and Bov-B LINEs. The two genes, oprd1 and rcc1 are part of a homologous syntenic block, which is conserved among amniotes. While oprd1 and rcc1 have no known function in sex determination or differentiation in amniotes, this homologous syntenic block in mammals and chicken also contains R-spondin 1 (rspo1), the ovarian differentiating gene in mammals. In order to explore the probability that rspo1 is sex determining in dragon lizards, genomic BAC and cDNA clones were mapped using fluorescence in situ hybridisation. Their location on an autosomal microchromosome pair, not on the ZW sex microchromosomes, eliminates rspo1 as a candidate sex determining gene in P. vitticeps. Conclusion Our study has characterized the largest contiguous stretch of physically mapped sex chromosome sequence (352 kb) from a ZZ/ZW lizard species. Although this region represents only a small fraction of the sex chromosomes of P. vitticeps, it has revealed several features typically associated with sex chromosomes including the accumulation of large blocks of repetitive sequences. PMID:24344927
Assessment of mangroves from Goa, west coast India using DNA barcode.

PubMed

Saddhe, Ankush Ashok; Jamdade, Rahul Arvind; Kumar, Kundan

2016-01-01

Mangroves are salt-tolerant forest ecosystems of tropical and subtropical intertidal regions. They are among most productive, diverse, biologically important ecosystem and inclined toward threatened system. Identification of mangrove species is of critical importance in conserving and utilizing biodiversity, which apparently hindered by a lack of taxonomic expertise. In recent years, DNA barcoding using plastid markers rbcL and matK has been suggested as an effective method to enrich traditional taxonomic expertise for rapid species identification and biodiversity inventories. In the present study, we performed assessment of available 14 mangrove species of Goa, west coast India based on core DNA barcode markers, rbcL and matK. PCR amplification success rate, intra- and inter-specific genetic distance variation and the correct identification percentage were taken into account to assess candidate barcode regions. PCR and sequence success rate were high in rbcL (97.7 %) and matK (95.5 %) region. The two candidate chloroplast barcoding regions (rbcL, matK) yielded barcode gaps. Our results clearly demonstrated that matK locus assigned highest correct identification rates (72.09 %) based on TaxonDNA Best Match criteria. The concatenated rbcL + matK loci were able to adequately discriminate all mangrove genera and species to some extent except those in Rhizophora, Sonneratia and Avicennia. Our study provides the first endorsement of the species resolution among mangroves using plastid genes with few exceptions. Our future work will be focused on evaluation of other barcode markers to delineate complete resolution of mangrove species and identification of putative hybrids.
Spatial heterogeneity in the Mediterranean Biodiversity Hotspot affects barcoding accuracy of its freshwater fishes.

PubMed

Geiger, M F; Herder, F; Monaghan, M T; Almada, V; Barbieri, R; Bariche, M; Berrebi, P; Bohlen, J; Casal-Lopez, M; Delmastro, G B; Denys, G P J; Dettai, A; Doadrio, I; Kalogianni, E; Kärst, H; Kottelat, M; Kovačić, M; Laporte, M; Lorenzoni, M; Marčić, Z; Özuluğ, M; Perdices, A; Perea, S; Persat, H; Porcelotti, S; Puzzi, C; Robalo, J; Šanda, R; Schneider, M; Šlechtová, V; Stoumboudi, M; Walter, S; Freyhof, J

2014-11-01

Incomplete knowledge of biodiversity remains a stumbling block for conservation planning and even occurs within globally important Biodiversity Hotspots (BH). Although technical advances have boosted the power of molecular biodiversity assessments, the link between DNA sequences and species and the analytics to discriminate entities remain crucial. Here, we present an analysis of the first DNA barcode library for the freshwater fish fauna of the Mediterranean BH (526 spp.), with virtually complete species coverage (498 spp., 98% extant species). In order to build an identification system supporting conservation, we compared species determination by taxonomists to multiple clustering analyses of DNA barcodes for 3165 specimens. The congruence of barcode clusters with morphological determination was strongly dependent on the method of cluster delineation, but was highest with the general mixed Yule-coalescent (GMYC) model-based approach (83% of all species recovered as GMYC entity). Overall, genetic morphological discontinuities suggest the existence of up to 64 previously unrecognized candidate species. We found reduced identification accuracy when using the entire DNA-barcode database, compared with analyses on databases for individual river catchments. This scale effect has important implications for barcoding assessments and suggests that fairly simple identification pipelines provide sufficient resolution in local applications. We calculated Evolutionarily Distinct and Globally Endangered scores in order to identify candidate species for conservation priority and argue that the evolutionary content of barcode data can be used to detect priority species for future IUCN assessments. We show that large-scale barcoding inventories of complex biotas are feasible and contribute directly to the evaluation of conservation priorities. © 2014 John Wiley & Sons Ltd.
The role of Cas8 in type I CRISPR interference.

PubMed

Cass, Simon D B; Haas, Karina A; Stoll, Britta; Alkhnbashi, Omer S; Sharma, Kundan; Urlaub, Henning; Backofen, Rolf; Marchfelder, Anita; Bolt, Edward L

2015-05-05

CRISPR (clustered regularly interspaced short palindromic repeat) systems provide bacteria and archaea with adaptive immunity to repel invasive genetic elements. Type I systems use 'cascade' [CRISPR-associated (Cas) complex for antiviral defence] ribonucleoprotein complexes to target invader DNA, by base pairing CRISPR RNA (crRNA) to protospacers. Cascade identifies PAMs (protospacer adjacent motifs) on invader DNA, triggering R-loop formation and subsequent DNA degradation by Cas3. Cas8 is a candidate PAM recognition factor in some cascades. We analysed Cas8 homologues from type IB CRISPR systems in archaea Haloferax volcanii (Hvo) and Methanothermobacter thermautotrophicus (Mth). Cas8 was essential for CRISPR interference in Hvo and purified Mth Cas8 protein responded to PAM sequence when binding to nucleic acids. Cas8 interacted physically with Cas5-Cas7-crRNA complex, stimulating binding to PAM containing substrates. Mutation of conserved Cas8 amino acid residues abolished interference in vivo and altered catalytic activity of Cas8 protein in vitro. This is experimental evidence that Cas8 is important for targeting Cascade to invader DNA. © 2015 Authors.
Application of Quaternion in improving the quality of global sequence alignment scores for an ambiguous sequence target in Streptococcus pneumoniae DNA

NASA Astrophysics Data System (ADS)

Lestari, D.; Bustamam, A.; Novianti, T.; Ardaneswari, G.

2017-07-01

DNA sequence can be defined as a succession of letters, representing the order of nucleotides within DNA, using a permutation of four DNA base codes including adenine (A), guanine (G), cytosine (C), and thymine (T). The precise code of the sequences is determined using DNA sequencing methods and technologies, which have been developed since the 1970s and currently become highly developed, advanced and highly throughput sequencing technologies. So far, DNA sequencing has greatly accelerated biological and medical research and discovery. However, in some cases DNA sequencing could produce any ambiguous and not clear enough sequencing results that make them quite difficult to be determined whether these codes are A, T, G, or C. To solve these problems, in this study we can introduce other representation of DNA codes namely Quaternion Q = (PA, PT, PG, PC), where PA, PT, PG, PC are the probability of A, T, G, C bases that could appear in Q and PA + PT + PG + PC = 1. Furthermore, using Quaternion representations we are able to construct the improved scoring matrix for global sequence alignment processes, by applying a dot product method. Moreover, this scoring matrix produces better and higher quality of the match and mismatch score between two DNA base codes. In implementation, we applied the Needleman-Wunsch global sequence alignment algorithm using Octave, to analyze our target sequence which contains some ambiguous sequence data. The subject sequences are the DNA sequences of Streptococcus pneumoniae families obtained from the Genebank, meanwhile the target DNA sequence are received from our collaborator database. As the results we found the Quaternion representations improve the quality of the sequence alignment score and we can conclude that DNA sequence target has maximum similarity with Streptococcus pneumoniae.
DNA barcoding of Rhodiola (crassulaceae): a case study on a group of recently diversified medicinal plants from the Qinghai-Tibetan Plateau.

PubMed

Zhang, Jian-Qiang; Meng, Shi-Yong; Wen, Jun; Rao, Guang-Yuan

2015-01-01

DNA barcoding, the identification of species using one or a few short standardized DNA sequences, is an important complement to traditional taxonomy. However, there are particular challenges for barcoding plants, especially for species with complex evolutionary histories. We herein evaluated the utility of five candidate sequences - rbcL, matK, trnH-psbA, trnL-F and the internal transcribed spacer (ITS) - for barcoding Rhodiola species, a group of high-altitude plants frequently used as adaptogens, hemostatics and tonics in traditional Tibetan medicine. Rhodiola was suggested to have diversified rapidly recently. The genus is thus a good model for testing DNA barcoding strategies for recently diversified medicinal plants. This study analyzed 189 accessions, representing 47 of the 55 recognized Rhodiola species in the Flora of China treatment. Based on intraspecific and interspecific divergence and degree of monophyly statistics, ITS was the best single-locus barcode, resolving 66% of the Rhodiola species. The core combination rbcL+matK resolved only 40.4% of them. Unsurprisingly, the combined use of all five loci provided the highest discrimination power, resolving 80.9% of the species. However, this is weaker than the discrimination power generally reported in barcoding studies of other plant taxa. The observed complications may be due to the recent diversification, incomplete lineage sorting and reticulate evolution of the genus. These processes are common features of numerous plant groups in the high-altitude regions of the Qinghai-Tibetan Plateau.
Metagenomic discovery of biomass-degrading genes and genomes from cow rumen.

PubMed

Hess, Matthias; Sczyrba, Alexander; Egan, Rob; Kim, Tae-Wan; Chokhawala, Harshal; Schroth, Gary; Luo, Shujun; Clark, Douglas S; Chen, Feng; Zhang, Tao; Mackie, Roderick I; Pennacchio, Len A; Tringe, Susannah G; Visel, Axel; Woyke, Tanja; Wang, Zhong; Rubin, Edward M

2011-01-28

The paucity of enzymes that efficiently deconstruct plant polysaccharides represents a major bottleneck for industrial-scale conversion of cellulosic biomass into biofuels. Cow rumen microbes specialize in degradation of cellulosic plant material, but most members of this complex community resist cultivation. To characterize biomass-degrading genes and genomes, we sequenced and analyzed 268 gigabases of metagenomic DNA from microbes adherent to plant fiber incubated in cow rumen. From these data, we identified 27,755 putative carbohydrate-active genes and expressed 90 candidate proteins, of which 57% were enzymatically active against cellulosic substrates. We also assembled 15 uncultured microbial genomes, which were validated by complementary methods including single-cell genome sequencing. These data sets provide a substantially expanded catalog of genes and genomes participating in the deconstruction of cellulosic biomass.
Utility of DNA barcoding for rapid and accurate assessment of bat diversity in Malaysia in the absence of formally described species.

PubMed

Wilson, J-J; Sing, K-W; Halim, M R A; Ramli, R; Hashim, R; Sofian-Azirun, M

2014-02-19

Bats are important flagship species for biodiversity research; however, diversity in Southeast Asia is considerably underestimated in the current checklists and field guides. Incorporation of DNA barcoding into surveys has revealed numerous species-level taxa overlooked by conventional methods. Inclusion of these taxa in inventories provides a more informative record of diversity, but is problematic as these species lack formal description. We investigated how frequently documented, but undescribed, bat taxa are encountered in Peninsular Malaysia. We discuss whether a barcode library provides a means of recognizing and recording these taxa across biodiversity inventories. Tissue was sampled from bats trapped at Pasir Raja, Dungun Terengganu, Peninsular Malaysia. The DNA was extracted and the COI barcode region amplified and sequenced. We identified 9 species-level taxa within our samples, based on analysis of the DNA barcodes. Six specimens matched to four previously documented taxa considered candidate species but currently lacking formal taxonomic status. This study confirms the high diversity of bats within Peninsular Malaysia (9 species in 13 samples) and demonstrates how DNA barcoding allows for inventory and documentation of known taxa lacking formal taxonomic status.
MotifMark: Finding regulatory motifs in DNA sequences.

PubMed

Hassanzadeh, Hamid Reza; Kolhe, Pushkar; Isbell, Charles L; Wang, May D

2017-07-01

The interaction between proteins and DNA is a key driving force in a significant number of biological processes such as transcriptional regulation, repair, recombination, splicing, and DNA modification. The identification of DNA-binding sites and the specificity of target proteins in binding to these regions are two important steps in understanding the mechanisms of these biological activities. A number of high-throughput technologies have recently emerged that try to quantify the affinity between proteins and DNA motifs. Despite their success, these technologies have their own limitations and fall short in precise characterization of motifs, and as a result, require further downstream analysis to extract useful and interpretable information from a haystack of noisy and inaccurate data. Here we propose MotifMark, a new algorithm based on graph theory and machine learning, that can find binding sites on candidate probes and rank their specificity in regard to the underlying transcription factor. We developed a pipeline to analyze experimental data derived from compact universal protein binding microarrays and benchmarked it against two of the most accurate motif search methods. Our results indicate that MotifMark can be a viable alternative technique for prediction of motif from protein binding microarrays and possibly other related high-throughput techniques.
Design, synthesis, and functional testing of recombinant cell penetrating peptides

NASA Astrophysics Data System (ADS)

Widyaningtyas, S. T.; Soebandrio, A.; Ibrahim, F.; Bela, B.

2017-08-01

Cell penetrating peptides (CPP) are one of the most attractive DNA delivery systems currently in development. In this research, in silico CPP development was performed based on a literature study to look for peptides that induce endosome escape, have the ability to bind DNA, and pass through cell membranes and/or nuclear membranes with a final goal of creating a new CPP to be used as a DNA delivery system. We report herein the successful isolation of three candidate CPP molecules, which have all been successfully expressed and purified by NiNTA. One of the determinants of CPP success as a DNA carrier is the ability of the CPP to bind and protect DNA from the effects of nucleases. The DNA binding test results show that all three CPPs can bind to DNA and protect it from the effects of serum nucleases. These three CPP candidates designed in silico and synthesized in the prokaryote system are eligible candidates for further testing of their ability to deliver DNA in vitro and in vivo.
Large-Scale Concatenation cDNA Sequencing

PubMed Central

Yu, Wei; Andersson, Björn; Worley, Kim C.; Muzny, Donna M.; Ding, Yan; Liu, Wen; Ricafrente, Jennifer Y.; Wentland, Meredith A.; Lennon, Greg; Gibbs, Richard A.

1997-01-01

A total of 100 kb of DNA derived from 69 individual human brain cDNA clones of 0.7–2.0 kb were sequenced by concatenated cDNA sequencing (CCS), whereby multiple individual DNA fragments are sequenced simultaneously in a single shotgun library. The method yielded accurate sequences and a similar efficiency compared with other shotgun libraries constructed from single DNA fragments (>20 kb). Computer analyses were carried out on 65 cDNA clone sequences and their corresponding end sequences to examine both nucleic acid and amino acid sequence similarities in the databases. Thirty-seven clones revealed no DNA database matches, 12 clones generated exact matches (≥98% identity), and 16 clones generated nonexact matches (57%–97% identity) to either known human or other species genes. Of those 28 matched clones, 8 had corresponding end sequences that failed to identify similarities. In a protein similarity search, 27 clone sequences displayed significant matches, whereas only 20 of the end sequences had matches to known protein sequences. Our data indicate that full-length cDNA insert sequences provide significantly more nucleic acid and protein sequence similarity matches than expressed sequence tags (ESTs) for database searching. [All 65 cDNA clone sequences described in this paper have been submitted to the GenBank data library under accession nos. U79240–U79304.] PMID:9110174
Using RNA-Seq for gene identification, polymorphism detection and transcript profiling in two alfalfa genotypes with divergent cell wall composition in stems

PubMed Central

2011-01-01

Background Alfalfa, [Medicago sativa (L.) sativa], a widely-grown perennial forage has potential for development as a cellulosic ethanol feedstock. However, the genomics of alfalfa, a non-model species, is still in its infancy. The recent advent of RNA-Seq, a massively parallel sequencing method for transcriptome analysis, provides an opportunity to expand the identification of alfalfa genes and polymorphisms, and conduct in-depth transcript profiling. Results Cell walls in stems of alfalfa genotype 708 have higher cellulose and lower lignin concentrations compared to cell walls in stems of genotype 773. Using the Illumina GA-II platform, a total of 198,861,304 expression sequence tags (ESTs, 76 bp in length) were generated from cDNA libraries derived from elongating stem (ES) and post-elongation stem (PES) internodes of 708 and 773. In addition, 341,984 ESTs were generated from ES and PES internodes of genotype 773 using the GS FLX Titanium platform. The first alfalfa (Medicago sativa) gene index (MSGI 1.0) was assembled using the Sanger ESTs available from GenBank, the GS FLX Titanium EST sequences, and the de novo assembled Illumina sequences. MSGI 1.0 contains 124,025 unique sequences including 22,729 tentative consensus sequences (TCs), 22,315 singletons and 78,981 pseudo-singletons. We identified a total of 1,294 simple sequence repeats (SSR) among the sequences in MSGI 1.0. In addition, a total of 10,826 single nucleotide polymorphisms (SNPs) were predicted between the two genotypes. Out of 55 SNPs randomly selected for experimental validation, 47 (85%) were polymorphic between the two genotypes. We also identified numerous allelic variations within each genotype. Digital gene expression analysis identified numerous candidate genes that may play a role in stem development as well as candidate genes that may contribute to the differences in cell wall composition in stems of the two genotypes. Conclusions Our results demonstrate that RNA-Seq can be successfully used for gene identification, polymorphism detection and transcript profiling in alfalfa, a non-model, allogamous, autotetraploid species. The alfalfa gene index assembled in this study, and the SNPs, SSRs and candidate genes identified can be used to improve alfalfa as a forage crop and cellulosic feedstock. PMID:21504589
Synthesis of DNA

DOEpatents

Mariella, Jr., Raymond P.

2008-11-18

A method of synthesizing a desired double-stranded DNA of a predetermined length and of a predetermined sequence. Preselected sequence segments that will complete the desired double-stranded DNA are determined. Preselected segment sequences of DNA that will be used to complete the desired double-stranded DNA are provided. The preselected segment sequences of DNA are assembled to produce the desired double-stranded DNA.
MeDIP-seq and nCpG analyses illuminate sexually dimorphic methylation of gonadal development genes with high historic methylation in turtle hatchlings with temperature-dependent sex determination.

PubMed

Radhakrishnan, Srihari; Literman, Robert; Mizoguchi, Beatriz; Valenzuela, Nicole

2017-01-01

DNA methylation alters gene expression but not DNA sequence and mediates some cases of phenotypic plasticity. Temperature-dependent sex determination (TSD) epitomizes phenotypic plasticity where environmental temperature drives embryonic sexual fate, as occurs commonly in turtles. Importantly, the temperature-specific transcription of two genes underlying gonadal differentiation is known to be induced by differential methylation in TSD fish, turtle and alligator. Yet, how extensive is the link between DNA methylation and TSD remains unclear. Here we test for broad differences in genome-wide DNA methylation between male and female hatchling gonads of the TSD painted turtle Chrysemys picta using methyl DNA immunoprecipitation sequencing, to identify differentially methylated candidates for future study. We also examine the genome-wide nCpG distribution (which affects DNA methylation) in painted turtles and test for historic methylation in genes regulating vertebrate gonadogenesis. Turtle global methylation was consistent with other vertebrates (57% of the genome, 78% of all CpG dinucleotides). Numerous genes predicted to regulate turtle gonadogenesis exhibited sex-specific methylation and were proximal to methylated repeats. nCpG distribution predicted actual turtle DNA methylation and was bimodal in gene promoters (as other vertebrates) and introns (unlike other vertebrates). Differentially methylated genes, including regulators of sexual development, had lower nCpG content indicative of higher historic methylation. Ours is the first evidence suggesting that sexually dimorphic DNA methylation is pervasive in turtle gonads (perhaps mediated by repeat methylation) and that it targets numerous regulators of gonadal development, consistent with the hypothesis that it may regulate thermosensitive transcription in TSD vertebrates. However, further research during embryogenesis will help test this hypothesis and the alternative that instead, most differential methylation observed in hatchlings is the by-product of sexual differentiation and not its cause.
Nanopore Technology: A Simple, Inexpensive, Futuristic Technology for DNA Sequencing.

PubMed

Gupta, P D

2016-10-01

In health care, importance of DNA sequencing has been fully established. Sanger's Capillary Electrophoresis DNA sequencing methodology is time consuming, cumbersome, hence become more expensive. Lately, because of its versatility DNA sequencing became house hold name, and therefore, there is an urgent need of simple, fast, inexpensive, DNA sequencing technology. In the beginning of this century efforts were made, and Nanopore DNA sequencing technology was developed; still it is infancy, nevertheless, it is the futuristic technology.
The genome-wide DNA sequence specificity of the anti-tumour drug bleomycin in human cells.

PubMed

Murray, Vincent; Chen, Jon K; Tanaka, Mark M

2016-07-01

The cancer chemotherapeutic agent, bleomycin, cleaves DNA at specific sites. For the first time, the genome-wide DNA sequence specificity of bleomycin breakage was determined in human cells. Utilising Illumina next-generation DNA sequencing techniques, over 200 million bleomycin cleavage sites were examined to elucidate the bleomycin genome-wide DNA selectivity. The genome-wide bleomycin cleavage data were analysed by four different methods to determine the cellular DNA sequence specificity of bleomycin strand breakage. For the most highly cleaved DNA sequences, the preferred site of bleomycin breakage was at 5'-GT* dinucleotide sequences (where the asterisk indicates the bleomycin cleavage site), with lesser cleavage at 5'-GC* dinucleotides. This investigation also determined longer bleomycin cleavage sequences, with preferred cleavage at 5'-GT*A and 5'- TGT* trinucleotide sequences, and 5'-TGT*A tetranucleotides. For cellular DNA, the hexanucleotide DNA sequence 5'-RTGT*AY (where R is a purine and Y is a pyrimidine) was the most highly cleaved DNA sequence. It was striking that alternating purine-pyrimidine sequences were highly cleaved by bleomycin. The highest intensity cleavage sites in cellular and purified DNA were very similar although there were some minor differences. Statistical nucleotide frequency analysis indicated a G nucleotide was present at the -3 position (relative to the cleavage site) in cellular DNA but was absent in purified DNA.

Completely monodisperse, highly repetitive proteins for bioconjugate capillary electrophoresis: Development and characterization

PubMed Central

Lin, Jennifer S.; Albrecht, Jennifer Coyne; Meagher, Robert J.; Wang, Xiaoxiao; Barron, Annelise E.

2011-01-01

Protein-based polymers are increasingly being used in biomaterial applications due to their ease of customization and potential monodispersity. These advantages make protein polymers excellent candidates for bioanalytical applications. Here we describe improved methods for producing drag-tags for Free-Solution Conjugate Electrophoresis (FSCE). FSCE utilizes a pure, monodisperse recombinant protein, tethered end-on to a ssDNA molecule, to enable DNA size separation in aqueous buffer. FSCE also provides a highly sensitive method to evaluate the polydispersity of a protein drag-tag and thus its suitability for bioanalytical uses. This method is able to detect slight differences in drag-tag charge or mass. We have devised an improved cloning, expression, and purification strategy that enables us to generate, for the first time, a truly monodisperse 20 kDa protein polymer and a nearly monodisperse 38 kDa protein. These newly produced proteins can be used as drag-tags to enable longer read DNA sequencing by free-solution microchannel electrophoresis. PMID:21553840
Molecular inversion probe assay for allelic quantitation

PubMed Central

Ji, Hanlee; Welch, Katrina

2010-01-01

Molecular inversion probe (MIP) technology has been demonstrated to be a robust platform for large-scale dual genotyping and copy number analysis. Applications in human genomic and genetic studies include the possibility of running dual germline genotyping and combined copy number variation ascertainment. MIPs analyze large numbers of specific genetic target sequences in parallel, relying on interrogation of a barcode tag, rather than direct hybridization of genomic DNA to an array. The MIP approach does not replace, but is complementary to many of the copy number technologies being performed today. Some specific advantages of MIP technology include: Less DNA required (37 ng vs. 250 ng), DNA quality less important, more dynamic range (amplifications detected up to copy number 60), allele specific information “cleaner” (less SNP crosstalk/contamination), and quality of markers better (fewer individual MIPs versus SNPs needed to identify copy number changes). MIPs can be considered a candidate gene (targeted whole genome) approach and can find specific areas of interest that otherwise may be missed with other methods. PMID:19488872
Sequence and Structure Dependent DNA-DNA Interactions

NASA Astrophysics Data System (ADS)

Kopchick, Benjamin; Qiu, Xiangyun

Molecular forces between dsDNA strands are largely dominated by electrostatics and have been extensively studied. Quantitative knowledge has been accumulated on how DNA-DNA interactions are modulated by varied biological constituents such as ions, cationic ligands, and proteins. Despite its central role in biology, the sequence of DNA has not received substantial attention and ``random'' DNA sequences are typically used in biophysical studies. However, ~50% of human genome is composed of non-random-sequence DNAs, particularly repetitive sequences. Furthermore, covalent modifications of DNA such as methylation play key roles in gene functions. Such DNAs with specific sequences or modifications often take on structures other than the canonical B-form. Here we present series of quantitative measurements of the DNA-DNA forces with the osmotic stress method on different DNA sequences, from short repeats to the most frequent sequences in genome, and to modifications such as bromination and methylation. We observe peculiar behaviors that appear to be strongly correlated with the incurred structural changes. We speculate the causalities in terms of the differences in hydration shell and DNA surface structures.
Association between hMLH1 hypermethylation and JC virus (JCV) infection in human colorectal cancer (CRC).

PubMed

Vilkin, Alex; Niv, Yaron

2011-04-01

Incorporation of viral DNA may interfere with the normal sequence of human DNA bases on the genetic level or cause secondary epigenetic changes such as gene promoter methylation or histone acetylation. Colorectal cancer (CRC) is the second leading cause of cancer mortality in the USA. Chromosomal instability (CIN) was established as the key mechanism in cancer development. Later, it was found that CRC results not only from the progressive accumulation of genetic alterations but also from epigenetic changes. JC virus (JCV) is a candidate etiologic factor in sporadic CRC. It may act by stabilizing β-catenin, facilitating its entrance to the cell nucleus, initialing proliferation and cancer development. Diploid CRC cell lines transfected with JCV-containing plasmids developed CIN. This result provides direct experimental evidence for the ability of JCV T-Ag to induce CIN in the genome of colonic epithelial cells. The association of CRC hMLH1 methylation and tumor positivity for JCV was recently documented. JC virus T-Ag DNA sequences were found in 77% of CRCs and are associated with promoter methylation of multiple genes. hMLH1 was methylated in 25 out of 80 CRC patients positive for T-Ag (31%) in comparison with only one out of 11 T-Ag negative cases (9%). Thus, JCV can mediate both CIN and aberrant methylation in CRC. Like other viruses, chronic infection with JCV may induce CRC by different mechanisms which should be further investigated. Thus, gene promoter methylation induced by JCV may be an important process in CRC and the polyp-carcinoma sequence.
Large-scale collection of full-length cDNA and transcriptome analysis in Hevea brasiliensis.

PubMed

Makita, Yuko; Ng, Kiaw Kiaw; Veera Singham, G; Kawashima, Mika; Hirakawa, Hideki; Sato, Shusei; Othman, Ahmad Sofiman; Matsui, Minami

2017-04-01

Natural rubber has unique physical properties that cannot be replaced by products from other latex-producing plants or petrochemically produced synthetic rubbers. Rubber from Hevea brasiliensis is the main commercial source for this natural rubber that has a cis-polyisoprene configuration. For sustainable production of enough rubber to meet demand elucidation of the molecular mechanisms involved in the production of latex is vital. To this end, we firstly constructed rubber full-length cDNA libraries of RRIM 600 cultivar and sequenced around 20,000 clones by the Sanger method and over 15,000 contigs by Illumina sequencer. With these data, we updated around 5,500 gene structures and newly annotated around 9,500 transcription start sites. Second, to elucidate the rubber biosynthetic pathways and their transcriptional regulation, we carried out tissue- and cultivar-specific RNA-Seq analysis. By using our recently published genome sequence, we confirmed the expression patterns of the rubber biosynthetic genes. Our data suggest that the cytoplasmic mevalonate (MVA) pathway is the main route for isoprenoid biosynthesis in latex production. In addition to the well-studied polymerization factors, we suggest that rubber elongation factor 8 (REF8) is a candidate factor in cis-polyisoprene biosynthesis. We have also identified 39 transcription factors that may be key regulators in latex production. Expression profile analysis using two additional cultivars, RRIM 901 and PB 350, via an RNA-Seq approach revealed possible expression differences between a high latex-yielding cultivar and a disease-resistant cultivar. © The Author 2017. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Cross-referencing yeast genetics and mammalian genomes

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hieter, P.; Basset, D.; Boguski, M.

1994-09-01

We have initiated a project that will systematically transfer information about yeast genes onto the genetic maps of mice and human beings. Rapidly expanding human EST data will serve as a source of candidate human homologs that will be repeatedly searched using yeast protein sequence queries. Search results will be automatically reported to participating labs. Human cDNA sequences from which the ESTs are derived will be mapped at high resolution in the human and mouse genomes. The comparative mapping information cross-references the genomic position of novel human cDNAs with functional information known about the cognate yeast genes. This should facilitatemore » the initial identification of genes responsible for mammalian mutant phenotypes, including human disease. In addition, the identification of mammalian homologs of yeast genes provides reagents for determining evolutionary conservation and for performing direct experiments in multicellular eukaryotes to enhance study of the yeast protein`s function. For example, ESTs homologous to CDC27 and CDC16 were identified, and the corresponding cDNA clones were obtained from ATTC, completely sequenced, and mapped on human and mouse chromosomes. In addition, the CDC17hs cDNA has been used to raise antisera to the CDC27Hs protein and used in subcellular localization experiments and junctional studies in mammalian cells. We have received funding from the National Center for Human Genome Research to provide a community resource which will establish comprehensive cross-referencing among yeast, human, and mouse loci. The project is set up as a service and information on how to communicate with this effort will be provided.« less
Detection of Gastric Cancer with Novel Methylated DNA Markers: Discovery, Tissue Validation, and Pilot Testing in Plasma.

PubMed

Anderson, Bradley W; Suh, Yun-Suhk; Choi, Boram; Lee, Hyuk-Joon; Yab, Tracy C; Taylor, William; Dukek, Brian A; Berger, Calise K; Cao, Xiaoming; Foote, Patrick H; Devens, Mary E; Boardman, Lisa A; Kisiel, John B; Mahoney, Douglas W; Slettedahl, Seth W; Allawi, Hatim T; Lidgard, Graham P; Smyrk, Thomas C; Yang, Han-Kwang; Ahlquist, David A

2018-05-29

Gastric adenocarcinoma (GAC) is the third most common cause of cancer mortality worldwide. Accurate and affordable non-invasive detection methods have potential value for screening and surveillance. Herein, we identify novel methylated DNA markers (MDMs) for GAC, validate their discrimination for GAC in tissues from geographically separate cohorts, explore marker acquisition through the oncogenic cascade, and describe distributions of candidate MDMs in plasma from GAC cases and normal controls. Following discovery by unbiased whole methylome sequencing, candidate MDMs were validated by blinded methylation-specific PCR in archival case-control tissues from U.S. and South Korean patients. Top MDMs were then assayed by an analytically sensitive method (quantitative real-time allele-specific target and signal amplification) in a blinded pilot study on archival plasma from GAC cases and normal controls. Whole methylome discovery yielded novel and highly discriminant candidate MDMs. In tissue, a panel of candidate MDMs detected GAC in 92-100% of U.S. and S. Korean cohorts at 100% specificity. Levels of most MDMs increased progressively from normal mucosa through metaplasia, adenoma, and GAC with variation in points of greatest marker acquisition. In plasma, a 3 marker panel ( ELMO1 , ZNF569 , C13orf18) detected 86% (95% CI 71-95%) of GACs at 95% specificity. Novel MDMs appear to accurately discriminate GAC from normal controls in both tissue and plasma. The point of aberrant methylation during oncogenesis varies by MDM, which may have relevance to marker selection in clinical applications. Further exploration of these MDMs for GAC screening and surveillance is warranted. Copyright ©2018, American Association for Cancer Research.
Phytoremediation of chromium using Salix species: cloning ESTs and candidate genes involved in the Cr response.

PubMed

Quaggiotti, Silvia; Barcaccia, Gianni; Schiavon, Michela; Nicolé, Silvia; Galla, Giulio; Rossignolo, Virginia; Soattin, Marica; Malagoli, Mario

2007-11-01

In this research a differential display based on the detection of cDNA-AFLP markers was used to identify candidate genes potentially involved in the regulation of the response to chromium in four different willow species (Salix alba, Salix eleagnos, Salix fragilis and Salix matsudana) chosen on the basis of their suitability in phytoremediation techniques. Our approach enabled the assay of a large set of mRNA-related fragments and increased the reliability of amplification-based transcriptome analysis. The vast majority of transcript-derived fragments were shared among samples within species and thus attributable to constitutively expressed genes. However, a number of differentially expressed mRNAs were scored in each species and a total of 68 transcripts displaying an altered expression in response to Cr were isolated and sequenced. Public database querying revealed that 44.1% and 4.4% of the cloned ESTs score significant similarity with genes encoding proteins having known or putative function, or with genes coding for unknown proteins, respectively, whereas the remaining 51.5% did not retrieve any homology. Semi-quantitative RT-PCR analysis of seven candidate genes fully confirmed the expression patterns obtained by cDNA-AFLP. Our results indicate the existence of common mechanisms of gene regulation in response to Cr, pathogen attack and senescence-mediated programmed cell death, and suggest a role for the genes isolated in the cross-talk of the signaling pathways governing the adaptation to biotic and abiotic stresses.
Molecular cloning of the potato Gro1-4 gene conferring resistance to pathotype Ro1 of the root cyst nematode Globodera rostochiensis, based on a candidate gene approach.

PubMed

Paal, Jürgen; Henselewski, Heike; Muth, Jost; Meksem, Khalid; Menéndez, Cristina M; Salamini, Francesco; Ballvora, Agim; Gebhardt, Christiane

2004-04-01

The endoparasitic root cyst nematode Globodera rostochiensis causes considerable damage in potato cultivation. In the past, major genes for nematode resistance have been introgressed from related potato species into cultivars. Elucidating the molecular basis of resistance will contribute to the understanding of nematode-plant interactions and assist in breeding nematode-resistant cultivars. The Gro1 resistance locus to G. rostochiensis on potato chromosome VII co-localized with a resistance-gene-like (RGL) DNA marker. This marker was used to isolate from genomic libraries 15 members of a closely related candidate gene family. Analysis of inheritance, linkage mapping, and sequencing reduced the number of candidate genes to three. Complementation analysis by stable potato transformation showed that the gene Gro1-4 conferred resistance to G. rostochiensis pathotype Ro1. Gro1-4 encodes a protein of 1136 amino acids that contains Toll-interleukin 1 receptor (TIR), nucleotide-binding (NB), leucine-rich repeat (LRR) homology domains and a C-terminal domain with unknown function. The deduced Gro1-4 protein differed by 29 amino acid changes from susceptible members of the Gro1 gene family. Sequence characterization of 13 members of the Gro1 gene family revealed putative regulatory elements and a variable microsatellite in the promoter region, insertion of a retrotransposon-like element in the first intron, and a stop codon in the NB coding region of some genes. Sequence analysis of RT-PCR products showed that Gro1-4 is expressed, among other members of the family including putative pseudogenes, in non-infected roots of nematode-resistant plants. RT-PCR also demonstrated that members of the Gro1 gene family are expressed in most potato tissues.
NOVEL EPIGENETIC CHANGES IN CDKN2A ARE ASSOCIATED WITH PROGRESSION OF CERVICAL INTRAEPITHELIAL NEOPLASIA

PubMed Central

Wijetunga, N. Ari; Belbin, Thomas J.; Burk, Robert D.; Whitney, Kathleen; Abadi, Maria; Greally, John M.; Einstein, Mark H.; Schlecht, Nicolas F.

2016-01-01

Objective To conduct a comprehensive mapping of the genomic DNA methylation in CDKN2A, which codes for the p16INK4A and p14ARF proteins, and 14 of the most promising DNA methylation marker candidates previously reported to be associated with progression of low-grade cervical intraepithelial neoplasia (CIN1) to cervical cancer. Methods We analyzed DNA methylation in 68 HIV-seropositive and negative women with incident CIN1, CIN2, CIN3 and invasive cervical cancer, assaying 120 CpG dinucleotide sites spanning APC, CDH1, CDH13, CDKN2A, CDKN2B, DAPK1, FHIT, GSTP1, HIC1, MGMT, MLH1, RARB, RASSF1, TERT and TIMP3 using the Illumina Infinium array. Validation was performed using high resolution mapping of the target genes with HELP-tagging for 286 CpGs, followed by fine mapping of candidate genes with targeted bisulfite sequencing. We assessed for statistical differences in DNA methylation levels for each CpG loci assayed using univariate and multivariate methods correcting for multiple comparisons. Results In our discovery sample set, we identified dose dependent differences in DNA methylation with grade of disease in CDKN2A, APC, MGMT, MLH1 and HIC1, whereas single CpG locus differences between CIN2/3 and cancer groups were seen for CDH13, DAPK1 and TERT. Only those CpGs in the gene body of CDKN2A showed a monotonic increase in methylation between persistent CIN1, CIN2, CIN3 and cancers. Conclusion Our data suggests a novel link between early cervical disease progression and DNA methylation in a region downstream of the CDKN2A transcription start site that may lead to increased p16INK4A/p14ARF expression prior to development of malignant disease. PMID:27401842
Novel epigenetic changes in CDKN2A are associated with progression of cervical intraepithelial neoplasia.

PubMed

Wijetunga, N Ari; Belbin, Thomas J; Burk, Robert D; Whitney, Kathleen; Abadi, Maria; Greally, John M; Einstein, Mark H; Schlecht, Nicolas F

2016-09-01

To conduct a comprehensive mapping of the genomic DNA methylation in CDKN2A, which codes for the p16(INK4A) and p14(ARF) proteins, and 14 of the most promising DNA methylation marker candidates previously reported to be associated with progression of low-grade cervical intraepithelial neoplasia (CIN1) to cervical cancer. We analyzed DNA methylation in 68 HIV-seropositive and negative women with incident CIN1, CIN2, CIN3 and invasive cervical cancer, assaying 120 CpG dinucleotide sites spanning APC, CDH1, CDH13, CDKN2A, CDKN2B, DAPK1, FHIT, GSTP1, HIC1, MGMT, MLH1, RARB, RASSF1, TERT and TIMP3 using the Illumina Infinium array. Validation was performed using high resolution mapping of the target genes with HELP-tagging for 286 CpGs, followed by fine mapping of candidate genes with targeted bisulfite sequencing. We assessed for statistical differences in DNA methylation levels for each CpG loci assayed using univariate and multivariate methods correcting for multiple comparisons. In our discovery sample set, we identified dose dependent differences in DNA methylation with grade of disease in CDKN2A, APC, MGMT, MLH1 and HIC1, whereas single CpG locus differences between CIN2/3 and cancer groups were seen for CDH13, DAPK1 and TERT. Only those CpGs in the gene body of CDKN2A showed a monotonic increase in methylation between persistent CIN1, CIN2, CIN3 and cancers. Our data suggests a novel link between early cervical disease progression and DNA methylation in a region downstream of the CDKN2A transcription start site that may lead to increased p16(INK4A)/p14(ARF) expression prior to development of malignant disease. Copyright © 2016 Elsevier Inc. All rights reserved.
Interest of proviral HIV-1 DNA genotypic resistance testing in virologically suppressed patients candidate for maintenance therapy.

PubMed

Allavena, C; Rodallec, A; Leplat, A; Hall, N; Luco, C; Le Guen, L; Bernaud, C; Bouchez, S; André-Garnier, E; Boutoille, D; Ferré, V; Raffi, F

2018-01-01

Switch of antiretroviral therapy in virologically suppressed HIV-infected patients is frequent, to prevent toxicities, for simplification or convenience reasons. Pretherapeutic genotypic resistance testing on RNA can be lacking in some patients, which could enhance the risk of virologic failure, if resistance-associated mutations of the new regimen are not taken into account. Proviral DNA resistance testing in 69 virologically suppressed patients on antiretroviral treatment with no history of virological failure were pair-wised compared with pre-ART plasma RNA resistance testing. The median time between plasma (RNA testing) and whole blood (proviral DNA testing) was 47 months (IQR 29-63). A stop codon was evidenced in 23% (16/69) of proviral DNA sequences; these strains were considered as defective, non-replicative, and not taken into consideration. Within the non defective strains, concordance rate between plasma RNA and non-defective proviral DNA was high both on protease (194/220 concordant resistance-associated mutations=88%) and reverse transcriptase (28/37 concordant resistance-associated mutations=76%) genes. This study supports that proviral DNA testing might be an informative tool before switching antiretrovirals in virologically suppressed patients with no history of virological failure, but the interpretation should be restricted to non-defective viruses. Copyright © 2017 Elsevier B.V. All rights reserved.
Identification of the PLA2G6 c.1579G>A Missense Mutation in Papillon Dog Neuroaxonal Dystrophy Using Whole Exome Sequencing Analysis

PubMed Central

Tsuboi, Masaya; Watanabe, Manabu; Nibe, Kazumi; Yoshimi, Natsuko; Kato, Akihisa; Sakaguchi, Masahiro; Yamato, Osamu; Tanaka, Miyuu; Kuwamura, Mitsuru; Kushida, Kazuya; Harada, Tomoyuki; Chambers, James Kenn; Sugano, Sumio; Uchida, Kazuyuki; Nakayama, Hiroyuki

2017-01-01

Whole exome sequencing (WES) has become a common tool for identifying genetic causes of human inherited disorders, and it has also recently been applied to canine genome research. We conducted WES analysis of neuroaxonal dystrophy (NAD), a neurodegenerative disease that sporadically occurs worldwide in Papillon dogs. The disease is considered an autosomal recessive monogenic disease, which is histopathologically characterized by severe axonal swelling, known as “spheroids,” throughout the nervous system. By sequencing all eleven DNA samples from one NAD-affected Papillon dog and her parents, two unrelated NAD-affected Papillon dogs, and six unaffected control Papillon dogs, we identified 10 candidate mutations. Among them, three candidates were determined to be “deleterious” by in silico pathogenesis evaluation. By subsequent massive screening by TaqMan genotyping analysis, only the PLA2G6 c.1579G>A mutation had an association with the presence or absence of the disease, suggesting that it may be a causal mutation of canine NAD. As a human homologue of this gene is a causative gene for infantile neuroaxonal dystrophy, this canine phenotype may serve as a good animal model for human disease. The results of this study also indicate that WES analysis is a powerful tool for exploring canine hereditary diseases, especially in rare monogenic hereditary diseases. PMID:28107443
Genomewide investigation of adaptation to harmful algal blooms in common bottlenose dolphins (Tursiops truncatus).

PubMed

Cammen, Kristina M; Schultz, Thomas F; Rosel, Patricia E; Wells, Randall S; Read, Andrew J

2015-09-01

Harmful algal blooms (HABs), which can be lethal in marine species and cause illness in humans, are increasing worldwide. In the Gulf of Mexico, HABs of Karenia brevis produce neurotoxic brevetoxins that cause large-scale marine mortality events. The long history of such blooms, combined with the potentially severe effects of exposure, may have produced a strong selective pressure for evolved resistance. Advances in next-generation sequencing, in particular genotyping-by-sequencing, greatly enable the genomic study of such adaptation in natural populations. We used restriction site-associated DNA (RAD) sequencing to investigate brevetoxicosis resistance in common bottlenose dolphins (Tursiops truncatus). To improve our understanding of the epidemiology and aetiology of brevetoxicosis and the potential for evolved resistance in an upper trophic level predator, we sequenced pools of genomic DNA from dolphins sampled from both coastal and estuarine populations in Florida and during multiple HAB-associated mortality events. We sequenced 129 594 RAD loci and analysed 7431 single nucleotide polymorphisms (SNPs). The allele frequencies of many of these polymorphic loci differed significantly between live and dead dolphins. Some loci associated with survival showed patterns suggesting a common genetic-based mechanism of resistance to brevetoxins in bottlenose dolphins along the Gulf coast of Florida, but others suggested regionally specific mechanisms of resistance or reflected differences among HABs. We identified candidate genes that may be the evolutionary target for brevetoxin resistance by searching the dolphin genome for genes adjacent to survival-associated SNPs. © 2015 John Wiley & Sons Ltd.
Sox2 regulatory region 2 sequence works as a DNA nuclear targeting sequence enhancing the efficiency of an exogenous gene expression in ES cells

DOE Office of Scientific and Technical Information (OSTI.GOV)

Funabashi, Hisakage; Takatsu, Makoto; Saito, Mikako

2010-10-01

Research highlights: {yields} SV40-DTS worked as a DTS in ES cells as well as other types of cells. {yields} Sox2 regulatory region 2 worked as a DTS in ES cells and thus was termed as SRR2-DTS. {yields} SRR2-DTS was suggested as an ES cell-specific DTS. -- Abstract: In this report, the effects of two DNA nuclear targeting sequence (DTS) candidates on the gene expression efficiency in ES cells were investigated. Reporter plasmids containing the simian virus 40 (SV40) promoter/enhancer sequence (SV40-DTS), a DTS for various types of cells but not being reported yet for ES cells, and the 81 basemore » pairs of Sox2 regulatory region 2 (SRR2) where two transcriptional factors in ES cells, Oct3/4 and Sox2, are bound (SRR2-DTS), were introduced into cytoplasm in living cells by femtoinjection. The gene expression efficiencies of each plasmid in mouse insulinoma cell line MIN6 cells and mouse ES cells were then evaluated. Plasmids including SV40-DTS and SRR2-DTS exhibited higher gene expression efficiency comparing to plasmids without these DTSs, and thus it was concluded that both sequences work as a DTS in ES cells. In addition, it was suggested that SRR2-DTS works as an ES cell-specific DTS. To the best of our knowledge, this is the first report to confirm the function of DTSs in ES cells.« less
Genetic structure of the mating-type locus of Chlamydomonas reinhardtii.

PubMed Central

Ferris, Patrick J; Armbrust, E Virginia; Goodenough, Ursula W

2002-01-01

Portions of the cloned mating-type (MT) loci (mt(+) and mt(-)) of Chlamydomonas reinhardtii, defined as the approximately 1-Mb domains of linkage group VI that are under recombinational suppression, were subjected to Northern analysis to elucidate their coding capacity. The four central rearranged segments of the loci were found to contain both housekeeping genes (expressed during several life-cycle stages) and mating-related genes, while the sequences unique to mt(+) or mt(-) carried genes expressed only in the gametic or zygotic phases of the life cycle. One of these genes, Mtd1, is a candidate participant in gametic cell fusion; two others, Mta1 and Ezy2, are candidate participants in the uniparental inheritance of chloroplast DNA. The identified housekeeping genes include Pdk, encoding pyruvate dehydrogenase kinase, and GdcH, encoding glycine decarboxylase complex subunit H. Unusual genetic configurations include three genes whose sequences overlap, one gene that has inserted into the coding region of another, several genes that have been inactivated by rearrangements in the region, and genes that have undergone tandem duplication. This report extends our original conclusion that the MT locus has incurred high levels of mutational change. PMID:11805055
Genes for seed longevity in barley identified by genomic analysis on Near Isogenic Lines.

PubMed

Wozny, Dorothee; Kramer, Katharina; Finkemeier, Iris; Acosta, Ivan F; Koornneef, Maarten

2018-05-09

Genes controlling differences in seed longevity between two barley (Hordeum vulgare) accessions were identified by combining quantitative genetics 'omics' technologies in Near Isogenic Lines (NILs). The NILs were derived from crosses between the spring barley landraces L94 from Ethiopia and Cebada Capa from Argentina. A combined transcriptome and proteome analysis on mature, non-aged seeds of the two parental lines and the L94 NILs by RNA-sequencing and total seed proteomic profiling identified the UDP-glycosyltransferase MLOC_11661.1 as candidate gene for the QTL on 2H, and the NADP-dependent malic enzyme (NADP-ME) MLOC_35785.1 as possible downstream target gene. To validate these candidates, they were expressed in Arabidopsis under the control of constitutive promoters to attempt complementing the T-DNA knock-out line nadp-me1. Both the NADP-ME MLOC_35785.1 and the UDP-glycosyltransferase MLOC_11661.1 were able to rescue the nadp-me1 seed longevity phenotype. In the case of the UDP-glycosyltransferase, with high accumulation in NILs, only the coding sequence of Cebada Capa had a rescue effect. This article is protected by copyright. All rights reserved.
Differentially Private Frequent Sequence Mining via Sampling-based Candidate Pruning

PubMed Central

Xu, Shengzhi; Cheng, Xiang; Li, Zhengyi; Xiong, Li

2016-01-01

In this paper, we study the problem of mining frequent sequences under the rigorous differential privacy model. We explore the possibility of designing a differentially private frequent sequence mining (FSM) algorithm which can achieve both high data utility and a high degree of privacy. We found, in differentially private FSM, the amount of required noise is proportionate to the number of candidate sequences. If we could effectively reduce the number of unpromising candidate sequences, the utility and privacy tradeoff can be significantly improved. To this end, by leveraging a sampling-based candidate pruning technique, we propose a novel differentially private FSM algorithm, which is referred to as PFS2. The core of our algorithm is to utilize sample databases to further prune the candidate sequences generated based on the downward closure property. In particular, we use the noisy local support of candidate sequences in the sample databases to estimate which sequences are potentially frequent. To improve the accuracy of such private estimations, a sequence shrinking method is proposed to enforce the length constraint on the sample databases. Moreover, to decrease the probability of misestimating frequent sequences as infrequent, a threshold relaxation method is proposed to relax the user-specified threshold for the sample databases. Through formal privacy analysis, we show that our PFS2 algorithm is ε-differentially private. Extensive experiments on real datasets illustrate that our PFS2 algorithm can privately find frequent sequences with high accuracy. PMID:26973430
DNA methylome profiling identifies novel methylated genes in African American patients with colorectal neoplasia.

PubMed

Ashktorab, Hassan; Daremipouran, M; Goel, Ajay; Varma, Sudhir; Leavitt, R; Sun, Xueguang; Brim, Hassan

2014-04-01

The identification of genes that are differentially methylated in colorectal cancer (CRC) has potential value for both diagnostic and therapeutic interventions specifically in high-risk populations such as African Americans (AAs). However, DNA methylation patterns in CRC, especially in AAs, have not been systematically explored and remain poorly understood. Here, we performed DNA methylome profiling to identify the methylation status of CpG islands within candidate genes involved in critical pathways important in the initiation and development of CRC. We used reduced representation bisulfite sequencing (RRBS) in colorectal cancer and adenoma tissues that were compared with DNA methylome from a healthy AA subject's colon tissue and peripheral blood DNA. The identified methylation markers were validated in fresh frozen CRC tissues and corresponding normal tissues from AA patients diagnosed with CRC at Howard University Hospital. We identified and validated the methylation status of 355 CpG sites located within 16 gene promoter regions associated with CpG islands. Fifty CpG sites located within CpG islands-in genes ATXN7L1 (2), BMP3 (7), EID3 (15), GAS7 (1), GPR75 (24), and TNFAIP2 (1)-were significantly hypermethylated in tumor vs. normal tissues (P<0.05). The methylation status of BMP3, EID3, GAS7, and GPR75 was confirmed in an independent, validation cohort. Ingenuity pathway analysis mapped three of these markers (GAS7, BMP3 and GPR) in the insulin and TGF-β1 network-the two key pathways in CRC. In addition to hypermethylated genes, our analysis also revealed that LINE-1 repeat elements were progressively hypomethylated in the normal-adenoma-cancer sequence. We conclude that DNA methylome profiling based on RRBS is an effective method for screening aberrantly methylated genes in CRC. While previous studies focused on the limited identification of hypermethylated genes, ours is the first study to systematically and comprehensively identify novel hypermethylated genes, as well as hypomethylated LINE-1 sequences, which may serve as potential biomarkers for CRC in African Americans. Our discovered biomarkers were intimately linked to the insulin/TGF-B1 pathway, further strengthening the association of diabetic disorders with colon oncogenic transformation.
Intra- and inter-isolate variation of ribosomal and protein-coding genes in Pleurotus: implications for molecular identification and phylogeny on fungal groups.

PubMed

He, Xiao-Lan; Li, Qian; Peng, Wei-Hong; Zhou, Jie; Cao, Xue-Lian; Wang, Di; Huang, Zhong-Qian; Tan, Wei; Li, Yu; Gan, Bing-Cheng

2017-06-26

The internal transcribed spacer (ITS), RNA polymerase II second largest subunit (RPB2), and elongation factor 1-alpha (EF1α) are often used in fungal taxonomy and phylogenetic analysis. As we know, an ideal molecular marker used in molecular identification and phylogenetic studies is homogeneous within species, and interspecific variation exceeds intraspecific variation. However, during our process of performing ITS, RPB2, and EF1α sequencing on the Pleurotus spp., we found that intra-isolate sequence polymorphism might be present in these genes because direct sequencing of PCR products failed in some isolates. Therefore, we detected intra- and inter-isolate variation of the three genes in Pleurotus by polymerase chain reaction amplification and cloning in this study. Results showed that intra-isolate variation of ITS was not uncommon but the polymorphic level in each isolate was relatively low in Pleurotus; intra-isolate variations of EF1α and RPB2 sequences were present in an unexpectedly high amount. The polymorphism level differed significantly between ITS, RPB2, and EF1α in the same individual, and the intra-isolate heterogeneity level of each gene varied between isolates within the same species. Intra-isolate and intraspecific variation of ITS in the tested isolates was less than interspecific variation, and intra-isolate and intraspecific variation of RPB2 was probably equal with interspecific divergence. Meanwhile, intra-isolate and intraspecific variation of EF1α could exceed interspecific divergence. These findings suggested that RPB2 and EF1α are not desirable barcoding candidates for Pleurotus. We also discussed the reason why rDNA and protein-coding genes showed variants within a single isolate in Pleurotus, but must be addressed in further research. Our study demonstrated that intra-isolate variation of ribosomal and protein-coding genes are likely widespread in fungi. This has implications for studies on fungal evolution, taxonomy, phylogenetics, and population genetics. More extensive sampling of these genes and other candidates will be required to ensure reliability as phylogenetic markers and DNA barcodes.

A High-Throughput Process for the Solid-Phase Purification of Synthetic DNA Sequences

PubMed Central

Grajkowski, Andrzej; Cieślak, Jacek; Beaucage, Serge L.

2017-01-01

An efficient process for the purification of synthetic phosphorothioate and native DNA sequences is presented. The process is based on the use of an aminopropylated silica gel support functionalized with aminooxyalkyl functions to enable capture of DNA sequences through an oximation reaction with the keto function of a linker conjugated to the 5′-terminus of DNA sequences. Deoxyribonucleoside phosphoramidites carrying this linker, as a 5′-hydroxyl protecting group, have been synthesized for incorporation into DNA sequences during the last coupling step of a standard solid-phase synthesis protocol executed on a controlled pore glass (CPG) support. Solid-phase capture of the nucleobase- and phosphate-deprotected DNA sequences released from the CPG support is demonstrated to proceed near quantitatively. Shorter than full-length DNA sequences are first washed away from the capture support; the solid-phase purified DNA sequences are then released from this support upon reaction with tetra-n-butylammonium fluoride in dry dimethylsulfoxide (DMSO) and precipitated in tetrahydrofuran (THF). The purity of solid-phase-purified DNA sequences exceeds 98%. The simulated high-throughput and scalability features of the solid-phase purification process are demonstrated without sacrificing purity of the DNA sequences. PMID:28628204
Using microarrays to identify positional candidate genes for QTL: the case study of ACTH response in pigs.

PubMed

Jouffe, Vincent; Rowe, Suzanne; Liaubet, Laurence; Buitenhuis, Bart; Hornshøj, Henrik; SanCristobal, Magali; Mormède, Pierre; de Koning, D J

2009-07-16

Microarray studies can supplement QTL studies by suggesting potential candidate genes in the QTL regions, which by themselves are too large to provide a limited selection of candidate genes. Here we provide a case study where we explore ways to integrate QTL data and microarray data for the pig, which has only a partial genome sequence. We outline various procedures to localize differentially expressed genes on the pig genome and link this with information on published QTL. The starting point is a set of 237 differentially expressed cDNA clones in adrenal tissue from two pig breeds, before and after treatment with adrenocorticotropic hormone (ACTH). Different approaches to localize the differentially expressed (DE) genes to the pig genome showed different levels of success and a clear lack of concordance for some genes between the various approaches. For a focused analysis on 12 genes, overlapping QTL from the public domain were presented. Also, differentially expressed genes underlying QTL for ACTH response were described. Using the latest version of the draft sequence, the differentially expressed genes were mapped to the pig genome. This enabled co-location of DE genes and previously studied QTL regions, but the draft genome sequence is still incomplete and will contain many errors. A further step to explore links between DE genes and QTL at the pathway level was largely unsuccessful due to the lack of annotation of the pig genome. This could be improved by further comparative mapping analyses but this would be time consuming. This paper provides a case study for the integration of QTL data and microarray data for a species with limited genome sequence information and annotation. The results illustrate the challenges that must be addressed but also provide a roadmap for future work that is applicable to other non-model species.
The genome sequence and effector complement of the flax rust pathogen Melampsora lini.

PubMed

Nemri, Adnane; Saunders, Diane G O; Anderson, Claire; Upadhyaya, Narayana M; Win, Joe; Lawrence, Gregory J; Jones, David A; Kamoun, Sophien; Ellis, Jeffrey G; Dodds, Peter N

2014-01-01

Rust fungi cause serious yield reductions on crops, including wheat, barley, soybean, coffee, and represent real threats to global food security. Of these fungi, the flax rust pathogen Melampsora lini has been developed most extensively over the past 80 years as a model to understand the molecular mechanisms that underpin pathogenesis. During infection, M. lini secretes virulence effectors to promote disease. The number of these effectors, their function and their degree of conservation across rust fungal species is unknown. To assess this, we sequenced and assembled de novo the genome of M. lini isolate CH5 into 21,130 scaffolds spanning 189 Mbp (scaffold N50 of 31 kbp). Global analysis of the DNA sequence revealed that repetitive elements, primarily retrotransposons, make up at least 45% of the genome. Using ab initio predictions, transcriptome data and homology searches, we identified 16,271 putative protein-coding genes. An analysis pipeline was then implemented to predict the effector complement of M. lini and compare it to that of the poplar rust, wheat stem rust and wheat stripe rust pathogens to identify conserved and species-specific effector candidates. Previous knowledge of four cloned M. lini avirulence effector proteins and two basidiomycete effectors was used to optimize parameters of the effector prediction pipeline. Markov clustering based on sequence similarity was performed to group effector candidates from all four rust pathogens. Clusters containing at least one member from M. lini were further analyzed and prioritized based on features including expression in isolated haustoria and infected leaf tissue and conservation across rust species. Herein, we describe 200 of 940 clusters that ranked highest on our priority list, representing 725 flax rust candidate effectors. Our findings on this important model rust species provide insight into how effectors of rust fungi are conserved across species and how they may act to promote infection on their hosts.
A novel deletion/insertion mutation in the mRNA transcribed from one {alpha}1(I) collagen allele in a family with dominant type III OI and germline mosaicism

DOE Office of Scientific and Technical Information (OSTI.GOV)

Wang, O.; Masters, C.; Lewis, M.B.

1994-09-01

In an 8-year-old girl and her father, both of whom have severe type III OI, we have previously used RNA/RNA hybrid analysis to demonstrate a mismatch in the region of {alpha}1(I) mRNA coding for aa 558-861. We used SSCP to further localize the abnormality to a subregion coding for aa 579-679. This region was subcloned and sequenced. Each patient`s cDNA has a deletion of the sequences coding for the last residue of exon 34, and all of exons 35 and 36 (aa 604-639), followed by an insertion of 156 nt from the 3{prime}-end of intron 36. PCR amplification of leukocytemore » DNA from the patients and the clinically normal paternal grandmother yielded two fragments: a 1007 bp fragment predicted from normal genomic sequences and a 445 bp fragment. Subcloning and sequencing of the shorter genomic PCR product confirmed the presence of a 565 bp genomic deletion from the end of exon 34 to the middle of intron 36. The abnormal protein is apparently synthesized and incorporated into helix. The inserted nucleotides are in frame with the collagenous sequence and contain no stop codons. They encode a 52 aa non-collagenous region. The fibroblast procollagen of the patients has both normal and electrophoretically delayed pro{alpha}(I) bands. The electrophoretically delayed procollagen is very sensitive to pepsin or trypsin digestion, as predicted by its non-collagenous sequence, and cannot be visualized as collagen. This unique OI collagen mutation is an excellent candidate for molecular targeting to {open_quotes}turn off{close_quotes} a dominant mutant allele.« less
DNA Sequence Polymorphism of the Lactate Dehydrogenase Genefrom Iranian Plasmodium vivax and Plasmodium falciparum Isolates.

PubMed

Getacher Feleke, Daniel; Nateghpour, Mehdi; Motevalli Haghi, Afsaneh; Hajjaran, Homa; Farivar, Leila; Mohebali, Mehdi; Raoofian, Reza

2015-01-01

Parasite lactate dehydrogenase (pLDH) is extensively employed as malaria rapid diagnostic tests (RDTs). Moreover, it is a well-known drug target candidate. However, the genetic diversity of this gene might influence performance of RDT kits and its drug target candidacy. This study aimed to determine polymorphism of pLDH gene from Iranian isolates of P. vivax and P. falciparum. Genomic DNA was extracted from whole blood of microscopically confirmed P. vivax and P. falciparum infected patients. pLDH gene of P. falciparum and P. vivax was amplified using conventional PCR from 43 symptomatic malaria patients from Sistan and Baluchistan Province, Southeast Iran from 2012 to 2013. Sequence analysis of 15 P. vivax LDH showed fourteen had 100% identity with P. vivax Sal-1 and Belem strains. Two nucleotide substitutions were detected with only one resulted in amino acid change. Analysis of P. falciparum LDH sequences showed six of the seven sequences had 100% homology with P. falciparum 3D7 and Mzr-1. Moreover, PfLDH displayed three nucleotide changes that resulted in changing only one amino acid. PvLDH and PfLDH showed 75%-76% nucleotide and 90.4%-90.76% amino acid homology. pLDH gene from Iranian P. falciparum and P. vivax isolates displayed 98.8-100% homology with 1-3 nucleotide substitutions. This indicated this gene was relatively conserved. Additional studies can be done weather this genetic variation can influence the performance of pLDH based RDTs or not.
An improved model for whole genome phylogenetic analysis by Fourier transform.

PubMed

Yin, Changchuan; Yau, Stephen S-T

2015-10-07

DNA sequence similarity comparison is one of the major steps in computational phylogenetic studies. The sequence comparison of closely related DNA sequences and genomes is usually performed by multiple sequence alignments (MSA). While the MSA method is accurate for some types of sequences, it may produce incorrect results when DNA sequences undergone rearrangements as in many bacterial and viral genomes. It is also limited by its computational complexity for comparing large volumes of data. Previously, we proposed an alignment-free method that exploits the full information contents of DNA sequences by Discrete Fourier Transform (DFT), but still with some limitations. Here, we present a significantly improved method for the similarity comparison of DNA sequences by DFT. In this method, we map DNA sequences into 2-dimensional (2D) numerical sequences and then apply DFT to transform the 2D numerical sequences into frequency domain. In the 2D mapping, the nucleotide composition of a DNA sequence is a determinant factor and the 2D mapping reduces the nucleotide composition bias in distance measure, and thus improving the similarity measure of DNA sequences. To compare the DFT power spectra of DNA sequences with different lengths, we propose an improved even scaling algorithm to extend shorter DFT power spectra to the longest length of the underlying sequences. After the DFT power spectra are evenly scaled, the spectra are in the same dimensionality of the Fourier frequency space, then the Euclidean distances of full Fourier power spectra of the DNA sequences are used as the dissimilarity metrics. The improved DFT method, with increased computational performance by 2D numerical representation, can be applicable to any DNA sequences of different length ranges. We assess the accuracy of the improved DFT similarity measure in hierarchical clustering of different DNA sequences including simulated and real datasets. The method yields accurate and reliable phylogenetic trees and demonstrates that the improved DFT dissimilarity measure is an efficient and effective similarity measure of DNA sequences. Due to its high efficiency and accuracy, the proposed DFT similarity measure is successfully applied on phylogenetic analysis for individual genes and large whole bacterial genomes. Copyright © 2015 Elsevier Ltd. All rights reserved.
TARGET Research Goals

Cancer.gov

TARGET researchers use various sequencing and array-based methods to examine the genomes, transcriptomes, and for some diseases epigenomes of select childhood cancers. This “multi-omic” approach generates a comprehensive profile of molecular alterations for each cancer type. Alterations are changes in DNA or RNA, such as rearrangements in chromosome structure or variations in gene expression, respectively. Through computational analyses and assays to validate biological function, TARGET researchers predict which alterations disrupt the function of a gene or pathway and promote cancer growth, progression, and/or survival. Researchers identify candidate therapeutic targets and/or prognostic markers from the cancer-associated alterations.
Design of nucleic acid sequences for DNA computing based on a thermodynamic approach

PubMed Central

Tanaka, Fumiaki; Kameda, Atsushi; Yamamoto, Masahito; Ohuchi, Azuma

2005-01-01

We have developed an algorithm for designing multiple sequences of nucleic acids that have a uniform melting temperature between the sequence and its complement and that do not hybridize non-specifically with each other based on the minimum free energy (ΔGmin). Sequences that satisfy these constraints can be utilized in computations, various engineering applications such as microarrays, and nano-fabrications. Our algorithm is a random generate-and-test algorithm: it generates a candidate sequence randomly and tests whether the sequence satisfies the constraints. The novelty of our algorithm is that the filtering method uses a greedy search to calculate ΔGmin. This effectively excludes inappropriate sequences before ΔGmin is calculated, thereby reducing computation time drastically when compared with an algorithm without the filtering. Experimental results in silico showed the superiority of the greedy search over the traditional approach based on the hamming distance. In addition, experimental results in vitro demonstrated that the experimental free energy (ΔGexp) of 126 sequences correlated well with ΔGmin (|R| = 0.90) than with the hamming distance (|R| = 0.80). These results validate the rationality of a thermodynamic approach. We implemented our algorithm in a graphic user interface-based program written in Java. PMID:15701762
Fine-Mapping the Branching Habit Trait in Cultivated Peanut by Combining Bulked Segregant Analysis and High-Throughput Sequencing

PubMed Central

Kayam, Galya; Brand, Yael; Faigenboim-Doron, Adi; Patil, Abhinandan; Hedvat, Ilan; Hovav, Ran

2017-01-01

The growth habit of lateral shoots (also termed “branching habit”) is an important descriptive and agronomic character of peanut. Yet, both the inheritance of branching habit and the genetic mechanism that controls it in this crop remain unclear. In addition, the low degree of polymorphism among cultivated peanut varieties hinders fine-mapping of this and other traits in non-homozygous genetic structures. Here, we combined high-throughput sequencing with a well-defined genetic system to study these issues in peanut. Initially, segregating F2 populations derived from a reciprocal cross between very closely related Virginia-type peanut cultivars with spreading and bunch growth habits were examined. The spreading/bunch trait was shown to be controlled by a single gene with no cytoplasmic effect. That gene was named Bunch1 and was significantly correlated with pod yield per plant, time to maturation and the ratio of “dead-end” pods. Subsequently, bulked segregant analysis was performed on 52 completely bunch, and 47 completely spreading F3 families. In order to facilitate the process of SNP detection and candidate-gene analysis, the transcriptome was used instead of genomic DNA. Young leaves were sampled and bulked. Reads from Illumina sequencing were aligned against the peanut reference transcriptome and the diploid genomes. Inter-varietal SNPs were detected, scored and quality-filtered. Thirty-four candidate SNPs were found to have a bulk frequency ratio value >10 and 6 of those SNPs were found to be located in the genomic region of linkage group B5. Three best hits from that over-represented region were further analyzed in the segregating population. The trait locus was found to be located in a ~1.1 Mbp segment between markers M875 (B5:145,553,897; 1.9 cM) and M255 (B5:146,649,943; 2.25 cM). The method was validated using a population of recombinant inbreed lines of the same cross and a new DNA SNP-array. This study demonstrates the relatively straight-forward utilization of bulk segregant analysis for trait fine-mapping in the low polymeric and heterozygous germplasm of cultivated peanut and provides a baseline for candidate gene discovery and map-based cloning of Bunch1. PMID:28421098
Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

PubMed Central

Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

2013-01-01

Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation in G. hirsutum and comparative genomics among Gossypium species. PMID:24146870
Ribosomal RNA Genes Contribute to the Formation of Pseudogenes and Junk DNA in the Human Genome.

PubMed

Robicheau, Brent M; Susko, Edward; Harrigan, Amye M; Snyder, Marlene

2017-02-01

Approximately 35% of the human genome can be identified as sequence devoid of a selected-effect function, and not derived from transposable elements or repeated sequences. We provide evidence supporting a known origin for a fraction of this sequence. We show that: 1) highly degraded, but near full length, ribosomal DNA (rDNA) units, including both 45S and Intergenic Spacer (IGS), can be found at multiple sites in the human genome on chromosomes without rDNA arrays, 2) that these rDNA sequences have a propensity for being centromere proximal, and 3) that sequence at all human functional rDNA array ends is divergent from canonical rDNA to the point that it is pseudogenic. We also show that small sequence strings of rDNA (from 45S + IGS) can be found distributed throughout the genome and are identifiable as an "rDNA-like signal", representing 0.26% of the q-arm of HSA21 and ∼2% of the total sequence of other regions tested. The size of sequence strings found in the rDNA-like signal intergrade into the size of sequence strings that make up the full-length degrading rDNA units found scattered throughout the genome. We conclude that the displaced and degrading rDNA sequences are likely of a similar origin but represent different stages in their evolution towards random sequence. Collectively, our data suggests that over vast evolutionary time, rDNA arrays contribute to the production of junk DNA. The concept that the production of rDNA pseudogenes is a by-product of concerted evolution represents a previously under-appreciated process; we demonstrate here its importance. © The Author(s) 2017. Published by Oxford University Press on behalf of the Society for Molecular Biology and Evolution.
Comprehensive evaluation of genome-wide 5-hydroxymethylcytosine profiling approaches in human DNA.

PubMed

Skvortsova, Ksenia; Zotenko, Elena; Luu, Phuc-Loi; Gould, Cathryn M; Nair, Shalima S; Clark, Susan J; Stirzaker, Clare

2017-01-01

The discovery that 5-methylcytosine (5mC) can be oxidized to 5-hydroxymethylcytosine (5hmC) by the ten-eleven translocation (TET) proteins has prompted wide interest in the potential role of 5hmC in reshaping the mammalian DNA methylation landscape. The gold-standard bisulphite conversion technologies to study DNA methylation do not distinguish between 5mC and 5hmC. However, new approaches to mapping 5hmC genome-wide have advanced rapidly, although it is unclear how the different methods compare in accurately calling 5hmC. In this study, we provide a comparative analysis on brain DNA using three 5hmC genome-wide approaches, namely whole-genome bisulphite/oxidative bisulphite sequencing (WG Bis/OxBis-seq), Infinium HumanMethylation450 BeadChip arrays coupled with oxidative bisulphite (HM450K Bis/OxBis) and antibody-based immunoprecipitation and sequencing of hydroxymethylated DNA (hMeDIP-seq). We also perform loci-specific TET-assisted bisulphite sequencing (TAB-seq) for validation of candidate regions. We show that whole-genome single-base resolution approaches are advantaged in providing precise 5hmC values but require high sequencing depth to accurately measure 5hmC, as this modification is commonly in low abundance in mammalian cells. HM450K arrays coupled with oxidative bisulphite provide a cost-effective representation of 5hmC distribution, at CpG sites with 5hmC levels >~10%. However, 5hmC analysis is restricted to the genomic location of the probes, which is an important consideration as 5hmC modification is commonly enriched at enhancer elements. Finally, we show that the widely used hMeDIP-seq method provides an efficient genome-wide profile of 5hmC and shows high correlation with WG Bis/OxBis-seq 5hmC distribution in brain DNA. However, in cell line DNA with low levels of 5hmC, hMeDIP-seq-enriched regions are not detected by WG Bis/OxBis or HM450K, either suggesting misinterpretation of 5hmC calls by hMeDIP or lack of sensitivity of the latter methods. We highlight both the advantages and caveats of three commonly used genome-wide 5hmC profiling technologies and show that interpretation of 5hmC data can be significantly influenced by the sensitivity of methods used, especially as the levels of 5hmC are low and vary in different cell types and different genomic locations.
Exclusion of known gene for enamel development in two Brazilian families with amelogenesis imperfecta

PubMed Central

Santos, Maria CLG; Hart, P Suzanne; Ramaswami, Mukundhan; Kanno, Cláudia M; Hart, Thomas C; Line, Sergio RP

2007-01-01

Amelogenesis imperfecta (AI) is a genetically heterogeneous group of diseases that result in defective development of tooth enamel. Mutations in several enamel proteins and proteinases have been associated with AI. The object of this study was to evaluate evidence of etiology for the six major candidate gene loci in two Brazilian families with AI. Genomic DNA was obtained from family members and all exons and exon-intron boundaries of the ENAM, AMBN, AMELX, MMP20, KLK4 and Amelotin gene were amplified and sequenced. Each family was also evaluated for linkage to chromosome regions known to contain genes important in enamel development. The present study indicates that the AI in these two families is not caused by any of the known loci for AI or any of the major candidate genes proposed in the literature. These findings indicate extensive genetic heterogeneity for non-syndromic AI. PMID:17266769
Exclusion of known gene for enamel development in two Brazilian families with amelogenesis imperfecta.

PubMed

Santos, Maria C L G; Hart, P Suzanne; Ramaswami, Mukundhan; Kanno, Cláudia M; Hart, Thomas C; Line, Sergio R P

2007-01-31

Amelogenesis imperfecta (AI) is a genetically heterogeneous group of diseases that result in defective development of tooth enamel. Mutations in several enamel proteins and proteinases have been associated with AI. The object of this study was to evaluate evidence of etiology for the six major candidate gene loci in two Brazilian families with AI. Genomic DNA was obtained from family members and all exons and exon-intron boundaries of the ENAM, AMBN, AMELX, MMP20, KLK4 and Amelotin gene were amplified and sequenced. Each family was also evaluated for linkage to chromosome regions known to contain genes important in enamel development. The present study indicates that the AI in these two families is not caused by any of the known loci for AI or any of the major candidate genes proposed in the literature. These findings indicate extensive genetic heterogeneity for non-syndromic AI.
Representation of DNA sequences in genetic codon context with applications in exon and intron prediction.

PubMed

Yin, Changchuan

2015-04-01

To apply digital signal processing (DSP) methods to analyze DNA sequences, the sequences first must be specially mapped into numerical sequences. Thus, effective numerical mappings of DNA sequences play key roles in the effectiveness of DSP-based methods such as exon prediction. Despite numerous mappings of symbolic DNA sequences to numerical series, the existing mapping methods do not include the genetic coding features of DNA sequences. We present a novel numerical representation of DNA sequences using genetic codon context (GCC) in which the numerical values are optimized by simulation annealing to maximize the 3-periodicity signal to noise ratio (SNR). The optimized GCC representation is then applied in exon and intron prediction by Short-Time Fourier Transform (STFT) approach. The results show the GCC method enhances the SNR values of exon sequences and thus increases the accuracy of predicting protein coding regions in genomes compared with the commonly used 4D binary representation. In addition, this study offers a novel way to reveal specific features of DNA sequences by optimizing numerical mappings of symbolic DNA sequences.
Single-cell genomic sequencing using Multiple Displacement Amplification.

PubMed

Lasken, Roger S

2007-10-01

Single microbial cells can now be sequenced using DNA amplified by the Multiple Displacement Amplification (MDA) reaction. The few femtograms of DNA in a bacterium are amplified into micrograms of high molecular weight DNA suitable for DNA library construction and Sanger sequencing. The MDA-generated DNA also performs well when used directly as template for pyrosequencing by the 454 Life Sciences method. While MDA from single cells loses some of the genomic sequence, this approach will greatly accelerate the pace of sequencing from uncultured microbes. The genetically linked sequences from single cells are also a powerful tool to be used in guiding genomic assembly of shotgun sequences of multiple organisms from environmental DNA extracts (metagenomic sequences).
Diversity of the strongly rheophilous tadpoles of Malagasy tree frogs, genus Boophis (Anura, Mantellidae), and identification of new candidate species via larval DNA sequence and morphology

PubMed Central

Randrianiaina, Roger Daniel; Strauß, Axel; Glos, Julian; Vences, Miguel

2012-01-01

Abstract This study provides detailed morphological descriptions of previously unknown tadpoles of the treefrog genus Boophis Tschudi and analyses of habitat preferences of several of these tadpoles in Ranomafana National Park. A total of twenty-two tadpoles determined via DNA barcoding are characterized morphologically herein, fourteen of them for the first time. Twelve of these tadpoles belong to taxonomically undescribed candidate species which in several cases are so far only known from their larval stages. Our data show that the larvae of some of these candidate species occur syntopically yet maintaining a clearly correlated genetic and morphological identity, suggesting that they indeed are true biological and evolutionary species. Tadpoles considered to belong to the “adherent” ecomorphological guild inhabit fast-running waters and their oral disc is commonly to continuously attached to the rocky substrate, supposedly to keep their position in the water current. Some of these species are characterized by the presence of a dorsal gap of papillae and the absence of an upper jaw sheath. This guild includes the tadpoles of the Boophis albipuncatus group (Boophis ankaratra, Boophis schuboeae, Boophis albipunctatus, Boophis sibilans, Boophis luciae), and of the Boophis mandraka group (Boophis sambirano and six candidate species related to this species and to Boophis mandraka). Tadpoles considered belonging to the “suctorial” guild inhabit fast-running waters where they use frequently their oral disc to attach to the substrate. They have an enlarged oral disc without any dorsal gap, including two nominal species (Boophis marojezensis, Boophis vittatus), and five candidate species related to Boophis marojezensis. An ecological analysis of the tadpoles of Boophis luciae, Boophis schuboeae and Boophis marojezensis [Ca51 JQ518198] from Ranomafana National Park did not provide evidence for a clear preference of these tadpoles to the fast flowing microhabitat sections of the stream, although the tadpoles discussed in this study are typically caught in this habitat. PMID:22539880
Characterization of a gene cluster responsible for the biosynthesis of anticancer agent FK228 in Chromobacterium violaceum No. 968.

PubMed

Cheng, Yi-Qiang; Yang, Min; Matter, Andrea M

2007-06-01

A gene cluster responsible for the biosynthesis of anticancer agent FK228 has been identified, cloned, and partially characterized in Chromobacterium violaceum no. 968. First, a genome-scanning approach was applied to identify three distinctive C. violaceum no. 968 genomic DNA clones that code for portions of nonribosomal peptide synthetase and polyketide synthase. Next, a gene replacement system developed originally for Pseudomonas aeruginosa was adapted to inactivate the genomic DNA-associated candidate natural product biosynthetic genes in vivo with high efficiency. Inactivation of a nonribosomal peptide synthetase-encoding gene completely abolished FK228 production in mutant strains. Subsequently, the entire FK228 biosynthetic gene cluster was cloned and sequenced. This gene cluster is predicted to encompass a 36.4-kb DNA region that includes 14 genes. The products of nine biosynthetic genes are proposed to constitute an unusual hybrid nonribosomal peptide synthetase-polyketide synthase-nonribosomal peptide synthetase assembly line including accessory activities for the biosynthesis of FK228. In particular, a putative flavin adenine dinucleotide-dependent pyridine nucleotide-disulfide oxidoreductase is proposed to catalyze disulfide bond formation between two sulfhydryl groups of cysteine residues as the final step in FK228 biosynthesis. Acquisition of the FK228 biosynthetic gene cluster and acclimation of an efficient genetic system should enable genetic engineering of the FK228 biosynthetic pathway in C. violaceum no. 968 for the generation of structural analogs as anticancer drug candidates.
Short interspersed CAN SINE elements as prognostic markers in canine mammary neoplasia.

PubMed

Gelaleti, Gabriela B; Granzotto, Adriana; Leonel, Camila; Jardim, Bruna V; Moschetta, Marina G; Carareto, Claudia M A; Zuccari, Debora Ap P C

2014-01-01

The genome of mammals is characterized by a large number of non-LTR retrotransposons, and among them, the CAN SINEs are characteristics of the canine species. Small amounts of DNA freely circulate in normal blood serum and high amounts are found in human patients with cancer, characterizing it as a candidate tumor-biomarker. The aim of this study was to estimate, through its absolute expression, the number of copies of CAN SINE sequences present in free circulating DNA of female dogs with mammary cancer, in order to correlate with the clinical and pathological characteristics and the follow-up period. The copy number of CAN SINE sequences was estimated by qPCR in 28 female dogs with mammary neoplasia. The univariate analysis showed an increased number of copies in female dogs with mammary tumor in female dogs >10 years old (p=0.02) and tumor time >18 months (p<0.05). The Kaplan-Meier test demonstrated a negative correlation between an increased number of copies and survival time (p=0.03). High amounts of CAN SINE fragments can be good markers for the detection of tumor DNA in blood and may characterize it as a marker of poor prognosis, being related to female dogs with shorter survival times. This estimate can be used as a prognostic marker in non-invasive breast cancer research and is useful in predicting tumor progression and patient monitoring.
Transcriptome analysis of Bupleurum chinense focusing on genes involved in the biosynthesis of saikosaponins

PubMed Central

2011-01-01

Abstract Background Bupleurum chinense DC. is a widely used traditional Chinese medicinal plant. Saikosaponins are the major bioactive constituents of B. chinense, but relatively little is known about saikosaponin biosynthesis. The 454 pyrosequencing technology provides a promising opportunity for finding novel genes that participate in plant metabolism. Consequently, this technology may help to identify the candidate genes involved in the saikosaponin biosynthetic pathway. Results One-quarter of the 454 pyrosequencing runs produced a total of 195, 088 high-quality reads, with an average read length of 356 bases (NCBI SRA accession SRA039388). A de novo assembly generated 24, 037 unique sequences (22, 748 contigs and 1, 289 singletons), 12, 649 (52.6%) of which were annotated against three public protein databases using a basic local alignment search tool (E-value ≤1e-10). All unique sequences were compared with NCBI expressed sequence tags (ESTs) (237) and encoding sequences (44) from the Bupleurum genus, and with a Sanger-sequenced EST dataset (3, 111). The 23, 173 (96.4%) unique sequences obtained in the present study represent novel Bupleurum genes. The ESTs of genes related to saikosaponin biosynthesis were found to encode known enzymes that catalyze the formation of the saikosaponin backbone; 246 cytochrome P450 (P450s) and 102 glycosyltransferases (GTs) unique sequences were also found in the 454 dataset. Full length cDNAs of 7 P450s and 7 uridine diphosphate GTs (UGTs) were verified by reverse transcriptase polymerase chain reaction or by cloning using 5' and/or 3' rapid amplification of cDNA ends. Two P450s and three UGTs were identified as the most likely candidates involved in saikosaponin biosynthesis. This finding was based on the coordinate up-regulation of their expression with β-AS in methyl jasmonate-treated adventitious roots and on their similar expression patterns with β-AS in various B. chinense tissues. Conclusions A collection of high-quality ESTs for B. chinense obtained by 454 pyrosequencing is provided here for the first time. These data should aid further research on the functional genomics of B. chinense and other Bupleurum species. The candidate genes for enzymes involved in saikosaponin biosynthesis, especially the P450s and UGTs, that were revealed provide a substantial foundation for follow-up research on the metabolism and regulation of the saikosaponins. PMID:22047182

Methionine increases BDNF DNA methylation and improves memory in epilepsy.

PubMed

Parrish, R Ryley; Buckingham, Susan C; Mascia, Katherine L; Johnson, Jarvis J; Matyjasik, Michal M; Lockhart, Roxanne M; Lubin, Farah D

2015-04-01

Temporal lobe epilepsy (TLE) patients exhibit signs of memory impairments even when seizures are pharmacologically controlled. Surprisingly, the underlying molecular mechanisms involved in TLE-associated memory impairments remain elusive. Memory consolidation requires epigenetic transcriptional regulation of genes in the hippocampus; therefore, we aimed to determine how epigenetic DNA methylation mechanisms affect learning-induced transcription of memory-permissive genes in the epileptic hippocampus. Using the kainate rodent model of TLE and focusing on the brain-derived neurotrophic factor (Bdnf) gene as a candidate of DNA methylation-mediated transcription, we analyzed DNA methylation levels in epileptic rats following learning. After detection of aberrant DNA methylation at the Bdnf gene, we investigated functional effects of altered DNA methylation on hippocampus-dependent memory formation in our TLE rodent model. We found that behaviorally driven BdnfDNA methylation was associated with hippocampus-dependent memory deficits. Bisulfite sequencing revealed that decreased BdnfDNA methylation levels strongly correlated with abnormally high levels of BdnfmRNA in the epileptic hippocampus during memory consolidation. Methyl supplementation via methionine (Met) increased BdnfDNA methylation and reduced BdnfmRNA levels in the epileptic hippocampus during memory consolidation. Met administration reduced interictal spike activity, increased theta rhythm power, and reversed memory deficits in epileptic animals. The rescue effect of Met treatment on learning-induced BdnfDNA methylation, Bdnf gene expression, and hippocampus-dependent memory, were attenuated by DNA methyltransferase blockade. Our findings suggest that manipulation of DNA methylation in the epileptic hippocampus should be considered as a viable treatment option to ameliorate memory impairments associated with TLE.
VEZF1 Elements Mediate Protection from DNA Methylation

PubMed Central

Strogantsev, Ruslan; Gaszner, Miklos; Hair, Alan; Felsenfeld, Gary; West, Adam G.

2010-01-01

There is growing consensus that genome organization and long-range gene regulation involves partitioning of the genome into domains of distinct epigenetic chromatin states. Chromatin insulator or barrier elements are key components of these processes as they can establish boundaries between chromatin states. The ability of elements such as the paradigm β-globin HS4 insulator to block the range of enhancers or the spread of repressive histone modifications is well established. Here we have addressed the hypothesis that a barrier element in vertebrates should be capable of defending a gene from silencing by DNA methylation. Using an established stable reporter gene system, we find that HS4 acts specifically to protect a gene promoter from de novo DNA methylation. Notably, protection from methylation can occur in the absence of histone acetylation or transcription. There is a division of labor at HS4; the sequences that mediate protection from methylation are separable from those that mediate CTCF-dependent enhancer blocking and USF-dependent histone modification recruitment. The zinc finger protein VEZF1 was purified as the factor that specifically interacts with the methylation protection elements. VEZF1 is a candidate CpG island protection factor as the G-rich sequences bound by VEZF1 are frequently found at CpG island promoters. Indeed, we show that VEZF1 elements are sufficient to mediate demethylation and protection of the APRT CpG island promoter from DNA methylation. We propose that many barrier elements in vertebrates will prevent DNA methylation in addition to blocking the propagation of repressive histone modifications, as either process is sufficient to direct the establishment of an epigenetically stable silent chromatin state. PMID:20062523
Rapid extraction from and direct identification in clinical samples of methicillin-resistant staphylococci using the PCR.

PubMed

Jaffe, R I; Lane, J D; Albury, S V; Niemeyer, D M

2000-09-01

Methicillin-resistant staphylococci (MRS) are one of the most common causes of nosocomial infections and bacteremia. Standard bacterial identification and susceptibility testing frequently require as long as 72 h to report results, and there may be difficulty in rapidly and accurately identifying methicillin resistance. The use of the PCR is a rapid and simple process for the amplification of target DNA sequences, which can be used to identify and test bacteria for antimicrobial resistance. However, many sample preparation methods are unsuitable for PCR utilization in the clinical laboratory because they either are not cost-effective, take too long to perform, or do not provide a satisfactory DNA template for PCR. Our goal was to provide same-day results to facilitate rapid diagnosis and therapy. In this report, we describe a rapid method for extraction of bacterial DNA directly from blood culture bottles that gave quality DNA for PCR in as little as 20 min. We compared this extraction method to the standard QIAGEN method for turnaround time (TAT), cost, purity, and use of template in PCR. Specific identification of MRS was determined using intragenic primer sets for bacterial and Staphylococcus 16S rRNA and mecA gene sequences. The PCR primer sets were validated with 416 isolates of staphylococci, including methicillin-resistant Staphylococcus aureus (n = 106), methicillin-sensitive S. aureus (n = 134), and coagulase-negative Staphylococcus (n = 176). The total supply cost of our extraction method and PCR was $2.15 per sample with a result TAT of less than 4 h. The methods described herein represent a rapid and accurate DNA extraction and PCR-based identification system, which makes the system an ideal candidate for use under austere field conditions and one that may have utility in the clinical laboratory.
Genome-wide DNA methylation sequencing reveals miR-663a is a novel epimutation candidate in CIMP-high endometrial cancer

PubMed Central

Yanokura, Megumi; Banno, Kouji; Adachi, Masataka; Aoki, Daisuke; Abe, Kuniya

2017-01-01

Aberrant DNA methylation is widely observed in many cancers. Concurrent DNA methylation of multiple genes occurs in endometrial cancer and is referred to as the CpG island methylator phenotype (CIMP). However, the features and causes of CIMP-positive endometrial cancer are not well understood. To investigate DNA methylation features characteristic to CIMP-positive endometrial cancer, we first classified samples from 25 patients with endometrial cancer based on the methylation status of three genes, i.e. MLH1, CDH1 (E-cadherin) and APC: CIMP-high (CIMP-H, 2/25, 8.0%), CIMP-low (CIMP-L, 7/25, 28.0%) and CIMP-negative (CIMP(-), 16/25, 64.0%). We then selected two samples each from CIMP-H and CIMP(-) classes, and analyzed DNA methylation status of both normal (peripheral blood cells: PBCs) and cancer tissues by genome-wide, targeted bisulfite sequencing. Genomes of the CIMP-H cancer tissues were significantly hypermethylated compared to those of the CIMP(-). Surprisingly, in normal tissues of the CIMP-H patients, promoter region of the miR-663a locus is hypermethylated relative to CIMP(-) samples. Consistent with this finding, miR-663a expression was lower in the CIMP-H PBCs than in the CIMP(-) PBCs. The same region of the miR663a locus is found to be highly methylated in cancer tissues of both CIMP-H and CIMP(-) cases. This is the first report showing that aberrant DNA methylation of the miR-663a promoter can occur in normal tissue of the cancer patients, suggesting a possible link between this epigenetic abnormality and endometrial cancer. This raises the possibility that the hypermethylation of the miR-663a promoter represents an epimutation associated with the CIMP-H endometrial cancers. Based on these findings, relationship of the aberrant DNA methylation and CIMP-H phenotype is discussed. PMID:28440489
Genome-wide DNA methylation sequencing reveals miR-663a is a novel epimutation candidate in CIMP-high endometrial cancer.

PubMed

Yanokura, Megumi; Banno, Kouji; Adachi, Masataka; Aoki, Daisuke; Abe, Kuniya

2017-06-01

Aberrant DNA methylation is widely observed in many cancers. Concurrent DNA methylation of multiple genes occurs in endometrial cancer and is referred to as the CpG island methylator phenotype (CIMP). However, the features and causes of CIMP-positive endometrial cancer are not well understood. To investigate DNA methylation features characteristic to CIMP-positive endometrial cancer, we first classified samples from 25 patients with endometrial cancer based on the methylation status of three genes, i.e. MLH1, CDH1 (E-cadherin) and APC: CIMP-high (CIMP-H, 2/25, 8.0%), CIMP-low (CIMP-L, 7/25, 28.0%) and CIMP-negative (CIMP(-), 16/25, 64.0%). We then selected two samples each from CIMP-H and CIMP(-) classes, and analyzed DNA methylation status of both normal (peripheral blood cells: PBCs) and cancer tissues by genome-wide, targeted bisulfite sequencing. Genomes of the CIMP-H cancer tissues were significantly hypermethylated compared to those of the CIMP(-). Surprisingly, in normal tissues of the CIMP-H patients, promoter region of the miR-663a locus is hypermethylated relative to CIMP(-) samples. Consistent with this finding, miR-663a expression was lower in the CIMP-H PBCs than in the CIMP(-) PBCs. The same region of the miR663a locus is found to be highly methylated in cancer tissues of both CIMP-H and CIMP(-) cases. This is the first report showing that aberrant DNA methylation of the miR-663a promoter can occur in normal tissue of the cancer patients, suggesting a possible link between this epigenetic abnormality and endometrial cancer. This raises the possibility that the hypermethylation of the miR-663a promoter represents an epimutation associated with the CIMP-H endometrial cancers. Based on these findings, relationship of the aberrant DNA methylation and CIMP-H phenotype is discussed.
A transcription map of the regions surrounding the CSF1R locus on human chromosome 5q31: Candidate genes for diastrophic dysplasia

DOE Office of Scientific and Technical Information (OSTI.GOV)

Clines, G.; Lovett, M.

1994-09-01

Diastrophic dysplasia (DTD) is an autosomal recessive disorder of unknown pathogenesis that is characterized by abnormal skeletal and cartilage growth. Phenotypic characteristics of the disorder include short stature, scoliosis, and deformation of the first metacarpal. The diastrophic dysplasia gene has been localized to chromosome 5q31-33, within {approximately}60 kb of the colony stimulating factor 1 receptor gene (CSF1R). We have used direct cDNA selection to build a transcription map across {approximately}250 kb surrounding and including the CSF1R locus. cDNA pools from human placenta, activated T cells, cerebellum, Hela cells, fetal brain, chondrocytes, chondrosarcomas and osteosarcomas were multiplexed in these selections. Aftermore » two rounds of selection, an analysis revealed that {approximately}70% of the selected cDNAs were contained within the contig. DNA sequencing and cosmid mapping data from a collection of 310 clones revealed the presence of three new genes in this region that show no appreciable homologies on sequence database searches, as well as cDNA clones from the CSF1R and the PDGFRB loci (another of the known genes in the region). An additional cDNA was found with 100% homology to the gene encoding human ribosomal protein L7 (RPL7). This cDNA comprised {approximately}25% of all selected clones. However, further analysis of the genomic contig revealed the presence of an RPL7 processed pseudogene in very close proximity to the CSF1R and PDGFRB genes. The selection of processed pseudogenes is one previously anticipated artifact of selection metholodolgies, but has not been previously observed. Mutational analysis of the three new genes is underway in diastrophic dysplasia families, as is derivation of full length cDNA clones and the expansion of this detailed transcription map into a larger genomic contig.« less
Acquisition of New DNA Sequences After Infection of Chicken Cells with Avian Myeloblastosis Virus

PubMed Central

Shoyab, M.; Baluda, M. A.; Evans, R.

1974-01-01

DNA-RNA hybridization studies between 70S RNA from avian myeloblastosis virus (AMV) and an excess of DNA from (i) AMV-induced leukemic chicken myeloblasts or (ii) a mixture of normal and of congenitally infected K-137 chicken embryos producing avian leukosis viruses revealed the presence of fast- and slow-hybridizing virus-specific DNA sequences. However, the leukemic cells contained twice the level of AMV-specific DNA sequences observed in normal chicken embryonic cells. The fast-reacting sequences were two to three times more numerous in leukemic DNA than in DNA from the mixed embryos. The slow-reacting sequences had a reiteration frequency of approximately 9 and 6, in the two respective systems. Both the fast- and the slow-reacting DNA sequences in leukemic cells exhibited a higher Tm (2 C) than the respective DNA sequences in normal cells. In normal and leukemic cells the slow hybrid sequences appeared to have a Tm which was 2 C higher than that of the fast hybrid sequences. Individual non-virus-producing chicken embryos, either group-specific antigen positive or negative, contained 40 to 100 copies of the fast sequences and 2 to 6 copies of the slowly hybridizing sequences per cell genome. Normal rat cells did not contain DNA that hybridized with AMV RNA, whereas non-virus-producing rat cells transformed by B-77 avian sarcoma virus contained only the slowly reacting sequences. The results demonstrate that leukemic cells transformed by AMV contain new AMV-specific DNA sequences which were not present before infection. PMID:16789139
Homogeneity of the 16S rDNA sequence among geographically disparate isolates of Taylorella equigenitalis

PubMed Central

Matsuda, M; Tazumi, A; Kagawa, S; Sekizuka, T; Murayama, O; Moore, JE; Millar, BC

2006-01-01

Background At present, six accessible sequences of 16S rDNA from Taylorella equigenitalis (T. equigenitalis) are available, whose sequence differences occur at a few nucleotide positions. Thus it is important to determine these sequences from additional strains in other countries, if possible, in order to clarify any anomalies regarding 16S rDNA sequence heterogeneity. Here, we clone and sequence the approximate full-length 16S rDNA from additional strains of T. equigenitalis isolated in Japan, Australia and France and compare these sequences to the existing published sequences. Results Clarification of any anomalies regarding 16S rDNA sequence heterogeneity of T. equigenitalis was carried out. When cloning, sequencing and comparison of the approximate full-length 16S rDNA from 17 strains of T. equigenitalis isolated in Japan, Australia and France, nucleotide sequence differences were demonstrated at the six loci in the 1,469 nucleotide sequence. Moreover, 12 polymorphic sites occurred among 23 sequences of the 16S rDNA, including the six reference sequences. Conclusion High sequence similarity (99.5% or more) was observed throughout, except from nucleotide positions 138 to 501 where substitutions and deletions were noted. PMID:16398935
Epigenetic discrimination of identical twins from blood under the forensic scenario.

PubMed

Vidaki, Athina; Díez López, Celia; Carnero-Montoro, Elena; Ralf, Arwin; Ward, Kirsten; Spector, Timothy; Bell, Jordana T; Kayser, Manfred

2017-11-01

Monozygotic (MZ) twins share the same STR profile, demonstrating a practical problem in forensic casework. DNA methylation has provided a suitable resource for MZ twin differentiation; however, studies addressing the forensic feasibility are lacking. Here, we investigated epigenetic MZ twin differentiation from blood under the forensic scenario comprising i) the discovery of candidate markers in reference-type blood DNA via genome-wide analysis, ii) the technical validation of candidate markers in reference-type blood DNA using a suitable targeted method, and iii) the analysis of the validated markers in trace-type DNA. Genome-wide methylation analysis in blood DNA from 10 MZ twin pairs resulted in 19-111 twin-differentially methylated sites (tDMSs) per pair with >0.3 twin-to-twin differences. Considering all top three candidate tDMSs across all pairs in the technical validation based on methylation-specific qPCR, 67.85% generated >0.1 twin-to-twin differences. Of the validated tDMSs, 68.4% showed >0.1 twin-to-twin differences with qPCR in trace-type DNA across 8 pairs. Using an updated marker selection strategy, 8 additional candidate tDMSs were obtained for an example MZ pair, of which 7 showed >0.1 twin-to-twin differences in both reference- and trace-type DNA. Lastly, we introduce a high-resolution melting curve analysis of the entire fragment that can complement the proposed approach. Overall, our study demonstrates the general feasibility of epigenetic twin differentiation in the forensic context and highlights that the number of informative tDMSs in the final trace DNA analysis is crucial, as some candidate markers identified in reference DNA were shown not informative in the trace DNA due to various, including technical, reasons. Future studies will need to address the optimal number of epigenetic markers required for reliable identification of MZ twin individuals including statistical considerations. Copyright © 2017 Elsevier B.V. All rights reserved.
Detection and quantitation of single nucleotide polymorphisms, DNA sequence variations, DNA mutations, DNA damage and DNA mismatches

DOEpatents

McCutchen-Maloney, Sandra L.

2002-01-01

DNA mutation binding proteins alone and as chimeric proteins with nucleases are used with solid supports to detect DNA sequence variations, DNA mutations and single nucleotide polymorphisms. The solid supports may be flow cytometry beads, DNA chips, glass slides or DNA dips sticks. DNA molecules are coupled to solid supports to form DNA-support complexes. Labeled DNA is used with unlabeled DNA mutation binding proteins such at TthMutS to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by binding which gives an increase in signal. Unlabeled DNA is utilized with labeled chimeras to detect DNA sequence variations, DNA mutations and single nucleotide length polymorphisms by nuclease activity of the chimera which gives a decrease in signal.
Screening for microsatellite instability target genes in colorectal cancers

PubMed Central

Vilkki, S; Launonen, V; Karhu, A; Sistonen, P; Vastrik, I; Aaltonen, L

2002-01-01

Background: Defects in the DNA repair system lead to genetic instability because replication errors are not corrected. This type of genetic instability is a key event in the malignant progression of HNPCC and a subset of sporadic colon cancers and mutation rates are particularly high at short repetitive sequences. Somatic deletions of coding mononucleotide repeats have been detected, for example, in the TGFßRII and BAX genes, and recently many novel target genes for microsatellite instability (MSI) have been proposed. Novel target genes are likely to be discovered in the future. More data should be created on background mutation rates in MSI tumours to evaluate mutation rates observed in the candidate target genes. Methods: Mutation rates in 14 neutral intronic repeats were evaluated in MSI tumours. Bioinformatic searches combined with keywords related to cancer and tumour suppressor or CRC related gene homology were used to find new candidate MSI target genes. By comparison of mutation frequencies observed in intronic mononucleotide repeats versus exonic coding repeats of potential MSI target genes, the significance of the exonic mutations was estimated. Results: As expected, the length of an intronic mononucleotide repeat correlated positively with the number of slippages for both G/C and A/T repeats (p=0.0020 and p=0.0012, respectively). BRCA1, CtBP1, and Rb1 associated CtIP and other candidates were found in a bioinformatic search combined with keywords related to cancer. Sequencing showed a significantly increased mutation rate in the exonic A9 repeat of CtIP (25/109=22.9%) as compared with similar intronic repeats (p≤0.001). Conclusions: We propose a new candidate MSI target gene CtIP to be evaluated in further studies. PMID:12414815
Identification of candidate genes affecting Δ9-tetrahydrocannabinol biosynthesis in Cannabis sativa

PubMed Central

Marks, M. David; Tian, Li; Wenger, Jonathan P.; Omburo, Stephanie N.; Soto-Fuentes, Wilfredo; He, Ji; Gang, David R.; Weiblen, George D.; Dixon, Richard A.

2009-01-01

RNA isolated from the glands of a Δ9-tetrahydrocannabinolic acid (THCA)-producing strain of Cannabis sativa was used to generate a cDNA library containing over 100 000 expressed sequence tags (ESTs). Sequencing of over 2000 clones from the library resulted in the identification of over 1000 unigenes. Candidate genes for almost every step in the biochemical pathways leading from primary metabolites to THCA were identified. Quantitative PCR analysis suggested that many of the pathway genes are preferentially expressed in the glands. Hexanoyl-CoA, one of the metabolites required for THCA synthesis, could be made via either de novo fatty acids synthesis or via the breakdown of existing lipids. qPCR analysis supported the de novo pathway. Many of the ESTs encode transcription factors and two putative MYB genes were identified that were preferentially expressed in glands. Given the similarity of the Cannabis MYB genes to those in other species with known functions, these Cannabis MYBs may play roles in regulating gland development and THCA synthesis. Three candidates for the polyketide synthase (PKS) gene responsible for the first committed step in the pathway to THCA were characterized in more detail. One of these was identical to a previously reported chalcone synthase (CHS) and was found to have CHS activity. All three could use malonyl-CoA and hexanoyl-CoA as substrates, including the CHS, but reaction conditions were not identified that allowed for the production of olivetolic acid (the proposed product of the PKS activity needed for THCA synthesis). One of the PKS candidates was highly and specifically expressed in glands (relative to whole leaves) and, on the basis of these expression data, it is proposed to be the most likely PKS responsible for olivetolic acid synthesis in Cannabis glands. PMID:19581347
Rapid identification of candidate genes for resistance to tomato late blight disease using next-generation sequencing technologies

PubMed Central

Arafa, Ramadan A.; Rakha, Mohamed T.; Kamel, Said M.

2017-01-01

Tomato late blight caused by Phytophthora infestans (Mont.) de Bary, also known as the Irish famine pathogen, is one of the most destructive plant diseases. Wild relatives of tomato possess useful resistance genes against this disease, and could therefore be used in breeding to improve cultivated varieties. In the genome of a wild relative of tomato, Solanum habrochaites accession LA1777, we identified a new quantitative trait locus for resistance against blight caused by an aggressive Egyptian isolate of P. infestans. Using double-digest restriction site–associated DNA sequencing (ddRAD-Seq) technology, we determined 6,514 genome-wide SNP genotypes of an F2 population derived from an interspecific cross. Subsequent association analysis of genotypes and phenotypes of the mapping population revealed that a 6.8 Mb genome region on chromosome 6 was a candidate locus for disease resistance. Whole-genome resequencing analysis revealed that 298 genes in this region potentially had functional differences between the parental lines. Among of them, two genes with missense mutations, Solyc06g071810.1 and Solyc06g083640.3, were considered to be potential candidates for disease resistance. SNP and SSR markers linking to this region can be used in marker-assisted selection in future breeding programs for late blight disease, including introgression of new genetic loci from wild species. In addition, the approach developed in this study provides a model for identification of other genes for attractive agronomical traits. PMID:29253902
Global transcriptional start site mapping using differential RNA sequencing reveals novel antisense RNAs in Escherichia coli.

PubMed

Thomason, Maureen K; Bischler, Thorsten; Eisenbart, Sara K; Förstner, Konrad U; Zhang, Aixia; Herbig, Alexander; Nieselt, Kay; Sharma, Cynthia M; Storz, Gisela

2015-01-01

While the model organism Escherichia coli has been the subject of intense study for decades, the full complement of its RNAs is only now being examined. Here we describe a survey of the E. coli transcriptome carried out using a differential RNA sequencing (dRNA-seq) approach, which can distinguish between primary and processed transcripts, and an automated prediction algorithm for transcriptional start sites (TSS). With the criterion of expression under at least one of three growth conditions examined, we predicted 14,868 TSS candidates, including 5,574 internal to annotated genes (iTSS) and 5,495 TSS corresponding to potential antisense RNAs (asRNAs). We examined expression of 14 candidate asRNAs by Northern analysis using RNA from wild-type E. coli and from strains defective for RNases III and E, two RNases reported to be involved in asRNA processing. Interestingly, nine asRNAs detected as distinct bands by Northern analysis were differentially affected by the rnc and rne mutations. We also compared our asRNA candidates with previously published asRNA annotations from RNA-seq data and discuss the challenges associated with these cross-comparisons. Our global transcriptional start site map represents a valuable resource for identification of transcription start sites, promoters, and novel transcripts in E. coli and is easily accessible, together with the cDNA coverage plots, in an online genome browser. Copyright © 2015, American Society for Microbiology. All Rights Reserved.
Methylation patterns of repetitive DNA sequences in germ cells of Mus musculus.

PubMed

Sanford, J; Forrester, L; Chapman, V; Chandley, A; Hastie, N

1984-03-26

The major and the minor satellite sequences of Mus musculus were undermethylated in both sperm and oocyte DNAs relative to the amount of undermethylation observed in adult somatic tissue DNA. This hypomethylation was specific for satellite sequences in sperm DNA. Dispersed repetitive and low copy sequences show a high degree of methylation in sperm DNA; however, a dispersed repetitive sequence was undermethylated in oocyte DNA. This finding suggests a difference in the amount of total genomic DNA methylation between sperm and oocyte DNA. The methylation levels of the minor satellite sequences did not change during spermiogenesis, and were not associated with the onset of meiosis or a specific stage in sperm development.
Process of labeling specific chromosomes using recombinant repetitive DNA

DOEpatents

Moyzis, R.K.; Meyne, J.

1988-02-12

Chromosome preferential nucleotide sequences are first determined from a library of recombinant DNA clones having families of repetitive sequences. Library clones are identified with a low homology with a sequence of repetitive DNA families to which the first clones respectively belong and variant sequences are then identified by selecting clones having a pattern of hybridization with genomic DNA dissimilar to the hybridization pattern shown by the respective families. In another embodiment, variant sequences are selected from a sequence of a known repetitive DNA family. The selected variant sequence is classified as chromosome specific, chromosome preferential, or chromosome nonspecific. Sequences which are classified as chromosome preferential are further sequenced and regions are identified having a low homology with other regions of the chromosome preferential sequence or with known sequences of other family members and consensus sequences of the repetitive DNA families for the chromosome preferential sequences. The selected low homology regions are then hybridized with chromosomes to determine those low homology regions hybridized with a specific chromosome under normal stringency conditions.
Transposable Elements and DNA Methylation Create in Embryonic Stem Cells Human-Specific Regulatory Sequences Associated with Distal Enhancers and Noncoding RNAs

PubMed Central

Glinsky, Gennadi V.

2015-01-01

Despite significant progress in the structural and functional characterization of the human genome, understanding of the mechanisms underlying the genetic basis of human phenotypic uniqueness remains limited. Here, I report that transposable element-derived sequences, most notably LTR7/HERV-H, LTR5_Hs, and L1HS, harbor 99.8% of the candidate human-specific regulatory loci (HSRL) with putative transcription factor-binding sites in the genome of human embryonic stem cells (hESC). A total of 4,094 candidate HSRL display selective and site-specific binding of critical regulators (NANOG [Nanog homeobox], POU5F1 [POU class 5 homeobox 1], CCCTC-binding factor [CTCF], Lamin B1), and are preferentially located within the matrix of transcriptionally active DNA segments that are hypermethylated in hESC. hESC-specific NANOG-binding sites are enriched near the protein-coding genes regulating brain size, pluripotency long noncoding RNAs, hESC enhancers, and 5-hydroxymethylcytosine-harboring regions immediately adjacent to binding sites. Sequences of only 4.3% of hESC-specific NANOG-binding sites are present in Neanderthals’ genome, suggesting that a majority of these regulatory elements emerged in Modern Humans. Comparisons of estimated creation rates of novel TF-binding sites revealed that there was 49.7-fold acceleration of creation rates of NANOG-binding sites in genomes of Chimpanzees compared with the mouse genomes and further 5.7-fold acceleration in genomes of Modern Humans compared with the Chimpanzees genomes. Preliminary estimates suggest that emergence of one novel NANOG-binding site detectable in hESC required 466 years of evolution. Pathway analysis of coding genes that have hESC-specific NANOG-binding sites within gene bodies or near gene boundaries revealed their association with physiological development and functions of nervous and cardiovascular systems, embryonic development, behavior, as well as development of a diverse spectrum of pathological conditions such as cancer, diseases of cardiovascular and reproductive systems, metabolic diseases, multiple neurological and psychological disorders. A proximity placement model is proposed explaining how a 33–47% excess of NANOG, CTCF, and POU5F1 proteins immobilized on a DNA scaffold may play a functional role at distal regulatory elements. PMID:25956794
Extensive sequencing of seven human genomes to characterize benchmark reference materials

PubMed Central

Zook, Justin M.; Catoe, David; McDaniel, Jennifer; Vang, Lindsay; Spies, Noah; Sidow, Arend; Weng, Ziming; Liu, Yuling; Mason, Christopher E.; Alexander, Noah; Henaff, Elizabeth; McIntyre, Alexa B.R.; Chandramohan, Dhruva; Chen, Feng; Jaeger, Erich; Moshrefi, Ali; Pham, Khoa; Stedman, William; Liang, Tiffany; Saghbini, Michael; Dzakula, Zeljko; Hastie, Alex; Cao, Han; Deikus, Gintaras; Schadt, Eric; Sebra, Robert; Bashir, Ali; Truty, Rebecca M.; Chang, Christopher C.; Gulbahce, Natali; Zhao, Keyan; Ghosh, Srinka; Hyland, Fiona; Fu, Yutao; Chaisson, Mark; Xiao, Chunlin; Trow, Jonathan; Sherry, Stephen T.; Zaranek, Alexander W.; Ball, Madeleine; Bobe, Jason; Estep, Preston; Church, George M.; Marks, Patrick; Kyriazopoulou-Panagiotopoulou, Sofia; Zheng, Grace X.Y.; Schnall-Levin, Michael; Ordonez, Heather S.; Mudivarti, Patrice A.; Giorda, Kristina; Sheng, Ying; Rypdal, Karoline Bjarnesdatter; Salit, Marc

2016-01-01

The Genome in a Bottle Consortium, hosted by the National Institute of Standards and Technology (NIST) is creating reference materials and data for human genome sequencing, as well as methods for genome comparison and benchmarking. Here, we describe a large, diverse set of sequencing data for seven human genomes; five are current or candidate NIST Reference Materials. The pilot genome, NA12878, has been released as NIST RM 8398. We also describe data from two Personal Genome Project trios, one of Ashkenazim Jewish ancestry and one of Chinese ancestry. The data come from 12 technologies: BioNano Genomics, Complete Genomics paired-end and LFR, Ion Proton exome, Oxford Nanopore, Pacific Biosciences, SOLiD, 10X Genomics GemCode WGS, and Illumina exome and WGS paired-end, mate-pair, and synthetic long reads. Cell lines, DNA, and data from these individuals are publicly available. Therefore, we expect these data to be useful for revealing novel information about the human genome and improving sequencing technologies, SNP, indel, and structural variant calling, and de novo assembly. PMID:27271295
Sequence-based prediction of protein-binding sites in DNA: comparative study of two SVM models.

PubMed

Park, Byungkyu; Im, Jinyong; Tuvshinjargal, Narankhuu; Lee, Wook; Han, Kyungsook

2014-11-01

As many structures of protein-DNA complexes have been known in the past years, several computational methods have been developed to predict DNA-binding sites in proteins. However, its inverse problem (i.e., predicting protein-binding sites in DNA) has received much less attention. One of the reasons is that the differences between the interaction propensities of nucleotides are much smaller than those between amino acids. Another reason is that DNA exhibits less diverse sequence patterns than protein. Therefore, predicting protein-binding DNA nucleotides is much harder than predicting DNA-binding amino acids. We computed the interaction propensity (IP) of nucleotide triplets with amino acids using an extensive dataset of protein-DNA complexes, and developed two support vector machine (SVM) models that predict protein-binding nucleotides from sequence data alone. One SVM model predicts protein-binding nucleotides using DNA sequence data alone, and the other SVM model predicts protein-binding nucleotides using both DNA and protein sequences. In a 10-fold cross-validation with 1519 DNA sequences, the SVM model that uses DNA sequence data only predicted protein-binding nucleotides with an accuracy of 67.0%, an F-measure of 67.1%, and a Matthews correlation coefficient (MCC) of 0.340. With an independent dataset of 181 DNAs that were not used in training, it achieved an accuracy of 66.2%, an F-measure 66.3% and a MCC of 0.324. Another SVM model that uses both DNA and protein sequences achieved an accuracy of 69.6%, an F-measure of 69.6%, and a MCC of 0.383 in a 10-fold cross-validation with 1519 DNA sequences and 859 protein sequences. With an independent dataset of 181 DNAs and 143 proteins, it showed an accuracy of 67.3%, an F-measure of 66.5% and a MCC of 0.329. Both in cross-validation and independent testing, the second SVM model that used both DNA and protein sequence data showed better performance than the first model that used DNA sequence data. To the best of our knowledge, this is the first attempt to predict protein-binding nucleotides in a given DNA sequence from the sequence data alone. Copyright © 2014 Elsevier Ireland Ltd. All rights reserved.
Novel Single-Base Deletional Mutation in Major Intrinsic Protein (MIP) in Autosomal Dominant Cataract

PubMed Central

Geyer, David D.; Spence, M. Anne; Johannes, Meriam; Flodman, Pamela; Clancy, Kevin P.; Berry, Rebecca; Sparkes, Robert S.; Jonsen, Matthew D.; Isenberg, Sherwin J.; Bateman, J. Bronwyn

2006-01-01

PURPOSE To further elucidate the cataract phenotype, and identify the gene and mutation for autosomal dominant cataract (ADC) in an American family of European descent (ADC2) by sequencing the major intrinsic protein gene (MIP), a candidate based on linkage to chromosome 12q13. DESIGN Observational case series and laboratory experimental study. METHODS We examined two at-risk individuals in ADC2. We PCR-amplified and sequenced all four exons and all intron-exon boundaries of the MIP gene from genomic and cloned DNA in affected members to confirm one variant as the putative mutation. RESULTS We found a novel single deletion of nucleotide (nt) 3223 (within codon 235) in exon four, causing a frameshift that alters 41 of 45 subsequent amino acids and creates a premature stop codon. CONCLUSIONS We identified a novel single base pair deletion in the MIP gene and conclude that it is a pathogenic sequence alteration. PMID:16564824

In vivo insertion pool sequencing identifies virulence factors in a complex fungal–host interaction

PubMed Central

Uhse, Simon; Pflug, Florian G.; Stirnberg, Alexandra; Ehrlinger, Klaus; von Haeseler, Arndt

2018-01-01

Large-scale insertional mutagenesis screens can be powerful genome-wide tools if they are streamlined with efficient downstream analysis, which is a serious bottleneck in complex biological systems. A major impediment to the success of next-generation sequencing (NGS)-based screens for virulence factors is that the genetic material of pathogens is often underrepresented within the eukaryotic host, making detection extremely challenging. We therefore established insertion Pool-Sequencing (iPool-Seq) on maize infected with the biotrophic fungus U. maydis. iPool-Seq features tagmentation, unique molecular barcodes, and affinity purification of pathogen insertion mutant DNA from in vivo-infected tissues. In a proof of concept using iPool-Seq, we identified 28 virulence factors, including 23 that were previously uncharacterized, from an initial pool of 195 candidate effector mutants. Because of its sensitivity and quantitative nature, iPool-Seq can be applied to any insertional mutagenesis library and is especially suitable for genetically complex setups like pooled infections of eukaryotic hosts. PMID:29684023
Convergence among cave catfishes: long-branch attraction and a Bayesian relative rates test.

PubMed

Wilcox, T P; García de León, F J; Hendrickson, D A; Hillis, D M

2004-06-01

Convergence has long been of interest to evolutionary biologists. Cave organisms appear to be ideal candidates for studying convergence in morphological, physiological, and developmental traits. Here we report apparent convergence in two cave-catfishes that were described on morphological grounds as congeners: Prietella phreatophila and Prietella lundbergi. We collected mitochondrial DNA sequence data from 10 species of catfishes, representing five of the seven genera in Ictaluridae, as well as seven species from a broad range of siluriform outgroups. Analysis of the sequence data under parsimony supports a monophyletic Prietella. However, both maximum-likelihood and Bayesian analyses support polyphyly of the genus, with P. lundbergi sister to Ictalurus and P. phreatophila sister to Ameiurus. The topological difference between parsimony and the other methods appears to result from long-branch attraction between the Prietella species. Similarly, the sequence data do not support several other relationships within Ictaluridae supported by morphology. We develop a new Bayesian method for examining variation in molecular rates of evolution across a phylogeny.
Enlightenment of Yeast Mitochondrial Homoplasmy: Diversified Roles of Gene Conversion

PubMed Central

Ling, Feng; Mikawa, Tsutomu; Shibata, Takehiko

2011-01-01

Mitochondria have their own genomic DNA. Unlike the nuclear genome, each cell contains hundreds to thousands of copies of mitochondrial DNA (mtDNA). The copies of mtDNA tend to have heterogeneous sequences, due to the high frequency of mutagenesis, but are quickly homogenized within a cell (“homoplasmy”) during vegetative cell growth or through a few sexual generations. Heteroplasmy is strongly associated with mitochondrial diseases, diabetes and aging. Recent studies revealed that the yeast cell has the machinery to homogenize mtDNA, using a common DNA processing pathway with gene conversion; i.e., both genetic events are initiated by a double-stranded break, which is processed into 3′ single-stranded tails. One of the tails is base-paired with the complementary sequence of the recipient double-stranded DNA to form a D-loop (homologous pairing), in which repair DNA synthesis is initiated to restore the sequence lost by the breakage. Gene conversion generates sequence diversity, depending on the divergence between the donor and recipient sequences, especially when it occurs among a number of copies of a DNA sequence family with some sequence variations, such as in immunoglobulin diversification in chicken. MtDNA can be regarded as a sequence family, in which the members tend to be diversified by a high frequency of spontaneous mutagenesis. Thus, it would be interesting to determine why and how double-stranded breakage and D-loop formation induce sequence homogenization in mitochondria and sequence diversification in nuclear DNA. We will review the mechanisms and roles of mtDNA homoplasmy, in contrast to nuclear gene conversion, which diversifies gene and genome sequences, to provide clues toward understanding how the common DNA processing pathway results in such divergent outcomes. PMID:24710143
Microbial Diversity in a Hydrocarbon- and Chlorinated-Solvent-Contaminated Aquifer Undergoing Intrinsic Bioremediation

PubMed Central

Dojka, Michael A.; Hugenholtz, Philip; Haack, Sheridan K.; Pace, Norman R.

1998-01-01

A culture-independent molecular phylogenetic approach was used to survey constituents of microbial communities associated with an aquifer contaminated with hydrocarbons (mainly jet fuel) and chlorinated solvents undergoing intrinsic bioremediation. Samples were obtained from three redox zones: methanogenic, methanogenic-sulfate reducing, and iron or sulfate reducing. Small-subunit rRNA genes were amplified directly from aquifer material DNA by PCR with universally conserved or Bacteria- or Archaea-specific primers and were cloned. A total of 812 clones were screened by restriction fragment length polymorphisms (RFLP), approximately 50% of which were unique. All RFLP types that occurred more than once in the libraries, as well as many of the unique types, were sequenced. A total of 104 (94 bacterial and 10 archaeal) sequence types were determined. Of the 94 bacterial sequence types, 10 have no phylogenetic association with known taxonomic divisions and are phylogenetically grouped in six novel division level groups (candidate divisions WS1 to WS6); 21 belong to four recently described candidate divisions with no cultivated representatives (OP5, OP8, OP10, and OP11); and 63 are phylogenetically associated with 10 well-recognized divisions. The physiology of two particularly abundant sequence types obtained from the methanogenic zone could be inferred from their phylogenetic association with groups of microorganisms with a consistent phenotype. One of these sequence types is associated with the genus Syntrophus; Syntrophus spp. produce energy from the anaerobic oxidation of organic acids, with the production of acetate and hydrogen. The organism represented by the other sequence type is closely related to Methanosaeta spp., which are known to be capable of energy generation only through aceticlastic methanogenesis. We hypothesize, therefore, that the terminal step of hydrocarbon degradation in the methanogenic zone of the aquifer is aceticlastic methanogenesis and that the microorganisms represented by these two sequence types occur in syntrophic association. PMID:9758812
Microbial diversity in a hydrocarbon- and chlorinated-solvent- contaminated aquifer undergoing intrinsic bioremediation

USGS Publications Warehouse

Dojka, M.A.; Hugenholtz, P.; Haack, S.K.; Pace, N.R.

1998-01-01

A culture-independent molecular phylogenetic approach was used to survey constituents of microbial communities associated with an aquifer contaminated with hydrocarbons (mainly jet fuel) and chlorinated solvents undergoing intrinsic bioremediation. Samples were obtained from three redox zones: methanogenic, methanogenic-sulfate reducing, and iron or sulfate reducing. Small-subunit rRNA genes were amplified directly from aquifer material DNA by PCR with universally conserved or Bacteria- or Archaea-specific primers and were cloned. A total of 812 clones were screened by restriction fragment length polymorphisms (RFLP), approximately 50% of which were unique. All RFLP types that occurred more than once in the libraries, as well as many of the unique types, were sequenced. A total of 104 (94 bacterial and 10 archaeal) sequence types were determined. Of the 94 bacterial sequence types, 10 have no phylogenetic association with known taxonomic divisions and are phylogenetically grouped in six novel division level groups (candidate divisions WS1 to WS6); 21 belong to four recently described candidate divisions with no cultivated representatives (OPS, OP8, OP10, and OP11); and 63 are phylogenetically associated with 10 well-recognized divisions. The physiology of two particularly abundant sequence types obtained from the methanogenic zone could be inferred from their phylogenetic association with groups of microorganisms with a consistent phenotype. One of these sequence types is associated with the genus Syntrophus; Syntrophus spp. produce energy from the anaerobic oxidation of organic acids, with the production of acetate and hydrogen. The organism represented by the other sequence type is closely related to Methanosaeta spp., which are known to be capable of energy generation only through aceticlastic methanogenesis. We hypothesize, therefore, that the terminal step of hydrocarbon degradation in the methanogenic zone of the aquifer is aceticlastic methanogenesis and that the microorganisms represented by these two sequence types occur in syntrophic association.
"First generation" automated DNA sequencing technology.

PubMed

Slatko, Barton E; Kieleczawa, Jan; Ju, Jingyue; Gardner, Andrew F; Hendrickson, Cynthia L; Ausubel, Frederick M

2011-10-01

Beginning in the 1980s, automation of DNA sequencing has greatly increased throughput, reduced costs, and enabled large projects to be completed more easily. The development of automation technology paralleled the development of other aspects of DNA sequencing: better enzymes and chemistry, separation and imaging technology, sequencing protocols, robotics, and computational advancements (including base-calling algorithms with quality scores, database developments, and sequence analysis programs). Despite the emergence of high-throughput sequencing platforms, automated Sanger sequencing technology remains useful for many applications. This unit provides background and a description of the "First-Generation" automated DNA sequencing technology. It also includes protocols for using the current Applied Biosystems (ABI) automated DNA sequencing machines. © 2011 by John Wiley & Sons, Inc.
GM2 Gangliosidosis in Shiba Inu Dogs with an In-Frame Deletion in HEXB.

PubMed

Kolicheski, A; Johnson, G S; Villani, N A; O'Brien, D P; Mhlanga-Mutangadura, T; Wenger, D A; Mikoloski, K; Eagleson, J S; Taylor, J F; Schnabel, R D; Katz, M L

2017-09-01

Consistent with a tentative diagnosis of neuronal ceroid lipofuscinosis (NCL), autofluorescent cytoplasmic storage bodies were found in neurons from the brains of 2 related Shiba Inu dogs with a young-adult onset, progressive neurodegenerative disease. Unexpectedly, no potentially causal NCL-related variants were identified in a whole-genome sequence generated with DNA from 1 of the affected dogs. Instead, the whole-genome sequence contained a homozygous 3 base pair (bp) deletion in a coding region of HEXB. The other affected dog also was homozygous for this 3-bp deletion. Mutations in the human HEXB ortholog cause Sandhoff disease, a type of GM2 gangliosidosis. Thin-layer chromatography confirmed that GM2 ganglioside had accumulated in an affected Shiba Inu brain. Enzymatic analysis confirmed that the GM2 gangliosidosis resulted from a deficiency in the HEXB encoded protein and not from a deficiency in products from HEXA or GM2A, which are known alternative causes of GM2 gangliosidosis. We conclude that the homozygous 3-bp deletion in HEXB is the likely cause of the Shiba Inu neurodegenerative disease and that whole-genome sequencing can lead to the early identification of potentially disease-causing DNA variants thereby refocusing subsequent diagnostic analyses toward confirming or refuting candidate variant causality. Copyright © 2017 The Authors. Journal of Veterinary Internal Medicine published by Wiley Periodicals, Inc. on behalf of the American College of Veterinary Internal Medicine.
Application of hidden Markov models to biological data mining: a case study

NASA Astrophysics Data System (ADS)

Yin, Michael M.; Wang, Jason T.

2000-04-01

In this paper we present an example of biological data mining: the detection of splicing junction acceptors in eukaryotic genes. Identification or prediction of transcribed sequences from within genomic DNA has been a major rate-limiting step in the pursuit of genes. Programs currently available are far from being powerful enough to elucidate the gene structure completely. Here we develop a hidden Markov model (HMM) to represent the degeneracy features of splicing junction acceptor sites in eukaryotic genes. The HMM system is fully trained using an expectation maximization (EM) algorithm and the system performance is evaluated using the 10-way cross- validation method. Experimental results show that our HMM system can correctly classify more than 94% of the candidate sequences (including true and false acceptor sites) into right categories. About 90% of the true acceptor sites and 96% of the false acceptor sites in the test data are classified correctly. These results are very promising considering that only the local information in DNA is used. The proposed model will be a very important component of an effective and accurate gene structure detection system currently being developed in our lab.
Germline Mutations in PALB2, BRCA1, and RAD51C, Which Regulate DNA Recombination Repair, in Patients With Gastric Cancer.

PubMed

Sahasrabudhe, Ruta; Lott, Paul; Bohorquez, Mabel; Toal, Ted; Estrada, Ana P; Suarez, John J; Brea-Fernández, Alejandro; Cameselle-Teijeiro, José; Pinto, Carla; Ramos, Irma; Mantilla, Alejandra; Prieto, Rodrigo; Corvalan, Alejandro; Norero, Enrique; Alvarez, Carolina; Tapia, Teresa; Carvallo, Pilar; Gonzalez, Luz M; Cock-Rada, Alicia; Solano, Angela; Neffa, Florencia; Della Valle, Adriana; Yau, Chris; Soares, Gabriela; Borowsky, Alexander; Hu, Nan; He, Li-Ji; Han, Xiao-You; Taylor, Philip R; Goldstein, Alisa M; Torres, Javier; Echeverry, Magdalena; Ruiz-Ponte, Clara; Teixeira, Manuel R; Carvajal-Carmona, Luis G

2017-04-01

Up to 10% of cases of gastric cancer are familial, but so far, only mutations in CDH1 have been associated with gastric cancer risk. To identify genetic variants that affect risk for gastric cancer, we collected blood samples from 28 patients with hereditary diffuse gastric cancer (HDGC) not associated with mutations in CDH1 and performed whole-exome sequence analysis. We then analyzed sequences of candidate genes in 333 independent HDGC and non-HDGC cases. We identified 11 cases with mutations in PALB2, BRCA1, or RAD51C genes, which regulate homologous DNA recombination. We found these mutations in 2 of 31 patients with HDGC (6.5%) and 9 of 331 patients with sporadic gastric cancer (2.8%). Most of these mutations had been previously associated with other types of tumors and partially co-segregated with gastric cancer in our study. Tumors that developed in patients with these mutations had a mutation signature associated with somatic homologous recombination deficiency. Our findings indicate that defects in homologous recombination increase risk for gastric cancer. Copyright © 2017 AGA Institute. Published by Elsevier Inc. All rights reserved.
Sequence Alignment to Predict Across Species Susceptibility ...

EPA Pesticide Factsheets

Conservation of a molecular target across species can be used as a line-of-evidence to predict the likelihood of chemical susceptibility. The web-based Sequence Alignment to Predict Across Species Susceptibility (SeqAPASS) tool was developed to simplify, streamline, and quantitatively assess protein sequence/structural similarity across taxonomic groups as a means to predict relative intrinsic susceptibility. The intent of the tool is to allow for evaluation of any potential protein target, so it is amenable to variable degrees of protein characterization, depending on available information about the chemical/protein interaction and the molecular target itself. To allow for flexibility in the analysis, a layered strategy was adopted for the tool. The first level of the SeqAPASS analysis compares primary amino acid sequences to a query sequence, calculating a metric for sequence similarity (including detection of candidate orthologs), the second level evaluates sequence similarity within selected domains (e.g., ligand-binding domain, DNA binding domain), and the third level of analysis compares individual amino acid residue positions identified as being of importance for protein conformation and/or ligand binding upon chemical perturbation. Each level of the SeqAPASS analysis provides increasing evidence to apply toward rapid, screening-level assessments of probable cross species susceptibility. Such analyses can support prioritization of chemicals for further ev
Origins of domestication and polyploidy in oca (Oxalis Tuberosa: Oxalidaceae). 2. Chloroplast-expressed glutamine synthetase data.

PubMed

Emshwiller, Eve; Doyle, Jeff J

2002-07-01

In continuing study of the origins of the octoploid tuber crop oca, Oxalis tuberosa Molina, we used phylogenetic analysis of DNA sequences of the chloroplast-active (nuclear encoded) isozyme of glutamine synthetase (ncpGS) from cultivated oca, its allies in the "Oxalis tuberosa alliance," and other Andean Oxalis. Multiple ncpGS sequences found within individuals of both the cultigen and a yet unnamed wild tuber-bearing taxon of Bolivia were separated by molecular cloning, but some cloned sequences appeared to be artifacts of polymerase chain reaction (PCR) recombination and/or Taq error. Nonetheless, three classes of nonrecombinant sequences each joined a different part of the O. tuberosa alliance clade on the ncpGS gene tree. Octoploid oca shares two sequence classes with the Bolivian tuber-bearing taxon (of unknown ploidy level). Fixed heterozygosity of these two sequence classes in all ocas sampled suggests that they represent homeologous loci and that oca is allopolyploid. A third sequence class, found in eight of nine oca plants sampled, might represent a third homeologous locus, suggesting that oca may be autoallopolyploid, and is shared with another wild tuber-bearing species, tetraploid O. picchensis of southern Peru. Thus, ncpGS data identify these two taxa as the best candidates as progenitors of cultivated oca.
Influence of DNA sequence on the structure of minicircles under torsional stress

PubMed Central

Wang, Qian; Irobalieva, Rossitza N.; Chiu, Wah; Schmid, Michael F.; Fogg, Jonathan M.; Zechiedrich, Lynn

2017-01-01

Abstract The sequence dependence of the conformational distribution of DNA under various levels of torsional stress is an important unsolved problem. Combining theory and coarse-grained simulations shows that the DNA sequence and a structural correlation due to topology constraints of a circle are the main factors that dictate the 3D structure of a 336 bp DNA minicircle under torsional stress. We found that DNA minicircle topoisomers can have multiple bend locations under high torsional stress and that the positions of these sharp bends are determined by the sequence, and by a positive mechanical correlation along the sequence. We showed that simulations and theory are able to provide sequence-specific information about individual DNA minicircles observed by cryo-electron tomography (cryo-ET). We provided a sequence-specific cryo-ET tomogram fitting of DNA minicircles, registering the sequence within the geometric features. Our results indicate that the conformational distribution of minicircles under torsional stress can be designed, which has important implications for using minicircle DNA for gene therapy. PMID:28609782
Analysis of DNA Sequences by an Optical Time-Integrating Correlator: Proof-of-Concept Experiments.

DTIC Science & Technology

1992-05-01

DNA ANALYSIS STRATEGY 4 2.1 Representation of DNA Bases 4 2.2 DNA Analysis Strategy 6 3.0 CUSTOM GENERATORS FOR DNA SEQUENCES 10 3.1 Hardware Design 10...of the DNA bases where each base is represented by a 7-bits long pseudorandom sequence. 5 Figure 4: Coarse analysis of a DNA sequence. 7 Figure 5: Fine...a 20-bases long database. 32 xiii LIST OF TABLES PAGE Table 1: Short representations of the DNA bases where each base is represented by 7-bits long
An SNP resource for rice genetics and breeding based on subspecies indica and japonica genome alignments.

PubMed

Feltus, F Alex; Wan, Jun; Schulze, Stefan R; Estill, James C; Jiang, Ning; Paterson, Andrew H

2004-09-01

Dense coverage of the rice genome with polymorphic DNA markers is an invaluable tool for DNA marker-assisted breeding, positional cloning, and a wide range of evolutionary studies. We have aligned drafts of two rice subspecies, indica and japonica, and analyzed levels and patterns of genetic diversity. After filtering multiple copy and low quality sequence, 408,898 candidate DNA polymorphisms (SNPs/INDELs) were discerned between the two subspecies. These filters have the consequence that our data set includes only a subset of the available SNPs (in particular excluding large numbers of SNPs that may occur between repetitive DNA alleles) but increase the likelihood that this subset is useful: Direct sequencing suggests that 79.8% +/- 7.5% of the in silico SNPs are real. The SNP sample in our database is not randomly distributed across the genome. In fact, 566 rice genomic regions had unusually high (328 contigs/48.6 Mb/13.6% of genome) or low (237 contigs/64.7 Mb/18.1% of genome) polymorphism rates. Many SNP-poor regions were substantially longer than most SNP-rich regions, covering up to 4 Mb, and possibly reflecting introgression between the respective gene pools that may have occurred hundreds of years ago. Although 46.2% +/- 8.3% of the SNPs differentiate other pairs of japonica and indica genotypes, SNP rates in rice were not predictive of evolutionary rates for corresponding genes in another grass species, sorghum. The data set is freely available at http://www.plantgenome.uga.edu/snp.
An SNP Resource for Rice Genetics and Breeding Based on Subspecies Indica and Japonica Genome Alignments

PubMed Central

Feltus, F. Alex; Wan, Jun; Schulze, Stefan R.; Estill, James C.; Jiang, Ning; Paterson, Andrew H.

2004-01-01

Dense coverage of the rice genome with polymorphic DNA markers is an invaluable tool for DNA marker-assisted breeding, positional cloning, and a wide range of evolutionary studies. We have aligned drafts of two rice subspecies, indica and japonica, and analyzed levels and patterns of genetic diversity. After filtering multiple copy and low quality sequence, 408,898 candidate DNA polymorphisms (SNPs/INDELs) were discerned between the two subspecies. These filters have the consequence that our data set includes only a subset of the available SNPs (in particular excluding large numbers of SNPs that may occur between repetitive DNA alleles) but increase the likelihood that this subset is useful: Direct sequencing suggests that 79.8% ± 7.5% of the in silico SNPs are real. The SNP sample in our database is not randomly distributed across the genome. In fact, 566 rice genomic regions had unusually high (328 contigs/48.6 Mb/13.6% of genome) or low (237 contigs/64.7 Mb/18.1% of genome) polymorphism rates. Many SNP-poor regions were substantially longer than most SNP-rich regions, covering up to 4 Mb, and possibly reflecting introgression between the respective gene pools that may have occurred hundreds of years ago. Although 46.2% ± 8.3% of the SNPs differentiate other pairs of japonica and indica genotypes, SNP rates in rice were not predictive of evolutionary rates for corresponding genes in another grass species, sorghum. The data set is freely available at http://www.plantgenome.uga.edu/snp. PMID:15342564
DNA Barcoding of Rhodiola (Crassulaceae): A Case Study on a Group of Recently Diversified Medicinal Plants from the Qinghai-Tibetan Plateau

PubMed Central

Zhang, Jian-Qiang; Meng, Shi-Yong; Wen, Jun; Rao, Guang-Yuan

2015-01-01

DNA barcoding, the identification of species using one or a few short standardized DNA sequences, is an important complement to traditional taxonomy. However, there are particular challenges for barcoding plants, especially for species with complex evolutionary histories. We herein evaluated the utility of five candidate sequences — rbcL, matK, trnH-psbA, trnL-F and the internal transcribed spacer (ITS) — for barcoding Rhodiola species, a group of high-altitude plants frequently used as adaptogens, hemostatics and tonics in traditional Tibetan medicine. Rhodiola was suggested to have diversified rapidly recently. The genus is thus a good model for testing DNA barcoding strategies for recently diversified medicinal plants. This study analyzed 189 accessions, representing 47 of the 55 recognized Rhodiola species in the Flora of China treatment. Based on intraspecific and interspecific divergence and degree of monophyly statistics, ITS was the best single-locus barcode, resolving 66% of the Rhodiola species. The core combination rbcL+matK resolved only 40.4% of them. Unsurprisingly, the combined use of all five loci provided the highest discrimination power, resolving 80.9% of the species. However, this is weaker than the discrimination power generally reported in barcoding studies of other plant taxa. The observed complications may be due to the recent diversification, incomplete lineage sorting and reticulate evolution of the genus. These processes are common features of numerous plant groups in the high-altitude regions of the Qinghai-Tibetan Plateau. PMID:25774915
Housekeeping while brain's storming Validation of normalizing factors for gene expression studies in a murine model of traumatic brain injury

PubMed Central

Rhinn, Hervé; Marchand-Leroux, Catherine; Croci, Nicole; Plotkine, Michel; Scherman, Daniel; Escriou, Virginie

2008-01-01

Background Traumatic brain injury models are widely studied, especially through gene expression, either to further understand implied biological mechanisms or to assess the efficiency of potential therapies. A large number of biological pathways are affected in brain trauma models, whose elucidation might greatly benefit from transcriptomic studies. However the suitability of reference genes needed for quantitative RT-PCR experiments is missing for these models. Results We have compared five potential reference genes as well as total cDNA level monitored using Oligreen reagent in order to determine the best normalizing factors for quantitative RT-PCR expression studies in the early phase (0–48 h post-trauma (PT)) of a murine model of diffuse brain injury. The levels of 18S rRNA, and of transcripts of β-actin, glyceraldehyde-3P-dehydrogenase (GAPDH), β-microtubulin and S100β were determined in the injured brain region of traumatized mice sacrificed at 30 min, 3 h, 6 h, 12 h, 24 h and 48 h post-trauma. The stability of the reference genes candidates and of total cDNA was evaluated by three different methods, leading to the following rankings as normalization factors, from the most suitable to the less: by using geNorm VBA applet, we obtained the following sequence: cDNA(Oligreen); GAPDH > 18S rRNA > S100β > β-microtubulin > β-actin; by using NormFinder Excel Spreadsheet, we obtained the following sequence: GAPDH > cDNA(Oligreen) > S100β > 18S rRNA > β-actin > β-microtubulin; by using a Confidence-Interval calculation, we obtained the following sequence: cDNA(Oligreen) > 18S rRNA; GAPDH > S100β > β-microtubulin > β-actin. Conclusion This work suggests that Oligreen cDNA measurements, 18S rRNA and GAPDH or a combination of them may be used to efficiently normalize qRT-PCR gene expression in mouse brain trauma injury, and that β-actin and β-microtubulin should be avoided. The potential of total cDNA as measured by Oligreen as a first-intention normalizing factor with a broad field of applications is highlighted. Pros and cons of the three methods of normalization factors selection are discussed. A generic time- and cost-effective procedure for normalization factor validation is proposed. PMID:18611280
Identification and characterization of a new multigene family in the human MHC: A candidate autoimmune disease susceptibility element (3.8-1)

DOE Office of Scientific and Technical Information (OSTI.GOV)

Harris, J.M.; Venditti, C.P.; Chorney, M.J.

1994-09-01

An association between idiopathic hemochromatosis (HFE) and the HLA-A3 locus has been previously well-established. In an attempt to identify potential HFE candidate genes, a genomic DNA fragment distal to the HLA-A9 breakpoint was used to screen a B cell cDNA library; a member (3.8-1) of a new multigene family, composed of five distinct genomic cross-reactive fragments, was identified. Clone 3.8-1 represents the 3{prime} end of 9.6 kb transcript which is expressed in multiple tissues including the spleen, thymus, lung and kidney. Sequencing and genome database analysis indicate that 3.8-1 is unique, with no homology to any known entries. The genomicmore » residence of 3-8.1, defined by polymorphism analysis and physical mapping using YAC clones, appears to be absent from the genomes of higher primates, although four other cross-reactivities are maintained. The absence of this gene as well as other probes which map in the TNF to HLA-B interval, suggest that this portion of the human HMC, located between the Class I and Class III regions, arose in humans as the result of a post-speciation insertional event. The large size of the 3.8-1 gene and the possible categorization of 3.8-1 as a human-specific gene are significant given the genetic data that place an autoimmune susceptibility element for IDDM and myasthenia gravis in the precise region where this gene resides. In an attempt to isolate the 5{prime} end of this large transcript, we have constructed a cosmid contig which encompasses the genomic locus of this gene and are progressively isolating coding sequences by exon trapping.« less
Fine-scale genetic mapping of a hybrid sterility factor between Drosophila simulans and D. mauritiana: the varied and elusive functions of "speciation genes".

PubMed

Araripe, Luciana O; Montenegro, Horácio; Lemos, Bernardo; Hartl, Daniel L

2010-12-14

Hybrid male sterility (HMS) is a usual outcome of hybridization between closely related animal species. It arises because interactions between alleles that are functional within one species may be disrupted in hybrids. The identification of genes leading to hybrid sterility is of great interest for understanding the evolutionary process of speciation. In the current work we used marked P-element insertions as dominant markers to efficiently locate one genetic factor causing a severe reduction in fertility in hybrid males of Drosophila simulans and D. mauritiana. Our mapping effort identified a region of 9 kb on chromosome 3, containing three complete and one partial coding sequences. Within this region, two annotated genes are suggested as candidates for the HMS factor, based on the comparative molecular characterization and public-source information. Gene Taf1 is partially contained in the region, but yet shows high polymorphism with four fixed non-synonymous substitutions between the two species. Its molecular functions involve sequence-specific DNA binding and transcription factor activity. Gene agt is a small, intronless gene, whose molecular function is annotated as methylated-DNA-protein-cysteine S-methyltransferase activity. High polymorphism and one fixed non-synonymous substitution suggest this is a fast evolving gene. The gene trees of both genes perfectly separate D. simulans and D. mauritiana into monophyletic groups. Analysis of gene expression using microarray revealed trends that were similar to those previously found in comparisons between whole-genome hybrids and parental species. The identification following confirmation of the HMS candidate gene will add another case study leading to understanding the evolutionary process of hybrid incompatibility.
De novo Analysis of the Epiphytic Transcriptome of the Cucurbit Powdery Mildew Fungus Podosphaera xanthii and Identification of Candidate Secreted Effector Proteins

PubMed Central

Vela-Corcía, David; Bautista, Rocío; de Vicente, Antonio; Spanu, Pietro D.; Pérez-García, Alejandro

2016-01-01

The cucurbit powdery mildew fungus Podosphaera xanthii is a major limiting factor for cucurbit production worldwide. Despite the fungus’s agronomic and economic importance, very little is known about fundamental aspects of P. xanthii biology, such as obligate biotrophy or pathogenesis. To design more durable control strategies, genomic information about P. xanthii is needed. Powdery mildews are fungal pathogens with large genomes compared with those of other fungi, which contain vast amounts of repetitive DNA sequences, much of which is composed of retrotransposons. To reduce genome complexity, in this work we aimed to obtain and analyse the epiphytic transcriptome of P. xanthii as a starting point for genomic research. Total RNA was isolated from epiphytic fungal material, and the corresponding cDNA library was sequenced using a 454 GS FLX platform. Over 676,562 reads were obtained and assembled into 37,241 contigs. Annotation data identified 8,798 putative genes with different orthologues. As described for other powdery mildew fungi, a similar set of missing core ascomycete genes was found, which may explain obligate biotrophy. To gain insight into the plant-pathogen relationships, special attention was focused on the analysis of the secretome. After this analysis, 137 putative secreted proteins were identified, including 53 candidate secreted effector proteins (CSEPs). Consistent with a putative role in pathogenesis, the expression profile observed for some of these CSEPs showed expression maxima at the beginning of the infection process at 24 h after inoculation, when the primary appressoria are mostly formed. Our data mark the onset of genomics research into this very important pathogen of cucurbits and shed some light on the intimate relationship between this pathogen and its host plant. PMID:27711117

De novo Analysis of the Epiphytic Transcriptome of the Cucurbit Powdery Mildew Fungus Podosphaera xanthii and Identification of Candidate Secreted Effector Proteins.

PubMed

Vela-Corcía, David; Bautista, Rocío; de Vicente, Antonio; Spanu, Pietro D; Pérez-García, Alejandro

2016-01-01

The cucurbit powdery mildew fungus Podosphaera xanthii is a major limiting factor for cucurbit production worldwide. Despite the fungus's agronomic and economic importance, very little is known about fundamental aspects of P. xanthii biology, such as obligate biotrophy or pathogenesis. To design more durable control strategies, genomic information about P. xanthii is needed. Powdery mildews are fungal pathogens with large genomes compared with those of other fungi, which contain vast amounts of repetitive DNA sequences, much of which is composed of retrotransposons. To reduce genome complexity, in this work we aimed to obtain and analyse the epiphytic transcriptome of P. xanthii as a starting point for genomic research. Total RNA was isolated from epiphytic fungal material, and the corresponding cDNA library was sequenced using a 454 GS FLX platform. Over 676,562 reads were obtained and assembled into 37,241 contigs. Annotation data identified 8,798 putative genes with different orthologues. As described for other powdery mildew fungi, a similar set of missing core ascomycete genes was found, which may explain obligate biotrophy. To gain insight into the plant-pathogen relationships, special attention was focused on the analysis of the secretome. After this analysis, 137 putative secreted proteins were identified, including 53 candidate secreted effector proteins (CSEPs). Consistent with a putative role in pathogenesis, the expression profile observed for some of these CSEPs showed expression maxima at the beginning of the infection process at 24 h after inoculation, when the primary appressoria are mostly formed. Our data mark the onset of genomics research into this very important pathogen of cucurbits and shed some light on the intimate relationship between this pathogen and its host plant.
Laser mass spectrometry for DNA sequencing, disease diagnosis, and fingerprinting

NASA Astrophysics Data System (ADS)

Chen, C. H. Winston; Taranenko, N. I.; Zhu, Y. F.; Chung, C. N.; Allman, S. L.

1997-05-01

Since laser mass spectrometry has the potential for achieving very fast DNA analysis, we recently applied it to DNA sequencing, DNA typing for fingerprinting, and DNA screening for disease diagnosis. Two different approaches for sequencing DNA have been successfully demonstrated. One is to sequence DNA with DNA ladders produced from Sanger's enzymatic method. The other is to do direct sequencing without DNA ladders. The need for quick DNA typing for identification purposes is critical for forensic application. Our preliminary results indicate laser mass spectrometry can possible be used for rapid DNA fingerprinting applications at a much lower cost than gel electrophoresis. Population screening for certain genetic disease can be a very efficient step to reducing medical costs through prevention. Since laser mass spectrometry can provide very fast DNA analysis, we applied laser mass spectrometry to disease diagnosis. Clinical samples with both base deletion and point mutation have been tested with complete success.
Micronuclear DNA of Oxytricha nova contains sequences with autonomously replicating activity in Saccharomyces cerevisiae.

PubMed Central

Colombo, M M; Swanton, M T; Donini, P; Prescott, D M

1984-01-01

Oxytricha nova is a hypotrichous ciliate with micronuclei and macronuclei. Micronuclei, which contain large, chromosomal-sized DNA, are genetically inert but undergo meiosis and exchange during cell mating. Macronuclei, which contain only small, gene-sized DNA molecules, provide all of the nuclear RNA needed to run the cell. After cell mating the macronucleus is derived from a micronucleus, a derivation that includes excision of the genes from chromosomes and elimination of the remaining DNA. The eliminated DNA includes all of the repetitious sequences and approximately 95% of the unique sequences. We cloned large restriction fragments from the micronucleus that confer replication ability on a replication-deficient plasmid in Saccharomyces cerevisiae. Sequences that confer replication ability are called autonomously replicating sequences. The frequency and effectiveness of autonomously replicating sequences in micronuclear DNA are similar to those reported for DNAs of other organisms introduced into yeast cells. Of the 12 micronuclear fragments with autonomously replicating sequence activity, 9 also showed homology to macronuclear DNA, indicating that they contain a macronuclear gene sequence. We conclude from this that autonomously replicating sequence activity is nonrandomly distributed throughout micronuclear DNA and is preferentially associated with those regions of micronuclear DNA that contain genes. Images PMID:6092934
DNA sequence-dependent mechanics and protein-assisted bending in repressor-mediated loop formation

PubMed Central

Boedicker, James Q.; Garcia, Hernan G.; Johnson, Stephanie; Phillips, Rob

2014-01-01

As the chief informational molecule of life, DNA is subject to extensive physical manipulations. The energy required to deform double-helical DNA depends on sequence, and this mechanical code of DNA influences gene regulation, such as through nucleosome positioning. Here we examine the sequence-dependent flexibility of DNA in bacterial transcription factor-mediated looping, a context for which the role of sequence remains poorly understood. Using a suite of synthetic constructs repressed by the Lac repressor and two well-known sequences that show large flexibility differences in vitro, we make precise statistical mechanical predictions as to how DNA sequence influences loop formation and test these predictions using in vivo transcription and in vitro single-molecule assays. Surprisingly, sequence-dependent flexibility does not affect in vivo gene regulation. By theoretically and experimentally quantifying the relative contributions of sequence and the DNA-bending protein HU to DNA mechanical properties, we reveal that bending by HU dominates DNA mechanics and masks intrinsic sequence-dependent flexibility. Such a quantitative understanding of how mechanical regulatory information is encoded in the genome will be a key step towards a predictive understanding of gene regulation at single-base pair resolution. PMID:24231252
Divergent nuclear 18S rDNA paralogs in a turkey coccidium, Eimeria meleagrimitis, complicate molecular systematics and identification.

PubMed

El-Sherry, Shiem; Ogedengbe, Mosun E; Hafeez, Mian A; Barta, John R

2013-07-01

Multiple 18S rDNA sequences were obtained from two single-oocyst-derived lines of each of Eimeria meleagrimitis and Eimeria adenoeides. After analysing the 15 new 18S rDNA sequences from two lines of E. meleagrimitis and 17 new sequences from two lines of E. adenoeides, there were clear indications that divergent, paralogous 18S rDNA copies existed within the nuclear genome of E. meleagrimitis. In contrast, mitochondrial cytochrome c oxidase subunit I (COI) partial sequences from all lines of a particular Eimeria sp. were identical and, in phylogenetic analyses, COI sequences clustered unambiguously in monophyletic and highly-supported clades specific to individual Eimeria sp. Phylogenetic analysis of the new 18S rDNA sequences from E. meleagrimitis showed that they formed two distinct clades: Type A with four new sequences; and Type B with nine new sequences; both Types A and B sequences were obtained from each of the single-oocyst-derived lines of E. meleagrimitis. Together these rDNA types formed a well-supported E. meleagrimitis clade. Types A and B 18S rDNA sequences from E. meleagrimitis had a mean sequence identity of only 97.4% whereas mean sequence identity within types was 99.1-99.3%. The observed intraspecific sequence divergence among E. meleagrimitis 18S rDNA sequence types was even higher (approximately 2.6%) than the interspecific sequence divergence present between some well-recognized species such as Eimeria tenella and Eimeria necatrix (1.1%). Our observations suggest that, unlike COI sequences, 18S rDNA sequences are not reliable molecular markers to be used alone for species identification with coccidia, although 18S rDNA sequences have clear utility for phylogenetic reconstruction of apicomplexan parasites at the genus and higher taxonomic ranks. Copyright © 2013. Published by Elsevier Ltd.
Identification of Smoking-Associated Differentially Methylated Regions Using Reduced Representation Bisulfite Sequencing and Cell type-Specific Enhancer Activation and Gene Expression.

PubMed

Wan, Ma; Bennett, Brian D; Pittman, Gary S; Campbell, Michelle R; Reynolds, Lindsay M; Porter, Devin K; Crowl, Christopher L; Wang, Xuting; Su, Dan; Englert, Neal A; Thompson, Isabel J; Liu, Yongmei; Bell, Douglas A

2018-04-27

Cigarette smoke is a causal factor in cancers and cardiovascular disease. Smoking-associated differentially methylated regions (SM-DMRs) have been observed in disease studies, but the causal link between altered DNA methylation and transcriptional change is obscure. Our objectives were to finely resolve SM-DMRs and to interrogate the mechanistic link between SM-DMRs and altered transcription of enhancer noncoding RNA (eRNA) and mRNA in human circulating monocytes. We integrated SM-DMRs identified by reduced representation bisulfite sequencing (RRBS) of circulating CD14+ monocyte DNA collected from two independent human studies [ n =38 from Clinical Research Unit (CRU) and n =55 from the Multi-Ethnic Study of Atherosclerosis (MESA), about half of whom were active smokers] with gene expression for protein-coding genes and noncoding RNAs measured by RT-PCR or RNA sequencing. Candidate SM-DMRs were compared with RRBS of purified CD4+ T cells, CD8+ T cells, CD15+ granulocytes, CD19+ B cells, and CD56+ NK cells ( n =19 females, CRU). DMRs were validated using pyrosequencing or bisulfite amplicon sequencing in up to 85 CRU volunteers, who also provided saliva DNA. RRBS identified monocyte SM-DMRs frequently located in putative gene regulatory regions. The most significant monocyte DMR occurred at a poised enhancer in the aryl-hydrocarbon receptor repressor gene ( AHRR ) and it was also detected in both granulocytes and saliva DNA. To our knowledge, we identify for the first time that SM-DMRs in or near AHRR , C5orf55-EXOC-AS , and SASH1 were associated with increased noncoding eRNA as well as mRNA in monocytes. Functionally, the AHRR SM-DMR appeared to up-regulate AHRR mRNA through activating the AHRR enhancer, as suggested by increased eRNA in the monocytes, but not granulocytes, from smokers compared with nonsmokers. Our findings suggest that AHRR SM-DMR up-regulates AHRR mRNA in a monocyte-specific manner by activating the AHRR enhancer. Cell type-specific activation of enhancers at SM-DMRs may represent a mechanism driving smoking-related disease. https://doi.org/10.1289/EHP2395.
Affordable hands-on DNA sequencing and genotyping: an exercise for teaching DNA analysis to undergraduates.

PubMed

Shah, Kushani; Thomas, Shelby; Stein, Arnold

2013-01-01

In this report, we describe a 5-week laboratory exercise for undergraduate biology and biochemistry students in which students learn to sequence DNA and to genotype their DNA for selected single nucleotide polymorphisms (SNPs). Students use miniaturized DNA sequencing gels that require approximately 8 min to run. The students perform G, A, T, C Sanger sequencing reactions. They prepare and run the gels, perform Southern blots (which require only 10 min), and detect sequencing ladders using a colorimetric detection system. Students enlarge their sequencing ladders from digital images of their small nylon membranes, and read the sequence manually. They compare their reads with the actual DNA sequence using BLAST2. After mastering the DNA sequencing system, students prepare their own DNA from a cheek swab, polymerase chain reaction-amplify a region of their DNA that encompasses a SNP of interest, and perform sequencing to determine their genotype at the SNP position. A family pedigree can also be constructed. The SNP chosen by the instructor was rs17822931, which is in the ABCC11 gene and is the determinant of human earwax type. Genotypes at the rs178229931 site vary in different ethnic populations. © 2013 by The International Union of Biochemistry and Molecular Biology.
Phylogenetic characterization of a biogas plant microbial community integrating clone library 16S-rDNA sequences and metagenome sequence data obtained by 454-pyrosequencing.

PubMed

Kröber, Magdalena; Bekel, Thomas; Diaz, Naryttza N; Goesmann, Alexander; Jaenicke, Sebastian; Krause, Lutz; Miller, Dimitri; Runte, Kai J; Viehöver, Prisca; Pühler, Alfred; Schlüter, Andreas

2009-06-01

The phylogenetic structure of the microbial community residing in a fermentation sample from a production-scale biogas plant fed with maize silage, green rye and liquid manure was analysed by an integrated approach using clone library sequences and metagenome sequence data obtained by 454-pyrosequencing. Sequencing of 109 clones from a bacterial and an archaeal 16S-rDNA amplicon library revealed that the obtained nucleotide sequences are similar but not identical to 16S-rDNA database sequences derived from different anaerobic environments including digestors and bioreactors. Most of the bacterial 16S-rDNA sequences could be assigned to the phylum Firmicutes with the most abundant class Clostridia and to the class Bacteroidetes, whereas most archaeal 16S-rDNA sequences cluster close to the methanogen Methanoculleus bourgensis. Further sequences of the archaeal library most probably represent so far non-characterised species within the genus Methanoculleus. A similar result derived from phylogenetic analysis of mcrA clone sequences. The mcrA gene product encodes the alpha-subunit of methyl-coenzyme-M reductase involved in the final step of methanogenesis. BLASTn analysis applying stringent settings resulted in assignment of 16S-rDNA metagenome sequence reads to 62 16S-rDNA amplicon sequences thus enabling frequency of abundance estimations for 16S-rDNA clone library sequences. Ribosomal Database Project (RDP) Classifier processing of metagenome 16S-rDNA reads revealed abundance of the phyla Firmicutes, Bacteroidetes and Euryarchaeota and the orders Clostridiales, Bacteroidales and Methanomicrobiales. Moreover, a large fraction of 16S-rDNA metagenome reads could not be assigned to lower taxonomic ranks, demonstrating that numerous microorganisms in the analysed fermentation sample of the biogas plant are still unclassified or unknown.
Isolation and characterization of the promoter sequence of a cassava gene coding for Pt2L4, a glutamic acid-rich protein differentially expressed in storage roots.

PubMed

de Souza, C R; Aragão, F J; Moreira, E C O; Costa, C N M; Nascimento, S B; Carvalho, L J

2009-03-24

Cassava is one of the most important tropical food crops for more than 600 million people worldwide. Transgenic technologies can be useful for increasing its nutritional value and its resistance to viral diseases and insect pests. However, tissue-specific promoters that guarantee correct expression of transgenes would be necessary. We used inverse polymerase chain reaction to isolate a promoter sequence of the Mec1 gene coding for Pt2L4, a glutamic acid-rich protein differentially expressed in cassava storage roots. In silico analysis revealed putative cis-acting regulatory elements within this promoter sequence, including root-specific elements that may be required for its expression in vascular tissues. Transient expression experiments showed that the Mec1 promoter is functional, since this sequence was able to drive GUS expression in bean embryonic axes. Results from our computational analysis can serve as a guide for functional experiments to identify regions with tissue-specific Mec1 promoter activity. The DNA sequence that we identified is a new promoter that could be a candidate for genetic engineering of cassava roots.
Clinical sequencing in leukemia with the assistance of artificial intelligence.

PubMed

Tojo, Arinobu

2017-01-01

Next generation sequencing (NGS) of cancer genomes is now becoming a prerequisite for accurate diagnosis and proper treatment in clinical oncology. Because the genomic regions for NGS expand from a certain set of genes to the whole exome or whole genome, the resulting sequence data becomes incredibly enormous and makes it quite laborious to translate the genomic data into medicine, so-called annotation and curation. We organized a clinical sequencing team and established a bidirectional (bed-to-bench and bench-to-bed) system to integrate clinical and genomic data for hematological malignancies. We also started a collaborative research project with IBM Japan to adopt the artificial intelligence Watson for Genomics (WfG) to the pipeline of medical informatics. Genomic DNA was prepared from malignant as well as normal tissues in each patient and subjected to NGS. Sequence data was analyzed using an in-house semi-automated pipeline in combination with WfG, which was used to identify candidate driver mutations and relevant pathways from which applicable drug information was deduced. Currently, we have analyzed more than 150 patients with hematological disorders, including AML and ALL, and obtained many informative findings. In this presentation, I will introduce some of the achievements we have made so far.
An Integrated Encyclopedia of DNA Elements in the Human Genome

PubMed Central

2012-01-01

Summary The human genome encodes the blueprint of life, but the function of the vast majority of its nearly three billion bases is unknown. The Encyclopedia of DNA Elements (ENCODE) project has systematically mapped regions of transcription, transcription factor association, chromatin structure, and histone modification. These data enabled us to assign biochemical functions for 80% of the genome, in particular outside of the well-studied protein-coding regions. Many discovered candidate regulatory elements are physically associated with one another and with expressed genes, providing new insights into the mechanisms of gene regulation. The newly identified elements also show a statistical correspondence to sequence variants linked to human disease, and can thereby guide interpretation of this variation. Overall the project provides new insights into the organization and regulation of our genes and genome, and an expansive resource of functional annotations for biomedical research. PMID:22955616
Analysis of eye lens-specific genes in congenital hereditary cataracts and microphthalmia of the miniature schnauzer dog.

PubMed

Zhang, R L; Samuelson, D A; Zhang, Z G; Reddy, V N; Shastry, B S

1991-08-01

The congenital hereditary cataracts and microphthalmia in the miniature schnauzer dog are inherited by an autosomal recessive mode. To understand the genetic basis of these diseases, the authors purified and analyzed leukocyte deoxyribonucleic acid (DNA) from affected and normal animals using a candidate gene approach. Because the genes that encode the lens-specific proteins, specifically, alpha, beta, and gamma crystallins and the membrane protein (MP26), are known to maintain the structure and function of the lens, the authors used complimentary DNA (cDNA) fragments that corresponded to the above genes to search for the mutations at their loci in the affected animals. They found no evidence of the gene deletion and rearrangement in any of the five loci. In addition, the hybridizable sequences of the dog DNA to the specific probes for the human chromosome 4 and 18 loci, which are reported to be involved in the abnormality of the human eye, seem to be unaffected. These data support the notion that the hereditary cataracts and microphthalmia in the dog may be associated with genes other than those reported for several animal systems.
E6 and E7 gene silencing results in decreased methylation of tumor suppressor genes and induces phenotype transformation of human cervical carcinoma cell lines

PubMed Central

Long, Jia; Shen, Danbei; Zhou, Wuqing; Zhou, Qiyan; Yang, Jia; Jiang, Mingjun

2015-01-01

In SiHa and CaSki cells, E6 and E7-targeting shRNA specifically and effectively knocked down human papillomavirus (HPV) 16 E6 and E7 at the transcriptional level, reduced the E6 and E7 mRNA levels by more than 80% compared with control cells that expressed a scrambled-sequence shRNA. E6 and E7 repression resulted in down-regulation of DNA methyltransferase mRNA and protein expression, decreased DNA methylation and increased mRNA expression levels of tumor suppressor genes, induced a certain apoptosis and inhibited proliferation in E6 and E7 shRNA-infected SiHa and CaSki cells compared with the uninfected cells. Repression of E6 and E7 oncogenes resulted in restoration of DNA methyltransferase suppressor pathways and induced apoptosis in HPV16-positive cervical carcinoma cell lines. Our findings suggest that the potential carcinogenic mechanism of HPV16 through influencing DNA methylation pathway to activate the development of cervical cancer exist, and maybe as a candidate therapeutic strategy for cervical and other HPV-associated cancers. PMID:26329329
Control of DEMETER DNA demethylase gene transcription in male and female gamete companion cells in Arabidopsis thaliana.

PubMed

Park, Jin-Sup; Frost, Jennifer M; Park, Kyunghyuk; Ohr, Hyonhwa; Park, Guen Tae; Kim, Seohyun; Eom, Hyunjoo; Lee, Ilha; Brooks, Janie S; Fischer, Robert L; Choi, Yeonhee

2017-02-21

The DEMETER (DME) DNA glycosylase initiates active DNA demethylation via the base-excision repair pathway and is vital for reproduction in Arabidopsis thaliana DME-mediated DNA demethylation is preferentially targeted to small, AT-rich, and nucleosome-depleted euchromatic transposable elements, influencing expression of adjacent genes and leading to imprinting in the endosperm. In the female gametophyte, DME expression and subsequent genome-wide DNA demethylation are confined to the companion cell of the egg, the central cell. Here, we show that, in the male gametophyte, DME expression is limited to the companion cell of sperm, the vegetative cell, and to a narrow window of time: immediately after separation of the companion cell lineage from the germline. We define transcriptional regulatory elements of DME using reporter genes, showing that a small region, which surprisingly lies within the DME gene, controls its expression in male and female companion cells. DME expression from this minimal promoter is sufficient to rescue seed abortion and the aberrant DNA methylome associated with the null dme-2 mutation. Within this minimal promoter, we found short, conserved enhancer sequences necessary for the transcriptional activities of DME and combined predicted binding motifs with published transcription factor binding coordinates to produce a list of candidate upstream pathway members in the genetic circuitry controlling DNA demethylation in gamete companion cells. These data show how DNA demethylation is regulated to facilitate endosperm gene imprinting and potential transgenerational epigenetic regulation, without subjecting the germline to potentially deleterious transposable element demethylation.
Cloning, characterization, and expression of Cytochrome b ( Cytb)—a key mitochondrial gene from Prorocentrum donghaiense

NASA Astrophysics Data System (ADS)

Zhao, Liyuan; Mi, Tiezhu; Zhen, Yu; Yu, Zhigang

2012-05-01

Mitochondrial cytochrome b (Cytb), one of the few proteins encoded by the mitochondrial DNA, plays an important role in transferring electrons. As a mitochondrial gene, it has been widely used for phylogenetic analysis. Previously, a 949-bp fragment of the coding gene and mRNA editing were characterized from Prorocentrum donghaiense, which might prove useful for resolving P. donghaiense from closely related species. However, the full-length coding region has not been characterized. In this study, we used rapid amplification of cDNA ends (RACE) to obtain full-length, 1 124 bp cDNA. Cytb transcript contained a standard initiation codon ATG, but did not have a recognizable stop codon. Homology comparison showed that the P. donghaiense Cytb had a high sequence identity to Cytb sequences from other dinoflagellate species. Phylogenetic analysis placed Cytb from P. donghaiense in the clade of dinoflagellates and it clustered together strongly with that from P. minimum. Based on the full-length sequence, we inferred 32 editing events at different positions, accounting for 2.93% of the Cytb gene. 34.4% (11) of the changes were A to G, 25% (8) were T to C, and 25% (8) were C to U, with smaller proportions of G to C and G to A edits (9.4% (3) and 6.2% (2), respectively). The expression level of the Cytb transcript was quantified by real-time PCR with a TaqMan probe at different times during the whole growth phase. The average Cytb transcript was present at 39.27±7.46 copies of cDNA per cell during the whole growth cycle, and the expression of Cytb was relatively stable over the different phases. These results deepen our understanding of the structure and characteristics of Cytb in P. donghaiense, and confirmed that Cytb in P. donghaiense is a candidate reference gene for studying the expression of other genes.
Genetic resources offer efficient tools for rice functional genomics research.

PubMed

Lo, Shuen-Fang; Fan, Ming-Jen; Hsing, Yue-Ie; Chen, Liang-Jwu; Chen, Shu; Wen, Ien-Chie; Liu, Yi-Lun; Chen, Ku-Ting; Jiang, Mirng-Jier; Lin, Ming-Kuang; Rao, Meng-Yen; Yu, Lin-Chih; Ho, Tuan-Hua David; Yu, Su-May

2016-05-01

Rice is an important crop and major model plant for monocot functional genomics studies. With the establishment of various genetic resources for rice genomics, the next challenge is to systematically assign functions to predicted genes in the rice genome. Compared with the robustness of genome sequencing and bioinformatics techniques, progress in understanding the function of rice genes has lagged, hampering the utilization of rice genes for cereal crop improvement. The use of transfer DNA (T-DNA) insertional mutagenesis offers the advantage of uniform distribution throughout the rice genome, but preferentially in gene-rich regions, resulting in direct gene knockout or activation of genes within 20-30 kb up- and downstream of the T-DNA insertion site and high gene tagging efficiency. Here, we summarize the recent progress in functional genomics using the T-DNA-tagged rice mutant population. We also discuss important features of T-DNA activation- and knockout-tagging and promoter-trapping of the rice genome in relation to mutant and candidate gene characterizations and how to more efficiently utilize rice mutant populations and datasets for high-throughput functional genomics and phenomics studies by forward and reverse genetics approaches. These studies may facilitate the translation of rice functional genomics research to improvements of rice and other cereal crops. © 2015 John Wiley & Sons Ltd.
Evaluation of the DNA barcodes in Dendrobium (Orchidaceae) from mainland Asia.

PubMed

Xu, Songzhi; Li, Dezhu; Li, Jianwu; Xiang, Xiaoguo; Jin, Weitao; Huang, Weichang; Jin, Xiaohua; Huang, Luqi

2015-01-01

DNA barcoding has been proposed to be one of the most promising tools for accurate and rapid identification of taxa. However, few publications have evaluated the efficiency of DNA barcoding for the large genera of flowering plants. Dendrobium, one of the largest genera of flowering plants, contains many species that are important in horticulture, medicine and biodiversity conservation. Besides, Dendrobium is a notoriously difficult group to identify. DNA barcoding was expected to be a supplementary means for species identification, conservation and future studies in Dendrobium. We assessed the power of 11 candidate barcodes on the basis of 1,698 accessions of 184 Dendrobium species obtained primarily from mainland Asia. Our results indicated that five single barcodes, i.e., ITS, ITS2, matK, rbcL and trnH-psbA, can be easily amplified and sequenced with the currently established primers. Four barcodes, ITS, ITS2, ITS+matK, and ITS2+matK, have distinct barcoding gaps. ITS+matK was the optimal barcode based on all evaluation methods. Furthermore, the efficiency of ITS+matK was verified in four other large genera including Ficus, Lysimachia, Paphiopedilum, and Pedicularis in this study. Therefore, we tentatively recommend the combination of ITS+matK as a core DNA barcode for large flowering plant genera.
Evaluation of the DNA Barcodes in Dendrobium (Orchidaceae) from Mainland Asia

PubMed Central

Xu, Songzhi; Li, Dezhu; Li, Jianwu; Xiang, Xiaoguo; Jin, Weitao; Huang, Weichang; Jin, Xiaohua; Huang, Luqi

2015-01-01

DNA barcoding has been proposed to be one of the most promising tools for accurate and rapid identification of taxa. However, few publications have evaluated the efficiency of DNA barcoding for the large genera of flowering plants. Dendrobium, one of the largest genera of flowering plants, contains many species that are important in horticulture, medicine and biodiversity conservation. Besides, Dendrobium is a notoriously difficult group to identify. DNA barcoding was expected to be a supplementary means for species identification, conservation and future studies in Dendrobium. We assessed the power of 11 candidate barcodes on the basis of 1,698 accessions of 184 Dendrobium species obtained primarily from mainland Asia. Our results indicated that five single barcodes, i.e., ITS, ITS2, matK, rbcL and trnH-psbA, can be easily amplified and sequenced with the currently established primers. Four barcodes, ITS, ITS2, ITS+matK, and ITS2+matK, have distinct barcoding gaps. ITS+matK was the optimal barcode based on all evaluation methods. Furthermore, the efficiency of ITS+matK was verified in four other large genera including Ficus, Lysimachia, Paphiopedilum, and Pedicularis in this study. Therefore, we tentatively recommend the combination of ITS+matK as a core DNA barcode for large flowering plant genera. PMID:25602282
DNA capture and next-generation sequencing can recover whole mitochondrial genomes from highly degraded samples for human identification

PubMed Central

2013-01-01

Background Mitochondrial DNA (mtDNA) typing can be a useful aid for identifying people from compromised samples when nuclear DNA is too damaged, degraded or below detection thresholds for routine short tandem repeat (STR)-based analysis. Standard mtDNA typing, focused on PCR amplicon sequencing of the control region (HVS I and HVS II), is limited by the resolving power of this short sequence, which misses up to 70% of the variation present in the mtDNA genome. Methods We used in-solution hybridisation-based DNA capture (using DNA capture probes prepared from modern human mtDNA) to recover mtDNA from post-mortem human remains in which the majority of DNA is both highly fragmented (<100 base pairs in length) and chemically damaged. The method ‘immortalises’ the finite quantities of DNA in valuable extracts as DNA libraries, which is followed by the targeted enrichment of endogenous mtDNA sequences and characterisation by next-generation sequencing (NGS). Results We sequenced whole mitochondrial genomes for human identification from samples where standard nuclear STR typing produced only partial profiles or demonstrably failed and/or where standard mtDNA hypervariable region sequences lacked resolving power. Multiple rounds of enrichment can substantially improve coverage and sequencing depth of mtDNA genomes from highly degraded samples. The application of this method has led to the reliable mitochondrial sequencing of human skeletal remains from unidentified World War Two (WWII) casualties approximately 70 years old and from archaeological remains (up to 2,500 years old). Conclusions This approach has potential applications in forensic science, historical human identification cases, archived medical samples, kinship analysis and population studies. In particular the methodology can be applied to any case, involving human or non-human species, where whole mitochondrial genome sequences are required to provide the highest level of maternal lineage discrimination. Multiple rounds of in-solution hybridisation-based DNA capture can retrieve whole mitochondrial genome sequences from even the most challenging samples. PMID:24289217
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis.

PubMed

Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

2012-01-01

RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. http://www.cemb.edu.pk/sw.html RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language.

Direct Detection and Sequencing of Damaged DNA Bases

PubMed Central

2011-01-01

Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications. PMID:22185597
Direct detection and sequencing of damaged DNA bases.

PubMed

Clark, Tyson A; Spittle, Kristi E; Turner, Stephen W; Korlach, Jonas

2011-12-20

Products of various forms of DNA damage have been implicated in a variety of important biological processes, such as aging, neurodegenerative diseases, and cancer. Therefore, there exists great interest to develop methods for interrogating damaged DNA in the context of sequencing. Here, we demonstrate that single-molecule, real-time (SMRT®) DNA sequencing can directly detect damaged DNA bases in the DNA template - as a by-product of the sequencing method - through an analysis of the DNA polymerase kinetics that are altered by the presence of a modified base. We demonstrate the sequencing of several DNA templates containing products of DNA damage, including 8-oxoguanine, 8-oxoadenine, O6-methylguanine, 1-methyladenine, O4-methylthymine, 5-hydroxycytosine, 5-hydroxyuracil, 5-hydroxymethyluracil, or thymine dimers, and show that these base modifications can be readily detected with single-modification resolution and DNA strand specificity. We characterize the distinct kinetic signatures generated by these DNA base modifications.
Screening and Characterization of Lactic Acid Bacteria Strains with Anti-inflammatory Activities through in vitro and Caenorhabditis elegans Model Testing

PubMed Central

Park, Mi Ri; Kim, Younghoon; Lee, Myung-Ki

2015-01-01

The present study was conducted to screen candidate probiotic strains for anti-inflammatory activity. Initially, a nitric oxide (NO) assay was used to test selected candidate probiotic strains for anti-inflammatory activity in cultures of the murine macrophage cell line, RAW 264.7. Then, the in vitro probiotic properties of the strains, including bile tolerance, acid resistance, and growth in skim milk media, were investigated. We also performed an in vitro hydrophobicity test and an intestinal adhesion assay using Caenorhabditis elegans as a surrogate in vivo model. From our screening, we obtained 4 probiotic candidate lactic acid bacteria (LAB) strains based on their anti-inflammatory activity in lipopolysaccharide (LPS)-stimulated RAW 264.7 cell cultures and the results of the in vitro and in vivo probiotic property assessments. Molecular characterization using 16S rDNA sequencing analysis identified the 4 LAB strains as Lactobacillus plantarum. The selected L. plantarum strains (CAU1054, CAU1055, CAU1064, and CAU1106) were found to possess desirable in vitro and in vivo probiotic properties, and these strains are good candidates for further investigations in animal models and human clinical studies to elucidate the mechanisms underlying their anti-inflammatory activities. PMID:26761805
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1987-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3575113
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1990-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2333227
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1988-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:3368330
A comprehensive list of cloned human DNA sequences

PubMed Central

Schmidtke, Jörg; Cooper, David N.

1989-01-01

A list of DNA sequences cloned from the human genome is presented. Intended as a guide to clone availability, this list includes published reports of cDNA, genomic and synthetic clones comprising gene and pseudogene sequences, uncharacterised DNA segments and repetitive DNA elements. PMID:2654889
Kilo-sequencing: an ordered strategy for rapid DNA sequence data acquisition.

PubMed Central

Barnes, W M; Bevan, M

1983-01-01

A strategy for rapid DNA sequence acquisition in an ordered, nonrandom manner, while retaining all of the conveniences of the dideoxy method with M13 transducing phage DNA template, is described. Target DNA 3 to 14 kb in size can be stably carried by our M13 vectors. Suitable targets are stretches of DNA which lack an enzyme recognition site which is unique on our cloning vectors and adjacent to the sequencing primer; current sites that are so useful when lacking are Pst, Xba, HindIII, BglII, EcoRI. By an in vitro procedure, we cut RF DNA once randomly and once specifically, to create thousands of deletions which start at the unique restriction site adjacent to the dideoxy sequencing primer and extend various distances across the target DNA. Phage carrying a desired size of deletions, whose DNA as template will give rise to DNA sequence data in a desired location along the target DNA, may be purified by electrophoresis alive on agarose gels. Phage running in the same location on the agarose gel thus conveniently give rise to nucleotide sequence data from the same kilobase of target DNA. Images PMID:6298723
A Multilocus Species Delimitation Reveals a Striking Number of Species of Coralline Algae Forming Maerl in the OSPAR Maritime Area

PubMed Central

Pardo, Cristina; Lopez, Lua; Peña, Viviana; Hernández-Kantún, Jazmin; Le Gall, Line; Bárbara, Ignacio; Barreiro, Rodolfo

2014-01-01

Maerl beds are sensitive biogenic habitats built by an accumulation of loose-lying, non-geniculate coralline algae. While these habitats are considered hot-spots of marine biodiversity, the number and distribution of maerl-forming species is uncertain because homoplasy and plasticity of morphological characters are common. As a result, species discrimination based on morphological features is notoriously challenging, making these coralline algae the ideal candidates for a DNA barcoding study. Here, mitochondrial (COI-5P DNA barcode fragment) and plastidial (psbA gene) sequence data were used in a two-step approach to delimit species in 224 collections of maerl sampled from Svalbard (78°96’N) to the Canary Islands (28°64’N) that represented 10 morphospecies from four genera and two families. First, the COI-5P dataset was analyzed with two methods based on distinct criteria (ABGD and GMYC) to delineate 16 primary species hypotheses (PSHs) arranged into four major lineages. Second, chloroplast (psbA) sequence data served to consolidate these PSHs into 13 secondary species hypotheses (SSHs) that showed biologically plausible ranges. Using several lines of evidence (e.g. morphological characters, known species distributions, sequences from type and topotype material), six SSHs were assigned to available species names that included the geographically widespread Phymatolithon calcareum, Lithothamnion corallioides, and L. glaciale; possible identities of other SSHs are discussed. Concordance between SSHs and morphospecies was minimal, highlighting the convenience of DNA barcoding for an accurate identification of maerl specimens. Our survey indicated that a majority of maerl forming species have small distribution ranges and revealed a gradual replacement of species with latitude. PMID:25111057
Novel USH2A compound heterozygous mutations cause RP/USH2 in a Chinese family.

PubMed

Liu, Xiaowen; Tang, Zhaohui; Li, Chang; Yang, Kangjuan; Gan, Guanqi; Zhang, Zibo; Liu, Jingyu; Jiang, Fagang; Wang, Qing; Liu, Mugen

2010-03-17

To identify the disease-causing gene in a four-generation Chinese family affected with retinitis pigmentosa (RP). Linkage analysis was performed with a panel of microsatellite markers flanking the candidate genetic loci of RP. These loci included 38 known RP genes. The complete coding region and exon-intron boundaries of Usher syndrome 2A (USH2A) were sequenced with the proband DNA to screen the disease-causing gene mutation. Restriction fragment length polymorphism (RFLP) analysis and direct DNA sequence analysis were done to demonstrate co-segregation of the USH2A mutations with the family disease. One hundred normal controls were used without the mutations. The disease-causing gene in this Chinese family was linked to the USH2A locus on chromosome 1q41. Direct DNA sequence analysis of USH2A identified two novel mutations in the patients: one missense mutation p.G1734R in exon 26 and a splice site mutation, IVS32+1G>A, which was found in the donor site of intron 32 of USH2A. Neither the p.G1734R nor the IVS32+1G>A mutation was found in the unaffected family members or the 100 normal controls. One patient with a homozygous mutation displayed only RP symptoms until now, while three patients with compound heterozygous mutations in the family of study showed both RP and hearing impairment. This study identified two novel mutations: p.G1734R and IVS32+1G>A of USH2A in a four-generation Chinese RP family. In this study, the heterozygous mutation and the homozygous mutation in USH2A may cause Usher syndrome Type II or RP, respectively. These two mutations expand the mutant spectrum of USH2A.
Isolation of a candidate human telomerase catalytic subunit gene, which reveals complex splicing patterns in different cell types.

PubMed

Kilian, A; Bowtell, D D; Abud, H E; Hime, G R; Venter, D J; Keese, P K; Duncan, E L; Reddel, R R; Jefferson, R A

1997-11-01

Telomerase is a multicomponent reverse transcriptase enzyme that adds DNA repeats to the ends of chromosomes using its RNA component as a template for synthesis. Telomerase activity is detected in the germline as well as the majority of tumors and immortal cell lines, and at low levels in several types of normal cells. We have cloned a human gene homologous to a protein from Saccharomyces cerevisiae and Euplotes aediculatus that has reverse transcriptase motifs and is thought to be the catalytic subunit of telomerase in those species. This gene is present in the human genome as a single copy sequence with a dominant transcript of approximately 4 kb in a human colon cancer cell line, LIM1215. The cDNA sequence was determined using clones from a LIM1215 cDNA library and by RT-PCR, cRACE and 3'RACE on mRNA from the same source. We show that the gene is expressed in several normal tissues, telomerase-positive post-crisis (immortal) cell lines and various tumors but is not expressed in the majority of normal tissues analyzed, pre-crisis (non-immortal) cells and telomerase-negative immortal (ALT) cell lines. Multiple products were identified by RT-PCR using primers within the reverse transcriptase domain. Sequencing of these products suggests that they arise by alternative splicing. Strikingly, various tumors, cell lines and even normal tissues (colonic crypt and testis) showed considerable differences in the splicing patterns. Alternative splicing of the telomerase catalytic subunit transcript may be important for the regulation of telomerase activity and may give rise to proteins with different biochemical functions.
Silicene nanoribbon as a new DNA sequencing device

NASA Astrophysics Data System (ADS)

Alesheikh, Sara; Shahtahmassebi, Nasser; Roknabadi, Mahmood Rezaee; Pilevar Shahri, Raheleh

2018-02-01

The importance of applying DNA sequencing in different fields, results in looking for fast and cheap methods. Nanotechnology helps this development by introducing nanostructures used for DNA sequencing. In this work we study the interaction between zigzag silicene nanoribbon and DNA nucleobases using DFT and non equilibrium Green's function approach, to investigate the possibility of using zigzag silicene nanoribbons as a biosensor for DNA sequencing.
Isolation and characterization of target sequences of the chicken CdxA homeobox gene.

PubMed Central

Margalit, Y; Yarus, S; Shapira, E; Gruenbaum, Y; Fainsod, A

1993-01-01

The DNA binding specificity of the chicken homeodomain protein CDXA was studied. Using a CDXA-glutathione-S-transferase fusion protein, DNA fragments containing the binding site for this protein were isolated. The sources of DNA were oligonucleotides with random sequence and chicken genomic DNA. The DNA fragments isolated were sequenced and tested in DNA binding assays. Sequencing revealed that most DNA fragments are AT rich which is a common feature of homeodomain binding sites. By electrophoretic mobility shift assays it was shown that the different target sequences isolated bind to the CDXA protein with different affinities. The specific sequences bound by the CDXA protein in the genomic fragments isolated, were determined by DNase I footprinting. From the footprinted sequences, the CDXA consensus binding site was determined. The CDXA protein binds the consensus sequence A, A/T, T, A/T, A, T, A/G. The CAUDAL binding site in the ftz promoter is also included in this consensus sequence. When tested, some of the genomic target sequences were capable of enhancing the transcriptional activity of reporter plasmids when introduced into CDXA expressing cells. This study determined the DNA sequence specificity of the CDXA protein and it also shows that this protein can further activate transcription in cells in culture. Images PMID:7909943
Sequence periodicity in nucleosomal DNA and intrinsic curvature.

PubMed

Nair, T Murlidharan

2010-05-17

Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA.
Contig Maps and Genomic Sequencing Identify Candidate Genes in the Usher 1C Locus

PubMed Central

Higgins, Michael J.; Day, Colleen D.; Smilinich, Nancy J.; Ni, L.; Cooper, Paul R.; Nowak, Norma J.; Davies, Chris; de Jong, Pieter J.; Hejtmancik, Fielding; Evans, Glen A.; Smith, Richard J.H.; Shows, Thomas B.

1998-01-01

Usher syndrome 1C (USH1C) is a congenital condition manifesting profound hearing loss, the absence of vestibular function, and eventual retinal degeneration. The USH1C locus has been mapped genetically to a 2- to 3-cM interval in 11p14–15.1 between D11S899 and D11S861. In an effort to identify the USH1C disease gene we have isolated the region between these markers in yeast artificial chromosomes (YACs) using a combination of STS content mapping and Alu–PCR hybridization. The YAC contig is ∼3.5 Mb and has located several other loci within this interval, resulting in the order CEN-LDHA-SAA1-TPH-D11S1310-(D11S1888/KCNC1)-MYOD1-D11S902D11S921-D11S1890-TEL. Subsequent haplotyping and homozygosity analysis refined the location of the disease gene to a 400-kb interval between D11S902 and D11S1890 with all affected individuals being homozygous for the internal marker D11S921. To facilitate gene identification, the critical region has been converted into P1 artificial chromosome (PAC) clones using sequence-tagged sites (STSs) mapped to the YAC contig, Alu–PCR products generated from the YACs, and PAC end probes. A contig of >50 PAC clones has been assembled between D11S1310 and D11S1890, confirming the order of markers used in haplotyping. Three PAC clones representing nearly two-thirds of the USH1C critical region have been sequenced. PowerBLAST analysis identified six clusters of expressed sequence tags (ESTs), two known genes (BIR,SUR1) mapped previously to this region, and a previously characterized but unmapped gene NEFA (DNA binding/EF hand/acidic amino-acid-rich). GRAIL analysis identified 11 CpG islands and 73 exons of excellent quality. These data allowed the construction of a transcription map for the USH1C critical region, consisting of three known genes and six or more novel transcripts. Based on their map location, these loci represent candidate disease loci for USH1C. The NEFA gene was assessed as the USH1C locus by the sequencing of an amplified NEFA cDNA from an USH1C patient; however, no mutations were detected. [The sequence data described in this paper have been submitted to GenBank under accession numbers AC000406–AC000407.] PMID:9445488
Assessing the Fidelity of Ancient DNA Sequences Amplified From Nuclear Genes

PubMed Central

Binladen, Jonas; Wiuf, Carsten; Gilbert, M. Thomas P.; Bunce, Michael; Barnett, Ross; Larson, Greger; Greenwood, Alex D.; Haile, James; Ho, Simon Y. W.; Hansen, Anders J.; Willerslev, Eske

2006-01-01

To date, the field of ancient DNA has relied almost exclusively on mitochondrial DNA (mtDNA) sequences. However, a number of recent studies have reported the successful recovery of ancient nuclear DNA (nuDNA) sequences, thereby allowing the characterization of genetic loci directly involved in phenotypic traits of extinct taxa. It is well documented that postmortem damage in ancient mtDNA can lead to the generation of artifactual sequences. However, as yet no one has thoroughly investigated the damage spectrum in ancient nuDNA. By comparing clone sequences from 23 fossil specimens, recovered from environments ranging from permafrost to desert, we demonstrate the presence of miscoding lesion damage in both the mtDNA and nuDNA, resulting in insertion of erroneous bases during amplification. Interestingly, no significant differences in the frequency of miscoding lesion damage are recorded between mtDNA and nuDNA despite great differences in cellular copy numbers. For both mtDNA and nuDNA, we find significant positive correlations between total sequence heterogeneity and the rates of type 1 transitions (adenine → guanine and thymine → cytosine) and type 2 transitions (cytosine → thymine and guanine → adenine), respectively. Type 2 transitions are by far the most dominant and increase relative to those of type 1 with damage load. The results suggest that the deamination of cytosine (and 5-methyl cytosine) to uracil (and thymine) is the main cause of miscoding lesions in both ancient mtDNA and nuDNA sequences. We argue that the problems presented by postmortem damage, as well as problems with contamination from exogenous sources of conserved nuclear genes, allelic variation, and the reliance on single nucleotide polymorphisms, call for great caution in studies relying on ancient nuDNA sequences. PMID:16299392
[Current applications of high-throughput DNA sequencing technology in antibody drug research].

PubMed

Yu, Xin; Liu, Qi-Gang; Wang, Ming-Rong

2012-03-01

Since the publication of a high-throughput DNA sequencing technology based on PCR reaction was carried out in oil emulsions in 2005, high-throughput DNA sequencing platforms have been evolved to a robust technology in sequencing genomes and diverse DNA libraries. Antibody libraries with vast numbers of members currently serve as a foundation of discovering novel antibody drugs, and high-throughput DNA sequencing technology makes it possible to rapidly identify functional antibody variants with desired properties. Herein we present a review of current applications of high-throughput DNA sequencing technology in the analysis of antibody library diversity, sequencing of CDR3 regions, identification of potent antibodies based on sequence frequency, discovery of functional genes, and combination with various display technologies, so as to provide an alternative approach of discovery and development of antibody drugs.
DNA fingerprinting, DNA barcoding, and next generation sequencing technology in plants.

PubMed

Sucher, Nikolaus J; Hennell, James R; Carles, Maria C

2012-01-01

DNA fingerprinting of plants has become an invaluable tool in forensic, scientific, and industrial laboratories all over the world. PCR has become part of virtually every variation of the plethora of approaches used for DNA fingerprinting today. DNA sequencing is increasingly used either in combination with or as a replacement for traditional DNA fingerprinting techniques. A prime example is the use of short, standardized regions of the genome as taxon barcodes for biological identification of plants. Rapid advances in "next generation sequencing" (NGS) technology are driving down the cost of sequencing and bringing large-scale sequencing projects into the reach of individual investigators. We present an overview of recent publications that demonstrate the use of "NGS" technology for DNA fingerprinting and DNA barcoding applications.
Mammalian DNA enriched for replication origins is enriched for snap-back sequences.

PubMed

Zannis-Hadjopoulos, M; Kaufmann, G; Martin, R G

1984-11-15

Using the instability of replication loops as a method for the isolation of double-stranded nascent DNA, extruded DNA enriched for replication origins was obtained and denatured. Snap-back DNA, single-stranded DNA with inverted repeats (palindromic sequences), reassociates rapidly into stem-loop structures with zero-order kinetics when conditions are changed from denaturing to renaturing, and can be assayed by chromatography on hydroxyapatite. Origin-enriched nascent DNA strands from mouse, rat and monkey cells growing either synchronously or asynchronously were purified and assayed for the presence of snap-back sequences. The results show that origin-enriched DNA is also enriched for snap-back sequences, implying that some origins for mammalian DNA replication contain or lie near palindromic sequences.
Large-scale transcriptome characterization and mass discovery of SNPs in globe artichoke and its related taxa.

PubMed

Scaglione, Davide; Lanteri, Sergio; Acquadro, Alberto; Lai, Zhao; Knapp, Steven J; Rieseberg, Loren; Portis, Ezio

2012-10-01

Cynara cardunculus (2n = 2× = 34) is a member of the Asteraceae family that contributes significantly to the agricultural economy of the Mediterranean basin. The species includes two cultivated varieties, globe artichoke and cardoon, which are grown mainly for food. Cynara cardunculus is an orphan crop species whose genome/transcriptome has been relatively unexplored, especially in comparison to other Asteraceae crops. Hence, there is a significant need to improve its genomic resources through the identification of novel genes and sequence-based markers, to design new breeding schemes aimed at increasing quality and crop productivity. We report the outcome of cDNA sequencing and assembly for eleven accessions of C. cardunculus. Sequencing of three mapping parental genotypes using Roche 454-Titanium technology generated 1.7 × 10⁶ reads, which were assembled into 38,726 reference transcripts covering 32 Mbp. Putative enzyme-encoding genes were annotated using the KEGG-database. Transcription factors and candidate resistance genes were surveyed as well. Paired-end sequencing was done for cDNA libraries of eight other representative C. cardunculus accessions on an Illumina Genome Analyzer IIx, generating 46 × 10⁶ reads. Alignment of the IGA and 454 reads to reference transcripts led to the identification of 195,400 SNPs with a Bayesian probability exceeding 95%; a validation rate of 90% was obtained by Sanger-sequencing of a subset of contigs. These results demonstrate that the integration of data from different NGS platforms enables large-scale transcriptome characterization, along with massive SNP discovery. This information will contribute to the dissection of key agricultural traits in C. cardunculus and facilitate the implementation of marker-assisted selection programs. © 2012 The Authors. Plant Biotechnology Journal © 2012 Society for Experimental Biology, Association of Applied Biologists and Blackwell Publishing Ltd.

CRAWview: for viewing splicing variation, gene families, and polymorphism in clusters of ESTs and full-length sequences.

PubMed

Chou, A; Burke, J

1999-05-01

DNA sequence clustering has become a valuable method in support of gene discovery and gene expression analysis. Our interest lies in leveraging the sequence diversity within clusters of expressed sequence tags (ESTs) to model gene structure for the study of gene variants that arise from, among other things, alternative mRNA splicing, polymorphism, and divergence after gene duplication, fusion, and translocation events. In previous work, CRAW was developed to discover gene variants from assembled clusters of ESTs. Most importantly, novel gene features (the differing units between gene variants, for example alternative exons, polymorphisms, transposable elements, etc.) that are specialized to tissue, disease, population, or developmental states can be identified when these tools collate DNA source information with gene variant discrimination. While the goal is complete automation of novel feature and gene variant detection, current methods are far from perfect and hence the development of effective tools for visualization and exploratory data analysis are of paramount importance in the process of sifting through candidate genes and validating targets. We present CRAWview, a Java based visualization extension to CRAW. Features that vary between gene forms are displayed using an automatically generated color coded index. The reporting format of CRAWview gives a brief, high level summary report to display overlap and divergence within clusters of sequences as well as the ability to 'drill down' and see detailed information concerning regions of interest. Additionally, the alignment viewing and editing capabilities of CRAWview make it possible to interactively correct frame-shifts and otherwise edit cluster assemblies. We have implemented CRAWview as a Java application across windows NT/95 and UNIX platforms. A beta version of CRAWview will be freely available to academic users from Pangea Systems (http://www.pangeasystems.com). Contact :
Development of a High-Throughput Resequencing Array for the Detection of Pathogenic Mutations in Osteogenesis Imperfecta

PubMed Central

Wang, Yao; Cui, Yazhou; Zhou, Xiaoyan; Han, Jinxiang

2015-01-01

Objective Osteogenesis imperfecta (OI) is a rare inherited skeletal disease, characterized by bone fragility and low bone density. The mutations in this disorder have been widely reported to be on various exonal hotspots of the candidate genes, including COL1A1, COL1A2, CRTAP, LEPRE1, and FKBP10, thus creating a great demand for precise genetic tests. However, large genome sizes make the process daunting and the analyses, inefficient and expensive. Therefore, we aimed at developing a fast, accurate, efficient, and cheaper sequencing platform for OI diagnosis; and to this end, use of an advanced array-based technique was proposed. Method A CustomSeq Affymetrix Resequencing Array was established for high-throughput sequencing of five genes simultaneously. Genomic DNA extraction from 13 OI patients and 85 normal controls and amplification using long-range PCR (LR-PCR) were followed by DNA fragmentation and chip hybridization, according to standard Affymetrix protocols. Hybridization signals were determined using GeneChip Sequence Analysis Software (GSEQ). To examine the feasibility, the outcome from new resequencing approach was validated by conventional capillary sequencing method. Result Overall call rates using resequencing array was 96–98% and the agreement between microarray and capillary sequencing was 99.99%. 11 out of 13 OI patients with pathogenic mutations were successfully detected by the chip analysis without adjustment, and one mutation could also be identified using manual visual inspection. Conclusion A high-throughput resequencing array was developed that detects the disease-associated mutations in OI, providing a potential tool to facilitate large-scale genetic screening for OI patients. Through this method, a novel mutation was also found. PMID:25742658
Validation of SmartRank: A likelihood ratio software for searching national DNA databases with complex DNA profiles.

PubMed

Benschop, Corina C G; van de Merwe, Linda; de Jong, Jeroen; Vanvooren, Vanessa; Kempenaers, Morgane; Kees van der Beek, C P; Barni, Filippo; Reyes, Eusebio López; Moulin, Léa; Pene, Laurent; Haned, Hinda; Sijen, Titia

2017-07-01

Searching a national DNA database with complex and incomplete profiles usually yields very large numbers of possible matches that can present many candidate suspects to be further investigated by the forensic scientist and/or police. Current practice in most forensic laboratories consists of ordering these 'hits' based on the number of matching alleles with the searched profile. Thus, candidate profiles that share the same number of matching alleles are not differentiated and due to the lack of other ranking criteria for the candidate list it may be difficult to discern a true match from the false positives or notice that all candidates are in fact false positives. SmartRank was developed to put forward only relevant candidates and rank them accordingly. The SmartRank software computes a likelihood ratio (LR) for the searched profile and each profile in the DNA database and ranks database entries above a defined LR threshold according to the calculated LR. In this study, we examined for mixed DNA profiles of variable complexity whether the true donors are retrieved, what the number of false positives above an LR threshold is and the ranking position of the true donors. Using 343 mixed DNA profiles over 750 SmartRank searches were performed. In addition, the performance of SmartRank and CODIS were compared regarding DNA database searches and SmartRank was found complementary to CODIS. We also describe the applicable domain of SmartRank and provide guidelines. The SmartRank software is open-source and freely available. Using the best practice guidelines, SmartRank enables obtaining investigative leads in criminal cases lacking a suspect. Copyright © 2017 Elsevier B.V. All rights reserved.
DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

DOE Office of Scientific and Technical Information (OSTI.GOV)

Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio

The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
DNA sequence determinants controlling affinity, stability and shape of DNA complexes bound by the nucleoid protein Fis

DOE PAGES

Hancock, Stephen P.; Stella, Stefano; Cascio, Duilio; ...

2016-03-09

The abundant Fis nucleoid protein selectively binds poorly related DNA sequences with high affinities to regulate diverse DNA reactions. Fis binds DNA primarily through DNA backbone contacts and selects target sites by reading conformational properties of DNA sequences, most prominently intrinsic minor groove widths. High-affinity binding requires Fis-stabilized DNA conformational changes that vary depending on DNA sequence. In order to better understand the molecular basis for high affinity site recognition, we analyzed the effects of DNA sequence within and flanking the core Fis binding site on binding affinity and DNA structure. X-ray crystal structures of Fis-DNA complexes containing variable sequencesmore » in the noncontacted center of the binding site or variations within the major groove interfaces show that the DNA can adapt to the Fis dimer surface asymmetrically. We show that the presence and position of pyrimidine-purine base steps within the major groove interfaces affect both local DNA bending and minor groove compression to modulate affinities and lifetimes of Fis-DNA complexes. Sequences flanking the core binding site also modulate complex affinities, lifetimes, and the degree of local and global Fis-induced DNA bending. In particular, a G immediately upstream of the 15 bp core sequence inhibits binding and bending, and A-tracts within the flanking base pairs increase both complex lifetimes and global DNA curvatures. Taken together, our observations support a revised DNA motif specifying high-affinity Fis binding and highlight the range of conformations that Fis-bound DNA can adopt. Lastly, the affinities and DNA conformations of individual Fis-DNA complexes are likely to be tailored to their context-specific biological functions.« less
Specific minor groove solvation is a crucial determinant of DNA binding site recognition

PubMed Central

Harris, Lydia-Ann; Williams, Loren Dean; Koudelka, Gerald B.

2014-01-01

The DNA sequence preferences of nearly all sequence specific DNA binding proteins are influenced by the identities of bases that are not directly contacted by protein. Discrimination between non-contacted base sequences is commonly based on the differential abilities of DNA sequences to allow narrowing of the DNA minor groove. However, the factors that govern the propensity of minor groove narrowing are not completely understood. Here we show that the differential abilities of various DNA sequences to support formation of a highly ordered and stable minor groove solvation network are a key determinant of non-contacted base recognition by a sequence-specific binding protein. In addition, disrupting the solvent network in the non-contacted region of the binding site alters the protein's ability to recognize contacted base sequences at positions 5–6 bases away. This observation suggests that DNA solvent interactions link contacted and non-contacted base recognition by the protein. PMID:25429976
A Method for Preparing DNA Sequencing Templates Using a DNA-Binding Microplate

PubMed Central

Yang, Yu; Hebron, Haroun R.; Hang, Jun

2009-01-01

A DNA-binding matrix was immobilized on the surface of a 96-well microplate and used for plasmid DNA preparation for DNA sequencing. The same DNA-binding plate was used for bacterial growth, cell lysis, DNA purification, and storage. In a single step using one buffer, bacterial cells were lysed by enzymes, and released DNA was captured on the plate simultaneously. After two wash steps, DNA was eluted and stored in the same plate. Inclusion of phosphates in the culture medium was found to enhance the yield of plasmid significantly. Purified DNA samples were used successfully in DNA sequencing with high consistency and reproducibility. Eleven vectors and nine libraries were tested using this method. In 10 μl sequencing reactions using 3 μl sample and 0.25 μl BigDye Terminator v3.1, the results from a 3730xl sequencer gave a success rate of 90–95% and read-lengths of 700 bases or more. The method is fully automatable and convenient for manual operation as well. It enables reproducible, high-throughput, rapid production of DNA with purity and yields sufficient for high-quality DNA sequencing at a substantially reduced cost. PMID:19568455
Dendritic Cell-Based Immunotherapy of Breast Cancer: Modulation by CpG DNA

DTIC Science & Technology

2005-09-01

tumor-associated antigens and bacterial DNA oligodeoxynucleotides containing unmethylated CpG sequences (CpG DNA) further augment the immune priming...associated antigens by cytotoxic T lymphocytes, and bacterial DNA oligodeoxy- nucleotides containing unmethylated CpG sequences (CpG DNA) can further...further amplify their immunostimulatory capacity and bacterial DNA oligodeoxynucleotides (ODN) containing unmethylated CpG sequences (CpG DNA) provide such
A rapid and cost-effective method for sequencing pooled cDNA clones by using a combination of transposon insertion and Gateway technology.

PubMed

Morozumi, Takeya; Toki, Daisuke; Eguchi-Ogawa, Tomoko; Uenishi, Hirohide

2011-09-01

Large-scale cDNA-sequencing projects require an efficient strategy for mass sequencing. Here we describe a method for sequencing pooled cDNA clones using a combination of transposon insertion and Gateway technology. Our method reduces the number of shotgun clones that are unsuitable for reconstruction of cDNA sequences, and has the advantage of reducing the total costs of the sequencing project.
Biological sequence compression algorithms.

PubMed

Matsumoto, T; Sadakane, K; Imai, H

2000-01-01

Today, more and more DNA sequences are becoming available. The information about DNA sequences are stored in molecular biology databases. The size and importance of these databases will be bigger and bigger in the future, therefore this information must be stored or communicated efficiently. Furthermore, sequence compression can be used to define similarities between biological sequences. The standard compression algorithms such as gzip or compress cannot compress DNA sequences, but only expand them in size. On the other hand, CTW (Context Tree Weighting Method) can compress DNA sequences less than two bits per symbol. These algorithms do not use special structures of biological sequences. Two characteristic structures of DNA sequences are known. One is called palindromes or reverse complements and the other structure is approximate repeats. Several specific algorithms for DNA sequences that use these structures can compress them less than two bits per symbol. In this paper, we improve the CTW so that characteristic structures of DNA sequences are available. Before encoding the next symbol, the algorithm searches an approximate repeat and palindrome using hash and dynamic programming. If there is a palindrome or an approximate repeat with enough length then our algorithm represents it with length and distance. By using this preprocessing, a new program achieves a little higher compression ratio than that of existing DNA-oriented compression algorithms. We also describe new compression algorithm for protein sequences.
Detection of DNA Methylation by Whole-Genome Bisulfite Sequencing.

PubMed

Li, Qing; Hermanson, Peter J; Springer, Nathan M

2018-01-01

DNA methylation plays an important role in the regulation of the expression of transposons and genes. Various methods have been developed to assay DNA methylation levels. Bisulfite sequencing is considered to be the "gold standard" for single-base resolution measurement of DNA methylation levels. Coupled with next-generation sequencing, whole-genome bisulfite sequencing (WGBS) allows DNA methylation to be evaluated at a genome-wide scale. Here, we described a protocol for WGBS in plant species with large genomes. This protocol has been successfully applied to assay genome-wide DNA methylation levels in maize and barley. This protocol has also been successfully coupled with sequence capture technology to assay DNA methylation levels in a targeted set of genomic regions.
Single-Molecule Electrical Random Resequencing of DNA and RNA

NASA Astrophysics Data System (ADS)

Ohshiro, Takahito; Matsubara, Kazuki; Tsutsui, Makusu; Furuhashi, Masayuki; Taniguchi, Masateru; Kawai, Tomoji

2012-07-01

Two paradigm shifts in DNA sequencing technologies--from bulk to single molecules and from optical to electrical detection--are expected to realize label-free, low-cost DNA sequencing that does not require PCR amplification. It will lead to development of high-throughput third-generation sequencing technologies for personalized medicine. Although nanopore devices have been proposed as third-generation DNA-sequencing devices, a significant milestone in these technologies has been attained by demonstrating a novel technique for resequencing DNA using electrical signals. Here we report single-molecule electrical resequencing of DNA and RNA using a hybrid method of identifying single-base molecules via tunneling currents and random sequencing. Our method reads sequences of nine types of DNA oligomers. The complete sequence of 5'-UGAGGUA-3' from the let-7 microRNA family was also identified by creating a composite of overlapping fragment sequences, which was randomly determined using tunneling current conducted by single-base molecules as they passed between a pair of nanoelectrodes.
Characterization and isolation of a T-DNA tagged banana promoter active during in vitro culture and low temperature stress.

PubMed

Santos, Efrén; Remy, Serge; Thiry, Els; Windelinckx, Saskia; Swennen, Rony; Sági, László

2009-06-24

Next-generation transgenic plants will require a more precise regulation of transgene expression, preferably under the control of native promoters. A genome-wide T-DNA tagging strategy was therefore performed for the identification and characterization of novel banana promoters. Embryogenic cell suspensions of a plantain-type banana were transformed with a promoterless, codon-optimized luciferase (luc+) gene and low temperature-responsive luciferase activation was monitored in real time. Around 16,000 transgenic cell colonies were screened for baseline luciferase activity at room temperature 2 months after transformation. After discarding positive colonies, cultures were re-screened in real-time at 26 degrees C followed by a gradual decrease to 8 degrees C. The baseline activation frequency was 0.98%, while the frequency of low temperature-responsive luciferase activity was 0.61% in the same population of cell cultures. Transgenic colonies with luciferase activity responsive to low temperature were regenerated to plantlets and luciferase expression patterns monitored during different regeneration stages. Twenty four banana DNA sequences flanking the right T-DNA borders in seven independent lines were cloned via PCR walking. RT-PCR analysis in one line containing five inserts allowed the identification of the sequence that had activated luciferase expression under low temperature stress in a developmentally regulated manner. This activating sequence was fused to the uidA reporter gene and back-transformed into a commercial dessert banana cultivar, in which its original expression pattern was confirmed. This promoter tagging and real-time screening platform proved valuable for the identification of novel promoters and genes in banana and for monitoring expression patterns throughout in vitro development and low temperature treatment. Combination of PCR walking techniques was efficient for the isolation of candidate promoters even in a multicopy T-DNA line. Qualitative and quantitative GUS expression analyses of one tagged promoter in a commercial cultivar demonstrated a reproducible promoter activity pattern during in vitro culture. Thus, this promoter could be used during in vitro selection and generation of commercial transgenic plants.
RNomics in Drosophila melanogaster: identification of 66 candidates for novel non-messenger RNAs

PubMed Central

Yuan, Guozhong; Klämbt, Christian; Bachellerie, Jean-Pierre; Brosius, Jürgen; Hüttenhofer, Alexander

2003-01-01

By generating a specialised cDNA library from four different developmental stages of Drosophila melanogaster, we have identified 66 candidates for small non-messenger RNAs (snmRNAs) and have confirmed their expression by northern blot analysis. Thirteen of them were expressed at certain stages of D.melanogaster development, only. Thirty-five species belong to the class of small nucleolar RNAs (snoRNAs), divided into 15 members from the C/D subclass and 20 members from the H/ACA subclass, which mostly guide 2′-O-methylation and pseudouridylation, respectively, of rRNA and snRNAs. These also include two outstanding C/D snoRNAs, U3 and U14, both functioning as pre-rRNA chaperones. Surprisingly, the sequence of the Drosophila U14 snoRNA reflects a major change of function of this snoRNA in Diptera relative to yeast and vertebrates. Among the 22 snmRNAs lacking known sequence and structure motifs, five were located in intergenic regions, two in introns, five in untranslated regions of mRNAs, eight were derived from open reading frames, and two were transcribed opposite to an intron. Interestingly, detection of two RNA species from this group implies that certain snmRNA species are processed from alternatively spliced pre-mRNAs. Surprisingly, a few snmRNA sequences could not be found on the published D.melanogaster genome, which might suggest that more snmRNA genes (as well as mRNAs) are hidden in unsequenced regions of the genome. PMID:12736298
The Fecal Viral Flora of Wild Rodents

PubMed Central

Phan, Tung G.; Kapusinszky, Beatrix; Wang, Chunlin; Rose, Robert K.; Lipton, Howard L.; Delwart, Eric L.

2011-01-01

The frequent interactions of rodents with humans make them a common source of zoonotic infections. To obtain an initial unbiased measure of the viral diversity in the enteric tract of wild rodents we sequenced partially purified, randomly amplified viral RNA and DNA in the feces of 105 wild rodents (mouse, vole, and rat) collected in California and Virginia. We identified in decreasing frequency sequences related to the mammalian viruses families Circoviridae, Picobirnaviridae, Picornaviridae, Astroviridae, Parvoviridae, Papillomaviridae, Adenoviridae, and Coronaviridae. Seventeen small circular DNA genomes containing one or two replicase genes distantly related to the Circoviridae representing several potentially new viral families were characterized. In the Picornaviridae family two new candidate genera as well as a close genetic relative of the human pathogen Aichi virus were characterized. Fragments of the first mouse sapelovirus and picobirnaviruses were identified and the first murine astrovirus genome was characterized. A mouse papillomavirus genome and fragments of a novel adenovirus and adenovirus-associated virus were also sequenced. The next largest fraction of the rodent fecal virome was related to insect viruses of the Densoviridae, Iridoviridae, Polydnaviridae, Dicistroviriade, Bromoviridae, and Virgaviridae families followed by plant virus-related sequences in the Nanoviridae, Geminiviridae, Phycodnaviridae, Secoviridae, Partitiviridae, Tymoviridae, Alphaflexiviridae, and Tombusviridae families reflecting the largely insect and plant rodent diet. Phylogenetic analyses of full and partial viral genomes therefore revealed many previously unreported viral species, genera, and families. The close genetic similarities noted between some rodent and human viruses might reflect past zoonoses. This study increases our understanding of the viral diversity in wild rodents and highlights the large number of still uncharacterized viruses in mammals. PMID:21909269
Generation and analysis of expression sequence tags from haustoria of the wheat stripe rust fungus Puccinia striiformis f. sp. Tritici

PubMed Central

2009-01-01

Background Stripe rust, caused by Puccinia striiformis f. sp. tritici (Pst), is one of the most destructive diseases of wheat (Triticum aestivum L.) worldwide. In spite of its agricultural importance, the genomics and genetics of the pathogen are poorly characterized. Pst transcripts from urediniospores and germinated urediniospores have been examined previously, but little is known about genes expressed during host infection. Some genes involved in virulence in other rust fungi have been found to be specifically expressed in haustoria. Therefore, the objective of this study was to generate a cDNA library to characterize genes expressed in haustoria of Pst. Results A total of 5,126 EST sequences of high quality were generated from haustoria of Pst, from which 287 contigs and 847 singletons were derived. Approximately 10% and 26% of the 1,134 unique sequences were homologous to proteins with known functions and hypothetical proteins, respectively. The remaining 64% of the unique sequences had no significant similarities in GenBank. Fifteen genes were predicted to be proteins secreted from Pst haustoria. Analysis of ten genes, including six secreted protein genes, using quantitative RT-PCR revealed changes in transcript levels in different developmental and infection stages of the pathogen. Conclusions The haustorial cDNA library was useful in identifying genes of the stripe rust fungus expressed during the infection process. From the library, we identified 15 genes encoding putative secreted proteins and six genes induced during the infection process. These genes are candidates for further studies to determine their functions in wheat-Pst interactions. PMID:20028560
DNA/RNA hybrid substrates modulate the catalytic activity of purified AID.

PubMed

Abdouni, Hala S; King, Justin J; Ghorbani, Atefeh; Fifield, Heather; Berghuis, Lesley; Larijani, Mani

2018-01-01

Activation-induced cytidine deaminase (AID) converts cytidine to uridine at Immunoglobulin (Ig) loci, initiating somatic hypermutation and class switching of antibodies. In vitro, AID acts on single stranded DNA (ssDNA), but neither double-stranded DNA (dsDNA) oligonucleotides nor RNA, and it is believed that transcription is the in vivo generator of ssDNA targeted by AID. It is also known that the Ig loci, particularly the switch (S) regions targeted by AID are rich in transcription-generated DNA/RNA hybrids. Here, we examined the binding and catalytic behavior of purified AID on DNA/RNA hybrid substrates bearing either random sequences or GC-rich sequences simulating Ig S regions. If substrates were made up of a random sequence, AID preferred substrates composed entirely of DNA over DNA/RNA hybrids. In contrast, if substrates were composed of S region sequences, AID preferred to mutate DNA/RNA hybrids over substrates composed entirely of DNA. Accordingly, AID exhibited a significantly higher affinity for binding DNA/RNA hybrid substrates composed specifically of S region sequences, than any other substrates composed of DNA. Thus, in the absence of any other cellular processes or factors, AID itself favors binding and mutating DNA/RNA hybrids composed of S region sequences. AID:DNA/RNA complex formation and supporting mutational analyses suggest that recognition of DNA/RNA hybrids is an inherent structural property of AID. Copyright © 2017 Elsevier Ltd. All rights reserved.
Characterization of the repetitive DNA elements in the genome of fish lymphocystis disease viruses.

PubMed

Schnitzler, P; Darai, G

1989-09-01

The complete DNA nucleotide sequence of the repetitive DNA elements in the genome of fish lymphocystis disease virus (FLDV) isolated from two different species (flounder and dab) was determined. The size of these repetitive DNA elements was found to be 1413 bp which corresponds to the DNA sequences of the 5' terminus of the EcoRI DNA fragment B (0.034 to 0.052 m.u.) and to the EcoRI DNA fragment M (0.718 to 0.736 m.u.) of the FLDV genome causing lymphocystis disease in flounder and plaice. The degree of DNA nucleotide homology between both regions was found to be 99%. The repetitive DNA element in the genome of FLDV isolated from other fish species (dab) was identified and is located within the EcoRI DNA fragment B and J of the viral genome. The DNA nucleotide sequence of one duplicate of this repetition (EcoRI DNA fragment J) was determined (1410 bp) and compared to the DNA nucleotide sequences of the repetitive DNA elements of the genome of FLDV isolated from flounder. It was found that the repetitive DNA elements of the genome of FLDV derived from two different fish species are highly conserved and possess a degree of DNA sequence homology of 94%. The DNA sequences of each strand of the individual repetitive element possess one open reading frame.
Long-range correlations and charge transport properties of DNA sequences

NASA Astrophysics Data System (ADS)

Liu, Xiao-liang; Ren, Yi; Xie, Qiong-tao; Deng, Chao-sheng; Xu, Hui

2010-04-01

By using Hurst's analysis and transfer approach, the rescaled range functions and Hurst exponents of human chromosome 22 and enterobacteria phage lambda DNA sequences are investigated and the transmission coefficients, Landauer resistances and Lyapunov coefficients of finite segments based on above genomic DNA sequences are calculated. In a comparison with quasiperiodic and random artificial DNA sequences, we find that λ-DNA exhibits anticorrelation behavior characterized by a Hurst exponent 0.5
Whole genomes redefine the mutational landscape of pancreatic cancer

PubMed Central

Waddell, Nicola; Pajic, Marina; Patch, Ann-Marie; Chang, David K.; Kassahn, Karin S.; Bailey, Peter; Johns, Amber L.; Miller, David; Nones, Katia; Quek, Kelly; Quinn, Michael C. J.; Robertson, Alan J.; Fadlullah, Muhammad Z. H.; Bruxner, Tim J. C.; Christ, Angelika N.; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourse, Craig; Nourbakhsh, Ehsan; Wani, Shivangi; Wilson, Peter J; Markham, Emma; Cloonan, Nicole; Anderson, Matthew J.; Fink, J. Lynn; Holmes, Oliver; Kazakoff, Stephen H.; Leonard, Conrad; Newell, Felicity; Poudel, Barsha; Song, Sarah; Taylor, Darrin; Waddell, Nick; Wood, Scott; Xu, Qinying; Wu, Jianmin; Pinese, Mark; Cowley, Mark J.; Lee, Hong C.; Jones, Marc D.; Nagrial, Adnan M.; Humphris, Jeremy; Chantrill, Lorraine A.; Chin, Venessa; Steinmann, Angela M.; Mawson, Amanda; Humphrey, Emily S.; Colvin, Emily K.; Chou, Angela; Scarlett, Christopher J.; Pinho, Andreia V.; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S.; Kench, James G.; Pettitt, Jessica A.; Merrett, Neil D.; Toon, Christopher; Epari, Krishna; Nguyen, Nam Q.; Barbour, Andrew; Zeps, Nikolajs; Jamieson, Nigel B.; Graham, Janet S.; Niclou, Simone P.; Bjerkvig, Rolf; Grützmann, Robert; Aust, Daniela; Hruban, Ralph H.; Maitra, Anirban; Iacobuzio-Donahue, Christine A.; Wolfgang, Christopher L.; Morgan, Richard A.; Lawlor, Rita T.; Corbo, Vincenzo; Bassi, Claudio; Falconi, Massimo; Zamboni, Giuseppe; Tortora, Giampaolo; Tempero, Margaret A.; Gill, Anthony J.; Eshleman, James R.; Pilarsky, Christian; Scarpa, Aldo; Musgrove, Elizabeth A.; Pearson, John V.; Biankin, Andrew V.; Grimmond, Sean M.

2015-01-01

Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (variation in chromosomal structure) classified PDACs into 4 subtypes with potential clinical utility: the subtypes were termed stable, locally rearranged, scattered and unstable. A significant proportion harboured focal amplifications, many of which contained druggable oncogenes (ERBB2, MET, FGFR1, CDK6, PIK3R3 and PIK3CA), but at low individual patient prevalence. Genomic instability co-segregated with inactivation of DNA maintenance genes (BRCA1, BRCA2 or PALB2) and a mutational signature of DNA damage repair deficiency. Of 8 patients who received platinum therapy, 4 of 5 individuals with these measures of defective DNA maintenance responded. PMID:25719666

Whole genomes redefine the mutational landscape of pancreatic cancer.

PubMed

Waddell, Nicola; Pajic, Marina; Patch, Ann-Marie; Chang, David K; Kassahn, Karin S; Bailey, Peter; Johns, Amber L; Miller, David; Nones, Katia; Quek, Kelly; Quinn, Michael C J; Robertson, Alan J; Fadlullah, Muhammad Z H; Bruxner, Tim J C; Christ, Angelika N; Harliwong, Ivon; Idrisoglu, Senel; Manning, Suzanne; Nourse, Craig; Nourbakhsh, Ehsan; Wani, Shivangi; Wilson, Peter J; Markham, Emma; Cloonan, Nicole; Anderson, Matthew J; Fink, J Lynn; Holmes, Oliver; Kazakoff, Stephen H; Leonard, Conrad; Newell, Felicity; Poudel, Barsha; Song, Sarah; Taylor, Darrin; Waddell, Nick; Wood, Scott; Xu, Qinying; Wu, Jianmin; Pinese, Mark; Cowley, Mark J; Lee, Hong C; Jones, Marc D; Nagrial, Adnan M; Humphris, Jeremy; Chantrill, Lorraine A; Chin, Venessa; Steinmann, Angela M; Mawson, Amanda; Humphrey, Emily S; Colvin, Emily K; Chou, Angela; Scarlett, Christopher J; Pinho, Andreia V; Giry-Laterriere, Marc; Rooman, Ilse; Samra, Jaswinder S; Kench, James G; Pettitt, Jessica A; Merrett, Neil D; Toon, Christopher; Epari, Krishna; Nguyen, Nam Q; Barbour, Andrew; Zeps, Nikolajs; Jamieson, Nigel B; Graham, Janet S; Niclou, Simone P; Bjerkvig, Rolf; Grützmann, Robert; Aust, Daniela; Hruban, Ralph H; Maitra, Anirban; Iacobuzio-Donahue, Christine A; Wolfgang, Christopher L; Morgan, Richard A; Lawlor, Rita T; Corbo, Vincenzo; Bassi, Claudio; Falconi, Massimo; Zamboni, Giuseppe; Tortora, Giampaolo; Tempero, Margaret A; Gill, Anthony J; Eshleman, James R; Pilarsky, Christian; Scarpa, Aldo; Musgrove, Elizabeth A; Pearson, John V; Biankin, Andrew V; Grimmond, Sean M

2015-02-26

Pancreatic cancer remains one of the most lethal of malignancies and a major health burden. We performed whole-genome sequencing and copy number variation (CNV) analysis of 100 pancreatic ductal adenocarcinomas (PDACs). Chromosomal rearrangements leading to gene disruption were prevalent, affecting genes known to be important in pancreatic cancer (TP53, SMAD4, CDKN2A, ARID1A and ROBO2) and new candidate drivers of pancreatic carcinogenesis (KDM6A and PREX2). Patterns of structural variation (variation in chromosomal structure) classified PDACs into 4 subtypes with potential clinical utility: the subtypes were termed stable, locally rearranged, scattered and unstable. A significant proportion harboured focal amplifications, many of which contained druggable oncogenes (ERBB2, MET, FGFR1, CDK6, PIK3R3 and PIK3CA), but at low individual patient prevalence. Genomic instability co-segregated with inactivation of DNA maintenance genes (BRCA1, BRCA2 or PALB2) and a mutational signature of DNA damage repair deficiency. Of 8 patients who received platinum therapy, 4 of 5 individuals with these measures of defective DNA maintenance responded.
A High Quality Draft Consensus Sequence of the Genome of a Heterozygous Grapevine Variety

PubMed Central

Cartwright, Dustin A.; Cestaro, Alessandro; Pruss, Dmitry; Pindo, Massimo; FitzGerald, Lisa M.; Vezzulli, Silvia; Reid, Julia; Malacarne, Giulia; Iliev, Diana; Coppola, Giuseppina; Wardell, Bryan; Micheletti, Diego; Macalma, Teresita; Facci, Marco; Mitchell, Jeff T.; Perazzolli, Michele; Eldredge, Glenn; Gatto, Pamela; Oyzerski, Rozan; Moretto, Marco; Gutin, Natalia; Stefanini, Marco; Chen, Yang; Segala, Cinzia; Davenport, Christine; Demattè, Lorenzo; Mraz, Amy; Battilana, Juri; Stormo, Keith; Costa, Fabrizio; Tao, Quanzhou; Si-Ammour, Azeddine; Harkins, Tim; Lackey, Angie; Perbost, Clotilde; Taillon, Bruce; Stella, Alessandra; Solovyev, Victor; Fawcett, Jeffrey A.; Sterck, Lieven; Vandepoele, Klaas; Grando, Stella M.; Toppo, Stefano; Moser, Claudio; Lanchbury, Jerry; Bogden, Robert; Skolnick, Mark; Sgaramella, Vittorio; Bhatnagar, Satish K.; Fontana, Paolo; Gutin, Alexander; Van de Peer, Yves; Salamini, Francesco; Viola, Roberto

2007-01-01

Background Worldwide, grapes and their derived products have a large market. The cultivated grape species Vitis vinifera has potential to become a model for fruit trees genetics. Like many plant species, it is highly heterozygous, which is an additional challenge to modern whole genome shotgun sequencing. In this paper a high quality draft genome sequence of a cultivated clone of V. vinifera Pinot Noir is presented. Principal Findings We estimate the genome size of V. vinifera to be 504.6 Mb. Genomic sequences corresponding to 477.1 Mb were assembled in 2,093 metacontigs and 435.1 Mb were anchored to the 19 linkage groups (LGs). The number of predicted genes is 29,585, of which 96.1% were assigned to LGs. This assembly of the grape genome provides candidate genes implicated in traits relevant to grapevine cultivation, such as those influencing wine quality, via secondary metabolites, and those connected with the extreme susceptibility of grape to pathogens. Single nucleotide polymorphism (SNP) distribution was consistent with a diffuse haplotype structure across the genome. Of around 2,000,000 SNPs, 1,751,176 were mapped to chromosomes and one or more of them were identified in 86.7% of anchored genes. The relative age of grape duplicated genes was estimated and this made possible to reveal a relatively recent Vitis-specific large scale duplication event concerning at least 10 chromosomes (duplication not reported before). Conclusions Sanger shotgun sequencing and highly efficient sequencing by synthesis (SBS), together with dedicated assembly programs, resolved a complex heterozygous genome. A consensus sequence of the genome and a set of mapped marker loci were generated. Homologous chromosomes of Pinot Noir differ by 11.2% of their DNA (hemizygous DNA plus chromosomal gaps). SNP markers are offered as a tool with the potential of introducing a new era in the molecular breeding of grape. PMID:18094749
[Whole Genome Sequencing of Human mtDNA Based on Ion Torrent PGM™ Platform].

PubMed

Cao, Y; Zou, K N; Huang, J P; Ma, K; Ping, Y

2017-08-01

To analyze and detect the whole genome sequence of human mitochondrial DNA （mtDNA） by Ion Torrent PGM™ platform and to study the differences of mtDNA sequence in different tissues. Samples were collected from 6 unrelated individuals by forensic postmortem examination, including chest blood, hair, costicartilage, nail, skeletal muscle and oral epithelium. Amplification of whole genome sequence of mtDNA was performed by 4 pairs of primer. Libraries were constructed with Ion Shear™ Plus Reagents kit and Ion Plus Fragment Library kit. Whole genome sequencing of mtDNA was performed using Ion Torrent PGM™ platform. Sanger sequencing was used to determine the heteroplasmy positions and the mutation positions on HVⅠ region. The whole genome sequence of mtDNA from all samples were amplified successfully. Six unrelated individuals belonged to 6 different haplotypes. Different tissues in one individual had heteroplasmy difference. The heteroplasmy positions and the mutation positions on HVⅠ region were verified by Sanger sequencing. After a consistency check by the Kappa method, it was found that the results of mtDNA sequence had a high consistency in different tissues. The testing method used in present study for sequencing the whole genome sequence of human mtDNA can detect the heteroplasmy difference in different tissues, which have good consistency. The results provide guidance for the further applications of mtDNA in forensic science. Copyright© by the Editorial Department of Journal of Forensic Medicine
Germline pathogenic variants in PALB2 and other cancer-predisposing genes in families with hereditary diffuse gastric cancer without CDH1 mutation: a whole-exome sequencing study.

PubMed

Fewings, Eleanor; Larionov, Alexey; Redman, James; Goldgraben, Mae A; Scarth, James; Richardson, Susan; Brewer, Carole; Davidson, Rosemarie; Ellis, Ian; Evans, D Gareth; Halliday, Dorothy; Izatt, Louise; Marks, Peter; McConnell, Vivienne; Verbist, Louis; Mayes, Rebecca; Clark, Graeme R; Hadfield, James; Chin, Suet-Feung; Teixeira, Manuel R; Giger, Olivier T; Hardwick, Richard; di Pietro, Massimiliano; O'Donovan, Maria; Pharoah, Paul; Caldas, Carlos; Fitzgerald, Rebecca C; Tischkowitz, Marc

2018-04-26

Germline pathogenic variants in the E-cadherin gene (CDH1) are strongly associated with the development of hereditary diffuse gastric cancer. There is a paucity of data to guide risk assessment and management of families with hereditary diffuse gastric cancer that do not carry a CDH1 pathogenic variant, making it difficult to make informed decisions about surveillance and risk-reducing surgery. We aimed to identify new candidate genes associated with predisposition to hereditary diffuse gastric cancer in affected families without pathogenic CDH1 variants. We did whole-exome sequencing on DNA extracted from the blood of 39 individuals (28 individuals diagnosed with hereditary diffuse gastric cancer and 11 unaffected first-degree relatives) in 22 families without pathogenic CDH1 variants. Genes with loss-of-function variants were prioritised using gene-interaction analysis to identify clusters of genes that could be involved in predisposition to hereditary diffuse gastric cancer. Protein-affecting germline variants were identified in probands from six families with hereditary diffuse gastric cancer; variants were found in genes known to predispose to cancer and in lesser-studied DNA repair genes. A frameshift deletion in PALB2 was found in one member of a family with a history of gastric and breast cancer. Two different MSH2 variants were identified in two unrelated affected individuals, including one frameshift insertion and one previously described start-codon loss. One family had a unique combination of variants in the DNA repair genes ATR and NBN. Two variants in the DNA repair gene RECQL5 were identified in two unrelated families: one missense variant and a splice-acceptor variant. The results of this study suggest a role for the known cancer predisposition gene PALB2 in families with hereditary diffuse gastric cancer and no detected pathogenic CDH1 variants. We also identified new candidate genes associated with disease risk in these families. UK Medical Research Council (Sackler programme), European Research Council under the European Union's Seventh Framework Programme (2007-13), National Institute for Health Research Cambridge Biomedical Research Centre, Experimental Cancer Medicine Centres, and Cancer Research UK. Copyright © 2018 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.
Survey of endosymbionts in the Diaphorina citri metagenome and assembly of a Wolbachia wDi draft genome.

PubMed

Saha, Surya; Hunter, Wayne B; Reese, Justin; Morgan, J Kent; Marutani-Hert, Mizuri; Huang, Hong; Lindeberg, Magdalen

2012-01-01

Diaphorina citri (Hemiptera: Psyllidae), the Asian citrus psyllid, is the insect vector of Ca. Liberibacter asiaticus, the causal agent of citrus greening disease. Sequencing of the D. citri metagenome has been initiated to gain better understanding of the biology of this organism and the potential roles of its bacterial endosymbionts. To corroborate candidate endosymbionts previously identified by rDNA amplification, raw reads from the D. citri metagenome sequence were mapped to reference genome sequences. Results of the read mapping provided the most support for Wolbachia and an enteric bacterium most similar to Salmonella. Wolbachia-derived reads were extracted using the complete genome sequences for four Wolbachia strains. Reads were assembled into a draft genome sequence, and the annotation assessed for the presence of features potentially involved in host interaction. Genome alignment with the complete sequences reveals membership of Wolbachia wDi in supergroup B, further supported by phylogenetic analysis of FtsZ. FtsZ and Wsp phylogenies additionally indicate that the Wolbachia strain in the Florida D. citri isolate falls into a sub-clade of supergroup B, distinct from Wolbachia present in Chinese D. citri isolates, supporting the hypothesis that the D. citri introduced into Florida did not originate from China.
Survey of Endosymbionts in the Diaphorina citri Metagenome and Assembly of a Wolbachia wDi Draft Genome

PubMed Central

Saha, Surya; Hunter, Wayne B.; Reese, Justin; Morgan, J. Kent; Marutani-Hert, Mizuri; Huang, Hong; Lindeberg, Magdalen

2012-01-01

Diaphorina citri (Hemiptera: Psyllidae), the Asian citrus psyllid, is the insect vector of Ca. Liberibacter asiaticus, the causal agent of citrus greening disease. Sequencing of the D. citri metagenome has been initiated to gain better understanding of the biology of this organism and the potential roles of its bacterial endosymbionts. To corroborate candidate endosymbionts previously identified by rDNA amplification, raw reads from the D. citri metagenome sequence were mapped to reference genome sequences. Results of the read mapping provided the most support for Wolbachia and an enteric bacterium most similar to Salmonella. Wolbachia-derived reads were extracted using the complete genome sequences for four Wolbachia strains. Reads were assembled into a draft genome sequence, and the annotation assessed for the presence of features potentially involved in host interaction. Genome alignment with the complete sequences reveals membership of Wolbachia wDi in supergroup B, further supported by phylogenetic analysis of FtsZ. FtsZ and Wsp phylogenies additionally indicate that the Wolbachia strain in the Florida D. citri isolate falls into a sub-clade of supergroup B, distinct from Wolbachia present in Chinese D. citri isolates, supporting the hypothesis that the D. citri introduced into Florida did not originate from China. PMID:23166822
Sequence periodicity in nucleosomal DNA and intrinsic curvature

PubMed Central

2010-01-01

Background Most eukaryotic DNA contained in the nucleus is packaged by wrapping DNA around histone octamers. Histones are ubiquitous and bind most regions of chromosomal DNA. In order to achieve smooth wrapping of the DNA around the histone octamer, the DNA duplex should be able to deform and should possess intrinsic curvature. The deformability of DNA is a result of the non-parallelness of base pair stacks. The stacking interaction between base pairs is sequence dependent. The higher the stacking energy the more rigid the DNA helix, thus it is natural to expect that sequences that are involved in wrapping around the histone octamer should be unstacked and possess intrinsic curvature. Intrinsic curvature has been shown to be dictated by the periodic recurrence of certain dinucleotides. Several genome-wide studies directed towards mapping of nucleosome positions have revealed periodicity associated with certain stretches of sequences. In the current study, these sequences have been analyzed with a view to understand their sequence-dependent structures. Results Higher order DNA structures and the distribution of molecular bend loci associated with 146 base nucleosome core DNA sequence from C. elegans and chicken have been analyzed using the theoretical model for DNA curvature. The curvature dispersion calculated by cyclically permuting the sequences revealed that the molecular bend loci were delocalized throughout the nucleosome core region and had varying degrees of intrinsic curvature. Conclusions The higher order structures associated with nucleosomes of C.elegans and chicken calculated from the sequences revealed heterogeneity with respect to the deviation of the DNA axis. The results points to the possibility of context dependent curvature of varying degrees to be associated with nucleosomal DNA. PMID:20487515
A survey of the sequence-specific interaction of damaging agents with DNA: emphasis on antitumor agents.

PubMed

Murray, V

1999-01-01

This article reviews the literature concerning the sequence specificity of DNA-damaging agents. DNA-damaging agents are widely used in cancer chemotherapy. It is important to understand fully the determinants of DNA sequence specificity so that more effective DNA-damaging agents can be developed as antitumor drugs. There are five main methods of DNA sequence specificity analysis: cleavage of end-labeled fragments, linear amplification with Taq DNA polymerase, ligation-mediated polymerase chain reaction (PCR), single-strand ligation PCR, and footprinting. The DNA sequence specificity in purified DNA and in intact mammalian cells is reviewed for several classes of DNA-damaging agent. These include agents that form covalent adducts with DNA, free radical generators, topoisomerase inhibitors, intercalators and minor groove binders, enzymes, and electromagnetic radiation. The main sites of adduct formation are at the N-7 of guanine in the major groove of DNA and the N-3 of adenine in the minor groove, whereas free radical generators abstract hydrogen from the deoxyribose sugar and topoisomerase inhibitors cause enzyme-DNA cross-links to form. Several issues involved in the determination of the DNA sequence specificity are discussed. The future directions of the field, with respect to cancer chemotherapy, are also examined.
Deciphering the genomic targets of alkylating polyamide conjugates using high-throughput sequencing

PubMed Central

Chandran, Anandhakumar; Syed, Junetha; Taylor, Rhys D.; Kashiwazaki, Gengo; Sato, Shinsuke; Hashiya, Kaori; Bando, Toshikazu; Sugiyama, Hiroshi

2016-01-01

Chemically engineered small molecules targeting specific genomic sequences play an important role in drug development research. Pyrrole-imidazole polyamides (PIPs) are a group of molecules that can bind to the DNA minor-groove and can be engineered to target specific sequences. Their biological effects rely primarily on their selective DNA binding. However, the binding mechanism of PIPs at the chromatinized genome level is poorly understood. Herein, we report a method using high-throughput sequencing to identify the DNA-alkylating sites of PIP-indole-seco-CBI conjugates. High-throughput sequencing analysis of conjugate 2 showed highly similar DNA-alkylating sites on synthetic oligos (histone-free DNA) and on human genomes (chromatinized DNA context). To our knowledge, this is the first report identifying alkylation sites across genomic DNA by alkylating PIP conjugates using high-throughput sequencing. PMID:27098039
Manipulation of oligonucleotides immobilized on solid supports - DNA computations on surfaces

NASA Astrophysics Data System (ADS)

Liu, Qinghua

The manipulation of DNA oligonucleotides immobilized on various solid supports has been studied intensively, especially in the area of surface hybridization. Recently, surface-based biotechnology has been applied to the area of molecular computing. These surface-based methods have advantages with regard to ease of handling, facile purification, and less interference when compared to solution methodologies. This dissertation describes the investigation of molecular approaches to DNA computing. The feasibility of encoding a bit (0 or 1) of information for DNA-based computations at the single nucleotide level was studied, particularly with regard to the efficiency and specificity of hybridization discrimination. Both gold and glass surfaces, with addressed arrays of 32 oligonucleotides, were employed with similar hybridization results. Although single-base discrimination may be achieved in the system, it is at the cost of a severe decrease in the efficiency of hybridization to perfectly matched sequences. This compromises the utility of single nucleotide encoding for DNA computing applications in the absence of some additional mechanism for increasing specificity. Several methods are suggested including a multiple-base encoding strategy. The multiple-base encoding strategy was employed to develop a prototype DNA computer. The approach was demonstrated by solving a small example of the Satisfiability (SAT) problem, an NP-complete problem in Boolean logic. 16 distinct DNA oligonucleotides, encoding all candidate solutions to the 4-variable-4-clause-3-SAT problem, were immobilized on a gold surface in the non-addressed format. Four cycles of MARK (hybridization), DESTROY (enzymatic destruction) and UNMARK (denaturation) were performed, which identified and eliminated members of the set which were not solutions to the problem. Determination of the answer was accomplished in the READOUT (sequence identification) operation by PCR amplification of the remaining molecules and hybridization to an addressed array. Four answers were determined and the S/N ratio between correct and incorrect solutions ranged from 10 to 777, making discrimination between correct and incorrect solutions to the problem straightforward. Additionally, studies of enzymatic manipulations of DNA molecules on surfaces suggested the use of E. coli Exonuclease I (Exo I) and perhaps EarI in the DESTROY operation.
A Case Study into Microbial Genome Assembly Gap Sequences and Finishing Strategies.

PubMed

Utturkar, Sagar M; Klingeman, Dawn M; Hurt, Richard A; Brown, Steven D

2017-01-01

This study characterized regions of DNA which remained unassembled by either PacBio and Illumina sequencing technologies for seven bacterial genomes. Two genomes were manually finished using bioinformatics and PCR/Sanger sequencing approaches and regions not assembled by automated software were analyzed. Gaps present within Illumina assemblies mostly correspond to repetitive DNA regions such as multiple rRNA operon sequences. PacBio gap sequences were evaluated for several properties such as GC content, read coverage, gap length, ability to form strong secondary structures, and corresponding annotations. Our hypothesis that strong secondary DNA structures blocked DNA polymerases and contributed to gap sequences was not accepted. PacBio assemblies had few limitations overall and gaps were explained as cumulative effect of lower than average sequence coverage and repetitive sequences at contig termini. An important aspect of the present study is the compilation of biological features that interfered with assembly and included active transposons, multiple plasmid sequences, phage DNA integration, and large sequence duplication. Our targeted genome finishing approach and systematic evaluation of the unassembled DNA will be useful for others looking to close, finish, and polish microbial genome sequences.
Genetic diversity of the merozoite surface protein-3 gene in Plasmodium falciparum populations in Thailand.

PubMed

Pattaradilokrat, Sittiporn; Sawaswong, Vorthon; Simpalipan, Phumin; Kaewthamasorn, Morakot; Siripoon, Napaporn; Harnyuttanakorn, Pongchai

2016-10-21

An effective malaria vaccine is an urgently needed tool to fight against human malaria, the most deadly parasitic disease of humans. One promising candidate is the merozoite surface protein-3 (MSP-3) of Plasmodium falciparum. This antigenic protein, encoded by the merozoite surface protein (msp-3) gene, is polymorphic and classified according to size into the two allelic types of K1 and 3D7. A recent study revealed that both the K1 and 3D7 alleles co-circulated within P. falciparum populations in Thailand, but the extent of the sequence diversity and variation within each allelic type remains largely unknown. The msp-3 gene was sequenced from 59 P. falciparum samples collected from five endemic areas (Mae Hong Son, Kanchanaburi, Ranong, Trat and Ubon Ratchathani) in Thailand and analysed for nucleotide sequence diversity, haplotype diversity and deduced amino acid sequence diversity. The gene was also subject to population genetic analysis (F st ) and neutrality tests (Tajima's D, Fu and Li D* and Fu and Li' F* tests) to determine any signature of selection. The sequence analyses revealed eight unique DNA haplotypes and seven amino acid sequence variants, with a haplotype and nucleotide diversity of 0.828 and 0.049, respectively. Neutrality tests indicated that the polymorphism detected in the alanine heptad repeat region of MSP-3 was maintained by positive diversifying selection, suggesting its role as a potential target of protective immune responses and supporting its role as a vaccine candidate. Comparison of MSP-3 variants among parasite populations in Thailand, India and Nigeria also inferred a close genetic relationship between P. falciparum populations in Asia. This study revealed the extent of the msp-3 gene diversity in P. falciparum in Thailand, providing the fundamental basis for the better design of future blood stage malaria vaccines against P. falciparum.
Whole genome sequencing of turbot (Scophthalmus maximus; Pleuronectiformes): a fish adapted to demersal life.

PubMed

Figueras, Antonio; Robledo, Diego; Corvelo, André; Hermida, Miguel; Pereiro, Patricia; Rubiolo, Juan A; Gómez-Garrido, Jèssica; Carreté, Laia; Bello, Xabier; Gut, Marta; Gut, Ivo Glynne; Marcet-Houben, Marina; Forn-Cuní, Gabriel; Galán, Beatriz; García, José Luis; Abal-Fabeiro, José Luis; Pardo, Belen G; Taboada, Xoana; Fernández, Carlos; Vlasova, Anna; Hermoso-Pulido, Antonio; Guigó, Roderic; Álvarez-Dios, José Antonio; Gómez-Tato, Antonio; Viñas, Ana; Maside, Xulio; Gabaldón, Toni; Novoa, Beatriz; Bouza, Carmen; Alioto, Tyler; Martínez, Paulino

2016-06-01

The turbot is a flatfish (Pleuronectiformes) with increasing commercial value, which has prompted active genomic research aimed at more efficient selection. Here we present the sequence and annotation of the turbot genome, which represents a milestone for both boosting breeding programmes and ascertaining the origin and diversification of flatfish. We compare the turbot genome with model fish genomes to investigate teleost chromosome evolution. We observe a conserved macrosyntenic pattern within Percomorpha and identify large syntenic blocks within the turbot genome related to the teleost genome duplication. We identify gene family expansions and positive selection of genes associated with vision and metabolism of membrane lipids, which suggests adaptation to demersal lifestyle and to cold temperatures, respectively. Our data indicate a quick evolution and diversification of flatfish to adapt to benthic life and provide clues for understanding their controversial origin. Moreover, we investigate the genomic architecture of growth, sex determination and disease resistance, key traits for understanding local adaptation and boosting turbot production, by mapping candidate genes and previously reported quantitative trait loci. The genomic architecture of these productive traits has allowed the identification of candidate genes and enriched pathways that may represent useful information for future marker-assisted selection in turbot. © The Author 2016. Published by Oxford University Press on behalf of Kazusa DNA Research Institute.
Scalable whole-exome sequencing of cell-free DNA reveals high concordance with metastatic tumors.

PubMed

Adalsteinsson, Viktor A; Ha, Gavin; Freeman, Samuel S; Choudhury, Atish D; Stover, Daniel G; Parsons, Heather A; Gydush, Gregory; Reed, Sarah C; Rotem, Denisse; Rhoades, Justin; Loginov, Denis; Livitz, Dimitri; Rosebrock, Daniel; Leshchiner, Ignaty; Kim, Jaegil; Stewart, Chip; Rosenberg, Mara; Francis, Joshua M; Zhang, Cheng-Zhong; Cohen, Ofir; Oh, Coyin; Ding, Huiming; Polak, Paz; Lloyd, Max; Mahmud, Sairah; Helvie, Karla; Merrill, Margaret S; Santiago, Rebecca A; O'Connor, Edward P; Jeong, Seong H; Leeson, Rachel; Barry, Rachel M; Kramkowski, Joseph F; Zhang, Zhenwei; Polacek, Laura; Lohr, Jens G; Schleicher, Molly; Lipscomb, Emily; Saltzman, Andrea; Oliver, Nelly M; Marini, Lori; Waks, Adrienne G; Harshman, Lauren C; Tolaney, Sara M; Van Allen, Eliezer M; Winer, Eric P; Lin, Nancy U; Nakabayashi, Mari; Taplin, Mary-Ellen; Johannessen, Cory M; Garraway, Levi A; Golub, Todd R; Boehm, Jesse S; Wagle, Nikhil; Getz, Gad; Love, J Christopher; Meyerson, Matthew

2017-11-06

Whole-exome sequencing of cell-free DNA (cfDNA) could enable comprehensive profiling of tumors from blood but the genome-wide concordance between cfDNA and tumor biopsies is uncertain. Here we report ichorCNA, software that quantifies tumor content in cfDNA from 0.1× coverage whole-genome sequencing data without prior knowledge of tumor mutations. We apply ichorCNA to 1439 blood samples from 520 patients with metastatic prostate or breast cancers. In the earliest tested sample for each patient, 34% of patients have ≥10% tumor-derived cfDNA, sufficient for standard coverage whole-exome sequencing. Using whole-exome sequencing, we validate the concordance of clonal somatic mutations (88%), copy number alterations (80%), mutational signatures, and neoantigens between cfDNA and matched tumor biopsies from 41 patients with ≥10% cfDNA tumor content. In summary, we provide methods to identify patients eligible for comprehensive cfDNA profiling, revealing its applicability to many patients, and demonstrate high concordance of cfDNA and metastatic tumor whole-exome sequencing.
Genetics of intellectual disability in consanguineous families.

PubMed

Hu, Hao; Kahrizi, Kimia; Musante, Luciana; Fattahi, Zohreh; Herwig, Ralf; Hosseini, Masoumeh; Oppitz, Cornelia; Abedini, Seyedeh Sedigheh; Suckow, Vanessa; Larti, Farzaneh; Beheshtian, Maryam; Lipkowitz, Bettina; Akhtarkhavari, Tara; Mehvari, Sepideh; Otto, Sabine; Mohseni, Marzieh; Arzhangi, Sanaz; Jamali, Payman; Mojahedi, Faezeh; Taghdiri, Maryam; Papari, Elaheh; Soltani Banavandi, Mohammad Javad; Akbari, Saeide; Tonekaboni, Seyed Hassan; Dehghani, Hossein; Ebrahimpour, Mohammad Reza; Bader, Ingrid; Davarnia, Behzad; Cohen, Monika; Khodaei, Hossein; Albrecht, Beate; Azimi, Sarah; Zirn, Birgit; Bastami, Milad; Wieczorek, Dagmar; Bahrami, Gholamreza; Keleman, Krystyna; Vahid, Leila Nouri; Tzschach, Andreas; Gärtner, Jutta; Gillessen-Kaesbach, Gabriele; Varaghchi, Jamileh Rezazadeh; Timmermann, Bernd; Pourfatemi, Fatemeh; Jankhah, Aria; Chen, Wei; Nikuei, Pooneh; Kalscheuer, Vera M; Oladnabi, Morteza; Wienker, Thomas F; Ropers, Hans-Hilger; Najmabadi, Hossein

2018-01-04

Autosomal recessive (AR) gene defects are the leading genetic cause of intellectual disability (ID) in countries with frequent parental consanguinity, which account for about 1/7th of the world population. Yet, compared to autosomal dominant de novo mutations, which are the predominant cause of ID in Western countries, the identification of AR-ID genes has lagged behind. Here, we report on whole exome and whole genome sequencing in 404 consanguineous predominantly Iranian families with two or more affected offspring. In 219 of these, we found likely causative variants, involving 77 known and 77 novel AR-ID (candidate) genes, 21 X-linked genes, as well as 9 genes previously implicated in diseases other than ID. This study, the largest of its kind published to date, illustrates that high-throughput DNA sequencing in consanguineous families is a superior strategy for elucidating the thousands of hitherto unknown gene defects underlying AR-ID, and it sheds light on their prevalence.
An evolution based biosensor receptor DNA sequence generation algorithm.

PubMed

Kim, Eungyeong; Lee, Malrey; Gatton, Thomas M; Lee, Jaewan; Zang, Yupeng

2010-01-01

A biosensor is composed of a bioreceptor, an associated recognition molecule, and a signal transducer that can selectively detect target substances for analysis. DNA based biosensors utilize receptor molecules that allow hybridization with the target analyte. However, most DNA biosensor research uses oligonucleotides as the target analytes and does not address the potential problems of real samples. The identification of recognition molecules suitable for real target analyte samples is an important step towards further development of DNA biosensors. This study examines the characteristics of DNA used as bioreceptors and proposes a hybrid evolution-based DNA sequence generating algorithm, based on DNA computing, to identify suitable DNA bioreceptor recognition molecules for stable hybridization with real target substances. The Traveling Salesman Problem (TSP) approach is applied in the proposed algorithm to evaluate the safety and fitness of the generated DNA sequences. This approach improves efficiency and stability for enhanced and variable-length DNA sequence generation and allows extension to generation of variable-length DNA sequences with diverse receptor recognition requirements.
RDNAnalyzer: A tool for DNA secondary structure prediction and sequence analysis

PubMed Central

Afzal, Muhammad; Shahid, Ahmad Ali; Shehzadi, Abida; Nadeem, Shahid; Husnain, Tayyab

2012-01-01

RDNAnalyzer is an innovative computer based tool designed for DNA secondary structure prediction and sequence analysis. It can randomly generate the DNA sequence or user can upload the sequences of their own interest in RAW format. It uses and extends the Nussinov dynamic programming algorithm and has various application for the sequence analysis. It predicts the DNA secondary structure and base pairings. It also provides the tools for routinely performed sequence analysis by the biological scientists such as DNA replication, reverse compliment generation, transcription, translation, sequence specific information as total number of nucleotide bases, ATGC base contents along with their respective percentages and sequence cleaner. RDNAnalyzer is a unique tool developed in Microsoft Visual Studio 2008 using Microsoft Visual C# and Windows Presentation Foundation and provides user friendly environment for sequence analysis. It is freely available. Availability http://www.cemb.edu.pk/sw.html Abbreviations RDNAnalyzer - Random DNA Analyser, GUI - Graphical user interface, XAML - Extensible Application Markup Language. PMID:23055611
Structural and Thermodynamic Signatures of DNA Recognition by Mycobacterium tuberculosis DnaA

DOE Office of Scientific and Technical Information (OSTI.GOV)

Tsodikov, Oleg V.; Biswas, Tapan

An essential protein, DnaA, binds to 9-bp DNA sites within the origin of replication oriC. These binding events are prerequisite to forming an enigmatic nucleoprotein scaffold that initiates replication. The number, sequences, positions, and orientations of these short DNA sites, or DnaA boxes, within the oriCs of different bacteria vary considerably. To investigate features of DnaA boxes that are important for binding Mycobacterium tuberculosis DnaA (MtDnaA), we have determined the crystal structures of the DNA binding domain (DBD) of MtDnaA bound to a cognate MtDnaA-box (at 2.0 {angstrom} resolution) and to a consensus Escherichia coli DnaA-box (at 2.3 {angstrom}). Thesemore » structures, complemented by calorimetric equilibrium binding studies of MtDnaA DBD in a series of DnaA-box variants, reveal the main determinants of DNA recognition and establish the [T/C][T/A][G/A]TCCACA sequence as a high-affinity MtDnaA-box. Bioinformatic and calorimetric analyses indicate that DnaA-box sequences in mycobacterial oriCs generally differ from the optimal binding sequence. This sequence variation occurs commonly at the first 2 bp, making an in vivo mycobacterial DnaA-box effectively a 7-mer and not a 9-mer. We demonstrate that the decrease in the affinity of these MtDnaA-box variants for MtDnaA DBD relative to that of the highest-affinity box TTGTCCACA is less than 10-fold. The understanding of DnaA-box recognition by MtDnaA and E. coli DnaA enables one to map DnaA-box sequences in the genomes of M. tuberculosis and other eubacteria.« less
Prospect of Bioflavonoid Fisetin as a Quadruplex DNA Ligand: A Biophysical Approach

PubMed Central

Sengupta, Bidisha; Pahari, Biswapathik; Blackmon, Laura; Sengupta, Pradeep K.

2013-01-01

Quadruplex (G4) forming sequences in telomeric DNA and c-myc promoter regions of human DNA are associated with tumorogenesis. Ligands that can facilitate or stabilize the formation and increase the stabilization of G4 can prevent tumor cell proliferation and have been regarded as potential anti-cancer drugs. In the present study, steady state and time-resolved fluorescence measurements provide important structural and dynamical insights into the free and bound states of the therapeutically potent plant flavonoid fisetin (3,3′,4′,7-tetrahydroxyflavone) in a G4 DNA matrix. The excited state intra-molecular proton transfer (ESPT) of fisetin plays an important role in observing and understanding the binding of fisetin with the G4 DNA. Differential absorption spectra, thermal melting, and circular dichroism spectroscopic studies provide evidences for the formation of G4 DNA and size exclusion chromatography (SEC) proves the binding and 1∶1 stoichiometry of fisetin in the DNA matrix. Comparative analysis of binding in the presence of EtBr proves that fisetin favors binding at the face of the G-quartet, mostly along the diagonal loop. Time resolved fluorescence anisotropy decay analysis indicates the increase in the restrictions in motion from the free to bound fisetin. We have also investigated the fingerprints of the binding of fisetin in the antiparallel quadruplex using Raman spectroscopy. Preliminary results indicate fisetin to be a prospective candidate as a G4 ligand. PMID:23785423
Redox/methylation mediated abnormal DNA methylation as regulators of ambient fine particulate matter-induced neurodevelopment related impairment in human neuronal cells

NASA Astrophysics Data System (ADS)

Wei, Hongying; Liang, Fan; Meng, Ge; Nie, Zhiqing; Zhou, Ren; Cheng, Wei; Wu, Xiaomeng; Feng, Yan; Wang, Yan

2016-09-01

Fine particulate matter (PM2.5) has been implicated as a risk factor for neurodevelopmental disorders including autism in children. However, the underlying biological mechanism remains unclear. DNA methylation is suggested to be a fundamental mechanism for the neuronal responses to environmental cues. We prepared whole particle of PM2.5 (PM2.5), water-soluble extracts (Pw), organic extracts (Po) and carbon core component (Pc) and characterized their chemical constitutes. We found that PM2.5 induced significant redox imbalance, decreased the levels of intercellular methyl donor S-adenosylmethionine and caused global DNA hypomethylation. Furthermore, PM2.5 exposure triggered gene-specific promoter DNA hypo- or hypermethylation and abnormal mRNA expression of autism candidate genes. PM2.5-induced DNA hypermethylation in promoter regions of synapse related genes were associated with the decreases in their mRNA and protein expression. The inhibiting effects of antioxidative reagents, a methylation-supporting agent and a DNA methyltransferase inhibitor demonstrated the involvement of redox/methylation mechanism in PM2.5-induced abnormal DNA methylation patterns and synaptic protein expression. The biological effects above generally followed a sequence of PM2.5 ≥ Pwo > Po > Pw > Pc. Our results implicated a novel epigenetic mechanism for the neurodevelopmental toxicity of particulate air pollution, and that eliminating the chemical components could mitigate the neurotoxicity of PM2.5.

Some links on this page may take you to non-federal websites. Their policies may differ from this site.