Sample records for public dna databases

  1. The Genographic Project public participation mitochondrial DNA database.

    PubMed

    Behar, Doron M; Rosset, Saharon; Blue-Smith, Jason; Balanovsky, Oleg; Tzur, Shay; Comas, David; Mitchell, R John; Quintana-Murci, Lluis; Tyler-Smith, Chris; Wells, R Spencer

    2007-06-01

    The Genographic Project is studying the genetic signatures of ancient human migrations and creating an open-source research database. It allows members of the public to participate in a real-time anthropological genetics study by submitting personal samples for analysis and donating the genetic results to the database. We report our experience from the first 18 months of public participation in the Genographic Project, during which we have created the largest standardized human mitochondrial DNA (mtDNA) database ever collected, comprising 78,590 genotypes. Here, we detail our genotyping and quality assurance protocols including direct sequencing of the mtDNA HVS-I, genotyping of 22 coding-region SNPs, and a series of computational quality checks based on phylogenetic principles. This database is very informative with respect to mtDNA phylogeny and mutational dynamics, and its size allows us to develop a nearest neighbor-based methodology for mtDNA haplogroup prediction based on HVS-I motifs that is superior to classic rule-based approaches. We make available to the scientific community and general public two new resources: a periodically updated database comprising all data donated by participants, and the nearest neighbor haplogroup prediction tool. PMID:17604454

  2. The Genographic Project Public Participation Mitochondrial DNA Database

    PubMed Central

    Behar, Doron M; Rosset, Saharon; Blue-Smith, Jason; Balanovsky, Oleg; Tzur, Shay; Comas, David; Mitchell, R. John; Quintana-Murci, Lluis; Tyler-Smith, Chris; Wells, R. Spencer

    2007-01-01

    The Genographic Project is studying the genetic signatures of ancient human migrations and creating an open-source research database. It allows members of the public to participate in a real-time anthropological genetics study by submitting personal samples for analysis and donating the genetic results to the database. We report our experience from the first 18 months of public participation in the Genographic Project, during which we have created the largest standardized human mitochondrial DNA (mtDNA) database ever collected, comprising 78,590 genotypes. Here, we detail our genotyping and quality assurance protocols including direct sequencing of the mtDNA HVS-I, genotyping of 22 coding-region SNPs, and a series of computational quality checks based on phylogenetic principles. This database is very informative with respect to mtDNA phylogeny and mutational dynamics, and its size allows us to develop a nearest neighbor–based methodology for mtDNA haplogroup prediction based on HVS-I motifs that is superior to classic rule-based approaches. We make available to the scientific community and general public two new resources: a periodically updated database comprising all data donated by participants, and the nearest neighbor haplogroup prediction tool. PMID:17604454

  3. The Genographic Project Public Participation Mitochondrial DNA Database

    Microsoft Academic Search

    Doron M Behar; Saharon Rosset; Jason Blue-Smith; Oleg Balanovsky; Shay Tzur; David Comas; R. John Mitchell; Lluis Quintana-Murci; Chris Tyler-Smith; R. Spencer Wells

    2007-01-01

    The Genographic Project is studying the genetic signatures of ancient human migrations and creating an open-source research database. It allows members of the public to participate in a real-time anthropological genetics study by submitting personal samples for analysis and donating the genetic results to the database. We report our experience from the first 18 months of public participation in the

  4. About DNA databasing and investigative genetic analysis of externally visible characteristics: A public survey.

    PubMed

    Zieger, Martin; Utz, Silvia

    2015-07-01

    During the last decade, DNA profiling and the use of DNA databases have become two of the most employed instruments of police investigations. This very rapid establishment of forensic genetics is yet far from being complete. In the last few years novel types of analyses have been presented to describe phenotypically a possible perpetrator. We conducted the present study among German speaking Swiss residents for two main reasons: firstly, we aimed at getting an impression of the public awareness and acceptance of the Swiss DNA database and the perception of a hypothetical DNA database containing all Swiss residents. Secondly, we wanted to get a broader picture of how people that are not working in the field of forensic genetics think about legal permission to establish phenotypic descriptions of alleged criminals by genetic means. Even though a significant number of study participants did not even know about the existence of the Swiss DNA database, its acceptance appears to be very high. Generally our results suggest that the current forensic use of DNA profiling is considered highly trustworthy. However, the acceptance of a hypothetical universal database would be only as low as about 30% among the 284 respondents to our study, mostly because people are concerned about the security of their genetic data, their privacy or a possible risk of abuse of such a database. Concerning the genetic analysis of externally visible characteristics and biogeographical ancestry, we discover a high degree of acceptance. The acceptance decreases slightly when precise characteristics are presented to the participants in detail. About half of the respondents would be in favor of the moderate use of physical traits analyses only for serious crimes threatening life, health or sexual integrity. The possible risk of discrimination and reinforcement of racism, as discussed by scholars from anthropology, bioethics, law, philosophy and sociology, is mentioned less frequently by the study participants than we would have expected. A national DNA database and the widespread use of DNA analyses for police and justice have an impact on the entire society. Therefore the concerns of lay persons from the respective population should be heard and considered. The aims of this study were to draw a broader picture of the public opinion on DNA databasing and to contribute to the debate about the possible future use of genetics to reveal phenotypic characteristics. Our data might provide an additional perspective for experts involved in regulatory or legislative processes. PMID:26004189

  5. On the reliability of DNA sequences of Ophiocordyceps sinensis in public databases.

    PubMed

    Zhang, Shu; Zhang, Yong-Jie; Liu, Xing-Zhong; Zhang, Hong; Liu, Dian-Sheng

    2013-04-01

    Some DNA sequences in the International Nucleotide Sequence Databases (INSD) are erroneously annotated, which has lead to misleading conclusions in publications. Ophiocordyceps sinensis (syn. Cordyceps sinensis) is a fungus endemic to the Tibetan Plateau, and more than 100 populations covering almost its distribution area have been examined by us over recent years. In this study, using the data from authentic materials, we have evaluated the reliability of nucleotide sequences annotated as O. sinensis in the INSD. As of October 15, 2012, the INSD contained 874 records annotated as O. sinensis, including 555 records representing nuclear ribosomal DNA (63.5 %), 197 representing protein-coding genes (22.5 %), 92 representing random markers with unknown functions (10.5 %), and 30 representing microsatellite loci (3.5 %). Our analysis indicated that 39 of the 397 internal transcribed spacer entries, 27 of the 105 small subunit entries, and five of the 53 large subunit entries were incorrectly annotated as belonging to O. sinensis. For protein-coding sequences, all records of serine protease genes, the mating-type gene MAT1-2-1, the DNA lyase gene, the two largest subunits of RNA polymerase II, and elongation factor-1? gene were correct, while 14 of the 73 ?-tubulin entries were indeterminate. Genetic diversity analyses using those sequences correctly identified as O. sinensis revealed significant genetic differentiation in the fungus although the extent of genetic differentiation varied with the gene. The relationship between O. sinensis and some other related fungal taxa is also discussed. PMID:23397071

  6. Spanish public awareness regarding DNA profile databases in forensic genetics: what type of DNA profiles should be included?

    PubMed Central

    Gamero, Joaquín J; Romero, Jose?Luis; Peralta, Juan?Luis; Carvalho, Mónica; Corte?Real, Francisco

    2007-01-01

    The importance of non?codifying DNA polymorphism for the administration of justice is now well known. In Spain, however, this type of test has given rise to questions in recent years: (a) Should consent be obtained before biological samples are taken from an individual for DNA analysis? (b) Does society perceive these techniques and methods of analysis as being reliable? (c) There appears to be lack of knowledge concerning the basic norms that regulate databases containing private or personal information and the protection that information of this type must be given. This opinion survey and the subsequent analysis of the results in ethical terms may serve to reveal the criteria and the degree of information that society has with regard to DNA databases. In the study, 73.20% (SE 1.12%) of the population surveyed was in favour of specific legislation for computer files in which DNA analysis results for forensic purposes are stored. PMID:17906059

  7. The National DNA Database

    Microsoft Academic Search

    David J Werrett

    1997-01-01

    Over the last two years the Forensic Science Service (FSS) has developed and put into operation a National DNA Database that has analysed samples from individuals suspected of crime and stains from scenes of crime. It has provided more than 2200 links between individuals and scenes and 1200 links between scenes of crime. It uses an STR SGM (second generation

  8. NCCDPHP PUBLICATION DATABASE

    EPA Science Inventory

    This database provides bibliographic citations and abstracts of publications produced by the CDC's National Center for Chronic Disease Prevention and Health Promotion (NCCDPHP) including journal articles, monographs, book chapters, reports, policy documents, and fact sheets. Full...

  9. Navigating Public Microarray Databases

    PubMed Central

    Bähler, Jürg

    2004-01-01

    With the ever-escalating amount of data being produced by genome-wide microarray studies, it is of increasing importance that these data are captured in public databases so that researchers can use this information to complement and enhance their own studies. Many groups have set up databases of expression data, ranging from large repositories, which are designed to comprehensively capture all published data, through to more specialized databases. The public repositories, such as ArrayExpress at the European Bioinformatics Institute contain complete datasets in raw format in addition to processed data, whilst the specialist databases tend to provide downstream analysis of normalized data from more focused studies and data sources. Here we provide a guide to the use of these public microarray resources. PMID:18629145

  10. Enhancing the DNA Patent Database

    SciTech Connect

    Walters, LeRoy B.

    2008-02-18

    Final Report on Award No. DE-FG0201ER63171 Principal Investigator: LeRoy B. Walters February 18, 2008 This project successfully completed its goal of surveying and reporting on the DNA patenting and licensing policies at 30 major U.S. academic institutions. The report of survey results was published in the January 2006 issue of Nature Biotechnology under the title “The Licensing of DNA Patents by US Academic Institutions: An Empirical Survey.” Lori Pressman was the lead author on this feature article. A PDF reprint of the article will be submitted to our Program Officer under separate cover. The project team has continued to update the DNA Patent Database on a weekly basis since the conclusion of the project. The database can be accessed at dnapatents.georgetown.edu. This database provides a valuable research tool for academic researchers, policymakers, and citizens. A report entitled Reaping the Benefits of Genomic and Proteomic Research: Intellectual Property Rights, Innovation, and Public Health was published in 2006 by the Committee on Intellectual Property Rights in Genomic and Protein Research and Innovation, Board on Science, Technology, and Economic Policy at the National Academies. The report was edited by Stephen A. Merrill and Anne-Marie Mazza. This report employed and then adapted the methodology developed by our research project and quoted our findings at several points. (The full report can be viewed online at the following URL: http://www.nap.edu/openbook.php?record_id=11487&page=R1). My colleagues and I are grateful for the research support of the ELSI program at the U.S. Department of Energy.

  11. Is the American Public Ready to Embrace DNA as a Crime-Fighting Tool? A Survey Assessing Support for DNA Databases

    Microsoft Academic Search

    Lauren Dundes

    2001-01-01

    States began passing legislation mandating the collection of genetic material from certain convicted offenders in 1988. By 1998, all 50 states had passed laws allowing DNA databases for convicted sexual offenders, and some states collected DNA from all those convicted of a felony. A survey of 416 persons in Maryland revealed wide support for the inclusion of convicted violent offenders

  12. Public database aids drug researchers

    Cancer.gov

    Researchers at the Broad Institute of Harvard and MIT have released ChemBank 2.0, a major upgrade to ChemBank, a publicly available database poised to enhance scientists' capabilities in drug discovery.

  13. Citation analysis of database publications

    Microsoft Academic Search

    Erhard Rahm; Andreas Thor

    2005-01-01

    We analyze citation frequencies for two main database conferences (SIGMOD, VLDB) and three database journals (TODS, VLDB Journal, Sigmod Record) over 10 years. The citation data is obtai- ned by integrating and cleaning data from DBLP and Google Scho- lar. Our analysis considers different comparative metrics per publication venue, in particular the total and average number of ci- tations as

  14. AIDS PUBLIC INFORMATION DATABASE

    EPA Science Inventory

    The AIDS Public Information Data Set is computer software designed to run on a Microsoft Windows microcomputer, and contains information abstracted from acquired immunodeficiency syndrome (AIDS) cases reported in the United States. The data set is created by the Division of HIV/A...

  15. Short Tandem Repeat DNA Internet Database

    National Institute of Standards and Technology Data Gateway

    SRD 130 Short Tandem Repeat DNA Internet Database (Web, free access)   Short Tandem Repeat DNA Internet Database is intended to benefit research and application of short tandem repeat DNA markers for human identity testing. Facts and sequence information on each STR system, population data, commonly used multiplex STR systems, PCR primers and conditions, and a review of various technologies for analysis of STR alleles have been included.

  16. The Protein-DNA Interface database

    PubMed Central

    2010-01-01

    The Protein-DNA Interface database (PDIdb) is a repository containing relevant structural information of Protein-DNA complexes solved by X-ray crystallography and available at the Protein Data Bank. The database includes a simple functional classification of the protein-DNA complexes that consists of three hierarchical levels: Class, Type and Subtype. This classification has been defined and manually curated by humans based on the information gathered from several sources that include PDB, PubMed, CATH, SCOP and COPS. The current version of the database contains only structures with resolution of 2.5 Å or higher, accounting for a total of 922 entries. The major aim of this database is to contribute to the understanding of the main rules that underlie the molecular recognition process between DNA and proteins. To this end, the database is focused on each specific atomic interface rather than on the separated binding partners. Therefore, each entry in this database consists of a single and independent protein-DNA interface. We hope that PDIdb will be useful to many researchers working in fields such as the prediction of transcription factor binding sites in DNA, the study of specificity determinants that mediate enzyme recognition events, engineering and design of new DNA binding proteins with distinct binding specificity and affinity, among others. Finally, due to its friendly and easy-to-use web interface, we hope that PDIdb will also serve educational and teaching purposes. PMID:20482798

  17. Forensic DNA Profiling and Database

    PubMed Central

    Panneerchelvam, S.; Norazmi, M.N.

    2003-01-01

    The incredible power of DNA technology as an identification tool had brought a tremendous change in crimnal justice . DNA data base is an information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. This article discusses the essential steps in compilation of COmbined DNA Index System (CODIS) on validated polymerase chain amplified STRs and their use in crime detection. PMID:23386793

  18. Plant rDNA database: update and new features

    PubMed Central

    Garcia, Sònia; Gálvez, Francisco; Gras, Airy; Kova?ík, Aleš; Garnatje, Teresa

    2014-01-01

    The Plant rDNA database (www.plantrdnadatabase.com) is an open access online resource providing detailed information on numbers, structures and positions of 5S and 18S-5.8S-26S (35S) ribosomal DNA loci. The data have been obtained from >600 publications on plant molecular cytogenetics, mostly based on fluorescent in situ hybridization (FISH). This edition of the database contains information on 1609 species derived from 2839 records, which means an expansion of 55.76 and 94.45%, respectively. It holds the data for angiosperms, gymnosperms, bryophytes and pteridophytes available as of June 2013. Information from publications reporting data for a single rDNA (either 5S or 35S alone) and annotation regarding transcriptional activity of 35S loci now appears in the database. Preliminary analyses suggest greater variability in the number of rDNA loci in gymnosperms than in angiosperms. New applications provide ideograms of the species showing the positions of rDNA loci as well as a visual representation of their genome sizes. We have also introduced other features to boost the usability of the Web interface, such as an application for convenient data export and a new section with rDNA–FISH-related information (mostly detailing protocols and reagents). In addition, we upgraded and/or proofread tabs and links and modified the website for a more dynamic appearance. This manuscript provides a synopsis of these changes and developments. Database URL: http://www.plantrdnadatabase.com PMID:24980131

  19. Ethical-legal problems of DNA databases in criminal investigation

    PubMed Central

    Guillen, M.; Lareu, M. V.; Pestoni, C.; Salas, A.; Carracedo, A.

    2000-01-01

    Advances in DNA technology and the discovery of DNA polymorphisms have permitted the creation of DNA databases of individuals for the purpose of criminal investigation. Many ethical and legal problems arise in the preparation of a DNA database, and these problems are especially important when one analyses the legal regulations on the subject. In this paper three main groups of possibilities, three systems, are analysed in relation to databases. The first system is based on a general analysis of the population; the second one is based on the taking of samples for a particular list of crimes, and a third is based only on the specific analysis of each case. The advantages and disadvantages of each system are compared and controversial issues are then examined. We found the second system to be the best choice for Spain and other European countries with a similar tradition when we weighed the rights of an individual against the public's interest in the prosecution of a crime. Key Words: DNA databases • forensic genetics • ethics PMID:10951922

  20. Publications of Australian LIS Academics in Databases

    ERIC Educational Resources Information Center

    Wilson, Concepcion S.; Boell, Sebastian K.; Kennan, Mary Anne; Willard, Patricia

    2011-01-01

    This paper examines aspects of journal articles published from 1967 to 2008, located in eight databases, and authored or co-authored by academics serving for at least two years in Australian LIS programs from 1959 to 2008. These aspects are: inclusion of publications in databases, publications in journals, authorship characteristics of…

  1. Publication Kind Codes in STN Patent Databases

    E-print Network

    Hoffmann, Armin

    Publication Kind Codes in STN Patent Databases Dokumentenart-Codes in STN-Patentdatenbanken Imprint Abstracts (CAPLUS), Chemical Abstracts (CAPLUS), Derwent World Patents Index (DWPI), and Derwent World Patents Index (DWPI)und INPADOCDB INPADOCDB international patent databases, together with a short

  2. Ethical-legal problems of DNA databases in criminal investigation.

    PubMed

    Guillén, M; Lareu, M V; Pestoni, C; Salas, A; Carracedo, A

    2000-08-01

    Advances in DNA technology and the discovery of DNA polymorphisms have permitted the creation of DNA databases of individuals for the purpose of criminal investigation. Many ethical and legal problems arise in the preparation of a DNA database, and these problems are especially important when one analyses the legal regulations on the subject. In this paper three main groups of possibilities, three systems, are analysed in relation to databases. The first system is based on a general analysis of the population; the second one is based on the taking of samples for a particular list of crimes, and a third is based only on the specific analysis of each case. The advantages and disadvantages of each system are compared and controversial issues are then examined. We found the second system to be the best choice for Spain and other European countries with a similar tradition when we weighed the rights of an individual against the public's interest in the prosecution of a crime. PMID:10951922

  3. Angiosperm DNA C-Values Database

    NSDL National Science Digital Library

    The 1992 Global Convention on Biological Diversity (Rio de Janeiro) specified the need to make biodiversity data available "despite imperfections, rather than holding back information until more polished products are completed." Few organizations have done so. This Royal Botanic Gardens (Kew, UK) genome biodiversity database is one valuable exception. Founded in 1759, the Royal Botanic Gardens, Kew has built its unique collections which now include 6 million dried plant specimens - covering 90% of the world's plant species; 40,000 living plant taxa - estimated as 10% of the world's flora; and 80,000 fungi and artifacts of plant origin. Known best among botanists as a global resource for definitively identifying, classifying, and naming plants and fungi, Kew also maintains this database on DNA C-values. To access this free, searchable database, the user must provide an email address as well as the genus of interest; search results include Taxon, Family, 4C DNA amount (pg), and entry number/reference citation, listed separately for each species.

  4. DNA algorithms of implementing biomolecular databases on a biological computer.

    PubMed

    Chang, Weng-Long; Vasilakos, Athanasios V

    2015-01-01

    In this paper, DNA algorithms are proposed to perform eight operations of relational algebra (calculus), which include Cartesian product, union, set difference, selection, projection, intersection, join, and division, on biomolecular relational databases. PMID:25343766

  5. The Availability of Faculty Publication Databases from Library Web Pages

    ERIC Educational Resources Information Center

    Blummer, Barbara A.

    2007-01-01

    Faculty publication databases or author bibliographies offer libraries an opportunity to provide services to users. Initially, these databases remained initiatives of special libraries in the health-sciences fields. Librarians used the publication information derived from these databases to compile lists for annual reports. However, the advent of…

  6. Data publication: towards a database of everything.

    PubMed

    Smith, Vincent S

    2009-01-01

    The fabric of science is changing, driven by a revolution in digital technologies that facilitate the acquisition and communication of massive amounts of data. This is changing the nature of collaboration and expanding opportunities to participate in science. If digital technologies are the engine of this revolution, digital data are its fuel. But for many scientific disciplines, this fuel is in short supply. The publication of primary data is not a universal or mandatory part of science, and despite policies and proclamations to the contrary, calls to make data publicly available have largely gone unheeded. In this short essay I consider why, and explore some of the challenges that lie ahead, as we work toward a database of everything. PMID:19552813

  7. Short Tandem Repeat DNA Internet Database

    NSDL National Science Digital Library

    This website contains comprehensive information relating to forensic DNA analysis. It has material from an introductory to an advanced level on forensic DNA technology. The material provides general information on DNA markers that are of interest to human identification. The site contains both introductory and in-depth discussions of short tandem repeats (STRs) and other DNA markers currently used by the forensic community. Powerpoint and PDF presentations on STR training material are available and can be readily downloaded.

  8. MethHC: a database of DNA methylation and gene expression in human cancer

    PubMed Central

    Huang, Wei-Yun; Hsu, Sheng-Da; Huang, Hsi-Yuan; Sun, Yi-Ming; Chou, Chih-Hung; Weng, Shun-Long; Huang, Hsien-Da

    2015-01-01

    We present MethHC (http://MethHC.mbc.nctu.edu.tw), a database comprising a systematic integration of a large collection of DNA methylation data and mRNA/microRNA expression profiles in human cancer. DNA methylation is an important epigenetic regulator of gene transcription, and genes with high levels of DNA methylation in their promoter regions are transcriptionally silent. Increasing numbers of DNA methylation and mRNA/microRNA expression profiles are being published in different public repositories. These data can help researchers to identify epigenetic patterns that are important for carcinogenesis. MethHC integrates data such as DNA methylation, mRNA expression, DNA methylation of microRNA gene and microRNA expression to identify correlations between DNA methylation and mRNA/microRNA expression from TCGA (The Cancer Genome Atlas), which includes 18 human cancers in more than 6000 samples, 6548 microarrays and 12 567 RNA sequencing data. PMID:25398901

  9. MethHC: a database of DNA methylation and gene expression in human cancer.

    PubMed

    Huang, Wei-Yun; Hsu, Sheng-Da; Huang, Hsi-Yuan; Sun, Yi-Ming; Chou, Chih-Hung; Weng, Shun-Long; Huang, Hsien-Da

    2015-01-01

    We present MethHC (http://MethHC.mbc.nctu.edu.tw), a database comprising a systematic integration of a large collection of DNA methylation data and mRNA/microRNA expression profiles in human cancer. DNA methylation is an important epigenetic regulator of gene transcription, and genes with high levels of DNA methylation in their promoter regions are transcriptionally silent. Increasing numbers of DNA methylation and mRNA/microRNA expression profiles are being published in different public repositories. These data can help researchers to identify epigenetic patterns that are important for carcinogenesis. MethHC integrates data such as DNA methylation, mRNA expression, DNA methylation of microRNA gene and microRNA expression to identify correlations between DNA methylation and mRNA/microRNA expression from TCGA (The Cancer Genome Atlas), which includes 18 human cancers in more than 6000 samples, 6548 microarrays and 12 567 RNA sequencing data. PMID:25398901

  10. Exploration of the Chemical Space of Public Genomic Databases

    EPA Science Inventory

    The current project aims to chemically index the content of public genomic databases to make these data accessible in relation to other publicly available, chemically-indexed toxicological information. ...

  11. EMPOP—the EDNAP mtDNA population database concept for a new generation, high-quality mtDNA database

    Microsoft Academic Search

    Walther Parson; Anita Brandstätter; Martin Pircher; Martin Steinlechner; Richard Scheithauer

    2004-01-01

    The European DNA Profiling Group (EDNAP) MtDNA Population Database (EMPOP) is an international collaborative project between DNA laboratories performing mtDNA analysis and the DNA laboratory of the Institute of Legal Medicine (GMI) in Innsbruck, Austria. The goal is to set up a directly accessible mtDNA population database, which can be used in routine forensic casework for frequency investigations.Most forensic laboratories

  12. DNAVaxDB: the first web-based DNA vaccine database and its data analysis

    PubMed Central

    2014-01-01

    Since the first DNA vaccine studies were done in the 1990s, thousands more studies have followed. Here we report the development and analysis of DNAVaxDB (http://www.violinet.org/dnavaxdb), the first publically available web-based DNA vaccine database that curates, stores, and analyzes experimentally verified DNA vaccines, DNA vaccine plasmid vectors, and protective antigens used in DNA vaccines. All data in DNAVaxDB are annotated from reliable resources, particularly peer-reviewed articles. Among over 140 DNA vaccine plasmids, some plasmids were more frequently used in one type of pathogen than others; for example, pCMVi-UB for G- bacterial DNA vaccines, and pCAGGS for viral DNA vaccines. Presently, over 400 DNA vaccines containing over 370 protective antigens from over 90 infectious and non-infectious diseases have been curated in DNAVaxDB. While extracellular and bacterial cell surface proteins and adhesin proteins were frequently used for DNA vaccine development, the majority of protective antigens used in Chlamydophila DNA vaccines are localized to the inner portion of the cell. The DNA vaccine priming, other vaccine boosting vaccination regimen has been widely used to induce protection against infection of different pathogens such as HIV. Parasitic and cancer DNA vaccines were also systematically analyzed. User-friendly web query and visualization interfaces are available in DNAVaxDB for interactive data search. To support data exchange, the information of DNA vaccines, plasmids, and protective antigens is stored in the Vaccine Ontology (VO). DNAVaxDB is targeted to become a timely and vital source of DNA vaccines and related data and facilitate advanced DNA vaccine research and development. PMID:25104313

  13. CORE: A Phylogenetically-Curated 16S rDNA Database of the Core Oral Microbiome

    Microsoft Academic Search

    Ann L. Griffen; Clifford J. Beall; Noah D. Firestone; Erin L. Gross; James M. DiFranco; Jori H. Hardman; Bastienne Vriesendorp; Russell A. Faust; Daniel A. Janies; Eugene J. Leys

    2011-01-01

    Comparing bacterial 16S rDNA sequences to GenBank and other large public databases via BLAST often provides results of little use for identification and taxonomic assignment of the organisms of interest. The human microbiome, and in particular the oral microbiome, includes many taxa, and accurate identification of sequence data is essential for studies of these communities. For this purpose, a phylogenetically

  14. Public database aids drug researchers | Office of Cancer Genomics

    Cancer.gov

    Researchers at the Broad Institute of Harvard and MIT have released ChemBank 2.0, a major upgrade to ChemBank, a publicly available database poised to enhance scientists' capabilities in drug discovery.

  15. Choice of population database for forensic DNA profile analysis.

    PubMed

    Steele, Christopher D; Balding, David J

    2014-12-01

    When evaluating the weight of evidence (WoE) for an individual to be a contributor to a DNA sample, an allele frequency database is required. The allele frequencies are needed to inform about genotype probabilities for unknown contributors of DNA to the sample. Typically databases are available from several populations, and a common practice is to evaluate the WoE using each available database for each unknown contributor. Often the most conservative WoE (most favourable to the defence) is the one reported to the court. However the number of human populations that could be considered is essentially unlimited and the number of contributors to a sample can be large, making it impractical to perform every possible WoE calculation, particularly for complex crime scene profiles. We propose instead the use of only the database that best matches the ancestry of the queried contributor, together with a substantial FST adjustment. To investigate the degree of conservativeness of this approach, we performed extensive simulations of one- and two-contributor crime scene profiles, in the latter case with, and without, the profile of the second contributor available for the analysis. The genotypes were simulated using five population databases, which were also available for the analysis, and evaluations of WoE using our heuristic rule were compared with several alternative calculations using different databases. Using FST=0.03, we found that our heuristic gave WoE more favourable to the defence than alternative calculations in well over 99% of the comparisons we considered; on average the difference in WoE was just under 0.2 bans (orders of magnitude) per locus. The degree of conservativeness of the heuristic rule can be adjusted through the FST value. We propose the use of this heuristic for DNA profile WoE calculations, due to its ease of implementation, and efficient use of the evidence while allowing a flexible degree of conservativeness. PMID:25498938

  16. Choice of population database for forensic DNA profile analysis

    PubMed Central

    Steele, Christopher D.; Balding, David J.

    2014-01-01

    When evaluating the weight of evidence (WoE) for an individual to be a contributor to a DNA sample, an allele frequency database is required. The allele frequencies are needed to inform about genotype probabilities for unknown contributors of DNA to the sample. Typically databases are available from several populations, and a common practice is to evaluate the WoE using each available database for each unknown contributor. Often the most conservative WoE (most favourable to the defence) is the one reported to the court. However the number of human populations that could be considered is essentially unlimited and the number of contributors to a sample can be large, making it impractical to perform every possible WoE calculation, particularly for complex crime scene profiles. We propose instead the use of only the database that best matches the ancestry of the queried contributor, together with a substantial FST adjustment. To investigate the degree of conservativeness of this approach, we performed extensive simulations of one- and two-contributor crime scene profiles, in the latter case with, and without, the profile of the second contributor available for the analysis. The genotypes were simulated using five population databases, which were also available for the analysis, and evaluations of WoE using our heuristic rule were compared with several alternative calculations using different databases. Using FST = 0.03, we found that our heuristic gave WoE more favourable to the defence than alternative calculations in well over 99% of the comparisons we considered; on average the difference in WoE was just under 0.2 bans (orders of magnitude) per locus. The degree of conservativeness of the heuristic rule can be adjusted through the FST value. We propose the use of this heuristic for DNA profile WoE calculations, due to its ease of implementation, and efficient use of the evidence while allowing a flexible degree of conservativeness. PMID:25498938

  17. [Privacy and public benefit in using large scale health databases].

    PubMed

    Yamamoto, Ryuichi

    2014-01-01

    In Japan, large scale heath databases were constructed in a few years, such as National Claim insurance and health checkup database (NDB) and Japanese Sentinel project. But there are some legal issues for making adequate balance between privacy and public benefit by using such databases. NDB is carried based on the act for elderly person's health care but in this act, nothing is mentioned for using this database for general public benefit. Therefore researchers who use this database are forced to pay much concern about anonymization and information security that may disturb the research work itself. Japanese Sentinel project is a national project to detecting drug adverse reaction using large scale distributed clinical databases of large hospitals. Although patients give the future consent for general such purpose for public good, it is still under discussion using insufficiently anonymized data. Generally speaking, researchers of study for public benefit will not infringe patient's privacy, but vague and complex requirements of legislation about personal data protection may disturb the researches. Medical science does not progress without using clinical information, therefore the adequate legislation that is simple and clear for both researchers and patients is strongly required. In Japan, the specific act for balancing privacy and public benefit is now under discussion. The author recommended the researchers including the field of pharmacology should pay attention to, participate in the discussion of, and make suggestion to such act or regulations. PMID:24790041

  18. Genetics and Forensics: Making the National DNA Database.

    PubMed

    Johnson, Paul; Williams, Robin; Martin, Paul

    2003-01-01

    This paper is based on a current study of the growing police use of the epistemic authority of molecular biology for the identification of criminal suspects in support of crime investigation. It discusses the development of DNA profiling and the establishment and development of the UK National DNA Database (NDNAD) as an instance of the 'scientification of police work' (Ericson and Shearing 1986) in which the police uses of science and technology have a recursive effect on their future development. The NDNAD, owned by the Association of Chief Police Officers of England and Wales, is the first of its kind in the world and currently contains the genetic profiles of more than 2 million people. The paper provides a framework for the examination of this socio-technical innovation, begins to tease out the dense and compact history of the database and accounts for the way in which changes and developments across disparate scientific, governmental and policing contexts, have all contributed to the range of uses to which it is put. PMID:16467921

  19. Toward Privacy in Public Databases Shuchi Chawla1

    E-print Network

    Chawla, Shuchi

    Toward Privacy in Public Databases Shuchi Chawla1 , Cynthia Dwork2 , Frank McSherry2 , Adam Smith3,mcsherry}@microsoft.com 3 Weizmann Institute of Science, adam.smith@weizmann.ac.il 4 University of California, Berkeley ARO Grant DAAD19-00-1-0177. #12;2 S. Chawla, C. Dwork, F. McSherry, A. Smith and H. Wee sanitization

  20. 76 FR 1137 - Publicly Available Consumer Product Safety Information Database: Notice of Public Web Conferences

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-07

    The Consumer Product Safety Commission (``Commission,'' ``CPSC,'' or ``we'') is announcing two Web conferences to demonstrate to interested stakeholders the incident reporting form, industry registration and comment features, and the search function of the publicly available consumer product safety information database (``Database''). The Web conferences will be webcast live from the Commission's......

  1. A publication database for optical long baseline interferometry

    NASA Astrophysics Data System (ADS)

    Malbet, Fabien; Mella, Guillaume; Lawson, Peter; Taillifet, Esther; Lafrasse, Sylvain

    2010-07-01

    Optical long baseline interferometry is a technique that has generated almost 850 refereed papers to date. The targets span a large variety of objects from planetary systems to extragalactic studies and all branches of stellar physics. We have created a database hosted by the JMMC and connected to the Optical Long Baseline Interferometry Newsletter (OLBIN) web site using MySQL and a collection of XML or PHP scripts in order to store and classify these publications. Each entry is defined by its ADS bibcode, includes basic ADS informations and metadata. The metadata are specified by tags sorted in categories: interferometric facilities, instrumentation, wavelength of operation, spectral resolution, type of measurement, target type, and paper category, for example. The whole OLBIN publication list has been processed and we present how the database is organized and can be accessed. We use this tool to generate statistical plots of interest for the community in optical long baseline interferometry.

  2. Forensic DNA databases in Western Balkan region: retrospectives, perspectives, and initiatives

    PubMed Central

    Marjanovi?, Damir; Konjhodži?, Rijad; Butorac, Sara Sanela; Drobni?, Katja; Merkaš, Siniša; Lauc, Gordan; Primorac, Damir; An?elinovi?, Šimun; Milosavljevi?, Mladen; Karan, Željko; Vidovi?, Stojko; Stojkovi?, Oliver; Pani?, Bojana; Vu?eti? Dragovi?, An?elka; Kova?evi?, Sandra; Jakovski, Zlatko; Asplen, Chris; Primorac, Dragan

    2011-01-01

    The European Network of Forensic Science Institutes (ENFSI) recommended the establishment of forensic DNA databases and specific implementation and management legislations for all EU/ENFSI members. Therefore, forensic institutions from Bosnia and Herzegovina, Serbia, Montenegro, and Macedonia launched a wide set of activities to support these recommendations. To assess the current state, a regional expert team completed detailed screening and investigation of the existing forensic DNA data repositories and associated legislation in these countries. The scope also included relevant concurrent projects and a wide spectrum of different activities in relation to forensics DNA use. The state of forensic DNA analysis was also determined in the neighboring Slovenia and Croatia, which already have functional national DNA databases. There is a need for a ‘regional supplement’ to the current documentation and standards pertaining to forensic application of DNA databases, which should include regional-specific preliminary aims and recommendations. PMID:21674821

  3. UniPROBE: an online database of protein binding microarray data on protein–DNA interactions

    E-print Network

    Bulyk, Martha L.

    The UniPROBE (Universal PBM Resource for Oligonucleotide Binding Evaluation) database hosts data generated by universal protein binding microarray (PBM) technology on the in vitro DNA-binding specificities of proteins. ...

  4. DSSTOX WEBSITE LAUNCH: IMPROVING PUBLIC ACCESS TO DATABASES FOR BUILDING STRUCTURE-TOXICITY PREDICTION MODELS

    EPA Science Inventory

    DSSTox Website Launch: Improving Public Access to Databases for Building Structure-Toxicity Prediction Models Ann M. Richard US Environmental Protection Agency, Research Triangle Park, NC, USA Distributed: Decentralized set of standardized, field-delimited databases,...

  5. Accessing the public MIMIC-II intensive care relational database for clinical research

    E-print Network

    Scott, Daniel

    Background: The Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database is a free, public resource for intensive care research. The database was officially released in 2006, and has attracted a growing ...

  6. DNA Fingerprint Database for Crapemyrtle Cultivar Identification, Hybrid Verification, and Parentage Analysis

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The objective of this study was to create DNA fingerprints for the Razzle Dazzle® crape myrtle series using simple sequence repeat (SSR) markers, and compare them with the DNA fingerprints of a database made up of over 50 popular crape myrtle cultivars currently available in the trade. Data consiste...

  7. Exploring public databases to characterize urban flood risks in Amsterdam

    NASA Astrophysics Data System (ADS)

    Gaitan, Santiago; ten Veldhuis, Marie-claire; van de Giesen, Nick

    2015-04-01

    Cities worldwide are challenged by increasing urban flood risks. Precise and realistic measures are required to decide upon investment to reduce their impacts. Obvious flooding factors affecting flood risk include sewer systems performance and urban topography. However, currently implemented sewer and topographic models do not provide realistic predictions of local flooding occurrence during heavy rain events. Assessing other factors such as spatially distributed rainfall and socioeconomic characteristics may help to explain probability and impacts of urban flooding. Several public databases were analyzed: complaints about flooding made by citizens, rainfall depths (15 min and 100 Ha spatio-temporal resolution), grids describing number of inhabitants, income, and housing price (1Ha and 25Ha resolution); and buildings age. Data analysis was done using Python and GIS programming, and included spatial indexing of data, cluster analysis, and multivariate regression on the complaints. Complaints were used as a proxy to characterize flooding impacts. The cluster analysis, run for all the variables except the complaints, grouped part of the grid-cells of central Amsterdam into a highly differentiated group, covering 10% of the analyzed area, and accounting for 25% of registered complaints. The configuration of the analyzed variables in central Amsterdam coincides with a high complaint count. Remaining complaints were evenly dispersed along other groups. An adjusted R2 of 0.38 in the multivariate regression suggests that explaining power can improve if additional variables are considered. While rainfall intensity explained 4% of the incidence of complaints, population density and building age significantly explained around 20% each. Data mining of public databases proved to be a valuable tool to identify factors explaining variability in occurrence of urban pluvial flooding, though additional variables must be considered to fully explain flood risk variability.

  8. The Moroccan Genetic Disease Database (MGDD): a database for DNA variations related to inherited disorders and disease susceptibility.

    PubMed

    Charoute, Hicham; Nahili, Halima; Abidi, Omar; Gabi, Khalid; Rouba, Hassan; Fakiri, Malika; Barakat, Abdelhamid

    2014-03-01

    National and ethnic mutation databases provide comprehensive information about genetic variations reported in a population or an ethnic group. In this paper, we present the Moroccan Genetic Disease Database (MGDD), a catalogue of genetic data related to diseases identified in the Moroccan population. We used the PubMed, Web of Science and Google Scholar databases to identify available articles published until April 2013. The Database is designed and implemented on a three-tier model using Mysql relational database and the PHP programming language. To date, the database contains 425 mutations and 208 polymorphisms found in 301 genes and 259 diseases. Most Mendelian diseases in the Moroccan population follow autosomal recessive mode of inheritance (74.17%) and affect endocrine, nutritional and metabolic physiology. The MGDD database provides reference information for researchers, clinicians and health professionals through a user-friendly Web interface. Its content should be useful to improve researches in human molecular genetics, disease diagnoses and design of association studies. MGDD can be publicly accessed at http://mgdd.pasteur.ma. PMID:23860041

  9. DNA -- Intimate Information or Trash for Public Consumption?

    E-print Network

    Wilson, Melanie D.

    2008-01-01

    This essay discusses the increasingly popular police practice of covertly collecting DNA samples from people who inadvertently leave saliva, hair or other biological matter in public places. The essay contends that although the United States Supreme...

  10. DNA Paternity Testing: Public Perceptions And The Influence Of Gender

    Microsoft Academic Search

    Michael Gilding; Christine Critchley; Penelope Shields; Lisa Bakacs; Kerrie-Anne Butler

    This article reports on the findings of the Swinburne National Technology and Society Monitor in relation to public perceptions of DNA paternity testing, with particular reference to the effects of gender. The Monitor included a large-scale random survey and focus groups. Taken together, the survey and focus groups suggest that most Australians are 'comfortable' with DNA paternity testing in a

  11. Academic Impact of a Public Electronic Health Database: Bibliometric Analysis of Studies Using the General Practice Research Database

    PubMed Central

    Chen, Yu-Chun; Wu, Jau-Ching; Haschler, Ingo; Majeed, Azeem; Chen, Tzeng-Ji; Wetter, Thomas

    2011-01-01

    Background Studies that use electronic health databases as research material are getting popular but the influence of a single electronic health database had not been well investigated yet. The United Kingdom's General Practice Research Database (GPRD) is one of the few electronic health databases publicly available to academic researchers. This study analyzed studies that used GPRD to demonstrate the scientific production and academic impact by a single public health database. Methodology and Findings A total of 749 studies published between 1995 and 2009 with ‘General Practice Research Database’ as their topics, defined as GPRD studies, were extracted from Web of Science. By the end of 2009, the GPRD had attracted 1251 authors from 22 countries and been used extensively in 749 studies published in 193 journals across 58 study fields. Each GPRD study was cited 2.7 times by successive studies. Moreover, the total number of GPRD studies increased rapidly, and it is expected to reach 1500 by 2015, twice the number accumulated till the end of 2009. Since 17 of the most prolific authors (1.4% of all authors) contributed nearly half (47.9%) of GPRD studies, success in conducting GPRD studies may accumulate. The GPRD was used mainly in, but not limited to, the three study fields of “Pharmacology and Pharmacy”, “General and Internal Medicine”, and “Public, Environmental and Occupational Health”. The UK and United States were the two most active regions of GPRD studies. One-third of GRPD studies were internationally co-authored. Conclusions A public electronic health database such as the GPRD will promote scientific production in many ways. Data owners of electronic health databases at a national level should consider how to reduce access barriers and to make data more available for research. PMID:21731733

  12. A brief history of the formation of DNA databases in forensic science within Europe.

    PubMed

    Martin, P D; Schmitter, H; Schneider, P M

    2001-06-15

    The introduction of DNA analysis to forensic science brought with it a number of choices for analysis, not all of which were compatible. As laboratories throughout Europe were eager to use the new technology different systems became routine in different laboratories and consequently, there was no basis for the exchange of results. A period of co-operation then started in which a nucleus of forensic scientists agreed on an uniform system. This collaboration spread to incorporate most of the established forensic science laboratories in Europe and continued through two major changes in the technology. At each step agreement was reached on which systems to use. From the beginning it was realised that DNA databases would provide the criminal justice systems with an efficient way of crime solving and consequently some local databases were created. It was not until the introduction of the amplification technology linked to the analysis of short tandem repeats that a sufficiently sensitive and robust system was available for the formation of efficient and effective DNA databases. Comprehensive legislation enacted in the UK in 1995 enabled forensic scientists to set up the first national DNA database which would hold both personal DNA profiles together with results obtained from crime scenes. Other countries quickly followed but in some the legislation has severely restricted the amount and type of data which can be retained and, therefore, effectiveness of the databases is limited. The widespread use of commercially produced multiplex kits has produced a situation in which nearly all European laboratories are using compatible systems and there is, therefore, the potential for the introduction of a pan-European DNA database. However, the exchange of results between countries is hampered by the various legislations which currently exist. PMID:11376988

  13. The public Human Genome Project's DNA donors, Eric LanderSite: DNA Interactive (www.dnai.org)

    NSDL National Science Digital Library

    2008-10-06

    Interviewee: Eric Lander DNAi Location: Genome>The Project>players>Public consortium The public's DNA donors Eric Lander, director of the Whitehead Institute Center for Genome Research, explains where the DNA donors for the first reference sequence came from.

  14. Hungarian mtDNA population databases from Budapest and the Baranya county Roma

    Microsoft Academic Search

    Jodi Irwin; Balazs Egyed; Jessica Saunier; Gabriella Szamosi; Jennifer O’Callaghan; Zsolt Padar; Thomas J. Parsons

    2007-01-01

    To facilitate forensic mtDNA testing in Hungary, we have generated control region databases for two Hungarian populations:\\u000a 211 individuals were sampled from the urban Budapest population and 208 individuals were sampled from a Romani (“gypsy”) population\\u000a in Baranya county. Sequences were generated using a highly redundant approach to minimize potential database errors. The Budapest\\u000a population had high sequence diversity with

  15. Novel Statistical Tools for Management of Public Databases Facilitate Community-Wide Replicability and Control of False

    E-print Network

    Shamir, Ron

    to public database management that we term Quality Preserving Database (QPD). It enables perpetual useNovel Statistical Tools for Management of Public Databases Facilitate Community-Wide Replicability in managing current and future biological databases will significantly enhance the community's ability to make

  16. CD-ROM REVIEW: The ICRP Database of Dose Coefficients: Workers and Members of the Public

    Microsoft Academic Search

    Alan Bunker

    1999-01-01

    This CD contains the ICRP database of dose coefficients for workers and members of the public prepared by the DOCAL Task Group of ICRP Committee 2, and adopted by the Commission in October 1998. It is essentially an electronic version of ICRP Publications 68 and 72 for workers and members of the public, respectively. However, it contains far more data

  17. Toward a mtDNA Locus-Specific Mutation Database Using the LOVD Platform

    PubMed Central

    Elson, Joanna L.; Sweeney, Mary G.; Procaccio, Vincent; Yarham, John W.; Salas, Antonio; Kong, Qing-Peng; van der Westhuizen, Francois H.; Pitceathly, Robert D.S.; Thorburn, David R.; Lott, Marie T.; Wallace, Douglas C.; Taylor, Robert W.; McFarland, Robert

    2015-01-01

    The Human Variome Project (HVP) is a global effort to collect and curate all human genetic variation affecting health. Mutations of mitochondrial DNA (mtDNA) are an important cause of neurogenetic disease in humans; however, identification of the pathogenic mutations responsible can be problematic. In this article, we provide explanations as to why and suggest how such difficulties might be overcome. We put forward a case in support of a new Locus Specific Mutation Database (LSDB) implemented using the Leiden Open-source Variation Database (LOVD) system that will not only list primary mutations, but also present the evidence supporting their role in disease. Critically, we feel that this new database should have the capacity to store information on the observed phenotypes alongside the genetic variation, thereby facilitating our understanding of the complex and variable presentation of mtDNA disease. LOVD supports fast queries of both seen and hidden data and allows storage of sequence variants from high-throughput sequence analysis. The LOVD platform will allow construction of a secure mtDNA database; one that can fully utilize currently available data, as well as that being generated by high-throughput sequencing, to link genotype with phenotype enhancing our understanding of mitochondrial disease, with a view to providing better prognostic information. PMID:22581690

  18. An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...

  19. An assessment of whether SNPs will replace STRs in national DNA databases Joint considerations of the

    E-print Network

    Working Group on DNA Analysis Methods (SWGDAM) Sir: It is unlikely that SNPs will replace STRs as the preferred method of testing of forensic samples and database samples in the near to medium future throughput, this research is carried out primarily for the pharmaceutical industry for drug discovery

  20. Prisoners' expectations of the national forensic DNA database: surveillance and reconfiguration of individual rights.

    PubMed

    Machado, Helena; Santos, Filipe; Silva, Susana

    2011-07-15

    In this paper we aim to discuss how Portuguese prisoners know and what they feel about surveillance mechanisms related to the inclusion and deletion of the DNA profiles of convicted criminals in the national forensic database. Through a set of interviews with individuals currently imprisoned we focus on the ways this group perceives forensic DNA technologies. While the institutional and political discourses maintain that the restricted use and application of DNA profiles within the national forensic database protects individuals' rights, the prisoners claim that police misuse of such technologies potentially makes it difficult to escape from surveillance and acts as a mean of reinforcing the stigma of delinquency. The prisoners also argue that additional intensive and extensive use of surveillance devices might be more protective of their own individual rights and might possibly increase potential for exoneration. PMID:21414735

  1. End-Users/Public Access. Reprints from the Best of "ONLINE" [and]"DATABASE."

    ERIC Educational Resources Information Center

    Online, Inc., Weston, CT.

    Reprints of 20 articles pertaining to the topics of end-users and public access appear in this volume, which is one in a series of volumes of reprints from "ONLINE" and "DATABASE" magazines. Edited for information professionals who use electronically distributed databases, these articles address such topics as: (1) managing a compact disc…

  2. ArrayExpress - a public database of microarray experiments and gene expression profiles

    Microsoft Academic Search

    Helen E. Parkinson; Misha Kapushesky; Mohammadreza Shojatalab; Niran Abeygunawardena; R. Coulson; Anna Farne; Ele Holloway; N. Kolesnykov; P. Lilja; M. Lukk; R. Mani; Tim Rayner; Anjan Sharma; E. William; Ugis Sarkans; Alvis Brazma

    2007-01-01

    ArrayExpress is a public database for high through- put functional genomics data. ArrayExpress con- sists of two parts—the ArrayExpress Repository, which is a MIAME supportive public archive of micro- array data, and the ArrayExpress Data Warehouse, which is a database of gene expression profiles selected from the repository and consistently re- annotated. Archived experiments can be queried by experiment attributes,

  3. HUNT: launch of a full-length cDNA database from the Helix Research Institute

    PubMed Central

    Yudate, Henrik T.; Suwa, Makiko; Irie, Ryotaro; Matsui, Hiroshi; Nishikawa, Tetsuo; Nakamura, Yoshitaka; Yamaguchi, Daisuke; Peng, Zhang Zhi; Yamamoto, Tomoyuki; Nagai, Keiichi; Hayashi, Koji; Otsuki, Tetsuji; Sugiyama, Tomoyasu; Ota, Toshio; Suzuki, Yutaka; Sugano, Sumio; Isogai, Takao; Masuho, Yasuhiko

    2001-01-01

    The Helix Research Institute (HRI) in Japan is releasing 4356 HUman Novel Transcripts and related information in the newly established HUNT database. The institute is a joint research project principally funded by the Japanese Ministry of International Trade and Industry, and the clones were sequenced in the governmental New Energy and Industrial Technology Development Organization (NEDO) Human cDNA Sequencing Project. The HUNT database contains an extensive amount of annotation from advanced analysis and represents an essential bioinformatics contribution towards understanding of the gene function. The HRI human cDNA clones were obtained from full-length enriched cDNA libraries constructed with the oligo-capping method and have resulted in novel full-length cDNA sequences. A large fraction has little similarity to any proteins of known function and to obtain clues about possible function we have developed original analysis procedures. Any putative function deduced here can be validated or refuted by complementary analysis results. The user can also extract information from specific categories like PROSITE patterns, PFAM domains, PSORT localization, transmembrane helices and clones with GENIUS structure assignments. The HUNT database can be accessed at http://www.hri.co.jp/HUNT. PMID:11125086

  4. DNA variant databases improve test accuracy and phenotype prediction in Alport syndrome.

    PubMed

    Savige, Judy; Ars, Elisabet; Cotton, Richard G H; Crockett, David; Dagher, Hayat; Deltas, Constantinos; Ding, Jie; Flinter, Frances; Pont-Kingdon, Genevieve; Smaoui, Nizar; Torra, Roser; Storey, Helen

    2014-06-01

    X-linked Alport syndrome is a form of progressive renal failure caused by pathogenic variants in the COL4A5 gene. More than 700 variants have been described and a further 400 are estimated to be known to individual laboratories but are unpublished. The major genetic testing laboratories for X-linked Alport syndrome worldwide have established a Web-based database for published and unpublished COL4A5 variants ( https://grenada.lumc.nl/LOVD2/COL4A/home.php?select_db=COL4A5 ). This conforms with the recommendations of the Human Variome Project: it uses the Leiden Open Variation Database (LOVD) format, describes variants according to the human reference sequence with standardized nomenclature, indicates likely pathogenicity and associated clinical features, and credits the submitting laboratory. The database includes non-pathogenic and recurrent variants, and is linked to another COL4A5 mutation database and relevant bioinformatics sites. Access is free. Increasing the number of COL4A5 variants in the public domain helps patients, diagnostic laboratories, clinicians, and researchers. The database improves the accuracy and efficiency of genetic testing because its variants are already categorized for pathogenicity. The description of further COL4A5 variants and clinical associations will improve our ability to predict phenotype and our understanding of collagen IV biochemistry. The database for X-linked Alport syndrome represents a model for databases in other inherited renal diseases. PMID:23720012

  5. Using the ADS Database to Study Trends in Astronomical Publication

    Microsoft Academic Search

    E. Schulman; A. L. Powell; J. C. French; G. Eichhorn; M. J. Kurtz; S. S. Murray

    1996-01-01

    The sociology of astronomical publication has traditionally been studied by looking for publication trends using every paper published in a few selected journals within a few selected years. For example, Abt (1981, PASP, 93, 269) examined the papers published in ApJ, ApJS, AJ, and PASP during the first year of each decade from 1910 to 1980. By analyzing the NASA

  6. DNA Identification of Mountain Lions Involved in Livestock Predation and Public Safety Incidents and Investigations

    E-print Network

    Ernest, Holly

    1 DNA Identification of Mountain Lions Involved in Livestock Predation and Public Safety Incidents concolor, bobcat, forensic, genetics, DNA techniques, noninvasive sampling, fecal DNA, prey swab DNA ABSTRACT Using three case studies, we demonstrated the utility of techniques to analyze DNA from trace

  7. FastStats: a public health statistics database.

    PubMed

    Vardell, Emily

    2014-01-01

    FastStats is a site that provides quick and easy access to public health statistics. The freely available website is maintained by the Centers for Disease Control and Prevention's National Center for Health Statistics. Users can browse alphabetically by topic and state/territory or search across the National Center for Health Statistics site. A description of the browsing capabilities and sample searches are presented. PMID:24735268

  8. Development of Energy Consumption Database Management System of Existing Large Public Buildings 

    E-print Network

    Li, Y.; Zhang, J.; Sun, D.

    2006-01-01

    ICEBO2006, Shenzhen, China Policy for Energy Efficiency and Comfort, Vol.VII-3-1 Development of Energy Consumption Database Management System of Existing Large Public Buildings1 Yunhua Li Jili Zhang Dexing...-conditioning System in Some Public Building. FULID MACHINERY. 2003, 31(3):42~46.(In Chinese) [3] Li Yuyun, Zhang Chunzhi, Zeng Shengzhi. Analysis of air-conditioning energy consumption in commercial buildings of Wuhan.HV&AC. 2002,32(4):85~87.(In Chinese) [4...

  9. Molecular scaffold analysis of natural products databases in the public domain.

    PubMed

    Yongye, Austin B; Waddell, Jacob; Medina-Franco, José L

    2012-11-01

    Natural products represent important sources of bioactive compounds in drug discovery efforts. In this work, we compiled five natural products databases available in the public domain and performed a comprehensive chemoinformatic analysis focused on the content and diversity of the scaffolds with an overview of the diversity based on molecular fingerprints. The natural products databases were compared with each other and with a set of molecules obtained from in-house combinatorial libraries, and with a general screening commercial library. It was found that publicly available natural products databases have different scaffold diversity. In contrast to the common concept that larger libraries have the largest scaffold diversity, the largest natural products collection analyzed in this work was not the most diverse. The general screening library showed, overall, the highest scaffold diversity. However, considering the most frequent scaffolds, the general reference library was the least diverse. In general, natural products databases in the public domain showed low molecule overlap. In addition to benzene and acyclic compounds, flavones, coumarins, and flavanones were identified as the most frequent molecular scaffolds across the different natural products collections. The results of this work have direct implications in the computational and experimental screening of natural product databases for drug discovery. PMID:22863071

  10. Governing Software: Networks, Databases and Algorithmic Power in the Digital Governance of Public Education

    ERIC Educational Resources Information Center

    Williamson, Ben

    2015-01-01

    This article examines the emergence of "digital governance" in public education in England. Drawing on and combining concepts from software studies, policy and political studies, it identifies some specific approaches to digital governance facilitated by network-based communications and database-driven information processing software…

  11. HEDS - EPA DATABASE SYSTEM FOR PUBLIC ACCESS TO HUMAN EXPOSURE DATA

    EPA Science Inventory

    Human Exposure Database System (HEDS) is an Internet-based system developed to provide public access to human-exposure-related data from studies conducted by EPA's National Exposure Research Laboratory (NERL). HEDS was designed to work with the EPA Office of Research and Devel...

  12. Touchscreen field specification for public access database queries: let your fingers do the walking

    Microsoft Academic Search

    Andrew Sears; Yoram Kochavy; Ben Shneiderman

    1990-01-01

    Database query is becoming a common task in public access systems; touchscreens can provide an appealing interface for such a system. This paper explores three interfaces for constructing queries on alphabetic field values with a touchscreen interface; including a QWERTY keyboard, an Alphabetic keyboard, and a Reduced Input Data Entry (RIDE) interface. The RIDE interface allows field values to be

  13. Harp: a distributed query system for legacy public libraries and structured databases

    Microsoft Academic Search

    Ee-Peng Lim; Ying Lu

    1999-01-01

    The main purpose of a digital library is to facilitate users easy access to enormous amount of globally networked information. Typically, this information includes preexisting public library catalog data, digitized document collections, and other databases. In this article, we describe the distributed query system of a digital library prototype system known as HARP. In the HARP project, we have designed

  14. LBVS: an online platform for ligand-based virtual screening using publicly accessible databases.

    PubMed

    Zheng, Minghao; Liu, Zhihong; Yan, Xin; Ding, Qianzhi; Gu, Qiong; Xu, Jun

    2014-11-01

    Abundant data on compound bioactivity and publicly accessible chemical databases increase opportunities for ligand-based drug discovery. In order to make full use of the data, an online platform for ligand-based virtual screening (LBVS) using publicly accessible databases has been developed. LBVS adopts Bayesian learning approach to create virtual screening models because of its noise tolerance, speed, and efficiency in extracting knowledge from data. LBVS currently includes data derived from BindingDB and ChEMBL. Three validation approaches have been employed to evaluate the virtual screening models created from LBVS. The tenfold cross validation results of twenty different LBVS models demonstrate that LBVS achieves an average AUC value of 0.86. Our internal and external testing results indicate that LBVS is predictive for lead identifications. LBVS can be publicly accessed at http://rcdd.sysu.edu.cn/lbvs. PMID:25182364

  15. Information Technologies in Public Health Management: A Database on Biocides to Improve Quality of Life

    PubMed Central

    Roman, C; Scripcariu, L; Diaconescu, RM; Grigoriu, A

    2012-01-01

    Background Biocides for prolonging the shelf life of a large variety of materials have been extensively used over the last decades. It has estimated that the worldwide biocide consumption to be about 12.4 billion dollars in 2011, and is expected to increase in 2012. As biocides are substances we get in contact with in our everyday lives, access to this type of information is of paramount importance in order to ensure an appropriate living environment. Consequently, a database where information may be quickly processed, sorted, and easily accessed, according to different search criteria, is the most desirable solution. The main aim of this work was to design and implement a relational database with complete information about biocides used in public health management to improve the quality of life. Methods: Design and implementation of a relational database for biocides, by using the software “phpMyAdmin”. Results: A database, which allows for an efficient collection, storage, and management of information including chemical properties and applications of a large quantity of biocides, as well as its adequate dissemination into the public health environment. Conclusion: The information contained in the database herein presented promotes an adequate use of biocides, by means of information technologies, which in consequence may help achieve important improvement in our quality of life. PMID:23113190

  16. A Public HTLV-1 Molecular Epidemiology Database for Sequence Management and Data Mining

    PubMed Central

    Libin, Pieter; Deforche, Koen; Edwards, Dustin; de Albuquerque-Junior, Antonio Eduardo; Vandamme, Anne-Mieke; Galvao-Castro, Bernardo; Alcantara, Luiz Carlos Junior

    2012-01-01

    Background It is estimated that 15 to 20 million people are infected with the human T-cell lymphotropic virus type 1 (HTLV-1). At present, there are more than 2,000 unique HTLV-1 isolate sequences published. A central database to aggregate sequence information from a range of epidemiological aspects including HTLV-1 infections, pathogenesis, origins, and evolutionary dynamics would be useful to scientists and physicians worldwide. Described here, we have developed a database that collects and annotates sequence data and can be accessed through a user-friendly search interface. The HTLV-1 Molecular Epidemiology Database website is available at http://htlv1db.bahia.fiocruz.br/. Methodology/Principal Findings All data was obtained from publications available at GenBank or through contact with the authors. The database was developed using Apache Webserver 2.1.6 and SGBD MySQL. The webpage interfaces were developed in HTML and sever-side scripting written in PHP. The HTLV-1 Molecular Epidemiology Database is hosted on the Gonçalo Moniz/FIOCRUZ Research Center server. There are currently 2,457 registered sequences with 2,024 (82.37%) of those sequences representing unique isolates. Of these sequences, 803 (39.67%) contain information about clinical status (TSP/HAM, 17.19%; ATL, 7.41%; asymptomatic, 12.89%; other diseases, 2.17%; and no information, 60.32%). Further, 7.26% of sequences contain information on patient gender while 5.23% of sequences provide the age of the patient. Conclusions/Significance The HTLV-1 Molecular Epidemiology Database retrieves and stores annotated HTLV-1 proviral sequences from clinical, epidemiological, and geographical studies. The collected sequences and related information are now accessible on a publically available and user-friendly website. This open-access database will support clinical research and vaccine development related to viral genotype. PMID:22970114

  17. SITVITWEB--a publicly available international multimarker database for studying Mycobacterium tuberculosis genetic diversity and molecular epidemiology.

    PubMed

    Demay, Christophe; Liens, Benjamin; Burguière, Thomas; Hill, Véronique; Couvin, David; Millet, Julie; Mokrousov, Igor; Sola, Christophe; Zozio, Thierry; Rastogi, Nalin

    2012-06-01

    Among various genotyping methods to study Mycobacterium tuberculosis complex (MTC) genotypic polymorphism, spoligotyping and mycobacterial interspersed repetitive units-variable number of DNA tandem repeats (MIRU-VNTRs) have recently gained international approval as robust, fast, and reproducible typing methods generating data in a portable format. Spoligotyping constituted the backbone of a publicly available database SpolDB4 released in 2006; nonetheless this method possesses a low discriminatory power when used alone and should be ideally used in conjunction with a second typing method such as MIRU-VNTRs for high-resolution epidemiological studies. We hereby describe a publicly available international database named SITVITWEB which incorporates such multimarker data allowing to have a global vision of MTC genetic diversity worldwide based on 62,582 clinical isolates corresponding to 153 countries of patient origin (105 countries of isolation). We report a total of 7105 spoligotype patterns (corresponding to 58,180 clinical isolates) - grouped into 2740 shared-types or spoligotype international types (SIT) containing 53,816 clinical isolates and 4364 orphan patterns. Interestingly, only 7% of the MTC isolates worldwide were orphans whereas more than half of SITed isolates (n=27,059) were restricted to only 24 most prevalent SITs. The database also contains a total of 2379 MIRU patterns (from 8161 clinical isolates) from 87 countries of patient origin (35 countries of isolation); these were grouped in 847 shared-types or MIRU international types (MIT) containing 6626 isolates and 1533 orphan patterns. Lastly, data on 5-locus exact tandem repeats (ETRs) were available on 4626 isolates from 59 countries of patient origin (22 countries of isolation); a total of 458 different VNTR patterns were observed - split into 245 shared-types or VNTR International Types (VIT) containing 4413 isolates) and 213 orphan patterns. Datamining of SITVITWEB further allowed to update rules defining MTC genotypic lineages as well to have a new insight into MTC population structure and worldwide distribution at country, sub-regional and continental levels. At evolutionary level, the data compiled may be useful to distinguish the occasional convergent evolution of genotypes versus specific evolution of sublineages essentially influenced by adaptation to the host. This database is publicly available at: http://www.pasteur-guadeloupe.fr:8081/SITVIT_ONLINE. PMID:22365971

  18. Genomics and Public Health Research: Can the State Allow Access to Genomic Databases?

    PubMed Central

    Cousineau, J; Girard, N; Monardes, C; Leroux, T; Jean, M Stanton

    2012-01-01

    Because many diseases are multifactorial disorders, the scientific progress in genomics and genetics should be taken into consideration in public health research. In this context, genomic databases will constitute an important source of information. Consequently, it is important to identify and characterize the State’s role and authority on matters related to public health, in order to verify whether it has access to such databases while engaging in public health genomic research. We first consider the evolution of the concept of public health, as well as its core functions, using a comparative approach (e.g. WHO, PAHO, CDC and the Canadian province of Quebec). Following an analysis of relevant Quebec legislation, the precautionary principle is examined as a possible avenue to justify State access to and use of genomic databases for research purposes. Finally, we consider the Influenza pandemic plans developed by WHO, Canada, and Quebec, as examples of key tools framing public health decision-making process. We observed that State powers in public health, are not, in Quebec, well adapted to the expansion of genomics research. We propose that the scope of the concept of research in public health should be clear and include the following characteristics: a commitment to the health and well-being of the population and to their determinants; the inclusion of both applied research and basic research; and, an appropriate model of governance (authorization, follow-up, consent, etc.). We also suggest that the strategic approach version of the precautionary principle could guide collective choices in these matters. PMID:23113174

  19. Development and expansion of high-quality control region databases to improve forensic mtDNA evidence interpretation

    Microsoft Academic Search

    Jodi A. Irwin; Jessica L. Saunier; Katharine M. Strouss; Kimberly A. Sturk; Toni M. Diegoli; Rebecca S. Just; Michael D. Coble; Walther Parson; Thomas J. Parsons

    2007-01-01

    In an effort to increase the quantity, breadth and availability of mtDNA databases suitable for forensic comparisons, we have developed a high-throughput process to generate approximately 5000 control region sequences per year from regional US populations, global populations from which the current US population is derived and global populations currently under-represented in available forensic databases. The system utilizes robotic instrumentation

  20. Does an English appeal court ruling increase the risks of miscarriages of justice when complex DNA profiles are searched against the national DNA database?

    PubMed

    Gill, P; Bleka, Ø; Egeland, T

    2014-11-01

    Likelihood ratio (LR) methods to interpret multi-contributor, low template, complex DNA mixtures are becoming standard practice. The next major development will be to introduce search engines based on the new methods to interrogate very large national DNA databases, such as those held by China, the USA and the UK. Here we describe a rapid method that was used to assign a LR to each individual member of database of 5 million genotypes which can be ranked in order. Previous authors have only considered database trawls in the context of binary match or non-match criteria. However, the concept of match/non-match no longer applies within the new paradigm introduced, since the distribution of resultant LRs is continuous for practical purposes. An English appeal court decision allows scientists to routinely report complex DNA profiles using nothing more than their subjective personal 'experience of casework' and 'observations' in order to apply an expression of the rarity of an evidential sample. This ruling must be considered in context of a recent high profile English case, where an individual was extracted from a database and wrongly accused of a serious crime. In this case the DNA evidence was used to negate the overwhelming exculpatory (non-DNA) evidence. Demonstrable confirmation bias, also known as the 'CSI-effect, seriously affected the investigation. The case demonstrated that in practice, databases could be used to select and prosecute an individual, simply because he ranked high in the list of possible matches. We have identified this phenomenon as a cognitive error which we term: 'the naïve investigator effect'. We take the opportunity to test the performance of database extraction strategies either by using a simple matching allele count (MAC) method or LR. The example heard by the appeal court is used as the exemplar case. It is demonstrated that the LR search-method offers substantial benefits compared to searches based on simple matching allele count (MAC) methods. PMID:25151459

  1. The public sequencing process, John SulstonSite: DNA Interactive (www.dnai.org)

    NSDL National Science Digital Library

    2008-10-06

    Interviewee: John Sulston DNAi Location:Genome>The project>players>Public The public sequencing process Nobel Laureate John Sulston, a key figure in the UK sequencing effort, talks about breaking DNA apart so that the sequence can be reassembled.

  2. Accessing the public MIMIC-II intensive care relational database for clinical research

    PubMed Central

    2013-01-01

    Background The Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database is a free, public resource for intensive care research. The database was officially released in 2006, and has attracted a growing number of researchers in academia and industry. We present the two major software tools that facilitate accessing the relational database: the web-based QueryBuilder and a downloadable virtual machine (VM) image. Results QueryBuilder and the MIMIC-II VM have been developed successfully and are freely available to MIMIC-II users. Simple example SQL queries and the resulting data are presented. Clinical studies pertaining to acute kidney injury and prediction of fluid requirements in the intensive care unit are shown as typical examples of research performed with MIMIC-II. In addition, MIMIC-II has also provided data for annual PhysioNet/Computing in Cardiology Challenges, including the 2012 Challenge “Predicting mortality of ICU Patients”. Conclusions QueryBuilder is a web-based tool that provides easy access to MIMIC-II. For more computationally intensive queries, one can locally install a complete copy of MIMIC-II in a VM. Both publicly available tools provide the MIMIC-II research community with convenient querying interfaces and complement the value of the MIMIC-II relational database. PMID:23302652

  3. Applying Knowledge Discovery in Databases in Public Health Data Set: Challenges and Concerns

    PubMed Central

    Volrathongchia, Kanittha

    2003-01-01

    In attempting to apply Knowledge Discovery in Databases (KDD) to generate a predictive model from a health care dataset that is currently available to the public, the first step is to pre-process the data to overcome the challenges of missing data, redundant observations, and records containing inaccurate data. This study will demonstrate how to use simple pre-processing methods to improve the quality of input data. PMID:14728545

  4. The barley EST DNA Replication and Repair Database (bEST-DRRD) as a tool for the identification of the genes involved in DNA replication and repair

    PubMed Central

    2012-01-01

    Background The high level of conservation of genes that regulate DNA replication and repair indicates that they may serve as a source of information on the origin and evolution of the species and makes them a reliable system for the identification of cross-species homologs. Studies that had been conducted to date shed light on the processes of DNA replication and repair in bacteria, yeast and mammals. However, there is still much to be learned about the process of DNA damage repair in plants. Description These studies, which were conducted mainly using bioinformatics tools, enabled the list of genes that participate in various pathways of DNA repair in Arabidopsis thaliana (L.) Heynh to be outlined; however, information regarding these mechanisms in crop plants is still very limited. A similar, functional approach is particularly difficult for a species whose complete genomic sequences are still unavailable. One of the solutions is to apply ESTs (Expressed Sequence Tags) as the basis for gene identification. For the construction of the barley EST DNA Replication and Repair Database (bEST-DRRD), presented here, the Arabidopsis nucleotide and protein sequences involved in DNA replication and repair were used to browse for and retrieve the deposited sequences, derived from four barley (Hordeum vulgare L.) sequence databases, including the “Barley Genome version 0.05” database (encompassing ca. 90% of barley coding sequences) and from two databases covering the complete genomes of two monocot models: Oryza sativa L. and Brachypodium distachyon L. in order to identify homologous genes. Sequences of the categorised Arabidopsis queries are used for browsing the repositories, which are located on the ViroBLAST platform. The bEST-DRRD is currently used in our project during the identification and validation of the barley genes involved in DNA repair. Conclusions The presented database provides information about the Arabidopsis genes involved in DNA replication and repair, their expression patterns and models of protein interactions. It was designed and established to provide an open-access tool for the identification of monocot homologs of known Arabidopsis genes that are responsible for DNA-related processes. The barley genes identified in the project are currently being analysed to validate their function. PMID:22697361

  5. Similarity landscapes: An improved method for scientific visualization of information from protein and DNA database searches

    SciTech Connect

    Dogget, N.; Myers, G. [Los Alamos National Lab., NM (United States); Wills, C.J. [Univ. of California, San Diego, CA (United States)

    1998-12-01

    This is the final report of a three-year, Laboratory Directed Research and Development (LDRD) project at the Los Alamos National Laboratory (LANL). The authors have used computer simulations and examination of a variety of databases to answer questions about a wide range of evolutionary questions. The authors have found that there is a clear distinction in the evolution of HIV-1 and HIV-2, with the former and more virulent virus evolving more rapidly at a functional level. The authors have discovered highly non-random patterns in the evolution of HIV-1 that can be attributed to a variety of selective pressures. In the course of examination of microsatellite DNA (short repeat regions) in microorganisms, the authors have found clear differences between prokaryotes and eukaryotes in their distribution, differences that can be tied to different selective pressures. They have developed a new method (topiary pruning) for enhancing the phylogenetic information contained in DNA sequences. Most recently, the authors have discovered effects in complex rainforest ecosystems that indicate strong frequency-dependent interactions between host species and their parasites, leading to the maintenance of ecosystem variability.

  6. Addressing database mismatch in forensic speaker recognition with Ahumada III: a public real-casework database in Spanish

    Microsoft Academic Search

    Daniel Ramos; Joaquin Gonzalez-Rodriguez; Javier Gonzalez-Dominguez; Jose Juan Lucena-Molina

    2008-01-01

    This paper presents and describes Ahumada III, a speech database in Spanish collected from real forensic cases. In i ts current release, the database presents male speakers recorded using the systems and procedures followed by Spanish Guardia Civil police force. The paper also explores the usefulness of such a corpus for facing the important problem of database mis- match in

  7. 76 FR 77533 - Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-13

    ...AGENCY [No. 2011-N-13] Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single-Family...Washington, DC 20006. mailto: Ian.Keith@fhfa.gov. For legal questions, contact: Sharon Like, Managing Associate...

  8. Quantitative assessment of the expanding complementarity between public and commercial databases of bioactive compounds

    PubMed Central

    2009-01-01

    Background Since 2004 public cheminformatic databases and their collective functionality for exploring relationships between compounds, protein sequences, literature and assay data have advanced dramatically. In parallel, commercial sources that extract and curate such relationships from journals and patents have also been expanding. This work updates a previous comparative study of databases chosen because of their bioactive content, availability of downloads and facility to select informative subsets. Results Where they could be calculated, extracted compounds-per-journal article were in the range of 12 to 19 but compound-per-protein counts increased with document numbers. Chemical structure filtration to facilitate standardised comparisons typically reduced source counts by between 5% and 30%. The pair-wise overlaps between 23 databases and subsets were determined, as well as changes between 2006 and 2008. While all compound sets have increased, PubChem has doubled to 14.2 million. The 2008 comparison matrix shows not only overlap but also unique content across all sources. Many of the detailed differences could be attributed to individual strategies for data selection and extraction. While there was a big increase in patent-derived structures entering PubChem since 2006, GVKBIO contains over 0.8 million unique structures from this source. Venn diagrams showed extensive overlap between compounds extracted by independent expert curation from journals by GVKBIO, WOMBAT (both commercial) and BindingDB (public) but each included unique content. In contrast, the approved drug collections from GVKBIO, MDDR (commercial) and DrugBank (public) showed surprisingly low overlap. Aggregating all commercial sources established that while 1 million compounds overlapped with PubChem 1.2 million did not. Conclusion On the basis of chemical structure content per se public sources have covered an increasing proportion of commercial databases over the last two years. However, commercial products included in this study provide links between compounds and information from patents and journals at a larger scale than current public efforts. They also continue to capture a significant proportion of unique content. Our results thus demonstrate not only an encouraging overall expansion of data-supported bioactive chemical space but also that both commercial and public sources are complementary for its exploration. PMID:20298516

  9. MSiMass list: a public database of identifications for protein MALDI MS imaging.

    PubMed

    McDonnell, Liam A; Walch, Axel; Stoeckli, Markus; Corthals, Garry L

    2014-02-01

    The clinical application of mass spectrometry imaging has developed into a sizable subdiscipline of proteomics and metabolomics because its seamless integration with pathology enables biomarkers and biomarker profiles to be determined that can aid patient and disease stratification (diagnosis, prognosis, and response to therapy). Confident identification of the discriminating peaks remains a challenge owing to the presence of nontryptic protein fragments, large mass-to-charge ratio ions that are not efficiently fragmented via tandem mass spectrometry or a high density of isobaric species. A public database of identifications has been initiated to aid the clinical development and implementation of mass spectrometry imaging. The MSiMass list database ( www.maldi-msi.org/mass ) enables users to assign identities to the peaks observed in their experiments and provides the methods by which the identifications were obtained. In contrast with existing protein databases, this list is designed as a community effort without a formal review panel. In this concept, authors can freely enter data and can comment on existing entries. In such, the database itself is an experiment on sharing knowledge, and its ability to rapidly provide quality data will be evaluated in the future. PMID:24313301

  10. The Government Finance Database: A Common Resource for Quantitative Research in Public Financial Analysis

    PubMed Central

    Pierson, Kawika; Hand, Michael L.; Thompson, Fred

    2015-01-01

    Quantitative public financial management research focused on local governments is limited by the absence of a common database for empirical analysis. While the U.S. Census Bureau distributes government finance data that some scholars have utilized, the arduous process of collecting, interpreting, and organizing the data has led its adoption to be prohibitive and inconsistent. In this article we offer a single, coherent resource that contains all of the government financial data from 1967-2012, uses easy to understand natural-language variable names, and will be extended when new data is available. PMID:26107821

  11. Bayesian approach to transforming public gene expression repositories into disease diagnosis databases.

    PubMed

    Huang, Haiyan; Liu, Chun-Chi; Zhou, Xianghong Jasmine

    2010-04-13

    The rapid accumulation of gene expression data has offered unprecedented opportunities to study human diseases. The National Center for Biotechnology Information Gene Expression Omnibus is currently the largest database that systematically documents the genome-wide molecular basis of diseases. However, thus far, this resource has been far from fully utilized. This paper describes the first study to transform public gene expression repositories into an automated disease diagnosis database. Particularly, we have developed a systematic framework, including a two-stage Bayesian learning approach, to achieve the diagnosis of one or multiple diseases for a query expression profile along a hierarchical disease taxonomy. Our approach, including standardizing cross-platform gene expression data and heterogeneous disease annotations, allows analyzing both sources of information in a unified probabilistic system. A high level of overall diagnostic accuracy was shown by cross validation. It was also demonstrated that the power of our method can increase significantly with the continued growth of public gene expression repositories. Finally, we showed how our disease diagnosis system can be used to characterize complex phenotypes and to construct a disease-drug connectivity map. PMID:20360561

  12. Covariation of the Incidence of Type 1 Diabetes with Country Characteristics Available in Public Databases

    PubMed Central

    Diaz-Valencia, Paula Andrea; Bougnères, Pierre; Valleron, Alain-Jacques

    2015-01-01

    Background The incidence of Type 1 Diabetes (T1D) in children varies dramatically between countries. Part of the explanation must be sought in environmental factors. Increasingly, public databases provide information on country-to-country environmental differences. Methods Information on the incidence of T1D and country characteristics were searched for in the 194 World Health Organization (WHO) member countries. T1D incidence was extracted from a systematic literature review of all papers published between 1975 and 2014, including the 2013 update from the International Diabetes Federation. The information on country characteristics was searched in public databases. We considered all indicators with a plausible relation with T1D and those previously reported as correlated with T1D, and for which there was less than 5% missing values. This yielded 77 indicators. Four domains were explored: Climate and environment, Demography, Economy, and Health Conditions. Bonferroni correction to correct false discovery rate (FDR) was used in bivariate analyses. Stepwise multiple regressions, served to identify independent predictors of the geographical variation of T1D. Findings T1D incidence was estimated for 80 WHO countries. Forty-one significant correlations between T1D and the selected indicators were found. Stepwise Multiple Linear Regressions performed in the four explored domains indicated that the percentages of variance explained by the indicators were respectively 35% for Climate and environment, 33% for Demography, 45% for Economy, and 46% for Health conditions, and 51% in the Final model, where all variables selected by domain were considered. Significant environmental predictors of the country-to-country variation of T1D incidence included UV radiation, number of mobile cellular subscriptions in the country, health expenditure per capita, hepatitis B immunization and mean body mass index (BMI). Conclusions The increasing availability of public databases providing information in all global environmental domains should allow new analyses to identify further geographical, behavioral, social and economic factors, or indicators that point to latent causal factors of T1D. PMID:25706995

  13. Protecting DNA Sequence Anonymity with Generalization Lattices

    E-print Network

    . The method is tested and evaluated with several publicly available human population datasets. #12;Keywords that may be inferred from a DNA sequence. In this paper we introduce a com- putational method for anonymizing a collection of person-specific DNA database sequences. The method is termed DNA lattice

  14. DISTRIBUTED STRUCTURE-SEARCHABLE TOXICITY (DSSTOX) DATABASE NETWORK: MAKING PUBLIC TOXICITY DATA RESOURCES MORE ACCESSIBLE AND USABLE FOR DATA EXPLORATION AND SAR DEVELOPMENT

    EPA Science Inventory

    Distributed Structure-Searchable Toxicity (DSSTox) Database Network: Making Public Toxicity Data Resources More Accessible and U sable for Data Exploration and SAR Development Many sources of public toxicity data are not currently linked to chemical structure, are not ...

  15. IEEE JOURNAL ON SELECTED TOPICS IN SIGNAL PROCESSING, VOL. 6, NO. 6, OCTOBER 2012 1 Analysis of Public Image and Video Databases

    E-print Network

    Winkler, Stefan

    of Public Image and Video Databases for Quality Assessment Stefan Winkler Abstract--Databases of images and discusses areas for database improvement and future work. Section V concludes the paper. S. Winkler) in Singapore. E-mail: stefan.winkler@adsc.com.sg. Manuscript received submitted November 09, 2011; revised May

  16. The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome

    PubMed Central

    Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A.

    2015-01-01

    A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser. PMID:25324314

  17. A Human Mitochondrial Genome Database

    NSDL National Science Digital Library

    Brown, M.D.

    1996-01-01

    The Center for Molecular Medicine at Emory University maintains this human mitochondrial genome database, which offers information on Mitochondrial DNA Function Locations and Polypeptide Assignments as well as the relevant publication references. The database is initially searchable by gene, disease, and enzyme. Users can then refine their search by function, polymorphisms, or references (author, title, journal, year, or keyword). Users can also search the references directly via an Advanced Search. An additional resource at the site is a reference guide to mitomap tables featuring searchable (by keyword) information on specific mitochondrial DNA function locations and references. An opportunity to add publications to this database is available, if users find that pertinent papers have not been cited.

  18. DNA DNA DNA (d)DNA DNA DNA

    E-print Network

    Hagiya, Masami

    DNA DNA DNA DNA DNA DNA DNA DNA [ 2008] (d)DNA DNA DNA DNA 2 3 DNA DNA DNA DNA DNA DNA DNA (a) (c) (b) (d) #12;DNA DNA DNA DNA DNA DNA DNA DNA (b) DNA [Tanaka et al.2008] DNA DNA DNA DNA DNA DNA DNA #12;iGEM MIT MIT

  19. The Genome Sequence DataBase (GSDB): meeting the challenge of genomic sequencing

    Microsoft Academic Search

    Gifford Keen; Jillian Burton; David Crowley; Emily Dickinson; Ada Espinosa-lujan; Ed Franks; Carol Harger; Mo Manning; Shelley March; Mia Mcleod; John O'neill; Alicia Power; Maria Pumilia; Rhonda Reinert; David Rider; John Rohrlich; Jolene Schwertfeger; Linda Smyth; Nina Thayer; Charles Troup; Chris A. Fields

    1996-01-01

    The genome sequence database (GSDB) is a complete, publicly available relational database of DNA se- quences and annotation maintained by the National Center for Genome Resources (NCGR) under a Coop- erative Agreement with the US Department of Energy (DOE). GSDB provides direct, client-server access to the database for data contributions, community an- notation and SQL queries. The GSDB Annotator, a

  20. Construction of a public CHO cell line transcript database using versatile bioinformatics analysis pipelines.

    PubMed

    Rupp, Oliver; Becker, Jennifer; Brinkrolf, Karina; Timmermann, Christina; Borth, Nicole; Pühler, Alfred; Noll, Thomas; Goesmann, Alexander

    2014-01-01

    Chinese hamster ovary (CHO) cell lines represent the most commonly used mammalian expression system for the production of therapeutic proteins. In this context, detailed knowledge of the CHO cell transcriptome might help to improve biotechnological processes conducted by specific cell lines. Nevertheless, very few assembled cDNA sequences of CHO cells were publicly released until recently, which puts a severe limitation on biotechnological research. Two extended annotation systems and web-based tools, one for browsing eukaryotic genomes (GenDBE) and one for viewing eukaryotic transcriptomes (SAMS), were established as the first step towards a publicly usable CHO cell genome/transcriptome analysis platform. This is complemented by the development of a new strategy to assemble the ca. 100 million reads, sequenced from a broad range of diverse transcripts, to a high quality CHO cell transcript set. The cDNA libraries were constructed from different CHO cell lines grown under various culture conditions and sequenced using Roche/454 and Illumina sequencing technologies in addition to sequencing reads from a previous study. Two pipelines to extend and improve the CHO cell line transcripts were established. First, de novo assemblies were carried out with the Trinity and Oases assemblers, using varying k-mer sizes. The resulting contigs were screened for potential CDS using ESTScan. Redundant contigs were filtered out using cd-hit-est. The remaining CDS contigs were re-assembled with CAP3. Second, a reference-based assembly with the TopHat/Cufflinks pipeline was performed, using the recently published draft genome sequence of CHO-K1 as reference. Additionally, the de novo contigs were mapped to the reference genome using GMAP and merged with the Cufflinks assembly using the cuffmerge software. With this approach 28,874 transcripts located on 16,492 gene loci could be assembled. Combining the results of both approaches, 65,561 transcripts were identified for CHO cell lines, which could be clustered by sequence identity into 17,598 gene clusters. PMID:24427317

  1. Harmonizing Databases? Using a Quasi-Experimental Design to Evaluate a Public Mental Health Re-entry Program1

    PubMed Central

    Deng, Xiaogang; Fisher, William; Fulwiler, Carl; Sambamoorthi, Usha; Johnson, Craig; Pinals, Debra A.; Sampson, Lisa; Siegfriedt, Julianne

    2012-01-01

    Our study is the first-ever initiative to merge administrative databases in Massachusetts to evaluate an important public mental health program. It examines post-incarceration outcomes of adults with serious mental illness (SMI) enrolled in the Massachusetts Department of Mental Health (DMH) Forensic Transition Team (FTT) program. The program began in 1998 with the goal of transitioning offenders with SMI released from state and local correctional facilities utilizing a core set of transition activities. In this study we evaluate the program’s effectiveness using merged administrative data from various state agencies for the years 2007 – 2011, comparing FTT clients to released prisoners who, despite having serious mental health disorders, did not meet the criterion for DMH services. By systematically describing our original study design and the barriers we encountered, this report will inform future efforts to evaluate public programs using merged administrative databases and electronic health records. PMID:22436598

  2. The ICRP database of dose coefficients: Workers and members of the public, version 1.0, an extension of ICRP publications 68 and 72

    SciTech Connect

    Vargo, G.J.

    2000-03-01

    A CDROM database that gives dose coefficients for inhalation and ingestion of over 800 radionuclides of 91 elements. Inhalation dose coefficients are provided for ten aerosol sizes from 0.001 {micro}m to 10 {micro}m AMAD. Effective dose and equivalent dose coefficients are given for ten integration periods form 1 d to 70 y. Extensive help files summarizing the biokinetic models in ICRP Reports 68 and 72 are provided. The greatest value added in this CDROM, however, is the powerful help utility. These files contain test from ICRP Publications 68 and 72, diagrams of the biokinetic models, and information concerning the database. The text of Publication 68 provides a very useful summary of the ICRP respiratory tract model. Hypertext links are provided for important definitions and tables, making the CDROM much more user-friendly than the original publication. Another help file compiles all of the most recent biokinetic models for systemic activity. This is particularly useful since some have been revised recently (e.g., cesium appears in Publication 56) while others have remained unchanged since Publication 30. Consolidating these models in a single place should relieve the problem of crosschecking to assure that the most current model is being used in a dose assessment.

  3. PepBank - a database of peptides based on sequence text mining and public peptide data sources

    Microsoft Academic Search

    Timur Shtatland; Daniel Guettler; Misha Kossodo; Misha Pivovarov; Ralph Weissleder

    2007-01-01

    Background: Peptides are important molecules with diverse biological functions and biomedical uses. To date, there does not exist a single, searchable archive for peptide sequences or associated biological data. Rather, peptide sequences still have to be mined from abstracts and full-length articles, and\\/or obtained from the fragmented public sources. Description: We have constructed a new database (PepBank), which at the

  4. Errors in the interpretation of copy number variations due to the use of public databases as a reference.

    PubMed

    Bastida-Lertxundi, Nerea; López-López, Elixabet; Piñán, M Angeles; Puiggros, Anna; Navajas, Aurora; Solé, Francesc; García-Orad, Africa

    2014-04-01

    The identification of new cryptic deletions and duplications can be used to improve prognostic classification in cancer. To obtain accurate results, it is necessary to discriminate between somatic alterations in the tumor cell and germline polymorphisms. For this purpose, copy number variation (CNV) public databases have been used as a reference. Nevertheless, the use of these databases may lead to erroneous results. Our main goal was to explore the limitations of the use of CNV databases, such as the Database of Genomic Variants (DGV), as the reference. To that end, we used pediatric acute lymphoblastic leukemia (ALL) as a model. We analyzed the genome-wide copy number profile of 23 ALL patients and conducted a comparison of the results obtained using the DGV with those obtained using the normal sample from the patient as the reference. Using only the DGV, 19% of alterations and 41% of polymorphisms were erroneously catalogued. Our results support the hypothesis that with the use of databases such as the DGV as the reference, a high percentage of the variations can be erroneously classified. PMID:24767712

  5. The Human Transcript Database: A Catalogue of Full Length cDNA Inserts

    SciTech Connect

    Bouckk John; Michael McLeod; Kim Worley; Richard Gibbs

    1999-09-10

    The BCM Search Launcher provided improved access to web-based sequence analysis services during the granting period and beyond. The Search Launcher web site grouped analysis procedures by function and provided default parameters that provided reasonable search results for most applications. For instance, most queries were automatically masked for repeat sequences prior to sequence database searches to avoid spurious matches. In addition to the web-based access and arrangements that were made using the functions easier, the BCM Search Launcher provided unique value-added applications like the BEAUTY sequence database search tool that combined information about protein domains and sequence database search results to give an enhanced, more complete picture of the reliability and relative value of the information reported. This enhanced search tool made evaluating search results more straight-forward and consistent. Some of the favorite features of the web site are the sequence utilities and the batch client functionality that allows processing of multiple samples from the command line interface. One measure of the success of the BCM Search Launcher is the number of sites that have adopted the models first developed on the site. The graphic display on the BLAST search from the NCBI web site is one such outgrowth, as is the display of protein domain search results within BLAST search results, and the design of the Biology Workbench application. The logs of usage and comments from users confirm the great utility of this resource.

  6. Selecting Communications and Public Relations Database Media: Online, CD-ROM, and Paper

    Microsoft Academic Search

    William J. Buchholz

    Communication managers in the nineties confront a bewildering array of possibilities in accessing, maintaining, and promulgating information. Planners in charge of developing and accessing communication databases must appraise their needs across all the information storage media currently available, but especially in paper and electronic modes. Many of today's communication databases are in fact evolving into complex interactive systems that incorporate

  7. The MOSART Database: Linking the SART CORS Clinical Database to the Population-Based Massachusetts PELL Reproductive Public Health Data System

    PubMed Central

    Hoang, Lan; Stern, Judy E.; Diop, Hafsatou; Belanoff, Candice; Declercq, Eugene

    2015-01-01

    Although Assisted Reproductive Technology (ART) births make up 1.6 % of births in the US, the impact of ART on subsequent infant and maternal health is not well understood. Clinical ART treatment records linked to population data would be a powerful tool to study long term outcomes among those treated or not by ART. This paper describes the development of a database intended to accomplish this task. We constructed the Massachusetts Outcomes Study of Assisted Reproductive Technology (MOSART) database by linking the Society of Assisted Reproductive Technologies Clinical Outcomes Reporting System (SART CORS) and the Massachusetts (MA) Pregnancy to Early Life Longitudinal (PELL) data systems for children born to MA resident women at MA hospitals between July 2004 and December 2008. PELL data representing 282,971 individual women and their 334,152 deliveries and 342,035 total births were linked with 48,578 cycles of ART treatment in SART CORS delivered to MA residents or women receiving treatment in MA clinics, representing 18,439 eligible women of whom 9,326 had 10,138 deliveries in this time period. A deterministic five phase linkage algorithm methodology was employed. Linkage results, accuracy, and concordance analyses were examined. We linked 9,092 (89.7 %) SART CORS outcome records to PELL delivery records overall, including 95.0 % among known MA residents treated in MA clinics; 70.8 % with full exact matches. There were minimal differences between matched and unmatched delivery records, except for unknown residency and out-of-state ART site. There was very low concordance of reported use of ART treatment between SART CORS and PELL (birth certificate) data. A total of 3.4 % of MA children (11,729) were identified from ART assisted pregnancies (6,556 singletons; 5,173 multiples). The MOSART linked database provides a strong basis for further longitudinal ART outcomes studies and supports the continued development of potentially powerful linked clinical-public health databases. PMID:24623195

  8. UnoViS: the MedIT public unobtrusive vital signs database.

    PubMed

    Wartzek, Tobias; Czaplik, Michael; Antink, Christoph Hoog; Eilebrecht, Benjamin; Walocha, Rafael; Leonhardt, Steffen

    2015-01-01

    While PhysioNet is a large database for standard clinical vital signs measurements, such a database does not exist for unobtrusively measured signals. This inhibits progress in the vital area of signal processing for unobtrusive medical monitoring as not everybody owns the specific measurement systems to acquire signals. Furthermore, if no common database exists, a comparison between different signal processing approaches is not possible. This gap will be closed by our UnoViS database. It contains different recordings in various scenarios ranging from a clinical study to measurements obtained while driving a car. Currently, 145 records with a total of 16.2 h of measurement data is available, which are provided as MATLAB files or in the PhysioNet WFDB file format. In its initial state, only (multichannel) capacitive ECG and unobtrusive PPG signals are, together with a reference ECG, included. All ECG signals contain annotations by a peak detector and by a medical expert. A dataset from a clinical study contains further clinical annotations. Additionally, supplementary functions are provided, which simplify the usage of the database and thus the development and evaluation of new algorithms. The development of urgently needed methods for very robust parameter extraction or robust signal fusion in view of frequent severe motion artifacts in unobtrusive monitoring is now possible with the database. PMID:26038690

  9. Amplification volume reduction on DNA database samples using FTA™ Classic Cards.

    PubMed

    Wong, Hang Yee; Lim, Eng Seng Simon; Tan-Siew, Wai Fun

    2012-03-01

    The DNA forensic community always strives towards improvements in aspects such as sensitivity, robustness, and efficacy balanced with cost efficiency. Therefore our laboratory decided to study the feasibility of PCR amplification volume reduction using DNA entrapped in FTA™ Classic Card and to bring cost savings to the laboratory. There were a few concerns the laboratory needed to address. First, the kinetics of the amplification reaction could be significantly altered. Second, an increase in sensitivity might affect interpretation due to increased stochastic effects even though they were pristine samples. Third, statics might cause FTA punches to jump out of its allocated well into another thus causing sample-to-sample contamination. Fourth, the size of the punches might be too small for visual inspection. Last, there would be a limit to the extent of volume reduction due to evaporation and the possible need of re-injection of samples for capillary electrophoresis. The laboratory had successfully optimized a reduced amplification volume of 10 ?L for FTA samples. PMID:21543276

  10. A framework for organizing cancer-related variations from existing databases, publications and NGS data using a High-performance Integrated Virtual Environment (HIVE).

    PubMed

    Wu, Tsung-Jung; Shamsaddini, Amirhossein; Pan, Yang; Smith, Krista; Crichton, Daniel J; Simonyan, Vahan; Mazumder, Raja

    2014-01-01

    Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Because of the petabytes of data and information present in NGS primary repositories, a platform HIVE (High-performance Integrated Virtual Environment) for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31 979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13 896 small-scale and 308 986 large-scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies. Database URL: BioMuta: http://hive.biochemistry.gwu.edu/tools/biomuta/index.php; CSR: http://hive.biochemistry.gwu.edu/dna.cgi?cmd=csr; HIVE: http://hive.biochemistry.gwu.edu. PMID:24667251

  11. A framework for organizing cancer-related variations from existing databases, publications and NGS data using a High-performance Integrated Virtual Environment (HIVE)

    PubMed Central

    Wu, Tsung-Jung; Shamsaddini, Amirhossein; Pan, Yang; Smith, Krista; Crichton, Daniel J.; Simonyan, Vahan; Mazumder, Raja

    2014-01-01

    Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Because of the petabytes of data and information present in NGS primary repositories, a platform HIVE (High-performance Integrated Virtual Environment) for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31 979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13 896 small-scale and 308 986 large-scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies. Database URL: BioMuta: http://hive.biochemistry.gwu.edu/tools/biomuta/index.php; CSR: http://hive.biochemistry.gwu.edu/dna.cgi?cmd=csr; HIVE: http://hive.biochemistry.gwu.edu PMID:24667251

  12. CARL Corporation to Market Knight Ridder DIALOG Databases to the Academic and Public Library Market.

    ERIC Educational Resources Information Center

    Machovec, George S.

    1996-01-01

    With the advent of CD-ROMs, libraries began to limit online searching via DIALOG. To increase DIALOG's market share, Colorado Alliance of Research Libraries (CARL) Corporation is developing graphical user interfaces using World Wide Web and Windows technology and has reached agreements with Knight Ridder Information and with most of their database

  13. Identification and quantification of glycerolipids in cotton fibers: reconciliation with metabolic pathway predictions from DNA databases.

    PubMed

    Wanjie, Sylvia W; Welti, Ruth; Moreau, Robert A; Chapman, Kent D

    2005-08-01

    The lipid profiles of cotton fiber cells were determined from total lipid extracts of elongating and maturing cotton fiber cells to see whether the membrane lipid composition changed during the phases of rapid cell elongation or secondary cell wall thickening. Total FA content was highest or increased during elongation and was lower or decreased thereafter, likely reflecting the assembly of the expanding cell membranes during elongation and the shift to membrane maintenance (and increase in secondary cell wall content) in maturing fibers. Analysis of lipid extracts by electrospray ionization and tandem MS (ESI-MS/MS) revealed that in elongating fiber cells (7-10 d post-anthesis), the polar lipids-PC, PE, PI, PA, phosphatidylglycerol, monogalactosyldiacylglycerol, digalactosyldiacylglycerol, and phosphatidylglycerol-were most abundant. These same glycerolipids were found in similar proportions in maturing fiber cells (21 dpa). Detailed molecular species profiles were determined by ESI-MS/MS for all glycerolipid classes, and ESI-MS/MS results were consistent with lipid profiles determined by HPLC and ELSD. The predominant molecular species of PC, PE, PI, and PA was 34:3 (16:0, 18:3), but 36:6 (18:3,18:3) also was prevalent. Total FA analysis of cotton lipids confirmed that indeed linolenic (18:3) and palmitic (16:0) acids were the most abundant FA in these cell types. Bioinformatics data were mined from cotton fiber expressed sequence tag databases in an attempt to reconcile expression of lipid metabolic enzymes with lipid metabolite data. Together, these data form a foundation for future studies of the functional contribution of lipid metabolism to the development of this unusual and economically important cell type. PMID:16296396

  14. FULL-malaria: a database for a full-length enriched cDNA library from human malaria parasite, Plasmodium falciparum

    PubMed Central

    Watanabe, Junichi; Sasaki, Masahide; Suzuki, Yutaka; Sugano, Sumio

    2001-01-01

    FULL-malaria is a database for a full-length-enriched cDNA library from the human malaria parasite Plasmodium falciparum (http://133.11.149.55/). Because of its medical importance, this organism is the first target for genome sequencing of a eukaryotic pathogen; the sequences of two of its 14 chromosomes have already been determined. However, for the full exploitation of this rapidly accumulating information, correct identification of the genes and study of their expression are essential. Using the oligo-capping method, we have produced a full-length-enriched cDNA library from erythrocytic stage parasites and performed one-pass reading. The database consists of nucleotide sequences of 2490 random clones that include 390 (16%) known malaria genes according to BLASTN analysis of the nr-nt database in GenBank; these represent 98 genes, and the clones for 48 of these genes contain the complete protein-coding sequence (49%). On the other hand, comparisons with the complete chromosome 2 sequence revealed that 35 of 210 predicted genes are expressed, and in addition led to detection of three new gene candidates that were not previously known. In total, 19 of these 38 clones (50%) were full-length. From these obser­vations, it is expected that the database contains ?1000 genes, including 500 full-length clones. It should be an invaluable resource for the development of vaccines and novel drugs. PMID:11125052

  15. Assessing the impact of paediatric oncology publications using three citation databases.

    PubMed

    Arora, Ramandeep S; Eden, Tim O B

    2011-01-01

    Despite some reported limitations, Web of Science has been the standard source to assess the impact of individual articles, and consequently journals. By analysing the citations to articles published in the field of paediatric oncology, we demonstrate that Scopus and Google Scholar, the two new citation databases, retrieve more citations than Web of Science. The strength of Scopus lies in identifying non-English literature from Western and Eastern Europe, while Google Scholar is proficient at identifying English and non-English literature from Africa, Asia and Central and South America. These findings have implications for researchers, journals and health libraries. PMID:20922764

  16. Production of Arrayed and Rearrayed cDNA Libraries for Public Use

    SciTech Connect

    Rasmussen, K

    2005-08-29

    Researchers studying genes and their protein products need an easily available source for that gene. The I.M.A.G.E. Consortium at Lawrence Livermore National Laboratory is an important source of such genes in the form of arrayed cDNA libraries. The arrayed clones and associated data are available to the public, free of restriction. Libraries are transformed and titered into 384-well master plates, from which 2-8 copies are made. One copy plate is stored by LLNL while others are sent to sequencing groups, plate distributors, and to the group which contributed the library. Clones found to be unique and/or full-length are rearrayed and also made publicly available. Bioinformatics tools supporting the use of I.M.A.G.E. clones are accessible via the World Wide Web.

  17. ENVIRONMENTAL RESIDUE EFFECTS DATABASE (ERED)

    EPA Science Inventory

    US Army Corps of Engineers public web site for the "Environmental Residue Effects Database", a searchable database of adverse biological effects associated with tissue concentrations of various contaminants....

  18. A bioinformatics tool for linking gene expression profiling results with public databases of microRNA target predictions.

    PubMed

    Creighton, Chad J; Nagaraja, Ankur K; Hanash, Samir M; Matzuk, Martin M; Gunaratne, Preethi H

    2008-11-01

    MicroRNAs are short (approximately 22 nucleotides) noncoding RNAs that regulate the stability and translation of mRNA targets. A number of computational algorithms have been developed to help predict which microRNAs are likely to regulate which genes. Gene expression profiling of biological systems where microRNAs might be active can yield hundreds of differentially expressed genes. The commonly used public microRNA target prediction databases facilitate gene-by-gene searches. However, integration of microRNA-mRNA target predictions with gene expression data on a large scale using these databases is currently cumbersome and time consuming for many researchers. We have developed a desktop software application which, for a given target prediction database, retrieves all microRNA:mRNA functional pairs represented by an experimentally derived set of genes. Furthermore, for each microRNA, the software computes an enrichment statistic for overrepresentation of predicted targets within the gene set, which could help to implicate roles for specific microRNAs and microRNA-regulated genes in the system under study. Currently, the software supports searching of results from PicTar, TargetScan, and miRanda algorithms. In addition, the software can accept any user-defined set of gene-to-class associations for searching, which can include the results of other target prediction algorithms, as well as gene annotation or gene-to-pathway associations. A search (using our software) of genes transcriptionally regulated in vitro by estrogen in breast cancer uncovered numerous targeting associations for specific microRNAs-above what could be observed in randomly generated gene lists-suggesting a role for microRNAs in mediating the estrogen response. The software and Excel VBA source code are freely available at http://sigterms.sourceforge.net. PMID:18812437

  19. ArrayExpress: a public database of gene expression data at EBI

    Microsoft Academic Search

    Philippe Rocca-Serra; Alvis Brazma; Helen Parkinson; Ugis Sarkans; Mohammadreza Shojatalab; Sergio Contrino; Jaak Vilo; Niran Abeygunawardena; Gaurab Mukherjee; Ele Holloway; Misha Kapushesky; Patrick Kemmeren; Gonzalo Garcia Lara; Ahmet Oezcimen; Susanna-Assunta Sansone

    2003-01-01

    ArrayExpress is a public repository for microarray-based gene expression data, resulting from the implementation of the MAGE object model to ensure accurate data structuring and the MIAME standard, which defines the annotation requirements. ArrayExpress accepts data as MAGE–ML files for direct submissions or data from MIAMExpress, the MIAME compliant web-based annotation and submission tool of EBI. A team of curators

  20. Complementary Value of Databases for Discovery of Scholarly Literature A User Survey of On-Line Searching for Publications in Art History

    Microsoft Academic Search

    Erik Nemeth

    Discovery of academic literature through Web search engines challenges the traditional role of specialized research databases. Creation of literature outside of academic presses and peer- reviewed publications expands the content for scholarly research within a particular field. The resulting body of literature raises the question of whether scholars prefer the perceived broader access of Web search engines or opt for

  1. BASIC DEFINITIONS TO ASSIST WITH UNDERSTANDING THE NIH PUBLIC ACCESS POLICY A database of citations and abstracts for biomedical literature from MEDLINE and additional life science journals. Links are provided

    E-print Network

    Grishok, Alla

    BASIC DEFINITIONS TO ASSIST WITH UNDERSTANDING THE NIH PUBLIC ACCESS POLICY PubMed A database number used when citing papers falling under the NIH Public Access Policy on applications, proposals in a journal. This number does NOT indicate compliance with the NIH Public Access Policy. NIH Public Access

  2. The DNA Binding Properties of Saccharomyces cerevisiae Rad51 (Received for publication, June 12, 1998, and in revised form, November 16, 1998)

    E-print Network

    Kowalczykowski, Stephen C.

    The DNA Binding Properties of Saccharomyces cerevisiae Rad51 Protein* (Received for publication Rad51 protein is the para- digm for eukaryotic ATP-dependent DNA strand ex- change proteins. To explain some of the unique charac- teristics of DNA strand exchange promoted by Rad51 protein, when

  3. A DNA Pairing-enhanced Conformation of Bacterial RecA Proteins* Received for publication, August 4, 2003, and in revised form, October 3, 2003

    E-print Network

    Cox, Michael M.

    A DNA Pairing-enhanced Conformation of Bacterial RecA Proteins* Received for publication, August 4A proteins of Escherichia coli (Ec) and Deino- coccus radiodurans (Dr) both promote a DNA strand exchange- stranded DNA-binding protein (SSB). In the absence of SSB, the initiation of strand exchange is greatly en

  4. E-SovTox: An online database of the main publicly-available sources of toxicity data concerning REACH-relevant chemicals published in the Russian language.

    PubMed

    Sihtmäe, Mariliis; Blinova, Irina; Aruoja, Villem; Dubourguier, Henri-Charles; Legrand, Nicolas; Kahru, Anne

    2010-08-01

    A new open-access online database, E-SovTox, is presented. E-SovTox provides toxicological data for substances relevant to the EU Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) system, from publicly-available Russian language data sources. The database contains information selected mainly from scientific journals published during the Soviet Union era. The main information source for this database - the journal, Gigiena Truda i Professional'nye Zabolevania [Industrial Hygiene and Occupational Diseases], published between 1957 and 1992 - features acute, but also chronic, toxicity data for numerous industrial chemicals, e.g. for rats, mice, guinea-pigs and rabbits. The main goal of the abovementioned toxicity studies was to derive the maximum allowable concentration limits for industrial chemicals in the occupational health settings of the former Soviet Union. Thus, articles featured in the database include mostly data on LD50 values, skin and eye irritation, skin sensitisation and cumulative properties. Currently, the E-SovTox database contains toxicity data selected from more than 500 papers covering more than 600 chemicals. The user is provided with the main toxicity information, as well as abstracts of these papers in Russian and in English (given as provided in the original publication). The search engine allows cross-searching of the database by the name or CAS number of the compound, and the author of the paper. The E-SovTox database can be used as a decision-support tool by researchers and regulators for the hazard assessment of chemical substances. PMID:20822322

  5. Drinking Water Treatability Database (Database)

    EPA Science Inventory

    The drinking Water Treatability Database (TDB) will provide data taken from the literature on the control of contaminants in drinking water, and will be housed on an interactive, publicly-available USEPA web site. It can be used for identifying effective treatment processes, rec...

  6. De-identifying a public use microdata file from the Canadian national discharge abstract database

    PubMed Central

    2011-01-01

    Abstract Background The Canadian Institute for Health Information (CIHI) collects hospital discharge abstract data (DAD) from Canadian provinces and territories. There are many demands for the disclosure of this data for research and analysis to inform policy making. To expedite the disclosure of data for some of these purposes, the construction of a DAD public use microdata file (PUMF) was considered. Such purposes include: confirming some published results, providing broader feedback to CIHI to improve data quality, training students and fellows, providing an easily accessible data set for researchers to prepare for analyses on the full DAD data set, and serve as a large health data set for computer scientists and statisticians to evaluate analysis and data mining techniques. The objective of this study was to measure the probability of re-identification for records in a PUMF, and to de-identify a national DAD PUMF consisting of 10% of records. Methods Plausible attacks on a PUMF were evaluated. Based on these attacks, the 2008-2009 national DAD was de-identified. A new algorithm was developed to minimize the amount of suppression while maximizing the precision of the data. The acceptable threshold for the probability of correct re-identification of a record was set at between 0.04 and 0.05. Information loss was measured in terms of the extent of suppression and entropy. Results Two different PUMF files were produced, one with geographic information, and one with no geographic information but more clinical information. At a threshold of 0.05, the maximum proportion of records with the diagnosis code suppressed was 20%, but these suppressions represented only 8-9% of all values in the DAD. Our suppression algorithm has less information loss than a more traditional approach to suppression. Smaller regions, patients with longer stays, and age groups that are infrequently admitted to hospitals tend to be the ones with the highest rates of suppression. Conclusions The strategies we used to maximize data utility and minimize information loss can result in a PUMF that would be useful for the specific purposes noted earlier. However, to create a more detailed file with less information loss suitable for more complex health services research, the risk would need to be mitigated by requiring the data recipient to commit to a data sharing agreement. PMID:21861894

  7. DNA.

    ERIC Educational Resources Information Center

    Felsenfeld, Gary

    1985-01-01

    Structural form, bonding scheme, and chromatin structure of and gene-modification experiments with deoxyribonucleic acid (DNA) are described. Indicates that DNA's double helix is variable and also flexible as it interacts with regulatory and other molecules to transfer hereditary messages. (DH)

  8. A New Model for Providing Cell-Free DNA and Risk Assessment for Chromosome Abnormalities in a Public Hospital Setting

    PubMed Central

    Wallerstein, Robert; Jelks, Andrea; Garabedian, Matthew J.

    2014-01-01

    Objective. Cell-free DNA (cfDNA) offers highly accurate noninvasive screening for Down syndrome. Incorporating it into routine care is complicated. We present our experience implementing a novel program for cfDNA screening, emphasizing patient education, genetic counseling, and resource management. Study Design. Beginning in January 2013, we initiated a new patient care model in which high-risk patients for aneuploidy received genetic counseling at 12 weeks of gestation. Patients were presented with four pathways for aneuploidy risk assessment and diagnosis: (1) cfDNA; (2) integrated screening; (3) direct-to-invasive testing (chorionic villus sampling or amniocentesis); or (4) no first trimester diagnostic testing/screening. Patients underwent follow-up genetic counseling and detailed ultrasound at 18–20 weeks to review first trimester testing and finalize decision for amniocentesis. Results. Counseling and second trimester detailed ultrasound were provided to 163 women. Most selected cfDNA screening (69%) over integrated screening (0.6%), direct-to-invasive testing (14.1%), or no screening (16.6%). Amniocentesis rates decreased following implementation of cfDNA screening (19.0% versus 13.0%, P < 0.05). Conclusion. When counseled about screening options, women often chose cfDNA over integrated screening. This program is a model for patient-directed, efficient delivery of a newly available high-level technology in a public health setting. Genetic counseling is an integral part of patient education and determination of plan of care. PMID:25101177

  9. Non-B DB v2.0: a database of predicted non-B DNA-forming motifs and its associated tools

    PubMed Central

    Cer, Regina Z.; Donohue, Duncan E.; Mudunuri, Uma S.; Temiz, Nuri A.; Loss, Michael A.; Starner, Nathan J.; Halusa, Goran N.; Volfovsky, Natalia; Yi, Ming; Luke, Brian T.; Bacolla, Albino; Collins, Jack R.; Stephens, Robert M.

    2013-01-01

    The non-B DB, available at http://nonb.abcc.ncifcrf.gov, catalogs predicted non-B DNA-forming sequence motifs, including Z-DNA, G-quadruplex, A-phased repeats, inverted repeats, mirror repeats, direct repeats and their corresponding subsets: cruciforms, triplexes and slipped structures, in several genomes. Version 2.0 of the database revises and re-implements the motif discovery algorithms to better align with accepted definitions and thresholds for motifs, expands the non-B DNA-forming motifs coverage by including short tandem repeats and adds key visualization tools to compare motif locations relative to other genomic annotations. Non-B DB v2.0 extends the ability for comparative genomics by including re-annotation of the five organisms reported in non-B DB v1.0, human, chimpanzee, dog, macaque and mouse, and adds seven additional organisms: orangutan, rat, cow, pig, horse, platypus and Arabidopsis thaliana. Additionally, the non-B DB v2.0 provides an overall improved graphical user interface and faster query performance. PMID:23125372

  10. Development of a DNA microarray to detect antimicrobial resistance genes identified in the national center for biotechnology information database

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High density genotyping techniques are needed for investigating antimicrobial resistance especially in the case of multi-drug resistant (MDR) isolates. To achieve this all antimicrobial resistance genes in the NCBI Genbank database were identified by key word searches of sequence annotations and the...

  11. GEOBASE: Israel Regional Database

    NSDL National Science Digital Library

    GEOBASE is a searchable database that contains data extracted from Israel's Central Bureau of Statistics, statistical publications, public service records, local databases, and summaries from individual level datasets. GEOBASE provides regularly updated annual and quarterly series on numerous topics, including economic activities, labor and wages, population, transportation, tourism, housing, and education.

  12. Maize microarray annotation database

    PubMed Central

    2011-01-01

    Background Microarray technology has matured over the past fifteen years into a cost-effective solution with established data analysis protocols for global gene expression profiling. The Agilent-016047 maize 44 K microarray was custom-designed from EST sequences, but only reporter sequences with EST accession numbers are publicly available. The following information is lacking: (a) reporter - gene model match, (b) number of reporters per gene model, (c) potential for cross hybridization, (d) sense/antisense orientation of reporters, (e) position of reporter on B73 genome sequence (for eQTL studies), and (f) functional annotations of genes represented by reporters. To address this, we developed a strategy to annotate the Agilent-016047 maize microarray, and built a publicly accessible annotation database. Description Genomic annotation of the 42,034 reporters on the Agilent-016047 maize microarray was based on BLASTN results of the 60-mer reporter sequences and their corresponding ESTs against the maize B73 RefGen v2 "Working Gene Set" (WGS) predicted transcripts and the genome sequence. The agreement between the EST, WGS transcript and gDNA BLASTN results were used to assign the reporters into six genomic annotation groups. These annotation groups were: (i) "annotation by sense gene model" (23,668 reporters), (ii) "annotation by antisense gene model" (4,330); (iii) "annotation by gDNA" without a WGS transcript hit (1,549); (iv) "annotation by EST", in which case the EST from which the reporter was designed, but not the reporter itself, has a WGS transcript hit (3,390); (v) "ambiguous annotation" (2,608); and (vi) "inconclusive annotation" (6,489). Functional annotations of reporters were obtained by BLASTX and Blast2GO analysis of corresponding WGS transcripts against GenBank. The annotations are available in the Maize Microarray Annotation Database http://MaizeArrayAnnot.bi.up.ac.za/, as well as through a GBrowse annotation file that can be uploaded to the MaizeGDB genome browser as a custom track. The database was used to re-annotate lists of differentially expressed genes reported in case studies of published work using the Agilent-016047 maize microarray. Up to 85% of reporters in each list could be annotated with confidence by a single gene model, however up to 10% of reporters had ambiguous annotations. Overall, more than 57% of reporters gave a measurable signal in tissues as diverse as anthers and leaves. Conclusions The Maize Microarray Annotation Database will assist users of the Agilent-016047 maize microarray in (i) refining gene lists for global expression analysis, and (ii) confirming the annotation of candidate genes before functional studies. PMID:21961731

  13. Biological Biochemical Image Database

    NSDL National Science Digital Library

    The National Institute on Aging -- one of the National Institutes of Health -- provides the Biological Biochemical Image Database, "a searchable database of images of putative biological pathways, macromolecular structures, gene families, and cellular relationships." The database is intended for researchers "working with large sets of genes or proteins using cDNA arrays, functional genomics, or proteomics." The database may be searched by gene name, pathway, cell or tissue type, disease name, biological level, etc. Database users are invited to submit additional diagrams, suggestions, and comments The Web site also includes convenient lists of gene names and keywords, as well as links to biological/ biochemical pathway Web resources.

  14. RecA Filament Dynamics during DNA Strand Exchange Reactions* (Received for publication, November 27, 1996)

    E-print Network

    Cox, Michael M.

    of the RecA redistribution idea. When ATP is hydrolyzed, DNA strand exchange is accompanied by a Rec is observed, and sig- nificant ATP is hydrolyzed, even though DNA strand exchange is entirely blocked bound to ssDNA (dATP is hydrolyzed at rates about 20% higher). ATP is hydrolyzed uniformly through- out

  15. RETRIEVAL ACCURACY OF VERY LARGE DNA-BASED DATABASES OF DIGITAL Sotirios A. Tsaftaris, and Aggelos K. Katsaggelos

    E-print Network

    Tsaftaris, Sotirios

    hybridizing is a func- tion of concentrations, thermodynamic strength of their chemical bond, temperature and salt concentration [4]. Therefore, it is critical to quantify the percentage of fluo- rescent output and noise the fluorescent response of undesired ones. Simulation frameworks to model DNA hybridization in

  16. Significant variance in genetic diversity among populations of Schistosoma haematobium detected using microsatellite DNA loci from a genome-wide database

    PubMed Central

    2013-01-01

    Background Urogenital schistosomiasis caused by Schistosoma haematobium is widely distributed across Africa and is increasingly being targeted for control. Genome sequences and population genetic parameters can give insight into the potential for population- or species-level drug resistance. Microsatellite DNA loci are genetic markers in wide use by Schistosoma researchers, but there are few primers available for S. haematobium. Methods We sequenced 1,058,114 random DNA fragments from clonal cercariae collected from a snail infected with a single Schistosoma haematobium miracidium. We assembled and aligned the S. haematobium sequences to the genomes of S. mansoni and S. japonicum, identifying microsatellite DNA loci across all three species and designing primers to amplify the loci in S. haematobium. To validate our primers, we screened 32 randomly selected primer pairs with population samples of S. haematobium. Results We designed >13,790 primer pairs to amplify unique microsatellite loci in S. haematobium, (available at http://www.cebio.org/projetos/schistosoma-haematobium-genome). The three Schistosoma genomes contained similar overall frequencies of microsatellites, but the frequency and length distributions of specific motifs differed among species. We identified 15 primer pairs that amplified consistently and were easily scored. We genotyped these 15 loci in S. haematobium individuals from six locations: Zanzibar had the highest levels of diversity; Malawi, Mauritius, Nigeria, and Senegal were nearly as diverse; but the sample from South Africa was much less diverse. Conclusions About half of the primers in the database of Schistosoma haematobium microsatellite DNA loci should yield amplifiable and easily scored polymorphic markers, thus providing thousands of potential markers. Sequence conservation among S. haematobium, S. japonicum, and S. mansoni is relatively high, thus it should now be possible to identify markers that are universal among Schistosoma species (i.e., using DNA sequences conserved among species), as well as other markers that are specific to species or species-groups (i.e., using DNA sequences that differ among species). Full genome-sequencing of additional species and specimens of S. haematobium, S. japonicum, and S. mansoni is desirable to better characterize differences within and among these species, to develop additional genetic markers, and to examine genes as well as conserved non-coding elements associated with drug resistance. PMID:24499537

  17. Morchella MLST database

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Welcome to the Morchella MLST database. This dedicated database was set up at the CBS-KNAW Biodiversity Center by Vincent Robert in February 2012, using BioloMICS software (Robert et al., 2011), to facilitate DNA sequence-based identifications of Morchella species via the Internet. The current datab...

  18. Development of simple sequence repeat DNA markers and their integration into a barley linkage map

    Microsoft Academic Search

    Z.-W. Liu; R. M. Biyashev; M. A. Saghai Maroof

    1996-01-01

    Simple sequence repeats (SSRs), or microsatellites, are a new class of PCR-based DNA markers for genetic mapping. The objectives of the present study were to develop SSR markers for barley and to integrate them into an existing barley linkage map. DNA sequences containing SSRs were isolated from a barley genomic library and from public databases. It is estimated that the

  19. Improving the accuracy of NMR structures of DNA by means of a database potential of mean force describing base-base positional interactions.

    PubMed

    Kuszewski, J; Schwieters, C; Clore, G M

    2001-05-01

    NMR structure determination of nucleic acids presents an intrinsically difficult problem since the density of short interproton distance contacts is relatively low and limited to adjacent base pairs. Although residual dipolar couplings provide orientational information that is clearly helpful, they do not provide translational information of either a short-range (with the exception of proton-proton dipolar couplings) or long-range nature. As a consequence, the description of the nonbonded contacts has a major impact on the structures of nucleic acids generated from NMR data. In this paper, we describe the derivation of a potential of mean force derived from all high-resolution (2 A or better) DNA crystal structures available in the Nucleic Acid Database (NDB) as of May 2000 that provides a statistical description, in simple geometric terms, of the relative positions of pairs of neighboring bases (both intra- and interstrand) in Cartesian space. The purpose of this pseudopotential, which we term a DELPHIC base-base positioning potential, is to bias sampling during simulated annealing refinement to physically reasonable regions of conformational space within the range of possibilities that are consistent with the experimental NMR restraints. We illustrate the application of the DELPHIC base-base positioning potential to the structure refinement of a DNA dodecamer, d(CGCGAATTCGCG)(2), for which NOE and dipolar coupling data have been measured in solution and for which crystal structures have been determined. We demonstrate by cross-validation against independent NMR observables (that is, both residual dipolar couplings and NOE-derived intereproton distance restraints) that the DELPHIC base-base positioning potential results in a significant increase in accuracy and obviates artifactual distortions in the structures arising from the limitations of conventional descriptions of the nonbonded contacts in terms of either Lennard-Jones van der Waals and electrostatic potentials or a simple van der Waals repulsion potential. We also demonstrate, using experimental NMR data for a complex of the male sex determining factor SRY with a duplex DNA 14mer, which includes a region of highly unusual and distorted DNA, that the DELPHIC base-base positioning potential does not in any way hinder unusual interactions and conformations from being satisfactorily sampled and reproduced. We expect that the methodology described in this paper for DNA can be equally applied to RNA, as well as side chain-side chain interactions in proteins and protein-protein complexes, and side chain-nucleic acid interactions in protein-nucleic acid complexes. Further, this approach should be useful not only for NMR structure determination but also for refinement of low-resolution (3-3.5 A) X-ray data. PMID:11457140

  20. Database of Gene Co-Regulation (dGCR): A Web Tool for Analysing Patterns of Gene Co-regulation across Publicly Available Expression Data

    PubMed Central

    Williams, Gareth

    2015-01-01

    The database of Gene Co-Regulation (dGCR) is a web tool for the analysis of gene relationships based on correlated patterns of gene expression over publicly available transcriptional data. The motivation behind dGCR is that genes whose expression patterns correlate across many experiments tend to be co-regulated and hence share biological function. In addition to revealing functional connections between individual gene pairs, extended sets of co-regulated genes can also be assessed for enrichment of gene ontology classes and interaction pathways. This functionality provides an insight into the biological function of the query gene itself. The dGCR web tool extends the range of expression data curated by existing co-regulation databases and provides additional insights into gene function through the analysis of pathways, gene ontology classes and co-regulation modules. PMID:25628763

  1. Contamination of sequence databases with adaptor sequences

    SciTech Connect

    Yoshikawa, Takeo; Sanders, A.R.; Detera-Wadleigh, S.D. [National Institute of Mental Health, Bethesda, MD (United States)

    1997-02-01

    Because of the exponential increase in the amount of DNA sequences being added to the public databases on a daily basis, it has become imperative to identify sources of contamination rapidly. Previously, contaminations of sequence databases have been reported to alert the scientific community to the problem. These contaminations can be divided into two categories. The first category comprises host sequences that have been difficult for submitters to manage or control. Examples include anomalous sequences derived from Escherichia coli, which are inserted into the chromosomes (and plasmids) of the bacterial hosts. Insertion sequences are highly mobile and are capable of transposing themselves into plasmids during cloning manipulation. Another example of the first category is the infection with yeast genomic DNA or with bacterial DNA of some commercially available cDNA libraries from Clontech. The second category of database contamination is due to the inadvertent inclusion of nonhost sequences. This category includes incorporation of cloning-vector sequences and multicloning sites in the database submission. M13-derived artifacts have been common, since M13-based vectors have been widely used for subcloning DNA fragments. Recognizing this problem, the National Center for Biotechnology Information (NCBI) started to screen, in April 1994, all sequences directly submitted to GenBank, against a set of vector data retrieved from GenBank by use of key-word searches, such as {open_quotes}vector.{close_quotes} In this report, we present evidence for another sequence artifact that is widespread but that, to our knowledge, has not yet been reported. 11 refs., 1 tab.

  2. The Zoonoses Database-A HyperCard™ Stack on the Macintosh™ Computer for Teaching Veterinary Epidemiology and Public Health

    PubMed Central

    Hungerford, Laura L.; Smith, Ronald P.; Smith, Ronald D.

    1990-01-01

    For the past 2 years second-year veterinary students have used a HyperCard-based Zoonoses Database to abstract, store and retrieve information on zoonotic diseases. Each student was required to review the veterinary literature for information about one zoonotic disease and enter the information into a HyperCard template which summarized epidemiologic aspects of the disease. A total of 99 diseases are included in the Zoonoses Database. To use this resource in problem-solving, students formed mock 2 to 6 person veterinary practices. Student “practice groups” were then assigned one of 24 case histories and asked to form a rule-out list, select the most probable disease, and provide advice to the client regarding the zoonotic potential of the disease in question. We will continue to expand the database and link life cycles to it.

  3. The public Human Genome Project: mapping the genome, sequencing, and reassembly, 3D animationSite: DNA Interactive (www.dnai.org)

    NSDL National Science Digital Library

    2008-10-06

    DNAi location: Genome>The Project>putting it together>Animations>Hierarchical shotgun (public) Mapping the genome AND Sequencing and assembly This animation shows how the human genome was sequenced using the 'hierarchical shotgun' method of the public Human Genome Project. All the base pairs in our DNA are represented as letters on pieces of paper.

  4. The ICRP Database of Dose Coefficients: Workers and Members of the Public, version 1.0 - an extension of ICRP Publications 68 and 72.

    SciTech Connect

    Vargo, George J.(BATTELLE (PACIFIC NW LAB)) [BATTELLE (PACIFIC NW LAB)

    2000-03-01

    A CD-ROM database that gives dose coefficients for inhalation and ingestion of over 800 radionuclides of 91 elements. Inhalation dose coefficients are provided for ten aerosol sizes from 0.001um to 10 um AMAD. Effective dose and equivalent dose coefficients are given for ten integration periods from 1 d to 70y.

  5. A Reference Methylome Database and Analysis Pipeline to Facilitate Integrative and Comparative Epigenomics

    PubMed Central

    Song, Qiang; Decato, Benjamin; Hong, Elizabeth E.; Zhou, Meng; Fang, Fang; Qu, Jianghan; Garvin, Tyler; Kessler, Michael; Zhou, Jun; Smith, Andrew D.

    2013-01-01

    DNA methylation is implicated in a surprising diversity of regulatory, evolutionary processes and diseases in eukaryotes. The introduction of whole-genome bisulfite sequencing has enabled the study of DNA methylation at a single-base resolution, revealing many new aspects of DNA methylation and highlighting the usefulness of methylome data in understanding a variety of genomic phenomena. As the number of publicly available whole-genome bisulfite sequencing studies reaches into the hundreds, reliable and convenient tools for comparing and analyzing methylomes become increasingly important. We present MethPipe, a pipeline for both low and high-level methylome analysis, and MethBase, an accompanying database of annotated methylomes from the public domain. Together these resources enable researchers to extract interesting features from methylomes and compare them with those identified in public methylomes in our database. PMID:24324667

  6. Psychiatric inpatient expenditures and public health insurance programmes: analysis of a national database covering the entire South Korean population

    Microsoft Academic Search

    Woojin Chung

    2010-01-01

    BACKGROUND: Medical spending on psychiatric hospitalization has been reported to impose a tremendous socio-economic burden on many developed countries with public health insurance programmes. However, there has been no in-depth study of the factors affecting psychiatric inpatient medical expenditures and differentiated these factors across different types of public health insurance programmes. In view of this, this study attempted to explore

  7. DNA and the revolutions of molecular evolution, computational biology, and bioinformatics.

    PubMed

    Golding, G Brian

    2003-12-01

    The discovery of the structure of DNA was a necessary prerequisite for determining the sequence of DNA molecules. Technological advances have now made it possible to sequence DNA rapidly and has resulted in public databases with over 30 billion nucleotides of known sequence. The analysis of these data has lead to new fields of science and to amazing advances in our understanding of evolution. PMID:14663507

  8. Ionic Liquids Database- (ILThermo)

    National Institute of Standards and Technology Data Gateway

    SRD 147 Ionic Liquids Database- (ILThermo) (Web, free access)   IUPAC Ionic Liquids Database, ILThermo, is a free web research tool that allows users worldwide to access an up-to-date data collection from the publications on experimental investigations of thermodynamic, and transport properties of ionic liquids as well as binary and ternary mixtures containing ionic liquids.

  9. AIDSinfo Drug Database

    MedlinePLUS

    ... APIs Widgets Order Publications Skip Nav AIDS info Drug Database Home > Drugs Español Text Size Drugs by class Collapse All | Expand All FDA-approved ... for health care providers and patients. Search the Drug Database Help × Search by drug name Performs a ...

  10. Using data from the public project, Craig VenterSite: DNA Interactive (www.dnai.org)

    NSDL National Science Digital Library

    2008-10-06

    Interviewee: Craig Venter DNAi Location:Genome>the project>players>private project Using data from the public project Craig Venter, leader of the private effort at Celera Genomics, speaks about his company's reliance on the public data for reassembly of the Celera sequence.

  11. The public Human Genome Project, Craig VenterSite: DNA Interactive (www.dnai.org)

    NSDL National Science Digital Library

    2008-10-06

    Interviewee: Craig Venter DNAi Location:Genome>The project>players>money Threatening their funding Craig Venter speaks about the public sector's reaction to his plans to sequence the genome at a private company, Celera Genomics.

  12. Submitted as a book chapter to: Advanced Topics in Database Research -Volume 5 Authors until publication. August 29, 2005

    E-print Network

    Arpinar, I. Budak

    proprietary, trusted, and open-source information, including intranets, the deep Web and the open Web publication. August 29, 2005 Semantic Analytics in Intelligence: Applying Semantic Association Discovery to determine Relevance of Heterogeneous Documents 1 Semantic Analytics in Intelligence: Applying Semantic

  13. Function2Gene: A gene selection tool to increase the power of genetic association studies by utilizing public databases and expert knowledge

    PubMed Central

    Armstrong, Don L; Jacob, Chaim O; Zidovetzki, Raphael

    2008-01-01

    Background Many common disorders have multiple genetic components which convey increased susceptibility. SNPs have been used to identify genetic components which are associated with a disease. Unfortunately, many studies using these methods suffer from low reproducibility due to lack of power. Results We present a set of programs which implement a novel method for searching for disease-associated genes using prior information to select and order genes from publicly available databases by their prior likelihood of association with the disease. These programs were used in a published study of childhood-onset SLE which yielded novel associations with modest sample size. Conclusion Using prior information to decrease the size of the problem space to an amount commensurate with available samples and resources while maintaining appropriate power enables researchers to increase their likelihood of discovering reproducible associations. PMID:18631403

  14. Publications

    Cancer.gov

    An NCI database that contains summaries of the latest cancer information. Summaries cover treatment, screening, prevention, genetics, supportive care, and complementary and alternative medicine. Also includes a searchable listing of cancer clinical trials.

  15. Psychiatric inpatient expenditures and public health insurance programmes: analysis of a national database covering the entire South Korean population

    PubMed Central

    2010-01-01

    Background Medical spending on psychiatric hospitalization has been reported to impose a tremendous socio-economic burden on many developed countries with public health insurance programmes. However, there has been no in-depth study of the factors affecting psychiatric inpatient medical expenditures and differentiated these factors across different types of public health insurance programmes. In view of this, this study attempted to explore factors affecting medical expenditures for psychiatric inpatients between two public health insurance programmes covering the entire South Korean population: National Health Insurance (NHI) and National Medical Care Aid (AID). Methods This retrospective, cross-sectional study used a nationwide, population-based reimbursement claims dataset consisting of 1,131,346 claims of all 160,465 citizens institutionalized due to psychiatric diagnosis between January 2005 and June 2006 in South Korea. To adjust for possible correlation of patients characteristics within the same medical institution and a non-linearity structure, a Box-Cox transformed, multilevel regression analysis was performed. Results Compared with inpatients 19 years old or younger, the medical expenditures of inpatients between 50 and 64 years old were 10% higher among NHI beneficiaries but 40% higher among AID beneficiaries. Males showed higher medical expenditures than did females. Expenditures on inpatients with schizophrenia as compared to expenditures on those with neurotic disorders were 120% higher among NHI beneficiaries but 83% higher among AID beneficiaries. Expenditures on inpatients of psychiatric hospitals were greater on average than expenditures on inpatients of general hospitals. Among AID beneficiaries, institutions owned by private groups treated inpatients with 32% higher costs than did government institutions. Among NHI beneficiaries, inpatients medical expenditures were positively associated with the proportion of patients diagnosed into dementia or schizophrenia categories. However, for AID beneficiaries, inpatient medical expenditures were positively associated with the proportion of all patients with a psychiatric diagnosis that were AID beneficiaries in a medical institution. Conclusions This study provides evidence that patient and institutional factors are associated with psychiatric inpatient medical expenditures, and that they may have different effects for beneficiaries of different public health insurance programmes. Policy efforts to reduce psychiatric inpatient medical expenditures should be made differently across the different types of public health insurance programmes. PMID:20819235

  16. GDB: the Human Genome Database

    Microsoft Academic Search

    Stanley Letovsky; Robert W. Cottingham; Christopher J. Porter; Peter W. D. Li

    1998-01-01

    The Genome Database (GDB, http:\\/\\/www.gdb.org ) is a public repository of data on human genes, clones, STSs, polymorphisms and maps. GDB entries are highly cross-linked to each other, to literature citations and to entries in other databases, including the sequence databases, OMIM, and the Mouse Genome Database. Mapping data from large genome centers and smaller mapping efforts are added to

  17. Nanotechnology Database

    NSDL National Science Digital Library

    Sponsored by the National Science Foundation and housed at the Loyola College in Maryland's International Technology Research Institute the Nanotechnology Database is a source of online information on major research centers, funding agencies, major reports, and books dealing with nanotechnology. The resources listed here are carefully selected and reviewed. The site is expected to grow with the continued support and updates from organizations and individuals in the field of nanotechnology. The list of resources is divided into the following categories: Academic, Industry, Government Laboratories, Government Agencies, Professional Societies, Non-Profit Organizations, Books, Periodicals, Reports, and Conferences. Each listing provides a brief summary (taken from that Website) and hyperlink to the resource (note: the book list links mostly take users to online booksellers). A submission form allows users to add a relevant organization or publication.

  18. HS3D, A Dataset of Homo Sapiens Splice Regions, and its Extraction Procedure from a Major Public Database

    NASA Astrophysics Data System (ADS)

    Pollastro, Pasquale; Rampone, Salvatore

    The aim of this work is to describe a cleaning procedure of GenBank data, producing material to train and to assess the prediction accuracy of computational approaches for gene characterization. A procedure (GenBank2HS3D) has been defined, producing a dataset (HS3D - Homo Sapiens Splice Sites Dataset) of Homo Sapiens Splice regions extracted from GenBank (Rel.123 at this time). It selects, from the complete GenBank Primate Division, entries of Human Nuclear DNA according with several assessed criteria; then it extracts exons and introns from these entries (actually 4523 + 3802). Donor and acceptor sites are then extracted as windows of 140 nucleotides around each splice site (3799 + 3799). After discarding windows not including canonical GT-AG junctions (65 + 74), including insufficient data (not enough material for a 140 nucleotide window) (686 + 589), including not AGCT bases (29 + 30), and redundant (218 + 226), the remaining windows (2796 + 2880) are reported in the dataset. Finally, windows of false splice sites are selected by searching canonical GT-AG pairs in not splicing positions (271 937 + 332 296). The false sites in a range +/- 60 from a true splice site are marked as proximal. HS3D, release 1.2 at this time, is available at the Web server of the University of Sannio: http://www.sci.unisannio.it/docenti/rampone/.

  19. Hydrocarbon Impacts Database

    NSDL National Science Digital Library

    The Hydrocarbon Impacts (HI) database is a subset of the University of Calgary's Arctic Institute of North America's Arctic Science and Technology Information System database. More than 5,100 records describe "publications and research projects about the environmental impacts, socio-economic effects and regulation of hydrocarbon exploration, development and transportation in northern Canada." Users can search by record type, keyword, subject code, geographic code, author, and year, as well as an advanced search feature to locate the information. Well designed and easy to use, the database provides those interested in this narrow subject field a helpful resource.

  20. Publications

    NSDL National Science Digital Library

    1969-12-31

    The Nitrogen and Phosphorus Knowledge Web page is offered by Iowa State University Extension and the College of Agriculture. The publications page contains links to various newsletters, articles, publications, power point presentations, links to governmental publications, and more. For example, visitors will find articles written on phosphorous within the Integrated Crop Management Newsletter, power point presentations on Nitrogen Management and Carbon Sequestration, and links to other Iowa State University publications on various subjects such as nutrient management. Other links on the home page of the site contain soil temperature data, research highlights, and other similarly relevant information for those in similar fields.

  1. ProtNA-ASA: Protein-nucleic acid structural database with information on accessible surface area

    NASA Astrophysics Data System (ADS)

    Tkachenko, M. Y.; Boryskina, O. P.; Shestopalova, A. V.; Tolstorukov, M. Y.

    The article describes a new database (ProtNA-ASA), which combines the data on conformational parameters of nucleic acids and calculations of the accessible surface area (ASA) of nucleic acid atoms in protein-DNA/RNA complexes. As for October 2008, the database contains 214 DNA-protein and 28 RNA-protein non-homologous complexes. The database provides structural parameters that describe local geometry of base pairs and base-pair steps as well as backbone torsion angles. Additionally, total ASA of DNA/RNA atoms and the accessible area of atoms in the minor and major grooves are calculated. ProtNA-ASA database facilitates studying the relationship between the DNA/RNA conformation and availability of atoms for contact with proteins either in major or in minor groove for different nucleotides. Such an analysis is important for understanding the principles of molecular recognition including indirect sequence readout. The database is publicly available for use at http://www.protna.bio-page.org.

  2. International Society of Human and Animal Mycology (ISHAM)-ITS reference DNA barcoding database--the quality controlled standard tool for routine identification of human and animal pathogenic fungi.

    PubMed

    Irinyi, Laszlo; Serena, Carolina; Garcia-Hermoso, Dea; Arabatzis, Michael; Desnos-Ollivier, Marie; Vu, Duong; Cardinali, Gianluigi; Arthur, Ian; Normand, Anne-Cécile; Giraldo, Alejandra; da Cunha, Keith Cassia; Sandoval-Denis, Marcelo; Hendrickx, Marijke; Nishikaku, Angela Satie; de Azevedo Melo, Analy Salles; Merseguel, Karina Bellinghausen; Khan, Aziza; Parente Rocha, Juliana Alves; Sampaio, Paula; da Silva Briones, Marcelo Ribeiro; e Ferreira, Renata Carmona; de Medeiros Muniz, Mauro; Castañón-Olivares, Laura Rosio; Estrada-Barcenas, Daniel; Cassagne, Carole; Mary, Charles; Duan, Shu Yao; Kong, Fanrong; Sun, Annie Ying; Zeng, Xianyu; Zhao, Zuotao; Gantois, Nausicaa; Botterel, Françoise; Robbertse, Barbara; Schoch, Conrad; Gams, Walter; Ellis, David; Halliday, Catriona; Chen, Sharon; Sorrell, Tania C; Piarroux, Renaud; Colombo, Arnaldo L; Pais, Célia; de Hoog, Sybren; Zancopé-Oliveira, Rosely Maria; Taylor, Maria Lucia; Toriello, Conchita; de Almeida Soares, Célia Maria; Delhaes, Laurence; Stubbe, Dirk; Dromer, Françoise; Ranque, Stéphane; Guarro, Josep; Cano-Lira, Jose F; Robert, Vincent; Velegraki, Aristea; Meyer, Wieland

    2015-05-01

    Human and animal fungal pathogens are a growing threat worldwide leading to emerging infections and creating new risks for established ones. There is a growing need for a rapid and accurate identification of pathogens to enable early diagnosis and targeted antifungal therapy. Morphological and biochemical identification methods are time-consuming and require trained experts. Alternatively, molecular methods, such as DNA barcoding, a powerful and easy tool for rapid monophasic identification, offer a practical approach for species identification and less demanding in terms of taxonomical expertise. However, its wide-spread use is still limited by a lack of quality-controlled reference databases and the evolving recognition and definition of new fungal species/complexes. An international consortium of medical mycology laboratories was formed aiming to establish a quality controlled ITS database under the umbrella of the ISHAM working group on "DNA barcoding of human and animal pathogenic fungi." A new database, containing 2800 ITS sequences representing 421 fungal species, providing the medical community with a freely accessible tool at http://www.isham.org/ and http://its.mycologylab.org/ to rapidly and reliably identify most agents of mycoses, was established. The generated sequences included in the new database were used to evaluate the variation and overall utility of the ITS region for the identification of pathogenic fungi at intra-and interspecies level. The average intraspecies variation ranged from 0 to 2.25%. This highlighted selected pathogenic fungal species, such as the dermatophytes and emerging yeast, for which additional molecular methods/genetic markers are required for their reliable identification from clinical and veterinary specimens. PMID:25802363

  3. Marmal-aid – a database for Infinium HumanMethylation450

    PubMed Central

    2013-01-01

    Background DNA methylation is indispensible for normal human genome function. Currently there is an increasingly large number of DNA methylomic data being released in the public domain allowing for an opportunity to investigate the relationships between the DNA methylome, genome function, and human phenotypes. The Illumina450K is one of the most popular platforms for assessing DNA methylation with over 10,000 samples available in the public domain. However, accessing all this data requires downloading each individual experiment and due to inconsistent annotation, accessing the right data can be a challenge. Description Here we introduce ‘Marmal-aid’, the first standardised database for DNA methylation (freely available at http://marmal-aid.org). In Marmal-aid, the majority of publicly available Illumina HumanMethylation450 data is incorporated into a single repository allowing for re-processing of data including normalisation and imputation of missing values. The database is accessible in two ways: (1) Using an R package to allow for incorporation into existing analysis pipelines which can then be easily queried to gain insight into the functionality of certain CpG sites. This is aimed at a bioinformatician with experience in R. (2) Using a graphical interface allowing general biologists to query a pre-defined set of tissues (currently 15) providing a reference database of the methylation state in these tissues for the 450,000 CpG sites profiled by the Illumina HumanMethylation450. Conclusion Marmal-aid is the largest publicly available Illumina HumanMethylation450 methylation database combining Illumina HumanMethylation450 data from a number of sources into a single location with a single common annotation format. This allows for automated extraction using the R package and inclusion into existing analysis pipelines. Marmal-aid also provides a easy to use GUI to visualise methylation data in user defined genomic regions for various reference tissues. PMID:24330312

  4. Transboundary Freshwater Dispute Database

    NSDL National Science Digital Library

    This collection of databases is intended to aid in the assessment of the process of water conflict prevention and resolution. The searchable collections include data such as case studies, freshwater treaties from 1820 to 2001, events concerning historical water relations from 1948 to 1999, a register of international river basins, and information on interstate water compacts in the United States. There is also spatial data on transboundary freshwater indicator variables, international river basins, and a map and image gallery. Other materials include links to the organization's publications and research projects, and a set of links to other databases and publications on freshwater conflict issues.

  5. GlycomeDB – integration of open-access carbohydrate structure databases

    PubMed Central

    Ranzinger, René; Herget, Stephan; Wetter, Thomas; von der Lieth, Claus-Wilhelm

    2008-01-01

    Background Although carbohydrates are the third major class of biological macromolecules, after proteins and DNA, there is neither a comprehensive database for carbohydrate structures nor an established universal structure encoding scheme for computational purposes. Funding for further development of the Complex Carbohydrate Structure Database (CCSD or CarbBank) ceased in 1997, and since then several initiatives have developed independent databases with partially overlapping foci. For each database, different encoding schemes for residues and sequence topology were designed. Therefore, it is virtually impossible to obtain an overview of all deposited structures or to compare the contents of the various databases. Results We have implemented procedures which download the structures contained in the seven major databases, e.g. GLYCOSCIENCES.de, the Consortium for Functional Glycomics (CFG), the Kyoto Encyclopedia of Genes and Genomes (KEGG) and the Bacterial Carbohydrate Structure Database (BCSDB). We have created a new database called GlycomeDB, containing all structures, their taxonomic annotations and references (IDs) for the original databases. More than 100000 datasets were imported, resulting in more than 33000 unique sequences now encoded in GlycomeDB using the universal format GlycoCT. Inconsistencies were found in all public databases, which were discussed and corrected in multiple feedback rounds with the responsible curators. Conclusion GlycomeDB is a new, publicly available database for carbohydrate sequences with a unified, all-encompassing structure encoding format and NCBI taxonomic referencing. The database is updated weekly and can be downloaded free of charge. The JAVA application GlycoUpdateDB is also available for establishing and updating a local installation of GlycomeDB. With the advent of GlycomeDB, the distributed islands of knowledge in glycomics are now bridged to form a single resource. PMID:18803830

  6. Annual Review of Database Development: 1992.

    ERIC Educational Resources Information Center

    Basch, Reva

    1992-01-01

    Reviews recent trends in databases and online systems. Topics discussed include new access points for established databases; acquisitions, consolidations, and competition between vendors; European coverage; international services; online reference materials, including telephone directories; political and legal materials and public records;…

  7. USDA NATIONAL NUTRIENT DATABASE FOR STANDARD REFERENCE

    EPA Science Inventory

    The USDA Nutrient Database for Standard Reference (SR) is the major source of food composition data in the United States. It provides the foundation for most food composition databases in the public and private sectors....

  8. Biofuel Database

    National Institute of Standards and Technology Data Gateway

    Biofuel Database (Web, free access)   This database brings together structural, biological, and thermodynamic data for enzymes that are either in current use or are being considered for use in the production of biofuels.

  9. FACILITY DATABASE

    Cancer.gov

    LASP Administrative Use Only Data Entry Start Date _______________ July 2007 LASP FACILTY Database Form 1.000 FACILITY DATABASE Principal Investigator – Data Entry Requirements This form is used to identify the level of data that each investigator

  10. Database Administrator

    ERIC Educational Resources Information Center

    Moore, Pam

    2010-01-01

    The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…

  11. The Histone Database: an integrated resource for histones and histone fold-containing proteins.

    PubMed

    Mariño-Ramírez, Leonardo; Levine, Kevin M; Morales, Mario; Zhang, Suiyuan; Moreland, R Travis; Baxevanis, Andreas D; Landsman, David

    2011-01-01

    Eukaryotic chromatin is composed of DNA and protein components-core histones-that act to compactly pack the DNA into nucleosomes, the fundamental building blocks of chromatin. These nucleosomes are connected to adjacent nucleosomes by linker histones. Nucleosomes are highly dynamic and, through various core histone post-translational modifications and incorporation of diverse histone variants, can serve as epigenetic marks to control processes such as gene expression and recombination. The Histone Sequence Database is a curated collection of sequences and structures of histones and non-histone proteins containing histone folds, assembled from major public databases. Here, we report a substantial increase in the number of sequences and taxonomic coverage for histone and histone fold-containing proteins available in the database. Additionally, the database now contains an expanded dataset that includes archaeal histone sequences. The database also provides comprehensive multiple sequence alignments for each of the four core histones (H2A, H2B, H3 and H4), the linker histones (H1/H5) and the archaeal histones. The database also includes current information on solved histone fold-containing structures. The Histone Sequence Database is an inclusive resource for the analysis of chromatin structure and function focused on histones and histone fold-containing proteins. PMID:22025671

  12. The EMBL Nucleotide Sequence Database

    Microsoft Academic Search

    Tamara Kulikova; Philippe Aldebert; Nicola Althorpe; Wendy Baker; Kirsty Bates; Paul Browne; Alexandra Van Den Broek; Guy Cochrane; Karyn Duggan; Ruth Eberhardt; Nadeem Faruque; Maria Garcia-pastor; Nicola Harte; Carola Kanz; Rasko Leinonen; Quan Lin; Vincent Lombard; Rodrigo Lopez; Renato Mancuso; Michelle Mchale; Francesco Nardone; Ville Silventoinen; Peter Stoehr; Guenter Stoesser; Mary Ann Tuli; Katerina Tzouvara; Robert Vaughan; Dan Wu; Weimin Zhu; Rolf Apweiler

    2004-01-01

    The EMBL Nucleotide Sequence Database (http:\\/\\/ www.ebi.ac.uk\\/embl\\/), maintained at the European Bioinformatics Institute (EBI), incorporates, organ- izes and distributes nucleotide sequences from public sources. The database is a part of an inter- national collaboration with DDBJ (Japan) and GenBank (USA). Data are exchanged between the collaborating databases on a daily basis to achieve optimal synchrony. The web-based tool, Webin, is

  13. The National Ag Safety Database

    NSDL National Science Digital Library

    The University of Florida originally made this directory, a subset of its National Agricultural Safety Database CD-ROM, available on the web. Since then the database has undergone several updates. The Directory "contains contact information on safety professionals and organizations throughout the US," and "health and safety publications from 32 states, 4 federal agencies and 5 national organizations" and can be browsed or searched. The creators of the database have gone to great lengths to improve this site over the years.

  14. National Ambient Radiation Database

    SciTech Connect

    Dziuban, J.; Sears, R.

    2003-02-25

    The U.S. Environmental Protection Agency (EPA) recently developed a searchable database and website for the Environmental Radiation Ambient Monitoring System (ERAMS) data. This site contains nationwide radiation monitoring data for air particulates, precipitation, drinking water, surface water and pasteurized milk. This site provides location-specific as well as national information on environmental radioactivity across several media. It provides high quality data for assessing public exposure and environmental impacts resulting from nuclear emergencies and provides baseline data during routine conditions. The database and website are accessible at www.epa.gov/enviro/. This site contains (1) a query for the general public which is easy to use--limits the amount of information provided, but includes the ability to graph the data with risk benchmarks and (2) a query for a more technical user which allows access to all of the data in the database, (3) background information on ER AMS.

  15. The Stanford Microarray Database

    Microsoft Academic Search

    Gavin Sherlock; Tina Hernandez-boussard; Andrew Kasarskis; Gail Binkley; John C. Matese; Selina S. Dwight; Miroslava Kaloper; Shuai Weng; Heng Jin; Catherine A. Ball; Michael B. Eisen; Paul T. Spellman; Patrick O. Brown; David Botstein; J. Michael Cherry

    2001-01-01

    The Stanford Microarray Database (SMD) stores raw and normalized data from microarray experiments, and provides web interfaces for researchers to retrieve, analyze and visualize their data. The two immediate goals for SMD are to serve as a storage site for microarray data from ongoing research at Stanford University, and to facilitate the public dissemination of that data once published, or

  16. Improving the Accuracy of NMR Structures of DNA by Means of a Database Potential of Mean Force Describing Base-Base Positional

    E-print Network

    Clore, G. Marius

    a major impact on the structures of nucleic acids generated from NMR data. In this paper, we describe NMR data for a complex of the male sex determining factor SRY with a duplex DNA 14mer, which includes

  17. Addition of a breeding database in the Genome Database for Rosaceae

    PubMed Central

    Evans, Kate; Jung, Sook; Lee, Taein; Brutcher, Lisa; Cho, Ilhyung; Peace, Cameron; Main, Dorrie

    2013-01-01

    Breeding programs produce large datasets that require efficient management systems to keep track of performance, pedigree, geographical and image-based data. With the development of DNA-based screening technologies, more breeding programs perform genotyping in addition to phenotyping for performance evaluation. The integration of breeding data with other genomic and genetic data is instrumental for the refinement of marker-assisted breeding tools, enhances genetic understanding of important crop traits and maximizes access and utility by crop breeders and allied scientists. Development of new infrastructure in the Genome Database for Rosaceae (GDR) was designed and implemented to enable secure and efficient storage, management and analysis of large datasets from the Washington State University apple breeding program and subsequently expanded to fit datasets from other Rosaceae breeders. The infrastructure was built using the software Chado and Drupal, making use of the Natural Diversity module to accommodate large-scale phenotypic and genotypic data. Breeders can search accessions within the GDR to identify individuals with specific trait combinations. Results from Search by Parentage lists individuals with parents in common and results from Individual Variety pages link to all data available on each chosen individual including pedigree, phenotypic and genotypic information. Genotypic data are searchable by markers and alleles; results are linked to other pages in the GDR to enable the user to access tools such as GBrowse and CMap. This breeding database provides users with the opportunity to search datasets in a fully targeted manner and retrieve and compare performance data from multiple selections, years and sites, and to output the data needed for variety release publications and patent applications. The breeding database facilitates efficient program management. Storing publicly available breeding data in a database together with genomic and genetic data will further accelerate the cross-utilization of diverse data types by researchers from various disciplines. Database URL: http://www.rosaceae.org/breeders_toolbox PMID:24247530

  18. Identification of RNA editing sites in the SNP database

    PubMed Central

    Eisenberg, Eli; Adamsky, Konstantin; Cohen, Lital; Amariglio, Ninette; Hirshberg, Abraham; Rechavi, Gideon; Levanon, Erez Y.

    2005-01-01

    The relationship between human inherited genomic variations and phenotypic differences has been the focus of much research effort in recent years. These studies benefit from millions of single-nucleotide polymorphism (SNP) records available in public databases, such as dbSNP. The importance of identifying false dbSNP records increases with the growing role played by SNPs in linkage analysis for disease traits. In particular, the emerging understanding of the abundance of DNA and RNA editing calls for a careful distinction between inherited SNPs and somatic DNA and RNA modifications. In order to demonstrate that some of the SNP database records are actually somatic modification, we focus on one type of these modifications, namely A-to-I RNA editing, and present evidence for hundreds of dbSNP records that are actually editing sites. We provide a list of 102 RNA editing sites previously annotated in dbSNP database as SNPs, and experimentally validate seven of these. Interestingly, we show how dbSNP can serve as a starting point to look for new editing sites. Our results, for this particular type of RNA editing, demonstrate the need for a careful analysis of SNP databases in light of the increasing recognition of the significance of somatic sequence modifications. PMID:16100382

  19. IOPI Database of Plant Databases

    NSDL National Science Digital Library

    The International Organization for Plant Information (IOPI), a Commission of the International Union of Biological Sciences (IUBS), manages the Database of Plant Databases (DPD). The DPD is a global list of plant databases including Taxonomic databases ("with systematic information on families or genera, or for Flora projects"); Collection catalogs (usually of herbaria); and DELTA datasets (DELTA is "the Description Language for Taxonomy, a data format for character data, used for identification, key construction and the generation of descriptions."). The DPD may be searched using numerous specified fields, or it may be viewed in its entirety -- by Database Name, Host Name, or Host Country. Though bare bones in appearance, this extensive database contains a gold mine of information, with hundreds of hyperlinks to valuable plant databases.

  20. Toward Privacy in Public Databases

    Microsoft Academic Search

    Shuchi Chawla; Cynthia Dwork; Frank Mcsherry; Adam Smith; Hoeteck Wee

    2005-01-01

    We initiate a theoretical study of thecensus problem. Infor- mally, in a census individual respondents give private information to a trusted party (the census bureau), who publishes a sanitized version of the data. There are two fundamentally con?icting requirements:privacy for the respondents andutility of the sanitized data. Unlike in the study of secure function evaluation, in which privacy is preserved

  1. Genomic BLAST: custom-defined virtual databases for complete and unfinished genomes.

    PubMed

    Cummings, Leda; Riley, Leigh; Black, Lori; Souvorov, Alexander; Resenchuk, Sergei; Dondoshansky, Ilya; Tatusova, Tatiana

    2002-11-01

    BLAST (Basic Local Alignment Search Tool) searches against DNA and protein sequence databases have become an indispensable tool for biomedical research. The proliferation of the genome sequencing projects is steadily increasing the fraction of genome-derived sequences in the public databases and their importance as a public resource. We report here the availability of Genomic BLAST, a novel graphical tool for simplifying BLAST searches against complete and unfinished genome sequences. This tool allows the user to compare the query sequence against a virtual database of DNA and/or protein sequences from a selected group of organisms with finished or unfinished genomes. The organisms for such a database can be selected using either a graphic taxonomy-based tree or an alphabetical list of organism-specific sequences. The first option is designed to help explore the evolutionary relationships among organisms within a certain taxonomy group when performing BLAST searches. The use of an alphabetical list allows the user to perform a more elaborate set of selections, assembling any given number of organism-specific databases from unfinished or complete genomes. This tool, available at the NCBI web site http://www.ncbi.nlm.nih.gov/cgi-bin/Entrez/genom_table_cgi, currently provides access to over 170 bacterial and archaeal genomes and over 40 eukaryotic genomes. PMID:12435493

  2. Evaluation of the effectiveness of posters to provide information to patients about a DNA database and their opportunity to opt out

    Microsoft Academic Search

    Jill M. Pulley; Margaret Brace; Gordon R. Bernard; Dan Masys

    2007-01-01

    Objective  Vanderbilt University Medical Center is implementing a DNA Databank to facilitate genomic research. This study describes the\\u000a use of informational posters to communicate to patients about the Databank and their option to not participate.\\u000a \\u000a \\u000a \\u000a Methods  Informational posters were displayed in two phlebotomy areas prior to the implementation of the DNA Databank project. Patients\\u000a leaving the phlebotomy areas were interviewed by non-medical

  3. Cancer Control Publications 1998-2011: FAQ

    Cancer.gov

    CC Publications is a searchable database developed by DCCPS that includes staff, contract investigators, and grantee publications. This database demonstrates the depth and breadth of research publications in cancer control and population sciences funded by NCI.

  4. Maize databases

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This chapter is a succinct overview of maize data held in the species-specific database MaizeGDB (the Maize Genomics and Genetics Database), and selected multi-species data repositories, such as Gramene/Ensembl Plants, Phytozome, UniProt and the National Center for Biotechnology Information (NCBI), ...

  5. Image Databases.

    ERIC Educational Resources Information Center

    Pettersson, Rune

    Different kinds of pictorial databases are described with respect to aims, user groups, search possibilities, storage, and distribution. Some specific examples are given for databases used for the following purposes: (1) labor markets for artists; (2) document management; (3) telling a story; (4) preservation (archives and museums); (5) research;…

  6. NIOSHTIC DATABASE

    EPA Science Inventory

    NIOSHTIC Database is a bibliographic database of literature in the field of occupational safety and health. English language technical journals provide approximately 35 percent of the additions to NIOSHTIC? annually. Retrospective information, some of which is from the 19th centu...

  7. NSFC Databases

    NSDL National Science Digital Library

    The National Environmental Services Center (NESC) is based at West Virginia University and "serves as a clearinghouse for information about drinking water, wastewater, environmental training, and solid waste management in communities serving fewer than 10,000 individuals." As part of the NSFC larger Web site, the Databases page offers three online databases that can be accessed free after an initial registration. The Regulations Database contains copies of regulations for onsite wastewater treatment systems in 48 states, the Bibliographic Database stores thousands of articles dealing with onsite and small community wastewater issues, and the Manufacturers and Consultants Database houses a list of industry contacts for wastewater products and consulting services. Much more is available within the larger NSFC site and readers are encouraged to take a look through its contents.

  8. Annual Review of Database Developments 1991.

    ERIC Educational Resources Information Center

    Basch, Reva

    1991-01-01

    Review of developments in databases highlights a new emphasis on accessibility. Topics discussed include the internationalization of databases; databases that deal with finance, drugs, and toxic waste; access to public records, both personal and corporate; media online; reducing large files of data to smaller, more manageable files; and…

  9. A Transaction Mechanism for Engineering Design Databases

    Microsoft Academic Search

    Won Kim; Raymond A. Lorie; Dan Mcnabb; Wil Plouffe

    1984-01-01

    One primary difference between transactions in an engineering design environment and those in conventional business applications is that an engineering transaction typically lasts a much longer time. Existing proposals for supporting the long-lived engineering transactions are all based on the public\\/private database architec- ture, in which a transaction checks out design objects from the public database, modifies them, and checks

  10. Quality control of EUVE databases

    NASA Technical Reports Server (NTRS)

    John, L. M.; Drake, J.

    1992-01-01

    The publicly accessible databases for the Extreme Ultraviolet Explorer include: the EUVE Archive mailserver; the CEA ftp site; the EUVE Guest Observer Mailserver; and the Astronomical Data System node. The EUVE Performance Assurance team is responsible for verifying that these public EUVE databases are working properly, and that the public availability of EUVE data contained therein does not infringe any data rights which may have been assigned. In this poster, we describe the Quality Assurance (QA) procedures we have developed from the approach of QA as a service organization, thus reflecting the overall EUVE philosophy of Quality Assurance integrated into normal operating procedures, rather than imposed as an external, post facto, control mechanism.

  11. Hospital Records Database

    NSDL National Science Digital Library

    This new joint project from the Wellcome Trust and the UK Public Record Office helps researchers locate records of hospitals all over the UK. The database currently contains over 2,800 entries and may be searched by hospital or town name. Information contained in the database includes administrative details of the hospitals, location and covering dates of administrative and clinical records, and the existence of lists, catalogs or other finding aids. A sample search for "royal" under hospital name returned 210 records, and one for "Manchester" under town name produced 124 returns. While the target audience of this database -- researchers in British medical history -- is rather specialized, this new resource will prove extremely useful for these scholars and their students.

  12. Alcohol Industry & Policy Database

    NSDL National Science Digital Library

    Marin Institute for the Prevention of Alcohol and Other Drug Problems.

    The Marin Institute for the Prevention of Alcohol and Other Drug Problems maintains the Alcohol Industry & Policy Database, which contains bibliographic citations and abstracts for more than 13,000 articles and news stories on the alcohol beverage industry, alcohol policy, and the prevention of alcohol-related problems. The citations in the database span from 1991 to the present and are updated monthly. Users may conduct cross-field queries of the database by keywords, subject headings, company name, and publication date. The search facility includes Word Wheels, which are interactive Java applets that help users to identify indexed terms quickly, thereby "eliminat[ing] trial-and-error searching [and] produc[ing] more accurate searches."

  13. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1992-04-30

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on R-32, R-123, R-124, R- 125, R-134a, R-141b, R142b, R-143a, R-152a, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses polyalkylene glycol (PAG), ester, and other lubricants. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits.

  14. u-Genome: a database on genome design in unicellular genomes.

    PubMed

    Sakharkar, Kishore Ramaji; Chaturvedi, Iti; Chow, Vincent T K; Kwoh, Chee Keong; Kangueane, Pandjassarame; Sakharkar, Meena Kishore

    2005-01-01

    Unicellular eukaryotes were among the first ones to be selected for complete genome sequencing because of the small size of their genomes and their interactions with humans and a broad range of animals and plants. Currently, ten completely sequenced unicellular genome sequences have been publicly released and as the number of available unicellular genomes increases, comparative genomics analysis within this group of organisms becomes more and more instructive. However, such an analysis is difficult to carry out without a suitable platform gathering not only the original annotations but also relevant information available in public databases or obtained by applying common bioinformatics methods. With the aim of solving these difficulties, we have developed a web-accessible database named u-Genome, the unicellular genome design database. The database is unique in featuring three datasets namely (1) orthologous proteins (2) paralogous proteins and (3) statistical distributions on exons, introns, intergenic DNA and correlations between them. A tool, Uniview, designed to visualize the gene structures for individual genes in the genome is also integrated. This database is of importance in understanding unicellular genome design and architecture and evolution related studies. The database is available through a web interface at http://sege.ntu.edu.sg/wester/ugenome. PMID:16610139

  15. The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

    PubMed Central

    Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.

    2010-01-01

    Background With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings A set of ?30K unique sequences (UniSeqs) representing ?19K clusters were generated from ?98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66% of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases. Conclusions/Significance The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics. PMID:20174471

  16. The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

    SciTech Connect

    Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.

    2010-01-27

    Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics.

  17. The Molecular Biology Database Collection: 2005 update

    PubMed Central

    Galperin, Michael Y.

    2005-01-01

    The Nucleic Acids Research Molecular Biology Database Collection is a public online resource that lists the databases described in this and previous issues of Nucleic Acids Research together with other databases of value to the biologist and available throughout the world. All databases included in this Collection are freely available to the public. The 2005 update includes 719 databases, 171 more than the 2004 one. The databases are organized in a hierarchical classification that simplifies the process of finding the right database for any given task. The growing number of databases related to immunology, plant and organelle research have been accommodated by separating them into three new categories. The database summaries provide brief descriptions of the databases, contact details, appropriate references and acknowledgements. The online summaries also serve as a venue for the maintainers of each database to introduce database updates and other improvements in the scope and tools. These updates are particularly important for those databases that have not been described in print in the recent past. The database list and summaries are available online at the Nucleic Acids Research web site, http://nar.oupjournals.org/. PMID:15608247

  18. Centre for Pacific Studies Literature Database Search

    NSDL National Science Digital Library

    The Centre for Pacific Studies (CPS) at the University of Nijmegen, the Netherlands, constructed this searchable bibliographic database of publications related to Oceania. The comprehensive database includes books from academic publishers as well as articles from 113 academic journals. The database is compiled from citations that have appeared in the last six years of the Oceania Newsletter, a CPS serial that covers the areas of Polynesia, Micronesia, Melanesia, and Australia. Users may query the database by keyword, author, title, and year.

  19. BIOMARKERS DATABASE

    EPA Science Inventory

    This database was developed by assembling and evaluating the literature relevant to human biomarkers. It catalogues and evaluates the usefulness of biomarkers of exposure, susceptibility and effect which may be relevant for a longitudinal cohort study. In addition to describing ...

  20. ECOTOX DATABASE

    EPA Science Inventory

    ECOTOX is a comprehensive ecotoxicology database and is therefore essential to the Agency, providing and supporting high quality models needed to estimate population effects of toxic chemicals across a wide range of species....

  1. Tsunami Database

    NSDL National Science Digital Library

    The Tsunami Database is a global digital database containing information on more than 2000 tsunamis maintained by the National Geophysical Data Center. This is an interactive site; the user is asked to enter search parameters such as date, latitude and longitude, cause of the tsunami - earthquake, landslide, volcano, or all combined - magnitude, and death. Information is then generated on tsunamis that match that data. The National Geophysical Data Center also maintains an historic slide set collection of tsunami damage.

  2. Multimedia Databases

    Microsoft Academic Search

    Arcot Desai Narasimhalu

    1996-01-01

    .  The rapidly growing interest in building multimedia tools and applications has created a need for the development of multimedia\\u000a database management systems (MMDBMSs) as a tool for efficient organization, storage and retrieval of multimedia objects. We\\u000a begin with a word about traditional database management systems (DBMSs). Then we present an overview of the MMDBMS research\\u000a issues, challenges, methods, models, and

  3. National Tourism Database

    NSDL National Science Digital Library

    Developed by the Michigan State University Extension Tourism Area of Expertise and the National Tourism Education Design Team, this site contains information on numerous resources related to tourism education, including bulletins, research reports, videos, and training programs. Nearly 100 of the documents featured are full-text. Users can browse the database by topic or browse or search by keyword. A separate list of the full-text publications is also provided. A useful site for students and professionals in the tourism industry.

  4. A comprehensive DNA barcode database for Central European beetles with a focus on Germany: adding more than 3500 identified species to BOLD.

    PubMed

    Hendrich, Lars; Morinière, Jérôme; Haszprunar, Gerhard; Hebert, Paul D N; Hausmann, Axel; Köhler, Frank; Balke, Michael

    2015-07-01

    Beetles are the most diverse group of animals and are crucial for ecosystem functioning. In many countries, they are well established for environmental impact assessment, but even in the well-studied Central European fauna, species identification can be very difficult. A comprehensive and taxonomically well-curated DNA barcode library could remedy this deficit and could also link hundreds of years of traditional knowledge with next generation sequencing technology. However, such a beetle library is missing to date. This study provides the globally largest DNA barcode reference library for Coleoptera for 15 948 individuals belonging to 3514 well-identified species (53% of the German fauna) with representatives from 97 of 103 families (94%). This study is the first comprehensive regional test of the efficiency of DNA barcoding for beetles with a focus on Germany. Sequences ?500 bp were recovered from 63% of the specimens analysed (15 948 of 25 294) with short sequences from another 997 specimens. Whereas most specimens (92.2%) could be unambiguously assigned to a single known species by sequence diversity at CO1, 1089 specimens (6.8%) were assigned to more than one Barcode Index Number (BIN), creating 395 BINs which need further study to ascertain if they represent cryptic species, mitochondrial introgression, or simply regional variation in widespread species. We found 409 specimens (2.6%) that shared a BIN assignment with another species, most involving a pair of closely allied species as 43 BINs were involved. Most of these taxa were separated by barcodes although sequence divergences were low. Only 155 specimens (0.97%) show identical or overlapping clusters. PMID:25469559

  5. Wrong Sequences in Databases: Whose Fault??

    Microsoft Academic Search

    Devi LalRup Lal; Rup Lal

    With the development of newer sequencing techniques the databases have been flooded with enormous amount of sequencing data. This saw emergence of yet another field in biology- ‘‘bioinformatics’’ which is principally based on sequence information from DNA, RNA or protein molecules. Sequence databases provide massive data-sets that serve as basic raw material for any bioinformatics research. Novel biological finding are

  6. Experiment Databases

    NASA Astrophysics Data System (ADS)

    Vanschoren, Joaquin; Blockeel, Hendrik

    Next to running machine learning algorithms based on inductive queries, much can be learned by immediately querying the combined results of many prior studies. Indeed, all around the globe, thousands of machine learning experiments are being executed on a daily basis, generating a constant stream of empirical information on machine learning techniques. While the information contained in these experiments might have many uses beyond their original intent, results are typically described very concisely in papers and discarded afterwards. If we properly store and organize these results in central databases, they can be immediately reused for further analysis, thus boosting future research. In this chapter, we propose the use of experiment databases: databases designed to collect all the necessary details of these experiments, and to intelligently organize them in online repositories to enable fast and thorough analysis of a myriad of collected results. They constitute an additional, queriable source of empirical meta-data based on principled descriptions of algorithm executions, without reimplementing the algorithms in an inductive database. As such, they engender a very dynamic, collaborative approach to experimentation, in which experiments can be freely shared, linked together, and immediately reused by researchers all over the world. They can be set up for personal use, to share results within a lab or to create open, community-wide repositories. Here, we provide a high-level overview of their design, and use an existing experiment database to answer various interesting research questions about machine learning algorithms and to verify a number of recent studies.

  7. Bioinformatics and database resources in hepatology.

    PubMed

    Teufel, Andreas

    2015-03-01

    Lately, advances in high-throughput technologies in biomedical research have led to a dramatic increase in the accessibility of molecular insights at multiple biological levels in hepatology. Much of this information is available in publications, but an increasing number of large-scale analyses are currently being stored in databases. Scopes of these databases are very divergent and may range from large, general databases collecting information on almost every known disease, to very specialized databases covering only a specific liver disease or aspect of hepatology. Over recent years, these bioinformatics data repositories have rapidly evolved into an essential aid for molecular hepatology. However, although publicly available through the internet, many of these databases are only known to a few experts. To facilitate access to these resources, the publicly available databases supporting research on liver diseases are summarized in this review. PMID:25450718

  8. American Mineralogist Crystal Structure Database

    NSDL National Science Digital Library

    R. T. Downs

    This database provides access to information on every crystal structure published in the American Mineralogist, the Canadian Mineralogist, European Journal of Mineralogy, and Physics and Chemistry of Minerals, as well as selected datasets from other journals. The data are searchable by mineral name, author, chemistry, cell parameters and symmetry, diffraction pattern, and a general search. There are also lists of minerals represented in the database and authors of publications cited.

  9. Avibase: The World Bird Database

    NSDL National Science Digital Library

    Denis Lepage

    This database provides information on all birds of the world, featuring information on thousands of species and subspecies of birds such as taxonomy, names and synonyms in various languages, photos, distribution maps, and links to additional information from other websites. The database is searchable by keyword or term, exact name, language, year of publication, and other parameters. There is also a search by taxonomic family, a set of checklists by geographic region, and a blog for ornithological discussions.

  10. INVADERS Database

    NSDL National Science Digital Library

    Based at the University of Montana and directed by Dr. Peter Rice, the INVADERS Database is "a comprehensive database of exotic plant names and weed distribution records for five states in the northwestern United States." Designed for use by land management and weed regulatory agencies, INVADERS uses a query interface (plant name or location) to sort and display information. Data are updated regularly so as to increase the chance of detecting and halting the rapid spread of alien weeds. Highlights of the site include the noxious weed listings for all US states and six Canadian provinces, historic distribution records against which to compare current plant distributions, and summary statistics such as the number of invasive species detected per state or a summary of the 120 year invasion, among others. The INVADERS database will prove both interesting and useful to managers and academics, alike.

  11. Molecular Identification and Databases in Fusarium

    Technology Transfer Automated Retrieval System (TEKTRAN)

    DNA sequence-based methods for identifying pathogenic and mycotoxigenic Fusarium isolates have become the gold standard worldwide. Moreover, fusarial DNA sequence data are increasing rapidly in several web-accessible databases for comparative purposes. Unfortunately, the use of Basic Alignment Sea...

  12. FishMicrosat: a microsatellite database of commercially important fishes and shellfishes of the Indian subcontinent

    PubMed Central

    2013-01-01

    Background Microsatellite DNA is one of many powerful genetic markers used for the construction of genetic linkage maps and the study of population genetics. The biological databases in public domain hold vast numbers of microsatellite sequences for many organisms including fishes. The microsatellite data available in these data sources were extracted and managed into a database that facilitates sequences analysis and browsing relevant information. The system also helps to design primer sequences for flanking regions of repeat loci for PCR identification of polymorphism within populations. Description FishMicrosat is a database of microsatellite sequences of fishes and shellfishes that includes important aquaculture species such as Lates calcarifer, Ctenopharyngodon idella, Hypophthalmichthys molitrix, Penaeus monodon, Labeo rohita, Oreochromis niloticus, Fenneropenaeus indicus and Macrobrachium rosenbergii. The database contains 4398 microsatellite sequences of 41 species belonging to 15 families from the Indian subcontinent. GenBank of NCBI was used as a prime data source for developing the database. The database presents information about simple and compound microsatellites, their clusters and locus orientation within sequences. The database has been integrated with different tools in a web interface such as primer designing, locus finding, mapping repeats, detecting similarities among sequences across species, and searching using motifs and keywords. In addition, the database has the ability to browse information on the top 10 families and the top 10 species, through record overview. Conclusions FishMicrosat database is a useful resource for fish and shellfish microsatellite analyses and locus identification across species, which has important applications in population genetics, evolutionary studies and genetic relatedness among species. The database can be expanded further to include the microsatellite data of fishes and shellfishes from other regions and available information on genome sequencing project of species of aquaculture importance. PMID:24047532

  13. Solubility Database

    National Institute of Standards and Technology Data Gateway

    SRD 106 IUPAC-NIST Solubility Database (Web, free access)   These solubilities are compiled from 18 volumes (Click here for List) of the International Union for Pure and Applied Chemistry(IUPAC)-NIST Solubility Data Series. The database includes liquid-liquid, solid-liquid, and gas-liquid systems. Typical solvents and solutes include water, seawater, heavy water, inorganic compounds, and a variety of organic compounds such as hydrocarbons, halogenated hydrocarbons, alcohols, acids, esters and nitrogen compounds. There are over 67,500 solubility measurements and over 1800 references.

  14. HUMHOT: a database of human meiotic recombination hot spots.

    PubMed

    Nishant, K T; Kumar, Chetan; Rao, M R S

    2006-01-01

    Meiotic recombination occurs preferentially at certain regions in the genome referred to as hot spots. The number of hot spots known in humans has increased manifold in recent years. The identification of these hot spots in humans is of great interest to population and medical geneticists since they influence the structure of Linkage Disequilibrium and Haplotype blocks in human populations, whose patterns have applications in mapping disease genes. HUMHOT is a web-based database of Human Meiotic Recombination Hot Spots. The database comprises DNA sequences corresponding to the hot spot regions from the literature that have been mapped to a high resolution (<4 kb) in humans. It also provides flanking sequence information for the hot spot region along with references describing the hot spot. The database can be queried based on hot spot identity, chromosome position or by homology to user-defined sequences. It is also updated with new hot spot sequences as they are discovered and provides hyperlinks to commonly used tools for estimating recombination rates, performing genetic analysis and new advances in our understanding of meiotic hot spots. Public access to the HUMHOT database is available at http://www.jncasr.ac.in/humhot. PMID:16381857

  15. Database of recent tsunami deposits

    USGS Publications Warehouse

    Peters, Robert; Jaffe, Bruce E.

    2010-01-01

    This report describes a database of sedimentary characteristics of tsunami deposits derived from published accounts of tsunami deposit investigations conducted shortly after the occurrence of a tsunami. The database contains 228 entries, each entry containing data from up to 71 categories. It includes data from 51 publications covering 15 tsunamis distributed between 16 countries. The database encompasses a wide range of depositional settings including tropical islands, beaches, coastal plains, river banks, agricultural fields, and urban environments. It includes data from both local tsunamis and teletsunamis. The data are valuable for interpreting prehistorical, historical, and modern tsunami deposits, and for the development of criteria to identify tsunami deposits in the geologic record.

  16. Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

    PubMed Central

    Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

    2013-01-01

    Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation in G. hirsutum and comparative genomics among Gossypium species. PMID:24146870

  17. Working on the public Human Genome Project, Craig VenterSite: DNA Interactive (www.dnai.org)

    NSDL National Science Digital Library

    2008-10-06

    Interviewee: Craig Venter DNAi Location:Genome>The project>players>money Threatening their funding Craig Venter speaks about the public sector's reaction to his plans to sequence the genome at a private company, Celera Genomics.

  18. The urologic epithelial stem cell database (UESC) – a web tool for cell type-specific gene expression and immunohistochemistry images of the prostate and bladder

    PubMed Central

    Pascal, Laura E; Deutsch, Eric W; Campbell, David S; Korb, Martin; True, Lawrence D; Liu, Alvin Y

    2007-01-01

    Background Public databases are crucial for analysis of high-dimensional gene and protein expression data. The Urologic Epithelial Stem Cells (UESC) database is a public database that contains gene and protein information for the major cell types of the prostate, prostate cancer cell lines, and a cancer cell type isolated from a primary tumor. Similarly, such information is available for urinary bladder cell types. Description Two major data types were archived in the database, protein abundance localization data from immunohistochemistry images, and transcript abundance data principally from DNA microarray analysis. Data results were organized in modules that were made to operate independently but built upon a core functionality. Gene array data and immunostaining images for human and mouse prostate and bladder were made available for interrogation. Data analysis capabilities include: (1) CD (cluster designation) cell surface protein data. For each cluster designation molecule, a data summary allows easy retrieval of images (at multiple magnifications). (2) Microarray data. Single gene or batch search can be initiated with Affymetrix Probeset ID, Gene Name, or Accession Number together with options of coalescing probesets and/or replicates. Conclusion Databases are invaluable for biomedical research, and their utility depends on data quality and user friendliness. UESC provides for database queries and tools to examine cell type-specific gene expression (normal vs. cancer), whereas most other databases contain only whole tissue expression datasets. The UESC database provides a valuable tool in the analysis of differential gene expression in prostate cancer genes in cancer progression. PMID:18072977

  19. FACILITY DATABASE

    Cancer.gov

    LASP Administrative Use Only Data Entry Start Date _______________Investigator Data RequirementsJuly 2007 LASP FACILTY Database Form 1.000This form is used to identify the level of data that each investigator [and his/her staff] will require for entry

  20. NBRP databases: databases of biological resources in Japan.

    PubMed

    Yamazaki, Yukiko; Akashi, Ryo; Banno, Yutaka; Endo, Takashi; Ezura, Hiroshi; Fukami-Kobayashi, Kaoru; Inaba, Kazuo; Isa, Tadashi; Kamei, Katsuhiko; Kasai, Fumie; Kobayashi, Masatomo; Kurata, Nori; Kusaba, Makoto; Matuzawa, Tetsuro; Mitani, Shohei; Nakamura, Taro; Nakamura, Yukio; Nakatsuji, Norio; Naruse, Kiyoshi; Niki, Hironori; Nitasaka, Eiji; Obata, Yuichi; Okamoto, Hitoshi; Okuma, Moriya; Sato, Kazuhiro; Serikawa, Tadao; Shiroishi, Toshihiko; Sugawara, Hideaki; Urushibara, Hideko; Yamamoto, Masatoshi; Yaoita, Yoshio; Yoshiki, Atsushi; Kohara, Yuji

    2010-01-01

    The National BioResource Project (NBRP) is a Japanese project that aims to establish a system for collecting, preserving and providing bioresources for use as experimental materials for life science research. It is promoted by 27 core resource facilities, each concerned with a particular group of organisms, and by one information center. The NBRP database is a product of this project. Thirty databases and an integrated database-retrieval system (BioResource World: BRW) have been created and made available through the NBRP home page (http://www.nbrp.jp). The 30 independent databases have individual features which directly reflect the data maintained by each resource facility. The BRW is designed for users who need to search across several resources without moving from one database to another. BRW provides access to a collection of 4.5-million records on bioresources including wild species, inbred lines, mutants, genetically engineered lines, DNA clones and so on. BRW supports summary browsing, keyword searching, and searching by DNA sequences or gene ontology. The results of searches provide links to online requests for distribution of research materials. A circulation system allows users to submit details of papers published on research conducted using NBRP resources. PMID:19934255

  1. Human cancer databases (Review)

    PubMed Central

    PAVLOPOULOU, ATHANASIA; SPANDIDOS, DEMETRIOS A.; MICHALOPOULOS, IOANNIS

    2015-01-01

    Cancer is one of the four major non-communicable diseases (NCD), responsible for ~14.6% of all human deaths. Currently, there are >100 different known types of cancer and >500 genes involved in cancer. Ongoing research efforts have been focused on cancer etiology and therapy. As a result, there is an exponential growth of cancer-associated data from diverse resources, such as scientific publications, genome-wide association studies, gene expression experiments, gene-gene or protein-protein interaction data, enzymatic assays, epigenomics, immunomics and cytogenetics, stored in relevant repositories. These data are complex and heterogeneous, ranging from unprocessed, unstructured data in the form of raw sequences and polymorphisms to well-annotated, structured data. Consequently, the storage, mining, retrieval and analysis of these data in an efficient and meaningful manner pose a major challenge to biomedical investigators. In the current review, we present the central, publicly accessible databases that contain data pertinent to cancer, the resources available for delivering and analyzing information from these databases, as well as databases dedicated to specific types of cancer. Examples for this wealth of cancer-related information and bioinformatic tools have also been provided. PMID:25369839

  2. Human cancer databases (review).

    PubMed

    Pavlopoulou, Athanasia; Spandidos, Demetrios A; Michalopoulos, Ioannis

    2015-01-01

    Cancer is one of the four major non?communicable diseases (NCD), responsible for ~14.6% of all human deaths. Currently, there are >100 different known types of cancer and >500 genes involved in cancer. Ongoing research efforts have been focused on cancer etiology and therapy. As a result, there is an exponential growth of cancer?associated data from diverse resources, such as scientific publications, genome?wide association studies, gene expression experiments, gene?gene or protein?protein interaction data, enzymatic assays, epigenomics, immunomics and cytogenetics, stored in relevant repositories. These data are complex and heterogeneous, ranging from unprocessed, unstructured data in the form of raw sequences and polymorphisms to well?annotated, structured data. Consequently, the storage, mining, retrieval and analysis of these data in an efficient and meaningful manner pose a major challenge to biomedical investigators. In the current review, we present the central, publicly accessible databases that contain data pertinent to cancer, the resources available for delivering and analyzing information from these databases, as well as databases dedicated to specific types of cancer. Examples for this wealth of cancer?related information and bioinformatic tools have also been provided. PMID:25369839

  3. Databases for T-cell epitopes.

    PubMed

    Tung, Chun-Wei

    2014-01-01

    Modem immunology and vaccinology incorporate immunoinformatics techniques to give insights into immune systems and accelerate vaccine design. Databases managing epitope data in a structured form with immune-related annotations including sequences, alleles, source organisms, structures, and diseases could be the most crucial part of immunoinformatics offering data sources for the analysis of immune systems and development of prediction methods. This chapter provides an overview of publicly available databases of T-cell epitopes including general databases, pathogen- and tumor-specific databases, and 3D structure databases. PMID:25048121

  4. Ecology in the age of DNA barcoding: the resource, the promise and the challenges ahead.

    PubMed

    Joly, Simon; Davies, T Jonathan; Archambault, Annie; Bruneau, Anne; Derry, Alison; Kembel, Steven W; Peres-Neto, Pedro; Vamosi, Jana; Wheeler, Terry A

    2014-03-01

    Ten years after DNA barcoding was initially suggested as a tool to identify species, millions of barcode sequences from more than 1100 species are available in public databases. While several studies have reviewed the methods and potential applications of DNA barcoding, most have focused on species identification and discovery, and relatively few have addressed applications of DNA barcoding data to ecology. These data, and the associated information on the evolutionary histories of taxa that they can provide, offer great opportunities for ecologists to investigate questions that were previously difficult or impossible to address. We present an overview of potential uses of DNA barcoding relevant in the age of ecoinformatics, including applications in community ecology, species invasion, macroevolution, trait evolution, food webs and trophic interactions, metacommunities, and spatial ecology. We also outline some of the challenges and potential advances in DNA barcoding that lie ahead. PMID:24118947

  5. Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants.

    PubMed

    Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki

    2014-09-01

    In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops. PMID:25320561

  6. A Novel Approach: Chemical Relational Databases, and the Role of the ISSCAN Database on Assessing Chemical Carcinogenity

    EPA Science Inventory

    Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as "look-up-tables" of existing data, and most often did no...

  7. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1992-11-09

    The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R- 717 (ammonia), ethers, and others as well as azeotropic and zeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents on compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. A computerized version is available that includes retrieval software.

  8. Open Geoscience Database

    NASA Astrophysics Data System (ADS)

    Bashev, A.

    2012-04-01

    Currently there is an enormous amount of various geoscience databases. Unfortunately the only users of the majority of the databases are their elaborators. There are several reasons for that: incompaitability, specificity of tasks and objects and so on. However the main obstacles for wide usage of geoscience databases are complexity for elaborators and complication for users. The complexity of architecture leads to high costs that block the public access. The complication prevents users from understanding when and how to use the database. Only databases, associated with GoogleMaps don't have these drawbacks, but they could be hardly named "geoscience" Nevertheless, open and simple geoscience database is necessary at least for educational purposes (see our abstract for ESSI20/EOS12). We developed a database and web interface to work with them and now it is accessible at maps.sch192.ru. In this database a result is a value of a parameter (no matter which) in a station with a certain position, associated with metadata: the date when the result was obtained; the type of a station (lake, soil etc); the contributor that sent the result. Each contributor has its own profile, that allows to estimate the reliability of the data. The results can be represented on GoogleMaps space image as a point in a certain position, coloured according to the value of the parameter. There are default colour scales and each registered user can create the own scale. The results can be also extracted in *.csv file. For both types of representation one could select the data by date, object type, parameter type, area and contributor. The data are uploaded in *.csv format: Name of the station; Lattitude(dd.dddddd); Longitude(ddd.dddddd); Station type; Parameter type; Parameter value; Date(yyyy-mm-dd). The contributor is recognised while entering. This is the minimal set of features that is required to connect a value of a parameter with a position and see the results. All the complicated data treatment could be conducted in other programs after extraction the filtered data into *.csv file. It makes the database understandable for non-experts. The database employs open data format (*.csv) and wide spread tools: PHP as the program language, MySQL as database management system, JavaScript for interaction with GoogleMaps and JQueryUI for create user interface. The database is multilingual: there are association tables, which connect with elements of the database. In total the development required about 150 hours. The database still has several problems. The main problem is the reliability of the data. Actually it needs an expert system for estimation the reliability, but the elaboration of such a system would take more resources than the database itself. The second problem is the problem of stream selection - how to select the stations that are connected with each other (for example, belong to one water stream) and indicate their sequence. Currently the interface is English and Russian. However it can be easily translated to your language. But some problems we decided. For example problem "the problem of the same station" (sometimes the distance between stations is smaller, than the error of position): when you adding new station to the database our application automatically find station near this place. Also we decided problem of object and parameter type (how to regard "EC" and "electrical conductivity" as the same parameter). This problem has been solved using "associative tables". If you would like to see the interface on your language, just contact us. We should send you the list of terms and phrases for translation on your language. The main advantage of the database is that it is totally open: everybody can see, extract the data from the database and use them for non-commercial purposes with no charge. Registered users can contribute to the database without getting paid. We hope, that it will be widely used first of all for education purposes, but professional scientists could use it also.

  9. DISTRIBUTED DATABASES INTRODUCTION

    E-print Network

    Liu, Chengfei

    D DISTRIBUTED DATABASES INTRODUCTION The development of network and data communication tech- nology distributed database management. Naturally, the decen- tralized approach reflects the distributed aspects in the definition of a distributed database exist. First, a distributed database is distributed

  10. 16S rDNA library-based analysis of ruminal bacterial diversity

    Microsoft Academic Search

    Joan E. Edwards; Neil R. McEwan; Anthony J. Travis; R. John Wallace

    2004-01-01

    Bacterial 16S rDNA sequence data, incorporating sequences > 1 kb, were retrieved from published rumen library studies and\\u000a public databases, then were combined and analysed to assess the diversity of the rumen microbial ecosystem as indicated by\\u000a the pooled data. Low G+C Gram positive bacteria (54%) and the Cytophaga-Flexibacter-Bacteroides (40%) phyla were most abundantly represented. The diversity inferred by combining

  11. A catalog of human cDNA expression clones and its application to structural genomics

    Microsoft Academic Search

    Konrad Büssow; Claudia Quedenau; Volker Sievert; Janett Tischer; Christoph Scheich; Harald Seitz; Brigitte Hieke; Frank H Niesen; Frank Götz; Ulrich Harttig; Hans Lehrach

    2004-01-01

    We describe here a systematic approach to the identification of human proteins and protein fragments that can be expressed as soluble proteins in Escherichia coli. A cDNA expression library of 10,825 clones was screened by small-scale expression and purification and 2,746 clones were identified. Sequence and protein-expression data were entered into a public database. A set of 163 clones was

  12. Hooley and Sweeney Survey of Publicly Available State Health Databases http://thedatamap/1075-1.pdf v0.3 1

    E-print Network

    Chen, Yiling

    http://thedatamap/1075-1.pdf v0.3 1 Survey of Publicly Available State Health be matched back to the patient because it contains diagnoses that may include drug and alcohol dependency are the same. It is important to understand that sharing data beyond the patient encounter offers many worthy

  13. REFEREE: BIBLIOGRAPHIC DATABASE MANAGER, DOCUMENTATION

    EPA Science Inventory

    The publication is the user's manual for 3.xx releases of REFEREE, a general-purpose bibliographic database management program for IBM-compatible microcomputers. The REFEREE software also is available from NTIS. The manual has two main sections--Quick Tour and References Guide--a...

  14. DATABASE AUTHENTICATION BY DISTORTION FREE WATERMARKING

    E-print Network

    Cortesi, Tino

    DATABASE AUTHENTICATION BY DISTORTION FREE WATERMARKING Sukriti Bhattacharya and Agostino Cortesi@dsi.unive.it, cortesi@unive.it Keywords: Database watermarking, ZAW, Public key watermark, Abstract interpretation. Abstract: In this paper we introduce a distortion free watermarking technique that strengthen

  15. Mineralogy Database

    NSDL National Science Digital Library

    This reference database contains information for over 4,000 individual mineral species. Introductory material includes an overview of 'what is a mineral', definitions of what constitutes a mineral taken from several different sources and arranged by year, and a 'what's new' page listing additions to the site. Mineral data for individual species are arranged in tables by crystallography, crystal structure, X-Ray powder diffraction, chemical composition, and physical and optical properties. There are also listings for Strunz and New Dana classifications, and alphabetical directory, an image gallery, and links to help topics, news articles, and to other sites on mineralogy.

  16. Publications Publications

    E-print Network

    Seybold, Steven J.

    Society, 56(3): 229-394. 1983 Karban, Richard. Induced responses of cherry trees to periodical cicadaJournals 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. Publications Publications Richard Karban 1977 Karban, Richard. Growth form and interleaf shading by Costus lima in a Costa Rican rainforest. Biotropica

  17. SPODOBASE : an EST database for the lepidopteran crop pest Spodoptera

    PubMed Central

    Nègre, Vincent; Hôtelier, Thierry; Volkoff, Anne-Nathalie; Gimenez, Sylvie; Cousserans, François; Mita, Kazuei; Sabau, Xavier; Rocher, Janick; López-Ferber, Miguel; d'Alençon, Emmanuelle; Audant, Pascaline; Sabourault, Cécile; Bidegainberry, Vincent; Hilliou, Frédérique; Fournier, Philippe

    2006-01-01

    Background The Lepidoptera Spodoptera frugiperda is a pest which causes widespread economic damage on a variety of crop plants. It is also well known through its famous Sf9 cell line which is used for numerous heterologous protein productions. Species of the Spodoptera genus are used as model for pesticide resistance and to study virus host interactions. A genomic approach is now a critical step for further new developments in biology and pathology of these insects, and the results of ESTs sequencing efforts need to be structured into databases providing an integrated set of tools and informations. Description The ESTs from five independent cDNA libraries, prepared from three different S. frugiperda tissues (hemocytes, midgut and fat body) and from the Sf9 cell line, are deposited in the database. These tissues were chosen because of their importance in biological processes such as immune response, development and plant/insect interaction. So far, the SPODOBASE contains 29,325 ESTs, which are cleaned and clustered into non-redundant sets (2294 clusters and 6103 singletons). The SPODOBASE is constructed in such a way that other ESTs from S. frugiperda or other species may be added. User can retrieve information using text searches, pre-formatted queries, query assistant or blast searches. Annotation is provided against NCBI, UNIPROT or Bombyx mori ESTs databases, and with GO-Slim vocabulary. Conclusion The SPODOBASE database provides integrated access to expressed sequence tags (EST) from the lepidopteran insect Spodoptera frugiperda. It is a publicly available structured database with insect pest sequences which will allow identification of a number of genes and comprehensive cloning of gene families of interest for scientific community. SPODOBASE is available from URL: PMID:16796757

  18. The Molecular Biology Database Collection: 2004 update

    PubMed Central

    Galperin, Michael Y.

    2004-01-01

    The Molecular Biology Database Collection is a public resource listing key databases of value to the biologist, including those featured in this issue of Nucleic Acids Research, and other high-quality databases. All databases included in this Collection are freely available to the public. This listing aims to serve as a convenient starting point for searching the web for reliable information on various aspects of molecular biology, biochemistry and genetics. This year’s update includes 548 databases, 162 more than the previous one. The databases are organized in a hierarchical classification that should simplify finding the right database for each given task. Each database in the list comes with a recently updated brief description. The database list and the database descriptions can be accessed online at the Nucleic Acids Research web site http://nar.oupjournals.org/. The great challenge in biological research today is how to turn data into knowledge. I have met people who think data is knowledge but these people are then striving for a means of turning knowledge into understanding.Sydney Brenner. The Scientist 16[6]:12, March 18, 2002 PMID:14681349

  19. The peptaibiotics database - a comprehensive online resource.

    PubMed

    Neumann, Nora K N; Stoppacher, Norbert; Zeilinger, Susanne; Degenkolb, Thomas; Brückner, Hans; Schuhmacher, Rainer

    2015-05-01

    In this work, we present the 'Peptaibiotics Database' (PDB), a comprehensive online resource, which intends to cover all Aib-containing non-ribosomal fungal peptides currently described in scientific literature. This database shall extend and update the recently published 'Comprehensive Peptaibiotics Database' and currently consists of 1,297 peptaibiotic sequences. In a literature survey, a total of 235 peptaibiotic sequences published between January 2013 and June 2014 have been compiled, and added to the list of 1,062 peptides in the recently published 'Comprehensive Peptaibiotics Database'. The presented database is intended as a public resource freely accessible to the scientific community at peptaibiotics-database.boku.ac.at. The search options of the previously published repository and the presentation of sequence motif searches have been extended significantly. All of the available search options can be combined to create complex database queries. As a public repository, the presented database enables the easy upload of new peptaibiotic sequences or the correction of existing informations. In addition, an administrative interface for maintenance of the content of the database has been implemented, and the design of the database can be easily extended to store additional information to accommodate future needs of the 'peptaibiomics community'. PMID:26010663

  20. ARTI Refrigerant Database

    NASA Astrophysics Data System (ADS)

    Cain, J. M.

    1993-04-01

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate the phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in the research and design of air-conditioning and refrigeration equipment. The complete documents are not included. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents to accelerate availability of the information and will be completed or replaced in future updates.

  1. ARTI Refrigerant Database

    NASA Astrophysics Data System (ADS)

    Calm, J. M.

    1993-11-01

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R134a, R-141b, R-142b, R-143a, R-152a, R-227ea, R-245ca, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyol ester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.

  2. ARTI Refrigerant Database

    SciTech Connect

    Cain, J.M. [Calm (James M.), Great Falls, VA (United States)

    1993-04-30

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents to accelerate availability of the information and will be completed or replaced in future updates.

  3. The Molecule Pages database.

    PubMed

    Saunders, Brian; Lyon, Stephen; Day, Matthew; Riley, Brenda; Chenette, Emily; Subramaniam, Shankar; Vadivelu, Ilango

    2008-01-01

    The UCSD-Nature Signaling Gateway Molecule Pages (http://www.signaling-gateway.org/molecule) provides essential information on more than 3800 mammalian proteins involved in cellular signaling. The Molecule Pages contain expert-authored and peer-reviewed information based on the published literature, complemented by regularly updated information derived from public data source references and sequence analysis. The expert-authored data includes both a full-text review about the molecule, with citations, and highly structured data for bioinformatics interrogation, including information on protein interactions and states, transitions between states and protein function. The expert-authored pages are anonymously peer reviewed by the Nature Publishing Group. The Molecule Pages data is present in an object-relational database format and is freely accessible to the authors, the reviewers and the public from a web browser that serves as a presentation layer. The Molecule Pages are supported by several applications that along with the database and the interfaces form a multi-tier architecture. The Molecule Pages and the Signaling Gateway are routinely accessed by a very large research community. PMID:17965093

  4. DNA Barcoding for Species Assignment: The Case of Mediterranean Marine Fishes

    PubMed Central

    Landi, Monica; Dimech, Mark; Arculeo, Marco; Biondo, Girolama; Martins, Rogelia; Carneiro, Miguel; Carvalho, Gary Robert; Brutto, Sabrina Lo; Costa, Filipe O.

    2014-01-01

    Background DNA barcoding enhances the prospects for species-level identifications globally using a standardized and authenticated DNA-based approach. Reference libraries comprising validated DNA barcodes (COI) constitute robust datasets for testing query sequences, providing considerable utility to identify marine fish and other organisms. Here we test the feasibility of using DNA barcoding to assign species to tissue samples from fish collected in the central Mediterranean Sea, a major contributor to the European marine ichthyofaunal diversity. Methodology/Principal Findings A dataset of 1278 DNA barcodes, representing 218 marine fish species, was used to test the utility of DNA barcodes to assign species from query sequences. We tested query sequences against 1) a reference library of ranked DNA barcodes from the neighbouring North East Atlantic, and 2) the public databases BOLD and GenBank. In the first case, a reference library comprising DNA barcodes with reliability grades for 146 fish species was used as diagnostic dataset to screen 486 query DNA sequences from fish specimens collected in the central basin of the Mediterranean Sea. Of all query sequences suitable for comparisons 98% were unambiguously confirmed through complete match with reference DNA barcodes. In the second case, it was possible to assign species to 83% (BOLD-IDS) and 72% (GenBank) of the sequences from the Mediterranean. Relatively high intraspecific genetic distances were found in 7 species (2.2%–18.74%), most of them of high commercial relevance, suggesting possible cryptic species. Conclusion/Significance We emphasize the discriminatory power of COI barcodes and their application to cases requiring species level resolution starting from query sequences. Results highlight the value of public reference libraries of reliability grade-annotated DNA barcodes, to identify species from different geographical origins. The ability to assign species with high precision from DNA samples of disparate quality and origin has major utility in several fields, from fisheries and conservation programs to control of fish products authenticity. PMID:25222272

  5. Pathways database system.

    PubMed

    Ozsoyoglu, Z Meral; Nadeau, Joseph H; Ozsoyoglu, G

    2003-01-01

    During the next phase of the Human Genome Project, research will focus on functional studies of attributing functions to genes, their regulatory elements, and other DNA sequences. To facilitate the use of genomic information in such studies, a new modeling perspective is needed to examine and study genome sequences in the context of many kinds of biological information. Pathways are the logical format for modeling and presenting such information in a manner that is familiar to biological researchers. In this paper, we introduce an integrated system, called "Pathways Database System," with a set of software tools for modeling, storing, analyzing, visualizing, and querying biological pathways data at different levels of genetic, molecular, biochemical and organismal detail. PMID:12831573

  6. Algaline Database

    NSDL National Science Digital Library

    Maintained by the Finnish Institute of Marine Research and several other institutions, the Algaline Database offers updated reports on the conditions of phytoplankton and related parameters in the Baltic Sea. The reports, which vary in length and detail (though most are brief), summarize measurements of Oxygen, Salinity, Temperature, Nutrients, Harmful substances, Plankton, Zooplankton, Benthic Animals, Flow, and Other measurements. In addition, the Maps and Figures section offers numerous color images (including satellite) of Baltic Sea conditions and marine organisms. To access reports by geographic subregion of the Baltic, head to the Reports section. Finally, the Latest News section keeps researchers abreast of changing conditions (e.g., algal blooms) and research cruises in the Baltic. For researchers or anyone else wanting in-depth information on a host of ecological parameters for the Baltic Sea, this is an excellent reference site.

  7. ChloroplastDB: the Chloroplast Genome Database

    Microsoft Academic Search

    Liying Cui; Narayanan Veeraraghavan; Alexander Richter; P. Kerr Wall; Robert K. Jansen; James Leebens-mack; Izabela Makalowska; Claude W. Depamphilis

    2006-01-01

    The Chloroplast Genome Database (ChloroplastDB) is an interactive, web-based database for fully sequenced plastid genomes, containing genomic, protein, DNA and RNA sequences, gene locations, RNA-editing sites, putative protein families and align- ments (http:\\/\\/chloroplast.cbio.psu.edu\\/). With recent technical advances, the rate of generating new organ- elle genomes has increased dramatically. However, the established ontology for chloroplast genes and gene features has not

  8. Environment Australia's Online Image Database

    NSDL National Science Digital Library

    Environment Australia -- Australia's Department of Environment and Heritage -- has made its extensive collection of photographs freely available for non-commercial use. Researchers and students in the environmental sciences may find this collection of well-composed, high-quality images a useful resource for presentations and publications. Users may easily search the database by keyword, general subject, and/or geographic area. Search results yield a table of thumbnail photos together with summary information for each image. Before publishing an image from the database, users must first contact Environment Australia (via provided Web form).

  9. Quantifying the Consistency of Scientific Databases

    PubMed Central

    Šubelj, Lovro; Bajec, Marko; Mileva Boshkoska, Biljana; Kastrin, Andrej; Levnaji?, Zoran

    2015-01-01

    Science is a social process with far-reaching impact on our modern society. In recent years, for the first time we are able to scientifically study the science itself. This is enabled by massive amounts of data on scientific publications that is increasingly becoming available. The data is contained in several databases such as Web of Science or PubMed, maintained by various public and private entities. Unfortunately, these databases are not always consistent, which considerably hinders this study. Relying on the powerful framework of complex networks, we conduct a systematic analysis of the consistency among six major scientific databases. We found that identifying a single "best" database is far from easy. Nevertheless, our results indicate appreciable differences in mutual consistency of different databases, which we interpret as recipes for future bibliometric studies. PMID:25984946

  10. WEBrary: Putting Your In-House Databases on the Web.

    ERIC Educational Resources Information Center

    Justie, Kevin M.

    1999-01-01

    The WEBrary(R) databases at the Morton Grove Public Library (Illinois) provide patron-accessible searchable databases, easily available over the library's Web site. Database offerings include the locally maintained Song Collection Index, Obituary Index, Continuations Listings, On-Order files, topical and personalized New Acquisitions files, and…

  11. The taming of an impossible child: a standardized all-in approach to the phylogeny of Hymenoptera using public database sequences

    PubMed Central

    2011-01-01

    Background Enormous molecular sequence data have been accumulated over the past several years and are still exponentially growing with the use of faster and cheaper sequencing techniques. There is high and widespread interest in using these data for phylogenetic analyses. However, the amount of data that one can retrieve from public sequence repositories is virtually impossible to tame without dedicated software that automates processes. Here we present a novel bioinformatics pipeline for downloading, formatting, filtering and analyzing public sequence data deposited in GenBank. It combines some well-established programs with numerous newly developed software tools (available at http://software.zfmk.de/). Results We used the bioinformatics pipeline to investigate the phylogeny of the megadiverse insect order Hymenoptera (sawflies, bees, wasps and ants) by retrieving and processing more than 120,000 sequences and by selecting subsets under the criteria of compositional homogeneity and defined levels of density and overlap. Tree reconstruction was done with a partitioned maximum likelihood analysis from a supermatrix with more than 80,000 sites and more than 1,100 species. In the inferred tree, consistent with previous studies, "Symphyta" is paraphyletic. Within Apocrita, our analysis suggests a topology of Stephanoidea + (Ichneumonoidea + (Proctotrupomorpha + (Evanioidea + Aculeata))). Despite the huge amount of data, we identified several persistent problems in the Hymenoptera tree. Data coverage is still extremely low, and additional data have to be collected to reliably infer the phylogeny of Hymenoptera. Conclusions While we applied our bioinformatics pipeline to Hymenoptera, we designed the approach to be as general as possible. With this pipeline, it is possible to produce phylogenetic trees for any taxonomic group and to monitor new data and tree robustness in a taxon of interest. It therefore has great potential to meet the challenges of the phylogenomic era and to deepen our understanding of the tree of life. PMID:21851592

  12. Mining of public sequencing databases supports a non-dietary origin for putative foreign miRNAs: underestimated effects of contamination in NGS

    PubMed Central

    Tosar, Juan Pablo; Rovira, Carlos; Naya, Hugo; Cayota, Alfonso

    2014-01-01

    The report that exogenous plant miRNAs are able to cross the mammalian gastrointestinal tract and exert gene-regulation mechanism in mammalian tissues has yielded a lot of controversy, both in the public press and the scientific literature. Despite the initial enthusiasm, reproducibility of these results was recently questioned by several authors. To analyze the causes of this unease, we searched for diet-derived miRNAs in deep-sequencing libraries performed by ourselves and others. We found variable amounts of plant miRNAs in publicly available small RNA-seq data sets of human tissues. In human spermatozoa, exogenous RNAs reached extreme, biologically meaningless levels. On the contrary, plant miRNAs were not detected in our sequencing of human sperm cells, which was performed in the absence of any known sources of plant contamination. We designed an experiment to show that cross-contamination during library preparation is a source of exogenous RNAs. These contamination-derived exogenous sequences even resisted oxidation with sodium periodate. To test the assumption that diet-derived miRNAs were actually contamination-derived, we sought in the literature for previous sequencing reports performed by the same group which reported the initial finding. We analyzed the spectra of plant miRNAs in a small RNA sequencing study performed in amphioxus by this group in 2009 and we found a very strong correlation with the plant miRNAs which they later reported in human sera. Even though contamination with exogenous sequences may be easy to detect, cross-contamination between samples from the same organism can go completely unnoticed, possibly affecting conclusions derived from NGS transcriptomics. PMID:24729469

  13. Mining of public sequencing databases supports a non-dietary origin for putative foreign miRNAs: underestimated effects of contamination in NGS.

    PubMed

    Tosar, Juan Pablo; Rovira, Carlos; Naya, Hugo; Cayota, Alfonso

    2014-06-01

    The report that exogenous plant miRNAs are able to cross the mammalian gastrointestinal tract and exert gene-regulation mechanism in mammalian tissues has yielded a lot of controversy, both in the public press and the scientific literature. Despite the initial enthusiasm, reproducibility of these results was recently questioned by several authors. To analyze the causes of this unease, we searched for diet-derived miRNAs in deep-sequencing libraries performed by ourselves and others. We found variable amounts of plant miRNAs in publicly available small RNA-seq data sets of human tissues. In human spermatozoa, exogenous RNAs reached extreme, biologically meaningless levels. On the contrary, plant miRNAs were not detected in our sequencing of human sperm cells, which was performed in the absence of any known sources of plant contamination. We designed an experiment to show that cross-contamination during library preparation is a source of exogenous RNAs. These contamination-derived exogenous sequences even resisted oxidation with sodium periodate. To test the assumption that diet-derived miRNAs were actually contamination-derived, we sought in the literature for previous sequencing reports performed by the same group which reported the initial finding. We analyzed the spectra of plant miRNAs in a small RNA sequencing study performed in amphioxus by this group in 2009 and we found a very strong correlation with the plant miRNAs which they later reported in human sera. Even though contamination with exogenous sequences may be easy to detect, cross-contamination between samples from the same organism can go completely unnoticed, possibly affecting conclusions derived from NGS transcriptomics. PMID:24729469

  14. Village Green Project: Web-accessible Database

    EPA Science Inventory

    The purpose of this web-accessible database is for the public to be able to view instantaneous readings from a solar-powered air monitoring station located in a public location (prototype pilot test is outside of a library in Durham County, NC). The data are wirelessly transmitte...

  15. Searching NCBI Databases Using Entrez.

    PubMed

    Gibney, Gretchen; Baxevanis, Andreas D

    2011-10-01

    One of the most widely used interfaces for the retrieval of information from biological databases is the NCBI Entrez system. Entrez capitalizes on the fact that there are pre-existing, logical relationships between the individual entries found in numerous public databases. The existence of such natural connections, mostly biological in nature, argued for the development of a method through which all the information about a particular biological entity could be found without having to sequentially visit and query disparate databases. Two basic protocols describe simple, text-based searches, illustrating the types of information that can be retrieved through the Entrez system. An alternate protocol builds upon the first basic protocol, using additional, built-in features of the Entrez system, and providing alternative ways to issue the initial query. The support protocol reviews how to save frequently issued queries. Finally, Cn3D, a structure visualization tool, is also discussed. PMID:21975942

  16. National Residential Efficiency Measures Database

    DOE Data Explorer

    The National Residential Efficiency Measures Database is a publicly available, centralized resource of residential building retrofit measures and costs for the U.S. building industry. With support from the U.S. Department of Energy, NREL developed this tool to help users determine the most cost-effective retrofit measures for improving energy efficiency of existing homes. Software developers who require residential retrofit performance and cost data for applications that evaluate residential efficiency measures are the primary audience for this database. In addition, home performance contractors and manufacturers of residential materials and equipment may find this information useful. The database offers the following types of retrofit measures: 1) Appliances, 2) Domestic Hot Water, 3) Enclosure, 4) Heating, Ventilating, and Air Conditioning (HVAC), 5) Lighting, 6) Miscellaneous.

  17. LOTUS-DB: an integrative and interactive database for Nelumbo nucifera study

    PubMed Central

    Wang, Kun; Deng, Jiao; Damaris, Rebecca Njeri; Yang, Mei; Xu, Liming; Yang, Pingfang

    2015-01-01

    Besides its important significance in plant taxonomy and phylogeny, sacred lotus (Nelumbo nucifera Gaertn.) might also hold the key to the secrets of aging, which attracts crescent attentions from researchers all over the world. The genetic or molecular studies on this species depend on its genome information. In 2013, two publications reported the sequencing of its full genome, based on which we constructed a database named as LOTUS-DB. It will provide comprehensive information on the annotation, gene function and expression for the sacred lotus. The information will facilitate users to efficiently query and browse genes, graphically visualize genome and download a variety of complex data information on genome DNA, coding sequence (CDS), transcripts or peptide sequences, promoters and markers. It will accelerate researches on gene cloning, functional identification of sacred lotus, and hence promote the studies on this species and plant genomics as well. Database URL: http://lotus-db.wbgcas.cn. PMID:25819075

  18. LOTUS-DB: an integrative and interactive database for Nelumbo nucifera study.

    PubMed

    Wang, Kun; Deng, Jiao; Damaris, Rebecca Njeri; Yang, Mei; Xu, Liming; Yang, Pingfang

    2015-01-01

    Besides its important significance in plant taxonomy and phylogeny, sacred lotus (Nelumbo nucifera Gaertn.) might also hold the key to the secrets of aging, which attracts crescent attentions from researchers all over the world. The genetic or molecular studies on this species depend on its genome information. In 2013, two publications reported the sequencing of its full genome, based on which we constructed a database named as LOTUS-DB. It will provide comprehensive information on the annotation, gene function and expression for the sacred lotus. The information will facilitate users to efficiently query and browse genes, graphically visualize genome and download a variety of complex data information on genome DNA, coding sequence (CDS), transcripts or peptide sequences, promoters and markers. It will accelerate researches on gene cloning, functional identification of sacred lotus, and hence promote the studies on this species and plant genomics as well. Database URL: http://lotus-db.wbgcas.cn PMID:25819075

  19. The UCSC Genome Browser database: 2015 update

    PubMed Central

    Rosenbloom, Kate R.; Armstrong, Joel; Barber, Galt P.; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Dreszer, Timothy R.; Fujita, Pauline A.; Guruvadoo, Luvina; Haeussler, Maximilian; Harte, Rachel A.; Heitner, Steve; Hickey, Glenn; Hinrichs, Angie S.; Hubley, Robert; Karolchik, Donna; Learned, Katrina; Lee, Brian T.; Li, Chin H.; Miga, Karen H.; Nguyen, Ngan; Paten, Benedict; Raney, Brian J.; Smit, Arian F. A.; Speir, Matthew L.; Zweig, Ann S.; Haussler, David; Kuhn, Robert M.; Kent, W. James

    2015-01-01

    Launched in 2001 to showcase the draft human genome assembly, the UCSC Genome Browser database (http://genome.ucsc.edu) and associated tools continue to grow, providing a comprehensive resource of genome assemblies and annotations to scientists and students worldwide. Highlights of the past year include the release of a browser for the first new human genome reference assembly in 4 years in December 2013 (GRCh38, UCSC hg38), a watershed comparative genomics annotation (100-species multiple alignment and conservation) and a novel distribution mechanism for the browser (GBiB: Genome Browser in a Box). We created browsers for new species (Chinese hamster, elephant shark, minke whale), ‘mined the web’ for DNA sequences and expanded the browser display with stacked color graphs and region highlighting. As our user community increasingly adopts the UCSC track hub and assembly hub representations for sharing large-scale genomic annotation data sets and genome sequencing projects, our menu of public data hubs has tripled. PMID:25428374

  20. The UCSC Genome Browser database: 2015 update.

    PubMed

    Rosenbloom, Kate R; Armstrong, Joel; Barber, Galt P; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Dreszer, Timothy R; Fujita, Pauline A; Guruvadoo, Luvina; Haeussler, Maximilian; Harte, Rachel A; Heitner, Steve; Hickey, Glenn; Hinrichs, Angie S; Hubley, Robert; Karolchik, Donna; Learned, Katrina; Lee, Brian T; Li, Chin H; Miga, Karen H; Nguyen, Ngan; Paten, Benedict; Raney, Brian J; Smit, Arian F A; Speir, Matthew L; Zweig, Ann S; Haussler, David; Kuhn, Robert M; Kent, W James

    2015-01-01

    Launched in 2001 to showcase the draft human genome assembly, the UCSC Genome Browser database (http://genome.ucsc.edu) and associated tools continue to grow, providing a comprehensive resource of genome assemblies and annotations to scientists and students worldwide. Highlights of the past year include the release of a browser for the first new human genome reference assembly in 4 years in December 2013 (GRCh38, UCSC hg38), a watershed comparative genomics annotation (100-species multiple alignment and conservation) and a novel distribution mechanism for the browser (GBiB: Genome Browser in a Box). We created browsers for new species (Chinese hamster, elephant shark, minke whale), 'mined the web' for DNA sequences and expanded the browser display with stacked color graphs and region highlighting. As our user community increasingly adopts the UCSC track hub and assembly hub representations for sharing large-scale genomic annotation data sets and genome sequencing projects, our menu of public data hubs has tripled. PMID:25428374

  1. World Database of Happiness

    NSDL National Science Digital Library

    The World Database of Happiness, maintained by Professor Ruut Veenhoven of Erasmus University Rotterdam, is a "continuous register of scientific research on subjective appreciation of life." This site contains the Bibliography of Happiness, a collection of over 3,000 scientific publications accessible by author or subject; the Catalog of Happiness in Nations, providing responses from over 1,500 national happiness surveys taken in 93 different countries between 1946 and 1996; the Catalog of Happiness Correlates, which presents the abstracts of correlational research findings from 662 studies worldwide; and finally, the Directory of Happiness Investigators, an international listing of more than 3,300 happiness researchers. Users may freely download the Bibliography or the Catalog of Happiness in Nations as compressed MS Access files (.zip), and download the full text of the Catalog of Correlates in compressed RTF format (.zip).

  2. Alcohol Studies Database

    NSDL National Science Digital Library

    Since 1987, staff members at the Rutgers University Center of Alcohol Studies have been collecting citations of documents related to alcohol. Today, they have over 80,000 citations and much of the material is related to research and professional materials that deal with the subject. Additionally, the database contains a small collection of educational and prevention materials designed for use by educators, parents, and public health workers. The site is maintained by the Scholarly Communication Center, the Center of Alcohol Studies, and the Rutgers University Libraries. Visitors to the site can search by subject, or perform a more advanced search as well. The site also includes a "Help" area, which includes information on limiting searches, links to full text, and suggestions on using Boolean techniques.

  3. CMAP: Complement Map Database

    PubMed Central

    Yang, Kun; Dinasarapu, Ashok R.; Reis, Edimara S.; DeAngelis, Robert A.; Ricklin, Daniel; Subramaniam, Shankar; Lambris, John D.

    2013-01-01

    Summary: The human complement system is increasingly perceived as an intricate protein network of effectors, inhibitors and regulators that drives critical processes in health and disease and extensively communicates with associated physiological pathways ranging from immunity and inflammation to homeostasis and development. A steady stream of experimental data reveals new fascinating connections at a rapid pace; although opening unique opportunities for research discoveries, the comprehensiveness and large diversity of experimental methods, nomenclatures and publication sources renders it highly challenging to keep up with the essential findings. With the Complement Map Database (CMAP), we have created a novel and easily accessible research tool to assist the complement community and scientists from related disciplines in exploring the complement network and discovering new connections. Availability: http://www.complement.us/cmap. Contact: lambris@upenn.edu Supplementary information: Supplementary data are available at Bioinformatics online. PMID:23661693

  4. National Geologic Map Database

    NSDL National Science Digital Library

    1997-01-01

    The National Geologic Map Database (NGMDB) is an Internet-based system for query and retrieval of earth-science map information, created as a collaborative effort between the USGS and the Association of American State Geologists. Its functions include providing a catalog of available map information; a data repository; and a source for general information on the nature and intended uses of the various types of earth-science information. The map catalog is a comprehensive, searchable catalog of all geoscience maps of the United States, in paper or digital format. It includes maps published in geological survey formal series and open-file series, maps in books, theses and dissertations, maps published by park associations, scientific societies, and other agencies, as well as publications that do not contain a map but instead provide a geological description of an area (for example, a state park). The geologic-names lexicon (GEOLEX) is a search tool for lithologic and geochronologic unit names. It now contains roughly 90% of the geologic names found in the most recent listing of USGS-approved geologic names. Current mapping activities at 1:24,000- and 1:100,000-scale are listed in the Geologic Mapping in Progress Database. Information on how to find topographic maps and list of geology-related links is also available.

  5. The ITPA disruption database

    NASA Astrophysics Data System (ADS)

    Eidietis, N. W.; Gerhardt, S. P.; Granetz, R. S.; Kawano, Y.; Lehnen, M.; Lister, J. B.; Pautasso, G.; Riccardo, V.; Tanna, R. L.; Thornton, A. J.; ITPA Disruption Database Participants, The

    2015-06-01

    A multi-device database of disruption characteristics has been developed under the auspices of the International Tokamak Physics Activity magneto-hydrodynamics topical group. The purpose of this ITPA disruption database (IDDB) is to find the commonalities between the disruption and disruption mitigation characteristics in a wide variety of tokamaks in order to elucidate the physics underlying tokamak disruptions and to extrapolate toward much larger devices, such as ITER and future burning plasma devices. In contrast to previous smaller disruption data collation efforts, the IDDB aims to provide significant context for each shot provided, allowing exploration of a wide array of relationships between pre-disruption and disruption parameters. The IDDB presently includes contributions from nine tokamaks, including both conventional aspect ratio and spherical tokamaks. An initial parametric analysis of the available data is presented. This analysis includes current quench rates, halo current fraction and peaking, and the effectiveness of massive impurity injection. The IDDB is publicly available, with instruction for access provided herein.

  6. Exploring DNA

    NSDL National Science Digital Library

    Mrs. Flitton

    2008-08-13

    Get ready to learn an explore DNA, genes and proteins. By moving through the different topics, you will hopefully gain greater understanding of how DNA, genes, and proteins are all related. DNA to Protein Module You will zoom into the human body to see and read more about DNA. The Journey Into DNA DNA Workshop Activity- You try it! More DNA and Protein Synthesis ...

  7. CD-ROM-aided Databases

    NASA Astrophysics Data System (ADS)

    Kitamura, Masami

    Nichigai Associates Inc. has begun information services to publish text databases on CD-ROM. In chapter 2, outline of these services and the publication plan of this fiscal year are described. In chapter 3, CD-ROM logical file format common to these services, software to generate files conformed to the format, and software to retrieve CD-ROM files by personal computers are also described.

  8. Curation accuracy of model organism databases

    PubMed Central

    Keseler, Ingrid M.; Skrzypek, Marek; Weerasinghe, Deepika; Chen, Albert Y.; Fulcher, Carol; Li, Gene-Wei; Lemmer, Kimberly C.; Mladinich, Katherine M.; Chow, Edmond D.; Sherlock, Gavin; Karp, Peter D.

    2014-01-01

    Manual extraction of information from the biomedical literature—or biocuration—is the central methodology used to construct many biological databases. For example, the UniProt protein database, the EcoCyc Escherichia coli database and the Candida Genome Database (CGD) are all based on biocuration. Biological databases are used extensively by life science researchers, as online encyclopedias, as aids in the interpretation of new experimental data and as golden standards for the development of new bioinformatics algorithms. Although manual curation has been assumed to be highly accurate, we are aware of only one previous study of biocuration accuracy. We assessed the accuracy of EcoCyc and CGD by manually selecting curated assertions within randomly chosen EcoCyc and CGD gene pages and by then validating that the data found in the referenced publications supported those assertions. A database assertion is considered to be in error if that assertion could not be found in the publication cited for that assertion. We identified 10 errors in the 633 facts that we validated across the two databases, for an overall error rate of 1.58%, and individual error rates of 1.82% for CGD and 1.40% for EcoCyc. These data suggest that manual curation of the experimental literature by Ph.D-level scientists is highly accurate. Database URL: http://ecocyc.org/, http://www.candidagenome.org// PMID:24923819

  9. Complete Genomic DNA Sequence of the East Asian Spotted Fever Disease Agent Rickettsia japonica

    PubMed Central

    Matsutani, Minenosuke; Ogawa, Motohiko; Takaoka, Naohisa; Hanaoka, Nozomu; Toh, Hidehiro; Yamashita, Atsushi; Oshima, Kenshiro; Hirakawa, Hideki; Kuhara, Satoru; Suzuki, Harumi; Hattori, Masahira; Kishimoto, Toshio; Ando, Shuji; Azuma, Yoshinao; Shirai, Mutsunori

    2013-01-01

    Rickettsia japonica is an obligate intracellular alphaproteobacteria that causes tick-borne Japanese spotted fever, which has spread throughout East Asia. We determined the complete genomic DNA sequence of R. japonica type strain YH (VR-1363), which consists of 1,283,087 base pairs (bp) and 971 protein-coding genes. Comparison of the genomic DNA sequence of R. japonica with other rickettsiae in the public databases showed that 2 regions (4,323 and 216 bp) were conserved in a very narrow range of Rickettsia species, and the shorter one was inserted in, and disrupted, a preexisting open reading frame (ORF). While it is unknown how the DNA sequences were acquired in R. japonica genomes, it may be a useful signature for the diagnosis of Rickettsia species. Instead of the species-specific inserted DNA sequences, rickettsial genomes contain Rickettsia-specific palindromic elements (RPEs), which are also capable of locating in preexisting ORFs. Precise alignments of protein and DNA sequences involving RPEs showed that when a gene contains an inserted DNA sequence, each rickettsial ortholog carried an inserted DNA sequence at the same locus. The sequence, ATGAC, was shown to be highly frequent and thus characteristic in certain RPEs (RPE-4, RPE-6, and RPE-7). This finding implies that RPE-4, RPE-6, and RPE-7 were derived from a common inserted DNA sequence. PMID:24039725

  10. Genome databases worry about yeast (and other) infections

    Microsoft Academic Search

    C. Anderson

    1993-01-01

    A group of 2000 DNA sequences representing human genes was published in October 1992 by Genethon. Since then, more than half (possibly up to 85%) of the sequences appear to be from yeast and unidentified bacteria. This represents the problem of researchers submitting sequences directly to databases without peer review or serious error checking. While database managers rely on scientists

  11. Incorporating Metric Access Methods for Similarity Searching on Oracle Database

    Microsoft Academic Search

    Daniel S. Kaster; Pedro Henrique Bugatti; Agma J. M. Traina; Caetano Traina Jr.

    2009-01-01

    The volume of multimedia and complex data (images, videos, audio, time series, DNA sequences, and others) has been growing at a very fast pace. Thus, it is necessary to store in databases many types of data which are not nat- urally handled by Database Management Systems (DBMSs). Complex data are well-suited to be queried by similarity. Many works addressed techniques

  12. Nanotechnology with DNA DNA Nanodevices

    E-print Network

    Ludwig-Maximilians-Universität, München

    Nanotechnology with DNA DNA Nanodevices Friedrich C. Simmel* and Wendy U. Dittmer A DNA actuator. Introduction.............285 2. Overview: DNA Nanotechnology.......285 3. Prototypes of Nanomechanical DNA overview of DNA nanotechnology as a whole is given. The most important properties of DNA molecules

  13. Databases: Beyond the Basics.

    ERIC Educational Resources Information Center

    Whittaker, Robert

    This presented paper offers an elementary description of database characteristics and then provides a survey of databases that may be useful to the teacher and researcher in Slavic and East European languages and literatures. The survey focuses on commercial databases that are available, usable, and needed. Individual databases discussed include:…

  14. Making database systems usable

    Microsoft Academic Search

    H. V. Jagadish; Adriane Chapman; Aaron Elkiss; Magesh Jayapandian; Yunyao Li; Arnab Nandi; Cong Yu

    2007-01-01

    Database researchers have striven to improve the capability of a database in terms of both performance and functional- ity. We assert that the usability of a database is as important as its capability. In this paper, we study why database sys- tems today are so difficult to use. We identify a set of five pain points and propose a research

  15. Human Mitochondrial Protein Database

    National Institute of Standards and Technology Data Gateway

    SRD 131 Human Mitochondrial Protein Database (Web, free access)   The Human Mitochondrial Protein Database (HMPDb) provides comprehensive data on mitochondrial and human nuclear encoded proteins involved in mitochondrial biogenesis and function. This database consolidates information from SwissProt, LocusLink, Protein Data Bank (PDB), GenBank, Genome Database (GDB), Online Mendelian Inheritance in Man (OMIM), Human Mitochondrial Genome Database (mtDB), MITOMAP, Neuromuscular Disease Center and Human 2-D PAGE Databases. This database is intended as a tool not only to aid in studying the mitochondrion but in studying the associated diseases.

  16. DNA microarray (spot) .

    E-print Network

    1. DNA microarray DNA (spot) . DNA probe , probe (hybridization) . DNA microarray cDNA oligonucleotide oligonucleotide cDNA probe . oligonucleotide microarray , DNA , probe . oligonucleotide microarray probe

  17. LncRNADisease: a database for long-non-coding RNA-associated diseases

    PubMed Central

    Chen, Geng; Wang, Ziyun; Wang, Dongqing; Qiu, Chengxiang; Liu, Mingxi; Chen, Xing; Zhang, Qipeng; Yan, Guiying; Cui, Qinghua

    2013-01-01

    In this article, we describe a long-non-coding RNA (lncRNA) and disease association database (LncRNADisease), which is publicly accessible at http://cmbi.bjmu.edu.cn/lncrnadisease. In recent years, a large number of lncRNAs have been identified and increasing evidence shows that lncRNAs play critical roles in various biological processes. Therefore, the dysfunctions of lncRNAs are associated with a wide range of diseases. It thus becomes important to understand lncRNAs’ roles in diseases and to identify candidate lncRNAs for disease diagnosis, treatment and prognosis. For this purpose, a high-quality lncRNA–disease association database would be extremely beneficial. Here, we describe the LncRNADisease database that collected and curated approximately 480 entries of experimentally supported lncRNA–disease associations, including 166 diseases. LncRNADisease also curated 478 entries of lncRNA interacting partners at various molecular levels, including protein, RNA, miRNA and DNA. Moreover, we annotated lncRNA–disease associations with genomic information, sequences, references and species. We normalized the disease name and the type of lncRNA dysfunction and provided a detailed description for each entry. Finally, we developed a bioinformatic method to predict novel lncRNA–disease associations and integrated the method and the predicted associated diseases of 1564 human lncRNAs into the database. PMID:23175614

  18. Challenges in the association of human single nucleotide polymorphism mentions with unique database identifiers

    PubMed Central

    2011-01-01

    Background Most information on genomic variations and their associations with phenotypes are covered exclusively in scientific publications rather than in structured databases. These texts commonly describe variations using natural language; database identifiers are seldom mentioned. This complicates the retrieval of variations, associated articles, as well as information extraction, e. g. the search for biological implications. To overcome these challenges, procedures to map textual mentions of variations to database identifiers need to be developed. Results This article describes a workflow for normalization of variation mentions, i.e. the association of them to unique database identifiers. Common pitfalls in the interpretation of single nucleotide polymorphism (SNP) mentions are highlighted and discussed. The developed normalization procedure achieves a precision of 98.1 % and a recall of 67.5% for unambiguous association of variation mentions with dbSNP identifiers on a text corpus based on 296 MEDLINE abstracts containing 527 mentions of SNPs. The annotated corpus is freely available at http://www.scai.fraunhofer.de/snp-normalization-corpus.html. Conclusions Comparable approaches usually focus on variations mentioned on the protein sequence and neglect problems for other SNP mentions. The results presented here indicate that normalizing SNPs described on DNA level is more difficult than the normalization of SNPs described on protein level. The challenges associated with normalization are exemplified with ambiguities and errors, which occur in this corpus. PMID:21992066

  19. Geminivirus database (GVDB): first database of family Geminiviridae and its genera Begomovirus.

    PubMed

    Prajapat, Rajneesh; Marwal, Avinash; Shaikh, Zuber; Gaur, Rajarshi Kumar

    2012-07-15

    Geminivirus Database (GVDB) is an online interactive database of Geminiviridae family. GVDB comprises of partial and complete nucleotide sequences along with duly annotated expressed genes of isolated Begomovirus species. The in silico homology modeling, docking and recombination results obtained for different begomoviral sequences are also mentioned. This database is endowed with comprehensive information about Geminivirus members which grounds infection in various plants species in India assorting from crops, ornamentals plants and common weeds. The home page of this database offers various links associated with current research projects and also the publications related to molecular and in silico study of Begomovirus infection. The main feature of GVDB includes flexible database designs based on platform of PHP allows easy retrieval of the information. The database is made available at www.wikigeminivirus.org. PMID:24171254

  20. Silicon Valley Companies Database (SV150)

    NSDL National Science Digital Library

    Created by Mercury Center, the online service of the San Jose Mercury News, this database offers financial information and company background for the 150 largest publicly traded companies in Silicon Valley. Silicon Valley is defined as the cities of Santa Cruz and Santa Clara, as well as the southern sections of San Mateo and Alameda counties. The database is searchable by company name, stock symbol, 1997 sales, industry type, product, and location. Clear, detailed instructions will help users best use the database. The search results link to company homepages and charted stock prices.

  1. An occurence records database of French Guiana harvestmen (Arachnida, Opiliones)

    PubMed Central

    Solbès, Pierre; Grosso, Bernadette

    2014-01-01

    Abstract This dataset provides information on specimens of harvestmen (Arthropoda, Arachnida, Opiliones) collected in French Guiana. Field collections have been initiated in 2012 within the framework of the CEnter for the Study of Biodiversity in Amazonia (CEBA: www.labex-ceba.fr/en/). This dataset is a work in progress.  Occurrences are recorded in an online database stored at the EDB laboratory after each collecting trip and the dataset is updated on a monthly basis. Voucher specimens and associated DNA are also stored at the EDB laboratory until deposition in natural history Museums. The latest version of the dataset is publicly and freely accessible through our Integrated Publication Toolkit at http://130.120.204.55:8080/ipt/resource.do?r=harvestmen_of_french_guiana or through the Global Biodiversity Information Facility data portal at http://www.gbif.org/dataset/3c9e2297-bf20-4827-928e-7c7eefd9432c. PMID:25589875

  2. RefSeq microbial genomes database: new representation and annotation strategy.

    PubMed

    Tatusova, Tatiana; Ciufo, Stacy; Fedorov, Boris; O'Neill, Kathleen; Tolstoy, Igor

    2014-01-01

    The source of the microbial genomic sequences in the RefSeq collection is the set of primary sequence records submitted to the International Nucleotide Sequence Database public archives. These can be accessed through the Entrez search and retrieval system at http://www.ncbi.nlm.nih.gov/genome. Next-generation sequencing has enabled researchers to perform genomic sequencing at rates that were unimaginable in the past. Microbial genomes can now be sequenced in a matter of hours, which has led to a significant increase in the number of assembled genomes deposited in the public archives. This huge increase in DNA sequence data presents new challenges for the annotation, analysis and visualization bioinformatics tools. New strategies have been developed for the annotation and representation of reference genomes and sequence variations derived from population studies and clinical outbreaks. PMID:24316578

  3. Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags

    Microsoft Academic Search

    Jan Gorodkin; Susanna Cirera; Jakob Hedegaard; Michael J Gilchrist; Frank Panitz; Claus Jørgensen; Karsten Scheibye-Knudsen; Troels Arvin; Steen Lumholdt; Milena Sawera; Trine Green; Bente J Nielsen; Jakob H Havgaard; Carina Rosenkilde; Jun Wang; Heng Li; Ruiqiang Li; Bin Liu; Songnian Hu; Wei Dong; Wei Li; Jun Yu; Jian Wang; Hans-Henrik Stærfeldt; Rasmus Wernersson; Lone B Madsen; Bo Thomsen; Henrik Hornshøj; Zhan Bujie; Xuegang Wang; Xuefei Wang; Lars Bolund; Søren Brunak; Huanming Yang; Christian Bendixen; Merete Fredholm

    2007-01-01

    : BACKGROUND: Knowledge of the structure of gene expression is essential for mammalian transcriptomics research. We analyzed a collection of more than one million porcine expressed sequence tags (ESTs), of which two-thirds were generated in the Sino-Danish Pig Genome Project and one-third are from public databases. The Sino-Danish ESTs were generated from one normalized and 97 non-normalized cDNA libraries representing

  4. Pregnancy-associated plasma protein-E (PAPPE) 1 The cDNA and protein data reported here have been deposited in the EMBL Nucleotide Sequence Database under accession number AJ278348, HSA278348. 1

    Microsoft Academic Search

    Martin Farr; Jörg Strübe; Harald-Gerhard Geppert; Andreas Kocourek; Martina Mahne; Harald Tschesche

    2000-01-01

    A full-length cDNA encoding a novel human protein was cloned from placenta cDNA. The corresponding 1542 amino acid protein sequence was termed ‘pregnancy-associated plasma protein-E’ (PAPP-E) as it shows a 62% homology to the human pregnancy-associated plasma protein-A (PAPP-A) that is a diagnostic marker for trisomies, especially Down syndrome. The conserved domain structure contains five motifs related to the short

  5. PROGRESS REPORT ON THE DSSTOX DATABASE NETWORK: NEWLY LAUNCHED WEBSITE, APPLICATIONS, FUTURE PLANS

    EPA Science Inventory

    Progress Report on the DSSTox Database Network: Newly Launched Website, Applications, Future Plans Progress will be reported on development of the Distributed Structure-Searchable Toxicity (DSSTox) Database Network and the newly launched public website that coordinates and...

  6. WLN's Database: New Directions.

    ERIC Educational Resources Information Center

    Ziegman, Bruce N.

    1988-01-01

    Describes features of the Western Library Network's database, including the database structure, authority control, contents, quality control, and distribution methods. The discussion covers changes in distribution necessitated by increasing telecommunications costs and the development of optical data disk products. (CLB)

  7. The DNA Binding Site(s) of the Escherichia coli RecA Protein* (Received for publication, August 23, 1995, and in revised form, March 10, 1996)

    E-print Network

    Kowalczykowski, Stephen C.

    in the Escherichia coli RecA protein that are proximal to and may directly mediate binding of DNA. Ultraviolet of covalent linkages between nucleotide bases and amino acids through the action of ultraviolet irradiation

  8. DOLOP: A Database of Bacterial Lipoproteins

    NSDL National Science Digital Library

    M. Maden Babu (MRC-Laboratory of Molecular Biology, Cambridge; )

    2001-09-15

    Bacteria rely on protein-lipid combinations known as lipoproteins to glom onto surfaces, sense their surroundings, slurp up nutrients, shuttle DNA to other cells, and perform other life tasks. Researchers can analyze more than 270 of the molecules at DOLOP, a database from the Medical Research Council Laboratory of Molecular Biology in Cambridge, U.K. Entries describe each protein, indicate its size and function, and provide links to the Swiss-Prot database, where you can parse the molecule's sequence and structural features. The site also explains the synthesis of lipoproteins and describes the lipobox, a characteristic amino acid string to which lipids attach.

  9. Comprehensive Thematic T-matrix Reference Database: a 2013-2014 Update

    NASA Technical Reports Server (NTRS)

    Mishchenko, Michael I.; Zakharova, Nadezhda T.; Khlebtsov, Nikolai G.; Wriedt, Thomas; Videen, Gorden

    2014-01-01

    This paper is the sixth update to the comprehensive thematic database of peer-reviewedT-matrix publications initiated by us in 2004 and includes relevant publications that have appeared since 2013. It also lists several earlier publications not incorporated in the original database and previous updates.

  10. Statistical database design

    Microsoft Academic Search

    Francis Y. L. Chin; Gultekin Ozsoyoglu

    1981-01-01

    The security problem of a statistical database is to limit the use of the database so that no sequence of statistical queries is sufficient to deduce confidential or private information. In this paper it is suggested that the problem be investigated at the conceptual data model level. The design of a statistical database should utilize a statistical security management facility

  11. ORACLE CERTIFICATION Oracle Database

    E-print Network

    Loudon, Catherine

    ORACLE CERTIFICATION Oracle Database Administration Certificate Program Train with the best. Get your Oracle Database Administration education from the number-one provider* of Oracle training-on, lab-based understanding of Oracle, the world's leading database platform, and long the product

  12. Full-Text Databases.

    ERIC Educational Resources Information Center

    Siddiqui, Moid A.

    1991-01-01

    This review of the literature on full-text databases provides information on search strategy, performance measurement, and the benefits and limitations of full text compared to bibliographic database searching. Various use studies and uses of full-text databases are also listed. (21 references) (LAE)

  13. Decision Points for Databases.

    ERIC Educational Resources Information Center

    Basch, Reva

    1992-01-01

    Argues that the selection of a database is a significant factor in the success and cost effectiveness of an online search, and provides guidelines for determining whether the content of a database is relevant for a particular search and whether the database is accessible, affordable, and suitable for the search. (LAE)

  14. General Purpose Database Summarization

    Microsoft Academic Search

    Régis Saint-Paul; Guillaume Raschia; Noureddine Mouaddib

    2005-01-01

    In this paper, a message-oriented architecture for large database summarization is presented. The summarization system takes a database table as input and produces a reduced version of this table through both a rewriting and a generalization process. The resulting table provides tuples with less precision than the original but yet are very informative of the actual content of the database.

  15. DNA Computing Hamiltonian path

    E-print Network

    Hagiya, Masami

    2014 DNA DNA #12;DNA Computing · Feynman · Adleman · DNASIMD · ... · · · · · DNADNA #12;DNA · DNA · · · · DNA · · #12;2000 2005 2010 1995 Hamiltonian path DNA tweezers DNA tile DNA origami DNA box Sierpinski DNA tile self assembly DNA logic gates Whiplash PCR DNA automaton DNA spider MAYA

  16. GOLD: The Genomes Online Database

    DOE Data Explorer

    Kyrpides, Nikos; Liolios, Dinos; Chen, Amy; Tavernarakis, Nektarios; Hugenholtz, Philip; Markowitz, Victor; Bernal, Alex

    Since its inception in 1997, GOLD has continuously monitored genome sequencing projects worldwide and has provided the community with a unique centralized resource that integrates diverse information related to Archaea, Bacteria, Eukaryotic and more recently Metagenomic sequencing projects. As of September 2007, GOLD recorded 639 completed genome projects. These projects have their complete sequence deposited into the public archival sequence databases such as GenBank EMBL,and DDBJ. From the total of 639 complete and published genome projects as of 9/2007, 527 were bacterial, 47 were archaeal and 65 were eukaryotic. In addition to the complete projects, there were 2158 ongoing sequencing projects. 1328 of those were bacterial, 59 archaeal and 771 eukaryotic projects. Two types of metadata are provided by GOLD: (i) project metadata and (ii) organism/environment metadata. GOLD CARD pages for every project are available from the link of every GOLD_STAMP ID. The information in every one of these pages is organized into three tables: (a) Organism information, (b) Genome project information and (c) External links. [The Genomes On Line Database (GOLD) in 2007: Status of genomic and metagenomic projects and their associated metadata, Konstantinos Liolios, Konstantinos Mavromatis, Nektarios Tavernarakis and Nikos C. Kyrpides, Nucleic Acids Research Advance Access published online on November 2, 2007, Nucleic Acids Research, doi:10.1093/nar/gkm884]

    The basic tables in the GOLD database that can be browsed or searched include the following information:

    • Gold Stamp ID
    • Organism name
    • Domain
    • Links to information sources
    • Size and link to a map, when available
    • Chromosome number, Plas number, and GC content
    • A link for downloading the actual genome data
    • Institution that did the sequencing
    • Funding source
    • Database where information resides
    • Publication status and information

    (Specialized Interface)

  17. Which specimens from a museum collection will yield DNA barcodes? A time series study of spiders in alcohol

    PubMed Central

    Miller, Jeremy A.; Beentjes, Kevin K.; van Helsdingen, Peter; IJland, Steven

    2013-01-01

    Abstract We report initial results from an ongoing effort to build a library of DNA barcode sequences for Dutch spiders and investigate the utility of museum collections as a source of specimens for barcoding spiders. Source material for the library comes from a combination of specimens freshly collected in the field specifically for this project and museum specimens collected in the past. For the museum specimens, we focus on 31 species that have been frequently collected over the past several decades. A series of progressively older specimens representing these 31 species were selected for DNA barcoding. Based on the pattern of sequencing successes and failures, we find that smaller-bodied species expire before larger-bodied species as tissue sources for single-PCR standard DNA barcoding. Body size and age of oldest successful DNA barcode are significantly correlated after factoring out phylogenetic effects using independent contrasts analysis. We found some evidence that extracted DNA concentration is correlated with body size and inversely correlated with time since collection, but these relationships are neither strong nor consistent. DNA was extracted from all specimens using standard destructive techniques involving the removal and grinding of tissue. A subset of specimens was selected to evaluate nondestructive extraction. Nondestructive extractions significantly extended the DNA barcoding shelf life of museum specimens, especially small-bodied species, and yielded higher DNA concentrations compared to destructive extractions. All primary data are publically available through a Dryad archive and the Barcode of Life database. PMID:24453561

  18. A web-database of mammalian morphology and a reanalysis of placental phylogeny

    PubMed Central

    Asher, Robert J

    2007-01-01

    Background Recent publications concerning the interordinal phylogeny of placental mammals have converged on a common signal, consisting of four major radiations with some ambiguity regarding the placental root. The DNA data with which these relationships have been reconstructed are easily accessible from public databases; access to morphological characters is much more difficult. Here, I present a graphical web-database of morphological characters focusing on placental mammals, in tandem with a combined-data phylogenetic analysis of placental mammal phylogeny. Results The results reinforce the growing consensus regarding the extant placental mammal clades of Afrotheria, Xenarthra, Euarchontoglires, and Laurasiatheria. Unweighted parsimony applied to all DNA sequences and insertion-deletion (indel) characters of extant taxa alone support a placental root at murid rodents; combined with morphology this shifts to Afrotheria. Bayesian analyses of morphology, indels, and DNA support both a basal position for Afrotheria and the position of Cretaceous eutherians outside of crown Placentalia. Depending on treatment of third codon positions, the affinity of several fossils (Leptictis,Paleoparadoxia, Plesiorycteropus and Zalambdalestes) vary, highlighting the potential effect of sequence data on fossils for which such data are missing. Conclusion The combined dataset supports the location of the placental mammal root at Afrotheria or Xenarthra, not at Erinaceus or rodents. Even a small morphological dataset can have a marked influence on the location of the root in a combined-data analysis. Additional morphological data are desirable to better reconstruct the position of several fossil taxa; and the graphic-rich, web-based morphology data matrix presented here will make it easier to incorporate more taxa into a larger data matrix. PMID:17608930

  19. ETC Spills Technology Databases: Oil Properties Database

    NSDL National Science Digital Library

    Fieldhouse, B.

    The Environmental Technology Center of Environment Canada provides a database which contains various properties of crude oils and petroleum products. In addition to the listing of oils, the database reports properties "which will likely determine the environmental behavior and effects of spilled oil." The user may select an oil from a pull-down menu that lists 412 oils. The data are organized into tables in the following areas: Origin, API Gravity, Density, Pour Point, Dynamic Viscosity, Hydrocarbon Groups, and Distillation.

  20. Cancer Control Publications 1998-2011: About CC Publications

    Cancer.gov

    The Division of Cancer Control and Population Sciences (DCCPS) funds over 900 grants annually. CC Publications is a searchable database that highlights publications resulting from DCCPS-funded research and staff research findings. These publications have been retrieved through PubMed, grants final reports, and staff reporting.

  1. Women in Politics: Bibliographic Database

    NSDL National Science Digital Library

    This bibliographic database currently holds 650 titles of recent works concerned with women in politics. A new addition to the Inter-Parliamentary Union's "Democracy through Partnership between Men and Women in Politics" site, "it provides bibliographic references to books, reports and journal articles on all aspects of women's participation in political life worldwide." The search mechanism allows users to specify type of document, geographic region, publishing organization, subject matter, author, title of periodical, and year of publication. Alternatively, there is also a subject keyword search. For more information about the Inter-Parliamentary Union Website, see the December 12, 1997 Scout Report.

  2. The EMBL Nucleotide Sequence Database: major new developments

    Microsoft Academic Search

    Guenter Stoesser; Wendy Baker; Alexandra Van Den Broek; Maria Garcia-pastor; Carola Kanz; Tamara Kulikova; Rasko Leinonen; Quan Lin; Vincent Lombard; Rodrigo Lopez; Renato Mancuso; Francesco Nardone; Peter Stoehr; Mary Ann Tuli; Katerina Tzouvara; Robert Vaughan

    2003-01-01

    The EMBL Nucleotide Sequence Database (http:\\/\\/ www.ebi.ac.uk\\/embl\\/) incorporates, organizes and distributes nucleotide sequences from all available public sources. The database is located and main- tained at the European Bioinformatics Institute (EBI) near Cambridge, UK. In an international collabora- tion with DDBJ (Japan) and GenBank (USA), data are exchanged amongst the collaborating databases on a daily basis to achieve optimal synchronization.

  3. The Combined Health Information Database (CHD) Online

    NSDL National Science Digital Library

    1997-01-01

    A joint project of the National Institutes of Health and the Centers for Disease Control and Prevention, this unified database of bibliographic records has been available to the public since 1985, and now sports a clean new interface. There are sixteen separately maintained databases that can be searched individually or at once, ranging from AIDS Education and Alzheimers Disease to Cancer Prevention and Weight Control. The simple search interface offers a single box into which keywords are entered. The detailed search interface allows the user to specify date of publication, media type, and language, and provides multiple query boxes that may be linked together by Boolean operators. Searches return lists of matches, from which individual bibliographic records (including abstracts) may be viewed. Reprint ordering procedures are also listed. Users may also browse information on the scope and coverage of each of the sixteen databases.

  4. DNA Restriction

    NSDL National Science Digital Library

    The discovery of enzymes that could cut and paste DNA made genetic engineering possible. Restriction enzymes, found naturally in bacteria, can be used to cut DNA fragment at specific sequences, while another enzyme, DNA ligase, can attach or rejoin DNA fragments with complementary ends. This animation from Cold Spring Harbor Laboratory's Dolan DNA Learning Center presents DNA restriction through a series of illustrations of processes involved.

  5. DNA sequence evolution through Integral Value Transformations.

    PubMed

    Hassan, Sk Sarif; Choudhury, Pabitra Pal; Guha, Ranita; Chakraborty, Shantanav; Goswami, Arunava

    2012-06-01

    In deciphering the DNA structures, evolutions and functions, Cellular Automata (CA) plays a significant role. DNA can be thought as a one-dimensional multi-state CA, more precisely four states of CA namely A, T, C, and G which can be taken as numerals 0, 1, 2 and 3. Earlier, Sirakoulis et al. (2003) reported the DNA structure, evolution and function through quaternary logic one dimensional CA and the authors have found the simulation results of the DNA evolutions with the help of only four linear CA rules. The DNA sequences which are produced through the CA evolutions, however, are seen by us not to exist in the established databases of various genomes although the initial seed (initial global state of CA) was taken from the database. This problem motivated us to study the DNA evolutions from more fundamental point of view. Parallel to CA paradigm we have devised an enriched set of discrete transformations which have been named as Integral Value Transformations (IVT). Interestingly, on applying the IVT systematically, we have been able to show that each of the DNA sequence at various discrete time instances in IVT evolutions can be directly mapped to a specific DNA sequence existing in the database. This has been possible through our efforts of getting quantitative mathematical parameters of the DNA sequences involving fractals. Thus we have at our disposal some transformational mechanism between one DNA to another. PMID:22843235

  6. Final Report on Atomic Database Project

    Microsoft Academic Search

    Yuan; Moses

    2006-01-01

    Atomic physics in hot dense plasmas is essential for understanding the radiative properties of plasmas either produced terrestrially such as in fusion energy research or in space such as the study of the core of the sun. Various kinds of atomic data are needed for spectrum analysis or for radiation hydrodynamics simulations. There are many atomic databases accessible publicly through

  7. The HITRAN 2008 Molecular Spectroscopic Database

    NASA Technical Reports Server (NTRS)

    Rothman, Laurence S.; Gordon, Iouli E.; Barbe, Alain; Benner, D. Chris; Bernath, Peter F.; Birk, Manfred; Boudon, V.; Brown, Linda R.; Campargue, Alain; Champion, J.-P.; Chance, Kelly V.; Coudert, L. H.; Sung, K.; Toth, R. A.

    2009-01-01

    This paper describes the status of the 2008 edition of the HITRAN molecular spectroscopic database. The new edition is the first official public release since the 2004 edition, although a number of crucial updates had been made available online since 2004. The HITRAN compilation consists of several components that serve as input for radiative-transfer calculation codes: individual line parameters for the microwave through visible spectra of molecules in the gas phase; absorption cross-sections for molecules having dense spectral features, i.e., spectra in which the individual lines are not resolved; individual line parameters and absorption cross sections for bands in the ultra-violet; refractive indices of aerosols, tables and files of general properties associated with the database; and database management software. The line-by-line portion of the database contains spectroscopic parameters for forty-two molecules including many of their isotopologues.

  8. GOVERNING GENETIC DATABASES: COLLECTION, STORAGE AND USE.

    PubMed

    Gibbons, Susan M C; Kaye, Jane

    2007-01-01

    This paper provides an introduction to a collection of five papers, published as a special symposium journal issue, under the title: "Governing Genetic Databases: Collection, Storage and Use". It begins by setting the scene, to provide a backdrop and context for the papers. It describes the evolving scientific landscape around genetic databases and genomic research, particularly within the biomedical and criminal forensic investigation fields. It notes the lack of any clear, coherent or coordinated legal governance regime, either at the national or international level. It then identifies and reflects on key cross-cutting issues and themes that emerge from the five papers, in particular: terminology and definitions; consent; special concerns around population genetic databases (biobanks) and forensic databases; international harmonisation; data protection; data access; boundary-setting; governance; and issues around balancing individual interests against public good values. PMID:18841252

  9. Performance Evaluation of a Parallel Cascade Semijoin Algorithm for Computing Path Expressions in Object Database Systems

    Microsoft Academic Search

    Guoren Wang; Ge Yu

    2002-01-01

    With the emerging of new applications, especially in Web, such as E-Commerce, Digital Library and DNA Bank, object database\\u000a systems show their stronger functions than other kinds of database systems due to their powerful representation ability on\\u000a complex semantics and relationship. One distinguished feature of object database systems is path expression, and most queries\\u000a on an object database are based

  10. A catalog of new eclipsing binaries in the Kepler database

    E-print Network

    Kotson, Michael Christopher

    2012-01-01

    In this thesis, we present a catalog of binary stars discovered in the publicly available Kepler database, none of which were included in previous such catalogs published by the Kepler science team. A brief review of other ...

  11. NASA scientific and technical publications: A catalog of special publications, reference publications, conference publications, and technical papers, 1989

    NASA Technical Reports Server (NTRS)

    1990-01-01

    This catalog lists 190 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered into the NASA scientific and technical information database during accession year 1989. The entries are grouped by subject category. Indexes of subject terms, personal authors, and NASA report numbers are provided.

  12. NASA scientific and technical publications: A catalog of Special Publications, Reference Publications, Conference Publications, and Technical Papers, 1987

    NASA Technical Reports Server (NTRS)

    1988-01-01

    This catalog lists 239 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered in the NASA scientific and technical information database during accession year 1987. The entries are grouped by subject category. Indexes of subject terms, personal authors, and NASA report numbers are provided.

  13. NASA scientific and technical publications: A catalog of special publications, reference publications, conference publications, and technical papers, 1987-1990

    NASA Technical Reports Server (NTRS)

    1991-01-01

    This catalog lists 783 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered into NASA Scientific and Technical Information Database during the year's 1987 through 1990. The entries are grouped by subject category. Indexes of subject terms, personal authors, and NASA report numbers are provided.

  14. Texas 3D Face Recognition Database Shalini Gupta

    E-print Network

    Bovik, Alan

    -computer interaction. Three dimensional face recognition technology also has advantages over two dimensional (2D) face recog- nition technology in that 3D facial images are more robust to facial pose variations and ambient Recognition Database. publicly available database of 3D facial images acquired using a stereo imaging system

  15. MitoProteome: mitochondrial protein sequence database and annotation system

    Microsoft Academic Search

    Dawn Cotter; Purnima Guda; Eoin Fahy; Shankar Subramaniam

    2004-01-01

    MitoProteome is an object-relational mitochondrial protein sequence database and annotation system. The initial release contains 847 human mitochon- drial protein sequences, derived from public sequence databases and mass spectrometric analy- sis of highly purified human heart mitochondria. Each sequence is manually annotated with primary function, subfunction and subcellular location, and extensively annotated in an automated process with data extracted from

  16. 108 New Variable Stars in the NSVS Database

    Microsoft Academic Search

    Maxim Usatov; Artem Nosulchik

    2008-01-01

    In this paper we present 105 SR+L, 1 Orion T Tau and 2 RS CVn type variable stars found in the Northern Sky Variability Survey (NSVS) database. This work is designated to complement and finalize our previous publication of the Extended Catalog of Red AGB Variable Stars found in the NSVS database as is primarily designated to find SR+L stars.

  17. Enhancing student learning in database courses with large data sets

    Microsoft Academic Search

    Venkat N Gudivada; Jagadeesh Nandigam; Yonglei Tao

    2007-01-01

    Rapidly increasing storage device capacities at ever decreasing costs have resulted in mushrooming of publicly available large data sets on the Web. In this paper, we describe a novel approach to teaching relational database course by using such data repositories. We demonstrate our approach using the Amazon.com product database, though the approach is generic and is applicable to other data

  18. Foodline®: International Food Market, Technology and Regulatory Databases

    Microsoft Academic Search

    Peter Sidney

    1996-01-01

    Foodline® is a trio of databases from U.K.-based Leatherhead Food Research Association providing international coverage of food marketing, technical and regulatory information. Foodline®: International Food Market Data is a bibliographic database of global market information abstracted from some 250 food and beverage business and trade journals, statistical publications and market studies. Foodline®: Food Science and Technology consists of citations and

  19. Information Technology Road Maps: a bibliographic database pilot project

    Microsoft Academic Search

    Gerry McKiernan

    2000-01-01

    In an effort to facilitate the identification and use of highly-relevant publications and resources relating to the social and economic implications of information, computation, and communication technologies, the National Science Foundation (NSF) recently funded a pilot project to create a Web-based bibliographic database of significant materials. Within the framework of this database, users are able to browse citations to relevant

  20. The Vocational Guidance Research Database: A Scientometric Approach

    ERIC Educational Resources Information Center

    Flores-Buils, Raquel; Gil-Beltran, Jose Manuel; Caballer-Miedes, Antonio; Martinez-Martinez, Miguel Angel

    2012-01-01

    The scientometric study of scientific output through publications in specialized journals cannot be undertaken exclusively with the databases available today. For this reason, the objective of this article is to introduce the "Base de Datos de Investigacion en Orientacion Vocacional" [Vocational Guidance Research Database], based on the use of…

  1. The PROSITE database, its status in 2002

    Microsoft Academic Search

    Laurent Falquet; Marco Pagni; Philipp Bucher; Nicolas Hulo; Christian J. A. Sigrist; Kay Hofmann; Amos Bairoch

    2002-01-01

    PROSITE (Bairoch and Bucher (1994) Nucleic Acids Res., 22, 3583-3589; Hofmann et al. (1999) Nucleic Acids Res., 27, 215-219) is a method of identifying the functions of uncharacterized proteins translated from genomic or cDNA sequences. The PROSITE database (http:\\/\\/www.expasy.org\\/prosite\\/) consists of biologically significant patterns and profiles designed in such a way that with appropriate compu- tational tools it can rapidly

  2. CMD: a Cotton Microsatellite Database resource for Gossypium genomics

    Microsoft Academic Search

    Anna Blenda; Jodi Scheffler; Brian Scheffler; Michael Palmer; Jean-Marc Lacape; John Z Yu; Christopher Jesudurai; Sook Jung; Sriram Muthukumar; Preetham Yellambalase; Stephen Ficklin; Margaret Staton; Robert Eshelman; Mauricio Ulloa; Sukumar Saha; Ben Burr; Shaolin Liu; Tianzhen Zhang; Deqiu Fang; Alan Pepper; Siva Kumpatla; John Jacobs; Jeff Tomkins; Roy Cantrell; Dorrie Main

    2006-01-01

    BACKGROUND: The Cotton Microsatellite Database (CMD) http:\\/\\/www.cottonssr.org is a curated and integrated web-based relational database providing centralized access to publicly available cotton microsatellites, an invaluable resource for basic and applied research in cotton breeding. DESCRIPTION: At present CMD contains publication, sequence, primer, mapping and homology data for nine major cotton microsatellite projects, collectively representing 5,484 microsatellites. In addition, CMD displays

  3. Space medicine research publications: 1983-1984

    NASA Technical Reports Server (NTRS)

    Solberg, J. L.; Pleasant, L. G.

    1984-01-01

    A list of publications supported by the Space Medicine Program, Office of Space Science and Applications is given. Included are publications entered into the Life Sciences Bibliographic Database by The George Washington University as of October 1, 1984.

  4. Long Valley caldera GIS Database

    NASA Astrophysics Data System (ADS)

    Williams, M. J.; Battaglia, M.; Hill, D.; Langbein, J.; Segall, P.

    2002-12-01

    In May of 1980, a strong earthquake swarm that included four magnitude 6 earthquakes struck the southern margin of Long Valley Caldera associated with a 25-cm, dome-shaped uplift of the caldera floor. These events marked the onset of the latest period of caldera unrest that continues to this day. This ongoing unrest includes recurring earthquake swarms and continued dome-shaped uplift of the central section of the caldera (the resurgent dome) accompanied by changes in thermal springs and gas emissions. Analysis of combined gravity and geodetic data confirms the intrusion of silicic magma beneath Long Valley caldera. In 1982, the U.S. Geological Survey under the Volcano Hazards Program began an intensive effort to monitor and study geologic unrest in Long Valley Caldera. This database provides an overview of the studies being conducted by the Long Valley Observatory in Eastern California from 1975 to 2000. The database includes geological, monitoring and topographic datasets related to the Long Valley Caldera, plus a number of USGS publications on Long Valley (e.g., fact-sheets, references). Datasets are available as text files or ArcView shapefiles. Database CD-ROM Table of Contents: - Geological data (digital geologic map) - Monitoring data: Deformation (EDM, GPS, Leveling); Earthquakes; Gravity; Hydrologic; CO2 - Topographic data: DEM, DRG, Landsat 7, Rivers, Roads, Water Bodies - ArcView Project File

  5. Compressive genomics for protein databases

    PubMed Central

    Daniels, Noah M.; Gallant, Andrew; Peng, Jian; Cowen, Lenore J.; Baym, Michael; Berger, Bonnie

    2013-01-01

    Motivation: The exponential growth of protein sequence databases has increasingly made the fundamental question of searching for homologs a computational bottleneck. The amount of unique data, however, is not growing nearly as fast; we can exploit this fact to greatly accelerate homology search. Acceleration of programs in the popular PSI/DELTA-BLAST family of tools will not only speed-up homology search directly but also the huge collection of other current programs that primarily interact with large protein databases via precisely these tools. Results: We introduce a suite of homology search tools, powered by compressively accelerated protein BLAST (CaBLASTP), which are significantly faster than and comparably accurate with all known state-of-the-art tools, including HHblits, DELTA-BLAST and PSI-BLAST. Further, our tools are implemented in a manner that allows direct substitution into existing analysis pipelines. The key idea is that we introduce a local similarity-based compression scheme that allows us to operate directly on the compressed data. Importantly, CaBLASTP’s runtime scales almost linearly in the amount of unique data, as opposed to current BLASTP variants, which scale linearly in the size of the full protein database being searched. Our compressive algorithms will speed-up many tasks, such as protein structure prediction and orthology mapping, which rely heavily on homology search. Availability: CaBLASTP is available under the GNU Public License at http://cablastp.csail.mit.edu/ Contact: bab@mit.edu PMID:23812995

  6. Federal Register document image database

    NASA Astrophysics Data System (ADS)

    Garris, Michael D.; Janet, Stanley A.; Klein, William W.

    1999-01-01

    A new, fully-automated process has been developed at NIST to derive ground truth for document images. The method involves matching optical character recognition (OCR) results from a page with typesetting files for an entire book. Public domain software used to derive the ground truth is provided in the form of Perl scripts and C source code, and includes new, more efficient string alignment technology and a word- level scoring package. With this ground truthing technology, it is now feasible to produce much larger data sets, at much lower cost, than was ever possible with previous labor- intensive, manual data collection projects. Using this method, NIST has produced a new document image database for evaluating Document Analysis and Recognition technologies and Information Retrieval systems. The database produced contains scanned images, SGML-tagged ground truth text, commercial OCR results, and image quality assessment results for pages published in the 1994 Federal Register. These data files are useful in a wide variety of experiments and research. There were roughly 250 issues, comprised of nearly 69,000 pages, published in the Federal Register in 1994. This volume of the database contains the pages of 20 books published in January of that year. In all, there are 4711 page images provided, with 4519 of them having corresponding ground truth. This volume is distributed on two ISO-9660 CD- ROMs. Future volumes may be released, depending on the level of interest.

  7. ANALYSIS OF DNA MICROARRAY DATA

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Analysis of DNA microarrays involves the extraction of fluorescent intensity from raw image files generated by the scanner, storing the extracted data in a database, normalizing the data, conducting statistical analysis and finally querying the analyzed data to find biologically meaningful results. ...

  8. ITS-90 Thermocouple Database

    National Institute of Standards and Technology Data Gateway

    SRD 60 NIST ITS-90 Thermocouple Database (Web, free access)   Web version of Standard Reference Database 60 and NIST Monograph 175. The database gives temperature -- electromotive force (emf) reference functions and tables for the letter-designated thermocouple types B, E, J, K, N, R, S and T. These reference functions have been adopted as standards by the American Society for Testing and Materials (ASTM) and the International Electrotechnical Commission (IEC).

  9. Toward unification of taxonomy databases in a distributed computer environment

    SciTech Connect

    Kitakami, Hajime [Hiroshima City Univ. (Japan); Tateno, Yoshio; Gojobori, Takashi [National Institute of Genetics, Shizuoka-Ken (Japan)

    1994-12-31

    All the taxonomy databases constructed with the DNA databases of the international DNA data banks are powerful electronic dictionaries which aid in biological research by computer. The taxonomy databases are, however not consistently unified with a relational format. If we can achieve consistent unification of the taxonomy databases, it will be useful in comparing many research results, and investigating future research directions from existent research results. In particular, it will be useful in comparing relationships between phylogenetic trees inferred from molecular data and those constructed from morphological data. The goal of the present study is to unify the existent taxonomy databases and eliminate inconsistencies (errors) that are present in them. Inconsistencies occur particularly in the restructuring of the existent taxonomy databases, since classification rules for constructing the taxonomy have rapidly changed with biological advancements. A repair system is needed to remove inconsistencies in each data bank and mismatches among data banks. This paper describes a new methodology for removing both inconsistencies and mismatches from the databases on a distributed computer environment. The methodology is implemented in a relational database management system, SYBASE.

  10. Extracting DNA

    NSDL National Science Digital Library

    Science Netlinks

    2002-03-28

    This lesson for students in grades 9-12 introduces DNA, genes, chromosomes, the chemicals that make up DNA. After the basic information, students will do an experiment in which they will separate out DNA from peas. Knowing that DNA can be separated will give them a base of understanding for future lessons in biology, evolution, biotechnology, and health technology.

  11. Databases for Microbiologists.

    PubMed

    Zhulin, Igor B

    2015-08-01

    Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. The purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists. PMID:26013493

  12. Databases for LDEF results

    NASA Technical Reports Server (NTRS)

    Bohnhoff-Hlavacek, Gail

    1992-01-01

    One of the objectives of the team supporting the LDEF Systems and Materials Special Investigative Groups is to develop databases of experimental findings. These databases identify the hardware flown, summarize results and conclusions, and provide a system for acknowledging investigators, tracing sources of data, and future design suggestions. To date, databases covering the optical experiments, and thermal control materials (chromic acid anodized aluminum, silverized Teflon blankets, and paints) have been developed at Boeing. We used the Filemaker Pro software, the database manager for the Macintosh computer produced by the Claris Corporation. It is a flat, text-retrievable database that provides access to the data via an intuitive user interface, without tedious programming. Though this software is available only for the Macintosh computer at this time, copies of the databases can be saved to a format that is readable on a personal computer as well. Further, the data can be exported to more powerful relational databases, capabilities, and use of the LDEF databases and describe how to get copies of the database for your own research.

  13. AtlasT4SS: A curated database for type IV secretion systems

    PubMed Central

    2012-01-01

    Background The type IV secretion system (T4SS) can be classified as a large family of macromolecule transporter systems, divided into three recognized sub-families, according to the well-known functions. The major sub-family is the conjugation system, which allows transfer of genetic material, such as a nucleoprotein, via cell contact among bacteria. Also, the conjugation system can transfer genetic material from bacteria to eukaryotic cells; such is the case with the T-DNA transfer of Agrobacterium tumefaciens to host plant cells. The system of effector protein transport constitutes the second sub-family, and the third one corresponds to the DNA uptake/release system. Genome analyses have revealed numerous T4SS in Bacteria and Archaea. The purpose of this work was to organize, classify, and integrate the T4SS data into a single database, called AtlasT4SS - the first public database devoted exclusively to this prokaryotic secretion system. Description The AtlasT4SS is a manual curated database that describes a large number of proteins related to the type IV secretion system reported so far in Gram-negative and Gram-positive bacteria, as well as in Archaea. The database was created using the RDBMS MySQL and the Catalyst Framework based in the Perl programming language and using the Model-View-Controller (MVC) design pattern for Web. The current version holds a comprehensive collection of 1,617 T4SS proteins from 58 Bacteria (49 Gram-negative and 9 Gram-Positive), one Archaea and 11 plasmids. By applying the bi-directional best hit (BBH) relationship in pairwise genome comparison, it was possible to obtain a core set of 134 clusters of orthologous genes encoding T4SS proteins. Conclusions In our database we present one way of classifying orthologous groups of T4SSs in a hierarchical classification scheme with three levels. The first level comprises four classes that are based on the organization of genetic determinants, shared homologies, and evolutionary relationships: (i) F-T4SS, (ii) P-T4SS, (iii) I-T4SS, and (iv) GI-T4SS. The second level designates a specific well-known protein families otherwise an uncharacterized protein family. Finally, in the third level, each protein of an ortholog cluster is classified according to its involvement in a specific cellular process. AtlasT4SS database is open access and is available at http://www.t4ss.lncc.br. PMID:22876890

  14. DNA DNA [1]. 1994

    E-print Network

    , . , , DNA Abstract The Monkey and Banana Problem is an example commonly used for illustrating simple problem, the Monkey and Banana Problem can be solved effectively without weakening the fundamental aims above and Banana Problem, which was implemented from the conventional point of view, gives us only one optimal

  15. The RECONS 25 Parsec Database

    NASA Astrophysics Data System (ADS)

    Henry, Todd J.; Jao, Wei-Chun; Pewett, Tiffany; Riedel, Adric R.; Silverstein, Michele L.; Slatten, Kenneth J.; Winters, Jennifer G.; Recons Team

    2015-01-01

    The REsearch Consortium On Nearby Stars (RECONS, www.recons.org) Team has been mapping the solar neighborhood since 1994. Nearby stars provide the fundamental framework upon which all of stellar astronomy is based, both for individual stars and stellar populations. The nearest stars are also the primary targets for extrasolar planet searches, and will undoubtedly play key roles in understanding the prevalence and structure of solar systems, and ultimately, in our search for life elsewhere.We have built the RECONS 25 Parsec Database to encourage and enable exploration of the Sun's nearest neighbors. The Database, slated for public release in 2015, contains 3088 stars, brown dwarfs, andexoplanets in 2184 systems as of October 1, 2014. All of these systems have accurate trigonometric parallaxes in the refereed literature placing them closer than 25.0 parsecs, i.e., parallaxes greater than 40 mas with errors less than 10 mas. Carefully vetted astrometric, photometric, and spectroscopic data are incorporated intothe Database from reliable sources, including significant original data collected by members of the RECONS Team.Current exploration of the solar neighborhood by RECONS, enabled by the Database, focuses on the ubiquitous red dwarfs, including: assessing the stellar companion population of ~1200 red dwarfs (Winters), investigating the astrophysical causes that spread red dwarfs of similar temperatures by a factor of 16 in luminosity (Pewett), and canvassing ~3000 red dwarfs for excess emission due to unseen companions and dust (Silverstein). In addition, a decade long astrometric survey of ~500 red dwarfs in the southern sky has begun, in an effort to understand the stellar, brown dwarf, and planetary companion populations for the stars that make up at least 75% of all stars in the Universe.This effort has been supported by the NSF through grants AST-0908402, AST-1109445, and AST-1412026, and via observations made possible by the SMARTS Consortium.

  16. Illinois State Archives: Database of Illinois Civil War Veterans

    NSDL National Science Digital Library

    This database from the Illinois State Archives "indexes the first eight volumes of the nine volume publication, Report of the Adjutant General of the State of Illinois." The publication is drawn from the original rosters maintained during the Civil War by the Adjutant General. In addition to the names of approximately 250,000 men organized into 175 regiments, this searchable database also provides histories of the Illinois units and regiments. The database was created and donated to the Illinois State Archives by amateur genealogist Fred Delap of Kansas, Illinois.

  17. Identity Database 1. CyberCIEGE Identity Database

    E-print Network

    Identity Database 1. CyberCIEGE Identity Database CyberCIEGE is an information assurance (IA-motivated professionals. The Identity Database scenario requires players to protect an identity database that is used.1 Preparation From the "Campaign Player", select the "Identity Database" campaign as seen in figure 1

  18. Pfam: the protein families database.

    PubMed

    Finn, Robert D; Bateman, Alex; Clements, Jody; Coggill, Penelope; Eberhardt, Ruth Y; Eddy, Sean R; Heger, Andreas; Hetherington, Kirstie; Holm, Liisa; Mistry, Jaina; Sonnhammer, Erik L L; Tate, John; Punta, Marco

    2014-01-01

    Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures. PMID:24288371

  19. The International Nucleotide Sequence Database Collaboration.

    PubMed

    Karsch-Mizrachi, Ilene; Nakamura, Yasukazu; Cochrane, Guy

    2012-01-01

    The members of the International Nucleotide Sequence Database Collaboration (INSDC; http://www.insdc.org) set out to capture, preserve and present globally comprehensive public domain nucleotide sequence information. The work of the long-standing collaboration includes the provision of data formats, annotation conventions and routine global data exchange. Among the many developments to INSDC resources in 2011 are the newly launched BioProject database and improved handling of assembly information. In this article, we outline INSDC services and update the reader on developments in 2011. PMID:22080546

  20. SAbDab: the structural antibody database

    PubMed Central

    Dunbar, James; Krawczyk, Konrad; Leem, Jinwoo; Baker, Terry; Fuchs, Angelika; Georges, Guy; Shi, Jiye; Deane, Charlotte M.

    2014-01-01

    Structural antibody database (SAbDab; http://opig.stats.ox.ac.uk/webapps/sabdab) is an online resource containing all the publicly available antibody structures annotated and presented in a consistent fashion. The data are annotated with several properties including experimental information, gene details, correct heavy and light chain pairings, antigen details and, where available, antibody–antigen binding affinity. The user can select structures, according to these attributes as well as structural properties such as complementarity determining region loop conformation and variable domain orientation. Individual structures, datasets and the complete database can be downloaded. PMID:24214988

  1. NASA aerospace database subject scope: An overview

    NASA Technical Reports Server (NTRS)

    1993-01-01

    Outlined here is the subject scope of the NASA Aerospace Database, a publicly available subset of the NASA Scientific and Technical (STI) Database. Topics of interest to NASA are outlined and placed within the framework of the following broad aerospace subject categories: aeronautics, astronautics, chemistry and materials, engineering, geosciences, life sciences, mathematical and computer sciences, physics, social sciences, space sciences, and general. A brief discussion of the subject scope is given for each broad area, followed by a similar explanation of each of the narrower subject fields that follow. The subject category code is listed for each entry.

  2. Physical database design for relational databases

    Microsoft Academic Search

    Sheldon J. Finkelstein; Mario Schkolnick; Paolo Tiberio

    1988-01-01

    This paper describes the concepts used in the implementation of DBDSGN, an experimental physical design tool for relational databases developed at the IBM San Jose Research Laboratory. Given a workload for System R (consisting of a set of SQL statements and their execution frequencies), DBDSGN suggests physical configurations for efficient performance. Each configuration consists of a set of indices and

  3. Steam Properties Database

    National Institute of Standards and Technology Data Gateway

    SRD 10 NIST/ASME Steam Properties Database (PC database for purchase)   Based upon the International Association for the Properties of Water and Steam (IAPWS) 1995 formulation for the thermodynamic properties of water and the most recent IAPWS formulations for transport and other properties, this updated version provides water properties over a wide range of conditions according to the accepted international standards.

  4. Build Your Own Database.

    ERIC Educational Resources Information Center

    Jacso, Peter; Lancaster, F. W.

    This book is intended to help librarians and others to produce databases of better value and quality, especially if they have had little previous experience in database construction. Drawing upon almost 40 years of experience in the field of information retrieval, this book emphasizes basic principles and approaches rather than in-depth and…

  5. Oracle Database SQL Reference

    E-print Network

    Shahabi, Cyrus

    Oracle® Database SQL Reference 10g Release 1 (10.1) Part No. B10759-01 December 2003 #12;Oracle Database SQL Reference 10g Release 1 (10.1) Part No. B10759-01 Copyright © 1996, 2003 Oracle Corporation include both the software and documentation) contain proprietary information of Oracle Corporation

  6. The intelligent database machine

    NASA Technical Reports Server (NTRS)

    Yancey, K. E.

    1985-01-01

    The IDM data base was compared with the data base crack to determine whether IDM 500 would better serve the needs of the MSFC data base management system than Oracle. The two were compared and the performance of the IDM was studied. Implementations that work best on which database are implicated. The choice is left to the database administrator.

  7. World Database of Crystallographers

    NSDL National Science Digital Library

    The World Database of Crystallographers and of Other Scientists Employing Crystallographic Methods is offered by the International Union of Crystallography. The simple database can be searched by family name, title, interests, address, and various other criteria. Results include basic information such as full name, position, institution address, degrees held, key interests, and contact information. Those seeking such specific information will appreciate this unique resource.

  8. HIV Structural Database

    National Institute of Standards and Technology Data Gateway

    SRD 102 HIV Structural Database (Web, free access)   The HIV Protease Structural Database is an archive of experimentally determined 3-D structures of Human Immunodeficiency Virus 1 (HIV-1), Human Immunodeficiency Virus 2 (HIV-2) and Simian Immunodeficiency Virus (SIV) Proteases and their complexes with inhibitors or products of substrate cleavage.

  9. Biological Macromolecule Crystallization Database

    National Institute of Standards and Technology Data Gateway

    SRD 21 Biological Macromolecule Crystallization Database (Web, free access)   The Biological Macromolecule Crystallization Database and NASA Archive for Protein Crystal Growth Data (BMCD) contains the conditions reported for the crystallization of proteins and nucleic acids used in X-ray structure determinations and archives the results of microgravity macromolecule crystallization studies.

  10. Dictionary as Database.

    ERIC Educational Resources Information Center

    Painter, Derrick

    1996-01-01

    Discussion of dictionaries as databases focuses on the digitizing of The Oxford English dictionary (OED) and the use of Standard Generalized Mark-Up Language (SGML). Topics include the creation of a consortium to digitize the OED, document structure, relational databases, text forms, sequence, and discourse. (LRW)

  11. Online Database Searching Workbook.

    ERIC Educational Resources Information Center

    Littlejohn, Alice C.; Parker, Joan M.

    Designed primarily for use by first-time searchers, this workbook provides an overview of online searching. Following a brief introduction which defines online searching, databases, and database producers, five steps in carrying out a successful search are described: (1) identifying the main concepts of the search statement; (2) selecting a…

  12. Noncoding regulatory RNAs database

    Microsoft Academic Search

    Maciej Szymanski; Volker A. Erdmann; Jan Barciszewski

    2003-01-01

    The noncoding RNAs database is a collection of currently available sequence data on RNAs, which have no protein-coding capacity and have been implicated in regulation of cellular processes. The RNAs included in the database form very hetero- genous group of molecules that act on different levels of information transmission in the cell. It includes RNAs acting on the level of

  13. PAN Pesticide Database

    NSDL National Science Digital Library

    0000-00-00

    The Pesticide Action Network (PAN) Pesticide Database is your one-stop location for toxicity and regulatory information for pesticides. This is a comprehensive search enabled database of pesticide chemicals and also trade names. An easy to navigate sidebar takes you through toxicity, uses, registration, company, and distributor. Other links take you to less toxic alternatives, and pesticide tutorial and references.

  14. Household Products Database

    Microsoft Academic Search

    Aimee Haley

    2005-01-01

    This column is an overview of the Household Products Database, a free resource available online from the National Library of Medicine (NLM). Search strategies, overall layout, and supportive information are detailed. The database covers products commonly found in or around the home, including cosmetics, cleaning substances, and other maintenance items.

  15. Knowledge Discovery in Databases.

    ERIC Educational Resources Information Center

    Norton, M. Jay

    1999-01-01

    Knowledge discovery in databases (KDD) revolves around the investigation and creation of knowledge, processes, algorithms, and mechanisms for retrieving knowledge from data collections. The article is an introductory overview of KDD. The rationale and environment of its development and applications are discussed. Issues related to database design…

  16. Bachelorscriptie Database Schema Integratie

    E-print Network

    Lucas, Peter

    . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 7.2 ERP Database . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 7.3 Order houden organisaties hun data in databases, zoals ze voor ERP1 en CRM2 systemen worden gebruikt. Heel veel. Hierbij heeft men heel veel verschillende systemen, zoals Oracle, SAP R/3, Microsoft SQL, Microsoft Access

  17. Database of Mechanical Properties of Textile Composites

    NASA Technical Reports Server (NTRS)

    Delbrey, Jerry

    1996-01-01

    This report describes the approach followed to develop a database for mechanical properties of textile composites. The data in this database is assembled from NASA Advanced Composites Technology (ACT) programs and from data in the public domain. This database meets the data documentation requirements of MIL-HDBK-17, Section 8.1.2, which describes in detail the type and amount of information needed to completely document composite material properties. The database focuses on mechanical properties of textile composite. Properties are available for a range of parameters such as direction, fiber architecture, materials, environmental condition, and failure mode. The composite materials in the database contain innovative textile architectures such as the braided, woven, and knitted materials evaluated under the NASA ACT programs. In summary, the database contains results for approximately 3500 coupon level tests, for ten different fiber/resin combinations, and seven different textile architectures. It also includes a limited amount of prepreg tape composites data from ACT programs where side-by-side comparisons were made.

  18. Complex Carbohydrate Research Center Spectral Databases

    NSDL National Science Digital Library

    York, William .

    Dr. William York of the Complex Carbohydrate Research Center has created these two databases with scientific input from others at the University of Georgia. The Xyloglucan NMR Database consists of a searchable table of the ?H-NMR chemical shifts of xyloglucan oligoglycosyl alditols. Xyloglucans are highly branched polymers with a cellulosic backbone (i.e., consisting of b-(1,4)-linked D-glucosyl residues). The basis for the most commonly used nomenclature for xyloglucan structures comes from the linear array of glycosyl side chains that many of the backbone residues bear. The Partially Methylated Alditol Acetate (PMAA) Database shows the molecular structures of PMAAs derived from Hexopyranosyl, Pentopyranosyl, and Pentofuranosyl Residues. Users view the structures by gliding the mouse over a table. The PMAA Database also says that electron-impact mass spectra are available, but at the time of publication, these links weren't working. The Xyloglucan database comes with overviews, search guides, and nomenclature information. The PMAA database has a help page. Both require a free login.

  19. DNA-based prediction of human externally visible characteristics in forensics: Motivations, scientific challenges, and ethical considerations

    Microsoft Academic Search

    Manfred Kayser; Peter M. Schneider

    2009-01-01

    There will always be criminal cases, where the evidence DNA sample will not match either a suspect's DNA profile, or any in a criminal DNA database. In the absence of DNA-based mass intelligence screenings, including familial searching (both of which may be restricted by legislation), there is only one option to potentially avoid or retrospectively solve “cold cases”: the DNA-based

  20. Cascadia Tsunami Deposit Database

    USGS Publications Warehouse

    Peters, Robert; Jaffe, Bruce; Gelfenbaum, Guy; Peterson, Curt

    2003-01-01

    The Cascadia Tsunami Deposit Database contains data on the location and sedimentological properties of tsunami deposits found along the Cascadia margin. Data have been compiled from 52 studies, documenting 59 sites from northern California to Vancouver Island, British Columbia that contain known or potential tsunami deposits. Bibliographical references are provided for all sites included in the database. Cascadia tsunami deposits are usually seen as anomalous sand layers in coastal marsh or lake sediments. The studies cited in the database use numerous criteria based on sedimentary characteristics to distinguish tsunami deposits from sand layers deposited by other processes, such as river flooding and storm surges. Several studies cited in the database contain evidence for more than one tsunami at a site. Data categories include age, thickness, layering, grainsize, and other sedimentological characteristics of Cascadia tsunami deposits. The database documents the variability observed in tsunami deposits found along the Cascadia margin.

  1. Public Speaking

    NSDL National Science Digital Library

    Ms.Maxwell

    2011-10-27

    In this lesson students will evaluate famous speeches in order to identify the ways in which the speech was effective or ineffective. We will explore several websites, which present themselves as databases holding hundreds of speeches. Students will also view a video identifying ways to overcome fear of public speaking. Students will look over the websites and choose one video recorded speech from each website to go back and watch. They will then view the last website, which will remind students of what they should be looking in each type of speech. After being reminded of the qualities of each type of speech, students will be given a chart to fill out. Students will then go back to each website and view the videos which they have previouisly chosen. As students watch the speeches, they will note the ways in which the speaker meets the requirements for a quality speech, i.e. eye contact, number of vocal fillers used, vocal pitch, appearance, etc. They will record this information of their charts. First let's learn about American Rhetoric Now let's take a look at this film on conquering fear of public speaking Conquering Fear of Public Speaking Now let's take a look at Speaking Now let's take a look at Speech Club Now let's take a look at Study Great Speaches Now let's take a look at Study Sphere Public Speaking ...

  2. Scientific Communication of Geochemical Data and the Use of Computer Databases.

    ERIC Educational Resources Information Center

    Le Bas, M. J.; Durham, J.

    1989-01-01

    Describes a scheme in the United Kingdom that coordinates geochemistry publications with a computerized geochemistry database. The database comprises not only data published in the journals but also the remainder of the pertinent data set. The discussion covers the database design; collection, storage and retrieval of data; and plans for future…

  3. Release of ToxCastDB and ExpoCastDB databases

    EPA Science Inventory

    EPA has released two databases - the Toxicity Forecaster database (ToxCastDB) and a database of chemical exposure studies (ExpoCastDB) - that scientists and the public can use to access chemical toxicity and exposure data. ToxCastDB users can search and download data from over 50...

  4. A Study of the Impact of Statewide Database Licensing on Information Provision in Washington State

    Microsoft Academic Search

    Efthimis N. Efthimiadis; Harry Bruce

    The Statewide Database Licensing (SDL) Project brought ProQuest, full-text periodicals and newspapers databases from Bell & Howell, to nearly every library in Washington State. The research presented here investigates the impact of statewide database licensing on the users of public, school and community college libraries. The study was conducted in two stages. At first, transaction log data from Bell &

  5. The magnet components database system

    Microsoft Academic Search

    M. J. Baggett; R. Leedy; C. Saltmarsh; J. C. Tompkins

    1990-01-01

    The philosophy, structure, and usage MagCom, the SSC magnet components database, are described. The database has been implemented in Sybase (a powerful relational database management system) on a UNIX-based workstation at the Superconducting Super Collider Laboratory (SSCL); magnet project collaborators can access the database via network connections. The database was designed to contain the specifications and measured values of important

  6. Enchytraeus albidus Microarray: Enrichment, Design, Annotation and Database (EnchyBASE)

    PubMed Central

    Novais, Sara C.; Arrais, Joel; Lopes, Pedro; Vandenbrouck, Tine; De Coen, Wim; Roelofs, Dick; Soares, Amadeu M. V. M.; Amorim, Mónica J. B.

    2012-01-01

    Enchytraeus albidus (Oligochaeta) is an ecologically relevant species used as standard test organisms for risk assessment. Effects of stressors in this species are commonly determined at the population level using reproduction and survival as endpoints. The assessment of transcriptomic responses can be very useful e.g. to understand underlying mechanisms of toxicity with gene expression fingerprinting. In the present paper the following is being addressed: 1) development of suppressive subtractive hybridization (SSH) libraries enriched for differentially expressed genes after metal and pesticide exposures; 2) sequencing and characterization of all generated cDNA inserts; 3) development of a publicly available genomic database on E. albidus. A total of 2100 Expressed Sequence Tags (ESTs) were isolated, sequenced and assembled into 1124 clusters (947 singletons and 177 contigs). From these sequences, 41% matched known proteins in GenBank (BLASTX, e-value?10-5) and 37% had at least one Gene Ontology (GO) term assigned. In total, 5.5% of the sequences were assigned to a metabolic pathway, based on KEGG. With this new sequencing information, an Agilent custom oligonucleotide microarray was designed, representing a potential tool for transcriptomic studies. EnchyBASE (http://bioinformatics.ua.pt/enchybase/) was developed as a web freely available database containing genomic information on E. albidus and will be further extended in the near future for other enchytraeid species. The database so far includes all ESTs generated for E. albidus from three cDNA libraries. This information can be downloaded and applied in functional genomics and transcription studies. PMID:22558086

  7. DNA Interactive

    NSDL National Science Digital Library

    2004-01-05

    DNA Interactive is an educational site celebrating the 50th anniversary of the discovery of the double-helical structure of DNA by James Watson and Francis Crick. The web site features interactive modules about the history of DNA science; discovering and reading the DNA code; manipulating the code to create tailored molecules; studying the human genome; applications of DNA research; and a chronicle of the eugenics movement. These modules feature rare video interviews with scientists, 3D animations, and narrative text to present and explain DNA science. Other materials include a teacher's guide with downloadable, printable lessons, an online teaching community, and information on further resources.

  8. Carbon Capture and Storage Database (CCS) from DOE's National Energy Technology Laboratory (NETL)

    DOE Data Explorer

    NETL's Carbon Capture and Storage (CCS) Database includes active, proposed, canceled, and terminated CCS projects worldwide. Information in the database regarding technologies being developed for capture, evaluation of sites for carbon dioxide (CO2) storage, estimation of project costs, and anticipated dates of completion is sourced from publically available information. The CCS Database provides the public with information regarding efforts by various industries, public groups, and governments towards development and eventual deployment of CCS technology. The database contains more than 260 CCS projects worldwide in more than 30 countries across 6 continents. Access to the database requires use of Google Earth, as the NETL CCS database is a layer in Google Earth. Or, users can download a copy of the database in MS-Excel directly from the NETL website.

  9. Creating a Database for Demographic Research: A Case Study.

    ERIC Educational Resources Information Center

    Gates, William A.; Witt, Barbara

    The PUS801000 (Public Use Samples) database was created as a subsample of the Census Bureau's 1980 Public Use Microdata Sample (PUMS) for the purpose of meeting the needs of demographics research. PUMS needed to be reorganized along relational lines by identifying the variables most widely used by researchers and constructing proper relations…

  10. Efficient Compression of non-repetitive DNA sequences using Dynamic Programming

    Microsoft Academic Search

    K. G. Srinivasa; M. Jagadish; K. R. Venugopal; L. M. Patnaik

    2006-01-01

    DNA compression has been a subject of great interest since the availability of genomic databases. Although only two bits are sufficient to encode four bases of DNA ( namely A, G, T and C ), the massive size DNA sequences compels the need for efficient compression. General text compression methods do not make use of characteristics specific to DNA sequences.

  11. Genomic tools and cDNA derived markers for butterflies.

    PubMed

    Papanicolaou, Alexie; Joron, Mathieu; McMillan, W Owen; Blaxter, Mark L; Jiggins, Chris D

    2005-08-01

    The Lepidoptera have long been used as examples in the study of evolution, but some questions remain difficult to resolve due to a lack of molecular genetic data. However, as technology improves, genomic tools are becoming increasingly available to tackle unanswered evolutionary questions. Here we have used expressed sequence tags (ESTs) to develop genetic markers for two Müllerian mimic species, Heliconius melpomene and Heliconius erato. In total 1363 ESTs were generated, representing 330 gene objects in H. melpomene and 431 in H. erato. User-friendly bioinformatic tools were used to construct a nonredundant database of these putative genes (available at http://www.heliconius.org), and annotate them with blast similarity searches, InterPro matches and Gene Ontology terms. This database will be continually updated with EST sequences for the Papilionideae as they become publicly available, providing a tool for gene finding in the butterflies. Alignments of the Heliconius sequences with putative homologues derived from Bombyx mori or other public data sets were used to identify conserved PCR priming sites, and develop 55 markers that can be amplified from genomic DNA in both H. erato and H. melpomene. These markers will be used for comparative linkage mapping in Heliconius and will have applications in other phylogenetic and genomic studies in the Lepidoptera. PMID:16029486

  12. 32 CFR 338.1 - Ordering DNA issuances.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ...2013-07-01 2013-07-01 false Ordering DNA issuances. 338.1 Section 338.1 ...AVAILABILITY TO THE PUBLIC OF DEFENSE NUCLEAR AGENCY (DNA) INSTRUCTIONS AND CHANGES THERETO § 338.1 Ordering DNA issuances. (a) The DNA issuances...

  13. 32 CFR 338.1 - Ordering DNA issuances.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ...2012-07-01 2012-07-01 false Ordering DNA issuances. 338.1 Section 338.1 ...AVAILABILITY TO THE PUBLIC OF DEFENSE NUCLEAR AGENCY (DNA) INSTRUCTIONS AND CHANGES THERETO § 338.1 Ordering DNA issuances. (a) The DNA issuances...

  14. 32 CFR 338.1 - Ordering DNA issuances.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ...2014-07-01 2014-07-01 false Ordering DNA issuances. 338.1 Section 338.1 ...AVAILABILITY TO THE PUBLIC OF DEFENSE NUCLEAR AGENCY (DNA) INSTRUCTIONS AND CHANGES THERETO § 338.1 Ordering DNA issuances. (a) The DNA issuances...

  15. Publishing Your Database on CD-ROM for Profit: The FISHLIT and NISC Experience.

    ERIC Educational Resources Information Center

    Crampton, Margaret

    1995-01-01

    Details the development of the FISHLIT bibliographic database at the JLB Smith Institute of Ichthyology Library at Rhodes University (South Africa), and the subsequent CD-ROM publication of the database by NISC (National Information Services Corporation). Discusses the advantages of CD-ROM publication, costs and information service provision,…

  16. DNA Replication

    NSDL National Science Digital Library

    American Society For Microbiology

    2002-01-01

    This animation, which shows DNA replication and the interactions of the various enzymes, can be used to illustrate to students the order of events in DNA replication, as well as emphasize which enzymes are involved in the process.

  17. DNA Detectives

    NSDL National Science Digital Library

    BEGIN:VCARD VERSION:2.1 FN:Suzanne Black N:Black; Suzanne ORG:Inglemoor High School REV:2005-04-09 END:VCARD

    1995-06-30

    Many of the revolutionary changes that have occurred in biology since 1970 can be attributed directly to the ability to manipulate DNA in defined ways. The principal tools for this recombinant DNA technology are enzymes that can "cut and "paste" DNA. Restriction enzymes are the "chemical scissors" of the molecular biologist; these enzymes cut DNA at specific nucleotide sequences. A sample of someone's DNA, incubated with restriction enzymes, is reduced to millions of DNA fragments of varying sizes. A DNA sample from a different person would have a different nucleotide sequence and would thus be enzymatically "chopped up" into a very different collection of fragments. We have been asked to apply DNA fingerprinting to determine which suspect should be charged with a crime perpetrated in our city.

  18. DNA Copyright

    E-print Network

    Torrance, Andrew W.

    2011-01-01

    of architecture and computer software. Sequences of DNA should also be acknowledged as eligible for copyright protection. Unaltered genomic DNA sequences would seem poor candidates for copyright protection. The case is stronger for copyright protection...

  19. Household Products Database

    NSDL National Science Digital Library

    Users will find important and possibly life-saving information on over 4,000 household products in this online database from the National Library of Medicine's Specialized Information Services. The database allows users to find out what a product contains, potential health effects, and safety and handling information. Users can quickly and easily search the database by product name, ingredients, or symptom. The products search seems to be the most user-friendly, as it is organized alphabetically and by general category, e.g. home maintenance, personal care/use, auto products, and so on.

  20. World Biodiversity Database

    NSDL National Science Digital Library

    The World Biodiversity Database, provided by the Expert Center for Taxonomic Identification (ETI), seeks to "document all presently known species (about 1.7 million) and to make this important biological information worldwide accessible." This continually growing database "provides taxonomic information, species names, synonyms, descriptions, illustrations and literature references when available" on 200,000 taxa. The searchable database can be explored using an expandable tree of the five taxonomic kingdoms or by typing in a common or scientific name. Both educators and students should find this site easy to navigate, informative, and useful.

  1. Year 2000 Reports Database

    NSDL National Science Digital Library

    1999-01-01

    The US Securities and Exchange Commission (SEC) has recently released this searchable database of Y2K readiness reports in order to disclose the preparatory efforts of the securities industry. The database contains more than 13,000 reports from the broker-dealers, transfer agents, investment advisors, and mutual funds required to file with the SEC. Reports include descriptions of the company's, agent's, or fund's state of Y2K readiness, costs to address the Y2K problem, Y2K risks, and contingency plans. Complete database documentation, search instructions, and contact information are provided at the Important Information page.

  2. JICST Factual Database(3)

    NASA Astrophysics Data System (ADS)

    Shimura, Kazuki; Abe, Atsushi

    This paper describes the system outline, characteristics and use of JICST Thermophysical and Thermochemical Properties Database of which service was started as one part of JICST Factual Database System. This system enables to store data of more than 60 kinds of physical or chemical thermal properties. It covers elements, pure substances of inorganic and low molecular organic compounds, and two or three component systems of these compounds. The system is designed to enable to deal with floating decimal point numerical data identifying significant figures, to provide versatile searching supports, and to link its searching to other databases. The actual use examples and some points to be careful are also described.

  3. The PEDANT genome database in 2005.

    PubMed

    Riley, M Louise; Schmidt, Thorsten; Wagner, Christian; Mewes, Hans-Werner; Frishman, Dmitrij

    2005-01-01

    The PEDANT genome database (http://pedant.gsf.de) contains pre-computed bioinformatics analyses of publicly available genomes. Its main mission is to provide robust automatic annotation of the vast majority of amino acid sequences, which have not been subjected to in-depth manual curation by human experts in high-quality protein sequence databases. By design PEDANT annotation is genome-oriented, making it possible to explore genomic context of gene products, and evaluate functional and structural content of genomes using a category-based query mechanism. At present, the PEDANT database contains exhaustive annotation of over 1,240,000 proteins from 270 eubacterial, 23 archeal and 41 eukaryotic genomes. PMID:15608204

  4. The future of forensic DNA analysis.

    PubMed

    Butler, John M

    2015-08-01

    The author's thoughts and opinions on where the field of forensic DNA testing is headed for the next decade are provided in the context of where the field has come over the past 30 years. Similar to the Olympic motto of 'faster, higher, stronger', forensic DNA protocols can be expected to become more rapid and sensitive and provide stronger investigative potential. New short tandem repeat (STR) loci have expanded the core set of genetic markers used for human identification in Europe and the USA. Rapid DNA testing is on the verge of enabling new applications. Next-generation sequencing has the potential to provide greater depth of coverage for information on STR alleles. Familial DNA searching has expanded capabilities of DNA databases in parts of the world where it is allowed. Challenges and opportunities that will impact the future of forensic DNA are explored including the need for education and training to improve interpretation of complex DNA profiles. PMID:26101278

  5. DNA Nanotechnology

    NSDL National Science Digital Library

    2014-06-10

    In this activity, learners explore deoxyribonucleic acid (DNA), a nanoscale structure that occurs in nature. Learners extract a sample of DNA from split peas and put the sample in an Eppendorf tube to take home. Learners discover that nanoscientists study DNA to understand its biological function and use it to make other nanoscale materials and devices.

  6. 2058 Expressed sequence tags (ESTs) from a human fetal lung cDNA library

    SciTech Connect

    Kazunori, Sudo [Cancer Institute, Tokyo (Japan)]|[Mitsubishi, Yuka BCL Co. Ltd., Tokyo (Japan); Katsuya Chinen; Yusuke Nakamura [Mitsubishi Yuka BCL Co. Ltd., Tokyo (Japan)

    1994-11-15

    ESTs (expressed sequence tags) provide complementary resources for structural and functional analyses of the human genome. The authors have performed single-pass sequencing of 2058 randomly selected, directionally cloned cDNAs isolated from a fetal-lung cDNA library constructed with oligo (dT) primers. Computer analyses of the 5{prime}-end sequences revealed that 60.4% of the clones were considered to be identical to previously reported human genes or ESTs; 9.0% of them showed significant homology to known genes in human, other mammals, or lower organisms; 30.6% showed no homology to any genes or DNA sequences in the public database. These data and reagents will be useful for future investigations of gene expression during prenatal development of human lung. 11 refs., 1 fig., 2 tabs.

  7. Algorithmic Self-Assembly DNA (Layered

    E-print Network

    Hagiya, Masami

    DNA DNA Algorithmic Self-Assembly DNA DNA DNA DNA DNA 2 DNA (Layered Tile Model) [1] LTM Fig.1-Origami 4 DNA (Fig.1) DNA Fig.2 [2] DNA DNA Fig. 2 DNA ( ( ), SEM )) DNA DNA DNA DNA DNA DNA DNA RecA 1 DNA 2 DNA ATP DNA 3 DNA (Fig. ) DNA DNA DNA RecA 1 DNA 3 #12;Fig. 4 AFM image of triple strand DNA

  8. PMRD: plant microRNA database

    PubMed Central

    Zhang, Zhenhai; Yu, Jingyin; Li, Daofeng; Zhang, Zuyong; Liu, Fengxia; Zhou, Xin; Wang, Tao; Ling, Yi; Su, Zhen

    2010-01-01

    MicroRNAs (miRNA) are ?21 nucleotide-long non-coding small RNAs, which function as post-transcriptional regulators in eukaryotes. miRNAs play essential roles in regulating plant growth and development. In recent years, research into the mechanism and consequences of miRNA action has made great progress. With whole genome sequence available in such plants as Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, Glycine max, etc., it is desirable to develop a plant miRNA database through the integration of large amounts of information about publicly deposited miRNA data. The plant miRNA database (PMRD) integrates available plant miRNA data deposited in public databases, gleaned from the recent literature, and data generated in-house. This database contains sequence information, secondary structure, target genes, expression profiles and a genome browser. In total, there are 8433 miRNAs collected from 121 plant species in PMRD, including model plants and major crops such as Arabidopsis, rice, wheat, soybean, maize, sorghum, barley, etc. For Arabidopsis, rice, poplar, soybean, cotton, medicago and maize, we included the possible target genes for each miRNA with a predicted interaction site in the database. Furthermore, we provided miRNA expression profiles in the PMRD, including our local rice oxidative stress related microarray data (LC Sciences miRPlants_10.1) and the recently published microarray data for poplar, Arabidopsis, tomato, maize and rice. The PMRD database was constructed by open source technology utilizing a user-friendly web interface, and multiple search tools. The PMRD is freely available at http://bioinformatics.cau.edu.cn/PMRD. We expect PMRD to be a useful tool for scientists in the miRNA field in order to study the function of miRNAs and their target genes, especially in model plants and major crops. PMID:19808935

  9. Widespread Horizontal Gene Transfer from Circular Single-stranded DNA Viruses to Eukaryotic Genomes

    PubMed Central

    2011-01-01

    Background In addition to vertical transmission, organisms can also acquire genes from other distantly related species or from their extra-chromosomal elements (plasmids and viruses) via horizontal gene transfer (HGT). It has been suggested that phages represent substantial forces in prokaryotic evolution. In eukaryotes, retroviruses, which can integrate into host genome as an obligate step in their replication strategy, comprise approximately 8% of the human genome. Unlike retroviruses, few members of other virus families are known to transfer genes to host genomes. Results Here we performed a systematic search for sequences related to circular single-stranded DNA (ssDNA) viruses in publicly available eukaryotic genome databases followed by comprehensive phylogenetic analysis. We conclude that the replication initiation protein (Rep)-related sequences of geminiviruses, nanoviruses and circoviruses have been frequently transferred to a broad range of eukaryotic species, including plants, fungi, animals and protists. Some of the transferred viral genes were conserved and expressed, suggesting that these genes have been coopted to assume cellular functions in the host genomes. We also identified geminivirus-like and parvovirus-like transposable elements in genomes of fungi and lower animals, respectively, and thereby provide direct evidence that eukaryotic transposons could derive from ssDNA viruses. Conclusions Our discovery extends the host range of circular ssDNA viruses and sheds light on the origin and evolution of these viruses. It also suggests that ssDNA viruses act as an unforeseen source of genetic innovation in their hosts. PMID:21943216

  10. DNA Barcoding of Catfish: Species Authentication and Phylogenetic Assessment

    PubMed Central

    Wong, Li Lian; Peatman, Eric; Lu, Jianguo; Kucuktas, Huseyin; He, Shunping; Zhou, Chuanjiang; Na-nakorn, Uthairat; Liu, Zhanjiang

    2011-01-01

    As the global market for fisheries and aquaculture products expands, mislabeling of these products has become a growing concern in the food safety arena. Molecular species identification techniques hold the potential for rapid, accurate assessment of proper labeling. Here we developed and evaluated DNA barcodes for use in differentiating United States domestic and imported catfish species. First, we sequenced 651 base-pair barcodes from the cytochrome oxidase I (COI) gene from individuals of 9 species (and an Ictalurid hybrid) of domestic and imported catfish in accordance with standard DNA barcoding protocols. These included domestic Ictalurid catfish, and representative imported species from the families of Clariidae and Pangasiidae. Alignment of individual sequences from within a given species revealed highly consistent barcodes (98% similarity on average). These alignments allowed the development and analyses of consensus barcode sequences for each species and comparison with limited sequences in public databases (GenBank and Barcode of Life Data Systems). Validation tests carried out in blinded studies and with commercially purchased catfish samples (both frozen and fresh) revealed the reliability of DNA barcoding for differentiating between these catfish species. The developed protocols and consensus barcodes are valuable resources as increasing market and governmental scrutiny is placed on catfish and other fisheries and aquaculture products labeling in the United States. PMID:21423623

  11. Environmental Security Database

    NSDL National Science Digital Library

    Maintained by the Peace & Conflict Studies Program at the University of Toronto, this database has potential as a powerful research tool for those studying the relationships between environmental stress and violent conflict in developing countries. The database contains information on (but not the text of) over 20,000 items, including books, journal articles, papers, and newspaper clippings. Users may conduct searches using keywords, names, titles, or a special coding system developed to permit "complex Boolean searches of the Database to produce subsets of items relating to specific issues." Typical search returns include author, title, publisher, date, and comments which vary in length. The authors stress that most items in the database can be found through local research libraries; however, they do offer limited assistance in locating and reproducing some materials.

  12. DDI and Relational Databases

    E-print Network

    Amin, Alerk; Barkow, Ingo

    2013-04-04

    Although the DDI standard is expressed in XML, many institutions have a requirement or preference to use relational databases (eg. Access, MySQL, Oracle, Postgres) in their applications. This may be because of integration with existing applications...

  13. Nuclear Science References Database

    NASA Astrophysics Data System (ADS)

    Pritychenko, B.; B?ták, E.; Singh, B.; Totans, J.

    2014-06-01

    The Nuclear Science References (NSR) database together with its associated Web interface, is the world's only comprehensive source of easily accessible low- and intermediate-energy nuclear physics bibliographic information for more than 210,000 articles since the beginning of nuclear science. The weekly-updated NSR database provides essential support for nuclear data evaluation, compilation and research activities. The principles of the database and Web application development and maintenance are described. Examples of nuclear structure, reaction and decay applications are specifically included. The complete NSR database is freely available at the websites of the National Nuclear Data Center http://www.nndc.bnl.gov/nsr and the International Atomic Energy Agency http://www-nds.iaea.org/nsr.

  14. The Gymnosperm Database

    NSDL National Science Digital Library

    This database is a fabulous resource for students and researchers of conifers, cycads, and their allies; it was created by Dr. Christopher J. Earle, a dendrochronologist and doctorate of the University of Washington. The user enters the taxonomic database at the highest level, Order and Family, and can then navigate to Species (or sometimes Variety) levels. At each level, information on the taxon is provided, along with bibliographic citations. In addition to the gymnosperm database, the site provides a links page with pointers to bibliographic databases, virtual image galleries and other resources. Special sections on gymnosperms of Alta California, gymnosperms of Sichuan, tree age determination, paleobotany of Australia and New Zealand conifers, and podocarp forests of south-west New Zealand further complement the site.

  15. Spanky Fractal Database

    NSDL National Science Digital Library

    Spanky Fractal Database: fractal images, programs, documents, papers, code examples, and other fractal related material. Submitted by contributors or hunted down from various nooks and crannies on the net. Enjoy and discover.

  16. The Dinosaur Database

    NSDL National Science Digital Library

    This database contains detailed information on hundreds of dinosaurs and dinosaur related topics. It features a dinosaur dictionary, dinosaur clip art and flex-art, and links to lesson plans and dinosaur experiments for teachers.

  17. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M. [Calm (James M.), Great Falls, VA (United States)

    1994-05-27

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  18. CAL Learning Strategies Database

    NSDL National Science Digital Library

    The Learning Strategies Database, developed by the Center for Advancement of Learning (CAL) at Muskingum College, organizes information about learning strategies into four major sections: Introduction to the CAL Learning Strategies Database, General-Purpose Learning Strategies, Content-Specific Learning Strategies, and Bibliography of Learning Strategies Resources. Each section is further divided into more specific subsections, creating a hierarchical database structure. For example, the general-purpose section contains sixteen subsections related to general learning, such as Memory, Test Preparation, and Notetaking; whereas, the content-specific section has 27 disciplinary subsections, covering subjects in the natural sciences, social sciences, humanities, and arts. The information in the database allows students of all ages and their instructors to assess current learning styles, and to identify and implement methods for effective education adapted to the learning strengths and weaknesses of individual students.

  19. THE CTEPP DATABASE

    EPA Science Inventory

    The CTEPP (Children's Total Exposure to Persistent Pesticides and Other Persistent Organic Pollutants) database contains a wealth of data on children's aggregate exposures to pollutants in their everyday surroundings. Chemical analysis data for the environmental media and ques...

  20. New primers for DNA barcoding of digeneans and cestodes (Platyhelminthes).

    PubMed

    Van Steenkiste, Niels; Locke, Sean A; Castelin, Magalie; Marcogliese, David J; Abbott, Cathryn L

    2015-07-01

    Digeneans and cestodes are species-rich taxa and can seriously impact human health, fisheries, aqua- and agriculture, and wildlife conservation and management. DNA barcoding using the COI Folmer region could be applied for species detection and identification, but both 'universal' and taxon-specific COI primers fail to amplify in many flatworm taxa. We found that high levels of nucleotide variation at priming sites made it unrealistic to design primers targeting all flatworms. We developed new degenerate primers that enabled acquisition of the COI barcode region from 100% of specimens tested (n = 46), representing 23 families of digeneans and 6 orders of cestodes. This high success rate represents an improvement over existing methods. Primers and methods provided here are critical pieces towards redressing the current paucity of COI barcodes for these taxa in public databases. PMID:25490869

  1. The DARE Database: UNESCO

    NSDL National Science Digital Library

    The DARE Database, maintained by UNESCO, includes an international directory of over 11,000 references to social science research and training institutes, social science specialists, social science documentation and information services, and social science periodicals. The directory also provides listings of peace, human rights, and international law institutions. Users may search the directory database by type of institution, country name, personal name, geographical coverage, periodical title, language of periodical, ISSN, or keyword.

  2. Database computing in HEP

    NASA Technical Reports Server (NTRS)

    Day, C. T.; Loken, S.; Macfarlane, J. F.; May, E.; Lifka, D.; Lusk, E.; Price, L. E.; Baden, A.; Grossman, R.; Qin, X.

    1992-01-01

    The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors, I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototypes based on relational and object-oriented databases of CDF data samples.

  3. Ecomp Executive Compensation Database

    NSDL National Science Digital Library

    The Ecomp Executive Compensation Database allows users to research the compensation and net-worth of executives. Users may search the database by company name or ticker symbol, as well as by state, sector, and industry pull-down menus. Search returns list compensation summaries for the top executives, including salary, bonus, and total compensation. Clicking on the executive's name will give a more detailed summary, including restricted stock, LTIP payouts, and value realized for options exercised. All numbers are for 1999.

  4. MEROPS: The Peptidase Database

    NSDL National Science Digital Library

    Funded by the Medical Research Council (UK) and the Biotechnology and Biological Sciences Research Council, Neil Rawlings and Alan Barrett of the Babraham Institute have created MEROPS, a Peptidase Database (Version 3.1). Indexes included in the database are Peptidases (Organisms & Peptidases and MEROPS Identifiers), Families, and Clans. Other resources at the site -- in the Documents section -- such as BioMedical Aspects, Unsequenced Peptidases, Statistics, and Distribution of Families, provide a wealth of additional information.

  5. ECOTOX Database System

    NSDL National Science Digital Library

    The US Environmental Protection Agency (EPA) provides this database of chemical toxicity. Three individual EPA databases are combined to provide information on chemical-specific toxicity values for aquatic and terrestrial plants and animals. Users can search for research reports by chemical name, species name, or environmental effect. The site has informative help files and browse features. This Web site is useful for evaluating industrial chemicals or for environmental assessment research.

  6. West Indian Orchidaceae Database

    NSDL National Science Digital Library

    The New York Botanical Garden has recently placed online this searchable database of West Indian Orchids. Containing approximately 5,200 specimen records for the family Orchidacese (from the New York Botanical Garden's collection), the database may be searched by Family, Collector, Country, Taxon, State/Province, and other select fields. Typical returns provide information on Specimen name (scientific name), Location, Collector, Description, and Habitat.

  7. TCC Trade Agreements Database

    NSDL National Science Digital Library

    Developed by the Department of Commerce to aid US exporters, the Trade Compliance Center (TCC) provides access to more than 200 trade documents via the Trade Related Agreements Database (TARA). US Trade Agreements may be browsed by country signatory, issue, or agreement title, or searched by keyword with the benefit on an online thesaurus. In addition, the TCC provides a related database of Market Access Information with searchable country commercial guides, reports on economic policy and trade practices, and national trade estimate reports.

  8. Atlas Florae Europaeae Database

    NSDL National Science Digital Library

    Initiated in 1992, the primary goal of the Atlas Florae Europaeae (AFE) Database project is to make plant distribution data available in digital format. Currently, AFE includes "preliminary maps for all European vascular plants" (based on the time period 1972-1996), examples of distribution statistics (colorful summary maps), and Biogeographical analyses; digital data are expected in 1999. PC users can download an evaluation copy of the current database (which will be available for sale in 1999) on-site.

  9. SSME environment database development

    NASA Technical Reports Server (NTRS)

    Reardon, John

    1987-01-01

    The internal environment of the Space Shuttle Main Engine (SSME) is being determined from hot firings of the prototype engines and from model tests using either air or water as the test fluid. The objectives are to develop a database system to facilitate management and analysis of test measurements and results, to enter available data into the the database, and to analyze available data to establish conventions and procedures to provide consistency in data normalization and configuration geometry references.

  10. Database computing in HEP

    SciTech Connect

    Day, C.T.; Loken, S.; MacFarlane, J.F. (Lawrence Berkeley Lab., CA (United States)); May, E.; Lifka, D.; Lusk, E.; Price, L.E. (Argonne National Lab., IL (United States)); Baden, A. (Maryland Univ., College Park, MD (United States). Dept. of Physics); Grossman, R.; Qin, X. (Illinois Univ., Chicago, IL (United States). Dept. of Mathematics, Statistics and Computer Science); Cormell, L.; Leibold, P.; Liu, D

    1992-01-01

    The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors. I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototype based on relational and object-oriented databases of CDF data samples.

  11. DNA demethylation by DNA repair

    E-print Network

    Gehring, Mary

    Active DNA demethylation underlies key facets of reproduction in flowering plants and mammals and serves a general genome housekeeping function in plants. A family of 5-methylcytosine DNA glycosylases catalyzes plant ...

  12. Catalog of databases and reports

    SciTech Connect

    Burtis, M.D. [comp.

    1997-04-01

    This catalog provides information about the many reports and materials made available by the US Department of Energy`s (DOE`s) Global Change Research Program (GCRP) and the Carbon Dioxide Information Analysis Center (CDIAC). The catalog is divided into nine sections plus the author and title indexes: Section A--US Department of Energy Global Change Research Program Research Plans and Summaries; Section B--US Department of Energy Global Change Research Program Technical Reports; Section C--US Department of Energy Atmospheric Radiation Measurement (ARM) Program Reports; Section D--Other US Department of Energy Reports; Section E--CDIAC Reports; Section F--CDIAC Numeric Data and Computer Model Distribution; Section G--Other Databases Distributed by CDIAC; Section H--US Department of Agriculture Reports on Response of Vegetation to Carbon Dioxide; and Section I--Other Publications.

  13. MPW: the Metabolic Pathways Database.

    PubMed

    Selkov, E; Grechkin, Y; Mikhailova, N; Selkov, E

    1998-01-01

    The Metabolic Pathwasy Database (MPW) (www.biobase.com/emphome.html/homepage. html.pags/pathways.html) a derivative of EMP (www.biobase.com/EMP) plays a fundamental role in the technology of metabolic reconstructions from sequenced genomes under the PUMA (www.mcs.anl.gov/home/compbio/PUMA/Production/ ReconstructedMetabolism/reconstruction.html), WIT (www.mcs.anl.gov/home/compbio/WIT/wit.html ) and WIT2 (beauty.isdn.msc.anl.gov/WIT2.pub/CGI/user.cgi) systems. In October 1997, it included some 2800 pathway diagrams covering primary and secondary metabolism, membrane transport, signal transduction pathways, intracellular traffic, translation and transcription. In the current public release of MPW (beauty.isdn.mcs.anl.gov/MPW), the encoding is based on the logical structure of the pathways and is represented by the objects commonly used in electronic circuit design. This facilitates drawing and editing the diagrams and makes possible automation of the basic simulation operations such as deriving stoichiometric matrices, rate laws, and, ultimately, dynamic models of metabolic pathways. Individual pathway diagrams, automatically derived from the original ASCII records, are stored as SGML instances supplemented by relational indices. An auxiliary database of compound names and structures, encoded in the SMILES format, is maintained to unambiguously connect the pathways to the chemical structures of their intermediates. PMID:9407141

  14. MPW : the metabolic pathways database.

    SciTech Connect

    Selkov, E., Jr.; Grechkin, Y.; Mikhailova, N.; Selkov, E.; Mathematics and Computer Science; Russian Academy of Sciences

    1998-01-01

    The Metabolic Pathways Database (MPW) (www.biobase.com/emphome.html/homepage. html.pags/pathways.html) a derivative of EMP (www.biobase.com/EMP) plays a fundamental role in the technology of metabolic reconstructions from sequenced genomes under the PUMA (www.mcs.anl.gov/home/compbio/PUMA/Production/ ReconstructedMetabolism/reconstruction.html), WIT (www.mcs.anl.gov/home/compbio/WIT/wit.html ) and WIT2 (beauty.isdn.msc.anl.gov/WIT2.pub/CGI/user.cgi) systems. In October 1997, it included some 2800 pathway diagrams covering primary and secondary metabolism, membrane transport, signal transduction pathways, intracellular traffic, translation and transcription. In the current public release of MPW (beauty.isdn.mcs.anl.gov/MPW), the encoding is based on the logical structure of the pathways and is represented by the objects commonly used in electronic circuit design. This facilitates drawing and editing the diagrams and makes possible automation of the basic simulation operations such as deriving stoichiometric matrices, rate laws, and, ultimately, dynamic models of metabolic pathways. Individual pathway diagrams, automatically derived from the original ASCII records, are stored as SGML instances supplemented by relational indices. An auxiliary database of compound names and structures, encoded in the SMILES format, is maintained to unambiguously connect the pathways to the chemical structures of their intermediates.

  15. Drinking Water Database

    NASA Technical Reports Server (NTRS)

    Murray, ShaTerea R.

    2004-01-01

    This summer I had the opportunity to work in the Environmental Management Office (EMO) under the Chemical Sampling and Analysis Team or CS&AT. This team s mission is to support Glenn Research Center (GRC) and EM0 by providing chemical sampling and analysis services and expert consulting. Services include sampling and chemical analysis of water, soil, fbels, oils, paint, insulation materials, etc. One of this team s major projects is the Drinking Water Project. This is a project that is done on Glenn s water coolers and ten percent of its sink every two years. For the past two summers an intern had been putting together a database for this team to record the test they had perform. She had successfully created a database but hadn't worked out all the quirks. So this summer William Wilder (an intern from Cleveland State University) and I worked together to perfect her database. We began be finding out exactly what every member of the team thought about the database and what they would change if any. After collecting this data we both had to take some courses in Microsoft Access in order to fix the problems. Next we began looking at what exactly how the database worked from the outside inward. Then we began trying to change the database but we quickly found out that this would be virtually impossible.

  16. TaxMan: a taxonomic database manager

    Microsoft Academic Search

    Martin Jones; Mark Blaxter

    2006-01-01

    Background: Phylogenetic analysis of large, multiple-gene datasets, assembled from public sequence databases, is rapidly becoming a popular way to approach difficult phylogenetic problems. Supermatrices (concatenated multiple sequence alignments of multiple genes) can yield more phylogenetic signal than individual genes. However, manually assembling such datasets for a large taxonomic group is time-consuming and error-prone. Additionally, sequence curation, alignment and assessment of

  17. The UCSC Genome Browser Database: update 2009

    Microsoft Academic Search

    Robert M. Kuhn; Donna Karolchik; Ann S. Zweig; T. Wang; Kayla E. Smith; Kate R. Rosenbloom; Brooke L. Rhead; Brian J. Raney; Andy Pohl; M. Pheasant; L. Meyer; Fan Hsu; Angela S. Hinrichs; Rachel A. Harte; Belinda Giardine; P. Fujita; Mark Diekhans; T. Dreszer; Hiram Clawson; Galt P. Barber; David Haussler; W. James Kent

    2009-01-01

    The UCSC Genome Browser Database (GBD, http:\\/\\/ genome.ucsc.edu) is a publicly available collection of genome assembly sequence data and inte- grated annotations for a large number of organ- isms, including extensive comparative-genomic resources. In the past year, 13 new genome assem- blies have been added, including two important pri- mate species, orangutan and marmoset, bringing the total to 46 assemblies

  18. A Database Interface for Clustering in Large Spatial Databases

    Microsoft Academic Search

    Martin Ester; Hans-peter Kriegel; Xiaowei Xu

    1995-01-01

    Both the number and the size of spatial databases are rapidly growing because of the large amount of data obtained from satellite images, X-ray crystallography or other scientific equipment. Therefore, automated knowledge discovery be- comes more and more important in spatial databases. So far, most of the methods for knowledge discovery in databases (KDD) have been based on relational database

  19. National Environmental Publications Internet Site

    NSDL National Science Digital Library

    National Environmental Publications Internet Site is maintained by the Environmental Protection Agency (EPA) and contains a database of over 9000 documents that have been published by the EPA. Searches can be attempted by keyword or by publication title, which should help make finding a particular document easy. Once found, the documents can be viewed, printed freely, or ordered directly from the EPA.

  20. Establishment of a universal size standard strain for use with the PulseNet standardized pulsed-field gel electrophoresis protocols: converting the national databases to the new size standard.

    PubMed

    Hunter, Susan B; Vauterin, Paul; Lambert-Fair, Mary Ann; Van Duyne, M Susan; Kubota, Kristy; Graves, Lewis; Wrigley, Donna; Barrett, Timothy; Ribot, Efrain

    2005-03-01

    The PulseNet National Database, established by the Centers for Disease Control and Prevention in 1996, consists of pulsed-field gel electrophoresis (PFGE) patterns obtained from isolates of food-borne pathogens (currently Escherichia coli O157:H7, Salmonella, Shigella, and Listeria) and textual information about the isolates. Electronic images and accompanying text are submitted from over 60 U.S. public health and food regulatory agency laboratories. The PFGE patterns are generated according to highly standardized PFGE protocols. Normalization and accurate comparison of gel images require the use of a well-characterized size standard in at least three lanes of each gel. Originally, a well-characterized strain of each organism was chosen as the reference standard for that particular database. The increasing number of databases, difficulty in identifying an organism-specific standard for each database, the increased range of band sizes generated by the use of additional restriction endonucleases, and the maintenance of many different organism-specific strains encouraged us to search for a more versatile and universal DNA size marker. A Salmonella serotype Braenderup strain (H9812) was chosen as the universal size standard. This strain was subjected to rigorous testing in our laboratories to ensure that it met the desired criteria, including coverage of a wide range of DNA fragment sizes, even distribution of bands, and stability of the PFGE pattern. The strategy used to convert and compare data generated by the new and old reference standards is described. PMID:15750058

  1. ORIGINAL PAPER An EST database for Liriodendron tulipifera L. floral buds

    E-print Network

    dePamphilis, Claude

    ORIGINAL PAPER An EST database for Liriodendron tulipifera L. floral buds: the first EST resource for identification of new genes related to floral diversity in basal angiosperms. A large, non-normalized cDNA library was constructed from pre- meiotic and meiotic floral buds and sequenced to generate a database

  2. Synthesized Population Databases: A Geospatial Database of US Poultry Farms.

    PubMed

    Bruhn, Mark C; Munoz, Breda; Cajka, James; Smith, Gary; Curry, Ross J; Wagener, Diane K; Wheaton, William D

    2012-01-01

    The pervasive and potentially severe economic, social, and public health consequences of infectious disease in farmed animals require that plans be in place for a rapid response. Increasingly, agent-based models are being used to analyze the spread of animal-borne infectious disease outbreaks and derive policy alternatives to control future outbreaks. Although the locations, types, and sizes of animal farms are essential model inputs, no public domain nationwide geospatial database of actual farm locations and characteristics currently exists in the United States. This report describes a novel method to develop a synthetic dataset that replicates the spatial distribution of poultry farms, as well as the type and number of birds raised on them. It combines county-aggregated poultry farm counts, land use/land cover, transportation, business, and topographic data to generate locations in the conterminous United States where poultry farms are likely to be found. Simulation approaches used to evaluate the accuracy of this method when compared to that of a random placement alternative found this method to be superior. The results suggest the viability of adapting this method to simulate other livestock farms of interest to infectious disease researchers. PMID:25364787

  3. DataBase on Demand

    NASA Astrophysics Data System (ADS)

    Gaspar Aparicio, R.; Gomez, D.; Coterillo Coz, I.; Wojcik, D.

    2012-12-01

    At CERN a number of key database applications are running on user-managed MySQL database services. The database on demand project was born out of an idea to provide the CERN user community with an environment to develop and run database services outside of the actual centralised Oracle based database services. The Database on Demand (DBoD) empowers the user to perform certain actions that had been traditionally done by database administrators, DBA's, providing an enterprise platform for database applications. It also allows the CERN user community to run different database engines, e.g. presently open community version of MySQL and single instance Oracle database server. This article describes a technology approach to face this challenge, a service level agreement, the SLA that the project provides, and an evolution of possible scenarios.

  4. ADANS database specification

    SciTech Connect

    NONE

    1997-01-16

    The purpose of the Air Mobility Command (AMC) Deployment Analysis System (ADANS) Database Specification (DS) is to describe the database organization and storage allocation and to provide the detailed data model of the physical design and information necessary for the construction of the parts of the database (e.g., tables, indexes, rules, defaults). The DS includes entity relationship diagrams, table and field definitions, reports on other database objects, and a description of the ADANS data dictionary. ADANS is the automated system used by Headquarters AMC and the Tanker Airlift Control Center (TACC) for airlift planning and scheduling of peacetime and contingency operations as well as for deliberate planning. ADANS also supports planning and scheduling of Air Refueling Events by the TACC and the unit-level tanker schedulers. ADANS receives input in the form of movement requirements and air refueling requests. It provides a suite of tools for planners to manipulate these requirements/requests against mobility assets and to develop, analyze, and distribute schedules. Analysis tools are provided for assessing the products of the scheduling subsystems, and editing capabilities support the refinement of schedules. A reporting capability provides formatted screen, print, and/or file outputs of various standard reports. An interface subsystem handles message traffic to and from external systems. The database is an integral part of the functionality summarized above.

  5. Organism - Specific Genome Information and Databases: Prokaryotes

    NSDL National Science Digital Library

    Smith, Christopher M.

    Genome sequencing encountered a milestone in 1997 when the whole genomes of Escherichia coli and Bacillus subtilis were sequenced. The San Diego Supercomputer Center's Organism-Specific Genome Information and Databases site contains links to the sequences of both microbes (listed under prokaryotes). This site covers one of the top ten scientific breakthroughs of 1997, compiled in the December 19, 1997 issue of Science. The top scientific breakthrough of 1997 was the cloning of a sheep, resulting in a lamb named Dolly. The nine runners up were: the Pathfinder mission to Mars, synchrotrons, biological clock genes, gamma ray bursts, Neandertal DNA, nanotubes, Europa's ocean, whole genome sequencing, and neurons.

  6. REDfly: a Regulatory Element Database for Drosophila.

    PubMed

    Gallo, Steven M; Li, Long; Hu, Zihua; Halfon, Marc S

    2006-02-01

    Bioinformatics studies of transcriptional regulation in the metazoa are significantly hindered by the absence of readily available data on large numbers of transcriptional cis-regulatory modules (CRMs). Even the richly annotated Drosophila melanogaster genome lacks extensive CRM information. We therefore present here a database of Drosophila CRMs curated from the literature complete with both DNA sequence and a searchable description of the gene expression pattern regulated by each CRM. This resource should greatly facilitate the development of computational approaches to CRM discovery as well as bioinformatics analyses of regulatory sequence properties and evolution. PMID:16303794

  7. The WiscDsLox T-DNA collection: an arabidopsis community resource generated by using an improved high-throughput T-DNA sequencing pipeline.

    PubMed

    Woody, Scott T; Austin-Phillips, Sandra; Amasino, Richard M; Krysan, Patrick J

    2007-01-01

    We have developed a new community resource, called the WiscDsLox collection, for performing reverse-genetic analysis in arabidopsis. This resource is composed of 10,459 T-DNA lines generated using the Arabidopsis thaliana ecotype Columbia. The flanking sequence tag for each T-DNA insertion has been deposited in public databases, and seed for each line is currently available from the Arabidopsis Biological Resource Center. The pDsLox vector used to create this new population contains a Ds transposon and Cre/Lox recombination sites. Each WiscDsLox line therefore has the potential to serve as a launch-pad for performing local saturation mutagenesis by mobilization of the Ds element. In addition, Cre-Lox recombination between the T-DNA and a transposed Ds element should enable targeted deletion of specific genomic regions. We generated the WiscDsLox collection using an improved high-throughput pipeline that streamlines analysis of large numbers of independent Arabidopsis thaliana (L.) Hyenh. lines. In this paper we describe the details of this novel method and also provide potential users of WiscDsLox T-DNA lines with useful background information about this collection. Experiments to characterize the utility of the Ds transposon and Cre/Lox elements present in the WiscDsLox lines are in progress and will be reported in the future. PMID:17186119

  8. National Spill Test Technology Database

    DOE Data Explorer

    Sheesley, David [Western Research Institute

    Western Research Institute established, and ACRC continues to maintain, the National Spill Technology database to provide support to the Liquified Gaseous Fuels Spill Test Facility (now called the National HAZMAT Spill Center) as directed by Congress in Section 118(n) of the Superfund Amendments and Reauthorization Act of 1986 (SARA). The Albany County Research Corporation (ACRC) was established to make publicly funded data developed from research projects available to benefit public safety. The founders since 1987 have been investigating the behavior of toxic chemicals that are deliberately or accidentally spilled, educating emergency response organizations, and maintaining funding to conduct the research at the DOEÆs HAZMAT Spill Center (HSC) located on the Nevada Test Site. ACRC also supports DOE in collaborative research and development efforts mandated by Congress in the Clean Air Act Amendments. The data files are results of spill tests conducted at various times by the Silicones Environmental Health and Safety Council (SEHSC) and DOE, ANSUL, Dow Chemical, the Center for Chemical Process Safety (CCPS) and DOE, Lawrence Livermore National Laboratory (LLNL), OSHA, and DOT; DuPont, and the Western Research Institute (WRI), Desert Research Institute (DRI), and EPA. Each test data page contains one executable file for each test in the test series as well as a file named DOC.EXE that contains information documenting the test series. These executable files are actually self-extracting zip files that, when executed, create one or more comma separated value (CSV) text files containing the actual test data or other test information.

  9. Second-Tier Database for Ecosystem Focus, 1999-2000 Annual Report.

    SciTech Connect

    Van Holmes, Chris; Muongchanh, Christine; Anderson, James J. (University of Washington, School of Aquatic and Fishery Sciences, Seattle, WA)

    2000-11-01

    The Second-Tier Database for Ecosystem Focus (Contract 19601900) provides direct and timely public access to Columbia Basin environmental, operational, fishery and riverine data resources for federal, state, public and private entities. The Second-Tier Database known as Data Access in Realtime (DART) does not duplicate services provided by other government entities in the region. Rather, it integrates public data for effective access, consideration and application.

  10. The Stanford Microarray Database: data access and quality assessment tools

    Microsoft Academic Search

    Jeremy Gollub; Catherine A. Ball; Gail Binkley; Janos Demeter; David B. Finkelstein; Joan M. Hebert; Tina Hernandez-boussard; Heng Jin; Miroslava Kaloper; John C. Matese; Mark Schroeder; Patrick O. Brown; David Botstein; Gavin Sherlock

    2003-01-01

    The Stanford Microarray Database (SMD; http:\\/\\/ genome-www.stanford.edu\\/microarray\\/) serves as a microarray research database for Stanford investi- gators and their collaborators. In addition, SMD functions as a resource for the entire scientific community, by making freely available all of its source code and providing full public access to data published by SMD users, along with many tools to explore and analyze

  11. Indian Renewable Energy and Energy Efficiency Policy Database (Fact Sheet)

    SciTech Connect

    Bushe, S.

    2013-09-01

    This fact sheet provides an overview of the Indian Renewable Energy and Energy Efficiency Policy Database (IREEED) developed in collaboration by the United States Department of Energy and India's Ministry of New and Renewable Energy. IREEED provides succinct summaries of India's central and state government policies and incentives related to renewable energy and energy efficiency. The online, public database was developed under the U.S.- India Energy Dialogue and the Clean Energy Solution Center.

  12. International Architecture Database

    NSDL National Science Digital Library

    Drawing on the contributions from persons across much of Europe, the International Architecture Database website has served as a valuable clearinghouse for thousands of architectural projects (both built and unrealized) since 1996. Currently, the database contains information on more than 13,000 projects, most from the 20th and 21st centuries. Visitors can begin by browsing the database by name, location, or keyword. Looking at a single record, visitors will be presented with a host of information, such as building type, primary architect, location, years of construction, and in certain cases with external links, photographs, and plans. Looking through the lists of keywords can actually be quite useful, as each keyword is linked to examples that are demonstrative of the idea suggested by the keyword, such as early Gothic or elementary school. Overall, this is a fine resource for those persons who wish to learn a bit more about architecture or for those looking for information on different architectural projects.

  13. Scottish Emigration Database

    NSDL National Science Digital Library

    Scotland has given the world a great many things, and during the 19th century, many Scots set sail to seek their fortune in other parts of the world. Social historians and others will be glad to know that the University of Aberdeen's Centre for Irish and Scottish Studies has created this online database of Scottish emigrants. Currently, visitors can examine the records of over 21,000 passengers who embarked at Glasgow and Greenock for other ports. While the database only covers a small time period, the database is well-designed for general use. First-time visitors should take a look at the "User Guide", which includes details about the different fields used in each record, such as "occupation", "urban district/village", and "destination port".

  14. Transboundary Freshwater Dispute Database

    NSDL National Science Digital Library

    Created and maintained by Dr. Aaron T. Wolf of the Department of Geosciences at Oregon State University, this site is designed to help researchers and students explore water disputes and negotiations in the 20th century. To that end, it offers a searchable database containing the summaries and full text of 150 international water-related treaties and another similar database of 39 interstate compacts within the US. Treaties within the databases may be selected by nation or state, main and treaty basins, focus, and beginning and ending dates. Additional resources include a digitized inventory of international watersheds. In the future, Wolf plans to add descriptions of indigenous/ traditional methods for the resolution of water disputes, news files and bibliographic entries of acute water conflicts, and an annotated bibliography of the state of the art of Transboundary Freshwater Dispute Resolution.

  15. Enhancing medical database semantics.

    PubMed Central

    Leão, B. de F.; Pavan, A.

    1995-01-01

    Medical Databases deal with dynamic, heterogeneous and fuzzy data. The modeling of such complex domain demands powerful semantic data modeling methodologies. This paper describes GSM-Explorer a Case Tool that allows for the creation of relational databases using semantic data modeling techniques. GSM Explorer fully incorporates the Generic Semantic Data Model-GSM enabling knowledge engineers to model the application domain with the abstraction mechanisms of generalization/specialization, association and aggregation. The tool generates a structure that implements persistent database-objects through the automatic generation of customized SQL ANSI scripts that sustain the semantics defined in the higher lever. This paper emphasizes the system architecture and the mapping of the semantic model into relational tables. The present status of the project and its further developments are discussed in the Conclusions. PMID:8563288

  16. Fungus 2000 Database

    NSDL National Science Digital Library

    Launched by the British Mycological Society, the Fungus 2000 Database initiative was established "to record at least 2000 species of fungi from the British Isles in the year 2000 and, equally as important, to produce a millennium collection of one dried voucher specimen for each species recorded." The Fungus 2000 Database provides details on the first collections of species made during the year 2000, listed in alphabetical order (scientific name only). Each data entry describes the species name (scientific name only), associated organism(s), location of specimen, date of collection, reference data, and (in some cases) a distribution map for the species. As of early May, 2000, nearly 800 specimens have been included in the database.

  17. Cities and Buildings Database

    NSDL National Science Digital Library

    The University of Washington Libraries Digital Collection online projects and archives are well-regarded, and this database proves to be no exception to that highly positive trend. Started in 1995, the Cities and Buildings Database contains over 10,000 digitized images of buildings and cities culled from all historical periods and from all over the world. Visitors may wish to start with a simple keyword search or if they are interested in merely browsing by country, they may do so as well from the homepage. Of course, one should not be surprised to also learn that visitors may also perform detailed searches for buildings by city, style, title, architect, and date of construction. Just to give prospective visitors some sense of the depth and breadth of the collection, the database contains everything from conceptual sketches of Frank Gehry's Experience Music Project to photographs of the monastery of St. Keghard in Armenia.

  18. DNA Chips

    NSDL National Science Digital Library

    Science Netlinks

    2003-02-23

    In this lesson from Science NetLinks, students will conduct activities from a module called "DNA Chips: A Genetics Lab in the Palm of Your Hand." This module is part of the National Institutes of Health Snapshots series, which focuses on a single area of biomedical research to help students understand how science, people, ethics, and history all fit together. The module for this lesson is about the DNA microarray, also known as a DNA chip.

  19. Database development in toxicogenomics: issues and efforts.

    PubMed Central

    Mattes, William B; Pettit, Syril D; Sansone, Susanna-Assunta; Bushel, Pierre R; Waters, Michael D

    2004-01-01

    The marriage of toxicology and genomics has created not only opportunities but also novel informatics challenges. As with the larger field of gene expression analysis, toxicogenomics faces the problems of probe annotation and data comparison across different array platforms. Toxicogenomics studies are generally built on standard toxicology studies generating biological end point data, and as such, one goal of toxicogenomics is to detect relationships between changes in gene expression and in those biological parameters. These challenges are best addressed through data collection into a well-designed toxicogenomics database. A successful publicly accessible toxicogenomics database will serve as a repository for data sharing and as a resource for analysis, data mining, and discussion. It will offer a vehicle for harmonizing nomenclature and analytical approaches and serve as a reference for regulatory organizations to evaluate toxicogenomics data submitted as part of registrations. Such a database would capture the experimental context of in vivo studies with great fidelity such that the dynamics of the dose response could be probed statistically with confidence. This review presents the collaborative efforts between the European Molecular Biology Laboratory-European Bioinformatics Institute ArrayExpress, the International Life Sciences Institute Health and Environmental Science Institute, and the National Institute of Environmental Health Sciences National Center for Toxigenomics Chemical Effects in Biological Systems knowledge base. The goal of this collaboration is to establish public infrastructure on an international scale and examine other developments aimed at establishing toxicogenomics databases. In this review we discuss several issues common to such databases: the requirement for identifying minimal descriptors to represent the experiment, the demand for standardizing data storage and exchange formats, the challenge of creating standardized nomenclature and ontologies to describe biological data, the technical problems involved in data upload, the necessity of defining parameters that assess and record data quality, and the development of standardized analytical approaches. PMID:15033600

  20. The PEP-II project-wide database

    SciTech Connect

    Chan, A.; Calish, S.; Crane, G.; MacGregor, I.; Meyer, S.; Wong, J.

    1995-05-01

    The PEP-II Project Database is a tool for monitoring the technical and documentation aspects of this accelerator construction. It holds the PEP-II design specifications, fabrication and installation data in one integrated system. Key pieces of the database include the machine parameter list, magnet and vacuum fabrication data. CAD drawings, publications and documentation, survey and alignment data and property control. The database can be extended to contain information required for the operations phase of the accelerator and detector. Features such as viewing CAD drawing graphics from the database will be implemented in the future. This central Oracle database on a UNIX server is built using ORACLE Case tools. Users at the three collaborating laboratories (SLAC, LBL, LLNL) can access the data remotely, using various desktop computer platforms and graphical interfaces.

  1. NBER Macrohistory Database

    NSDL National Science Digital Library

    1998-01-01

    The National Bureau of Economic Research (NBER) (discussed in the September 22, 1995 Scout Report) offers a Macrohistory Database of 3500 monthly, quarterly, and annual economic time series on pre-WWI and interwar economies in addition to their other historical data sets and working papers. Fifteen Macrohistory chapters cover United States production, construction, employment, money, prices, asset market transactions, foreign trade, and government activity, with some coverage of the United Kingdom, France, and Germany. Although the database is searchable by keyword, an analytical index (.pdf format) created by the original NBER compilers gives more detailed information on data series creation and organization.

  2. Botanical Image Database

    NSDL National Science Digital Library

    2001-09-06

    This database, published by the University of Basel, Switzerland, archives thousands of images (photos and woodcuts) of plants. The collection is arranged alphabetically by species, genus, family, or order, and can be searched by keyword. The advanced search allows users to specify taxonomic groups, physical properties of plants (morphology, growth form, ecology, and others), presentation format, and other parameters. There are also specialized collections from certain geographic regions, and collections of specific types (woody plants, vegetation, pollinators). Each image is accompanied by genus and species, family, a brief description, and location information. The database is available in English and German versions.

  3. Earth Impact Database

    NSDL National Science Digital Library

    This database contains information on confirmed impact structures around the world. A series of interactive maps, one for each continent, shows the locations of impact structures. Users can roll their mouse over the locations to see structure names and click on them to access physical information such as location (latitude and longitude), diameter, age, maps, photos, cross-sections (where available), and a list of references. The database can also be sorted by age, diameter, or name. There is also a list of the principal criteria for determining if a geological feature is an impact structure, an essay on impact cratering on Earth, and a frequently-asked-questions page.

  4. Databases for plant phosphoproteomics.

    PubMed

    Schulze, Waltraud X; Yao, Qiuming; Xu, Dong

    2015-01-01

    Phosphorylation is the most studied posttranslational modification involved in signal transduction in stress responses, development, and growth. In the recent years large-scale phosphoproteomic studies were carried out using various model plants and several growth and stress conditions. Here we present an overview of online resources for plant phosphoproteomic databases: PhosPhAt as a resource for Arabidopsis phosphoproteins, P3DB as a resource expanding to crop plants, and Medicago PhosphoProtein Database as a resource for the model plant Medicago trunculata. PMID:25930705

  5. Universal Chalcidoidea Database

    NSDL National Science Digital Library

    Noyes, John S.

    Chalcidoid wasps now have an excellent database all their own thanks to John Noyes of London's Natural History Museum. The Universal Chalcidoidea Database contains an extensive set of taxonomic and bibliographic records, as well as nearly 400 photos of living chalcidoids. Users will also find a key to chalcidoid families, information on collecting and preserving specimens, and a brief overview of this extremely diverse yet poorly understood group of wee wasps, which includes the world's smallest adult insect -- measuring in at a mere 0.11 mm.

  6. Organic Compounds Database

    NSDL National Science Digital Library

    Bell, Harold M.

    2000-01-01

    The Colby College Department of Chemistry offers the Organic Compounds Database, which was compiled by Harold Bell of the Virginia Polytechnic Institute. Visitors can search by the compounds melting point, boiling point, index of refraction, molecular weight, formula, absorption wavelength, mass spectral peak, chemical type, and by partial name. Once entered, results are returned with basically the same type of information that can be searched, plus any other critical information. References are provided for the close to 2500 organic compounds included in the database; yet, because the site was last modified in 1995, varying the data may be required to fully authenticate its accuracy.

  7. Building the GEM Faulted Earth database

    NASA Astrophysics Data System (ADS)

    Litchfield, N. J.; Berryman, K. R.; Christophersen, A.; Thomas, R. F.; Wyss, B.; Tarter, J.; Pagani, M.; Stein, R. S.; Costa, C. H.; Sieh, K. E.

    2011-12-01

    The GEM Faulted Earth project is aiming to build a global active fault and seismic source database with a common set of strategies, standards, and formats, to be placed in the public domain. Faulted Earth is one of five hazard global components of the Global Earthquake Model (GEM) project. A key early phase of the GEM Faulted Earth project is to build a database which is flexible enough to capture existing and variable (e.g., from slow interplate faults to fast subduction interfaces) global data, and yet is not too onerous to enter new data from areas where existing databases are not available. The purpose of this talk is to give an update on progress building the GEM Faulted Earth database. The database design conceptually has two layers, (1) active faults and folds, and (2) fault sources, and automated processes are being defined to generate fault sources. These include the calculation of moment magnitude using a user-selected magnitude-length or magnitude-area scaling relation, and the calculation of recurrence interval from displacement divided by slip rate, where displacement is calculated from moment and moment magnitude. The fault-based earthquake sources defined by the Faulted Earth project will then be rationalised with those defined by the other GEM global components. A web based tool is being developed for entering individual faults and folds, and fault sources, and includes capture of additional information collected at individual sites, as well as descriptions of the data sources. GIS shapefiles of individual faults and folds, and fault sources will also be able to be uploaded. A data dictionary explaining the database design rationale, definitions of the attributes and formats, and a tool user guide is also being developed. Existing national databases will be uploaded outside of the fault compilation tool, through a process of mapping common attributes between the databases. Regional workshops are planned for compilation in areas where existing databases are not available, or require further population, and will include training on using the fault compilation tool. The tool is also envisaged as an important legacy of the GEM Faulted Earth project, to be available for use beyond the end of the 2 year project.

  8. Protein databases on the internet.

    PubMed

    Xu, Dong; Xu, Ying

    2004-11-01

    Protein databases have become a crucial part of modern biology. Huge amounts of data for protein structures, functions, and particularly sequences are being generated. Searching databases is often the first step in the study of a new protein. Comparison between proteins and between protein families in databases provides information about the relationship between proteins within a genome or across different species, and hence offers much more information than can be obtained by studying only an isolated protein. In addition, secondary databases derived from experimental databases are also widely available. These databases reorganize and annotate the data or provide predictions. The use of multiple databases often helps researchers understand the structure and function of proteins. Although some protein databases are widely known, they are far from being fully utilized in the protein science community. This unit provides a starting point for readers to explore the potential of protein databases on the Internet. PMID:18265344

  9. Protein databases on the internet.

    PubMed

    Xu, Dong; Xu, Ying

    2004-11-01

    Protein databases have become a crucial part of modern biology. Huge amounts of data for protein structures, functions, and particularly sequences are being generated. Searching databases is often the first step in the study of a new protein. Comparison between proteins and between protein families in databases provides information about the relationship between proteins within a genome or across different species, and hence offers much more information than can be obtained by studying only an isolated protein. In addition, secondary databases derived from experimental databases are also widely available. These databases reorganize and annotate the data or provide predictions. The use of multiple databases often helps researchers understand the structure and function of proteins. Although some protein databases are widely known, they are far from being fully utilized in the protein science community. This unit provides a starting point for readers to explore the potential of protein databases on the Internet. PMID:18429255

  10. Radiation Genes: a database devoted to microarrays screenings revealing transcriptome alterations induced by ionizing radiation in mammalian cells

    PubMed Central

    Chiani, Francesco; Iannone, Camilla; Negri, Rodolfo; Paoletti, Daniele; D’Antonio, Mattia; De Meo, Paolo D’onorio; Castrignanò, Tiziana

    2009-01-01

    The analysis of the great extent of data generated by using DNA microarrays technologies has shown that the transcriptional response to radiation can be considerably different depending on the quality, the dose range and dose rate of radiation, as well as the timing selected for the analysis. At present, it is very difficult to integrate data obtained under several experimental conditions in different biological systems to reach overall conclusions or build regulatory models which may be tested and validated. In fact, most available data is buried in different websites, public or private, in general or local repositories or in files included in published papers; it is often in various formats, which makes a wide comparison even more difficult. The Radiation Genes Database (http://www.caspur.it/RadiationGenes) collects microarrays data from various local and public repositories or from published papers and supplementary materials. The database classifies it in terms of significant variables, such as radiation quality, dose, dose rate and sampling timing, as to provide user-friendly tools to facilitate data integration and comparison. PMID:20157480

  11. Cancer Control Publications 1998-2011

    Cancer.gov

    CC Publications demonstrates the depth and breadth of research in cancer control and population sciences at NCI. This searchable database includes publications from the Division of Cancer Control and Population Sciences (DCCPS) staff and DCCPS-funded research. Learn more about CC Publications.

  12. ORNL Publications External Publication

    E-print Network

    Pennycook, Steve

    ORNL Publications External Publication Job Posting Title Postdoctoral Research Associate ­ Physical Sciences Directorate (PSD) at Oak Ridge National Laboratory (ORNL). Major Duties in CNMS. Interfaces with administrative staff, managers and visitors to ORNL. Measures of Effectiveness

  13. Bibliographical database of radiation biological dosimetry and risk assessment: Part 1, through June 1988

    SciTech Connect

    Straume, T.; Ricker, Y.; Thut, M.

    1988-08-29

    This database was constructed to support research in radiation biological dosimetry and risk assessment. Relevant publications were identified through detailed searches of national and international electronic databases and through our personal knowledge of the subject. Publications were numbered and key worded, and referenced in an electronic data-retrieval system that permits quick access through computerized searches on publication number, authors, key words, title, year, and journal name. Photocopies of all publications contained in the database are maintained in a file that is numerically arranged by citation number. This report of the database is provided as a useful reference and overview. It should be emphasized that the database will grow as new citations are added to it. With that in mind, we arranged this report in order of ascending citation number so that follow-up reports will simply extend this document. The database cite 1212 publications. Publications are from 119 different scientific journals, 27 of these journals are cited at least 5 times. It also contains reference to 42 books and published symposia, and 129 reports. Information relevant to radiation biological dosimetry and risk assessment is widely distributed among the scientific literature, although a few journals clearly dominate. The four journals publishing the largest number of relevant papers are Health Physics, Mutation Research, Radiation Research, and International Journal of Radiation Biology. Publications in Health Physics make up almost 10% of the current database.

  14. The AMMA database

    Microsoft Academic Search

    Jean-Luc Boichard; Guillaume Brissebrat; Sophie Cloche; Laurence Eymard; Laurence Fleury; Laurence Mastrorillo; Oumarou Moulaye; Karim Ramage

    2010-01-01

    The AMMA project includes aircraft, ground-based and ocean measurements, an intensive use of satellite data and diverse modelling studies. Therefore, the AMMA database aims at storing a great amount and a large variety of data, and at providing the data as rapidly and safely as possible to the AMMA research community. In order to stimulate the exchange of information and

  15. Triatomic Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 117 Triatomic Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 55 triatomic molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty and reference are given for each transition reported.

  16. Enzyme Database - BRENDA

    NSDL National Science Digital Library

    Institute of Biochemistry and Bioinformatics at the Technical University of Braunschweig, Germany

    BRENDA is the main collection of enzyme functional data available to the scientific community. It is available free of charge for via the internet (www.brenda-enzymes.info) and as an in-house database for commercial users (requests to our distributor Biobase).

  17. Query Nuclear Explosions Database

    NSDL National Science Digital Library

    2002-01-01

    NUCEXP, National Geoscience Database, provided by the Australian Geological Survey Organization (AGSO), contains entries on nuclear explosions around the world since 1945, with the location, time and size of explosions. To view the records, users must select site and country conducting the test and beginning/end dates.

  18. Cotton Marker Database

    Technology Transfer Automated Retrieval System (TEKTRAN)

    To address the lack of available molecular markers for cotton, Cotton Incorporated has spearheaded an initiative to create the Cotton Microsatellite Database (CMD), and several groups are actively involved in projects to generate, screen and map cotton molecular markers. CMD is a centralized databas...

  19. Hydrocarbon Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 115 Hydrocarbon Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 91 hydrocarbon molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty and reference are given for each transition reported.

  20. NATIONAL ASSESSMENT DATABASE (NAD)

    EPA Science Inventory

    Resource Purpose: The National Assessment Database stores State water quality assessments that are reported under Section 305(b) of the Clean Water Act. The data are stored by individual water quality assessments. Threatened, partially and not supporting waters also have da...

  1. SGD: Saccharomyces Genome Database

    Microsoft Academic Search

    J. Michael Cherry; Caroline Adler; Catherine A. Ball; Stephen A. Chervitz; Selina S. Dwight; Erich T. Hester; Yankai Jia; Gail Juvik; Taiyun Roe; Mark Schroeder; Shuai Weng; David Botstein

    1998-01-01

    The Saccharomyces Genome Database (SGD) provides Internet access to the complete Saccharomyces cerevisiae genomic s equence, i ts g enes a nd t heir products, the p henotypes of i ts m utants, a nd t he literature supporting these data. The amount of information and t he n umber o f features p rovided b y SGD have i

  2. Databases and data mining

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Over the course of the past decade, the breadth of information that is made available through online resources for plant biology has increased astronomically, as have the interconnectedness among databases, online tools, and methods of data acquisition and analysis. For maize researchers, the numbe...

  3. High Performance Buildings Database

    DOE Data Explorer

    The High Performance Buildings Database is a shared resource for the building industry, a unique central repository of in-depth information and data on high-performance, green building projects across the United States and abroad. The database includes information on the energy use, environmental performance, design process, finances, and other aspects of each project. Members of the design and construction teams are listed, as are sources for additional information. In total, up to twelve screens of detailed information are provided for each project profile. Projects range in size from small single-family homes or tenant fit-outs within buildings to large commercial and institutional buildings and even entire campuses. The database is a data repository as well. A series of Web-based data-entry templates allows anyone to enter information about a building project into the database. Once a project has been submitted, each of the partner organizations can review the entry and choose whether or not to publish that particular project on its own Web site.

  4. ENVIRONMENTAL FATE DATABASE (ENVIROFATE)

    EPA Science Inventory

    The Environmental Fate Database contains more than 13,000 records of information on the environmental fate or behavior (i.e., transport and degradation) of approximately 800 chemical released into the environment. Chemicals selected for inclusion are produced in quantities exceed...

  5. Diatomic Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 114 Diatomic Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 121 diatomic molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty, and reference are given for each transition reported.

  6. NATIONAL CONTAMINANT OCCURRENCE DATABASE

    EPA Science Inventory

    Resource Purpose: Under the 1996 Safe Drinking Water Act Amendments, EPA is to assemble a National Drinking Water Occurrence Database (NCOD) by August 1999. The NCOD is a collection of data of documented quality on unregulated and regulated chemical, radiological, microbia...

  7. The Peptaibol Database

    NSDL National Science Digital Library

    Chugh, Jasveen.

    This online database from the Crystallography Department at Birkbeck College, London "deals primarily with naturally occurring peptides, generally these have a fungal origin." The search function is easy to use; data can be queried by name, family, or residue motif. Another useful feature is the Peptaibol Picture Gallery, which includes images of several peptaibols.

  8. Agricultural Market Access Database

    NSDL National Science Digital Library

    Founded in early 1999, the Agricultural Market Access Database (AMAD) is a joint effort by Agriculture and AgriFood Canada, EU Commission - Agriculture Directorate-General, Food and Agriculture Organisation of the United Nations, Organisation for Economic Co-operation and Development, The World Bank, United Nations Conference on Trade and Development, and United States Department of Agriculture - Economic Research Service. AMAD was created in order to "provide a common data set on tariffs, TRQs and imports, as well as the tools for researchers, policymakers, and others to use in analyzing levels of tariff protection in agriculture among WTO Members." Users begin by selecting a region on a map; from there they can narrow their search by country, and the database will generate a data set on that country. AMAD also provides a 30-page User's Guide which helps explain the purpose and uses of the database, as well as helping to decipher some of the information contained in AMAD. At present, not all countries worldwide can be accessed through this database. AMAD, however, promises to continue to expand.

  9. Lecture Notes Database Management

    E-print Network

    O'Neil, Patrick

    SQL Programs. Embedded SQL means SQL statements embedded in host language (C in our case). The original idea was for end-users to access a database through SQL. Called casual users. But this is not a good idea. Takes too much concentration. Can you picture airline reservation clerk doing job with SQL

  10. GENERAL PERMITS DATABASE

    EPA Science Inventory

    Resource Purpose: This database was used to provide permit writers with a library of examples for writing general permits. It has not been maintained and is outdated and will be removed. Water Permits Division is trying to determine whether or not to recreate this databas...

  11. NATIONAL NUTRIENTS DATABASE

    EPA Science Inventory

    Resource Purpose: The Nutrient Criteria Program has initiated development of a National relational database application that will be used to store and analyze nutrient data. The ultimate use of these data will be to derive ecoregion- and waterbody-specific numeric nutrient...

  12. Reception of Texts Database

    NSDL National Science Digital Library

    Created by the Reception of Texts Project at the Open University, this pilot database is designed to help practitioners of reception studies "address issues of performance with the same degree of rigour and attention to evidence which is expected in textual studies and to develop ways of documenting performance which recognise its cross disciplinary and creative dimensions." To that end, academics and students in classical studies, literature, theater studies, and related fields can use this database to search for information on the performances of Greek plays in the original and in adaptations, versions and translations in English from c.1970 to the present, and in the future, poetry in English which draws on Greek texts, myths, and images. The database offers nine search categories, each with a slightly different search format, some offering only a simple keyword search, others with multiple modifiers, and others with pull-down menus for browsing. With the exception of the Critical Works category, searches ultimately return a Production Details page which generally includes modern and original title, year, theater, dates of performance, company, and music, design, and general notes. A useful feature throughout the database is a Missing Information form, which allows users to submit additional or missing information about specific entries.

  13. The SIMBAD astronomical database.

    NASA Astrophysics Data System (ADS)

    Egret, D.; Wenger, M.; Dubois, P.

    SIMBAD is the astronomical data base produced and maintained by the Centre de Données astronomiques de Strasbourg (CDS), at the Observatoire de Strasbourg, France. The acronym SIMBAD stands for Set of Identifications, Measurements, and Bibliography for Astronomical Data. The authors describe here the present status of the database, the features related to access, usage and updating, and finally describe the expected future developments.

  14. Molecular systematic analysis of the order Proteocephalidea (Eucestoda) based on mitochondrial and nuclear rDNA sequences 1 Note: Nucleotide sequence data reported in this paper are available in the EMBL, GenBank ™ and DDJB databases under the accession numbers AJ 238826 to 238829, AJ 23881 to 238832, AJ 238834 to 238837, AJ 388590 to 388638, AJ 389477 to 389524. Alignments are available from the following URL: http:\\/\\/www.herbaria.harvard.edu\\/treebase. Under the study accession number 5389 and the matrix accession numbers M543 and M544. This is part of the PhD thesis of the first author. 1

    Microsoft Academic Search

    M. P Zehnder; J Mariaux

    1999-01-01

    Two ribosomal DNA sequences were used to infer phylogenetic relationships among the Eucestoda order Proteocephalidea. A 437bp portion of the 16S mitochondrial and a 1149bp 5? portion of the nuclear large sub-unit rRNA molecule were sequenced for 53 proteocephalidean cestodes (representing nine subfamilies and 22 genera) and for one outgroup species. Parsimony and distance-based analyses of the two databases, alone

  15. JDD, Inc. Database

    NASA Technical Reports Server (NTRS)

    Miller, David A., Jr.

    2004-01-01

    JDD Inc, is a maintenance and custodial contracting company whose mission is to provide their clients in the private and government sectors "quality construction, construction management and cleaning services in the most efficient and cost effective manners, (JDD, Inc. Mission Statement)." This company provides facilities support for Fort Riley in Fo,rt Riley, Kansas and the NASA John H. Glenn Research Center at Lewis Field here in Cleveland, Ohio. JDD, Inc. is owned and operated by James Vaughn, who started as painter at NASA Glenn and has been working here for the past seventeen years. This summer I worked under Devan Anderson, who is the safety manager for JDD Inc. in the Logistics and Technical Information Division at Glenn Research Center The LTID provides all transportation, secretarial, security needs and contract management of these various services for the center. As a safety manager, my mentor provides Occupational Health and Safety Occupation (OSHA) compliance to all JDD, Inc. employees and handles all other issues (Environmental Protection Agency issues, workers compensation, safety and health training) involving to job safety. My summer assignment was not as considered "groundbreaking research" like many other summer interns have done in the past, but it is just as important and beneficial to JDD, Inc. I initially created a database using a Microsoft Excel program to classify and categorize data pertaining to numerous safety training certification courses instructed by our safety manager during the course of the fiscal year. This early portion of the database consisted of only data (training field index, employees who were present at these training courses and who was absent) from the training certification courses. Once I completed this phase of the database, I decided to expand the database and add as many dimensions to it as possible. Throughout the last seven weeks, I have been compiling more data from day to day operations and been adding the information to the database. It now consists of seven different categories of data (carpet cleaning, forms, NASA Event Schedules, training certifications, wall and vent cleaning, work schedules, and miscellaneous) . I also did some field inspecting with the supervisors around the site and was present at all of the training certification courses that have been scheduled since June 2004. My future outlook for the JDD, Inc. database is to have all of company s information from future contract proposals, weekly inventory, to employee timesheets all in this same database.

  16. DNA nanomachines

    Microsoft Academic Search

    Jonathan Bath; Andrew J. Turberfield

    2007-01-01

    We are learning to build synthetic molecular machinery from DNA. This research is inspired by biological systems in which individual molecules act, singly and in concert, as specialized machines: our ambition is to create new technologies to perform tasks that are currently beyond our reach. DNA nanomachines are made by self-assembly, using techniques that rely on the sequence-specific interactions that

  17. DNA Pendant

    E-print Network

    Hacker, Randi; Tsutsui, William

    2007-11-14

    Broadcast Transcript: It's a symbol of commitment. It's a memento mori. It's the DNA pendant offered by Japan's Eiwa Industry and it's two, two, two things in one. Using genetic extraction, Eiwa removes the DNA from, say, a strand of hair or a...

  18. DNA Tutorial

    NSDL National Science Digital Library

    Yvette

    This site is an excellent resource on the structure and function of DNA as well as its role in genes and chromosomes. It also covers DNA replication, RNA structure and function, RNA synthesis, the genetic code, and protein synthesis. The site includes a tutorial that tests comprehension of the covered subjects.

  19. NATIVE HEALTH DATABASES: NATIVE HEALTH HISTORY DATABASE (NHHD)

    EPA Science Inventory

    The Native Health Databases contain bibliographic information and abstracts of health-related articles, reports, surveys, and other resource documents pertaining to the health and health care of American Indians, Alaska Natives, and Canadian First Nations. The databases provide i...

  20. NATIVE HEALTH DATABASES: NATIVE HEALTH RESEARCH DATABASE (NHRD)

    EPA Science Inventory

    The Native Health Databases contain bibliographic information and abstracts of health-related articles, reports, surveys, and other resource documents pertaining to the health and health care of American Indians, Alaska Natives, and Canadian First Nations. The databases provide i...

  1. Phyto diab care: Phytoremedial database for antidiabetics

    PubMed Central

    Luhach, Shruti; Goel, Anshita; Taj, Gohar; Goyal, Peyush; Kumar, Anil

    2013-01-01

    Diabetes, a chronic disease debilitating to normal healthy lifestyle, onsets due to insufficient amount of insulin production or ineffective utilization of the amount produced. Although, pharmaceutical research has brought up remedial drugs and numerous candidates in various phases of clinical trials, off-target effects and unwanted physiological actions are a constant source of concern and contra indicatory in case of diabetic patients. Here we present a phytoremedial database, Phyto Diab Care, broadly applicable to any known anti-diabetic medicinal plant and phytochemicals sourced from them. Utilization of the traditional medicine knowledge for combating diabetes without creating unwanted physiological actions is our major emphasis. Data collected from peer-reviewed publications and phytochemicals were added to the customizable database by means of an extended relational design. The strength of this resource is in providing rapid retrieval of data from large volumes of text at a high degree of accuracy. Enhanced web interface allows multi-criteria based information filtering. Furthermore, the availability of 2D and 3D structures from molecular docking studies with any efficacy on the insulin signaling pathway makes the resource searchable and comparable in an intuitive manner. Phyto Diab Care compendium is publicly available and can be found in online. Availability http://www.gbpuat-cbsh.ac.in/departments/bi/database/phytodiabcare/HOME%20PAGE/Home%20page.html PMID:23750083

  2. Leveraging a Critical Care Database

    PubMed Central

    Ghassemi, Marzyeh; Marshall, John; Singh, Nakul; Stone, David J.

    2014-01-01

    Background: Observational studies have found an increased risk of adverse effects such as hemorrhage, stroke, and increased mortality in patients taking selective serotonin reuptake inhibitors (SSRIs). The impact of prior use of these medications on outcomes in critically ill patients has not been previously examined. We performed a retrospective study to determine if preadmission use of SSRIs or serotonin norepinephrine reuptake inhibitors (SNRIs) is associated with mortality differences in patients admitted to the ICU. Methods: The retrospective study used a modifiable data mining technique applied to the publicly available Multiparameter Intelligent Monitoring in Intensive Care (MIMIC) 2.6 database. A total of 14,709 patient records, consisting of 2,471 in the SSRI/SNRI group and 12,238 control subjects, were analyzed. The study outcome was in-hospital mortality. Results: After adjustment for age, Simplified Acute Physiology Score, vasopressor use, ventilator use, and combined Elixhauser score, SSRI/SNRI use was associated with significantly increased in-hospital mortality (OR, 1.19; 95% CI, 1.02-1.40; P = .026). Among patient subgroups, risk was highest in patients with acute coronary syndrome (OR, 1.95; 95% CI, 1.21-3.13; P = .006) and patients admitted to the cardiac surgery recovery unit (OR, 1.51; 95% CI, 1.11-2.04; P = .008). Mortality appeared to vary by specific SSRI, with higher mortalities associated with higher levels of serotonin inhibition. Conclusions: We found significant increases in hospital stay mortality among those patients in the ICU taking SSRI/SNRIs prior to admission as compared with control subjects. Mortality was higher in patients receiving SSRI/SNRI agents that produce greater degrees of serotonin reuptake inhibition. The study serves to demonstrate the potential for the future application of advanced data examination techniques upon detailed (and growing) clinical databases being made available by the digitization of medicine. PMID:24371841

  3. Overview of the HUPO Plasma Proteome Project: Results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly-available database

    Microsoft Academic Search

    Marcin Adamski; Rajasree Menon; Henning Hermjakob; Rolf Apweiler; Arie Admon; Ruedi Aebersold; Helmut Meyer; Young-Ki Paik; Jong-Shin Yoo; Peipei Ping; Joel G. Pounds; Joshua N. Adkins; Xiaohong Qian; Valerie Wasinger; Chi Yue Wu; Xiaohang Zhao; Rong Zeng; Alexander Archakov; Akira Tsugita; Ilan Beer; Akhilesh Pandey; Michael Pisano; Philip Andrews; Harald Tammen

    2005-01-01

    HUPO initiated the Plasma Proteome Project (PPP) in 2002. Its pilot phase has (1) evaluated advantages and limitations of many depletion, fractionation, and MS technology platforms; (2) compared PPP reference specimens of human serum and EDTA, heparin, and citrate-anticoagulated plasma; and (3) created a publicly-available knowledge base (www.bioinformatics. med.umich.edu\\/hupo\\/ppp; www.ebi.ac.uk\\/pride). Thirty-five participating laboratories in 13 countries submitted datasets. Working groups

  4. Native Pig and Chicken Breed Database: NPCDB.

    PubMed

    Jeong, Hyeon-Soo; Kim, Dae-Won; Chun, Se-Yoon; Sung, Samsun; Kim, Hyeon-Jeong; Cho, Seoae; Kim, Heebal; Oh, Sung-Jong

    2014-10-01

    Indigenous (native) breeds of livestock have higher disease resistance and adaptation to the environment due to high genetic diversity. Even though their extinction rate is accelerated due to the increase of commercial breeds, natural disaster, and civil war, there is a lack of well-established databases for the native breeds. Thus, we constructed the native pig and chicken breed database (NPCDB) which integrates available information on the breeds from around the world. It is a nonprofit public database aimed to provide information on the genetic resources of indigenous pig and chicken breeds for their conservation. The NPCDB (http://npcdb.snu.ac.kr/) provides the phenotypic information and population size of each breed as well as its specific habitat. In addition, it provides information on the distribution of genetic resources across the country. The database will contribute to understanding of the breed's characteristics such as disease resistance and adaptation to environmental changes as well as the conservation of indigenous genetic resources. PMID:25178289

  5. Native Pig and Chicken Breed Database: NPCDB

    PubMed Central

    Jeong, Hyeon-Soo; Kim, Dae-Won; Chun, Se-Yoon; Sung, Samsun; Kim, Hyeon-Jeong; Cho, Seoae; Kim, Heebal; Oh, Sung-Jong

    2014-01-01

    Indigenous (native) breeds of livestock have higher disease resistance and adaptation to the environment due to high genetic diversity. Even though their extinction rate is accelerated due to the increase of commercial breeds, natural disaster, and civil war, there is a lack of well-established databases for the native breeds. Thus, we constructed the native pig and chicken breed database (NPCDB) which integrates available information on the breeds from around the world. It is a nonprofit public database aimed to provide information on the genetic resources of indigenous pig and chicken breeds for their conservation. The NPCDB (http://npcdb.snu.ac.kr/) provides the phenotypic information and population size of each breed as well as its specific habitat. In addition, it provides information on the distribution of genetic resources across the country. The database will contribute to understanding of the breed’s characteristics such as disease resistance and adaptation to environmental changes as well as the conservation of indigenous genetic resources. PMID:25178289

  6. 1 Public Policy and Public Administration PUBLIC POLICY AND PUBLIC

    E-print Network

    Vertes, Akos

    1 Public Policy and Public Administration PUBLIC POLICY AND PUBLIC ADMINISTRATION Through its Trachtenberg School of Public Policy and Public Administration, Columbian College of Arts and Sciences offers the Master of Public Policy, Master of Public Administration, and the Doctor of Philosophy in the field

  7. GLIDA: GPCR--ligand database for chemical genomics drug discovery--database and tools update.

    PubMed

    Okuno, Yasushi; Tamon, Akiko; Yabuuchi, Hiroaki; Niijima, Satoshi; Minowa, Yohsuke; Tonomura, Koichiro; Kunimoto, Ryo; Feng, Chunlai

    2008-01-01

    G-protein coupled receptors (GPCRs) represent one of the most important families of drug targets in pharmaceutical development. GLIDA is a public GPCR-related Chemical Genomics database that is primarily focused on the integration of information between GPCRs and their ligands. It provides interaction data between GPCRs and their ligands, along with chemical information on the ligands, as well as biological information regarding GPCRs. These data are connected with each other in a relational database, allowing users in the field of Chemical Genomics research to easily retrieve such information from either biological or chemical starting points. GLIDA includes a variety of similarity search functions for the GPCRs and for their ligands. Thus, GLIDA can provide correlation maps linking the searched homologous GPCRs (or ligands) with their ligands (or GPCRs). By analyzing the correlation patterns between GPCRs and ligands, we can gain more detailed knowledge about their conserved molecular recognition patterns and improve drug design efforts by focusing on inferred candidates for GPCR-specific drugs. This article provides a summary of the GLIDA database and user facilities, and describes recent improvements to database design, data contents, ligand classification programs, similarity search options and graphical interfaces. GLIDA is publicly available at http://pharminfo.pharm.kyoto-u.ac.jp/services/glida/. We hope that it will prove very useful for Chemical Genomics research and GPCR-related drug discovery. PMID:17986454

  8. GLIDA: GPCR—ligand database for chemical genomics drug discovery—database and tools update

    PubMed Central

    Okuno, Yasushi; Tamon, Akiko; Yabuuchi, Hiroaki; Niijima, Satoshi; Minowa, Yohsuke; Tonomura, Koichiro; Kunimoto, Ryo; Feng, Chunlai

    2008-01-01

    G-protein coupled receptors (GPCRs) represent one of the most important families of drug targets in pharmaceutical development. GLIDA is a public GPCR-related Chemical Genomics database that is primarily focused on the integration of information between GPCRs and their ligands. It provides interaction data between GPCRs and their ligands, along with chemical information on the ligands, as well as biological information regarding GPCRs. These data are connected with each other in a relational database, allowing users in the field of Chemical Genomics research to easily retrieve such information from either biological or chemical starting points. GLIDA includes a variety of similarity search functions for the GPCRs and for their ligands. Thus, GLIDA can provide correlation maps linking the searched homologous GPCRs (or ligands) with their ligands (or GPCRs). By analyzing the correlation patterns between GPCRs and ligands, we can gain more detailed knowledge about their conserved molecular recognition patterns and improve drug design efforts by focusing on inferred candidates for GPCR-specific drugs. This article provides a summary of the GLIDA database and user facilities, and describes recent improvements to database design, data contents, ligand classification programs, similarity search options and graphical interfaces. GLIDA is publicly available at http://pharminfo.pharm.kyoto-u.ac.jp/services/glida/. We hope that it will prove very useful for Chemical Genomics research and GPCR-related drug discovery. PMID:17986454

  9. Federated database systems for managing distributed, heterogeneous, and autonomous databases

    Microsoft Academic Search

    Amit P. Sheth; James A. Larson

    1990-01-01

    A federated database system (FDBS) is a collection of cooperating database systems that are autonomous and possibly heterogeneous. In this paper, we define a reference architecture for distributed database management systems from system and schema viewpoints and show how various FDBS architectures can be developed. We then define a methodology for developing one of the popular architectures of an FDBS.

  10. The Intelligent Database Interface: Integrating AI and Database Systems

    Microsoft Academic Search

    Donald P. Mckay; Timothy W. Finin; Anthony B. O'hare

    1990-01-01

    The Intelligent Database Interface (IDI) is a cache-based interface that is designed to provide Artificial Intelligence systems with efficient access to one or more databases on one or more remote database management systems (DBMSs). It can be used to interface with a wide variety of different DBMSs with little or no modification since SQL is used to communicate with remote

  11. The GLIMS Glacier Database

    NASA Astrophysics Data System (ADS)

    Raup, B. H.; Khalsa, S. S.; Armstrong, R.

    2007-12-01

    The Global Land Ice Measurements from Space (GLIMS) project has built a geospatial and temporal database of glacier data, composed of glacier outlines and various scalar attributes. These data are being derived primarily from satellite imagery, such as from ASTER and Landsat. Each "snapshot" of a glacier is from a specific time, and the database is designed to store multiple snapshots representative of different times. We have implemented two web-based interfaces to the database; one enables exploration of the data via interactive maps (web map server), while the other allows searches based on text-field constraints. The web map server is an Open Geospatial Consortium (OGC) compliant Web Map Server (WMS) and Web Feature Server (WFS). This means that other web sites can display glacier layers from our site over the Internet, or retrieve glacier features in vector format. All components of the system are implemented using Open Source software: Linux, PostgreSQL, PostGIS (geospatial extensions to the database), MapServer (WMS and WFS), and several supporting components such as Proj.4 (a geographic projection library) and PHP. These tools are robust and provide a flexible and powerful framework for web mapping applications. As a service to the GLIMS community, the database contains metadata on all ASTER imagery acquired over glacierized terrain. Reduced-resolution of the images (browse imagery) can be viewed either as a layer in the MapServer application, or overlaid on the virtual globe within Google Earth. The interactive map application allows the user to constrain by time what data appear on the map. For example, ASTER or glacier outlines from 2002 only, or from Autumn in any year, can be displayed. The system allows users to download their selected glacier data in a choice of formats. The results of a query based on spatial selection (using a mouse) or text-field constraints can be downloaded in any of these formats: ESRI shapefiles, KML (Google Earth), MapInfo, GML (Geography Markup Language) and GMT (Generic Mapping Tools). This "clip-and-ship" function allows users to download only the data they are interested in. Our flexible web interfaces to the database, which includes various support layers (e.g. a layer to help collaborators identify satellite imagery over their region of expertise) will facilitate enhanced analysis to be undertaken on glacier systems, their distribution, and their impacts on other Earth systems.

  12. Astronomical databases of Nikolaev Observatory

    NASA Astrophysics Data System (ADS)

    Protsyuk, Y.; Mazhaev, A.

    2008-07-01

    Several astronomical databases were created at Nikolaev Observatory during the last years. The databases are built by using MySQL search engine and PHP scripts. They are available on NAO web-site http://www.mao.nikolaev.ua.

  13. PARALLEL DATABASE MACHINES Kjell Bratbergsengen

    E-print Network

    and database servers for "new" data types, notably film and video. THE TRAUMATIC HISTORY OF DATABASE COMPUTERS and later, European Community supported developments. Also the massively parallel search system based

  14. A user-friendly phytoremediation database: creating the searchable database, the users, and the broader implications.

    PubMed

    Famulari, Stevie; Witz, Kyla

    2015-01-01

    Designers, students, teachers, gardeners, farmers, landscape architects, architects, engineers, homeowners, and others have uses for the practice of phytoremediation. This research looks at the creation of a phytoremediation database which is designed for ease of use for a non-scientific user, as well as for students in an educational setting ( http://www.steviefamulari.net/phytoremediation ). During 2012, Environmental Artist & Professor of Landscape Architecture Stevie Famulari, with assistance from Kyla Witz, a landscape architecture student, created an online searchable database designed for high public accessibility. The database is a record of research of plant species that aid in the uptake of contaminants, including metals, organic materials, biodiesels & oils, and radionuclides. The database consists of multiple interconnected indexes categorized into common and scientific plant name, contaminant name, and contaminant type. It includes photographs, hardiness zones, specific plant qualities, full citations to the original research, and other relevant information intended to aid those designing with phytoremediation search for potential plants which may be used to address their site's need. The objective of the terminology section is to remove uncertainty for more inexperienced users, and to clarify terms for a more user-friendly experience. Implications of the work, including education and ease of browsing, as well as use of the database in teaching, are discussed. PMID:26030361

  15. An EST database from saffron stigmas

    PubMed Central

    D'Agostino, Nunzio; Pizzichini, Daniele; Chiusano, Maria Luisa; Giuliano, Giovanni

    2007-01-01

    Background Saffron (Crocus sativus L., Iridaceae) flowers have been used as a spice and medicinal plant ever since the Greek-Minoan civilization. The edible part – the stigmas – are commonly considered the most expensive spice in the world and are the site of a peculiar secondary metabolism, responsible for the characteristic color and flavor of saffron. Results We produced 6,603 high quality Expressed Sequence Tags (ESTs) from a saffron stigma cDNA library. This collection is accessible and searchable through the Saffron Genes database http://www.saffrongenes.org. The ESTs have been grouped into 1,893 Clusters, each corresponding to a different expressed gene, and annotated. The complete set of raw EST sequences, as well as of their electopherograms, are maintained in the database, allowing users to investigate sequence qualities and EST structural features (vector contamination, repeat regions). The saffron stigma transcriptome contains a series of interesting sequences (putative sex determination genes, lipid and carotenoid metabolism enzymes, transcription factors). Conclusion The Saffron Genes database represents the first reference collection for the genomics of Iridaceae, for the molecular biology of stigma biogenesis, as well as for the metabolic pathways underlying saffron secondary metabolism. PMID:17925031

  16. Encoded evidence: DNA in forensic analysis.

    PubMed

    Jobling, Mark A; Gill, Peter

    2004-10-01

    Sherlock Holmes said "it has long been an axiom of mine that the little things are infinitely the most important", but never imagined that such a little thing, the DNA molecule, could become perhaps the most powerful single tool in the multifaceted fight against crime. Twenty years after the development of DNA fingerprinting, forensic DNA analysis is key to the conviction or exoneration of suspects and the identification of victims of crimes, accidents and disasters, driving the development of innovative methods in molecular genetics, statistics and the use of massive intelligence databases. PMID:15510165

  17. Database of Nordic Neo-Latin Literature

    NSDL National Science Digital Library

    Originating from a research project that involved latinists from all five Nordic countries (Denmark, Finland, Iceland, Norway, and Sweden), this database is currently maintained and edited by professors Lars Boje Mortensen and Karen Skovgaard-Petersen, Department of Greek and Latin, University of Bergen, Norway, and Peter Zeeberg, Institut for Graesk og Latin, University of Copenhagen, Denmark. It lists selected Latin texts, written between the reformation (c. 1530) and 1800, that pertain to Nordic people or locations. Scholars can search the database by keyword or by author, place of publication, language, and dedicatee. Visitors can also browse a list of current Neo-Latin scholars, consult a bibliography, view an historical map of Scandanavia, and read a brief note on the historical background of the region.

  18. Stanford University Medical Center: Ovarian Kaleidoscope Database

    NSDL National Science Digital Library

    The Ovarian Kaleidoscope Database (OKDB) was developed by the Hsueh Lab in the Department of Gynecology & Obstetrics at Stanford University Medical Center. The OKDB "provides information regarding the biological function, expression pattern and regulation of genes expressed in the ovary. It also contains information on gene sequences, chromosomal localization, human and murine mutation phenotypes and biomedical publication links." Database users can conduct a Gene Search, or browse an extensive Alphabetical List of Ovarian Genes. After registering with OKDB, site users can access Submit and Update options as well. The site also contains an interactive diagram of Ovarian Gene Mutations Associated with Infertility or Sub-Fertility, information about Ovarian Gene Maps, and a selection of Useful Links.

  19. Hydrogen Leak Detection Sensor Database

    NASA Technical Reports Server (NTRS)

    Baker, Barton D.

    2010-01-01

    This slide presentation reviews the characteristics of the Hydrogen Sensor database. The database is the result of NASA's continuing interest in and improvement of its ability to detect and assess gas leaks in space applications. The database specifics and a snapshot of an entry in the database are reviewed. Attempts were made to determine the applicability of each of the 65 sensors for ground and/or vehicle use.

  20. Great Lakes Shipping Database

    NSDL National Science Digital Library

    Provided by the University of Detroit Mercy Libraries/Media Services, this site is a great resource for anyone interested in the history of shipping on the Great Lakes. The database indexes information on a large number of ships that have worked these waters, offering information such as registry number, year built, final disposition, company, physical measurements, name of shipbuilders, and additional remarks, among other categories. Both company name and shipbuilder are cross-referenced to additional ships owned or built. Most of the entries also include some excellent historical photos, though these did not load correctly in Netscape (they worked fine with IE).The entry for the Edmund Fitzgerald, for instance, contained ten photos. The database may be searched by keyword with multiple modifiers.