Note: This page contains sample records for the topic public dna databases from Science.gov.
While these samples are representative of the content of Science.gov,
they are not comprehensive nor are they the most current set.
We encourage you to perform a real-time search of Science.gov
to obtain the most current and comprehensive results.
Last update: August 15, 2014.
1

The Genographic Project Public Participation Mitochondrial DNA Database  

PubMed Central

The Genographic Project is studying the genetic signatures of ancient human migrations and creating an open-source research database. It allows members of the public to participate in a real-time anthropological genetics study by submitting personal samples for analysis and donating the genetic results to the database. We report our experience from the first 18 months of public participation in the Genographic Project, during which we have created the largest standardized human mitochondrial DNA (mtDNA) database ever collected, comprising 78,590 genotypes. Here, we detail our genotyping and quality assurance protocols including direct sequencing of the mtDNA HVS-I, genotyping of 22 coding-region SNPs, and a series of computational quality checks based on phylogenetic principles. This database is very informative with respect to mtDNA phylogeny and mutational dynamics, and its size allows us to develop a nearest neighbor–based methodology for mtDNA haplogroup prediction based on HVS-I motifs that is superior to classic rule-based approaches. We make available to the scientific community and general public two new resources: a periodically updated database comprising all data donated by participants, and the nearest neighbor haplogroup prediction tool.

Behar, Doron M; Rosset, Saharon; Blue-Smith, Jason; Balanovsky, Oleg; Tzur, Shay; Comas, David; Mitchell, R. John; Quintana-Murci, Lluis; Tyler-Smith, Chris; Wells, R. Spencer

2007-01-01

2

On the reliability of DNA sequences of Ophiocordyceps sinensis in public databases.  

PubMed

Some DNA sequences in the International Nucleotide Sequence Databases (INSD) are erroneously annotated, which has lead to misleading conclusions in publications. Ophiocordyceps sinensis (syn. Cordyceps sinensis) is a fungus endemic to the Tibetan Plateau, and more than 100 populations covering almost its distribution area have been examined by us over recent years. In this study, using the data from authentic materials, we have evaluated the reliability of nucleotide sequences annotated as O. sinensis in the INSD. As of October 15, 2012, the INSD contained 874 records annotated as O. sinensis, including 555 records representing nuclear ribosomal DNA (63.5 %), 197 representing protein-coding genes (22.5 %), 92 representing random markers with unknown functions (10.5 %), and 30 representing microsatellite loci (3.5 %). Our analysis indicated that 39 of the 397 internal transcribed spacer entries, 27 of the 105 small subunit entries, and five of the 53 large subunit entries were incorrectly annotated as belonging to O. sinensis. For protein-coding sequences, all records of serine protease genes, the mating-type gene MAT1-2-1, the DNA lyase gene, the two largest subunits of RNA polymerase II, and elongation factor-1? gene were correct, while 14 of the 73 ?-tubulin entries were indeterminate. Genetic diversity analyses using those sequences correctly identified as O. sinensis revealed significant genetic differentiation in the fungus although the extent of genetic differentiation varied with the gene. The relationship between O. sinensis and some other related fungal taxa is also discussed. PMID:23397071

Zhang, Shu; Zhang, Yong-Jie; Liu, Xing-Zhong; Zhang, Hong; Liu, Dian-Sheng

2013-04-01

3

NCCDPHP PUBLICATION DATABASE  

EPA Science Inventory

This database provides bibliographic citations and abstracts of publications produced by the CDC's National Center for Chronic Disease Prevention and Health Promotion (NCCDPHP) including journal articles, monographs, book chapters, reports, policy documents, and fact sheets. Full...

4

The National DNA Database  

Microsoft Academic Search

Over the last two years the Forensic Science Service (FSS) has developed and put into operation a National DNA Database that has analysed samples from individuals suspected of crime and stains from scenes of crime. It has provided more than 2200 links between individuals and scenes and 1200 links between scenes of crime. It uses an STR SGM (second generation

David J Werrett

1997-01-01

5

Citation analysis of database publications  

Microsoft Academic Search

We analyze citation frequencies for two main database conferences (SIGMOD, VLDB) and three database journals (TODS, VLDB Journal, Sigmod Record) over 10 years. The citation data is obtai- ned by integrating and cleaning data from DBLP and Google Scho- lar. Our analysis considers different comparative metrics per publication venue, in particular the total and average number of ci- tations as

Erhard Rahm; Andreas Thor

2005-01-01

6

Enhancing the DNA Patent Database  

SciTech Connect

Final Report on Award No. DE-FG0201ER63171 Principal Investigator: LeRoy B. Walters February 18, 2008 This project successfully completed its goal of surveying and reporting on the DNA patenting and licensing policies at 30 major U.S. academic institutions. The report of survey results was published in the January 2006 issue of Nature Biotechnology under the title “The Licensing of DNA Patents by US Academic Institutions: An Empirical Survey.” Lori Pressman was the lead author on this feature article. A PDF reprint of the article will be submitted to our Program Officer under separate cover. The project team has continued to update the DNA Patent Database on a weekly basis since the conclusion of the project. The database can be accessed at dnapatents.georgetown.edu. This database provides a valuable research tool for academic researchers, policymakers, and citizens. A report entitled Reaping the Benefits of Genomic and Proteomic Research: Intellectual Property Rights, Innovation, and Public Health was published in 2006 by the Committee on Intellectual Property Rights in Genomic and Protein Research and Innovation, Board on Science, Technology, and Economic Policy at the National Academies. The report was edited by Stephen A. Merrill and Anne-Marie Mazza. This report employed and then adapted the methodology developed by our research project and quoted our findings at several points. (The full report can be viewed online at the following URL: http://www.nap.edu/openbook.php?record_id=11487&page=R1). My colleagues and I are grateful for the research support of the ELSI program at the U.S. Department of Energy.

Walters, LeRoy B.

2008-02-18

7

Database Support for Research in Public Administration  

ERIC Educational Resources Information Center

This study examines the extent to which databases support student and faculty research in the area of public administration. A list of journals in public administration, public policy, political science, public budgeting and finance, and other related areas was compared to the journal content list of six business databases. These databases

Tucker, James Cory

2005-01-01

8

Compressing DNA sequence databases with coil  

PubMed Central

Background Publicly available DNA sequence databases such as GenBank are large, and are growing at an exponential rate. The sheer volume of data being dealt with presents serious storage and data communications problems. Currently, sequence data is usually kept in large "flat files," which are then compressed using standard Lempel-Ziv (gzip) compression – an approach which rarely achieves good compression ratios. While much research has been done on compressing individual DNA sequences, surprisingly little has focused on the compression of entire databases of such sequences. In this study we introduce the sequence database compression software coil. Results We have designed and implemented a portable software package, coil, for compressing and decompressing DNA sequence databases based on the idea of edit-tree coding. coil is geared towards achieving high compression ratios at the expense of execution time and memory usage during compression – the compression time represents a "one-off investment" whose cost is quickly amortised if the resulting compressed file is transmitted many times. Decompression requires little memory and is extremely fast. We demonstrate a 5% improvement in compression ratio over state-of-the-art general-purpose compression tools for a large GenBank database file containing Expressed Sequence Tag (EST) data. Finally, coil can efficiently encode incremental additions to a sequence database. Conclusion coil presents a compelling alternative to conventional compression of flat files for the storage and distribution of DNA sequence databases having a narrow distribution of sequence lengths, such as EST data. Increasing compression levels for databases having a wide distribution of sequence lengths is a direction for future work.

White, W Timothy J; Hendy, Michael D

2008-01-01

9

Developing a DNA variant database.  

PubMed

Disease- and locus-specific variant databases have been a valuable resource to clinical and research geneticists. With the recent rapid developments in technologies, the number of DNA variants detected in a typical molecular genetics laboratory easily exceeds 1,000. To keep track of the growing inventory of DNA variants, many laboratories employ information technology to store the data as well as distributing the data and its associated information to clinicians and researchers via the Web. While it is a valuable resource, the hosting of a web-accessible database requires collaboration between bioinformaticians and biologists and careful planning to ensure its usability and availability. In this chapter, a series of tutorials on building a local DNA variant database out of a sample dataset will be provided. However, this tutorial will not include programming details on building a web interface and on constructing the web application necessary for web hosting. Instead, an introduction to the two commonly used methods for hosting web-accessible variant databases will be described. Apart from the tutorials, this chapter will also consider the resources and planning required for making a variant database project successful. PMID:18453092

Fung, David C Y

2008-01-01

10

An annotated mtDNA database.  

PubMed

We have compiled a database of mitochondrial DNA (mtDNA) control region, hypervariable regions 1 (HVR1) and 2 (HVR2) sequences of a total of 14,138 individuals compiled from 103 mtDNA publications before 1 January 2000, 13 data sets published in 2000 and 2001 and 2 unpublished data sets of Iraqi Kurds and Indians from Kerala. By contacting the authors and by other means, we have confirmed and corrected sequence errors, eliminated duplications and harmonised the sequence format. These changes affected all but 26 of the 116 publications. Furthermore, we have implemented a geographic information system ("mtradius") which searches for closest matches to a given mtDNA control region sequence and displays them on a geographic map. A potential application is to estimate a chance matching probability when a forensic stain and a suspect have an identical mtDNA sequence: we suggest that the geographic area with the highest frequency of closely related mtDNA sequence types may be used to define a reference population to give the suspect the maximum benefit of doubt in accordance with the ceiling principle. PMID:11599766

Röhl, A; Brinkmann, B; Forster, L; Forster, P

2001-08-01

11

Publications of Australian LIS Academics in Databases  

ERIC Educational Resources Information Center

This paper examines aspects of journal articles published from 1967 to 2008, located in eight databases, and authored or co-authored by academics serving for at least two years in Australian LIS programs from 1959 to 2008. These aspects are: inclusion of publications in databases, publications in journals, authorship characteristics of…

Wilson, Concepcion S.; Boell, Sebastian K.; Kennan, Mary Anne; Willard, Patricia

2011-01-01

12

Short Tandem Repeat DNA Internet Database  

National Institute of Standards and Technology Data Gateway

SRD 130 Short Tandem Repeat DNA Internet Database (Web, free access)   Short Tandem Repeat DNA Internet Database is intended to benefit research and application of short tandem repeat DNA markers for human identity testing. Facts and sequence information on each STR system, population data, commonly used multiplex STR systems, PCR primers and conditions, and a review of various technologies for analysis of STR alleles have been included.

13

Forensic DNA Profiling and Database  

PubMed Central

The incredible power of DNA technology as an identification tool had brought a tremendous change in crimnal justice . DNA data base is an information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. This article discusses the essential steps in compilation of COmbined DNA Index System (CODIS) on validated polymerase chain amplified STRs and their use in crime detection.

Panneerchelvam, S.; Norazmi, M.N.

2003-01-01

14

Plant rDNA database: update and new features  

PubMed Central

The Plant rDNA database (www.plantrdnadatabase.com) is an open access online resource providing detailed information on numbers, structures and positions of 5S and 18S-5.8S-26S (35S) ribosomal DNA loci. The data have been obtained from >600 publications on plant molecular cytogenetics, mostly based on fluorescent in situ hybridization (FISH). This edition of the database contains information on 1609 species derived from 2839 records, which means an expansion of 55.76 and 94.45%, respectively. It holds the data for angiosperms, gymnosperms, bryophytes and pteridophytes available as of June 2013. Information from publications reporting data for a single rDNA (either 5S or 35S alone) and annotation regarding transcriptional activity of 35S loci now appears in the database. Preliminary analyses suggest greater variability in the number of rDNA loci in gymnosperms than in angiosperms. New applications provide ideograms of the species showing the positions of rDNA loci as well as a visual representation of their genome sizes. We have also introduced other features to boost the usability of the Web interface, such as an application for convenient data export and a new section with rDNA–FISH-related information (mostly detailing protocols and reagents). In addition, we upgraded and/or proofread tabs and links and modified the website for a more dynamic appearance. This manuscript provides a synopsis of these changes and developments. Database URL: http://www.plantrdnadatabase.com

Garcia, Sonia; Galvez, Francisco; Gras, Airy; Kovarik, Ales; Garnatje, Teresa

2014-01-01

15

Analysis of commercial and public bioactivity databases.  

PubMed

Activity data for small molecules are invaluable in chemoinformatics. Various bioactivity databases exist containing detailed information of target proteins and quantitative binding data for small molecules extracted from journals and patents. In the current work, we have merged several public and commercial bioactivity databases into one bioactivity metabase. The molecular presentation, target information, and activity data of the vendor databases were standardized. The main motivation of the work was to create a single relational database which allows fast and simple data retrieval by in-house scientists. Second, we wanted to know the amount of overlap between databases by commercial and public vendors to see whether the former contain data complementing the latter. Third, we quantified the degree of inconsistency between data sources by comparing data points derived from the same scientific article cited by more than one vendor. We found that each data source contains unique data which is due to different scientific articles cited by the vendors. When comparing data derived from the same article we found that inconsistencies between the vendors are common. In conclusion, using databases of different vendors is still useful since the data overlap is not complete. It should be noted that this can be partially explained by the inconsistencies and errors in the source data. PMID:22145975

Tiikkainen, Pekka; Franke, Lutz

2012-02-27

16

The Availability of Faculty Publication Databases from Library Web Pages  

Microsoft Academic Search

Faculty publication databases or author bibliographies offer libraries an opportunity to provide services to users. Initially, these databases remained initiatives of special libraries in the health-sciences fields. Librarians used the publication information derived from these databases to compile lists for annual reports. However, the advent of new technologies, especially the Web, prompted numerous libraries to develop faculty publication databases for

Barbara A. Blummer

2007-01-01

17

MitoBreak: the mitochondrial DNA breakpoints database.  

PubMed

Mitochondrial DNA (mtDNA) rearrangements are key events in the development of many diseases. Investigations of mtDNA regions affected by rearrangements (i.e. breakpoints) can lead to important discoveries about rearrangement mechanisms and can offer important clues about the causes of mitochondrial diseases. Here, we present the mitochondrial DNA breakpoints database (MitoBreak; http://mitobreak.portugene.com), a free, web-accessible comprehensive list of breakpoints from three classes of somatic mtDNA rearrangements: circular deleted (deletions), circular partially duplicated (duplications) and linear mtDNAs. Currently, MitoBreak contains >1400 mtDNA rearrangements from seven species (Homo sapiens, Mus musculus, Rattus norvegicus, Macaca mulatta, Drosophila melanogaster, Caenorhabditis elegans and Podospora anserina) and their associated phenotypic information collected from nearly 400 publications. The database allows researchers to perform multiple types of data analyses through user-friendly interfaces with full or partial datasets. It also permits the download of curated data and the submission of new mtDNA rearrangements. For each reported case, MitoBreak also documents the precise breakpoint positions, junction sequences, disease or associated symptoms and links to the related publications, providing a useful resource to study the causes and consequences of mtDNA structural alterations. PMID:24170808

Damas, Joana; Carneiro, João; Amorim, António; Pereira, Filipe

2014-01-01

18

24 CFR 81.72 - Public-use database and public information.  

Code of Federal Regulations, 2013 CFR

...2013-04-01 false Public-use database and public information. 81.72 ...Information § 81.72 Public-use database and public information. (a) General...available for public use, a public-use database containing public data as defined...

2013-04-01

19

24 CFR 81.72 - Public-use database and public information.  

Code of Federal Regulations, 2010 CFR

...2009-04-01 false Public-use database and public information. 81.72 ...Information § 81.72 Public-use database and public information. (a) General...available for public use, a public-use database containing public data as defined...

2009-04-01

20

24 CFR 81.72 - Public-use database and public information.  

Code of Federal Regulations, 2010 CFR

...2010-04-01 false Public-use database and public information. 81.72 ...Information § 81.72 Public-use database and public information. (a) General...available for public use, a public-use database containing public data as defined...

2010-04-01

21

Information Access through Electronic Databases for Rural Public Libraries.  

ERIC Educational Resources Information Center

To compile a list of recommended electronic databases for rural libraries, public library patron questions received by the Arizona State Reference Center were searched on ten databases. The results indicated Books in Print, Magazine Database, ABI/INFORM, Public Affairs Information System (PAIS), and Government Printing Office (GPO) Publications

Canepi, Kitti

1997-01-01

22

Data publication: towards a database of everything  

PubMed Central

The fabric of science is changing, driven by a revolution in digital technologies that facilitate the acquisition and communication of massive amounts of data. This is changing the nature of collaboration and expanding opportunities to participate in science. If digital technologies are the engine of this revolution, digital data are its fuel. But for many scientific disciplines, this fuel is in short supply. The publication of primary data is not a universal or mandatory part of science, and despite policies and proclamations to the contrary, calls to make data publicly available have largely gone unheeded. In this short essay I consider why, and explore some of the challenges that lie ahead, as we work toward a database of everything.

Smith, Vincent S

2009-01-01

23

The Availability of Faculty Publication Databases from Library Web Pages  

ERIC Educational Resources Information Center

Faculty publication databases or author bibliographies offer libraries an opportunity to provide services to users. Initially, these databases remained initiatives of special libraries in the health-sciences fields. Librarians used the publication information derived from these databases to compile lists for annual reports. However, the advent of…

Blummer, Barbara A.

2007-01-01

24

An integrated web medicinal materials DNA database: MMDBD (Medicinal Materials DNA Barcode Database)  

PubMed Central

Background Thousands of plants and animals possess pharmacological properties and there is an increased interest in using these materials for therapy and health maintenance. Efficacies of the application is critically dependent on the use of genuine materials. For time to time, life-threatening poisoning is found because toxic adulterant or substitute is administered. DNA barcoding provides a definitive means of authentication and for conducting molecular systematics studies. Owing to the reduced cost in DNA authentication, the volume of the DNA barcodes produced for medicinal materials is on the rise and necessitates the development of an integrated DNA database. Description We have developed an integrated DNA barcode multimedia information platform- Medicinal Materials DNA Barcode Database (MMDBD) for data retrieval and similarity search. MMDBD contains over 1000 species of medicinal materials listed in the Chinese Pharmacopoeia and American Herbal Pharmacopoeia. MMDBD also contains useful information of the medicinal material, including resources, adulterant information, medical parts, photographs, primers used for obtaining the barcodes and key references. MMDBD can be accessed at http://www.cuhk.edu.hk/icm/mmdbd.htm. Conclusions This work provides a centralized medicinal materials DNA barcode database and bioinformatics tools for data storage, analysis and exchange for promoting the identification of medicinal materials. MMDBD has the largest collection of DNA barcodes of medicinal materials and is a useful resource for researchers in conservation, systematic study, forensic and herbal industry.

2010-01-01

25

Exploration of the Chemical Space of Public Genomic Databases  

EPA Science Inventory

The current project aims to chemically index the content of public genomic databases to make these data accessible in relation to other publicly available, chemically-indexed toxicological information. ...

26

AN EXPERT SYSTEM FOR SCORING DNA DATABASE PROFILES  

Microsoft Academic Search

Forensic DNA databases are becoming an increasingly valuable law enforcement tool for convicting repeat offenders and exonerating the innocent. However, constructing such databases is quite laborious. After generating STR profiles in the lab, people expend even greater effort visually reviewing the data before it enters the database. All artifacts must be detected, and no error can be tolerated. With millions

Mark W. Perlin

2000-01-01

27

[Privacy and public benefit in using large scale health databases].  

PubMed

In Japan, large scale heath databases were constructed in a few years, such as National Claim insurance and health checkup database (NDB) and Japanese Sentinel project. But there are some legal issues for making adequate balance between privacy and public benefit by using such databases. NDB is carried based on the act for elderly person's health care but in this act, nothing is mentioned for using this database for general public benefit. Therefore researchers who use this database are forced to pay much concern about anonymization and information security that may disturb the research work itself. Japanese Sentinel project is a national project to detecting drug adverse reaction using large scale distributed clinical databases of large hospitals. Although patients give the future consent for general such purpose for public good, it is still under discussion using insufficiently anonymized data. Generally speaking, researchers of study for public benefit will not infringe patient's privacy, but vague and complex requirements of legislation about personal data protection may disturb the researches. Medical science does not progress without using clinical information, therefore the adequate legislation that is simple and clear for both researchers and patients is strongly required. In Japan, the specific act for balancing privacy and public benefit is now under discussion. The author recommended the researchers including the field of pharmacology should pay attention to, participate in the discussion of, and make suggestion to such act or regulations. PMID:24790041

Yamamoto, Ryuichi

2014-01-01

28

Short Tandem Repeat DNA Internet Database  

NSDL National Science Digital Library

This website contains comprehensive information relating to forensic DNA analysis. It has material from an introductory to an advanced level on forensic DNA technology. The material provides general information on DNA markers that are of interest to human identification. The site contains both introductory and in-depth discussions of short tandem repeats (STRs) and other DNA markers currently used by the forensic community. Powerpoint and PDF presentations on STR training material are available and can be readily downloaded.

2011-04-08

29

DNA variant databases: current state and future directions.  

PubMed

In this chapter we aim to provide an overview of DNA variant databases, commonly known as Locus-Specific Databases (LSDBs), or Gene-Disease Specific Databases (GDSDBs), but the term variant database will be used for simplicity. We restrict this overview to germ-line variants, particularly as related to Mendelian diseases, which are diseases caused by a variant in a single gene. Common difficulties associated with variant databases and some proposed solutions are reviewed. Finally, systems where technical solutions have been implemented are discussed. This work will be useful for anyone wishing to establish their own variant database, or to learn about the global picture of variant databases, and the technical challenges to be overcome. PMID:24870141

Plazzer, John-Paul; Macrae, Finlay

2014-01-01

30

Building a Faculty Publications Database: A Case Study  

ERIC Educational Resources Information Center

This case study shares the experience of building an in-house faculty publications database that was spearheaded by the Touro College and University System library in 2010. The project began with the intention of contributing to the college by collecting the research accomplishments of our faculty and staff, thereby also increasing library…

Tabaei, Sara; Schaffer, Yitzchak; McMurray, Gregory; Simon, Bashe

2013-01-01

31

Digital Equipment Corporation's CRDOM Software and Database Publications.  

ERIC Educational Resources Information Center

Acquaints information professionals with Digital Equipment Corporation's compact optical disk read-only-memory (CDROM) search and retrieval software and growing library of CDROM database publications (COMPENDEX, Chemical Abstracts Services). Highlights include MicroBASIS, boolean operators, range operators, word and phrase searching, proximity…

Adams, Michael Q.

1986-01-01

32

CORE: A Phylogenetically-Curated 16S rDNA Database of the Core Oral Microbiome  

PubMed Central

Comparing bacterial 16S rDNA sequences to GenBank and other large public databases via BLAST often provides results of little use for identification and taxonomic assignment of the organisms of interest. The human microbiome, and in particular the oral microbiome, includes many taxa, and accurate identification of sequence data is essential for studies of these communities. For this purpose, a phylogenetically curated 16S rDNA database of the core oral microbiome, CORE, was developed. The goal was to include a comprehensive and minimally redundant representation of the bacteria that regularly reside in the human oral cavity with computationally robust classification at the level of species and genus. Clades of cultivated and uncultivated taxa were formed based on sequence analyses using multiple criteria, including maximum-likelihood-based topology and bootstrap support, genetic distance, and previous naming. A number of classification inconsistencies for previously named species, especially at the level of genus, were resolved. The performance of the CORE database for identifying clinical sequences was compared to that of three publicly available databases, GenBank nr/nt, RDP and HOMD, using a set of sequencing reads that had not been used in creation of the database. CORE offered improved performance compared to other public databases for identification of human oral bacterial 16S sequences by a number of criteria. In addition, the CORE database and phylogenetic tree provide a framework for measures of community divergence, and the focused size of the database offers advantages of efficiency for BLAST searching of large datasets. The CORE database is available as a searchable interface and for download at http://microbiome.osu.edu.

Griffen, Ann L.; Beall, Clifford J.; Firestone, Noah D.; Gross, Erin L.; DiFranco, James M.; Hardman, Jori H.; Vriesendorp, Bastienne; Faust, Russell A.; Janies, Daniel A.; Leys, Eugene J.

2011-01-01

33

Cryptographic Approaches to Provacy in Forensic DNA Databases  

Microsoft Academic Search

Advances in DNA sequencing technology and human genetics are leading to the availabilityof inexpensive genetic tests, notably tests for individual predisposition to certain diseases. Whilesuch information is often valuable, its availability has raised serious concerns over the privacyof genetic information. These concerns are further heightened when genetic information isgathered into databases. We study access control for one class of such

Philip Bohannon; Markus Jakobsson; Sukamol Srikwan

2000-01-01

34

Knowing your DNA database: Issues with determining ancestral Y haplotypes in a Y-Filer database  

Microsoft Academic Search

Y-chromosome STR profiles are being increasingly used in forensic investigations. Differences between populations in male-specific DNA markers can be large (relative to autosomal ones) and this knowledge has implications for appropriate database construction. Also, in cosmopolitan populations, such as Australia, asymmetrical admixture between groups can have important effects. This paper reports on these issues as they relate to the South

Duncan A. Taylor; Robert J. Mitchell; Roland van Oorschot; Nano Nagle; Julianne M. Henry

2009-01-01

35

Comparisons of familial DNA database searching strategies.  

PubMed

The current familial searching strategies are generally based on either Identity-By-State (IBS) (i.e., number of shared alleles) or likelihood ratio (i.e., kinship index [KI]) assessments. In this study, the expected IBS match probabilities given relationships and the logic of the likelihood ratio method were addressed. Further, the false-positive and false-negative rates of the strategies were compared analytically or by simulations using Caucasian population data of the 13 CODIS Short Tandem Repeat (STR). IBS ? 15, IBS ? 16, KI ? 1000, or KI ? 10,000 were found to be good thresholds for balancing false-positive and false-negative rates. IBS ? 17 and/or KI ? 1,000,000 can exclude the majority of candidate profiles in the database, either related or not, and may be an initial screening option if a small candidate list is desired. Polices combining both IBS and KI can provide higher accuracy. Typing additional STRs can provide better searching performance, and lineage markers can be extremely useful for reducing false rates. PMID:21827463

Ge, Jianye; Chakraborty, Ranajit; Eisenberg, Arthur; Budowle, Bruce

2011-11-01

36

CORE: A Phylogenetically-Curated 16S rDNA Database of the Core Oral Microbiome  

Microsoft Academic Search

Comparing bacterial 16S rDNA sequences to GenBank and other large public databases via BLAST often provides results of little use for identification and taxonomic assignment of the organisms of interest. The human microbiome, and in particular the oral microbiome, includes many taxa, and accurate identification of sequence data is essential for studies of these communities. For this purpose, a phylogenetically

Ann L. Griffen; Clifford J. Beall; Noah D. Firestone; Erin L. Gross; James M. DiFranco; Jori H. Hardman; Bastienne Vriesendorp; Russell A. Faust; Daniel A. Janies; Eugene J. Leys

2011-01-01

37

DNAVaxDB: the first web-based DNA vaccine database and its data analysis  

PubMed Central

Since the first DNA vaccine studies were done in the 1990s, thousands more studies have followed. Here we report the development and analysis of DNAVaxDB (http://www.violinet.org/dnavaxdb), the first publically available web-based DNA vaccine database that curates, stores, and analyzes experimentally verified DNA vaccines, DNA vaccine plasmid vectors, and protective antigens used in DNA vaccines. All data in DNAVaxDB are annotated from reliable resources, particularly peer-reviewed articles. Among over 140 DNA vaccine plasmids, some plasmids were more frequently used in one type of pathogen than others; for example, pCMVi-UB for G- bacterial DNA vaccines, and pCAGGS for viral DNA vaccines. Presently, over 400 DNA vaccines containing over 370 protective antigens from over 90 infectious and non-infectious diseases have been curated in DNAVaxDB. While extracellular and bacterial cell surface proteins and adhesin proteins were frequently used for DNA vaccine development, the majority of protective antigens used in Chlamydophila DNA vaccines are localized to the inner portion of the cell. The DNA vaccine priming, other vaccine boosting vaccination regimen has been widely used to induce protection against infection of different pathogens such as HIV. Parasitic and cancer DNA vaccines were also systematically analyzed. User-friendly web query and visualization interfaces are available in DNAVaxDB for interactive data search. To support data exchange, the information of DNA vaccines, plasmids, and protective antigens is stored in the Vaccine Ontology (VO). DNAVaxDB is targeted to become a timely and vital source of DNA vaccines and related data and facilitate advanced DNA vaccine research and development.

2014-01-01

38

75 FR 41180 - Notice of Order: Revisions to Enterprise Public Use Database  

Federal Register 2010, 2011, 2012, 2013

...Revisions to Enterprise Public Use Database AGENCY: Federal Housing Finance Agency...responsibility to maintain a public use database (PUDB) for such mortgage data was transferred...purpose of loan data field in these two databases. 4. Single-family Data Field 27...

2010-07-15

39

DSSTOX WEBSITE LAUNCH: IMPROVING PUBLIC ACCESS TO DATABASES FOR BUILDING STRUCTURE-TOXICITY PREDICTION MODELS  

EPA Science Inventory

DSSTox Website Launch: Improving Public Access to Databases for Building Structure-Toxicity Prediction Models Ann M. Richard US Environmental Protection Agency, Research Triangle Park, NC, USA Distributed: Decentralized set of standardized, field-delimited databases,...

40

MmtDB: a Metazoa mitochondrial DNA variants database.  

PubMed Central

The present paper describes the structure of MmtDB-a specialized database designed to collect Metazoa mitochondrial DNA variants. Priority in the data collection is given to the Metazoa species for which a large amount of variants is available, as it is the case for human variants. Starting from the sequences available in the Nucleotide Sequence Databases, the redundant sequences are removed and new sequences from other sources are added. Value-added information are associated to each variant sequence, e.g. analysed region, experimental method, tissue and cell lines, population data, sex, age, family code and information about the variation events (nucleotide position, involved gene, restriction site's gain or loss). Cross-references are introduced to the EMBL Data Library, as well as an internal cross-referencing among MmtDB entries according to their tissual, heteroplasmic, familiar and aplotypical correlation. MmtDB can be accessed through the World Wide Web at URL [see text].

Calo, D; De Pascali, A; Sasanelli, D; Tanzariello, F; Tommaseo Ponzetta, M; Saccone, C; Attimonelli, M

1997-01-01

41

A distributed system for DNA/protein database similarity searches.  

PubMed

A distributed system for exhaustive alignment similarity searches on DNA/protein databases is presented. The system makes it possible to share the computational burden on diverse computers, provided they are interconnected by a network supporting TCP/IP communication. It consists of three separate programs: a user-interface module, a similarity score calculation module (multiple instances of which may be running simultaneously on different machines), and a control module. This last acts as the 'hub' of the system. The three programs have been successfully tested under the AIX, SunOS, VAX/VMS and MS/DOS operating systems. PMID:7922683

Penotti, F E

1994-06-01

42

SSAHA: A Fast Search Method for Large DNA Databases  

PubMed Central

We describe an algorithm, SSAHA (Sequence Search and Alignment by Hashing Algorithm), for performing fast searches on databases containing multiple gigabases of DNA. Sequences in the database are preprocessed by breaking them into consecutive k-tuples of k contiguous bases and then using a hash table to store the position of each occurrence of each k-tuple. Searching for a query sequence in the database is done by obtaining from the hash table the “hits” for each k-tuple in the query sequence and then performing a sort on the results. We discuss the effect of the tuple length k on the search speed, memory usage, and sensitivity of the algorithm and present the results of computational experiments which show that SSAHA can be three to four orders of magnitude faster than BLAST or FASTA, while requiring less memory than suffix tree methods. The SSAHA algorithm is used for high-throughput single nucleotide polymorphism (SNP) detection and very large scale sequence assembly. Also, it provides Web-based sequence search facilities for Ensembl projects.

Ning, Zemin; Cox, Anthony J.; Mullikin, James C.

2001-01-01

43

Development and Evaluation of a Quality-Controlled Ribosomal Sequence Database for 16S Ribosomal DNA-Based Identification of Staphylococcus Species  

Microsoft Academic Search

To establish an improved ribosomal gene sequence database as part of the Ribosomal Differentiation of Microorganisms (RIDOM) project and to overcome the drawbacks of phenotypic identification systems and publicly accessible sequence databases, both strands of the 5 end of the 16S ribosomal DNA (rDNA) of 81 type and reference strains comprising all validly described staphylococcal (sub)species were sequenced. Assuming a

Karsten Becker; Dag Harmsen; Alexander Mellmann; Christian Meier; Peter Schumann; Georg Peters; Christof von Eiff

2004-01-01

44

Annotation Error in Public Databases: Misannotation of Molecular Function in Enzyme Superfamilies  

Microsoft Academic Search

Due to the rapid release of new data from genome sequencing projects, the majority of protein sequences in public databases have not been experimentally characterized; rather, sequences are annotated using computational analysis. The level of misannotation and the types of misannotation in large public databases are currently unknown and have not been analyzed in depth. We have investigated the misannotation

Alexandra M. Schnoes; Shoshana D. Brown; Igor Dodevski; Patricia C. Babbitt

2009-01-01

45

Toward public volume database management: a case study of NOVA, the National Online Volumetric Archive  

Microsoft Academic Search

Public databases today can be constructed with a wide variety of authoring and management structures. The widespread appeal of Internet search engines suggests that public information be made open and available to common search strategies, making accessible information that would otherwise be hidden by the infrastructure and software interfaces of a traditional database management system. We present the construction and

Alex Fletcher; Terry S. Yoo

2004-01-01

46

The EpiSLI Database: A Publicly Available Database on Speech and Language  

ERIC Educational Resources Information Center

Purpose: This article describes a database that was created in the process of conducting a large-scale epidemiologic study of specific language impairment (SLI). As such, this database will be referred to as the EpiSLI database. Children with SLI have unexpected and unexplained difficulties learning and using spoken language. Although there is no…

Tomblin, J. Bruce

2010-01-01

47

Abundant human DNA contamination identified in non-primate genome databases.  

PubMed

During routine screens of the NCBI databases using human repetitive elements we discovered an unlikely level of nucleotide identity across a broad range of phyla. To ascertain whether databases containing DNA sequences, genome assemblies and trace archive reads were contaminated with human sequences, we performed an in depth search for sequences of human origin in non-human species. Using a primate specific SINE, AluY, we screened 2,749 non-primate public databases from NCBI, Ensembl, JGI, and UCSC and have found 492 to be contaminated with human sequence. These represent species ranging from bacteria (B. cereus) to plants (Z. mays) to fish (D. rerio) with examples found from most phyla. The identification of such extensive contamination of human sequence across databases and sequence types warrants caution among the sequencing community in future sequencing efforts, such as human re-sequencing. We discuss issues this may raise as well as present data that gives insight as to how this may be occurring. PMID:21358816

Longo, Mark S; O'Neill, Michael J; O'Neill, Rachel J

2011-01-01

48

Generation and analysis of a 29,745 unique Expressed Sequence Tags from the Pacific oyster (Crassostrea gigas) assembled into a publicly accessible database: the GigasDatabase  

PubMed Central

Background Although bivalves are among the most-studied marine organisms because of their ecological role and economic importance, very little information is available on the genome sequences of oyster species. This report documents three large-scale cDNA sequencing projects for the Pacific oyster Crassostrea gigas initiated to provide a large number of expressed sequence tags that were subsequently compiled in a publicly accessible database. This resource allowed for the identification of a large number of transcripts and provides valuable information for ongoing investigations of tissue-specific and stimulus-dependant gene expression patterns. These data are crucial for constructing comprehensive DNA microarrays, identifying single nucleotide polymorphisms and microsatellites in coding regions, and for identifying genes when the entire genome sequence of C. gigas becomes available. Description In the present paper, we report the production of 40,845 high-quality ESTs that identify 29,745 unique transcribed sequences consisting of 7,940 contigs and 21,805 singletons. All of these new sequences, together with existing public sequence data, have been compiled into a publicly-available Website http://public-contigbrowser.sigenae.org:9090/Crassostrea_gigas/index.html. Approximately 43% of the unique ESTs had significant matches against the SwissProt database and 27% were annotated using Gene Ontology terms. In addition, we identified a total of 208 in silico microsatellites from the ESTs, with 173 having sufficient flanking sequence for primer design. We also identified a total of 7,530 putative in silico, single-nucleotide polymorphisms using existing and newly-generated EST resources for the Pacific oyster. Conclusion A publicly-available database has been populated with 29,745 unique sequences for the Pacific oyster Crassostrea gigas. The database provides many tools to search cleaned and assembled ESTs. The user may input and submit several filters, such as protein or nucleotide hits, to select and download relevant elements. This database constitutes one of the most developed genomic resources accessible among Lophotrochozoans, an orphan clade of bilateral animals. These data will accelerate the development of both genomics and genetics in a commercially-important species with the highest annual, commercial production of any aquatic organism.

2009-01-01

49

TFBSshape: a motif database for DNA shape features of transcription factor binding sites.  

PubMed

Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein-DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955

Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W; Gordân, Raluca; Rohs, Remo

2014-01-01

50

Database Changes (Post-Publication). ERIC Processing Manual, Section X.  

ERIC Educational Resources Information Center

The purpose of this section is to specify the procedure for making changes to the ERIC database after the data involved have been announced in the abstract journals RIE or CIJE. As a matter of general ERIC policy, a document or journal article is not re-announced or re-entered into the database as a new accession for the purpose of accomplishing a…

Brandhorst, Ted, Ed.

51

76 FR 53912 - FDA's Public Database of Products With Orphan-Drug Designation: Replacing Non-Informative Code...  

Federal Register 2010, 2011, 2012, 2013

...FDA-2011-N-0607] FDA's Public Database of Products With Orphan-Drug Designation...descriptive identifiers on its public database of products that have received orphan-drug...designation were published on our public database with non-informative code names....

2011-08-30

52

Design of a Database on Institutions, Projects, and Publications in the Field of Technology Assessment (TA).  

National Technical Information Service (NTIS)

In cooperation with FIZ Karlsruhe, the Information Center on Energy, Physics, Mathematics, the Department for Applied Systems Analysis of the Nuclear Research Center Karlsruhe designed a database on institutions, projects, and publications in the field of...

I. Berg H. Buecker-Gaertner

1988-01-01

53

A Public-Use, Full-Screen Interface for SPIRES Databases.  

ERIC Educational Resources Information Center

This paper describes the techniques for implementing a full-screen, custom SPIRES interface for a public-use library database. The database-independent protocol that controls the system is described in detail. Source code for an entire working application using this interface is included. The protocol, with less than 170 lines of procedural code,…

Kriz, Harry M.

54

Evaluating DNA profile evidence when the suspect is identified through a database search  

Microsoft Academic Search

ABSTRACT: The paper is concerned with the strength of DNA evidence when a suspect is identified via a search through a database of the DNA profiles of known,individuals. Consideration of the appropriate likelihood ratio shows that in this setting the DNA evidence is (slightly) stronger than when,a suspect is identified by other means, subsequently profiled, and found to match. The

Dj Balding; P Donnelly

1996-01-01

55

Characterization and compilation of polymorphic simple sequence repeat (SSR) markers of peanut from public database  

PubMed Central

Background There are several reports describing thousands of SSR markers in the peanut (Arachis hypogaea L.) genome. There is a need to integrate various research reports of peanut DNA polymorphism into a single platform. Further, because of lack of uniformity in the labeling of these markers across the publications, there is some confusion on the identities of many markers. We describe below an effort to develop a central comprehensive database of polymorphic SSR markers in peanut. Findings We compiled 1,343 SSR markers as detecting polymorphism (14.5%) within a total of 9,274 markers. Amongst all polymorphic SSRs examined, we found that AG motif (36.5%) was the most abundant followed by AAG (12.1%), AAT (10.9%), and AT (10.3%).The mean length of SSR repeats in dinucleotide SSRs was significantly longer than that in trinucleotide SSRs. Dinucleotide SSRs showed higher polymorphism frequency for genomic SSRs when compared to trinucleotide SSRs, while for EST-SSRs, the frequency of polymorphic SSRs was higher in trinucleotide SSRs than in dinucleotide SSRs. The correlation of the length of SSR and the frequency of polymorphism revealed that the frequency of polymorphism was decreased as motif repeat number increased. Conclusions The assembled polymorphic SSRs would enhance the density of the existing genetic maps of peanut, which could also be a useful source of DNA markers suitable for high-throughput QTL mapping and marker-assisted selection in peanut improvement and thus would be of value to breeders.

2012-01-01

56

Strengthening the United States' database protection laws: balancing public access and private control.  

PubMed

This paper develops three arguments for increasing the strength of database protection under U.S. law. First, stronger protections would encourage private investment in database development, and private databases have many potential benefits for science and industry. Second, stronger protections would discourage extensive use of private licenses to protect databases and would allow for greater public control over database laws and policies. Third, stronger database protections in the U.S. would harmonize U.S. and E.U. laws and would thus enhance international trade, commerce, and research. The U.S. should therefore follow the European example and develop two tiers of protection for databases: 1) protection for creative databases under copyright law; 2) protection for non-creative databases through a special type of sui generis protection. In order to balance private control of data and public access to data, sui generis protections should define a "fair use" exemption that permits some unauthorized extraction of data for private, educational, and research purposes, provided that such extraction does not adversely impact the economic value of the database. PMID:12971291

Resnik, David B

2003-07-01

57

DOE's Public Database for Green Building Case Studies: Preprint  

SciTech Connect

To help capture valuable information on''green building'' case studies, the U.S. Department of Energy has created an online database for collecting, standardizing, and disseminating information about high-performance, green projects. Type of information collected includes green features, design processes, energy performance, and comparison to other high-performance, green buildings.

Torcellini, P. A.; Crawley, D. B.

2003-11-01

58

Forensic DNA databases in Western Balkan region: retrospectives, perspectives, and initiatives  

PubMed Central

The European Network of Forensic Science Institutes (ENFSI) recommended the establishment of forensic DNA databases and specific implementation and management legislations for all EU/ENFSI members. Therefore, forensic institutions from Bosnia and Herzegovina, Serbia, Montenegro, and Macedonia launched a wide set of activities to support these recommendations. To assess the current state, a regional expert team completed detailed screening and investigation of the existing forensic DNA data repositories and associated legislation in these countries. The scope also included relevant concurrent projects and a wide spectrum of different activities in relation to forensics DNA use. The state of forensic DNA analysis was also determined in the neighboring Slovenia and Croatia, which already have functional national DNA databases. There is a need for a ‘regional supplement’ to the current documentation and standards pertaining to forensic application of DNA databases, which should include regional-specific preliminary aims and recommendations.

Marjanovic, Damir; Konjhodzic, Rijad; Butorac, Sara Sanela; Drobnic, Katja; Merkas, Sinisa; Lauc, Gordan; Primorac, Damir; An?elinovic, Simun; Milosavljevic, Mladen; Karan, Zeljko; Vidovic, Stojko; Stojkovic, Oliver; Panic, Bojana; Vucetic Dragovic, An?elka; Kovacevic, Sandra; Jakovski, Zlatko; Asplen, Chris; Primorac, Dragan

2011-01-01

59

Literature curation of protein interactions: measuring agreement across major public databases  

PubMed Central

Literature curation of protein interaction data faces a number of challenges. Although curators increasingly adhere to standard data representations, the data that various databases actually record from the same published information may differ significantly. Some of the reasons underlying these differences are well known, but their global impact on the interactions collectively curated by major public databases has not been evaluated. Here we quantify the agreement between curated interactions from 15?471 publications shared across nine major public databases. Results show that on average, two databases fully agree on 42% of the interactions and 62% of the proteins curated from the same publication. Furthermore, a sizable fraction of the measured differences can be attributed to divergent assignments of organism or splice isoforms, different organism focus and alternative representations of multi-protein complexes. Our findings highlight the impact of divergent curation policies across databases, and should be relevant to both curators and data consumers interested in analyzing protein-interaction data generated by the scientific community. Database URL: http://wodaklab.org/iRefWeb

Turinsky, Andrei L.; Razick, Sabry; Turner, Brian; Wodak, Shoshana J.

2010-01-01

60

Complementarity between public and commercial databases: new opportunities in medicinal chemistry informatics  

Microsoft Academic Search

The last two years have seen a dramatic expansion in public cheminformatics, as exemplified by the approximate five-fold growth of PubChem from over 50 contributing data sources. Consequently, medicinal chemists who were hitherto limited to commercial databases now also have access to public sources that they can download and\\/or query directly over the Web. The range of public sources, particularly

Christopher Southan; P'eter V'arkonyi; Sorel Muresan

2007-01-01

61

MisPred: a resource for identification of erroneous protein sequences in public databases  

PubMed Central

Correct prediction of the structure of protein-coding genes of higher eukaryotes is still a difficult task; therefore, public databases are heavily contaminated with mispredicted sequences. The high rate of misprediction has serious consequences because it significantly affects the conclusions that may be drawn from genome-scale sequence analyses of eukaryotic genomes. Here we present the MisPred database and computational pipeline that provide efficient means for the identification of erroneous sequences in public databases. The MisPred database contains a collection of abnormal, incomplete and mispredicted protein sequences from 19 metazoan species identified as erroneous by MisPred quality control tools in the UniProtKB/Swiss-Prot, UniProtKB/TrEMBL, NCBI/RefSeq and EnsEMBL databases. Major releases of the database are automatically generated and updated regularly. The database (http://www.mispred.com) is easily accessible through a simple web interface coupled to a powerful query engine and a standard web service. The content is completely or partially downloadable in a variety of formats. Database URL: http://www.mispred.com

Nagy, Alinda; Patthy, Laszlo

2013-01-01

62

FastStats: A Public Health Statistics Database.  

PubMed

FastStats is a site that provides quick and easy access to public health statistics. The freely available website is maintained by the Centers for Disease Control and Prevention's National Center for Health Statistics. Users can browse alphabetically by topic and state/territory or search across the National Center for Health Statistics site. A description of the browsing capabilities and sample searches are presented. PMID:24735268

Vardell, Emily

2014-01-01

63

The annotation and the usage of scientific databases could be improved with public issue tracker software  

PubMed Central

Since the publication of their longtime predecessor The Atlas of Protein Sequences and Structures in 1965 by Margaret Dayhoff, scientific databases have become a key factor in the organization of modern science. All the information and knowledge described in the novel scientific literature is translated into entries in many different scientific databases, making it possible to obtain very accurate information on a biological entity like genes or proteins without having to manually review the literature on it. However, even for the databases with the finest annotation procedures, errors or unclear parts sometimes appear in the publicly released version and influence the research of unaware scientists using them. The researcher that finds an error in a database is often left in a uncertain state, and often abandons the effort of reporting it because of a lack of a standard procedure to do so. In the present work, we propose that the simple adoption of a public error tracker application, as in many open software projects, could improve the quality of the annotations in many databases and encourage feedback from the scientific community on the data annotated publicly. In order to illustrate the situation, we describe a series of errors that we found and helped solve on the genes of a very well-known pathway in various biomedically relevant databases. We would like to show that, even if a majority of the most important scientific databases have procedures for reporting errors, these are usually not publicly visible, making the process of reporting errors time consuming and not useful. Also, the effort made by the user that reports the error often goes unacknowledged, putting him in a discouraging position.

Dall'Olio, Giovanni Marco; Bertranpetit, Jaume; Laayouni, Hafid

2010-01-01

64

The annotation and the usage of scientific databases could be improved with public issue tracker software.  

PubMed

Since the publication of their longtime predecessor The Atlas of Protein Sequences and Structures in 1965 by Margaret Dayhoff, scientific databases have become a key factor in the organization of modern science. All the information and knowledge described in the novel scientific literature is translated into entries in many different scientific databases, making it possible to obtain very accurate information on a biological entity like genes or proteins without having to manually review the literature on it. However, even for the databases with the finest annotation procedures, errors or unclear parts sometimes appear in the publicly released version and influence the research of unaware scientists using them. The researcher that finds an error in a database is often left in a uncertain state, and often abandons the effort of reporting it because of a lack of a standard procedure to do so. In the present work, we propose that the simple adoption of a public error tracker application, as in many open software projects, could improve the quality of the annotations in many databases and encourage feedback from the scientific community on the data annotated publicly. In order to illustrate the situation, we describe a series of errors that we found and helped solve on the genes of a very well-known pathway in various biomedically relevant databases. We would like to show that, even if a majority of the most important scientific databases have procedures for reporting errors, these are usually not publicly visible, making the process of reporting errors time consuming and not useful. Also, the effort made by the user that reports the error often goes unacknowledged, putting him in a discouraging position. PMID:21186182

Dall'Olio, Giovanni Marco; Bertranpetit, Jaume; Laayouni, Hafid

2010-01-01

65

The relative flexibility of B-DNA and A-RNA duplexes: database analysis  

PubMed Central

An extensive analysis of structural databases is carried out to investigate the relative flexibility of B-DNA and A-RNA duplexes in crystal form. Our results show that the general anisotropic concept of flexibility is not very useful to compare the deformability of B-DNA and A-RNA duplexes, since the flexibility patterns of B-DNA and A-RNA are quite different. In other words, ‘flexibility’ is a dangerous word for describing macromolecules, unless it is clearly defined. A few soft essential movements explain most of the natural flexibility of A-RNA, whereas many are necessary for B-DNA. Essential movements occurring in naked B-DNAs are identical to those necessary to deform DNA in DNA–protein complexes, which suggest that evolution has designed DNA–protein complexes so that B-DNA is deformed according to its natural tendency. DNA is generally more flexible, but for some distortions A-RNA is easier to deform. Local stiffness constants obtained for naked B-DNAs and DNA complexes are very close, demonstrating that global distortions in DNA necessary for binding to proteins are the result of the addition of small concerted deformations at the base-pair level. Finally, it is worth noting that in general the picture of the relative deformability of A-RNA and DNA derived from database analysis agrees very well with that derived from state of the art molecular dynamics (MD) simulations.

Perez, Alberto; Noy, Agnes; Lankas, Filip; Luque, F. Javier; Orozco, Modesto

2004-01-01

66

The relative flexibility of B-DNA and ARNA duplexes: database analysis  

Microsoft Academic Search

An extensive analysis of structural databases is carried out to investigate the relative flexibility of B-DNA and A-RNA duplexes in crystal form. Our results show that the general anisotropic concept of flexibility is not very useful to compare the deform- ability of B-DNA and A-RNA duplexes, since the flexibility patterns of B-DNA and A-RNA are quite dif- ferent. In other

Alberto Perez; Agnes Noy; Filip Lankas; F. Javier Luque; Modesto Orozco

2004-01-01

67

HEDS - EPA DATABASE SYSTEM FOR PUBLIC ACCESS TO HUMAN EXPOSURE DATA  

EPA Science Inventory

Human Exposure Database System (HEDS) is an Internet-based system developed to provide public access to human-exposure-related data from studies conducted by EPA's National Exposure Research Laboratory (NERL). HEDS was designed to work with the EPA Office of Research and Devel...

68

Harp: a distributed query system for legacy public libraries and structured databases  

Microsoft Academic Search

The main purpose of a digital library is to facilitate users easy access to enormous amount of globally networked information. Typically, this information includes preexisting public library catalog data, digitized document collections, and other databases. In this article, we describe the distributed query system of a digital library prototype system known as HARP. In the HARP project, we have designed

Ee-Peng Lim; Ying Lu

1999-01-01

69

The Neutron Monitor database as a tool for space weather, education, and public outreach  

NASA Astrophysics Data System (ADS)

The Neutron Monitor database (NMDB) was created to make measurements from ground-based Neutron Monitors easily accessible. Data from more than 40 stations is available in the database and can be plotted via a webpage and downloaded as ASCII tables for further processing. Real-time applications, like the GLE Alert, can access the database directly. The NMDB project has also hosted training sessions and created extensive public outreach and training material that has been translated into 11 languages. This material is openly available on the NMDB website and is frequently used in highschool and university courses. While the availability of data from currently operating stations is nearing completion, the availability of historical data, especially no longer operating stations, is still limited. We are currently trying to fill these gaps. As a first step a project to make NMDB compatible with the database of relativistic solar particle events (GLEs) is starting this year.

Steigies, Christian T.; Klein, Karl-Ludwig; Bütikofer, Rolf

2014-05-01

70

Hungarian mtDNA population databases from Budapest and the Baranya county Roma  

Microsoft Academic Search

To facilitate forensic mtDNA testing in Hungary, we have generated control region databases for two Hungarian populations:\\u000a 211 individuals were sampled from the urban Budapest population and 208 individuals were sampled from a Romani (“gypsy”) population\\u000a in Baranya county. Sequences were generated using a highly redundant approach to minimize potential database errors. The Budapest\\u000a population had high sequence diversity with

Jodi Irwin; Balazs Egyed; Jessica Saunier; Gabriella Szamosi; Jennifer O’Callaghan; Zsolt Padar; Thomas J. Parsons

2007-01-01

71

Information Technologies in Public Health Management: A Database on Biocides to Improve Quality of Life  

PubMed Central

Background Biocides for prolonging the shelf life of a large variety of materials have been extensively used over the last decades. It has estimated that the worldwide biocide consumption to be about 12.4 billion dollars in 2011, and is expected to increase in 2012. As biocides are substances we get in contact with in our everyday lives, access to this type of information is of paramount importance in order to ensure an appropriate living environment. Consequently, a database where information may be quickly processed, sorted, and easily accessed, according to different search criteria, is the most desirable solution. The main aim of this work was to design and implement a relational database with complete information about biocides used in public health management to improve the quality of life. Methods: Design and implementation of a relational database for biocides, by using the software “phpMyAdmin”. Results: A database, which allows for an efficient collection, storage, and management of information including chemical properties and applications of a large quantity of biocides, as well as its adequate dissemination into the public health environment. Conclusion: The information contained in the database herein presented promotes an adequate use of biocides, by means of information technologies, which in consequence may help achieve important improvement in our quality of life.

Roman, C; Scripcariu, L; Diaconescu, RM; Grigoriu, A

2012-01-01

72

76 FR 77533 - Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single...  

Federal Register 2010, 2011, 2012, 2013

...2011-N-13] Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single-Family Securitized Loan...FHFA's adoption of an Order revising FHFA's Public Use Database matrices to include certain data fields for high-cost...

2011-12-13

73

A Public HTLV-1 Molecular Epidemiology Database for Sequence Management and Data Mining  

PubMed Central

Background It is estimated that 15 to 20 million people are infected with the human T-cell lymphotropic virus type 1 (HTLV-1). At present, there are more than 2,000 unique HTLV-1 isolate sequences published. A central database to aggregate sequence information from a range of epidemiological aspects including HTLV-1 infections, pathogenesis, origins, and evolutionary dynamics would be useful to scientists and physicians worldwide. Described here, we have developed a database that collects and annotates sequence data and can be accessed through a user-friendly search interface. The HTLV-1 Molecular Epidemiology Database website is available at http://htlv1db.bahia.fiocruz.br/. Methodology/Principal Findings All data was obtained from publications available at GenBank or through contact with the authors. The database was developed using Apache Webserver 2.1.6 and SGBD MySQL. The webpage interfaces were developed in HTML and sever-side scripting written in PHP. The HTLV-1 Molecular Epidemiology Database is hosted on the Gonçalo Moniz/FIOCRUZ Research Center server. There are currently 2,457 registered sequences with 2,024 (82.37%) of those sequences representing unique isolates. Of these sequences, 803 (39.67%) contain information about clinical status (TSP/HAM, 17.19%; ATL, 7.41%; asymptomatic, 12.89%; other diseases, 2.17%; and no information, 60.32%). Further, 7.26% of sequences contain information on patient gender while 5.23% of sequences provide the age of the patient. Conclusions/Significance The HTLV-1 Molecular Epidemiology Database retrieves and stores annotated HTLV-1 proviral sequences from clinical, epidemiological, and geographical studies. The collected sequences and related information are now accessible on a publically available and user-friendly website. This open-access database will support clinical research and vaccine development related to viral genotype.

Libin, Pieter; Deforche, Koen; Edwards, Dustin; de Albuquerque-Junior, Antonio Eduardo; Vandamme, Anne-Mieke; Galvao-Castro, Bernardo; Alcantara, Luiz Carlos Junior

2012-01-01

74

Genomics and Public Health Research: Can the State Allow Access to Genomic Databases?  

PubMed Central

Because many diseases are multifactorial disorders, the scientific progress in genomics and genetics should be taken into consideration in public health research. In this context, genomic databases will constitute an important source of information. Consequently, it is important to identify and characterize the State’s role and authority on matters related to public health, in order to verify whether it has access to such databases while engaging in public health genomic research. We first consider the evolution of the concept of public health, as well as its core functions, using a comparative approach (e.g. WHO, PAHO, CDC and the Canadian province of Quebec). Following an analysis of relevant Quebec legislation, the precautionary principle is examined as a possible avenue to justify State access to and use of genomic databases for research purposes. Finally, we consider the Influenza pandemic plans developed by WHO, Canada, and Quebec, as examples of key tools framing public health decision-making process. We observed that State powers in public health, are not, in Quebec, well adapted to the expansion of genomics research. We propose that the scope of the concept of research in public health should be clear and include the following characteristics: a commitment to the health and well-being of the population and to their determinants; the inclusion of both applied research and basic research; and, an appropriate model of governance (authorization, follow-up, consent, etc.). We also suggest that the strategic approach version of the precautionary principle could guide collective choices in these matters.

Cousineau, J; Girard, N; Monardes, C; Leroux, T; Jean, M Stanton

2012-01-01

75

DNAtraffic--a new database for systems biology of DNA dynamics during the cell life  

PubMed Central

DNAtraffic (http://dnatraffic.ibb.waw.pl/) is dedicated to be a unique comprehensive and richly annotated database of genome dynamics during the cell life. It contains extensive data on the nomenclature, ontology, structure and function of proteins related to the DNA integrity mechanisms such as chromatin remodeling, histone modifications, DNA repair and damage response from eight organisms: Homo sapiens, Mus musculus, Drosophila melanogaster, Caenorhabditis elegans, Saccharomyces cerevisiae, Schizosaccharomyces pombe, Escherichia coli and Arabidopsis thaliana. DNAtraffic contains comprehensive information on the diseases related to the assembled human proteins. DNAtraffic is richly annotated in the systemic information on the nomenclature, chemistry and structure of DNA damage and their sources, including environmental agents or commonly used drugs targeting nucleic acids and/or proteins involved in the maintenance of genome stability. One of the DNAtraffic database aim is to create the first platform of the combinatorial complexity of DNA network analysis. Database includes illustrations of pathways, damage, proteins and drugs. Since DNAtraffic is designed to cover a broad spectrum of scientific disciplines, it has to be extensively linked to numerous external data sources. Our database represents the result of the manual annotation work aimed at making the DNAtraffic much more useful for a wide range of systems biology applications.

Kuchta, Krzysztof; Barszcz, Daniela; Grzesiuk, Elzbieta; Pomorski, Pawel; Krwawicz, Joanna

2012-01-01

76

Public Databases for Radio Astronomy: a joint venture between East and West  

NASA Astrophysics Data System (ADS)

Existing astronomical data centers and databases of celestial objects show a notable lack in published information on sources of radio emission. Two western and two eastern research teams have spent independent efforts in the past both in collecting systematically old and newly published source lists in electronic form, and in building search tools to query these. This project joins the efforts of all teams and proposes to establish the first reasonably complete and publicly accessible database of radio sources. The organization of the database of astrophysical catalogs is described. This ``CATalog supporting System'' (CATS) runs under the UNIX/LINUX operating system and permits external users to process requests on the basis of various net protocols and via electronic mail.

Trushkin, S. A.; Andernach, H.; Micol, A.; Gubanov, A. G.; Verkhodanov, O.; Titov, V. B.; Chernenkov, V. N.

77

Accessing the public MIMIC-II intensive care relational database for clinical research  

PubMed Central

Background The Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II) database is a free, public resource for intensive care research. The database was officially released in 2006, and has attracted a growing number of researchers in academia and industry. We present the two major software tools that facilitate accessing the relational database: the web-based QueryBuilder and a downloadable virtual machine (VM) image. Results QueryBuilder and the MIMIC-II VM have been developed successfully and are freely available to MIMIC-II users. Simple example SQL queries and the resulting data are presented. Clinical studies pertaining to acute kidney injury and prediction of fluid requirements in the intensive care unit are shown as typical examples of research performed with MIMIC-II. In addition, MIMIC-II has also provided data for annual PhysioNet/Computing in Cardiology Challenges, including the 2012 Challenge “Predicting mortality of ICU Patients”. Conclusions QueryBuilder is a web-based tool that provides easy access to MIMIC-II. For more computationally intensive queries, one can locally install a complete copy of MIMIC-II in a VM. Both publicly available tools provide the MIMIC-II research community with convenient querying interfaces and complement the value of the MIMIC-II relational database.

2013-01-01

78

ScafBank: a public comprehensive Scaffold database to support molecular hopping  

Microsoft Academic Search

Aim:The search for molecules whose bioactivities are similar to those of given compounds or to optimize the initial lead compounds from high throughput screening has attracted increasing interest in recent years. Our goal is to provide a publically searchable database of scaffolds out from a large collection of existing chemical molecules.Results:Although a number of in silico methods have emerged to

Bi-bo Yan; Meng-zhu Xue; Bing Xiong; Ke Liu; Ding-yu Hu; Jing-kang Shen

2009-01-01

79

Bioethical Biobanks: Three Concerns in Designing and Using Law Enforcement DNA Identification Databases  

SciTech Connect

Federal and state law enforcement authorities have amassed large collections of DNA samples and the identifying profiles derived from them. These databases help to identify the guilty and to exonerate the innocent, but as the databanks grow, so do fears about civil liberties. The research reported here discusses three legal and social policy issues that have been raised in regard to these biobanks—the choice of loci to type for identifying individuals, the indefinite retention of DNA samples, and the use of the DNA samples or the identifying profiles for research purposes. It also considers the possible value of the databases for research into the genetics of human behavior and the ethics of using them for this purpose. It rejects the broad claim that such research is inherently unethical but proposes procedures for ensuring that the value of the proposed research justifies any psychosocial or other risks to the subjects of the research.

D.H. Kaye

2006-10-19

80

Prisoners' expectations of the national forensic DNA database: surveillance and reconfiguration of individual rights.  

PubMed

In this paper we aim to discuss how Portuguese prisoners know and what they feel about surveillance mechanisms related to the inclusion and deletion of the DNA profiles of convicted criminals in the national forensic database. Through a set of interviews with individuals currently imprisoned we focus on the ways this group perceives forensic DNA technologies. While the institutional and political discourses maintain that the restricted use and application of DNA profiles within the national forensic database protects individuals' rights, the prisoners claim that police misuse of such technologies potentially makes it difficult to escape from surveillance and acts as a mean of reinforcing the stigma of delinquency. The prisoners also argue that additional intensive and extensive use of surveillance devices might be more protective of their own individual rights and might possibly increase potential for exoneration. PMID:21414735

Machado, Helena; Santos, Filipe; Silva, Susana

2011-07-15

81

SITVITWEB--a publicly available international multimarker database for studying Mycobacterium tuberculosis genetic diversity and molecular epidemiology.  

PubMed

Among various genotyping methods to study Mycobacterium tuberculosis complex (MTC) genotypic polymorphism, spoligotyping and mycobacterial interspersed repetitive units-variable number of DNA tandem repeats (MIRU-VNTRs) have recently gained international approval as robust, fast, and reproducible typing methods generating data in a portable format. Spoligotyping constituted the backbone of a publicly available database SpolDB4 released in 2006; nonetheless this method possesses a low discriminatory power when used alone and should be ideally used in conjunction with a second typing method such as MIRU-VNTRs for high-resolution epidemiological studies. We hereby describe a publicly available international database named SITVITWEB which incorporates such multimarker data allowing to have a global vision of MTC genetic diversity worldwide based on 62,582 clinical isolates corresponding to 153 countries of patient origin (105 countries of isolation). We report a total of 7105 spoligotype patterns (corresponding to 58,180 clinical isolates) - grouped into 2740 shared-types or spoligotype international types (SIT) containing 53,816 clinical isolates and 4364 orphan patterns. Interestingly, only 7% of the MTC isolates worldwide were orphans whereas more than half of SITed isolates (n=27,059) were restricted to only 24 most prevalent SITs. The database also contains a total of 2379 MIRU patterns (from 8161 clinical isolates) from 87 countries of patient origin (35 countries of isolation); these were grouped in 847 shared-types or MIRU international types (MIT) containing 6626 isolates and 1533 orphan patterns. Lastly, data on 5-locus exact tandem repeats (ETRs) were available on 4626 isolates from 59 countries of patient origin (22 countries of isolation); a total of 458 different VNTR patterns were observed - split into 245 shared-types or VNTR International Types (VIT) containing 4413 isolates) and 213 orphan patterns. Datamining of SITVITWEB further allowed to update rules defining MTC genotypic lineages as well to have a new insight into MTC population structure and worldwide distribution at country, sub-regional and continental levels. At evolutionary level, the data compiled may be useful to distinguish the occasional convergent evolution of genotypes versus specific evolution of sublineages essentially influenced by adaptation to the host. This database is publicly available at: http://www.pasteur-guadeloupe.fr:8081/SITVIT_ONLINE. PMID:22365971

Demay, Christophe; Liens, Benjamin; Burguière, Thomas; Hill, Véronique; Couvin, David; Millet, Julie; Mokrousov, Igor; Sola, Christophe; Zozio, Thierry; Rastogi, Nalin

2012-06-01

82

DNA variant databases improve test accuracy and phenotype prediction in Alport syndrome.  

PubMed

X-linked Alport syndrome is a form of progressive renal failure caused by pathogenic variants in the COL4A5 gene. More than 700 variants have been described and a further 400 are estimated to be known to individual laboratories but are unpublished. The major genetic testing laboratories for X-linked Alport syndrome worldwide have established a Web-based database for published and unpublished COL4A5 variants ( https://grenada.lumc.nl/LOVD2/COL4A/home.php?select_db=COL4A5 ). This conforms with the recommendations of the Human Variome Project: it uses the Leiden Open Variation Database (LOVD) format, describes variants according to the human reference sequence with standardized nomenclature, indicates likely pathogenicity and associated clinical features, and credits the submitting laboratory. The database includes non-pathogenic and recurrent variants, and is linked to another COL4A5 mutation database and relevant bioinformatics sites. Access is free. Increasing the number of COL4A5 variants in the public domain helps patients, diagnostic laboratories, clinicians, and researchers. The database improves the accuracy and efficiency of genetic testing because its variants are already categorized for pathogenicity. The description of further COL4A5 variants and clinical associations will improve our ability to predict phenotype and our understanding of collagen IV biochemistry. The database for X-linked Alport syndrome represents a model for databases in other inherited renal diseases. PMID:23720012

Savige, Judy; Ars, Elisabet; Cotton, Richard G H; Crockett, David; Dagher, Hayat; Deltas, Constantinos; Ding, Jie; Flinter, Frances; Pont-Kingdon, Genevieve; Smaoui, Nizar; Torra, Roser; Storey, Helen

2014-06-01

83

Privacy protection and public goods: building a genetic database for health research in Newfoundland and Labrador  

PubMed Central

Objective To provide a legal and ethical analysis of some of the implementation challenges faced by the Population Therapeutics Research Group (PTRG) at Memorial University (Canada), in using genealogical information offered by individuals for its genetics research database. Materials and methods This paper describes the unique historical and genetic characteristics of the Newfoundland and Labrador founder population, which gave rise to the opportunity for PTRG to build the Newfoundland Genealogy Database containing digitized records of all pre-confederation (1949) census records of the Newfoundland founder population. In addition to building the database, PTRG has developed the Heritability Analytics Infrastructure, a data management structure that stores genotype, phenotype, and pedigree information in a single database, and custom linkage software (KINNECT) to perform pedigree linkages on the genealogy database. Discussion A newly adopted legal regimen in Newfoundland and Labrador is discussed. It incorporates health privacy legislation with a unique research ethics statute governing the composition and activities of research ethics boards and, for the first time in Canada, elevating the status of national research ethics guidelines into law. The discussion looks at this integration of legal and ethical principles which provides a flexible and seamless framework for balancing the privacy rights and welfare interests of individuals, families, and larger societies in the creation and use of research data infrastructures as public goods. Conclusion The complementary legal and ethical frameworks that now coexist in Newfoundland and Labrador provide the legislative authority, ethical legitimacy, and practical flexibility needed to find a workable balance between privacy interests and public goods. Such an approach may also be instructive for other jurisdictions as they seek to construct and use biobanks and related research platforms for genetic research.

Pullman, Daryl; Perrot-Daley, Astrid; Hodgkinson, Kathy; Street, Catherine; Rahman, Proton

2013-01-01

84

Development of a 20-locus fluorescent multiplex system as a valuable tool for national DNA database.  

PubMed

The multiplex system allows the detection of 19 autosomal short tandem repeat (STR) loci [including all Combined DNA Index System (CODIS) STR loci as well as D2S1338, D6S1043, D12S391, D19S433, Penta D and Penta E] plus the sex-determining locus Amelogenin in a single reaction, comprising all STR loci in various commercial kits used in the China national DNA database (NDNAD). Primers are designed so that the amplicons are distributed ranging from 90 base pairs (bp) to 450 bp within a five-dye fluorescent design with the fifth dye reserved for the internal size standard. With 30 cycles, 125 pg to 2 ng DNA template showed optimal profiling result, while robust profiles could also be achieved by adjusting the cycle numbers for the DNA template beyond that optimal DNA input range. Mixture studies showed that 83% and 87% of minor alleles were detected at 9:1 and 1:9 ratios, respectively. When 4 ng of degraded DNA was digested by 2-min DNase and 1 ng undegraded DNA was added to 400 ?M haematin, the complete profiles were still observed. Polymerase chain reaction (PCR)-based procedures were examined and optimized including the concentrations of primer set, magnesium and the Taq polymerase as well as volume, cycle number and annealing temperature. In addition, the system has been validated by 3000 bloodstain samples and 35 common case samples in line with the Chinese National Standards and Scientific Working Group on DNA Analysis Methods (SWGDAM) guidelines. The total probability of identity (TPI) can reach to 8×10(-24), where DNA database can be improved at the level of 10 million DNA profiles or more because the number of expected match is far from one person (4×10(-10)) and can be negligible. Further, our system also demonstrates its good performance in case samples and it will be an ideal tool for forensic DNA typing and databasing with potential application. PMID:23266302

Jiang, Xianhua; Guo, Fei; Jia, Fei; Jin, Ping; Sun, Zhu

2013-02-01

85

Novel statistical tools for management of public databases facilitate community-wide replicability and control of false discovery.  

PubMed

Issues of publication bias, lack of replicability, and false discovery have long plagued the genetics community. Proper utilization of public and shared data resources presents an opportunity to ameliorate these problems. We present an approach to public database management that we term Quality Preserving Database (QPD). It enables perpetual use of the database for testing statistical hypotheses while controlling false discovery and avoiding publication bias on the one hand, and maintaining testing power on the other hand. We demonstrate it on a use case of a replication server for GWAS findings, underlining its practical utility. We argue that a shift to using QPD in managing current and future biological databases will significantly enhance the community's ability to make efficient and statistically sound use of the available data resources. PMID:24706571

Rosset, Saharon; Aharoni, Ehud; Neuvirth, Hani

2014-07-01

86

Development and Evaluation of a Quality-Controlled Ribosomal Sequence Database for 16S Ribosomal DNA-Based Identification of Staphylococcus Species  

PubMed Central

To establish an improved ribosomal gene sequence database as part of the Ribosomal Differentiation of Microorganisms (RIDOM) project and to overcome the drawbacks of phenotypic identification systems and publicly accessible sequence databases, both strands of the 5? end of the 16S ribosomal DNA (rDNA) of 81 type and reference strains comprising all validly described staphylococcal (sub)species were sequenced. Assuming a normal distribution for pairwise distances of all unique staphylococcal sequences and choosing a reporting criterion of ?98.7% similarity for a “distinct species,” a statistical error probability of 1.0% was calculated. To evaluate this database, a 16S rDNA fragment (corresponding to Escherichia coli positions 54 to 510) of 55 clinical Staphylococcus isolates (including those of the small-colony variant phenotype) were sequenced and analyzed by the RIDOM approach. Of these isolates, 54 (98.2%) had a similarity score above the proposed threshold using RIDOM; 48 (87.3%) of the sequences gave a perfect match, whereas 83.6% were found by searching National Center for Biotechnology Information (NCBI) database entries. In contrast to RIDOM, which showed four ambiguities at the species level (mainly concerning Staphylococcus intermedius versus Staphylococcus delphini), the NCBI database search yielded 18 taxon-related ambiguities and showed numerous matches exhibiting redundant or unspecified entries. Comparing molecular results with those of biochemical procedures, ID 32 Staph (bioMérieux, Marcy I'Etoile, France) and VITEK 2 (bioMérieux) failed to identify 13 (23.6%) and 19 (34.5%) isolates, respectively, due to incorrect identification and/or categorization below acceptable values. In contrast to phenotypic methods and the NCBI database, the novel high-quality RIDOM sequence database provides excellent identification of staphylococci, including rarely isolated species and phenotypic variants.

Becker, Karsten; Harmsen, Dag; Mellmann, Alexander; Meier, Christian; Schumann, Peter; Peters, Georg; von Eiff, Christof

2004-01-01

87

The Internet Brain Volume Database: A Public Resource for Storage and Retrieval of Volumetric Data  

PubMed Central

Every month, numerous publications appear that include neuroanatomic volumetric observations. The current and past literature that includes volumetric measurements is vast, but variable with respect to specific species, structures, and subject characteristics (such as gender, age, pathology, etc.). In this report we introduce the Internet Brain Volume Database (IBVD), www.nitrc.org/projects/ibvd, a site devoted to facilitating access to and utilization of neuroanatomic volumetric observations as published in the literature. We review the design and functionality of the site. The IBVD is the first database dedicated to integrating, exposing and sharing brain volumetric observations across species and disease. It offers valuable functionality for quality assurance assessment of results as well as support for meta-analysis across large segments of the published literature that are obscured from traditional text-based search engines.

Hodge, Steven M.; Gao, Yong; Frazier, Jean A.; Haselgrove, Christian

2012-01-01

88

Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections ?  

PubMed Central

Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated with human or animal mycoses encountered in clinical microbiology laboratories. The database comprises partial sequences from three nuclear genes: translation elongation factor 1? (EF-1?), the largest subunit of RNA polymerase (RPB1), and the second largest subunit of RNA polymerase (RPB2). These three gene fragments can be amplified by PCR and sequenced using primers that are conserved across the phylogenetic breadth of Fusarium. Phylogenetic analyses of the combined data set reveal that, with the exception of two monotypic lineages, all clinically relevant fusaria are nested in one of eight variously sized and strongly supported species complexes. The monophyletic lineages have been named informally to facilitate communication of an isolate's clade membership and genetic diversity. To identify isolates to the species included within the database, partial DNA sequence data from one or more of the three genes can be used as a BLAST query against the database which is Web accessible at FUSARIUM-ID (http://isolate.fusariumdb.org) and the Centraalbureau voor Schimmelcultures (CBS-KNAW) Fungal Biodiversity Center (http://www.cbs.knaw.nl/fusarium). Alternatively, isolates can be identified via phylogenetic analysis by adding sequences of unknowns to the DNA sequence alignment, which can be downloaded from the two aforementioned websites. The utility of this database should increase significantly as members of the clinical microbiology community deposit in internationally accessible culture collections (e.g., CBS-KNAW or the Fusarium Research Center) cultures of novel mycosis-associated fusaria, along with associated, corrected sequence chromatograms and data, so that the sequence results can be verified and isolates are made available for future study.

O'Donnell, Kerry; Sutton, Deanna A.; Rinaldi, Michael G.; Sarver, Brice A. J.; Balajee, S. Arunmozhi; Schroers, Hans-Josef; Summerbell, Richard C.; Robert, Vincent A. R. G.; Crous, Pedro W.; Zhang, Ning; Aoki, Takayuki; Jung, Kyongyong; Park, Jongsun; Lee, Yong-Hwan; Kang, Seogchan; Park, Bongsoo; Geiser, David M.

2010-01-01

89

Use of DNA profiles for investigation using a simulated national DNA database: Part I. Partial SGM Plus profiles.  

PubMed

In traditional criminal investigation, uncertainties are often dealt with using a combination of common sense, practical considerations and experience, but rarely with tailored statistical models. For example, in some countries, in order to search for a given profile in the national DNA database, it must have allelic information for six or more of the ten SGM Plus loci for a simple trace. If the profile does not have this amount of information then it cannot be searched in the national DNA database (NDNAD). This requirement (of a result at six or more loci) is not based on a statistical approach, but rather on the feeling that six or more would be sufficient. A statistical approach, however, could be more rigorous and objective and would take into consideration factors such as the probability of adventitious matches relative to the actual database size and/or investigator's requirements in a sensible way. Therefore, this research was undertaken to establish scientific foundations pertaining to the use of partial SGM Plus loci profiles (or similar) for investigation. PMID:20457051

Hicks, T; Taroni, F; Curran, J; Buckleton, J; Ribaux, O; Castella, V

2010-07-01

90

"Would you accept having your DNA profile inserted in the National Forensic DNA database? Why?" Results of a questionnaire applied in Portugal.  

PubMed

The creation and expansion of forensic DNA databases might involve potential threats to the protection of a range of human rights. At the same time, such databases have social benefits. Based on data collected through an online questionnaire applied to 628 individuals in Portugal, this paper aims to analyze the citizens' willingness to donate voluntarily a sample for profiling and inclusion in the National Forensic DNA Database and the views underpinning such a decision. Nearly one-quarter of the respondents would indicate 'no', and this negative response increased significantly with age and education. The overriding willingness to accept the inclusion of the individual genetic profile indicates an acknowledgement of the investigative potential of forensic DNA technologies and a relegation of civil liberties and human rights to the background, owing to the perceived benefits of protecting both society and the individual from crime. This rationale is mostly expressed by the idea that all citizens should contribute to the expansion of the National Forensic DNA Database for reasons that range from the more abstract assumption that donating a sample for profiling would be helpful in fighting crime to the more concrete suggestion that everyone (criminals and non-criminals) should be in the database. The concerns with the risks of accepting the donation of a sample for genetic profiling and inclusion in the National Forensic DNA Database are mostly related to lack of control and insufficient or unclear regulations concerning safeguarding individuals' data and supervising the access and uses of genetic data. By providing an empirically-grounded understanding of the attitudes regarding willingness to donate voluntary a sample for profiling and inclusion in a National Forensic DNA Database, this study also considers the citizens' perceived benefits and risks of operating forensic DNA databases. These collective views might be useful for the formation of international common ethical standards for the development and governance of DNA databases in a framework in which the citizens' perspectives are taken into consideration. PMID:24315600

Machado, Helena; Silva, Susana

2014-01-01

91

SELEX_DB: an activated database on selected randomized DNA\\/RNA sequences addressed to genomic sequence annotation  

Microsoft Academic Search

SELEX_DB is a novel curated database on selected randomized DNA\\/RNA sequences designed for accumulation of experimental data on functional site sequences obtained by using SELEX and SELEX-like technologies from the pools of random sequences. This database also contains the programs for DNA\\/RNA functional site recognition within arbitrary nucleotide sequences. The first release of SELEX_DB has been installed under SRS and

Julia V. Ponomarenko; Galina Orlova; Mikhail P. Ponomarenko; Sergey V. Lavryushev; Anatoly S. Frolov; Svetlana V. Zybova; Nikolay A. Kolchanov

2000-01-01

92

A comparison of adjustment methods to test the robustness of an STR DNA database comprised of 24 European populations  

Microsoft Academic Search

An aim of the European Network of Forensic Science Institutes (ENFSI) is to produce a DNA database of second generation multiplex (SGM) STR profiles that is representative of the resident cosmopolitan populations. To achieve this, data were collected from 24 different populations. All of the data were combined to form one database of 5700 profiles from which allele proportions were

Peter Gill; Lindsey Foreman; John S Buckleton; Christopher M Triggs; Heather Allen

2003-01-01

93

Feline Non-repetitive Mitochondrial DNA Control Region Database for Forensic Evidence  

PubMed Central

The domestic cat is the one of the most popular pets throughout the world. A by-product of owning, interacting with, or being in a household with a cat is the transfer of shed fur to clothing or personal objects. As trace evidence, transferred cat fur is a relatively untapped resource for forensic scientists. Both phenotypic and genotypic characteristics can be obtained from cat fur, but databases for neither aspect exist. Because cats incessantly groom, cat fur may have nucleated cells, not only in the hair bulb, but also as epithelial cells on the hair shaft deposited during the grooming process, thereby generally providing material for DNA profiling. To effectively exploit cat hair as a resource, representative databases must be established. This study evaluates 402 bp of the mtDNA control region (CR) from 1,394 cats, including cats from 25 distinct worldwide populations and 26 breeds. Eighty-three percent of the cats are represented by 12 major mitotypes. An additional 8.0% are clearly derived from the major mitotypes. Unique sequences were found in 7.5% of the cats. The overall genetic diversity for this data set was 0.8813 ± 0.0046 with a random match probability of 11.8%. This region of the cat mtDNA has discriminatory power suitable for forensic application worldwide.

Grahn, R. A.; Kurushima, J. D.; Billings, N. C.; Grahn, J.C.; Halverson, J. L.; Hammer, E.; Ho, C.K.; Kun, T. J.; Levy, J.K.; Lipinski, M. J.; Mwenda, J.M.; Ozpinar, H.; Schuster, R.K; Shoorijeh, S.J.; Tarditi, C. R.; Waly, N.E.; Wictum, E. J.; Lyons, L. A.

2010-01-01

94

Development of an Integrated Suite of Software in Analysing of Large DNA Databases  

PubMed Central

The work showed that the integrated suite of software tools for detecting criminals using DNA databases has achieved the overall objective by providing a working platform for sequence analysis. The work also demonstrated that by integrating BLAST and FASTA (two widely used and freely available algorithms), plus an additional implementation of PSA (custom-built pairwise sequence alignment algorithms) and TR analysis tools (for detecting tandem repeats) with the rest of the utilities supporting tools (databases and files management) developed, it is entirely possible to have an initial working version of the software tool for criminal DNA analysis and detection work. The integrated software tool has great potential and that the results obtained during the tests were satisfactory. The recent South Asia Tsunami incident has renewed the need to establish a quick and reliable system for DNA matching and comparison. This work may also contribute towards the quick identification of victims in many disasters. Future works are to further enhance the existing tools by adding more options and controls, improve upon the visualisation display, and to build robust software architecture to better manage the system loadings. Fault tolerance enhancement to the system is one of the key areas that can further help to make the entire application efficient, robust and reliable.

Kong, K.S; Ng, E.Y.K

2008-01-01

95

Estimating haplotype frequencies by combining data from large DNA pools with database information.  

PubMed

We assume that allele frequency data have been extracted from several large DNA pools, each containing genetic material of up to hundreds of sampled individuals. Our goal is to estimate the haplotype frequencies among the sampled individuals by combining the pooled allele frequency data with prior knowledge about the set of possible haplotypes. Such prior information can be obtained, for example, from a database such as HapMap. We present a Bayesian haplotyping method for pooled DNA based on a continuous approximation of the multinomial distribution. The proposed method is applicable when the sizes of the DNA pools and/or the number of considered loci exceed the limits of several earlier methods. In the example analyses, the proposed model clearly outperforms a deterministic greedy algorithm on real data from the HapMap database. With a small number of loci, the performance of the proposed method is similar to that of an EM-algorithm, which uses a multinormal approximation for the pooled allele frequencies, but which does not utilize prior information about the haplotypes. The method has been implemented using Matlab and the code is available upon request from the authors. PMID:21071795

Gasbarra, Dario; Kulathinal, Sangita; Pirinen, Matti; Sillanpää, Mikko J

2011-01-01

96

SABRE2: a database connecting plant EST/full-length cDNA clones with Arabidopsis information.  

PubMed

The SABRE (Systematic consolidation of Arabidopsis and other Botanical REsources) database cross-searches plant genetic resources through publicly available Arabidopsis information. In SABRE, plant expressed sequence tag (EST)/cDNA clones are related to TAIR (The Arabidoposis Information Resource) gene models and their annotations through sequence similarity. By entering a keyword, SABRE searches and retrieves TAIR gene models and annotations, together with homologous gene clones from various plant species. SABRE thus facilitates using TAIR annotations of Arabidopsis genes for research on homologous genes from other model plants. To expand the application range of SABRE to crop breeding, we have recently upgraded SABRE to SABRE2 (http://sabre.epd.brc.riken.jp/SABRE2.html), by newly adding six model plants (including the major crops barley, soybean, tomato and wheat), and by improving the retrieval interface. The present version has integrated information on >1.5 million plant EST/cDNA clones from the National BioResource Project (NBRP) of Japan. All clones are actual experimental resources from 14 plant species (Arabidoposis, barley, cassava, Chinese cabbage, lotus, morning glory, poplar, Physcomitrella patens, Striga hermonthica, soybean, Thellungiella halophila, tobacco, tomato and wheat), and are available from the core facilities of the NBRP. SABRE2 is thus a useful tool that can contribute towards the improvement of important crop breeds by connecting basic research and crop breeding. PMID:24323624

Fukami-Kobayashi, Kaoru; Nakamura, Yasukazu; Tamura, Takuro; Kobayashi, Masatomo

2014-01-01

97

Loading the Database of Topographic Objects Bdot10k with the Data from Public Registers, Supported by Digital Generalization  

NASA Astrophysics Data System (ADS)

The article presents legal and technological conditions of creating and maintaining the database of topographic objects (baza danych obiektów topograficznych - BDOT10k). A particular attention in paid on the adjustment (harmonization) of a new base from already existing public registers. The stage of harmonization considers the selection and definition ofproper operators of quantitative and qualitative generalization for individual objects and their attributes. Also technologies for the up-dating of the topographic database in the whole country was taken into account

Zieli?ski, Jerzy

2013-12-01

98

Italian mitochondrial DNA database: results of a collaborative exercise and proficiency testing.  

PubMed

This work is a review of a collaborative exercise on mtDNA analysis undertaken by the Italian working group (Ge.F.I.). A total of 593 samples from 11 forensic genetic laboratories were subjected to hypervariable region (HVS-I/HVS-II) sequence analysis. The raw lane data were sent to MtDNA Population Database (EMPOP) for an independent evaluation. For the inclusion of data for the Italian database, quality assurance procedures were applied to the control region profiles. Only eight laboratories with a final population sample of 395 subjects passed the quality conformance test. Control region haplogroup (hg) assignments were confirmed by restriction fragment length polymorphism (RFLP) typing of the most common European hg-diagnostic sites. A total of 306 unique haplotypes derived from the combined analysis of control and coding region polymorphisms were found; the most common haplotype--CRS, 263, 309.1C, 315.1C/ not7025 AluI--was shared by 20 subjects. The majority of mtDNAs detected in the Italian population fell into the most common west Eurasian hgs: R0a (0.76%), HV (4.81%), H (38.99%), HV0 (3.55%), J (7.85%), T (13.42%), U (11.65%), K (10.13%), I (1.52%), X (2.78%), and W (1.01%). PMID:17952451

Turchi, Chiara; Buscemi, Loredana; Previderè, Carlo; Grignani, Pierangela; Brandstätter, Anita; Achilli, Alessandro; Parson, Walther; Tagliabracci, Adriano

2008-05-01

99

Non-B DB: a database of predicted non-B DNA-forming motifs in mammalian genomes  

PubMed Central

Although the capability of DNA to form a variety of non-canonical (non-B) structures has long been recognized, the overall significance of these alternate conformations in biology has only recently become accepted en masse. In order to provide access to genome-wide locations of these classes of predicted structures, we have developed non-B DB, a database integrating annotations and analysis of non-B DNA-forming sequence motifs. The database provides the most complete list of alternative DNA structure predictions available, including Z-DNA motifs, quadruplex-forming motifs, inverted repeats, mirror repeats and direct repeats and their associated subsets of cruciforms, triplex and slipped structures, respectively. The database also contains motifs predicted to form static DNA bends, short tandem repeats and homo(purine•pyrimidine) tracts that have been associated with disease. The database has been built using the latest releases of the human, chimp, dog, macaque and mouse genomes, so that the results can be compared directly with other data sources. In order to make the data interpretable in a genomic context, features such as genes, single-nucleotide polymorphisms and repetitive elements (SINE, LINE, etc.) have also been incorporated. The database is accessed through query pages that produce results with links to the UCSC browser and a GBrowse-based genomic viewer. It is freely accessible at http://nonb.abcc.ncifcrf.gov.

Cer, Regina Z.; Bruce, Kevin H.; Mudunuri, Uma S.; Yi, Ming; Volfovsky, Natalia; Luke, Brian T.; Bacolla, Albino; Collins, Jack R.; Stephens, Robert M.

2011-01-01

100

A Large Population Genetic Study of 15 Autosomal Short Tandem Repeat Loci for Establishment of Korean DNA Profile Database  

PubMed Central

Genotyping of highly polymorphic short tandem repeat (STR) markers is widely used for the genetic identification of individuals in forensic DNA analyses and in paternity disputes. The National DNA Profile Databank recently established by the DNA Identification Act in Korea contains the computerized STR DNA profiles of individuals convicted of crimes. For the establishment of a large autosomal STR loci population database, 1805 samples were obtained at random from Korean individuals and 15 autosomal STR markers were analyzed using the AmpFlSTR Identifiler PCR Amplification kit. For the 15 autosomal STR markers, no deviations from the Hardy-Weinberg equilibrium were observed. The most informative locus in our data set was the D2S1338 with a discrimination power of 0.9699. The combined matching probability was 1.521 × 10-17. This large STR profile dataset including atypical alleles will be important for the establishment of the Korean DNA database and for forensic applications.

Yoo, Seong Yeon; Cho, Nam Soo; Park, Myung Jin; Seong, Ki Min; Hwang, Jung Ho; Song, Seok Bean; Han, Myun Soo; Lee, Won Tae; Chung, Ki Wha

2011-01-01

101

The public sequencing process, John SulstonSite: DNA Interactive (www.dnai.org)  

NSDL National Science Digital Library

Interviewee: John Sulston DNAi Location:Genome>The project>players>Public The public sequencing process Nobel Laureate John Sulston, a key figure in the UK sequencing effort, talks about breaking DNA apart so that the sequence can be reassembled.

2008-10-06

102

DNA perfume and improving public knowledge, Francis CollinsSite: DNA Interactive (www.dnai.org)  

NSDL National Science Digital Library

Interviewee: Francis Collins DNAi Location:Timeline>1990s I arrived here April the 4th, 1993, an auspicious day because that very day I, as I opened the newspaper I realized that that was probably going to be remembered in the annals of DNA, not because of me, because that didn't seem like that was going to be so significant, but because that was the day that DNA perfume was issued, all kinds of advertising all over the place about, go out and buy this perfume. And it was a bit jarring because the perfume bottle was in the shape of a helix but it was actually a triple helix, making it very clear we had an educational challenge as well as a scientific challenge if this project was ever going to achieve the public benefit that we all hoped for.

2008-03-26

103

Defining new criteria for selection of cell-based intestinal models using publicly available databases  

PubMed Central

Background The criteria for choosing relevant cell lines among a vast panel of available intestinal-derived lines exhibiting a wide range of functional properties are still ill-defined. The objective of this study was, therefore, to establish objective criteria for choosing relevant cell lines to assess their appropriateness as tumor models as well as for drug absorption studies. Results We made use of publicly available expression signatures and cell based functional assays to delineate differences between various intestinal colon carcinoma cell lines and normal intestinal epithelium. We have compared a panel of intestinal cell lines with patient-derived normal and tumor epithelium and classified them according to traits relating to oncogenic pathway activity, epithelial-mesenchymal transition (EMT) and stemness, migratory properties, proliferative activity, transporter expression profiles and chemosensitivity. For example, SW480 represent an EMT-high, migratory phenotype and scored highest in terms of signatures associated to worse overall survival and higher risk of recurrence based on patient derived databases. On the other hand, differentiated HT29 and T84 cells showed gene expression patterns closest to tumor bulk derived cells. Regarding drug absorption, we confirmed that differentiated Caco-2 cells are the model of choice for active uptake studies in the small intestine. Regarding chemosensitivity we were unable to confirm a recently proposed association of chemo-resistance with EMT traits. However, a novel signature was identified through mining of NCI60 GI50 values that allowed to rank the panel of intestinal cell lines according to their drug responsiveness to commonly used chemotherapeutics. Conclusions This study presents a straightforward strategy to exploit publicly available gene expression data to guide the choice of cell-based models. While this approach does not overcome the major limitations of such models, introducing a rank order of selected features may allow selecting model cell lines that are more adapted and pertinent to the addressed biological question.

2012-01-01

104

First searchable database for DNA profiles of human cell lines: sequential use of fingerprint techniques for authentication.  

PubMed

The authenticity and freedom from cross-contaminants of a cell line are important prerequisites for any research, development or production programs involving cell lines. Mini- and microsatellites in the human genome harboring variable-numbers of tandem repeat (VNTR) DNA markers allow individualization at the DNA level and are of practical value for genetic linkage mapping, forensic legal medicine, paternity testing, monitoring of bone marrow transplants, and individualization of established cell lines. We have validated fingerprint techniques of different single- and multiple-locus VNTRs enabling the establishment of a searchable database of DNA profiles. As a result, multiplexed polymerase chain reaction amplification fragment length polymorphism (AmpFLP) of four prominent and highly polymorphic minisatellite VNTR loci was proven as the best tool for screening the uniqueness of DNA profiles in a fingerprint database. In order to avoid false positivity, identical or similar DNA profiles based on AmpFLP VNTR were tested further using a multi-locus fingerprint system. Our data demonstrate that misidentification remains a chronic problem among human continuous cell lines (detailed information at URL http://www.dsmz.de). The combination of rapidly generated DNA profiles based on single-locus VNTR loci, their authentication by screening the fingerprint database, and confirmation of duplicate banding patterns using multilocus fingerprints constitute a highly reliable and robust method, which enables high fidelity and quality of maintenance independent from the quantity of individual cell lines. PMID:10541480

Dirks, W; MacLeod, R A; Jäger, K; Milch, H; Drexler, H G

1999-09-01

105

First whole genome based microsatellite DNA marker database of tomato for mapping and variety identification  

PubMed Central

Background The cultivated tomato is second most consumed vegetable of the world and is an important part of a diverse and balanced diet as a rich source of vitamins, minerals, phenolic antioxidants and antioxidant lycopene having anti-cancer properties. To reap benefit of genomics of the domestic tomato (Solanum lycopersicum L.) unravelled by Tomato Genome Consortium (The Tomato Genome Consortium, 2012), the bulk mining of its markers in totality is imperative and critically required. The solgenomics has limited number of microsatellite DNA markers (2867) pertaining to solanaceae family. As these markers are of linkage map having relative distance, the choice of selected markers based on absolute distance as of physical map is missing. Only limited microsatellite markers with limitations are reported for variety identification thus there is a need for more markers supplementing DUS test and also for traceability of product in global market. Description We present here the first whole genome based microsatellite DNA marker database of tomato, TomSatDB (Tomato MicroSatellite Database) with more than 1.4 million markers mined in-silico, using MIcroSAtellite (MISA) tool. To cater the customized needs of wet lab, features with a novelty of an automated primer designing tool is added. TomSatDB (http://cabindb.iasri.res.in/tomsatdb), a user-friendly and freely accessible tool offers chromosome wise as well as location wise search of primers. It is an online relational database based on “three-tier architecture” that catalogues information of microsatellites in MySQL and user-friendly interface developed using PHP (Hypertext Pre Processor). Conclusion Besides abiotic stress, tomato is known to have biotic stress due to its susceptibility over 200 diseases caused by pathogenic fungi, bacteria, viruses and nematodes. These markers are expected to pave the way of germplasm management over abiotic and biotic stress as well as improvement through molecular breeding, leading to increased tomato productivity in India as well as other parts of the world. In era of IPR the new variety can be identified based on allelic variation among varieties supplementing DUS test and product traceability.

2013-01-01

106

Low template STR typing: effect of replicate number and consensus method on genotyping reliability and DNA database search results.  

PubMed

To analyze DNA samples with very low DNA concentrations, various methods have been developed that sensitize short tandem repeat (STR) typing. Sensitized DNA typing is accompanied by stochastic amplification effects, such as allele drop-outs and drop-ins. Therefore low template (LT) DNA profiles are interpreted with care. One can either try to infer the genotype by a consensus method that uses alleles confirmed in replicate analyses, or one can use a statistical model to evaluate the strength of the evidence in a direct comparison with a known DNA profile. In this study we focused on the first strategy and we show that the procedure by which the consensus profile is assembled will affect genotyping reliability. In order to gain insight in the roles of replicate number and requested level of reproducibility, we generated six independent amplifications of samples of known donors. The LT methods included both increased cycling and enhanced capillary electrophoresis (CE) injection [1]. Consensus profiles were assembled from two to six of the replications using four methods: composite (include all alleles), n-1 (include alleles detected in all but one replicate), n/2 (include alleles detected in at least half of the replicates) and 2× (include alleles detected twice). We compared the consensus DNA profiles with the DNA profile of the known donor, studied the stochastic amplification effects and examined the effect of the consensus procedure on DNA database search results. From all these analyses we conclude that the accuracy of LT DNA typing and the efficiency of database searching improve when the number of replicates is increased and the consensus method is n/2. The most functional number of replicates within this n/2 method is four (although a replicate number of three suffices for samples showing >25% of the alleles in standard STR typing). This approach was also the optimal strategy for the analysis of 2-person mixtures, although modified search strategies may be needed to retrieve the minor component in database searches. From the database searches follows the recommendation to specifically mark LT DNA profiles when entering them into the DNA database. PMID:20655289

Benschop, Corina C G; van der Beek, Cornelis P; Meiland, Hugo C; van Gorp, Ankie G M; Westen, Antoinette A; Sijen, Titia

2011-08-01

107

Public Perceptions and Expectations of the Forensic Use of DNA: Results of a Preliminary Study  

ERIC Educational Resources Information Center

The forensic use of Deoxyribonucleic Acid (DNA) is demonstrating significant success as a crime-solving tool. However, numerous concerns have been raised regarding the potential for DNA use to contravene cultural, ethical, and legal codes. In this article the expectations and level of knowledge of the New Zealand public of the DNA data-bank and…

Curtis, Cate

2009-01-01

108

Dfam: a database of repetitive DNA based on profile hidden Markov models  

PubMed Central

We present a database of repetitive DNA elements, called Dfam (http://dfam.janelia.org). Many genomes contain a large fraction of repetitive DNA, much of which is made up of remnants of transposable elements (TEs). Accurate annotation of TEs enables research into their biology and can shed light on the evolutionary processes that shape genomes. Identification and masking of TEs can also greatly simplify many downstream genome annotation and sequence analysis tasks. The commonly used TE annotation tools RepeatMasker and Censor depend on sequence homology search tools such as cross_match and BLAST variants, as well as Repbase, a collection of known TE families each represented by a single consensus sequence. Dfam contains entries corresponding to all Repbase TE entries for which instances have been found in the human genome. Each Dfam entry is represented by a profile hidden Markov model, built from alignments generated using RepeatMasker and Repbase. When used in conjunction with the hidden Markov model search tool nhmmer, Dfam produces a 2.9% increase in coverage over consensus sequence search methods on a large human benchmark, while maintaining low false discovery rates, and coverage of the full human genome is 54.5%. The website provides a collection of tools and data views to support improved TE curation and annotation efforts. Dfam is also available for download in flat file format or in the form of MySQL table dumps.

Wheeler, Travis J.; Clements, Jody; Eddy, Sean R.; Hubley, Robert; Jones, Thomas A.; Jurka, Jerzy; Smit, Arian F. A.; Finn, Robert D.

2013-01-01

109

Construction of a public CHO cell line transcript database using versatile bioinformatics analysis pipelines.  

PubMed

Chinese hamster ovary (CHO) cell lines represent the most commonly used mammalian expression system for the production of therapeutic proteins. In this context, detailed knowledge of the CHO cell transcriptome might help to improve biotechnological processes conducted by specific cell lines. Nevertheless, very few assembled cDNA sequences of CHO cells were publicly released until recently, which puts a severe limitation on biotechnological research. Two extended annotation systems and web-based tools, one for browsing eukaryotic genomes (GenDBE) and one for viewing eukaryotic transcriptomes (SAMS), were established as the first step towards a publicly usable CHO cell genome/transcriptome analysis platform. This is complemented by the development of a new strategy to assemble the ca. 100 million reads, sequenced from a broad range of diverse transcripts, to a high quality CHO cell transcript set. The cDNA libraries were constructed from different CHO cell lines grown under various culture conditions and sequenced using Roche/454 and Illumina sequencing technologies in addition to sequencing reads from a previous study. Two pipelines to extend and improve the CHO cell line transcripts were established. First, de novo assemblies were carried out with the Trinity and Oases assemblers, using varying k-mer sizes. The resulting contigs were screened for potential CDS using ESTScan. Redundant contigs were filtered out using cd-hit-est. The remaining CDS contigs were re-assembled with CAP3. Second, a reference-based assembly with the TopHat/Cufflinks pipeline was performed, using the recently published draft genome sequence of CHO-K1 as reference. Additionally, the de novo contigs were mapped to the reference genome using GMAP and merged with the Cufflinks assembly using the cuffmerge software. With this approach 28,874 transcripts located on 16,492 gene loci could be assembled. Combining the results of both approaches, 65,561 transcripts were identified for CHO cell lines, which could be clustered by sequence identity into 17,598 gene clusters. PMID:24427317

Rupp, Oliver; Becker, Jennifer; Brinkrolf, Karina; Timmermann, Christina; Borth, Nicole; Pühler, Alfred; Noll, Thomas; Goesmann, Alexander

2014-01-01

110

PepBank - a database of peptides based on sequence text mining and public peptide data sources  

Microsoft Academic Search

Background: Peptides are important molecules with diverse biological functions and biomedical uses. To date, there does not exist a single, searchable archive for peptide sequences or associated biological data. Rather, peptide sequences still have to be mined from abstracts and full-length articles, and\\/or obtained from the fragmented public sources. Description: We have constructed a new database (PepBank), which at the

Timur Shtatland; Daniel Guettler; Misha Kossodo; Misha Pivovarov; Ralph Weissleder

2007-01-01

111

Public-key system using DNA as a one-way function for key distribution.  

PubMed

Novel public-key system using DNA has been developed. To solve key distribution problem, the public-key cryptography system based on the one-way function has been developed. The message-encoded DNA hidden in dummies can be restored by PCR amplification, followed by sequencing. We used these operations as a one-way function, and constituted a novel method for the key distribution based on the public-key system using DNA. We will show the way of holding a key in common just between specific two persons. PMID:15917125

Tanaka, Kazuo; Okamoto, Akimitsu; Saito, Isao

2005-07-01

112

Distributing the ERIC Database on Compact Disc: A Case History of Private Sector Involvement in the Distribution of Public Sector Data.  

ERIC Educational Resources Information Center

Describes the partnership between the public and private sectors in developing and marketing the ERIC database in CD-ROM format. Particular emphasis is given to the marketing research and protocols of partnership that were developed. (Author/CLB)

Brandhorst, Ted

1987-01-01

113

MGDB: crossing the marker genes of a user microarray with a database of public-microarrays marker genes  

PubMed Central

Summary: The microarrays performed by scientific teams grow exponentially. These microarray data could be useful for researchers around the world, but unfortunately they are underused. To fully exploit these data, it is necessary (i) to extract these data from a repository of the high-throughput gene expression data like Gene Expression Omnibus (GEO) and (ii) to make the data from different microarrays comparable with tools easy to use for scientists. We have developed these two solutions in our server, implementing a database of microarray marker genes (Marker Genes Data Base). This database contains the marker genes of all GEO microarray datasets and it is updated monthly with the new microarrays from GEO. Thus, researchers can see whether the marker genes of their microarray are marker genes in other microarrays in the database, expanding the analysis of their microarray to the rest of the public microarrays. This solution helps not only to corroborate the conclusions regarding a researcher's microarray but also to identify the phenotype of different subsets of individuals under investigation, to frame the results with microarray experiments from other species, pathologies or tissues, to search for drugs that promote the transition between the studied phenotypes, to detect undesirable side effects of the treatment applied, etc. Thus, the researcher can quickly add relevant information to his/her studies from all of the previous analyses performed in other studies as long as they have been deposited in public repositories. Availability: Marker-gene database tool: http://ibb.uab.es/mgdb Contact: jcedano@unorte.edu.uy

Huerta, Mario; Munyi, Marc; Exposito, David; Querol, Enric; Cedano, Juan

2014-01-01

114

Familial searching: a specialist forensic DNA profiling service utilising the National DNA Database to identify unknown offenders via their relatives--the UK experience.  

PubMed

The National DNA Database (NDNAD) of England and Wales was established on April 10th 1995. The NDNAD is governed by a variety of legislative instruments that mean that DNA samples can be taken if an individual is arrested and detained in a police station. The biological samples and the DNA profiles derived from them can be used for purposes related to the prevention and detection of crime, the investigation of an offence and for the conduct of a prosecution. Following the South East Asian Tsunami of December 2004, the legislation was amended to allow the use of the NDNAD to assist in the identification of a deceased person or of a body part where death has occurred from natural causes or from a natural disaster. The UK NDNAD now contains the DNA profiles of approximately 6 million individuals representing 9.6% of the UK population. As the science of DNA profiling advanced, the National DNA Database provided a potential resource for increased intelligence beyond the direct matching for which it was originally created. The familial searching service offered to the police by several UK forensic science providers exploits the size and geographic coverage of the NDNAD and the fact that close relatives of an offender may share a significant proportion of that offender's DNA profile and will often reside in close geographic proximity to him or her. Between 2002 and 2011 Forensic Science Service Ltd. (FSS) provided familial search services to support 188 police investigations, 70 of which are still active cases. This technique, which may be used in serious crime cases or in 'cold case' reviews when there are few or no investigative leads, has led to the identification of 41 perpetrators or suspects. In this paper we discuss the processes, utility, and governance of the familial search service in which the NDNAD is searched for close genetic relatives of an offender who has left DNA evidence at a crime scene, but whose DNA profile is not represented within the NDNAD. We discuss the scientific basis of the familial search approach, other DNA-based methods for eliminating individuals from the candidate lists generated by these NDNAD searches, the value of filtering these lists by age, ethnic appearance and geography and the governance required by the NDNAD Strategy Board when a police force commissions a familial search. We present the FSS data in relation to the utility of the familial searching service and demonstrate the power of the technique by reference to casework examples. We comment on the uptake of familial searching of DNA databases in the USA, the Netherlands, Australia, and New Zealand. Finally, following the adverse ruling by the European Court of Human Rights against the UK in regard to the S & Marper cases and the consequent introduction of the Protection of Freedoms Act (2012), we discuss the impact that changes to regulations concerning the storage of DNA samples will have on the continuing provision of familial searching of the National DNA Database in England and Wales. PMID:24315582

Maguire, C N; McCallum, L A; Storey, C; Whitaker, J P

2014-01-01

115

A Human Mitochondrial Genome Database  

NSDL National Science Digital Library

The Center for Molecular Medicine at Emory University maintains this human mitochondrial genome database, which offers information on Mitochondrial DNA Function Locations and Polypeptide Assignments as well as the relevant publication references. The database is initially searchable by gene, disease, and enzyme. Users can then refine their search by function, polymorphisms, or references (author, title, journal, year, or keyword). Users can also search the references directly via an Advanced Search. An additional resource at the site is a reference guide to mitomap tables featuring searchable (by keyword) information on specific mitochondrial DNA function locations and references. An opportunity to add publications to this database is available, if users find that pertinent papers have not been cited.

Brown, M. D.; Huoponen, K.; Katdare, G. C.; Kogelnik, A. M.; Lott, M. T.; Navathe, S. B.

1996-01-01

116

SIDDBASE: a database containing the stress-induced DNA duplex destabilization (SIDD) profiles of complete microbial genomes  

PubMed Central

Prokaryotic genomic DNA is generally negatively supercoiled in vivo. Many regulatory processes, including the initiation of transcription, are known to depend on the superhelical state of the DNA substrate. The stresses induced within DNA by negative superhelicity can destabilize the DNA duplex at specific sites. Various experiments have either shown or suggested that stress-induced DNA duplex destabilization (SIDD) is involved in specific regulatory mechanisms governing a variety of biological processes. We have developed methods to evaluate the SIDD properties of DNA sequences, including complete chromosomes. This analysis predicts the locations where the duplex becomes destabilized under superhelical stress. Previous studies have shown that the SIDD-susceptible sites predicted in this way occur at rates much higher than expected at random in transcriptional regulatory regions, and much lower than expected in coding regions. Analysis of the SIDD profiles of 42 bacterial genomes chosen for their diversity confirms this pattern. Predictions of SIDD sites have been used to identify potential genomic regulatory regions, and suggest both possible regulatory mechanisms involving stress-induced destabilization and experimental tests of these mechanisms. Here we describe the SIDDBASE database which enables users to retrieve and visualize the results of SIDD analyses of completely sequenced prokaryotic and archaeal genomes, together with their annotations. SIDDBASE is available at .

Wang, Huiquan; Kaloper, Miroslava; Benham, Craig J.

2006-01-01

117

DNA banking and DNA databanking: Legal, ethical, and public policy issues.  

National Technical Information Service (NTIS)

The purpose of this research was to provide support to enable the authors to: (1) perform legal and empirical research and critically analyze DNA banking and DNA databanking as those activities are conducted by state forensic laboratories, the military, a...

P. R. Reilly J. E. McEwen J. D. Lawyer D. Small

1997-01-01

118

A framework for organizing cancer-related variations from existing databases, publications and NGS data using a High-performance Integrated Virtual Environment (HIVE).  

PubMed

Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Because of the petabytes of data and information present in NGS primary repositories, a platform HIVE (High-performance Integrated Virtual Environment) for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31 979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13 896 small-scale and 308 986 large-scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies. Database URL: BioMuta: http://hive.biochemistry.gwu.edu/tools/biomuta/index.php; CSR: http://hive.biochemistry.gwu.edu/dna.cgi?cmd=csr; HIVE: http://hive.biochemistry.gwu.edu. PMID:24667251

Wu, Tsung-Jung; Shamsaddini, Amirhossein; Pan, Yang; Smith, Krista; Crichton, Daniel J; Simonyan, Vahan; Mazumder, Raja

2014-01-01

119

Expanding the functional human mitochondrial DNA database by the establishment of primate xenomitochondrial cybrids  

PubMed Central

The nuclear and mitochondrial genomes coevolve to optimize approximately 100 different interactions necessary for an efficient ATP-generating system. This coevolution led to a species-specific compatibility between these genomes. We introduced mitochondrial DNA (mtDNA) from different primates into mtDNA-less human cells and selected for growth of cells with a functional oxidative phosphorylation system. mtDNA from common chimpanzee, pigmy chimpanzee, and gorilla were able to restore oxidative phosphorylation in the context of a human nuclear background, whereas mtDNA from orangutan, and species representative of Old-World monkeys, New-World monkeys, and lemurs were not. Oxygen consumption, a sensitive index of respiratory function, showed that mtDNA from chimpanzee, pigmy chimpanzee, and gorilla replaced the human mtDNA and restored respiration to essentially normal levels. Mitochondrial protein synthesis was also unaltered in successful “xenomitochondrial cybrids.” The abrupt failure of mtDNA from primate species that diverged from humans as recently as 8–18 million years ago to functionally replace human mtDNA suggests the presence of one or a few mutations affecting critical nuclear–mitochondrial genome interactions between these species. These cellular systems provide a demonstration of intergenus mtDNA transfer, expand more than 20-fold the number of mtDNA polymorphisms that can be analyzed in a human nuclear background, and provide a novel model for the study of nuclear–mitochondrial interactions.

Kenyon, Lesley; Moraes, Carlos T.

1997-01-01

120

BrassicaTED - a public database for utilization of miniature transposable elements in Brassica species  

PubMed Central

Background MITE, TRIM and SINEs are miniature form transposable elements (mTEs) that are ubiquitous and dispersed throughout entire plant genomes. Tens of thousands of members cause insertion polymorphism at both the inter- and intra- species level. Therefore, mTEs are valuable targets and resources for development of markers that can be utilized for breeding, genetic diversity and genome evolution studies. Taking advantage of the completely sequenced genomes of Brassica rapa and B. oleracea, characterization of mTEs and building a curated database are prerequisite to extending their utilization for genomics and applied fields in Brassica crops. Findings We have developed BrassicaTED as a unique web portal containing detailed characterization information for mTEs of Brassica species. At present, BrassicaTED has datasets for 41 mTE families, including 5894 and 6026 members from 20 MITE families, 1393 and 1639 members from 5 TRIM families, 1270 and 2364 members from 16 SINE families in B. rapa and B. oleracea, respectively. BrassicaTED offers different sections to browse structural and positional characteristics for every mTE family. In addition, we have added data on 289 MITE insertion polymorphisms from a survey of seven Brassica relatives. Genes with internal mTE insertions are shown with detailed gene annotation and microarray-based comparative gene expression data in comparison with their paralogs in the triplicated B. rapa genome. This database also includes a novel tool, K BLAST (Karyotype BLAST), for clear visualization of the locations for each member in the B. rapa and B. oleracea pseudo-genome sequences. Conclusions BrassicaTED is a newly developed database of information regarding the characteristics and potential utility of mTEs including MITE, TRIM and SINEs in B. rapa and B. oleracea. The database will promote the development of desirable mTE-based markers, which can be utilized for genomics and breeding in Brassica species. BrassicaTED will be a valuable repository for scientists and breeders, promoting efficient research on Brassica species. BrassicaTED can be accessed at http://im-crop.snu.ac.kr/BrassicaTED/index.php.

2014-01-01

121

Evaluation and utilization as a public health tool of a national molecular epidemiological tuberculosis outbreak database within the United Kingdom from 1997 to 2001.  

PubMed

The aim of this study was to develop a national model and analyze the value of a molecular epidemiological Mycobacterium tuberculosis DNA fingerprint-outbreak database. Incidents were investigated by the United Kingdom PHLS Mycobacterium Reference Unit (MRU) from June 1997 to December 2001, inclusive. A total of 124 incidents involving 972 tuberculosis cases, including 520 patient cultures from referred incidents and 452 patient cultures related to two population studies, were examined by using restriction fragment length polymorphism IS6110 fingerprinting and rapid epidemiological typing. Investigations were divided into the following three categories, reflecting different operational strategies: retrospective passive analysis, retrospective active analysis, and retrospective prospective analysis. The majority of incidents were in the retrospective passive analysis category, i.e., the individual submitting isolates has a suspicion they may be linked. Outbreaks were examined in schools, hospitals, farms, prisons, and public houses, and laboratory cross-contamination events and unusual clinical presentations were investigated. Retrospective active analysis involved a major outbreak centered on a high school. Contact tracing of a teenager with smear-positive pulmonary tuberculosis matched 14 individuals, including members of his class, and another 60 cases were identified in schools clinically and radiologically and by skin testing. Retrospective prospective analysis involved an outbreak of 94 isoniazid-resistant tuberculosis cases in London, United Kingdom, that began after cases were identified at one hospital in January 2000. Contact tracing and comparison with MRU databases indicated that the earliest matched case had occurred in 1995. Subsequently, the MRU changed to an active prospective analysis targeting linked isoniazid-monoresistant isolates for follow up. The patients were multiethnic, born mainly in the United Kingdom, and included professionals, individuals from the music industry, intravenous drug abusers, and prisoners. PMID:12734218

Drobniewski, F A; Gibson, A; Ruddy, M; Yates, M D

2003-05-01

122

Searching for first-degree familial relationships in California's offender DNA database: validation of a likelihood ratio-based approach.  

PubMed

A validation study was performed to measure the effectiveness of using a likelihood ratio-based approach to search for possible first-degree familial relationships (full-sibling and parent-child) by comparing an evidence autosomal short tandem repeat (STR) profile to California's ?1,000,000-profile State DNA Index System (SDIS) database. Test searches used autosomal STR and Y-STR profiles generated for 100 artificial test families. When the test sample and the first-degree relative in the database were characterized at the 15 Identifiler(®) (Applied Biosystems(®), Foster City, CA) STR loci, the search procedure included 96% of the fathers and 72% of the full-siblings. When the relative profile was limited to the 13 Combined DNA Index System (CODIS) core loci, the search procedure included 93% of the fathers and 61% of the full-siblings. These results, combined with those of functional tests using three real families, support the effectiveness of this tool. Based upon these results, the validated approach was implemented as a key, pragmatic and demonstrably practical component of the California Department of Justice's Familial Search Program. An investigative lead created through this process recently led to an arrest in the Los Angeles Grim Sleeper serial murders. PMID:21056023

Myers, Steven P; Timken, Mark D; Piucci, Matthew L; Sims, Gary A; Greenwald, Michael A; Weigand, James J; Konzak, Kenneth C; Buoncristiani, Martin R

2011-11-01

123

Arylamine N-acetyltransferases in prokaryotic and eukaryotic genomes: a survey of public databases.  

PubMed

Arylamine N-acetyltransferases (NATs) are xenobiotic metabolizing enzymes found in prokaryotes and eukaryotes. NATs have been characterized in bacteria (Bacilli, Mycobacteria, Salmonella etc.), laboratory animals (chicken, rabbit, rodents etc.) and humans, where the NAT loci occupy 230 kilobases on chromosome 8p22. Our previous comprehensive search for NAT genes involved 416 genomes (340 prokaryotic, 76 eukaryotic) and identified NAT homologues in several taxa, while also reporting on taxa that appeared to lack NAT genes [Boukouvala, S. and Fakis, G. (2005) Drug Metab. Rev. 37(3), 511-564]. Here, we present an update of this genomic search, covering 2138 genomes (1674 prokaryotic, 464 eukaryotic), of which 1167 (986 prokaryotic, 181 eukaryotic) were accessible using the advanced search algorithm tBLASTn. We have reconstructed the full-length open reading frames for putative proteins with sequence homology and features characteristic of NAT from 274 bacterial genomes (31 actinobacteria, 6 bacteroidetes/chlorobi, 2 cyanobacteria, 65 firmicutes and 170 proteobacteria) and 27 animals (1 sea-urchin, 5 fishes, 1 lizard, 1 bird and 19 mammals). Partial NAT sequences were recovered from several other organisms, including fungi, where NAT genes were found in 30 ascomycetes and 2 basidiomycetes. No NATs were found in arhaea, plants and lower invertebrates (insects and worms), while it is also uncertain whether NAT genes exist in protista. We present comparative genomic and phylogenetic analyses of the identified NAT homologues and announce a new database that will maintain information on non-human NATs and will provide recommendations for a standardized nomenclature, along the lines of the NAT Gene Nomenclature Committee. PMID:18781915

Vagena, Eirini; Fakis, Giannoulis; Boukouvala, Sotiria

2008-09-01

124

Use of DNA profiles for investigation using a simulated national DNA database: Part II. Statistical and ethical considerations on familial searching.  

PubMed

Familial searching consists of searching for a full profile left at a crime scene in a National DNA Database (NDNAD). In this paper we are interested in the circumstance where no full match is returned, but a partial match is found between a database member's profile and the crime stain. Because close relatives share more of their DNA than unrelated persons, this partial match may indicate that the crime stain was left by a close relative of the person with whom the partial match was found. This approach has successfully solved important crimes in the UK and the USA. In a previous paper, a model, which takes into account substructure and siblings, was used to simulate a NDNAD. In this paper, we have used this model to test the usefulness of familial searching and offer guidelines for pre-assessment of the cases based on the likelihood ratio. Siblings of "persons" present in the simulated Swiss NDNAD were created. These profiles (N=10,000) were used as traces and were then compared to the whole database (N=100,000). The statistical results obtained show that the technique has great potential confirming the findings of previous studies. However, effectiveness of the technique is only one part of the story. Familial searching has juridical and ethical aspects that should not be ignored. In Switzerland for example, there are no specific guidelines to the legality or otherwise of familial searching. This article both presents statistical results, and addresses criminological and civil liberties aspects to take into account risks and benefits of familial searching. PMID:20457057

Hicks, T; Taroni, F; Curran, J; Buckleton, J; Castella, V; Ribaux, O

2010-10-01

125

Contamination of cDNA libraries and expressed sequence-tags databases  

SciTech Connect

Partially sequenced cDNAs, or expressed sequence tags (ESTs), are claimed to represent an efficient strategy for characterizing an organism`s genes. By necessity, these sequences are incompletely characterized, and examples of contamination of cDNA libraries with sequences from other species have been described. It has been suggested that a Human T-cell cDNA library (Clontech HL1963g) is contaminated by sequences from yeast (Saccharomyces cerevisiae) and an unknown bacterium. We are characterizing human ESTs that represent new members of the ATP-binding cassette transporter super-family. In examining human ESTs generated from the T-cell library, we have encountered one gene that was in fact a yeast sequence (Genbank Z15214 = SSH2 locus) and several genes that do not hybridize to human DNA or RNA. PCR primers from these sequences failed to amplify a product from human, yeast, or Escherichia coli DNA but did produce a product from a Clontech kidney cDNA library (HL1123a). To determine the source of the contamination, we amplified a conserved segment of the 16S rDNA (following a suggestion from Dr. C. Savakis) from the kidney library. The sequence of this product was nearly identical to that of the bacterium Leuconostoc lactis (300 of 304 bp). Leuconostoc species are commonly found in dairy products, fruits, vegetables, and wine and are nonpathogenic to humans. 6 refs., 1 fig.

Dean, M.; Allikmets, R. [Frederick Cancer Research and Development Center, MD (United States)

1995-11-01

126

Does language matter? A case study of epidemiological and public health journals, databases and professional education in French, German and Italian.  

PubMed

Epidemiology and public health are usually context-specific. Journals published in different languages and countries play a role both as sources of data and as channels through which evidence is incorporated into local public health practice. Databases in these languages facilitate access to relevant journals, and professional education in these languages facilitates the growth of native expertise in epidemiology and public health. However, as English has become the lingua franca of scientific communication in the era of globalisation, many journals published in non-English languages face the difficult dilemma of either switching to English and competing internationally, or sticking to the native tongue and having a restricted circulation among a local readership. This paper discusses the historical development of epidemiology and the current scene of epidemiological and public health journals, databases and professional education in three Western European languages: French, German and Italian, and examines the dynamics and struggles they have today. PMID:18826570

Baussano, Iacopo; Brzoska, Patrick; Fedeli, Ugo; Larouche, Claudia; Razum, Oliver; Fung, Isaac C-H

2008-01-01

127

Does language matter? A case study of epidemiological and public health journals, databases and professional education in French, German and Italian  

PubMed Central

Epidemiology and public health are usually context-specific. Journals published in different languages and countries play a role both as sources of data and as channels through which evidence is incorporated into local public health practice. Databases in these languages facilitate access to relevant journals, and professional education in these languages facilitates the growth of native expertise in epidemiology and public health. However, as English has become the lingua franca of scientific communication in the era of globalisation, many journals published in non-English languages face the difficult dilemma of either switching to English and competing internationally, or sticking to the native tongue and having a restricted circulation among a local readership. This paper discusses the historical development of epidemiology and the current scene of epidemiological and public health journals, databases and professional education in three Western European languages: French, German and Italian, and examines the dynamics and struggles they have today.

Baussano, Iacopo; Brzoska, Patrick; Fedeli, Ugo; Larouche, Claudia; Razum, Oliver; Fung, Isaac C-H

2008-01-01

128

De-identifying a public use microdata file from the Canadian national discharge abstract database  

PubMed Central

Abstract Background The Canadian Institute for Health Information (CIHI) collects hospital discharge abstract data (DAD) from Canadian provinces and territories. There are many demands for the disclosure of this data for research and analysis to inform policy making. To expedite the disclosure of data for some of these purposes, the construction of a DAD public use microdata file (PUMF) was considered. Such purposes include: confirming some published results, providing broader feedback to CIHI to improve data quality, training students and fellows, providing an easily accessible data set for researchers to prepare for analyses on the full DAD data set, and serve as a large health data set for computer scientists and statisticians to evaluate analysis and data mining techniques. The objective of this study was to measure the probability of re-identification for records in a PUMF, and to de-identify a national DAD PUMF consisting of 10% of records. Methods Plausible attacks on a PUMF were evaluated. Based on these attacks, the 2008-2009 national DAD was de-identified. A new algorithm was developed to minimize the amount of suppression while maximizing the precision of the data. The acceptable threshold for the probability of correct re-identification of a record was set at between 0.04 and 0.05. Information loss was measured in terms of the extent of suppression and entropy. Results Two different PUMF files were produced, one with geographic information, and one with no geographic information but more clinical information. At a threshold of 0.05, the maximum proportion of records with the diagnosis code suppressed was 20%, but these suppressions represented only 8-9% of all values in the DAD. Our suppression algorithm has less information loss than a more traditional approach to suppression. Smaller regions, patients with longer stays, and age groups that are infrequently admitted to hospitals tend to be the ones with the highest rates of suppression. Conclusions The strategies we used to maximize data utility and minimize information loss can result in a PUMF that would be useful for the specific purposes noted earlier. However, to create a more detailed file with less information loss suitable for more complex health services research, the risk would need to be mitigated by requiring the data recipient to commit to a data sharing agreement.

2011-01-01

129

Mass spectrometric base composition profiling: Implications for forensic mtDNA databasing.  

PubMed

In forensic genetics mitochondrial DNA (mtDNA) is usually analyzed by direct Sanger-type sequencing (STS). This method is known to be laborious and sometimes prone to human error. Alternative methods have been proposed that lead to faster results. Among these are methods that involve mass-spectrometry resulting in base composition profiles that are, by definition, less informative than the full nucleotide sequence. Here, we applied a highly automated electrospray ionization mass spectrometry (ESI-MS) system (PLEX-ID) to an mtDNA population study to compare its performance with respect to throughput and concordance to STS. We found that the loss of information power was relatively low compared to the gain in speed and analytical standardization. The detection of point and length heteroplasmy turned out to be roughly comparable between the technologies with some individual differences related to the processes. We confirm that ESI-MS provides a valuable platform for analyzing mtDNA variation that can also be applied in the forensic context. PMID:24054029

Eduardoff, Mayra; Huber, Gabriela; Bayer, Birgit; Schmid, Dagmar; Anslinger, Katja; Göbel, Tanja; Zimmermann, Bettina; Schneider, Peter M; Röck, Alexander W; Parson, Walther

2013-12-01

130

A brief history of the formation of DNA databases in forensic science within Europe  

Microsoft Academic Search

The introduction of DNA analysis to forensic science brought with it a number of choices for analysis, not all of which were compatible. As laboratories throughout Europe were eager to use the new technology different systems became routine in different laboratories and consequently, there was no basis for the exchange of results. A period of co-operation then started in which

Peter D. Martin; Hermann Schmitter; Peter M. Schneider

2001-01-01

131

Mass spectrometric base composition profiling: Implications for forensic mtDNA databasing?  

PubMed Central

In forensic genetics mitochondrial DNA (mtDNA) is usually analyzed by direct Sanger-type sequencing (STS). This method is known to be laborious and sometimes prone to human error. Alternative methods have been proposed that lead to faster results. Among these are methods that involve mass-spectrometry resulting in base composition profiles that are, by definition, less informative than the full nucleotide sequence. Here, we applied a highly automated electrospray ionization mass spectrometry (ESI-MS) system (PLEX-ID) to an mtDNA population study to compare its performance with respect to throughput and concordance to STS. We found that the loss of information power was relatively low compared to the gain in speed and analytical standardization. The detection of point and length heteroplasmy turned out to be roughly comparable between the technologies with some individual differences related to the processes. We confirm that ESI-MS provides a valuable platform for analyzing mtDNA variation that can also be applied in the forensic context.

Eduardoff, Mayra; Huber, Gabriela; Bayer, Birgit; Schmid, Dagmar; Anslinger, Katja; Gobel, Tanja; Zimmermann, Bettina; Schneider, Peter M.; Rock, Alexander W.; Parson, Walther

2013-01-01

132

Discovery of Possible Gene Relationships through the Application of Self-Organizing Maps to DNA Microarray Databases  

PubMed Central

DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques –an unsupervised artificial neural network called a Self-Organizing Map (SOM)–which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms.

Chavez-Alvarez, Rocio; Chavoya, Arturo; Mendez-Vazquez, Andres

2014-01-01

133

ENVIRONMENTAL RESIDUE EFFECTS DATABASE (ERED)  

EPA Science Inventory

US Army Corps of Engineers public web site for the "Environmental Residue Effects Database", a searchable database of adverse biological effects associated with tissue concentrations of various contaminants....

134

Eukaryotic genome size databases  

Microsoft Academic Search

Three independent databases of eukaryotic genome size information have been launched or re-released in updated form since 2005: the Plant DNA C-values Database (www.kew.org\\/genomesize\\/homepage. html), the Animal Genome Size Database (www. genomesize.com) and the Fungal Genome Size Database (www.zbi.ee\\/fungal-genomesize\\/). In total, these databases provide freely accessible genome size data for >10 000 species of eukaryotes assem- bled from more than

T. Ryan Gregory; James A. Nicol; Heidi Tamm; Bellis Kullman; Kaur Kullman; Ilia J. Leitch; Brian G. Murray; Donald F. Kapraun; Johann Greilhuber; Michael D. Bennett

2007-01-01

135

Drinking Water Treatability Database (Database)  

EPA Science Inventory

The drinking Water Treatability Database (TDB) will provide data taken from the literature on the control of contaminants in drinking water, and will be housed on an interactive, publicly-available USEPA web site. It can be used for identifying effective treatment processes, rec...

136

Production of Arrayed and Rearrayed cDNA Libraries for Public Use  

SciTech Connect

Researchers studying genes and their protein products need an easily available source for that gene. The I.M.A.G.E. Consortium at Lawrence Livermore National Laboratory is an important source of such genes in the form of arrayed cDNA libraries. The arrayed clones and associated data are available to the public, free of restriction. Libraries are transformed and titered into 384-well master plates, from which 2-8 copies are made. One copy plate is stored by LLNL while others are sent to sequencing groups, plate distributors, and to the group which contributed the library. Clones found to be unique and/or full-length are rearrayed and also made publicly available. Bioinformatics tools supporting the use of I.M.A.G.E. clones are accessible via the World Wide Web.

Rasmussen, K

2005-08-29

137

Conservation of taxonomic and biological trait diversity of European stream macroinvertebrate communities: a case for a collective public database  

Microsoft Academic Search

The use of databases for the conservation of biodiversity is increasing. During the last decade, such a database has been\\u000a created for European stream macroinvertebrates. Today, it includes 527 sites that are the least human-impacted representatives\\u000a of many stream types across many European regions. It includes data on the abundance of 312 invertebrate genera, several environmental\\u000a site characteristics, collection methods,

Bernhard Statzner; Núria Bonada; Sylvain Dolédec

2007-01-01

138

DNA Microarray for Detection of Macrolide Resistance Genes  

Microsoft Academic Search

A DNA microarray was developed to detect bacterial genes conferring resistance to macrolides and related antibiotics. A database containing 65 nonredundant genes selected from publicly available DNA sequences was constructed and used to design 100 oligonucleotide probes that could specifically detect and discriminate all 65 genes. Probes were spotted on a glass slide, and the array was reacted with DNA

Marco Cassone; Marco M. D'Andrea; Francesco Iannelli; Marco R. Oggioni; Gian Maria Rossolini; Gianni Pozzi

2006-01-01

139

BioModels Database  

NSDL National Science Digital Library

Hosted by the European Bioinformatics Institute, the BioModels Database is a collaborative, "new effort to develop a data resource that will allow biologist to store, search and retrieve published mathematical models of biological interests. The models in the BioModels Database are annotated and linked to relevant data resources, such as publications, databases of compounds and pathways, controlled vocabularies, etc." The website allows visitors to browse and search the Database for models. The site also provides information about submitting models for the Database. It should be noted that submitted models must undergo tests conducted by BioModels Database curators before they are incorporated. [NL

140

The creation of a public database of precision phantoms to facilitate the evaluation and standardization of advanced visualization and quantification software  

NASA Astrophysics Data System (ADS)

In order to validate CT imaging as a biomarker, it is important to ascertain the variability and artifacts associated with various forms of advanced visualization and quantification software. The purpose of the paper is to describe the rationale behind the creation of a free, public resource that contains phantom datasets for CT designed to facilitate testing, development and standardization of advanced visualization and quantification software. For our research, three phantoms were scanned at multiple kVp and mAs settings utilizing a 64-channel MDCT scanner at a collimation of 0.75 mm. Images were reconstructed at a slice thickness of 0.75 mm and archived in DICOM format. The phantoms consisted of precision spheres, balls of different materials and sizes, and slabs of Last-A-Foam(R) at varying densities. The database of scans is stored in an archive utilizing software developed for the National Cancer Imaging Archive and is publically available. The scans were completed successfully and the datasets are available for free and unrestricted download. The CT images can be accessed in DICOM format via http or FTP or utilizing caGRID. A DICOM database of phantom data was successfully created and made available to the public. We anticipate that this database will be useful as a reference for physicists for quality control purposes, for developers of advanced visualization and quantification software, and for others who need to test the performance of their systems against a known "gold" standard. We plan to add more phantom images in the future and expand to other imaging modalities.

Chen, Joseph J.; Saenz, Naomi J.; Siegel, Eliot L.

2009-02-01

141

The ESA microgravity database.  

PubMed

To facilitate and optimise access to the results of previous microgravity experiments, ESA is establishing a Microgravity Database (MGDB). The database will give scientists access to descriptions of all experiments related to microgravity research carried out on ESA and NASA missions since the 1960s. MGDB incorporates both a local database available for PCs and Apple Macintoshes, and a remote database which can be accessed via the ESRIN Data Dissemination Network (DDN) and public X.25 networks. PMID:11541525

Jeanne, P; Scholz, J

1993-11-01

142

The Institute of Public Administration's Document Center: From Paper to Electronic Records--A Full Image Government Documents Database.  

ERIC Educational Resources Information Center

Since its establishment in 1960, the Institute of Public Administration (IPA) in Riyadh, Saudi Arabia has had responsibility for documenting Saudi administrative literature, the official publications of Saudi Arabia, and the literature of regional and international organizations through establishment of the Document Center in 1961. This paper…

Al-Zahrani, Rashed S.

143

Psychiatric inpatient expenditures and public health insurance programmes: analysis of a national database covering the entire South Korean population  

Microsoft Academic Search

BACKGROUND: Medical spending on psychiatric hospitalization has been reported to impose a tremendous socio-economic burden on many developed countries with public health insurance programmes. However, there has been no in-depth study of the factors affecting psychiatric inpatient medical expenditures and differentiated these factors across different types of public health insurance programmes. In view of this, this study attempted to explore

Woojin Chung

2010-01-01

144

DNA banking and DNA databanking: Legal, ethical, and public policy issues. Progress report, (April 1, 1993--March 31, 1994).  

National Technical Information Service (NTIS)

The purpose of the grant was to provide support to enable us to: (1) perform legal and empirical research and critically analyze DNA banking and DNA databanking as those activities are conducted by state forensic laboratories, the military, academic resea...

P. R. Reilly J. E. McEwen D. Small

1994-01-01

145

Human Mitochondrial Protein Database  

NSDL National Science Digital Library

The Human Mitochondrial Protein Database (HMPDb) conveniently consolidates information from a number of other databases, including GenBank, Online Mendelian Inheritance in Man, and the Human Mitochondrial Genome Database. HMPDb "is intended as a tool not only to aid in studying the mitochondrion but in studying the associated diseases" as well. Features include a general database search, a graphical tool for visualizing the mitochondrial DNA sequences, and 3D structures for mitochondrial proteins. Users are welcome to contact the National Institute of Standards and Technology with corrections or other information relating to the database.

146

The ribosomal database project (RDP-II): introducing myRDP space and quality controlled public data  

PubMed Central

Substantial new features have been implemented at the Ribosomal Database Project in response to the increased importance of high-throughput rRNA sequence analysis in microbial ecology and related disciplines. The most important changes include quality analysis, including chimera detection, for all available rRNA sequences and the introduction of myRDP Space, a new web component designed to help researchers place their own data in context with the RDP's data. In addition, new video tutorials describe how to use RDP features. Details about RDP data and analytical functions can be found at the RDP-II website ().

Cole, J. R.; Chai, B.; Farris, R. J.; Wang, Q.; Kulam-Syed-Mohideen, A. S.; McGarrell, D. M.; Bandela, A. M.; Cardenas, E.; Garrity, G. M.; Tiedje, J. M.

2007-01-01

147

Who owns what? Private ownership and the public interest in recombinant DNA technology in the 1970s.  

PubMed

This essay analyzes how academic institutions, government agencies, and the nascent biotech industry contested the legal ownership of recombinant DNA technology in the name of the public interest. It reconstructs the way a small but influential group of government officials and university research administrators introduced a new framework for the commercialization of academic research in the context of a national debate over scientific research's contributions to American economic prosperity and public health. They claimed that private ownership of inventions arising from public support would provide a powerful means to liberate biomedical discoveries for public benefit. This articulation of the causal link between private ownership and the public interest, it is argued, justified a new set of expectations about the use of research results arising from government or public support, in which commercialization became a new public obligation for academic researchers. By highlighting the broader economic and legal shifts that prompted the reconfiguration of the ownership of public knowledge in late twentieth-century American capitalism, the essay examines the threads of policy-informed legal ideas that came together to affirm private ownership of biomedical knowledge as germane to the public interest in the coming of age of biotechnology and genetic medicine. PMID:22073770

Yi, Doogab

2011-09-01

148

Publications  

NSDL National Science Digital Library

The Nitrogen and Phosphorus Knowledge Web page is offered by Iowa State University Extension and the College of Agriculture. The publications page contains links to various newsletters, articles, publications, power point presentations, links to governmental publications, and more. For example, visitors will find articles written on phosphorous within the Integrated Crop Management Newsletter, power point presentations on Nitrogen Management and Carbon Sequestration, and links to other Iowa State University publications on various subjects such as nutrient management. Other links on the home page of the site contain soil temperature data, research highlights, and other similarly relevant information for those in similar fields.

1969-12-31

149

Identification of new eukaryotic tRNA genes in genomic DNA databases by a multistep weight matrix analysis of transcriptional control regions.  

PubMed Central

A linear method for the search of eukaryotic nuclear tRNA genes in DNA databases is described. Based on a modified version of the general weight matrix procedure, our algorithm relies on the recognition of two intragenic control regions known as A and B boxes, a transcription termination signal, and on the evaluation of the spacing between these elements. The scanning of the eukaryotic nuclear DNA database using this search algorithm correctly identified 933 of the 940 known tRNA genes (0.74% of false negatives). Thirty new potential tRNA genes were identified, and the transcriptional activity of two of them was directly verified by in vitro transcription. The total false positive rate of the algorithm was 0.014%. Structurally unusual tRNA genes, like those coding for selenocysteine tRNAs, could also be recognized using a set of rules concerning their specific properties, and one human gene coding for such tRNA was identified. Some of the newly identified tRNA genes were found in rather uncommon genomic positions: 2 in centromeric regions and 3 within introns. Furthermore, the presence of extragenically located B boxes in tRNA genes from various organisms could be detected through a specific subroutine of the standard search program. Images

Pavesi, A; Conterio, F; Bolchi, A; Dieci, G; Ottonello, S

1994-01-01

150

The PNNL Quantitative Infrared Database for Gas-Phase Sensing: A spectral Library for Environmental, Hazmat, and Public Safety Standoff Detection  

SciTech Connect

Pacific Northwest National Laboratory (PNNL) continues to expand its library of quantitative infrared reference spectra for remote sensing. The gas-phase data are recorded at 0.1 cm-1 resolution, with nitrogen pressure broadening to one atmosphere to emulate spectra recorded in the field. It is planned that the PNNL library will consist of approximately 500 vapor-phase spectra associated with the U.S. Department of Energy's environmental, energy, and public safety missions. At present, the database is comprised of approximately 300 infrared spectra, many of which represent highly reactive or toxic species. For the 298 K data, each reported spectrum is in fact a composite spectrum generated by a Beer's law plot (at each wavelength) to typically 12 measured spectra. Recent additions to the database include the vapors of several semi-volatile and non-volatile liquids using an improved dissemination technique for vaporizing the liquid into the nitrogen carrier gas. Experimental and analytical methods are used to remove several known and new artifacts associated with FTIR gas-phase spectroscopy. Details concerning sample preparation and composite spectrum generation are discussed.

Johnson, Timothy J.; Sams, Robert L.; Sharpe, Steven W.; Arthur J. Sedlacek III, Richard Colton, Tuan Vo-Dinh

2004-03-25

151

The PNNL quantitative infrared database for gas-phase sensing: a spectral library for environmental, hazmat, and public safety standoff detection  

NASA Astrophysics Data System (ADS)

Pacific Northwest National Laboratory (PNNL) continues to expand its library of quantitative infrared reference spectra for remote sensing. The gas-phase data are recorded at 0.1 cm-1 resolution, with nitrogen pressure broadening to one atmosphere to emulate spectra recorded in the field. It is planned that the PNNL library will consist of approximately 500 vapor-phase spectra associated with the U.S. Department of Energy"s environmental, energy, and public safety missions. At present, the database is comprised of approximately 300 infrared spectra, many of which represent highly reactive or toxic species. For the 298 K data, each reported spectrum is in fact a composite spectrum generated by a Beer"s law plot (at each wavelength) to typically 12 measured spectra. Recent additions to the database include the vapors of several semi-volatile and non-volatile liquids using an improved dissemination technique for vaporizing the liquid into the nitrogen carrier gas. Experimental and analytical methods are used to remove several known and new artifacts associated with FTIR gas-phase spectroscopy. Details concerning sample preparation and composite spectrum generation are discussed.

Johnson, Timothy J.; Sams, Robert L.; Sharpe, Steven W.

2004-03-01

152

Expanding the forensic German mitochondrial DNA control region database: genetic diversity as a function of sample size and microgeography  

Microsoft Academic Search

Mitochondrial DNA control region sequences were determined in 109 unrelated German Caucasoid individuals from north west\\u000a Germany for both hypervariable regions 1 (HV1) and 2 (HV2) and 100 polymorphic nucleotide positions (nps) were found, 63 in\\u000a HV1 and 37 in HV2. A total of 100 different mtDNA lineages was revealed, of which 7 were shared by 2 individuals and 1

H. Pfeiffer; B. Brinkmann; J. Hühne; B. Rolf; A. A. Morris; R. Steighner; M. M. Holland; P. Forster

1999-01-01

153

Ionic Liquids Database- (ILThermo)  

National Institute of Standards and Technology Data Gateway

SRD 147 Ionic Liquids Database- (ILThermo) (Web, free access)   IUPAC Ionic Liquids Database, ILThermo, is a free web research tool that allows users worldwide to access an up-to-date data collection from the publications on experimental investigations of thermodynamic, and transport properties of ionic liquids as well as binary and ternary mixtures containing ionic liquids.

154

Payment Systems Research Database  

NSDL National Science Digital Library

The Federal Reserve Bank of Chicago provides this banking industry database for the "convenience of financial services industry researchers and executives." The Payment Systems Research Database is both browseable and searchable by general reference, payment instrument, public policy, or research topic. Each entry includes a brief bibliographic citation with links to full-text articles where available.

155

The Molecular Biology Database Collection: 2006 update  

Microsoft Academic Search

The NAR Molecular Biology Database Collection is a public online resource that contains links to all databases described in this issue of Nucleic Acids Research. In addition, this collection lists databases that have been featured in previous issues of NAR, as well as selected other databases that are freely available to the public and may be useful to the molecular

M. Y. Galperin

2006-01-01

156

GCOD - GeneChip Oncology Database  

PubMed Central

Background DNA microarrays have become a nearly ubiquitous tool for the study of human disease, and nowhere is this more true than in cancer. With hundreds of studies and thousands of expression profiles representing the majority of human cancers completed and in public databases, the challenge has been effectively accessing and using this wealth of data. Description To address this issue we have collected published human cancer gene expression datasets generated on the Affymetrix GeneChip platform, and carefully annotated those studies with a focus on providing accurate sample annotation. To facilitate comparison between datasets, we implemented a consistent data normalization and transformation protocol and then applied stringent quality control procedures to flag low-quality assays. Conclusion The resulting resource, the GeneChip Oncology Database, is available through a publicly accessible website that provides several query options and analytical tools through an intuitive interface.

2011-01-01

157

Olfactory Receptor Database: a database of the largest eukaryotic gene family.  

PubMed Central

The Olfactory Receptor Database (ORDB) is a WWW-accessible database that stores data on Olfactory Receptor-like molecules (ORs) and has been open to the public since June 1996. It contains a public and a private area. The public area includes published DNA and protein sequence data for ORs, links to OR models and data on their expression, chromosomal localization and source organism, as well as (i) links to bibliography through PubMed and (ii) interactive WWW-based tools, such as BLAST homology searching. The private area functions as a service to laboratories that are actively cloning receptors. Source laboratories enter the sequences of the receptor clones they have characterized to the private database and can search for identical or near identical OR sequences in both public and private databases. If another laboratory has cloned and deposited an identical or closely matching sequence there are means for communication between the laboratories to help avoid duplication of work. ORDB is available via the WWW at http://crepe.med.yale.edu/ORDB/HTML

Skoufos, E; Healy, M D; Singer, M S; Nadkarni, P M; Miller, P L; Shepherd, G M

1999-01-01

158

Olfactory Receptor Database: a database of the largest eukaryotic gene family.  

PubMed

The Olfactory Receptor Database (ORDB) is a WWW-accessible database that stores data on Olfactory Receptor-like molecules (ORs) and has been open to the public since June 1996. It contains a public and a private area. The public area includes published DNA and protein sequence data for ORs, links to OR models and data on their expression, chromosomal localization and source organism, as well as (i) links to bibliography through PubMed and (ii) interactive WWW-based tools, such as BLAST homology searching. The private area functions as a service to laboratories that are actively cloning receptors. Source laboratories enter the sequences of the receptor clones they have characterized to the private database and can search for identical or near identical OR sequences in both public and private databases. If another laboratory has cloned and deposited an identical or closely matching sequence there are means for communication between the laboratories to help avoid duplication of work. ORDB is available via the WWW at http://crepe.med.yale.edu/ORDB/HTML PMID:9847223

Skoufos, E; Healy, M D; Singer, M S; Nadkarni, P M; Miller, P L; Shepherd, G M

1999-01-01

159

Evaluating Statistical Methods Using Plasmode Data Sets in the Age of Massive Public Databases: An Illustration Using False Discovery Rates  

PubMed Central

Plasmode is a term coined several years ago to describe data sets that are derived from real data but for which some truth is known. Omic techniques, most especially microarray and genomewide association studies, have catalyzed a new zeitgeist of data sharing that is making data and data sets publicly available on an unprecedented scale. Coupling such data resources with a science of plasmode use would allow statistical methodologists to vet proposed techniques empirically (as opposed to only theoretically) and with data that are by definition realistic and representative. We illustrate the technique of empirical statistics by consideration of a common task when analyzing high dimensional data: the simultaneous testing of hundreds or thousands of hypotheses to determine which, if any, show statistical significance warranting follow-on research. The now-common practice of multiple testing in high dimensional experiment (HDE) settings has generated new methods for detecting statistically significant results. Although such methods have heretofore been subject to comparative performance analysis using simulated data, simulating data that realistically reflect data from an actual HDE remains a challenge. We describe a simulation procedure using actual data from an HDE where some truth regarding parameters of interest is known. We use the procedure to compare estimates for the proportion of true null hypotheses, the false discovery rate (FDR), and a local version of FDR obtained from 15 different statistical methods.

Gadbury, Gary L.; Xiang, Qinfang; Yang, Lin; Barnes, Stephen; Page, Grier P.; Allison, David B.

2008-01-01

160

First insight into the human liver proteome from PROTEOME(SKY)-LIVER(Hu) 1.0, a publicly available database.  

PubMed

Herein, we report proteome and transcriptome profiles of the human adult liver and present an initial analysis. Overall, the human liver proteome (HLP) data set comprises 6788 identified proteins with at least two peptides matches at 95% confidence, including 3721 proteins newly identified in liver. The human liver transcriptome (HLT) data set consists of 11 205 expressed genes. The HLP is the largest proteome data set for a human organ and is the first direct association between a proteome and its transcriptome derived from the same sample. Although it is hard to approach complete coverage of the HLP currently, several conclusions based on this data set are clearly reached: (1) The 5816 protein-encoding genes (PEGs) represented by the HLP and the 11 104 PEGs represented in the HLT have been identified from 20 070 PEGs in IPI Human v3.07 and 19 478 PEGs in the integrated human transcriptome database, respectively. (2) The patterns of chromosomal distribution of the genes corresponding to the HLP are highly consistent with those of the HLT. Some chromosomal regions, such as 16p13.3, 19q13.31, 19q13.42, and Xq28, exhibit particularly high densities of liver-specific genes, which perform the important functions related to normal physiology or/and pathology in this organ. (3) The HLP spans 6 orders of magnitude in relative protein abundance and 78% of the proteins fall in the middle of this range. Of newly identified liver proteins, 82.5% are of low abundance. (4) Proteins involving in metabolism, transport, and coagulation and those containing active domains for metabolism, transport, and biosynthesis are significantly enriched in liver. (5) All 94 metabolic pathways in KEGG are touched to different extent. Of which, for 48 pathways, particularly those involved in metabolism of carbohydrates and amino acids, more than 80% of the component proteins have been detected. The liver-specific pathways, such as those participating in metabolism of bile acid and bilirubin and in biotransformation, are identified with remarkably high coverage. A total of 31 members of the cytochrome P450 family are identified, four of which have been observed for the first time in human liver. (6) Transport proteins involved in energy metabolism and secretion of both protein and bile acid are highly abundant. Three ion channels are described for the first time in liver. (7) The 800 proteins related to signal transduction and primarily involved in cellular recognition, localization, communication, and inflammation are present in the HLP data set. Insulin and adipocytokine pathways, which are involved in the regulation of glucose and fatty acids, are highly covered. (8) Transcription factors (309 in total) have been recognized at relatively low detection rates and abundance; however, transcription factors regulating gene expression related to transport, metabolism, and biosynthesis are detected at relatively higher coverage and the protein products of their target genes (100 in total), such as metabolic enzymes and plasma proteins, are also identified. (9) The overlap between the human liver and plasma proteomes is particularly noteworthy in the coagulation/anticoagulation/fibrinolysis and complement system. There is a significantly positive linear correlation between the abundance of coagulator proteins in liver and plasma. PMID:19653699

2010-01-01

161

Contamination of sequence databases with adaptor sequences  

SciTech Connect

Because of the exponential increase in the amount of DNA sequences being added to the public databases on a daily basis, it has become imperative to identify sources of contamination rapidly. Previously, contaminations of sequence databases have been reported to alert the scientific community to the problem. These contaminations can be divided into two categories. The first category comprises host sequences that have been difficult for submitters to manage or control. Examples include anomalous sequences derived from Escherichia coli, which are inserted into the chromosomes (and plasmids) of the bacterial hosts. Insertion sequences are highly mobile and are capable of transposing themselves into plasmids during cloning manipulation. Another example of the first category is the infection with yeast genomic DNA or with bacterial DNA of some commercially available cDNA libraries from Clontech. The second category of database contamination is due to the inadvertent inclusion of nonhost sequences. This category includes incorporation of cloning-vector sequences and multicloning sites in the database submission. M13-derived artifacts have been common, since M13-based vectors have been widely used for subcloning DNA fragments. Recognizing this problem, the National Center for Biotechnology Information (NCBI) started to screen, in April 1994, all sequences directly submitted to GenBank, against a set of vector data retrieved from GenBank by use of key-word searches, such as {open_quotes}vector.{close_quotes} In this report, we present evidence for another sequence artifact that is widespread but that, to our knowledge, has not yet been reported. 11 refs., 1 tab.

Yoshikawa, Takeo; Sanders, A.R.; Detera-Wadleigh, S.D. [National Institute of Mental Health, Bethesda, MD (United States)

1997-02-01

162

The Génolevures database.  

PubMed

The Génolevures online database (URL: http://www.genolevures.org) stores and provides the data and results obtained by the Génolevures Consortium through several campaigns of genome annotation of the yeasts in the Saccharomycotina subphylum (hemiascomycetes). This database is dedicated to large-scale comparison of these genomes, storing not only the different chromosomal elements detected in the sequences, but also the logical relations between them. The database is divided into a public part, accessible to anyone through Internet, and a private part where the Consortium members make genome annotations with our Magus annotation system; this system is used to annotate several related genomes in parallel. The public database is widely consulted and offers structured data, organized using a REST web site architecture that allows for automated requests. The implementation of the database, as well as its associated tools and methods, is evolving to cope with the influx of genome sequences produced by Next Generation Sequencing (NGS). PMID:21819938

Martin, Tiphaine; Sherman, David J; Durrens, Pascal

2011-01-01

163

Significant variance in genetic diversity among populations of Schistosoma haematobium detected using microsatellite DNA loci from a genome-wide database  

PubMed Central

Background Urogenital schistosomiasis caused by Schistosoma haematobium is widely distributed across Africa and is increasingly being targeted for control. Genome sequences and population genetic parameters can give insight into the potential for population- or species-level drug resistance. Microsatellite DNA loci are genetic markers in wide use by Schistosoma researchers, but there are few primers available for S. haematobium. Methods We sequenced 1,058,114 random DNA fragments from clonal cercariae collected from a snail infected with a single Schistosoma haematobium miracidium. We assembled and aligned the S. haematobium sequences to the genomes of S. mansoni and S. japonicum, identifying microsatellite DNA loci across all three species and designing primers to amplify the loci in S. haematobium. To validate our primers, we screened 32 randomly selected primer pairs with population samples of S. haematobium. Results We designed >13,790 primer pairs to amplify unique microsatellite loci in S. haematobium, (available at http://www.cebio.org/projetos/schistosoma-haematobium-genome). The three Schistosoma genomes contained similar overall frequencies of microsatellites, but the frequency and length distributions of specific motifs differed among species. We identified 15 primer pairs that amplified consistently and were easily scored. We genotyped these 15 loci in S. haematobium individuals from six locations: Zanzibar had the highest levels of diversity; Malawi, Mauritius, Nigeria, and Senegal were nearly as diverse; but the sample from South Africa was much less diverse. Conclusions About half of the primers in the database of Schistosoma haematobium microsatellite DNA loci should yield amplifiable and easily scored polymorphic markers, thus providing thousands of potential markers. Sequence conservation among S. haematobium, S. japonicum, and S. mansoni is relatively high, thus it should now be possible to identify markers that are universal among Schistosoma species (i.e., using DNA sequences conserved among species), as well as other markers that are specific to species or species-groups (i.e., using DNA sequences that differ among species). Full genome-sequencing of additional species and specimens of S. haematobium, S. japonicum, and S. mansoni is desirable to better characterize differences within and among these species, to develop additional genetic markers, and to examine genes as well as conserved non-coding elements associated with drug resistance.

2013-01-01

164

Audit of Compliance With Standards Governing Combined DNA Index System Activities at the Texas Department of Public Safety Lubbock Criminal Laboratory, Lubbock, Texas.  

National Technical Information Service (NTIS)

The Department of Justice Office of the Inspector General (OIG), Audit Division, has completed an audit of compliance with standards governing Combined DNA Index System (CODIS) activities at the Texas Department of Public Safety Lubbock Criminal Laborator...

2011-01-01

165

DSSTOX STRUCTURE-SEARCHABLE PUBLIC TOXICITY DATABASE NETWORK: CURRENT PROGRESS AND NEW INITIATIVES TO IMPROVE CHEMO-BIOINFORMATICS CAPABILITIES  

EPA Science Inventory

The EPA DSSTox website (http://www/epa.gov/nheerl/dsstox) publishes standardized, structure-annotated toxicity databases, covering a broad range of toxicity disciplines. Each DSSTox database features documentation written in collaboration with the source authors and toxicity expe...

166

Prospects of using DNA barcoding for species identification and evaluation of the accuracy of sequence databases for ticks (Acari: Ixodida).  

PubMed

Ticks are important vectors of disease and parasites of livestock. Species identification of ticks has been traditionally based on morphological characters, which is usually limited by the condition of samples and little variation among specimens, so a rapid and reliable identification method is needed. DNA barcoding uses a standard fragment of the mitochondrial gene cytochrome oxidase c subunit I (COI) to identify species and has been successfully used in many taxa. In this study, we applied DNA barcoding to tick species. K2P distances showed that most interspecific divergences exceed 8%, while intraspecific distances were usually lower than 2%. However, intraspecific distances of 12 species were unexpectedly high. ABGD grouping results demonstrated that sequences of these species should be divided into 2 or more groups. And some exceptional clustering occurred among sequences of Hyalomma marginatum, Hy. truncatum, and Hy. dromedarii, Amblyomma testudinarium and A. pattoni, Rhipicephalus sanguineus and R. pumilio, Haemaphysalis parva and Ha. concinna, Ixodes asanumai and I. nipponensis. Additionally, 226 unnamed sequences were assigned to known species or constituted different groups, and K2P distances of all these groups were less than 2%. In conclusion, our study demonstrated that DNA barcoding is a useful tool for the identification of tick species, and further work is needed to reveal ambiguous species delimitation in some problematic genera. PMID:24656809

Zhang, Rui L; Zhang, Bin

2014-04-01

167

Rainfall Simulation Database  

NSDL National Science Digital Library

The British Geomorphological Research Group maintains this rainfall simulation database site. The database includes information about rainfall simulators, as well as their set-up and use. Currently, the database is divided into laboratory, field, and dual-purpose (lab and field) rainfall simulators. For each simulator in the database, users will find information on: "Simulator identification; Contact information; Specification of simulator; performance of simulator; Table of intensities and rainfall characteristics; Costs; Summary of research conducted using the simulator; Publications; and Additional comments." A section on simulation calibration is planned, and a form is provided for those wishing to contribute simulators to the database. A lengthy selection of related references adds to the utility of the database.

168

Nanotechnology Database  

NSDL National Science Digital Library

Sponsored by the National Science Foundation and housed at the Loyola College in Maryland's International Technology Research Institute the Nanotechnology Database is a source of online information on major research centers, funding agencies, major reports, and books dealing with nanotechnology. The resources listed here are carefully selected and reviewed. The site is expected to grow with the continued support and updates from organizations and individuals in the field of nanotechnology. The list of resources is divided into the following categories: Academic, Industry, Government Laboratories, Government Agencies, Professional Societies, Non-Profit Organizations, Books, Periodicals, Reports, and Conferences. Each listing provides a brief summary (taken from that Website) and hyperlink to the resource (note: the book list links mostly take users to online booksellers). A submission form allows users to add a relevant organization or publication.

2001-01-01

169

How Frequently Do the Results from Completed US Clinical Trials Enter the Public Domain? - A Statistical Analysis of the ClinicalTrials.gov Database  

PubMed Central

Background Achieving transparency in clinical trials, through either publishing results in a journal or posting results to the ClinicalTrials.gov (CTG) web site, is an essential public health good. However, it remains unknown what proportion of completed studies achieve public disclosure of results (PDOR), or what factors explain these differences. Methods We analyzed data from 400 randomly selected studies within the CTG database that had been listed as ‘completed’ and had at least four years in which to disclose results. Using Kaplan-Meier curves, we calculated times from completion to PDOR (defined as publishing the primary outcomes in a journal and/or posting results to CTG), and identified explanatory variables predicting these outcomes using Cox proportional hazards models. Findings Among the 400 clinical trials, 118 (29.5%) failed to achieve PDOR within four years of completion. The median day from study completion to PDOR among 282 studies (70.5%) that achieved PDOR was 602 days (mean 647 days, SD 454 days). Studies were less likely to achieve PDOR if at earlier stages (phase 2 vs. phase 3/4, adjusted HR 0.60, 95% CI 0.47–0.78), if they only included adult subjects (adjusted HR 0.61, 95% CI 0.45–0.83), involved randomization (adjusted HR 0.62, 95% CI 0.46–0.83), or had smaller sample sizes (?50 subjects vs. >50, adjusted HR 0.60, 95% CI 0.44–0.83). Industry-funded studies were significantly less likely to be published than non-industry or blended studies (adjusted HR 0.49, 95% CI 0.36–0.66). Conclusions A significant proportion of completed studies did not achieve PDOR within the four years of follow-up, particularly smaller studies at earlier stages of development with industry funding. This constitutes reporting bias and threatens the validity of the clinical research literature in the US.

Saito, Hiroki; Gill, Christopher J.

2014-01-01

170

Method and system for normalizing biometric variations to authenticate users from a public database and that ensures individual biometric data privacy  

DOEpatents

A password system comprises a set of codewords spaced apart from one another by a Hamming distance (HD) that exceeds twice the variability that can be projected for a series of biometric measurements for a particular individual and that is less than the HD that can be encountered between two individuals. To enroll an individual, a biometric measurement is taken and exclusive-ORed with a random codeword to produce a "reference value." To verify the individual later, a biometric measurement is taken and exclusive-ORed with the reference value to reproduce the original random codeword or its approximation. If the reproduced value is not a codeword, the nearest codeword to it is found, and the bits that were corrected to produce the codeword to it is found, and the bits that were corrected to produce the codeword are also toggled in the biometric measurement taken and the codeword generated during enrollment. The correction scheme can be implemented by any conventional error correction code such as Reed-Muller code R(m,n). In the implementation using a hand geometry device an R(2,5) code has been used in this invention. Such codeword and biometric measurement can then be used to see if the individual is an authorized user. Conventional Diffie-Hellman public key encryption schemes and hashing procedures can then be used to secure the communications lines carrying the biometric information and to secure the database of authorized users.

Strait, Robert S. (Oakland, CA); Pearson, Peter K. (Livermore, CA); Sengupta, Sailes K. (Livermore, CA)

2000-01-01

171

Microsatellite characterization and marker development from public EST and WGS databases in the reef-building coral Acropora millepora (Cnidaria, Anthozoa, Scleractinia).  

PubMed

Mining for microsatellites (also called simple sequence repeats [SSRs]) in public sequence databases of a common Indo-Pacific coral Acropora millepora identified 191 SSRs from 10 258 expressed sequence tag (EST) and 618 SSRs from 14 625 whole-genome shotgun (WGS) sequences. In contrast to other animals, trinucleotide repeats, rather than dinucleotide repeats, are dominant in the WGS-SSRs, and AAT is the most frequent trinucleotide motif in EST-SSRs. We successfully developed 40 polymorphic markers from EST-SSRs and WGS-SSRs. Both EST- and WGS-SSRs show high levels of polymorphism within corals from the same reef patch. Interestingly, markers WGS079 and WGS227 revealed SSR duplications in a few individuals, suggesting recent duplication events. Genotypic linkage disequilibrium was identified in 5 pairs of SSR markers, which will be invaluable for high-resolution studies of genetic admixture in natural populations of A. millepora. Transferability analysis showed that 25 of these markers can be successfully amplified in one of the most ubiquitous Indo-Pacific corals Acropora hyacinthus. The marker collection reported here is the largest ever developed for any reef-building coral. It holds great potential for addressing coral reef connectivity across the Indo-Pacific with an unprecedented precision, especially taking into account the cross-species transferability of a substantial number of markers. PMID:19043068

Wang, Shi; Zhang, Lingling; Matz, Mikhail

2009-01-01

172

Computational tools and resources for metabolism-related property predictions. 1. Overview of publicly available (free and commercial) databases and software  

PubMed Central

Metabolism has been identified as a defining factor in drug development success or failure because of its impact on many aspects of drug pharmacology, including bioavailability, half-life and toxicity. In this article, we provide an outline and descriptions of the resources for metabolism-related property predictions that are currently either freely or commercially available to the public. These resources include databases with data on, and software for prediction of, several end points: metabolite formation, sites of metabolic transformation, binding to metabolizing enzymes and metabolic stability. We attempt to place each tool in historical context and describe, wherever possible, the data it was based on. For predictions of interactions with metabolizing enzymes, we show a typical set of results for a small test set of compounds. Our aim is to give a clear overview of the areas and aspects of metabolism prediction in which the currently available resources are useful and accurate, and the areas in which they are inadequate or missing entirely.

Peach, Megan L; Zakharov, Alexey V; Liu, Ruifeng; Pugliese, Angelo; Tawa, Gregory; Wallqvist, Anders; Nicklaus, Marc C

2014-01-01

173

Overview of the HUPO Plasma Proteome Project: Results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly-available database  

SciTech Connect

HUPO initiated the Plasma Proteome Project (PPP) in 2002. Its pilot phase has (1) evaluated advantages and limitations of many depletion, fractionation, and MS technology platforms; (2) compared PPP reference specimens of human serum and EDTA, heparin, and citrate-anticoagulated plasma; and (3) created a publicly-available knowledge base (www.bioinformatics. med.umich.edu/hupo/ppp; www.ebi.ac.uk/pride). Thirty-five participating laboratories in 13 countries submitted datasets. Working groups addressed (a) specimen stability and protein concentrations; (b) protein identifications from 18 MS/MS datasets; (c) independent analyses from raw MS-MS spectra; (d) search engine performance, subproteome analyses, and biological insights; (e) antibody arrays; and (f) direct MS/SELDI analyses. MS-MS datasets had 15 710 different International Protein Index (IPI) protein IDs; our integration algorithm applied to multiple matches of peptide sequences yielded 9504 IPI proteins identified with one or more peptides and 3020 proteins identified with two or more peptides (the Core Dataset). These proteins have been characterized with Gene Ontology, InterPro, Novartis Atlas, OMIM, and immunoassay based concentration determinations. The database permits examination of many other subsets, such as 1274 proteins identified with three or more peptides. Reverse protein to DNA matching identified proteins for 118 previously unidentified ORFs. We recommend use of plasma instead of serum, with EDTA (or citrate) for anticoagulation. To improve resolution, sensitivity and reproducibility of peptide identifications and protein matches, we recommend combinations of depletion, fractionation, and MS/MS technologies, with explicit criteria for evaluation of spectra, use of search algorithms, and integration of homologous protein matches. This Special Issue of PROTEOMICS presents papers integral to the collaborative analysis plus many reports of supplementary work on various aspects of the PPP workplan. These PPP results on complexity, dynamic range, incomplete sampling, false-positive matches, and integration of diverse datasets for plasma and serum proteins lay a foundation for development and validation of circulating protein biomarkers in health and disease.

Omenn, Gilbert; States, David J.; Adamski, Marcin; Blackwell, Thomas W.; Menon, Rajasree; Hermjakob, Henning; Apweiler, Rolf; Haab, Brian B.; Simpson, Richard; Eddes, James; Kapp, Eugene; Moritz, Rod; Chan, Daniel W.; Rai, Alex J.; Admon, Arie; Aebersold, Ruedi; Eng, Jimmy K.; Hancock, William S.; Hefta, Stanley A.; Meyer, Helmut; Paik, Young-Ki; Yoo, Jong-Shin; Ping, Peipei; Pounds, Joel G.; Adkins, Joshua N.; Qian, Xiaohong; Wang, Rong; Wasinger, Valerie; Wu, Chi Yue; Zhao, Xiaohang; Zeng, Rong; Archakov, Alexander; Tsugita, Akira; Beer, Ilan; Pandey, Akhilesh; Pisano, Michael; Andrews, Philip; Tammen, Harald; Speicher, David W.; Hanash, Samir M.

2005-08-13

174

Buffmap Database  

NSDL National Science Digital Library

The Institut National de Recherche Agronomique (France) provides online databases of genomes for several organisms, including Buffalo. The database is updated as more data become available, and provide records on loci, genes, microsatellites, polymorphisms, probes, enzymes, homologue loci, alleles, primers, references, and links to external databases. Instructions for submitting data to the databases are provided on-site.

175

The Database of PCR Primers for Phytopathogenic Fungi  

Microsoft Academic Search

We present the first on-line searchable database of primer sets useful for the detection of plant pathogenic fungi. This web\\u000a resource is implemented entirely with open-source software (PHP, MySQL). Primer set details can be retrieved by organism name,\\u000a primer name, nucleotide sequence comparison, target DNA, PCR technique, author name, journal and year of publication. Each\\u000a record is linked directly to

Stefano Ghignone; Quirico Migheli

2005-01-01

176

Public Databases Supporting Computational Toxicology  

EPA Science Inventory

A major goal of the emerging field of computational toxicology is the development of screening-level models that predict potential toxicity of chemicals from a combination of mechanistic in vitro assay data and chemical structure descriptors. In order to build these models, resea...

177

Hawaii bibliographic database  

NASA Astrophysics Data System (ADS)

The Hawaii bibliographic database has been created to contain all of the literature, from 1779 to the present, pertinent to the volcanological history of the Hawaiian-Emperor volcanic chain. References are entered in a PC- and Macintosh-compatible EndNote Plus bibliographic database with keywords and s or (if no ) with annotations as to content. Keywords emphasize location, discipline, process, identification of new chemical data or age determinations, and type of publication. The database is updated approximately three times a year and is available to upload from an ftp site. The bibliography contained 8460 references at the time this paper was submitted for publication. Use of the database greatly enhances the power and completeness of library searches for anyone interested in Hawaiian volcanism.

Wright, Thomas L.; Takahashi, Taeko Jane

178

Public concern over genetic manipulation, Alexander CapronSite: DNA Interactive (www.dnai.org)  

NSDL National Science Digital Library

Alexander Capron DNAi Location:Manipulation>Revolution>players>The controversy The Frankenstein factor Alexander Capron, a lawyer and specialist in bioethics, talks about how fear of Frankenstein captured the public fancy.

2008-10-06

179

Mycobacteriophage Genome Database  

PubMed Central

Mycobacteriophage genome database (MGDB) is an exclusive repository of the 64 completely sequenced mycobacteriophages with annotated information. It is a comprehensive compilation of the various gene parameters captured from several databases pooled together to empower mycobacteriophage researchers. The MGDB (Version No.1.0) comprises of 6086 genes from 64 mycobacteriophages classified into 72 families based on ACLAME database. Manual curation was aided by information available from public databases which was enriched further by analysis. Its web interface allows browsing as well as querying the classification. The main objective is to collect and organize the complexity inherent to mycobacteriophage protein classification in a rational way. The other objective is to browse the existing and new genomes and describe their functional annotation. Availability The database is available for free at http://mpgdb.ibioinformatics.org/mpgdb.php

Joseph, Jerrine; Rajendran, Vasanthi; Hassan, Sameer; Kumar, Vanaja

2011-01-01

180

Database of Zeolite Structures  

NSDL National Science Digital Library

The Database of Zeolite Structures is provided by the Structure Commission of the International Zeolite Association. Links include an Atlas of Zeolite Framework Types, Collection of Simulated XRD Powder Patterns for Zeolites, Catalog of Disordered Zeolite Structures, Schemes for Building Zeolite Framework Models, and Zeolite Structure References, as well as various publications. The database can be searched or browsed, and contains several useful tools such as the "input your data" link, which allows the user to enter crystallographic data not available in the database and generate the diffraction pattern.

2000-01-01

181

Human variation databases  

PubMed Central

More than 100 000 human genetic variations have been described in various genes that are associated with a wide variety of diseases. Such data provides invaluable information for both clinical medicine and basic science. A number of locus-specific databases have been developed to exploit this huge amount of data. However, the scope, format and content of these databases differ strongly and as no standard for variation databases has yet been adopted, the way data is presented varies enormously. This review aims to give an overview of current resources for human variation data in public and commercial resources.

Kuntzer, Jan; Eggle, Daniela; Klostermann, Stefan; Burtscher, Helmut

2010-01-01

182

FACILITY DATABASE  

Cancer.gov

January 2008 LASP FACILTY Database Form 5.000 Issue Reporting Form This form is used to report data and/or program related issues regarding the FACILITY database, Supplemental, or the LASP Online Access System. Before submitting this form,

183

Russian Regional Database  

NSDL National Science Digital Library

The Centre for the Study of Public Policy (CSPP) at the University of Strathclyde, Glasgow has created a new online database that presents a comparative overview of eleven Russian regions: Central, Central Black Earth, East Siberia, Far East, North Caucasus, Northern, North-west, Urals, Volga, Volgo-Vyatka, and West Siberia. The database provides data garnered from official government statistics and public opinion surveys to compare indicators that measure economic competitiveness; labor market change; and social cohesion, exclusion, and stability. The data are accompanied by analysis written by Professor Richard Rose.

1998-01-01

184

The EMBL nucleotide sequence database  

Microsoft Academic Search

The EMBL Nucleotide Sequence Database (http:\\/\\/ www.ebi.ac.uk\\/embl\\/) is maintained at the European Bioinformatics Institute (EBI) in an international collaboration with the DNA Data Bank of Japan (DDBJ) and GenBank at the NCBI (USA). Data is exchanged amongst the collaborating databases on a daily basis. The major contributors to the EMBL database are individual authors and genome project groups. Webin is

Guenter Stoesser; Wendy Baker; Alexandra Van Den Broek; Evelyn Camon; Maria Garcia-pastor; Carola Kanz; Tamara Kulikova; Vincent Lombard; Rodrigo Lopez; Helen E. Parkinson; Nicole Redaschi; Peter Sterk; Peter Stoehr; Mary Ann Tuli

2001-01-01

185

Database of Mechanical Properties of Textile Composites.  

National Technical Information Service (NTIS)

This report describes the approach followed to develop a database for mechanical properties of textile composites. The data in this database is assembled from NASA Advanced Composites Technology (ACT) programs and from data in the public domain. This data...

J. Delbrey

1996-01-01

186

Annual Review of Database Development: 1992.  

ERIC Educational Resources Information Center

Reviews recent trends in databases and online systems. Topics discussed include new access points for established databases; acquisitions, consolidations, and competition between vendors; European coverage; international services; online reference materials, including telephone directories; political and legal materials and public records;…

Basch, Reva

1992-01-01

187

USDA NATIONAL NUTRIENT DATABASE FOR STANDARD REFERENCE  

EPA Science Inventory

The USDA Nutrient Database for Standard Reference (SR) is the major source of food composition data in the United States. It provides the foundation for most food composition databases in the public and private sectors....

188

Data recovery and integration from public databases uncovers transformation-specific transcriptional downregulation of cAMP-PKA pathway-encoding genes  

PubMed Central

Background The integration of data from multiple genome-wide assays is essential for understanding dynamic spatio-temporal interactions within cells. Such integration, which leads to a more complete view of cellular processes, offers the opportunity to rationalize better the high amount of "omics" data freely available in several public databases. In particular, integration of microarray-derived transcriptome data with other high-throughput analyses (genomic and mutational analysis, promoter analysis) may allow us to unravel transcriptional regulatory networks under a variety of physio-pathological situations, such as the alteration in the cross-talk between signal transduction pathways in transformed cells. Results Here we sequentially apply web-based and statistical tools to a case study: the role of oncogenic activation of different signal transduction pathways in the transcriptional regulation of genes encoding proteins involved in the cAMP-PKA pathway. To this end, we first re-analyzed available genome-wide expression data for genes encoding proteins of the downstream branch of the PKA pathway in normal tissues and human tumor cell lines. Then, in order to identify mutation-dependent transcriptional signatures, we classified cancer cells as a function of their mutational state. The results of such procedure were used as a starting point to analyze the structure of PKA pathway-encoding genes promoters, leading to identification of specific combinations of transcription factor binding sites, which are neatly consistent with available experimental data and help to clarify the relation between gene expression, transcriptional factors and oncogenes in our case study. Conclusions Genome-wide, large-scale "omics" experimental technologies give different, complementary perspectives on the structure and regulatory properties of complex systems. Even the relatively simple, integrated workflow presented here offers opportunities not only for filtering data noise intrinsic in high throughput data, but also to progressively extract novel information that would have remained hidden otherwise. In fact we have been able to detect a strong transcriptional repression of genes encoding proteins of cAMP/PKA pathway in cancer cells of different genetic origins. The basic workflow presented herein may be easily extended by incorporating other tools and can be applied even by researchers with poor bioinformatics skills.

Balestrieri, Chiara; Alberghina, Lilia; Vanoni, Marco; Chiaradonna, Ferdinando

2009-01-01

189

IOPI Database of Plant Databases  

NSDL National Science Digital Library

The International Organization for Plant Information (IOPI), a Commission of the International Union of Biological Sciences (IUBS), manages the Database of Plant Databases (DPD). The DPD is a global list of plant databases including Taxonomic databases ("with systematic information on families or genera, or for Flora projects"); Collection catalogs (usually of herbaria); and DELTA datasets (DELTA is "the Description Language for Taxonomy, a data format for character data, used for identification, key construction and the generation of descriptions."). The DPD may be searched using numerous specified fields, or it may be viewed in its entirety -- by Database Name, Host Name, or Host Country. Though bare bones in appearance, this extensive database contains a gold mine of information, with hundreds of hyperlinks to valuable plant databases.

190

Neogene Mammal Database  

NSDL National Science Digital Library

There are plenty of mammal databases that deal with existing taxonomies and such, but where can one go to find information about land mammal taxa from the Eurasian Miocene to Pleistocene periods? They need look no further than this fine website, which provides access to such material, courtesy of an international advisory board, headed by Mikael Fortelius of the University of Helsinki. First-time visitors may want to look over the data conventions section first. Here they can learn about the proper use of taxonomic fields, and also view such taxonomic examples from the database. Users may also want to look over the browsing section, as it serves as a good orientation to effectively using the database. For those engaged in scholarly endeavors related to the field, there is also a publications section that offers access to recent works that draw on this rather voluminous database.

191

Ovarian Kaleidoscope Database: Ten Years and Beyond1  

PubMed Central

ABSTRACT Ovarian Kaleidoscope database (OKdb) is an online, searchable, public database containing text-based and DNA microarray data to facilitate research by ovarian researchers. Using key words and predetermined categories, users can search ovarian gene information based on gene function, cell type of expression, cellular localization, hormonal regulation, mutant phenotypes, chromosomal location, ligand-receptor relationship, and other criteria, either alone or in combination. For individual genes, users can access more than 10 extensive DNA microarray datasets to interrogate gene expression patterns in a development-specific and cell type-specific manner. All ligand and receptor genes expressed in the ovary are matched to facilitate investigation of paracrine/autocrine signaling. More than 3500 ovarian genes in the database are matched to 185 gene pathways in the Kyoto Encyclopedia of Genes and Genomes to allow for elucidation of gene interactions and relationships. In addition to >400 genes with infertility or subfertility phenotypes when mutated in mice or humans, the OKdb also lists ?50 and ?40 genes associated with polycystic ovarian syndrome and primary ovarian insufficiency, respectively. The expanding OKdb is updated weekly and allows submission of new genes by ovarian researchers to allow instant access to DNA microarray datasets for newly submitted genes. The present database is a virtual community for ovarian researchers and allows users to instantaneously provide their comments for individual gene pages based on an automated Web-discussion system. In the coming years, we will continue to add new features to serve the ovarian research community.

Hsueh, Aaron J.; Rauch, Rami

2012-01-01

192

The National Ag Safety Database  

NSDL National Science Digital Library

The University of Florida originally made this directory, a subset of its National Agricultural Safety Database CD-ROM, available on the web. Since then the database has undergone several updates. The Directory "contains contact information on safety professionals and organizations throughout the US," and "health and safety publications from 32 states, 4 federal agencies and 5 national organizations" and can be browsed or searched. The creators of the database have gone to great lengths to improve this site over the years.

1995-01-01

193

The EMBL Nucleotide Sequence Database  

Microsoft Academic Search

The EMBL Nucleotide Sequence Database (http:\\/\\/ www.ebi.ac.uk\\/embl\\/), maintained at the European Bioinformatics Institute (EBI), incorporates, organ- izes and distributes nucleotide sequences from public sources. The database is a part of an inter- national collaboration with DDBJ (Japan) and GenBank (USA). Data are exchanged between the collaborating databases on a daily basis to achieve optimal synchrony. The web-based tool, Webin, is

Tamara Kulikova; Philippe Aldebert; Nicola Althorpe; Wendy Baker; Kirsty Bates; Paul Browne; Alexandra Van Den Broek; Guy Cochrane; Karyn Duggan; Ruth Eberhardt; Nadeem Faruque; Maria Garcia-pastor; Nicola Harte; Carola Kanz; Rasko Leinonen; Quan Lin; Vincent Lombard; Rodrigo Lopez; Renato Mancuso; Michelle Mchale; Francesco Nardone; Ville Silventoinen; Peter Stoehr; Guenter Stoesser; Mary Ann Tuli; Katerina Tzouvara; Robert Vaughan; Dan Wu; Weimin Zhu; Rolf Apweiler

2004-01-01

194

Early Stage Development of the Jetset Database  

NASA Astrophysics Data System (ADS)

We are developing an online Database. Similar databases exist, like the The Youngest Protostars, by Dirk Froebrich, from which we can take some ideas. This database features separate tables for each object, few data for each observation and a link to the corresponding publication. Our current goals are to have user-entered data, a query-able database and separate tables for each object (source and outflow). The fields of the database are defined by Work Package managers.

Rammos, Periklis; Whelan, Emma T.; Gracia, Josié; Dudzinski, Stephane; Grange, Philippe

195

SoyDB: a knowledge database of soybean transcription factors  

PubMed Central

Background Transcription factors play the crucial rule of regulating gene expression and influence almost all biological processes. Systematically identifying and annotating transcription factors can greatly aid further understanding their functions and mechanisms. In this article, we present SoyDB, a user friendly database containing comprehensive knowledge of soybean transcription factors. Description The soybean genome was recently sequenced by the Department of Energy-Joint Genome Institute (DOE-JGI) and is publicly available. Mining of this sequence identified 5,671 soybean genes as putative transcription factors. These genes were comprehensively annotated as an aid to the soybean research community. We developed SoyDB - a knowledge database for all the transcription factors in the soybean genome. The database contains protein sequences, predicted tertiary structures, putative DNA binding sites, domains, homologous templates in the Protein Data Bank (PDB), protein family classifications, multiple sequence alignments, consensus protein sequence motifs, web logo of each family, and web links to the soybean transcription factor database PlantTFDB, known EST sequences, and other general protein databases including Swiss-Prot, Gene Ontology, KEGG, EMBL, TAIR, InterPro, SMART, PROSITE, NCBI, and Pfam. The database can be accessed via an interactive and convenient web server, which supports full-text search, PSI-BLAST sequence search, database browsing by protein family, and automatic classification of a new protein sequence into one of 64 annotated transcription factor families by hidden Markov models. Conclusions A comprehensive soybean transcription factor database was constructed and made publicly accessible at http://casp.rnet.missouri.edu/soydb/.

2010-01-01

196

ASC Publications  

NSDL National Science Digital Library

The University of Wisconsin's Applied Superconductivity Center (ASC) has a publications page with a good selection of online papers on superconductivity. The page contains a searchable publications database, a list of all ASC papers online, and a list of papers supported by the National Science Foundation.

197

GaAs Reliability Database  

NASA Technical Reports Server (NTRS)

The database consists of two main sections, the data references and the device reliability records. The reference section contains 8 fields: reference number, date of publication, authors, article title, publisher, volume, and page numbers.

Sacco, T.; Gonzalez, S.; Kayali, S.

1993-01-01

198

Database Licensing: A Future View.  

ERIC Educational Resources Information Center

Access to database information in libraries will increase as licenses for tape loading of data onto public access catalogs becomes more widespread. Institutions with adequate storage capacity will have full text databases, and the adoption of the Z39.50 standard, which allows differing computer systems to interface with each other, will increase…

Flanagan, Michael

1993-01-01

199

Glycoproteomic and glycomic databases.  

PubMed

Protein glycosylation serves critical roles in the cellular and biological processes of many organisms. Aberrant glycosylation has been associated with many illnesses such as hereditary and chronic diseases like cancer, cardiovascular diseases, neurological disorders, and immunological disorders. Emerging mass spectrometry (MS) technologies that enable the high-throughput identification of glycoproteins and glycans have accelerated the analysis and made possible the creation of dynamic and expanding databases. Although glycosylation-related databases have been established by many laboratories and institutions, they are not yet widely known in the community. Our study reviews 15 different publicly available databases and identifies their key elements so that users can identify the most applicable platform for their analytical needs. These databases include biological information on the experimentally identified glycans and glycopeptides from various cells and organisms such as human, rat, mouse, fly and zebrafish. The features of these databases - 7 for glycoproteomic data, 6 for glycomic data, and 2 for glycan binding proteins are summarized including the enrichment techniques that are used for glycoproteome and glycan identification. Furthermore databases such as Unipep, GlycoFly, GlycoFish recently established by our group are introduced. The unique features of each database, such as the analytical methods used and bioinformatical tools available are summarized. This information will be a valuable resource for the glycobiology community as it presents the analytical methods and glycosylation related databases together in one compendium. It will also represent a step towards the desired long term goal of integrating the different databases of glycosylation in order to characterize and categorize glycoproteins and glycans better for biomedical research. PMID:24725457

Baycin Hizal, Deniz; Wolozny, Daniel; Colao, Joseph; Jacobson, Elena; Tian, Yuan; Krag, Sharon S; Betenbaugh, Michael J; Zhang, Hui

2014-01-01

200

Glycoproteomic and glycomic databases  

PubMed Central

Protein glycosylation serves critical roles in the cellular and biological processes of many organisms. Aberrant glycosylation has been associated with many illnesses such as hereditary and chronic diseases like cancer, cardiovascular diseases, neurological disorders, and immunological disorders. Emerging mass spectrometry (MS) technologies that enable the high-throughput identification of glycoproteins and glycans have accelerated the analysis and made possible the creation of dynamic and expanding databases. Although glycosylation-related databases have been established by many laboratories and institutions, they are not yet widely known in the community. Our study reviews 15 different publicly available databases and identifies their key elements so that users can identify the most applicable platform for their analytical needs. These databases include biological information on the experimentally identified glycans and glycopeptides from various cells and organisms such as human, rat, mouse, fly and zebrafish. The features of these databases - 7 for glycoproteomic data, 6 for glycomic data, and 2 for glycan binding proteins are summarized including the enrichment techniques that are used for glycoproteome and glycan identification. Furthermore databases such as Unipep, GlycoFly, GlycoFish recently established by our group are introduced. The unique features of each database, such as the analytical methods used and bioinformatical tools available are summarized. This information will be a valuable resource for the glycobiology community as it presents the analytical methods and glycosylation related databases together in one compendium. It will also represent a step towards the desired long term goal of integrating the different databases of glycosylation in order to characterize and categorize glycoproteins and glycans better for biomedical research.

2014-01-01

201

Database Manager  

ERIC Educational Resources Information Center

It is normal practice today for organizations to store large quantities of records of related information as computer-based files or databases. Purposeful information is retrieved by performing queries on the data sets. The purpose of DATABASE MANAGER is to communicate to students the method by which the computer performs these queries. This…

Martin, Andrew

2010-01-01

202

NSFC Databases  

NSDL National Science Digital Library

The National Environmental Services Center (NESC) is based at West Virginia University and "serves as a clearinghouse for information about drinking water, wastewater, environmental training, and solid waste management in communities serving fewer than 10,000 individuals." As part of the NSFC larger Web site, the Databases page offers three online databases that can be accessed free after an initial registration. The Regulations Database contains copies of regulations for onsite wastewater treatment systems in 48 states, the Bibliographic Database stores thousands of articles dealing with onsite and small community wastewater issues, and the Manufacturers and Consultants Database houses a list of industry contacts for wastewater products and consulting services. Much more is available within the larger NSFC site and readers are encouraged to take a look through its contents.

2001-01-01

203

The Histone Database.  

PubMed

Histone proteins are often noted for their high degree of sequence conservation. It is less often recognized that the histones are a heterogeneous protein family. Furthermore, several classes of non-histone proteins containing the histone fold motif exist. Novel histone and histone fold protein sequences continue to be added to public databases every year. The Histone Database (http://genome.nhgri.nih.gov/histones/) is a searchable, periodically updated collection of histone fold-containing sequences derived from sequence-similarity searches of public databases. Sequence sets are presented in redundant and non-redundant FASTA form, hotlinked to GenBank sequence files. Partial sequences are also now included in the database, which has considerably augmented its taxonomic coverage. Annotated alignments of full-length non-redundant sets of sequences are now available in both web-viewable (HTML) and downloadable (PDF) formats. The database also provides summaries of current information on solved histone fold structures, post-translational modifications of histones, and the human histone gene complement. PMID:11752331

Sullivan, Steven; Sink, Daniel W; Trout, Kenneth L; Makalowska, Izabela; Taylor, Patrick M; Baxevanis, Andreas D; Landsman, David

2002-01-01

204

Fit-for-purpose curated database application in mass spectrometry-based targeted protein identification and validation  

PubMed Central

Background Mass spectrometry (MS) is a very sensitive and specific method for protein identification, biomarker discovery, and biomarker validation. Protein identification is commonly carried out by comparing MS data with public databases. However, with the development of high throughput and accurate genomic sequencing technology, public databases are being overwhelmed with new entries from different species every day. The application of these databases can also be problematic due to factors such as size, specificity, and unharmonized annotation of the molecules of interest. Current databases representing liquid chromatography-tandem mass spectrometry (LC-MS/MS)-based searches focus on enzyme digestion patterns and sequence information and consequently, important functional information can be missed within the search output. Protein variants displaying similar sequence homology can interfere with database identification when only certain homologues are examined. In addition, recombinant DNA technology can result in products that may not be accurately annotated in public databases. Curated databases, which focus on the molecule of interest with clearer functional annotation and sequence information, are necessary for accurate protein identification and validation. Here, four cases of curated database application have been explored and summarized. Findings The four presented curated databases were constructed with clear goals regarding application and have proven very useful for targeted protein identification and biomarker application in different fields. They include a sheeppox virus database created for accurate identification of proteins with strong antigenicity, a custom database containing clearly annotated protein variants such as tau transcript variant 2 for accurate biomarker identification, a sheep-hamster chimeric prion protein (PrP) database constructed for assay development of prion diseases, and a custom Escherichia coli (E. coli) flagella (H antigen) database produced for MS-H, a new H-typing technique. Clearly annotating the proteins of interest was essential for highly accurate, specific, and sensitive sequence identification, and searching against public databases resulted in inaccurate identification of the sequence of interest, while combining the curated database with a public database reduced both the confidence and sequence coverage of the protein search. Conclusion Curated protein sequence databases incorporating clear annotations are very useful for accurate protein identification and fit-for-purpose application through MS-based biomarker validation.

2014-01-01

205

ESTuber db: an online database for Tuber borchii EST sequences  

PubMed Central

Background The ESTuber database () includes 3,271 Tuber borchii expressed sequence tags (EST). The dataset consists of 2,389 sequences from an in-house prepared cDNA library from truffle vegetative hyphae, and 882 sequences downloaded from GenBank and representing four libraries from white truffle mycelia and ascocarps at different developmental stages. An automated pipeline was prepared to process EST sequences using public software integrated by in-house developed Perl scripts. Data were collected in a MySQL database, which can be queried via a php-based web interface. Results Sequences included in the ESTuber db were clustered and annotated against three databases: the GenBank nr database, the UniProtKB database and a third in-house prepared database of fungi genomic sequences. An algorithm was implemented to infer statistical classification among Gene Ontology categories from the ontology occurrences deduced from the annotation procedure against the UniProtKB database. Ontologies were also deduced from the annotation of more than 130,000 EST sequences from five filamentous fungi, for intra-species comparison purposes. Further analyses were performed on the ESTuber db dataset, including tandem repeats search and comparison of the putative protein dataset inferred from the EST sequences to the PROSITE database for protein patterns identification. All the analyses were performed both on the complete sequence dataset and on the contig consensus sequences generated by the EST assembly procedure. Conclusion The resulting web site is a resource of data and links related to truffle expressed genes. The Sequence Report and Contig Report pages are the web interface core structures which, together with the Text search utility and the Blast utility, allow easy access to the data stored in the database.

Lazzari, Barbara; Caprera, Andrea; Cosentino, Cristian; Stella, Alessandra; Milanesi, Luciano; Viotti, Angelo

2007-01-01

206

DFCI Gene Index Project: Genomic Databases for Plants, Animals, Protist, and Fungi from the Dana-Farber Cancer Institute  

DOE Data Explorer

The DFCI Gene Index Project creates databases for specific organisms. The goal for these databases is to provide an analysis of publicly available Expressed Sequence Transcripts (ESTs). ESTs are fragments of genes that were, at some time, copied from DNA to RNA. and gene sequence data to identify transcrips. The databases are in zipped files and free for download. The website also provides software and tools for use with the data, along with instructions from the website on how to link to background resources. The Gene Indices are organized into four categories: Animals, Plants, Protist, and Fungi.

207

ProtNA-ASA: Protein-nucleic acid structural database with information on accessible surface area  

NASA Astrophysics Data System (ADS)

The article describes a new database (ProtNA-ASA), which combines the data on conformational parameters of nucleic acids and calculations of the accessible surface area (ASA) of nucleic acid atoms in protein-DNA/RNA complexes. As for October 2008, the database contains 214 DNA-protein and 28 RNA-protein non-homologous complexes. The database provides structural parameters that describe local geometry of base pairs and base-pair steps as well as backbone torsion angles. Additionally, total ASA of DNA/RNA atoms and the accessible area of atoms in the minor and major grooves are calculated. ProtNA-ASA database facilitates studying the relationship between the DNA/RNA conformation and availability of atoms for contact with proteins either in major or in minor groove for different nucleotides. Such an analysis is important for understanding the principles of molecular recognition including indirect sequence readout. The database is publicly available for use at http://www.protna.bio-page.org.

Tkachenko, M. Y.; Boryskina, O. P.; Shestopalova, A. V.; Tolstorukov, M. Y.

208

Marmal-aid - a database for Infinium HumanMethylation450  

PubMed Central

Background DNA methylation is indispensible for normal human genome function. Currently there is an increasingly large number of DNA methylomic data being released in the public domain allowing for an opportunity to investigate the relationships between the DNA methylome, genome function, and human phenotypes. The Illumina450K is one of the most popular platforms for assessing DNA methylation with over 10,000 samples available in the public domain. However, accessing all this data requires downloading each individual experiment and due to inconsistent annotation, accessing the right data can be a challenge. Description Here we introduce ‘Marmal-aid’, the first standardised database for DNA methylation (freely available at http://marmal-aid.org). In Marmal-aid, the majority of publicly available Illumina HumanMethylation450 data is incorporated into a single repository allowing for re-processing of data including normalisation and imputation of missing values. The database is accessible in two ways: (1) Using an R package to allow for incorporation into existing analysis pipelines which can then be easily queried to gain insight into the functionality of certain CpG sites. This is aimed at a bioinformatician with experience in R. (2) Using a graphical interface allowing general biologists to query a pre-defined set of tissues (currently 15) providing a reference database of the methylation state in these tissues for the 450,000 CpG sites profiled by the Illumina HumanMethylation450. Conclusion Marmal-aid is the largest publicly available Illumina HumanMethylation450 methylation database combining Illumina HumanMethylation450 data from a number of sources into a single location with a single common annotation format. This allows for automated extraction using the R package and inclusion into existing analysis pipelines. Marmal-aid also provides a easy to use GUI to visualise methylation data in user defined genomic regions for various reference tissues.

2013-01-01

209

BIOMARKERS DATABASE  

EPA Science Inventory

This database was developed by assembling and evaluating the literature relevant to human biomarkers. It catalogues and evaluates the usefulness of biomarkers of exposure, susceptibility and effect which may be relevant for a longitudinal cohort study. In addition to describing ...

210

Ant Database  

NSDL National Science Digital Library

Research entomologist Ted Schultz from the Smithsonian Institution maintains this impressive work in progress. This online database represents the Smithsonian's identified ant collection, including 4,580 valid named species or subspecies. The taxonomy is current with Bolton's 1995 catalog and includes reported holdings through June 1998. The database may be queried by Subfamily, Tribe, Genus, Subgenus, Species, Subspecies, Author, or Types, and typical returns give concise taxonomic information, total specimens (workers, females, and males), author, and year.

Schultz, Ted.

2000-01-01

211

Beyond the cold hit: measuring the impact of the national DNA data bank on public safety at the city and county level.  

PubMed

Over the past decade, the Combined DNA Index System (CODIS) has increased solvability of violent crimes by linking evidence DNA profiles to known offenders. At present, an in-depth analysis of the United States National DNA Data Bank effort has not assessed the success of this national public safety endeavor. Critics of this effort often focus on laboratory and police investigators unable to provide timely investigative support as a root cause(s) of CODIS' failure to increase public safety. By studying a group of nearly 200 DNA cold hits obtained in SFPD criminal investigations from 2001-2006, three key performance metrics (Significance of Cold Hits, Case Progression & Judicial Resolution, and Potential Reduction of Future Criminal Activity) provide a proper context in which to define the impact of CODIS at the City and County level. Further, the analysis of a recidivist group of cold hit offenders and their past interaction with law enforcement established five noteworthy criminal case resolution trends; these trends signify challenges to CODIS in achieving meaningful case resolutions. CODIS' effectiveness and critical activities to support case resolutions are the responsibility of all criminal justice partners in order to achieve long-lasting public safety within the United States. PMID:20579236

Gabriel, Matthew; Boland, Cherisse; Holt, Cydne

2010-01-01

212

Annual Review of Database Developments 1991.  

ERIC Educational Resources Information Center

Review of developments in databases highlights a new emphasis on accessibility. Topics discussed include the internationalization of databases; databases that deal with finance, drugs, and toxic waste; access to public records, both personal and corporate; media online; reducing large files of data to smaller, more manageable files; and…

Basch, Reva

1991-01-01

213

BIND: A Biomedical INteroperable Database System  

Microsoft Academic Search

We describe a Biomedical INteroperable Database System (BIND), which is part of a larger research project, the Biomedical Equipment Assessment and Management (BEAM) system. BEAM is an AIM (Advanced Informatics in Medicine) awarded project and it is funded by the European Community. It integrates a number of heterogeneous database systems which store Biomedical information, over public communications networks. The databases

Catherine E. Houstis; Theodore S. Papatheodorou; Vassilios S. Verykios; Aris Floratos; Ahmed K. Elmagarmid

1994-01-01

214

Hospital Records Database  

NSDL National Science Digital Library

This new joint project from the Wellcome Trust and the UK Public Record Office helps researchers locate records of hospitals all over the UK. The database currently contains over 2,800 entries and may be searched by hospital or town name. Information contained in the database includes administrative details of the hospitals, location and covering dates of administrative and clinical records, and the existence of lists, catalogs or other finding aids. A sample search for "royal" under hospital name returned 210 records, and one for "Manchester" under town name produced 124 returns. While the target audience of this database -- researchers in British medical history -- is rather specialized, this new resource will prove extremely useful for these scholars and their students.

215

Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model.  

PubMed

In recent times, the size of biological databases has increased significantly, with the continuous growth in the number of users and rate of queries; such that some databases have reached the terabyte size. There is therefore, the increasing need to access databases at the fastest rates possible. In this paper, the decision tree indexing model (PDTIM) was parallelised, using a hybrid of distributed and shared memory on resident database; with horizontal and vertical growth through Message Passing Interface (MPI) and POSIX Thread (PThread), to accelerate the index building time. The PDTIM was implemented using 1, 2, 4 and 5 processors on 1, 2, 3 and 4 threads respectively. The results show that the hybrid technique improved the speedup, compared to a sequential version. It could be concluded from results that the proposed PDTIM is appropriate for large data sets, in terms of index building time. PMID:24794073

Jaber, Khalid Mohammad; Abdullah, Rosni; Rashid, Nur'aini Abdul

2014-01-01

216

Identification of RNA editing sites in the SNP database.  

PubMed

The relationship between human inherited genomic variations and phenotypic differences has been the focus of much research effort in recent years. These studies benefit from millions of single-nucleotide polymorphism (SNP) records available in public databases, such as dbSNP. The importance of identifying false dbSNP records increases with the growing role played by SNPs in linkage analysis for disease traits. In particular, the emerging understanding of the abundance of DNA and RNA editing calls for a careful distinction between inherited SNPs and somatic DNA and RNA modifications. In order to demonstrate that some of the SNP database records are actually somatic modification, we focus on one type of these modifications, namely A-to-I RNA editing, and present evidence for hundreds of dbSNP records that are actually editing sites. We provide a list of 102 RNA editing sites previously annotated in dbSNP database as SNPs, and experimentally validate seven of these. Interestingly, we show how dbSNP can serve as a starting point to look for new editing sites. Our results, for this particular type of RNA editing, demonstrate the need for a careful analysis of SNP databases in light of the increasing recognition of the significance of somatic sequence modifications. PMID:16100382

Eisenberg, Eli; Adamsky, Konstantin; Cohen, Lital; Amariglio, Ninette; Hirshberg, Abraham; Rechavi, Gideon; Levanon, Erez Y

2005-01-01

217

National Tourism Database  

NSDL National Science Digital Library

Developed by the Michigan State University Extension Tourism Area of Expertise and the National Tourism Education Design Team, this site contains information on numerous resources related to tourism education, including bulletins, research reports, videos, and training programs. Nearly 100 of the documents featured are full-text. Users can browse the database by topic or browse or search by keyword. A separate list of the full-text publications is also provided. A useful site for students and professionals in the tourism industry.

218

The Molecular Biology Database Collection: 2005 update  

PubMed Central

The Nucleic Acids Research Molecular Biology Database Collection is a public online resource that lists the databases described in this and previous issues of Nucleic Acids Research together with other databases of value to the biologist and available throughout the world. All databases included in this Collection are freely available to the public. The 2005 update includes 719 databases, 171 more than the 2004 one. The databases are organized in a hierarchical classification that simplifies the process of finding the right database for any given task. The growing number of databases related to immunology, plant and organelle research have been accommodated by separating them into three new categories. The database summaries provide brief descriptions of the databases, contact details, appropriate references and acknowledgements. The online summaries also serve as a venue for the maintainers of each database to introduce database updates and other improvements in the scope and tools. These updates are particularly important for those databases that have not been described in print in the recent past. The database list and summaries are available online at the Nucleic Acids Research web site, http://nar.oupjournals.org/.

Galperin, Michael Y.

2005-01-01

219

INVADERS Database  

NSDL National Science Digital Library

Based at the University of Montana and directed by Dr. Peter Rice, the INVADERS Database is "a comprehensive database of exotic plant names and weed distribution records for five states in the northwestern United States." Designed for use by land management and weed regulatory agencies, INVADERS uses a query interface (plant name or location) to sort and display information. Data are updated regularly so as to increase the chance of detecting and halting the rapid spread of alien weeds. Highlights of the site include the noxious weed listings for all US states and six Canadian provinces, historic distribution records against which to compare current plant distributions, and summary statistics such as the number of invasive species detected per state or a summary of the 120 year invasion, among others. The INVADERS database will prove both interesting and useful to managers and academics, alike.

1997-01-01

220

Health service use in families where children enter public care: a nested case control study using the General Practice Research Database  

PubMed Central

Background At least 3% of children spend some of their childhood in public care and, as a group, have poor outcomes across a range of education, employment, health and social care outcomes. Research, using social care or government datasets, has identified a number of risk factors associated with children entering public care but the utility of risk factors in clinical practice is not established. This paper uses routine primary health care data to see if risk factors for children entering public care can be identified in clinical practice. Methods A nested case control methodology using routine primary care data from the United Kingdom. Health service use data were extracted for the 12 months before the case child entered public care and compared with 12 months of data for four control mother child pairs per case pair, matched on the age and sex of the child and the general practice. Exposures of interest were developed from a systematic review of the literature on risk factors associated with children entering public care. Results Conditional logistic regression was used to investigate the combined effect of more than one exposure of interest. Maternal mental illness (OR 2.51, 95% CI 1.55-4.05), maternal age at birth of the child, socio-economic status (5th quintile vs. 1st quintile OR 7.14, 95% CI 2.92-17.4), maternal drug use (OR 28.8, 95% CI 2.29-363), non attendance at appointments (OR 2.42, 95% CI 1.42-4.14), child mental illness (OR 2.65, 95% CI 1.42-4.96) and child admission to hospital (OR 3.31, 95% CI 1.21-9.02) were all significantly associated with children entering public care. Maternal use of primary care contraception services was negatively associated with children entering public care (OR 0.52, 95% CI 0.31-0.87). Conclusions Differences in health service use can be identified from routine primary care data in mother child pairs where children enter public care after controlling for maternal age and socio-economic status. The interaction between different risk factors needs testing in a cumulative risk model using longitudinal datasets.

2012-01-01

221

Solubility Database  

National Institute of Standards and Technology Data Gateway

SRD 106 IUPAC-NIST Solubility Database (Web, free access)   These solubilities are compiled from 18 volumes (Click here for List) of the International Union for Pure and Applied Chemistry(IUPAC)-NIST Solubility Data Series. The database includes liquid-liquid, solid-liquid, and gas-liquid systems. Typical solvents and solutes include water, seawater, heavy water, inorganic compounds, and a variety of organic compounds such as hydrocarbons, halogenated hydrocarbons, alcohols, acids, esters and nitrogen compounds. There are over 67,500 solubility measurements and over 1800 references.

222

MetaBase--the wiki-database of biological databases.  

PubMed

Biology is generating more data than ever. As a result, there is an ever increasing number of publicly available databases that analyse, integrate and summarize the available data, providing an invaluable resource for the biological community. As this trend continues, there is a pressing need to organize, catalogue and rate these resources, so that the information they contain can be most effectively exploited. MetaBase (MB) (http://MetaDatabase.Org) is a community-curated database containing more than 2000 commonly used biological databases. Each entry is structured using templates and can carry various user comments and annotations. Entries can be searched, listed, browsed or queried. The database was created using the same MediaWiki technology that powers Wikipedia, allowing users to contribute on many different levels. The initial release of MB was derived from the content of the 2007 Nucleic Acids Research (NAR) Database Issue. Since then, approximately 100 databases have been manually collected from the literature, and users have added information for over 240 databases. MB is synchronized annually with the static Molecular Biology Database Collection provided by NAR. To date, there have been 19 significant contributors to the project; each one is listed as an author here to highlight the community aspect of the project. PMID:22139927

Bolser, Dan M; Chibon, Pierre-Yves; Palopoli, Nicolas; Gong, Sungsam; Jacob, Daniel; Del Angel, Victoria Dominguez; Swan, Dan; Bassi, Sebastian; González, Virginia; Suravajhala, Prashanth; Hwang, Seungwoo; Romano, Paolo; Edwards, Rob; Bishop, Bryan; Eargle, John; Shtatland, Timur; Provart, Nicholas J; Clements, Dave; Renfro, Daniel P; Bhak, Daeui; Bhak, Jong

2012-01-01

223

NASA Scientific and Technical Publications: A Catalog of Special Publications, Reference Publications, Conference Publications, and Technical Papers, 1991-1992.  

National Technical Information Service (NTIS)

This catalog lists 458 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered into the NASA Scientific and Technical Information database during accession year 199...

1993-01-01

224

NASA Scientific and Technical Publications: A Catalog of Special Publications, Reference Publications, Conference Publications, and Technical Papers, 1989.  

National Technical Information Service (NTIS)

This catalog lists 190 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered into the NASA scientific and technical information database during accession year 198...

1990-01-01

225

Publication Bias in Antipsychotic Trials: An Analysis of Efficacy Comparing the Published Literature to the US Food and Drug Administration Database  

PubMed Central

Background Publication bias compromises the validity of evidence-based medicine, yet a growing body of research shows that this problem is widespread. Efficacy data from drug regulatory agencies, e.g., the US Food and Drug Administration (FDA), can serve as a benchmark or control against which data in journal articles can be checked. Thus one may determine whether publication bias is present and quantify the extent to which it inflates apparent drug efficacy. Methods and Findings FDA Drug Approval Packages for eight second-generation antipsychotics—aripiprazole, iloperidone, olanzapine, paliperidone, quetiapine, risperidone, risperidone long-acting injection (risperidone LAI), and ziprasidone—were used to identify a cohort of 24 FDA-registered premarketing trials. The results of these trials according to the FDA were compared with the results conveyed in corresponding journal articles. The relationship between study outcome and publication status was examined, and effect sizes derived from the two data sources were compared. Among the 24 FDA-registered trials, four (17%) were unpublished. Of these, three failed to show that the study drug had a statistical advantage over placebo, and one showed the study drug was statistically inferior to the active comparator. Among the 20 published trials, the five that were not positive, according to the FDA, showed some evidence of outcome reporting bias. However, the association between trial outcome and publication status did not reach statistical significance. Further, the apparent increase in the effect size point estimate due to publication bias was modest (8%) and not statistically significant. On the other hand, the effect size for unpublished trials (0.23, 95% confidence interval 0.07 to 0.39) was less than half that for the published trials (0.47, 95% confidence interval 0.40 to 0.54), a difference that was significant. Conclusions The magnitude of publication bias found for antipsychotics was less than that found previously for antidepressants, possibly because antipsychotics demonstrate superiority to placebo more consistently. Without increased access to regulatory agency data, publication bias will continue to blur distinctions between effective and ineffective drugs. Please see later in the article for the Editors' Summary

Turner, Erick H.; Knoepflmacher, Daniel; Shapley, Lee

2012-01-01

226

The Human Intermediate Filament Database: comprehensive information on a gene family involved in many human diseases.  

PubMed

We describe a revised and expanded database on human intermediate filament proteins, a major component of the eukaryotic cytoskeleton. The family of 70 intermediate filament genes (including those encoding keratins, desmins, and lamins) is now known to be associated with a wide range of diverse diseases, at least 72 distinct human pathologies, including skin blistering, muscular dystrophy, cardiomyopathy, premature aging syndromes, neurodegenerative disorders, and cataract. To date, the database catalogs 1,274 manually-curated pathogenic sequence variants and 170 allelic variants in intermediate filament genes from over 459 peer-reviewed research articles. Unrelated cases were collected from all of the six sequence homology groups and the sequence variations were described at cDNA and protein levels with links to the related diseases and reference articles. The mutations and polymorphisms are presented in parallel with data on protein structure, gene, and chromosomal location and basic information on associated diseases. Detailed statistics relating to the variants records in the database are displayed by homology group, mutation type, affected domain, associated diseases, and nucleic and amino acid substitutions. Multiple sequence alignment algorithms can be run from queries to determine DNA or protein sequence conservation. Literature sources can be interrogated within the database and external links are provided to public databases. The database is freely and publicly accessible online at www.interfil.org (last accessed 13 September 2007). Users can query the database by various keywords and the search results can be downloaded. It is anticipated that the Human Intermediate Filament Database (HIFD) will provide a useful resource to study human genome variations for basic scientists, clinicians, and students alike. PMID:18033728

Szeverenyi, Ildiko; Cassidy, Andrew J; Chung, Cheuk Wang; Lee, Bernett T K; Common, John E A; Ogg, Stephen C; Chen, Huijia; Sim, Shu Yin; Goh, Walter L P; Ng, Kee Woei; Simpson, John A; Chee, Li Lian; Eng, Goi Hui; Li, Bin; Lunny, Declan P; Chuon, Danny; Venkatesh, Aparna; Khoo, Kian Hoe; McLean, W H Irwin; Lim, Yun Ping; Lane, E Birgitte

2008-03-01

227

Lightweight Databases  

Microsoft Academic Search

Current World Wide Web technologies concentrate on presenting documents to human readers. Although HTML identifies structures within a document, it does not allow the semantic content of document sections to be specified explicitly. We investigate a small extension to HTML which allows parts of a document to be mapped onto an underlying database schema. This allows automatic identification and extraction

Simon A. Dobson; Victoria A. Burrill

1995-01-01

228

Avibase: The World Bird Database  

NSDL National Science Digital Library

This database provides information on all birds of the world, featuring information on thousands of species and subspecies of birds such as taxonomy, names and synonyms in various languages, photos, distribution maps, and links to additional information from other websites. The database is searchable by keyword or term, exact name, language, year of publication, and other parameters. There is also a search by taxonomic family, a set of checklists by geographic region, and a blog for ornithological discussions.

Lepage, Denis

229

American Mineralogist Crystal Structure Database  

NSDL National Science Digital Library

This database provides access to information on every crystal structure published in the American Mineralogist, the Canadian Mineralogist, European Journal of Mineralogy, and Physics and Chemistry of Minerals, as well as selected datasets from other journals. The data are searchable by mineral name, author, chemistry, cell parameters and symmetry, diffraction pattern, and a general search. There are also lists of minerals represented in the database and authors of publications cited.

Downs, R. T.; Hall-Wallace, M.

230

The EMBL Nucleotide Sequence Database  

Microsoft Academic Search

The EMBL Nucleotide Sequence Database (http:\\/\\/ www.ebi.ac.uk\\/embl), maintained at the European Bioinformatics Institute (EBI) near Cambridge, UK, is a comprehensive collection of nucleotide sequences and annotation from available public sources. The database is part of an international collaboration with DDBJ (Japan) and GenBank (USA). Data are exchanged daily between the collaborating institutes toachieveswiftsynchrony.Webinisthepreferredtool for individual submissions of nucleotide sequences, including

Carola Kanz; Philippe Aldebert; Nicola Althorpe; Wendy Baker; Alastair Baldwin; Kirsty Bates; Paul Browne; Alexandra Van Den Broek; Matias Castro; Guy Cochrane; Karyn Duggan; Ruth Eberhardt; Nadeem Faruque; John Gamble; Federico Garcia Diez; Nicola Harte; Tamara Kulikova; Quan Lin; Vincent Lombard; Rodrigo Lopez; Renato Mancuso; Michelle Mchale; Francesco Nardone; Ville Silventoinen; Siamak Sobhany; Peter Stoehr; Mary Ann Tuli; Katerina Tzouvara; Robert Vaughan; Dan Wu; Weimin Zhu; Rolf Apweiler

2005-01-01

231

RiceFOX: A Database of Arabidopsis Mutant Lines Overexpressing Rice Full-Length cDNA that Contains a Wide Range of Trait Information to Facilitate Analysis of Gene Function  

PubMed Central

Identification of gene function is important not only for basic research but also for applied science, especially with regard to improvements in crop production. For rapid and efficient elucidation of useful traits, we developed a system named FOX hunting (Full-length cDNA Over-eXpressor gene hunting) using full-length cDNAs (fl-cDNAs). A heterologous expression approach provides a solution for the high-throughput characterization of gene functions in agricultural plant species. Since fl-cDNAs contain all the information of functional mRNAs and proteins, we introduced rice fl-cDNAs into Arabidopsis plants for systematic gain-of-function mutation. We generated >30,000 independent Arabidopsis transgenic lines expressing rice fl-cDNAs (rice FOX Arabidopsis mutant lines). These rice FOX Arabidopsis lines were screened systematically for various criteria such as morphology, photosynthesis, UV resistance, element composition, plant hormone profile, metabolite profile/fingerprinting, bacterial resistance, and heat and salt tolerance. The information obtained from these screenings was compiled into a database named ‘RiceFOX’. This database contains around 18,000 records of rice FOX Arabidopsis lines and allows users to search against all the observed results, ranging from morphological to invisible traits. The number of searchable items is approximately 100; moreover, the rice FOX Arabidopsis lines can be searched by rice and Arabidopsis gene/protein identifiers, sequence similarity to the introduced rice fl-cDNA and traits. The RiceFOX database is available at http://ricefox.psc.riken.jp/.

Sakurai, Tetsuya; Kondou, Youichi; Akiyama, Kenji; Kurotani, Atsushi; Higuchi, Mieko; Ichikawa, Takanari; Kuroda, Hirofumi; Kusano, Miyako; Mori, Masaki; Saitou, Tsutomu; Sakakibara, Hitoshi; Sugano, Shoji; Suzuki, Makoto; Takahashi, Hideki; Takahashi, Shinya; Takatsuji, Hiroshi; Yokotani, Naoki; Yoshizumi, Takeshi; Saito, Kazuki; Shinozaki, Kazuo; Oda, Kenji; Hirochika, Hirohiko; Matsui, Minami

2011-01-01

232

Comparing the public and private sequencing projects, Craig VenterSite: DNA Interactive (www.dnai.org)  

NSDL National Science Digital Library

Interviewee: Craig Venter DNAi Location:Genome>the project>players>private project Comparing public and private Craig Venter, the leader of the private genome effort at Celera Genomics, talks about the differences between the public and private approaches.

2008-10-06

233

NASA STI Database, Aerospace Database and ARIN coverage of 'space law'  

NASA Technical Reports Server (NTRS)

The space-law coverage provided by the NASA STI Database, the Aerospace Database, and ARIN is briefly described. Particular attention is given to the space law content of the two Databases and of ARIN, the NASA Thesauras space law terminology, space law publication forms, and the availability of the space law literature.

Buchan, Ronald L.

1992-01-01

234

Trail Database  

NSDL National Science Digital Library

Something of a veteran by Internet standards, the Trail Database has been around since 1997 and, as such, now bills itself as the "world's largest hiking trail database." "Henk," the Dutch hiker responsible for this compendium, updates the site regularly and has included a wide variety of links to helpful material here. Users can search or browse the resources, which are arranged both under general topics, such as Knots or Equipment, and by country. Those planning European hikes will find the links off the front page to foot and mouth disease-related hiking restrictions useful (though we found some of these links to be broken). In all, an impressive collection of material. The site is available in Dutch or English.

1997-01-01

235

Audit of Compliance with Standards Governing Combined DNA Index System Activities at the Texas Deparment of Public Safety McAllen Criminal Laboratory McAllen, Texas. Executive Summary.  

National Technical Information Service (NTIS)

The Department of Justice, Office of the Inspector General (OIG), Audit Division, has completed an audit of compliance with standards governing Combined DNA Index System (CODIS) activities at the Texas Department of Public Safety McAllen Criminal Laborato...

2011-01-01

236

Scientists Report New Online Cancer Research Databases  

Cancer.gov

Like the old saying "art for art's sake," most medical researchers once believed that creating large computer databases of known genes or DNA sequences was a case of gathering information for the sake of gathering information.

237

Modeling of Database Constraints in Active Databases.  

National Technical Information Service (NTIS)

This report deals with the problems of specification, modeling, and enforcement of constraints in databases. The active database area is emerging as a viable alternative for implementation of large scale database applications particularly those involving ...

S. B. Navathe A. K. Tanaka

1993-01-01

238

A Chronostratigraphic Relational Database Ontology  

NASA Astrophysics Data System (ADS)

A chronostratigraphic research database was donated by British Petroleum to the Stratigraphy Group at the Energy and Geoscience Institute (EGI), University of Utah. These data consists of over 2,000 measured sections representing over three decades of research into the application of the graphic correlation method. The data are global and includes both microfossil (foraminifera, calcareous nannoplankton, spores, pollen, dinoflagellate cysts, etc) and macrofossil data. The objective of the donation was to make the research data available to the public in order to encourage additional chronostratigraphy studies, specifically regarding graphic correlation. As part of the National Science Foundation's Cyberinfrastructure for the Geosciences (GEON) initiative these data have been made available to the public at http://css.egi.utah.edu. To encourage further research using the graphic correlation method, EGI has developed a software package, StrataPlot that will soon be publicly available from the GEON website as a standalone software download. The EGI chronostratigraphy research database, although relatively large, has many data holes relative to some paleontological disciplines and geographical areas, so the challenge becomes how do we expand the data available for chronostratigrahic studies using graphic correlation. There are several public or soon-to-be public databases available to chronostratigraphic research, but they have their own data structures and modes of presentation. The heterogeneous nature of these database schemas hinders their integration and makes it difficult for the user to retrieve and consolidate potentially valuable chronostratigraphic data. The integration of these data sources would facilitate rapid and comprehensive data searches, thus helping advance studies in chronostratigraphy. The GEON project will host a number of databases within the geology domain, some of which contain biostratigraphic data. Ontologies are being developed to provide an integrated query system for the searching across GEON's biostratigraphy databases, as well as databases available in the public domain. Although creating an ontology directly from the existing database metadata would have been effective and straightforward, our effort was directed towards creating a more efficient representation of our database, as well as a general representation of the biostratigraphic domain.

Platon, E.; Gary, A.; Sikora, P.

2005-12-01

239

The taming of an impossible child: a standardized all-in approach to the phylogeny of Hymenoptera using public database sequences  

PubMed Central

Background Enormous molecular sequence data have been accumulated over the past several years and are still exponentially growing with the use of faster and cheaper sequencing techniques. There is high and widespread interest in using these data for phylogenetic analyses. However, the amount of data that one can retrieve from public sequence repositories is virtually impossible to tame without dedicated software that automates processes. Here we present a novel bioinformatics pipeline for downloading, formatting, filtering and analyzing public sequence data deposited in GenBank. It combines some well-established programs with numerous newly developed software tools (available at http://software.zfmk.de/). Results We used the bioinformatics pipeline to investigate the phylogeny of the megadiverse insect order Hymenoptera (sawflies, bees, wasps and ants) by retrieving and processing more than 120,000 sequences and by selecting subsets under the criteria of compositional homogeneity and defined levels of density and overlap. Tree reconstruction was done with a partitioned maximum likelihood analysis from a supermatrix with more than 80,000 sites and more than 1,100 species. In the inferred tree, consistent with previous studies, "Symphyta" is paraphyletic. Within Apocrita, our analysis suggests a topology of Stephanoidea + (Ichneumonoidea + (Proctotrupomorpha + (Evanioidea + Aculeata))). Despite the huge amount of data, we identified several persistent problems in the Hymenoptera tree. Data coverage is still extremely low, and additional data have to be collected to reliably infer the phylogeny of Hymenoptera. Conclusions While we applied our bioinformatics pipeline to Hymenoptera, we designed the approach to be as general as possible. With this pipeline, it is possible to produce phylogenetic trees for any taxonomic group and to monitor new data and tree robustness in a taxon of interest. It therefore has great potential to meet the challenges of the phylogenomic era and to deepen our understanding of the tree of life.

2011-01-01

240

Mining of public sequencing databases supports a non-dietary origin for putative foreign miRNAs: underestimated effects of contamination in NGS.  

PubMed

The report that exogenous plant miRNAs are able to cross the mammalian gastrointestinal tract and exert gene-regulation mechanism in mammalian tissues has yielded a lot of controversy, both in the public press and the scientific literature. Despite the initial enthusiasm, reproducibility of these results was recently questioned by several authors. To analyze the causes of this unease, we searched for diet-derived miRNAs in deep-sequencing libraries performed by ourselves and others. We found variable amounts of plant miRNAs in publicly available small RNA-seq data sets of human tissues. In human spermatozoa, exogenous RNAs reached extreme, biologically meaningless levels. On the contrary, plant miRNAs were not detected in our sequencing of human sperm cells, which was performed in the absence of any known sources of plant contamination. We designed an experiment to show that cross-contamination during library preparation is a source of exogenous RNAs. These contamination-derived exogenous sequences even resisted oxidation with sodium periodate. To test the assumption that diet-derived miRNAs were actually contamination-derived, we sought in the literature for previous sequencing reports performed by the same group which reported the initial finding. We analyzed the spectra of plant miRNAs in a small RNA sequencing study performed in amphioxus by this group in 2009 and we found a very strong correlation with the plant miRNAs which they later reported in human sera. Even though contamination with exogenous sequences may be easy to detect, cross-contamination between samples from the same organism can go completely unnoticed, possibly affecting conclusions derived from NGS transcriptomics. PMID:24729469

Tosar, Juan Pablo; Rovira, Carlos; Naya, Hugo; Cayota, Alfonso

2014-06-01

241

The 2013 Nucleic Acids Research Database Issue and the online Molecular Biology Database Collection  

PubMed Central

The 20th annual Database Issue of Nucleic Acids Research includes 176 articles, half of which describe new online molecular biology databases and the other half provide updates on the databases previously featured in NAR and other journals. This year’s highlights include two databases of DNA repeat elements; several databases of transcriptional factors and transcriptional factor-binding sites; databases on various aspects of protein structure and protein–protein interactions; databases for metagenomic and rRNA sequence analysis; and four databases specifically dedicated to Escherichia coli. The increased emphasis on using the genome data to improve human health is reflected in the development of the databases of genomic structural variation (NCBI’s dbVar and EBI’s DGVa), the NIH Genetic Testing Registry and several other databases centered on the genetic basis of human disease, potential drugs, their targets and the mechanisms of protein–ligand binding. Two new databases present genomic and RNAseq data for monkeys, providing wealth of data on our closest relatives for comparative genomics purposes. The NAR online Molecular Biology Database Collection, available at http://www.oxfordjournals.org/nar/database/a/, has been updated and currently lists 1512 online databases. The full content of the Database Issue is freely available online on the Nucleic Acids Research website (http://nar.oxfordjournals.org/).

Fernandez-Suarez, Xose M.; Galperin, Michael Y.

2013-01-01

242

The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database  

PubMed Central

Background With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings A set of ?30K unique sequences (UniSeqs) representing ?19K clusters were generated from ?98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66% of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases. Conclusions/Significance The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in EST library sequencing approaches, and thus represent a rich resource for studies of environmental genomics.

Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.

2010-01-01

243

Transboundary Freshwater Dispute Database  

NSDL National Science Digital Library

The database includes 150 water related treaties and 39 US compacts. The International Treaties and Compacts section is searchable by nation, basin, issues, conflict resolution mechanisms, non-water linkages, and date. US Compacts are searchable by state, basin, focus and date. For all treaties or compacts there is a summary of the agreement, the parties involved, allocations if applicable and other information pertinent to a specific agreement. There is also information available on indigenous approaches to water conflict resolution, a digitized inventory of the world's river basins with data about those basins, and a bibliography featuring publications about transboundary freshwater dispute resolution.

244

Cryogenic Treatment Database  

NSDL National Science Digital Library

The Cryogenics Database is a website created by the Cryogenic Society of America that contains "scientific and informational articles pertaining to the cryogenic treatment industry", and is updated quarterly. Visitors curious about cryogenics, but only aware of the "misinformation about cryogenic treatment of materials in the public domain" can become familiar with cryogenics by checking out the "Resources" tab, near the top of any page. The "Cryo Central: Cryogenic Treatment of Materials" document gives examples of some of the types of raw materials that are treated with cryogenic processing, and their applications in the everyday world. Some of the examples given are brakes, racing cars, stereos, industrial tooling, and sporting goods.

245

Atomic Databases  

NASA Astrophysics Data System (ADS)

Atomic and molecular data are required in a variety of fields ranging from the traditional astronomy, atmospherics and fusion research to fast growing technologies such as lasers, lighting, low-temperature plasmas, plasma assisted etching and radiotherapy. In this context, there are some research groups, both theoretical and experimental, scattered round the world that attend to most of this data demand, but the implementation of atomic databases has grown independently out of sheer necessity. In some cases the latter has been associated with the data production process or with data centers involved in data collection and evaluation; but sometimes it has been the result of individual initiatives that have been quite successful. In any case, the development and maintenance of atomic databases call for a number of skills and an entrepreneurial spirit that are not usually associated with most physics researchers. In the present report we present some of the highlights in this area in the past five years and discuss what we think are some of the main issues that have to be addressed.

Mendoza, Claudio

2000-10-01

246

FishMicrosat: a microsatellite database of commercially important fishes and shellfishes of the Indian subcontinent  

PubMed Central

Background Microsatellite DNA is one of many powerful genetic markers used for the construction of genetic linkage maps and the study of population genetics. The biological databases in public domain hold vast numbers of microsatellite sequences for many organisms including fishes. The microsatellite data available in these data sources were extracted and managed into a database that facilitates sequences analysis and browsing relevant information. The system also helps to design primer sequences for flanking regions of repeat loci for PCR identification of polymorphism within populations. Description FishMicrosat is a database of microsatellite sequences of fishes and shellfishes that includes important aquaculture species such as Lates calcarifer, Ctenopharyngodon idella, Hypophthalmichthys molitrix, Penaeus monodon, Labeo rohita, Oreochromis niloticus, Fenneropenaeus indicus and Macrobrachium rosenbergii. The database contains 4398 microsatellite sequences of 41 species belonging to 15 families from the Indian subcontinent. GenBank of NCBI was used as a prime data source for developing the database. The database presents information about simple and compound microsatellites, their clusters and locus orientation within sequences. The database has been integrated with different tools in a web interface such as primer designing, locus finding, mapping repeats, detecting similarities among sequences across species, and searching using motifs and keywords. In addition, the database has the ability to browse information on the top 10 families and the top 10 species, through record overview. Conclusions FishMicrosat database is a useful resource for fish and shellfish microsatellite analyses and locus identification across species, which has important applications in population genetics, evolutionary studies and genetic relatedness among species. The database can be expanded further to include the microsatellite data of fishes and shellfishes from other regions and available information on genome sequencing project of species of aquaculture importance.

2013-01-01

247

NIST Scientific and Technical Databases  

NSDL National Science Digital Library

NIST has long been developing and compiling benchmark data for the properties of important substances, classes of substances and systems. It's data collections, data prediction methods and models meet high priority industrial and national needs. This list of databases contribute to U.S. industry's productivity and competitiveness and improve public health, safety and environmental quality.

248

REFEREE: BIBLIOGRAPHIC DATABASE MANAGER, DOCUMENTATION  

EPA Science Inventory

The publication is the user's manual for 3.xx releases of REFEREE, a general-purpose bibliographic database management program for IBM-compatible microcomputers. The REFEREE software also is available from NTIS. The manual has two main sections--Quick Tour and References Guide--a...

249

Marginal public health gain of screening for colorectal cancer: modelling study, based on WHO and national databases in the Nordic countries  

PubMed Central

Aims To estimate the potential gain of national screening programmes for colorectal cancer (CRC) by stool occult blood testing in the Nordic countries, with comparative reference to the burden of other causes of premature death. Methods Implementation of national screening programmes for CRC was modelled among people 55–74 years in accordance with the 2011 Cochrane review of biannual screening, using the faecal occult blood test (FOBT) for 10 years, resulting in 15% relative risk reduction in CRC deaths among all those invited [intention-to-treat; relative risk 0.85; confidence interval (CI) 0.78 to 0.92]. Our calculations are based on the World Health Organization and national databanks on death causes (ICD-10) and the mid-year number of inhabitants in the target group. For Finland, Denmark, Norway and Sweden, we used data for 2009. For Iceland, due to the population's small size, we calculated mean mortality for the period 2005–2009. Results Invitation to a CRC screening programme for 10 years could influence 0.5–0.9% (95%CI 0.4–1.2) of all deaths in the age group 65–74 years. Among the remaining 99% of premature deaths, around 50% were caused by lung cancer, other lung diseases, cardiovascular diseases and accidents, with some national variations. Conclusions and implications Establishment of a screening programme for CRC for people aged 55–74 can be expected to affect only a minor proportion of all premature deaths in the Nordic setting. From a public health perspective, prioritizing preventive strategies targeting more prevalent causes of premature death may be a superior approach.

Sigurdsson, Johann A; Getz, Linn; Sjonell, Goran; Vainiomaki, Paula; Brodersen, John

2013-01-01

250

Stackfile Database  

NASA Technical Reports Server (NTRS)

This software provides storage retrieval and analysis functionality for managing satellite altimetry data. It improves the efficiency and analysis capabilities of existing database software with improved flexibility and documentation. It offers flexibility in the type of data that can be stored. There is efficient retrieval either across the spatial domain or the time domain. Built-in analysis tools are provided for frequently performed altimetry tasks. This software package is used for storing and manipulating satellite measurement data. It was developed with a focus on handling the requirements of repeat-track altimetry missions such as Topex and Jason. It was, however, designed to work with a wide variety of satellite measurement data [e.g., Gravity Recovery And Climate Experiment -- GRACE). The software consists of several command-line tools for importing, retrieving, and analyzing satellite measurement data.

deVarvalho, Robert; Desai, Shailen D.; Haines, Bruce J.; Kruizinga, Gerhard L.; Gilmer, Christopher

2013-01-01

251

Algaline Database  

NSDL National Science Digital Library

Maintained by the Finnish Institute of Marine Research and several other institutions, the Algaline Database offers updated reports on the conditions of phytoplankton and related parameters in the Baltic Sea. The reports, which vary in length and detail (though most are brief), summarize measurements of Oxygen, Salinity, Temperature, Nutrients, Harmful substances, Plankton, Zooplankton, Benthic Animals, Flow, and Other measurements. In addition, the Maps and Figures section offers numerous color images (including satellite) of Baltic Sea conditions and marine organisms. To access reports by geographic subregion of the Baltic, head to the Reports section. Finally, the Latest News section keeps researchers abreast of changing conditions (e.g., algal blooms) and research cruises in the Baltic. For researchers or anyone else wanting in-depth information on a host of ecological parameters for the Baltic Sea, this is an excellent reference site.

252

USDA Forest Service Southern Research Station: Publications  

NSDL National Science Digital Library

This database contains publications in support of this group\\'s mission to create the science and technology needed to sustain and enhance southern forest ecosystems and the benefits they provide. The database can be searched by author, title, keyword, date range, and publication origin; publication number; and by the most recent publications added to the database. There are also online versions of Compass, a publication catalog of research products from scientists at SRS.

2007-02-17

253

The PIR-International Protein Sequence Database.  

PubMed Central

PIR-International is an association of macromolecular sequence data collection centers dedicated to fostering international cooperation as an essential element in the development of scientific databases. A major objective of PIR-International is to continue the development of the Protein Sequence Database as an essential public resource for protein sequence information. This paper briefly describes the architecture of the Protein Sequence Database and how it and associated data sets are distributed and can be accessed electronically.

George, D G; Barker, W C; Mewes, H W; Pfeiffer, F; Tsugita, A

1994-01-01

254

ARTI refrigerant database  

SciTech Connect

The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates. Citations in this report are divided into the following topics: thermophysical properties; materials compatibility; lubricants and tribology; application data; safety; test and analysis methods; impacts; regulatory actions; substitute refrigerants; identification; absorption and adsorption; research programs; and miscellaneous documents. Information is also presented on ordering instructions for the computerized version.

Calm, J.M. [Calm (James M.), Great Falls, VA (United States)] [Calm (James M.), Great Falls, VA (United States)

1996-04-15

255

Lost Art Internet Database  

NSDL National Science Digital Library

Launched on April 10, the Lost Art Internet Database is a collaborative project of the Federal Government of Germany and its Federal States ("Lander"). The site "registers cultural goods which were transported or stolen because of persecution, especially of Jewish citizens, as a result of World War II or as a result of National Socialism." At the site users can currently conduct simple or advanced searches of two areas of the database: Public Losses and Remaining Stock CCP. Alternatively, users can search both simultaneously via the Global Search function. Search returns for the former include title, type of object, whether or not it was a war loss, destroyed, or restituted, and a contact for further information. Those for the latter include title, artist, material, measures, purchase, borrower, and inventory number. Background and additional information on the project as a whole and the currently available content are provided. Please note that the site is still under extensive construction and more content is forthcoming. Additions planned for the near future include a "Fremdbesitz" area of the database that will list objects, regardless of their location, that are not in the owner's hands or are without provenance, and an expanded Other Countries section that may include data from other countries.

256

ARTI refrigerant database  

SciTech Connect

The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufactures and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on many refrigerants including propane, ammonia, water, carbon dioxide, propylene, ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.

Calm, J.M. [Calm (James M.), Great Falls, VA (United States)] [Calm (James M.), Great Falls, VA (United States)

1998-08-01

257

Correlates of Access to Business Research Databases  

ERIC Educational Resources Information Center

This study examines potential correlates of business research database access through academic libraries serving top business programs in the United States. Results indicate that greater access to research databases is related to enrollment in graduate business programs, but not to overall enrollment or status as a public or private institution.…

Gottfried, John C.

2010-01-01

258

Mouse Phenome Database (MPD)  

PubMed Central

The Mouse Phenome Project was launched a decade ago to complement mouse genome sequencing efforts by promoting new phenotyping initiatives under standardized conditions and collecting the data in a central public database, the Mouse Phenome Database (MPD; http://phenome.jax.org). MPD houses a wealth of strain characteristics data to facilitate the use of the laboratory mouse in translational research for human health and disease, helping alleviate problems involving experimentation in humans that cannot be done practically or ethically. Data sets are voluntarily contributed by researchers from a variety of institutions and settings, or in some cases, retrieved by MPD staff from public sources. MPD maintains a growing collection of standardized reference data that assists investigators in selecting mouse strains for research applications; houses treatment/control data for drug studies and other interventions; offers a standardized platform for discovering genotype–phenotype relationships; and provides tools for hypothesis testing. MPD improvements and updates since our last NAR report are presented, including the addition of new tools and features to facilitate navigation and data mining as well as the acquisition of new data (phenotypic, genotypic and gene expression).

Maddatu, Terry P.; Grubb, Stephen C.; Bult, Carol J.; Bogue, Molly A.

2012-01-01

259

The Molecule Pages database  

PubMed Central

The UCSD-Nature Signaling Gateway Molecule Pages (http://www.signaling-gateway.org/molecule) provides essential information on more than 3800 mammalian proteins involved in cellular signaling. The Molecule Pages contain expert-authored and peer-reviewed information based on the published literature, complemented by regularly updated information derived from public data source references and sequence analysis. The expert-authored data includes both a full-text review about the molecule, with citations, and highly structured data for bioinformatics interrogation, including information on protein interactions and states, transitions between states and protein function. The expert-authored pages are anonymously peer reviewed by the Nature Publishing Group. The Molecule Pages data is present in an object-relational database format and is freely accessible to the authors, the reviewers and the public from a web browser that serves as a presentation layer. The Molecule Pages are supported by several applications that along with the database and the interfaces form a multi-tier architecture. The Molecule Pages and the Signaling Gateway are routinely accessed by a very large research community.

Saunders, Brian; Lyon, Stephen; Day, Matthew; Riley, Brenda; Chenette, Emily; Subramaniam, Shankar

2008-01-01

260

DNA polymerases drive DNA sequencing-by-synthesis technologies: both past and present  

PubMed Central

Next-generation sequencing (NGS) technologies have revolutionized modern biological and biomedical research. The engines responsible for this innovation are DNA polymerases; they catalyze the biochemical reaction for deriving template sequence information. In fact, DNA polymerase has been a cornerstone of DNA sequencing from the very beginning. Escherichia coli DNA polymerase I proteolytic (Klenow) fragment was originally utilized in Sanger’s dideoxy chain-terminating DNA sequencing chemistry. From these humble beginnings followed an explosion of organism-specific, genome sequence information accessible via public database. Family A/B DNA polymerases from mesophilic/thermophilic bacteria/archaea were modified and tested in today’s standard capillary electrophoresis (CE) and NGS sequencing platforms. These enzymes were selected for their efficient incorporation of bulky dye-terminator and reversible dye-terminator nucleotides respectively. Third generation, real-time single molecule sequencing platform requires slightly different enzyme properties. Enterobacterial phage ?29 DNA polymerase copies long stretches of DNA and possesses a unique capability to efficiently incorporate terminal phosphate-labeled nucleoside polyphosphates. Furthermore, ?29 enzyme has also been utilized in emerging DNA sequencing technologies including nanopore-, and protein-transistor-based sequencing. DNA polymerase is, and will continue to be, a crucial component of sequencing technologies.

Chen, Cheng-Yao

2014-01-01

261

The HUGO Gene Nomenclature Database, 2006 updates  

Microsoft Academic Search

The HUGO Gene Nomenclature Committee (HGNC) aims to give every human gene a unique and ideally meaningful name and symbol. The HGNC database, previously known as Genew, contains over 22000 public records with approved human gene nomen- clature and associated information. The database has undergone major improvements throughout the last year, is publicly available for online searching at http:\\/\\/www.gene.ucl.ac.uk\\/cgi-bin\\/nomenclature\\/ searchgenes.pl

Tina A. Eyre; Fabrice Ducluzeau; Tam P. Sneddon; Sue Povey; Elspeth A. Bruford; Michael J. Lush

2006-01-01

262

The urologic epithelial stem cell database (UESC) - a web tool for cell type-specific gene expression and immunohistochemistry images of the prostate and bladder  

PubMed Central

Background Public databases are crucial for analysis of high-dimensional gene and protein expression data. The Urologic Epithelial Stem Cells (UESC) database is a public database that contains gene and protein information for the major cell types of the prostate, prostate cancer cell lines, and a cancer cell type isolated from a primary tumor. Similarly, such information is available for urinary bladder cell types. Description Two major data types were archived in the database, protein abundance localization data from immunohistochemistry images, and transcript abundance data principally from DNA microarray analysis. Data results were organized in modules that were made to operate independently but built upon a core functionality. Gene array data and immunostaining images for human and mouse prostate and bladder were made available for interrogation. Data analysis capabilities include: (1) CD (cluster designation) cell surface protein data. For each cluster designation molecule, a data summary allows easy retrieval of images (at multiple magnifications). (2) Microarray data. Single gene or batch search can be initiated with Affymetrix Probeset ID, Gene Name, or Accession Number together with options of coalescing probesets and/or replicates. Conclusion Databases are invaluable for biomedical research, and their utility depends on data quality and user friendliness. UESC provides for database queries and tools to examine cell type-specific gene expression (normal vs. cancer), whereas most other databases contain only whole tissue expression datasets. The UESC database provides a valuable tool in the analysis of differential gene expression in prostate cancer genes in cancer progression.

Pascal, Laura E; Deutsch, Eric W; Campbell, David S; Korb, Martin; True, Lawrence D; Liu, Alvin Y

2007-01-01

263

Quality Control of EUVE Databases  

NASA Astrophysics Data System (ADS)

The publicly accessible databases for the Extreme Ultraviolet Explorer (EUVE) include: the EUVE Archive Mailserver, the Center for EUV Astrophysics ftp site, the EUVE Guest Observer Mailserver, and the Astronomical Data System node. The EUVE Performance Assurance team is responsible for verifying that these public databases are working properly and that the public availability of EUVE data contained therein does not infringe any data rights which may have been assigned. In this paper, we describe the quality assurance (QA) procedures we have developed from approaching QA as a service organization; this approach reflects the overall EUVE philosophy of QA integrated into normal operating procedures, rather than imposed as an external, post-facto, control mechanism.

John, Linda M.

1993-01-01

264

The Molecular Biology Database Collection: 2004 update  

PubMed Central

The Molecular Biology Database Collection is a public resource listing key databases of value to the biologist, including those featured in this issue of Nucleic Acids Research, and other high-quality databases. All databases included in this Collection are freely available to the public. This listing aims to serve as a convenient starting point for searching the web for reliable information on various aspects of molecular biology, biochemistry and genetics. This year’s update includes 548 databases, 162 more than the previous one. The databases are organized in a hierarchical classification that should simplify finding the right database for each given task. Each database in the list comes with a recently updated brief description. The database list and the database descriptions can be accessed online at the Nucleic Acids Research web site http://nar.oupjournals.org/. The great challenge in biological research today is how to turn data into knowledge. I have met people who think data is knowledge but these people are then striving for a means of turning knowledge into understanding.Sydney Brenner. The Scientist 16[6]:12, March 18, 2002

Galperin, Michael Y.

2004-01-01

265

MITOMAP: a human mitochondrial genome database  

Microsoft Academic Search

ABSTRACT We have developed a comprehensive database (MITOMAP) for the human mitochondrial DNA (mtDNA), the first component,of the human,genome,to be completely sequenced [Anderson et al. (1981) Nature 290, 457?465]. MITOMAP uses the mtDNA sequence ,as the ,unifying element for bringing together information on mitochondrial genome structure and function, pathogenic mutations and their clinical characteristics, population associated variation, and gene?gene interactions.

Andreas M. Kogelnik; Marie T. Lott; Michael D. Brown; Shamkant B. Navathe; Douglas C. Wallace

1996-01-01

266

NASA Scientific and Technical Publications: A Catalog of Special Publications, Reference Publications, Conference Publications, and Technical Papers 1987-1990.  

National Technical Information Service (NTIS)

This catalog lists 783 citations of all NASA Special Publications, NASA Reference Publications, and NASA Technical Papers that were entered into the NASA Scientific and Technical Information Database during the accession years 1987 through 1990. The entri...

1991-01-01

267

State and Local Government Publications.  

ERIC Educational Resources Information Center

Reviews trends in library programs for state and local government publications and documents the increased interest in microforms and databases. Discussion focuses on publication distribution and control, and efforts to support interstate networking. There are 28 references. (RAA)

Nakata, Yuri; Kopec, Karen

1980-01-01

268

Overlap in Bibliographic Databases.  

ERIC Educational Resources Information Center

Examines the topic of Fuzzy Set Theory to determine the overlap of coverage in bibliographic databases. Highlights include examples of comparisons of database coverage; frequency distribution of the degree of overlap; records with maximum overlap; records unique to one database; intra-database duplicates; and overlap in the top ten databases.…

Hood, William W.; Wilson, Concepcion S.

2003-01-01

269

Human gene mutation database-a biomedical information and research resource.  

PubMed

Although 20 years have elapsed since the first single basepair substitution underlying an inherited disease in humans was characterised at the DNA level, the initiative has only recently been taken to establish central database resources for pathological genetic variants. Disease-associated gene lesions are currently collected and publicised by the Human Gene Mutation Database (HGMD) in Cardiff, locus-specific mutation databases, and to some extent also by the Genome Database (GDB) and Online Mendelian Inheritance in Man (OMIM). To date, HGMD represents the only comprehensive and publicly available database of gene lesions underlying human inherited disease. By July 1999, HGMD contained over 18,000 different mutations from some 900 human genes, the majority being single basepair substitutions. In addition to its potential as an information resource for clinicians and genetic counsellors, HGMD has allowed molecular geneticists to address a variety of biological questions through meta-analysis of the collated data. HGMD also promises to assist research workers in optimising mutation search strategies for a given gene. A questionnaire sent out to, and answered by, the editors of 20 key journals revealed that human genetics journals are increasingly reluctant to publish mutation reports. Electronic data submission and publication facilities are therefore urgently required. The World Wide Web (WWW) provides an excellent medium within which to combine the centralised management of basic mutation data, including rigorous quality control, with the possibility of publishing additional mutation-related information. In response to these needs, HGMD has both instituted a collaboration with Springer-Verlag GmbH, Heidelberg, to potentiate free online submission and electronic publication of human gene mutation data and developed links with the curators of locus-specific mutation databases. PMID:10612821

Krawczak, M; Ball, E V; Fenton, I; Stenson, P D; Abeysinghe, S; Thomas, N; Cooper, D N

2000-01-01

270

Database Designer's Workbench.  

National Technical Information Service (NTIS)

The Database Design and Evaluation Workbench is a graphics-oriented decision support system for both centralized and distributed database design. A prototype system called the Database Designer's Workbench (DDW) has been implemented on MULTICS and is curr...

R. E. Cobb T. J. Teorey

1987-01-01

271

NASA (National Aeronautics and Space Administration) Scientific and Technical Publications: A Catalog of Special Publications, Reference Publications, Conference Publications, and Technical Papers, 1987.  

National Technical Information Service (NTIS)

This catalog lists 239 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered in the NASA scientific and technical information database during accession year 1987....

1988-01-01

272

National Residential Efficiency Measures Database  

DOE Data Explorer

The National Residential Efficiency Measures Database is a publicly available, centralized resource of residential building retrofit measures and costs for the U.S. building industry. With support from the U.S. Department of Energy, NREL developed this tool to help users determine the most cost-effective retrofit measures for improving energy efficiency of existing homes. Software developers who require residential retrofit performance and cost data for applications that evaluate residential efficiency measures are the primary audience for this database. In addition, home performance contractors and manufacturers of residential materials and equipment may find this information useful. The database offers the following types of retrofit measures: 1) Appliances, 2) Domestic Hot Water, 3) Enclosure, 4) Heating, Ventilating, and Air Conditioning (HVAC), 5) Lighting, 6) Miscellaneous.

273

IPD--the Immuno Polymorphism Database  

PubMed Central

The Immuno Polymorphism Database (IPD), http://www.ebi.ac.uk/ipd/ is a set of specialist databases related to the study of polymorphic genes in the immune system. The IPD project works with specialist groups or nomenclature committees who provide and curate individual sections before they are submitted to IPD for online publication. The IPD project stores all the data in a set of related databases. IPD currently consists of four databases: IPD-KIR, contains the allelic sequences of killer-cell immunoglobulin-like receptors, IPD-MHC, a database of sequences of the major histocompatibility complex of different species; IPD-HPA, alloantigens expressed only on platelets; and IPD-ESTDAB, which provides access to the European Searchable Tumour Cell-Line Database, a cell bank of immunologically characterized melanoma cell lines. The data is currently available online from the website and FTP directory. This article describes the latest updates and additional tools added to the IPD project.

Robinson, James; Halliwell, Jason A.; McWilliam, Hamish; Lopez, Rodrigo; Marsh, Steven G. E.

2013-01-01

274

Approaching Analysis of EU IST Projects Database  

Microsoft Academic Search

We present the first results of the analysis of publicly available EU IST project descriptions. The database was automatically built from the publicly available information on the Web and organized to enable further analysis. We have used Text Mining methods to group the projects according to their content and the institutions participating in the projects. Two systems were developed, the

Marko Grobelnik

275

National Geologic Map Database  

NSDL National Science Digital Library

The National Geologic Map Database (NGMDB) is an Internet-based system for query and retrieval of earth-science map information, created as a collaborative effort between the USGS and the Association of American State Geologists. Its functions include providing a catalog of available map information; a data repository; and a source for general information on the nature and intended uses of the various types of earth-science information. The map catalog is a comprehensive, searchable catalog of all geoscience maps of the United States, in paper or digital format. It includes maps published in geological survey formal series and open-file series, maps in books, theses and dissertations, maps published by park associations, scientific societies, and other agencies, as well as publications that do not contain a map but instead provide a geological description of an area (for example, a state park). The geologic-names lexicon (GEOLEX) is a search tool for lithologic and geochronologic unit names. It now contains roughly 90% of the geologic names found in the most recent listing of USGS-approved geologic names. Current mapping activities at 1:24,000- and 1:100,000-scale are listed in the Geologic Mapping in Progress Database. Information on how to find topographic maps and list of geology-related links is also available.

1997-01-01

276

Ribosomal Database Project II  

DOE Data Explorer

The Ribosomal Database Project (RDP) provides ribosome related data and services to the scientific community, including online data analysis and aligned and annotated Bacterial small-subunit 16S rRNA sequences. As of March 2008, RDP Release 10 is available and currently (August 2009) contains 1,074,075 aligned 16S rRNA sequences. Data that can be downloaded include zipped GenBank and FASTA alignment files, a histogram (in Excel) of the number of RDP sequences spanning each base position, data in the Functional Gene Pipeline Repository, and various user submitted data. The RDP-II website also provides analysis tools such as: • myRDP - Align and Classify your 16S rRNA sequences. Use the RDP Pipeline to process sequence libraries from raw sequencer output to analysis • Tree Builder Create a phylogenetic tree • Hierarchy Browser - Browse a phylogenetic hierarchy and compile a list of 16S rRNA sequences for download or use. Also browse by Publication or Genome • Classifier - Assign 16S rRNA sequences to our taxonomical hierarchy • Library Compare - Compare two sequence libraries using the RDP Classifier • Sequence Match - Upload your sequence and search for its nearest neighbors • Probe Match - See what your probe targets in our database [From the RDP-II home page at http://rdp.cme.msu.edu/index.jsp

277

Interactive bibliographical database on color  

NASA Astrophysics Data System (ADS)

The paper describes the methodology and results of a project under development, aimed at the elaboration of an interactive bibliographical database on color in all fields of application: philosophy, psychology, semiotics, education, anthropology, physical and natural sciences, biology, medicine, technology, industry, architecture and design, arts, linguistics, geography, history. The project is initially based upon an already developed bibliography, published in different journals, updated in various opportunities, and now available at the Internet, with more than 2,000 entries. The interactive database will amplify that bibliography, incorporating hyperlinks and contents (indexes, abstracts, keywords, introductions, or eventually the complete document), and devising mechanisms for information retrieval. The sources to be included are: books, doctoral dissertations, multimedia publications, reference works. The main arrangement will be chronological, but the design of the database will allow rearrangements or selections by different fields: subject, Decimal Classification System, author, language, country, publisher, etc. A further project is to develop another database, including color-specialized journals or newsletters, and articles on color published in international journals, arranged in this case by journal name and date of publication, but allowing also rearrangements or selections by author, subject and keywords.

Caivano, Jose L.

2002-06-01

278

ChloroplastDB: the Chloroplast Genome Database  

Microsoft Academic Search

The Chloroplast Genome Database (ChloroplastDB) is an interactive, web-based database for fully sequenced plastid genomes, containing genomic, protein, DNA and RNA sequences, gene locations, RNA-editing sites, putative protein families and align- ments (http:\\/\\/chloroplast.cbio.psu.edu\\/). With recent technical advances, the rate of generating new organ- elle genomes has increased dramatically. However, the established ontology for chloroplast genes and gene features has not

Liying Cui; Narayanan Veeraraghavan; Alexander Richter; P. Kerr Wall; Robert K. Jansen; James Leebens-mack; Izabela Makalowska; Claude W. Depamphilis

2006-01-01

279

CD-ROM-aided Databases  

NASA Astrophysics Data System (ADS)

Nichigai Associates Inc. has begun information services to publish text databases on CD-ROM. In chapter 2, outline of these services and the publication plan of this fiscal year are described. In chapter 3, CD-ROM logical file format common to these services, software to generate files conformed to the format, and software to retrieve CD-ROM files by personal computers are also described.

Kitamura, Masami

280

Evaluation of the database on mutant frequencies and DNA sequence alterations of vermilion mutations induced in germ cells of Drosophila shows the importance of a neutral mutation detection system.  

PubMed

The vermilion gene in Drosophila has extensively been used for the molecular analysis of mutations induced by chemicals in germ cells in vivo. The gene is located on the X-chromosome and is a useful target for the study of mutagenesis since all types of mutations are generated. We have critically evaluated this system with respect to sensitivity for mutation induction and selectivity for different types of mutations, using a database of more than 600 vermilion mutants induced in postmeiotic male germ cells by 18 mutagens. From most of these mutants the mutation has been analysed. These data showed 336 base substitutions, 96 intra-locus DNA rearrangements and 78 multi-locus deletions (MLD). Mutants containing a MLD were either heterozygous sterile or homozygous and hemizygous lethal. The distribution of both basepair (bp) changes and intra-locus rearrangements over the coding region of the vermilion gene was uniform with no preferences concerning 5' or 3' regions, certain exons, splice sites, specific amino acid changes or nonsense mutations. Possible hotspots for base substitutions seem to be related to the type of DNA damage rather than to the vermilion system. Gene mutations other than bp changes were examined on sequence characteristics flanking the deletion breakpoints. Induction frequencies of vermilion mosaic mutants were, in general, higher than those of vermilion complete mutants, suggesting that persistent lesions are the main contributors to the molecular spectra. Comparison of induction frequencies of vermilion mutants and sex-linked recessive lethal (SLRL) mutants for the 18 mutagens showed that the sensitivity of the vermilion gene against a mutagenic insult is representative for genes located on the X-chromosome. The effect of nucleotide excision repair (NER) on the formation of SLRL mutants correlated with an increase of transversions in the vermilion spectra under NER deficient conditions. Furthermore, the clastogenic potency of the mutagens, i.e., the efficiency to induce chromosomal-losses vs. SLRL forward mutations, shows a positive correlation with the percentage of DNA deletions in the molecular spectra of vermilion mutants. PMID:10656485

Nivard, M J; Aguirrezabalaga, I; Ballering, L A; Pastink, A; Sierra, L M; Vogel, E W

1999-12-16

281

Understanding Database Design Expertise  

Microsoft Academic Search

Database design is a complex and time-consuming process. In order to automate database design, an understanding of the nature of expertise that goes into the design process is needed. Although a number of expert systems have been developed to assist or replace a database designer, database design expertise has not been examined in any detail. This paper proposes a conceptual

Veda C. Storey; Cheryl Bagley Thompson; Sudha Ram

1995-01-01

282

Relay database design  

Microsoft Academic Search

Many electric utilities use computer databases to store their relay data. The database naturally gives access to many users, coordinates modifications, and generates the necessary reports for different purposes. The relay database must accommodate a wide variety of protective devices. Users of a relay database include relay engineers, relay technicians, system operators, and other field personnel. Information wanted by these

J. McClain; S. M. Chan; D. Choe

1995-01-01

283

Curation accuracy of model organism databases.  

PubMed

Manual extraction of information from the biomedical literature-or biocuration-is the central methodology used to construct many biological databases. For example, the UniProt protein database, the EcoCyc Escherichia coli database and the Candida Genome Database (CGD) are all based on biocuration. Biological databases are used extensively by life science researchers, as online encyclopedias, as aids in the interpretation of new experimental data and as golden standards for the development of new bioinformatics algorithms. Although manual curation has been assumed to be highly accurate, we are aware of only one previous study of biocuration accuracy. We assessed the accuracy of EcoCyc and CGD by manually selecting curated assertions within randomly chosen EcoCyc and CGD gene pages and by then validating that the data found in the referenced publications supported those assertions. A database assertion is considered to be in error if that assertion could not be found in the publication cited for that assertion. We identified 10 errors in the 633 facts that we validated across the two databases, for an overall error rate of 1.58%, and individual error rates of 1.82% for CGD and 1.40% for EcoCyc. These data suggest that manual curation of the experimental literature by Ph.D-level scientists is highly accurate. Database URL: http://ecocyc.org/, http://www.candidagenome.org// PMID:24923819

Keseler, Ingrid M; Skrzypek, Marek; Weerasinghe, Deepika; Chen, Albert Y; Fulcher, Carol; Li, Gene-Wei; Lemmer, Kimberly C; Mladinich, Katherine M; Chow, Edmond D; Sherlock, Gavin; Karp, Peter D

2014-01-01

284

Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice.  

PubMed

We collected and completely sequenced 28,469 full-length complementary DNA clones from Oryza sativa L. ssp. japonica cv. Nipponbare. Through homology searches of publicly available sequence data, we assigned tentative protein functions to 21,596 clones (75.86%). Mapping of the cDNA clones to genomic DNA revealed that there are 19,000 to 20,500 transcription units in the rice genome. Protein informatics analysis against the InterPro database revealed the existence of proteins presented in rice but not in Arabidopsis. Sixty-four percent of our cDNAs are homologous to Arabidopsis proteins. PMID:12869764

Kikuchi, Shoshi; Satoh, Kouji; Nagata, Toshifumi; Kawagashira, Nobuyuki; Doi, Koji; Kishimoto, Naoki; Yazaki, Junshi; Ishikawa, Masahiro; Yamada, Hitomi; Ooka, Hisako; Hotta, Isamu; Kojima, Keiichi; Namiki, Takahiro; Ohneda, Eisuke; Yahagi, Wataru; Suzuki, Kohji; Li, Chao Jie; Ohtsuki, Kenji; Shishiki, Toru; Otomo, Yasuhiro; Murakami, Kazuo; Iida, Yoshiharu; Sugano, Sumio; Fujimura, Tatsuto; Suzuki, Yutaka; Tsunoda, Yuki; Kurosaki, Takashi; Kodama, Takeko; Masuda, Hiromi; Kobayashi, Michie; Xie, Quihong; Lu, Min; Narikawa, Ryuya; Sugiyama, Akio; Mizuno, Kouichi; Yokomizo, Satoko; Niikura, Junko; Ikeda, Rieko; Ishibiki, Junya; Kawamata, Midori; Yoshimura, Akemi; Miura, Junichirou; Kusumegi, Takahiro; Oka, Mitsuru; Ryu, Risa; Ueda, Mariko; Matsubara, Kenichi; Kawai, Jun; Carninci, Piero; Adachi, Jun; Aizawa, Katsunori; Arakawa, Takahiro; Fukuda, Shiro; Hara, Ayako; Hashizume, Wataru; Hayatsu, Norihito; Imotani, Koichi; Ishii, Yoshiyuki; Itoh, Masayoshi; Kagawa, Ikuko; Kondo, Shinji; Konno, Hideaki; Miyazaki, Ai; Osato, Naoki; Ota, Yoshimi; Saito, Rintaro; Sasaki, Daisuke; Sato, Kenjiro; Shibata, Kazuhiro; Shinagawa, Akira; Shiraki, Toshiyuki; Yoshino, Masayasu; Hayashizaki, Yoshihide; Yasunishi, Ayako

2003-07-18

285

DNA Barcoding  

NSDL National Science Digital Library

This is a two-part animation. ÃÂDNA Barcoding, Part 1,ÃÂ provides an overview of how DNA barcoding of animals can be used to identify an unknown sample or discover a new species. Cytochrome c oxidase subunit 1 (COI) is found in the mitochondria as part of the electron transport chain. The COI gene is used for DNA barcoding. Just like a barcode on an item in a grocery store identifies a product, a DNA barcode (determined by DNA sequencing) is used to identify species. Part 1 run time: 1 minute, 40 seconds. ÃÂDNA Barcoding, Part 2ÃÂ details how small tissue samples are used for DNA barcoding, including a review of the laboratory and bioinformatics steps used in barcoding: DNA purification, polymerase chain reaction (PCR), agarose gel electrophoresis, DNA sequencing and analysis, and DNA sequence identification using the Basic Local Alignment Search Tool (BLAST) or the Barcode of Life Database (BOLD). Part 2 run time: 4 minutes, 15 seconds. Animation is closed captioned.

2012-10-22

286

Comparison of DNA Sequences with Protein Sequences  

Microsoft Academic Search

The FASTA package of sequence comparison programs has been expanded to include FASTX and FASTY, which compare a DNA sequence to a protein sequence database, translating the DNA sequence in three frames and aligning the translated DNA sequence to each sequence in the protein database, allowing gaps and frameshifts. Also new are TFASTX and TFASTY, which compare a protein sequence

William R. Pearson; Todd Wood; Zheng Zhang; Webb Miller

1997-01-01

287

PRIDE: the proteomics identifications database.  

PubMed

The advent of high-throughput proteomics has enabled the identification of ever increasing numbers of proteins. Correspondingly, the number of publications centered on these protein identifications has increased dramatically. With the first results of the HUPO Plasma Proteome Project being analyzed and many other large-scale proteomics projects about to disseminate their data, this trend is not likely to flatten out any time soon. However, the publication mechanism of these identified proteins has lagged behind in technical terms. Often very long lists of identifications are either published directly with the article, resulting in both a voluminous and rather tedious read, or are included on the publisher's website as supplementary information. In either case, these lists are typically only provided as portable document format documents with a custom-made layout, making it practically impossible for computer programs to interpret them, let alone efficiently query them. Here we propose the proteomics identifications (PRIDE) database (http://www.ebi.ac.uk/pride) as a means to finally turn publicly available data into publicly accessible data. PRIDE offers a web-based query interface, a user-friendly data upload facility, and a documented application programming interface for direct computational access. The complete PRIDE database, source code, data, and support tools are freely available for web access or download and local installation. PMID:16041671

Martens, Lennart; Hermjakob, Henning; Jones, Philip; Adamski, Marcin; Taylor, Chris; States, David; Gevaert, Kris; Vandekerckhove, Joël; Apweiler, Rolf

2005-08-01

288

MitoZoa 2.0: a database resource and search tools for comparative and evolutionary analyses of mitochondrial genomes in Metazoa  

PubMed Central

The MITOchondrial genome database of metaZOAns (MitoZoa) is a public resource for comparative analyses of metazoan mitochondrial genomes (mtDNA) at both the sequence and genomic organizational levels. The main characteristics of the MitoZoa database are the careful revision of mtDNA entry annotations and the possibility of retrieving gene order and non-coding region (NCR) data in appropriate formats. The MitoZoa retrieval system enables basic and complex queries at various taxonomic levels using different search menus. MitoZoa 2.0 has been enhanced in several aspects, including: a re-annotation pipeline to check the correctness of protein-coding gene predictions; a standardized annotation of introns and of precursor ORFs whose functionality is post-transcriptionally recovered by RNA editing or programmed translational frameshifting; updates of taxon-related fields and a BLAST sequence similarity search tool. Database novelties and the definition of standard mtDNA annotation rules, together with the user-friendly retrieval system and the BLAST service, make MitoZoa a valuable resource for comparative and evolutionary analyses as well as a reference database to assist in the annotation of novel mtDNA sequences. MitoZoa is freely accessible at http://www.caspur.it/mitozoa.

D'Onorio de Meo, Paolo; D'Antonio, Mattia; Griggio, Francesca; Lupi, Renato; Borsani, Massimiliano; Pavesi, Giulio; Castrignano, Tiziana; Pesole, Graziano; Gissi, Carmela

2012-01-01

289

The Organelle Genome Database Project (GOBASE).  

PubMed Central

The taxonomically broad organelle genome database (GOBASE) organizes and integrates diverse data related to organelles (mitochondria and chloroplasts). The current version of GOBASE focuses on the mitochondrial subset of data and contains molecular sequences, RNA secondary structures and genetic maps, as well as taxonomic information for all eukaryotic species represented. The database has been designed so that complex biological queries, especially ones posed in a comparative genomics context, are supported. GOBASE has been implemented as a relational database with a web-based user interface (http://megasun.bch.umontreal.ca/gobase/gobas e.html ). Custom software tools have been written in house to assist in the population of the database, data validation, nomenclature standardization and front-end design. The database is fully operational and publicly accessible via the World Wide Web, allowing interactive browsing, sophisticated searching and easy downloading of data.

Korab-Laskowska, M; Rioux, P; Brossard, N; Littlejohn, T G; Gray, M W; Lang, B F; Burger, G

1998-01-01

290

Genetics and Public Policy Center  

MedlinePLUS

... Genomic Healthcare Challenges New Grant on Clinical WGS Sequencing Policy 04/04/12 Public wants genetic research ... genetic researchers who collect and study banked human DNA. more news releases... Surreptitious DNA testing The combination ...

291

YMDB: the Yeast Metabolome Database.  

PubMed

The Yeast Metabolome Database (YMDB, http://www.ymdb.ca) is a richly annotated 'metabolomic' database containing detailed information about the metabolome of Saccharomyces cerevisiae. Modeled closely after the Human Metabolome Database, the YMDB contains >2000 metabolites with links to 995 different genes/proteins, including enzymes and transporters. The information in YMDB has been gathered from hundreds of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the YMDB also contains an extensive collection of experimental intracellular and extracellular metabolite concentration data compiled from detailed Mass Spectrometry (MS) and Nuclear Magnetic Resonance (NMR) metabolomic analyses performed in our lab. This is further supplemented with thousands of NMR and MS spectra collected on pure, reference yeast metabolites. Each metabolite entry in the YMDB contains an average of 80 separate data fields including comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, intracellular/extracellular concentrations, growth conditions and substrates, pathway information, enzyme data, gene/protein sequence data, as well as numerous hyperlinks to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided that support text, chemical structure, spectral, molecular weight and gene/protein sequence queries. Because of S. cervesiae's importance as a model organism for biologists and as a biofactory for industry, we believe this kind of database could have considerable appeal not only to metabolomics researchers, but also to yeast biologists, systems biologists, the industrial fermentation industry, as well as the beer, wine and spirit industry. PMID:22064855

Jewison, Timothy; Knox, Craig; Neveu, Vanessa; Djoumbou, Yannick; Guo, An Chi; Lee, Jacqueline; Liu, Philip; Mandal, Rupasri; Krishnamurthy, Ram; Sinelnikov, Igor; Wilson, Michael; Wishart, David S

2012-01-01

292

University of Washington: Database Research Group  

NSDL National Science Digital Library

The University of Washington's Database Research Group âÂÂis focused on broadening the scope of database and data management techniques beyond their traditional scope.â Its work focuses on databases and the Web, XML, data management for ubiquitous computing, data integration, and data mining. For example, the Mangrove project seeks to facilitate the move to the semantic Web by creating âÂÂan environment in which users are motivated to create semantic content because of the existence of useful semantic services.â The various projects are described on this website along with related publications, which are available to download.

293

DNA BARCODING IN LAND PLANTS: DEVELOPING STANDARDS TO QUANTIFY AND MAXIMIZE SUCCESS.  

PubMed

The selection of a DNA barcode in plants has been impeded in part due to the relatively low rates of nucleotide substitution observed at the most accessible plastid markers. However, the absence of consensus also reflects a lack of standards for comparing potential barcode markers. While many publications have suggested a host of plant DNA barcodes, the studies cannot be readily compared with each other through any quantitative or statistical parameter, partly because they put forward no single compelling rationale relevant to the adoption of a DNA barcode in plants. Here, we argue that the efficacy of any particular plant DNA barcode selection should reflect the anticipated performance of the resulting barcode database in assignment of a query sequence to species. While legitimate scientific disagreement exists over the criteria relevant to "database performance", the notion gives a unifying rationale for prioritizing selection criteria. Accordingly, we suggest a measure of barcode efficacy based on the rationale of database performance, "the probability of correct identification" (PCI). Moreover, the definition of PCI is left flexible enough to handle most of the scientific disagreement over how to best evaluate DNA barcodes. Finally, we consider how different types of barcodes might require different methods of analysis and database design and indicate how the analysis might affect the selection of the most broadly effective barcode for land plants. PMID:19779570

Erickson, David L; Spouge, John; Resch, Alissa; Weigt, Lee A; Kress, W John

2008-11-01

294

Search the PAQ Database  

Cancer.gov

Use this interface to search through all the physical activity-related questions in our database. See below for some tips on how to search. If you have any comments concerning this database, please e-mail David Berrigan.

295

Structural Ceramics Database  

National Institute of Standards and Technology Data Gateway

SRD 30 NIST Structural Ceramics Database (Web, free access)   The NIST Structural Ceramics Database (WebSCD) provides evaluated materials property data for a wide range of advanced ceramics known variously as structural ceramics, engineering ceramics, and fine ceramics.

296

NIOSH Power Tools Database  

MedlinePLUS

... Search Tool Type: Manufacturer: Technical Specifications: Power Tools Database Choose one of the tools below to view sound specifications. Disclaimer: The NIOSH database does not include a comprehensive list of manufacturers ...

297

Household Products Database: Pesticides  

MedlinePLUS

... Names Types of Products Manufacturers Ingredients About the Database FAQ Product Recalls Help Glossary Contact Us More ... holders. Information is extracted from Consumer Product Information Database ©2001-2013 by DeLima Associates. All rights reserved. ...

298

The Database Index.  

ERIC Educational Resources Information Center

Describes the characteristics of System Development Corporation's (SDC) online index to its database, "Database Index," and presents the results of various test searches to demonstrate the limitations and benefits of the new online file. (JD)

Antony, Arthur

1979-01-01

299

Kentucky Geotechnical Database.  

National Technical Information Service (NTIS)

Development of a comprehensive dynamic, geotechnical database is described. This report is the fourth of four, recently completed, research studies. It summarizes all studies and describes the integration of major components of the database. Components in...

L. Sun T. C. Hopkins T. L. Beckham

2005-01-01

300

The Chloroplast Function Database II: a comprehensive collection of homozygous mutants and their phenotypic/genotypic traits for nuclear-encoded chloroplast proteins.  

PubMed

The Chloroplast Function Database has so far offered phenotype information on mutants of the nuclear-encoded chloroplast proteins in Arabidopsis that pertains to >200 phenotypic data sets that were obtained from 1,722 transposon- or T-DNA-tagged lines. Here, we present the development of the second version of the database, which is named the Chloroplast Function Database II and was redesigned to increase the number of mutant characters and new user-friendly tools for data mining and integration. The upgraded database offers information on genome-wide mutant screens for any visible phenotype against 2,495 tagged lines to create a comprehensive homozygous mutant collection. The collection consists of 147 lines with seedling phenotypes and 185 lines for which we could not obtain homozygotes, as well as 1,740 homozygotes with wild-type phenotypes. Besides providing basic information about primer lists that were used for the PCR genotyping of T-DNA-tagged lines and explanations about the preparation of homozygous mutants and phenotype screening, the database includes access to a link between the gene locus and existing publicly available databases. This gives users access to a combined pool of data, enabling them to gain valuable insights into biological processes. In addition, high-resolution images of plastid morphologies of mutants with seedling-specific chloroplast defects as observed with transmission electron microscopy (TEM) are available in the current database. This database is used to compare the phenotypes of visually identifiable mutants with their plastid ultrastructures and to evaluate their potential significance from characteristic patterns of plastid morphology in vivo. Thus, the Chloroplast Function Database II is a useful and comprehensive information resource that can help researchers to connect individual Arabidopsis genes to plastid functions on the basis of phenotype analysis of our tagged mutant collection. It can be freely accessed at http://rarge.psc.riken.jp/chloroplast/. PMID:23230006

Myouga, Fumiyoshi; Akiyama, Kenji; Tomonaga, Yumi; Kato, Aya; Sato, Yuka; Kobayashi, Megumi; Nagata, Noriko; Sakurai, Tetsuya; Shinozaki, Kazuo

2013-02-01

301

The Mouse Tumor Biology database  

PubMed Central

The laboratory mouse has long been an important tool in the study of the biology and genetics of human cancer. With the advent of genetic engineering techniques, DNA microarray analyses, tissue arrays, and other large-scale, high-throughput data generating methods, the amount of data available for mouse models of cancer is growing exponentially. Tools to integrate, locate and visualize these data are crucial to aid researchers in their investigations. The Mouse Tumor Biology database (http://tumor.informatics.jax.org) seeks to address that need.

Krupke, Debra M.; Begley, Dale A.; Sundberg, John P.; Bult, Carol J.; Eppig, Janan T.

2008-01-01

302

Estimating Haplotype Frequency and Coverage of Databases  

PubMed Central

A variety of forensic, population, and disease studies are based on haploid DNA (e.g. mitochondrial DNA or Y-chromosome data). For any set of genetic markers databases of conventional size will normally contain only a fraction of all haplotypes. For several applications, reliable estimates of haplotype frequencies, the total number of haplotypes and coverage of the database (the probability that the next random haplotype is contained in the database) will be useful. We propose different approaches to the problem based on classical methods as well as new applications of Principal Component Analysis (PCA). We also discuss previous proposals based on saturation curves. Several conclusions can be inferred from simulated and real data. First, classical estimates of the fraction of unseen haplotypes can be seriously biased. Second, there is no obvious way to decide on required sample size based on traditional approaches. Methods based on testing of hypotheses or length of confidence intervals may appear artificial since no single test or parameter stands out as particularly relevant. Rather the coverage may be more relevant since it indicates the percentage of different haplotypes that are contained in a database; if the coverage is low, there is a considerable chance that the next haplotype to be observed does not appear in the database and this indicates that the database needs to be expanded. Finally, freeware and example data sets accompany the methods discussed in this paper: http://folk.uio.no/thoree/nhap/.

Egeland, Thore; Salas, Antonio

2008-01-01

303

A Prolog Database System  

SciTech Connect

A Prolog Database System has direct relevance to the Japanese Fifth Generation Computer Project in which both Prolog and relational databases are essential. It can be read as an integrated whole, showing the sorts of systems and methods which will become common, or it can be read as a detached study of Prolog programming, query methods for relational databases or the application of associative methods to relational databases.

Heriot, D.L.I.

1984-01-01

304

NIST Special Database 10  

Microsoft Academic Search

This document describes the NIST fingerprint database, NIST Special Database 10. The databaseprovides a large sample of patterns for transitional fingerprint classes and classes with a low naturalfrequency of occurrence in NIST Special Database 9. The 552 fingerprint cards in NIST SpecialDatabase 10 are non-mated cards archived on a set of three CD-ROM's with the first CD-ROMcontaining 2160 fingerprint images

C. I. Watson

1993-01-01

305

JICST Mass Spectral Database  

NASA Astrophysics Data System (ADS)

Japan Information Center of Science and Technology (JICST) has started the on-line service of mass spectral database in March 1987. This database is composed of NBS-NIH-EPA-MSDC Mass Spectral Database and original data compiled by Mass Spectroscopy Society of Japan in cooperation with JICST. The authors outline the database system, data items and search commands. An example of retrieval session is presented.

Onodera, Natsuo; Abe, Atsushi; Atago, Ryuji

306

NMNH Entomology Database Library  

NSDL National Science Digital Library

The Smithsonian's Department of Entomology database library offers 25 databases and bibliographies related to insects and entomology, collectively containing many thousands of records. The main page lists the databases available and describes most of them. The number of records indexed and available search options vary for each database or bibliography. Additional information on the Department of Entomology and its collections is available from the department's homepage.

1969-12-31

307

NASA scientific and technical publications: A catalog of special publications, reference publications, conference publications, and technical papers, 1989  

NASA Technical Reports Server (NTRS)

This catalog lists 190 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered into the NASA scientific and technical information database during accession year 1989. The entries are grouped by subject category. Indexes of subject terms, personal authors, and NASA report numbers are provided.

1990-01-01

308

NASA scientific and technical publications: A catalog of special publications, reference publications, conference publications, and technical papers, 1991-1992  

NASA Technical Reports Server (NTRS)

This catalog lists 458 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered into the NASA Scientific and Technical Information database during accession year 1991 through 1992. The entries are grouped by subject category. Indexes of subject terms, personal authors, and NASA report numbers are provided.

1993-01-01

309

IEEE Conference Publications in Libraries.  

ERIC Educational Resources Information Center

Conclusions of surveys (63 libraries, OCLC database, University of Rhode Island users) assessing handling of Institute of Electrical and Electronics Engineers (IEEE) conference publications indicate that most libraries fully catalog these publications using LC cataloging, and library patrons frequently require series access to publications. Eight…

Johnson, Karl E.

1984-01-01

310

NASA scientific and technical publications: A catalog of special publications, reference publications, conference publications, and technical papers, 1987-1990  

NASA Technical Reports Server (NTRS)

This catalog lists 783 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered into NASA Scientific and Technical Information Database during the year's 1987 through 1990. The entries are grouped by subject category. Indexes of subject terms, personal authors, and NASA report numbers are provided.

1991-01-01

311

Mission and Assets Database  

NASA Technical Reports Server (NTRS)

Mission and Assets Database (MADB) Version 1.0 is an SQL database system with a Web user interface to centralize information. The database stores flight project support resource requirements, view periods, antenna information, schedule, and forecast results for use in mid-range and long-term planning of Deep Space Network (DSN) assets.

Baldwin, John; Zendejas, Silvino; Gutheinz, Sandy; Borden, Chester; Wang, Yeou-Fang

2009-01-01

312

Decision Points for Databases.  

ERIC Educational Resources Information Center

Argues that the selection of a database is a significant factor in the success and cost effectiveness of an online search, and provides guidelines for determining whether the content of a database is relevant for a particular search and whether the database is accessible, affordable, and suitable for the search. (LAE)

Basch, Reva

1992-01-01

313

Full-Text Databases.  

ERIC Educational Resources Information Center

This review of the literature on full-text databases provides information on search strategy, performance measurement, and the benefits and limitations of full text compared to bibliographic database searching. Various use studies and uses of full-text databases are also listed. (21 references) (LAE)

Siddiqui, Moid A.

1991-01-01

314

Statistical database design  

Microsoft Academic Search

The security problem of a statistical database is to limit the use of the database so that no sequence of statistical queries is sufficient to deduce confidential or private information. In this paper it is suggested that the problem be investigated at the conceptual data model level. The design of a statistical database should utilize a statistical security management facility

Francis Y. L. Chin; Gultekin Ozsoyoglu

1981-01-01

315

Temporal database system implementations  

Microsoft Academic Search

Although research on temporal database systems has been active for about 20 years, implementations have not appeared until recently. This is one reason why current commercial database systems provide only limited temporal functionality. This paper summarizes extant state of the art of temporal database implementations. Rather than being very specific about each system we have attempted to provide an indication

Michael H. Böhlen

1995-01-01

316

The Columbia grasp database  

Microsoft Academic Search

Collecting grasp data for learning and bench- marking purposes is very expensive. It would be helpful to have a standard database of graspable objects, along with a set of stable grasps for each object, but no such database exists. In this work we show how to automate the construction of a database consisting of several hands, thousands of objects, and

Corey Goldfeder; Matei T. Ciocarlie; Hao Dang; Peter K. Allen

2009-01-01

317

DHQ Database Revision History  

Cancer.gov

The following details all additions and revisions made to the DHQ nutrient and food database. This revision history is provided as a reference for investigators who may have performed analyses with a previous release of the database. The contents of the current database are listed in Nutrient and Dietary Constituents and Pyramid Food Groups.

318

[Utilization of clinical database for quality improvement in health care].  

PubMed

Measurement and feedback of clinical performance is one of the important activities in clinical database. In addition, clinical database is used for developing medical guidelines, research for public policy and clinical research, and also various kinds of information can be provided to the public. On the other hand, risk adjustment for data analysis is a key issue in the clinical database. From these viewpoints, this article described the activities for quality improvement, examples of research, and planning of clinical research using the clinical database. PMID:24008645

Tomotaki, Ai; Miyata, Hiroaki; Okubo, Suguru; Motomura, Noboru; Murakami, Arata; Ono, Minoru; Iwanaka, Tadashi

2013-09-01

319

GOLD: The Genomes Online Database  

DOE Data Explorer

Since its inception in 1997, GOLD has continuously monitored genome sequencing projects worldwide and has provided the community with a unique centralized resource that integrates diverse information related to Archaea, Bacteria, Eukaryotic and more recently Metagenomic sequencing projects. As of September 2007, GOLD recorded 639 completed genome projects. These projects have their complete sequence deposited into the public archival sequence databases such as GenBank EMBL,and DDBJ. From the total of 639 complete and published genome projects as of 9/2007, 527 were bacterial, 47 were archaeal and 65 were eukaryotic. In addition to the complete projects, there were 2158 ongoing sequencing projects. 1328 of those were bacterial, 59 archaeal and 771 eukaryotic projects. Two types of metadata are provided by GOLD: (i) project metadata and (ii) organism/environment metadata. GOLD CARD pages for every project are available from the link of every GOLD_STAMP ID. The information in every one of these pages is organized into three tables: (a) Organism information, (b) Genome project information and (c) External links. [The Genomes On Line Database (GOLD) in 2007: Status of genomic and metagenomic projects and their associated metadata, Konstantinos Liolios, Konstantinos Mavromatis, Nektarios Tavernarakis and Nikos C. Kyrpides, Nucleic Acids Research Advance Access published online on November 2, 2007, Nucleic Acids Research, doi:10.1093/nar/gkm884]

The basic tables in the GOLD database that can be browsed or searched include the following information:

  • Gold Stamp ID
  • Organism name
  • Domain
  • Links to information sources
  • Size and link to a map, when available
  • Chromosome number, Plas number, and GC content
  • A link for downloading the actual genome data
  • Institution that did the sequencing
  • Funding source
  • Database where information resides
  • Publication status and information

(Specialized Interface)

Kyrpides, Nikos; Liolios, Dinos; Chen, Amy; Tavernarakis, Nektarios; Hugenholtz, Philip; Markowitz, Victor; Bernal, Alex

320

Secure wildcard searchable database  

US Patent & Trademark Office Database

A system and method for providing access to data stored in encrypted form in a physically non-secure database without compromising security of the data in the physically non-secure database is disclosed. A representation of at least some of the data from the database in unencrypted form is stored in volatile memory associated with the server. The wildcard search is performed on the representation. Search results are displayed to the user to allow the user to select database contents to be retrieved. The user's selection is retrieved from the database and decrypted. Finally, the unencrypted selection results are provided to the user.

2013-10-01

321

ETC Spills Technology Databases: Oil Properties Database  

NSDL National Science Digital Library

The Environmental Technology Center of Environment Canada provides a database which contains various properties of crude oils and petroleum products. In addition to the listing of oils, the database reports properties "which will likely determine the environmental behavior and effects of spilled oil." The user may select an oil from a pull-down menu that lists 412 oils. The data are organized into tables in the following areas: Origin, API Gravity, Density, Pour Point, Dynamic Viscosity, Hydrocarbon Groups, and Distillation.

Fieldhouse, B.; Fingas, M.; Jokuty, P.; Lambert, P.; Mullin, J.; Wang, Z.

322

An Introduction to Database Structure and Database Machines.  

ERIC Educational Resources Information Center

Enumerates principal management objectives of database management systems (data independence, quality, security, multiuser access, central control) and criteria for comparison (response time, size, flexibility, other features). Conventional database management systems, relational databases, and database machines used for backend processing are…

Detweiler, Karen

1984-01-01

323

Raintree: Tropical Plant Database  

NSDL National Science Digital Library

Hosted by Raintree, the Tropical Plant Database is authored and maintained by Board Certified Naturopath, Ms. Leslie Taylor to provide accurate information about rainforest plants and to help promote rainforest conservation. Including over 300 pages of documentation on rainforest plants and very well-organized, the Tropical Plant Database lists plants by Common name, Botanical name, Ethnic uses, and Action/disorder. The Database File for each plant includes an illustration and information about family, genus, species, common names, plant description, and more. Visitors can link to great illustrations and photos as well as web resources for each plant including Medline Abstracts, W3 TROPICOS Database, Ethnobotany Database, and Phtyochem Database among others. Plant Database File pages include references as well.

2007-06-15

324

Testing systems database specification  

SciTech Connect

The Pacific Northwest Laboratory (PNL) has been contracted by the Navy Personnel Research and Development Center (NPRDC) to provide software for enhancing the NPRDC CAMP Facility. The objectives of this Database Specification are to provide the Database Administrator, maintenance programmers, and users with the following: a reference to the database design and data tables in the database, a description of the data flow when loaded into the database, a list of storage requirements for the database and support programs, and a discussion of the security issues involved. This paper describes the database: its name, physical system, availability, storage requirements, organization, data loading logic, data modification, support programs, and security. Also defined are the data tables and elements that compose the TSD.

Caplinger, J.T.; Corrigan, A.L.; Olander, M.

1989-05-01

325

Space medicine research publications: 1983-1984  

NASA Technical Reports Server (NTRS)

A list of publications supported by the Space Medicine Program, Office of Space Science and Applications is given. Included are publications entered into the Life Sciences Bibliographic Database by The George Washington University as of October 1, 1984.

Solberg, J. L.; Pleasant, L. G.

1984-01-01

326

Database description with SDM: a semantic database model  

Microsoft Academic Search

SDM is a high-level semantics-based database description and structuring formalism (database model) for databases. This database model is designed to capture more of the meaning of an application environment than is possible with contemporary database models. An SDM specification describes a database in terms of the kinds of entities that exist in the application environment, the classifications and groupings of

Michael M. Hammer; Dennis Mc Leod

1981-01-01

327

NBC update: The addition of viral and fungal databases to the Na?ve Bayes classification tool  

PubMed Central

Background Classifying the fungal and viral content of a sample is an important component of analyzing microbial communities in environmental media. Therefore, a method to classify any fragment from these organisms' DNA should be implemented. Results We update the näive Bayes classification (NBC) tool to classify reads originating from viral and fungal organisms. NBC classifies a fungal dataset similarly to Basic Local Alignment Search Tool (BLAST) and the Ribosomal Database Project (RDP) classifier. We also show NBC's similarities and differences to RDP on a fungal large subunit (LSU) ribosomal DNA dataset. For viruses in the training database, strain classification accuracy is 98%, while for those reads originating from sequences not in the database, the order-level accuracy is 78%, where order indicates the taxonomic level in the tree of life. Conclusions In addition to being competitive to other classifiers available, NBC has the potential to handle reads originating from any location in the genome. We recommend using the Bacteria/Archaea, Fungal, and Virus databases separately due to algorithmic biases towards long genomes. The tool is publicly available at: http://nbc.ece.drexel.edu.

2012-01-01

328

Women in Politics: Bibliographic Database  

NSDL National Science Digital Library

This bibliographic database currently holds 650 titles of recent works concerned with women in politics. A new addition to the Inter-Parliamentary Union's "Democracy through Partnership between Men and Women in Politics" site, "it provides bibliographic references to books, reports and journal articles on all aspects of women's participation in political life worldwide." The search mechanism allows users to specify type of document, geographic region, publishing organization, subject matter, author, title of periodical, and year of publication. Alternatively, there is also a subject keyword search. For more information about the Inter-Parliamentary Union Website, see the December 12, 1997 Scout Report.

329

Tomato Expression Database (TED): a suite of data presentation and analysis tools  

PubMed Central

The Tomato Expression Database (TED) includes three integrated components. The Tomato Microarray Data Warehouse serves as a central repository for raw gene expression data derived from the public tomato cDNA microarray. In addition to expression data, TED stores experimental design and array information in compliance with the MIAME guidelines and provides web interfaces for researchers to retrieve data for their own analysis and use. The Tomato Microarray Expression Database contains normalized and processed microarray data for ten time points with nine pair-wise comparisons during fruit development and ripening in a normal tomato variety and nearly isogenic single gene mutants impacting fruit development and ripening. Finally, the Tomato Digital Expression Database contains raw and normalized digital expression (EST abundance) data derived from analysis of the complete public tomato EST collection containing >150?000 ESTs derived from 27 different non-normalized EST libraries. This last component also includes tools for the comparison of tomato and Arabidopsis digital expression data. A set of query interfaces and analysis, and visualization tools have been developed and incorporated into TED, which aid users in identifying and deciphering biologically important information from our datasets. TED can be accessed at .

Fei, Zhangjun; Tang, Xuemei; Alba, Rob; Giovannoni, James

2006-01-01

330

Open access intrapartum CTG database  

PubMed Central

Background Cardiotocography (CTG) is a monitoring of fetal heart rate and uterine contractions. Since 1960 it is routinely used by obstetricians to assess fetal well-being. Many attempts to introduce methods of automatic signal processing and evaluation have appeared during the last 20 years, however still no significant progress similar to that in the domain of adult heart rate variability, where open access databases are available (e.g. MIT-BIH), is visible. Based on a thorough review of the relevant publications, presented in this paper, the shortcomings of the current state are obvious. A lack of common ground for clinicians and technicians in the field hinders clinically usable progress. Our open access database of digital intrapartum cardiotocographic recordings aims to change that. Description The intrapartum CTG database consists in total of 552 intrapartum recordings, which were acquired between April 2010 and August 2012 at the obstetrics ward of the University Hospital in Brno, Czech Republic. All recordings were stored in electronic form in the OB TraceVue®;system. The recordings were selected from 9164 intrapartum recordings with clinical as well as technical considerations in mind. All recordings are at most 90 minutes long and start a maximum of 90 minutes before delivery. The time relation of CTG to delivery is known as well as the length of the second stage of labor which does not exceed 30 minutes. The majority of recordings (all but 46 cesarean sections) is – on purpose – from vaginal deliveries. All recordings have available biochemical markers as well as some more general clinical features. Full description of the database and reasoning behind selection of the parameters is presented in the paper. Conclusion A new open-access CTG database is introduced which should give the research community common ground for comparison of results on reasonably large database. We anticipate that after reading the paper, the reader will understand the context of the field from clinical and technical perspectives which will enable him/her to use the database and also understand its limitations.

2014-01-01

331

Challenges in the association of human single nucleotide polymorphism mentions with unique database identifiers  

PubMed Central

Background Most information on genomic variations and their associations with phenotypes are covered exclusively in scientific publications rather than in structured databases. These texts commonly describe variations using natural language; database identifiers are seldom mentioned. This complicates the retrieval of variations, associated articles, as well as information extraction, e. g. the search for biological implications. To overcome these challenges, procedures to map textual mentions of variations to database identifiers need to be developed. Results This article describes a workflow for normalization of variation mentions, i.e. the association of them to unique database identifiers. Common pitfalls in the interpretation of single nucleotide polymorphism (SNP) mentions are highlighted and discussed. The developed normalization procedure achieves a precision of 98.1 % and a recall of 67.5% for unambiguous association of variation mentions with dbSNP identifiers on a text corpus based on 296 MEDLINE abstracts containing 527 mentions of SNPs. The annotated corpus is freely available at http://www.scai.fraunhofer.de/snp-normalization-corpus.html. Conclusions Comparable approaches usually focus on variations mentioned on the protein sequence and neglect problems for other SNP mentions. The results presented here indicate that normalizing SNPs described on DNA level is more difficult than the normalization of SNPs described on protein level. The challenges associated with normalization are exemplified with ambiguities and errors, which occur in this corpus.

2011-01-01

332

LncRNADisease: a database for long-non-coding RNA-associated diseases  

PubMed Central

In this article, we describe a long-non-coding RNA (lncRNA) and disease association database (LncRNADisease), which is publicly accessible at http://cmbi.bjmu.edu.cn/lncrnadisease. In recent years, a large number of lncRNAs have been identified and increasing evidence shows that lncRNAs play critical roles in various biological processes. Therefore, the dysfunctions of lncRNAs are associated with a wide range of diseases. It thus becomes important to understand lncRNAs’ roles in diseases and to identify candidate lncRNAs for disease diagnosis, treatment and prognosis. For this purpose, a high-quality lncRNA–disease association database would be extremely beneficial. Here, we describe the LncRNADisease database that collected and curated approximately 480 entries of experimentally supported lncRNA–disease associations, including 166 diseases. LncRNADisease also curated 478 entries of lncRNA interacting partners at various molecular levels, including protein, RNA, miRNA and DNA. Moreover, we annotated lncRNA–disease associations with genomic information, sequences, references and species. We normalized the disease name and the type of lncRNA dysfunction and provided a detailed description for each entry. Finally, we developed a bioinformatic method to predict novel lncRNA–disease associations and integrated the method and the predicted associated diseases of 1564 human lncRNAs into the database.

Chen, Geng; Wang, Ziyun; Wang, Dongqing; Qiu, Chengxiang; Liu, Mingxi; Chen, Xing; Zhang, Qipeng; Yan, Guiying; Cui, Qinghua

2013-01-01

333

Ecology in the age of DNA barcoding: the resource, the promise and the challenges ahead.  

PubMed

Ten years after DNA barcoding was initially suggested as a tool to identify species, millions of barcode sequences from more than 1100 species are available in public databases. While several studies have reviewed the methods and potential applications of DNA barcoding, most have focused on species identification and discovery, and relatively few have addressed applications of DNA barcoding data to ecology. These data, and the associated information on the evolutionary histories of taxa that they can provide, offer great opportunities for ecologists to investigate questions that were previously difficult or impossible to address. We present an overview of potential uses of DNA barcoding relevant in the age of ecoinformatics, including applications in community ecology, species invasion, macroevolution, trait evolution, food webs and trophic interactions, metacommunities, and spatial ecology. We also outline some of the challenges and potential advances in DNA barcoding that lie ahead. PMID:24118947

Joly, Simon; Davies, T Jonathan; Archambault, Annie; Bruneau, Anne; Derry, Alison; Kembel, Steven W; Peres-Neto, Pedro; Vamosi, Jana; Wheeler, Terry A

2014-03-01

334

Challenges of DNA profiling in mass disaster investigations.  

PubMed

In cases of mass disaster, there is often a need for managing, analyzing, and comparing large numbers of biological samples and DNA profiles. This requires the use of laboratory information management systems for large-scale sample logging and tracking, coupled with bioinformatic tools for DNA database searching according to different matching algorithms, and for the evaluation of the significance of each match by likelihood ratio calculations. There are many different interrelated factors and circumstances involved in each specific mass disaster scenario that may challenge the final DNA identification goal, such as: the number of victims, the mechanisms of body destruction, the extent of body fragmentation, the rate of DNA degradation, the body accessibility for sample collection, or the type of DNA reference samples availability. In this paper, we examine the different steps of the DNA identification analysis (DNA sampling, DNA analysis and technology, DNA database searching, and concordance and kinship analysis) reviewing the "lessons learned" and the scientific progress made in some mass disaster cases described in the scientific literature. We will put special emphasis on the valuable scientific feedback that genetic forensic community has received from the collaborative efforts of several public and private USA forensic laboratories in assisting with the more critical areas of the World Trade Center (WTC) mass fatality of September 11, 2001. The main challenges in identifying the victims of the recent South Asian Tsunami disaster, which has produced the steepest death count rise in history, will also be considered. We also present data from two recent mass fatality cases that involved Spanish victims: the Madrid terrorist attack of March 11, 2004, and the Yakolev-42 aircraft accident in Trabzon, Turkey, of May 26, 2003. PMID:16100756

Alonso, Antonio; Martin, Pablo; Albarrán, Cristina; Garcia, Pilar; Fernandez de Simon, Lourdes; Jesús Iturralde, Maria; Fernández-Rodriguez, Amparo; Atienza, Inmaculada; Capilla, Javier; García-Hirschfeld, Julia; Martinez, Pilar; Vallejo, Gloria; García, Oscar; García, Emilio; Real, Pilar; Alvarez, David; León, Antonio; Sancho, Manuel

2005-08-01

335

THE DRINKING WATER TREATABILITY DATABASE (Conference Paper)  

EPA Science Inventory

The Drinking Water Treatability Database (TDB) assembles referenced data on the control of contaminants in drinking water, housed on an interactive, publicly-available, USEPA web site (www.epa.gov/tdb). The TDB is of use to drinking water utilities, treatment process design engin...

336

THE DRINKING WATER TREATABILITY DATABASE (Slides)  

EPA Science Inventory

The Drinking Water Treatability Database (TDB) assembles referenced data on the control of contaminants in drinking water, housed on an interactive, publicly-available, USEPA web site (www.epa.gov/tdb). The TDB is of use to drinking water utilities, treatment process design engin...

337

The Drinking Water Treatability Database (Poster)  

EPA Science Inventory

Drinking Water Treatability Database (TDB) will provide data taken from the literature on the control of contaminants in drinking water, and will be housed on an interactive, publicly-available USEPA web site. It can be used for identifying effective treatment processes, recogni...

338

A Database System for Course Administration.  

ERIC Educational Resources Information Center

Describes a computer-assisted testing system which produces multiple-choice examinations for a college course in business administration. The system uses SPIRES (Stanford Public Information REtrieval System) to manage a database of questions and related data, mark-sense cards for machine grading tests, and ACL (6) (Audit Command Language) to…

Benbasat, Izak; And Others

1982-01-01

339

GOVERNING GENETIC DATABASES: COLLECTION, STORAGE AND USE  

PubMed Central

This paper provides an introduction to a collection of five papers, published as a special symposium journal issue, under the title: “Governing Genetic Databases: Collection, Storage and Use”. It begins by setting the scene, to provide a backdrop and context for the papers. It describes the evolving scientific landscape around genetic databases and genomic research, particularly within the biomedical and criminal forensic investigation fields. It notes the lack of any clear, coherent or coordinated legal governance regime, either at the national or international level. It then identifies and reflects on key cross-cutting issues and themes that emerge from the five papers, in particular: terminology and definitions; consent; special concerns around population genetic databases (biobanks) and forensic databases; international harmonisation; data protection; data access; boundary-setting; governance; and issues around balancing individual interests against public good values.

Gibbons, Susan M.C.; Kaye, Jane

2008-01-01

340

Dial-up remote access image database  

NASA Astrophysics Data System (ADS)

In this paper, a prototyping system for dial-up remote access image database is proposed. As a videotex system, the system includes Information Customer, Information Provider, Communication Server, Public Switch Telephone Networks, and a database server containing an image database. Because the color natural image is included in the database, the high resolution visual medium are given and many possible applications can be provided. Currently, a color image with a resolution of 400 by 400 can be accessed in about 25 seconds by using JPEG compression and high-speed modem. The system can be employed on many applications, such as home-shopping, remote education, etc. Also, it can be a pioneer system to provide teleservice in Integrated Serve Digital Network.

Ho, Chung-Ding; Lee, Su-Ming; Liao, Pen-Kung; Tsai, Ming-Houng; Shieh, Wern-Sheng; Chang, Horng-Ren; Ju, Rong-Hauh

1994-04-01

341

Analyses of porcine public SNPs in coding-gene regions by re-sequencing and phenotypic association studies  

Microsoft Academic Search

The Porcine SNP database has a huge number of SNPs, but these SNPs are mostly found by computer data-mining procedures and\\u000a have not been well characterized. We re-sequenced 1,439 porcine public SNPs from four commercial pig breeds and one Korean\\u000a domestic breed (Korean Native pig, KNP) by using two DNA pools from eight unrelated animals in each breed. These SNPs

Xiaoping Li; Sang-Wook Kim; Kyoung-Tag Do; You-Kyoung Ha; Yun-Mi Lee; Suk-Hee Yoon; Hee-Bal Kim; Jong-Joo Kim; Bong-Hwan Choi; Kwan-Suk Kim

2011-01-01

342

Acute Toxicity Database  

NSDL National Science Digital Library

This searchable database (by species or chemical) from the US Geological Survey was initially developed in 1986 using some 4,900 acute toxicity tests (since 1969) with 410 chemicals and 66 species of aquatic animals. Two databases may be viewed online (or downloaded): the ID Database (CSV format), including Chemical ID, Chemical Name, Units of Measure, Use of Chemical, and CAS number, and Acute Toxicity Test Results Database, a CSV file with acute toxicity data by Chemical ID. Also available onsite is the "Manual of Acute Toxicity: Interpretation and DataBase for 410 Chemicals and 66 Species of Freshwater Animals." The manual describes the database structure and contents, data quality, and metadata reference information.

343

RefSeq microbial genomes database: new representation and annotation strategy  

PubMed Central

The source of the microbial genomic sequences in the RefSeq collection is the set of primary sequence records submitted to the International Nucleotide Sequence Database public archives. These can be accessed through the Entrez search and retrieval system at http://www.ncbi.nlm.nih.gov/genome. Next-generation sequencing has enabled researchers to perform genomic sequencing at rates that were unimaginable in the past. Microbial genomes can now be sequenced in a matter of hours, which has led to a significant increase in the number of assembled genomes deposited in the public archives. This huge increase in DNA sequence data presents new challenges for the annotation, analysis and visualization bioinformatics tools. New strategies have been developed for the annotation and representation of reference genomes and sequence variations derived from population studies and clinical outbreaks.

Tatusova, Tatiana; Ciufo, Stacy; Fedorov, Boris; O'Neill, Kathleen; Tolstoy, Igor

2014-01-01

344

2010 Worldwide Gasification Database  

DOE Data Explorer

The 2010 Worldwide Gasification Database describes the current world gasification industry and identifies near-term planned capacity additions. The database lists gasification projects and includes information (e.g., plant location, number and type of gasifiers, syngas capacity, feedstock, and products). The database reveals that the worldwide gasification capacity has continued to grow for the past several decades and is now at 70,817 megawatts thermal (MWth) of syngas output at 144 operating plants with a total of 412 gasifiers.

345

Column oriented Database Systems  

Microsoft Academic Search

Column-oriented database systems (column-stores) have attracted a lot of attention in the past few years. Column-stores, in a nutshell, store each database table column separately, with attribute values belonging to the same column stored contiguously, compressed, and densely packed, as opposed to traditional database systems that store entire records (rows) one after the other. Reading a subset of a table's

Daniel J. Abadi; Peter A. Boncz; Stavros Harizopoulos

2009-01-01

346

The EMBL Nucleotide Sequence Database: major new developments  

Microsoft Academic Search

The EMBL Nucleotide Sequence Database (http:\\/\\/ www.ebi.ac.uk\\/embl\\/) incorporates, organizes and distributes nucleotide sequences from all available public sources. The database is located and main- tained at the European Bioinformatics Institute (EBI) near Cambridge, UK. In an international collabora- tion with DDBJ (Japan) and GenBank (USA), data are exchanged amongst the collaborating databases on a daily basis to achieve optimal synchronization.

Guenter Stoesser; Wendy Baker; Alexandra Van Den Broek; Maria Garcia-pastor; Carola Kanz; Tamara Kulikova; Rasko Leinonen; Quan Lin; Vincent Lombard; Rodrigo Lopez; Renato Mancuso; Francesco Nardone; Peter Stoehr; Mary Ann Tuli; Katerina Tzouvara; Robert Vaughan

2003-01-01

347

Compressive genomics for protein databases  

PubMed Central

Motivation: The exponential growth of protein sequence databases has increasingly made the fundamental question of searching for homologs a computational bottleneck. The amount of unique data, however, is not growing nearly as fast; we can exploit this fact to greatly accelerate homology search. Acceleration of programs in the popular PSI/DELTA-BLAST family of tools will not only speed-up homology search directly but also the huge collection of other current programs that primarily interact with large protein databases via precisely these tools. Results: We introduce a suite of homology search tools, powered by compressively accelerated protein BLAST (CaBLASTP), which are significantly faster than and comparably accurate with all known state-of-the-art tools, including HHblits, DELTA-BLAST and PSI-BLAST. Further, our tools are implemented in a manner that allows direct substitution into existing analysis pipelines. The key idea is that we introduce a local similarity-based compression scheme that allows us to operate directly on the compressed data. Importantly, CaBLASTP’s runtime scales almost linearly in the amount of unique data, as opposed to current BLASTP variants, which scale linearly in the size of the full protein database being searched. Our compressive algorithms will speed-up many tasks, such as protein structure prediction and orthology mapping, which rely heavily on homology search. Availability: CaBLASTP is available under the GNU Public License at http://cablastp.csail.mit.edu/ Contact: bab@mit.edu

Daniels, Noah M.; Gallant, Andrew; Peng, Jian; Cowen, Lenore J.; Baym, Michael; Berger, Bonnie

2013-01-01

348

The Comparative Toxicogenomics Database (CTD).  

PubMed Central

The Mount Desert Island Biological Laboratory in Salsbury Cove, Maine, USA, is developing the Comparative Toxicogenomics Database (CTD), a community-supported genomic resource devoted to genes and proteins of human toxicologic significance. CTD will be the first publicly available database to a) provide annotated associations among genes, proteins, references, and toxic agents, with a focus on annotating data from aquatic and mammalian organisms; b) include nucleotide and protein sequences from diverse species; c) offer a range of analysis tools for customized comparative studies; and d) provide information to investigators on available molecular reagents. This combination of features will facilitate cross-species comparisons of toxicologically significant genes and proteins. These comparisons will promote understanding of molecular evolution, the significance of conserved sequences, the genetic basis of variable sensitivity to environmental agents, and the complex interactions between the environment and human health. CTD is currently under development, and the planned scope and functions of the database are described herein. The intent of this report is to invite community participation in the development of CTD to ensure that it will be a valuable resource for environmental health, molecular biology, and toxicology research.

Mattingly, Carolyn J; Colby, Glenn T; Forrest, John N; Boyer, James L

2003-01-01

349

Federal Register document image database  

NASA Astrophysics Data System (ADS)

A new, fully-automated process has been developed at NIST to derive ground truth for document images. The method involves matching optical character recognition (OCR) results from a page with typesetting files for an entire book. Public domain software used to derive the ground truth is provided in the form of Perl scripts and C source code, and includes new, more efficient string alignment technology and a word- level scoring package. With this ground truthing technology, it is now feasible to produce much larger data sets, at much lower cost, than was ever possible with previous labor- intensive, manual data collection projects. Using this method, NIST has produced a new document image database for evaluating Document Analysis and Recognition technologies and Information Retrieval systems. The database produced contains scanned images, SGML-tagged ground truth text, commercial OCR results, and image quality assessment results for pages published in the 1994 Federal Register. These data files are useful in a wide variety of experiments and research. There were roughly 250 issues, comprised of nearly 69,000 pages, published in the Federal Register in 1994. This volume of the database contains the pages of 20 books published in January of that year. In all, there are 4711 page images provided, with 4519 of them having corresponding ground truth. This volume is distributed on two ISO-9660 CD- ROMs. Future volumes may be released, depending on the level of interest.

Garris, Michael D.; Janet, Stanley A.; Klein, William W.

1999-01-01

350

Long Valley caldera GIS Database  

NASA Astrophysics Data System (ADS)

In May of 1980, a strong earthquake swarm that included four magnitude 6 earthquakes struck the southern margin of Long Valley Caldera associated with a 25-cm, dome-shaped uplift of the caldera floor. These events marked the onset of the latest period of caldera unrest that continues to this day. This ongoing unrest includes recurring earthquake swarms and continued dome-shaped uplift of the central section of the caldera (the resurgent dome) accompanied by changes in thermal springs and gas emissions. Analysis of combined gravity and geodetic data confirms the intrusion of silicic magma beneath Long Valley caldera. In 1982, the U.S. Geological Survey under the Volcano Hazards Program began an intensive effort to monitor and study geologic unrest in Long Valley Caldera. This database provides an overview of the studies being conducted by the Long Valley Observatory in Eastern California from 1975 to 2000. The database includes geological, monitoring and topographic datasets related to the Long Valley Caldera, plus a number of USGS publications on Long Valley (e.g., fact-sheets, references). Datasets are available as text files or ArcView shapefiles. Database CD-ROM Table of Contents: - Geological data (digital geologic map) - Monitoring data: Deformation (EDM, GPS, Leveling); Earthquakes; Gravity; Hydrologic; CO2 - Topographic data: DEM, DRG, Landsat 7, Rivers, Roads, Water Bodies - ArcView Project File

Williams, M. J.; Battaglia, M.; Hill, D.; Langbein, J.; Segall, P.

2002-12-01

351

Development a GIS Snowstorm Database  

NASA Astrophysics Data System (ADS)

This paper describes the development of a GIS Snowstorm Database (GSDB) at NOAA’s National Climatic Data Center. The snowstorm database is a collection of GIS layers and tabular information for 471 snowstorms between 1900 and 2010. Each snowstorm has undergone automated and manual quality control. The beginning and ending date of each snowstorm is specified. The original purpose of this data was to serve as input for NCDC’s new Regional Snowfall Impact Scale (ReSIS). However, this data is being preserved and used to investigate the impacts of snowstorms on society. GSDB is used to summarize the impact of snowstorms on transportation (interstates) and various classes of facilities (roads, schools, hospitals, etc.). GSDB can also be linked to other sources of impacts such as insurance loss information and Storm Data. Thus the snowstorm database is suited for many different types of users including the general public, decision makers, and researchers. This paper summarizes quality control issues associated with using snowfall data, methods used to identify the starting and ending dates of a storm, and examples of the tables that combine snowfall and societal data.

Squires, M. F.

2010-12-01

352

DOLOP: A Database of Bacterial Lipoproteins  

NSDL National Science Digital Library

Bacteria rely on protein-lipid combinations known as lipoproteins to glom onto surfaces, sense their surroundings, slurp up nutrients, shuttle DNA to other cells, and perform other life tasks. Researchers can analyze more than 270 of the molecules at DOLOP, a database from the Medical Research Council Laboratory of Molecular Biology in Cambridge, U.K. Entries describe each protein, indicate its size and function, and provide links to the Swiss-Prot database, where you can parse the molecule's sequence and structural features. The site also explains the synthesis of lipoproteins and describes the lipobox, a characteristic amino acid string to which lipids attach.

M. Maden Babu (MRC-Laboratory of Molecular Biology, Cambridge;); K. Sankaran (Centre for Biotechnology, Anna University, Chennai ;)

2001-09-15

353

Database extraction strategies for low-template evidence.  

PubMed

Often in forensic cases, the profile of at least one of the contributors to a DNA evidence sample is unknown and a database search is needed to discover possible perpetrators. In this article we consider two types of search strategies to extract suspects from a database using methods based on probability arguments. The performance of the proposed match scores is demonstrated by carrying out a study of each match score relative to the level of allele drop-out in the crime sample, simulating low-template DNA. The efficiency was measured by random man simulation and we compared the performance using the SGM Plus kit and the ESX 17 kit for the Norwegian population, demonstrating that the latter has greatly enhanced power to discover perpetrators of crime in large national DNA databases. The code for the database extraction strategies will be prepared for release in the R-package forensim. PMID:24528591

Bleka, Øyvind; Dørum, Guro; Haned, Hinda; Gill, Peter

2014-03-01

354

40 CFR 1400.13 - Read-only database.  

Code of Federal Regulations, 2010 CFR

...DISTRIBUTION OF OFF-SITE CONSEQUENCE ANALYSIS INFORMATION DISTRIBUTION OF OFF-SITE CONSEQUENCE ANALYSIS INFORMATION Other Provisions ...the public off-site consequence analysis information by means of a central database under...

2010-07-01

355

40 CFR 1400.13 - Read-only database.  

Code of Federal Regulations, 2010 CFR

...DISTRIBUTION OF OFF-SITE CONSEQUENCE ANALYSIS INFORMATION DISTRIBUTION OF OFF-SITE CONSEQUENCE ANALYSIS INFORMATION Other Provisions ...the public off-site consequence analysis information by means of a central database under...

2009-07-01

356

Foodline®: International Food Market, Technology and Regulatory Databases  

Microsoft Academic Search

Foodline® is a trio of databases from U.K.-based Leatherhead Food Research Association providing international coverage of food marketing, technical and regulatory information. Foodline®: International Food Market Data is a bibliographic database of global market information abstracted from some 250 food and beverage business and trade journals, statistical publications and market studies. Foodline®: Food Science and Technology consists of citations and

Peter Sidney

1996-01-01

357

Go Figure: Computer Database Adds the Personal Touch.  

ERIC Educational Resources Information Center

A database for recordkeeping for a summer reading club was developed for a public library system using an IBM PC and Microsoft Works. Use of the database resulted in more efficient program management, giving librarians more time to spend with patrons and enabling timely awarding of incentives. (LAE)

Gaffney, Jean; Crawford, Pat

1992-01-01

358

Small and Medium Enterprises across the Globe: A New Database  

Microsoft Academic Search

This paper describes a new cross-country database on the importance of small and medium enterprises (SMEs). This database is unique in that it presents consistent and comparable information on the contribution of the SME sector to total employment and GDP across different countries. The dataset improves on existing publicly available datasets on several grounds. First, it extends coverage to a

Meghana Ayyagari; Thorsten Beck

2003-01-01

359

Increased coverage of protein families with the Blocks Database servers  

PubMed Central

The Blocks Database WWW (http://blocks.fhcrc.org ) and Email (blocks@blocks.fhcrc.org ) servers provide tools to search DNA and protein queries against the Blocks+ Database of multiple alignments, which represent conserved protein regions. Blocks+ nearly doubles the number of protein families included in the database by adding families from the Pfam-A, ProDom and Domo databases to those from PROSITE and PRINTS. Other new features include improved Block Searcher statistics, searching with NCBI’s IMPALA program and 3D display of blocks on PDB structures.

Henikoff, Jorja G.; Greene, Elizabeth A.; Pietrokovski, Shmuel; Henikoff, Steven

2000-01-01

360

48 CFR 504.605-70 - Federal Procurement Data System-Public access to data.  

Code of Federal Regulations, 2013 CFR

...SystemâPublic access to data. (a) The FPDS database . The General Services Administration awarded...operation of the Federal Procurement Data System (FPDS) database. That database includes information reported by departments...

2013-10-01

361

78 FR 49730 - Collection of Information; Proposed Extension of Approval; Comment Request-Publicly Available...  

Federal Register 2010, 2011, 2012, 2013

...Available Consumer Product Safety Information Database AGENCY: Consumer Product Safety Commission...Available Consumer Product Safety Information Database. The Commission will consider all comments...maintain a publicly available, searchable database on the safety of consumer products...

2013-08-15

362

MITOMAP: a human mitochondrial genome database--2004 update  

PubMed Central

MITOMAP (http://www.MITOMAP.org), a database for the human mitochondrial genome, has grown rapidly in data content over the past several years as interest in the role of mitochondrial DNA (mtDNA) variation in human origins, forensics, degenerative diseases, cancer and aging has increased dramatically. To accommodate this information explosion, MITOMAP has implemented a new relational database and an improved search engine, and all programs have been rewritten. System administrative changes have been made to improve security and efficiency, and to make MITOMAP compatible with a new automatic mtDNA sequence analyzer known as Mitomaster.

Brandon, Marty C.; Lott, Marie T.; Nguyen, Kevin Cuong; Spolim, Syawal; Navathe, Shamkant B.; Baldi, Pierre; Wallace, Douglas C.

2005-01-01

363

MITOMAP: a human mitochondrial genome database--2004 update.  

PubMed

MITOMAP (http://www.MITOMAP.org), a database for the human mitochondrial genome, has grown rapidly in data content over the past several years as interest in the role of mitochondrial DNA (mtDNA) variation in human origins, forensics, degenerative diseases, cancer and aging has increased dramatically. To accommodate this information explosion, MITOMAP has implemented a new relational database and an improved search engine, and all programs have been rewritten. System administrative changes have been made to improve security and efficiency, and to make MITOMAP compatible with a new automatic mtDNA sequence analyzer known as Mitomaster. PMID:15608272

Brandon, Marty C; Lott, Marie T; Nguyen, Kevin Cuong; Spolim, Syawal; Navathe, Shamkant B; Baldi, Pierre; Wallace, Douglas C

2005-01-01

364

CMD: a Cotton Microsatellite Database resource for Gossypium genomics  

Microsoft Academic Search

BACKGROUND: The Cotton Microsatellite Database (CMD) http:\\/\\/www.cottonssr.org is a curated and integrated web-based relational database providing centralized access to publicly available cotton microsatellites, an invaluable resource for basic and applied research in cotton breeding. DESCRIPTION: At present CMD contains publication, sequence, primer, mapping and homology data for nine major cotton microsatellite projects, collectively representing 5,484 microsatellites. In addition, CMD displays

Anna Blenda; Jodi Scheffler; Brian Scheffler; Michael Palmer; Jean-Marc Lacape; John Z Yu; Christopher Jesudurai; Sook Jung; Sriram Muthukumar; Preetham Yellambalase; Stephen Ficklin; Margaret Staton; Robert Eshelman; Mauricio Ulloa; Sukumar Saha; Ben Burr; Shaolin Liu; Tianzhen Zhang; Deqiu Fang; Alan Pepper; Siva Kumpatla; John Jacobs; Jeff Tomkins; Roy Cantrell; Dorrie Main

2006-01-01

365

FORMIDABEL: The Belgian Ants Database  

PubMed Central

Abstract FORMIDABEL is a database of Belgian Ants containing more than 27.000 occurrence records. These records originate from collections, field sampling and literature. The database gives information on 76 native and 9 introduced ant species found in Belgium. The collection records originated mainly from the ants collection in Royal Belgian Institute of Natural Sciences (RBINS), the ‘Gaspar’ Ants collection in Gembloux and the zoological collection of the University of Liège (ULG). The oldest occurrences date back from May 1866, the most recent refer to August 2012. FORMIDABEL is a work in progress and the database is updated twice a year. The latest version of the dataset is publicly and freely accessible through this url: http://ipt.biodiversity.be/resource.do?r=formidabel. The dataset is also retrievable via the GBIF data portal through this link: http://data.gbif.org/datasets/resource/14697 A dedicated geo-portal, developed by the Belgian Biodiversity Platform is accessible at: http://www.formicidae-atlas.be Purpose: FORMIDABEL is a joint cooperation of the Flemish ants working group “Polyergus” (http://formicidae.be) and the Wallonian ants working group “FourmisWalBru” (http://fourmiswalbru.be). The original database was created in 2002 in the context of the preliminary red data book of Flemish Ants (Dekoninck et al. 2003). Later, in 2005, data from the Southern part of Belgium; Wallonia and Brussels were added. In 2012 this dataset was again updated for the creation of the first Belgian Ants Atlas (Figure 1) (Dekoninck et al. 2012). The main purpose of this atlas was to generate maps for all outdoor-living ant species in Belgium using an overlay of the standard Belgian ecoregions. By using this overlay for most species, we can discern a clear and often restricted distribution pattern in Belgium, mainly based on vegetation and soil types.

Brosens, Dimitri; Vankerkhoven, Francois; Ignace, David; Wegnez, Philippe; Noe, Nicolas; Heughebaert, Andre; Bortels, Jeannine; Dekoninck, Wouter

2013-01-01

366

Central Asia Active Fault Database  

NASA Astrophysics Data System (ADS)

The ongoing collision of the Indian subcontinent with Asia controls active tectonics and seismicity in Central Asia. This motion is accommodated by faults that have historically caused devastating earthquakes and continue to pose serious threats to the population at risk. Despite international and regional efforts to assess seismic hazards in Central Asia, little attention has been given to development of a comprehensive database for active faults in the region. To address this issue and to better understand the distribution and level of seismic hazard in Central Asia, we are developing a publically available database for active faults of Central Asia (including but not limited to Afghanistan, Tajikistan, Kyrgyzstan, northern Pakistan and western China) using ArcGIS. The database is designed to allow users to store, map and query important fault parameters such as fault location, displacement history, rate of movement, and other data relevant to seismic hazard studies including fault trench locations, geochronology constraints, and seismic studies. Data sources integrated into the database include previously published maps and scientific investigations as well as strain rate measurements and historic and recent seismicity. In addition, high resolution Quickbird, Spot, and Aster imagery are used for selected features to locate and measure offset of landforms associated with Quaternary faulting. These features are individually digitized and linked to attribute tables that provide a description for each feature. Preliminary observations include inconsistent and sometimes inaccurate information for faults documented in different studies. For example, the Darvaz-Karakul fault which roughly defines the western margin of the Pamir, has been mapped with differences in location of up to 12 kilometers. The sense of motion for this fault ranges from unknown to thrust and strike-slip in three different studies despite documented left-lateral displacements of Holocene and late Pleistocene landforms observed near the fault trace.

Mohadjer, Solmaz; Ehlers, Todd A.; Kakar, Najibullah

2014-05-01

367

FORMIDABEL: The Belgian Ants Database.  

PubMed

FORMIDABEL is a database of Belgian Ants containing more than 27.000 occurrence records. These records originate from collections, field sampling and literature. The database gives information on 76 native and 9 introduced ant species found in Belgium. The collection records originated mainly from the ants collection in Royal Belgian Institute of Natural Sciences (RBINS), the 'Gaspar' Ants collection in Gembloux and the zoological collection of the University of Liège (ULG). The oldest occurrences date back from May 1866, the most recent refer to August 2012. FORMIDABEL is a work in progress and the database is updated twice a year. THE LATEST VERSION OF THE DATASET IS PUBLICLY AND FREELY ACCESSIBLE THROUGH THIS URL: http://ipt.biodiversity.be/resource.do?r=formidabel. The dataset is also retrievable via the GBIF data portal through this link: http://data.gbif.org/datasets/resource/14697 A dedicated geo-portal, developed by the Belgian Biodiversity Platform is accessible at: http://www.formicidae-atlas.be Purpose: FORMIDABEL is a joint cooperation of the Flemish ants working group "Polyergus" (http://formicidae.be) and the Wallonian ants working group "FourmisWalBru" (http://fourmiswalbru.be). The original database was created in 2002 in the context of the preliminary red data book of Flemish Ants (Dekoninck et al. 2003). Later, in 2005, data from the Southern part of Belgium; Wallonia and Brussels were added. In 2012 this dataset was again updated for the creation of the first Belgian Ants Atlas (Figure 1) (Dekoninck et al. 2012). The main purpose of this atlas was to generate maps for all outdoor-living ant species in Belgium using an overlay of the standard Belgian ecoregions. By using this overlay for most species, we can discern a clear and often restricted distribution pattern in Belgium, mainly based on vegetation and soil types. PMID:23794918

Brosens, Dimitri; Vankerkhoven, François; Ignace, David; Wegnez, Philippe; Noé, Nicolas; Heughebaert, André; Bortels, Jeannine; Dekoninck, Wouter

2013-01-01

368

Automating database management for distributed database systems  

NASA Astrophysics Data System (ADS)

The management of large databases can be a time consuming and tedious task without tools to automate the process. In order to facilitata the process, a database design philosophy which is consistent with application objectives must be adopted. The Planetary Plasma Interactions (PPI) Node of the Planetary Data System (PDS) provides users with a software application which allows for rapid retrieval (for examination or extraction) of datasets which are distributed and which may exist in more than one location simultaneously. In addition, the application software, which interacts with both a local and a centralized database management system (dbms) via a client-server architecture, also is distributed. The basic problem lies in keeping all of the information current in a dynamic data environment so that the application software, regardless of its location, can access any data in the system with the greatest possible throughput. In addressing this problem the PDS/PPI Node has adopted a standardized architecture [2] and has chosen to package data management information with the data themselves as a set of detached relational flatfiles. At the PDS/PPI Node automated dbms tools are being developed which update both local and central management systems by using the information provided along with the datasets. This information can be stored in any relational database system or it can be accessed and manipulated outside of a dbms. Science data are not maintained under direct dbms control.

Joy, Steven P.; King, Todd A.; Walker, Raymond J.

1993-08-01

369

Physical database design for relational databases  

Microsoft Academic Search

This paper describes the concepts used in the implementation of DBDSGN, an experimental physical design tool for relational databases developed at the IBM San Jose Research Laboratory. Given a workload for System R (consisting of a set of SQL statements and their execution frequencies), DBDSGN suggests physical configurations for efficient performance. Each configuration consists of a set of indices and

Sheldon J. Finkelstein; Mario Schkolnick; Paolo Tiberio

1988-01-01

370

TaxMan: a taxonomic database manager  

PubMed Central

Background Phylogenetic analysis of large, multiple-gene datasets, assembled from public sequence databases, is rapidly becoming a popular way to approach difficult phylogenetic problems. Supermatrices (concatenated multiple sequence alignments of multiple genes) can yield more phylogenetic signal than individual genes. However, manually assembling such datasets for a large taxonomic group is time-consuming and error-prone. Additionally, sequence curation, alignment and assessment of the results of phylogenetic analysis are made particularly difficult by the potential for a given gene in a given species to be unrepresented, or to be represented by multiple or partial sequences. We have developed a software package, TaxMan, that largely automates the processes of sequence acquisition, consensus building, alignment and taxon selection to facilitate this type of phylogenetic study. Results TaxMan uses freely available tools to allow rapid assembly, storage and analysis of large, aligned DNA and protein sequence datasets for user-defined sets of species and genes. The user provides GenBank format files and a list of gene names and synonyms for the loci to analyse. Sequences are extracted from the GenBank files on the basis of annotation and sequence similarity. Consensus sequences are built automatically. Alignment is carried out (where possible, at the protein level) and aligned sequences are stored in a database. TaxMan can automatically determine the best subset of taxa to examine phylogeny at a given taxonomic level. By using the stored aligned sequences, large concatenated multiple sequence alignments can be generated rapidly for a subset and output in analysis-ready file formats. Trees resulting from phylogenetic analysis can be stored and compared with a reference taxonomy. Conclusion TaxMan allows rapid automated assembly of a multigene datasets of aligned sequences for large taxonomic groups. By extracting sequences on the basis of both annotation and BLAST similarity, it ensures that all available sequence data can be brought to bear on a phylogenetic problem, but remains fast enough to cope with many thousands of records. By automatically assisting in the selection of the best subset of taxa to address a particular phylogenetic problem, TaxMan greatly speeds up the process of generating multiple sequence alignments for phylogenetic analysis. Our results indicate that an automated phylogenetic workbench can be a useful tool when correctly guided by user knowledge.

Jones, Martin; Blaxter, Mark

2006-01-01

371

Campus Database Issues.  

ERIC Educational Resources Information Center

Focuses on the link between electronic databases and institutes of higher education, particularly the rapid increase of commercial databases. Discusses the issues that digital and electronic resources present to colleges and universities, including the implications of recent Congressional legislation and the Feist case. (EV)

O'Neil, Robert M.

2000-01-01

372

Database Reviews: Legal Information.  

ERIC Educational Resources Information Center

Detailed reviews of two legal information databases--"Laborlaw I" and "Legal Resource Index"--are presented in this paper. Each database review begins with a bibliographic entry listing the title; producer; vendor; cost per hour contact time; offline print cost per citation; time period covered; frequency of updates; and size of file. A detailed…

Seiser, Virginia

373

Probabilistic Spatial Database Operations  

Microsoft Academic Search

Spatial databases typically assume that the positional at- tributes of spatial objects are precisely known. In practice, however, they are known only approximately, with the error depending on the nature of the measurement and the source of data. In this paper, we address the problem how to perform spatial database operations in the presence of uncertainty. We first discuss a

Jinfeng Ni; Chinya V. Ravishankar; Bir Bhanu

2003-01-01

374

Steam Properties Database  

National Institute of Standards and Technology Data Gateway

SRD 10 NIST/ASME Steam Properties Database (PC database for purchase)   Based upon the International Association for the Properties of Water and Steam (IAPWS) 1995 formulation for the thermodynamic properties of water and the most recent IAPWS formulations for transport and other properties, this updated version provides water properties over a wide range of conditions according to the accepted international standards.

375

PAN Pesticide Database  

NSDL National Science Digital Library

The Pesticide Action Network (PAN) Pesticide Database is your one-stop location for toxicity and regulatory information for pesticides. This is a comprehensive search enabled database of pesticide chemicals and also trade names. An easy to navigate sidebar takes you through toxicity, uses, registration, company, and distributor. Other links take you to less toxic alternatives, and pesticide tutorial and references.

0002-11-30

376

Atomic Spectra Database (ASD)  

National Institute of Standards and Technology Data Gateway

SRD 78 NIST Atomic Spectra Database (ASD) (Web, free access)   This database provides access and search capability for NIST critically evaluated data on atomic energy levels, wavelengths, and transition probabilities that are reasonably up-to-date. The NIST Atomic Spectroscopy Data Center has carried out these critical compilations.

377

Web Database Development.  

National Technical Information Service (NTIS)

This thesis explores the concept of Web Database Development using Active Server Pages (ASP) and Java Server Pages (JSP). These are among the leading technologies in the web database development. The focus of this thesis was to analyze and compare the ASP...

N. A. Tsardas

2001-01-01

378

World Database of Crystallographers  

NSDL National Science Digital Library

The World Database of Crystallographers and of Other Scientists Employing Crystallographic Methods is offered by the International Union of Crystallography. The simple database can be searched by family name, title, interests, address, and various other criteria. Results include basic information such as full name, position, institution address, degrees held, key interests, and contact information. Those seeking such specific information will appreciate this unique resource.

379

Biological Macromolecule Crystallization Database  

National Institute of Standards and Technology Data Gateway

SRD 21 Biological Macromolecule Crystallization Database (Web, free access)   The Biological Macromolecule Crystallization Database and NASA Archive for Protein Crystal Growth Data (BMCD) contains the conditions reported for the crystallization of proteins and nucleic acids used in X-ray structure determinations and archives the results of microgravity macromolecule crystallization studies.

380

ETD - Exoplanet Transit Database  

NASA Astrophysics Data System (ADS)

Introduction of Exoplanet Transit Database (ETD), an internet portal, which offers all necessary informations for exoplanet transit observation. Scheduling, list of previously obtained data and on-line algorithm for photometric data processing, which determines the mid-transit time, duration and depth of a transit The Exoplanet Transit Database can be found at http://var.astro.cz/etd.

Brát, L.; Poddaný, S.; Pejcha, O.; Zejda, M.

2010-12-01

381

Sampling from Spatial Databases  

Microsoft Academic Search

Techniques for obtaining random point samples from spatial databases are described. Random points are sought from a continuous domain that satisfy a spatial predicate which is represented in the database as a collection of polygons. Several applications of spatial sampling are described. Sampling problems are characterized in terms of two key parameters: coverage (selectivity), and expected stabbing number (overlap). Two

Frank Olken; Doron Rotem

1993-01-01

382

Subscription Databases: Final Thoughts.  

ERIC Educational Resources Information Center

This eighth part in a series of articles on Web-based reference providers focuses on subscription databases. Topics include training needs; keeping teachers informed about new databases; costs; information retrieval techniques; free information from the Internet versus paying for use; information vendors; and the need for printed reference…

Safford, Barbara Ripp

2001-01-01

383

Database Design Recovery  

Microsoft Academic Search

The design of a software component, such as a database, is the trace of all the processes, products andreasonings that have led to the production of this artifact. Such a document is the very basis of systemmaintenance and evolution processes. Unfortunately, it does not exist in most situations. The paperdescribes how the design of a database or of a collection

Jean-luc Hainaut; Jean Henrard; Jean-marc Hick; Didier Roland; Vincent Englebert

1996-01-01

384

A Quality System Database  

NASA Technical Reports Server (NTRS)

A quality system database (QSD), and software to administer the database, were developed to support recording of administrative nonconformance activities that involve requirements for documentation of corrective and/or preventive actions, which can include ISO 9000 internal quality audits and customer complaints.

Snell, William H.; Turner, Anne M.; Gifford, Luther; Stites, William

2010-01-01

385

Dictionary as Database.  

ERIC Educational Resources Information Center

Discussion of dictionaries as databases focuses on the digitizing of The Oxford English dictionary (OED) and the use of Standard Generalized Mark-Up Language (SGML). Topics include the creation of a consortium to digitize the OED, document structure, relational databases, text forms, sequence, and discourse. (LRW)

Painter, Derrick

1996-01-01

386

MITOMAP: a human mitochondrial genome database--1998 update  

Microsoft Academic Search

We have continued to develop MITOMAP (http:\\/\\/www.gen.emory.edu\\/MITOMAP ), a comprehen- sive database for the human mitochondrial DNA (mtDNA). MITOMAP uses the mtDNA sequence as the unifying element for bringing together information on mitochondrial genome structure and function, patho- genic mutations and their clinical characteristics, population associated variation, and gene-gene inter- actions. Over the past year we have increased the degree

Andreas M. Kogelnik; Marie T. Lott; Michael D. Brown; Shamkant B. Navathe; Douglas C. Wallace

1998-01-01

387

Three Decades of Recombinant DNA.  

ERIC Educational Resources Information Center

Discusses highlights in the development of genetic engineering, examining techniques with recombinant DNA, legal and ethical issues, GenBank (a national database of nucleic acid sequences), and other topics. (JN)

Palmer, Jackie

1985-01-01

388

Are convenience DNA samples significantly different?  

Microsoft Academic Search

In this paper, the issue of whether DNA databases collected by different convenience sampling methods are significantly different statistically is investigated. Testing the null hypothesis that the population probability or frequency distributions of DNA profiles under different sampling methods are the same is of interest in this investigation. Some statistical analyses are conducted on the single-locus VNTR databases collected from

Wing K. Fung

1996-01-01

389

The World Bacterial Biogeography and Biodiversity through Databases: A Case Study of NCBI Nucleotide Database and GBIF Database  

PubMed Central

Databases are an essential tool and resource within the field of bioinformatics. The primary aim of this study was to generate an overview of global bacterial biodiversity and biogeography using available data from the two largest public online databases, NCBI Nucleotide and GBIF. The secondary aim was to highlight the contribution each geographic area has to each database. The basis for data analysis of this study was the metadata provided by both databases, mainly, the taxonomy and the geographical area origin of isolation of the microorganism (record). These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database. Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented. This highlights the imbalance of exploration within these areas rather than any reduction in biodiversity. This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.

James, Phillip; Nateche, Farida; Wellington, Elizabeth M. H.; Hacene, Hocine

2013-01-01

390

MUBII-TB-DB: a database of mutations associated with antibiotic resistance in Mycobacterium tuberculosis  

PubMed Central

Background Tuberculosis is an infectious bacterial disease caused by Mycobacterium tuberculosis. It remains a major health threat, killing over one million people every year worldwide. An early antibiotic therapy is the basis of the treatment, and the emergence and spread of multidrug and extensively drug-resistant mutant strains raise significant challenges. As these bacteria grow very slowly, drug resistance mutations are currently detected using molecular biology techniques. Resistance mutations are identified by sequencing the resistance-linked genes followed by a comparison with the literature data. The only online database is the TB Drug Resistance Mutation database (TBDReaM database); however, it requires mutation detection before use, and its interrogation is complex due to its loose syntax and grammar. Description The MUBII-TB-DB database is a simple, highly structured text-based database that contains a set of Mycobacterium tuberculosis mutations (DNA and proteins) occurring at seven loci: rpoB, pncA, katG; mabA(fabG1)-inhA, gyrA, gyrB, and rrs. Resistance mutation data were extracted after the systematic review of MEDLINE referenced publications before March 2013. MUBII analyzes the query sequence obtained by PCR-sequencing using two parallel strategies: i) a BLAST search against a set of previously reconstructed mutated sequences and ii) the alignment of the query sequences (DNA and its protein translation) with the wild-type sequences. The post-treatment includes the extraction of the aligned sequences together with their descriptors (position and nature of mutations). The whole procedure is performed using the internet. The results are graphs (alignments) and text (description of the mutation, therapeutic significance). The system is quick and easy to use, even for technicians without bioinformatics training. Conclusion MUBII-TB-DB is a structured database of the mutations occurring at seven loci of major therapeutic value in tuberculosis management. Moreover, the system provides interpretation of the mutations in biological and therapeutic terms and can evolve by the addition of newly described mutations. Its goal is to provide easy and comprehensive access through a client–server model over the Web to an up-to-date database of mutations that lead to the resistance of M. tuberculosis to antibiotics.

2014-01-01

391

NASA aerospace database subject scope: An overview  

NASA Technical Reports Server (NTRS)

Outlined here is the subject scope of the NASA Aerospace Database, a publicly available subset of the NASA Scientific and Technical (STI) Database. Topics of interest to NASA are outlined and placed within the framework of the following broad aerospace subject categories: aeronautics, astronautics, chemistry and materials, engineering, geosciences, life sciences, mathematical and computer sciences, physics, social sciences, space sciences, and general. A brief discussion of the subject scope is given for each broad area, followed by a similar explanation of each of the narrower subject fields that follow. The subject category code is listed for each entry.

1993-01-01

392

SyStemCell: A Database Populated with Multiple Levels of Experimental Data from Stem Cell Differentiation Research  

PubMed Central

Elucidation of the mechanisms of stem cell differentiation is of great scientific interest. Increasing evidence suggests that stem cell differentiation involves changes at multiple levels of biological regulation, which together orchestrate the complex differentiation process; many related studies have been performed to investigate the various levels of regulation. The resulting valuable data, however, remain scattered. Most of the current stem cell-relevant databases focus on a single level of regulation (mRNA expression) from limited stem cell types; thus, a unifying resource would be of great value to compile the multiple levels of research data available. Here we present a database for this purpose, SyStemCell, deposited with multi-level experimental data from stem cell research. The database currently covers seven levels of stem cell differentiation-associated regulatory mechanisms, including DNA CpG 5-hydroxymethylcytosine/methylation, histone modification, transcript products, microRNA-based regulation, protein products, phosphorylation proteins and transcription factor regulation, all of which have been curated from 285 peer-reviewed publications selected from PubMed. The database contains 43,434 genes, recorded as 942,221 gene entries, for four organisms (Homo sapiens, Mus musculus, Rattus norvegicus, and Macaca mulatta) and various stem cell sources (e.g., embryonic stem cells, neural stem cells and induced pluripotent stem cells). Data in SyStemCell can be queried by Entrez gene ID, symbol, alias, or browsed by specific stem cell type at each level of genetic regulation. An online analysis tool is integrated to assist researchers to mine potential relationships among different regulations, and the potential usage of the database is demonstrated by three case studies. SyStemCell is the first database to bridge multi-level experimental information of stem cell studies, which can become an important reference resource for stem cell researchers. The database is available at http://lifecenter.sgst.cn/SyStemCell/.

Zeng, Lingyao; Sun, Jiehuan; Li, Wei; Sun, Han; He, Ying; Li, Jing; Zhang, Guoqing; Wang, Chuan; Li, Yixue; Xie, Lu

2012-01-01

393

Illinois State Archives: Database of Illinois Civil War Veterans  

NSDL National Science Digital Library

This database from the Illinois State Archives "indexes the first eight volumes of the nine volume publication, Report of the Adjutant General of the State of Illinois." The publication is drawn from the original rosters maintained during the Civil War by the Adjutant General. In addition to the names of approximately 250,000 men organized into 175 regiments, this searchable database also provides histories of the Illinois units and regiments. The database was created and donated to the Illinois State Archives by amateur genealogist Fred Delap of Kansas, Illinois.

394

Olfactory receptor database: a sensory chemoreceptor resource.  

PubMed

The Olfactory Receptor Database (ORDB) is a WWW-accessible database that has been expanded from an olfactory receptor resource to a chemoreceptor resource. It stores data on six classes of G-protein-coupled sensory chemoreceptors: (i) olfactory receptor-like proteins, (ii) vomeronasal receptors, (iii) insect olfactory receptors, (iv) worm chemo-receptors, (v) taste papilla receptors and (vi) fungal pheromone receptors. A complementary database of the ligands of these receptors (OdorDB) has been constructed and is publicly available in a pilot mode. The database schema of ORDB has been changed from traditional relational to EAV/CR (Entity-Attribute-Value with Classes and Relationships), which allows the interoperability of ORDB with other related databases as well as the creation of intra-database associations among objects. This inter-operability facilitates users to follow information from odor molecule binding to its putative receptor, to the properties of the neuron expressing the receptor, to a computational model of activity of olfactory bulb neurons. In addition, tools and resources have been added allowing users to access interactive phylogenetic trees and alignments of sensory chemoreceptors. ORDB is available via the WWW at http://ycmi.med. yale.edu/senselab/ordb/ PMID:10592268

Skoufos, E; Marenco, L; Nadkarni, P M; Miller, P L; Shepherd, G M

2000-01-01

395

Complex Carbohydrate Research Center Spectral Databases  

NSDL National Science Digital Library

Dr. William York of the Complex Carbohydrate Research Center has created these two databases with scientific input from others at the University of Georgia. The Xyloglucan NMR Database consists of a searchable table of the ?H-NMR chemical shifts of xyloglucan oligoglycosyl alditols. Xyloglucans are highly branched polymers with a cellulosic backbone (i.e., consisting of b-(1,4)-linked D-glucosyl residues). The basis for the most commonly used nomenclature for xyloglucan structures comes from the linear array of glycosyl side chains that many of the backbone residues bear. The Partially Methylated Alditol Acetate (PMAA) Database shows the molecular structures of PMAAs derived from Hexopyranosyl, Pentopyranosyl, and Pentofuranosyl Residues. Users view the structures by gliding the mouse over a table. The PMAA Database also says that electron-impact mass spectra are available, but at the time of publication, these links weren't working. The Xyloglucan database comes with overviews, search guides, and nomenclature information. The PMAA database has a help page. Both require a free login.

York, William ..

2001-01-01

396

Database of Mechanical Properties of Textile Composites  

NASA Technical Reports Server (NTRS)

This report describes the approach followed to develop a database for mechanical properties of textile composites. The data in this database is assembled from NASA Advanced Composites Technology (ACT) programs and from data in the public domain. This database meets the data documentation requirements of MIL-HDBK-17, Section 8.1.2, which describes in detail the type and amount of information needed to completely document composite material properties. The database focuses on mechanical properties of textile composite. Properties are available for a range of parameters such as direction, fiber architecture, materials, environmental condition, and failure mode. The composite materials in the database contain innovative textile architectures such as the braided, woven, and knitted materials evaluated under the NASA ACT programs. In summary, the database contains results for approximately 3500 coupon level tests, for ten different fiber/resin combinations, and seven different textile architectures. It also includes a limited amount of prepreg tape composites data from ACT programs where side-by-side comparisons were made.

Delbrey, Jerry

1996-01-01

397

Hazard Analysis Database Report  

SciTech Connect

The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for U S . Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for HNF-SD-WM-SAR-067, Tank Farms Final Safety Analysis Report (FSAR). The FSAR is part of the approved Authorization Basis (AB) for the River Protection Project (RPP). This document describes, identifies, and defines the contents and structure of the Tank Farms FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The Hazard Analysis Database supports the preparation of Chapters 3 ,4 , and 5 of the Tank Farms FSAR and the Unreviewed Safety Question (USQ) process and consists of two major, interrelated data sets: (1) Hazard Analysis Database: Data from the results of the hazard evaluations, and (2) Hazard Topography Database: Data from the system familiarization and hazard identification.

GRAMS, W.H.

2000-12-28

398

Hazard Analysis Database Report  

SciTech Connect

The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for US Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for the Tank Waste Remediation System (TWRS) Final Safety Analysis Report (FSAR). The FSAR is part of the approved TWRS Authorization Basis (AB). This document describes, identifies, and defines the contents and structure of the TWRS FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The TWRS Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The database supports the preparation of Chapters 3,4, and 5 of the TWRS FSAR and the USQ process and consists of two major, interrelated data sets: (1) Hazard Evaluation Database--Data from the results of the hazard evaluations; and (2) Hazard Topography Database--Data from the system familiarization and hazard identification.

GAULT, G.W.

1999-10-13

399

The Gaia Parameter Database  

NASA Astrophysics Data System (ADS)

The parallel development of many aspects of a complex mission like Gaia, which includes numerous participants in ESA, industrial companies, and a large and active scientific collaboration throughout Europe, makes keeping track of the many design changes, instrument and operational complexities, and numerical values for the data analysis a very challenging problem. A comprehensive, easily-accessible, up-to-date, and definitive compilation of a large range of numerical quantities is required, and the Gaia parameter database has been established to satisfy these needs. The database is a centralised repository containing, besides mathematical, physical, and astronomical constants, many satellite and subsystem design parameters. At the end of 2004, more than 1600 parameters had been included. Version control has been implemented, providing, next to a `live' version with the most recent parameters, well-defined reference versions of the full database contents. The database can be queried or browsed using a regular Web browser (http://www.rssd.esa.int/Gaia/paramdb). Query results are formated by default in HTML. Data can also be retrieved as Fortran-77, Fortran-90, Java, ANSIC, C++, or XML structures for direct inclusion into software codes in these languages. The idea is that all collaborating scientists can use the database parameters and values, once retrieved, directly linked to computational routines. An off-line access mode is also available, enabling users to automatically download the contents of the database. The database will be maintained actively, and significant extensions of the contents are planned. Consistent use in the future of the database by the Gaia community at large, including all industrial teams, will ensure correct numerical values throughout the complex software systems being built up as details of the Gaia design develop. The database is already being used for the telemetry simulation chain in ESTEC, and in the data simulations for GDAAS2.

de Bruijne, J. H. J.; Lammers, U.; Perryman, M. A. C.

2005-01-01

400

DDTRP: Database of Drug Targets for Resistant Pathogens  

PubMed Central

Emergence of drug resistance is a major threat to public health. Many pathogens have developed resistance to most of the existing antibiotics, and multidrug-resistant and extensively drug resistant strains are extremely difficult to treat. This has resulted in an urgent need for novel drugs. We describe a database called ‘Database of Drug Targets for Resistant Pathogens’ (DDTRP). The database contains information on drugs with reported resistance, their respective targets, metabolic pathways involving these targets, and a list of potential alternate targets for seven pathogens. The database can be accessed freely at http://bmi.icmr.org.in/DDTRP.

Sundaramurthi, Jagadish Chandrabose; Ramanandan, Prabhakaran; Brindha, Sridharan; Subhasree, Chelladurai Ramarathnam; Prasad, Abhimanyu; Kumaraswami, Vasanthapuram; Hanna, Luke Elizabeth

2011-01-01

401

National Environmental Publications Internet Site  

NSDL National Science Digital Library

National Environmental Publications Internet Site is maintained by the Environmental Protection Agency (EPA) and contains a database of over 9000 documents that have been published by the EPA. Searches can be attempted by keyword or by publication title, which should help make finding a particular document easy. Once found, the documents can be viewed, printed freely, or ordered directly from the EPA.

402

Phase Equilibria Diagrams Database  

National Institute of Standards and Technology Data Gateway

SRD 31 NIST/ACerS Phase Equilibria Diagrams Database (PC database for purchase)   The Phase Equilibria Diagrams Database contains commentaries and more than 21,000 diagrams for non-organic systems, including those published in all 21 hard-copy volumes produced as part of the ACerS-NIST Phase Equilibria Diagrams Program (formerly titled Phase Diagrams for Ceramists): Volumes I through XIV (blue books); Annuals 91, 92, 93; High Tc Superconductors I & II; Zirconium & Zirconia Systems; and Electronic Ceramics I. Materials covered include oxides as well as non-oxide systems such as chalcogenides and pnictides, phosphates, salt systems, and mixed systems of these classes.

403

International Comparisions Database  

National Institute of Standards and Technology Data Gateway

International Comparisions Database (Web, free access)   The International Comparisons Database (ICDB) serves the U.S. and the Inter-American System of Metrology (SIM) with information based on Appendices B (International Comparisons), C (Calibration and Measurement Capabilities) and D (List of Participating Countries) of the Comit? International des Poids et Mesures (CIPM) Mutual Recognition Arrangement (MRA). The official source of the data is The BIPM key comparison database. The ICDB provides access to results of comparisons of measurements and standards organized by the consultative committees of the CIPM and the Regional Metrology Organizations.

404

NORTHWOODS Wildlife Habitat Database  

NSDL National Science Digital Library

Created through a joint effort of the USDA Forest Service's North Central Forest Experiment Station (NCFES) and seven national forests in the Upper Great Lakes Region, NORTHWOODS is a wildlife habitat database featuring "information about the habitat needs of 389 species of reptiles, amphibians, birds, and mammals in the Upper Great Lakes Region." The database compiles common and scientific names, species occurrences in 20 aquatic and terrestrial habitat types, species abundances and seasonal use in seven national forests, and species conservation status. The NORTHWOODS database is available in tab-delimited ASCII file format.

405

World Biodiversity Database  

NSDL National Science Digital Library

The World Biodiversity Database, provided by ETI BioInformatics, seeks to document all presently known species (about 1.7 million) and to make this important biological information worldwide accessible. This continually growing database provides "taxonomic information, species names, synonyms, descriptions, illustrations and literature references, as well as online identification keys and interactive geographical information systems." The searchable database can be explored using an expandable tree of taxonomic kingdoms or by typing in a common or scientific name. Both educators and students should find this site easy to navigate, informative, and useful.

2005-12-08

406

The Press Freedom Database  

NSDL National Science Digital Library

Developed by the Committee to Protect Journalists (CPJ), The Press Freedom Database contains case histories of documented press freedom violations against journalists and media organizations worldwide as well as annual CPJ reports on the violations in specific countries and regions. The case histories document the circumstances in which journalists have been attacked, killed, expelled, censored, harassed, threatened, and/or imprisoned. The database is searchable by journalist name, news organization, type of organization, date, country, or region. Currently, the database, which is updated weekly, contains over 2,000 reports of press freedom violations, spanning from 1993 to the present.

407

Hybrid Terrain Database  

NASA Technical Reports Server (NTRS)

A prototype hybrid terrain database is being developed in conjunction with other databases and with hardware and software that constitute subsystems of aerospace cockpit display systems (known in the art as synthetic vision systems) that generate images to increase pilots' situation awareness and eliminate poor visibility as a cause of aviation accidents. The basic idea is to provide a clear view of the world around an aircraft by displaying computer-generated imagery derived from an onboard database of terrain, obstacle, and airport information.

Arthur, Trey

2006-01-01

408

Simple re-instantiation of small databases using cloud computing  

PubMed Central

Background Small bioinformatics databases, unlike institutionally funded large databases, are vulnerable to discontinuation and many reported in publications are no longer accessible. This leads to irreproducible scientific work and redundant effort, impeding the pace of scientific progress. Results We describe a Web-accessible system, available online at http://biodb100.apbionet.org, for archival and future on demand re-instantiation of small databases within minutes. Depositors can rebuild their databases by downloading a Linux live operating system (http://www.bioslax.com), preinstalled with bioinformatics and UNIX tools. The database and its dependencies can be compressed into an ".lzm" file for deposition. End-users can search for archived databases and activate them on dynamically re-instantiated BioSlax instances, run as virtual machines over the two popular full virtualization standard cloud-computing platforms, Xen Hypervisor or vSphere. The system is adaptable to increasing demand for disk storage or computational load and allows database developers to use the re-instantiated databases for integration and development of new databases. Conclusions Herein, we demonstrate that a relatively inexpensive solution can be implemented for archival of bioinformatics databases and their rapid re-instantiation should the live databases disappear.

2013-01-01

409

Federal Pain Research Database Launched  

MedlinePLUS

... Release: Tuesday, May 27, 2014 Federal pain research database launched Multi-agency effort combines pain research information in easy-to-use database The Interagency Pain Research Portfolio (IPRP), a database ...

410

Lies, Damned Lies, and Databases.  

National Technical Information Service (NTIS)

A database is usually expected to give correct and complete answers to queries. However, some applications take confidentiality to an extreme and require the database to receive some users by supplying incorrect answers. In most secure databases classifyi...

S. Wiseman

1991-01-01

411

A Logical Database Design Framework.  

National Technical Information Service (NTIS)

This report presents a methodology to be used for logical database design. The report covers the entire spectrum of the logical database design process. The logical database design process is broken down into five major tasks: requirements analysis, globa...

P. P. S. Chen I. Chung D. Perry

1982-01-01

412

CAL Learning Strategies Database  

NSDL National Science Digital Library

The Learning Strategies Database, developed by the Center for Advancement of Learning (CAL) at Muskingum College, organizes information about learning strategies into four major sections: Introduction to the CAL Learning Strategies Database, General-Purpose Learning Strategies, Content-Specific Learning Strategies, and Bibliography of Learning Strategies Resources. Each section is further divided into more specific subsections, creating a hierarchical database structure. For example, the general-purpose section contains sixteen subsections related to general learning, such as Memory, Test Preparation, and Notetaking; whereas, the content-specific section has 27 disciplinary subsections, covering subjects in the natural sciences, social sciences, humanities, and arts. The information in the database allows students of all ages and their instructors to assess current learning styles, and to identify and implement methods for effective education adapted to the learning strengths and weaknesses of individual students.

413

Chemical Kinetics Database  

National Institute of Standards and Technology Data Gateway

SRD 17 NIST Chemical Kinetics Database (Web, free access)   The NIST Chemical Kinetics Database includes essentially all reported kinetics results for thermal gas-phase chemical reactions. The database is designed to be searched for kinetics data based on the specific reactants involved, for reactions resulting in specified products, for all the reactions of a particular species, or for various combinations of these. In addition, the bibliography can be searched by author name or combination of names. The database contains in excess of 38,000 separate reaction records for over 11,700 distinct reactant pairs. These data have been abstracted from over 12,000 papers with literature coverage through early 2000.

414

Bishop Museum: Ichthyology Databases  

NSDL National Science Digital Library

This website offers access to three on-line fish database sites from the Bishop Museum in Hawaii. The first database, the Bishop Museum Hawaiian Fish Checklist, contains 980 records and offers search fields for Genus, Order, Species, Family, and more. The second site, the Fuji Fish Searchable Database, is under construction but is still operational and includes search criteria for Taxa (with our without specimens), Taxon, and Locality. The third database, the Bishop Museum Zooarchaeology Laboratory Fish Osteology Collection, "is searchable by order, family, genus, species, common and local names, locality and status (whether endemic, indigenous, or non-indigenous). Bishop Museum's comparative reference collection is available for use by students and researchers interested in identifying fish remains from archaeological sites in Hawaii and elsewhere in the Pacific."

415

Atomicity for XML Databases  

NASA Astrophysics Data System (ADS)

With more and more data stored into XML databases, there is a need to provide the same level of failure resilience and robustness that users have come to expect from relational database systems. In this work, we discuss strategies to provide the transactional aspect of atomicity to XML databases. The main contribution of this paper is to propose a novel approach for performing updates-in-place on XML databases, with the undo statements stored in the same high level language as the update statements. Finally, we give experimental results to study the performance/storage trade-off of the updates-in-place strategy (based on our undo proposal) against the deferred updates strategy to providing atomicity.

Biswas, Debmalya; Jiwane, Ashwin; Genest, Blaise

416

Database Delivers Contaminant Resources  

EPA Science Inventory

The US Environmental Protection Agency's Drinking Water Treatability Database presents referenced information on the control of contaminants in drinking water. The TDB allows a variety of drinking water professionals to access information gathered from thousands of literature so...

417

THE CTEPP DATABASE  

EPA Science Inventory

The CTEPP (Children's Total Exposure to Persistent Pesticides and Other Persistent Organic Pollutants) database contains a wealth of data on children's aggregate exposures to pollutants in their everyday surroundings. Chemical analysis data for the environmental media and ques...

418

Spatial Digital Database for the Geologic Map of Oregon. (Digital Database Version 3.0).  

National Technical Information Service (NTIS)

This report describes and makes available a geologic digital spatial database (orgeo) representing the geologic map of Oregon (Walker and MacLeod, 1991). The original paper publication was printed as a single map sheet at a scale of 1:500,000, accompanied...

G. W. Walker N. S. MacLeod R. J. Miller G. L. Raines K. A. Connors

2002-01-01

419

Multimedia support for databases  

Microsoft Academic Search

Next generation database systems will need to provide supportfor both textual data and other types of multimediadata (e.g., images, video, audio). These two types of datadiffer in their characteristics, and hence require differenttechniques for their organization and management. For example,continuous media data (e.g., video, audio) requiresa guaranteed transfer rate. In this paper, we provide anoverview of 1) how database systems

Banu Özden; Rajeev Rastogi; Abraham Silberschatz

1997-01-01

420

Database computing in HEP  

SciTech Connect

The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors. I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototype based on relational and object-oriented databases of CDF data samples.

Day, C.T.; Loken, S.; MacFarlane, J.F. (Lawrence Berkeley Lab., CA (United States)); May, E.; Lifka, D.; Lusk, E.; Price, L.E. (Argonne National Lab., IL (United States)); Baden, A. (Maryland Univ., College Park, MD (United States). Dept. of Physics); Grossman, R.; Qin, X. (Illinois Univ., Chicago, IL (United States). Dept. of Mathematics, Statistics and Computer Science); Cormell, L.; Leibold, P.; Liu, D

1992-01-01

421

West Indian Orchidaceae Database  

NSDL National Science Digital Library

The New York Botanical Garden has recently placed online this searchable database of West Indian Orchids. Containing approximately 5,200 specimen records for the family Orchidacese (from the New York Botanical Garden's collection), the database may be searched by Family, Collector, Country, Taxon, State/Province, and other select fields. Typical returns provide information on Specimen name (scientific name), Location, Collector, Description, and Habitat.

422

ICE Biological Inventories Databases  

NSDL National Science Digital Library

The Information Center for the Environment (ICE) is a cooperative effort between scientists at University of California -- Davis and collaborators from over thirty organizations involved in environmental protection." The ICE Biological Inventory Databases contain "documented, taxonomically standardized species inventories of plants and animals reported from the world's protected areas." New to the Scout Report, these online databases have been updated recently to include botanical inventories from protected areas in Costa Rica and South Africa.

2008-07-08

423

The Iris database system  

Microsoft Academic Search

Iris is an object-oriented database management system being developed at Hewlett-Packard Laboratories [1], [3]. This videotape provides an overview of the Iris data model and a summary of our experiences in converting a computer-integrated manufacturing application to Iris. An abstract of the videotape follows.Iris is intended to meet the needs of new and emerging database applications such as office and

Bill Kent; Peter Lyngback; Samir Mathur; Kevin Wilkinson

1990-01-01

424

JICST Factual Database(4)  

NASA Astrophysics Data System (ADS)

JICST Mass Spectral Database which had started from March, 1987 made a fresh start at January, 1988 as JOIS-F (JICST Online Information System-Fact) including two other JICST Factual Databases. Renewal point of it are data up-dated, a new function of cross-retrieval and a small change of command. The outline of data items, retrieval system and availability are given again.

Okano, Keiko; Abe, Atsushi

425

Atlas Florae Europaeae Database  

NSDL National Science Digital Library

Initiated in 1992, the primary goal of the Atlas Florae Europaeae (AFE) Database project is to make plant distribution data available in digital format. Currently, AFE includes "preliminary maps for all European vascular plants" (based on the time period 1972-1996), examples of distribution statistics (colorful summary maps), and Biogeographical analyses; digital data are expected in 1999. PC users can download an evaluation copy of the current database (which will be available for sale in 1999) on-site.

1992-01-01

426

Protein Structure Databases  

Microsoft Academic Search

Web-based protein structure databases come in a wide variety of types and levels of information content. Those having the\\u000a most general interest are the various atlases that describe each experimentally determined protein structure and provide useful\\u000a links, analyses and schematic diagrams relating to its 3D structure and biological function. Also of great interest are the\\u000a databases that classify 3D structures

Roman A. Laskowski

2011-01-01

427

SSME environment database development  

NASA Technical Reports Server (NTRS)

The internal environment of the Space Shuttle Main Engine (SSME) is being determined from hot firings of the prototype engines and from model tests using either air or water as the test fluid. The objectives are to develop a database system to facilitate management and analysis of test measurements and results, to enter available data into the the database, and to analyze available data to establish conventions and procedures to provide consistency in data normalization and configuration geometry references.

Reardon, John

1987-01-01

428

Chemical Synthesis Database  

NSDL National Science Digital Library

This database provides information on thousands of chemical compounds, including synthesis references and physical properties. The database is searchable by keyword and browseable by journal title. For each compound, the information includes molecular formula and weight, Chemical Abstracts Service (CAS) numbers, International Chemical Identifier (InChIKey), and Simplified Molecular Input Line Entry System (SMILES) notation. There is also information on synonyms, physical properties (boiling and melting points, density), an illustration of chemical structure, spectral data, and links to additional data.

429

ICE Biological Inventories Databases  

NSDL National Science Digital Library

The Information Center for the Environment (ICE) is a cooperative effort between scientists at University of California -- Davis and collaborators from over thirty organizations involved in environmental protection." The ICE Biological Inventory Databases contain "documented, taxonomically standardized species inventories of plants and animals reported from the world's protected areas." New to the Scout Report, these online databases have been updated recently to include botanical inventories from protected areas in Costa Rica and South Africa.

2000-01-01

430

The DARE Database: UNESCO  

NSDL National Science Digital Library

The DARE Database, maintained by UNESCO, includes an international directory of over 11,000 references to social science research and training institutes, social science specialists, social science documentation and information services, and social science periodicals. The directory also provides listings of peace, human rights, and international law institutions. Users may search the directory database by type of institution, country name, personal name, geographical coverage, periodical title, language of periodical, ISSN, or keyword.

431

Rational Design of Human DNA Ligase Inhibitors that Target Cellular DNA Replication and Repair  

Microsoft Academic Search

Based on the crystal structure of human DNA ligase I complexed with nicked DNA, computer-aided drug design was used to identify compounds in a database of 1.5 million commercially available low molecular weight chemicals that were predicted to bind to a DNA-binding pocket within the DNA-binding domain of DNA ligase I, thereby inhibiting DNA joining. Ten of 192 candidates specifically

Xi Chen; Shijun Zhong; Xiao Zhu; Barbara Dziegielewska; Tom Ellenberger; Gerald M. Wilson; Alexander D. MacKerell; Alan E. Tomkinson

2008-01-01

432

The transporter classification database.  

PubMed

The Transporter Classification Database (TCDB; http://www.tcdb.org) serves as a common reference point for transport protein research. The database contains more than 10,000 non-redundant proteins that represent all currently recognized families of transmembrane molecular transport systems. Proteins in TCDB are organized in a five level hierarchical system, where the first two levels are the class and subclass, the second two are the family and subfamily, and the last one is the transport system. Superfamilies that contain multiple families are included as hyperlinks to the five tier TC hierarchy. TCDB includes proteins from all types of living organisms and is the only transporter classification system that is both universal and recognized by the International Union of Biochemistry and Molecular Biology. It has been expanded by manual curation, contains extensive text descriptions providing structural, functional, mechanistic and evolutionary information, is supported by unique software and is interconnected to many other relevant databases. TCDB is of increasing usefulness to the international scientific community and can serve as a model for the expansion of database technologies. This manuscript describes an update of the database descriptions previously featured in NAR database issues. PMID:24225317

Saier, Milton H; Reddy, Vamsee S; Tamang, Dorjee G; Västermark, Ake

2014-01-01

433

Neuroinformatics: From Bioinformatics to Databasing the Brain  

PubMed Central

Neuroinformatics seeks to create and maintain web-accessible databases of experimental and computational data, together with innovative software tools, essential for understanding the nervous system in its normal function and in neurological disorders. Neuroinformatics includes traditional bioinformatics of gene and protein sequences in the brain; atlases of brain anatomy and localization of genes and proteins; imaging of brain cells; brain imaging by positron emission tomography (PET), functional magnetic resonance imaging (fMRI), electroencephalography (EEG), magnetoencephalography (MEG) and other methods; many electrophysiological recording methods; and clinical neurological data, among others. Building neuroinformatics databases and tools presents difficult challenges because they span a wide range of spatial scales and types of data stored and analyzed. Traditional bioinformatics, by comparison, focuses primarily on genomic and proteomic data (which of course also presents difficult challenges). Much of bioinformatics analysis focus on sequences (DNA, RNA, and protein molecules), as the type of data that are stored, compared, and sometimes modeled. Bioinformatics is undergoing explosive growth with the addition, for example, of databases that catalog interactions between proteins, of databases that track the evolution of genes, and of systems biology databases which contain models of all aspects of organisms. This commentary briefly reviews neuroinformatics with clarification of its relationship to traditional and modern bioinformatics.

Morse, Thomas M.

2008-01-01

434

Corruption of genomic databases with anomalous sequence.  

PubMed Central

We describe evidence that DNA sequences from vectors used for cloning and sequencing have been incorporated accidentally into eukaryotic entries in the GenBank database. These incorporations were not restricted to one type of vector or to a single mechanism. Many minor instances may have been the result of simple editing errors, but some entries contained large blocks of vector sequence that h