Science.gov

Sample records for public dna databases

  1. The Genographic Project Public Participation Mitochondrial DNA Database

    PubMed Central

    Behar, Doron M; Rosset, Saharon; Blue-Smith, Jason; Balanovsky, Oleg; Tzur, Shay; Comas, David; Mitchell, R. John; Quintana-Murci, Lluis; Tyler-Smith, Chris; Wells, R. Spencer

    2007-01-01

    The Genographic Project is studying the genetic signatures of ancient human migrations and creating an open-source research database. It allows members of the public to participate in a real-time anthropological genetics study by submitting personal samples for analysis and donating the genetic results to the database. We report our experience from the first 18 months of public participation in the Genographic Project, during which we have created the largest standardized human mitochondrial DNA (mtDNA) database ever collected, comprising 78,590 genotypes. Here, we detail our genotyping and quality assurance protocols including direct sequencing of the mtDNA HVS-I, genotyping of 22 coding-region SNPs, and a series of computational quality checks based on phylogenetic principles. This database is very informative with respect to mtDNA phylogeny and mutational dynamics, and its size allows us to develop a nearest neighbor–based methodology for mtDNA haplogroup prediction based on HVS-I motifs that is superior to classic rule-based approaches. We make available to the scientific community and general public two new resources: a periodically updated database comprising all data donated by participants, and the nearest neighbor haplogroup prediction tool. PMID:17604454

  2. Public participation in genetic databases: crossing the boundaries between biobanks and forensic DNA databases through the principle of solidarity

    PubMed Central

    Machado, Helena; Silva, Susana

    2015-01-01

    The ethical aspects of biobanks and forensic DNA databases are often treated as separate issues. As a reflection of this, public participation, or the involvement of citizens in genetic databases, has been approached differently in the fields of forensics and medicine. This paper aims to cross the boundaries between medicine and forensics by exploring the flows between the ethical issues presented in the two domains and the subsequent conceptualisation of public trust and legitimisation. We propose to introduce the concept of ‘solidarity’, traditionally applied only to medical and research biobanks, into a consideration of public engagement in medicine and forensics. Inclusion of a solidarity-based framework, in both medical biobanks and forensic DNA databases, raises new questions that should be included in the ethical debate, in relation to both health services/medical research and activities associated with the criminal justice system. PMID:26139851

  3. About DNA databasing and investigative genetic analysis of externally visible characteristics: A public survey.

    PubMed

    Zieger, Martin; Utz, Silvia

    2015-07-01

    During the last decade, DNA profiling and the use of DNA databases have become two of the most employed instruments of police investigations. This very rapid establishment of forensic genetics is yet far from being complete. In the last few years novel types of analyses have been presented to describe phenotypically a possible perpetrator. We conducted the present study among German speaking Swiss residents for two main reasons: firstly, we aimed at getting an impression of the public awareness and acceptance of the Swiss DNA database and the perception of a hypothetical DNA database containing all Swiss residents. Secondly, we wanted to get a broader picture of how people that are not working in the field of forensic genetics think about legal permission to establish phenotypic descriptions of alleged criminals by genetic means. Even though a significant number of study participants did not even know about the existence of the Swiss DNA database, its acceptance appears to be very high. Generally our results suggest that the current forensic use of DNA profiling is considered highly trustworthy. However, the acceptance of a hypothetical universal database would be only as low as about 30% among the 284 respondents to our study, mostly because people are concerned about the security of their genetic data, their privacy or a possible risk of abuse of such a database. Concerning the genetic analysis of externally visible characteristics and biogeographical ancestry, we discover a high degree of acceptance. The acceptance decreases slightly when precise characteristics are presented to the participants in detail. About half of the respondents would be in favor of the moderate use of physical traits analyses only for serious crimes threatening life, health or sexual integrity. The possible risk of discrimination and reinforcement of racism, as discussed by scholars from anthropology, bioethics, law, philosophy and sociology, is mentioned less frequently by the study

  4. About DNA databasing and investigative genetic analysis of externally visible characteristics: A public survey.

    PubMed

    Zieger, Martin; Utz, Silvia

    2015-07-01

    During the last decade, DNA profiling and the use of DNA databases have become two of the most employed instruments of police investigations. This very rapid establishment of forensic genetics is yet far from being complete. In the last few years novel types of analyses have been presented to describe phenotypically a possible perpetrator. We conducted the present study among German speaking Swiss residents for two main reasons: firstly, we aimed at getting an impression of the public awareness and acceptance of the Swiss DNA database and the perception of a hypothetical DNA database containing all Swiss residents. Secondly, we wanted to get a broader picture of how people that are not working in the field of forensic genetics think about legal permission to establish phenotypic descriptions of alleged criminals by genetic means. Even though a significant number of study participants did not even know about the existence of the Swiss DNA database, its acceptance appears to be very high. Generally our results suggest that the current forensic use of DNA profiling is considered highly trustworthy. However, the acceptance of a hypothetical universal database would be only as low as about 30% among the 284 respondents to our study, mostly because people are concerned about the security of their genetic data, their privacy or a possible risk of abuse of such a database. Concerning the genetic analysis of externally visible characteristics and biogeographical ancestry, we discover a high degree of acceptance. The acceptance decreases slightly when precise characteristics are presented to the participants in detail. About half of the respondents would be in favor of the moderate use of physical traits analyses only for serious crimes threatening life, health or sexual integrity. The possible risk of discrimination and reinforcement of racism, as discussed by scholars from anthropology, bioethics, law, philosophy and sociology, is mentioned less frequently by the study

  5. Spanish public awareness regarding DNA profile databases in forensic genetics: what type of DNA profiles should be included?

    PubMed Central

    Gamero, Joaquín J; Romero, Jose‐Luis; Peralta, Juan‐Luis; Carvalho, Mónica; Corte‐Real, Francisco

    2007-01-01

    The importance of non‐codifying DNA polymorphism for the administration of justice is now well known. In Spain, however, this type of test has given rise to questions in recent years: (a) Should consent be obtained before biological samples are taken from an individual for DNA analysis? (b) Does society perceive these techniques and methods of analysis as being reliable? (c) There appears to be lack of knowledge concerning the basic norms that regulate databases containing private or personal information and the protection that information of this type must be given. This opinion survey and the subsequent analysis of the results in ethical terms may serve to reveal the criteria and the degree of information that society has with regard to DNA databases. In the study, 73.20% (SE 1.12%) of the population surveyed was in favour of specific legislation for computer files in which DNA analysis results for forensic purposes are stored. PMID:17906059

  6. Navigating public microarray databases.

    PubMed

    Penkett, Christopher J; Bähler, Jürg

    2004-01-01

    With the ever-escalating amount of data being produced by genome-wide microarray studies, it is of increasing importance that these data are captured in public databases so that researchers can use this information to complement and enhance their own studies. Many groups have set up databases of expression data, ranging from large repositories, which are designed to comprehensively capture all published data, through to more specialized databases. The public repositories, such as ArrayExpress at the European Bioinformatics Institute contain complete datasets in raw format in addition to processed data, whilst the specialist databases tend to provide downstream analysis of normalized data from more focused studies and data sources. Here we provide a guide to the use of these public microarray resources.

  7. Enhancing the DNA Patent Database

    SciTech Connect

    Walters, LeRoy B.

    2008-02-18

    Final Report on Award No. DE-FG0201ER63171 Principal Investigator: LeRoy B. Walters February 18, 2008 This project successfully completed its goal of surveying and reporting on the DNA patenting and licensing policies at 30 major U.S. academic institutions. The report of survey results was published in the January 2006 issue of Nature Biotechnology under the title “The Licensing of DNA Patents by US Academic Institutions: An Empirical Survey.” Lori Pressman was the lead author on this feature article. A PDF reprint of the article will be submitted to our Program Officer under separate cover. The project team has continued to update the DNA Patent Database on a weekly basis since the conclusion of the project. The database can be accessed at dnapatents.georgetown.edu. This database provides a valuable research tool for academic researchers, policymakers, and citizens. A report entitled Reaping the Benefits of Genomic and Proteomic Research: Intellectual Property Rights, Innovation, and Public Health was published in 2006 by the Committee on Intellectual Property Rights in Genomic and Protein Research and Innovation, Board on Science, Technology, and Economic Policy at the National Academies. The report was edited by Stephen A. Merrill and Anne-Marie Mazza. This report employed and then adapted the methodology developed by our research project and quoted our findings at several points. (The full report can be viewed online at the following URL: http://www.nap.edu/openbook.php?record_id=11487&page=R1). My colleagues and I are grateful for the research support of the ELSI program at the U.S. Department of Energy.

  8. Characterization of new Schistosoma mansoni microsatellite loci in sequences obtained from public DNA databases and microsatellite enriched genomic libraries.

    PubMed

    Rodrigues, N B; Loverde, P T; Romanha, A J; Oliveira, G

    2002-01-01

    In the last decade microsatellites have become one of the most useful genetic markers used in a large number of organisms due to their abundance and high level of polymorphism. Microsatellites have been used for individual identification, paternity tests, forensic studies and population genetics. Data on microsatellite abundance comes preferentially from microsatellite enriched libraries and DNA sequence databases. We have conducted a search in GenBank of more than 16,000 Schistosoma mansoni ESTs and 42,000 BAC sequences. In addition, we obtained 300 sequences from CA and AT microsatellite enriched genomic libraries. The sequences were searched for simple repeats using the RepeatMasker software. Of 16,022 ESTs, we detected 481 (3%) sequences that contained 622 microsatellites (434 perfect, 164 imperfect and 24 compounds). Of the 481 ESTs, 194 were grouped in 63 clusters containing 2 to 15 ESTs per cluster. Polymorphisms were observed in 16 clusters. The 287 remaining ESTs were orphan sequences. Of the 42,017 BAC end sequences, 1,598 (3.8%) contained microsatellites (2,335 perfect, 287 imperfect and 79 compounds). The 1,598 BAC end sequences 80 were grouped into 17 clusters containing 3 to 17 BAC end sequences per cluster. Microsatellites were present in 67 out of 300 sequences from microsatellite enriched libraries (55 perfect, 38 imperfect and 15 compounds). From all of the observed loci 55 were selected for having the longest perfect repeats and flanking regions that allowed the design of primers for PCR amplification. Additionally we describe two new polymorphic microsatellite loci.

  9. Short Tandem Repeat DNA Internet Database

    National Institute of Standards and Technology Data Gateway

    SRD 130 Short Tandem Repeat DNA Internet Database (Web, free access)   Short Tandem Repeat DNA Internet Database is intended to benefit research and application of short tandem repeat DNA markers for human identity testing. Facts and sequence information on each STR system, population data, commonly used multiplex STR systems, PCR primers and conditions, and a review of various technologies for analysis of STR alleles have been included.

  10. Forensic DNA profiling and database.

    PubMed

    Panneerchelvam, S; Norazmi, M N

    2003-07-01

    The incredible power of DNA technology as an identification tool had brought a tremendous change in crimnal justice . DNA data base is an information resource for the forensic DNA typing community with details on commonly used short tandem repeat (STR) DNA markers. This article discusses the essential steps in compilation of COmbined DNA Index System (CODIS) on validated polymerase chain amplified STRs and their use in crime detection.

  11. Database Support for Research in Public Administration

    ERIC Educational Resources Information Center

    Tucker, James Cory

    2005-01-01

    This study examines the extent to which databases support student and faculty research in the area of public administration. A list of journals in public administration, public policy, political science, public budgeting and finance, and other related areas was compared to the journal content list of six business databases. These databases…

  12. Online Database Searching in Smaller Public Libraries.

    ERIC Educational Resources Information Center

    Roose, Tina

    1983-01-01

    Online database searching experiences of nine Illinois public libraries--Arlington Heights, Deerfield, Elk Grove Village, Evanston, Glenview, Northbrook, Schaumburg Township, Waukegan, Wilmette--are discussed, noting search costs, user charges, popular databases, library acquisition, interaction with users, and staff training. Three sources are…

  13. Influencing Database Use in Public Libraries.

    ERIC Educational Resources Information Center

    Tenopir, Carol

    1999-01-01

    Discusses results of a survey of factors influencing database use in public libraries. Highlights the importance of content; ease of use; and importance of instruction. Tabulates importance indications for number and location of workstations, library hours, availability of remote login, usefulness and quality of content, lack of other databases,…

  14. Ethical-legal problems of DNA databases in criminal investigation

    PubMed Central

    Guillen, M.; Lareu, M. V.; Pestoni, C.; Salas, A.; Carracedo, A.

    2000-01-01

    Advances in DNA technology and the discovery of DNA polymorphisms have permitted the creation of DNA databases of individuals for the purpose of criminal investigation. Many ethical and legal problems arise in the preparation of a DNA database, and these problems are especially important when one analyses the legal regulations on the subject. In this paper three main groups of possibilities, three systems, are analysed in relation to databases. The first system is based on a general analysis of the population; the second one is based on the taking of samples for a particular list of crimes, and a third is based only on the specific analysis of each case. The advantages and disadvantages of each system are compared and controversial issues are then examined. We found the second system to be the best choice for Spain and other European countries with a similar tradition when we weighed the rights of an individual against the public's interest in the prosecution of a crime. Key Words: DNA databases • forensic genetics • ethics PMID:10951922

  15. Public Opinion Poll Question Databases: An Evaluation

    ERIC Educational Resources Information Center

    Woods, Stephen

    2007-01-01

    This paper evaluates five polling resource: iPOLL, Polling the Nations, Gallup Brain, Public Opinion Poll Question Database, and Polls and Surveys. Content was evaluated on disclosure standards from major polling organizations, scope on a model for public opinion polls, and presentation on a flow chart discussing search limitations and usability.

  16. DDRprot: a database of DNA damage response-related proteins

    PubMed Central

    Andrés-León, Eduardo; Cases, Ildefonso; Arcas, Aida; Rojas, Ana M.

    2016-01-01

    The DNA Damage Response (DDR) signalling network is an essential system that protects the genome’s integrity. The DDRprot database presented here is a resource that integrates manually curated information on the human DDR network and its sub-pathways. For each particular DDR protein, we present detailed information about its function. If involved in post-translational modifications (PTMs) with each other, we depict the position of the modified residue/s in the three-dimensional structures, when resolved structures are available for the proteins. All this information is linked to the original publication from where it was obtained. Phylogenetic information is also shown, including time of emergence and conservation across 47 selected species, family trees and sequence alignments of homologues. The DDRprot database can be queried by different criteria: pathways, species, evolutionary age or involvement in (PTM). Sequence searches using hidden Markov models can be also used. Database URL: http://ddr.cbbio.es. PMID:27577567

  17. DDRprot: a database of DNA damage response-related proteins.

    PubMed

    Andrés-León, Eduardo; Cases, Ildefonso; Arcas, Aida; Rojas, Ana M

    2016-01-01

    The DNA Damage Response (DDR) signalling network is an essential system that protects the genome's integrity. The DDRprot database presented here is a resource that integrates manually curated information on the human DDR network and its sub-pathways. For each particular DDR protein, we present detailed information about its function. If involved in post-translational modifications (PTMs) with each other, we depict the position of the modified residue/s in the three-dimensional structures, when resolved structures are available for the proteins. All this information is linked to the original publication from where it was obtained. Phylogenetic information is also shown, including time of emergence and conservation across 47 selected species, family trees and sequence alignments of homologues. The DDRprot database can be queried by different criteria: pathways, species, evolutionary age or involvement in (PTM). Sequence searches using hidden Markov models can be also used.Database URL: http://ddr.cbbio.es. PMID:27577567

  18. The Dfam database of repetitive DNA families

    PubMed Central

    Hubley, Robert; Finn, Robert D.; Clements, Jody; Eddy, Sean R.; Jones, Thomas A.; Bao, Weidong; Smit, Arian F.A.; Wheeler, Travis J.

    2016-01-01

    Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download. PMID:26612867

  19. The Dfam database of repetitive DNA families.

    PubMed

    Hubley, Robert; Finn, Robert D; Clements, Jody; Eddy, Sean R; Jones, Thomas A; Bao, Weidong; Smit, Arian F A; Wheeler, Travis J

    2016-01-01

    Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download.

  20. The Dfam database of repetitive DNA families.

    PubMed

    Hubley, Robert; Finn, Robert D; Clements, Jody; Eddy, Sean R; Jones, Thomas A; Bao, Weidong; Smit, Arian F A; Wheeler, Travis J

    2016-01-01

    Repetitive DNA, especially that due to transposable elements (TEs), makes up a large fraction of many genomes. Dfam is an open access database of families of repetitive DNA elements, in which each family is represented by a multiple sequence alignment and a profile hidden Markov model (HMM). The initial release of Dfam, featured in the 2013 NAR Database Issue, contained 1143 families of repetitive elements found in humans, and was used to produce more than 100 Mb of additional annotation of TE-derived regions in the human genome, with improved speed. Here, we describe recent advances, most notably expansion to 4150 total families including a comprehensive set of known repeat families from four new organisms (mouse, zebrafish, fly and nematode). We describe improvements to coverage, and to our methods for identifying and reducing false annotation. We also describe updates to the website interface. The Dfam website has moved to http://dfam.org. Seed alignments, profile HMMs, hit lists and other underlying data are available for download. PMID:26612867

  1. A compendium of human mitochondrial DNA control region: development of an international standard forensic database.

    PubMed

    Miller, K W; Budowle, B

    2001-06-01

    A compendium of human mitochondrial DNA (mtDNA) control region types has been constructed. This updated compilation indexes over 10,000 population-specific mtDNA nucleotide sequences in a standardized format. The sequences represent mtDNA types from the Scientific Working Group on DNA Analysis Methods (SWGDAM) mtDNA database and from the public literature. The SWGDAM data are considered to be of higher quality than the public data, particularly for counting the number of times a particular haplotype has been observed. PMID:11387646

  2. OriDB: a DNA replication origin database.

    PubMed

    Nieduszynski, Conrad A; Hiraga, Shin-ichiro; Ak, Prashanth; Benham, Craig J; Donaldson, Anne D

    2007-01-01

    Replication of eukaryotic chromosomes initiates at multiple sites called replication origins. Replication origins are best understood in the budding yeast Saccharomyces cerevisiae, where several complementary studies have mapped their locations genome-wide. We have collated these datasets, taking account of the resolution of each study, to generate a single list of distinct origin sites. OriDB provides a web-based catalogue of these confirmed and predicted S.cerevisiae DNA replication origin sites. Each proposed or confirmed origin site appears as a record in OriDB, with each record comprising seven pages. These pages provide, in text and graphical formats, the following information: genomic location and chromosome context of the origin site; time of origin replication; DNA sequence of proposed or experimentally confirmed origin elements; free energy required to open the DNA duplex (stress-induced DNA duplex destabilization or SIDD); and phylogenetic conservation of sequence elements. In addition, OriDB encourages community submission of additional information for each origin site through a User Notes facility. Origin sites are linked to several external resources, including the Saccharomyces Genome Database (SGD) and relevant publications at PubMed. Finally, a Chromosome Viewer utility allows users to interactively generate graphical representations of DNA replication data genome-wide. OriDB is available at www.oridb.org.

  3. 24 CFR 81.72 - Public-use database and public information.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 24 Housing and Urban Development 1 2011-04-01 2011-04-01 false Public-use database and public... Public-use database and public information. (a) General. Except as provided in paragraph (c) of this section, the Secretary shall establish and make available for public use, a public-use database...

  4. 24 CFR 81.72 - Public-use database and public information.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 24 Housing and Urban Development 1 2013-04-01 2013-04-01 false Public-use database and public... Public-use database and public information. (a) General. Except as provided in paragraph (c) of this section, the Secretary shall establish and make available for public use, a public-use database...

  5. 24 CFR 81.72 - Public-use database and public information.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 24 Housing and Urban Development 1 2012-04-01 2012-04-01 false Public-use database and public... Public-use database and public information. (a) General. Except as provided in paragraph (c) of this section, the Secretary shall establish and make available for public use, a public-use database...

  6. 24 CFR 81.72 - Public-use database and public information.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 24 Housing and Urban Development 1 2014-04-01 2014-04-01 false Public-use database and public... Public-use database and public information. (a) General. Except as provided in paragraph (c) of this section, the Secretary shall establish and make available for public use, a public-use database...

  7. 24 CFR 81.72 - Public-use database and public information.

    Code of Federal Regulations, 2010 CFR

    2010-04-01

    ... 24 Housing and Urban Development 1 2010-04-01 2010-04-01 false Public-use database and public... Public-use database and public information. (a) General. Except as provided in paragraph (c) of this section, the Secretary shall establish and make available for public use, a public-use database...

  8. The Availability of Faculty Publication Databases from Library Web Pages

    ERIC Educational Resources Information Center

    Blummer, Barbara A.

    2007-01-01

    Faculty publication databases or author bibliographies offer libraries an opportunity to provide services to users. Initially, these databases remained initiatives of special libraries in the health-sciences fields. Librarians used the publication information derived from these databases to compile lists for annual reports. However, the advent of…

  9. Exploration of the Chemical Space of Public Genomic Databases

    EPA Science Inventory

    The current project aims to chemically index the content of public genomic databases to make these data accessible in relation to other publicly available, chemically-indexed toxicological information.

  10. [Privacy and public benefit in using large scale health databases].

    PubMed

    Yamamoto, Ryuichi

    2014-01-01

    In Japan, large scale heath databases were constructed in a few years, such as National Claim insurance and health checkup database (NDB) and Japanese Sentinel project. But there are some legal issues for making adequate balance between privacy and public benefit by using such databases. NDB is carried based on the act for elderly person's health care but in this act, nothing is mentioned for using this database for general public benefit. Therefore researchers who use this database are forced to pay much concern about anonymization and information security that may disturb the research work itself. Japanese Sentinel project is a national project to detecting drug adverse reaction using large scale distributed clinical databases of large hospitals. Although patients give the future consent for general such purpose for public good, it is still under discussion using insufficiently anonymized data. Generally speaking, researchers of study for public benefit will not infringe patient's privacy, but vague and complex requirements of legislation about personal data protection may disturb the researches. Medical science does not progress without using clinical information, therefore the adequate legislation that is simple and clear for both researchers and patients is strongly required. In Japan, the specific act for balancing privacy and public benefit is now under discussion. The author recommended the researchers including the field of pharmacology should pay attention to, participate in the discussion of, and make suggestion to such act or regulations.

  11. Enhancing thermal video using a public database of images

    NASA Astrophysics Data System (ADS)

    Qadir, Hemin; Kozaitis, S. P.; Ali, Ehsan

    2014-05-01

    We presented a system to display nightime imagery with natural colors using a public database of images. We initially combined two spectral bands of images, thermal and visible, to enhance night vision imagery, however the fused image gave an unnatural color appearance. Therefore, a color transfer based on look-up table (LUT) was used to replace the false color appearance with a colormap derived from a daytime reference image obtained from a public database using the GPS coordinates of the vehicle. Because of the computational demand in deriving the colormap from the reference image, we created an additional local database of colormaps. Reference images from the public database were compared to a compact local database to retrieve one of a limited number of colormaps that represented several driving environments. Each colormap in the local database was stored with an image from which it was derived. To retrieve a colormap, we compared the histogram of the fused image with histograms of images in the local database. The colormaps of the best match was then used for the fused image. Continuously selecting and applying colormaps using this approach offered a convenient way to color night vision imagery.

  12. DNAVaxDB: the first web-based DNA vaccine database and its data analysis.

    PubMed

    Racz, Rebecca; Li, Xinna; Patel, Mukti; Xiang, Zuoshuang; He, Yongqun

    2014-01-01

    Since the first DNA vaccine studies were done in the 1990s, thousands more studies have followed. Here we report the development and analysis of DNAVaxDB (http://www.violinet.org/dnavaxdb), the first publically available web-based DNA vaccine database that curates, stores, and analyzes experimentally verified DNA vaccines, DNA vaccine plasmid vectors, and protective antigens used in DNA vaccines. All data in DNAVaxDB are annotated from reliable resources, particularly peer-reviewed articles. Among over 140 DNA vaccine plasmids, some plasmids were more frequently used in one type of pathogen than others; for example, pCMVi-UB for G- bacterial DNA vaccines, and pCAGGS for viral DNA vaccines. Presently, over 400 DNA vaccines containing over 370 protective antigens from over 90 infectious and non-infectious diseases have been curated in DNAVaxDB. While extracellular and bacterial cell surface proteins and adhesin proteins were frequently used for DNA vaccine development, the majority of protective antigens used in Chlamydophila DNA vaccines are localized to the inner portion of the cell. The DNA vaccine priming, other vaccine boosting vaccination regimen has been widely used to induce protection against infection of different pathogens such as HIV. Parasitic and cancer DNA vaccines were also systematically analyzed. User-friendly web query and visualization interfaces are available in DNAVaxDB for interactive data search. To support data exchange, the information of DNA vaccines, plasmids, and protective antigens is stored in the Vaccine Ontology (VO). DNAVaxDB is targeted to become a timely and vital source of DNA vaccines and related data and facilitate advanced DNA vaccine research and development.

  13. The Mouse SAGE Site: database of public mouse SAGE libraries.

    PubMed

    Divina, Petr; Forejt, Jirí

    2004-01-01

    The Mouse SAGE Site is a web-based database of all available public libraries generated by the Serial Analysis of Gene Expression (SAGE) from various mouse tissues and cell lines. The database contains mouse SAGE libraries organized in a uniform way and provides web-based tools for browsing, comparing and searching SAGE data with reliable tag-to-gene identification. A modified approach based on the SAGEmap database is used for reliable tag identification. The Mouse SAGE Site is maintained on an ongoing basis at the Institute of Molecular Genetics, Academy of Sciences of the Czech Republic and is accessible at the internet address http://mouse.biomed.cas.cz/sage/.

  14. Building a Faculty Publications Database: A Case Study

    ERIC Educational Resources Information Center

    Tabaei, Sara; Schaffer, Yitzchak; McMurray, Gregory; Simon, Bashe

    2013-01-01

    This case study shares the experience of building an in-house faculty publications database that was spearheaded by the Touro College and University System library in 2010. The project began with the intention of contributing to the college by collecting the research accomplishments of our faculty and staff, thereby also increasing library…

  15. Digital Equipment Corporation's CRDOM Software and Database Publications.

    ERIC Educational Resources Information Center

    Adams, Michael Q.

    1986-01-01

    Acquaints information professionals with Digital Equipment Corporation's compact optical disk read-only-memory (CDROM) search and retrieval software and growing library of CDROM database publications (COMPENDEX, Chemical Abstracts Services). Highlights include MicroBASIS, boolean operators, range operators, word and phrase searching, proximity…

  16. Prototype Food and Nutrient Database for Dietary Studies: Branded Food Products Database for Public Health Proof of Concept

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The Prototype Food and Nutrient Database for Dietary Studies (Prototype FNDDS) Branded Food Products Database for Public Health is a proof of concept database. The database contains a small selection of food products which is being used to exhibit the approach for incorporation of the Branded Food ...

  17. Genetics and Forensics: Making the National DNA Database

    PubMed Central

    Johnson, Paul; Williams, Robin; Martin, Paul

    2005-01-01

    This paper is based on a current study of the growing police use of the epistemic authority of molecular biology for the identification of criminal suspects in support of crime investigation. It discusses the development of DNA profiling and the establishment and development of the UK National DNA Database (NDNAD) as an instance of the ‘scientification of police work’ (Ericson and Shearing 1986) in which the police uses of science and technology have a recursive effect on their future development. The NDNAD, owned by the Association of Chief Police Officers of England and Wales, is the first of its kind in the world and currently contains the genetic profiles of more than 2 million people. The paper provides a framework for the examination of this socio-technical innovation, begins to tease out the dense and compact history of the database and accounts for the way in which changes and developments across disparate scientific, governmental and policing contexts, have all contributed to the range of uses to which it is put. PMID:16467921

  18. 76 FR 1137 - Publicly Available Consumer Product Safety Information Database: Notice of Public Web Conferences

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-01-07

    ... site. In the Federal Register of December 9, 2010 (75 FR 76832), we published a final rule to establish... COMMISSION Publicly Available Consumer Product Safety Information Database: Notice of Public Web Conferences... Commission (``Commission,'' ``CPSC,'' or ``we'') is announcing two Web conferences to demonstrate...

  19. The Biotinidase Gene Variants Registry: A Paradigm Public Database

    PubMed Central

    Procter, Melinda; Wolf, Barry; Crockett, David K.; Mao, Rong

    2013-01-01

    The BTD gene codes for production of biotinidase, the enzyme responsible for helping the body reuse and recycle the biotin found in foods. Biotinidase deficiency is an autosomal recessively inherited disorder resulting in the inability to recycle the vitamin biotin and affects approximately 1 in 60,000 newborns. If untreated, the depletion of intracellular biotin leads to impaired activities of the biotin-dependent carboxylases and can result in cutaneous and neurological abnormalities in individuals with the disorder. Mutations in the biotinidase gene (BTD) alter enzymatic function. To date, more than 165 mutations in BTD have been reported. Our group has developed a database that characterizes the known mutations and sequence variants in BTD. (http://arup.utah.edu/database/BTD/BTD_welcome.php). All sequence variants have been verified for their positions within the BTD gene and designated according to standard nomenclature suggested by Human Genome Variation Society (HGVS). In addition, we describe the change in the protein, indicate whether the variant is a known or likely mutation vs. a benign polymorphism, and include the reference that first described the alteration. We also indicate whether the alteration is known to be clinically pathological based on an observation of a known symptomatic individual or predicted to be pathological based on enzymatic activity or putative disruption of the protein structure. We incorporated the published phenotype to help establish genotype-phenotype correlations and facilitate this process for those performing mutation analysis and/or interpreting results. Other features of this database include disease information, relevant links about biotinidase deficiency, reference sequences, ability to query by various criteria, and the process for submitting novel variations. This database is free to the public and will be updated quarterly. This database is a paradigm for formulating databases for other inherited metabolic disorders

  20. Pathway Analysis for Drug Repositioning Based on Public Database Mining

    PubMed Central

    2015-01-01

    Sixteen FDA-approved drugs were investigated to elucidate their mechanisms of action (MOAs) and clinical functions by pathway analysis based on retrieved drug targets interacting with or affected by the investigated drugs. Protein and gene targets and associated pathways were obtained by data-mining of public databases including the MMDB, PubChem BioAssay, GEO DataSets, and the BioSystems databases. Entrez E-Utilities were applied, and in-house Ruby scripts were developed for data retrieval and pathway analysis to identify and evaluate relevant pathways common to the retrieved drug targets. Pathways pertinent to clinical uses or MOAs were obtained for most drugs. Interestingly, some drugs identified pathways responsible for other diseases than their current therapeutic uses, and these pathways were verified retrospectively by in vitro tests, in vivo tests, or clinical trials. The pathway enrichment analysis based on drug target information from public databases could provide a novel approach for elucidating drug MOAs and repositioning, therefore benefiting the discovery of new therapeutic treatments for diseases. PMID:24460210

  1. Using the ADS Database to Study Trends in Astronomical Publication

    NASA Astrophysics Data System (ADS)

    Schulman, E.; Powell, A. L.; French, J. C.; Eichhorn, G.; Kurtz, M. J.; Murray, S. S.

    1996-12-01

    The sociology of astronomical publication has traditionally been studied by looking for publication trends using every paper published in a few selected journals within a few selected years. For example, Abt (1981, PASP, 93, 269) examined the papers published in ApJ, ApJS, AJ, and PASP during the first year of each decade from 1910 to 1980. By analyzing the NASA Astrophysics Data System (ADS) database of astronomical abstracts we can study a large number of issues in the sociology of astronomical publication while including every paper published in a number of refereed astronomy journals during the past twenty years. Although there are articles from more than a thousand journals in the ADS database, seven journals together account for the majority of refereed astronomy and astrophysics papers published in the last two decades. We will be presenting results of a study of astronomical publication trends using papers published in A&A, A&AS, AJ, ApJ, ApJS, MNRAS, and PASP between 1975 and 1995. One of the most interesting trends is the rapid decrease in the fraction of papers with only one author: A&A A&AS AJ ApJ ApJS MNRAS PASP 1975 39% 39% 49% 35% 67% 48% 54% 1985 25% 31% 25% 21% 36% 28% 35% 1995 14% 19% 14% 13% 19% 14% 28% We will also be presenting information about trends in the number of papers published, the length of papers, and the number of authors per paper, with particular emphasis on the recent phenomenon of astronomical papers with fifty or more authors.

  2. TFBSshape: a motif database for DNA shape features of transcription factor binding sites

    PubMed Central

    Yang, Lin; Zhou, Tianyin; Dror, Iris; Mathelier, Anthony; Wasserman, Wyeth W.; Gordân, Raluca; Rohs, Remo

    2014-01-01

    Transcription factor binding sites (TFBSs) are most commonly characterized by the nucleotide preferences at each position of the DNA target. Whereas these sequence motifs are quite accurate descriptions of DNA binding specificities of transcription factors (TFs), proteins recognize DNA as a three-dimensional object. DNA structural features refine the description of TF binding specificities and provide mechanistic insights into protein–DNA recognition. Existing motif databases contain extensive nucleotide sequences identified in binding experiments based on their selection by a TF. To utilize DNA shape information when analysing the DNA binding specificities of TFs, we developed a new tool, the TFBSshape database (available at http://rohslab.cmb.usc.edu/TFBSshape/), for calculating DNA structural features from nucleotide sequences provided by motif databases. The TFBSshape database can be used to generate heat maps and quantitative data for DNA structural features (i.e., minor groove width, roll, propeller twist and helix twist) for 739 TF datasets from 23 different species derived from the motif databases JASPAR and UniPROBE. As demonstrated for the basic helix-loop-helix and homeodomain TF families, our TFBSshape database can be used to compare, qualitatively and quantitatively, the DNA binding specificities of closely related TFs and, thus, uncover differential DNA binding specificities that are not apparent from nucleotide sequence alone. PMID:24214955

  3. Addressing the use of phylogenetics for identification of sequences in error in the SWGDAM mitochondrial DNA database.

    PubMed

    Budowle, Bruce; Polanskey, Deborah; Allard, Marc W; Chakraborty, Ranajit

    2004-11-01

    The SWGDAM mtDNA database is a publicly available reference source that is used for estimating the rarity of an evidence mtDNA profile. Because of the current processes for generating population data, it is unlikely that population databases are error free. The majority of the errors are due to human error and are transcriptional in nature. Phylogenetic analysis of data sets can identify some potential errors, and coupled with a review of the sequence data or alignment sheets can be a very useful tool. Seven sequences with errors have been identified by phylogenetic analysis. In addition, two samples were inadvertently modified when placed in the SWGDAM database. The corrected sequences are provided so that users can modify appropriately the current iteration of the SWGDAM database. From a practical perspective, upper bound estimates of the percentage of matching profiles obtained from a database search containing an incorrect sequence and those of a database containing the corrected sequence are not substantially different. Community wide access and review has enabled identification of errors in the SWGDAM data set and will continue to do so. The result of public accessibility is that the quality of the SWGDAM forensic dataset is always improving. PMID:15568698

  4. High-throughput STR analysis for DNA database using direct PCR.

    PubMed

    Sim, Jeong Eun; Park, Su Jeong; Lee, Han Chul; Kim, Se-Yong; Kim, Jong Yeol; Lee, Seung Hwan

    2013-07-01

    Since the Korean criminal DNA database was launched in 2010, we have focused on establishing an automated DNA database profiling system that analyzes short tandem repeat loci in a high-throughput and cost-effective manner. We established a DNA database profiling system without DNA purification using a direct PCR buffer system. The quality of direct PCR procedures was compared with that of conventional PCR system under their respective optimized conditions. The results revealed not only perfect concordance but also an excellent PCR success rate, good electropherogram quality, and an optimal intra/inter-loci peak height ratio. In particular, the proportion of DNA extraction required due to direct PCR failure could be minimized to <3%. In conclusion, the newly developed direct PCR system can be adopted for automated DNA database profiling systems to replace or supplement conventional PCR system in a time- and cost-saving manner.

  5. Exploring public databases to characterize urban flood risks in Amsterdam

    NASA Astrophysics Data System (ADS)

    Gaitan, Santiago; ten Veldhuis, Marie-claire; van de Giesen, Nick

    2015-04-01

    Cities worldwide are challenged by increasing urban flood risks. Precise and realistic measures are required to decide upon investment to reduce their impacts. Obvious flooding factors affecting flood risk include sewer systems performance and urban topography. However, currently implemented sewer and topographic models do not provide realistic predictions of local flooding occurrence during heavy rain events. Assessing other factors such as spatially distributed rainfall and socioeconomic characteristics may help to explain probability and impacts of urban flooding. Several public databases were analyzed: complaints about flooding made by citizens, rainfall depths (15 min and 100 Ha spatio-temporal resolution), grids describing number of inhabitants, income, and housing price (1Ha and 25Ha resolution); and buildings age. Data analysis was done using Python and GIS programming, and included spatial indexing of data, cluster analysis, and multivariate regression on the complaints. Complaints were used as a proxy to characterize flooding impacts. The cluster analysis, run for all the variables except the complaints, grouped part of the grid-cells of central Amsterdam into a highly differentiated group, covering 10% of the analyzed area, and accounting for 25% of registered complaints. The configuration of the analyzed variables in central Amsterdam coincides with a high complaint count. Remaining complaints were evenly dispersed along other groups. An adjusted R2 of 0.38 in the multivariate regression suggests that explaining power can improve if additional variables are considered. While rainfall intensity explained 4% of the incidence of complaints, population density and building age significantly explained around 20% each. Data mining of public databases proved to be a valuable tool to identify factors explaining variability in occurrence of urban pluvial flooding, though additional variables must be considered to fully explain flood risk variability.

  6. Tracking the violent criminal offender through DNA typing profiles--a national database system concept.

    PubMed

    Baechtel, F S; Monson, K L; Forsen, G E; Budowle, B; Kearney, J J

    1991-01-01

    Implementation of standard methods for the conduct of restriction fragment length polymorphism analysis into the protocols of United States crime laboratories offers an unprecedented opportunity for the establishment of a national computer database system to enable interchange of DNA typing information. The FBI Laboratory, in concert with crime laboratory representatives, has taken the initiative in planning and implementing such a database system. The Combined DNA Index System (CODIS) will be composed of three sub-indices: a statistical database, which will contain frequencies of DNA fragment alleles in various population groups; an investigative database which will enable linkage of violent crimes through a common subject; and a convicted felon database that will serve to maintain DNA typing profiles for comparison to profiles developed from violent crimes where the suspect may be unknown.

  7. Tracking the violent criminal offender through DNA typing profiles--a national database system concept.

    PubMed

    Baechtel, F S; Monson, K L; Forsen, G E; Budowle, B; Kearney, J J

    1991-01-01

    Implementation of standard methods for the conduct of restriction fragment length polymorphism analysis into the protocols of United States crime laboratories offers an unprecedented opportunity for the establishment of a national computer database system to enable interchange of DNA typing information. The FBI Laboratory, in concert with crime laboratory representatives, has taken the initiative in planning and implementing such a database system. The Combined DNA Index System (CODIS) will be composed of three sub-indices: a statistical database, which will contain frequencies of DNA fragment alleles in various population groups; an investigative database which will enable linkage of violent crimes through a common subject; and a convicted felon database that will serve to maintain DNA typing profiles for comparison to profiles developed from violent crimes where the suspect may be unknown. PMID:1678357

  8. Forensic DNA databases in Western Balkan region: retrospectives, perspectives, and initiatives

    PubMed Central

    Marjanović, Damir; Konjhodžić, Rijad; Butorac, Sara Sanela; Drobnič, Katja; Merkaš, Siniša; Lauc, Gordan; Primorac, Damir; Anđelinović, Šimun; Milosavljević, Mladen; Karan, Željko; Vidović, Stojko; Stojković, Oliver; Panić, Bojana; Vučetić Dragović, Anđelka; Kovačević, Sandra; Jakovski, Zlatko; Asplen, Chris; Primorac, Dragan

    2011-01-01

    The European Network of Forensic Science Institutes (ENFSI) recommended the establishment of forensic DNA databases and specific implementation and management legislations for all EU/ENFSI members. Therefore, forensic institutions from Bosnia and Herzegovina, Serbia, Montenegro, and Macedonia launched a wide set of activities to support these recommendations. To assess the current state, a regional expert team completed detailed screening and investigation of the existing forensic DNA data repositories and associated legislation in these countries. The scope also included relevant concurrent projects and a wide spectrum of different activities in relation to forensics DNA use. The state of forensic DNA analysis was also determined in the neighboring Slovenia and Croatia, which already have functional national DNA databases. There is a need for a ‘regional supplement’ to the current documentation and standards pertaining to forensic application of DNA databases, which should include regional-specific preliminary aims and recommendations. PMID:21674821

  9. MitBASE : a comprehensive and integrated mitochondrial DNA database. The present status

    PubMed Central

    Attimonelli, M.; Altamura, N.; Benne, R.; Brennicke, A.; Cooper, J. M.; D’Elia, D.; Montalvo, A. de; Pinto, B. de; De Robertis, M.; Golik, P.; Knoop, V.; Lanave, C.; Lazowska, J.; Licciulli, F.; Malladi, B. S.; Memeo, F.; Monnerot, M.; Pasimeni, R.; Pilbout, S.; Schapira, A. H. V.; Sloof, P.; Saccone, C.

    2000-01-01

    MitBASE is an integrated and comprehensive database of mitochondrial DNA data which collects, under a single interface, databases for Plant, Vertebrate, Invertebrate, Human, Protist and Fungal mtDNA and a Pilot database on nuclear genes involved in mitochondrial biogenesis in Saccharomyces cerevisiae. MitBASE reports all available information from different organisms and from intraspecies variants and mutants. Data have been drawn from the primary databases and from the literature; value adding information has been structured, e.g., editing information on protist mtDNA genomes, pathological information for human mtDNA variants, etc. The different databases, some of which are structured using commercial packages (Microsoft Access, File Maker Pro) while others use a flat-file format, have been integrated under ORACLE. Ad hoc retrieval systems have been devised for some of the above listed databases keeping into account their peculiarities. The database is resident at the EBI and is available at the following site: http://www3.ebi.ac.uk/Research/Mitbase/mitbase.pl . The impact of this project is intended for both basic and applied research. The study of mitochondrial genetic diseases and mitochondrial DNA intraspecies diversity are key topics in several biotechnological fields. The database has been funded within the EU Biotechnology programme. PMID:10592207

  10. The Web-Based DNA Vaccine Database DNAVaxDB and Its Usage for Rational DNA Vaccine Design.

    PubMed

    Racz, Rebecca; He, Yongqun

    2016-01-01

    A DNA vaccine is a vaccine that uses a mammalian expression vector to express one or more protein antigens and is administered in vivo to induce an adaptive immune response. Since the 1990s, a significant amount of research has been performed on DNA vaccines and the mechanisms behind them. To meet the needs of the DNA vaccine research community, we created DNAVaxDB ( http://www.violinet.org/dnavaxdb ), the first Web-based database and analysis resource of experimentally verified DNA vaccines. All the data in DNAVaxDB, which includes plasmids, antigens, vaccines, and sources, is manually curated and experimentally verified. This chapter goes over the detail of DNAVaxDB system and shows how the DNA vaccine database, combined with the Vaxign vaccine design tool, can be used for rational design of a DNA vaccine against a pathogen, such as Mycobacterium bovis.

  11. Databases, quality control and interpretation of DNA profiling in the Home office Forensic Science Service.

    PubMed

    Gill, P; Evett, I W; Woodroffe, S; Lygo, J E; Millican, E; Webster, M

    1991-01-01

    The history of DNA profiling in the Home Office Forensic Science Service began with the introduction of multilocus probes into casework in 1986. The use of single-locus probes was introduced in 1990, supported by databases of three ethnic groups; interpretation is backed up using a Bayesian approach. Databases were compiled using an image analysis computing system. Quality control systems are described, detailing requirements before a sample can be included in the database.

  12. GBshape: a genome browser database for DNA shape annotations.

    PubMed

    Chiu, Tsu-Pei; Yang, Lin; Zhou, Tianyin; Main, Bradley J; Parker, Stephen C J; Nuzhdin, Sergey V; Tullius, Thomas D; Rohs, Remo

    2015-01-01

    Many regulatory mechanisms require a high degree of specificity in protein-DNA binding. Nucleotide sequence does not provide an answer to the question of why a protein binds only to a small subset of the many putative binding sites in the genome that share the same core motif. Whereas higher-order effects, such as chromatin accessibility, cooperativity and cofactors, have been described, DNA shape recently gained attention as another feature that fine-tunes the DNA binding specificities of some transcription factor families. Our Genome Browser for DNA shape annotations (GBshape; freely available at http://rohslab.cmb.usc.edu/GBshape/) provides minor groove width, propeller twist, roll, helix twist and hydroxyl radical cleavage predictions for the entire genomes of 94 organisms. Additional genomes can easily be added using the GBshape framework. GBshape can be used to visualize DNA shape annotations qualitatively in a genome browser track format, and to download quantitative values of DNA shape features as a function of genomic position at nucleotide resolution. As biological applications, we illustrate the periodicity of DNA shape features that are present in nucleosome-occupied sequences from human, fly and worm, and we demonstrate structural similarities between transcription start sites in the genomes of four Drosophila species.

  13. The Moroccan Genetic Disease Database (MGDD): a database for DNA variations related to inherited disorders and disease susceptibility

    PubMed Central

    Charoute, Hicham; Nahili, Halima; Abidi, Omar; Gabi, Khalid; Rouba, Hassan; Fakiri, Malika; Barakat, Abdelhamid

    2014-01-01

    National and ethnic mutation databases provide comprehensive information about genetic variations reported in a population or an ethnic group. In this paper, we present the Moroccan Genetic Disease Database (MGDD), a catalogue of genetic data related to diseases identified in the Moroccan population. We used the PubMed, Web of Science and Google Scholar databases to identify available articles published until April 2013. The Database is designed and implemented on a three-tier model using Mysql relational database and the PHP programming language. To date, the database contains 425 mutations and 208 polymorphisms found in 301 genes and 259 diseases. Most Mendelian diseases in the Moroccan population follow autosomal recessive mode of inheritance (74.17%) and affect endocrine, nutritional and metabolic physiology. The MGDD database provides reference information for researchers, clinicians and health professionals through a user-friendly Web interface. Its content should be useful to improve researches in human molecular genetics, disease diagnoses and design of association studies. MGDD can be publicly accessed at http://mgdd.pasteur.ma. PMID:23860041

  14. TabSQL: a MySQL tool to facilitate mapping user data to public databases

    PubMed Central

    2010-01-01

    Background With advances in high-throughput genomics and proteomics, it is challenging for biologists to deal with large data files and to map their data to annotations in public databases. Results We developed TabSQL, a MySQL-based application tool, for viewing, filtering and querying data files with large numbers of rows. TabSQL provides functions for downloading and installing table files from public databases including the Gene Ontology database (GO), the Ensembl databases, and genome databases from the UCSC genome bioinformatics site. Any other database that provides tab-delimited flat files can also be imported. The downloaded gene annotation tables can be queried together with users' data in TabSQL using either a graphic interface or command line. Conclusions TabSQL allows queries across the user's data and public databases without programming. It is a convenient tool for biologists to annotate and enrich their data. PMID:20573251

  15. Generation and analysis of a 29,745 unique Expressed Sequence Tags from the Pacific oyster (Crassostrea gigas) assembled into a publicly accessible database: the GigasDatabase

    PubMed Central

    2009-01-01

    Background Although bivalves are among the most-studied marine organisms because of their ecological role and economic importance, very little information is available on the genome sequences of oyster species. This report documents three large-scale cDNA sequencing projects for the Pacific oyster Crassostrea gigas initiated to provide a large number of expressed sequence tags that were subsequently compiled in a publicly accessible database. This resource allowed for the identification of a large number of transcripts and provides valuable information for ongoing investigations of tissue-specific and stimulus-dependant gene expression patterns. These data are crucial for constructing comprehensive DNA microarrays, identifying single nucleotide polymorphisms and microsatellites in coding regions, and for identifying genes when the entire genome sequence of C. gigas becomes available. Description In the present paper, we report the production of 40,845 high-quality ESTs that identify 29,745 unique transcribed sequences consisting of 7,940 contigs and 21,805 singletons. All of these new sequences, together with existing public sequence data, have been compiled into a publicly-available Website http://public-contigbrowser.sigenae.org:9090/Crassostrea_gigas/index.html. Approximately 43% of the unique ESTs had significant matches against the SwissProt database and 27% were annotated using Gene Ontology terms. In addition, we identified a total of 208 in silico microsatellites from the ESTs, with 173 having sufficient flanking sequence for primer design. We also identified a total of 7,530 putative in silico, single-nucleotide polymorphisms using existing and newly-generated EST resources for the Pacific oyster. Conclusion A publicly-available database has been populated with 29,745 unique sequences for the Pacific oyster Crassostrea gigas. The database provides many tools to search cleaned and assembled ESTs. The user may input and submit several filters, such as

  16. Vertebrate MitBASE: a specialised database on vertebrate mitochondrial DNA sequences.

    PubMed

    Carone, A; Malladi, S B; Attimonelli, M; Saccone, C

    1999-01-01

    Vertebrate MitBASE is a specialized database where all the vertebrate mitochondrial DNA entries from primary databases are collected, revised and integrated with new information emerging from the literature. Variant sequences are also analyzed, aligned and linked to reference sequences. Data related to the same species and fragment can be viewed over the WWW. The database has a flexible interface and a retrieval system to help non-expert users and contains information not currently available in the primary databases. Vertebrate MitBASE is now available through the MitBASE home page at URL: http://www.ebi.ac.uk/htbin/Mitbase/mitb ase.pl. This work is part of a larger project, MitBASE which is a network of databases covering the full panorama of knowledge on mitochondrial DNA from protists to human sequences.

  17. DNA patenting: implications for public health research.

    PubMed Central

    Dutfield, Graham

    2006-01-01

    I weigh the arguments for and against the patenting of functional DNA sequences including genes, and find the objections to be compelling. Is an outright ban on DNA patenting the right policy response? Not necessarily. Governments may wish to consider options ranging from patent law reforms to the creation of new rights. There are alternative ways to protect DNA sequences that industry may choose if DNA patenting is restricted or banned. Some of these alternatives may be more harmful than patents. Such unintended consequences of patent bans mean that we should think hard before concluding that prohibition is the only response to legitimate concerns about the appropriateness of patents in the field of human genomics. PMID:16710549

  18. A brief history of the formation of DNA databases in forensic science within Europe.

    PubMed

    Martin, P D; Schmitter, H; Schneider, P M

    2001-06-15

    The introduction of DNA analysis to forensic science brought with it a number of choices for analysis, not all of which were compatible. As laboratories throughout Europe were eager to use the new technology different systems became routine in different laboratories and consequently, there was no basis for the exchange of results. A period of co-operation then started in which a nucleus of forensic scientists agreed on an uniform system. This collaboration spread to incorporate most of the established forensic science laboratories in Europe and continued through two major changes in the technology. At each step agreement was reached on which systems to use. From the beginning it was realised that DNA databases would provide the criminal justice systems with an efficient way of crime solving and consequently some local databases were created. It was not until the introduction of the amplification technology linked to the analysis of short tandem repeats that a sufficiently sensitive and robust system was available for the formation of efficient and effective DNA databases. Comprehensive legislation enacted in the UK in 1995 enabled forensic scientists to set up the first national DNA database which would hold both personal DNA profiles together with results obtained from crime scenes. Other countries quickly followed but in some the legislation has severely restricted the amount and type of data which can be retained and, therefore, effectiveness of the databases is limited. The widespread use of commercially produced multiplex kits has produced a situation in which nearly all European laboratories are using compatible systems and there is, therefore, the potential for the introduction of a pan-European DNA database. However, the exchange of results between countries is hampered by the various legislations which currently exist.

  19. 75 FR 76831 - Publicly Available Consumer Product Safety Information Database

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-12-09

    ... available database. On May 24, 2010, we published a notice of proposed rulemaking at 75 FR 29156, which set... stakeholder input and comment, all of which were discussed in the preamble to the proposed rule at 75 FR 29156... (75 FR 29156, May 24, 2010) pertaining to each section. In addition to comments on each of...

  20. [Brazilian Food Composition Database (TBCA-USP): Data compilation to serve the public good].

    PubMed

    Lopes, Tássia do Vale Cardoso; Cyrillo, Denise Cavallini; Giuntini, Eliana Bistriche; Lajolo, Franco Maria; Menezes, Elizabete Wenzel De

    2015-09-01

    The article shows the evolution of the Brazilian Food Composition Database (TBCA-USP), since its creation until its next update. The article characterizes the TBCA-USP database like a public good and highlights the importance of the food composition data compilation as a high cost-effective activity. It reports the social relevance of the information about food composition and the importance of this database in the national context. It also indicates extension and update strategies of the TBCA-USP.

  1. An improved FORTRAN 77 recombinant DNA database management system with graphic extensions in GKS.

    PubMed

    Van Rompuy, L L; Lesage, C; Vanderhaegen, M E; Telemans, M P; Zabeau, M F

    1986-12-01

    We have improved an existing clone database management system written in FORTRAN 77 and adapted it to our software environment. Improvements are that the database can be interrogated for any type of information, not just keywords. Also, recombinant DNA constructions can be represented in a simplified 'shorthand', whereafter a program assembles the full nucleotide sequence from the contributing fragments, which may be obtained from nucleotide sequence databases. Another improvement is the replacement of the database manager by programs, running in batch to maintain the databank and verify its consistency automatically. Finally, graphic extensions are written in Graphical Kernel System, to draw linear and circular restriction maps of recombinants. Besides restriction sites, recombinant features can be presented from the feature lines of recombinant database entries, or from the feature tables of nucleotide databases. The clone database management system is fully integrated into the sequence analysis software package from the Pasteur Institute, Paris, and is made accessible through the same menu. As a result, recombinant DNA sequences can directly be analysed by the sequence analysis programs.

  2. Big bad data: law, public health, and biomedical databases.

    PubMed

    Hoffman, Sharona; Podgurski, Andy

    2013-03-01

    The accelerating adoption of electronic health record (EHR) systems will have far-reaching implications for public health research and surveillance, which in turn could lead to changes in public policy, statutes, and regulations. The public health benefits of EHR use can be significant. However, researchers and analysts who rely on EHR data must proceed with caution and understand the potential limitations of EHRs. Because of clinicians' workloads, poor user-interface design, and other factors, EHR data can be erroneous, miscoded, fragmented, and incomplete. In addition, public health findings can be tainted by the problems of selection bias, confounding bias, and measurement bias. These flaws may become all the more troubling and important in an era of electronic "big data," in which a massive amount of information is processed automatically, without human checks. Thus, we conclude the paper by outlining several regulatory and other interventions to address data analysis difficulties that could result in invalid conclusions and unsound public health policies.

  3. DSSTOX WEBSITE LAUNCH: IMPROVING PUBLIC ACCESS TO DATABASES FOR BUILDING STRUCTURE-TOXICITY PREDICTION MODELS

    EPA Science Inventory

    DSSTox Website Launch: Improving Public Access to Databases for Building Structure-Toxicity Prediction Models
    Ann M. Richard
    US Environmental Protection Agency, Research Triangle Park, NC, USA

    Distributed: Decentralized set of standardized, field-delimited databases,...

  4. End-Users/Public Access. Reprints from the Best of "ONLINE" [and]"DATABASE."

    ERIC Educational Resources Information Center

    Online, Inc., Weston, CT.

    Reprints of 20 articles pertaining to the topics of end-users and public access appear in this volume, which is one in a series of volumes of reprints from "ONLINE" and "DATABASE" magazines. Edited for information professionals who use electronically distributed databases, these articles address such topics as: (1) managing a compact disc…

  5. Characterizing the genetic structure of a forensic DNA database using a latent variable approach.

    PubMed

    Kruijver, Maarten

    2016-07-01

    Several problems in forensic genetics require a representative model of a forensic DNA database. Obtaining an accurate representation of the offender database can be difficult, since databases typically contain groups of persons with unregistered ethnic origins in unknown proportions. We propose to estimate the allele frequencies of the subpopulations comprising the offender database and their proportions from the database itself using a latent variable approach. We present a model for which parameters can be estimated using the expectation maximization (EM) algorithm. This approach does not rely on relatively small and possibly unrepresentative population surveys, but is driven by the actual genetic composition of the database only. We fit the model to a snapshot of the Dutch offender database (2014), which contains close to 180,000 profiles, and find that three subpopulations suffice to describe a large fraction of the heterogeneity in the database. We demonstrate the utility and reliability of the approach with three applications. First, we use the model to predict the number of false leads obtained in database searches. We assess how well the model predicts the number of false leads obtained in mock searches in the Dutch offender database, both for the case of familial searching for first degree relatives of a donor and searching for contributors to three-person mixtures. Second, we study the degree of partial matching between all pairs of profiles in the Dutch database and compare this to what is predicted using the latent variable approach. Third, we use the model to provide evidence to support that the Dutch practice of estimating match probabilities using the Balding-Nichols formula with a native Dutch reference database and θ=0.03 is conservative.

  6. 76 FR 53912 - FDA's Public Database of Products With Orphan-Drug Designation: Replacing Non-Informative Code...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-08-30

    ... HUMAN SERVICES Food and Drug Administration FDA's Public Database of Products With Orphan-Drug... its public database of products that have received orphan-drug designation. The Orphan Drug Act... received orphan designation were published on our public database with non-informative code names....

  7. An Internet-Accessible DNA Sequence Database for Identifying Fusaria from Human and Animal Infections

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated wi...

  8. Bioethical Biobanks: Three Concerns in Designing and Using Law Enforcement DNA Identification Databases

    SciTech Connect

    D.H. Kaye

    2006-10-19

    Federal and state law enforcement authorities have amassed large collections of DNA samples and the identifying profiles derived from them. These databases help to identify the guilty and to exonerate the innocent, but as the databanks grow, so do fears about civil liberties. The research reported here discusses three legal and social policy issues that have been raised in regard to these biobanks—the choice of loci to type for identifying individuals, the indefinite retention of DNA samples, and the use of the DNA samples or the identifying profiles for research purposes. It also considers the possible value of the databases for research into the genetics of human behavior and the ethics of using them for this purpose. It rejects the broad claim that such research is inherently unethical but proposes procedures for ensuring that the value of the proposed research justifies any psychosocial or other risks to the subjects of the research.

  9. Prisoners' expectations of the national forensic DNA database: surveillance and reconfiguration of individual rights.

    PubMed

    Machado, Helena; Santos, Filipe; Silva, Susana

    2011-07-15

    In this paper we aim to discuss how Portuguese prisoners know and what they feel about surveillance mechanisms related to the inclusion and deletion of the DNA profiles of convicted criminals in the national forensic database. Through a set of interviews with individuals currently imprisoned we focus on the ways this group perceives forensic DNA technologies. While the institutional and political discourses maintain that the restricted use and application of DNA profiles within the national forensic database protects individuals' rights, the prisoners claim that police misuse of such technologies potentially makes it difficult to escape from surveillance and acts as a mean of reinforcing the stigma of delinquency. The prisoners also argue that additional intensive and extensive use of surveillance devices might be more protective of their own individual rights and might possibly increase potential for exoneration.

  10. 75 FR 29155 - Publicly Available Consumer Product Safety Information Database

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-05-24

    ... information from the public through its Internet Web site through forms reporting on product-related injuries.... The proposal would describe four methods (internet, telephone, electronic mail, and paper) for.... 1102.10(b)(1) would explain that submitters using the Internet will use an electronic form...

  11. FastStats: a public health statistics database.

    PubMed

    Vardell, Emily

    2014-01-01

    FastStats is a site that provides quick and easy access to public health statistics. The freely available website is maintained by the Centers for Disease Control and Prevention's National Center for Health Statistics. Users can browse alphabetically by topic and state/territory or search across the National Center for Health Statistics site. A description of the browsing capabilities and sample searches are presented.

  12. The mitochondrial DNA makeup of Romanians: A forensic mtDNA control region database and phylogenetic characterization.

    PubMed

    Turchi, Chiara; Stanciu, Florin; Paselli, Giorgia; Buscemi, Loredana; Parson, Walther; Tagliabracci, Adriano

    2016-09-01

    To evaluate the pattern of Romanian population from a mitochondrial perspective and to establish an appropriate mtDNA forensic database, we generated a high-quality mtDNA control region dataset from 407 Romanian subjects belonging to four major historical regions: Moldavia, Transylvania, Wallachia and Dobruja. The entire control region (CR) was analyzed by Sanger-type sequencing assays and the resulting 306 different haplotypes were classified into haplogroups according to the most updated mtDNA phylogeny. The Romanian gene pool is mainly composed of West Eurasian lineages H (31.7%), U (12.8%), J (10.8%), R (10.1%), T (9.1%), N (8.1%), HV (5.4%),K (3.7%), HV0 (4.2%), with exceptions of East Asian haplogroup M (3.4%) and African haplogroup L (0.7%). The pattern of mtDNA variation observed in this study indicates that the mitochondrial DNA pool is geographically homogeneous across Romania and that the haplogroup composition reveals signals of admixture of populations of different origin. The PCA scatterplot supported this scenario, with Romania located in southeastern Europe area, close to Bulgaria and Hungary, and as a borderland with respect to east Mediterranean and other eastern European countries. High haplotype diversity (0.993) and nucleotide diversity indices (0.00838±0.00426), together with low random match probability (0.0087) suggest the usefulness of this control region dataset as a forensic database in routine forensic mtDNA analysis and in the investigation of maternal genetic lineages in the Romanian population.

  13. The mitochondrial DNA makeup of Romanians: A forensic mtDNA control region database and phylogenetic characterization.

    PubMed

    Turchi, Chiara; Stanciu, Florin; Paselli, Giorgia; Buscemi, Loredana; Parson, Walther; Tagliabracci, Adriano

    2016-09-01

    To evaluate the pattern of Romanian population from a mitochondrial perspective and to establish an appropriate mtDNA forensic database, we generated a high-quality mtDNA control region dataset from 407 Romanian subjects belonging to four major historical regions: Moldavia, Transylvania, Wallachia and Dobruja. The entire control region (CR) was analyzed by Sanger-type sequencing assays and the resulting 306 different haplotypes were classified into haplogroups according to the most updated mtDNA phylogeny. The Romanian gene pool is mainly composed of West Eurasian lineages H (31.7%), U (12.8%), J (10.8%), R (10.1%), T (9.1%), N (8.1%), HV (5.4%),K (3.7%), HV0 (4.2%), with exceptions of East Asian haplogroup M (3.4%) and African haplogroup L (0.7%). The pattern of mtDNA variation observed in this study indicates that the mitochondrial DNA pool is geographically homogeneous across Romania and that the haplogroup composition reveals signals of admixture of populations of different origin. The PCA scatterplot supported this scenario, with Romania located in southeastern Europe area, close to Bulgaria and Hungary, and as a borderland with respect to east Mediterranean and other eastern European countries. High haplotype diversity (0.993) and nucleotide diversity indices (0.00838±0.00426), together with low random match probability (0.0087) suggest the usefulness of this control region dataset as a forensic database in routine forensic mtDNA analysis and in the investigation of maternal genetic lineages in the Romanian population. PMID:27414754

  14. Development of a 20-locus fluorescent multiplex system as a valuable tool for national DNA database.

    PubMed

    Jiang, Xianhua; Guo, Fei; Jia, Fei; Jin, Ping; Sun, Zhu

    2013-02-01

    The multiplex system allows the detection of 19 autosomal short tandem repeat (STR) loci [including all Combined DNA Index System (CODIS) STR loci as well as D2S1338, D6S1043, D12S391, D19S433, Penta D and Penta E] plus the sex-determining locus Amelogenin in a single reaction, comprising all STR loci in various commercial kits used in the China national DNA database (NDNAD). Primers are designed so that the amplicons are distributed ranging from 90 base pairs (bp) to 450 bp within a five-dye fluorescent design with the fifth dye reserved for the internal size standard. With 30 cycles, 125 pg to 2 ng DNA template showed optimal profiling result, while robust profiles could also be achieved by adjusting the cycle numbers for the DNA template beyond that optimal DNA input range. Mixture studies showed that 83% and 87% of minor alleles were detected at 9:1 and 1:9 ratios, respectively. When 4 ng of degraded DNA was digested by 2-min DNase and 1 ng undegraded DNA was added to 400 μM haematin, the complete profiles were still observed. Polymerase chain reaction (PCR)-based procedures were examined and optimized including the concentrations of primer set, magnesium and the Taq polymerase as well as volume, cycle number and annealing temperature. In addition, the system has been validated by 3000 bloodstain samples and 35 common case samples in line with the Chinese National Standards and Scientific Working Group on DNA Analysis Methods (SWGDAM) guidelines. The total probability of identity (TPI) can reach to 8×10(-24), where DNA database can be improved at the level of 10 million DNA profiles or more because the number of expected match is far from one person (4×10(-10)) and can be negligible. Further, our system also demonstrates its good performance in case samples and it will be an ideal tool for forensic DNA typing and databasing with potential application.

  15. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

    SciTech Connect

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    2015-11-19

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database in which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the

  16. An affinity-structure database of helix-turn-helix: DNA complexes with a universal coordinate system

    DOE PAGES

    AlQuraishi, Mohammed; Tang, Shengdong; Xia, Xide

    2015-11-19

    Molecular interactions between proteins and DNA molecules underlie many cellular processes, including transcriptional regulation, chromosome replication, and nucleosome positioning. Computational analyses of protein-DNA interactions rely on experimental data characterizing known protein-DNA interactions structurally and biochemically. While many databases exist that contain either structural or biochemical data, few integrate these two data sources in a unified fashion. Such integration is becoming increasingly critical with the rapid growth of structural and biochemical data, and the emergence of algorithms that rely on the synthesis of multiple data types to derive computational models of molecular interactions. We have developed an integrated affinity-structure database inmore » which the experimental and quantitative DNA binding affinities of helix-turn-helix proteins are mapped onto the crystal structures of the corresponding protein-DNA complexes. This database provides access to: (i) protein-DNA structures, (ii) quantitative summaries of protein-DNA binding affinities using position weight matrices, and (iii) raw experimental data of protein-DNA binding instances. Critically, this database establishes a correspondence between experimental structural data and quantitative binding affinity data at the single basepair level. Furthermore, we present a novel alignment algorithm that structurally aligns the protein-DNA complexes in the database and creates a unified residue-level coordinate system for comparing the physico-chemical environments at the interface between complexes. Using this unified coordinate system, we compute the statistics of atomic interactions at the protein-DNA interface of helix-turn-helix proteins. We provide an interactive website for visualization, querying, and analyzing this database, and a downloadable version to facilitate programmatic analysis. Lastly, this database will facilitate the analysis of protein-DNA interactions and the

  17. DNA banking and DNA databanking: Legal, ethical, and public policy issues

    SciTech Connect

    Reilly, P.R.; McEwen, J.E.; Lawyer, J.D.; Small, D.

    1997-04-30

    The purpose of this research was to provide support to enable the authors to: (1) perform legal and empirical research and critically analyze DNA banking and DNA databanking as those activities are conducted by state forensic laboratories, the military, academic researchers, and commercial enterprises; and (2) develop a broadcast quality educational videotape for viewing by the general public about DNA technology and the privacy and related issues that it raises. The grant thus had both a research and analysis component and a public education component. This report outlines the work completed under the project.

  18. The annotation and the usage of scientific databases could be improved with public issue tracker software

    PubMed Central

    Dall'Olio, Giovanni Marco; Bertranpetit, Jaume; Laayouni, Hafid

    2010-01-01

    Since the publication of their longtime predecessor The Atlas of Protein Sequences and Structures in 1965 by Margaret Dayhoff, scientific databases have become a key factor in the organization of modern science. All the information and knowledge described in the novel scientific literature is translated into entries in many different scientific databases, making it possible to obtain very accurate information on a biological entity like genes or proteins without having to manually review the literature on it. However, even for the databases with the finest annotation procedures, errors or unclear parts sometimes appear in the publicly released version and influence the research of unaware scientists using them. The researcher that finds an error in a database is often left in a uncertain state, and often abandons the effort of reporting it because of a lack of a standard procedure to do so. In the present work, we propose that the simple adoption of a public error tracker application, as in many open software projects, could improve the quality of the annotations in many databases and encourage feedback from the scientific community on the data annotated publicly. In order to illustrate the situation, we describe a series of errors that we found and helped solve on the genes of a very well-known pathway in various biomedically relevant databases. We would like to show that, even if a majority of the most important scientific databases have procedures for reporting errors, these are usually not publicly visible, making the process of reporting errors time consuming and not useful. Also, the effort made by the user that reports the error often goes unacknowledged, putting him in a discouraging position. PMID:21186182

  19. Internet-accessible DNA sequence database for identifying fusaria from human and animal infections.

    PubMed

    O'Donnell, Kerry; Sutton, Deanna A; Rinaldi, Michael G; Sarver, Brice A J; Balajee, S Arunmozhi; Schroers, Hans-Josef; Summerbell, Richard C; Robert, Vincent A R G; Crous, Pedro W; Zhang, Ning; Aoki, Takayuki; Jung, Kyongyong; Park, Jongsun; Lee, Yong-Hwan; Kang, Seogchan; Park, Bongsoo; Geiser, David M

    2010-10-01

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated with human or animal mycoses encountered in clinical microbiology laboratories. The database comprises partial sequences from three nuclear genes: translation elongation factor 1α (EF-1α), the largest subunit of RNA polymerase (RPB1), and the second largest subunit of RNA polymerase (RPB2). These three gene fragments can be amplified by PCR and sequenced using primers that are conserved across the phylogenetic breadth of Fusarium. Phylogenetic analyses of the combined data set reveal that, with the exception of two monotypic lineages, all clinically relevant fusaria are nested in one of eight variously sized and strongly supported species complexes. The monophyletic lineages have been named informally to facilitate communication of an isolate's clade membership and genetic diversity. To identify isolates to the species included within the database, partial DNA sequence data from one or more of the three genes can be used as a BLAST query against the database which is Web accessible at FUSARIUM-ID (http://isolate.fusariumdb.org) and the Centraalbureau voor Schimmelcultures (CBS-KNAW) Fungal Biodiversity Center (http://www.cbs.knaw.nl/fusarium). Alternatively, isolates can be identified via phylogenetic analysis by adding sequences of unknowns to the DNA sequence alignment, which can be downloaded from the two aforementioned websites. The utility of this database should increase significantly as members of the clinical microbiology community deposit in internationally accessible culture collections (e.g., CBS-KNAW or the Fusarium Research Center) cultures of novel mycosis-associated fusaria, along with associated, corrected sequence chromatograms and data, so that the

  20. Internet-accessible DNA sequence database for identifying fusaria from human and animal infections.

    PubMed

    O'Donnell, Kerry; Sutton, Deanna A; Rinaldi, Michael G; Sarver, Brice A J; Balajee, S Arunmozhi; Schroers, Hans-Josef; Summerbell, Richard C; Robert, Vincent A R G; Crous, Pedro W; Zhang, Ning; Aoki, Takayuki; Jung, Kyongyong; Park, Jongsun; Lee, Yong-Hwan; Kang, Seogchan; Park, Bongsoo; Geiser, David M

    2010-10-01

    Because less than one-third of clinically relevant fusaria can be accurately identified to species level using phenotypic data (i.e., morphological species recognition), we constructed a three-locus DNA sequence database to facilitate molecular identification of the 69 Fusarium species associated with human or animal mycoses encountered in clinical microbiology laboratories. The database comprises partial sequences from three nuclear genes: translation elongation factor 1α (EF-1α), the largest subunit of RNA polymerase (RPB1), and the second largest subunit of RNA polymerase (RPB2). These three gene fragments can be amplified by PCR and sequenced using primers that are conserved across the phylogenetic breadth of Fusarium. Phylogenetic analyses of the combined data set reveal that, with the exception of two monotypic lineages, all clinically relevant fusaria are nested in one of eight variously sized and strongly supported species complexes. The monophyletic lineages have been named informally to facilitate communication of an isolate's clade membership and genetic diversity. To identify isolates to the species included within the database, partial DNA sequence data from one or more of the three genes can be used as a BLAST query against the database which is Web accessible at FUSARIUM-ID (http://isolate.fusariumdb.org) and the Centraalbureau voor Schimmelcultures (CBS-KNAW) Fungal Biodiversity Center (http://www.cbs.knaw.nl/fusarium). Alternatively, isolates can be identified via phylogenetic analysis by adding sequences of unknowns to the DNA sequence alignment, which can be downloaded from the two aforementioned websites. The utility of this database should increase significantly as members of the clinical microbiology community deposit in internationally accessible culture collections (e.g., CBS-KNAW or the Fusarium Research Center) cultures of novel mycosis-associated fusaria, along with associated, corrected sequence chromatograms and data, so that the

  1. Average probability that a "cold hit" in a DNA database search results in an erroneous attribution.

    PubMed

    Song, Yun S; Patil, Anand; Murphy, Erin E; Slatkin, Montgomery

    2009-01-01

    We consider a hypothetical series of cases in which the DNA profile of a crime-scene sample is found to match a known profile in a DNA database (i.e., a "cold hit"), resulting in the identification of a suspect based only on genetic evidence. We show that the average probability that there is another person in the population whose profile matches the crime-scene sample but who is not in the database is approximately 2(N - d)p(A), where N is the number of individuals in the population, d is the number of profiles in the database, and p(A) is the average match probability (AMP) for the population. The AMP is estimated by computing the average of the probabilities that two individuals in the population have the same profile. We show further that if a priori each individual in the population is equally likely to have left the crime-scene sample, then the average probability that the database search attributes the crime-scene sample to a wrong person is (N - d)p(A).

  2. "Would you accept having your DNA profile inserted in the National Forensic DNA database? Why?" Results of a questionnaire applied in Portugal.

    PubMed

    Machado, Helena; Silva, Susana

    2014-01-01

    The creation and expansion of forensic DNA databases might involve potential threats to the protection of a range of human rights. At the same time, such databases have social benefits. Based on data collected through an online questionnaire applied to 628 individuals in Portugal, this paper aims to analyze the citizens' willingness to donate voluntarily a sample for profiling and inclusion in the National Forensic DNA Database and the views underpinning such a decision. Nearly one-quarter of the respondents would indicate 'no', and this negative response increased significantly with age and education. The overriding willingness to accept the inclusion of the individual genetic profile indicates an acknowledgement of the investigative potential of forensic DNA technologies and a relegation of civil liberties and human rights to the background, owing to the perceived benefits of protecting both society and the individual from crime. This rationale is mostly expressed by the idea that all citizens should contribute to the expansion of the National Forensic DNA Database for reasons that range from the more abstract assumption that donating a sample for profiling would be helpful in fighting crime to the more concrete suggestion that everyone (criminals and non-criminals) should be in the database. The concerns with the risks of accepting the donation of a sample for genetic profiling and inclusion in the National Forensic DNA Database are mostly related to lack of control and insufficient or unclear regulations concerning safeguarding individuals' data and supervising the access and uses of genetic data. By providing an empirically-grounded understanding of the attitudes regarding willingness to donate voluntary a sample for profiling and inclusion in a National Forensic DNA Database, this study also considers the citizens' perceived benefits and risks of operating forensic DNA databases. These collective views might be useful for the formation of international common

  3. Inspecting close maternal relatedness: Towards better mtDNA population samples in forensic databases

    PubMed Central

    Bodner, Martin; Irwin, Jodi A.; Coble, Michael D.; Parson, Walther

    2011-01-01

    Reliable data are crucial for all research fields applying mitochondrial DNA (mtDNA) as a genetic marker. Quality control measures have been introduced to ensure the highest standards in sequence data generation, validation and a posteriori inspection. A phylogenetic alignment strategy has been widely accepted as a prerequisite for data comparability and database searches, for forensic applications, for reconstructions of human migrations and for correct interpretation of mtDNA mutations in medical genetics. There is continuing effort to enhance the number of worldwide population samples in order to contribute to a better understanding of human mtDNA variation. This has often lead to the analysis of convenience samples collected for other purposes, which might not meet the quality requirement of random sampling for mtDNA data sets. Here, we introduce an additional quality control means that deals with one aspect of this limitation: by combining autosomal short tandem repeat (STR) marker with mtDNA information, it helps to avoid the bias introduced by related individuals included in the same (small) sample. By STR analysis of individuals sharing their mitochondrial haplotype, pedigree construction and subsequent software-assisted calculation of likelihood ratios based on the allele frequencies found in the population, closely maternally related individuals can be identified and excluded. We also discuss scenarios that allow related individuals in the same set. An ideal population sample would be representative for its population: this new approach represents another contribution towards this goal. PMID:21067986

  4. Inspecting close maternal relatedness: Towards better mtDNA population samples in forensic databases.

    PubMed

    Bodner, Martin; Irwin, Jodi A; Coble, Michael D; Parson, Walther

    2011-03-01

    Reliable data are crucial for all research fields applying mitochondrial DNA (mtDNA) as a genetic marker. Quality control measures have been introduced to ensure the highest standards in sequence data generation, validation and a posteriori inspection. A phylogenetic alignment strategy has been widely accepted as a prerequisite for data comparability and database searches, for forensic applications, for reconstructions of human migrations and for correct interpretation of mtDNA mutations in medical genetics. There is continuing effort to enhance the number of worldwide population samples in order to contribute to a better understanding of human mtDNA variation. This has often lead to the analysis of convenience samples collected for other purposes, which might not meet the quality requirement of random sampling for mtDNA data sets. Here, we introduce an additional quality control means that deals with one aspect of this limitation: by combining autosomal short tandem repeat (STR) marker with mtDNA information, it helps to avoid the bias introduced by related individuals included in the same (small) sample. By STR analysis of individuals sharing their mitochondrial haplotype, pedigree construction and subsequent software-assisted calculation of likelihood ratios based on the allele frequencies found in the population, closely maternally related individuals can be identified and excluded. We also discuss scenarios that allow related individuals in the same set. An ideal population sample would be representative for its population: this new approach represents another contribution towards this goal. PMID:21067986

  5. PUBLIC HEALTH AND EPIDEMIOLOGICAL DATABASES FOR THE ENHANCEMENT OF MEDICAL EDUCATION

    PubMed Central

    Jamal, Qazi Mohammad Sajid; Siddiqui, Mughees Uddin; Alzohairy, Mohammad Abdulrahman; Al Karaawi, Mohammed Abdullah

    2015-01-01

    The collaboration of public health education and information technology has made patient care safer and more reliable than before. Nurses and doctors use handheld computers to record a patient's medical history and check that they are administering the correct treatment. Fortunately Public Health Informatics (PHI) is the intersecting point of technology and public health. Therefore, the inclusion of online medical and epidemiology databases in the course curriculum of budding medical professionals and postgraduate students would be beneficial in enhancing the quality of health care, extensive epidemiological research, health education, health policies, health planning and consumer satisfaction as well. The purpose of this article is to discuss and provide introduction of various databases which have huge information and it could be used to enhance the public health education. PMID:26392847

  6. Molecular scaffold analysis of natural products databases in the public domain.

    PubMed

    Yongye, Austin B; Waddell, Jacob; Medina-Franco, José L

    2012-11-01

    Natural products represent important sources of bioactive compounds in drug discovery efforts. In this work, we compiled five natural products databases available in the public domain and performed a comprehensive chemoinformatic analysis focused on the content and diversity of the scaffolds with an overview of the diversity based on molecular fingerprints. The natural products databases were compared with each other and with a set of molecules obtained from in-house combinatorial libraries, and with a general screening commercial library. It was found that publicly available natural products databases have different scaffold diversity. In contrast to the common concept that larger libraries have the largest scaffold diversity, the largest natural products collection analyzed in this work was not the most diverse. The general screening library showed, overall, the highest scaffold diversity. However, considering the most frequent scaffolds, the general reference library was the least diverse. In general, natural products databases in the public domain showed low molecule overlap. In addition to benzene and acyclic compounds, flavones, coumarins, and flavanones were identified as the most frequent molecular scaffolds across the different natural products collections. The results of this work have direct implications in the computational and experimental screening of natural product databases for drug discovery.

  7. STANDARDIZATION AND STRUCTURAL ANNOTATION OF PUBLIC TOXICITY DATABASES: IMPROVING SAR CAPABILITIES AND LINKAGE TO 'OMICS DATA

    EPA Science Inventory

    Standardization and structural annotation of public toxicity databases: Improving SAR capabilities and linkage to 'omics data
    Ann M. Richard', ClarLynda Williams', Jamie Burch2
    'Nat Health & Environ Res Lab, US EPA, RTP, NC 27711; 2EPA/NC Central Univ Student COOP Trainee<...

  8. HEDS - EPA DATABASE SYSTEM FOR PUBLIC ACCESS TO HUMAN EXPOSURE DATA

    EPA Science Inventory

    Human Exposure Database System (HEDS) is an Internet-based system developed to provide public access to human-exposure-related data from studies conducted by EPA's National Exposure Research Laboratory (NERL). HEDS was designed to work with the EPA Office of Research and Devel...

  9. Governing Software: Networks, Databases and Algorithmic Power in the Digital Governance of Public Education

    ERIC Educational Resources Information Center

    Williamson, Ben

    2015-01-01

    This article examines the emergence of "digital governance" in public education in England. Drawing on and combining concepts from software studies, policy and political studies, it identifies some specific approaches to digital governance facilitated by network-based communications and database-driven information processing software…

  10. Feline Non-repetitive Mitochondrial DNA Control Region Database for Forensic Evidence

    PubMed Central

    Grahn, R. A.; Kurushima, J. D.; Billings, N. C.; Grahn, J.C.; Halverson, J. L.; Hammer, E.; Ho, C.K.; Kun, T. J.; Levy, J.K.; Lipinski, M. J.; Mwenda, J.M.; Ozpinar, H.; Schuster, R.K; Shoorijeh, S.J.; Tarditi, C. R.; Waly, N.E.; Wictum, E. J.; Lyons, L. A.

    2010-01-01

    The domestic cat is the one of the most popular pets throughout the world. A by-product of owning, interacting with, or being in a household with a cat is the transfer of shed fur to clothing or personal objects. As trace evidence, transferred cat fur is a relatively untapped resource for forensic scientists. Both phenotypic and genotypic characteristics can be obtained from cat fur, but databases for neither aspect exist. Because cats incessantly groom, cat fur may have nucleated cells, not only in the hair bulb, but also as epithelial cells on the hair shaft deposited during the grooming process, thereby generally providing material for DNA profiling. To effectively exploit cat hair as a resource, representative databases must be established. This study evaluates 402 bp of the mtDNA control region (CR) from 1,394 cats, including cats from 25 distinct worldwide populations and 26 breeds. Eighty-three percent of the cats are represented by 12 major mitotypes. An additional 8.0% are clearly derived from the major mitotypes. Unique sequences were found in 7.5% of the cats. The overall genetic diversity for this data set was 0.8813 ± 0.0046 with a random match probability of 11.8%. This region of the cat mtDNA has discriminatory power suitable for forensic application worldwide. PMID:20457082

  11. MAGIC-SPP: a database-driven DNA sequence processing package with associated management tools

    PubMed Central

    Liang, Chun; Sun, Feng; Wang, Haiming; Qu, Junfeng; Freeman, Robert M; Pratt, Lee H; Cordonnier-Pratt, Marie-Michèle

    2006-01-01

    Background Processing raw DNA sequence data is an especially challenging task for relatively small laboratories and core facilities that produce as many as 5000 or more DNA sequences per week from multiple projects in widely differing species. To meet this challenge, we have developed the flexible, scalable, and automated sequence processing package described here. Results MAGIC-SPP is a DNA sequence processing package consisting of an Oracle 9i relational database, a Perl pipeline, and user interfaces implemented either as JavaServer Pages (JSP) or as a Java graphical user interface (GUI). The database not only serves as a data repository, but also controls processing of trace files. MAGIC-SPP includes an administrative interface, a laboratory information management system, and interfaces for exploring sequences, monitoring quality control, and troubleshooting problems related to sequencing activities. In the sequence trimming algorithm it employs new features designed to improve performance with respect to concerns such as concatenated linkers, identification of the expected start position of a vector insert, and extending the useful length of trimmed sequences by bridging short regions of low quality when the following high quality segment is sufficiently long to justify doing so. Conclusion MAGIC-SPP has been designed to minimize human error, while simultaneously being robust, versatile, flexible and automated. It offers a unique combination of features that permit administration by a biologist with little or no informatics background. It is well suited to both individual research programs and core facilities. PMID:16522212

  12. Development of an Integrated Suite of Software in Analysing of Large DNA Databases

    PubMed Central

    Kong, K.S; Ng, E.Y.K

    2008-01-01

    The work showed that the integrated suite of software tools for detecting criminals using DNA databases has achieved the overall objective by providing a working platform for sequence analysis. The work also demonstrated that by integrating BLAST and FASTA (two widely used and freely available algorithms), plus an additional implementation of PSA (custom-built pairwise sequence alignment algorithms) and TR analysis tools (for detecting tandem repeats) with the rest of the utilities supporting tools (databases and files management) developed, it is entirely possible to have an initial working version of the software tool for criminal DNA analysis and detection work. The integrated software tool has great potential and that the results obtained during the tests were satisfactory. The recent South Asia Tsunami incident has renewed the need to establish a quick and reliable system for DNA matching and comparison. This work may also contribute towards the quick identification of victims in many disasters. Future works are to further enhance the existing tools by adding more options and controls, improve upon the visualisation display, and to build robust software architecture to better manage the system loadings. Fault tolerance enhancement to the system is one of the key areas that can further help to make the entire application efficient, robust and reliable. PMID:19415131

  13. Collecting, archiving and processing DNA from wildlife samples using FTA® databasing paper

    PubMed Central

    Smith, LM; Burgoyne, LA

    2004-01-01

    Background Methods involving the analysis of nucleic acids have become widespread in the fields of traditional biology and ecology, however the storage and transport of samples collected in the field to the laboratory in such a manner to allow purification of intact nucleic acids can prove problematical. Results FTA® databasing paper is widely used in human forensic analysis for the storage of biological samples and for purification of nucleic acids. The possible uses of FTA® databasing paper in the purification of DNA from samples of wildlife origin were examined, with particular reference to problems expected due to the nature of samples of wildlife origin. The processing of blood and tissue samples, the possibility of excess DNA in blood samples due to nucleated erythrocytes, and the analysis of degraded samples were all examined, as was the question of long term storage of blood samples on FTA® paper. Examples of the end use of the purified DNA are given for all protocols and the rationale behind the processing procedures is also explained to allow the end user to adjust the protocols as required. Conclusions FTA® paper is eminently suitable for collection of, and purification of nucleic acids from, biological samples from a wide range of wildlife species. This technology makes the collection and storage of such samples much simpler. PMID:15072582

  14. Information Technologies in Public Health Management: A Database on Biocides to Improve Quality of Life

    PubMed Central

    Roman, C; Scripcariu, L; Diaconescu, RM; Grigoriu, A

    2012-01-01

    Background Biocides for prolonging the shelf life of a large variety of materials have been extensively used over the last decades. It has estimated that the worldwide biocide consumption to be about 12.4 billion dollars in 2011, and is expected to increase in 2012. As biocides are substances we get in contact with in our everyday lives, access to this type of information is of paramount importance in order to ensure an appropriate living environment. Consequently, a database where information may be quickly processed, sorted, and easily accessed, according to different search criteria, is the most desirable solution. The main aim of this work was to design and implement a relational database with complete information about biocides used in public health management to improve the quality of life. Methods: Design and implementation of a relational database for biocides, by using the software “phpMyAdmin”. Results: A database, which allows for an efficient collection, storage, and management of information including chemical properties and applications of a large quantity of biocides, as well as its adequate dissemination into the public health environment. Conclusion: The information contained in the database herein presented promotes an adequate use of biocides, by means of information technologies, which in consequence may help achieve important improvement in our quality of life. PMID:23113190

  15. Publicly Available Database : Improved Spectral Line Measurements In SDSS DR7 Galaxies

    NASA Astrophysics Data System (ADS)

    Oh, Kyuseok; Sarzi, M.; Schawinski, K.; Yi, S. K.

    2012-01-01

    We present a new database of absorption and emission line measurements based on the Sloan Digital Sky Survey 7th data release for the galaxies within a redshift of 0.2. Our work makes use of the publicly available penalized pixel-fitting(pPXF) and GANDALF codes, aiming to improve the existing measurements for stellar kinematics, the strength of various absorption-line features, and the flux and width of the emissions from different species of ionized gas. The absorption line strengths measured by SDSS pipeline are seriously contaminated by emission fill-in. We effectively separate emission lines from absorption lines. For instance, this work successfully extract [NI] doublet from Mgb and it leads to more realistic result of alpha enhancement on late-type galaxies compared to the previous database. Besides accurately measuring line strengths, the database provides new parameters that are indicative of line strength measurement quality. Users can build a subset of database optimal for their studies using specific cuts in the fitting quality parameters as well as empirical signal-to-noise. Applying these parameters, we found `hidden’ broad-line-region galaxies and they turned out to be Seyfert I nuclei that were not picked up as AGN by SDSS. The database is publicly available at http://gem.yonsei.ac.kr/ossy

  16. REBASE--a database for DNA restriction and modification: enzymes, genes and genomes.

    PubMed

    Roberts, Richard J; Vincze, Tamas; Posfai, Janos; Macelis, Dana

    2010-01-01

    REBASE is a comprehensive database of information about restriction enzymes, DNA methyltransferases and related proteins involved in the biological process of restriction-modification (R-M). It contains fully referenced information about recognition and cleavage sites, isoschizomers, neoschizomers, commercial availability, methylation sensitivity, crystal and sequence data. Experimentally characterized homing endonucleases are also included. The fastest growing segment of REBASE contains the putative R-M systems found in the sequence databases. Comprehensive descriptions of the R-M content of all fully sequenced genomes are available including summary schematics. The contents of REBASE may be browsed from the web (http://rebase.neb.com) and selected compilations can be downloaded by ftp (ftp.neb.com). Additionally, monthly updates can be requested via email.

  17. The evidential value in the DNA database search controversy and the two-stain problem.

    PubMed

    Meester, Ronald; Sjerps, Marjan

    2003-09-01

    Does the evidential strength of a DNA match depend on whether the suspect was identified through database search or through other evidence ("probable cause")? In Balding and Donnelly (1995, Journal of the Royal Statistical Society, Series A 158, 21-53) and elsewhere, it has been argued that the evidential strength is slightly larger in a database search case than in a probable cause case, while Stockmarr (1999, Biometrics 55, 671-677) reached the opposite conclusion. Both these approaches use likelihood ratios. By making an excursion to a similar problem, the two-stain problem, we argue in this article that there are certain fundamental difficulties with the use of a likelihood ratio, which can be avoided by concentrating on the posterior odds. This approach helps resolving the above-mentioned conflict.

  18. Development of a Publicly Available, Comprehensive Database of Fiber and Health Outcomes: Rationale and Methods

    PubMed Central

    Livingston, Kara A.; Chung, Mei; Sawicki, Caleigh M.; Lyle, Barbara J.; Wang, Ding Ding; Roberts, Susan B.; McKeown, Nicola M.

    2016-01-01

    Background Dietary fiber is a broad category of compounds historically defined as partially or completely indigestible plant-based carbohydrates and lignin with, more recently, the additional criteria that fibers incorporated into foods as additives should demonstrate functional human health outcomes to receive a fiber classification. Thousands of research studies have been published examining fibers and health outcomes. Objectives (1) Develop a database listing studies testing fiber and physiological health outcomes identified by experts at the Ninth Vahouny Conference; (2) Use evidence mapping methodology to summarize this body of literature. This paper summarizes the rationale, methodology, and resulting database. The database will help both scientists and policy-makers to evaluate evidence linking specific fibers with physiological health outcomes, and identify missing information. Methods To build this database, we conducted a systematic literature search for human intervention studies published in English from 1946 to May 2015. Our search strategy included a broad definition of fiber search terms, as well as search terms for nine physiological health outcomes identified at the Ninth Vahouny Fiber Symposium. Abstracts were screened using a priori defined eligibility criteria and a low threshold for inclusion to minimize the likelihood of rejecting articles of interest. Publications then were reviewed in full text, applying additional a priori defined exclusion criteria. The database was built and published on the Systematic Review Data Repository (SRDR™), a web-based, publicly available application. Conclusions A fiber database was created. This resource will reduce the unnecessary replication of effort in conducting systematic reviews by serving as both a central database archiving PICO (population, intervention, comparator, outcome) data on published studies and as a searchable tool through which this data can be extracted and updated. PMID:27348733

  19. Misguided phylogenetic comparisons using DGGE excised bands may contaminate public sequence databases.

    PubMed

    Pylro, Victor Satler; Morais, Daniel Kumazawa; Kalks, Karlos Henrique Martins; Roesch, Luiz Fernando Wurdig; Hirsch, Penny R; Tótola, Marcos Rogério; Yotoko, Karla

    2016-07-01

    Controversy surrounding bacterial phylogenies has become one of the most important challenges for microbial ecology. Comparative analyses with nucleotide databases and phylogenetic reconstruction of the amplified 16S rRNA genes from DGGE (Denaturing Gradient Gel Electrophoresis) excised bands have been used by several researchers for the identification of organisms in complex samples. Here, we individually analyzed DGGE-excised 16S rRNA gene bands from 10 certified bacterial strains of different species, and demonstrated that this kind of approach can deliver erroneous outcomes to researchers, besides causing/emphasizing errors in public databases. PMID:27109483

  20. REBASE--a database for DNA restriction and modification: enzymes, genes and genomes.

    PubMed

    Roberts, Richard J; Vincze, Tamas; Posfai, Janos; Macelis, Dana

    2015-01-01

    REBASE is a comprehensive and fully curated database of information about the components of restriction-modification (RM) systems. It contains fully referenced information about recognition and cleavage sites for both restriction enzymes and methyltransferases as well as commercial availability, methylation sensitivity, crystal and sequence data. All genomes that are completely sequenced are analyzed for RM system components, and with the advent of PacBio sequencing, the recognition sequences of DNA methyltransferases (MTases) are appearing rapidly. Thus, Type I and Type III systems can now be characterized in terms of recognition specificity merely by DNA sequencing. The contents of REBASE may be browsed from the web http://rebase.neb.com and selected compilations can be downloaded by FTP (ftp.neb.com). Monthly updates are also available via email.

  1. Local mitochondrial DNA haplotype databases needed for domestic dog populations that have experienced founder effect.

    PubMed

    Spadaro, Amanda; Ream, Kelsey; Braham, Caitlyn; Webb, Kristen M

    2015-03-01

    Biological material from pets is often collected as evidence from crime scenes. Due to sample type and quality, mitochondrial DNA (mtDNA) is frequently evaluated to identify the potential contributor. MtDNA has a lower discriminatory power than nuclear DNA with multiple individuals in a population potentially carrying the same mtDNA sequence, or haplotype. The frequency distribution of mtDNA haplotypes in a population must be known in order to determine the evidentiary value of a match between crime scene evidence and the potential contributor of the biological material. This is especially important in geographic areas that include remote and/or isolated populations where founder effect may have lead to a decrease in genetic diversity and a non-random distribution of haplotypes relative to the population at large. Here we compared the haplotype diversity in dogs from the noncontiguous states of Alaska and Hawaii relative to the contiguous United States (US). We report a greater proportion of dogs carrying an A haplotype in Alaska relative to any other US population. Significant variation in the distribution of haplotype frequencies was discovered when comparing the haplotype diversity of dogs in Hawaii to that of the continental US. Each of these regions exhibits reduced genetic diversity relative to the contiguous US, likely due to founder effect. We recommend that specific databases be created to accurately represent the mitochondrial haplotype diversity in these remote areas. Furthermore, our work demonstrates the importance of local surveys for populations that may have experienced found effect. PMID:25612881

  2. Government databases and public health research: facilitating access in the public interest.

    PubMed

    Adams, Carolyn; Allen, Judy

    2014-06-01

    Access to datasets of personal health information held by government agencies is essential to support public health research and to promote evidence-based public health policy development. Privacy legislation in Australia allows the use and disclosure of such information for public health research. However, access is not always forthcoming in a timely manner and the decision-making process undertaken by government data custodians is not always transparent. Given the public benefit in research using these health information datasets, this article suggests that it is time to recognise a right of access for approved research and that the decisions, and decision-making processes, of government data custodians should be subject to increased scrutiny. The article concludes that researchers should have an avenue of external review where access to information has been denied or unduly delayed.

  3. Effect of reference database on frequency estimates of polymerase chain reaction (PCR)-based DNA profiles.

    PubMed

    Monson, K L; Budowle, B

    1998-05-01

    A variety of general, regional, ancestral and ethnic databases is available for the polymerase chain reaction (PCR)-based loci LDLR, GYPA, HBGG, D7S8, Gc, DQA1, and D1S80. Generally, we observed greater differences in frequency estimations of DNA profiles between racial groups than between ethnic or geographic subgroups. Analysis revealed few forensically significant differences within ethnic subgroups, particularly within general United States groups, and multi-locus frequency estimates typically differ by less than a factor of ten. Using a database different from the one to which a target profile belongs tends to overestimate rarity. Implementation of the general correction of homozygote frequencies for a population substructure, advised by the 1996 National Research Council report, The Evaluation of Forensic DNA Evidence, has a minimal effect on profile frequencies. Even when it is known that both the suspect and all possible perpetrators must belong to the same isolated population, the special correction for inbreeding, which was proposed by the 1996 National Research Council report for this special case, has a relatively modest effect, typically a factor of two or less for 1% inbreeding. The effect becomes more substantial (exceeding a factor of ten) for inbreeding of 3% or more in multi-locus profiles rarer than about one in a million. PMID:9608687

  4. Large-scale annotation of small-molecule libraries using public databases.

    PubMed

    Zhou, Yingyao; Zhou, Bin; Chen, Kaisheng; Yan, S Frank; King, Frederick J; Jiang, Shumei; Winzeler, Elizabeth A

    2007-01-01

    While many large publicly accessible databases provide excellent annotation for biological macromolecules, the same is not true for small chemical compounds. Commercial data sources also fail to encompass an annotation interface for large numbers of compounds and tend to be cost prohibitive to be widely available to biomedical researchers. Therefore, using annotation information for the selection of lead compounds from a modern day high-throughput screening (HTS) campaign presently occurs only under a very limited scale. The recent rapid expansion of the NIH PubChem database provides an opportunity to link existing biological databases with compound catalogs and provides relevant information that potentially could improve the information garnered from large-scale screening efforts. Using the 2.5 million compound collection at the Genomics Institute of the Novartis Research Foundation (GNF) as a model, we determined that approximately 4% of the library contained compounds with potential annotation in such databases as PubChem and the World Drug Index (WDI) as well as related databases such as the Kyoto Encyclopedia of Genes and Genomes (KEGG) and ChemIDplus. Furthermore, the exact structure match analysis showed 32% of GNF compounds can be linked to third party databases via PubChem. We also showed annotations such as MeSH (medical subject headings) terms can be applied to in-house HTS databases in identifying signature biological inhibition profiles of interest as well as expediting the assay validation process. The automated annotation of thousands of screening hits in batch is becoming feasible and has the potential to play an essential role in the hit-to-lead decision making process.

  5. SABRE2: a database connecting plant EST/full-length cDNA clones with Arabidopsis information.

    PubMed

    Fukami-Kobayashi, Kaoru; Nakamura, Yasukazu; Tamura, Takuro; Kobayashi, Masatomo

    2014-01-01

    The SABRE (Systematic consolidation of Arabidopsis and other Botanical REsources) database cross-searches plant genetic resources through publicly available Arabidopsis information. In SABRE, plant expressed sequence tag (EST)/cDNA clones are related to TAIR (The Arabidoposis Information Resource) gene models and their annotations through sequence similarity. By entering a keyword, SABRE searches and retrieves TAIR gene models and annotations, together with homologous gene clones from various plant species. SABRE thus facilitates using TAIR annotations of Arabidopsis genes for research on homologous genes from other model plants. To expand the application range of SABRE to crop breeding, we have recently upgraded SABRE to SABRE2 (http://sabre.epd.brc.riken.jp/SABRE2.html), by newly adding six model plants (including the major crops barley, soybean, tomato and wheat), and by improving the retrieval interface. The present version has integrated information on >1.5 million plant EST/cDNA clones from the National BioResource Project (NBRP) of Japan. All clones are actual experimental resources from 14 plant species (Arabidoposis, barley, cassava, Chinese cabbage, lotus, morning glory, poplar, Physcomitrella patens, Striga hermonthica, soybean, Thellungiella halophila, tobacco, tomato and wheat), and are available from the core facilities of the NBRP. SABRE2 is thus a useful tool that can contribute towards the improvement of important crop breeds by connecting basic research and crop breeding.

  6. Applying Knowledge Discovery in Databases in Public Health Data Set: Challenges and Concerns

    PubMed Central

    Volrathongchia, Kanittha

    2003-01-01

    In attempting to apply Knowledge Discovery in Databases (KDD) to generate a predictive model from a health care dataset that is currently available to the public, the first step is to pre-process the data to overcome the challenges of missing data, redundant observations, and records containing inaccurate data. This study will demonstrate how to use simple pre-processing methods to improve the quality of input data. PMID:14728545

  7. Toward public volume database management: a case study of NOVA, the National Online Volumetric Archive

    NASA Astrophysics Data System (ADS)

    Fletcher, Alex; Yoo, Terry S.

    2004-04-01

    Public databases today can be constructed with a wide variety of authoring and management structures. The widespread appeal of Internet search engines suggests that public information be made open and available to common search strategies, making accessible information that would otherwise be hidden by the infrastructure and software interfaces of a traditional database management system. We present the construction and organizational details for managing NOVA, the National Online Volumetric Archive. As an archival effort of the Visible Human Project for supporting medical visualization research, archiving 3D multimodal radiological teaching files, and enhancing medical education with volumetric data, our overall database structure is simplified; archives grow by accruing information, but seldom have to modify, delete, or overwrite stored records. NOVA is being constructed and populated so that it is transparent to the Internet; that is, much of its internal structure is mirrored in HTML allowing internet search engines to investigate, catalog, and link directly to the deep relational structure of the collection index. The key organizational concept for NOVA is the Image Content Group (ICG), an indexing strategy for cataloging incoming data as a set structure rather than by keyword management. These groups are managed through a series of XML files and authoring scripts. We cover the motivation for Image Content Groups, their overall construction, authorship, and management in XML, and the pilot results for creating public data repositories using this strategy.

  8. SITVITWEB--a publicly available international multimarker database for studying Mycobacterium tuberculosis genetic diversity and molecular epidemiology.

    PubMed

    Demay, Christophe; Liens, Benjamin; Burguière, Thomas; Hill, Véronique; Couvin, David; Millet, Julie; Mokrousov, Igor; Sola, Christophe; Zozio, Thierry; Rastogi, Nalin

    2012-06-01

    Among various genotyping methods to study Mycobacterium tuberculosis complex (MTC) genotypic polymorphism, spoligotyping and mycobacterial interspersed repetitive units-variable number of DNA tandem repeats (MIRU-VNTRs) have recently gained international approval as robust, fast, and reproducible typing methods generating data in a portable format. Spoligotyping constituted the backbone of a publicly available database SpolDB4 released in 2006; nonetheless this method possesses a low discriminatory power when used alone and should be ideally used in conjunction with a second typing method such as MIRU-VNTRs for high-resolution epidemiological studies. We hereby describe a publicly available international database named SITVITWEB which incorporates such multimarker data allowing to have a global vision of MTC genetic diversity worldwide based on 62,582 clinical isolates corresponding to 153 countries of patient origin (105 countries of isolation). We report a total of 7105 spoligotype patterns (corresponding to 58,180 clinical isolates) - grouped into 2740 shared-types or spoligotype international types (SIT) containing 53,816 clinical isolates and 4364 orphan patterns. Interestingly, only 7% of the MTC isolates worldwide were orphans whereas more than half of SITed isolates (n=27,059) were restricted to only 24 most prevalent SITs. The database also contains a total of 2379 MIRU patterns (from 8161 clinical isolates) from 87 countries of patient origin (35 countries of isolation); these were grouped in 847 shared-types or MIRU international types (MIT) containing 6626 isolates and 1533 orphan patterns. Lastly, data on 5-locus exact tandem repeats (ETRs) were available on 4626 isolates from 59 countries of patient origin (22 countries of isolation); a total of 458 different VNTR patterns were observed - split into 245 shared-types or VNTR International Types (VIT) containing 4413 isolates) and 213 orphan patterns. Datamining of SITVITWEB further allowed to update

  9. Generation and Analysis of End Sequence Database for T-DNA Tagging Lines in Rice1

    PubMed Central

    An, Suyoung; Park, Sunhee; Jeong, Dong-Hoon; Lee, Dong-Yeon; Kang, Hong-Gyu; Yu, Jung-Hwa; Hur, Junghe; Kim, Sung-Ryul; Kim, Young-Hea; Lee, Miok; Han, Soonki; Kim, Soo-Jin; Yang, Jungwon; Kim, Eunjoo; Wi, Soo Jin; Chung, Hoo Sun; Hong, Jong-Pil; Choe, Vitnary; Lee, Hak-Kyung; Choi, Jung-Hee; Nam, Jongmin; Kim, Seong-Ryong; Park, Phun-Bum; Park, Ky Young; Kim, Woo Taek; Choe, Sunghwa; Lee, Chin-Bum; An, Gynheung

    2003-01-01

    We analyzed 6,749 lines tagged by the gene trap vector pGA2707. This resulted in the isolation of 3,793 genomic sequences flanking the T-DNA. Among the insertions, 1,846 T-DNAs were integrated into genic regions, and 1,864 were located in intergenic regions. Frequencies were also higher at the beginning and end of the coding regions and upstream near the ATG start codon. The overall GC content at the insertion sites was close to that measured from the entire rice (Oryza sativa) genome. Functional classification of these 1,846 tagged genes showed a distribution similar to that observed for all the genes in the rice chromosomes. This indicates that T-DNA insertion is not biased toward a particular class of genes. There were 764, 327, and 346 T-DNA insertions in chromosomes 1, 4 and 10, respectively. Insertions were not evenly distributed; frequencies were higher at the ends of the chromosomes and lower near the centromere. At certain sites, the frequency was higher than in the surrounding regions. This sequence database will be valuable in identifying knockout mutants for elucidating gene function in rice. This resource is available to the scientific community at http://www.postech.ac.kr/life/pfg/risd. PMID:14630961

  10. Privacy protection and public goods: building a genetic database for health research in Newfoundland and Labrador

    PubMed Central

    Pullman, Daryl; Perrot-Daley, Astrid; Hodgkinson, Kathy; Street, Catherine; Rahman, Proton

    2013-01-01

    Objective To provide a legal and ethical analysis of some of the implementation challenges faced by the Population Therapeutics Research Group (PTRG) at Memorial University (Canada), in using genealogical information offered by individuals for its genetics research database. Materials and methods This paper describes the unique historical and genetic characteristics of the Newfoundland and Labrador founder population, which gave rise to the opportunity for PTRG to build the Newfoundland Genealogy Database containing digitized records of all pre-confederation (1949) census records of the Newfoundland founder population. In addition to building the database, PTRG has developed the Heritability Analytics Infrastructure, a data management structure that stores genotype, phenotype, and pedigree information in a single database, and custom linkage software (KINNECT) to perform pedigree linkages on the genealogy database. Discussion A newly adopted legal regimen in Newfoundland and Labrador is discussed. It incorporates health privacy legislation with a unique research ethics statute governing the composition and activities of research ethics boards and, for the first time in Canada, elevating the status of national research ethics guidelines into law. The discussion looks at this integration of legal and ethical principles which provides a flexible and seamless framework for balancing the privacy rights and welfare interests of individuals, families, and larger societies in the creation and use of research data infrastructures as public goods. Conclusion The complementary legal and ethical frameworks that now coexist in Newfoundland and Labrador provide the legislative authority, ethical legitimacy, and practical flexibility needed to find a workable balance between privacy interests and public goods. Such an approach may also be instructive for other jurisdictions as they seek to construct and use biobanks and related research platforms for genetic research. PMID

  11. Recommendations of the DNA Commission of the International Society for Forensic Genetics (ISFG) on quality control of autosomal Short Tandem Repeat allele frequency databasing (STRidER).

    PubMed

    Bodner, Martin; Bastisch, Ingo; Butler, John M; Fimmers, Rolf; Gill, Peter; Gusmão, Leonor; Morling, Niels; Phillips, Christopher; Prinz, Mechthild; Schneider, Peter M; Parson, Walther

    2016-09-01

    The statistical evaluation of autosomal Short Tandem Repeat (STR) genotypes is based on allele frequencies. These are empirically determined from sets of randomly selected human samples, compiled into STR databases that have been established in the course of population genetic studies. There is currently no agreed procedure of performing quality control of STR allele frequency databases, and the reliability and accuracy of the data are largely based on the responsibility of the individual contributing research groups. It has been demonstrated with databases of haploid markers (EMPOP for mitochondrial mtDNA, and YHRD for Y-chromosomal loci) that centralized quality control and data curation is essential to minimize error. The concepts employed for quality control involve software-aided likelihood-of-genotype, phylogenetic, and population genetic checks that allow the researchers to compare novel data to established datasets and, thus, maintain the high quality required in forensic genetics. Here, we present STRidER (http://strider.online), a publicly available, centrally curated online allele frequency database and quality control platform for autosomal STRs. STRidER expands on the previously established ENFSI DNA WG STRbASE and applies standard concepts established for haploid and autosomal markers as well as novel tools to reduce error and increase the quality of autosomal STR data. The platform constitutes a significant improvement and innovation for the scientific community, offering autosomal STR data quality control and reliable STR genotype estimates. PMID:27352221

  12. Databases of publications and observations as a part of the Crimean Astronomical Virtual Observatory

    NASA Astrophysics Data System (ADS)

    Shlyapnikov, A.; Bondar', N.; Gorbunov, M.

    We describe the main principles of formation of databases (DBs) with information about astronomical objects and their physical characteristics derived from observations obtained at the Crimean Astrophysical Observatory (CrAO) and published in the ``Izvestiya of the CrAO'' and elsewhere. Emphasis is placed on the DBs missing from the most complete global library of catalogs and data tables, VizieR (supported by the Center of Astronomical Data, Strasbourg). We specially consider the problem of forming a digital archive of observational data obtained at the CrAO as an interactive DB related to database objects and publications. We present examples of all our DBs as elements integrated into the Crimean Astronomical Virtual Observatory. We illustrate the work with the CrAO DBs using tools of the International Virtual Observatory: Aladin, VOPlot, VOSpec, in conjunction with the VizieR and Simbad DBs.

  13. A Public Database of Memory and Naive B-Cell Receptor Sequences

    PubMed Central

    Sherwood, Anna M.; Vignali, Marissa; Carlson, Christopher S.; Greenberg, Philip D.; Duerkopp, Natalie; Emerson, Ryan O.; Robins, Harlan S.

    2016-01-01

    The vast diversity of B-cell receptors (BCR) and secreted antibodies enables the recognition of, and response to, a wide range of epitopes, but this diversity has also limited our understanding of humoral immunity. We present a public database of more than 37 million unique BCR sequences from three healthy adult donors that is many fold deeper than any existing resource, together with a set of online tools designed to facilitate the visualization and analysis of the annotated data. We estimate the clonal diversity of the naive and memory B-cell repertoires of healthy individuals, and provide a set of examples that illustrate the utility of the database, including several views of the basic properties of immunoglobulin heavy chain sequences, such as rearrangement length, subunit usage, and somatic hypermutation positions and dynamics. PMID:27513338

  14. A Public Database of Memory and Naive B-Cell Receptor Sequences.

    PubMed

    DeWitt, William S; Lindau, Paul; Snyder, Thomas M; Sherwood, Anna M; Vignali, Marissa; Carlson, Christopher S; Greenberg, Philip D; Duerkopp, Natalie; Emerson, Ryan O; Robins, Harlan S

    2016-01-01

    The vast diversity of B-cell receptors (BCR) and secreted antibodies enables the recognition of, and response to, a wide range of epitopes, but this diversity has also limited our understanding of humoral immunity. We present a public database of more than 37 million unique BCR sequences from three healthy adult donors that is many fold deeper than any existing resource, together with a set of online tools designed to facilitate the visualization and analysis of the annotated data. We estimate the clonal diversity of the naive and memory B-cell repertoires of healthy individuals, and provide a set of examples that illustrate the utility of the database, including several views of the basic properties of immunoglobulin heavy chain sequences, such as rearrangement length, subunit usage, and somatic hypermutation positions and dynamics. PMID:27513338

  15. A spatial national health facility database for public health sector planning in Kenya in 2008

    PubMed Central

    Noor, Abdisalan M; Alegana, Victor A; Gething, Peter W; Snow, Robert W

    2009-01-01

    Background Efforts to tackle the enormous burden of ill-health in low-income countries are hampered by weak health information infrastructures that do not support appropriate planning and resource allocation. For health information systems to function well, a reliable inventory of health service providers is critical. The spatial referencing of service providers to allow their representation in a geographic information system is vital if the full planning potential of such data is to be realized. Methods A disparate series of contemporary lists of health service providers were used to update a public health facility database of Kenya last compiled in 2003. These new lists were derived primarily through the national distribution of antimalarial and antiretroviral commodities since 2006. A combination of methods, including global positioning systems, was used to map service providers. These spatially-referenced data were combined with high-resolution population maps to analyze disparity in geographic access to public health care. Findings The updated 2008 database contained 5,334 public health facilities (67% ministry of health; 28% mission and nongovernmental organizations; 2% local authorities; and 3% employers and other ministries). This represented an overall increase of 1,862 facilities compared to 2003. Most of the additional facilities belonged to the ministry of health (79%) and the majority were dispensaries (91%). 93% of the health facilities were spatially referenced, 38% using global positioning systems compared to 21% in 2003. 89% of the population was within 5 km Euclidean distance to a public health facility in 2008 compared to 71% in 2003. Over 80% of the population outside 5 km of public health service providers was in the sparsely settled pastoralist areas of the country. Conclusion We have shown that, with concerted effort, a relatively complete inventory of mapped health services is possible with enormous potential for improving planning

  16. CoDNaS 2.0: a comprehensive database of protein conformational diversity in the native state

    PubMed Central

    Monzon, Alexander Miguel; Rohr, Cristian Oscar; Fornasari, María Silvina; Parisi, Gustavo

    2016-01-01

    CoDNaS (conformational diversity of the native state) is a protein conformational diversity database. Conformational diversity describes structural differences between conformers that define the native state of proteins. It is a key concept to understand protein function and biological processes related to protein functions. CoDNaS offers a well curated database that is experimentally driven, thoroughly linked, and annotated. CoDNaS facilitates the extraction of key information on small structural differences based on protein movements. CoDNaS enables users to easily relate the degree of conformational diversity with physical, chemical and biological properties derived from experiments on protein structure and biological characteristics. The new version of CoDNaS includes ∼70% of all available protein structures, and new tools have been added that run sequence searches, display structural flexibility profiles and allow users to browse the database for different structural classes. These tools facilitate the exploration of protein conformational diversity and its role in protein function. Database URL: http://ufq.unq.edu.ar/codnas PMID:27022160

  17. CoDNaS 2.0: a comprehensive database of protein conformational diversity in the native state.

    PubMed

    Monzon, Alexander Miguel; Rohr, Cristian Oscar; Fornasari, María Silvina; Parisi, Gustavo

    2016-01-01

    CoDNaS (conformational diversity of the native state) is a protein conformational diversity database. Conformational diversity describes structural differences between conformers that define the native state of proteins. It is a key concept to understand protein function and biological processes related to protein functions. CoDNaS offers a well curated database that is experimentally driven, thoroughly linked, and annotated. CoDNaS facilitates the extraction of key information on small structural differences based on protein movements. CoDNaS enables users to easily relate the degree of conformational diversity with physical, chemical and biological properties derived from experiments on protein structure and biological characteristics. The new version of CoDNaS includes ∼70% of all available protein structures, and new tools have been added that run sequence searches, display structural flexibility profiles and allow users to browse the database for different structural classes. These tools facilitate the exploration of protein conformational diversity and its role in protein function. Database URL:http://ufq.unq.edu.ar/codnas. PMID:27022160

  18. mirPub: a database for searching microRNA publications

    PubMed Central

    Vergoulis, Thanasis; Kanellos, Ilias; Kostoulas, Nikos; Georgakilas, Georgios; Sellis, Timos; Hatzigeorgiou, Artemis; Dalamagas, Theodore

    2015-01-01

    Summary: Identifying, amongst millions of publications available in MEDLINE, those that are relevant to specific microRNAs (miRNAs) of interest based on keyword search faces major obstacles. References to miRNA names in the literature often deviate from standard nomenclature for various reasons, since even the official nomenclature evolves. For instance, a single miRNA name may identify two completely different molecules or two different names may refer to the same molecule. mirPub is a database with a powerful and intuitive interface, which facilitates searching for miRNA literature, addressing the aforementioned issues. To provide effective search services, mirPub applies text mining techniques on MEDLINE, integrates data from several curated databases and exploits data from its user community following a crowdsourcing approach. Other key features include an interactive visualization service that illustrates intuitively the evolution of miRNA data, tag clouds summarizing the relevance of publications to particular diseases, cell types or tissues and access to TarBase 6.0 data to oversee genes related to miRNA publications. Availability and Implementation: mirPub is freely available at http://www.microrna.gr/mirpub/. Contact: vergoulis@imis.athena-innovation.gr or dalamag@imis.athena-innovation.gr Supplementary information: Supplementary data are available at Bioinformatics online. PMID:25527833

  19. Does an English appeal court ruling increase the risks of miscarriages of justice when complex DNA profiles are searched against the national DNA database?

    PubMed

    Gill, P; Bleka, Ø; Egeland, T

    2014-11-01

    Likelihood ratio (LR) methods to interpret multi-contributor, low template, complex DNA mixtures are becoming standard practice. The next major development will be to introduce search engines based on the new methods to interrogate very large national DNA databases, such as those held by China, the USA and the UK. Here we describe a rapid method that was used to assign a LR to each individual member of database of 5 million genotypes which can be ranked in order. Previous authors have only considered database trawls in the context of binary match or non-match criteria. However, the concept of match/non-match no longer applies within the new paradigm introduced, since the distribution of resultant LRs is continuous for practical purposes. An English appeal court decision allows scientists to routinely report complex DNA profiles using nothing more than their subjective personal 'experience of casework' and 'observations' in order to apply an expression of the rarity of an evidential sample. This ruling must be considered in context of a recent high profile English case, where an individual was extracted from a database and wrongly accused of a serious crime. In this case the DNA evidence was used to negate the overwhelming exculpatory (non-DNA) evidence. Demonstrable confirmation bias, also known as the 'CSI-effect, seriously affected the investigation. The case demonstrated that in practice, databases could be used to select and prosecute an individual, simply because he ranked high in the list of possible matches. We have identified this phenomenon as a cognitive error which we term: 'the naïve investigator effect'. We take the opportunity to test the performance of database extraction strategies either by using a simple matching allele count (MAC) method or LR. The example heard by the appeal court is used as the exemplar case. It is demonstrated that the LR search-method offers substantial benefits compared to searches based on simple matching allele count (MAC

  20. Similarity landscapes: An improved method for scientific visualization of information from protein and DNA database searches

    SciTech Connect

    Dogget, N.; Myers, G.; Wills, C.J.

    1998-12-01

    This is the final report of a three-year, Laboratory Directed Research and Development (LDRD) project at the Los Alamos National Laboratory (LANL). The authors have used computer simulations and examination of a variety of databases to answer questions about a wide range of evolutionary questions. The authors have found that there is a clear distinction in the evolution of HIV-1 and HIV-2, with the former and more virulent virus evolving more rapidly at a functional level. The authors have discovered highly non-random patterns in the evolution of HIV-1 that can be attributed to a variety of selective pressures. In the course of examination of microsatellite DNA (short repeat regions) in microorganisms, the authors have found clear differences between prokaryotes and eukaryotes in their distribution, differences that can be tied to different selective pressures. They have developed a new method (topiary pruning) for enhancing the phylogenetic information contained in DNA sequences. Most recently, the authors have discovered effects in complex rainforest ecosystems that indicate strong frequency-dependent interactions between host species and their parasites, leading to the maintenance of ecosystem variability.

  1. The Government Finance Database: A Common Resource for Quantitative Research in Public Financial Analysis

    PubMed Central

    Pierson, Kawika; Hand, Michael L.; Thompson, Fred

    2015-01-01

    Quantitative public financial management research focused on local governments is limited by the absence of a common database for empirical analysis. While the U.S. Census Bureau distributes government finance data that some scholars have utilized, the arduous process of collecting, interpreting, and organizing the data has led its adoption to be prohibitive and inconsistent. In this article we offer a single, coherent resource that contains all of the government financial data from 1967-2012, uses easy to understand natural-language variable names, and will be extended when new data is available. PMID:26107821

  2. The Government Finance Database: A Common Resource for Quantitative Research in Public Financial Analysis.

    PubMed

    Pierson, Kawika; Hand, Michael L; Thompson, Fred

    2015-01-01

    Quantitative public financial management research focused on local governments is limited by the absence of a common database for empirical analysis. While the U.S. Census Bureau distributes government finance data that some scholars have utilized, the arduous process of collecting, interpreting, and organizing the data has led its adoption to be prohibitive and inconsistent. In this article we offer a single, coherent resource that contains all of the government financial data from 1967-2012, uses easy to understand natural-language variable names, and will be extended when new data is available.

  3. Fluorescence- and capillary electrophoresis (CE)-based SSR DNA fingerprinting and a molecular identity database for the Louisiana sugarcane industry

    Technology Transfer Automated Retrieval System (TEKTRAN)

    A database of Louisiana sugarcane molecular identity has been constructed and is being updated annually using FAM or HEX or NED fluorescence- and capillary electrophoresis (CE)-based microsatellite (SSR) fingerprinting information. The fingerprints are PCR-amplified from leaf DNA samples of current ...

  4. The University of Minnesota Biocatalysis/Biodegradation Database: improving public access.

    PubMed

    Gao, Junfeng; Ellis, Lynda B M; Wackett, Lawrence P

    2010-01-01

    The University of Minnesota Biocatalysis/Biodegradation Database (UM-BBD, http://umbbd.msi.umn.edu/) began in 1995 and now contains information on almost 1200 compounds, over 800 enzymes, almost 1300 reactions and almost 500 microorganism entries. Besides these data, it includes a Biochemical Periodic Table (UM-BPT) and a rule-based Pathway Prediction System (UM-PPS) (http://umbbd.msi.umn.edu/predict/) that predicts plausible pathways for microbial degradation of organic compounds. Currently, the UM-PPS contains 260 biotransformation rules derived from reactions found in the UM-BBD and scientific literature. Public access to UM-BBD data is increasing. UM-BBD compound data are now contributed to PubChem and ChemSpider, the public chemical databases. A new mirror website of the UM-BBD, UM-BPT and UM-PPS is being developed at ETH Zürich to improve speed and reliability of online access from anywhere in the world.

  5. Publicly available database for spectral line measurements of SDSS DR7 galaxies

    NASA Astrophysics Data System (ADS)

    Oh, Kyuseok; Sarzi, Marc; Schawinski, Kevin; Yi, Sukyoung K.

    2012-08-01

    We present a new database of absorption and emission-line measurements based on the Sloan Digital Sky Survey (SDSS) 7th data release of galaxies within a redshift of 0.2. Using the publicly available penalized pixel-fitting (pPXF) and gas and absorption line fitting (gandalf) codes, our work improve the existing measurements for stellar kinematics, the strength of various absorption line features, and the flux and width of the emissions from different species of ionised gas. Most notable of our work is that, we provide quality of the fit to assess reliability of the measurements. The quality assessment can be highly effective for finding new classes of objects. For example, based on the quality assessment around the Ha and [NII] nebular lines, we found approximately 1% of the SDSS spectra which classified as galaxies by the SDSS pipeline are in fact type I Seyfert AGN. This paper presents a summary of the recent paper, Oh et al.(2011). The database is publicly available at http://gem.yonsei.ac.kr/ossy/.

  6. MitoAge: a database for comparative analysis of mitochondrial DNA, with a special focus on animal longevity

    PubMed Central

    Toren, Dmitri; Barzilay, Thomer; Tacutu, Robi; Lehmann, Gilad; Muradian, Khachik K.; Fraifeld, Vadim E.

    2016-01-01

    Mitochondria are the only organelles in the animal cells that have their own genome. Due to a key role in energy production, generation of damaging factors (ROS, heat), and apoptosis, mitochondria and mtDNA in particular have long been considered one of the major players in the mechanisms of aging, longevity and age-related diseases. The rapidly increasing number of species with fully sequenced mtDNA, together with accumulated data on longevity records, provides a new fascinating basis for comparative analysis of the links between mtDNA features and animal longevity. To facilitate such analyses and to support the scientific community in carrying these out, we developed the MitoAge database containing calculated mtDNA compositional features of the entire mitochondrial genome, mtDNA coding (tRNA, rRNA, protein-coding genes) and non-coding (D-loop) regions, and codon usage/amino acids frequency for each protein-coding gene. MitoAge includes 922 species with fully sequenced mtDNA and maximum lifespan records. The database is available through the MitoAge website (www.mitoage.org or www.mitoage.info), which provides the necessary tools for searching, browsing, comparing and downloading the data sets of interest for selected taxonomic groups across the Kingdom Animalia. The MitoAge website assists in statistical analysis of different features of the mtDNA and their correlative links to longevity. PMID:26590258

  7. The barley EST DNA Replication and Repair Database (bEST-DRRD) as a tool for the identification of the genes involved in DNA replication and repair

    PubMed Central

    2012-01-01

    Background The high level of conservation of genes that regulate DNA replication and repair indicates that they may serve as a source of information on the origin and evolution of the species and makes them a reliable system for the identification of cross-species homologs. Studies that had been conducted to date shed light on the processes of DNA replication and repair in bacteria, yeast and mammals. However, there is still much to be learned about the process of DNA damage repair in plants. Description These studies, which were conducted mainly using bioinformatics tools, enabled the list of genes that participate in various pathways of DNA repair in Arabidopsis thaliana (L.) Heynh to be outlined; however, information regarding these mechanisms in crop plants is still very limited. A similar, functional approach is particularly difficult for a species whose complete genomic sequences are still unavailable. One of the solutions is to apply ESTs (Expressed Sequence Tags) as the basis for gene identification. For the construction of the barley EST DNA Replication and Repair Database (bEST-DRRD), presented here, the Arabidopsis nucleotide and protein sequences involved in DNA replication and repair were used to browse for and retrieve the deposited sequences, derived from four barley (Hordeum vulgare L.) sequence databases, including the “Barley Genome version 0.05” database (encompassing ca. 90% of barley coding sequences) and from two databases covering the complete genomes of two monocot models: Oryza sativa L. and Brachypodium distachyon L. in order to identify homologous genes. Sequences of the categorised Arabidopsis queries are used for browsing the repositories, which are located on the ViroBLAST platform. The bEST-DRRD is currently used in our project during the identification and validation of the barley genes involved in DNA repair. Conclusions The presented database provides information about the Arabidopsis genes involved in DNA replication and

  8. A public image database to support research in computer aided diagnosis.

    PubMed

    Reeves, A P; Biancardi, A M; Yankelevitz, D; Fotin, S; Keller, B M; Jirapatnakul, A; Lee, J

    2009-01-01

    The Public Lung Database to address drug response (PLD) has been developed to support research in computer aided diagnosis (CAD). Originally established for applications involving the characterization of pulmonary nodules, the PLD has been augmented to provide initial datasets for CAD research of other diseases. In general, the best performance for a CAD system is achieved when it is trained with a large amount of well documented data. Such training databases are very expensive to create and their lack of general availability limits the targets that can be considered for CAD applications and hampers development of the CAD field. The approach taken with the PLD has been to make available small datasets together with both manual and automated documentation. Furthermore, datasets with special properties are provided either to span the range of task complexity or to provide small change repeat images for direct calibration and evaluation of CAD systems. This resource offers a starting point for other research groups wishing to pursue CAD research in new directions. It also provides an on-line reference for better defining the issues relating to specific CAD tasks.

  9. Diagnostic Interpretation of Array Data Using Public Databases and Internet Sources

    PubMed Central

    de Leeuw, Nicole; Dijkhuizen, Trijnie; Hehir-Kwa, Jayne Y.; Carter, Nigel P.; Feuk, Lars; Firth, Helen V.; Kuhn, Robert M.; Ledbetter, David H.; Martin, Christa Lese; van Ravenswaaij-Arts, Conny M. A.; Scherer, Steven W.; Shams, Soheil; Van Vooren, Steven; Sijmons, Rolf; Swertz, Morris; Hastings, Ros

    2016-01-01

    The range of commercially available array platforms and analysis software packages is expanding and their utility is improving, making reliable detection of copy-number variants (CNVs) relatively straightforward. Reliable interpretation of CNV data, however, is often difficult and requires expertise. With our knowledge of the human genome growing rapidly, applications for array testing continuously broadening, and the resolution of CNV detection increasing, this leads to great complexity in interpreting what can be daunting data. Correct CNV interpretation and optimal use of the genotype information provided by single-nucleotide polymorphism probes on an array depends largely on knowledge present in various resources. In addition to the availability of host laboratories’ own datasets and national registries, there are several public databases and Internet resources with genotype and phenotype information that can be used for array data interpretation. With so many resources now available, it is important to know which are fit-for-purpose in a diagnostic setting. We summarize the characteristics of the most commonly used Internet databases and resources, and propose a general data interpretation strategy that can be used for comparative hybridization, comparative intensity, and genotype-based array data. PMID:26285306

  10. DNA banking and DNA databanking: Legal, ethical, and public policy issues. Progress report, [April 1, 1993--March 31, 1994

    SciTech Connect

    Reilly, P.R.; McEwen, J.E.; Small, D.

    1994-02-18

    The purpose of the grant was to provide support to enable us to: (1) perform legal and empirical research and critically analyze DNA banking and DNA databanking as those activities are conducted by state forensic laboratories, the military, academic researchers, and commercial enterprises; and (2) develop a broadcast quality educational videotape for viewing by the general public about DNA technology and the privacy and related issues that it raises. The grant thus has both a research and analysis component and a public education component. This report outlines the work completed since the inception of the project and describes the activities still in progress.

  11. Public Perceptions and Expectations of the Forensic Use of DNA: Results of a Preliminary Study

    ERIC Educational Resources Information Center

    Curtis, Cate

    2009-01-01

    The forensic use of Deoxyribonucleic Acid (DNA) is demonstrating significant success as a crime-solving tool. However, numerous concerns have been raised regarding the potential for DNA use to contravene cultural, ethical, and legal codes. In this article the expectations and level of knowledge of the New Zealand public of the DNA data-bank and…

  12. SBMDb: first whole genome putative microsatellite DNA marker database of sugarbeet for bioenergy and industrial applications.

    PubMed

    Iquebal, Mir Asif; Jaiswal, Sarika; Angadi, U B; Sablok, Gaurav; Arora, Vasu; Kumar, Sunil; Rai, Anil; Kumar, Dinesh

    2015-01-01

    DNA marker plays important role as valuable tools to increase crop productivity by finding plausible answers to genetic variations and linking the Quantitative Trait Loci (QTL) of beneficial trait. Prior approaches in development of Short Tandem Repeats (STR) markers were time consuming and inefficient. Recent methods invoking the development of STR markers using whole genomic or transcriptomics data has gained wide importance with immense potential in developing breeding and cultivator improvement approaches. Availability of whole genome sequences and in silico approaches has revolutionized bulk marker discovery. We report world's first sugarbeet whole genome marker discovery having 145 K markers along with 5 K functional domain markers unified in common platform using MySQL, Apache and PHP in SBMDb. Embedded markers and corresponding location information can be selected for desired chromosome, location/interval and primers can be generated using Primer3 core, integrated at backend. Our analyses revealed abundance of 'mono' repeat (76.82%) over 'di' repeats (13.68%). Highest density (671.05 markers/Mb) was found in chromosome 1 and lowest density (341.27 markers/Mb) in chromosome 6. Current investigation of sugarbeet genome marker density has direct implications in increasing mapping marker density. This will enable present linkage map having marker distance of ∼2 cM, i.e. from 200 to 2.6 Kb, thus facilitating QTL/gene mapping. We also report e-PCR-based detection of 2027 polymorphic markers in panel of five genotypes. These markers can be used for DUS test of variety identification and MAS/GAS in variety improvement program. The present database presents wide source of potential markers for developing and implementing new approaches for molecular breeding required to accelerate industrious use of this crop, especially for sugar, health care products, medicines and color dye. Identified markers will also help in improvement of bioenergy trait of

  13. SBMDb: first whole genome putative microsatellite DNA marker database of sugarbeet for bioenergy and industrial applications.

    PubMed

    Iquebal, Mir Asif; Jaiswal, Sarika; Angadi, U B; Sablok, Gaurav; Arora, Vasu; Kumar, Sunil; Rai, Anil; Kumar, Dinesh

    2015-01-01

    DNA marker plays important role as valuable tools to increase crop productivity by finding plausible answers to genetic variations and linking the Quantitative Trait Loci (QTL) of beneficial trait. Prior approaches in development of Short Tandem Repeats (STR) markers were time consuming and inefficient. Recent methods invoking the development of STR markers using whole genomic or transcriptomics data has gained wide importance with immense potential in developing breeding and cultivator improvement approaches. Availability of whole genome sequences and in silico approaches has revolutionized bulk marker discovery. We report world's first sugarbeet whole genome marker discovery having 145 K markers along with 5 K functional domain markers unified in common platform using MySQL, Apache and PHP in SBMDb. Embedded markers and corresponding location information can be selected for desired chromosome, location/interval and primers can be generated using Primer3 core, integrated at backend. Our analyses revealed abundance of 'mono' repeat (76.82%) over 'di' repeats (13.68%). Highest density (671.05 markers/Mb) was found in chromosome 1 and lowest density (341.27 markers/Mb) in chromosome 6. Current investigation of sugarbeet genome marker density has direct implications in increasing mapping marker density. This will enable present linkage map having marker distance of ∼2 cM, i.e. from 200 to 2.6 Kb, thus facilitating QTL/gene mapping. We also report e-PCR-based detection of 2027 polymorphic markers in panel of five genotypes. These markers can be used for DUS test of variety identification and MAS/GAS in variety improvement program. The present database presents wide source of potential markers for developing and implementing new approaches for molecular breeding required to accelerate industrious use of this crop, especially for sugar, health care products, medicines and color dye. Identified markers will also help in improvement of bioenergy trait of

  14. SBMDb: first whole genome putative microsatellite DNA marker database of sugarbeet for bioenergy and industrial applications

    PubMed Central

    Iquebal, Mir Asif; Jaiswal, Sarika; Angadi, U.B.; Sablok, Gaurav; Arora, Vasu; Kumar, Sunil; Rai, Anil; Kumar, Dinesh

    2015-01-01

    DNA marker plays important role as valuable tools to increase crop productivity by finding plausible answers to genetic variations and linking the Quantitative Trait Loci (QTL) of beneficial trait. Prior approaches in development of Short Tandem Repeats (STR) markers were time consuming and inefficient. Recent methods invoking the development of STR markers using whole genomic or transcriptomics data has gained wide importance with immense potential in developing breeding and cultivator improvement approaches. Availability of whole genome sequences and in silico approaches has revolutionized bulk marker discovery. We report world’s first sugarbeet whole genome marker discovery having 145 K markers along with 5 K functional domain markers unified in common platform using MySQL, Apache and PHP in SBMDb. Embedded markers and corresponding location information can be selected for desired chromosome, location/interval and primers can be generated using Primer3 core, integrated at backend. Our analyses revealed abundance of ‘mono’ repeat (76.82%) over ‘di’ repeats (13.68%). Highest density (671.05 markers/Mb) was found in chromosome 1 and lowest density (341.27 markers/Mb) in chromosome 6. Current investigation of sugarbeet genome marker density has direct implications in increasing mapping marker density. This will enable present linkage map having marker distance of ∼2 cM, i.e. from 200 to 2.6 Kb, thus facilitating QTL/gene mapping. We also report e-PCR-based detection of 2027 polymorphic markers in panel of five genotypes. These markers can be used for DUS test of variety identification and MAS/GAS in variety improvement program. The present database presents wide source of potential markers for developing and implementing new approaches for molecular breeding required to accelerate industrious use of this crop, especially for sugar, health care products, medicines and color dye. Identified markers will also help in improvement of bioenergy trait

  15. Documentation for the U.S. Geological Survey Public-Supply Database (PSDB): a database of permitted public-supply wells, surface-water intakes, and systems in the United States

    USGS Publications Warehouse

    Price, Curtis V.; Maupin, Molly A.

    2014-01-01

    The purpose of this report is to document the PSDB and explain the methods used to populate and update the data from the SDWIS, State datasets, and map and geospatial imagery. This report describes 3 data tables and 11 domain tables, including field contents, data sources, and relations between tables. Although the PSDB database is not available to the general public, this information should be useful for others who are developing other database systems to store and analyze public-supply system and facility data.

  16. MethBank: a database integrating next-generation sequencing single-base-resolution DNA methylation programming data.

    PubMed

    Zou, Dong; Sun, Shixiang; Li, Rujiao; Liu, Jiang; Zhang, Jing; Zhang, Zhang

    2015-01-01

    DNA methylation plays crucial roles during embryonic development. Here we present MethBank (http://dnamethylome.org), a DNA methylome programming database that integrates the genome-wide single-base nucleotide methylomes of gametes and early embryos in different model organisms. Unlike extant relevant databases, MethBank incorporates the whole-genome single-base-resolution methylomes of gametes and early embryos at multiple different developmental stages in zebrafish and mouse. MethBank allows users to retrieve methylation levels, differentially methylated regions, CpG islands, gene expression profiles and genetic polymorphisms for a specific gene or genomic region. Moreover, it offers a methylome browser that is capable of visualizing high-resolution DNA methylation profiles as well as other related data in an interactive manner and thus is of great helpfulness for users to investigate methylation patterns and changes of gametes and early embryos at different developmental stages. Ongoing efforts are focused on incorporation of methylomes and related data from other organisms. Together, MethBank features integration and visualization of high-resolution DNA methylation data as well as other related data, enabling identification of potential DNA methylation signatures in different developmental stages and accordingly providing an important resource for the epigenetic and developmental studies. PMID:25294826

  17. Familial searching: a specialist forensic DNA profiling service utilising the National DNA Database to identify unknown offenders via their relatives--the UK experience.

    PubMed

    Maguire, C N; McCallum, L A; Storey, C; Whitaker, J P

    2014-01-01

    The National DNA Database (NDNAD) of England and Wales was established on April 10th 1995. The NDNAD is governed by a variety of legislative instruments that mean that DNA samples can be taken if an individual is arrested and detained in a police station. The biological samples and the DNA profiles derived from them can be used for purposes related to the prevention and detection of crime, the investigation of an offence and for the conduct of a prosecution. Following the South East Asian Tsunami of December 2004, the legislation was amended to allow the use of the NDNAD to assist in the identification of a deceased person or of a body part where death has occurred from natural causes or from a natural disaster. The UK NDNAD now contains the DNA profiles of approximately 6 million individuals representing 9.6% of the UK population. As the science of DNA profiling advanced, the National DNA Database provided a potential resource for increased intelligence beyond the direct matching for which it was originally created. The familial searching service offered to the police by several UK forensic science providers exploits the size and geographic coverage of the NDNAD and the fact that close relatives of an offender may share a significant proportion of that offender's DNA profile and will often reside in close geographic proximity to him or her. Between 2002 and 2011 Forensic Science Service Ltd. (FSS) provided familial search services to support 188 police investigations, 70 of which are still active cases. This technique, which may be used in serious crime cases or in 'cold case' reviews when there are few or no investigative leads, has led to the identification of 41 perpetrators or suspects. In this paper we discuss the processes, utility, and governance of the familial search service in which the NDNAD is searched for close genetic relatives of an offender who has left DNA evidence at a crime scene, but whose DNA profile is not represented within the NDNAD. We

  18. [Terminology used in publications of pharmacoepidemiological research in france using health insurance reimbursement databases: need for harmonisation].

    PubMed

    Martin-Latry, Karin; Cougnard, Audrey

    2010-01-01

    The reimbursement databases of the French health insurance systems are greatly used for pharmaceoepidemiological research. However, the terminology used to describe them in subsequent articles and abstracts vary greatly and thus lead to a problem of identification during bibliograhic research or during the process of indexation in medline. In this article we have fixed the terminology used and proposed both a terminology and appropriate MeSH terms for indexation for the futur. Fifty-six published studies were included. At least six different root terms were found to define the French health insurance system, 64.3% of the publications mentioned the term "database", and 30.4% mentioned the term "reimbursement". We propose that abstracts of future articles contain the three terms: database, reimbursement, and health insurance. We also propose to include in the keywords of an article the MeSH terms that are most appropriate to define these three concepts: Insurance, Health, Reimbursement and Databases, Factual.

  19. SkyDOT: a publicly accessible variability database, containing multiple sky surveys and real-time data

    SciTech Connect

    Starr, D. L.; Wozniak, P. R.; Vestrand, W. T.

    2002-01-01

    SkyDOT (Sky Database for Objects in Time-Domain) is a Virtual Observatory currently comprised of data from the RAPTOR, ROTSE I, and OGLE I1 survey projects. This makes it a very large time domain database. In addition, the RAPTOR project provides SkyDOT with real-time variability data as well as stereoscopic information. With its web interface, we believe SkyDOT will be a very useful tool for both astronomers, and the public. Our main task has been to construct an efficient relational database containing all existing data, while handling a real-time inflow of data. We also provide a useful web interface allowing easy access to both astronomers and the public. Initially, this server will allow common searches, specific queries, and access to light curves. In the future we will include machine learning classification tools and access to spectral information.

  20. 76 FR 77533 - Notice of Order: Revisions to Enterprise Public Use Database Incorporating High-Cost Single...

    Federal Register 2010, 2011, 2012, 2013, 2014

    2011-12-13

    ... September 28, 2011 at 76 FR 60031, regarding FHFA's adoption of an Order revising FHFA's Public Use Database... Mac). The SUPPLEMENTARY INFORMATION in the Notice of Order stated that, based on data reported by Fannie Mae and Freddie Mac, in 2010, Freddie Mac did not purchase and securitize any first mortgages...

  1. Complementary Value of Databases for Discovery of Scholarly Literature: A User Survey of Online Searching for Publications in Art History

    ERIC Educational Resources Information Center

    Nemeth, Erik

    2010-01-01

    Discovery of academic literature through Web search engines challenges the traditional role of specialized research databases. Creation of literature outside academic presses and peer-reviewed publications expands the content for scholarly research within a particular field. The resulting body of literature raises the question of whether scholars…

  2. Public-database enabled analysis of Lagrangian dynamics of isotropic turbulence near the Vieillefosse tail

    NASA Astrophysics Data System (ADS)

    Yu, Huidan; Meneveau, Charles

    2010-11-01

    We study the Lagrangian time evolution of velocity gradient dynamics near the Vieillefosse tail. The data are obtained from fluid particle tracking through the 1024^4 space-time DNS of forced isotropic turbulence at Reλ=433, using a web-based public database (http://turbulence.pha.jhu.edu). Examination of individual time-series of velocity gradient invariants R and Q show that they are punctuated by strong peaks of negative Q and positive R. Most of these occur very close to the Viellefosse tail along Q = - (3/2^2/3) R^2/3. It is found there that the magnitude of pressure Hessian has positive Lagrangian time-derivative, meaning that it increases in order to resist the rapid growth. We also observe a "phase delay" of the pressure Hessian signals compared to those of R and Q, indicative of an "overshoot" of the controlling mechanism. We also examine the trajectories in the recently proposed 3-D extension of the R-Q plane (see Lüthi B, Holzner M, Tsinober A. 2009, J. Fluid Mech. 641, 497-507). Finally, Lagrangian models of the velocity gradient tensor are examined in the same light to identify similarities and differences with the observed dynamics. Such comparisons supply informative guidance to model improvements.

  3. BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology.

    PubMed

    Gilson, Michael K; Liu, Tiqing; Baitaluk, Michael; Nicola, George; Hwang, Linda; Chong, Jenny

    2016-01-01

    BindingDB, www.bindingdb.org, is a publicly accessible database of experimental protein-small molecule interaction data. Its collection of over a million data entries derives primarily from scientific articles and, increasingly, US patents. BindingDB provides many ways to browse and search for data of interest, including an advanced search tool, which can cross searches of multiple query types, including text, chemical structure, protein sequence and numerical affinities. The PDB and PubMed provide links to data in BindingDB, and vice versa; and BindingDB provides links to pathway information, the ZINC catalog of available compounds, and other resources. The BindingDB website offers specialized tools that take advantage of its large data collection, including ones to generate hypotheses for the protein targets bound by a bioactive compound, and for the compounds bound by a new protein of known sequence; and virtual compound screening by maximal chemical similarity, binary kernel discrimination, and support vector machine methods. Specialized data sets are also available, such as binding data for hundreds of congeneric series of ligands, drawn from BindingDB and organized for use in validating drug design methods. BindingDB offers several forms of programmatic access, and comes with extensive background material and documentation. Here, we provide the first update of BindingDB since 2007, focusing on new and unique features and highlighting directions of importance to the field as a whole.

  4. BindingDB in 2015: A public database for medicinal chemistry, computational chemistry and systems pharmacology

    PubMed Central

    Gilson, Michael K.; Liu, Tiqing; Baitaluk, Michael; Nicola, George; Hwang, Linda; Chong, Jenny

    2016-01-01

    BindingDB, www.bindingdb.org, is a publicly accessible database of experimental protein-small molecule interaction data. Its collection of over a million data entries derives primarily from scientific articles and, increasingly, US patents. BindingDB provides many ways to browse and search for data of interest, including an advanced search tool, which can cross searches of multiple query types, including text, chemical structure, protein sequence and numerical affinities. The PDB and PubMed provide links to data in BindingDB, and vice versa; and BindingDB provides links to pathway information, the ZINC catalog of available compounds, and other resources. The BindingDB website offers specialized tools that take advantage of its large data collection, including ones to generate hypotheses for the protein targets bound by a bioactive compound, and for the compounds bound by a new protein of known sequence; and virtual compound screening by maximal chemical similarity, binary kernel discrimination, and support vector machine methods. Specialized data sets are also available, such as binding data for hundreds of congeneric series of ligands, drawn from BindingDB and organized for use in validating drug design methods. BindingDB offers several forms of programmatic access, and comes with extensive background material and documentation. Here, we provide the first update of BindingDB since 2007, focusing on new and unique features and highlighting directions of importance to the field as a whole. PMID:26481362

  5. Potential translational targets revealed by linking mouse grooming behavioral phenotypes to gene expression using public databases.

    PubMed

    Roth, Andrew; Kyzar, Evan J; Cachat, Jonathan; Stewart, Adam Michael; Green, Jeremy; Gaikwad, Siddharth; O'Leary, Timothy P; Tabakoff, Boris; Brown, Richard E; Kalueff, Allan V

    2013-01-10

    Rodent self-grooming is an important, evolutionarily conserved behavior, highly sensitive to pharmacological and genetic manipulations. Mice with aberrant grooming phenotypes are currently used to model various human disorders. Therefore, it is critical to understand the biology of grooming behavior, and to assess its translational validity to humans. The present in-silico study used publicly available gene expression and behavioral data obtained from several inbred mouse strains in the open-field, light-dark box, elevated plus- and elevated zero-maze tests. As grooming duration differed between strains, our analysis revealed several candidate genes with significant correlations between gene expression in the brain and grooming duration. The Allen Brain Atlas, STRING, GoMiner and Mouse Genome Informatics databases were used to functionally map and analyze these candidate mouse genes against their human orthologs, assessing the strain ranking of their expression and the regional distribution of expression in the mouse brain. This allowed us to identify an interconnected network of candidate genes (which have expression levels that correlate with grooming behavior), display altered patterns of expression in key brain areas related to grooming, and underlie important functions in the brain. Collectively, our results demonstrate the utility of large-scale, high-throughput data-mining and in-silico modeling for linking genomic and behavioral data, as well as their potential to identify novel neural targets for complex neurobehavioral phenotypes, including grooming.

  6. Construction of a public CHO cell line transcript database using versatile bioinformatics analysis pipelines.

    PubMed

    Rupp, Oliver; Becker, Jennifer; Brinkrolf, Karina; Timmermann, Christina; Borth, Nicole; Pühler, Alfred; Noll, Thomas; Goesmann, Alexander

    2014-01-01

    Chinese hamster ovary (CHO) cell lines represent the most commonly used mammalian expression system for the production of therapeutic proteins. In this context, detailed knowledge of the CHO cell transcriptome might help to improve biotechnological processes conducted by specific cell lines. Nevertheless, very few assembled cDNA sequences of CHO cells were publicly released until recently, which puts a severe limitation on biotechnological research. Two extended annotation systems and web-based tools, one for browsing eukaryotic genomes (GenDBE) and one for viewing eukaryotic transcriptomes (SAMS), were established as the first step towards a publicly usable CHO cell genome/transcriptome analysis platform. This is complemented by the development of a new strategy to assemble the ca. 100 million reads, sequenced from a broad range of diverse transcripts, to a high quality CHO cell transcript set. The cDNA libraries were constructed from different CHO cell lines grown under various culture conditions and sequenced using Roche/454 and Illumina sequencing technologies in addition to sequencing reads from a previous study. Two pipelines to extend and improve the CHO cell line transcripts were established. First, de novo assemblies were carried out with the Trinity and Oases assemblers, using varying k-mer sizes. The resulting contigs were screened for potential CDS using ESTScan. Redundant contigs were filtered out using cd-hit-est. The remaining CDS contigs were re-assembled with CAP3. Second, a reference-based assembly with the TopHat/Cufflinks pipeline was performed, using the recently published draft genome sequence of CHO-K1 as reference. Additionally, the de novo contigs were mapped to the reference genome using GMAP and merged with the Cufflinks assembly using the cuffmerge software. With this approach 28,874 transcripts located on 16,492 gene loci could be assembled. Combining the results of both approaches, 65,561 transcripts were identified for CHO cell lines

  7. Osteoporosis prevention among chronic glucocorticoid users: results from a public health insurance database

    PubMed Central

    Trijau, Sophie; de Lamotte, Gaëlle; Pradel, Vincent; Natali, François; Allaria-Lapierre, Véronique; Coudert, Hervé; Pham, Thao; Sciortino, Vincent; Lafforgue, Pierre

    2016-01-01

    Introduction Long-term glucocorticoid therapy is the leading cause of secondary osteoporosis. The management of glucocorticoid-induced osteoporosis (GIOP) seems to be inadequate in many European countries. Objective To evaluate the rate of screening and treatment of GIOP. Design Information was collected from a national public health-insurance database in our geographic area of Provence-Alpes-Côte-d'Azur and in Corsica, from September 2009 through August 2011. Patients We identified participants aged 15 years and over starting glucocorticoid therapy (≥7.5 mg of prednisone equivalent per day during at least 90 days consecutive). This cohort was compared with an age-matched and sex-matched population that did not receive glucocorticoids. Main outcome measures Bone mass, prescription of bone antiresorptive medication and use of calcium and/or vitamin D treatment. Results We identified 32 812 patients who were prescribed glucocorticoid therapy, yielding 1% prevalence. Incidence of glucocorticoid therapy was 2.8/1000 inhabitants/year. Males represented 44%, the mean age was 58 years. The median prednisone-equivalent dose was 11 mg/day (IQR 9–18 mg/day). 8% underwent bone mass measurement. Calcium and/or vitamin D, and bisphosphonates were prescribed in 18% and 12%, respectively. Results were lower for the control population: 3% underwent bone mass measurement and 3% received bisphosphonate therapy. The rates of osteodensitometry and treatments were higher in women over 55 years of age than in men and women 55 years of age and younger, and also when glucocorticoid therapy was initiated by a rheumatologist versus other physician specialty. Conclusions The management of GIOP remains very inadequate, despite the availability of a statutory health insurance system. Targeted interventions are needed to improve the management of GIOP. PMID:27486526

  8. Towards standards for data exchange and integration and their impact on a public database such as CEBS (Chemical Effects in Biological Systems)

    SciTech Connect

    Fostel, Jennifer M.

    2008-11-15

    Integration, re-use and meta-analysis of high content study data, typical of DNA microarray studies, can increase its scientific utility. Access to study data and design parameters would enhance the mining of data integrated across studies. However, without standards for which data to include in exchange, and common exchange formats, publication of high content data is time-consuming and often prohibitive. The MGED Society ( (www.mged.org)) was formed in response to the widespread publication of microarray data, and the recognition of the utility of data re-use for meta-analysis. The NIEHS has developed the Chemical Effects in Biological Systems (CEBS) database, which can manage and integrate study data and design from biological and biomedical studies. As community standards are developed for study data and metadata it will become increasingly straightforward to publish high content data in CEBS, where they will be available for meta-analysis. Different exchange formats for study data are being developed: Standard for Exchange of Nonclinical Data (SEND; (www.cdisc.org)); Tox-ML ( (www.Leadscope.com)) and Simple Investigation Formatted Text (SIFT) from the NIEHS. Data integration can be done at the level of conclusions about responsive genes and phenotypes, and this workflow is supported by CEBS. CEBS also integrates raw and preprocessed data within a given platform. The utility and a method for integrating data within and across DNA microarray studies is shown in an example analysis using DrugMatrix data deposited in CEBS by Iconix Pharmaceuticals.

  9. A two-locus DNA sequence database for identifying host-specific pathogens and phylogenetic diversity within the Fusarium oxysporum species complex

    Technology Transfer Automated Retrieval System (TEKTRAN)

    An electronically portable two-locus DNA sequence database, comprising partial sequences of the translation elongation factor gene (EF-1a, 634 bp alignment) and nearly complete sequences of the nuclear ribosomal intergenic spacer region (IGS rDNA, 2220 bp alignment) for 850 isolates spanning the phy...

  10. The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome

    PubMed Central

    Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A.

    2015-01-01

    A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser. PMID:25324314

  11. The TTSMI database: a catalog of triplex target DNA sites associated with genes and regulatory elements in the human genome.

    PubMed

    Jenjaroenpun, Piroon; Chew, Chee Siang; Yong, Tai Pang; Choowongkomon, Kiattawee; Thammasorn, Wimada; Kuznetsov, Vladimir A

    2015-01-01

    A triplex target DNA site (TTS), a stretch of DNA that is composed of polypurines, is able to form a triple-helix (triplex) structure with triplex-forming oligonucleotides (TFOs) and is able to influence the site-specific modulation of gene expression and/or the modification of genomic DNA. The co-localization of a genomic TTS with gene regulatory signals and functional genome structures suggests that TFOs could potentially be exploited in antigene strategies for the therapy of cancers and other genetic diseases. Here, we present the TTS Mapping and Integration (TTSMI; http://ttsmi.bii.a-star.edu.sg) database, which provides a catalog of unique TTS locations in the human genome and tools for analyzing the co-localization of TTSs with genomic regulatory sequences and signals that were identified using next-generation sequencing techniques and/or predicted by computational models. TTSMI was designed as a user-friendly tool that facilitates (i) fast searching/filtering of TTSs using several search terms and criteria associated with sequence stability and specificity, (ii) interactive filtering of TTSs that co-localize with gene regulatory signals and non-B DNA structures, (iii) exploration of dynamic combinations of the biological signals of specific TTSs and (iv) visualization of a TTS simultaneously with diverse annotation tracks via the UCSC genome browser.

  12. Leading-edge forensic DNA analyses and the necessity of including crime scene investigators, police officers and technicians in a DNA elimination database.

    PubMed

    Lapointe, Martine; Rogic, Anita; Bourgoin, Sarah; Jolicoeur, Christine; Séguin, Diane

    2015-11-01

    In recent years, sophisticated technology has significantly increased the sensitivity and analytical power of genetic analyses so that very little starting material may now produce viable genetic profiles. This sensitivity however, has also increased the risk of detecting unknown genetic profiles assumed to be that of the perpetrator, yet originate from extraneous sources such as from crime scene workers. These contaminants may mislead investigations, keeping criminal cases active and unresolved for long spans of time. Voluntary submission of DNA samples from crime scene workers is fairly low, therefore we have created a promotional method for our staff elimination database that has resulted in a significant increase in voluntary samples since 2011. Our database enforces privacy safeguards and allows for optional anonymity to all staff members. We also offer information sessions at various police precincts to advise crime scene workers of the importance and success of our staff elimination database. This study, a pioneer in its field, has obtained 327 voluntary submissions from crime scene workers to date, of which 46 individual profiles (14%) have been matched to 58 criminal cases. By implementing our methods and respect for individual privacy, forensic laboratories everywhere may see similar growth and success in explaining unidentified genetic profiles in stagnate criminal cases.

  13. Errors in the interpretation of copy number variations due to the use of public databases as a reference.

    PubMed

    Bastida-Lertxundi, Nerea; López-López, Elixabet; Piñán, M Angeles; Puiggros, Anna; Navajas, Aurora; Solé, Francesc; García-Orad, Africa

    2014-04-01

    The identification of new cryptic deletions and duplications can be used to improve prognostic classification in cancer. To obtain accurate results, it is necessary to discriminate between somatic alterations in the tumor cell and germline polymorphisms. For this purpose, copy number variation (CNV) public databases have been used as a reference. Nevertheless, the use of these databases may lead to erroneous results. Our main goal was to explore the limitations of the use of CNV databases, such as the Database of Genomic Variants (DGV), as the reference. To that end, we used pediatric acute lymphoblastic leukemia (ALL) as a model. We analyzed the genome-wide copy number profile of 23 ALL patients and conducted a comparison of the results obtained using the DGV with those obtained using the normal sample from the patient as the reference. Using only the DGV, 19% of alterations and 41% of polymorphisms were erroneously catalogued. Our results support the hypothesis that with the use of databases such as the DGV as the reference, a high percentage of the variations can be erroneously classified. PMID:24767712

  14. Searching for first-degree familial relationships in California's offender DNA database: validation of a likelihood ratio-based approach.

    PubMed

    Myers, Steven P; Timken, Mark D; Piucci, Matthew L; Sims, Gary A; Greenwald, Michael A; Weigand, James J; Konzak, Kenneth C; Buoncristiani, Martin R

    2011-11-01

    A validation study was performed to measure the effectiveness of using a likelihood ratio-based approach to search for possible first-degree familial relationships (full-sibling and parent-child) by comparing an evidence autosomal short tandem repeat (STR) profile to California's ∼1,000,000-profile State DNA Index System (SDIS) database. Test searches used autosomal STR and Y-STR profiles generated for 100 artificial test families. When the test sample and the first-degree relative in the database were characterized at the 15 Identifiler(®) (Applied Biosystems(®), Foster City, CA) STR loci, the search procedure included 96% of the fathers and 72% of the full-siblings. When the relative profile was limited to the 13 Combined DNA Index System (CODIS) core loci, the search procedure included 93% of the fathers and 61% of the full-siblings. These results, combined with those of functional tests using three real families, support the effectiveness of this tool. Based upon these results, the validated approach was implemented as a key, pragmatic and demonstrably practical component of the California Department of Justice's Familial Search Program. An investigative lead created through this process recently led to an arrest in the Los Angeles Grim Sleeper serial murders.

  15. UnoViS: the MedIT public unobtrusive vital signs database.

    PubMed

    Wartzek, Tobias; Czaplik, Michael; Antink, Christoph Hoog; Eilebrecht, Benjamin; Walocha, Rafael; Leonhardt, Steffen

    2015-01-01

    While PhysioNet is a large database for standard clinical vital signs measurements, such a database does not exist for unobtrusively measured signals. This inhibits progress in the vital area of signal processing for unobtrusive medical monitoring as not everybody owns the specific measurement systems to acquire signals. Furthermore, if no common database exists, a comparison between different signal processing approaches is not possible. This gap will be closed by our UnoViS database. It contains different recordings in various scenarios ranging from a clinical study to measurements obtained while driving a car. Currently, 145 records with a total of 16.2 h of measurement data is available, which are provided as MATLAB files or in the PhysioNet WFDB file format. In its initial state, only (multichannel) capacitive ECG and unobtrusive PPG signals are, together with a reference ECG, included. All ECG signals contain annotations by a peak detector and by a medical expert. A dataset from a clinical study contains further clinical annotations. Additionally, supplementary functions are provided, which simplify the usage of the database and thus the development and evaluation of new algorithms. The development of urgently needed methods for very robust parameter extraction or robust signal fusion in view of frequent severe motion artifacts in unobtrusive monitoring is now possible with the database.

  16. Data on publications, structural analyses, and queries used to build and utilize the AlloRep database.

    PubMed

    Sousa, Filipa L; Parente, Daniel J; Hessman, Jacob A; Chazelle, Allen; Teichmann, Sarah A; Swint-Kruse, Liskin

    2016-09-01

    The AlloRep database (www.AlloRep.org) (Sousa et al., 2016) [1] compiles extensive sequence, mutagenesis, and structural information for the LacI/GalR family of transcription regulators. Sequence alignments are presented for >3000 proteins in 45 paralog subfamilies and as a subsampled alignment of the whole family. Phenotypic and biochemical data on almost 6000 mutants have been compiled from an exhaustive search of the literature; citations for these data are included herein. These data include information about oligomerization state, stability, DNA binding and allosteric regulation. Protein structural data for 65 proteins are presented as easily-accessible, residue-contact networks. Finally, this article includes example queries to enable the use of the AlloRep database. See the related article, "AlloRep: a repository of sequence, structural and mutagenesis data for the LacI/GalR transcription regulators" (Sousa et al., 2016) [1]. PMID:27508249

  17. The low-template-DNA (stochastic) threshold--its determination relative to risk analysis for national DNA databases.

    PubMed

    Gill, Peter; Puch-Solis, Roberto; Curran, James

    2009-03-01

    Although the low-template or stochastic threshold is in widespread use and is typically set to 150-200 rfu peak height, there has been no consideration on its determination and meaning. In this paper we propose a definition that is based upon the specific risk of wrongful designation of a heterozygous genotype as a homozygote which could lead to a false exclusion. Conversely, it is possible that a homozygote {a,a} could be designated as {a,F} where 'F' is a 'wild card', and this could lead to increased risk of false inclusion. To determine these risk levels, we analysed an experimental dataset that exhibited extreme drop-out using logistic regression. The derived probabilities are employed in a graphical model to determine the relative risks of wrongful designations that may cause false inclusions and exclusions. The methods described in this paper provide a preliminary solution of risk evaluation for any DNA process that employs a stochastic threshold.

  18. The Human Transcript Database: A Catalogue of Full Length cDNA Inserts

    SciTech Connect

    Bouckk John; Michael McLeod; Kim Worley; Richard Gibbs

    1999-09-10

    The BCM Search Launcher provided improved access to web-based sequence analysis services during the granting period and beyond. The Search Launcher web site grouped analysis procedures by function and provided default parameters that provided reasonable search results for most applications. For instance, most queries were automatically masked for repeat sequences prior to sequence database searches to avoid spurious matches. In addition to the web-based access and arrangements that were made using the functions easier, the BCM Search Launcher provided unique value-added applications like the BEAUTY sequence database search tool that combined information about protein domains and sequence database search results to give an enhanced, more complete picture of the reliability and relative value of the information reported. This enhanced search tool made evaluating search results more straight-forward and consistent. Some of the favorite features of the web site are the sequence utilities and the batch client functionality that allows processing of multiple samples from the command line interface. One measure of the success of the BCM Search Launcher is the number of sites that have adopted the models first developed on the site. The graphic display on the BLAST search from the NCBI web site is one such outgrowth, as is the display of protein domain search results within BLAST search results, and the design of the Biology Workbench application. The logs of usage and comments from users confirm the great utility of this resource.

  19. Database and online map service on unstable rock slopes in Norway - From data perpetuation to public information

    NASA Astrophysics Data System (ADS)

    Oppikofer, Thierry; Nordahl, Bobo; Bunkholt, Halvor; Nicolaisen, Magnus; Jarna, Alexandra; Iversen, Sverre; Hermanns, Reginald L.; Böhme, Martina; Yugsi Molina, Freddy X.

    2015-11-01

    The unstable rock slope database is developed and maintained by the Geological Survey of Norway as part of the systematic mapping of unstable rock slopes in Norway. This mapping aims to detect catastrophic rock slope failures before they occur. More than 250 unstable slopes with post-glacial deformation are detected up to now. The main aims of the unstable rock slope database are (1) to serve as a national archive for unstable rock slopes in Norway; (2) to serve for data collection and storage during field mapping; (3) to provide decision-makers with hazard zones and other necessary information on unstable rock slopes for land-use planning and mitigation; and (4) to inform the public through an online map service. The database is organized hierarchically with a main point for each unstable rock slope to which several feature classes and tables are linked. This main point feature class includes several general attributes of the unstable rock slopes, such as site name, general and geological descriptions, executed works, recommendations, technical parameters (volume, lithology, mechanism and others), displacement rates, possible consequences, as well as hazard and risk classification. Feature classes and tables linked to the main feature class include different scenarios of an unstable rock slope, field observation points, sampling points for dating, displacement measurement stations, lineaments, unstable areas, run-out areas, areas affected by secondary effects, along with tables for hazard and risk classification and URL links to further documentation and references. The database on unstable rock slopes in Norway will be publicly consultable through an online map service. Factsheets with key information on unstable rock slopes can be automatically generated and downloaded for each site. Areas of possible rock avalanche run-out and their secondary effects displayed in the online map service, along with hazard and risk assessments, will become important tools for

  20. Towards a DNA Barcode Reference Database for Spiders and Harvestmen of Germany

    PubMed Central

    Astrin, Jonas J.; Höfer, Hubert; Spelda, Jörg; Holstein, Joachim; Bayer, Steffen; Hendrich, Lars; Huber, Bernhard A.; Kielhorn, Karl-Hinrich; Krammer, Hans-Joachim; Lemke, Martin; Monje, Juan Carlos; Morinière, Jérôme; Rulik, Björn; Petersen, Malte; Janssen, Hannah; Muster, Christoph

    2016-01-01

    As part of the German Barcode of Life campaign, over 3500 arachnid specimens have been collected and analyzed: ca. 3300 Araneae and 200 Opiliones, belonging to almost 600 species (median: 4 individuals/species). This covers about 60% of the spider fauna and more than 70% of the harvestmen fauna recorded for Germany. The overwhelming majority of species could be readily identified through DNA barcoding: median distances between closest species lay around 9% in spiders and 13% in harvestmen, while in 95% of the cases, intraspecific distances were below 2.5% and 8% respectively, with intraspecific medians at 0.3% and 0.2%. However, almost 20 spider species, most notably in the family Lycosidae, could not be separated through DNA barcoding (although many of them present discrete morphological differences). Conspicuously high interspecific distances were found in even more cases, hinting at cryptic species in some instances. A new program is presented: DiStats calculates the statistics needed to meet DNA barcode release criteria. Furthermore, new generic COI primers useful for a wide range of taxa (also other than arachnids) are introduced. PMID:27681175

  1. Discovery of Possible Gene Relationships through the Application of Self-Organizing Maps to DNA Microarray Databases

    PubMed Central

    Chavez-Alvarez, Rocio; Chavoya, Arturo; Mendez-Vazquez, Andres

    2014-01-01

    DNA microarrays and cell cycle synchronization experiments have made possible the study of the mechanisms of cell cycle regulation of Saccharomyces cerevisiae by simultaneously monitoring the expression levels of thousands of genes at specific time points. On the other hand, pattern recognition techniques can contribute to the analysis of such massive measurements, providing a model of gene expression level evolution through the cell cycle process. In this paper, we propose the use of one of such techniques –an unsupervised artificial neural network called a Self-Organizing Map (SOM)–which has been successfully applied to processes involving very noisy signals, classifying and organizing them, and assisting in the discovery of behavior patterns without requiring prior knowledge about the process under analysis. As a test bed for the use of SOMs in finding possible relationships among genes and their possible contribution in some biological processes, we selected 282 S. cerevisiae genes that have been shown through biological experiments to have an activity during the cell cycle. The expression level of these genes was analyzed in five of the most cited time series DNA microarray databases used in the study of the cell cycle of this organism. With the use of SOM, it was possible to find clusters of genes with similar behavior in the five databases along two cell cycles. This result suggested that some of these genes might be biologically related or might have a regulatory relationship, as was corroborated by comparing some of the clusters obtained with SOMs against a previously reported regulatory network that was generated using biological knowledge, such as protein-protein interactions, gene expression levels, metabolism dynamics, promoter binding, and modification, regulation and transport of proteins. The methodology described in this paper could be applied to the study of gene relationships of other biological processes in different organisms. PMID:24699245

  2. Demographic and experiential correlates of public attitudes towards cell-free fetal DNA screening

    PubMed Central

    Sayres, Lauren C.; Allyse, Megan; Goodspeed, Taylor A.; Cho, Mildred K.

    2014-01-01

    This study seeks to inform clinical application of cell-free fetal DNA (cffDNA) screening as a novel method for prenatal trisomy detection by investigating public attitudes towards this technology and demographic and experiential characteristics related to these attitudes. Two versions of a 25-item survey assessing interest in cffDNA and existing first-trimester combined screening for either trisomy 13 and 18 or trisomy 21 were distributed among 3,164 members of the United States public. Logistic regression was performed to determine variables predictive of interest in screening options. Approximately 47% of respondents expressed an interest in cffDNA screening for trisomy 13, 18, and 21, with a majority interested in cffDNA screening as a stand-alone technique. A significantly greater percent would consider termination of pregnancy following a diagnosis of trisomy 13 or 18 (52%) over one of trisomy 21 (44%). Willingness to consider abortion of an affected pregnancy was the strongest correlate to interest in both cffDNA and first-trimester combined screening, although markedly more respondents expressed an interest in some form of screening (69% and 71%, respectively) than would consider termination. Greater educational attainment, higher income, and insurance coverage predicted interest in cffDNA screening; stronger religious identification also corresponded to decreased interest. Prior experience with disability and genetic testing was associated with increased interest in cffDNA screening. Several of these factors, in addition to advanced age and Asian race, were, in turn, predictive of respondents’ increased willingness to consider post-diagnosis termination of pregnancy. In conclusion, divergent attitudes towards cffDNA screening - and prenatal options more generally – appear correlated with individual socioeconomic and religious backgrounds and experiences with disability and genetic testing. Clinical implementation and counseling for novel prenatal

  3. Insights from the DNA databases: approaches to the phylogenetic structure of Acanthamoeba.

    PubMed

    Fuerst, Paul A

    2014-11-01

    Species of Acanthamoeba have been traditionally described using morphology (primarily cyst structure), or cytology of nuclear division (used by Pussard and Pons, 1977). Twenty-plus putative species were proposed based on such criteria. Morphology, however, is often plastic, dependent upon culture conditions. DNA sequences of the nuclear small subunit (18S) rRNA that can be used for the study of the phylogeny of Acanthamoeba have increased from a single sequence in 1986 to more than 1800 in 2013. Some of the patterns of the sequence data for Acanthamoeba are reviewed, and some of the insights that this data illuminates are illustrated. In particular, the data suggest the existence of 20 or more genotypic types, a number not dissimilar to the number of named species of Acanthamoeba. However, molecular studies make clear that the relationship between phylogenetic relatedness and species names as we know them for Acanthamoeba is tenuous at best.

  4. Seabird databases and the new paradigm for scientific publication and attribution

    USGS Publications Warehouse

    Hatch, Scott A.

    2010-01-01

    For more than 300 years, the peer-reviewed journal article has been the principal medium for packaging and delivering scientific data. With new tools for managing digital data, a new paradigm is emerging—one that demands open and direct access to data and that enables and rewards a broad-based approach to scientific questions. Ground-breaking papers in the future will increasingly be those that creatively mine and synthesize vast stores of data available on the Internet. This is especially true for conservation science, in which essential data can be readily captured in standard record formats. For seabird professionals, a number of globally shared databases are in the offing, or should be. These databases will capture the salient results of inventories and monitoring, pelagic surveys, diet studies, and telemetry. A number of real or perceived barriers to data sharing exist, but none is insurmountable. Our discipline should take an important stride now by adopting a specially designed markup language for annotating and sharing seabird data.

  5. Automatic detection of lung nodules in computed tomography images: training and validation of algorithms using public research databases

    NASA Astrophysics Data System (ADS)

    Camarlinghi, Niccolò

    2013-09-01

    Lung cancer is one of the main public health issues in developed countries. Lung cancer typically manifests itself as non-calcified pulmonary nodules that can be detected reading lung Computed Tomography (CT) images. To assist radiologists in reading images, researchers started, a decade ago, the development of Computer Aided Detection (CAD) methods capable of detecting lung nodules. In this work, a CAD composed of two CAD subprocedures is presented: , devoted to the identification of parenchymal nodules, and , devoted to the identification of the nodules attached to the pleura surface. Both CADs are an upgrade of two methods previously presented as Voxel Based Neural Approach CAD . The novelty of this paper consists in the massive training using the public research Lung International Database Consortium (LIDC) database and on the implementation of new features for classification with respect to the original VBNA method. Finally, the proposed CAD is blindly validated on the ANODE09 dataset. The result of the validation is a score of 0.393, which corresponds to the average sensitivity of the CAD computed at seven predefined false positive rates: 1/8, 1/4, 1/2, 1, 2, 4, and 8 FP/CT.

  6. [Public scientific knowledge distribution in health information, communication and information technology indexed in MEDLINE and LILACS databases].

    PubMed

    Packer, Abel Laerte; Tardelli, Adalberto Otranto; Castro, Regina Célia Figueiredo

    2007-01-01

    This study explores the distribution of international, regional and national scientific output in health information and communication, indexed in the MEDLINE and LILACS databases, between 1996 and 2005. A selection of articles was based on the hierarchical structure of Information Science in MeSH vocabulary. Four specific domains were determined: health information, medical informatics, scientific communications on healthcare and healthcare communications. The variables analyzed were: most-covered subjects and journals, author affiliation and publication countries and languages, in both databases. The Information Science category is represented in nearly 5% of MEDLINE and LILACS articles. The four domains under analysis showed a relative annual increase in MEDLINE. The Medical Informatics domain showed the highest number of records in MEDLINE, representing about half of all indexed articles. The importance of Information Science as a whole is more visible in publications from developed countries and the findings indicate the predominance of the United States, with significant growth in scientific output from China and South Korea and, to a lesser extent, Brazil.

  7. A framework for organizing cancer-related variations from existing databases, publications and NGS data using a High-performance Integrated Virtual Environment (HIVE).

    PubMed

    Wu, Tsung-Jung; Shamsaddini, Amirhossein; Pan, Yang; Smith, Krista; Crichton, Daniel J; Simonyan, Vahan; Mazumder, Raja

    2014-01-01

    Years of sequence feature curation by UniProtKB/Swiss-Prot, PIR-PSD, NCBI-CDD, RefSeq and other database biocurators has led to a rich repository of information on functional sites of genes and proteins. This information along with variation-related annotation can be used to scan human short sequence reads from next-generation sequencing (NGS) pipelines for presence of non-synonymous single-nucleotide variations (nsSNVs) that affect functional sites. This and similar workflows are becoming more important because thousands of NGS data sets are being made available through projects such as The Cancer Genome Atlas (TCGA), and researchers want to evaluate their biomarkers in genomic data. BioMuta, an integrated sequence feature database, provides a framework for automated and manual curation and integration of cancer-related sequence features so that they can be used in NGS analysis pipelines. Sequence feature information in BioMuta is collected from the Catalogue of Somatic Mutations in Cancer (COSMIC), ClinVar, UniProtKB and through biocuration of information available from publications. Additionally, nsSNVs identified through automated analysis of NGS data from TCGA are also included in the database. Because of the petabytes of data and information present in NGS primary repositories, a platform HIVE (High-performance Integrated Virtual Environment) for storing, analyzing, computing and curating NGS data and associated metadata has been developed. Using HIVE, 31 979 nsSNVs were identified in TCGA-derived NGS data from breast cancer patients. All variations identified through this process are stored in a Curated Short Read archive, and the nsSNVs from the tumor samples are included in BioMuta. Currently, BioMuta has 26 cancer types with 13 896 small-scale and 308 986 large-scale study-derived variations. Integration of variation data allows identifications of novel or common nsSNVs that can be prioritized in validation studies. Database URL: BioMuta: http

  8. Estimating species diversity and distribution in the era of Big Data: to what extent can we trust public databases?

    PubMed Central

    Maldonado, Carla; Molina, Carlos I.; Zizka, Alexander; Persson, Claes; Taylor, Charlotte M.; Albán, Joaquina; Chilquillo, Eder; Antonelli, Alexandre

    2015-01-01

    Abstract Aim Massive digitalization of natural history collections is now leading to a steep accumulation of publicly available species distribution data. However, taxonomic errors and geographical uncertainty of species occurrence records are now acknowledged by the scientific community – putting into question to what extent such data can be used to unveil correct patterns of biodiversity and distribution. We explore this question through quantitative and qualitative analyses of uncleaned versus manually verified datasets of species distribution records across different spatial scales. Location The American tropics. Methods As test case we used the plant tribe Cinchoneae (Rubiaceae). We compiled four datasets of species occurrences: one created manually and verified through classical taxonomic work, and the rest derived from GBIF under different cleaning and filling schemes. We used new bioinformatic tools to code species into grids, ecoregions, and biomes following WWF's classification. We analysed species richness and altitudinal ranges of the species. Results Altitudinal ranges for species and genera were correctly inferred even without manual data cleaning and filling. However, erroneous records affected spatial patterns of species richness. They led to an overestimation of species richness in certain areas outside the centres of diversity in the clade. The location of many of these areas comprised the geographical midpoint of countries and political subdivisions, assigned long after the specimens had been collected. Main conclusion Open databases and integrative bioinformatic tools allow a rapid approximation of large‐scale patterns of biodiversity across space and altitudinal ranges. We found that geographic inaccuracy affects diversity patterns more than taxonomic uncertainties, often leading to false positives, i.e. overestimating species richness in relatively species poor regions. Public databases for species distribution are valuable and should be

  9. Estimating species diversity and distribution in the era of Big Data: to what extent can we trust public databases?

    PubMed Central

    Maldonado, Carla; Molina, Carlos I.; Zizka, Alexander; Persson, Claes; Taylor, Charlotte M.; Albán, Joaquina; Chilquillo, Eder; Antonelli, Alexandre

    2015-01-01

    Abstract Aim Massive digitalization of natural history collections is now leading to a steep accumulation of publicly available species distribution data. However, taxonomic errors and geographical uncertainty of species occurrence records are now acknowledged by the scientific community – putting into question to what extent such data can be used to unveil correct patterns of biodiversity and distribution. We explore this question through quantitative and qualitative analyses of uncleaned versus manually verified datasets of species distribution records across different spatial scales. Location The American tropics. Methods As test case we used the plant tribe Cinchoneae (Rubiaceae). We compiled four datasets of species occurrences: one created manually and verified through classical taxonomic work, and the rest derived from GBIF under different cleaning and filling schemes. We used new bioinformatic tools to code species into grids, ecoregions, and biomes following WWF's classification. We analysed species richness and altitudinal ranges of the species. Results Altitudinal ranges for species and genera were correctly inferred even without manual data cleaning and filling. However, erroneous records affected spatial patterns of species richness. They led to an overestimation of species richness in certain areas outside the centres of diversity in the clade. The location of many of these areas comprised the geographical midpoint of countries and political subdivisions, assigned long after the specimens had been collected. Main conclusion Open databases and integrative bioinformatic tools allow a rapid approximation of large‐scale patterns of biodiversity across space and altitudinal ranges. We found that geographic inaccuracy affects diversity patterns more than taxonomic uncertainties, often leading to false positives, i.e. overestimating species richness in relatively species poor regions. Public databases for species distribution are valuable and should be

  10. [Organic Law 10/2007, of October 8, regulating the police database on identifiers obtained from DNA: historic background and genetic view].

    PubMed

    García, Oscar

    2007-01-01

    Recently, Basic Law 10/2007 of 8 October has entered into effect, which regulates the police database on identifiers that are obtained from DNA. In the following lines, the author reveals the process of approval of this law as well as approaching certain of its aspects from a genetic perspective.

  11. Evaluation and Utilization as a Public Health Tool of a National Molecular Epidemiological Tuberculosis Outbreak Database within the United Kingdom from 1997 to 2001

    PubMed Central

    Drobniewski, F. A.; Gibson, A.; Ruddy, M.; Yates, M. D.

    2003-01-01

    The aim of this study was to develop a national model and analyze the value of a molecular epidemiological Mycobacterium tuberculosis DNA fingerprint-outbreak database. Incidents were investigated by the United Kingdom PHLS Mycobacterium Reference Unit (MRU) from June 1997 to December 2001, inclusive. A total of 124 incidents involving 972 tuberculosis cases, including 520 patient cultures from referred incidents and 452 patient cultures related to two population studies, were examined by using restriction fragment length polymorphism IS6110 fingerprinting and rapid epidemiological typing. Investigations were divided into the following three categories, reflecting different operational strategies: retrospective passive analysis, retrospective active analysis, and retrospective prospective analysis. The majority of incidents were in the retrospective passive analysis category, i.e., the individual submitting isolates has a suspicion they may be linked. Outbreaks were examined in schools, hospitals, farms, prisons, and public houses, and laboratory cross-contamination events and unusual clinical presentations were investigated. Retrospective active analysis involved a major outbreak centered on a high school. Contact tracing of a teenager with smear-positive pulmonary tuberculosis matched 14 individuals, including members of his class, and another 60 cases were identified in schools clinically and radiologically and by skin testing. Retrospective prospective analysis involved an outbreak of 94 isoniazid-resistant tuberculosis cases in London, United Kingdom, that began after cases were identified at one hospital in January 2000. Contact tracing and comparison with MRU databases indicated that the earliest matched case had occurred in 1995. Subsequently, the MRU changed to an active prospective analysis targeting linked isoniazid-monoresistant isolates for follow up. The patients were multiethnic, born mainly in the United Kingdom, and included professionals

  12. Genotyping and interpretation of STR-DNA: Low-template, mixtures and database matches-Twenty years of research and development.

    PubMed

    Gill, Peter; Haned, Hinda; Bleka, Oyvind; Hansson, Oskar; Dørum, Guro; Egeland, Thore

    2015-09-01

    The introduction of Short Tandem Repeat (STR) DNA was a revolution within a revolution that transformed forensic DNA profiling into a tool that could be used, for the first time, to create National DNA databases. This transformation would not have been possible without the concurrent development of fluorescent automated sequencers, combined with the ability to multiplex several loci together. Use of the polymerase chain reaction (PCR) increased the sensitivity of the method to enable the analysis of a handful of cells. The first multiplexes were simple: 'the quad', introduced by the defunct UK Forensic Science Service (FSS) in 1994, rapidly followed by a more discriminating 'six-plex' (Second Generation Multiplex) in 1995 that was used to create the world's first national DNA database. The success of the database rapidly outgrew the functionality of the original system - by the year 2000 a new multiplex of ten-loci was introduced to reduce the chance of adventitious matches. The technology was adopted world-wide, albeit with different loci. The political requirement to introduce pan-European databases encouraged standardisation - the development of European Standard Set (ESS) of markers comprising twelve-loci is the latest iteration. Although development has been impressive, the methods used to interpret evidence have lagged behind. For example, the theory to interpret complex DNA profiles (low-level mixtures), had been developed fifteen years ago, but only in the past year or so, are the concepts starting to be widely adopted. A plethora of different models (some commercial and others non-commercial) have appeared. This has led to a confusing 'debate' about the 'best' to use. The different models available are described along with their advantages and disadvantages. A section discusses the development of national DNA databases, along with details of an associated controversy to estimate the strength of evidence of matches. Current methodology is limited to

  13. DISTRIBUTED STRUCTURE-SEARCHABLE TOXICITY (DSSTOX) DATABASE NETWORK: MAKING PUBLIC TOXICITY DATA RESOURCES MORE ACCESSIBLE AND USABLE FOR DATA EXPLORATION AND SAR DEVELOPMENT

    EPA Science Inventory


    Distributed Structure-Searchable Toxicity (DSSTox) Database Network: Making Public Toxicity Data Resources More Accessible and U sable for Data Exploration and SAR Development

    Many sources of public toxicity data are not currently linked to chemical structure, are not ...

  14. UniPROBE, update 2011: expanded content and search tools in the online database of protein-binding microarray data on protein-DNA interactions.

    PubMed

    Robasky, Kimberly; Bulyk, Martha L

    2011-01-01

    The Universal PBM Resource for Oligonucleotide-Binding Evaluation (UniPROBE) database is a centralized repository of information on the DNA-binding preferences of proteins as determined by universal protein-binding microarray (PBM) technology. Each entry for a protein (or protein complex) in UniPROBE provides the quantitative preferences for all possible nucleotide sequence variants ('words') of length k ('k-mers'), as well as position weight matrix (PWM) and graphical sequence logo representations of the k-mer data. In this update, we describe >130% expansion of the database content, incorporation of a protein BLAST (blastp) tool for finding protein sequence matches in UniPROBE, the introduction of UniPROBE accession numbers and additional database enhancements. The UniPROBE database is available at http://uniprobe.org.

  15. The Mission Accessible Near-Earth Object Survey Public Database Development Effort

    NASA Astrophysics Data System (ADS)

    Burt, Brian; Moskovitz, Nicholas; Putnam, Lowell

    2014-11-01

    The Mission Accessible Near-Earth Object Survey (MANOS) began in August 2013 as a multi-year physical characterization survey that was awarded large survey status by NOAO. MANOS will target several hundred mission-accessible NEOs across visible and near-infrared wavelengths, ultimately providing a comprehensive catalog of physical properties (astrometry, light curves, spectra). The MANOS project will provide a resource that not only helps to manage our survey in a fully transparent, publicly accessible forum, but will also help to coordinate minor planet characterization efforts and target prioritization across multiple research groups. Working towards that goal, we are developing a portal for rapid, up to date, public dissemination of our data. Migrating the Lowell Astorb dataset to a SQL framework is a major step towards the modernization of the system and will make capable up-to-date deployment of data. This will further allow us to develop utilities of various complexity, such as a deltaV calculator, minor planet finder charts, and sophisticated ephemeri generation functions. We present the state of this effort and a preliminary timeline for functionality.

  16. Creating a data exchange strategy for radiotherapy research: towards federated databases and anonymised public datasets.

    PubMed

    Skripcak, Tomas; Belka, Claus; Bosch, Walter; Brink, Carsten; Brunner, Thomas; Budach, Volker; Büttner, Daniel; Debus, Jürgen; Dekker, Andre; Grau, Cai; Gulliford, Sarah; Hurkmans, Coen; Just, Uwe; Krause, Mechthild; Lambin, Philippe; Langendijk, Johannes A; Lewensohn, Rolf; Lühr, Armin; Maingon, Philippe; Masucci, Michele; Niyazi, Maximilian; Poortmans, Philip; Simon, Monique; Schmidberger, Heinz; Spezi, Emiliano; Stuschke, Martin; Valentini, Vincenzo; Verheij, Marcel; Whitfield, Gillian; Zackrisson, Björn; Zips, Daniel; Baumann, Michael

    2014-12-01

    Disconnected cancer research data management and lack of information exchange about planned and ongoing research are complicating the utilisation of internationally collected medical information for improving cancer patient care. Rapidly collecting/pooling data can accelerate translational research in radiation therapy and oncology. The exchange of study data is one of the fundamental principles behind data aggregation and data mining. The possibilities of reproducing the original study results, performing further analyses on existing research data to generate new hypotheses or developing computational models to support medical decisions (e.g. risk/benefit analysis of treatment options) represent just a fraction of the potential benefits of medical data-pooling. Distributed machine learning and knowledge exchange from federated databases can be considered as one beyond other attractive approaches for knowledge generation within "Big Data". Data interoperability between research institutions should be the major concern behind a wider collaboration. Information captured in electronic patient records (EPRs) and study case report forms (eCRFs), linked together with medical imaging and treatment planning data, are deemed to be fundamental elements for large multi-centre studies in the field of radiation therapy and oncology. To fully utilise the captured medical information, the study data have to be more than just an electronic version of a traditional (un-modifiable) paper CRF. Challenges that have to be addressed are data interoperability, utilisation of standards, data quality and privacy concerns, data ownership, rights to publish, data pooling architecture and storage. This paper discusses a framework for conceptual packages of ideas focused on a strategic development for international research data exchange in the field of radiation therapy and oncology. PMID:25458128

  17. Creating a data exchange strategy for radiotherapy research: towards federated databases and anonymised public datasets.

    PubMed

    Skripcak, Tomas; Belka, Claus; Bosch, Walter; Brink, Carsten; Brunner, Thomas; Budach, Volker; Büttner, Daniel; Debus, Jürgen; Dekker, Andre; Grau, Cai; Gulliford, Sarah; Hurkmans, Coen; Just, Uwe; Krause, Mechthild; Lambin, Philippe; Langendijk, Johannes A; Lewensohn, Rolf; Lühr, Armin; Maingon, Philippe; Masucci, Michele; Niyazi, Maximilian; Poortmans, Philip; Simon, Monique; Schmidberger, Heinz; Spezi, Emiliano; Stuschke, Martin; Valentini, Vincenzo; Verheij, Marcel; Whitfield, Gillian; Zackrisson, Björn; Zips, Daniel; Baumann, Michael

    2014-12-01

    Disconnected cancer research data management and lack of information exchange about planned and ongoing research are complicating the utilisation of internationally collected medical information for improving cancer patient care. Rapidly collecting/pooling data can accelerate translational research in radiation therapy and oncology. The exchange of study data is one of the fundamental principles behind data aggregation and data mining. The possibilities of reproducing the original study results, performing further analyses on existing research data to generate new hypotheses or developing computational models to support medical decisions (e.g. risk/benefit analysis of treatment options) represent just a fraction of the potential benefits of medical data-pooling. Distributed machine learning and knowledge exchange from federated databases can be considered as one beyond other attractive approaches for knowledge generation within "Big Data". Data interoperability between research institutions should be the major concern behind a wider collaboration. Information captured in electronic patient records (EPRs) and study case report forms (eCRFs), linked together with medical imaging and treatment planning data, are deemed to be fundamental elements for large multi-centre studies in the field of radiation therapy and oncology. To fully utilise the captured medical information, the study data have to be more than just an electronic version of a traditional (un-modifiable) paper CRF. Challenges that have to be addressed are data interoperability, utilisation of standards, data quality and privacy concerns, data ownership, rights to publish, data pooling architecture and storage. This paper discusses a framework for conceptual packages of ideas focused on a strategic development for international research data exchange in the field of radiation therapy and oncology.

  18. Creating a data exchange strategy for radiotherapy research: Towards federated databases and anonymised public datasets

    PubMed Central

    Skripcak, Tomas; Belka, Claus; Bosch, Walter; Brink, Carsten; Brunner, Thomas; Budach, Volker; Büttner, Daniel; Debus, Jürgen; Dekker, Andre; Grau, Cai; Gulliford, Sarah; Hurkmans, Coen; Just, Uwe; Krause, Mechthild; Lambin, Philippe; Langendijk, Johannes A.; Lewensohn, Rolf; Lühr, Armin; Maingon, Philippe; Masucci, Michele; Niyazi, Maximilian; Poortmans, Philip; Simon, Monique; Schmidberger, Heinz; Spezi, Emiliano; Stuschke, Martin; Valentini, Vincenzo; Verheij, Marcel; Whitfield, Gillian; Zackrisson, Björn; Zips, Daniel; Baumann, Michael

    2015-01-01

    Disconnected cancer research data management and lack of information exchange about planned and ongoing research are complicating the utilisation of internationally collected medical information for improving cancer patient care. Rapidly collecting/pooling data can accelerate translational research in radiation therapy and oncology. The exchange of study data is one of the fundamental principles behind data aggregation and data mining. The possibilities of reproducing the original study results, performing further analyses on existing research data to generate new hypotheses or developing computational models to support medical decisions (e.g. risk/benefit analysis of treatment options) represent just a fraction of the potential benefits of medical data-pooling. Distributed machine learning and knowledge exchange from federated databases can be considered as one beyond other attractive approaches for knowledge generation within “Big Data”. Data interoperability between research institutions should be the major concern behind a wider collaboration. Information captured in electronic patient records (EPRs) and study case report forms (eCRFs), linked together with medical imaging and treatment planning data, are deemed to be fundamental elements for large multi-centre studies in the field of radiation therapy and oncology. To fully utilise the captured medical information, the study data have to be more than just an electronic version of a traditional (un-modifiable) paper CRF. Challenges that have to be addressed are data interoperability, utilisation of standards, data quality and privacy concerns, data ownership, rights to publish, data pooling architecture and storage. This paper discusses a framework for conceptual packages of ideas focused on a strategic development for international research data exchange in the field of radiation therapy and oncology. PMID:25458128

  19. Comparative study of multimodal intra-subject image registration methods on a publicly available database

    NASA Astrophysics Data System (ADS)

    Miri, Mohammad Saleh; Ghayoor, Ali; Johnson, Hans J.; Sonka, Milan

    2016-03-01

    This work reports on a comparative study between five manual and automated methods for intra-subject pair-wise registration of images from different modalities. The study includes a variety of inter-modal image registrations (MR-CT, PET-CT, PET-MR) utilizing different methods including two manual point-based techniques using rigid and similarity transformations, one automated point-based approach based on Iterative Closest Point (ICP) algorithm, and two automated intensity-based methods using mutual information (MI) and normalized mutual information (NMI). These techniques were employed for inter-modal registration of brain images of 9 subjects from a publicly available dataset, and the results were evaluated qualitatively via checkerboard images and quantitatively using root mean square error and MI criteria. In addition, for each inter-modal registration, a paired t-test was performed on the quantitative results in order to find any significant difference between the results of the studied registration techniques.

  20. Exploration of Preterm Birth Rates Using the Public Health Exposome Database and Computational Analysis Methods

    PubMed Central

    Kershenbaum, Anne D.; Langston, Michael A.; Levine, Robert S.; Saxton, Arnold M.; Oyana, Tonny J.; Kilbourne, Barbara J.; Rogers, Gary L.; Gittner, Lisaann S.; Baktash, Suzanne H.; Matthews-Juarez, Patricia; Juarez, Paul D.

    2014-01-01

    Recent advances in informatics technology has made it possible to integrate, manipulate, and analyze variables from a wide range of scientific disciplines allowing for the examination of complex social problems such as health disparities. This study used 589 county-level variables to identify and compare geographical variation of high and low preterm birth rates. Data were collected from a number of publically available sources, bringing together natality outcomes with attributes of the natural, built, social, and policy environments. Singleton early premature county birth rate, in counties with population size over 100,000 persons provided the dependent variable. Graph theoretical techniques were used to identify a wide range of predictor variables from various domains, including black proportion, obesity and diabetes, sexually transmitted infection rates, mother’s age, income, marriage rates, pollution and temperature among others. Dense subgraphs (paracliques) representing groups of highly correlated variables were resolved into latent factors, which were then used to build a regression model explaining prematurity (R-squared = 76.7%). Two lists of counties with large positive and large negative residuals, indicating unusual prematurity rates given their circumstances, may serve as a starting point for ways to intervene and reduce health disparities for preterm births. PMID:25464130

  1. Comparing subjective image quality measurement methods for the creation of public databases

    NASA Astrophysics Data System (ADS)

    Redi, Judith; Liu, Hantao; Alers, Hani; Zunino, Rodolfo; Heynderickx, Ingrid

    2010-01-01

    The Single Stimulus (SS) method is often chosen to collect subjective data testing no-reference objective metrics, as it is straightforward to implement and well standardized. At the same time, it exhibits some drawbacks; spread between different assessors is relatively large, and the measured ratings depend on the quality range spanned by the test samples, hence the results from different experiments cannot easily be merged . The Quality Ruler (QR) method has been proposed to overcome these inconveniences. This paper compares the performance of the SS and QR method for pictures impaired by Gaussian blur. The research goal is, on one hand, to analyze the advantages and disadvantages of both methods for quality assessment and, on the other, to make quality data of blur impaired images publicly available. The obtained results show that the confidence intervals of the QR scores are narrower than those of the SS scores. This indicates that the QR method enhances consistency across assessors. Moreover, QR scores exhibit a higher linear correlation with the distortion applied. In summary, for the purpose of building datasets of subjective quality, the QR approach seems promising from the viewpoint of both consistency and repeatability.

  2. De-identifying a public use microdata file from the Canadian national discharge abstract database

    PubMed Central

    2011-01-01

    Abstract Background The Canadian Institute for Health Information (CIHI) collects hospital discharge abstract data (DAD) from Canadian provinces and territories. There are many demands for the disclosure of this data for research and analysis to inform policy making. To expedite the disclosure of data for some of these purposes, the construction of a DAD public use microdata file (PUMF) was considered. Such purposes include: confirming some published results, providing broader feedback to CIHI to improve data quality, training students and fellows, providing an easily accessible data set for researchers to prepare for analyses on the full DAD data set, and serve as a large health data set for computer scientists and statisticians to evaluate analysis and data mining techniques. The objective of this study was to measure the probability of re-identification for records in a PUMF, and to de-identify a national DAD PUMF consisting of 10% of records. Methods Plausible attacks on a PUMF were evaluated. Based on these attacks, the 2008-2009 national DAD was de-identified. A new algorithm was developed to minimize the amount of suppression while maximizing the precision of the data. The acceptable threshold for the probability of correct re-identification of a record was set at between 0.04 and 0.05. Information loss was measured in terms of the extent of suppression and entropy. Results Two different PUMF files were produced, one with geographic information, and one with no geographic information but more clinical information. At a threshold of 0.05, the maximum proportion of records with the diagnosis code suppressed was 20%, but these suppressions represented only 8-9% of all values in the DAD. Our suppression algorithm has less information loss than a more traditional approach to suppression. Smaller regions, patients with longer stays, and age groups that are infrequently admitted to hospitals tend to be the ones with the highest rates of suppression. Conclusions The

  3. Comparative Analyses of Plant Transcription Factor Databases

    PubMed Central

    Ramirez, Silvia R; Basu, Chhandak

    2009-01-01

    Transcription factors (TFs) are proteinaceous complex, which bind to the promoter regions in the DNA and affect transcription initiation. Plant TFs control gene expressions and genes control many physiological processes, which in turn trigger cascades of biochemical reactions in plant cells. The databases available for plant TFs are somewhat abundant but all convey different information and in different formats. Some of the publicly available plant TF databases may be narrow, while others are broad in scopes. For example, some of the best TF databases are ones that are very specific with just one plant species, but there are also other databases that contain a total of up to 20 different plant species. In this review plant TF databases ranging from a single species to many will be assessed and described. The comparative analyses of all the databases and their advantages and disadvantages are also discussed. PMID:19721806

  4. UniPROBE, update 2015: new tools and content for the online database of protein-binding microarray data on protein-DNA interactions.

    PubMed

    Hume, Maxwell A; Barrera, Luis A; Gisselbrecht, Stephen S; Bulyk, Martha L

    2015-01-01

    The Universal PBM Resource for Oligonucleotide Binding Evaluation (UniPROBE) serves as a convenient source of information on published data generated using universal protein-binding microarray (PBM) technology, which provides in vitro data about the relative DNA-binding preferences of transcription factors for all possible sequence variants of a length k ('k-mers'). The database displays important information about the proteins and displays their DNA-binding specificity data in terms of k-mers, position weight matrices and graphical sequence logos. This update to the database documents the growth of UniPROBE since the last update 4 years ago, and introduces a variety of new features and tools, including a new streamlined pipeline that facilitates data deposition by universal PBM data generators in the research community, a tool that generates putative nonbinding (i.e. negative control) DNA sequences for one or more proteins and novel motifs obtained by analyzing the PBM data using the BEEML-PBM algorithm for motif inference. The UniPROBE database is available at http://uniprobe.org.

  5. Does language matter? A case study of epidemiological and public health journals, databases and professional education in French, German and Italian

    PubMed Central

    Baussano, Iacopo; Brzoska, Patrick; Fedeli, Ugo; Larouche, Claudia; Razum, Oliver; Fung, Isaac C-H

    2008-01-01

    Epidemiology and public health are usually context-specific. Journals published in different languages and countries play a role both as sources of data and as channels through which evidence is incorporated into local public health practice. Databases in these languages facilitate access to relevant journals, and professional education in these languages facilitates the growth of native expertise in epidemiology and public health. However, as English has become the lingua franca of scientific communication in the era of globalisation, many journals published in non-English languages face the difficult dilemma of either switching to English and competing internationally, or sticking to the native tongue and having a restricted circulation among a local readership. This paper discusses the historical development of epidemiology and the current scene of epidemiological and public health journals, databases and professional education in three Western European languages: French, German and Italian, and examines the dynamics and struggles they have today. PMID:18826570

  6. Does language matter? A case study of epidemiological and public health journals, databases and professional education in French, German and Italian.

    PubMed

    Baussano, Iacopo; Brzoska, Patrick; Fedeli, Ugo; Larouche, Claudia; Razum, Oliver; Fung, Isaac C-H

    2008-01-01

    Epidemiology and public health are usually context-specific. Journals published in different languages and countries play a role both as sources of data and as channels through which evidence is incorporated into local public health practice. Databases in these languages facilitate access to relevant journals, and professional education in these languages facilitates the growth of native expertise in epidemiology and public health. However, as English has become the lingua franca of scientific communication in the era of globalisation, many journals published in non-English languages face the difficult dilemma of either switching to English and competing internationally, or sticking to the native tongue and having a restricted circulation among a local readership. This paper discusses the historical development of epidemiology and the current scene of epidemiological and public health journals, databases and professional education in three Western European languages: French, German and Italian, and examines the dynamics and struggles they have today. PMID:18826570

  7. Does language matter? A case study of epidemiological and public health journals, databases and professional education in French, German and Italian.

    PubMed

    Baussano, Iacopo; Brzoska, Patrick; Fedeli, Ugo; Larouche, Claudia; Razum, Oliver; Fung, Isaac C-H

    2008-01-01

    Epidemiology and public health are usually context-specific. Journals published in different languages and countries play a role both as sources of data and as channels through which evidence is incorporated into local public health practice. Databases in these languages facilitate access to relevant journals, and professional education in these languages facilitates the growth of native expertise in epidemiology and public health. However, as English has become the lingua franca of scientific communication in the era of globalisation, many journals published in non-English languages face the difficult dilemma of either switching to English and competing internationally, or sticking to the native tongue and having a restricted circulation among a local readership. This paper discusses the historical development of epidemiology and the current scene of epidemiological and public health journals, databases and professional education in three Western European languages: French, German and Italian, and examines the dynamics and struggles they have today.

  8. Segmentation of anatomical structures in chest radiographs using supervised methods: a comparative study on a public database.

    PubMed

    van Ginneken, Bram; Stegmann, Mikkel B; Loog, Marco

    2006-02-01

    The task of segmenting the lung fields, the heart, and the clavicles in standard posterior-anterior chest radiographs is considered. Three supervised segmentation methods are compared: active shape models, active appearance models and a multi-resolution pixel classification method that employs a multi-scale filter bank of Gaussian derivatives and a k-nearest-neighbors classifier. The methods have been tested on a publicly available database of 247 chest radiographs, in which all objects have been manually segmented by two human observers. A parameter optimization for active shape models is presented, and it is shown that this optimization improves performance significantly. It is demonstrated that the standard active appearance model scheme performs poorly, but large improvements can be obtained by including areas outside the objects into the model. For lung field segmentation, all methods perform well, with pixel classification giving the best results: a paired t-test showed no significant performance difference between pixel classification and an independent human observer. For heart segmentation, all methods perform comparably, but significantly worse than a human observer. Clavicle segmentation is a hard problem for all methods; best results are obtained with active shape models, but human performance is substantially better. In addition, several hybrid systems are investigated. For heart segmentation, where the separate systems perform comparably, significantly better performance can be obtained by combining the results with majority voting. As an application, the cardio-thoracic ratio is computed automatically from the segmentation results. Bland and Altman plots indicate that all methods perform well when compared to the gold standard, with confidence intervals from pixel classification and active appearance modeling very close to those of a human observer. All results, including the manual segmentations, have been made publicly available to facilitate

  9. Production of Arrayed and Rearrayed cDNA Libraries for Public Use

    SciTech Connect

    Rasmussen, K

    2005-08-29

    Researchers studying genes and their protein products need an easily available source for that gene. The I.M.A.G.E. Consortium at Lawrence Livermore National Laboratory is an important source of such genes in the form of arrayed cDNA libraries. The arrayed clones and associated data are available to the public, free of restriction. Libraries are transformed and titered into 384-well master plates, from which 2-8 copies are made. One copy plate is stored by LLNL while others are sent to sequencing groups, plate distributors, and to the group which contributed the library. Clones found to be unique and/or full-length are rearrayed and also made publicly available. Bioinformatics tools supporting the use of I.M.A.G.E. clones are accessible via the World Wide Web.

  10. Genome databases

    SciTech Connect

    Courteau, J.

    1991-10-11

    Since the Genome Project began several years ago, a plethora of databases have been developed or are in the works. They range from the massive Genome Data Base at Johns Hopkins University, the central repository of all gene mapping information, to small databases focusing on single chromosomes or organisms. Some are publicly available, others are essentially private electronic lab notebooks. Still others limit access to a consortium of researchers working on, say, a single human chromosome. An increasing number incorporate sophisticated search and analytical software, while others operate as little more than data lists. In consultation with numerous experts in the field, a list has been compiled of some key genome-related databases. The list was not limited to map and sequence databases but also included the tools investigators use to interpret and elucidate genetic data, such as protein sequence and protein structure databases. Because a major goal of the Genome Project is to map and sequence the genomes of several experimental animals, including E. coli, yeast, fruit fly, nematode, and mouse, the available databases for those organisms are listed as well. The author also includes several databases that are still under development - including some ambitious efforts that go beyond data compilation to create what are being called electronic research communities, enabling many users, rather than just one or a few curators, to add or edit the data and tag it as raw or confirmed.

  11. Comparative Study of Seven Commercial Kits for Human DNA Extraction from Urine Samples Suitable for DNA Biomarker-Based Public Health Studies

    PubMed Central

    El Bali, Latifa; Diman, Aurélie; Bernard, Alfred; Roosens, Nancy H. C.; De Keersmaecker, Sigrid C. J.

    2014-01-01

    Human genomic DNA extracted from urine could be an interesting tool for large-scale public health studies involving characterization of genetic variations or DNA biomarkers as a result of the simple and noninvasive collection method. These studies, involving many samples, require a rapid, easy, and standardized extraction protocol. Moreover, for practicability, there is a necessity to collect urine at a moment different from the first void and to store it appropriately until analysis. The present study compared seven commercial kits to select the most appropriate urinary human DNA extraction procedure for epidemiological studies. DNA yield has been determined using different quantification methods: two classical, i.e., NanoDrop and PicoGreen, and two species-specific real-time quantitative (q)PCR assays, as DNA extracted from urine contains, besides human, microbial DNA also, which largely contributes to the total DNA yield. In addition, the kits giving a good yield were also tested for the presence of PCR inhibitors. Further comparisons were performed regarding the sampling time and the storage conditions. Finally, as a proof-of-concept, an important gene related to smoking has been genotyped using the developed tools. We could select one well-performing kit for the human DNA extraction from urine suitable for molecular diagnostic real-time qPCR-based assays targeting genetic variations, applicable to large-scale studies. In addition, successful genotyping was possible using DNA extracted from urine stored at −20°C for several months, and an acceptable yield could also be obtained from urine collected at different moments during the day, which is particularly important for public health studies. PMID:25365790

  12. E-SovTox: An online database of the main publicly-available sources of toxicity data concerning REACH-relevant chemicals published in the Russian language.

    PubMed

    Sihtmäe, Mariliis; Blinova, Irina; Aruoja, Villem; Dubourguier, Henri-Charles; Legrand, Nicolas; Kahru, Anne

    2010-08-01

    A new open-access online database, E-SovTox, is presented. E-SovTox provides toxicological data for substances relevant to the EU Registration, Evaluation, Authorisation and Restriction of Chemicals (REACH) system, from publicly-available Russian language data sources. The database contains information selected mainly from scientific journals published during the Soviet Union era. The main information source for this database - the journal, Gigiena Truda i Professional'nye Zabolevania [Industrial Hygiene and Occupational Diseases], published between 1957 and 1992 - features acute, but also chronic, toxicity data for numerous industrial chemicals, e.g. for rats, mice, guinea-pigs and rabbits. The main goal of the abovementioned toxicity studies was to derive the maximum allowable concentration limits for industrial chemicals in the occupational health settings of the former Soviet Union. Thus, articles featured in the database include mostly data on LD50 values, skin and eye irritation, skin sensitisation and cumulative properties. Currently, the E-SovTox database contains toxicity data selected from more than 500 papers covering more than 600 chemicals. The user is provided with the main toxicity information, as well as abstracts of these papers in Russian and in English (given as provided in the original publication). The search engine allows cross-searching of the database by the name or CAS number of the compound, and the author of the paper. The E-SovTox database can be used as a decision-support tool by researchers and regulators for the hazard assessment of chemical substances. PMID:20822322

  13. Drinking Water Treatability Database (Database)

    EPA Science Inventory

    The drinking Water Treatability Database (TDB) will provide data taken from the literature on the control of contaminants in drinking water, and will be housed on an interactive, publicly-available USEPA web site. It can be used for identifying effective treatment processes, rec...

  14. Clinical and public health research using methylated DNA Immunoprecipitation (MeDIP): A comparison of commercially available kits to examine differential DNA methylation across the genome

    PubMed Central

    Brebi-Mieville, Priscilla; Ili-Gangas, Carmen; Leal-Rojas, Pamela; Noordhuis, Maartje; Soudry, Ethan; Perez, Jimena; Roa, Juan Carlos; Sidransky, David; Guerrero-Preston, Rafael

    2012-01-01

    The methylated DNA immunoprecipitation method (MeDIP) is a genome-wide, high-resolution approach that detects DNA methylation with oligonucleotide tiling arrays or high throughput sequencing platforms. A simplified high-throughput MeDIP assay will enable translational research studies in clinics and populations, which will greatly enhance our understanding of the human methylome. We compared three commercial kits, MagMeDIP Kit TM (Diagenode), Methylated-DNA IP Kit (Zymo Research) and Methylamp™ Methylated DNA Capture Kit (Epigentek), in order to identify which one has better reliability and sensitivity for genomic DNA enrichment. Each kit was used to enrich two samples, one from fresh tissue and one from a cell line, with two different DNA amounts. The enrichment efficiency of each kit was evaluated by agarose gel band intensity after Nco I digestion and by reaction yield of methylated DNA. A successful enrichment is expected to have a 1:4 to 10:1 conversion ratio and a yield of 80% or higher. We also evaluated the hybridization efficiency to genome-wide methylation arrays in a separate cohort of tissue samples. We observed that the MagMeDIP kit had the highest yield for the two DNA amounts and for both the tissue and cell line samples, as well as for the positive control. In addition, the DNA was successfully enriched from a 1:4 to 10:1 ratio. Therefore, the MagMeDIP kit is a useful research tool that will enable clinical and public health genome-wide DNA methylation studies. PMID:22207357

  15. Development of a DNA microarray to detect antimicrobial resistance genes identified in the national center for biotechnology information database

    Technology Transfer Automated Retrieval System (TEKTRAN)

    High density genotyping techniques are needed for investigating antimicrobial resistance especially in the case of multi-drug resistant (MDR) isolates. To achieve this all antimicrobial resistance genes in the NCBI Genbank database were identified by key word searches of sequence annotations and the...

  16. Forensic DNA and bioinformatics.

    PubMed

    Bianchi, Lucia; Liò, Pietro

    2007-03-01

    The field of forensic science is increasingly based on biomolecular data and many European countries are establishing forensic databases to store DNA profiles of crime scenes of known offenders and apply DNA testing. The field is boosted by statistical and technological advances such as DNA microarray sequencing, TFT biosensors, machine learning algorithms, in particular Bayesian networks, which provide an effective way of evidence organization and inference. The aim of this article is to discuss the state of art potentialities of bioinformatics in forensic DNA science. We also discuss how bioinformatics will address issues related to privacy rights such as those raised from large scale integration of crime, public health and population genetic susceptibility-to-diseases databases.

  17. PFR²: a curated database of planktonic foraminifera 18S ribosomal DNA as a resource for studies of plankton ecology, biogeography and evolution.

    PubMed

    Morard, Raphaël; Darling, Kate F; Mahé, Frédéric; Audic, Stéphane; Ujiié, Yurika; Weiner, Agnes K M; André, Aurore; Seears, Heidi A; Wade, Christopher M; Quillévéré, Frédéric; Douady, Christophe J; Escarguel, Gilles; de Garidel-Thoron, Thibault; Siccha, Michael; Kucera, Michal; de Vargas, Colomban

    2015-11-01

    Planktonic foraminifera (Rhizaria) are ubiquitous marine pelagic protists producing calcareous shells with conspicuous morphology. They play an important role in the marine carbon cycle, and their exceptional fossil record serves as the basis for biochronostratigraphy and past climate reconstructions. A major worldwide sampling effort over the last two decades has resulted in the establishment of multiple large collections of cryopreserved individual planktonic foraminifera samples. Thousands of 18S rDNA partial sequences have been generated, representing all major known morphological taxa across their worldwide oceanic range. This comprehensive data coverage provides an opportunity to assess patterns of molecular ecology and evolution in a holistic way for an entire group of planktonic protists. We combined all available published and unpublished genetic data to build PFR(2), the Planktonic foraminifera Ribosomal Reference database. The first version of the database includes 3322 reference 18S rDNA sequences belonging to 32 of the 47 known morphospecies of extant planktonic foraminifera, collected from 460 oceanic stations. All sequences have been rigorously taxonomically curated using a six-rank annotation system fully resolved to the morphological species level and linked to a series of metadata. The PFR(2) website, available at http://pfr2.sb-roscoff.fr, allows downloading the entire database or specific sections, as well as the identification of new planktonic foraminiferal sequences. Its novel, fully documented curation process integrates advances in morphological and molecular taxonomy. It allows for an increase in its taxonomic resolution and assures that integrity is maintained by including a complete contingency tracking of annotations and assuring that the annotations remain internally consistent. PMID:25828689

  18. PFR²: a curated database of planktonic foraminifera 18S ribosomal DNA as a resource for studies of plankton ecology, biogeography and evolution.

    PubMed

    Morard, Raphaël; Darling, Kate F; Mahé, Frédéric; Audic, Stéphane; Ujiié, Yurika; Weiner, Agnes K M; André, Aurore; Seears, Heidi A; Wade, Christopher M; Quillévéré, Frédéric; Douady, Christophe J; Escarguel, Gilles; de Garidel-Thoron, Thibault; Siccha, Michael; Kucera, Michal; de Vargas, Colomban

    2015-11-01

    Planktonic foraminifera (Rhizaria) are ubiquitous marine pelagic protists producing calcareous shells with conspicuous morphology. They play an important role in the marine carbon cycle, and their exceptional fossil record serves as the basis for biochronostratigraphy and past climate reconstructions. A major worldwide sampling effort over the last two decades has resulted in the establishment of multiple large collections of cryopreserved individual planktonic foraminifera samples. Thousands of 18S rDNA partial sequences have been generated, representing all major known morphological taxa across their worldwide oceanic range. This comprehensive data coverage provides an opportunity to assess patterns of molecular ecology and evolution in a holistic way for an entire group of planktonic protists. We combined all available published and unpublished genetic data to build PFR(2), the Planktonic foraminifera Ribosomal Reference database. The first version of the database includes 3322 reference 18S rDNA sequences belonging to 32 of the 47 known morphospecies of extant planktonic foraminifera, collected from 460 oceanic stations. All sequences have been rigorously taxonomically curated using a six-rank annotation system fully resolved to the morphological species level and linked to a series of metadata. The PFR(2) website, available at http://pfr2.sb-roscoff.fr, allows downloading the entire database or specific sections, as well as the identification of new planktonic foraminiferal sequences. Its novel, fully documented curation process integrates advances in morphological and molecular taxonomy. It allows for an increase in its taxonomic resolution and assures that integrity is maintained by including a complete contingency tracking of annotations and assuring that the annotations remain internally consistent.

  19. An updated validation of Promega's PowerPlex 16 System: high throughput databasing under reduced PCR volume conditions on Applied Biosystem's 96 capillary 3730xl DNA Analyzer.

    PubMed

    Spathis, Rita; Lum, J Koji

    2008-11-01

    The PowerPlex 16 System from Promega Corporation allows single tube multiplex amplification of sixteen short tandem repeat (STR) loci including all 13 core combined DNA index system STRs. This report presents an updated validation of the PowerPlex 16 System on Applied Biosystem's 96 capillary 3730xl DNA Analyzer. The validation protocol developed in our laboratory allows for the analysis of 1536 loci (96 x 16) in c. 50 min. We have further optimized the assay by decreasing the reaction volume to one-quarter that recommended by the manufacturer thereby substantially reducing the total cost per sample without compromising reproducibility or specificity. This reduction in reaction volume has the ancillary benefit of dramatically increasing the sensitivity of the assay allowing for accurate analysis of lower quantities of DNA. Due to its substantially increased throughput capability, this extended validation of the PowerPlex 16 System should be useful in reducing the backlog of unanalyzed DNA samples currently facing public DNA forensic laboratories.

  20. Scientific publications about DNA structure-function and PCR technique in Costa Rica: a historic view (1953-2003).

    PubMed

    Albertazzi, Federico J

    2004-09-01

    The spreading of knowledge depends on the access to the information and its immediate use. Models are useful to explain specific phenomena. The scientific community accepts some models in Biology after a period of time, once it has evidence to support it. The model of the structure and function of the DNA proposed by Watson & Crick (1953) was not the exception, since a few years later the DNA model was finally accepted. In Costa Rica, DNA function was first mentioned in 1970, in the magazine Biologia Tropical (Tropical Biology Magazine), more than 15 years after its first publication in a scientific journal. An opposite situation occurs with technical innovations. If the efficiency of a new scientific technique is proved in a compelling way, then the acceptance by the community comes swiftly. This was the case of the polymerase chain reaction, or PCR. The first PCR machine in Costa Rica arrived in 1991, only three years after its publication.

  1. Aviation Safety Issues Database

    NASA Technical Reports Server (NTRS)

    Morello, Samuel A.; Ricks, Wendell R.

    2009-01-01

    The aviation safety issues database was instrumental in the refinement and substantiation of the National Aviation Safety Strategic Plan (NASSP). The issues database is a comprehensive set of issues from an extremely broad base of aviation functions, personnel, and vehicle categories, both nationally and internationally. Several aviation safety stakeholders such as the Commercial Aviation Safety Team (CAST) have already used the database. This broader interest was the genesis to making the database publically accessible and writing this report.

  2. Establishment of a mitochondrial DNA sequence database for the identification of fish species commercially available in South Africa.

    PubMed

    Cawthorn, Donna-Mareè; Steinman, Harris Andrew; Witthuhn, R Corli

    2011-11-01

    The limitations intrinsic to morphology-based identification systems have created an urgent need for reliable genetic methods that enable the unequivocal recognition of fish species, particularly those that are prone to overexploitation and/or market substitution. The aim of this study was to develop a comprehensive reference library of DNA sequence data to allow the explicit identification of 53 commercially available fish species in South Africa, most of which were locally caught marine species. Sequences of approximately 655 base pairs were generated for all species from the cytochrome c oxidase I (COI) gene, the region widely adopted for DNA barcoding. Specimens of the genus Thunnus were examined in further detail, employing additional mitochondrial DNA control region sequencing. Cumulative analysis of the sequences from the COI region revealed mean conspecific, congeneric and confamilial Kimura 2-parameter distances of 0.10%, 4.58% and 15.43%, respectively. The results showed that the vast majority (98%) of fish species examined could be readily differentiated by their COI barcodes, but that supplementary control region sequencing was more useful for the discrimination of three Thunnus species. Additionally, the analysis of COI data raised the prospect that Thyrsites atun (snoek) could constitute a species pair. The present study has established the necessary genetic information to permit the unambiguous identification of 53 commonly marketed fish species in South Africa, the applications of which hold a plethora of benefits relating to ecology research, fisheries management and control of commercial practices.

  3. DNA.

    ERIC Educational Resources Information Center

    Felsenfeld, Gary

    1985-01-01

    Structural form, bonding scheme, and chromatin structure of and gene-modification experiments with deoxyribonucleic acid (DNA) are described. Indicates that DNA's double helix is variable and also flexible as it interacts with regulatory and other molecules to transfer hereditary messages. (DH)

  4. Prevalence of human cell material: DNA and RNA profiling of public and private objects and after activity scenarios.

    PubMed

    van den Berge, M; Ozcanhan, G; Zijlstra, S; Lindenbergh, A; Sijen, T

    2016-03-01

    Especially when minute evidentiary traces are analysed, background cell material unrelated to the crime may contribute to detectable levels in the genetic analyses. To gain understanding on the composition of human cell material residing on surfaces contributing to background traces, we performed DNA and mRNA profiling on samplings of various items. Samples were selected by considering events contributing to cell material deposits in exemplary activities (e.g. dragging a person by the trouser ankles), and can be grouped as public objects, private samples, transfer-related samples and washing machine experiments. Results show that high DNA yields do not necessarily relate to an increased number of contributors or to the detection of other cell types than skin. Background cellular material may be found on any type of public or private item. When a major contributor can be deduced in DNA profiles from private items, this can be a different person than the owner of the item. Also when a specific activity is performed and the areas of physical contact are analysed, the "perpetrator" does not necessarily represent the major contributor in the STR profile. Washing machine experiments show that transfer and persistence during laundry is limited for DNA and cell type dependent for RNA. Skin conditions such as the presence of sebum or sweat can promote DNA transfer. Results of this study, which encompasses 549 samples, increase our understanding regarding the prevalence of human cell material in background and activity scenarios.

  5. Morchella MLST database

    Technology Transfer Automated Retrieval System (TEKTRAN)

    Welcome to the Morchella MLST database. This dedicated database was set up at the CBS-KNAW Biodiversity Center by Vincent Robert in February 2012, using BioloMICS software (Robert et al., 2011), to facilitate DNA sequence-based identifications of Morchella species via the Internet. The current datab...

  6. The Institute of Public Administration's Document Center: From Paper to Electronic Records--A Full Image Government Documents Database.

    ERIC Educational Resources Information Center

    Al-Zahrani, Rashed S.

    Since its establishment in 1960, the Institute of Public Administration (IPA) in Riyadh, Saudi Arabia has had responsibility for documenting Saudi administrative literature, the official publications of Saudi Arabia, and the literature of regional and international organizations through establishment of the Document Center in 1961. This paper…

  7. DNA barcoding for plants.

    PubMed

    de Vere, Natasha; Rich, Tim C G; Trinder, Sarah A; Long, Charlotte

    2015-01-01

    DNA barcoding uses specific regions of DNA in order to identify species. Initiatives are taking place around the world to generate DNA barcodes for all groups of living organisms and to make these data publically available in order to help understand, conserve, and utilize the world's biodiversity. For land plants the core DNA barcode markers are two sections of coding regions within the chloroplast, part of the genes, rbcL and matK. In order to create high quality databases, each plant that is DNA barcoded needs to have a herbarium voucher that accompanies the rbcL and matK DNA sequences. The quality of the DNA sequences, the primers used, and trace files should also be accessible to users of the data. Multiple individuals should be DNA barcoded for each species in order to check for errors and allow for intraspecific variation. The world's herbaria provide a rich resource of already preserved and identified material and these can be used for DNA barcoding as well as by collecting fresh samples from the wild. These protocols describe the whole DNA barcoding process, from the collection of plant material from the wild or from the herbarium, how to extract and amplify the DNA, and how to check the quality of the data after sequencing.

  8. Ovarian Kaleidoscope database: ten years and beyond.

    PubMed

    Hsueh, Aaron J; Rauch, Rami

    2012-06-01

    Ovarian Kaleidoscope database (OKdb) is an online, searchable, public database containing text-based and DNA microarray data to facilitate research by ovarian researchers. Using key words and predetermined categories, users can search ovarian gene information based on gene function, cell type of expression, cellular localization, hormonal regulation, mutant phenotypes, chromosomal location, ligand-receptor relationship, and other criteria, either alone or in combination. For individual genes, users can access more than 10 extensive DNA microarray datasets to interrogate gene expression patterns in a development-specific and cell type-specific manner. All ligand and receptor genes expressed in the ovary are matched to facilitate investigation of paracrine/autocrine signaling. More than 3500 ovarian genes in the database are matched to 185 gene pathways in the Kyoto Encyclopedia of Genes and Genomes to allow for elucidation of gene interactions and relationships. In addition to >400 genes with infertility or subfertility phenotypes when mutated in mice or humans, the OKdb also lists ~50 and ~40 genes associated with polycystic ovarian syndrome and primary ovarian insufficiency, respectively. The expanding OKdb is updated weekly and allows submission of new genes by ovarian researchers to allow instant access to DNA microarray datasets for newly submitted genes. The present database is a virtual community for ovarian researchers and allows users to instantaneously provide their comments for individual gene pages based on an automated Web-discussion system. In the coming years, we will continue to add new features to serve the ovarian research community. PMID:22441797

  9. DNA

    ERIC Educational Resources Information Center

    Stent, Gunther S.

    1970-01-01

    This history for molecular genetics and its explanation of DNA begins with an analysis of the Golden Jubilee essay papers, 1955. The paper ends stating that the higher nervous system is the one major frontier of biological inquiry which still offers some romance of research. (Author/VW)

  10. Publications

    Cancer.gov

    Information about NCI publications including PDQ cancer information for patients and health professionals, patient-education publications, fact sheets, dictionaries, NCI blogs and newsletters and major reports.

  11. Ionic Liquids Database- (ILThermo)

    National Institute of Standards and Technology Data Gateway

    SRD 147 Ionic Liquids Database- (ILThermo) (Web, free access)   IUPAC Ionic Liquids Database, ILThermo, is a free web research tool that allows users worldwide to access an up-to-date data collection from the publications on experimental investigations of thermodynamic, and transport properties of ionic liquids as well as binary and ternary mixtures containing ionic liquids.

  12. Validation of White-Matter Lesion Change Detection Methods on a Novel Publicly Available MRI Image Database.

    PubMed

    Lesjak, Žiga; Pernuš, Franjo; Likar, Boštjan; Špiclin, Žiga

    2016-10-01

    Changes of white-matter lesions (WMLs) are good predictors of the progression of neurodegenerative diseases like multiple sclerosis (MS). Based on longitudinal magnetic resonance (MR) imaging the changes can be monitored, while the need for their accurate and reliable quantification led to the development of several automated MR image analysis methods. However, an objective comparison of the methods is difficult, because publicly unavailable validation datasets with ground truth and different sets of performance metrics were used. In this study, we acquired longitudinal MR datasets of 20 MS patients, in which brain regions were extracted, spatially aligned and intensity normalized. Two expert raters then delineated and jointly revised the WML changes on subtracted baseline and follow-up MR images to obtain ground truth WML segmentations. The main contribution of this paper is an objective, quantitative and systematic evaluation of two unsupervised and one supervised intensity based change detection method on the publicly available datasets with ground truth segmentations, using common pre- and post-processing steps and common evaluation metrics. Besides, different combinations of the two main steps of the studied change detection methods, i.e. dissimilarity map construction and its segmentation, were tested to identify the best performing combination.

  13. Validation of White-Matter Lesion Change Detection Methods on a Novel Publicly Available MRI Image Database.

    PubMed

    Lesjak, Žiga; Pernuš, Franjo; Likar, Boštjan; Špiclin, Žiga

    2016-10-01

    Changes of white-matter lesions (WMLs) are good predictors of the progression of neurodegenerative diseases like multiple sclerosis (MS). Based on longitudinal magnetic resonance (MR) imaging the changes can be monitored, while the need for their accurate and reliable quantification led to the development of several automated MR image analysis methods. However, an objective comparison of the methods is difficult, because publicly unavailable validation datasets with ground truth and different sets of performance metrics were used. In this study, we acquired longitudinal MR datasets of 20 MS patients, in which brain regions were extracted, spatially aligned and intensity normalized. Two expert raters then delineated and jointly revised the WML changes on subtracted baseline and follow-up MR images to obtain ground truth WML segmentations. The main contribution of this paper is an objective, quantitative and systematic evaluation of two unsupervised and one supervised intensity based change detection method on the publicly available datasets with ground truth segmentations, using common pre- and post-processing steps and common evaluation metrics. Besides, different combinations of the two main steps of the studied change detection methods, i.e. dissimilarity map construction and its segmentation, were tested to identify the best performing combination. PMID:27207310

  14. The PNNL quantitative infrared database for gas-phase sensing: a spectral library for environmental, hazmat, and public safety standoff detection

    NASA Astrophysics Data System (ADS)

    Johnson, Timothy J.; Sams, Robert L.; Sharpe, Steven W.

    2004-03-01

    Pacific Northwest National Laboratory (PNNL) continues to expand its library of quantitative infrared reference spectra for remote sensing. The gas-phase data are recorded at 0.1 cm-1 resolution, with nitrogen pressure broadening to one atmosphere to emulate spectra recorded in the field. It is planned that the PNNL library will consist of approximately 500 vapor-phase spectra associated with the U.S. Department of Energy"s environmental, energy, and public safety missions. At present, the database is comprised of approximately 300 infrared spectra, many of which represent highly reactive or toxic species. For the 298 K data, each reported spectrum is in fact a composite spectrum generated by a Beer"s law plot (at each wavelength) to typically 12 measured spectra. Recent additions to the database include the vapors of several semi-volatile and non-volatile liquids using an improved dissemination technique for vaporizing the liquid into the nitrogen carrier gas. Experimental and analytical methods are used to remove several known and new artifacts associated with FTIR gas-phase spectroscopy. Details concerning sample preparation and composite spectrum generation are discussed.

  15. The PNNL Quantitative Infrared Database for Gas-Phase Sensing: A spectral Library for Environmental, Hazmat, and Public Safety Standoff Detection

    SciTech Connect

    Johnson, Timothy J.; Sams, Robert L.; Sharpe, Steven W.; Arthur J. Sedlacek III, Richard Colton, Tuan Vo-Dinh

    2004-03-25

    Pacific Northwest National Laboratory (PNNL) continues to expand its library of quantitative infrared reference spectra for remote sensing. The gas-phase data are recorded at 0.1 cm-1 resolution, with nitrogen pressure broadening to one atmosphere to emulate spectra recorded in the field. It is planned that the PNNL library will consist of approximately 500 vapor-phase spectra associated with the U.S. Department of Energy's environmental, energy, and public safety missions. At present, the database is comprised of approximately 300 infrared spectra, many of which represent highly reactive or toxic species. For the 298 K data, each reported spectrum is in fact a composite spectrum generated by a Beer's law plot (at each wavelength) to typically 12 measured spectra. Recent additions to the database include the vapors of several semi-volatile and non-volatile liquids using an improved dissemination technique for vaporizing the liquid into the nitrogen carrier gas. Experimental and analytical methods are used to remove several known and new artifacts associated with FTIR gas-phase spectroscopy. Details concerning sample preparation and composite spectrum generation are discussed.

  16. The PNNL Quantitative Infrared Database for Gas-Phase Sensing: A Spectral Library for Environmental, Hazmat and Public Safety Standoff Detection

    SciTech Connect

    Johnson, Timothy J.; Sams, Robert L.; Sharpe, Steven W.

    2004-01-01

    Pacific Northwest National Laboratory (PNNL) continues to expand its library of quantitative infrared reference spectra for remote sensing. The gas-phase data are recorded at 0.1 cm-1 resolution, with nitrogen pressure broadening to one atmosphere to emulate spectra recorded in the field. It is planned that the PNNL library will consist of approximately 500 vapor-phase spectra associated with DOE’s environmental, energy, and public safety missions. At present, the database is comprised of approximately 300 infrared spectra, many of which represent highly reactive or toxic species. For the 298 K data, each reported spectrum is in fact a composite spectrum generated by a Beer’s law plot (at each wavelength) to typically 12 measured spectra. Recent additions to the database include the vapors of several semi-volatile and non-volatile liquids using an improved dissemination technique for vaporizing the liquid into the nitrogen carrier gas. Experimental and analytical methods are used to remove several known and new artifacts associated with FTIR gas-phase spectroscopy. Details concerning sample preparation and composite spectrum generation are discussed.

  17. HS3D, A Dataset of Homo Sapiens Splice Regions, and its Extraction Procedure from a Major Public Database

    NASA Astrophysics Data System (ADS)

    Pollastro, Pasquale; Rampone, Salvatore

    The aim of this work is to describe a cleaning procedure of GenBank data, producing material to train and to assess the prediction accuracy of computational approaches for gene characterization. A procedure (GenBank2HS3D) has been defined, producing a dataset (HS3D - Homo Sapiens Splice Sites Dataset) of Homo Sapiens Splice regions extracted from GenBank (Rel.123 at this time). It selects, from the complete GenBank Primate Division, entries of Human Nuclear DNA according with several assessed criteria; then it extracts exons and introns from these entries (actually 4523 + 3802). Donor and acceptor sites are then extracted as windows of 140 nucleotides around each splice site (3799 + 3799). After discarding windows not including canonical GT-AG junctions (65 + 74), including insufficient data (not enough material for a 140 nucleotide window) (686 + 589), including not AGCT bases (29 + 30), and redundant (218 + 226), the remaining windows (2796 + 2880) are reported in the dataset. Finally, windows of false splice sites are selected by searching canonical GT-AG pairs in not splicing positions (271 937 + 332 296). The false sites in a range +/- 60 from a true splice site are marked as proximal. HS3D, release 1.2 at this time, is available at the Web server of the University of Sannio: http://www.sci.unisannio.it/docenti/rampone/.

  18. Contamination of sequence databases with adaptor sequences

    SciTech Connect

    Yoshikawa, Takeo; Sanders, A.R.; Detera-Wadleigh, S.D.

    1997-02-01

    Because of the exponential increase in the amount of DNA sequences being added to the public databases on a daily basis, it has become imperative to identify sources of contamination rapidly. Previously, contaminations of sequence databases have been reported to alert the scientific community to the problem. These contaminations can be divided into two categories. The first category comprises host sequences that have been difficult for submitters to manage or control. Examples include anomalous sequences derived from Escherichia coli, which are inserted into the chromosomes (and plasmids) of the bacterial hosts. Insertion sequences are highly mobile and are capable of transposing themselves into plasmids during cloning manipulation. Another example of the first category is the infection with yeast genomic DNA or with bacterial DNA of some commercially available cDNA libraries from Clontech. The second category of database contamination is due to the inadvertent inclusion of nonhost sequences. This category includes incorporation of cloning-vector sequences and multicloning sites in the database submission. M13-derived artifacts have been common, since M13-based vectors have been widely used for subcloning DNA fragments. Recognizing this problem, the National Center for Biotechnology Information (NCBI) started to screen, in April 1994, all sequences directly submitted to GenBank, against a set of vector data retrieved from GenBank by use of key-word searches, such as {open_quotes}vector.{close_quotes} In this report, we present evidence for another sequence artifact that is widespread but that, to our knowledge, has not yet been reported. 11 refs., 1 tab.

  19. GCOD - GeneChip Oncology Database

    PubMed Central

    2011-01-01

    Background DNA microarrays have become a nearly ubiquitous tool for the study of human disease, and nowhere is this more true than in cancer. With hundreds of studies and thousands of expression profiles representing the majority of human cancers completed and in public databases, the challenge has been effectively accessing and using this wealth of data. Description To address this issue we have collected published human cancer gene expression datasets generated on the Affymetrix GeneChip platform, and carefully annotated those studies with a focus on providing accurate sample annotation. To facilitate comparison between datasets, we implemented a consistent data normalization and transformation protocol and then applied stringent quality control procedures to flag low-quality assays. Conclusion The resulting resource, the GeneChip Oncology Database, is available through a publicly accessible website that provides several query options and analytical tools through an intuitive interface. PMID:21291543

  20. The Molecular Biology Database Collection: 2008 update

    PubMed Central

    Galperin, Michael Y.

    2008-01-01

    The Nucleic Acids Research online Molecular Biology Database Collection is a public repository that lists more than 1000 databases described in this and previous Nucleic Acids Research annual database issues, as well as a selection of molecular biology databases described in other journals. All databases included in this Collection are freely available to the public. The 2008 update includes 1078 databases, 110 more than the previous one. The links to more than 80 databases have been updated and 25 obsolete databases have been removed from the list. The complete database list and summaries are available online at the Nucleic Acids Research web site, http://nar.oxfordjournals.org/. PMID:18025043

  1. Publications.

    ERIC Educational Resources Information Center

    Aviation/Space, 1980

    1980-01-01

    Presents a variety of publications available from government and nongovernment sources. The government publications are from the Federal Aviation Administration (FAA) and the National Aeronautics and Space Administration (NASA) and are designed for educators, students, and the public. (Author/SA)

  2. Bibliometric assessment of publication output of child and adolescent psychiatric/psychological affiliations between 2005 and 2010 based on the databases PubMed and Scopus.

    PubMed

    Albayrak, Ozgür; Föcker, Manuel; Wibker, Katrin; Hebebrand, Johannes

    2012-06-01

    We aimed to determine the quantitative scientific publication output of child and adolescent psychiatric/psychological affiliations during 2005-2010 by country based on both, "PubMed" and "Scopus" and performed a bibliometric qualitative evaluation for 2009 using "PubMed". We performed our search by affiliation related to child and adolescent psychiatric/psychological institutions using "PubMed". For the quantitative analysis for 2005-2010, we counted the number of abstracts. For the qualitative analysis for 2009 we derived the impact factor of each abstract's journal from "Journal Citation Reports". We related total impact factor scores to the gross domestic product (GDP) and population size of each country. Additionally, we used "Scopus" to determine the number of abstracts for each country that was identified via "PubMed" for 2009 and compared the ranking of countries between the two databases. 61 % of the publications between 2005 and 2010 originated from European countries and 26 % from the USA. After adjustment for GDP and population size, the ranking positions changed in favor of smaller European countries with a population size of less than 20 million inhabitants. The ranking of countries for the count of articles in 2009 as derived from "Scopus" was similar to that identified via the "PubMed" search. The performed search revealed only minor differences between "Scopus" and "PubMed" related to the ranking of countries. Our data indicate a sharp difference between countries with a high versus low GDP with regard to scientific publication output in child and adolescent psychiatry/psychology.

  3. DSSTOX STRUCTURE-SEARCHABLE PUBLIC TOXICITY DATABASE NETWORK: CURRENT PROGRESS AND NEW INITIATIVES TO IMPROVE CHEMO-BIOINFORMATICS CAPABILITIES

    EPA Science Inventory

    The EPA DSSTox website (http://www/epa.gov/nheerl/dsstox) publishes standardized, structure-annotated toxicity databases, covering a broad range of toxicity disciplines. Each DSSTox database features documentation written in collaboration with the source authors and toxicity expe...

  4. Biofuel Database

    National Institute of Standards and Technology Data Gateway

    Biofuel Database (Web, free access)   This database brings together structural, biological, and thermodynamic data for enzymes that are either in current use or are being considered for use in the production of biofuels.

  5. Database-assisted promoter analysis.

    PubMed

    Hehl, R; Wingender, E

    2001-06-01

    The analysis of regulatory sequences is greatly facilitated by database-assisted bioinformatic approaches. The TRANSFAC database contains information on transcription factors and their origins, functional properties and sequence-specific binding activities. Software tools enable us to screen the database with a given DNA sequence for interacting transcription factors. If a regulatory function is already attributed to this sequence then the database-assisted identification of binding sites for proteins or protein classes and subsequent experimental verification might establish functionally relevant sites within this sequence. The binding transcription factors and interacting factors might already be present in the database.

  6. Database Administrator

    ERIC Educational Resources Information Center

    Moore, Pam

    2010-01-01

    The Internet and electronic commerce (e-commerce) generate lots of data. Data must be stored, organized, and managed. Database administrators, or DBAs, work with database software to find ways to do this. They identify user needs, set up computer databases, and test systems. They ensure that systems perform as they should and add people to the…

  7. Overview of the HUPO Plasma Proteome Project: Results from the pilot phase with 35 collaborating laboratories and multiple analytical groups, generating a core dataset of 3020 proteins and a publicly-available database

    SciTech Connect

    Omenn, Gilbert; States, David J.; Adamski, Marcin; Blackwell, Thomas W.; Menon, Rajasree; Hermjakob, Henning; Apweiler, Rolf; Haab, Brian B.; Simpson, Richard; Eddes, James; Kapp, Eugene; Moritz, Rod; Chan, Daniel W.; Rai, Alex J.; Admon, Arie; Aebersold, Ruedi; Eng, Jimmy K.; Hancock, William S.; Hefta, Stanley A.; Meyer, Helmut; Paik, Young-Ki; Yoo, Jong-Shin; Ping, Peipei; Pounds, Joel G.; Adkins, Joshua N.; Qian, Xiaohong; Wang, Rong; Wasinger, Valerie; Wu, Chi Yue; Zhao, Xiaohang; Zeng, Rong; Archakov, Alexander; Tsugita, Akira; Beer, Ilan; Pandey, Akhilesh; Pisano, Michael; Andrews, Philip; Tammen, Harald; Speicher, David W.; Hanash, Samir M.

    2005-08-13

    HUPO initiated the Plasma Proteome Project (PPP) in 2002. Its pilot phase has (1) evaluated advantages and limitations of many depletion, fractionation, and MS technology platforms; (2) compared PPP reference specimens of human serum and EDTA, heparin, and citrate-anticoagulated plasma; and (3) created a publicly-available knowledge base (www.bioinformatics. med.umich.edu/hupo/ppp; www.ebi.ac.uk/pride). Thirty-five participating laboratories in 13 countries submitted datasets. Working groups addressed (a) specimen stability and protein concentrations; (b) protein identifications from 18 MS/MS datasets; (c) independent analyses from raw MS-MS spectra; (d) search engine performance, subproteome analyses, and biological insights; (e) antibody arrays; and (f) direct MS/SELDI analyses. MS-MS datasets had 15 710 different International Protein Index (IPI) protein IDs; our integration algorithm applied to multiple matches of peptide sequences yielded 9504 IPI proteins identified with one or more peptides and 3020 proteins identified with two or more peptides (the Core Dataset). These proteins have been characterized with Gene Ontology, InterPro, Novartis Atlas, OMIM, and immunoassay based concentration determinations. The database permits examination of many other subsets, such as 1274 proteins identified with three or more peptides. Reverse protein to DNA matching identified proteins for 118 previously unidentified ORFs. We recommend use of plasma instead of serum, with EDTA (or citrate) for anticoagulation. To improve resolution, sensitivity and reproducibility of peptide identifications and protein matches, we recommend combinations of depletion, fractionation, and MS/MS technologies, with explicit criteria for evaluation of spectra, use of search algorithms, and integration of homologous protein matches. This Special Issue of PROTEOMICS presents papers integral to the collaborative analysis plus many reports of supplementary work on various aspects of the PPP workplan

  8. Method and system for normalizing biometric variations to authenticate users from a public database and that ensures individual biometric data privacy

    SciTech Connect

    Strait, R.S.; Pearson, P.K.; Sengupta, S.K.

    2000-03-14

    A password system comprises a set of codewords spaced apart from one another by a Hamming distance (HD) that exceeds twice the variability that can be projected for a series of biometric measurements for a particular individual and that is less than the HD that can be encountered between two individuals. To enroll an individual, a biometric measurement is taken and exclusive-ORed with a random codeword to produce a reference value. To verify the individual later, a biometric measurement is taken and exclusive-ORed with the reference value to reproduce the original random codeword or its approximation. If the reproduced value is not a codeword, the nearest codeword to it is found, and the bits that were corrected to produce the codeword to it is found, and the bits that were corrected to produce the codeword are also toggled in the biometric measurement taken and the codeword generated during enrollment. The correction scheme can be implemented by any conventional error correction code such as Reed-Muller code R(m,n). In the implementation using a hand geometry device an R(2,5) code has been used in this invention. Such codeword and biometric measurement can then be used to see if the individual is an authorized user. Conventional Diffie-Hellman public key encryption schemes and hashing procedures can then be used to secure the communications lines carrying the biometric information and to secure the database of authorized users.

  9. Method and system for normalizing biometric variations to authenticate users from a public database and that ensures individual biometric data privacy

    DOEpatents

    Strait, Robert S.; Pearson, Peter K.; Sengupta, Sailes K.

    2000-01-01

    A password system comprises a set of codewords spaced apart from one another by a Hamming distance (HD) that exceeds twice the variability that can be projected for a series of biometric measurements for a particular individual and that is less than the HD that can be encountered between two individuals. To enroll an individual, a biometric measurement is taken and exclusive-ORed with a random codeword to produce a "reference value." To verify the individual later, a biometric measurement is taken and exclusive-ORed with the reference value to reproduce the original random codeword or its approximation. If the reproduced value is not a codeword, the nearest codeword to it is found, and the bits that were corrected to produce the codeword to it is found, and the bits that were corrected to produce the codeword are also toggled in the biometric measurement taken and the codeword generated during enrollment. The correction scheme can be implemented by any conventional error correction code such as Reed-Muller code R(m,n). In the implementation using a hand geometry device an R(2,5) code has been used in this invention. Such codeword and biometric measurement can then be used to see if the individual is an authorized user. Conventional Diffie-Hellman public key encryption schemes and hashing procedures can then be used to secure the communications lines carrying the biometric information and to secure the database of authorized users.

  10. Computational tools and resources for metabolism-related property predictions. 1. Overview of publicly available (free and commercial) databases and software

    PubMed Central

    Peach, Megan L; Zakharov, Alexey V; Liu, Ruifeng; Pugliese, Angelo; Tawa, Gregory; Wallqvist, Anders; Nicklaus, Marc C

    2014-01-01

    Metabolism has been identified as a defining factor in drug development success or failure because of its impact on many aspects of drug pharmacology, including bioavailability, half-life and toxicity. In this article, we provide an outline and descriptions of the resources for metabolism-related property predictions that are currently either freely or commercially available to the public. These resources include databases with data on, and software for prediction of, several end points: metabolite formation, sites of metabolic transformation, binding to metabolizing enzymes and metabolic stability. We attempt to place each tool in historical context and describe, wherever possible, the data it was based on. For predictions of interactions with metabolizing enzymes, we show a typical set of results for a small test set of compounds. Our aim is to give a clear overview of the areas and aspects of metabolism prediction in which the currently available resources are useful and accurate, and the areas in which they are inadequate or missing entirely. PMID:23088273

  11. Current research status, databases and application of single nucleotide polymorphism.

    PubMed

    Javed, R; Mukesh

    2010-07-01

    Single Nucleotide Polymorphisms (SNPs) are the most frequent form of DNA variation in the genome. SNPs are genetic markers which are bi-allelic in nature and grow at a very fast rate. Current genomic databases contain information on several million SNPs. More than 6 million SNPs have been identified and the information is publicly available through the efforts of the SNP Consortium and others data bases. The NCBI plays a major role in facillating the identification and cataloging of SNPs through creation and maintenance of the public SNP database (dbSNP) by the biomedical community worldwide and stimulate many areas of biological research including the identification of the genetic components of disease. In this review article, we are compiling the existing SNP databases, research status and their application. PMID:21717869

  12. Public Databases Supporting Computational Toxicology

    EPA Science Inventory

    A major goal of the emerging field of computational toxicology is the development of screening-level models that predict potential toxicity of chemicals from a combination of mechanistic in vitro assay data and chemical structure descriptors. In order to build these models, resea...

  13. Hawaii bibliographic database

    USGS Publications Warehouse

    Wright, T.L.; Takahashi, T.J.

    1998-01-01

    The Hawaii bibliographic database has been created to contain all of the literature, from 1779 to the present, pertinent to the volcanological history of the Hawaiian-Emperor volcanic chain. References are entered in a PC- and Macintosh-compatible EndNote Plus bibliographic database with keywords and abstracts or (if no abstract) with annotations as to content. Keywords emphasize location, discipline, process, identification of new chemical data or age determinations, and type of publication. The database is updated approximately three times a year and is available to upload from an ftp site. The bibliography contained 8460 references at the time this paper was submitted for publication. Use of the database greatly enhances the power and completeness of library searches for anyone interested in Hawaiian volcanism.

  14. Fast decision tree-based method to index large DNA-protein sequence databases using hybrid distributed-shared memory programming model.

    PubMed

    Jaber, Khalid Mohammad; Abdullah, Rosni; Rashid, Nur'Aini Abdul

    2014-01-01

    In recent times, the size of biological databases has increased significantly, with the continuous growth in the number of users and rate of queries; such that some databases have reached the terabyte size. There is therefore, the increasing need to access databases at the fastest rates possible. In this paper, the decision tree indexing model (PDTIM) was parallelised, using a hybrid of distributed and shared memory on resident database; with horizontal and vertical growth through Message Passing Interface (MPI) and POSIX Thread (PThread), to accelerate the index building time. The PDTIM was implemented using 1, 2, 4 and 5 processors on 1, 2, 3 and 4 threads respectively. The results show that the hybrid technique improved the speedup, compared to a sequential version. It could be concluded from results that the proposed PDTIM is appropriate for large data sets, in terms of index building time.

  15. Reference databases for taxonomic assignment in metagenomics.

    PubMed

    Santamaria, Monica; Fosso, Bruno; Consiglio, Arianna; De Caro, Giorgio; Grillo, Giorgio; Licciulli, Flavio; Liuni, Sabino; Marzano, Marinella; Alonso-Alemany, Daniel; Valiente, Gabriel; Pesole, Graziano

    2012-11-01

    Metagenomics is providing an unprecedented access to the environmental microbial diversity. The amplicon-based metagenomics approach involves the PCR-targeted sequencing of a genetic locus fitting different features. Namely, it must be ubiquitous in the taxonomic range of interest, variable enough to discriminate between different species but flanked by highly conserved sequences, and of suitable size to be sequenced through next-generation platforms. The internal transcribed spacers 1 and 2 (ITS1 and ITS2) of the ribosomal DNA operon and one or more hyper-variable regions of 16S ribosomal RNA gene are typically used to identify fungal and bacterial species, respectively. In this context, reliable reference databases and taxonomies are crucial to assign amplicon sequence reads to the correct phylogenetic ranks. Several resources provide consistent phylogenetic classification of publicly available 16S ribosomal DNA sequences, whereas the state of ribosomal internal transcribed spacers reference databases is notably less advanced. In this review, we aim to give an overview of existing reference resources for both types of markers, highlighting strengths and possible shortcomings of their use for metagenomics purposes. Moreover, we present a new database, ITSoneDB, of well annotated and phylogenetically classified ITS1 sequences to be used as a reference collection in metagenomic studies of environmental fungal communities. ITSoneDB is available for download and browsing at http://itsonedb.ba.itb.cnr.it/.

  16. International Society of Human and Animal Mycology (ISHAM)-ITS reference DNA barcoding database--the quality controlled standard tool for routine identification of human and animal pathogenic fungi.

    PubMed

    Irinyi, Laszlo; Serena, Carolina; Garcia-Hermoso, Dea; Arabatzis, Michael; Desnos-Ollivier, Marie; Vu, Duong; Cardinali, Gianluigi; Arthur, Ian; Normand, Anne-Cécile; Giraldo, Alejandra; da Cunha, Keith Cassia; Sandoval-Denis, Marcelo; Hendrickx, Marijke; Nishikaku, Angela Satie; de Azevedo Melo, Analy Salles; Merseguel, Karina Bellinghausen; Khan, Aziza; Parente Rocha, Juliana Alves; Sampaio, Paula; da Silva Briones, Marcelo Ribeiro; e Ferreira, Renata Carmona; de Medeiros Muniz, Mauro; Castañón-Olivares, Laura Rosio; Estrada-Barcenas, Daniel; Cassagne, Carole; Mary, Charles; Duan, Shu Yao; Kong, Fanrong; Sun, Annie Ying; Zeng, Xianyu; Zhao, Zuotao; Gantois, Nausicaa; Botterel, Françoise; Robbertse, Barbara; Schoch, Conrad; Gams, Walter; Ellis, David; Halliday, Catriona; Chen, Sharon; Sorrell, Tania C; Piarroux, Renaud; Colombo, Arnaldo L; Pais, Célia; de Hoog, Sybren; Zancopé-Oliveira, Rosely Maria; Taylor, Maria Lucia; Toriello, Conchita; de Almeida Soares, Célia Maria; Delhaes, Laurence; Stubbe, Dirk; Dromer, Françoise; Ranque, Stéphane; Guarro, Josep; Cano-Lira, Jose F; Robert, Vincent; Velegraki, Aristea; Meyer, Wieland

    2015-05-01

    Human and animal fungal pathogens are a growing threat worldwide leading to emerging infections and creating new risks for established ones. There is a growing need for a rapid and accurate identification of pathogens to enable early diagnosis and targeted antifungal therapy. Morphological and biochemical identification methods are time-consuming and require trained experts. Alternatively, molecular methods, such as DNA barcoding, a powerful and easy tool for rapid monophasic identification, offer a practical approach for species identification and less demanding in terms of taxonomical expertise. However, its wide-spread use is still limited by a lack of quality-controlled reference databases and the evolving recognition and definition of new fungal species/complexes. An international consortium of medical mycology laboratories was formed aiming to establish a quality controlled ITS database under the umbrella of the ISHAM working group on "DNA barcoding of human and animal pathogenic fungi." A new database, containing 2800 ITS sequences representing 421 fungal species, providing the medical community with a freely accessible tool at http://www.isham.org/ and http://its.mycologylab.org/ to rapidly and reliably identify most agents of mycoses, was established. The generated sequences included in the new database were used to evaluate the variation and overall utility of the ITS region for the identification of pathogenic fungi at intra-and interspecies level. The average intraspecies variation ranged from 0 to 2.25%. This highlighted selected pathogenic fungal species, such as the dermatophytes and emerging yeast, for which additional molecular methods/genetic markers are required for their reliable identification from clinical and veterinary specimens. PMID:25802363

  17. International Society of Human and Animal Mycology (ISHAM)-ITS reference DNA barcoding database--the quality controlled standard tool for routine identification of human and animal pathogenic fungi.

    PubMed

    Irinyi, Laszlo; Serena, Carolina; Garcia-Hermoso, Dea; Arabatzis, Michael; Desnos-Ollivier, Marie; Vu, Duong; Cardinali, Gianluigi; Arthur, Ian; Normand, Anne-Cécile; Giraldo, Alejandra; da Cunha, Keith Cassia; Sandoval-Denis, Marcelo; Hendrickx, Marijke; Nishikaku, Angela Satie; de Azevedo Melo, Analy Salles; Merseguel, Karina Bellinghausen; Khan, Aziza; Parente Rocha, Juliana Alves; Sampaio, Paula; da Silva Briones, Marcelo Ribeiro; e Ferreira, Renata Carmona; de Medeiros Muniz, Mauro; Castañón-Olivares, Laura Rosio; Estrada-Barcenas, Daniel; Cassagne, Carole; Mary, Charles; Duan, Shu Yao; Kong, Fanrong; Sun, Annie Ying; Zeng, Xianyu; Zhao, Zuotao; Gantois, Nausicaa; Botterel, Françoise; Robbertse, Barbara; Schoch, Conrad; Gams, Walter; Ellis, David; Halliday, Catriona; Chen, Sharon; Sorrell, Tania C; Piarroux, Renaud; Colombo, Arnaldo L; Pais, Célia; de Hoog, Sybren; Zancopé-Oliveira, Rosely Maria; Taylor, Maria Lucia; Toriello, Conchita; de Almeida Soares, Célia Maria; Delhaes, Laurence; Stubbe, Dirk; Dromer, Françoise; Ranque, Stéphane; Guarro, Josep; Cano-Lira, Jose F; Robert, Vincent; Velegraki, Aristea; Meyer, Wieland

    2015-05-01

    Human and animal fungal pathogens are a growing threat worldwide leading to emerging infections and creating new risks for established ones. There is a growing need for a rapid and accurate identification of pathogens to enable early diagnosis and targeted antifungal therapy. Morphological and biochemical identification methods are time-consuming and require trained experts. Alternatively, molecular methods, such as DNA barcoding, a powerful and easy tool for rapid monophasic identification, offer a practical approach for species identification and less demanding in terms of taxonomical expertise. However, its wide-spread use is still limited by a lack of quality-controlled reference databases and the evolving recognition and definition of new fungal species/complexes. An international consortium of medical mycology laboratories was formed aiming to establish a quality controlled ITS database under the umbrella of the ISHAM working group on "DNA barcoding of human and animal pathogenic fungi." A new database, containing 2800 ITS sequences representing 421 fungal species, providing the medical community with a freely accessible tool at http://www.isham.org/ and http://its.mycologylab.org/ to rapidly and reliably identify most agents of mycoses, was established. The generated sequences included in the new database were used to evaluate the variation and overall utility of the ITS region for the identification of pathogenic fungi at intra-and interspecies level. The average intraspecies variation ranged from 0 to 2.25%. This highlighted selected pathogenic fungal species, such as the dermatophytes and emerging yeast, for which additional molecular methods/genetic markers are required for their reliable identification from clinical and veterinary specimens.

  18. Database Manager

    ERIC Educational Resources Information Center

    Martin, Andrew

    2010-01-01

    It is normal practice today for organizations to store large quantities of records of related information as computer-based files or databases. Purposeful information is retrieved by performing queries on the data sets. The purpose of DATABASE MANAGER is to communicate to students the method by which the computer performs these queries. This…

  19. Maize databases

    Technology Transfer Automated Retrieval System (TEKTRAN)

    This chapter is a succinct overview of maize data held in the species-specific database MaizeGDB (the Maize Genomics and Genetics Database), and selected multi-species data repositories, such as Gramene/Ensembl Plants, Phytozome, UniProt and the National Center for Biotechnology Information (NCBI), ...

  20. The Hawaiian Algal Database: a laboratory LIMS and online resource for biodiversity data

    PubMed Central

    Wang, Norman; Sherwood, Alison R; Kurihara, Akira; Conklin, Kimberly Y; Sauvage, Thomas; Presting, Gernot G

    2009-01-01

    Background Organization and presentation of biodiversity data is greatly facilitated by databases that are specially designed to allow easy data entry and organized data display. Such databases also have the capacity to serve as Laboratory Information Management Systems (LIMS). The Hawaiian Algal Database was designed to showcase specimens collected from the Hawaiian Archipelago, enabling users around the world to compare their specimens with our photographs and DNA sequence data, and to provide lab personnel with an organizational tool for storing various biodiversity data types. Description We describe the Hawaiian Algal Database, a comprehensive and searchable database containing photographs and micrographs, geo-referenced collecting information, taxonomic checklists and standardized DNA sequence data. All data for individual samples are linked through unique accession numbers. Users can search online for sample information by accession number, numerous levels of taxonomy, or collection site. At the present time the database contains data representing over 2,000 samples of marine, freshwater and terrestrial algae from the Hawaiian Archipelago. These samples are primarily red algae, although other taxa are being added. Conclusion The Hawaiian Algal Database is a digital repository for Hawaiian algal samples and acts as a LIMS for the laboratory. Users can make use of the online search tool to view and download specimen photographs and micrographs, DNA sequences and relevant habitat data, including georeferenced collecting locations. It is publicly available at . PMID:19728892

  1. National Ambient Radiation Database

    SciTech Connect

    Dziuban, J.; Sears, R.

    2003-02-25

    The U.S. Environmental Protection Agency (EPA) recently developed a searchable database and website for the Environmental Radiation Ambient Monitoring System (ERAMS) data. This site contains nationwide radiation monitoring data for air particulates, precipitation, drinking water, surface water and pasteurized milk. This site provides location-specific as well as national information on environmental radioactivity across several media. It provides high quality data for assessing public exposure and environmental impacts resulting from nuclear emergencies and provides baseline data during routine conditions. The database and website are accessible at www.epa.gov/enviro/. This site contains (1) a query for the general public which is easy to use--limits the amount of information provided, but includes the ability to graph the data with risk benchmarks and (2) a query for a more technical user which allows access to all of the data in the database, (3) background information on ER AMS.

  2. Ovarian Kaleidoscope Database: Ten Years and Beyond1

    PubMed Central

    Hsueh, Aaron J.; Rauch, Rami

    2012-01-01

    ABSTRACT Ovarian Kaleidoscope database (OKdb) is an online, searchable, public database containing text-based and DNA microarray data to facilitate research by ovarian researchers. Using key words and predetermined categories, users can search ovarian gene information based on gene function, cell type of expression, cellular localization, hormonal regulation, mutant phenotypes, chromosomal location, ligand-receptor relationship, and other criteria, either alone or in combination. For individual genes, users can access more than 10 extensive DNA microarray datasets to interrogate gene expression patterns in a development-specific and cell type-specific manner. All ligand and receptor genes expressed in the ovary are matched to facilitate investigation of paracrine/autocrine signaling. More than 3500 ovarian genes in the database are matched to 185 gene pathways in the Kyoto Encyclopedia of Genes and Genomes to allow for elucidation of gene interactions and relationships. In addition to >400 genes with infertility or subfertility phenotypes when mutated in mice or humans, the OKdb also lists ∼50 and ∼40 genes associated with polycystic ovarian syndrome and primary ovarian insufficiency, respectively. The expanding OKdb is updated weekly and allows submission of new genes by ovarian researchers to allow instant access to DNA microarray datasets for newly submitted genes. The present database is a virtual community for ovarian researchers and allows users to instantaneously provide their comments for individual gene pages based on an automated Web-discussion system. In the coming years, we will continue to add new features to serve the ovarian research community. PMID:22441797

  3. Beyond the cold hit: measuring the impact of the national DNA data bank on public safety at the city and county level.

    PubMed

    Gabriel, Matthew; Boland, Cherisse; Holt, Cydne

    2010-01-01

    Over the past decade, the Combined DNA Index System (CODIS) has increased solvability of violent crimes by linking evidence DNA profiles to known offenders. At present, an in-depth analysis of the United States National DNA Data Bank effort has not assessed the success of this national public safety endeavor. Critics of this effort often focus on laboratory and police investigators unable to provide timely investigative support as a root cause(s) of CODIS' failure to increase public safety. By studying a group of nearly 200 DNA cold hits obtained in SFPD criminal investigations from 2001-2006, three key performance metrics (Significance of Cold Hits, Case Progression & Judicial Resolution, and Potential Reduction of Future Criminal Activity) provide a proper context in which to define the impact of CODIS at the City and County level. Further, the analysis of a recidivist group of cold hit offenders and their past interaction with law enforcement established five noteworthy criminal case resolution trends; these trends signify challenges to CODIS in achieving meaningful case resolutions. CODIS' effectiveness and critical activities to support case resolutions are the responsibility of all criminal justice partners in order to achieve long-lasting public safety within the United States.

  4. Fun Databases: My Top Ten.

    ERIC Educational Resources Information Center

    O'Leary, Mick

    1992-01-01

    Provides reviews of 10 online databases: Consumer Reports; Public Opinion Online; Encyclopedia of Associations; Official Airline Guide Adventure Atlas and Events Calendar; CENDATA; Hollywood Hotline; Fearless Taster; Soap Opera Summaries; and Human Sexuality. (LRW)

  5. NUCLEAR DATABASES FOR REACTOR APPLICATIONS.

    SciTech Connect

    PRITYCHENKO, B.; ARCILLA, R.; BURROWS, T.; HERMAN, M.W.; MUGHABGHAB, S.; OBLOZINSKY, P.; ROCHMAN, D.; SONZOGNI, A.A.; TULI, J.; WINCHELL, D.F.

    2006-06-05

    The National Nuclear Data Center (NNDC): An overview of nuclear databases, related products, nuclear data Web services and publications. The NNDC collects, evaluates, and disseminates nuclear physics data for basic research and applied nuclear technologies. The NNDC maintains and contributes to the nuclear reaction (ENDF, CSISRS) and nuclear structure databases along with several others databases (CapGam, MIRD, IRDF-2002) and provides coordination for the Cross Section Evaluation Working Group (CSEWG) and the US Nuclear Data Program (USNDP). The Center produces several publications and codes such as Atlas of Neutron Resonances, Nuclear Wallet Cards booklets and develops codes, such as nuclear reaction model code Empire.

  6. Solubility Database

    National Institute of Standards and Technology Data Gateway

    SRD 106 IUPAC-NIST Solubility Database (Web, free access)   These solubilities are compiled from 18 volumes (Click here for List) of the International Union for Pure and Applied Chemistry(IUPAC)-NIST Solubility Data Series. The database includes liquid-liquid, solid-liquid, and gas-liquid systems. Typical solvents and solutes include water, seawater, heavy water, inorganic compounds, and a variety of organic compounds such as hydrocarbons, halogenated hydrocarbons, alcohols, acids, esters and nitrogen compounds. There are over 67,500 solubility measurements and over 1800 references.

  7. Addition of a breeding database in the Genome Database for Rosaceae.

    PubMed

    Evans, Kate; Jung, Sook; Lee, Taein; Brutcher, Lisa; Cho, Ilhyung; Peace, Cameron; Main, Dorrie

    2013-01-01

    Breeding programs produce large datasets that require efficient management systems to keep track of performance, pedigree, geographical and image-based data. With the development of DNA-based screening technologies, more breeding programs perform genotyping in addition to phenotyping for performance evaluation. The integration of breeding data with other genomic and genetic data is instrumental for the refinement of marker-assisted breeding tools, enhances genetic understanding of important crop traits and maximizes access and utility by crop breeders and allied scientists. Development of new infrastructure in the Genome Database for Rosaceae (GDR) was designed and implemented to enable secure and efficient storage, management and analysis of large datasets from the Washington State University apple breeding program and subsequently expanded to fit datasets from other Rosaceae breeders. The infrastructure was built using the software Chado and Drupal, making use of the Natural Diversity module to accommodate large-scale phenotypic and genotypic data. Breeders can search accessions within the GDR to identify individuals with specific trait combinations. Results from Search by Parentage lists individuals with parents in common and results from Individual Variety pages link to all data available on each chosen individual including pedigree, phenotypic and genotypic information. Genotypic data are searchable by markers and alleles; results are linked to other pages in the GDR to enable the user to access tools such as GBrowse and CMap. This breeding database provides users with the opportunity to search datasets in a fully targeted manner and retrieve and compare performance data from multiple selections, years and sites, and to output the data needed for variety release publications and patent applications. The breeding database facilitates efficient program management. Storing publicly available breeding data in a database together with genomic and genetic data will

  8. Addition of a breeding database in the Genome Database for Rosaceae.

    PubMed

    Evans, Kate; Jung, Sook; Lee, Taein; Brutcher, Lisa; Cho, Ilhyung; Peace, Cameron; Main, Dorrie

    2013-01-01

    Breeding programs produce large datasets that require efficient management systems to keep track of performance, pedigree, geographical and image-based data. With the development of DNA-based screening technologies, more breeding programs perform genotyping in addition to phenotyping for performance evaluation. The integration of breeding data with other genomic and genetic data is instrumental for the refinement of marker-assisted breeding tools, enhances genetic understanding of important crop traits and maximizes access and utility by crop breeders and allied scientists. Development of new infrastructure in the Genome Database for Rosaceae (GDR) was designed and implemented to enable secure and efficient storage, management and analysis of large datasets from the Washington State University apple breeding program and subsequently expanded to fit datasets from other Rosaceae breeders. The infrastructure was built using the software Chado and Drupal, making use of the Natural Diversity module to accommodate large-scale phenotypic and genotypic data. Breeders can search accessions within the GDR to identify individuals with specific trait combinations. Results from Search by Parentage lists individuals with parents in common and results from Individual Variety pages link to all data available on each chosen individual including pedigree, phenotypic and genotypic information. Genotypic data are searchable by markers and alleles; results are linked to other pages in the GDR to enable the user to access tools such as GBrowse and CMap. This breeding database provides users with the opportunity to search datasets in a fully targeted manner and retrieve and compare performance data from multiple selections, years and sites, and to output the data needed for variety release publications and patent applications. The breeding database facilitates efficient program management. Storing publicly available breeding data in a database together with genomic and genetic data will

  9. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1992-04-30

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on R-32, R-123, R-124, R- 125, R-134a, R-141b, R142b, R-143a, R-152a, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses polyalkylene glycol (PAG), ester, and other lubricants. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits.

  10. The Genopolis Microarray Database

    PubMed Central

    Splendiani, Andrea; Brandizi, Marco; Even, Gael; Beretta, Ottavio; Pavelka, Norman; Pelizzola, Mattia; Mayhaus, Manuel; Foti, Maria; Mauri, Giancarlo; Ricciardi-Castagnoli, Paola

    2007-01-01

    Background Gene expression databases are key resources for microarray data management and analysis and the importance of a proper annotation of their content is well understood. Public repositories as well as microarray database systems that can be implemented by single laboratories exist. However, there is not yet a tool that can easily support a collaborative environment where different users with different rights of access to data can interact to define a common highly coherent content. The scope of the Genopolis database is to provide a resource that allows different groups performing microarray experiments related to a common subject to create a common coherent knowledge base and to analyse it. The Genopolis database has been implemented as a dedicated system for the scientific community studying dendritic and macrophage cells functions and host-parasite interactions. Results The Genopolis Database system allows the community to build an object based MIAME compliant annotation of their experiments and to store images, raw and processed data from the Affymetrix GeneChip® platform. It supports dynamical definition of controlled vocabularies and provides automated and supervised steps to control the coherence of data and annotations. It allows a precise control of the visibility of the database content to different sub groups in the community and facilitates exports of its content to public repositories. It provides an interactive users interface for data analysis: this allows users to visualize data matrices based on functional lists and sample characterization, and to navigate to other data matrices defined by similarity of expression values as well as functional characterizations of genes involved. A collaborative environment is also provided for the definition and sharing of functional annotation by users. Conclusion The Genopolis Database supports a community in building a common coherent knowledge base and analyse it. This fills a gap between a local

  11. The EMBL nucleotide sequence database.

    PubMed Central

    Stoesser, G; Moseley, M A; Sleep, J; McGowran, M; Garcia-Pastor, M; Sterk, P

    1998-01-01

    The EMBL Nucleotide Sequence Database (http://www.ebi.ac.uk/embl. html ) constitutes Europe's primary nucleotide sequence resource. DNA and RNA sequences are directly submitted from researchers and genome sequencing groups and collected from the scientific literature and patent applications (Fig. 1). In collaboration with DDBJ and GenBank the database is produced, maintained and distributed at the European Bioinformatics Institute. Database releases are produced quarterly and are distributed on CD-ROM. EBI's network services allow access to the most up-to-date data collection via Internet and World Wide Web interface, providing database searching and sequence similarity facilities plus access to a large number of additional databases. PMID:9399791

  12. The Histone Database: an integrated resource for histones and histone fold-containing proteins.

    PubMed

    Mariño-Ramírez, Leonardo; Levine, Kevin M; Morales, Mario; Zhang, Suiyuan; Moreland, R Travis; Baxevanis, Andreas D; Landsman, David

    2011-01-01

    Eukaryotic chromatin is composed of DNA and protein components-core histones-that act to compactly pack the DNA into nucleosomes, the fundamental building blocks of chromatin. These nucleosomes are connected to adjacent nucleosomes by linker histones. Nucleosomes are highly dynamic and, through various core histone post-translational modifications and incorporation of diverse histone variants, can serve as epigenetic marks to control processes such as gene expression and recombination. The Histone Sequence Database is a curated collection of sequences and structures of histones and non-histone proteins containing histone folds, assembled from major public databases. Here, we report a substantial increase in the number of sequences and taxonomic coverage for histone and histone fold-containing proteins available in the database. Additionally, the database now contains an expanded dataset that includes archaeal histone sequences. The database also provides comprehensive multiple sequence alignments for each of the four core histones (H2A, H2B, H3 and H4), the linker histones (H1/H5) and the archaeal histones. The database also includes current information on solved histone fold-containing structures. The Histone Sequence Database is an inclusive resource for the analysis of chromatin structure and function focused on histones and histone fold-containing proteins.

  13. FishTraits Database

    USGS Publications Warehouse

    Angermeier, Paul L.; Frimpong, Emmanuel A.

    2009-01-01

    The need for integrated and widely accessible sources of species traits data to facilitate studies of ecology, conservation, and management has motivated development of traits databases for various taxa. In spite of the increasing number of traits-based analyses of freshwater fishes in the United States, no consolidated database of traits of this group exists publicly, and much useful information on these species is documented only in obscure sources. The largely inaccessible and unconsolidated traits information makes large-scale analysis involving many fishes and/or traits particularly challenging. FishTraits is a database of >100 traits for 809 (731 native and 78 exotic) fish species found in freshwaters of the conterminous United States, including 37 native families and 145 native genera. The database contains information on four major categories of traits: (1) trophic ecology, (2) body size and reproductive ecology (life history), (3) habitat associations, and (4) salinity and temperature tolerances. Information on geographic distribution and conservation status is also included. Together, we refer to the traits, distribution, and conservation status information as attributes. Descriptions of attributes are available here. Many sources were consulted to compile attributes, including state and regional species accounts and other databases.

  14. A comprehensive DNA barcode database for Central European beetles with a focus on Germany: adding more than 3500 identified species to BOLD.

    PubMed

    Hendrich, Lars; Morinière, Jérôme; Haszprunar, Gerhard; Hebert, Paul D N; Hausmann, Axel; Köhler, Frank; Balke, Michael

    2015-07-01

    Beetles are the most diverse group of animals and are crucial for ecosystem functioning. In many countries, they are well established for environmental impact assessment, but even in the well-studied Central European fauna, species identification can be very difficult. A comprehensive and taxonomically well-curated DNA barcode library could remedy this deficit and could also link hundreds of years of traditional knowledge with next generation sequencing technology. However, such a beetle library is missing to date. This study provides the globally largest DNA barcode reference library for Coleoptera for 15 948 individuals belonging to 3514 well-identified species (53% of the German fauna) with representatives from 97 of 103 families (94%). This study is the first comprehensive regional test of the efficiency of DNA barcoding for beetles with a focus on Germany. Sequences ≥500 bp were recovered from 63% of the specimens analysed (15 948 of 25 294) with short sequences from another 997 specimens. Whereas most specimens (92.2%) could be unambiguously assigned to a single known species by sequence diversity at CO1, 1089 specimens (6.8%) were assigned to more than one Barcode Index Number (BIN), creating 395 BINs which need further study to ascertain if they represent cryptic species, mitochondrial introgression, or simply regional variation in widespread species. We found 409 specimens (2.6%) that shared a BIN assignment with another species, most involving a pair of closely allied species as 43 BINs were involved. Most of these taxa were separated by barcodes although sequence divergences were low. Only 155 specimens (0.97%) show identical or overlapping clusters.

  15. Annual Review of Database Developments 1991.

    ERIC Educational Resources Information Center

    Basch, Reva

    1991-01-01

    Review of developments in databases highlights a new emphasis on accessibility. Topics discussed include the internationalization of databases; databases that deal with finance, drugs, and toxic waste; access to public records, both personal and corporate; media online; reducing large files of data to smaller, more manageable files; and…

  16. ECOTOX database; new additions and future direction

    EPA Science Inventory

    The ECOTOXicology database (ECOTOX) is a comprehensive, publicly available knowledgebase developed and maintained by ORD/NHEERL. It is used for environmental toxicity data on aquatic life, terrestrial plants and wildlife. Publications are identified for potential applicability af...

  17. The RIKEN integrated database of mammals

    PubMed Central

    Masuya, Hiroshi; Makita, Yuko; Kobayashi, Norio; Nishikata, Koro; Yoshida, Yuko; Mochizuki, Yoshiki; Doi, Koji; Takatsuki, Terue; Waki, Kazunori; Tanaka, Nobuhiko; Ishii, Manabu; Matsushima, Akihiro; Takahashi, Satoshi; Hijikata, Atsushi; Kozaki, Kouji; Furuichi, Teiichi; Kawaji, Hideya; Wakana, Shigeharu; Nakamura, Yukio; Yoshiki, Atsushi; Murata, Takehide; Fukami-Kobayashi, Kaoru; Mohan, Sujatha; Ohara, Osamu; Hayashizaki, Yoshihide; Mizoguchi, Riichiro; Obata, Yuichi; Toyoda, Tetsuro

    2011-01-01

    The RIKEN integrated database of mammals (http://scinets.org/db/mammal) is the official undertaking to integrate its mammalian databases produced from multiple large-scale programs that have been promoted by the institute. The database integrates not only RIKEN’s original databases, such as FANTOM, the ENU mutagenesis program, the RIKEN Cerebellar Development Transcriptome Database and the Bioresource Database, but also imported data from public databases, such as Ensembl, MGI and biomedical ontologies. Our integrated database has been implemented on the infrastructure of publication medium for databases, termed SciNetS/SciNeS, or the Scientists’ Networking System, where the data and metadata are structured as a semantic web and are downloadable in various standardized formats. The top-level ontology-based implementation of mammal-related data directly integrates the representative knowledge and individual data records in existing databases to ensure advanced cross-database searches and reduced unevenness of the data management operations. Through the development of this database, we propose a novel methodology for the development of standardized comprehensive management of heterogeneous data sets in multiple databases to improve the sustainability, accessibility, utility and publicity of the data of biomedical information. PMID:21076152

  18. The RIKEN integrated database of mammals.

    PubMed

    Masuya, Hiroshi; Makita, Yuko; Kobayashi, Norio; Nishikata, Koro; Yoshida, Yuko; Mochizuki, Yoshiki; Doi, Koji; Takatsuki, Terue; Waki, Kazunori; Tanaka, Nobuhiko; Ishii, Manabu; Matsushima, Akihiro; Takahashi, Satoshi; Hijikata, Atsushi; Kozaki, Kouji; Furuichi, Teiichi; Kawaji, Hideya; Wakana, Shigeharu; Nakamura, Yukio; Yoshiki, Atsushi; Murata, Takehide; Fukami-Kobayashi, Kaoru; Mohan, Sujatha; Ohara, Osamu; Hayashizaki, Yoshihide; Mizoguchi, Riichiro; Obata, Yuichi; Toyoda, Tetsuro

    2011-01-01

    The RIKEN integrated database of mammals (http://scinets.org/db/mammal) is the official undertaking to integrate its mammalian databases produced from multiple large-scale programs that have been promoted by the institute. The database integrates not only RIKEN's original databases, such as FANTOM, the ENU mutagenesis program, the RIKEN Cerebellar Development Transcriptome Database and the Bioresource Database, but also imported data from public databases, such as Ensembl, MGI and biomedical ontologies. Our integrated database has been implemented on the infrastructure of publication medium for databases, termed SciNetS/SciNeS, or the Scientists' Networking System, where the data and metadata are structured as a semantic web and are downloadable in various standardized formats. The top-level ontology-based implementation of mammal-related data directly integrates the representative knowledge and individual data records in existing databases to ensure advanced cross-database searches and reduced unevenness of the data management operations. Through the development of this database, we propose a novel methodology for the development of standardized comprehensive management of heterogeneous data sets in multiple databases to improve the sustainability, accessibility, utility and publicity of the data of biomedical information.

  19. DNA data bank of Japan (DDBJ) progress report

    PubMed Central

    Mashima, Jun; Kodama, Yuichi; Kosuge, Takehide; Fujisawa, Takatomo; Katayama, Toshiaki; Nagasaki, Hideki; Okuda, Yoshihiro; Kaminuma, Eli; Ogasawara, Osamu; Okubo, Kousaku; Nakamura, Yasukazu; Takagi, Toshihisa

    2016-01-01

    The DNA Data Bank of Japan Center (DDBJ Center; http://www.ddbj.nig.ac.jp) maintains and provides public archival, retrieval and analytical services for biological information. The contents of the DDBJ databases are shared with the US National Center for Biotechnology Information (NCBI) and the European Bioinformatics Institute (EBI) within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). Since 2013, the DDBJ Center has been operating the Japanese Genotype-phenotype Archive (JGA) in collaboration with the National Bioscience Database Center (NBDC) in Japan. In addition, the DDBJ Center develops semantic web technologies for data integration and sharing in collaboration with the Database Center for Life Science (DBCLS) in Japan. This paper briefly reports on the activities of the DDBJ Center over the past year including submissions to databases and improvements in our services for data retrieval, analysis, and integration. PMID:26578571

  20. SinEx DB: a database for single exon coding sequences in mammalian genomes.

    PubMed

    Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S

    2016-01-01

    Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl.

  1. SinEx DB: a database for single exon coding sequences in mammalian genomes

    PubMed Central

    Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F.; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S.

    2016-01-01

    Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as ‘single exon genes’ (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs. Database URL: www.sinex.cl PMID:27278816

  2. SinEx DB: a database for single exon coding sequences in mammalian genomes.

    PubMed

    Jorquera, Roddy; Ortiz, Rodrigo; Ossandon, F; Cárdenas, Juan Pablo; Sepúlveda, Rene; González, Carolina; Holmes, David S

    2016-01-01

    Eukaryotic genes are typically interrupted by intragenic, noncoding sequences termed introns. However, some genes lack introns in their coding sequence (CDS) and are generally known as 'single exon genes' (SEGs). In this work, a SEG is defined as a nuclear, protein-coding gene that lacks introns in its CDS. Whereas, many public databases of Eukaryotic multi-exon genes are available, there are only two specialized databases for SEGs. The present work addresses the need for a more extensive and diverse database by creating SinEx DB, a publicly available, searchable database of predicted SEGs from 10 completely sequenced mammalian genomes including human. SinEx DB houses the DNA and protein sequence information of these SEGs and includes their functional predictions (KOG) and the relative distribution of these functions within species. The information is stored in a relational database built with My SQL Server 5.1.33 and the complete dataset of SEG sequences and their functional predictions are available for downloading. SinEx DB can be interrogated by: (i) a browsable phylogenetic schema, (ii) carrying out BLAST searches to the in-house SinEx DB of SEGs and (iii) via an advanced search mode in which the database can be searched by key words and any combination of searches by species and predicted functions. SinEx DB provides a rich source of information for advancing our understanding of the evolution and function of SEGs.Database URL: www.sinex.cl. PMID:27278816

  3. The UCSC Genome Browser database: 2015 update.

    PubMed

    Rosenbloom, Kate R; Armstrong, Joel; Barber, Galt P; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Dreszer, Timothy R; Fujita, Pauline A; Guruvadoo, Luvina; Haeussler, Maximilian; Harte, Rachel A; Heitner, Steve; Hickey, Glenn; Hinrichs, Angie S; Hubley, Robert; Karolchik, Donna; Learned, Katrina; Lee, Brian T; Li, Chin H; Miga, Karen H; Nguyen, Ngan; Paten, Benedict; Raney, Brian J; Smit, Arian F A; Speir, Matthew L; Zweig, Ann S; Haussler, David; Kuhn, Robert M; Kent, W James

    2015-01-01

    Launched in 2001 to showcase the draft human genome assembly, the UCSC Genome Browser database (http://genome.ucsc.edu) and associated tools continue to grow, providing a comprehensive resource of genome assemblies and annotations to scientists and students worldwide. Highlights of the past year include the release of a browser for the first new human genome reference assembly in 4 years in December 2013 (GRCh38, UCSC hg38), a watershed comparative genomics annotation (100-species multiple alignment and conservation) and a novel distribution mechanism for the browser (GBiB: Genome Browser in a Box). We created browsers for new species (Chinese hamster, elephant shark, minke whale), 'mined the web' for DNA sequences and expanded the browser display with stacked color graphs and region highlighting. As our user community increasingly adopts the UCSC track hub and assembly hub representations for sharing large-scale genomic annotation data sets and genome sequencing projects, our menu of public data hubs has tripled.

  4. Progress towards a Spacecraft-Associated Microbial Meta-database (SAMM)

    NASA Astrophysics Data System (ADS)

    Mogul, Rakesh; Keagy, Laura; Nava, Argelia; Zerehi, Farah

    The microbial inventories within the assembly facilities for spacecraft represent the primary pool of forward contaminants that may compromise life-detection missions. Accordingly, we are constructing a meta-database of these microorganisms for the purpose of building a bioinformatic resource for planetary protection and astrobiology-related endeavors. Using student-led efforts, the meta-database is being constructed from literature reports and is inclusive of both isolated microorganisms and those solely detected through DNA-based techniques. The Spacecraft-Associated Microbial Meta-database (SAMM) currently includes over 800 entries that are organized using 32 meta-tags involving taxonomy, location of isolation (facility and component), category of characterization (culture and/or genetic), types of characterizations (e.g., culture, 16s rDNA, phylochip, FAME, and DNA hybridization), growth conditions, Gram stain, and general physiological traits (e.g., sporulation, extremotolerance, and respiration properties). Interrogations on the database show that the cleanrooms at Kennedy Space Center (KSC) are ~ 2-fold greater in diversity in bacterial genera when compared to the Jet Propulsion Laboratory (JPL), and that bacteria related to water, plant, and human environments are more often associated with the KSC-specific genera. These results are parallel to those reported in the literature, and hence serve as benchmarks demonstrating the bioinformatic potential of this meta-database. The ultimate plans for SAMM include public availability, expansion through crowdsourcing efforts, and potential use as a companion resource to the culture collections assembled by DSMZ and JPL.

  5. The AMMA database

    NASA Astrophysics Data System (ADS)

    Boichard, Jean-Luc; Brissebrat, Guillaume; Cloche, Sophie; Eymard, Laurence; Fleury, Laurence; Mastrorillo, Laurence; Moulaye, Oumarou; Ramage, Karim

    2010-05-01

    The AMMA project includes aircraft, ground-based and ocean measurements, an intensive use of satellite data and diverse modelling studies. Therefore, the AMMA database aims at storing a great amount and a large variety of data, and at providing the data as rapidly and safely as possible to the AMMA research community. In order to stimulate the exchange of information and collaboration between researchers from different disciplines or using different tools, the database provides a detailed description of the products and uses standardized formats. The AMMA database contains: - AMMA field campaigns datasets; - historical data in West Africa from 1850 (operational networks and previous scientific programs); - satellite products from past and future satellites, (re-)mapped on a regular latitude/longitude grid and stored in NetCDF format (CF Convention); - model outputs from atmosphere or ocean operational (re-)analysis and forecasts, and from research simulations. The outputs are processed as the satellite products are. Before accessing the data, any user has to sign the AMMA data and publication policy. This chart only covers the use of data in the framework of scientific objectives and categorically excludes the redistribution of data to third parties and the usage for commercial applications. Some collaboration between data producers and users, and the mention of the AMMA project in any publication is also required. The AMMA database and the associated on-line tools have been fully developed and are managed by two teams in France (IPSL Database Centre, Paris and OMP, Toulouse). Users can access data of both data centres using an unique web portal. This website is composed of different modules : - Registration: forms to register, read and sign the data use chart when an user visits for the first time - Data access interface: friendly tool allowing to build a data extraction request by selecting various criteria like location, time, parameters... The request can

  6. Publication Bias in Antipsychotic Trials: An Analysis of Efficacy Comparing the Published Literature to the US Food and Drug Administration Database

    PubMed Central

    Turner, Erick H.; Knoepflmacher, Daniel; Shapley, Lee

    2012-01-01

    Background Publication bias compromises the validity of evidence-based medicine, yet a growing body of research shows that this problem is widespread. Efficacy data from drug regulatory agencies, e.g., the US Food and Drug Administration (FDA), can serve as a benchmark or control against which data in journal articles can be checked. Thus one may determine whether publication bias is present and quantify the extent to which it inflates apparent drug efficacy. Methods and Findings FDA Drug Approval Packages for eight second-generation antipsychotics—aripiprazole, iloperidone, olanzapine, paliperidone, quetiapine, risperidone, risperidone long-acting injection (risperidone LAI), and ziprasidone—were used to identify a cohort of 24 FDA-registered premarketing trials. The results of these trials according to the FDA were compared with the results conveyed in corresponding journal articles. The relationship between study outcome and publication status was examined, and effect sizes derived from the two data sources were compared. Among the 24 FDA-registered trials, four (17%) were unpublished. Of these, three failed to show that the study drug had a statistical advantage over placebo, and one showed the study drug was statistically inferior to the active comparator. Among the 20 published trials, the five that were not positive, according to the FDA, showed some evidence of outcome reporting bias. However, the association between trial outcome and publication status did not reach statistical significance. Further, the apparent increase in the effect size point estimate due to publication bias was modest (8%) and not statistically significant. On the other hand, the effect size for unpublished trials (0.23, 95% confidence interval 0.07 to 0.39) was less than half that for the published trials (0.47, 95% confidence interval 0.40 to 0.54), a difference that was significant. Conclusions The magnitude of publication bias found for antipsychotics was less than that found

  7. VoSeq: a voucher and DNA sequence web application.

    PubMed

    Peña, Carlos; Malm, Tobias

    2012-01-01

    There is an ever growing number of molecular phylogenetic studies published, due to, in part, the advent of new techniques that allow cheap and quick DNA sequencing. Hence, the demand for relational databases with which to manage and annotate the amassing DNA sequences, genes, voucher specimens and associated biological data is increasing. In addition, a user-friendly interface is necessary for easy integration and management of the data stored in the database back-end. Available databases allow management of a wide variety of biological data. However, most database systems are not specifically constructed with the aim of being an organizational tool for researchers working in phylogenetic inference. We here report a new software facilitating easy management of voucher and sequence data, consisting of a relational database as back-end for a graphic user interface accessed via a web browser. The application, VoSeq, includes tools for creating molecular datasets of DNA or amino acid sequences ready to be used in commonly used phylogenetic software such as RAxML, TNT, MrBayes and PAUP, as well as for creating tables ready for publishing. It also has inbuilt BLAST capabilities against all DNA sequences stored in VoSeq as well as sequences in NCBI GenBank. By using mash-ups and calls to web services, VoSeq allows easy integration with public services such as Yahoo! Maps, Flickr, Encyclopedia of Life (EOL) and GBIF (by generating data-dumps that can be processed with GBIF's Integrated Publishing Toolkit).

  8. Comparative analysis of IRF6 variants in families with Van der Woude syndrome and popliteal pterygium syndrome using public whole-exome databases

    PubMed Central

    Leslie, Elizabeth J.; Standley, Jennifer; Compton, John; Bale, Sherri; Schutte, Brian C.; Murray, Jeffrey C.

    2013-01-01

    Purpose: Mutations in the transcription factor IRF6 cause allelic autosomal dominant clefting syndromes, Van der Woude syndrome, and popliteal pterygium syndrome. We compared the distribution of IRF6 coding and splice-site mutations from 549 families with Van der Woude syndrome or popliteal pterygium syndrome with that of variants from the 1000 Genomes and National Heart, Lung, and Blood Institute Exome Sequencing Projects. Methods: We compiled all published pathogenic IRF6 mutations and performed direct sequencing of IRF6 in families with Van der Woude syndrome or popliteal pterygium syndrome. Results: Although mutations causing Van der Woude syndrome or popliteal pterygium syndrome were nonrandomly distributed with significantly increased frequencies in the DNA-binding domain (P = 0.0001), variants found in controls were rare and evenly distributed in IRF6. Of 194 different missense or nonsense variants described as potentially pathogenic, we identified only two in more than 6,000 controls. PolyPhen and SIFT (sorting intolerant from tolerant) reported 5.9% of missense mutations in patients as benign, suggesting that use of current in silico prediction models to determine function can have significant false negatives. Conclusion: Mutation of IRF6 occurs infrequently in controls, suggesting that for IRF6 there is a high probability that disruption of the coding sequence, particularly the DNA-binding domain, will result in syndromic features. Prior associations of coding sequence variants in IRF6 with clefting syndromes have had few false positives. PMID:23154523

  9. Database of recent tsunami deposits

    USGS Publications Warehouse

    Peters, Robert; Jaffe, Bruce E.

    2010-01-01

    This report describes a database of sedimentary characteristics of tsunami deposits derived from published accounts of tsunami deposit investigations conducted shortly after the occurrence of a tsunami. The database contains 228 entries, each entry containing data from up to 71 categories. It includes data from 51 publications covering 15 tsunamis distributed between 16 countries. The database encompasses a wide range of depositional settings including tropical islands, beaches, coastal plains, river banks, agricultural fields, and urban environments. It includes data from both local tsunamis and teletsunamis. The data are valuable for interpreting prehistorical, historical, and modern tsunami deposits, and for the development of criteria to identify tsunami deposits in the geologic record.

  10. PAH Mutation Analysis Consortium Database: 1997. Prototype for relational locus-specific mutation databases.

    PubMed Central

    Nowacki, P M; Byck, S; Prevost, L; Scriver, C R

    1998-01-01

    PAHdb (http://www.mcgill.ca/pahdb ) is a curated relational database (Fig. 1) of nucleotide variation in the human PAH cDNA (GenBank U49897). Among 328 different mutations by state (Fig. 2) the majority are rare mutations causing hyperphenylalaninemia (HPA) (OMIM 261600), the remainder are polymorphic variants without apparent effect on phenotype. PAHdb modules contain mutations, polymorphic haplotypes, genotype-phenotype correlations, expression analysis, sources of information and the reference sequence; the database also contains pages of clinical information and data on three ENU mouse orthologues of human HPA. Only six different mutations account for 60% of human HPA chromosomes worldwide, mutations stratify by population and geographic region, and the Oriental and Caucasian mutation sets are different (Fig. 3). PAHdb provides curated electronic publication and one third of its incoming reports are direct submissions. Each different mutation receives a systematic (nucleotide) name and a unique identifier (UID). Data are accessed both by a Newsletter and a search engine on the website; integrity of the database is ensured by keeping the curated template offline. There have been >6500 online interrogations of the website. PMID:9399840

  11. Trends in performance indicators of neuroimaging anatomy research publications: a bibliometric study of major neuroradiology journal output over four decades based on web of science database.

    PubMed

    Wing, Louise; Massoud, Tarik F

    2015-01-01

    Quantitative, qualitative, and innovative application of bibliometric research performance indicators to anatomy and radiology research and education can enhance cross-fertilization between the two disciplines. We aim to use these indicators to identify long-term trends in dissemination of publications in neuroimaging anatomy (including both productivity and citation rates), which has subjectively waned in prestige during recent years. We examined publications over the last 40 years in two neuroradiological journals, AJNR and Neuroradiology, and selected and categorized all neuroimaging anatomy research articles according to theme and type. We studied trends in their citation activity over time, and mathematically analyzed these trends for 1977, 1987, and 1997 publications. We created a novel metric, "citation half-life at 10 years postpublication" (CHL-10), and used this to examine trends in the skew of citation numbers for anatomy articles each year. We identified 367 anatomy articles amongst a total of 18,110 in these journals: 74.2% were original articles, with study of normal anatomy being the commonest theme (46.7%). We recorded a mean of 18.03 citations for each anatomy article, 35% higher than for general neuroradiology articles. Graphs summarizing the rise (upslope) in citation rates after publication revealed similar trends spanning two decades. CHL-10 trends demonstrated that more recently published anatomy articles were likely to take longer to reach peak citation rate. Bibliometric analysis suggests that anatomical research in neuroradiology is not languishing. This novel analytical approach can be applied to other aspects of neuroimaging research, and within other subspecialties in radiology and anatomy, and also to foster anatomical education.

  12. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1992-11-09

    The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air- conditioning and refrigeration equipment. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R- 717 (ammonia), ethers, and others as well as azeotropic and zeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents on compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. A computerized version is available that includes retrieval software.

  13. The apoptosis database.

    PubMed

    Doctor, K S; Reed, J C; Godzik, A; Bourne, P E

    2003-06-01

    The apoptosis database is a public resource for researchers and students interested in the molecular biology of apoptosis. The resource provides functional annotation, literature references, diagrams/images, and alternative nomenclatures on a set of proteins having 'apoptotic domains'. These are the distinctive domains that are often, if not exclusively, found in proteins involved in apoptosis. The initial choice of proteins to be included is defined by apoptosis experts and bioinformatics tools. Users can browse through the web accessible lists of domains, proteins containing these domains and their associated homologs. The database can also be searched by sequence homology using basic local alignment search tool, text word matches of the annotation, and identifiers for specific records. The resource is available at http://www.apoptosis-db.org and is updated on a regular basis.

  14. Generation and Analysis of a Large-Scale Expressed Sequence Tag Database from a Full-Length Enriched cDNA Library of Developing Leaves of Gossypium hirsutum L

    PubMed Central

    Pang, Chaoyou; Fan, Shuli; Song, Meizhen; Yu, Shuxun

    2013-01-01

    Background Cotton (Gossypium hirsutum L.) is one of the world’s most economically-important crops. However, its entire genome has not been sequenced, and limited resources are available in GenBank for understanding the molecular mechanisms underlying leaf development and senescence. Methodology/Principal Findings In this study, 9,874 high-quality ESTs were generated from a normalized, full-length cDNA library derived from pooled RNA isolated from throughout leaf development during the plant blooming stage. After clustering and assembly of these ESTs, 5,191 unique sequences, representative 1,652 contigs and 3,539 singletons, were obtained. The average unique sequence length was 682 bp. Annotation of these unique sequences revealed that 84.4% showed significant homology to sequences in the NCBI non-redundant protein database, and 57.3% had significant hits to known proteins in the Swiss-Prot database. Comparative analysis indicated that our library added 2,400 ESTs and 991 unique sequences to those known for cotton. The unigenes were functionally characterized by gene ontology annotation. We identified 1,339 and 200 unigenes as potential leaf senescence-related genes and transcription factors, respectively. Moreover, nine genes related to leaf senescence and eleven MYB transcription factors were randomly selected for quantitative real-time PCR (qRT-PCR), which revealed that these genes were regulated differentially during senescence. The qRT-PCR for three GhYLSs revealed that these genes express express preferentially in senescent leaves. Conclusions/Significance These EST resources will provide valuable sequence information for gene expression profiling analyses and functional genomics studies to elucidate their roles, as well as for studying the mechanisms of leaf development and senescence in cotton and discovering candidate genes related to important agronomic traits of cotton. These data will also facilitate future whole-genome sequence assembly and annotation

  15. Stackfile Database

    NASA Technical Reports Server (NTRS)

    deVarvalho, Robert; Desai, Shailen D.; Haines, Bruce J.; Kruizinga, Gerhard L.; Gilmer, Christopher

    2013-01-01

    This software provides storage retrieval and analysis functionality for managing satellite altimetry data. It improves the efficiency and analysis capabilities of existing database software with improved flexibility and documentation. It offers flexibility in the type of data that can be stored. There is efficient retrieval either across the spatial domain or the time domain. Built-in analysis tools are provided for frequently performed altimetry tasks. This software package is used for storing and manipulating satellite measurement data. It was developed with a focus on handling the requirements of repeat-track altimetry missions such as Topex and Jason. It was, however, designed to work with a wide variety of satellite measurement data [e.g., Gravity Recovery And Climate Experiment -- GRACE). The software consists of several command-line tools for importing, retrieving, and analyzing satellite measurement data.

  16. Colombian forensic genetics as a form of public science: The role of race, nation and common sense in the stabilization of DNA populations.

    PubMed

    Schwartz-Marín, Ernesto; Wade, Peter; Cruz-Santiago, Arely; Cárdenas, Roosbelinda

    2015-12-01

    Abstract This article examines the role that vernacular notions of racialized-regional difference play in the constitution and stabilization of DNA populations in Colombian forensic science, in what we frame as a process of public science. In public science, the imaginations of the scientific world and common-sense public knowledge are integral to the production and circulation of science itself. We explore the origins and circulation of a scientific object--'La Tabla', published in Paredes et al. and used in genetic forensic identification procedures--among genetic research institutes, forensic genetics laboratories and courtrooms in Bogotá. We unveil the double life of this central object of forensic genetics. On the one hand, La Tabla enjoys an indisputable public place in the processing of forensic genetic evidence in Colombia (paternity cases, identification of bodies, etc.). On the other hand, the relations it establishes between 'race', geography and genetics are questioned among population geneticists in Colombia. Although forensic technicians are aware of the disputes among population geneticists, they use and endorse the relations established between genetics, 'race' and geography because these fit with common-sense notions of visible bodily difference and the regionalization of race in the Colombian nation.

  17. Colombian forensic genetics as a form of public science: The role of race, nation and common sense in the stabilization of DNA populations

    PubMed Central

    Schwartz-Marín, Ernesto; Wade, Peter; Cruz-Santiago, Arely; Cárdenas, Roosbelinda

    2015-01-01

    This article examines the role that vernacular notions of racialized-regional difference play in the constitution and stabilization of DNA populations in Colombian forensic science, in what we frame as a process of public science. In public science, the imaginations of the scientific world and common-sense public knowledge are integral to the production and circulation of science itself. We explore the origins and circulation of a scientific object – ‘La Tabla’, published in Paredes et al. and used in genetic forensic identification procedures – among genetic research institutes, forensic genetics laboratories and courtrooms in Bogotá. We unveil the double life of this central object of forensic genetics. On the one hand, La Tabla enjoys an indisputable public place in the processing of forensic genetic evidence in Colombia (paternity cases, identification of bodies, etc.). On the other hand, the relations it establishes between ‘race’, geography and genetics are questioned among population geneticists in Colombia. Although forensic technicians are aware of the disputes among population geneticists, they use and endorse the relations established between genetics, ‘race’ and geography because these fit with common-sense notions of visible bodily difference and the regionalization of race in the Colombian nation. PMID:27480000

  18. Colombian forensic genetics as a form of public science: The role of race, nation and common sense in the stabilization of DNA populations.

    PubMed

    Schwartz-Marín, Ernesto; Wade, Peter; Cruz-Santiago, Arely; Cárdenas, Roosbelinda

    2015-12-01

    Abstract This article examines the role that vernacular notions of racialized-regional difference play in the constitution and stabilization of DNA populations in Colombian forensic science, in what we frame as a process of public science. In public science, the imaginations of the scientific world and common-sense public knowledge are integral to the production and circulation of science itself. We explore the origins and circulation of a scientific object--'La Tabla', published in Paredes et al. and used in genetic forensic identification procedures--among genetic research institutes, forensic genetics laboratories and courtrooms in Bogotá. We unveil the double life of this central object of forensic genetics. On the one hand, La Tabla enjoys an indisputable public place in the processing of forensic genetic evidence in Colombia (paternity cases, identification of bodies, etc.). On the other hand, the relations it establishes between 'race', geography and genetics are questioned among population geneticists in Colombia. Although forensic technicians are aware of the disputes among population geneticists, they use and endorse the relations established between genetics, 'race' and geography because these fit with common-sense notions of visible bodily difference and the regionalization of race in the Colombian nation. PMID:27480000

  19. Open Geoscience Database

    NASA Astrophysics Data System (ADS)

    Bashev, A.

    2012-04-01

    Currently there is an enormous amount of various geoscience databases. Unfortunately the only users of the majority of the databases are their elaborators. There are several reasons for that: incompaitability, specificity of tasks and objects and so on. However the main obstacles for wide usage of geoscience databases are complexity for elaborators and complication for users. The complexity of architecture leads to high costs that block the public access. The complication prevents users from understanding when and how to use the database. Only databases, associated with GoogleMaps don't have these drawbacks, but they could be hardly named "geoscience" Nevertheless, open and simple geoscience database is necessary at least for educational purposes (see our abstract for ESSI20/EOS12). We developed a database and web interface to work with them and now it is accessible at maps.sch192.ru. In this database a result is a value of a parameter (no matter which) in a station with a certain position, associated with metadata: the date when the result was obtained; the type of a station (lake, soil etc); the contributor that sent the result. Each contributor has its own profile, that allows to estimate the reliability of the data. The results can be represented on GoogleMaps space image as a point in a certain position, coloured according to the value of the parameter. There are default colour scales and each registered user can create the own scale. The results can be also extracted in *.csv file. For both types of representation one could select the data by date, object type, parameter type, area and contributor. The data are uploaded in *.csv format: Name of the station; Lattitude(dd.dddddd); Longitude(ddd.dddddd); Station type; Parameter type; Parameter value; Date(yyyy-mm-dd). The contributor is recognised while entering. This is the minimal set of features that is required to connect a value of a parameter with a position and see the results. All the complicated data

  20. Targeted Sequencing for Discovery and Validation of DNA Methylation Markers of Colon Cancer Metastasis — EDRN Public Portal

    Cancer.gov

    Colon cancer is the second leading cause of cancer death in the United States. A key issue in treating colon cancer patients is inability to accurately predict tumors that have metastatic potential and require adjuvant chemotherapy. This project will test the model that tumor metastases arise from intra-tumor heterogeneity generated by DNA methylation events, and that detecting these events can provide a predictve signature of tumors with poor outcome

  1. NASA STI Database, Aerospace Database and ARIN coverage of 'space law'

    NASA Technical Reports Server (NTRS)

    Buchan, Ronald L.

    1992-01-01

    The space-law coverage provided by the NASA STI Database, the Aerospace Database, and ARIN is briefly described. Particular attention is given to the space law content of the two Databases and of ARIN, the NASA Thesauras space law terminology, space law publication forms, and the availability of the space law literature.

  2. Molecular Identification and Databases in Fusarium

    Technology Transfer Automated Retrieval System (TEKTRAN)

    DNA sequence-based methods for identifying pathogenic and mycotoxigenic Fusarium isolates have become the gold standard worldwide. Moreover, fusarial DNA sequence data are increasing rapidly in several web-accessible databases for comparative purposes. Unfortunately, the use of Basic Alignment Sea...

  3. REDIdb: the RNA editing database.

    PubMed

    Picardi, Ernesto; Regina, Teresa Maria Rosaria; Brennicke, Axel; Quagliariello, Carla

    2007-01-01

    The RNA Editing Database (REDIdb) is an interactive, web-based database created and designed with the aim to allocate RNA editing events such as substitutions, insertions and deletions occurring in a wide range of organisms. The database contains both fully and partially sequenced DNA molecules for which editing information is available either by experimental inspection (in vitro) or by computational detection (in silico). Each record of REDIdb is organized in a specific flat-file containing a description of the main characteristics of the entry, a feature table with the editing events and related details and a sequence zone with both the genomic sequence and the corresponding edited transcript. REDIdb is a relational database in which the browsing and identification of editing sites has been simplified by means of two facilities to either graphically display genomic or cDNA sequences or to show the corresponding alignment. In both cases, all editing sites are highlighted in colour and their relative positions are detailed by mousing over. New editing positions can be directly submitted to REDIdb after a user-specific registration to obtain authorized secure access. This first version of REDIdb database stores 9964 editing events and can be freely queried at http://biologia.unical.it/py_script/search.html.

  4. Database Marketplace 2002: The Database Universe.

    ERIC Educational Resources Information Center

    Tenopir, Carol; Baker, Gayle; Robinson, William

    2002-01-01

    Reviews the database industry over the past year, including new companies and services, company closures, popular database formats, popular access methods, and changes in existing products and services. Lists 33 firms and their database services; 33 firms and their database products; and 61 company profiles. (LRW)

  5. u-Genome: a database on genome design in unicellular genomes.

    PubMed

    Sakharkar, Kishore Ramaji; Chaturvedi, Iti; Chow, Vincent T K; Kwoh, Chee Keong; Kangueane, Pandjassarame; Sakharkar, Meena Kishore

    2005-01-01

    Unicellular eukaryotes were among the first ones to be selected for complete genome sequencing because of the small size of their genomes and their interactions with humans and a broad range of animals and plants. Currently, ten completely sequenced unicellular genome sequences have been publicly released and as the number of available unicellular genomes increases, comparative genomics analysis within this group of organisms becomes more and more instructive. However, such an analysis is difficult to carry out without a suitable platform gathering not only the original annotations but also relevant information available in public databases or obtained by applying common bioinformatics methods. With the aim of solving these difficulties, we have developed a web-accessible database named u-Genome, the unicellular genome design database. The database is unique in featuring three datasets namely (1) orthologous proteins (2) paralogous proteins and (3) statistical distributions on exons, introns, intergenic DNA and correlations between them. A tool, Uniview, designed to visualize the gene structures for individual genes in the genome is also integrated. This database is of importance in understanding unicellular genome design and architecture and evolution related studies. The database is available through a web interface at http://sege.ntu.edu.sg/wester/ugenome.

  6. A Chronostratigraphic Relational Database Ontology

    NASA Astrophysics Data System (ADS)

    Platon, E.; Gary, A.; Sikora, P.

    2005-12-01

    A chronostratigraphic research database was donated by British Petroleum to the Stratigraphy Group at the Energy and Geoscience Institute (EGI), University of Utah. These data consists of over 2,000 measured sections representing over three decades of research into the application of the graphic correlation method. The data are global and includes both microfossil (foraminifera, calcareous nannoplankton, spores, pollen, dinoflagellate cysts, etc) and macrofossil data. The objective of the donation was to make the research data available to the public in order to encourage additional chronostratigraphy studies, specifically regarding graphic correlation. As part of the National Science Foundation's Cyberinfrastructure for the Geosciences (GEON) initiative these data have been made available to the public at http://css.egi.utah.edu. To encourage further research using the graphic correlation method, EGI has developed a software package, StrataPlot that will soon be publicly available from the GEON website as a standalone software download. The EGI chronostratigraphy research database, although relatively large, has many data holes relative to some paleontological disciplines and geographical areas, so the challenge becomes how do we expand the data available for chronostratigrahic studies using graphic correlation. There are several public or soon-to-be public databases available to chronostratigraphic research, but they have their own data structures and modes of presentation. The heterogeneous nature of these database schemas hinders their integration and makes it difficult for the user to retrieve and consolidate potentially valuable chronostratigraphic data. The integration of these data sources would facilitate rapid and comprehensive data searches, thus helping advance studies in chronostratigraphy. The GEON project will host a number of databases within the geology domain, some of which contain biostratigraphic data. Ontologies are being developed to provide

  7. The Porcelain Crab Transcriptome and PCAD, the Porcelain Crab Microarray and Sequence Database

    SciTech Connect

    Tagmount, Abderrahmane; Wang, Mei; Lindquist, Erika; Tanaka, Yoshihiro; Teranishi, Kristen S.; Sunagawa, Shinichi; Wong, Mike; Stillman, Jonathon H.

    2010-01-27

    Background: With the emergence of a completed genome sequence of the freshwater crustacean Daphnia pulex, construction of genomic-scale sequence databases for additional crustacean sequences are important for comparative genomics and annotation. Porcelain crabs, genus Petrolisthes, have been powerful crustacean models for environmental and evolutionary physiology with respect to thermal adaptation and understanding responses of marine organisms to climate change. Here, we present a large-scale EST sequencing and cDNA microarray database project for the porcelain crab Petrolisthes cinctipes. Methodology/Principal Findings: A set of ~;;30K unique sequences (UniSeqs) representing ~;;19K clusters were generated from ~;;98K high quality ESTs from a set of tissue specific non-normalized and mixed-tissue normalized cDNA libraries from the porcelain crab Petrolisthes cinctipes. Homology for each UniSeq was assessed using BLAST, InterProScan, GO and KEGG database searches. Approximately 66percent of the UniSeqs had homology in at least one of the databases. All EST and UniSeq sequences along with annotation results and coordinated cDNA microarray datasets have been made publicly accessible at the Porcelain Crab Array Database (PCAD), a feature-enriched version of the Stanford and Longhorn Array Databases.Conclusions/Significance: The EST project presented here represents the third largest sequencing effort for any crustacean, and the largest effort for any crab species. Our assembly and clustering results suggest that our porcelain crab EST data set is equally diverse to the much larger EST set generated in the Daphnia pulex genome sequencing project, and thus will be an important resource to the Daphnia research community. Our homology results support the pancrustacea hypothesis and suggest that Malacostraca may be ancestral to Branchiopoda and Hexapoda. Our results also suggest that our cDNA microarrays cover as much of the transcriptome as can reasonably be captured in

  8. Database systems for knowledge-based discovery.

    PubMed

    Jagarlapudi, Sarma A R P; Kishan, K V Radha

    2009-01-01

    Several database systems have been developed to provide valuable information from the bench chemist to biologist, medical practitioner to pharmaceutical scientist in a structured format. The advent of information technology and computational power enhanced the ability to access large volumes of data in the form of a database where one could do compilation, searching, archiving, analysis, and finally knowledge derivation. Although, data are of variable types the tools used for database creation, searching and retrieval are similar. GVK BIO has been developing databases from publicly available scientific literature in specific areas like medicinal chemistry, clinical research, and mechanism-based toxicity so that the structured databases containing vast data could be used in several areas of research. These databases were classified as reference centric or compound centric depending on the way the database systems were designed. Integration of these databases with knowledge derivation tools would enhance the value of these systems toward better drug design and discovery.

  9. Database systems for knowledge-based discovery.

    PubMed

    Jagarlapudi, Sarma A R P; Kishan, K V Radha

    2009-01-01

    Several database systems have been developed to provide valuable information from the bench chemist to biologist, medical practitioner to pharmaceutical scientist in a structured format. The advent of information technology and computational power enhanced the ability to access large volumes of data in the form of a database where one could do compilation, searching, archiving, analysis, and finally knowledge derivation. Although, data are of variable types the tools used for database creation, searching and retrieval are similar. GVK BIO has been developing databases from publicly available scientific literature in specific areas like medicinal chemistry, clinical research, and mechanism-based toxicity so that the structured databases containing vast data could be used in several areas of research. These databases were classified as reference centric or compound centric depending on the way the database systems were designed. Integration of these databases with knowledge derivation tools would enhance the value of these systems toward better drug design and discovery. PMID:19727614

  10. Mining of public sequencing databases supports a non-dietary origin for putative foreign miRNAs: underestimated effects of contamination in NGS

    PubMed Central

    Tosar, Juan Pablo; Rovira, Carlos; Naya, Hugo; Cayota, Alfonso

    2014-01-01

    The report that exogenous plant miRNAs are able to cross the mammalian gastrointestinal tract and exert gene-regulation mechanism in mammalian tissues has yielded a lot of controversy, both in the public press and the scientific literature. Despite the initial enthusiasm, reproducibility of these results was recently questioned by several authors. To analyze the causes of this unease, we searched for diet-derived miRNAs in deep-sequencing libraries performed by ourselves and others. We found variable amounts of plant miRNAs in publicly available small RNA-seq data sets of human tissues. In human spermatozoa, exogenous RNAs reached extreme, biologically meaningless levels. On the contrary, plant miRNAs were not detected in our sequencing of human sperm cells, which was performed in the absence of any known sources of plant contamination. We designed an experiment to show that cross-contamination during library preparation is a source of exogenous RNAs. These contamination-derived exogenous sequences even resisted oxidation with sodium periodate. To test the assumption that diet-derived miRNAs were actually contamination-derived, we sought in the literature for previous sequencing reports performed by the same group which reported the initial finding. We analyzed the spectra of plant miRNAs in a small RNA sequencing study performed in amphioxus by this group in 2009 and we found a very strong correlation with the plant miRNAs which they later reported in human sera. Even though contamination with exogenous sequences may be easy to detect, cross-contamination between samples from the same organism can go completely unnoticed, possibly affecting conclusions derived from NGS transcriptomics. PMID:24729469

  11. The Hawaiian Freshwater Algal Database (HfwADB): a laboratory LIMS and online biodiversity resource

    PubMed Central

    2012-01-01

    Background Biodiversity databases serve the important role of highlighting species-level diversity from defined geographical regions. Databases that are specially designed to accommodate the types of data gathered during regional surveys are valuable in allowing full data access and display to researchers not directly involved with the project, while serving as a Laboratory Information Management System (LIMS). The Hawaiian Freshwater Algal Database, or HfwADB, was modified from the Hawaiian Algal Database to showcase non-marine algal specimens collected from the Hawaiian Archipelago by accommodating the additional level of organization required for samples including multiple species. Description The Hawaiian Freshwater Algal Database is a comprehensive and searchable database containing photographs and micrographs of samples and collection sites, geo-referenced collecting information, taxonomic data and standardized DNA sequence data. All data for individual samples are linked through unique 10-digit accession numbers (“Isolate Accession”), the first five of which correspond to the collection site (“Environmental Accession”). Users can search online for sample information by accession number, various levels of taxonomy, habitat or collection site. HfwADB is hosted at the University of Hawaii, and was made publicly accessible in October 2011. At the present time the database houses data for over 2,825 samples of non-marine algae from 1,786 collection sites from the Hawaiian Archipelago. These samples include cyanobacteria, red and green algae and diatoms, as well as lesser representation from some other algal lineages. Conclusions HfwADB is a digital repository that acts as a Laboratory Information Management System for Hawaiian non-marine algal data. Users can interact with the repository through the web to view relevant habitat data (including geo-referenced collection locations) and download images of collection sites, specimen photographs and

  12. The Cambridge Structural Database.

    PubMed

    Groom, Colin R; Bruno, Ian J; Lightfoot, Matthew P; Ward, Suzanna C

    2016-04-01

    The Cambridge Structural Database (CSD) contains a complete record of all published organic and metal-organic small-molecule crystal structures. The database has been in operation for over 50 years and continues to be the primary means of sharing structural chemistry data and knowledge across disciplines. As well as structures that are made public to support scientific articles, it includes many structures published directly as CSD Communications. All structures are processed both computationally and by expert structural chemistry editors prior to entering the database. A key component of this processing is the reliable association of the chemical identity of the structure studied with the experimental data. This important step helps ensure that data is widely discoverable and readily reusable. Content is further enriched through selective inclusion of additional experimental data. Entries are available to anyone through free CSD community web services. Linking services developed and maintained by the CCDC, combined with the use of standard identifiers, facilitate discovery from other resources. Data can also be accessed through CCDC and third party software applications and through an application programming interface.

  13. The Cambridge Structural Database

    PubMed Central

    Groom, Colin R.; Bruno, Ian J.; Lightfoot, Matthew P.; Ward, Suzanna C.

    2016-01-01

    The Cambridge Structural Database (CSD) contains a complete record of all published organic and metal–organic small-molecule crystal structures. The database has been in operation for over 50 years and continues to be the primary means of sharing structural chemistry data and knowledge across disciplines. As well as structures that are made public to support scientific articles, it includes many structures published directly as CSD Communications. All structures are processed both computationally and by expert structural chemistry editors prior to entering the database. A key component of this processing is the reliable association of the chemical identity of the structure studied with the experimental data. This important step helps ensure that data is widely discoverable and readily reusable. Content is further enriched through selective inclusion of additional experimental data. Entries are available to anyone through free CSD community web services. Linking services developed and maintained by the CCDC, combined with the use of standard identifiers, facilitate discovery from other resources. Data can also be accessed through CCDC and third party software applications and through an application programming interface. PMID:27048719

  14. ARTI Refrigerant Database

    SciTech Connect

    Cain, J.M. , Great Falls, VA )

    1993-04-30

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included. The database identifies sources of specific information on R-32, R-123, R-124, R-125, R-134, R-134a, R-141b, R-142b, R-143a, R-152a, R-245ca, R-290 (propane), R-717 (ammonia), ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, ester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents to accelerate availability of the information and will be completed or replaced in future updates.

  15. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1996-04-15

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates. Citations in this report are divided into the following topics: thermophysical properties; materials compatibility; lubricants and tribology; application data; safety; test and analysis methods; impacts; regulatory actions; substitute refrigerants; identification; absorption and adsorption; research programs; and miscellaneous documents. Information is also presented on ordering instructions for the computerized version.

  16. Curcumin Resource Database

    PubMed Central

    Kumar, Anil; Chetia, Hasnahana; Sharma, Swagata; Kabiraj, Debajyoti; Talukdar, Narayan Chandra; Bora, Utpal

    2015-01-01

    Curcumin is one of the most intensively studied diarylheptanoid, Curcuma longa being its principal producer. This apart, a class of promising curcumin analogs has been generated in laboratories, aptly named as Curcuminoids which are showing huge potential in the fields of medicine, food technology, etc. The lack of a universal source of data on curcumin as well as curcuminoids has been felt by the curcumin research community for long. Hence, in an attempt to address this stumbling block, we have developed Curcumin Resource Database (CRDB) that aims to perform as a gateway-cum-repository to access all relevant data and related information on curcumin and its analogs. Currently, this database encompasses 1186 curcumin analogs, 195 molecular targets, 9075 peer reviewed publications, 489 patents and 176 varieties of C. longa obtained by extensive data mining and careful curation from numerous sources. Each data entry is identified by a unique CRDB ID (identifier). Furnished with a user-friendly web interface and in-built search engine, CRDB provides well-curated and cross-referenced information that are hyperlinked with external sources. CRDB is expected to be highly useful to the researchers working on structure as well as ligand-based molecular design of curcumin analogs. Database URL: http://www.crdb.in PMID:26220923

  17. The Cambridge Structural Database.

    PubMed

    Groom, Colin R; Bruno, Ian J; Lightfoot, Matthew P; Ward, Suzanna C

    2016-04-01

    The Cambridge Structural Database (CSD) contains a complete record of all published organic and metal-organic small-molecule crystal structures. The database has been in operation for over 50 years and continues to be the primary means of sharing structural chemistry data and knowledge across disciplines. As well as structures that are made public to support scientific articles, it includes many structures published directly as CSD Communications. All structures are processed both computationally and by expert structural chemistry editors prior to entering the database. A key component of this processing is the reliable association of the chemical identity of the structure studied with the experimental data. This important step helps ensure that data is widely discoverable and readily reusable. Content is further enriched through selective inclusion of additional experimental data. Entries are available to anyone through free CSD community web services. Linking services developed and maintained by the CCDC, combined with the use of standard identifiers, facilitate discovery from other resources. Data can also be accessed through CCDC and third party software applications and through an application programming interface. PMID:27048719

  18. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1997-02-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufacturers and those using alterative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on various refrigerants. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.

  19. ARTI refrigerant database

    SciTech Connect

    Calm, J.M.

    1998-08-01

    The Refrigerant Database is an information system on alternative refrigerants, associated lubricants, and their use in air conditioning and refrigeration. It consolidates and facilitates access to property, compatibility, environmental, safety, application and other information. It provides corresponding information on older refrigerants, to assist manufactures and those using alternative refrigerants, to make comparisons and determine differences. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern. The database provides bibliographic citations and abstracts for publications that may be useful in research and design of air-conditioning and refrigeration equipment. The complete documents are not included, though some may be added at a later date. The database identifies sources of specific information on many refrigerants including propane, ammonia, water, carbon dioxide, propylene, ethers, and others as well as azeotropic and zeotropic blends of these fluids. It addresses lubricants including alkylbenzene, polyalkylene glycol, polyolester, and other synthetics as well as mineral oils. It also references documents addressing compatibility of refrigerants and lubricants with metals, plastics, elastomers, motor insulation, and other materials used in refrigerant circuits. Incomplete citations or abstracts are provided for some documents. They are included to accelerate availability of the information and will be completed or replaced in future updates.

  20. Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants.

    PubMed

    Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki

    2014-09-01

    In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops. PMID:25320561

  1. Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants.

    PubMed

    Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki

    2014-09-01

    In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops.

  2. Kazusa Marker DataBase: a database for genomics, genetics, and molecular breeding in plants

    PubMed Central

    Shirasawa, Kenta; Isobe, Sachiko; Tabata, Satoshi; Hirakawa, Hideki

    2014-01-01

    In order to provide useful genomic information for agronomical plants, we have established a database, the Kazusa Marker DataBase (http://marker.kazusa.or.jp). This database includes information on DNA markers, e.g., SSR and SNP markers, genetic linkage maps, and physical maps, that were developed at the Kazusa DNA Research Institute. Keyword searches for the markers, sequence data used for marker development, and experimental conditions are also available through this database. Currently, 10 plant species have been targeted: tomato (Solanum lycopersicum), pepper (Capsicum annuum), strawberry (Fragaria × ananassa), radish (Raphanus sativus), Lotus japonicus, soybean (Glycine max), peanut (Arachis hypogaea), red clover (Trifolium pratense), white clover (Trifolium repens), and eucalyptus (Eucalyptus camaldulensis). In addition, the number of plant species registered in this database will be increased as our research progresses. The Kazusa Marker DataBase will be a useful tool for both basic and applied sciences, such as genomics, genetics, and molecular breeding in crops. PMID:25320561

  3. The Giardia genome project database.

    PubMed

    McArthur, A G; Morrison, H G; Nixon, J E; Passamaneck, N Q; Kim, U; Hinkle, G; Crocker, M K; Holder, M E; Farr, R; Reich, C I; Olsen, G E; Aley, S B; Adam, R D; Gillin, F D; Sogin, M L

    2000-08-15

    The Giardia genome project database provides an online resource for Giardia lamblia (WB strain, clone C6) genome sequence information. The database includes edited single-pass reads, the results of BLASTX searches, and details of progress towards sequencing the entire 12 million-bp Giardia genome. Pre-sorted BLASTX results can be retrieved based on keyword searches and BLAST searches of the high throughput Giardia data can be initiated from the web site or through NCBI. Descriptions of the genomic DNA libraries, project protocols and summary statistics are also available. Although the Giardia genome project is ongoing, new sequences are made available on a bi-monthly basis to ensure that researchers have access to information that may assist them in the search for genes and their biological function. The current URL of the Giardia genome project database is www.mbl.edu/Giardia.

  4. Overlap in Bibliographic Databases.

    ERIC Educational Resources Information Center

    Hood, William W.; Wilson, Concepcion S.

    2003-01-01

    Examines the topic of Fuzzy Set Theory to determine the overlap of coverage in bibliographic databases. Highlights include examples of comparisons of database coverage; frequency distribution of the degree of overlap; records with maximum overlap; records unique to one database; intra-database duplicates; and overlap in the top ten databases.…

  5. Rapid and accurate identification of microorganisms contaminating cosmetic products based on DNA sequence homology.

    PubMed

    Fujita, Y; Shibayama, H; Suzuki, Y; Karita, S; Takamatsu, S

    2005-12-01

    The aim of this study was to develop rapid and accurate procedures to identify microorganisms contaminating cosmetic products, based on the identity of the nucleotide sequences of the internal transcribed spacer (ITS) region of the ribosomal RNA coding DNA (rDNA). Five types of microorganisms were isolated from the inner portion of lotion bottle caps, skin care lotions, and cleansing gels. The rDNA ITS region of microorganisms was amplified through the use of colony-direct PCR or ordinal PCR using DNA extracts as templates. The nucleotide sequences of the amplified DNA were determined and subjected to homology search of a publicly available DNA database. Thereby, we obtained DNA sequences possessing high similarity with the query sequences from the databases of all the five organisms analyzed. The traditional identification procedure requires expert skills, and a time period of approximately 1 month to identify the microorganisms. On the contrary, 3-7 days were sufficient to complete all the procedures employed in the current method, including isolation and cultivation of organisms, DNA sequencing, and the database homology search. Moreover, it was possible to develop the skills necessary to perform the molecular techniques required for the identification procedures within 1 week. Consequently, the current method is useful for rapid and accurate identification of microorganisms, contaminating cosmetics.

  6. Databases of the marine metagenomics.

    PubMed

    Mineta, Katsuhiko; Gojobori, Takashi

    2016-02-01

    The metagenomic data obtained from marine environments is significantly useful for understanding marine microbial communities. In comparison with the conventional amplicon-based approach of metagenomics, the recent shotgun sequencing-based approach has become a powerful tool that provides an efficient way of grasping a diversity of the entire microbial community at a sampling point in the sea. However, this approach accelerates accumulation of the metagenome data as well as increase of data complexity. Moreover, when metagenomic approach is used for monitoring a time change of marine environments at multiple locations of the seawater, accumulation of metagenomics data will become tremendous with an enormous speed. Because this kind of situation has started becoming of reality at many marine research institutions and stations all over the world, it looks obvious that the data management and analysis will be confronted by the so-called Big Data issues such as how the database can be constructed in an efficient way and how useful knowledge should be extracted from a vast amount of the data. In this review, we summarize the outline of all the major databases of marine metagenome that are currently publically available, noting that database exclusively on marine metagenome is none but the number of metagenome databases including marine metagenome data are six, unexpectedly still small. We also extend our explanation to the databases, as reference database we call, that will be useful for constructing a marine metagenome database as well as complementing important information with the database. Then, we would point out a number of challenges to be conquered in constructing the marine metagenome database.

  7. Databases and software for the analysis of mutations in the human p53 gene, human hprt gene and both the lacI and lacZ gene in transgenic rodents.

    PubMed

    Cariello, N F; Douglas, G R; Gorelick, N J; Hart, D W; Wilson, J D; Soussi, T

    1998-01-01

    We have created databases and software applications for the analysis of DNA mutations at the human p53 gene, the human hprt gene and both the rodent transgenic lacI and lacZ loci. The databases themselves are stand-alone dBASE files and the software for analysis of the databases runs on IBM-compatible computers with Microsoft Windows. Each database has a separate software analysis program. The software created for these databases permit the filtering, ordering, report generation and display of information in the database. In addition, a significant number of routines have been developed for the analysis of single base substitutions. One method of obtaining the databases and software is via the World Wide Web. Open the following home page with a Web Browser: http://sunsite.unc.edu/dnam/mainpage. html . Alternatively, the databases and programs are available via public FTP from: anonymous@sunsite.unc.edu. There is no password required to enter the system. The databases and software are found beneath the subdirectory: pub/academic/biology/dna-mutations. Two other programs are available at the site, a program for comparison of mutational spectra and a program for entry of mutational data into a relational database.

  8. A Novel Approach: Chemical Relational Databases, and the Role of the ISSCAN Database on Assessing Chemical Carcinogenity

    EPA Science Inventory

    Mutagenicity and carcinogenicity databases are crucial resources for toxicologists and regulators involved in chemicals risk assessment. Until recently, existing public toxicity databases have been constructed primarily as "look-up-tables" of existing data, and most often did no...

  9. Annual Review of Database Development: 1992.

    ERIC Educational Resources Information Center

    Basch, Reva

    1992-01-01

    Reviews recent trends in databases and online systems. Topics discussed include new access points for established databases; acquisitions, consolidations, and competition between vendors; European coverage; international services; online reference materials, including telephone directories; political and legal materials and public records;…

  10. 24 CFR 902.24 - Database adjustment.

    Code of Federal Regulations, 2012 CFR

    2012-04-01

    ... 24 Housing and Urban Development 4 2012-04-01 2012-04-01 false Database adjustment. 902.24 Section 902.24 Housing and Urban Development REGULATIONS RELATING TO HOUSING AND URBAN DEVELOPMENT (CONTINUED... PUBLIC HOUSING ASSESSMENT SYSTEM Physical Condition Indicator § 902.24 Database adjustment....

  11. 24 CFR 902.24 - Database adjustment.

    Code of Federal Regulations, 2011 CFR

    2011-04-01

    ... 24 Housing and Urban Development 4 2011-04-01 2011-04-01 false Database adjustment. 902.24 Section 902.24 Housing and Urban Development REGULATIONS RELATING TO HOUSING AND URBAN DEVELOPMENT (CONTINUED... PUBLIC HOUSING ASSESSMENT SYSTEM Physical Condition Indicator § 902.24 Database adjustment....

  12. 24 CFR 902.24 - Database adjustment.

    Code of Federal Regulations, 2014 CFR

    2014-04-01

    ... 24 Housing and Urban Development 4 2014-04-01 2014-04-01 false Database adjustment. 902.24 Section 902.24 Housing and Urban Development REGULATIONS RELATING TO HOUSING AND URBAN DEVELOPMENT (CONTINUED... PUBLIC HOUSING ASSESSMENT SYSTEM Physical Condition Indicator § 902.24 Database adjustment....

  13. 24 CFR 902.24 - Database adjustment.

    Code of Federal Regulations, 2013 CFR

    2013-04-01

    ... 24 Housing and Urban Development 4 2013-04-01 2013-04-01 false Database adjustment. 902.24 Section 902.24 Housing and Urban Development REGULATIONS RELATING TO HOUSING AND URBAN DEVELOPMENT (CONTINUED... PUBLIC HOUSING ASSESSMENT SYSTEM Physical Condition Indicator § 902.24 Database adjustment....

  14. Correlates of Access to Business Research Databases

    ERIC Educational Resources Information Center

    Gottfried, John C.

    2010-01-01

    This study examines potential correlates of business research database access through academic libraries serving top business programs in the United States. Results indicate that greater access to research databases is related to enrollment in graduate business programs, but not to overall enrollment or status as a public or private institution.…

  15. BELDATA -- The Database of Belgrade Astronomical Observatory

    NASA Astrophysics Data System (ADS)

    Milovanovic, N.; Popovic, L. C.; Dimitrijevic, M. S.

    The Belgrade Astronomical Database (BELDATA) is an Internet-based database designed to contain Stark broadening parameters, spectra of active galactic nuclei, catalogs of observations done at the Belgrade Observatory and abstracts of papers published in the publications of the observatory.

  16. Library Instruction and Online Database Searching.

    ERIC Educational Resources Information Center

    Mercado, Heidi

    1999-01-01

    Reviews changes in online database searching in academic libraries. Topics include librarians conducting all searches; the advent of end-user searching and the need for user instruction; compact disk technology; online public catalogs; the Internet; full text databases; electronic information literacy; user education and the remote library user;…

  17. Introducing a High-Risk HPV DNA Test Into a Public Sector Screening Program in El Salvador

    PubMed Central

    Cremer, Miriam L.; Maza, Mauricio; Alfaro, Karla M.; Kim, Jane J.; Ditzian, Lauren R.; Villalta, Sofia; Alonzo, Todd A.; Felix, Juan C.; Castle, Philip E.; Gage, Julia C.

    2016-01-01

    Objective In a primary human papillomavirus (HPV) screening program, we compared the 6-month follow-up among colposcopy and noncolposcopy-based management strategies for screen-positive women. Materials and Methods Women aged 30 to 49 years were screened with HPV DNA tests using both self-collection and provider collection of samples. Women testing positive received either (1) colposcopy management (CM) consisting of colposcopy and management per local guidelines or (2) screen-and-treat (ST) management using visual inspection with acetic acid to determine cryotherapy eligibility, with eligible women undergoing immediate cryotherapy. One thousand women were recruited in each cohort. Of these, 368 (18.4%) of 2000 women were recruited using a more intensive outreach strategy. Demographics, HPV positivity, and treatment compliance were compared across recruitment and management strategies. Results More women in the ST cohort received treatment within 6 months compared with those in the CM cohort (117/119 [98.3%] vs 64/93 [68.8%]; p < .001). Women recruited through more intensive outreach were more likely to be HPV positive, lived in urban areas, were more educated, and had higher numbers of lifetime sexual partners and fewer children. Conclusions Women in the CM arm were less likely to complete care than women in the ST arm. Targeted outreach to underscreened women successfully identified women with higher prevalence of HPV and possibly higher disease burden. PMID:26890683

  18. Mitochondrial DNA control region variation in Dubai, United Arab Emirates.

    PubMed

    Alshamali, Farida; Brandstätter, Anita; Zimmermann, Bettina; Parson, Walther

    2008-01-01

    249 entire mtDNA control region sequences were generated and analyzed in a population sample from Dubai, one of the seven United Arab Emirates. The control region was amplified in one piece and sequenced with different sequencing primers. Sequence evaluation was performed twice and validated by a third senior mtDNA scientist. Phylogenetic analyses were used for quality assurance purposes and for the determination of the haplogroup affiliation of the samples. Upon publication, the population data are going to be available in the EMPOP database (www.empop.org).

  19. CD-ROM-aided Databases

    NASA Astrophysics Data System (ADS)

    Masuyama, Keiichi

    CD-ROM has rapidly evolved as a new information medium with large capacity, In the U.S. it is predicted that it will become two hundred billion yen market in three years, and thus CD-ROM is strategic target of database industry. Here in Japan the movement toward its commercialization has been active since this year. Shall CD-ROM bussiness ever conquer information market as an on-disk database or electronic publication? Referring to some cases of the applications in the U.S. the author views marketability and the future trend of this new optical disk medium.

  20. The urologic epithelial stem cell database (UESC) – a web tool for cell type-specific gene expression and immunohistochemistry images of the prostate and bladder

    PubMed Central

    Pascal, Laura E; Deutsch, Eric W; Campbell, David S; Korb, Martin; True, Lawrence D; Liu, Alvin Y

    2007-01-01

    Background Public databases are crucial for analysis of high-dimensional gene and protein expression data. The Urologic Epithelial Stem Cells (UESC) database is a public database that contains gene and protein information for the major cell types of the prostate, prostate cancer cell lines, and a cancer cell type isolated from a primary tumor. Similarly, such information is available for urinary bladder cell types. Description Two major data types were archived in the database, protein abundance localization data from immunohistochemistry images, and transcript abundance data principally from DNA microarray analysis. Data results were organized in modules that were made to operate independently but built upon a core functionality. Gene array data and immunostaining images for human and mouse prostate and bladder were made available for interrogation. Data analysis capabilities include: (1) CD (cluster designation) cell surface protein data. For each cluster designation molecule, a data summary allows easy retrieval of images (at multiple magnifications). (2) Microarray data. Single gene or batch search can be initiated with Affymetrix Probeset ID, Gene Name, or Accession Number together with options of coalescing probesets and/or replicates. Conclusion Databases are invaluable for biomedical research, and their utility depends on data quality and user friendliness. UESC provides for database queries and tools to examine cell type-specific gene expression (normal vs. cancer), whereas most other databases contain only whole tissue expression datasets. The UESC database provides a valuable tool in the analysis of differential gene expression in prostate cancer genes in cancer progression. PMID:18072977

  1. Curcumin Resource Database.

    PubMed

    Kumar, Anil; Chetia, Hasnahana; Sharma, Swagata; Kabiraj, Debajyoti; Talukdar, Narayan Chandra; Bora, Utpal

    2015-01-01

    Curcumin is one of the most intensively studied diarylheptanoid, Curcuma longa being its principal producer. This apart, a class of promising curcumin analogs has been generated in laboratories, aptly named as Curcuminoids which are showing huge potential in the fields of medicine, food technology, etc. The lack of a universal source of data on curcumin as well as curcuminoids has been felt by the curcumin research community for long. Hence, in an attempt to address this stumbling block, we have developed Curcumin Resource Database (CRDB) that aims to perform as a gateway-cum-repository to access all relevant data and related information on curcumin and its analogs. Currently, this database encompasses 1186 curcumin analogs, 195 molecular targets, 9075 peer reviewed publications, 489 patents and 176 varieties of C. longa obtained by extensive data mining and careful curation from numerous sources. Each data entry is identified by a unique CRDB ID (identifier). Furnished with a user-friendly web interface and in-built search engine, CRDB provides well-curated and cross-referenced information that are hyperlinked with external sources. CRDB is expected to be highly useful to the researchers working on structure as well as ligand-based molecular design of curcumin analogs.

  2. Curcumin Resource Database.

    PubMed

    Kumar, Anil; Chetia, Hasnahana; Sharma, Swagata; Kabiraj, Debajyoti; Talukdar, Narayan Chandra; Bora, Utpal

    2015-01-01

    Curcumin is one of the most intensively studied diarylheptanoid, Curcuma longa being its principal producer. This apart, a class of promising curcumin analogs has been generated in laboratories, aptly named as Curcuminoids which are showing huge potential in the fields of medicine, food technology, etc. The lack of a universal source of data on curcumin as well as curcuminoids has been felt by the curcumin research community for long. Hence, in an attempt to address this stumbling block, we have developed Curcumin Resource Database (CRDB) that aims to perform as a gateway-cum-repository to access all relevant data and related information on curcumin and its analogs. Currently, this database encompasses 1186 curcumin analogs, 195 molecular targets, 9075 peer reviewed publications, 489 patents and 176 varieties of C. longa obtained by extensive data mining and careful curation from numerous sources. Each data entry is identified by a unique CRDB ID (identifier). Furnished with a user-friendly web interface and in-built search engine, CRDB provides well-curated and cross-referenced information that are hyperlinked with external sources. CRDB is expected to be highly useful to the researchers working on structure as well as ligand-based molecular design of curcumin analogs. PMID:26220923

  3. Searching and Indexing Genomic Databases via Kernelization

    PubMed Central

    Gagie, Travis; Puglisi, Simon J.

    2015-01-01

    The rapid advance of DNA sequencing technologies has yielded databases of thousands of genomes. To search and index these databases effectively, it is important that we take advantage of the similarity between those genomes. Several authors have recently suggested searching or indexing only one reference genome and the parts of the other genomes where they differ. In this paper, we survey the 20-year history of this idea and discuss its relation to kernelization in parameterized complexity. PMID:25710001

  4. Re-identification of DNA through an automated linkage process.

    PubMed Central

    Malin, B.; Sweeney, L.

    2001-01-01

    This work demonstrates how seemingly anonymous DNA database entries can be related to publicly available health information to uniquely and specifically identify the persons who are the subjects of the information even though the DNA information contains no accompanying explicit identifiers such as name, address, or Social Security number and contains no additional fields of personal information. The software program, REID (Re-Identification of DNA), iteratively uncovers unique occurrences in visit-disease patterns across data collections that reveal inferences about the identities of the patients who are the subject of the DNA. Using real-world data, REID established identifiable linkages in 33-100% of the 10,886 cases explicitly surveyed over 8 gene-based diseases. PMID:11825223

  5. Databases: Beyond the Basics.

    ERIC Educational Resources Information Center

    Whittaker, Robert

    This presented paper offers an elementary description of database characteristics and then provides a survey of databases that may be useful to the teacher and researcher in Slavic and East European languages and literatures. The survey focuses on commercial databases that are available, usable, and needed. Individual databases discussed include:…

  6. Reflective Database Access Control

    ERIC Educational Resources Information Center

    Olson, Lars E.

    2009-01-01

    "Reflective Database Access Control" (RDBAC) is a model in which a database privilege is expressed as a database query itself, rather than as a static privilege contained in an access control list. RDBAC aids the management of database access controls by improving the expressiveness of policies. However, such policies introduce new interactions…

  7. SPODOBASE : an EST database for the lepidopteran crop pest Spodoptera

    PubMed Central

    Nègre, Vincent; Hôtelier, Thierry; Volkoff, Anne-Nathalie; Gimenez, Sylvie; Cousserans, François; Mita, Kazuei; Sabau, Xavier; Rocher, Janick; López-Ferber, Miguel; d'Alençon, Emmanuelle; Audant, Pascaline; Sabourault, Cécile; Bidegainberry, Vincent; Hilliou, Frédérique; Fournier, Philippe

    2006-01-01

    Background The Lepidoptera Spodoptera frugiperda is a pest which causes widespread economic damage on a variety of crop plants. It is also well known through its famous Sf9 cell line which is used for numerous heterologous protein productions. Species of the Spodoptera genus are used as model for pesticide resistance and to study virus host interactions. A genomic approach is now a critical step for further new developments in biology and pathology of these insects, and the results of ESTs sequencing efforts need to be structured into databases providing an integrated set of tools and informations. Description The ESTs from five independent cDNA libraries, prepared from three different S. frugiperda tissues (hemocytes, midgut and fat body) and from the Sf9 cell line, are deposited in the database. These tissues were chosen because of their importance in biological processes such as immune response, development and plant/insect interaction. So far, the SPODOBASE contains 29,325 ESTs, which are cleaned and clustered into non-redundant sets (2294 clusters and 6103 singletons). The SPODOBASE is constructed in such a way that other ESTs from S. frugiperda or other species may be added. User can retrieve information using text searches, pre-formatted queries, query assistant or blast searches. Annotation is provided against NCBI, UNIPROT or Bombyx mori ESTs databases, and with GO-Slim vocabulary. Conclusion The SPODOBASE database provides integrated access to expressed sequence tags (EST) from the lepidopteran insect Spodoptera frugiperda. It is a publicly available structured database with insect pest sequences which will allow identification of a number of genes and comprehensive cloning of gene families of interest for scientific community. SPODOBASE is available from URL: PMID:16796757

  8. Searching NCBI databases using Entrez.

    PubMed

    Gibney, Gretchen; Baxevanis, Andreas D

    2011-06-01

    One of the most widely used interfaces for the retrieval of information from biological databases is the NCBI Entrez system. Entrez capitalizes on the fact that there are pre-existing, logical relationships between the individual entries found in numerous public databases. The existence of such natural connections, mostly biological in nature, argued for the development of a method through which all the information about a particular biological entity could be found without having to sequentially visit and query disparate databases. Two basic protocols describe simple, text-based searches, illustrating the types of information that can be retrieved through the Entrez system. An alternate protocol builds upon the first basic protocol, using additional, built-in features of the Entrez system, and providing alternative ways to issue the initial query. The support protocol reviews how to save frequently issued queries. Finally, Cn3D, a structure visualization tool, is also discussed.

  9. Searching NCBI databases using Entrez.

    PubMed

    Baxevanis, Andreas D

    2008-12-01

    One of the most widely used interfaces for the retrieval of information from biological databases is the NCBI Entrez system. Entrez capitalizes on the fact that there are pre-existing, logical relationships between the individual entries found in numerous public databases. The existence of such natural connections, mostly biological in nature, argued for the development of a method through which all the information about a particular biological entity could be found without having to sequentially visit and query disparate databases. Two Basic Protocols describe simple, text-based searches, illustrating the types of information that can be retrieved through the Entrez system. An Alternate Protocol builds upon the first Basic Protocol, using additional, built-in features of the Entrez system, and providing alternative ways to issue the initial query. The Support Protocol reviews how to save frequently issued queries. Finally, Cn3D, a structure visualization tool, is also discussed.

  10. National Residential Efficiency Measures Database

    DOE Data Explorer

    The National Residential Efficiency Measures Database is a publicly available, centralized resource of residential building retrofit measures and costs for the U.S. building industry. With support from the U.S. Department of Energy, NREL developed this tool to help users determine the most cost-effective retrofit measures for improving energy efficiency of existing homes. Software developers who require residential retrofit performance and cost data for applications that evaluate residential efficiency measures are the primary audience for this database. In addition, home performance contractors and manufacturers of residential materials and equipment may find this information useful. The database offers the following types of retrofit measures: 1) Appliances, 2) Domestic Hot Water, 3) Enclosure, 4) Heating, Ventilating, and Air Conditioning (HVAC), 5) Lighting, 6) Miscellaneous.

  11. Quantifying the Consistency of Scientific Databases

    PubMed Central

    Šubelj, Lovro; Bajec, Marko; Mileva Boshkoska, Biljana; Kastrin, Andrej; Levnajić, Zoran

    2015-01-01

    Science is a social process with far-reaching impact on our modern society. In recent years, for the first time we are able to scientifically study the science itself. This is enabled by massive amounts of data on scientific publications that is increasingly becoming available. The data is contained in several databases such as Web of Science or PubMed, maintained by various public and private entities. Unfortunately, these databases are not always consistent, which considerably hinders this study. Relying on the powerful framework of complex networks, we conduct a systematic analysis of the consistency among six major scientific databases. We found that identifying a single "best" database is far from easy. Nevertheless, our results indicate appreciable differences in mutual consistency of different databases, which we interpret as recipes for future bibliometric studies. PMID:25984946

  12. Human Mitochondrial Protein Database

    National Institute of Standards and Technology Data Gateway

    SRD 131 Human Mitochondrial Protein Database (Web, free access)   The Human Mitochondrial Protein Database (HMPDb) provides comprehensive data on mitochondrial and human nuclear encoded proteins involved in mitochondrial biogenesis and function. This database consolidates information from SwissProt, LocusLink, Protein Data Bank (PDB), GenBank, Genome Database (GDB), Online Mendelian Inheritance in Man (OMIM), Human Mitochondrial Genome Database (mtDB), MITOMAP, Neuromuscular Disease Center and Human 2-D PAGE Databases. This database is intended as a tool not only to aid in studying the mitochondrion but in studying the associated diseases.

  13. Establishment of Kawasaki disease database based on metadata standard

    PubMed Central

    Park, Yu Rang; Kim, Jae-Jung; Yoon, Young Jo; Yoon, Young-Kwang; Koo, Ha Yeong; Hong, Young Mi; Jang, Gi Young; Shin, Soo-Yong; Lee, Jong-Keuk

    2016-01-01

    Kawasaki disease (KD) is a rare disease that occurs predominantly in infants and young children. To identify KD susceptibility genes and to develop a diagnostic test, a specific therapy, or prevention method, collecting KD patients’ clinical and genomic data is one of the major issues. For this purpose, Kawasaki Disease Database (KDD) was developed based on the efforts of Korean Kawasaki Disease Genetics Consortium (KKDGC). KDD is a collection of 1292 clinical data and genomic samples of 1283 patients from 13 KKDGC-participating hospitals. Each sample contains the relevant clinical data, genomic DNA and plasma samples isolated from patients’ blood, omics data and KD-associated genotype data. Clinical data was collected and saved using the common data elements based on the ISO/IEC 11179 metadata standard. Two genome-wide association study data of total 482 samples and whole exome sequencing data of 12 samples were also collected. In addition, KDD includes the rare cases of KD (16 cases with family history, 46 cases with recurrence, 119 cases with intravenous immunoglobulin non-responsiveness, and 52 cases with coronary artery aneurysm). As the first public database for KD, KDD can significantly facilitate KD studies. All data in KDD can be searchable and downloadable. KDD was implemented in PHP, MySQL and Apache, with all major browsers supported. Database URL: http://www.kawasakidisease.kr PMID:27630202

  14. Establishment of Kawasaki disease database based on metadata standard

    PubMed Central

    Park, Yu Rang; Kim, Jae-Jung; Yoon, Young Jo; Yoon, Young-Kwang; Koo, Ha Yeong; Hong, Young Mi; Jang, Gi Young; Shin, Soo-Yong; Lee, Jong-Keuk

    2016-01-01

    Kawasaki disease (KD) is a rare disease that occurs predominantly in infants and young children. To identify KD susceptibility genes and to develop a diagnostic test, a specific therapy, or prevention method, collecting KD patients’ clinical and genomic data is one of the major issues. For this purpose, Kawasaki Disease Database (KDD) was developed based on the efforts of Korean Kawasaki Disease Genetics Consortium (KKDGC). KDD is a collection of 1292 clinical data and genomic samples of 1283 patients from 13 KKDGC-participating hospitals. Each sample contains the relevant clinical data, genomic DNA and plasma samples isolated from patients’ blood, omics data and KD-associated genotype data. Clinical data was collected and saved using the common data elements based on the ISO/IEC 11179 metadata standard. Two genome-wide association study data of total 482 samples and whole exome sequencing data of 12 samples were also collected. In addition, KDD includes the rare cases of KD (16 cases with family history, 46 cases with recurrence, 119 cases with intravenous immunoglobulin non-responsiveness, and 52 cases with coronary artery aneurysm). As the first public database for KD, KDD can significantly facilitate KD studies. All data in KDD can be searchable and downloadable. KDD was implemented in PHP, MySQL and Apache, with all major browsers supported. Database URL: http://www.kawasakidisease.kr

  15. Establishment of Kawasaki disease database based on metadata standard.

    PubMed

    Park, Yu Rang; Kim, Jae-Jung; Yoon, Young Jo; Yoon, Young-Kwang; Koo, Ha Yeong; Hong, Young Mi; Jang, Gi Young; Shin, Soo-Yong; Lee, Jong-Keuk

    2016-07-01

    Kawasaki disease (KD) is a rare disease that occurs predominantly in infants and young children. To identify KD susceptibility genes and to develop a diagnostic test, a specific therapy, or prevention method, collecting KD patients' clinical and genomic data is one of the major issues. For this purpose, Kawasaki Disease Database (KDD) was developed based on the efforts of Korean Kawasaki Disease Genetics Consortium (KKDGC). KDD is a collection of 1292 clinical data and genomic samples of 1283 patients from 13 KKDGC-participating hospitals. Each sample contains the relevant clinical data, genomic DNA and plasma samples isolated from patients' blood, omics data and KD-associated genotype data. Clinical data was collected and saved using the common data elements based on the ISO/IEC 11179 metadata standard. Two genome-wide association study data of total 482 samples and whole exome sequencing data of 12 samples were also collected. In addition, KDD includes the rare cases of KD (16 cases with family history, 46 cases with recurrence, 119 cases with intravenous immunoglobulin non-responsiveness, and 52 cases with coronary artery aneurysm). As the first public database for KD, KDD can significantly facilitate KD studies. All data in KDD can be searchable and downloadable. KDD was implemented in PHP, MySQL and Apache, with all major browsers supported.Database URL: http://www.kawasakidisease.kr. PMID:27630202

  16. Interactive bibliographical database on color

    NASA Astrophysics Data System (ADS)

    Caivano, Jose L.

    2002-06-01

    The paper describes the methodology and results of a project under development, aimed at the elaboration of an interactive bibliographical database on color in all fields of application: philosophy, psychology, semiotics, education, anthropology, physical and natural sciences, biology, medicine, technology, industry, architecture and design, arts, linguistics, geography, history. The project is initially based upon an already developed bibliography, published in different journals, updated in various opportunities, and now available at the Internet, with more than 2,000 entries. The interactive database will amplify that bibliography, incorporating hyperlinks and contents (indexes, abstracts, keywords, introductions, or eventually the complete document), and devising mechanisms for information retrieval. The sources to be included are: books, doctoral dissertations, multimedia publications, reference works. The main arrangement will be chronological, but the design of the database will allow rearrangements or selections by different fields: subject, Decimal Classification System, author, language, country, publisher, etc. A further project is to develop another database, including color-specialized journals or newsletters, and articles on color published in international journals, arranged in this case by journal name and date of publication, but allowing also rearrangements or selections by author, subject and keywords.

  17. YCRD: Yeast Combinatorial Regulation Database

    PubMed Central

    Wu, Wei-Sheng; Hsieh, Yen-Chen; Lai, Fu-Jou

    2016-01-01

    In eukaryotes, the precise transcriptional control of gene expression is typically achieved through combinatorial regulation using cooperative transcription factors (TFs). Therefore, a database which provides regulatory associations between cooperative TFs and their target genes is helpful for biologists to study the molecular mechanisms of transcriptional regulation of gene expression. Because there is no such kind of databases in the public domain, this prompts us to construct a database, called Yeast Combinatorial Regulation Database (YCRD), which deposits 434,197 regulatory associations between 2535 cooperative TF pairs and 6243 genes. The comprehensive collection of more than 2500 cooperative TF pairs was retrieved from 17 existing algorithms in the literature. The target genes of a cooperative TF pair (e.g. TF1-TF2) are defined as the common target genes of TF1 and TF2, where a TF’s experimentally validated target genes were downloaded from YEASTRACT database. In YCRD, users can (i) search the target genes of a cooperative TF pair of interest, (ii) search the cooperative TF pairs which regulate a gene of interest and (iii) identify important cooperative TF pairs which regulate a given set of genes. We believe that YCRD will be a valuable resource for yeast biologists to study combinatorial regulation of gene expression. YCRD is available at http://cosbi.ee.ncku.edu.tw/YCRD/ or http://cosbi2.ee.ncku.edu.tw/YCRD/. PMID:27392072

  18. YCRD: Yeast Combinatorial Regulation Database.

    PubMed

    Wu, Wei-Sheng; Hsieh, Yen-Chen; Lai, Fu-Jou

    2016-01-01

    In eukaryotes, the precise transcriptional control of gene expression is typically achieved through combinatorial regulation using cooperative transcription factors (TFs). Therefore, a database which provides regulatory associations between cooperative TFs and their target genes is helpful for biologists to study the molecular mechanisms of transcriptional regulation of gene expression. Because there is no such kind of databases in the public domain, this prompts us to construct a database, called Yeast Combinatorial Regulation Database (YCRD), which deposits 434,197 regulatory associations between 2535 cooperative TF pairs and 6243 genes. The comprehensive collection of more than 2500 cooperative TF pairs was retrieved from 17 existing algorithms in the literature. The target genes of a cooperative TF pair (e.g. TF1-TF2) are defined as the common target genes of TF1 and TF2, where a TF's experimentally validated target genes were downloaded from YEASTRACT database. In YCRD, users can (i) search the target genes of a cooperative TF pair of interest, (ii) search the cooperative TF pairs which regulate a gene of interest and (iii) identify important cooperative TF pairs which regulate a given set of genes. We believe that YCRD will be a valuable resource for yeast biologists to study combinatorial regulation of gene expression. YCRD is available at http://cosbi.ee.ncku.edu.tw/YCRD/ or http://cosbi2.ee.ncku.edu.tw/YCRD/. PMID:27392072

  19. DREMECELS: A Curated Database for Base Excision and Mismatch Repair Mechanisms Associated Human Malignancies.

    PubMed

    Shukla, Ankita; Moussa, Ahmed; Singh, Tiratha Raj

    2016-01-01

    DNA repair mechanisms act as a warrior combating various damaging processes that ensue critical malignancies. DREMECELS was designed considering the malignancies with frequent alterations in DNA repair pathways, that is, colorectal and endometrial cancers, associated with Lynch syndrome (also known as HNPCC). Since lynch syndrome carries high risk (~40-60%) for both cancers, therefore we decided to cover all three diseases in this portal. Although a large population is presently affected by these malignancies, many resources are available for various cancer types but no database archives information on the genes specifically for only these cancers and disorders. The database contains 156 genes and two repair mechanisms, base excision repair (BER) and mismatch repair (MMR). Other parameters include some of the regulatory processes that have roles in these disease progressions due to incompetent repair mechanisms, specifically BER and MMR. However, our unique database mainly provides qualitative and quantitative information on these cancer types along with methylation, drug sensitivity, miRNAs, copy number variation (CNV) and somatic mutations data. This database would serve the scientific community by providing integrated information on these disease types, thus sustaining diagnostic and therapeutic processes. This repository would serve as an excellent accompaniment for researchers and biomedical professionals and facilitate in understanding such critical diseases. DREMECELS is publicly available at http://www.bioinfoindia.org/dremecels.

  20. DREMECELS: A Curated Database for Base Excision and Mismatch Repair Mechanisms Associated Human Malignancies.

    PubMed

    Shukla, Ankita; Moussa, Ahmed; Singh, Tiratha Raj

    2016-01-01

    DNA repair mechanisms act as a warrior combating various damaging processes that ensue critical malignancies. DREMECELS was designed considering the malignancies with frequent alterations in DNA repair pathways, that is, colorectal and endometrial cancers, associated with Lynch syndrome (also known as HNPCC). Since lynch syndrome carries high risk (~40-60%) for both cancers, therefore we decided to cover all three diseases in this portal. Although a large population is presently affected by these malignancies, many resources are available for various cancer types but no database archives information on the genes specifically for only these cancers and disorders. The database contains 156 genes and two repair mechanisms, base excision repair (BER) and mismatch repair (MMR). Other parameters include some of the regulatory processes that have roles in these disease progressions due to incompetent repair mechanisms, specifically BER and MMR. However, our unique database mainly provides qualitative and quantitative information on these cancer types along with methylation, drug sensitivity, miRNAs, copy number variation (CNV) and somatic mutations data. This database would serve the scientific community by providing integrated information on these disease types, thus sustaining diagnostic and therapeutic processes. This repository would serve as an excellent accompaniment for researchers and biomedical professionals and facilitate in understanding such critical diseases. DREMECELS is publicly available at http://www.bioinfoindia.org/dremecels. PMID:27276067

  1. DREMECELS: A Curated Database for Base Excision and Mismatch Repair Mechanisms Associated Human Malignancies

    PubMed Central

    Shukla, Ankita; Singh, Tiratha Raj

    2016-01-01

    DNA repair mechanisms act as a warrior combating various damaging processes that ensue critical malignancies. DREMECELS was designed considering the malignancies with frequent alterations in DNA repair pathways, that is, colorectal and endometrial cancers, associated with Lynch syndrome (also known as HNPCC). Since lynch syndrome carries high risk (~40–60%) for both cancers, therefore we decided to cover all three diseases in this portal. Although a large population is presently affected by these malignancies, many resources are available for various cancer types but no database archives information on the genes specifically for only these cancers and disorders. The database contains 156 genes and two repair mechanisms, base excision repair (BER) and mismatch repair (MMR). Other parameters include some of the regulatory processes that have roles in these disease progressions due to incompetent repair mechanisms, specifically BER and MMR. However, our unique database mainly provides qualitative and quantitative information on these cancer types along with methylation, drug sensitivity, miRNAs, copy number variation (CNV) and somatic mutations data. This database would serve the scientific community by providing integrated information on these disease types, thus sustaining diagnostic and therapeutic processes. This repository would serve as an excellent accompaniment for researchers and biomedical professionals and facilitate in understanding such critical diseases. DREMECELS is publicly available at http://www.bioinfoindia.org/dremecels. PMID:27276067

  2. CD-ROM-aided Databases

    NASA Astrophysics Data System (ADS)

    Kitamura, Masami

    Nichigai Associates Inc. has begun information services to publish text databases on CD-ROM. In chapter 2, outline of these services and the publication plan of this fiscal year are described. In chapter 3, CD-ROM logical file format common to these services, software to generate files conformed to the format, and software to retrieve CD-ROM files by personal computers are also described.

  3. Village Green Project: Web-accessible Database

    EPA Science Inventory

    The purpose of this web-accessible database is for the public to be able to view instantaneous readings from a solar-powered air monitoring station located in a public location (prototype pilot test is outside of a library in Durham County, NC). The data are wirelessly transmitte...

  4. On-Line Databases in Mexico.

    ERIC Educational Resources Information Center

    Molina, Enzo

    1986-01-01

    Use of online bibliographic databases in Mexico is provided through Servicio de Consulta a Bancos de Informacion, a public service that provides information retrieval, document delivery, translation, technical support, and training services. Technical infrastructure is based on a public packet-switching network and institutional users may receive…

  5. 42 CFR 455.436 - Federal database checks.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 42 Public Health 4 2011-10-01 2011-10-01 false Federal database checks. 455.436 Section 455.436....436 Federal database checks. The State Medicaid agency must do all of the following: (a) Confirm the... databases. (b) Check the Social Security Administration's Death Master File, the National Plan and...

  6. 42 CFR 455.436 - Federal database checks.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 42 Public Health 4 2012-10-01 2012-10-01 false Federal database checks. 455.436 Section 455.436....436 Federal database checks. The State Medicaid agency must do all of the following: (a) Confirm the... databases. (b) Check the Social Security Administration's Death Master File, the National Plan and...

  7. 42 CFR 455.436 - Federal database checks.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 42 Public Health 4 2014-10-01 2014-10-01 false Federal database checks. 455.436 Section 455.436....436 Federal database checks. The State Medicaid agency must do all of the following: (a) Confirm the... databases. (b) Check the Social Security Administration's Death Master File, the National Plan and...

  8. 40 CFR 1400.13 - Read-only database.

    Code of Federal Regulations, 2012 CFR

    2012-07-01

    ... 40 Protection of Environment 34 2012-07-01 2012-07-01 false Read-only database. 1400.13 Section... INFORMATION Other Provisions § 1400.13 Read-only database. The Administrator is authorized to establish... public off-site consequence analysis information by means of a central database under the control of...

  9. 40 CFR 1400.13 - Read-only database.

    Code of Federal Regulations, 2013 CFR

    2013-07-01

    ... 40 Protection of Environment 34 2013-07-01 2013-07-01 false Read-only database. 1400.13 Section... INFORMATION Other Provisions § 1400.13 Read-only database. The Administrator is authorized to establish... public off-site consequence analysis information by means of a central database under the control of...

  10. 40 CFR 1400.13 - Read-only database.

    Code of Federal Regulations, 2014 CFR

    2014-07-01

    ... 40 Protection of Environment 33 2014-07-01 2014-07-01 false Read-only database. 1400.13 Section... INFORMATION Other Provisions § 1400.13 Read-only database. The Administrator is authorized to establish... public off-site consequence analysis information by means of a central database under the control of...

  11. 42 CFR 455.436 - Federal database checks.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 42 Public Health 4 2013-10-01 2013-10-01 false Federal database checks. 455.436 Section 455.436....436 Federal database checks. The State Medicaid agency must do all of the following: (a) Confirm the... databases. (b) Check the Social Security Administration's Death Master File, the National Plan and...

  12. Curation accuracy of model organism databases.

    PubMed

    Keseler, Ingrid M; Skrzypek, Marek; Weerasinghe, Deepika; Chen, Albert Y; Fulcher, Carol; Li, Gene-Wei; Lemmer, Kimberly C; Mladinich, Katherine M; Chow, Edmond D; Sherlock, Gavin; Karp, Peter D

    2014-01-01

    Manual extraction of information from the biomedical literature-or biocuration-is the central methodology used to construct many biological databases. For example, the UniProt protein database, the EcoCyc Escherichia coli database and the Candida Genome Database (CGD) are all based on biocuration. Biological databases are used extensively by life science researchers, as online encyclopedias, as aids in the interpretation of new experimental data and as golden standards for the development of new bioinformatics algorithms. Although manual curation has been assumed to be highly accurate, we are aware of only one previous study of biocuration accuracy. We assessed the accuracy of EcoCyc and CGD by manually selecting curated assertions within randomly chosen EcoCyc and CGD gene pages and by then validating that the data found in the referenced publications supported those assertions. A database assertion is considered to be in error if that assertion could not be found in the publication cited for that assertion. We identified 10 errors in the 633 facts that we validated across the two databases, for an overall error rate of 1.58%, and individual error rates of 1.82% for CGD and 1.40% for EcoCyc. These data suggest that manual curation of the experimental literature by Ph.D-level scientists is highly accurate. Database URL: http://ecocyc.org/, http://www.candidagenome.org//

  13. The UCSC Genome Browser database: 2015 update

    PubMed Central

    Rosenbloom, Kate R.; Armstrong, Joel; Barber, Galt P.; Casper, Jonathan; Clawson, Hiram; Diekhans, Mark; Dreszer, Timothy R.; Fujita, Pauline A.; Guruvadoo, Luvina; Haeussler, Maximilian; Harte, Rachel A.; Heitner, Steve; Hickey, Glenn; Hinrichs, Angie S.; Hubley, Robert; Karolchik, Donna; Learned, Katrina; Lee, Brian T.; Li, Chin H.; Miga, Karen H.; Nguyen, Ngan; Paten, Benedict; Raney, Brian J.; Smit, Arian F. A.; Speir, Matthew L.; Zweig, Ann S.; Haussler, David; Kuhn, Robert M.; Kent, W. James

    2015-01-01

    Launched in 2001 to showcase the draft human genome assembly, the UCSC Genome Browser database (http://genome.ucsc.edu) and associated tools continue to grow, providing a comprehensive resource of genome assemblies and annotations to scientists and students worldwide. Highlights of the past year include the release of a browser for the first new human genome reference assembly in 4 years in December 2013 (GRCh38, UCSC hg38), a watershed comparative genomics annotation (100-species multiple alignment and conservation) and a novel distribution mechanism for the browser (GBiB: Genome Browser in a Box). We created browsers for new species (Chinese hamster, elephant shark, minke whale), ‘mined the web’ for DNA sequences and expanded the browser display with stacked color graphs and region highlighting. As our user community increasingly adopts the UCSC track hub and assembly hub representations for sharing large-scale genomic annotation data sets and genome sequencing projects, our menu of public data hubs has tripled. PMID:25428374

  14. Three Decades of Recombinant DNA.

    ERIC Educational Resources Information Center

    Palmer, Jackie

    1985-01-01

    Discusses highlights in the development of genetic engineering, examining techniques with recombinant DNA, legal and ethical issues, GenBank (a national database of nucleic acid sequences), and other topics. (JN)

  15. Physiological Information Database (PID)

    EPA Science Inventory

    EPA has developed a physiological information database (created using Microsoft ACCESS) intended to be used in PBPK modeling. The database contains physiological parameter values for humans from early childhood through senescence as well as similar data for laboratory animal spec...

  16. Network II Database

    1994-11-07

    The Oak Ridge National Laboratory (ORNL) Rail and Barge Network II Database is a representation of the rail and barge system of the United States. The network is derived from the Federal Rail Administration (FRA) rail database.

  17. THE ECOTOX DATABASE

    EPA Science Inventory

    The database provides chemical-specific toxicity information for aquatic life, terrestrial plants, and terrestrial wildlife. ECOTOX is a comprehensive ecotoxicology database and is therefore essential for providing and suppoirting high quality models needed to estimate population...

  18. Household Products Database: Pesticides

    MedlinePlus

    ... Names Types of Products Manufacturers Ingredients About the Database FAQ Product Recalls Help Glossary Contact Us More ... holders. Information is extracted from Consumer Product Information Database ©2001-2015 by DeLima Associates. All rights reserved. ...

  19. Scopus database: a review

    PubMed Central

    Burnham, Judy F

    2006-01-01

    The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs. PMID:16522216

  20. Scopus database: a review.

    PubMed

    Burnham, Judy F

    2006-03-08

    The Scopus database provides access to STM journal articles and the references included in those articles, allowing the searcher to search both forward and backward in time. The database can be used for collection development as well as for research. This review provides information on the key points of the database and compares it to Web of Science. Neither database is inclusive, but complements each other. If a library can only afford one, choice must be based in institutional needs.

  1. Mission and Assets Database

    NASA Technical Reports Server (NTRS)

    Baldwin, John; Zendejas, Silvino; Gutheinz, Sandy; Borden, Chester; Wang, Yeou-Fang

    2009-01-01

    Mission and Assets Database (MADB) Version 1.0 is an SQL database system with a Web user interface to centralize information. The database stores flight project support resource requirements, view periods, antenna information, schedule, and forecast results for use in mid-range and long-term planning of Deep Space Network (DSN) assets.

  2. Ecology in the age of DNA barcoding: the resource, the promise and the challenges ahead.

    PubMed

    Joly, Simon; Davies, T Jonathan; Archambault, Annie; Bruneau, Anne; Derry, Alison; Kembel, Steven W; Peres-Neto, Pedro; Vamosi, Jana; Wheeler, Terry A

    2014-03-01

    Ten years after DNA barcoding was initially suggested as a tool to identify species, millions of barcode sequences from more than 1100 species are available in public databases. While several studies have reviewed the methods and potential applications of DNA barcoding, most have focused on species identification and discovery, and relatively few have addressed applications of DNA barcoding data to ecology. These data, and the associated information on the evolutionary histories of taxa that they can provide, offer great opportunities for ecologists to investigate questions that were previously difficult or impossible to address. We present an overview of potential uses of DNA barcoding relevant in the age of ecoinformatics, including applications in community ecology, species invasion, macroevolution, trait evolution, food webs and trophic interactions, metacommunities, and spatial ecology. We also outline some of the challenges and potential advances in DNA barcoding that lie ahead.

  3. DNA barcoding in the media: does coverage of cool science reflect its social context?

    PubMed

    Geary, Janis; Camicioli, Emma; Bubela, Tania

    2016-09-01

    Paul Hebert and colleagues first described DNA barcoding in 2003, which led to international efforts to promote and coordinate its use. Since its inception, DNA barcoding has generated considerable media coverage. We analysed whether this coverage reflected both the scientific and social mandates of international barcoding organizations. We searched newspaper databases to identify 900 English-language articles from 2003 to 2013. Coverage of the science of DNA barcoding was highly positive but lacked context for key topics. Coverage omissions pose challenges for public understanding of the science and applications of DNA barcoding; these included coverage of governance structures and issues related to the sharing of genetic resources across national borders. Our analysis provided insight into how barcoding communication efforts have translated into media coverage; more targeted communication efforts may focus media attention on previously omitted, but important topics. Our analysis is timely as the DNA barcoding community works to establish the International Society for the Barcode of Life. PMID:27463361

  4. DNA barcoding in the media: does coverage of cool science reflect its social context?

    PubMed

    Geary, Janis; Camicioli, Emma; Bubela, Tania

    2016-09-01

    Paul Hebert and colleagues first described DNA barcoding in 2003, which led to international efforts to promote and coordinate its use. Since its inception, DNA barcoding has generated considerable media coverage. We analysed whether this coverage reflected both the scientific and social mandates of international barcoding organizations. We searched newspaper databases to identify 900 English-language articles from 2003 to 2013. Coverage of the science of DNA barcoding was highly positive but lacked context for key topics. Coverage omissions pose challenges for public understanding of the science and applications of DNA barcoding; these included coverage of governance structures and issues related to the sharing of genetic resources across national borders. Our analysis provided insight into how barcoding communication efforts have translated into media coverage; more targeted communication efforts may focus media attention on previously omitted, but important topics. Our analysis is timely as the DNA barcoding community works to establish the International Society for the Barcode of Life.

  5. MitoZoa 2.0: a database resource and search tools for comparative and evolutionary analyses of mitochondrial genomes in Metazoa

    PubMed Central

    D'Onorio de Meo, Paolo; D'Antonio, Mattia; Griggio, Francesca; Lupi, Renato; Borsani, Massimiliano; Pavesi, Giulio; Castrignanò, Tiziana; Pesole, Graziano; Gissi, Carmela

    2012-01-01

    The MITOchondrial genome database of metaZOAns (MitoZoa) is a public resource for comparative analyses of metazoan mitochondrial genomes (mtDNA) at both the sequence and genomic organizational levels. The main characteristics of the MitoZoa database are the careful revision of mtDNA entry annotations and the possibility of retrieving gene order and non-coding region (NCR) data in appropriate formats. The MitoZoa retrieval system enables basic and complex queries at various taxonomic levels using different search menus. MitoZoa 2.0 has been enhanced in several aspects, including: a re-annotation pipeline to check the correctness of protein-coding gene predictions; a standardized annotation of introns and of precursor ORFs whose functionality is post-transcriptionally recovered by RNA editing or programmed translational frameshifting; updates of taxon-related fields and a BLAST sequence similarity search tool. Database novelties and the definition of standard mtDNA annotation rules, together with the user-friendly retrieval system and the BLAST service, make MitoZoa a valuable resource for comparative and evolutionary analyses as well as a reference database to assist in the annotation of novel mtDNA sequences. MitoZoa is freely accessible at http://www.caspur.it/mitozoa. PMID:22123747

  6. MitoZoa 2.0: a database resource and search tools for comparative and evolutionary analyses of mitochondrial genomes in Metazoa.

    PubMed

    D'Onorio de Meo, Paolo; D'Antonio, Mattia; Griggio, Francesca; Lupi, Renato; Borsani, Massimiliano; Pavesi, Giulio; Castrignanò, Tiziana; Pesole, Graziano; Gissi, Carmela

    2012-01-01

    The MITOchondrial genome database of metaZOAns (MitoZoa) is a public resource for comparative analyses of metazoan mitochondrial genomes (mtDNA) at both the sequence and genomic organizational levels. The main characteristics of the MitoZoa database are the careful revision of mtDNA entry annotations and the possibility of retrieving gene order and non-coding region (NCR) data in appropriate formats. The MitoZoa retrieval system enables basic and complex queries at various taxonomic levels using different search menus. MitoZoa 2.0 has been enhanced in several aspects, including: a re-annotation pipeline to check the correctness of protein-coding gene predictions; a standardized annotation of introns and of precursor ORFs whose functionality is post-transcriptionally recovered by RNA editing or programmed translational frameshifting; updates of taxon-related fields and a BLAST sequence similarity search tool. Database novelties and the definition of standard mtDNA annotation rules, together with the user-friendly retrieval system and the BLAST service, make MitoZoa a valuable resource for comparative and evolutionary analyses as well as a reference database to assist in the annotation of novel mtDNA sequences. MitoZoa is freely accessible at http://www.caspur.it/mitozoa.

  7. The Organelle Genome Database Project (GOBASE).

    PubMed Central

    Korab-Laskowska, M; Rioux, P; Brossard, N; Littlejohn, T G; Gray, M W; Lang, B F; Burger, G

    1998-01-01

    The taxonomically broad organelle genome database (GOBASE) organizes and integrates diverse data related to organelles (mitochondria and chloroplasts). The current version of GOBASE focuses on the mitochondrial subset of data and contains molecular sequences, RNA secondary structures and genetic maps, as well as taxonomic information for all eukaryotic species represented. The database has been designed so that complex biological queries, especially ones posed in a comparative genomics context, are supported. GOBASE has been implemented as a relational database with a web-based user interface (http://megasun.bch.umontreal.ca/gobase/gobas e.html ). Custom software tools have been written in house to assist in the population of the database, data validation, nomenclature standardization and front-end design. The database is fully operational and publicly accessible via the World Wide Web, allowing interactive browsing, sophisticated searching and easy downloading of data. PMID:9399818

  8. The PEP-2 project-wide database

    NASA Astrophysics Data System (ADS)

    Chan, A.; Calish, S.; Crane, G.; MacGregor, I.; Meyer, S.; Wong, J.; Weinstein, A.

    1995-05-01

    The PEP-2 Project Database is a tool for monitoring the technical and documentation aspects of this accelerator construction. It holds the PEP-2 design specifications, fabrication and installation data in one integrated system. Key pieces of the database include the machine parameter list, magnet and vacuum fabrication data. CAD drawings, publications and documentation, survey and alignment data and property control. The database can be extended to contain information required for the operations phase of the accelerator and detector. Features such as viewing CAD drawing graphics from the database will be implemented in the future. This central Oracle database on a UNIX server is built using ORACLE Case tools. Users at the three collaborating laboratories (SLAC, LBL, LLNL) can access the data remotely, using various desktop computer platforms and graphical interfaces.

  9. A nation's genes for a cure to cancer: evolving ethical, social and legal issues regarding population genetic databases.

    PubMed

    Hsieh, Alice

    2004-01-01

    The advent of the human genome sequence has focused research on understanding underlying genetic links to complex diseases such as cancer, asthma and heart disease. In the past few years, individual countries, such as Iceland, Estonia, Singapore and the United Kingdom, have created national databases of their citizens' DNA for comparative research. Most recently, an international consortium including Nigeria, Japan, China and the United States launched a $100 million project called the International HapMap to map the human genome according to haplotypes, blocks of DNA that contain genetic variation. Such population genetic databases present challenging ethical, social and legal issues, yet regulation of genetic information has developed sporadically, from region to region, without a consistent international standard. Without a clear understanding of the consequences of genetic research in terms of individual and community-wide discrimination and stigmatization, genetic databases raise concerns about the protection of genetic information. This Note provides a survey of the evolving landscape of population genetic databases as a legislative and public policy tool for national and international regulators. It compares different approaches to regulating the collection and use of population genetic databases in order to understand what areas of consensus are formulating a foundation for an international standard. As the first population genetics project that will span multiple countries for the collection of DNA, the International HapMap has the potential to become an influential standard for the protection of population genetic information. This Note highlights issues among the national databases and the HapMap project that raise ethical, social and legal concerns for the future and recommends further protections for both individual donors and community interests.

  10. The NCBI Taxonomy database.

    PubMed

    Federhen, Scott

    2012-01-01

    The NCBI Taxonomy database (http://www.ncbi.nlm.nih.gov/taxonomy) is the standard nomenclature and classification repository for the International Nucleotide Sequence Database Collaboration (INSDC), comprising the GenBank, ENA (EMBL) and DDBJ databases. It includes organism names and taxonomic lineages for each of the sequences represented in the INSDC's nucleotide and protein sequence databases. The taxonomy database is manually curated by a small group of scientists at the NCBI who use the current taxonomic literature to maintain a phylogenetic taxonomy for the source organisms represented in the sequence databases. The taxonomy database is a central organizing hub for many of the resources at the NCBI, and provides a means for clustering elements within other domains of NCBI web site, for internal linking between domains of the Entrez system and for linking out to taxon-specific external resources on the web. Our primary purpose is to index the domain of sequences as conveniently as possible for our user community.

  11. The Human PAX6 Mutation Database.

    PubMed

    Brown, A; McKie, M; van Heyningen, V; Prosser, J

    1998-01-01

    The Human PAX6 Mutation Database contains details of 94 mutations of the PAX6 gene. A Microsoft Access program is used by the Curator to store, update and search the database entries. Mutations can be entered directly by the Curator, or imported from submissions made via the World Wide Web. The PAX6 Mutation Database web page at URL http://www.hgu.mrc.ac.uk/Softdata/PAX6/ provides information about PAX6, as well as a fill-in form through which new mutations can be submitted to the Curator. A search facility allows remote users to query the database. A plain text format file of the data can be downloaded via the World Wide Web. The Curation program contains prior knowledge of the genetic code and of the PAX6 gene including cDNA sequence, location of intron/exon boundaries, and protein domains, so that the minimum of information need be provided by the submitter or Curator.

  12. DNA Sequencing apparatus

    DOEpatents

    Tabor, Stanley; Richardson, Charles C.

    1992-01-01

    An automated DNA sequencing apparatus having a reactor for providing at least two series of DNA products formed from a single primer and a DNA strand, each DNA product of a series differing in molecular weight and having a chain terminating agent at one end; separating means for separating the DNA products to form a series bands, the intensity of substantially all nearby bands in a different series being different, band reading means for determining the position an This invention was made with government support including a grant from the U.S. Public Health Service, contract number AI-06045. The U.S. government has certain rights in the invention.

  13. An Introduction to Database Structure and Database Machines.

    ERIC Educational Resources Information Center

    Detweiler, Karen

    1984-01-01

    Enumerates principal management objectives of database management systems (data independence, quality, security, multiuser access, central control) and criteria for comparison (response time, size, flexibility, other features). Conventional database management systems, relational databases, and database machines used for backend processing are…

  14. The National Land Cover Database

    USGS Publications Warehouse

    Homer, Collin H.; Fry, Joyce A.; Barnes, Christopher A.

    2012-01-01

    The National Land Cover Database (NLCD) serves as the definitive Landsat-based, 30-meter resolution, land cover database for the Nation. NLCD provides spatial reference and descriptive data for characteristics of the land surface such as thematic class (for example, urban, agriculture, and forest), percent impervious surface, and percent tree canopy cover. NLCD supports a wide variety of Federal, State, local, and nongovernmental applications that seek to assess ecosystem status and health, understand the spatial patterns of biodiversity, predict effects of climate change, and develop land management policy. NLCD products are created by the Multi-Resolution Land Characteristics (MRLC) Consortium, a partnership of Federal agencies led by the U.S. Geological Survey. All NLCD data products are available for download at no charge to the public from the MRLC Web site: http://www.mrlc.gov.

  15. LOTUS-DB: an integrative and interactive database for Nelumbo nucifera study

    PubMed Central

    Wang, Kun; Deng, Jiao; Damaris, Rebecca Njeri; Yang, Mei; Xu, Liming; Yang, Pingfang

    2015-01-01

    Besides its important significance in plant taxonomy and phylogeny, sacred lotus (Nelumbo nucifera Gaertn.) might also hold the key to the secrets of aging, which attracts crescent attentions from researchers all over the world. The genetic or molecular studies on this species depend on its genome information. In 2013, two publications reported the sequencing of its full genome, based on which we constructed a database named as LOTUS-DB. It will provide comprehensive information on the annotation, gene function and expression for the sacred lotus. The information will facilitate users to efficiently query and browse genes, graphically visualize genome and download a variety of complex data information on genome DNA, coding sequence (CDS), transcripts or peptide sequences, promoters and markers. It will accelerate researches on gene cloning, functional identification of sacred lotus, and hence promote the studies on this species and plant genomics as well. Database URL: http://lotus-db.wbgcas.cn. PMID:25819075

  16. GOLD: The Genomes Online Database

    DOE Data Explorer

    Kyrpides, Nikos; Liolios, Dinos; Chen, Amy; Tavernarakis, Nektarios; Hugenholtz, Philip; Markowitz, Victor; Bernal, Alex

    Since its inception in 1997, GOLD has continuously monitored genome sequencing projects worldwide and has provided the community with a unique centralized resource that integrates diverse information related to Archaea, Bacteria, Eukaryotic and more recently Metagenomic sequencing projects. As of September 2007, GOLD recorded 639 completed genome projects. These projects have their complete sequence deposited into the public archival sequence databases such as GenBank EMBL,and DDBJ. From the total of 639 complete and published genome projects as of 9/2007, 527 were bacterial, 47 were archaeal and 65 were eukaryotic. In addition to the complete projects, there were 2158 ongoing sequencing projects. 1328 of those were bacterial, 59 archaeal and 771 eukaryotic projects. Two types of metadata are provided by GOLD: (i) project metadata and (ii) organism/environment metadata. GOLD CARD pages for every project are available from the link of every GOLD_STAMP ID. The information in every one of these pages is organized into three tables: (a) Organism information, (b) Genome project information and (c) External links. [The Genomes On Line Database (GOLD) in 2007: Status of genomic and metagenomic projects and their associated metadata, Konstantinos Liolios, Konstantinos Mavromatis, Nektarios Tavernarakis and Nikos C. Kyrpides, Nucleic Acids Research Advance Access published online on November 2, 2007, Nucleic Acids Research, doi:10.1093/nar/gkm884]

    The basic tables in the GOLD database that can be browsed or searched include the following information:

    • Gold Stamp ID
    • Organism name
    • Domain
    • Links to information sources
    • Size and link to a map, when available
    • Chromosome number, Plas number, and GC content
    • A link for downloading the actual genome data
    • Institution that did the sequencing
    • Funding source
    • Database where information resides
    • Publication status and information

    • Human Contamination in Public Genome Assemblies

      PubMed Central

      Kryukov, Kirill; Imanishi, Tadashi

      2016-01-01

      Contamination in genome assembly can lead to wrong or confusing results when using such genome as reference in sequence comparison. Although bacterial contamination is well known, the problem of human-originated contamination received little attention. In this study we surveyed 45,735 available genome assemblies for evidence of human contamination. We used lineage specificity to distinguish between contamination and conservation. We found that 154 genome assemblies contain fragments that with high confidence originate as contamination from human DNA. Majority of contaminating human sequences were present in the reference human genome assembly for over a decade. We recommend that existing contaminated genomes should be revised to remove contaminated sequence, and that new assemblies should be thoroughly checked for presence of human DNA before submitting them to public databases. PMID:27611326

    • Human Contamination in Public Genome Assemblies.

      PubMed

      Kryukov, Kirill; Imanishi, Tadashi

      2016-01-01

      Contamination in genome assembly can lead to wrong or confusing results when using such genome as reference in sequence comparison. Although bacterial contamination is well known, the problem of human-originated contamination received little attention. In this study we surveyed 45,735 available genome assemblies for evidence of human contamination. We used lineage specificity to distinguish between contamination and conservation. We found that 154 genome assemblies contain fragments that with high confidence originate as contamination from human DNA. Majority of contaminating human sequences were present in the reference human genome assembly for over a decade. We recommend that existing contaminated genomes should be revised to remove contaminated sequence, and that new assemblies should be thoroughly checked for presence of human DNA before submitting them to public databases. PMID:27611326

    • The Chloroplast Function Database II: a comprehensive collection of homozygous mutants and their phenotypic/genotypic traits for nuclear-encoded chloroplast proteins.

      PubMed

      Myouga, Fumiyoshi; Akiyama, Kenji; Tomonaga, Yumi; Kato, Aya; Sato, Yuka; Kobayashi, Megumi; Nagata, Noriko; Sakurai, Tetsuya; Shinozaki, Kazuo

      2013-02-01

      The Chloroplast Function Database has so far offered phenotype information on mutants of the nuclear-encoded chloroplast proteins in Arabidopsis that pertains to >200 phenotypic data sets that were obtained from 1,722 transposon- or T-DNA-tagged lines. Here, we present the development of the second version of the database, which is named the Chloroplast Function Database II and was redesigned to increase the number of mutant characters and new user-friendly tools for data mining and integration. The upgraded database offers information on genome-wide mutant screens for any visible phenotype against 2,495 tagged lines to create a comprehensive homozygous mutant collection. The collection consists of 147 lines with seedling phenotypes and 185 lines for which we could not obtain homozygotes, as well as 1,740 homozygotes with wild-type phenotypes. Besides providing basic information about primer lists that were used for the PCR genotyping of T-DNA-tagged lines and explanations about the preparation of homozygous mutants and phenotype screening, the database includes access to a link between the gene locus and existing publicly available databases. This gives users access to a combined pool of data, enabling them to gain valuable insights into biological processes. In addition, high-resolution images of plastid morphologies of mutants with seedling-specific chloroplast defects as observed with transmission electron microscopy (TEM) are available in the current database. This database is used to compare the phenotypes of visually identifiable mutants with their plastid ultrastructures and to evaluate their potential significance from characteristic patterns of plastid morphology in vivo. Thus, the Chloroplast Function Database II is a useful and comprehensive information resource that can help researchers to connect individual Arabidopsis genes to plastid functions on the basis of phenotype analysis of our tagged mutant collection. It can be freely accessed at http://rarge.psc.riken.jp/chloroplast/.

    • PDS: A Performance Database Server

      DOE PAGES

      Berry, Michael W.; Dongarra, Jack J.; Larose, Brian H.; Letsche, Todd A.

      1994-01-01

      The process of gathering, archiving, and distributing computer benchmark data is a cumbersome task usually performed by computer users and vendors with little coordination. Most important, there is no publicly available central depository of performance data for all ranges of machines from personal computers to supercomputers. We present an Internet-accessible performance database server (PDS) that can be used to extract current benchmark data and literature. As an extension to the X-Windows-based user interface (Xnetlib) to the Netlib archival system, PDS provides an on-line catalog of public domain computer benchmarks such as the LINPACK benchmark, Perfect benchmarks, and the NAS parallelmore » benchmarks. PDS does not reformat or present the benchmark data in any way that conflicts with the original methodology of any particular benchmark; it is thereby devoid of any subjective interpretations of machine performance. We believe that all branches (research laboratories, academia, and industry) of the general computing community can use this facility to archive performance metrics and make them readily available to the public. PDS can provide a more manageable approach to the development and support of a large dynamic database of published performance metrics.« less

  1. Direct-test PCR for detection of meningococcal DNA and its serogroup characterization: standardization and adaptation for use in a public health laboratory.

    PubMed

    Baethgen, L F; Moraes, C; Weidlich, L; Rios, S; Kmetzsch, C I; Silva, M S N; Rossetti, M L R; Zaha, A

    2003-09-01

    A direct PCR test (DT-PCR) was established to detect Neisseria meningitidis DNA in clinical samples from patients with suspected bacterial meningitis. Specific primers for the 16S rDNA of N. meningitidis were designed to amplify a 600 bp DNA fragment. One hundred and ninety-three clinical samples were analysed, corresponding to 114 samples from patients diagnosed as positive and 79 as negative for infection by N. meningitidis using conventional methods (culture, latex agglutination and counterimmunoelectrophoresis). These samples were submitted to PCR by two different clinical sample preparation approaches (with and without DNA extraction and purification) and submitted to different PCR protocols to improve the results. In agarose gel detection, the sensitivity value for DT-PCR was 88.5 % and, using dot-blot DNA detection, the sensitivity increased to 96.4 %. The detection limit for meningococcus in cerebrospinal fluid was 2x10(2) c.f.u. ml(-1). Serogroup prediction was done using a multiplex PCR protocol and the sensitivity was 83 % for agarose gel DNA detection and 96.4 % using dot-blot DNA detection. PMID:12909657

  2. ITS-90 Thermocouple Database

    National Institute of Standards and Technology Data Gateway

    SRD 60 NIST ITS-90 Thermocouple Database (Web, free access)   Web version of Standard Reference Database 60 and NIST Monograph 175. The database gives temperature -- electromotive force (emf) reference functions and tables for the letter-designated thermocouple types B, E, J, K, N, R, S and T. These reference functions have been adopted as standards by the American Society for Testing and Materials (ASTM) and the International Electrotechnical Commission (IEC).

  3. 2010 Worldwide Gasification Database

    DOE Data Explorer

    The 2010 Worldwide Gasification Database describes the current world gasification industry and identifies near-term planned capacity additions. The database lists gasification projects and includes information (e.g., plant location, number and type of gasifiers, syngas capacity, feedstock, and products). The database reveals that the worldwide gasification capacity has continued to grow for the past several decades and is now at 70,817 megawatts thermal (MWth) of syngas output at 144 operating plants with a total of 412 gasifiers.

  4. Databases for Microbiologists

    DOE PAGES

    Zhulin, Igor B.

    2015-05-26

    Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. Finally, the purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists.

  5. Veterans Administration Databases

    Cancer.gov

    The Veterans Administration Information Resource Center provides database and informatics experts, customer service, expert advice, information products, and web technology to VA researchers and others.

  6. Databases for Microbiologists

    PubMed Central

    2015-01-01

    Databases play an increasingly important role in biology. They archive, store, maintain, and share information on genes, genomes, expression data, protein sequences and structures, metabolites and reactions, interactions, and pathways. All these data are critically important to microbiologists. Furthermore, microbiology has its own databases that deal with model microorganisms, microbial diversity, physiology, and pathogenesis. Thousands of biological databases are currently available, and it becomes increasingly difficult to keep up with their development. The purpose of this minireview is to provide a brief survey of current databases that are of interest to microbiologists. PMID:26013493

  7. Databases for LDEF results

    NASA Technical Reports Server (NTRS)

    Bohnhoff-Hlavacek, Gail

    1992-01-01

    One of the objectives of the team supporting the LDEF Systems and Materials Special Investigative Groups is to develop databases of experimental findings. These databases identify the hardware flown, summarize results and conclusions, and provide a system for acknowledging investigators, tracing sources of data, and future design suggestions. To date, databases covering the optical experiments, and thermal control materials (chromic acid anodized aluminum, silverized Teflon blankets, and paints) have been developed at Boeing. We used the Filemaker Pro software, the database manager for the Macintosh computer produced by the Claris Corporation. It is a flat, text-retrievable database that provides access to the data via an intuitive user interface, without tedious programming. Though this software is available only for the Macintosh computer at this time, copies of the databases can be saved to a format that is readable on a personal computer as well. Further, the data can be exported to more powerful relational databases, capabilities, and use of the LDEF databases and describe how to get copies of the database for your own research.

  8. Open access intrapartum CTG database

    PubMed Central

    2014-01-01

    Background Cardiotocography (CTG) is a monitoring of fetal heart rate and uterine contractions. Since 1960 it is routinely used by obstetricians to assess fetal well-being. Many attempts to introduce methods of automatic signal processing and evaluation have appeared during the last 20 years, however still no significant progress similar to that in the domain of adult heart rate variability, where open access databases are available (e.g. MIT-BIH), is visible. Based on a thorough review of the relevant publications, presented in this paper, the shortcomings of the current state are obvious. A lack of common ground for clinicians and technicians in the field hinders clinically usable progress. Our open access database of digital intrapartum cardiotocographic recordings aims to change that. Description The intrapartum CTG database consists in total of 552 intrapartum recordings, which were acquired between April 2010 and August 2012 at the obstetrics ward of the University Hospital in Brno, Czech Republic. All recordings were stored in electronic form in the OB TraceVue®;system. The recordings were selected from 9164 intrapartum recordings with clinical as well as technical considerations in mind. All recordings are at most 90 minutes long and start a maximum of 90 minutes before delivery. The time relation of CTG to delivery is known as well as the length of the second stage of labor which does not exceed 30 minutes. The majority of recordings (all but 46 cesarean sections) is – on purpose – from vaginal deliveries. All recordings have available biochemical markers as well as some more general clinical features. Full description of the database and reasoning behind selection of the parameters is presented in the paper. Conclusion A new open-access CTG database is introduced which should give the research community common ground for comparison of results on reasonably large database. We anticipate that after reading the paper, the reader will understand the

  9. The Latin American Social Medicine database

    PubMed Central

    Eldredge, Jonathan D; Waitzkin, Howard; Buchanan, Holly S; Teal, Janis; Iriart, Celia; Wiley, Kevin; Tregear, Jonathan

    2004-01-01

    Background Public health practitioners and researchers for many years have been attempting to understand more clearly the links between social conditions and the health of populations. Until recently, most public health professionals in English-speaking countries were unaware that their colleagues in Latin America had developed an entire field of inquiry and practice devoted to making these links more clearly understood. The Latin American Social Medicine (LASM) database finally bridges this previous gap. Description This public health informatics case study describes the key features of a unique information resource intended to improve access to LASM literature and to augment understanding about the social determinants of health. This case study includes both quantitative and qualitative evaluation data. Currently the LASM database at The University of New Mexico brings important information, originally known mostly within professional networks located in Latin American countries to public health professionals worldwide via the Internet. The LASM database uses Spanish, Portuguese, and English language trilingual, structured abstracts to summarize classic and contemporary works. Conclusion This database provides helpful information for public health professionals on the social determinants of health and expands access to LASM. PMID:15627401

  10. BodyMap: a human and mouse gene expression database.

    PubMed

    Hishiki, T; Kawamoto, S; Morishita, S; Okubo, K

    2000-01-01

    BodyMap is a human and mouse gene expression database that has been maintained since 1993. It is based on site-directed 3'-ESTs collected from non-biased cDNA libraries constructed at Osaka University and contains >270 000 sequences from 60 human and 38 mouse tissues. The site-directed nature of the sequence tags allows unequivocal grouping of tags representing the same transcript and provides abundance information for each transcript in different parts of the body. Our collection of ESTs was compared periodically with other public databases for cross referencing. The histological resolution of source tissues and unique cloning strategy that minimized cloning bias enabled BodyMap to support three unique mRNA based experiments in silico. First, the recurrence information for clones in each library provides a rough estimate of the mRNA composition of each source tissue. Second, a user can search the entire data set with nucleotide sequences or keywords to assess expression patterns of particular genes. Third, and most important, BodyMap allows a user to select genes that have a desired expression pattern in humans and mice. BodyMap is accessible through the WWW at http://bodymap.ims.u-tokyo.ac.jp PMID:10592203

  11. DNA Barcoding for Species Assignment: The Case of Mediterranean Marine Fishes

    PubMed Central

    Landi, Monica; Dimech, Mark; Arculeo, Marco; Biondo, Girolama; Martins, Rogelia; Carneiro, Miguel; Carvalho, Gary Robert; Brutto, Sabrina Lo; Costa, Filipe O.

    2014-01-01

    Background DNA barcoding enhances the prospects for species-level identifications globally using a standardized and authenticated DNA-based approach. Reference libraries comprising validated DNA barcodes (COI) constitute robust datasets for testing query sequences, providing considerable utility to identify marine fish and other organisms. Here we test the feasibility of using DNA barcoding to assign species to tissue samples from fish collected in the central Mediterranean Sea, a major contributor to the European marine ichthyofaunal diversity. Methodology/Principal Findings A dataset of 1278 DNA barcodes, representing 218 marine fish species, was used to test the utility of DNA barcodes to assign species from query sequences. We tested query sequences against 1) a reference library of ranked DNA barcodes from the neighbouring North East Atlantic, and 2) the public databases BOLD and GenBank. In the first case, a reference library comprising DNA barcodes with reliability grades for 146 fish species was used as diagnostic dataset to screen 486 query DNA sequences from fish specimens collected in the central basin of the Mediterranean Sea. Of all query sequences suitable for comparisons 98% were unambiguously confirmed through complete match with reference DNA barcodes. In the second case, it was possible to assign species to 83% (BOLD-IDS) and 72% (GenBank) of the sequences from the Mediterranean. Relatively high intraspecific genetic distances were found in 7 species (2.2%–18.74%), most of them of high commercial relevance, suggesting possible cryptic species. Conclusion/Significance We emphasize the discriminatory power of COI barcodes and their application to cases requiring species level resolution starting from query sequences. Results highlight the value of public reference libraries of reliability grade-annotated DNA barcodes, to identify species from different geographical origins. The ability to assign species with high precision from DNA samples of

  12. Database in Artificial Intelligence.

    ERIC Educational Resources Information Center

    Wilkinson, Julia

    1986-01-01

    Describes a specialist bibliographic database of literature in the field of artificial intelligence created by the Turing Institute (Glasgow, Scotland) using the BRS/Search information retrieval software. The subscription method for end-users--i.e., annual fee entitles user to unlimited access to database, document provision, and printed awareness…

  13. BioImaging Database

    SciTech Connect

    David Nix, Lisa Simirenko

    2006-10-25

    The Biolmaging Database (BID) is a relational database developed to store the data and meta-data for the 3D gene expression in early Drosophila embryo development on a cellular level. The schema was written to be used with the MySQL DBMS but with minor modifications can be used on any SQL compliant relational DBMS.

  14. Biological Macromolecule Crystallization Database

    National Institute of Standards and Technology Data Gateway

    SRD 21 Biological Macromolecule Crystallization Database (Web, free access)   The Biological Macromolecule Crystallization Database and NASA Archive for Protein Crystal Growth Data (BMCD) contains the conditions reported for the crystallization of proteins and nucleic acids used in X-ray structure determinations and archives the results of microgravity macromolecule crystallization studies.

  15. Online Database Searching Workbook.

    ERIC Educational Resources Information Center

    Littlejohn, Alice C.; Parker, Joan M.

    Designed primarily for use by first-time searchers, this workbook provides an overview of online searching. Following a brief introduction which defines online searching, databases, and database producers, five steps in carrying out a successful search are described: (1) identifying the main concepts of the search statement; (2) selecting a…

  16. HIV Structural Database

    National Institute of Standards and Technology Data Gateway

    SRD 102 HIV Structural Database (Web, free access)   The HIV Protease Structural Database is an archive of experimentally determined 3-D structures of Human Immunodeficiency Virus 1 (HIV-1), Human Immunodeficiency Virus 2 (HIV-2) and Simian Immunodeficiency Virus (SIV) Proteases and their complexes with inhibitors or products of substrate cleavage.

  17. Atomic Spectra Database (ASD)

    National Institute of Standards and Technology Data Gateway

    SRD 78 NIST Atomic Spectra Database (ASD) (Web, free access)   This database provides access and search capability for NIST critically evaluated data on atomic energy levels, wavelengths, and transition probabilities that are reasonably up-to-date. The NIST Atomic Spectroscopy Data Center has carried out these critical compilations.

  18. First Look: TRADEMARKSCAN Database.

    ERIC Educational Resources Information Center

    Fernald, Anne Conway; Davidson, Alan B.

    1984-01-01

    Describes database produced by Thomson and Thomson and available on Dialog which contains over 700,000 records representing all active federal trademark registrations and applications for registrations filed in United States Patent and Trademark Office. A typical record, special features, database applications, learning to use TRADEMARKSCAN, and…

  19. Dictionary as Database.

    ERIC Educational Resources Information Center

    Painter, Derrick

    1996-01-01

    Discussion of dictionaries as databases focuses on the digitizing of The Oxford English dictionary (OED) and the use of Standard Generalized Mark-Up Language (SGML). Topics include the creation of a consortium to digitize the OED, document structure, relational databases, text forms, sequence, and discourse. (LRW)

  20. Structural Ceramics Database

    National Institute of Standards and Technology Data Gateway

    SRD 30 NIST Structural Ceramics Database (Web, free access)   The NIST Structural Ceramics Database (WebSCD) provides evaluated materials property data for a wide range of advanced ceramics known variously as structural ceramics, engineering ceramics, and fine ceramics.

  1. Build Your Own Database.

    ERIC Educational Resources Information Center

    Jacso, Peter; Lancaster, F. W.

    This book is intended to help librarians and others to produce databases of better value and quality, especially if they have had little previous experience in database construction. Drawing upon almost 40 years of experience in the field of information retrieval, this book emphasizes basic principles and approaches rather than in-depth and…

  2. Knowledge Discovery in Databases.

    ERIC Educational Resources Information Center

    Norton, M. Jay

    1999-01-01

    Knowledge discovery in databases (KDD) revolves around the investigation and creation of knowledge, processes, algorithms, and mechanisms for retrieving knowledge from data collections. The article is an introductory overview of KDD. The rationale and environment of its development and applications are discussed. Issues related to database design…

  3. Database Searching by Managers.

    ERIC Educational Resources Information Center

    Arnold, Stephen E.

    Managers and executives need the easy and quick access to business and management information that online databases can provide, but many have difficulty articulating their search needs to an intermediary. One possible solution would be to encourage managers and their immediate support staff members to search textual databases directly as they now…

  4. A Quality System Database

    NASA Technical Reports Server (NTRS)

    Snell, William H.; Turner, Anne M.; Gifford, Luther; Stites, William

    2010-01-01

    A quality system database (QSD), and software to administer the database, were developed to support recording of administrative nonconformance activities that involve requirements for documentation of corrective and/or preventive actions, which can include ISO 9000 internal quality audits and customer complaints.

  5. Assignment to database industy

    NASA Astrophysics Data System (ADS)

    Abe, Kohichiroh

    Various kinds of databases are considered to be essential part in future large sized systems. Information provision only by databases is also considered to be growing as the market becomes mature. This paper discusses how such circumstances have been built and will be developed from now on.

  6. HMDB: the Human Metabolome Database

    PubMed Central

    Wishart, David S.; Tzur, Dan; Knox, Craig; Eisner, Roman; Guo, An Chi; Young, Nelson; Cheng, Dean; Jewell, Kevin; Arndt, David; Sawhney, Summit; Fung, Chris; Nikolai, Lisa; Lewis, Mike; Coutouly, Marie-Aude; Forsythe, Ian; Tang, Peter; Shrivastava, Savita; Jeroncic, Kevin; Stothard, Paul; Amegbey, Godwin; Block, David; Hau, David. D.; Wagner, James; Miniaci, Jessica; Clements, Melisa; Gebremedhin, Mulu; Guo, Natalie; Zhang, Ying; Duggan, Gavin E.; MacInnis, Glen D.; Weljie, Alim M.; Dowlatabadi, Reza; Bamforth, Fiona; Clive, Derrick; Greiner, Russ; Li, Liang; Marrie, Tom; Sykes, Brian D.; Vogel, Hans J.; Querengesser, Lori

    2007-01-01

    The Human Metabolome Database (HMDB) is currently the most complete and comprehensive curated collection of human metabolite and human metabolism data in the world. It contains records for more than 2180 endogenous metabolites with information gathered from thousands of books, journal articles and electronic databases. In addition to its comprehensive literature-derived data, the HMDB also contains an extensive collection of experimental metabolite concentration data compiled from hundreds of mass spectra (MS) and Nuclear Magnetic resonance (NMR) metabolomic analyses performed on urine, blood and cerebrospinal fluid samples. This is further supplemented with thousands of NMR and MS spectra collected on purified, reference metabolites. Each metabolite entry in the HMDB contains an average of 90 separate data fields including a comprehensive compound description, names and synonyms, structural information, physico-chemical data, reference NMR and MS spectra, biofluid concentrations, disease associations, pathway information, enzyme data, gene sequence data, SNP and mutation data as well as extensive links to images, references and other public databases. Extensive searching, relational querying and data browsing tools are also provided. The HMDB is designed to address the broad needs of biochemists, clinical chemists, physicians, medical geneticists, nutritionists and members of the metabolomics community. The HMDB is available at: PMID:17202168

  7. 16 CFR 1102.28 - Publication of reports of harm.

    Code of Federal Regulations, 2011 CFR

    2011-01-01

    ... REGULATIONS PUBLICLY AVAILABLE CONSUMER PRODUCT SAFETY INFORMATION DATABASE (Eff. Jan. 10, 2011) Procedural..., the Commission will publish reports of harm that meet the requirements for publication in the Database...(d) in the Database beyond the 10-business-day time frame set forth in paragraph (a) of this...

  8. Increased coverage of protein families with the blocks database servers.

    PubMed

    Henikoff, J G; Greene, E A; Pietrokovski, S; Henikoff, S

    2000-01-01

    The Blocks Database WWW (http://blocks.fhcrc.org ) and Email (blocks@blocks.fhcrc.org ) servers provide tools to search DNA and protein queries against the Blocks+ Database of multiple alignments, which represent conserved protein regions. Blocks+ nearly doubles the number of protein families included in the database by adding families from the Pfam-A, ProDom and Domo databases to those from PROSITE and PRINTS. Other new features include improved Block Searcher statistics, searching with NCBI's IMPALA program and 3D display of blocks on PDB structures.

  9. Cascadia Tsunami Deposit Database

    USGS Publications Warehouse

    Peters, Robert; Jaffe, Bruce; Gelfenbaum, Guy; Peterson, Curt

    2003-01-01

    The Cascadia Tsunami Deposit Database contains data on the location and sedimentological properties of tsunami deposits found along the Cascadia margin. Data have been compiled from 52 studies, documenting 59 sites from northern California to Vancouver Island, British Columbia that contain known or potential tsunami deposits. Bibliographical references are provided for all sites included in the database. Cascadia tsunami deposits are usually seen as anomalous sand layers in coastal marsh or lake sediments. The studies cited in the database use numerous criteria based on sedimentary characteristics to distinguish tsunami deposits from sand layers deposited by other processes, such as river flooding and storm surges. Several studies cited in the database contain evidence for more than one tsunami at a site. Data categories include age, thickness, layering, grainsize, and other sedimentological characteristics of Cascadia tsunami deposits. The database documents the variability observed in tsunami deposits found along the Cascadia margin.

  10. TCM Database@Taiwan: the world's largest traditional Chinese medicine database for drug screening in silico.

    PubMed

    Chen, Calvin Yu-Chian

    2011-01-06

    Rapid advancing computational technologies have greatly speeded up the development of computer-aided drug design (CADD). Recently, pharmaceutical companies have increasingly shifted their attentions toward traditional Chinese medicine (TCM) for novel lead compounds. Despite the growing number of studies on TCM, there is no free 3D small molecular structure database of TCM available for virtual screening or molecular simulation. To address this shortcoming, we have constructed TCM Database@Taiwan (http://tcm.cmu.edu.tw/) based on information collected from Chinese medical texts and scientific publications. TCM Database@Taiwan is currently the world's largest non-commercial TCM database. This web-based database contains more than 20,000 pure compounds isolated from 453 TCM ingredients. Both cdx (2D) and Tripos mol2 (3D) formats of each pure compound in the database are available for download and virtual screening. The TCM database includes both simple and advanced web-based query options that can specify search clauses, such as molecular properties, substructures, TCM ingredients, and TCM classification, based on intended drug actions. The TCM database can be easily accessed by all researchers conducting CADD. Over the last eight years, numerous volunteers have devoted their time to analyze TCM ingredients from Chinese medical texts as well as to construct structure files for each isolated compound. We believe that TCM Database@Taiwan will be a milestone on the path towards modernizing traditional Chinese medicine.

  11. SENTRA, a database of signal transduction proteins.

    SciTech Connect

    D'Souza, M.; Romine, M. F.; Maltsev, N.; Mathematics and Computer Science; PNNL

    2000-01-01

    SENTRA, available via URL http://wit.mcs.anl.gov/WIT2/Sentra/, is a database of proteins associated with microbial signal transduction. The database currently includes the classical two-component signal transduction pathway proteins and methyl-accepting chemotaxis proteins, but will be expanded to also include other classes of signal transduction systems that are modulated by phosphorylation or methylation reactions. Although the majority of database entries are from prokaryotic systems, eukaroytic proteins with bacterial-like signal transduction domains are also included. Currently SENTRA contains signal transduction proteins in 34 complete and almost completely sequenced prokaryotic genomes, as well as sequences from 243 organisms available in public databases (SWISS-PROT and EMBL). The analysis was carried out within the framework of the WIT2 system, which is designed and implemented to support genetic sequence analysis and comparative analysis of sequenced genomes.

  12. The HITRAN 2008 Molecular Spectroscopic Database

    NASA Technical Reports Server (NTRS)

    Rothman, Laurence S.; Gordon, Iouli E.; Barbe, Alain; Benner, D. Chris; Bernath, Peter F.; Birk, Manfred; Boudon, V.; Brown, Linda R.; Campargue, Alain; Champion, J.-P.; Chance, Kelly V.; Coudert, L. H.; Sung, K.; Toth, R. A.

    2009-01-01

    This paper describes the status of the 2008 edition of the HITRAN molecular spectroscopic database. The new edition is the first official public release since the 2004 edition, although a number of crucial updates had been made available online since 2004. The HITRAN compilation consists of several components that serve as input for radiative-transfer calculation codes: individual line parameters for the microwave through visible spectra of molecules in the gas phase; absorption cross-sections for molecules having dense spectral features, i.e., spectra in which the individual lines are not resolved; individual line parameters and absorption cross sections for bands in the ultra-violet; refractive indices of aerosols, tables and files of general properties associated with the database; and database management software. The line-by-line portion of the database contains spectroscopic parameters for forty-two molecules including many of their isotopologues.

  13. PubChem Substance and Compound databases.

    PubMed

    Kim, Sunghwan; Thiessen, Paul A; Bolton, Evan E; Chen, Jie; Fu, Gang; Gindulyte, Asta; Han, Lianyi; He, Jane; He, Siqian; Shoemaker, Benjamin A; Wang, Jiyao; Yu, Bo; Zhang, Jian; Bryant, Stephen H

    2016-01-01

    PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public repository for information on chemical substances and their biological activities, launched in 2004 as a component of the Molecular Libraries Roadmap Initiatives of the US National Institutes of Health (NIH). For the past 11 years, PubChem has grown to a sizable system, serving as a chemical information resource for the scientific research community. PubChem consists of three inter-linked databases, Substance, Compound and BioAssay. The Substance database contains chemical information deposited by individual data contributors to PubChem, and the Compound database stores unique chemical structures extracted from the Substance database. Biological activity data of chemical substances tested in assay experiments are contained in the BioAssay database. This paper provides an overview of the PubChem Substance and Compound databases, including data sources and contents, data organization, data submission using PubChem Upload, chemical structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access. It also gives a brief description of PubChem3D, a resource derived from theoretical three-dimensional structures of compounds in PubChem, as well as PubChemRDF, Resource Description Framework (RDF)-formatted PubChem data for data sharing, analysis and integration with information contained in other databases. PMID:26400175

  14. PubChem Substance and Compound databases

    PubMed Central

    Kim, Sunghwan; Thiessen, Paul A.; Bolton, Evan E.; Chen, Jie; Fu, Gang; Gindulyte, Asta; Han, Lianyi; He, Jane; He, Siqian; Shoemaker, Benjamin A.; Wang, Jiyao; Yu, Bo; Zhang, Jian; Bryant, Stephen H.

    2016-01-01

    PubChem (https://pubchem.ncbi.nlm.nih.gov) is a public repository for information on chemical substances and their biological activities, launched in 2004 as a component of the Molecular Libraries Roadmap Initiatives of the US National Institutes of Health (NIH). For the past 11 years, PubChem has grown to a sizable system, serving as a chemical information resource for the scientific research community. PubChem consists of three inter-linked databases, Substance, Compound and BioAssay. The Substance database contains chemical information deposited by individual data contributors to PubChem, and the Compound database stores unique chemical structures extracted from the Substance database. Biological activity data of chemical substances tested in assay experiments are contained in the BioAssay database. This paper provides an overview of the PubChem Substance and Compound databases, including data sources and contents, data organization, data submission using PubChem Upload, chemical structure standardization, web-based interfaces for textual and non-textual searches, and programmatic access. It also gives a brief description of PubChem3D, a resource derived from theoretical three-dimensional structures of compounds in PubChem, as well as PubChemRDF, Resource Description Framework (RDF)-formatted PubChem data for data sharing, analysis and integration with information contained in other databases. PMID:26400175

  15. DENdb: database of integrated human enhancers.

    PubMed

    Ashoor, Haitham; Kleftogiannis, Dimitrios; Radovanovic, Aleksandar; Bajic, Vladimir B

    2015-01-01

    Enhancers are cis-acting DNA regulatory regions that play a key role in distal control of transcriptional activities. Identification of enhancers, coupled with a comprehensive functional analysis of their properties, could improve our understanding of complex gene transcription mechanisms and gene regulation processes in general. We developed DENdb, a centralized on-line repository of predicted enhancers derived from multiple human cell-lines. DENdb integrates enhancers predicted by five different methods generating an enriched catalogue of putative enhancers for each of the analysed cell-lines. DENdb provides information about the overlap of enhancers with DNase I hypersensitive regions, ChIP-seq regions of a number of transcription factors and transcription factor binding motifs, means to explore enhancer interactions with DNA using several chromatin interaction assays and enhancer neighbouring genes. DENdb is designed as a relational database that facilitates fast and efficient searching, browsing and visualization of information. Database URL: http://www.cbrc.kaust.edu.sa/dendb/. PMID:26342387

  16. A Database System for Course Administration.

    ERIC Educational Resources Information Center

    Benbasat, Izak; And Others

    1982-01-01

    Describes a computer-assisted testing system which produces multiple-choice examinations for a college course in business administration. The system uses SPIRES (Stanford Public Information REtrieval System) to manage a database of questions and related data, mark-sense cards for machine grading tests, and ACL (6) (Audit Command Language) to…

  17. THE DRINKING WATER TREATABILITY DATABASE (Slides)

    EPA Science Inventory

    The Drinking Water Treatability Database (TDB) assembles referenced data on the control of contaminants in drinking water, housed on an interactive, publicly-available, USEPA web site (www.epa.gov/tdb). The TDB is of use to drinking water utilities, treatment process design engin...

  18. THE DRINKING WATER TREATABILITY DATABASE (Conference Paper)

    EPA Science Inventory

    The Drinking Water Treatability Database (TDB) assembles referenced data on the control of contaminants in drinking water, housed on an interactive, publicly-available, USEPA web site (www.epa.gov/tdb). The TDB is of use to drinking water utilities, treatment process design engin...

  19. Plant databases and data analysis tools

    Technology Transfer Automated Retrieval System (TEKTRAN)

    It is anticipated that the coming years will see the generation of large datasets including diagnostic markers in several plant species with emphasis on crop plants. To use these datasets effectively in any plant breeding program, it is essential to have the information available via public database...

  20. The RECONS 25 Parsec Database

    NASA Astrophysics Data System (ADS)

    Henry, Todd J.; Jao, Wei-Chun; Pewett, Tiffany; Riedel, Adric R.; Silverstein, Michele L.; Slatten, Kenneth J.; Winters, Jennifer G.; Recons Team

    2015-01-01

    The REsearch Consortium On Nearby Stars (RECONS, www.recons.org) Team has been mapping the solar neighborhood since 1994. Nearby stars provide the fundamental framework upon which all of stellar astronomy is based, both for individual stars and stellar populations. The nearest stars are also the primary targets for extrasolar planet searches, and will undoubtedly play key roles in understanding the prevalence and structure of solar systems, and ultimately, in our search for life elsewhere.We have built the RECONS 25 Parsec Database to encourage and enable exploration of the Sun's nearest neighbors. The Database, slated for public release in 2015, contains 3088 stars, brown dwarfs, andexoplanets in 2184 systems as of October 1, 2014. All of these systems have accurate trigonometric parallaxes in the refereed literature placing them closer than 25.0 parsecs, i.e., parallaxes greater than 40 mas with errors less than 10 mas. Carefully vetted astrometric, photometric, and spectroscopic data are incorporated intothe Database from reliable sources, including significant original data collected by members of the RECONS Team.Current exploration of the solar neighborhood by RECONS, enabled by the Database, focuses on the ubiquitous red dwarfs, including: assessing the stellar companion population of ~1200 red dwarfs (Winters), investigating the astrophysical causes that spread red dwarfs of similar temperatures by a factor of 16 in luminosity (Pewett), and canvassing ~3000 red dwarfs for excess emission due to unseen companions and dust (Silverstein). In addition, a decade long astrometric survey of ~500 red dwarfs in the southern sky has begun, in an effort to understand the stellar, brown dwarf, and planetary companion populations for the stars that make up at least 75% of all stars in the Universe.This effort has been supported by the NSF through grants AST-0908402, AST-1109445, and AST-1412026, and via observations made possible by the SMARTS Consortium.

  1. Central Asia Active Fault Database

    NASA Astrophysics Data System (ADS)

    Mohadjer, Solmaz; Ehlers, Todd A.; Kakar, Najibullah

    2014-05-01

    The ongoing collision of the Indian subcontinent with Asia controls active tectonics and seismicity in Central Asia. This motion is accommodated by faults that have historically caused devastating earthquakes and continue to pose serious threats to the population at risk. Despite international and regional efforts to assess seismic hazards in Central Asia, little attention has been given to development of a comprehensive database for active faults in the region. To address this issue and to better understand the distribution and level of seismic hazard in Central Asia, we are developing a publically available database for active faults of Central Asia (including but not limited to Afghanistan, Tajikistan, Kyrgyzstan, northern Pakistan and western China) using ArcGIS. The database is designed to allow users to store, map and query important fault parameters such as fault location, displacement history, rate of movement, and other data relevant to seismic hazard studies including fault trench locations, geochronology constraints, and seismic studies. Data sources integrated into the database include previously published maps and scientific investigations as well as strain rate measurements and historic and recent seismicity. In addition, high resolution Quickbird, Spot, and Aster imagery are used for selected features to locate and measure offset of landforms associated with Quaternary faulting. These features are individually digitized and linked to attribute tables that provide a description for each feature. Preliminary observations include inconsistent and sometimes inaccurate information for faults documented in different studies. For example, the Darvaz-Karakul fault which roughly defines the western margin of the Pamir, has been mapped with differences in location of up to 12 kilometers. The sense of motion for this fault ranges from unknown to thrust and strike-slip in three different studies despite documented left-lateral displacements of Holocene and late

  2. FORMIDABEL: The Belgian Ants Database

    PubMed Central

    Brosens, Dimitri; Vankerkhoven, François; Ignace, David; Wegnez, Philippe; Noé, Nicolas; Heughebaert, André; Bortels, Jeannine; Dekoninck, Wouter

    2013-01-01

    Abstract FORMIDABEL is a database of Belgian Ants containing more than 27.000 occurrence records. These records originate from collections, field sampling and literature. The database gives information on 76 native and 9 introduced ant species found in Belgium. The collection records originated mainly from the ants collection in Royal Belgian Institute of Natural Sciences (RBINS), the ‘Gaspar’ Ants collection in Gembloux and the zoological collection of the University of Liège (ULG). The oldest occurrences date back from May 1866, the most recent refer to August 2012. FORMIDABEL is a work in progress and the database is updated twice a year. The latest version of the dataset is publicly and freely accessible through this url: http://ipt.biodiversity.be/resource.do?r=formidabel. The dataset is also retrievable via the GBIF data portal through this link: http://data.gbif.org/datasets/resource/14697 A dedicated geo-portal, developed by the Belgian Biodiversity Platform is accessible at: http://www.formicidae-atlas.be Purpose: FORMIDABEL is a joint cooperation of the Flemish ants working group “Polyergus” (http://formicidae.be) and the Wallonian ants working group “FourmisWalBru” (http://fourmiswalbru.be). The original database was created in 2002 in the context of the preliminary red data book of Flemish Ants (Dekoninck et al. 2003). Later, in 2005, data from the Southern part of Belgium; Wallonia and Brussels were added. In 2012 this dataset was again updated for the creation of the first Belgian Ants Atlas (Figure 1) (Dekoninck et al. 2012). The main purpose of this atlas was to generate maps for all outdoor-living ant species in Belgium using an overlay of the standard Belgian ecoregions. By using this overlay for most species, we can discern a clear and often restricted distribution pattern in Belgium, mainly based on vegetation and soil types. PMID:23794918

  3. Hazard Analysis Database Report

    SciTech Connect

    GRAMS, W.H.

    2000-12-28

    The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for U S . Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for HNF-SD-WM-SAR-067, Tank Farms Final Safety Analysis Report (FSAR). The FSAR is part of the approved Authorization Basis (AB) for the River Protection Project (RPP). This document describes, identifies, and defines the contents and structure of the Tank Farms FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The Hazard Analysis Database supports the preparation of Chapters 3 ,4 , and 5 of the Tank Farms FSAR and the Unreviewed Safety Question (USQ) process and consists of two major, interrelated data sets: (1) Hazard Analysis Database: Data from the results of the hazard evaluations, and (2) Hazard Topography Database: Data from the system familiarization and hazard identification.

  4. Hazard Analysis Database Report

    SciTech Connect

    GAULT, G.W.

    1999-10-13

    The Hazard Analysis Database was developed in conjunction with the hazard analysis activities conducted in accordance with DOE-STD-3009-94, Preparation Guide for US Department of Energy Nonreactor Nuclear Facility Safety Analysis Reports, for the Tank Waste Remediation System (TWRS) Final Safety Analysis Report (FSAR). The FSAR is part of the approved TWRS Authorization Basis (AB). This document describes, identifies, and defines the contents and structure of the TWRS FSAR Hazard Analysis Database and documents the configuration control changes made to the database. The TWRS Hazard Analysis Database contains the collection of information generated during the initial hazard evaluations and the subsequent hazard and accident analysis activities. The database supports the preparation of Chapters 3,4, and 5 of the TWRS FSAR and the USQ process and consists of two major, interrelated data sets: (1) Hazard Evaluation Database--Data from the results of the hazard evaluations; and (2) Hazard Topography Database--Data from the system familiarization and hazard identification.

  5. Database for propagation models

    NASA Technical Reports Server (NTRS)

    Kantak, Anil V.

    1991-01-01

    A propagation researcher or a systems engineer who intends to use the results of a propagation experiment is generally faced with various database tasks such as the selection of the computer software, the hardware, and the writing of the programs to pass the data through the models of interest. This task is repeated every time a new experiment is conducted or the same experiment is carried out at a different location generating different data. Thus the users of this data have to spend a considerable portion of their time learning how to implement the computer hardware and the software towards the desired end. This situation may be facilitated considerably if an easily accessible propagation database is created that has all the accepted (standardized) propagation phenomena models approved by the propagation research community. Also, the handling of data will become easier for the user. Such a database construction can only stimulate the growth of the propagation research it if is available to all the researchers, so that the results of the experiment conducted by one researcher can be examined independently by another, without different hardware and software being used. The database may be made flexible so that the researchers need not be confined only to the contents of the database. Another way in which the database may help the researchers is by the fact that they will not have to document the software and hardware tools used in their research since the propagation research community will know the database already. The following sections show a possible database construction, as well as properties of the database for the propagation research.

  6. Pfam: the protein families database

    PubMed Central

    Finn, Robert D.; Bateman, Alex; Clements, Jody; Coggill, Penelope; Eberhardt, Ruth Y.; Eddy, Sean R.; Heger, Andreas; Hetherington, Kirstie; Holm, Liisa; Mistry, Jaina; Sonnhammer, Erik L. L.; Tate, John; Punta, Marco

    2014-01-01

    Pfam, available via servers in the UK (http://pfam.sanger.ac.uk/) and the USA (http://pfam.janelia.org/), is a widely used database of protein families, containing 14 831 manually curated entries in the current release, version 27.0. Since the last update article 2 years ago, we have generated 1182 new families and maintained sequence coverage of the UniProt Knowledgebase (UniProtKB) at nearly 80%, despite a 50% increase in the size of the underlying sequence database. Since our 2012 article describing Pfam, we have also undertaken a comprehensive review of the features that are provided by Pfam over and above the basic family data. For each feature, we determined the relevance, computational burden, usage statistics and the functionality of the feature in a website context. As a consequence of this review, we have removed some features, enhanced others and developed new ones to meet the changing demands of computational biology. Here, we describe the changes to Pfam content. Notably, we now provide family alignments based on four different representative proteome sequence data sets and a new interactive DNA search interface. We also discuss the mapping between Pfam and known 3D structures. PMID:24288371

  7. International Comparisions Database

    National Institute of Standards and Technology Data Gateway

    International Comparisions Database (Web, free access)   The International Comparisons Database (ICDB) serves the U.S. and the Inter-American System of Metrology (SIM) with information based on Appendices B (International Comparisons), C (Calibration and Measurement Capabilities) and D (List of Participating Countries) of the Comit� International des Poids et Mesures (CIPM) Mutual Recognition Arrangement (MRA). The official source of the data is The BIPM key comparison database. The ICDB provides access to results of comparisons of measurements and standards organized by the consultative committees of the CIPM and the Regional Metrology Organizations.

  8. Phase Equilibria Diagrams Database

    National Institute of Standards and Technology Data Gateway

    SRD 31 NIST/ACerS Phase Equilibria Diagrams Database (PC database for purchase)   The Phase Equilibria Diagrams Database contains commentaries and more than 21,000 diagrams for non-organic systems, including those published in all 21 hard-copy volumes produced as part of the ACerS-NIST Phase Equilibria Diagrams Program (formerly titled Phase Diagrams for Ceramists): Volumes I through XIV (blue books); Annuals 91, 92, 93; High Tc Superconductors I & II; Zirconium & Zirconia Systems; and Electronic Ceramics I. Materials covered include oxides as well as non-oxide systems such as chalcogenides and pnictides, phosphates, salt systems, and mixed systems of these classes.

  9. JICST Factual Database

    NASA Astrophysics Data System (ADS)

    Suzuki, Kazuaki; Shimura, Kazuki; Monma, Yoshio; Sakamoto, Masao; Morishita, Hiroshi; Kanazawa, Kenji

    The Japan Information Center of Science and Technology (JICST) has started the on-line service of JICST/NRIM Materials Strength Database for Engineering Steels and Alloys (JICST ME) in this March (1990). This database has been developed under the joint research between JICST and the National Research Institute for Metals (NRIM). It provides material strength data (creep, fatigue, etc.) of engineering steels and alloys. It is able to search and display on-line, and to analyze the searched data statistically and plot the result on graphic display. The database system and the data in JICST ME are described.

  10. Hybrid Terrain Database

    NASA Technical Reports Server (NTRS)

    Arthur, Trey

    2006-01-01

    A prototype hybrid terrain database is being developed in conjunction with other databases and with hardware and software that constitute subsystems of aerospace cockpit display systems (known in the art as synthetic vision systems) that generate images to increase pilots' situation awareness and eliminate poor visibility as a cause of aviation accidents. The basic idea is to provide a clear view of the world around an aircraft by displaying computer-generated imagery derived from an onboard database of terrain, obstacle, and airport information.

  11. HAPLOFIND: a new method for high-throughput mtDNA haplogroup assignment.

    PubMed

    Vianello, Dario; Sevini, Federica; Castellani, Gastone; Lomartire, Laura; Capri, Miriam; Franceschi, Claudio

    2013-09-01

    Deep sequencing technologies are completely revolutionizing the approach to DNA analysis. Mitochondrial DNA (mtDNA) studies entered in the "postgenomic era": the burst in sequenced samples observed in nuclear genomics is expected also in mitochondria, a trend that can already be detected checking complete mtDNA sequences database submission rate. Tools for the analysis of these data are available, but they fail in throughput or in easiness of use. We present here a new pipeline based on previous algorithms, inherited from the "nuclear genomic toolbox," combined with a newly developed algorithm capable of efficiently and easily classify new mtDNA sequences according to PhyloTree nomenclature. Detected mutations are also annotated using data collected from publicly available databases. Thanks to the analysis of all freely available sequences with known haplogroup obtained from GenBank, we were able to produce a PhyloTree-based weighted tree, taking into account each haplogroup pattern conservation. The combination of a highly efficient aligner, coupled with our algorithm and massive usage of asynchronous parallel processing, allowed us to build a high-throughput pipeline for the analysis of mtDNA sequences that can be quickly updated to follow the ever-changing nomenclature. HaploFind is freely accessible at the following Web address: https://haplofind.unibo.it.

  12. ARTI Refrigerant Database

    SciTech Connect

    Calm, J.M.

    1994-05-27

    The Refrigerant Database consolidates and facilitates access to information to assist industry in developing equipment using alternative refrigerants. The underlying purpose is to accelerate phase out of chemical compounds of environmental concern.

  13. Nuclear Science References Database

    SciTech Connect

    Pritychenko, B.; Běták, E.; Singh, B.; Totans, J.

    2014-06-15

    The Nuclear Science References (NSR) database together with its associated Web interface, is the world's only comprehensive source of easily accessible low- and intermediate-energy nuclear physics bibliographic information for more than 210,000 articles since the beginning of nuclear science. The weekly-updated NSR database provides essential support for nuclear data evaluation, compilation and research activities. The principles of the database and Web application development and maintenance are described. Examples of nuclear structure, reaction and decay applications are specifically included. The complete NSR database is freely available at the websites of the National Nuclear Data Center (http://www.nndc.bnl.gov/nsr) and the International Atomic Energy Agency (http://www-nds.iaea.org/nsr)

  14. Chemical Kinetics Database

    National Institute of Standards and Technology Data Gateway

    SRD 17 NIST Chemical Kinetics Database (Web, free access)   The NIST Chemical Kinetics Database includes essentially all reported kinetics results for thermal gas-phase chemical reactions. The database is designed to be searched for kinetics data based on the specific reactants involved, for reactions resulting in specified products, for all the reactions of a particular species, or for various combinations of these. In addition, the bibliography can be searched by author name or combination of names. The database contains in excess of 38,000 separate reaction records for over 11,700 distinct reactant pairs. These data have been abstracted from over 12,000 papers with literature coverage through early 2000.

  15. TREATABILITY DATABASE DESCRIPTION

    EPA Science Inventory

    The Drinking Water Treatability Database (TDB) presents referenced information on the control of contaminants in drinking water. It allows drinking water utilities, first responders to spills or emergencies, treatment process designers, research organizations, academics, regulato...

  16. THE CTEPP DATABASE

    EPA Science Inventory

    The CTEPP (Children's Total Exposure to Persistent Pesticides and Other Persistent Organic Pollutants) database contains a wealth of data on children's aggregate exposures to pollutants in their everyday surroundings. Chemical analysis data for the environmental media and ques...

  17. Requirements Management Database

    2009-08-13

    This application is a simplified and customized version of the RBA and CTS databases to capture federal, site, and facility requirements, link to actions that must be performed to maintain compliance with their contractual and other requirements.

  18. Steam Properties Database

    National Institute of Standards and Technology Data Gateway

    SRD 10 NIST/ASME Steam Properties Database (PC database for purchase)   Based upon the International Association for the Properties of Water and Steam (IAPWS) 1995 formulation for the thermodynamic properties of water and the most recent IAPWS formulations for transport and other properties, this updated version provides water properties over a wide range of conditions according to the accepted international standards.

  19. Database computing in HEP

    SciTech Connect

    Day, C.T.; Loken, S.; MacFarlane, J.F. ); May, E.; Lifka, D.; Lusk, E.; Price, L.E. ); Baden, A. . Dept. of Physics); Grossman, R.; Qin, X. . Dept. of Mathematics, Statistics and Computer Science); Cormell, L.; Leibold, P.; Liu, D

    1992-01-01

    The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors. I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototype based on relational and object-oriented databases of CDF data samples.

  20. Database computing in HEP

    NASA Technical Reports Server (NTRS)

    Day, C. T.; Loken, S.; Macfarlane, J. F.; May, E.; Lifka, D.; Lusk, E.; Price, L. E.; Baden, A.; Grossman, R.; Qin, X.

    1992-01-01

    The major SSC experiments are expected to produce up to 1 Petabyte of data per year each. Once the primary reconstruction is completed by farms of inexpensive processors, I/O becomes a major factor in further analysis of the data. We believe that the application of database techniques can significantly reduce the I/O performed in these analyses. We present examples of such I/O reductions in prototypes based on relational and object-oriented databases of CDF data samples.

  1. Querying genomic databases

    SciTech Connect

    Baehr, A.; Hagstrom, R.; Joerg, D.; Overbeek, R.

    1991-09-01

    A natural-language interface has been developed that retrieves genomic information by using a simple subset of English. The interface spares the biologist from the task of learning database-specific query languages and computer programming. Currently, the interface deals with the E. coli genome. It can, however, be readily extended and shows promise as a means of easy access to other sequenced genomic databases as well.

  2. The world bacterial biogeography and biodiversity through databases: a case study of NCBI Nucleotide Database and GBIF Database.

    PubMed

    Selama, Okba; James, Phillip; Nateche, Farida; Wellington, Elizabeth M H; Hacène, Hocine

    2013-01-01

    Databases are an essential tool and resource within the field of bioinformatics. The primary aim of this study was to generate an overview of global bacterial biodiversity and biogeography using available data from the two largest public online databases, NCBI Nucleotide and GBIF. The secondary aim was to highlight the contribution each geographic area has to each database. The basis for data analysis of this study was the metadata provided by both databases, mainly, the taxonomy and the geographical area origin of isolation of the microorganism (record). These were directly obtained from GBIF through the online interface, while E-utilities and Python were used in combination with a programmatic web service access to obtain data from the NCBI Nucleotide Database. Results indicate that the American continent, and more specifically the USA, is the top contributor, while Africa and Antarctica are less well represented. This highlights the imbalance of exploration within these areas rather than any reduction in biodiversity. This study describes a novel approach to generating global scale patterns of bacterial biodiversity and biogeography and indicates that the Proteobacteria are the most abundant and widely distributed phylum within both databases.

  3. Drinking Water Database

    NASA Technical Reports Server (NTRS)

    Murray, ShaTerea R.

    2004-01-01

    This summer I had the opportunity to work in the Environmental Management Office (EMO) under the Chemical Sampling and Analysis Team or CS&AT. This team s mission is to support Glenn Research Center (GRC) and EM0 by providing chemical sampling and analysis services and expert consulting. Services include sampling and chemical analysis of water, soil, fbels, oils, paint, insulation materials, etc. One of this team s major projects is the Drinking Water Project. This is a project that is done on Glenn s water coolers and ten percent of its sink every two years. For the past two summers an intern had been putting together a database for this team to record the test they had perform. She had successfully created a database but hadn't worked out all the quirks. So this summer William Wilder (an intern from Cleveland State University) and I worked together to perfect her database. We began be finding out exactly what every member of the team thought about the database and what they would change if any. After collecting this data we both had to take some courses in Microsoft Access in order to fix the problems. Next we began looking at what exactly how the database worked from the outside inward. Then we began trying to change the database but we quickly found out that this would be virtually impossible.

  4. The Halophile protein database.

    PubMed

    Sharma, Naveen; Farooqi, Mohammad Samir; Chaturvedi, Krishna Kumar; Lal, Shashi Bhushan; Grover, Monendra; Rai, Anil; Pandey, Pankaj

    2014-01-01

    Halophilic archaea/bacteria adapt to different salt concentration, namely extreme, moderate and low. These type of adaptations may occur as a result of modification of protein structure and other changes in different cell organelles. Thus proteins may play an important role in the adaptation of halophilic archaea/bacteria to saline conditions. The Halophile protein database (HProtDB) is a systematic attempt to document the biochemical and biophysical properties of proteins from halophilic archaea/bacteria which may be involved in adaptation of these organisms to saline conditions. In this database, various physicochemical properties such as molecular weight, theoretical pI, amino acid composition, atomic composition, estimated half-life, instability index, aliphatic index and grand average of hydropathicity (Gravy) have been listed. These physicochemical properties play an important role in identifying the protein structure, bonding pattern and function of the specific proteins. This database is comprehensive, manually curated, non-redundant catalogue of proteins. The database currently contains 59 897 proteins properties extracted from 21 different strains of halophilic archaea/bacteria. The database can be accessed through link. Database URL: http://webapp.cabgrid.res.in/protein/

  5. Crude Oil Analysis Database

    DOE Data Explorer

    Shay, Johanna Y.

    The composition and physical properties of crude oil vary widely from one reservoir to another within an oil field, as well as from one field or region to another. Although all oils consist of hydrocarbons and their derivatives, the proportions of various types of compounds differ greatly. This makes some oils more suitable than others for specific refining processes and uses. To take advantage of this diversity, one needs access to information in a large database of crude oil analyses. The Crude Oil Analysis Database (COADB) currently satisfies this need by offering 9,056 crude oil analyses. Of these, 8,500 are United States domestic oils. The database contains results of analysis of the general properties and chemical composition, as well as the field, formation, and geographic location of the crude oil sample. [Taken from the Introduction to COAMDATA_DESC.pdf, part of the zipped software and database file at http://www.netl.doe.gov/technologies/oil-gas/Software/database.html] Save the zipped file to your PC. When opened, it will contain PDF documents and a large Excel spreadsheet. It will also contain the database in Microsoft Access 2002.

  6. Open systems and databases

    SciTech Connect

    Martire, G.S. ); Nuttall, D.J.H. )

    1993-05-01

    This paper is part of a series of papers invited by the IEEE POWER CONTROL CENTER WORKING GROUP concerning the changing designs of modern control centers. Papers invited by the Working Group discuss the following issues: Benefits of Openness, Criteria for Evaluating Open EMS Systems, Hardware Design, Configuration Management, Security, Project Management, Databases, SCADA, Inter- and Intra-System Communications and Man-Machine Interfaces,'' The goal of this paper is to provide an introduction to the issues pertaining to Open Systems and Databases.'' The intent is to assist understanding of some of the underlying factors that effect choices that must be made when selecting a database system for use in a control room environment. This paper describes and compares the major database information models which are in common use for database systems and provides an overview of SQL. A case for the control center community to follow the workings of the non-formal standards bodies is presented along with possible uses and the benefits of commercially available databases within the control center. The reasons behind the emergence of industry supported standards organizations such as the Open Software Foundation (OSF) and SQL Access are presented.

  7. The comprehensive peptaibiotics database.

    PubMed

    Stoppacher, Norbert; Neumann, Nora K N; Burgstaller, Lukas; Zeilinger, Susanne; Degenkolb, Thomas; Brückner, Hans; Schuhmacher, Rainer

    2013-05-01

    Peptaibiotics are nonribosomally biosynthesized peptides, which - according to definition - contain the marker amino acid α-aminoisobutyric acid (Aib) and possess antibiotic properties. Being known since 1958, a constantly increasing number of peptaibiotics have been described and investigated with a particular emphasis on hypocrealean fungi. Starting from the existing online 'Peptaibol Database', first published in 1997, an exhaustive literature survey of all known peptaibiotics was carried out and resulted in a list of 1043 peptaibiotics. The gathered information was compiled and used to create the new 'The Comprehensive Peptaibiotics Database', which is presented here. The database was devised as a software tool based on Microsoft (MS) Access. It is freely available from the internet at http://peptaibiotics-database.boku.ac.at and can easily be installed and operated on any computer offering a Windows XP/7 environment. It provides useful information on characteristic properties of the peptaibiotics included such as peptide category, group name of the microheterogeneous mixture to which the peptide belongs, amino acid sequence, sequence length, producing fungus, peptide subfamily, molecular formula, and monoisotopic mass. All these characteristics can be used and combined for automated search within the database, which makes The Comprehensive Peptaibiotics Database a versatile tool for the retrieval of valuable information about peptaibiotics. Sequence data have been considered as to December 14, 2012. PMID:23681723

  8. Specialist Bibliographic Databases

    PubMed Central

    2016-01-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485

  9. Specialist Bibliographic Databases.

    PubMed

    Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D

    2016-05-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls. PMID:27134485

  10. Specialist Bibliographic Databases.

    PubMed

    Gasparyan, Armen Yuri; Yessirkepov, Marlen; Voronov, Alexander A; Trukhachev, Vladimir I; Kostyukova, Elena I; Gerasimov, Alexey N; Kitas, George D

    2016-05-01

    Specialist bibliographic databases offer essential online tools for researchers and authors who work on specific subjects and perform comprehensive and systematic syntheses of evidence. This article presents examples of the established specialist databases, which may be of interest to those engaged in multidisciplinary science communication. Access to most specialist databases is through subscription schemes and membership in professional associations. Several aggregators of information and database vendors, such as EBSCOhost and ProQuest, facilitate advanced searches supported by specialist keyword thesauri. Searches of items through specialist databases are complementary to those through multidisciplinary research platforms, such as PubMed, Web of Science, and Google Scholar. Familiarizing with the functional characteristics of biomedical and nonbiomedical bibliographic search tools is mandatory for researchers, authors, editors, and publishers. The database users are offered updates of the indexed journal lists, abstracts, author profiles, and links to other metadata. Editors and publishers may find particularly useful source selection criteria and apply for coverage of their peer-reviewed journals and grey literature sources. These criteria are aimed at accepting relevant sources with established editorial policies and quality controls.

  11. Online Information. Selected Databases at the New York State Library.

    ERIC Educational Resources Information Center

    New York State Library, Albany. Database Services.

    This brochure describes the online information services at the New York State Library, which has online access to over 250 databases covering a broad range of subject areas, including current events, law, science, medicine, public affairs, grants, business, computer technology, education, social welfare, and humanities. Many of these databases are…

  12. The Vocational Guidance Research Database: A Scientometric Approach

    ERIC Educational Resources Information Center

    Flores-Buils, Raquel; Gil-Beltran, Jose Manuel; Caballer-Miedes, Antonio; Martinez-Martinez, Miguel Angel

    2012-01-01

    The scientometric study of scientific output through publications in specialized journals cannot be undertaken exclusively with the databases available today. For this reason, the objective of this article is to introduce the "Base de Datos de Investigacion en Orientacion Vocacional" [Vocational Guidance Research Database], based on the use of…

  13. Go Figure: Computer Database Adds the Personal Touch.

    ERIC Educational Resources Information Center

    Gaffney, Jean; Crawford, Pat

    1992-01-01

    A database for recordkeeping for a summer reading club was developed for a public library system using an IBM PC and Microsoft Works. Use of the database resulted in more efficient program management, giving librarians more time to spend with patrons and enabling timely awarding of incentives. (LAE)

  14. Hungry for Nutrient Data? Navigating the USDA Nutrient Database

    Technology Transfer Automated Retrieval System (TEKTRAN)

    The USDA National Nutrient Database for Standard Reference (SR) is the major source of food composition data in the United States, providing the foundation for most food composition databases in the public and private sectors. Most nutrition professionals are familiar with the basics of the SR onlin...

  15. The peptaibiotics database--a comprehensive online resource.

    PubMed

    Neumann, Nora K N; Stoppacher, Norbert; Zeilinger, Susanne; Degenkolb, Thomas; Brückner, Hans; Schuhmacher, Rainer

    2015-05-01

    In this work, we present the 'Peptaibiotics Database' (PDB), a comprehensive online resource, which intends to cover all Aib-containing non-ribosomal fungal peptides currently described in scientific literature. This database shall extend and update the recently published 'Comprehensive Peptaibiotics Database' and currently consists of 1,297 peptaibiotic sequences. In a literature survey, a total of 235 peptaibiotic sequences published between January 2013 and June 2014 have been compiled, and added to the list of 1,062 peptides in the recently published 'Comprehensive Peptaibiotics Database'. The presented database is intended as a public resource freely accessible to the scientific community at peptaibiotics-database.boku.ac.at. The search options of the previously published repository and the presentation of sequence motif searches have been extended significantly. All of the available search options can be combined to create complex database queries. As a public repository, the presented database enables the easy upload of new peptaibiotic sequences or the correction of existing informations. In addition, an administrative interface for maintenance of the content of the database has been implemented, and the design of the database can be easily extended to store additional information to accommodate future needs of the 'peptaibiomics community'.

  16. Comprehensive Thematic T-matrix Reference Database: a 2013-2014 Update

    NASA Technical Reports Server (NTRS)

    Mishchenko, Michael I.; Zakharova, Nadezhda T.; Khlebtsov, Nikolai G.; Wriedt, Thomas; Videen, Gorden

    2014-01-01

    This paper is the sixth update to the comprehensive thematic database of peer-reviewedT-matrix publications initiated by us in 2004 and includes relevant publications that have appeared since 2013. It also lists several earlier publications not incorporated in the original database and previous updates.

  17. DNA Profiling of Convicted Offender Samples for the Combined DNA Index System

    ERIC Educational Resources Information Center

    Millard, Julie T

    2011-01-01

    The cornerstone of forensic chemistry is that a perpetrator inevitably leaves trace evidence at a crime scene. One important type of evidence is DNA, which has been instrumental in both the implication and exoneration of thousands of suspects in a wide range of crimes. The Combined DNA Index System (CODIS), a network of DNA databases, provides…

  18. Corruption of genomic databases with anomalous sequence.

    PubMed Central

    Lamperti, E D; Kittelberger, J M; Smith, T F; Villa-Komaroff, L

    1992-01-01

    We describe evidence that DNA sequences from vectors used for cloning and sequencing have been incorporated accidentally into eukaryotic entries in the GenBank database. These incorporations were not restricted to one type of vector or to a single mechanism. Many minor instances may have been the result of simple editing errors, but some entries contained large blocks of vector sequence that had been incorporated by contamination or other accidents during cloning. Some cases involved unusual rearrangements and areas of vector distant from the normal insertion sites. Matches to vector were found in 0.23% of 20,000 sequences analyzed in GenBank Release 63. Although the possibility of anomalous sequence incorporation has been recognized since the inception of GenBank and should be easy to avoid, recent evidence suggests that this problem is increasing more quickly than the database itself. The presence of anomalous sequence may have serious consequences for the interpretation and use of database entries, and will have an impact on issues of database management. The incorporated vector fragments described here may also be useful for a crude estimate of the fidelity of sequence information in the database. In alignments with well-defined ends, the matching sequences showed 96.8% identity to vector; when poorer matches with arbitrary limits were included, the aggregate identity to vector sequence was 94.8%. PMID:1614861

  19. Corruption of genomic databases with anomalous sequence.

    PubMed

    Lamperti, E D; Kittelberger, J M; Smith, T F; Villa-Komaroff, L

    1992-06-11

    We describe evidence that DNA sequences from vectors used for cloning and sequencing have been incorporated accidentally into eukaryotic entries in the GenBank database. These incorporations were not restricted to one type of vector or to a single mechanism. Many minor instances may have been the result of simple editing errors, but some entries contained large blocks of vector sequence that had been incorporated by contamination or other accidents during cloning. Some cases involved unusual rearrangements and areas of vector distant from the normal insertion sites. Matches to vector were found in 0.23% of 20,000 sequences analyzed in GenBank Release 63. Although the possibility of anomalous sequence incorporation has been recognized since the inception of GenBank and should be easy to avoid, recent evidence suggests that this problem is increasing more quickly than the database itself. The presence of anomalous sequence may have serious consequences for the interpretation and use of database entries, and will have an impact on issues of database management. The incorporated vector fragments described here may also be useful for a crude estimate of the fidelity of sequence information in the database. In alignments with well-defined ends, the matching sequences showed 96.8% identity to vector; when poorer matches with arbitrary limits were included, the aggregate identity to vector sequence was 94.8%.

  20. Neuroinformatics: from bioinformatics to databasing the brain.

    PubMed

    Morse, Thomas M

    2008-01-01

    Neuroinformatics seeks to create and maintain web-accessible databases of experimental and computational data, together with innovative software tools, essential for understanding the nervous system in its normal function and in neurological disorders. Neuroinformatics includes traditional bioinformatics of gene and protein sequences in the brain; atlases of brain anatomy and localization of genes and proteins; imaging of brain cells; brain imaging by positron emission tomography (PET), functional magnetic resonance imaging (fMRI), electroencephalography (EEG), magnetoencephalography (MEG) and other methods; many electrophysiological recording methods; and clinical neurological data, among others. Building neuroinformatics databases and tools presents difficult challenges because they span a wide range of spatial scales and types of data stored and analyzed. Traditional bioinformatics, by comparison, focuses primarily on genomic and proteomic data (which of course also presents difficult challenges). Much of bioinformatics analysis focus on sequences (DNA, RNA, and protein molecules), as the type of data that are stored, compared, and sometimes modeled. Bioinformatics is undergoing explosive growth with the addition, for example, of databases that catalog interactions between proteins, of databases that track the evolution of genes, and of systems biology databases which contain models of all aspects of organisms. This commentary briefly reviews neuroinformatics with clarification of its relationship to traditional and modern bioinformatics.

  1. Database of Mechanical Properties of Textile Composites

    NASA Technical Reports Server (NTRS)

    Delbrey, Jerry

    1996-01-01

    This report describes the approach followed to develop a database for mechanical properties of textile composites. The data in this database is assembled from NASA Advanced Composites Technology (ACT) programs and from data in the public domain. This database meets the data documentation requirements of MIL-HDBK-17, Section 8.1.2, which describes in detail the type and amount of information needed to completely document composite material properties. The database focuses on mechanical properties of textile composite. Properties are available for a range of parameters such as direction, fiber architecture, materials, environmental condition, and failure mode. The composite materials in the database contain innovative textile architectures such as the braided, woven, and knitted materials evaluated under the NASA ACT programs. In summary, the database contains results for approximately 3500 coupon level tests, for ten different fiber/resin combinations, and seven different textile architectures. It also includes a limited amount of prepreg tape composites data from ACT programs where side-by-side comparisons were made.

  2. Proposal for a High Energy Nuclear Database

    SciTech Connect

    Brown, David A.; Vogt, Ramona

    2005-03-31

    We propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. This database will be searchable and cross-indexed with relevant publications, including published detector descriptions. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support. This database should eventually contain all published data from Bevalac and AGS to RHIC to CERN-LHC energies, proton-proton to nucleus-nucleus collisions as well as other relevant systems, and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of old and new experiments. Furthermore, there is a growing need for compilations of high-energy nuclear data for applications including stockpile stewardship, technology development for inertial confinement fusion and target and source development for upcoming facilities such as the Next Linear Collider. To enhance the utility of this database, we propose periodically performing evaluations of the data and summarizing the results in topical reviews.

  3. ECMDB: the E. coli Metabolome Database.

    PubMed

    Guo, An Chi; Jewison, Timothy; Wilson, Michael; Liu, Yifeng; Knox, Craig; Djoumbou, Yannick; Lo, Patrick; Mandal, Rupasri; Krishnamurthy, Ram; Wishart, David S

    2013-01-01

    The Escherichia coli Metabolome Database (ECMDB, http://www.ecmdb.ca) is a comprehensively annotated metabolomic database containing detailed information about the metabolome of E. coli (K-12). Modelled closely on the Human and Yeast Metabolome Databases, the ECMDB contains >2600 metabolites with links to ∼1500 different genes and proteins, including enzymes and transporters. The information in the ECMDB has been collected from dozens of textbooks, journal articles and electronic databases. Each metabolite entry in the ECMDB contains an average of 75 separate data fields, including comprehensive compound descriptions, names and synonyms, chemical taxonomy, compound structural and physicochemical data, bacterial growth conditions and substrates, reactions, pathway information, enzyme data, gene/protein sequence data and numerous hyperlinks to images, references and other public databases. The ECMDB also includes an extensive collection of intracellular metabolite concentration data compiled from our own work as well as other published metabolomic studies. This information is further supplemented with thousands of fully assigned reference nuclear magnetic resonance and mass spectrometry spectra obtained from pure E. coli metabolites that we (and others) have collected. Extensive searching, relational querying and data browsing tools are also provided that support text, chemical structure, spectral, molecular weight and gene/protein sequence queries. Because of E. coli's importance as a model organism for biologists and as a biofactory for industry, we believe this kind of database could have considerable appeal not only to metabolomics researchers but also to molecular biologists, systems biologists and individuals in the biotechnology industry.

  4. Proposal for a High Energy Nuclear Database

    SciTech Connect

    Brown, D A; Vogt, R

    2005-03-31

    The authors propose to develop a high-energy heavy-ion experimental database and make it accessible to the scientific community through an on-line interface. This database will be searchable and cross-indexed with relevant publications, including published detector descriptions. Since this database will be a community resource, it requires the high-energy nuclear physics community's financial and manpower support. This database should eventually contain all published data from Bevalac, AGS and SPS to RHIC and CERN-LHC energies, proton-proton to nucleus-nucleus collisions as well as other relevant systems, and all measured observables. Such a database would have tremendous scientific payoff as it makes systematic studies easier and allows simpler benchmarking of theoretical models to a broad range of old and new experiments. Furthermore, there is a growing need for compilations of high-energy nuclear data for applications including stockpile stewardship, technology development for inertial confinement fusion and target and source development for upcoming facilities such as the Next Linear Collider. To enhance the utility of this database, they propose periodically performing evaluations of the data and summarizing the results in topical reviews.

  5. Great Basin paleontological database

    USGS Publications Warehouse

    Zhang, N.; Blodgett, R.B.; Hofstra, A.H.

    2008-01-01

    The U.S. Geological Survey has constructed a paleontological database for the Great Basin physiographic province that can be served over the World Wide Web for data entry, queries, displays, and retrievals. It is similar to the web-database solution that we constructed for Alaskan paleontological data (www.alaskafossil.org). The first phase of this effort was to compile a paleontological bibliography for Nevada and portions of adjacent states in the Great Basin that has recently been completed. In addition, we are also compiling paleontological reports (Known as E&R reports) of the U.S. Geological Survey, which are another extensive source of l,egacy data for this region. Initial population of the database benefited from a recently published conodont data set and is otherwise focused on Devonian and Mississippian localities because strata of this age host important sedimentary exhalative (sedex) Au, Zn, and barite resources and enormons Carlin-type An deposits. In addition, these strata are the most important petroleum source rocks in the region, and record the transition from extension to contraction associated with the Antler orogeny, the Alamo meteorite impact, and biotic crises associated with global oceanic anoxic events. The finished product will provide an invaluable tool for future geologic mapping, paleontological research, and mineral resource investigations in the Great Basin, making paleontological data acquired over nearly the past 150 yr readily available over the World Wide Web. A description of the structure of the database and the web interface developed for this effort are provided herein. This database is being used ws a model for a National Paleontological Database (which we am currently developing for the U.S. Geological Survey) as well as for other paleontological databases now being developed in other parts of the globe. ?? 2008 Geological Society of America.

  6. Toward unification of taxonomy databases in a distributed computer environment

    SciTech Connect

    Kitakami, Hajime; Tateno, Yoshio; Gojobori, Takashi

    1994-12-31

    All the taxonomy databases constructed with the DNA databases of the international DNA data banks are powerful electronic dictionaries which aid in biological research by computer. The taxonomy databases are, however not consistently unified with a relational format. If we can achieve consistent unification of the taxonomy databases, it will be useful in comparing many research results, and investigating future research directions from existent research results. In particular, it will be useful in comparing relationships between phylogenetic trees inferred from molecular data and those constructed from morphological data. The goal of the present study is to unify the existent taxonomy databases and eliminate inconsistencies (errors) that are present in them. Inconsistencies occur particularly in the restructuring of the existent taxonomy databases, since classification rules for constructing the taxonomy have rapidly changed with biological advancements. A repair system is needed to remove inconsistencies in each data bank and mismatches among data banks. This paper describes a new methodology for removing both inconsistencies and mismatches from the databases on a distributed computer environment. The methodology is implemented in a relational database management system, SYBASE.

  7. NASA aerospace database subject scope: An overview

    NASA Technical Reports Server (NTRS)

    1993-01-01

    Outlined here is the subject scope of the NASA Aerospace Database, a publicly available subset of the NASA Scientific and Technical (STI) Database. Topics of interest to NASA are outlined and placed within the framework of the following broad aerospace subject categories: aeronautics, astronautics, chemistry and materials, engineering, geosciences, life sciences, mathematical and computer sciences, physics, social sciences, space sciences, and general. A brief discussion of the subject scope is given for each broad area, followed by a similar explanation of each of the narrower subject fields that follow. The subject category code is listed for each entry.

  8. Novel circular DNA viruses identified in Procordulia grayi and Xanthocnemis zealandica larvae using metagenomic approaches.

    PubMed

    Dayaram, Anisha; Galatowitsch, Mark; Harding, Jon S; Argüello-Astorga, Gerardo R; Varsani, Arvind

    2014-03-01

    Recent advances in sequencing and metagenomics have enabled the discovery of many novel single stranded DNA (ssDNA) viruses from various environments. We have previously demonstrated that adult dragonflies, as predatory insects, are useful indicators of ssDNA viruses in terrestrial ecosystems. Here we recover and characterise 13 viral genomes which represent 10 novel and diverse circular replication associated protein (Rep)-encoding single stranded (CRESS) DNA viruses (1628-2668nt) from Procordulia grayi and Xanthocnemis zealandica dragonfly larvae collected from four high-country lakes in the South Island of New Zealand. The dragonfly larvae associated CRESS DNA viruses have different genome architectures, however, they all encode two major open reading frames (ORFs) which either have bidirectional or unidirectional arrangement. The 13 viral genomes have a conserved NAGTATTAC nonanucleotide motif and in their predicted Rep proteins we identified the rolling circle replication (RCR) motif 1, 2 and 3, as well as superfamily 3 (SF3) helicase motifs. Maximum likelihood phylogenetic and pairwise identity analysis of the Rep amino acid sequences reveal that the dragonfly larvae novel CRESS DNA viruses share <63% pairwise amino acid identity to the Reps of other CRESS DNA viruses whose complete genomes have been determined and available in public databases and that these viruses are novel. CRESS DNA viruses are circulating in larval dragonfly populations; however, we are unable to ascertain whether these viruses are infecting the larvae directly or are transient within dragonflies via their diet. PMID:24462907

  9. ADANS database specification

    SciTech Connect

    1997-01-16

    The purpose of the Air Mobility Command (AMC) Deployment Analysis System (ADANS) Database Specification (DS) is to describe the database organization and storage allocation and to provide the detailed data model of the physical design and information necessary for the construction of the parts of the database (e.g., tables, indexes, rules, defaults). The DS includes entity relationship diagrams, table and field definitions, reports on other database objects, and a description of the ADANS data dictionary. ADANS is the automated system used by Headquarters AMC and the Tanker Airlift Control Center (TACC) for airlift planning and scheduling of peacetime and contingency operations as well as for deliberate planning. ADANS also supports planning and scheduling of Air Refueling Events by the TACC and the unit-level tanker schedulers. ADANS receives input in the form of movement requirements and air refueling requests. It provides a suite of tools for planners to manipulate these requirements/requests against mobility assets and to develop, analyze, and distribute schedules. Analysis tools are provided for assessing the products of the scheduling subsystems, and editing capabilities support the refinement of schedules. A reporting capability provides formatted screen, print, and/or file outputs of various standard reports. An interface subsystem handles message traffic to and from external systems. The database is an integral part of the functionality summarized above.

  10. Using the Reactome Database

    PubMed Central

    Haw, Robin

    2012-01-01

    There is considerable interest in the bioinformatics community in creating pathway databases. The Reactome project (a collaboration between the Ontario Institute for Cancer Research, Cold Spring Harbor Laboratory, New York University Medical Center and the European Bioinformatics Institute) is one such pathway database and collects structured information on all the biological pathways and processes in the human. It is an expert-authored and peer-reviewed, curated collection of well-documented molecular reactions that span the gamut from simple intermediate metabolism to signaling pathways and complex cellular events. This information is supplemented with likely orthologous molecular reactions in mouse, rat, zebrafish, worm and other model organisms. This unit describes how to use the Reactome database to learn the steps of a biological pathway; navigate and browse through the Reactome database; identify the pathways in which a molecule of interest is involved; use the Pathway and Expression analysis tools to search the database for and visualize possible connections within user-supplied experimental data set and Reactome pathways; and the Species Comparison tool to compare human and model organism pathways. PMID:22700314

  11. NASA Records Database

    NASA Technical Reports Server (NTRS)

    Callac, Christopher; Lunsford, Michelle

    2005-01-01

    The NASA Records Database, comprising a Web-based application program and a database, is used to administer an archive of paper records at Stennis Space Center. The system begins with an electronic form, into which a user enters information about records that the user is sending to the archive. The form is smart : it provides instructions for entering information correctly and prompts the user to enter all required information. Once complete, the form is digitally signed and submitted to the database. The system determines which storage locations are not in use, assigns the user s boxes of records to some of them, and enters these assignments in the database. Thereafter, the software tracks the boxes and can be used to locate them. By use of search capabilities of the software, specific records can be sought by box storage locations, accession numbers, record dates, submitting organizations, or details of the records themselves. Boxes can be marked with such statuses as checked out, lost, transferred, and destroyed. The system can generate reports showing boxes awaiting destruction or transfer. When boxes are transferred to the National Archives and Records Administration (NARA), the system can automatically fill out NARA records-transfer forms. Currently, several other NASA Centers are considering deploying the NASA Records Database to help automate their records archives.

  12. Shuttle Hypervelocity Impact Database

    NASA Technical Reports Server (NTRS)

    Hyde, James L.; Christiansen, Eric L.; Lear, Dana M.

    2011-01-01

    With three missions outstanding, the Shuttle Hypervelocity Impact Database has nearly 3000 entries. The data is divided into tables for crew module windows, payload bay door radiators and thermal protection system regions, with window impacts compromising just over half the records. In general, the database provides dimensions of hypervelocity impact damage, a component level location (i.e., window number or radiator panel number) and the orbiter mission when the impact occurred. Additional detail on the type of particle that produced the damage site is provided when sampling data and definitive analysis results are available. Details and insights on the contents of the database including examples of descriptive statistics will be provided. Post flight impact damage inspection and sampling techniques that were employed during the different observation campaigns will also be discussed. Potential enhancements to the database structure and availability of the data for other researchers will be addressed in the Future Work section. A related database of returned surfaces from the International Space Station will also be introduced.

  13. Shuttle Hypervelocity Impact Database

    NASA Technical Reports Server (NTRS)

    Hyde, James I.; Christiansen, Eric I.; Lear, Dana M.

    2011-01-01

    With three flights remaining on the manifest, the shuttle impact hypervelocity database has over 2800 entries. The data is currently divided into tables for crew module windows, payload bay door radiators and thermal protection system regions, with window impacts compromising just over half the records. In general, the database provides dimensions of hypervelocity impact damage, a component level location (i.e., window number or radiator panel number) and the orbiter mission when the impact occurred. Additional detail on the type of particle that produced the damage site is provided when sampling data and definitive analysis results are available. The paper will provide details and insights on the contents of the database including examples of descriptive statistics using the impact data. A discussion of post flight impact damage inspection and sampling techniques that were employed during the different observation campaigns will be presented. Future work to be discussed will be possible enhancements to the database structure and availability of the data for other researchers. A related database of ISS returned surfaces that are under development will also be introduced.

  14. Computer Databases: A Survey; Part 1: General and News Databases.

    ERIC Educational Resources Information Center

    O'Leary, Mick

    1986-01-01

    Descriptions and evaluations of 13 databases devoted to computer information are presented by type under four headings: bibliographic databases; daily news services; online computer magazines; and specialized computer industry databases. Information on database producers, starting date of file, update frequency, vendors, and prices is summarized…

  15. Publicity and public relations

    NASA Technical Reports Server (NTRS)

    Fosha, Charles E.

    1990-01-01

    This paper addresses approaches to using publicity and public relations to meet the goals of the NASA Space Grant College. Methods universities and colleges can use to publicize space activities are presented.

  16. NASA scientific and technical publications: A catalog of Special Publications, Reference Publications, Conference Publications, and Technical Papers, 1987

    NASA Technical Reports Server (NTRS)

    1988-01-01

    This catalog lists 239 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered in the NASA scientific and technical information database during accession year 1987. The entries are grouped by subject category. Indexes of subject terms, personal authors, and NASA report numbers are provided.

  17. IEEE Conference Publications in Libraries.

    ERIC Educational Resources Information Center

    Johnson, Karl E.

    1984-01-01

    Conclusions of surveys (63 libraries, OCLC database, University of Rhode Island users) assessing handling of Institute of Electrical and Electronics Engineers (IEEE) conference publications indicate that most libraries fully catalog these publications using LC cataloging, and library patrons frequently require series access to publications. Eight…

  18. NASA scientific and technical publications: A catalog of special publications, reference publications, conference publications, and technical papers, 1991-1992

    NASA Technical Reports Server (NTRS)

    1993-01-01

    This catalog lists 458 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered into the NASA Scientific and Technical Information database during accession year 1991 through 1992. The entries are grouped by subject category. Indexes of subject terms, personal authors, and NASA report numbers are provided.

  19. NASA scientific and technical publications: A catalog of special publications, reference publications, conference publications, and technical papers, 1987-1990

    NASA Technical Reports Server (NTRS)

    1991-01-01

    This catalog lists 783 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered into NASA Scientific and Technical Information Database during the year's 1987 through 1990. The entries are grouped by subject category. Indexes of subject terms, personal authors, and NASA report numbers are provided.

  20. NASA scientific and technical publications: A catalog of special publications, reference publications, conference publications, and technical papers, 1989

    NASA Technical Reports Server (NTRS)

    1990-01-01

    This catalog lists 190 citations of all NASA Special Publications, NASA Reference Publications, NASA Conference Publications, and NASA Technical Papers that were entered into the NASA scientific and technical information database during accession year 1989. The entries are grouped by subject category. Indexes of subject terms, personal authors, and NASA report numbers are provided.

  1. BNDb--Biomolecules Nucleus Database: an integrated proteomics and transcriptomics database.

    PubMed

    Faria-Campos, A C; Gomes, R R; Moratelli, F S; Rausch-Fernandes, H; Franco, G R; Campos, S V A Campos

    2007-10-05

    Proteomics correspond to the identification and quantitative analysis of proteins expressed in different conditions or life stages of a cell or organism. Methods used in proteomics analysis include mainly chromatography, two-dimensional electrophoresis and mass spectrometry. Data generated in proteomics analysis vary significantly, and to identify a protein it is often necessary to perform a series of experiments, comparing its results to those found in proteomics databases. Existing proteomics databases are usually related to only one type of experiment or represent processed results, not raw data. Therefore, proteomics researchers frequently have to resort to several data repositories in order to be able to perform the identification. In this paper, we propose an integrated proteomics and transcriptomics database that stores raw and processed data, which are indexed allowing them to be retrieved together or individually. The proposed database, dubbed BNDb for Biomolecules Nucleus Database, is implemented using an MySQL server and is being used to store data from the parasite Schistosoma mansoni, the scorpion Tittyus serrulatus and the spider Phoneutria nigriventer. The database construction uses a relational approach and data indexes. The data model proposed uses groups of tables for each data subtype, which store details regarding the experimental procedure as well as raw data, analysis results and associated publications. BNDb also stores transcriptomics data publicly available which are associated with identifications performed on new samples. By using BNDb, we expect not only to contribute to proteomics research but also to provide a useful service for the scientific community.

  2. VIEWCACHE: An incremental database access method for autonomous interoperable databases

    NASA Technical Reports Server (NTRS)

    Roussopoulos, Nick; Sellis, Timoleon

    1991-01-01

    The objective is to illustrate the concept of incremental access to distributed databases. An experimental database management system, ADMS, which has been developed at the University of Maryland, in College Park, uses VIEWCACHE, a database access method based on incremental search. VIEWCACHE is a pointer-based access method that provides a uniform interface for accessing distributed databases and catalogues. The compactness of the pointer structures formed during database browsing and the incremental access method allow the user to search and do inter-database cross-referencing with no actual data movement between database sites. Once the search is complete, the set of collected pointers pointing to the desired data are dereferenced.

  3. MPW : the metabolic pathways database.

    SciTech Connect

    Selkov, E., Jr.; Grechkin, Y.; Mikhailova, N.; Selkov, E.; Mathematics and Computer Science; Russian Academy of Sciences

    1998-01-01

    The Metabolic Pathways Database (MPW) (www.biobase.com/emphome.html/homepage. html.pags/pathways.html) a derivative of EMP (www.biobase.com/EMP) plays a fundamental role in the technology of metabolic reconstructions from sequenced genomes under the PUMA (www.mcs.anl.gov/home/compbio/PUMA/Production/ ReconstructedMetabolism/reconstruction.html), WIT (www.mcs.anl.gov/home/compbio/WIT/wit.html ) and WIT2 (beauty.isdn.msc.anl.gov/WIT2.pub/CGI/user.cgi) systems. In October 1997, it included some 2800 pathway diagrams covering primary and secondary metabolism, membrane transport, signal transduction pathways, intracellular traffic, translation and transcription. In the current public release of MPW (beauty.isdn.mcs.anl.gov/MPW), the encoding is based on the logical structure of the pathways and is represented by the objects commonly used in electronic circuit design. This facilitates drawing and editing the diagrams and makes possible automation of the basic simulation operations such as deriving stoichiometric matrices, rate laws, and, ultimately, dynamic models of metabolic pathways. Individual pathway diagrams, automatically derived from the original ASCII records, are stored as SGML instances supplemented by relational indices. An auxiliary database of compound names and structures, encoded in the SMILES format, is maintained to unambiguously connect the pathways to the chemical structures of their intermediates.

  4. The Gene Expression Omnibus database

    PubMed Central

    Clough, Emily; Barrett, Tanya

    2016-01-01

    The Gene Expression Omnibus (GEO) database is an international public repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets. Created in 2000 as a worldwide resource for gene expression studies, GEO has evolved with rapidly changing technologies and now accepts high-throughput data for many other data applications, including those that examine genome methylation, chromatin structure, and genome–protein interactions. GEO supports community-derived reporting standards that specify provision of several critical study elements including raw data, processed data, and descriptive metadata. The database not only provides access to data for tens of thousands of studies, but also offers various Web-based tools and strategies that enable users to locate data relevant to their specific interests, as well as to visualize and analyze the data. This chapter includes detailed descriptions of methods to query and download GEO data and use the analysis and visualization tools. The GEO homepage is at http://www.ncbi.nlm.nih.gov/geo/. PMID:27008011

  5. The Gene Expression Omnibus Database.

    PubMed

    Clough, Emily; Barrett, Tanya

    2016-01-01

    The Gene Expression Omnibus (GEO) database is an international public repository that archives and freely distributes high-throughput gene expression and other functional genomics data sets. Created in 2000 as a worldwide resource for gene expression studies, GEO has evolved with rapidly changing technologies and now accepts high-throughput data for many other data applications, including those that examine genome methylation, chromatin structure, and genome-protein interactions. GEO supports community-derived reporting standards that specify provision of several critical study elements including raw data, processed data, and descriptive metadata. The database not only provides access to data for tens of thousands of studies, but also offers various Web-based tools and strategies that enable users to locate data relevant to their specific interests, as well as to visualize and analyze the data. This chapter includes detailed descriptions of methods to query and download GEO data and use the analysis and visualization tools. The GEO homepage is at http://www.ncbi.nlm.nih.gov/geo/. PMID:27008011

  6. Towards modeling DNA sequences as automata

    NASA Astrophysics Data System (ADS)

    Burks, Christian; Farmer, Doyne

    1984-01-01

    We seek to describe a starting point for modeling the evolution and role of DNA sequences within the framework of cellular automata by discussing the current understanding of genetic information storage in DNA sequences. This includes alternately viewing the role of DNA in living organisms as a simple scheme and as a complex scheme; a brief review of strategies for identifying and classifying patterns in DNA sequences; and finally, notes towards establishing DNA-like automata models, including a discussion of the extent of experimentally determined DNA sequence data present in the database at Los Alamos.

  7. Enhancing medical database semantics.

    PubMed Central

    Leão, B. de F.; Pavan, A.

    1995-01-01

    Medical Databases deal with dynamic, heterogeneous and fuzzy data. The modeling of such complex domain demands powerful semantic data modeling methodologies. This paper describes GSM-Explorer a Case Tool that allows for the creation of relational databases using semantic data modeling techniques. GSM Explorer fully incorporates the Generic Semantic Data Model-GSM enabling knowledge engineers to model the application domain with the abstraction mechanisms of generalization/specialization, association and aggregation. The tool generates a structure that implements persistent database-objects through the automatic generation of customized SQL ANSI scripts that sustain the semantics defined in the higher lever. This paper emphasizes the system architecture and the mapping of the semantic model into relational tables. The present status of the project and its further developments are discussed in the Conclusions. PMID:8563288

  8. Protein Structure Databases.

    PubMed

    Laskowski, Roman A

    2016-01-01

    Web-based protein structure databases come in a wide variety of types and levels of information content. Those having the most general interest are the various atlases that describe each experimentally determined protein structure and provide useful links, analyses, and schematic diagrams relating to its 3D structure and biological function. Also of great interest are the databases that classify 3D structures by their folds as these can reveal evolutionary relationships which may be hard to detect from sequence comparison alone. Related to these are the numerous servers that compare folds-particularly useful for newly solved structures, and especially those of unknown function. Beyond these are a vast number of databases for the more specialized user, dealing with specific families, diseases, structural features, and so on. PMID:27115626

  9. Mouse genome database 2016.

    PubMed

    Bult, Carol J; Eppig, Janan T; Blake, Judith A; Kadin, James A; Richardson, Joel E

    2016-01-01

    The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data.

  10. Mouse genome database 2016

    PubMed Central

    Bult, Carol J.; Eppig, Janan T.; Blake, Judith A.; Kadin, James A.; Richardson, Joel E.

    2016-01-01

    The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data. PMID:26578600

  11. Mouse genome database 2016.

    PubMed

    Bult, Carol J; Eppig, Janan T; Blake, Judith A; Kadin, James A; Richardson, Joel E

    2016-01-01

    The Mouse Genome Database (MGD; http://www.informatics.jax.org) is the primary community model organism database for the laboratory mouse and serves as the source for key biological reference data related to mouse genes, gene functions, phenotypes and disease models with a strong emphasis on the relationship of these data to human biology and disease. As the cost of genome-scale sequencing continues to decrease and new technologies for genome editing become widely adopted, the laboratory mouse is more important than ever as a model system for understanding the biological significance of human genetic variation and for advancing the basic research needed to support the emergence of genome-guided precision medicine. Recent enhancements to MGD include new graphical summaries of biological annotations for mouse genes, support for mobile access to the database, tools to support the annotation and analysis of sets of genes, and expanded support for comparative biology through the expansion of homology data. PMID:26578600

  12. Sandia Wind Turbine Loads Database

    DOE Data Explorer

    The Sandia Wind Turbine Loads Database is divided into six files, each corresponding to approximately 16 years of simulation. The files are text files with data in columnar format. The 424MB zipped file containing six data files can be downloaded by the public. The files simulate 10-minute maximum loads for the NREL 5MW wind turbine. The details of the loads simulations can be found in the paper: “Decades of Wind Turbine Loads Simulations”, M. Barone, J. Paquette, B. Resor, and L. Manuel, AIAA2012-1288 (3.69MB PDF). Note that the site-average wind speed is 10 m/s (class I-B), not the 8.5 m/s reported in the paper.

  13. Catalog of databases and reports

    SciTech Connect

    Burtis, M.D.

    1997-04-01

    This catalog provides information about the many reports and materials made available by the US Department of Energy`s (DOE`s) Global Change Research Program (GCRP) and the Carbon Dioxide Information Analysis Center (CDIAC). The catalog is divided into nine sections plus the author and title indexes: Section A--US Department of Energy Global Change Research Program Research Plans and Summaries; Section B--US Department of Energy Global Change Research Program Technical Reports; Section C--US Department of Energy Atmospheric Radiation Measurement (ARM) Program Reports; Section D--Other US Department of Energy Reports; Section E--CDIAC Reports; Section F--CDIAC Numeric Data and Computer Model Distribution; Section G--Other Databases Distributed by CDIAC; Section H--US Department of Agriculture Reports on Response of Vegetation to Carbon Dioxide; and Section I--Other Publications.

  14. CD-ROM-aided Databases

    NASA Astrophysics Data System (ADS)

    Fujiwara, Yuzuru

    CD-ROM is remarked as an epoch-making medium because of its advantages such as large capacity, compact size, mass reproducibility, read only memory and cost performance ratio. Some of big dictionaries and online databases have been converted to CD-ROM versions so far, however, information of publication or machine parts are converted recently. Moreover various CD-ROM-aided products such as support system for R&D, decision making and so on are being turned out. Still there remain many problems on sophisticated utilization of CD-ROM and distributive machinery of information. Author reviews this mini-series and describes the prospects of development of CD-ROM.

  15. Protein-protein interaction databases: keeping up with growing interactomes

    PubMed Central

    2009-01-01

    Over the past few years, the number of known protein-protein interactions has increased substantially. To make this information more readily available, a number of publicly available databases have set out to collect and store protein-protein interaction data. Protein-protein interactions have been retrieved from six major databases, integrated and the results compared. The six databases (the Biological General Repository for Interaction Datasets [BioGRID], the Molecular INTeraction database [MINT], the Biomolecular Interaction Network Database [BIND], the Database of Interacting Proteins [DIP], the IntAct molecular interaction database [IntAct] and the Human Protein Reference Database [HPRD]) differ in scope and content; integration of all datasets is non-trivial owing to differences in data annotation. With respect to human protein-protein interaction data, HPRD seems to be the most comprehensive. To obtain a complete dataset, however, interactions from all six databases have to be combined. To overcome this limitation, meta-databases such as the Agile Protein Interaction Database (APID) offer access to integrated protein-protein interaction datasets, although these also currently have certain restrictions. PMID:19403463

  16. Database Constraints Applied to Metabolic Pathway Reconstruction Tools

    PubMed Central

    Vilaplana, Jordi; Solsona, Francesc; Teixido, Ivan; Usié, Anabel; Karathia, Hiren; Alves, Rui; Mateo, Jordi

    2014-01-01

    Our group developed two biological applications, Biblio-MetReS and Homol-MetReS, accessing the same database of organisms with annotated genes. Biblio-MetReS is a data-mining application that facilitates the reconstruction of molecular networks based on automated text-mining analysis of published scientific literature. Homol-MetReS allows functional (re)annotation of proteomes, to properly identify both the individual proteins involved in the process(es) of interest and their function. It also enables the sets of proteins involved in the process(es) in different organisms to be compared directly. The efficiency of these biological applications is directly related to the design of the shared database. We classified and analyzed the different kinds of access to the database. Based on this study, we tried to adjust and tune the configurable parameters of the database server to reach the best performance of the communication data link to/from the database system. Different database technologies were analyzed. We started the study with a public relational SQL database, MySQL. Then, the same database was implemented by a MapReduce-based database named HBase. The results indicated that the standard configuration of MySQL gives an acceptable performance for low or medium size databases. Nevertheless, tuning database parameters can greatly improve the performance and lead to very competitive runtimes. PMID:25202745

  17. The Neotoma Paleoecology Database

    NASA Astrophysics Data System (ADS)

    Grimm, E. C.; Ashworth, A. C.; Barnosky, A. D.; Betancourt, J. L.; Bills, B.; Booth, R.; Blois, J.; Charles, D. F.; Graham, R. W.; Goring, S. J.; Hausmann, S.; Smith, A. J.; Williams, J. W.; Buckland, P.

    2015-12-01

    The Neotoma Paleoecology Database (www.neotomadb.org) is a multiproxy, open-access, relational database that includes fossil data for the past 5 million years (the late Neogene and Quaternary Periods). Modern distributional data for various organisms are also being made available for calibration and paleoecological analyses. The project is a collaborative effort among individuals from more than 20 institutions worldwide, including domain scientists representing a spectrum of Pliocene-Quaternary fossil data types, as well as experts in information technology. Working groups are active for diatoms, insects, ostracodes, pollen and plant macroscopic remains, testate amoebae, rodent middens, vertebrates, age models, geochemistry and taphonomy. Groups are also active in developing online tools for data analyses and for developing modules for teaching at different levels. A key design concept of NeotomaDB is that stewards for various data types are able to remotely upload and manage data. Cooperatives for different kinds of paleo data, or from different regions, can appoint their own stewards. Over the past year, much progress has been made on development of the steward software-interface that will enable this capability. The steward interface uses web services that provide access to the database. More generally, these web services enable remote programmatic access to the database, which both desktop and web applications can use and which provide real-time access to the most current data. Use of these services can alleviate the need to download the entire database, which can be out-of-date as soon as new data are entered. In general, the Neotoma web services deliver data either from an entire table or from the results of a view. Upon request, new web services can be quickly generated. Future developments will likely expand the spatial and temporal dimensions of the database. NeotomaDB is open to receiving new datasets and stewards from the global Quaternary community

  18. National Residential Efficiency Measures Database Aimed at Reducing Risk for Residential Retrofit Industry

    SciTech Connect

    David Roberts

    2012-01-01

    This technical highlight describes NREL research to develop a publicly available database of energy retrofit measures containing performance characteristics and cost estimates for nearly 3,000 measures.

  19. The Ribosomal Database Project.

    PubMed Central

    Maidak, B L; Larsen, N; McCaughey, M J; Overbeek, R; Olsen, G J; Fogel, K; Blandy, J; Woese, C R

    1994-01-01

    The Ribosomal Database Project (RDP) is a curated database that offers ribosome-related data, analysis services, and associated computer programs. The offerings include phylogenetically ordered alignments of ribosomal RNA (rRNA) sequences, derived phylogenetic trees, rRNA secondary structure diagrams, and various software for handling, analyzing and displaying alignments and trees. The data are available via anonymous ftp (rdp.life.uiuc.edu), electronic mail (server/rdp.life.uiuc.edu) and gopher (rdpgopher.life.uiuc.edu). The electronic mail server also provides ribosomal probe checking, approximate phylogenetic placement of user-submitted sequences, screening for chimeric nature of newly sequenced rRNAs, and automated alignment. PMID:7524021

  20. Database Management System

    NASA Technical Reports Server (NTRS)

    1990-01-01

    In 1981 Wayne Erickson founded Microrim, Inc, a company originally focused on marketing a microcomputer version of RIM (Relational Information Manager). Dennis Comfort joined the firm and is now vice president, development. The team developed an advanced spinoff from the NASA system they had originally created, a microcomputer database management system known as R:BASE 4000. Microrim added many enhancements and developed a series of R:BASE products for various environments. R:BASE is now the second largest selling line of microcomputer database management software in the world.