Sample records for open access databases

  1. Databases and Electronic Resources - Betty Petersen Memorial Library

    Science.gov Websites

    of NOAA-Wide and Open Access Databases on the NOAA Central Library website. American Meteorological to a nonfederal website. Open Science Directory Open Science Directory contains collections of Open Access Journals (e.g. Directory of Open Access Journals) and journals in the special programs (Hinari

  2. Crystallography Open Database – an open-access collection of crystal structures

    PubMed Central

    Gražulis, Saulius; Chateigner, Daniel; Downs, Robert T.; Yokochi, A. F. T.; Quirós, Miguel; Lutterotti, Luca; Manakova, Elena; Butkus, Justas; Moeck, Peter; Le Bail, Armel

    2009-01-01

    The Crystallography Open Database (COD), which is a project that aims to gather all available inorganic, metal–organic and small organic molecule structural data in one database, is described. The database adopts an open-access model. The COD currently contains ∼80 000 entries in crystallographic information file format, with nearly full coverage of the International Union of Crystallography publications, and is growing in size and quality. PMID:22477773

  3. Open-access databases as unprecedented resources and drivers of cultural change in fisheries science

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McManamay, Ryan A; Utz, Ryan

    2014-01-01

    Open-access databases with utility in fisheries science have grown exponentially in quantity and scope over the past decade, with profound impacts to our discipline. The management, distillation, and sharing of an exponentially growing stream of open-access data represents several fundamental challenges in fisheries science. Many of the currently available open-access resources may not be universally known among fisheries scientists. We therefore introduce many national- and global-scale open-access databases with applications in fisheries science and provide an example of how they can be harnessed to perform valuable analyses without additional field efforts. We also discuss how the development, maintenance, and utilizationmore » of open-access data are likely to pose technical, financial, and educational challenges to fisheries scientists. Such cultural implications that will coincide with the rapidly increasing availability of free data should compel the American Fisheries Society to actively address these problems now to help ease the forthcoming cultural transition.« less

  4. Crystallography Open Database (COD): an open-access collection of crystal structures and platform for world-wide collaboration

    PubMed Central

    Gražulis, Saulius; Daškevič, Adriana; Merkys, Andrius; Chateigner, Daniel; Lutterotti, Luca; Quirós, Miguel; Serebryanaya, Nadezhda R.; Moeck, Peter; Downs, Robert T.; Le Bail, Armel

    2012-01-01

    Using an open-access distribution model, the Crystallography Open Database (COD, http://www.crystallography.net) collects all known ‘small molecule / small to medium sized unit cell’ crystal structures and makes them available freely on the Internet. As of today, the COD has aggregated ∼150 000 structures, offering basic search capabilities and the possibility to download the whole database, or parts thereof using a variety of standard open communication protocols. A newly developed website provides capabilities for all registered users to deposit published and so far unpublished structures as personal communications or pre-publication depositions. Such a setup enables extension of the COD database by many users simultaneously. This increases the possibilities for growth of the COD database, and is the first step towards establishing a world wide Internet-based collaborative platform dedicated to the collection and curation of structural knowledge. PMID:22070882

  5. Reactome graph database: Efficient access to complex pathway data

    PubMed Central

    Korninger, Florian; Viteri, Guilherme; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D’Eustachio, Peter

    2018-01-01

    Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types. PMID:29377902

  6. Reactome graph database: Efficient access to complex pathway data.

    PubMed

    Fabregat, Antonio; Korninger, Florian; Viteri, Guilherme; Sidiropoulos, Konstantinos; Marin-Garcia, Pablo; Ping, Peipei; Wu, Guanming; Stein, Lincoln; D'Eustachio, Peter; Hermjakob, Henning

    2018-01-01

    Reactome is a free, open-source, open-data, curated and peer-reviewed knowledgebase of biomolecular pathways. One of its main priorities is to provide easy and efficient access to its high quality curated data. At present, biological pathway databases typically store their contents in relational databases. This limits access efficiency because there are performance issues associated with queries traversing highly interconnected data. The same data in a graph database can be queried more efficiently. Here we present the rationale behind the adoption of a graph database (Neo4j) as well as the new ContentService (REST API) that provides access to these data. The Neo4j graph database and its query language, Cypher, provide efficient access to the complex Reactome data model, facilitating easy traversal and knowledge discovery. The adoption of this technology greatly improved query efficiency, reducing the average query time by 93%. The web service built on top of the graph database provides programmatic access to Reactome data by object oriented queries, but also supports more complex queries that take advantage of the new underlying graph-based data storage. By adopting graph database technology we are providing a high performance pathway data resource to the community. The Reactome graph database use case shows the power of NoSQL database engines for complex biological data types.

  7. Do open access biomedical journals benefit smaller countries? The Slovenian experience.

    PubMed

    Turk, Nana

    2011-06-01

    Scientists from smaller countries have problems gaining visibility for their research. Does open access publishing provide a solution? Slovenia is a small country with around 5000 medical doctors, 1300 dentists and 1000 pharmacists. A search of Slovenia's Bibliographic database was carried out to identity all biomedical journals and those which are open access. Slovenia has 18 medical open access journals, but none has an impact factor and only 10 are indexed by Slovenian and international bibliographic databases. The visibility and quality of medical papers is poor. The solution might be to reduce the number of journals and encourage Slovenian scientists to publish their best articles in them. © 2011 The authors. Health Information and Libraries Journal © 2011 Health Libraries Group.

  8. SciELO, Scientific Electronic Library Online, a Database of Open Access Journals

    ERIC Educational Resources Information Center

    Meneghini, Rogerio

    2013-01-01

    This essay discusses SciELO, a scientific journal database operating in 14 countries. It covers over 1000 journals providing open access to full text and table sets of scientometrics data. In Brazil it is responsible for a collection of nearly 300 journals, selected along 15 years as the best Brazilian periodicals in natural and social sciences.…

  9. The Crisis in Scholarly Communication, Open Access, and Open Data Policies: The Libraries' Perspective

    NASA Astrophysics Data System (ADS)

    Besara, Rachel

    2015-03-01

    For years the cost of STEM databases have exceeded the rate of inflation. Libraries have reallocated funds for years to continue to provide support to their scientific communities, but they are reaching a point at many institutions where they are no longer able to provide access to many databases considered standard to support research. A possible or partial alleviation to this problem is the federal open access mandate. However, this shift challenges the current model of publishing and data management in the sciences. This talk will discuss these topics from the perspective of research libraries supporting physics and the STEM disciplines.

  10. Drug interaction databases in medical literature: transparency of ownership, funding, classification algorithms, level of documentation, and staff qualifications. A systematic review.

    PubMed

    Kongsholm, Gertrud Gansmo; Nielsen, Anna Katrine Toft; Damkier, Per

    2015-11-01

    It is well documented that drug-drug interaction databases (DIDs) differ substantially with respect to classification of drug-drug interactions (DDIs). The aim of this study was to study online available transparency of ownership, funding, information, classifications, staff training, and underlying documentation of the five most commonly used open access English language-based online DIDs and the three most commonly used subscription English language-based online DIDs in the literature. We conducted a systematic literature search to identify the five most commonly used open access and the three most commonly used subscription DIDs in the medical literature. The following parameters were assessed for each of the databases: Ownership, classification of interactions, primary information sources, and staff qualification. We compared the overall proportion of yes/no answers from open access databases and subscription databases by Fisher's exact test-both prior to and after requesting missing information. Among open access DIDs, 20/60 items could be verified from the webpage directly compared to 24/36 for the subscription DIDs (p = 0.0028). Following personal request, these numbers rose to 22/60 and 30/36, respectively (p < 0.0001). For items within the "classification of interaction" domain, proportions were 3/25 versus 11/15 available from the webpage (P = 0.0001) and 3/25 versus 15/15 (p < 0.0001) available upon personal request. Available information on online available transparency of ownership, funding, information, classifications, staff training, and underlying documentation varies substantially among various DIDs. Open access DIDs had a statistically lower score on parameters assessed.

  11. [Open access to academic scholarship as a public policy resource: a study of the Capes database on Brazilian theses and dissertations].

    PubMed

    da Silva Rosa, Teresa; Carneiro, Maria José

    2010-12-01

    Access to scientific knowledge is a valuable resource than can inform and validate positions taken in formulating public policy. But access to this knowledge can be challenging, given the diversity and breadth of available scholarship. Communication between the fields of science and of politics requires the dissemination of scholarship and access to it. We conducted a study using an open-access search tool in order to map existent knowledge on a specific topic: agricultural contributions to the preservation of biodiversity. The present article offers a critical view of access to the information available through the Capes database on Brazilian theses and dissertations.

  12. For 481 biomedical open access journals, articles are not searchable in the Directory of Open Access Journals nor in conventional biomedical databases.

    PubMed

    Liljekvist, Mads Svane; Andresen, Kristoffer; Pommergaard, Hans-Christian; Rosenberg, Jacob

    2015-01-01

    Background. Open access (OA) journals allows access to research papers free of charge to the reader. Traditionally, biomedical researchers use databases like MEDLINE and EMBASE to discover new advances. However, biomedical OA journals might not fulfill such databases' criteria, hindering dissemination. The Directory of Open Access Journals (DOAJ) is a database exclusively listing OA journals. The aim of this study was to investigate DOAJ's coverage of biomedical OA journals compared with the conventional biomedical databases. Methods. Information on all journals listed in four conventional biomedical databases (MEDLINE, PubMed Central, EMBASE and SCOPUS) and DOAJ were gathered. Journals were included if they were (1) actively publishing, (2) full OA, (3) prospectively indexed in one or more database, and (4) of biomedical subject. Impact factor and journal language were also collected. DOAJ was compared with conventional databases regarding the proportion of journals covered, along with their impact factor and publishing language. The proportion of journals with articles indexed by DOAJ was determined. Results. In total, 3,236 biomedical OA journals were included in the study. Of the included journals, 86.7% were listed in DOAJ. Combined, the conventional biomedical databases listed 75.0% of the journals; 18.7% in MEDLINE; 36.5% in PubMed Central; 51.5% in SCOPUS and 50.6% in EMBASE. Of the journals in DOAJ, 88.7% published in English and 20.6% had received impact factor for 2012 compared with 93.5% and 26.0%, respectively, for journals in the conventional biomedical databases. A subset of 51.1% and 48.5% of the journals in DOAJ had articles indexed from 2012 and 2013, respectively. Of journals exclusively listed in DOAJ, one journal had received an impact factor for 2012, and 59.6% of the journals had no content from 2013 indexed in DOAJ. Conclusions. DOAJ is the most complete registry of biomedical OA journals compared with five conventional biomedical databases. However, DOAJ only indexes articles for half of the biomedical journals listed, making it an incomplete source for biomedical research papers in general.

  13. For 481 biomedical open access journals, articles are not searchable in the Directory of Open Access Journals nor in conventional biomedical databases

    PubMed Central

    Andresen, Kristoffer; Pommergaard, Hans-Christian; Rosenberg, Jacob

    2015-01-01

    Background. Open access (OA) journals allows access to research papers free of charge to the reader. Traditionally, biomedical researchers use databases like MEDLINE and EMBASE to discover new advances. However, biomedical OA journals might not fulfill such databases’ criteria, hindering dissemination. The Directory of Open Access Journals (DOAJ) is a database exclusively listing OA journals. The aim of this study was to investigate DOAJ’s coverage of biomedical OA journals compared with the conventional biomedical databases. Methods. Information on all journals listed in four conventional biomedical databases (MEDLINE, PubMed Central, EMBASE and SCOPUS) and DOAJ were gathered. Journals were included if they were (1) actively publishing, (2) full OA, (3) prospectively indexed in one or more database, and (4) of biomedical subject. Impact factor and journal language were also collected. DOAJ was compared with conventional databases regarding the proportion of journals covered, along with their impact factor and publishing language. The proportion of journals with articles indexed by DOAJ was determined. Results. In total, 3,236 biomedical OA journals were included in the study. Of the included journals, 86.7% were listed in DOAJ. Combined, the conventional biomedical databases listed 75.0% of the journals; 18.7% in MEDLINE; 36.5% in PubMed Central; 51.5% in SCOPUS and 50.6% in EMBASE. Of the journals in DOAJ, 88.7% published in English and 20.6% had received impact factor for 2012 compared with 93.5% and 26.0%, respectively, for journals in the conventional biomedical databases. A subset of 51.1% and 48.5% of the journals in DOAJ had articles indexed from 2012 and 2013, respectively. Of journals exclusively listed in DOAJ, one journal had received an impact factor for 2012, and 59.6% of the journals had no content from 2013 indexed in DOAJ. Conclusions. DOAJ is the most complete registry of biomedical OA journals compared with five conventional biomedical databases. However, DOAJ only indexes articles for half of the biomedical journals listed, making it an incomplete source for biomedical research papers in general. PMID:26038727

  14. ERMes: Open Source Simplicity for Your E-Resource Management

    ERIC Educational Resources Information Center

    Doering, William; Chilton, Galadriel

    2009-01-01

    ERMes, the latest version of electronic resource management system (ERM), is a relational database; content in different tables connects to, and works with, content in other tables. ERMes requires Access 2007 (Windows) or Access 2008 (Mac) to operate as the database utilizes functionality not available in previous versions of Microsoft Access. The…

  15. Virus taxonomy: the database of the International Committee on Taxonomy of Viruses (ICTV)

    PubMed Central

    Dempsey, Donald M; Hendrickson, Robert Curtis; Orton, Richard J; Siddell, Stuart G; Smith, Donald B

    2018-01-01

    Abstract The International Committee on Taxonomy of Viruses (ICTV) is charged with the task of developing, refining, and maintaining a universal virus taxonomy. This task encompasses the classification of virus species and higher-level taxa according to the genetic and biological properties of their members; naming virus taxa; maintaining a database detailing the currently approved taxonomy; and providing the database, supporting proposals, and other virus-related information from an open-access, public web site. The ICTV web site (http://ictv.global) provides access to the current taxonomy database in online and downloadable formats, and maintains a complete history of virus taxa back to the first release in 1971. The ICTV has also published the ICTV Report on Virus Taxonomy starting in 1971. This Report provides a comprehensive description of all virus taxa covering virus structure, genome structure, biology and phylogenetics. The ninth ICTV report, published in 2012, is available as an open-access online publication from the ICTV web site. The current, 10th report (http://ictv.global/report/), is being published online, and is replacing the previous hard-copy edition with a completely open access, continuously updated publication. No other database or resource exists that provides such a comprehensive, fully annotated compendium of information on virus taxa and taxonomy. PMID:29040670

  16. An Open-source Toolbox for Analysing and Processing PhysioNet Databases in MATLAB and Octave.

    PubMed

    Silva, Ikaro; Moody, George B

    The WaveForm DataBase (WFDB) Toolbox for MATLAB/Octave enables integrated access to PhysioNet's software and databases. Using the WFDB Toolbox for MATLAB/Octave, users have access to over 50 physiological databases in PhysioNet. The toolbox provides access over 4 TB of biomedical signals including ECG, EEG, EMG, and PLETH. Additionally, most signals are accompanied by metadata such as medical annotations of clinical events: arrhythmias, sleep stages, seizures, hypotensive episodes, etc. Users of this toolbox should easily be able to reproduce, validate, and compare results published based on PhysioNet's software and databases.

  17. Toward an open-access global database for mapping, control, and surveillance of neglected tropical diseases.

    PubMed

    Hürlimann, Eveline; Schur, Nadine; Boutsika, Konstantina; Stensgaard, Anna-Sofie; Laserna de Himpsl, Maiti; Ziegelbauer, Kathrin; Laizer, Nassor; Camenzind, Lukas; Di Pasquale, Aurelio; Ekpo, Uwem F; Simoonga, Christopher; Mushinge, Gabriel; Saarnak, Christopher F L; Utzinger, Jürg; Kristensen, Thomas K; Vounatsou, Penelope

    2011-12-01

    After many years of general neglect, interest has grown and efforts came under way for the mapping, control, surveillance, and eventual elimination of neglected tropical diseases (NTDs). Disease risk estimates are a key feature to target control interventions, and serve as a benchmark for monitoring and evaluation. What is currently missing is a georeferenced global database for NTDs providing open-access to the available survey data that is constantly updated and can be utilized by researchers and disease control managers to support other relevant stakeholders. We describe the steps taken toward the development of such a database that can be employed for spatial disease risk modeling and control of NTDs. With an emphasis on schistosomiasis in Africa, we systematically searched the literature (peer-reviewed journals and 'grey literature'), contacted Ministries of Health and research institutions in schistosomiasis-endemic countries for location-specific prevalence data and survey details (e.g., study population, year of survey and diagnostic techniques). The data were extracted, georeferenced, and stored in a MySQL database with a web interface allowing free database access and data management. At the beginning of 2011, our database contained more than 12,000 georeferenced schistosomiasis survey locations from 35 African countries available under http://www.gntd.org. Currently, the database is expanded to a global repository, including a host of other NTDs, e.g. soil-transmitted helminthiasis and leishmaniasis. An open-access, spatially explicit NTD database offers unique opportunities for disease risk modeling, targeting control interventions, disease monitoring, and surveillance. Moreover, it allows for detailed geostatistical analyses of disease distribution in space and time. With an initial focus on schistosomiasis in Africa, we demonstrate the proof-of-concept that the establishment and running of a global NTD database is feasible and should be expanded without delay.

  18. Open access intrapartum CTG database.

    PubMed

    Chudáček, Václav; Spilka, Jiří; Burša, Miroslav; Janků, Petr; Hruban, Lukáš; Huptych, Michal; Lhotská, Lenka

    2014-01-13

    Cardiotocography (CTG) is a monitoring of fetal heart rate and uterine contractions. Since 1960 it is routinely used by obstetricians to assess fetal well-being. Many attempts to introduce methods of automatic signal processing and evaluation have appeared during the last 20 years, however still no significant progress similar to that in the domain of adult heart rate variability, where open access databases are available (e.g. MIT-BIH), is visible. Based on a thorough review of the relevant publications, presented in this paper, the shortcomings of the current state are obvious. A lack of common ground for clinicians and technicians in the field hinders clinically usable progress. Our open access database of digital intrapartum cardiotocographic recordings aims to change that. The intrapartum CTG database consists in total of 552 intrapartum recordings, which were acquired between April 2010 and August 2012 at the obstetrics ward of the University Hospital in Brno, Czech Republic. All recordings were stored in electronic form in the OB TraceVue®;system. The recordings were selected from 9164 intrapartum recordings with clinical as well as technical considerations in mind. All recordings are at most 90 minutes long and start a maximum of 90 minutes before delivery. The time relation of CTG to delivery is known as well as the length of the second stage of labor which does not exceed 30 minutes. The majority of recordings (all but 46 cesarean sections) is - on purpose - from vaginal deliveries. All recordings have available biochemical markers as well as some more general clinical features. Full description of the database and reasoning behind selection of the parameters is presented in the paper. A new open-access CTG database is introduced which should give the research community common ground for comparison of results on reasonably large database. We anticipate that after reading the paper, the reader will understand the context of the field from clinical and technical perspectives which will enable him/her to use the database and also understand its limitations.

  19. Virus taxonomy: the database of the International Committee on Taxonomy of Viruses (ICTV).

    PubMed

    Lefkowitz, Elliot J; Dempsey, Donald M; Hendrickson, Robert Curtis; Orton, Richard J; Siddell, Stuart G; Smith, Donald B

    2018-01-04

    The International Committee on Taxonomy of Viruses (ICTV) is charged with the task of developing, refining, and maintaining a universal virus taxonomy. This task encompasses the classification of virus species and higher-level taxa according to the genetic and biological properties of their members; naming virus taxa; maintaining a database detailing the currently approved taxonomy; and providing the database, supporting proposals, and other virus-related information from an open-access, public web site. The ICTV web site (http://ictv.global) provides access to the current taxonomy database in online and downloadable formats, and maintains a complete history of virus taxa back to the first release in 1971. The ICTV has also published the ICTV Report on Virus Taxonomy starting in 1971. This Report provides a comprehensive description of all virus taxa covering virus structure, genome structure, biology and phylogenetics. The ninth ICTV report, published in 2012, is available as an open-access online publication from the ICTV web site. The current, 10th report (http://ictv.global/report/), is being published online, and is replacing the previous hard-copy edition with a completely open access, continuously updated publication. No other database or resource exists that provides such a comprehensive, fully annotated compendium of information on virus taxa and taxonomy. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  20. 49 CFR 1104.3 - Copies.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... fully evaluate evidence, all spreadsheets must be fully accessible and manipulable. Electronic databases... Microsoft Open Database Connectivity (ODBC) standard. ODBC is a Windows technology that allows a database software package to import data from a database created using a different software package. We currently...

  1. 49 CFR 1104.3 - Copies.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... fully evaluate evidence, all spreadsheets must be fully accessible and manipulable. Electronic databases... Microsoft Open Database Connectivity (ODBC) standard. ODBC is a Windows technology that allows a database software package to import data from a database created using a different software package. We currently...

  2. 49 CFR 1104.3 - Copies.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... fully evaluate evidence, all spreadsheets must be fully accessible and manipulable. Electronic databases... Microsoft Open Database Connectivity (ODBC) standard. ODBC is a Windows technology that allows a database software package to import data from a database created using a different software package. We currently...

  3. 49 CFR 1104.3 - Copies.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... fully evaluate evidence, all spreadsheets must be fully accessible and manipulable. Electronic databases... Microsoft Open Database Connectivity (ODBC) standard. ODBC is a Windows technology that allows a database software package to import data from a database created using a different software package. We currently...

  4. "XANSONS for COD": a new small BOINC project in crystallography

    NASA Astrophysics Data System (ADS)

    Neverov, Vladislav S.; Khrapov, Nikolay P.

    2018-04-01

    "XANSONS for COD" (http://xansons4cod.com) is a new BOINC project aimed at creating the open-access database of simulated x-ray and neutron powder diffraction patterns for nanocrystalline phase of materials from the collection of the Crystallography Open Database (COD). The project uses original open-source software XaNSoNS to simulate diffraction patterns on CPU and GPU. This paper describes the scientific problem this project solves, the project's internal structure, its operation principles and organization of the final database.

  5. ZeBase: an open-source relational database for zebrafish laboratories.

    PubMed

    Hensley, Monica R; Hassenplug, Eric; McPhail, Rodney; Leung, Yuk Fai

    2012-03-01

    Abstract ZeBase is an open-source relational database for zebrafish inventory. It is designed for the recording of genetic, breeding, and survival information of fish lines maintained in a single- or multi-laboratory environment. Users can easily access ZeBase through standard web-browsers anywhere on a network. Convenient search and reporting functions are available to facilitate routine inventory work; such functions can also be automated by simple scripting. Optional barcode generation and scanning are also built-in for easy access to the information related to any fish. Further information of the database and an example implementation can be found at http://zebase.bio.purdue.edu.

  6. ReprDB and panDB: minimalist databases with maximal microbial representation.

    PubMed

    Zhou, Wei; Gay, Nicole; Oh, Julia

    2018-01-18

    Profiling of shotgun metagenomic samples is hindered by a lack of unified microbial reference genome databases that (i) assemble genomic information from all open access microbial genomes, (ii) have relatively small sizes, and (iii) are compatible to various metagenomic read mapping tools. Moreover, computational tools to rapidly compile and update such databases to accommodate the rapid increase in new reference genomes do not exist. As a result, database-guided analyses often fail to profile a substantial fraction of metagenomic shotgun sequencing reads from complex microbiomes. We report pipelines that efficiently traverse all open access microbial genomes and assemble non-redundant genomic information. The pipelines result in two species-resolution microbial reference databases of relatively small sizes: reprDB, which assembles microbial representative or reference genomes, and panDB, for which we developed a novel iterative alignment algorithm to identify and assemble non-redundant genomic regions in multiple sequenced strains. With the databases, we managed to assign taxonomic labels and genome positions to the majority of metagenomic reads from human skin and gut microbiomes, demonstrating a significant improvement over a previous database-guided analysis on the same datasets. reprDB and panDB leverage the rapid increases in the number of open access microbial genomes to more fully profile metagenomic samples. Additionally, the databases exclude redundant sequence information to avoid inflated storage or memory space and indexing or analyzing time. Finally, the novel iterative alignment algorithm significantly increases efficiency in pan-genome identification and can be useful in comparative genomic analyses.

  7. Reasons to temper enthusiasm about open access nursing journals.

    PubMed

    de Jong, Gideon

    2017-04-01

    Open access is a relatively new phenomenon within nursing science. Several papers from various nursing journals have been published recently on the disadvantages of the traditional model of purchasing proprietary fee-based databases to access scholarly information. Just few nursing scholars are less optimistic about the possible benefits of open access nursing journals. A critical reflection on the merits and pitfalls of open access journals along insights from the literature and personal opinion. Two arguments are discussed, providing justification for tempering enthusiasm about open access journals. First, only research groups with sufficient financial resources can publish in open access journals. Second, open access has conflicting incentives, where the aim is to expand production at the expense of publishing quality articles; a business model that fits well into a neoliberal discourse. There are valid reasons to criticise the traditional publishers for the excessive costs of a single article, therefore preventing the dissemination of scholarly nursing information. On the contrary, the business model of open access publishers is no less imbued with the neoliberal tendency of lining the pockets.

  8. The GraVent DDT database

    NASA Astrophysics Data System (ADS)

    Boeck, Lorenz R.; Katzy, Peter; Hasslberger, Josef; Kink, Andreas; Sattelmayer, Thomas

    2016-09-01

    An open-access online platform containing data from experiments on deflagration-to-detonation transition conducted at the Institute of Thermodynamics, Technical University of Munich, has been developed and is accessible at http://www.td.mw.tum.de/ddt. The database provides researchers working on explosion dynamics with data for theoretical analyses and for the validation of numerical simulations.

  9. Toward an Open-Access Global Database for Mapping, Control, and Surveillance of Neglected Tropical Diseases

    PubMed Central

    Hürlimann, Eveline; Schur, Nadine; Boutsika, Konstantina; Stensgaard, Anna-Sofie; Laserna de Himpsl, Maiti; Ziegelbauer, Kathrin; Laizer, Nassor; Camenzind, Lukas; Di Pasquale, Aurelio; Ekpo, Uwem F.; Simoonga, Christopher; Mushinge, Gabriel; Saarnak, Christopher F. L.; Utzinger, Jürg; Kristensen, Thomas K.; Vounatsou, Penelope

    2011-01-01

    Background After many years of general neglect, interest has grown and efforts came under way for the mapping, control, surveillance, and eventual elimination of neglected tropical diseases (NTDs). Disease risk estimates are a key feature to target control interventions, and serve as a benchmark for monitoring and evaluation. What is currently missing is a georeferenced global database for NTDs providing open-access to the available survey data that is constantly updated and can be utilized by researchers and disease control managers to support other relevant stakeholders. We describe the steps taken toward the development of such a database that can be employed for spatial disease risk modeling and control of NTDs. Methodology With an emphasis on schistosomiasis in Africa, we systematically searched the literature (peer-reviewed journals and ‘grey literature’), contacted Ministries of Health and research institutions in schistosomiasis-endemic countries for location-specific prevalence data and survey details (e.g., study population, year of survey and diagnostic techniques). The data were extracted, georeferenced, and stored in a MySQL database with a web interface allowing free database access and data management. Principal Findings At the beginning of 2011, our database contained more than 12,000 georeferenced schistosomiasis survey locations from 35 African countries available under http://www.gntd.org. Currently, the database is expanded to a global repository, including a host of other NTDs, e.g. soil-transmitted helminthiasis and leishmaniasis. Conclusions An open-access, spatially explicit NTD database offers unique opportunities for disease risk modeling, targeting control interventions, disease monitoring, and surveillance. Moreover, it allows for detailed geostatistical analyses of disease distribution in space and time. With an initial focus on schistosomiasis in Africa, we demonstrate the proof-of-concept that the establishment and running of a global NTD database is feasible and should be expanded without delay. PMID:22180793

  10. [The Open Access Initiative (OAI) in the scientific literature].

    PubMed

    Sánchez-Martín, Francisco M; Millán Rodríguez, Félix; Villavicencio Mavrich, Humberto

    2009-01-01

    According to the declaration of the Budapest Open Access Initiative (OAI) is defined as a editorial model in which access to scientific journal literature and his use are free. Free flow of information allowed by Internet has been the basis of this initiative. The Bethesda and the Berlin declarations, supported by some international agencies, proposes to require researchers to deposit copies of all articles published in a self-archive or an Open Access repository, and encourage researchers to publish their research papers in journals Open Access. This paper reviews the keys of the OAI, with their strengths and controversial aspects; and it discusses the position of databases, search engines and repositories of biomedical information, as well as the attitude of the scientists, publishers and journals. So far the journal Actas Urológicas Españolas (Act Urol Esp) offer their contents on Open Access as On Line in Spanish and English.

  11. Open Access Internet Resources for Nano-Materials Physics Education

    NASA Astrophysics Data System (ADS)

    Moeck, Peter; Seipel, Bjoern; Upreti, Girish; Harvey, Morgan; Garrick, Will

    2006-05-01

    Because a great deal of nano-material science and engineering relies on crystalline materials, materials physicists have to provide their own specific contributions to the National Nanotechnology Initiative. Here we briefly review two freely accessible internet-based crystallographic databases, the Nano-Crystallography Database (http://nanocrystallography.research.pdx.edu) and the Crystallography Open Database (http://crystallography.net). Information on over 34,000 full structure determinations are stored in these two databases in the Crystallographic Information File format. The availability of such crystallographic data on the internet in a standardized format allows for all kinds of web-based crystallographic calculations and visualizations. Two examples of which that are dealt with in this paper are: interactive crystal structure visualizations in three dimensions and calculations of lattice-fringe fingerprints for the identification of unknown nanocrystals from their atomic-resolution transmission electron microscopy images.

  12. Development of an Open Global Oil and Gas Infrastructure Inventory and Geodatabase

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rose, Kelly

    This submission contains a technical report describing the development process and visual graphics for the Global Oil and Gas Infrastructure database. Access the GOGI database using the following link: https://edx.netl.doe.gov/dataset/global-oil-gas-features-database

  13. Montreal Archive of Sleep Studies: an open-access resource for instrument benchmarking and exploratory research.

    PubMed

    O'Reilly, Christian; Gosselin, Nadia; Carrier, Julie; Nielsen, Tore

    2014-12-01

    Manual processing of sleep recordings is extremely time-consuming. Efforts to automate this process have shown promising results, but automatic systems are generally evaluated on private databases, not allowing accurate cross-validation with other systems. In lacking a common benchmark, the relative performances of different systems are not compared easily and advances are compromised. To address this fundamental methodological impediment to sleep study, we propose an open-access database of polysomnographic biosignals. To build this database, whole-night recordings from 200 participants [97 males (aged 42.9 ± 19.8 years) and 103 females (aged 38.3 ± 18.9 years); age range: 18-76 years] were pooled from eight different research protocols performed in three different hospital-based sleep laboratories. All recordings feature a sampling frequency of 256 Hz and an electroencephalography (EEG) montage of 4-20 channels plus standard electro-oculography (EOG), electromyography (EMG), electrocardiography (ECG) and respiratory signals. Access to the database can be obtained through the Montreal Archive of Sleep Studies (MASS) website (http://www.ceams-carsm.ca/en/MASS), and requires only affiliation with a research institution and prior approval by the applicant's local ethical review board. Providing the research community with access to this free and open sleep database is expected to facilitate the development and cross-validation of sleep analysis automation systems. It is also expected that such a shared resource will be a catalyst for cross-centre collaborations on difficult topics such as improving inter-rater agreement on sleep stage scoring. © 2014 European Sleep Research Society.

  14. Global Oil & Gas Features Database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kelly Rose; Jennifer Bauer; Vic Baker

    This submission contains a zip file with the developed Global Oil & Gas Features Database (as an ArcGIS geodatabase). Access the technical report describing how this database was produced using the following link: https://edx.netl.doe.gov/dataset/development-of-an-open-global-oil-and-gas-infrastructure-inventory-and-geodatabase

  15. Concierge: Personal Database Software for Managing Digital Research Resources

    PubMed Central

    Sakai, Hiroyuki; Aoyama, Toshihiro; Yamaji, Kazutsuna; Usui, Shiro

    2007-01-01

    This article introduces a desktop application, named Concierge, for managing personal digital research resources. Using simple operations, it enables storage of various types of files and indexes them based on content descriptions. A key feature of the software is a high level of extensibility. By installing optional plug-ins, users can customize and extend the usability of the software based on their needs. In this paper, we also introduce a few optional plug-ins: literature management, electronic laboratory notebook, and XooNlps client plug-ins. XooNIps is a content management system developed to share digital research resources among neuroscience communities. It has been adopted as the standard database system in Japanese neuroinformatics projects. Concierge, therefore, offers comprehensive support from management of personal digital research resources to their sharing in open-access neuroinformatics databases such as XooNIps. This interaction between personal and open-access neuroinformatics databases is expected to enhance the dissemination of digital research resources. Concierge is developed as an open source project; Mac OS X and Windows XP versions have been released at the official site (http://concierge.sourceforge.jp). PMID:18974800

  16. Earth science big data at users' fingertips: the EarthServer Science Gateway Mobile

    NASA Astrophysics Data System (ADS)

    Barbera, Roberto; Bruno, Riccardo; Calanducci, Antonio; Fargetta, Marco; Pappalardo, Marco; Rundo, Francesco

    2014-05-01

    The EarthServer project (www.earthserver.eu), funded by the European Commission under its Seventh Framework Program, aims at establishing open access and ad-hoc analytics on extreme-size Earth Science data, based on and extending leading-edge Array Database technology. The core idea is to use database query languages as client/server interface to achieve barrier-free "mix & match" access to multi-source, any-size, multi-dimensional space-time data -- in short: "Big Earth Data Analytics" - based on the open standards of the Open Geospatial Consortium Web Coverage Processing Service (OGC WCPS) and the W3C XQuery. EarthServer combines both, thereby achieving a tight data/metadata integration. Further, the rasdaman Array Database System (www.rasdaman.com) is extended with further space-time coverage data types. On server side, highly effective optimizations - such as parallel and distributed query processing - ensure scalability to Exabyte volumes. In this contribution we will report on the EarthServer Science Gateway Mobile, an app for both iOS and Android-based devices that allows users to seamlessly access some of the EarthServer applications using SAML-based federated authentication and fine-grained authorisation mechanisms.

  17. MetaboLights: An Open-Access Database Repository for Metabolomics Data.

    PubMed

    Kale, Namrata S; Haug, Kenneth; Conesa, Pablo; Jayseelan, Kalaivani; Moreno, Pablo; Rocca-Serra, Philippe; Nainala, Venkata Chandrasekhar; Spicer, Rachel A; Williams, Mark; Li, Xuefei; Salek, Reza M; Griffin, Julian L; Steinbeck, Christoph

    2016-03-24

    MetaboLights is the first general purpose, open-access database repository for cross-platform and cross-species metabolomics research at the European Bioinformatics Institute (EMBL-EBI). Based upon the open-source ISA framework, MetaboLights provides Metabolomics Standard Initiative (MSI) compliant metadata and raw experimental data associated with metabolomics experiments. Users can upload their study datasets into the MetaboLights Repository. These studies are then automatically assigned a stable and unique identifier (e.g., MTBLS1) that can be used for publication reference. The MetaboLights Reference Layer associates metabolites with metabolomics studies in the archive and is extensively annotated with data fields such as structural and chemical information, NMR and MS spectra, target species, metabolic pathways, and reactions. The database is manually curated with no specific release schedules. MetaboLights is also recommended by journals for metabolomics data deposition. This unit provides a guide to using MetaboLights, downloading experimental data, and depositing metabolomics datasets using user-friendly submission tools. Copyright © 2016 John Wiley & Sons, Inc.

  18. Wireless access to a pharmaceutical database: a demonstrator for data driven Wireless Application Protocol (WAP) applications in medical information processing.

    PubMed

    Schacht Hansen, M; Dørup, J

    2001-01-01

    The Wireless Application Protocol technology implemented in newer mobile phones has built-in facilities for handling much of the information processing needed in clinical work. To test a practical approach we ported a relational database of the Danish pharmaceutical catalogue to Wireless Application Protocol using open source freeware at all steps. We used Apache 1.3 web software on a Linux server. Data containing the Danish pharmaceutical catalogue were imported from an ASCII file into a MySQL 3.22.32 database using a Practical Extraction and Report Language script for easy update of the database. Data were distributed in 35 interrelated tables. Each pharmaceutical brand name was given its own card with links to general information about the drug, active substances, contraindications etc. Access was available through 1) browsing therapeutic groups and 2) searching for a brand name. The database interface was programmed in the server-side scripting language PHP3. A free, open source Wireless Application Protocol gateway to a pharmaceutical catalogue was established to allow dial-in access independent of commercial Wireless Application Protocol service providers. The application was tested on the Nokia 7110 and Ericsson R320s cellular phones. We have demonstrated that Wireless Application Protocol-based access to a dynamic clinical database can be established using open source freeware. The project opens perspectives for a further integration of Wireless Application Protocol phone functions in clinical information processing: Global System for Mobile communication telephony for bilateral communication, asynchronous unilateral communication via e-mail and Short Message Service, built-in calculator, calendar, personal organizer, phone number catalogue and Dictaphone function via answering machine technology. An independent Wireless Application Protocol gateway may be placed within hospital firewalls, which may be an advantage with respect to security. However, if Wireless Application Protocol phones are to become effective tools for physicians, special attention must be paid to the limitations of the devices. Input tools of Wireless Application Protocol phones should be improved, for instance by increased use of speech control.

  19. Wireless access to a pharmaceutical database: A demonstrator for data driven Wireless Application Protocol applications in medical information processing

    PubMed Central

    Hansen, Michael Schacht

    2001-01-01

    Background The Wireless Application Protocol technology implemented in newer mobile phones has built-in facilities for handling much of the information processing needed in clinical work. Objectives To test a practical approach we ported a relational database of the Danish pharmaceutical catalogue to Wireless Application Protocol using open source freeware at all steps. Methods We used Apache 1.3 web software on a Linux server. Data containing the Danish pharmaceutical catalogue were imported from an ASCII file into a MySQL 3.22.32 database using a Practical Extraction and Report Language script for easy update of the database. Data were distributed in 35 interrelated tables. Each pharmaceutical brand name was given its own card with links to general information about the drug, active substances, contraindications etc. Access was available through 1) browsing therapeutic groups and 2) searching for a brand name. The database interface was programmed in the server-side scripting language PHP3. Results A free, open source Wireless Application Protocol gateway to a pharmaceutical catalogue was established to allow dial-in access independent of commercial Wireless Application Protocol service providers. The application was tested on the Nokia 7110 and Ericsson R320s cellular phones. Conclusions We have demonstrated that Wireless Application Protocol-based access to a dynamic clinical database can be established using open source freeware. The project opens perspectives for a further integration of Wireless Application Protocol phone functions in clinical information processing: Global System for Mobile communication telephony for bilateral communication, asynchronous unilateral communication via e-mail and Short Message Service, built-in calculator, calendar, personal organizer, phone number catalogue and Dictaphone function via answering machine technology. An independent Wireless Application Protocol gateway may be placed within hospital firewalls, which may be an advantage with respect to security. However, if Wireless Application Protocol phones are to become effective tools for physicians, special attention must be paid to the limitations of the devices. Input tools of Wireless Application Protocol phones should be improved, for instance by increased use of speech control. PMID:11720946

  20. Prevalence and Citation Advantage of Gold Open Access in the Subject Areas of the Scopus Database

    ERIC Educational Resources Information Center

    Dorta-González, Pablo; Santana-Jiménez, Yolanda

    2018-01-01

    The potential benefit of open access (OA) in relation to citation impact has been discussed in the literature in depth. The methodology used to test the OA citation advantage includes comparing OA vs. non-OA journal impact factors and citations of OA vs. non-OA articles published in the same non-OA journals. However, one problem with many studies…

  1. Overview of open resources to support automated structure verification and elucidation

    EPA Science Inventory

    Cheminformatics methods form an essential basis for providing analytical scientists with access to data, algorithms and workflows. There are an increasing number of free online databases (compound databases, spectral libraries, data repositories) and a rich collection of software...

  2. Open exchange of scientific knowledge and European copyright: The case of biodiversity information

    PubMed Central

    Egloff, Willi; Patterson, David J.; Agosti, Donat; Hagedorn, Gregor

    2014-01-01

    Abstract Background. The 7th Framework Programme for Research and Technological Development is helping the European Union to prepare for an integrative system for intelligent management of biodiversity knowledge. The infrastructure that is envisaged and that will be further developed within the Programme “Horizon 2020” aims to provide open and free access to taxonomic information to anyone with a requirement for biodiversity data, without the need for individual consent of other persons or institutions. Open and free access to information will foster the re-use and improve the quality of data, will accelerate research, and will promote new types of research. Progress towards the goal of free and open access to content is hampered by numerous technical, economic, sociological, legal, and other factors. The present article addresses barriers to the open exchange of biodiversity knowledge that arise from European laws, in particular European legislation on copyright and database protection rights. We present a legal point of view as to what will be needed to bring distributed information together and facilitate its re-use by data mining, integration into semantic knowledge systems, and similar techniques. We address exceptions and limitations of copyright or database protection within Europe, and we point to the importance of data use agreements. We illustrate how exceptions and limitations have been transformed into national legislations within some European states to create inconsistencies that impede access to biodiversity information. Conclusions. The legal situation within the EU is unsatisfactory because there are inconsistencies among states that hamper the deployment of an open biodiversity knowledge management system. Scientists within the EU who work with copyright protected works or with protected databases have to be aware of regulations that vary from country to country. This is a major stumbling block to international collaboration and is an impediment to the open exchange of biodiversity knowledge. Such differences should be removed by unifying exceptions and limitations for research purposes in a binding, Europe-wide regulation. PMID:25009418

  3. Java Web Simulation (JWS); a web based database of kinetic models.

    PubMed

    Snoep, J L; Olivier, B G

    2002-01-01

    Software to make a database of kinetic models accessible via the internet has been developed and a core database has been set up at http://jjj.biochem.sun.ac.za/. This repository of models, available to everyone with internet access, opens a whole new way in which we can make our models public. Via the database, a user can change enzyme parameters and run time simulations or steady state analyses. The interface is user friendly and no additional software is necessary. The database currently contains 10 models, but since the generation of the program code to include new models has largely been automated the addition of new models is straightforward and people are invited to submit their models to be included in the database.

  4. An open access thyroid ultrasound image database

    NASA Astrophysics Data System (ADS)

    Pedraza, Lina; Vargas, Carlos; Narváez, Fabián.; Durán, Oscar; Muñoz, Emma; Romero, Eduardo

    2015-01-01

    Computer aided diagnosis systems (CAD) have been developed to assist radiologists in the detection and diagnosis of abnormalities and a large number of pattern recognition techniques have been proposed to obtain a second opinion. Most of these strategies have been evaluated using different datasets making their performance incomparable. In this work, an open access database of thyroid ultrasound images is presented. The dataset consists of a set of B-mode Ultrasound images, including a complete annotation and diagnostic description of suspicious thyroid lesions by expert radiologists. Several types of lesions as thyroiditis, cystic nodules, adenomas and thyroid cancers were included while an accurate lesion delineation is provided in XML format. The diagnostic description of malignant lesions was confirmed by biopsy. The proposed new database is expected to be a resource for the community to assess different CAD systems.

  5. Development of a data entry auditing protocol and quality assurance for a tissue bank database.

    PubMed

    Khushi, Matloob; Carpenter, Jane E; Balleine, Rosemary L; Clarke, Christine L

    2012-03-01

    Human transcription error is an acknowledged risk when extracting information from paper records for entry into a database. For a tissue bank, it is critical that accurate data are provided to researchers with approved access to tissue bank material. The challenges of tissue bank data collection include manual extraction of data from complex medical reports that are accessed from a number of sources and that differ in style and layout. As a quality assurance measure, the Breast Cancer Tissue Bank (http:\\\\www.abctb.org.au) has implemented an auditing protocol and in order to efficiently execute the process, has developed an open source database plug-in tool (eAuditor) to assist in auditing of data held in our tissue bank database. Using eAuditor, we have identified that human entry errors range from 0.01% when entering donor's clinical follow-up details, to 0.53% when entering pathological details, highlighting the importance of an audit protocol tool such as eAuditor in a tissue bank database. eAuditor was developed and tested on the Caisis open source clinical-research database; however, it can be integrated in other databases where similar functionality is required.

  6. Cloud-Based Distributed Control of Unmanned Systems

    DTIC Science & Technology

    2015-04-01

    during mission execution. At best, the data is saved onto hard-drives and is accessible only by the local team. Data history in a form available and...following open source technologies: GeoServer, OpenLayers, PostgreSQL , and PostGIS are chosen to implement the back-end database and server. A brief...geospatial map data. 3. PostgreSQL : An SQL-compliant object-relational database that easily scales to accommodate large amounts of data - upwards to

  7. Cloud storage based mobile assessment facility for patients with post-traumatic stress disorder using integrated signal processing algorithm

    NASA Astrophysics Data System (ADS)

    Balbin, Jessie R.; Pinugu, Jasmine Nadja J.; Basco, Abigail Joy S.; Cabanada, Myla B.; Gonzales, Patrisha Melrose V.; Marasigan, Juan Carlos C.

    2017-06-01

    The research aims to build a tool in assessing patients for post-traumatic stress disorder or PTSD. The parameters used are heart rate, skin conductivity, and facial gestures. Facial gestures are recorded using OpenFace, an open-source face recognition program that uses facial action units in to track facial movements. Heart rate and skin conductivity is measured through sensors operated using Raspberry Pi. Results are stored in a database for easy and quick access. Databases to be used are uploaded to a cloud platform so that doctors have direct access to the data. This research aims to analyze these parameters and give accurate assessment of the patient.

  8. Pan European Phenological database (PEP725): a single point of access for European data.

    PubMed

    Templ, Barbara; Koch, Elisabeth; Bolmgren, Kjell; Ungersböck, Markus; Paul, Anita; Scheifinger, Helfried; Rutishauser, This; Busto, Montserrat; Chmielewski, Frank-M; Hájková, Lenka; Hodzić, Sabina; Kaspar, Frank; Pietragalla, Barbara; Romero-Fresneda, Ramiro; Tolvanen, Anne; Vučetič, Višnja; Zimmermann, Kirsten; Zust, Ana

    2018-06-01

    The Pan European Phenology (PEP) project is a European infrastructure to promote and facilitate phenological research, education, and environmental monitoring. The main objective is to maintain and develop a Pan European Phenological database (PEP725) with an open, unrestricted data access for science and education. PEP725 is the successor of the database developed through the COST action 725 "Establishing a European phenological data platform for climatological applications" working as a single access point for European-wide plant phenological data. So far, 32 European meteorological services and project partners from across Europe have joined and supplied data collected by volunteers from 1868 to the present for the PEP725 database. Most of the partners actively provide data on a regular basis. The database presently holds almost 12 million records, about 46 growing stages and 265 plant species (including cultivars), and can be accessed via http://www.pep725.eu/ . Users of the PEP725 database have studied a diversity of topics ranging from climate change impact, plant physiological question, phenological modeling, and remote sensing of vegetation to ecosystem productivity.

  9. Pan European Phenological database (PEP725): a single point of access for European data

    NASA Astrophysics Data System (ADS)

    Templ, Barbara; Koch, Elisabeth; Bolmgren, Kjell; Ungersböck, Markus; Paul, Anita; Scheifinger, Helfried; Rutishauser, This; Busto, Montserrat; Chmielewski, Frank-M.; Hájková, Lenka; Hodzić, Sabina; Kaspar, Frank; Pietragalla, Barbara; Romero-Fresneda, Ramiro; Tolvanen, Anne; Vučetič, Višnja; Zimmermann, Kirsten; Zust, Ana

    2018-02-01

    The Pan European Phenology (PEP) project is a European infrastructure to promote and facilitate phenological research, education, and environmental monitoring. The main objective is to maintain and develop a Pan European Phenological database (PEP725) with an open, unrestricted data access for science and education. PEP725 is the successor of the database developed through the COST action 725 "Establishing a European phenological data platform for climatological applications" working as a single access point for European-wide plant phenological data. So far, 32 European meteorological services and project partners from across Europe have joined and supplied data collected by volunteers from 1868 to the present for the PEP725 database. Most of the partners actively provide data on a regular basis. The database presently holds almost 12 million records, about 46 growing stages and 265 plant species (including cultivars), and can be accessed via http://www.pep725.eu/. Users of the PEP725 database have studied a diversity of topics ranging from climate change impact, plant physiological question, phenological modeling, and remote sensing of vegetation to ecosystem productivity.

  10. ScienceCentral: open access full-text archive of scientific journals based on Journal Article Tag Suite regardless of their languages.

    PubMed

    Huh, Sun

    2013-01-01

    ScienceCentral, a free or open access, full-text archive of scientific journal literature at the Korean Federation of Science and Technology Societies, was under test in September 2013. Since it is a Journal Article Tag Suite-based full text database, extensible markup language files of all languages can be presented, according to Unicode Transformation Format 8-bit encoding. It is comparable to PubMed Central: however, there are two distinct differences. First, its scope comprises all science fields; second, it accepts all language journals. Launching ScienceCentral is the first step for free access or open access academic scientific journals of all languages to leap to the world, including scientific journals from Croatia.

  11. Hydrocarbon Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 115 Hydrocarbon Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 91 hydrocarbon molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty and reference are given for each transition reported.

  12. Diatomic Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 114 Diatomic Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 121 diatomic molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty, and reference are given for each transition reported.

  13. Triatomic Spectral Database

    National Institute of Standards and Technology Data Gateway

    SRD 117 Triatomic Spectral Database (Web, free access)   All of the rotational spectral lines observed and reported in the open literature for 55 triatomic molecules have been tabulated. The isotopic molecular species, assigned quantum numbers, observed frequency, estimated measurement uncertainty and reference are given for each transition reported.

  14. Beyond Chemical Literature: Developing Skills for Chemical Research Literacy

    ERIC Educational Resources Information Center

    Jensen, Dell, Jr.; Narske, Richard; Ghinazzi, Connie

    2010-01-01

    With the growing availability of electronic databases, online journal publications, and open-access publishing, there is unprecedented access to research materials. Increasingly, these materials are being incorporated into chemistry curricula and being used by undergraduate students in literature research. Internet savvy students can effectively…

  15. Data Architecture in an Open Systems Environment.

    ERIC Educational Resources Information Center

    Bernbom, Gerald; Cromwell, Dennis

    1993-01-01

    The conceptual basis for structured data architecture, and its integration with open systems technology at Indiana University, are described. Key strategic goals guiding these efforts are discussed: commitment to improved data access; migration to relational database technology, and deployment of a high-speed, multiprotocol network; and…

  16. The CHARA Array Database

    NASA Astrophysics Data System (ADS)

    Jones, Jeremy; Schaefer, Gail; ten Brummelaar, Theo; Gies, Douglas; Farrington, Christopher

    2018-01-01

    We are building a searchable database for the CHARA Array data archive. The Array consists of six telescopes linked together as an interferometer, providing sub-milliarcsecond resolution in the optical and near-infrared. The Array enables a variety of scientific studies, including measuring stellar angular diameters, imaging stellar shapes and surface features, mapping the orbits of close binary companions, and resolving circumstellar environments. This database is one component of an NSF/MSIP funded program to provide open access to the CHARA Array to the broader astronomical community. This archive goes back to 2004 and covers all the beam combiners on the Array. We discuss the current status of and future plans for the public database, and give directions on how to access it.

  17. Harnessing the wealth of Chinese scientific literature: schistosomiasis research and control in China

    PubMed Central

    Liu, Qin; Tian, Li-Guang; Xiao, Shu-Hua; Qi, Zhen; Steinmann, Peter; Mak, Tippi K; Utzinger, Jürg; Zhou, Xiao-Nong

    2008-01-01

    The economy of China continues to boom and so have its biomedical research and related publishing activities. Several so-called neglected tropical diseases that are most common in the developing world are still rampant or even emerging in some parts of China. The purpose of this article is to document the significant research potential from the Chinese biomedical bibliographic databases. The research contributions from China in the epidemiology and control of schistosomiasis provide an excellent illustration. We searched two widely used databases, namely China National Knowledge Infrastructure (CNKI) and VIP Information (VIP). Employing the keyword "Schistosoma" () and covering the period 1990–2006, we obtained 10,244 hits in the CNKI database and 5,975 in VIP. We examined 10 Chinese biomedical journals that published the highest number of original research articles on schistosomiasis for issues including languages and open access. Although most of the journals are published in Chinese, English abstracts are usually available. Open access to full articles was available in China Tropical Medicine in 2005/2006 and is granted by the Chinese Journal of Parasitology and Parasitic Diseases since 2003; none of the other journals examined offered open access. We reviewed (i) the discovery and development of antischistosomal drugs, (ii) the progress made with molluscicides and (iii) environmental management for schistosomiasis control in China over the past 20 years. In conclusion, significant research is published in the Chinese literature, which is relevant for local control measures and global scientific knowledge. Open access should be encouraged and language barriers removed so the wealth of Chinese research can be more fully appreciated by the scientific community. PMID:18826598

  18. Future mobile access for open-data platforms and the BBC-DaaS system

    NASA Astrophysics Data System (ADS)

    Edlich, Stefan; Singh, Sonam; Pfennigstorf, Ingo

    2013-03-01

    In this paper, we develop an open data platform on multimedia devices to act as marketplace of data for information seekers and data providers. We explore the important aspects of Data-as-a-Service (DaaS) service in the cloud with a mobile access point. The basis of the DaaS service is to act as a marketplace for information, utilizing new technologies and recent new scalable polyglot architectures based on NoSql databases. Whereas Open-Data platforms are beginning to be widely accepted, its mobile use is not. We compare similar products, their approach and a possible mobile usage. We discuss several approaches to address the mobile access as a native app, html5 and a mobile first approach together with the several frontend presentation techniques. Big data visualization itself is in the early days and we explore some possibilities to get big data / open data accessed by mobile users.

  19. ScienceCentral: open access full-text archive of scientific journals based on Journal Article Tag Suite regardless of their languages

    PubMed Central

    Huh, Sun

    2013-01-01

    ScienceCentral, a free or open access, full-text archive of scientific journal literature at the Korean Federation of Science and Technology Societies, was under test in September 2013. Since it is a Journal Article Tag Suite-based full text database, extensible markup language files of all languages can be presented, according to Unicode Transformation Format 8-bit encoding. It is comparable to PubMed Central: however, there are two distinct differences. First, its scope comprises all science fields; second, it accepts all language journals. Launching ScienceCentral is the first step for free access or open access academic scientific journals of all languages to leap to the world, including scientific journals from Croatia. PMID:24266292

  20. Ginseng Genome Database: an open-access platform for genomics of Panax ginseng.

    PubMed

    Jayakodi, Murukarthick; Choi, Beom-Soon; Lee, Sang-Choon; Kim, Nam-Hoon; Park, Jee Young; Jang, Woojong; Lakshmanan, Meiyappan; Mohan, Shobhana V G; Lee, Dong-Yup; Yang, Tae-Jin

    2018-04-12

    The ginseng (Panax ginseng C.A. Meyer) is a perennial herbaceous plant that has been used in traditional oriental medicine for thousands of years. Ginsenosides, which have significant pharmacological effects on human health, are the foremost bioactive constituents in this plant. Having realized the importance of this plant to humans, an integrated omics resource becomes indispensable to facilitate genomic research, molecular breeding and pharmacological study of this herb. The first draft genome sequences of P. ginseng cultivar "Chunpoong" were reported recently. Here, using the draft genome, transcriptome, and functional annotation datasets of P. ginseng, we have constructed the Ginseng Genome Database http://ginsengdb.snu.ac.kr /, the first open-access platform to provide comprehensive genomic resources of P. ginseng. The current version of this database provides the most up-to-date draft genome sequence (of approximately 3000 Mbp of scaffold sequences) along with the structural and functional annotations for 59,352 genes and digital expression of genes based on transcriptome data from different tissues, growth stages and treatments. In addition, tools for visualization and the genomic data from various analyses are provided. All data in the database were manually curated and integrated within a user-friendly query page. This database provides valuable resources for a range of research fields related to P. ginseng and other species belonging to the Apiales order as well as for plant research communities in general. Ginseng genome database can be accessed at http://ginsengdb.snu.ac.kr /.

  1. Creating Access to Data of Worldwide Volcanic Unrest

    NASA Astrophysics Data System (ADS)

    Venezky, D. Y.; Newhall, C. G.; Malone, S. D.

    2003-12-01

    We are creating a pilot database (WOVOdat - the World Organization of Volcano Observatories database) using an open source database and content generation software, allowing web access to data of worldwide volcanic seismicity, ground deformation, fumarolic activity, and other changes within or adjacent to a volcanic system. After three years of discussions with volcano observatories of the WOVO community and institutional databases such as IRIS, UNAVCO, and the Smithsonian's Global Volcanism Program about how to link global data of volcanic unrest for use during crisis situations and for research, we are now developing the pilot database. We already have created the core tables and have written simple queries that access some of the available data using pull-down menus on a website. Over the next year, we plan to complete schema realization, expand querying capabilities, and then open the pilot database for a multi-year data-loading process. Many of the challenges we are encountering are common to multidisciplinary projects and include determining standard data formats, choosing levels of data detail (raw vs. minimally processed data, summary intervals vs. continuous data, etc.), and organizing the extant but variable data into a useable schema. Additionally, we are working on how best to enter the varied data into the database (scripts for digital data and web-entry tools for non-digital data) and what standard sets of queries are most important. An essential during an evolving volcanic crisis would be: `Has any volcano shown the behavior being observed here and what happened?'. We believe that with a systematic aggregation of all datasets on volcanic unrest, we should be able to find patterns that were previously inaccessible or unrecognized. The second WOVOdat workshop in 2002 provided a recent forum for discussion of data formats, database access, and schemas. The formats and units for the discussed parameters can be viewed at http://www.wovo.org/WOVOdat/parameters.htm. Comments, suggestions, and participation in all aspects of the WOVOdat project are welcome and appreciated.

  2. An open experimental database for exploring inorganic materials

    DOE PAGES

    Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus; ...

    2018-04-03

    The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half ofmore » these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.« less

  3. An open experimental database for exploring inorganic materials.

    PubMed

    Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus; Perkins, John D; White, Robert; Munch, Kristin; Tumas, William; Phillips, Caleb

    2018-04-03

    The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.

  4. An open experimental database for exploring inorganic materials

    PubMed Central

    Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus; Perkins, John D.; White, Robert; Munch, Kristin; Tumas, William; Phillips, Caleb

    2018-01-01

    The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half of these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource. PMID:29611842

  5. An open experimental database for exploring inorganic materials

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Zakutayev, Andriy; Wunder, Nick; Schwarting, Marcus

    The use of advanced machine learning algorithms in experimental materials science is limited by the lack of sufficiently large and diverse datasets amenable to data mining. If publicly open, such data resources would also enable materials research by scientists without access to expensive experimental equipment. Here, we report on our progress towards a publicly open High Throughput Experimental Materials (HTEM) Database (htem.nrel.gov). This database currently contains 140,000 sample entries, characterized by structural (100,000), synthetic (80,000), chemical (70,000), and optoelectronic (50,000) properties of inorganic thin film materials, grouped in >4,000 sample entries across >100 materials systems; more than a half ofmore » these data are publicly available. This article shows how the HTEM database may enable scientists to explore materials by browsing web-based user interface and an application programming interface. This paper also describes a HTE approach to generating materials data, and discusses the laboratory information management system (LIMS), that underpin HTEM database. Finally, this manuscript illustrates how advanced machine learning algorithms can be adopted to materials science problems using this open data resource.« less

  6. Making open data work for plant scientists.

    PubMed

    Leonelli, Sabina; Smirnoff, Nicholas; Moore, Jonathan; Cook, Charis; Bastow, Ruth

    2013-11-01

    Despite the clear demand for open data sharing, its implementation within plant science is still limited. This is, at least in part, because open data-sharing raises several unanswered questions and challenges to current research practices. In this commentary, some of the challenges encountered by plant researchers at the bench when generating, interpreting, and attempting to disseminate their data have been highlighted. The difficulties involved in sharing sequencing, transcriptomics, proteomics, and metabolomics data are reviewed. The benefits and drawbacks of three data-sharing venues currently available to plant scientists are identified and assessed: (i) journal publication; (ii) university repositories; and (iii) community and project-specific databases. It is concluded that community and project-specific databases are the most useful to researchers interested in effective data sharing, since these databases are explicitly created to meet the researchers' needs, support extensive curation, and embody a heightened awareness of what it takes to make data reuseable by others. Such bottom-up and community-driven approaches need to be valued by the research community, supported by publishers, and provided with long-term sustainable support by funding bodies and government. At the same time, these databases need to be linked to generic databases where possible, in order to be discoverable to the majority of researchers and thus promote effective and efficient data sharing. As we look forward to a future that embraces open access to data and publications, it is essential that data policies, data curation, data integration, data infrastructure, and data funding are linked together so as to foster data access and research productivity.

  7. Pan European Phenological database (PEP725): a single point of access for European data

    NASA Astrophysics Data System (ADS)

    Templ, Barbara; Koch, Elisabeth; Bolmgren, Kjell; Ungersböck, Markus; Paul, Anita; Scheifinger, Helfried; Rutishauser, This; Busto, Montserrat; Chmielewski, Frank-M.; Hájková, Lenka; Hodzić, Sabina; Kaspar, Frank; Pietragalla, Barbara; Romero-Fresneda, Ramiro; Tolvanen, Anne; Vučetič, Višnja; Zimmermann, Kirsten; Zust, Ana

    2018-06-01

    The Pan European Phenology (PEP) project is a European infrastructure to promote and facilitate phenological research, education, and environmental monitoring. The main objective is to maintain and develop a Pan European Phenological database (PEP725) with an open, unrestricted data access for science and education. PEP725 is the successor of the database developed through the COST action 725 "Establishing a European phenological data platform for climatological applications" working as a single access point for European-wide plant phenological data. So far, 32 European meteorological services and project partners from across Europe have joined and supplied data collected by volunteers from 1868 to the present for the PEP725 database. Most of the partners actively provide data on a regular basis. The database presently holds almost 12 million records, about 46 growing stages and 265 plant species (including cultivars), and can be accessed via http://www.pep725.eu/ . Users of the PEP725 database have studied a diversity of topics ranging from climate change impact, plant physiological question, phenological modeling, and remote sensing of vegetation to ecosystem productivity.

  8. Functionally Graded Materials Database

    NASA Astrophysics Data System (ADS)

    Kisara, Katsuto; Konno, Tomomi; Niino, Masayuki

    2008-02-01

    Functionally Graded Materials Database (hereinafter referred to as FGMs Database) was open to the society via Internet in October 2002, and since then it has been managed by the Japan Aerospace Exploration Agency (JAXA). As of October 2006, the database includes 1,703 research information entries with 2,429 researchers data, 509 institution data and so on. Reading materials such as "Applicability of FGMs Technology to Space Plane" and "FGMs Application to Space Solar Power System (SSPS)" were prepared in FY 2004 and 2005, respectively. The English version of "FGMs Application to Space Solar Power System (SSPS)" is now under preparation. This present paper explains the FGMs Database, describing the research information data, the sitemap and how to use it. From the access analysis, user access results and users' interests are discussed.

  9. Database citation in supplementary data linked to Europe PubMed Central full text biomedical articles.

    PubMed

    Kafkas, Şenay; Kim, Jee-Hyub; Pi, Xingjun; McEntyre, Johanna R

    2015-01-01

    In this study, we present an analysis of data citation practices in full text research articles and their corresponding supplementary data files, made available in the Open Access set of articles from Europe PubMed Central. Our aim is to investigate whether supplementary data files should be considered as a source of information for integrating the literature with biomolecular databases. Using text-mining methods to identify and extract a variety of core biological database accession numbers, we found that the supplemental data files contain many more database citations than the body of the article, and that those citations often take the form of a relatively small number of articles citing large collections of accession numbers in text-based files. Moreover, citation of value-added databases derived from submission databases (such as Pfam, UniProt or Ensembl) is common, demonstrating the reuse of these resources as datasets in themselves. All the database accession numbers extracted from the supplementary data are publicly accessible from http://dx.doi.org/10.5281/zenodo.11771. Our study suggests that supplementary data should be considered when linking articles with data, in curation pipelines, and in information retrieval tasks in order to make full use of the entire research article. These observations highlight the need to improve the management of supplemental data in general, in order to make this information more discoverable and useful.

  10. A System for Web-based Access to the HSOS Database

    NASA Astrophysics Data System (ADS)

    Lin, G.

    Huairou Solar Observing Station's (HSOS) magnetogram and dopplergram are world-class instruments. Access to their data has opened to the world. Web-based access to the data will provide a powerful, convenient tool for data searching and solar physics. It is necessary that our data be provided to users via the Web when it is opened to the world. In this presentation, the author describes general design and programming construction of the system. The system will be generated by PHP and MySQL. The author also introduces basic feature of PHP and MySQL.

  11. National Utility Rate Database: Preprint

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ong, S.; McKeel, R.

    2012-08-01

    When modeling solar energy technologies and other distributed energy systems, using high-quality expansive electricity rates is essential. The National Renewable Energy Laboratory (NREL) developed a utility rate platform for entering, storing, updating, and accessing a large collection of utility rates from around the United States. This utility rate platform lives on the Open Energy Information (OpenEI) website, OpenEI.org, allowing the data to be programmatically accessed from a web browser, using an application programming interface (API). The semantic-based utility rate platform currently has record of 1,885 utility rates and covers over 85% of the electricity consumption in the United States.

  12. Exploring Chemical Space for Drug Discovery Using the Chemical Universe Database

    PubMed Central

    2012-01-01

    Herein we review our recent efforts in searching for bioactive ligands by enumeration and virtual screening of the unknown chemical space of small molecules. Enumeration from first principles shows that almost all small molecules (>99.9%) have never been synthesized and are still available to be prepared and tested. We discuss open access sources of molecules, the classification and representation of chemical space using molecular quantum numbers (MQN), its exhaustive enumeration in form of the chemical universe generated databases (GDB), and examples of using these databases for prospective drug discovery. MQN-searchable GDB, PubChem, and DrugBank are freely accessible at www.gdb.unibe.ch. PMID:23019491

  13. Urate levels predict survival in amyotrophic lateral sclerosis: Analysis of the expanded Pooled Resource Open-Access ALS clinical trials database.

    PubMed

    Paganoni, Sabrina; Nicholson, Katharine; Chan, James; Shui, Amy; Schoenfeld, David; Sherman, Alexander; Berry, James; Cudkowicz, Merit; Atassi, Nazem

    2018-03-01

    Urate has been identified as a predictor of amyotrophic lateral sclerosis (ALS) survival in some but not all studies. Here we leverage the recent expansion of the Pooled Resource Open-Access ALS Clinical Trials (PRO-ACT) database to study the association between urate levels and ALS survival. Pooled data of 1,736 ALS participants from the PRO-ACT database were analyzed. Cox proportional hazards regression models were used to evaluate associations between urate levels at trial entry and survival. After adjustment for potential confounders (i.e., creatinine and body mass index), there was an 11% reduction in risk of reaching a survival endpoint during the study with each 1-mg/dL increase in uric acid levels (adjusted hazard ratio 0.89, 95% confidence interval 0.82-0.97, P < 0.01). Our pooled analysis provides further support for urate as a prognostic factor for survival in ALS and confirms the utility of the PRO-ACT database as a powerful resource for ALS epidemiological research. Muscle Nerve 57: 430-434, 2018. © 2017 Wiley Periodicals, Inc.

  14. Usage Trends of Open Access and Local Journals: A Korean Case Study.

    PubMed

    Seo, Jeong-Wook; Chung, Hosik; Yun, Jungmin; Park, Jin Young; Park, Eunsun; Ahn, Yuri

    2016-01-01

    Articles from open access and local journals are important resources for research in Korea and the usage trends of these articles are important indicators for the assessment of the current research practice. We analyzed an institutional collection of published papers from 1998 to 2014 authored by researchers from Seoul National University, and their references from papers published between 1998 and 2011. The published papers were collected from Web of Science or Scopus and were analyzed according to the proportion of articles from open access journals. Their cited references from published papers in Web of Science were analyzed according to the proportion of local (South Korean) or open access journals. The proportion of open access papers was relatively stable until 2006 (2.5 ~ 5.2% in Web of Science and 2.7 ~ 4.2% in Scopus), but then increased to 15.9% (Web of Science) or 18.5% (Scopus) in 2014. We analyzed 2,750,485 cited references from 52,295 published papers. We found that the overall proportion of cited articles from local journals was 1.8% and that for open access journals was 3.0%. Citations of open access articles have increased since 2006 to 4.1% in 2011, although the increase in open access article citations was less than for open access publications. The proportion of citations from local journals was even lower. We think that the publishing / citing mismatch is a term to describe this difference, which is an issue at Seoul National University, where the number of published papers at open access or local journals is increasing but the number of citations is not. The cause of this discrepancy is multi-factorial but the governmental / institutional policies, social / cultural issues and authors' citing behaviors will explain the mismatch. However, additional measures are also necessary, such as the development of an institutional citation database and improved search capabilities with respect to local and open access documents.

  15. Usage Trends of Open Access and Local Journals: A Korean Case Study

    PubMed Central

    Chung, Hosik; Yun, Jungmin; Park, Jin Young; Park, Eunsun; Ahn, Yuri

    2016-01-01

    Articles from open access and local journals are important resources for research in Korea and the usage trends of these articles are important indicators for the assessment of the current research practice. We analyzed an institutional collection of published papers from 1998 to 2014 authored by researchers from Seoul National University, and their references from papers published between 1998 and 2011. The published papers were collected from Web of Science or Scopus and were analyzed according to the proportion of articles from open access journals. Their cited references from published papers in Web of Science were analyzed according to the proportion of local (South Korean) or open access journals. The proportion of open access papers was relatively stable until 2006 (2.5 ~ 5.2% in Web of Science and 2.7 ~ 4.2% in Scopus), but then increased to 15.9% (Web of Science) or 18.5% (Scopus) in 2014. We analyzed 2,750,485 cited references from 52,295 published papers. We found that the overall proportion of cited articles from local journals was 1.8% and that for open access journals was 3.0%. Citations of open access articles have increased since 2006 to 4.1% in 2011, although the increase in open access article citations was less than for open access publications. The proportion of citations from local journals was even lower. We think that the publishing / citing mismatch is a term to describe this difference, which is an issue at Seoul National University, where the number of published papers at open access or local journals is increasing but the number of citations is not. The cause of this discrepancy is multi-factorial but the governmental / institutional policies, social / cultural issues and authors' citing behaviors will explain the mismatch. However, additional measures are also necessary, such as the development of an institutional citation database and improved search capabilities with respect to local and open access documents. PMID:27195948

  16. Why open drug discovery needs four simple rules for licensing data and models.

    PubMed

    Williams, Antony J; Wilbanks, John; Ekins, Sean

    2012-01-01

    When we look at the rapid growth of scientific databases on the Internet in the past decade, we tend to take the accessibility and provenance of the data for granted. As we see a future of increased database integration, the licensing of the data may be a hurdle that hampers progress and usability. We have formulated four rules for licensing data for open drug discovery, which we propose as a starting point for consideration by databases and for their ultimate adoption. This work could also be extended to the computational models derived from such data. We suggest that scientists in the future will need to consider data licensing before they embark upon re-using such content in databases they construct themselves.

  17. JASPAR 2010: the greatly expanded open-access database of transcription factor binding profiles

    PubMed Central

    Portales-Casamar, Elodie; Thongjuea, Supat; Kwon, Andrew T.; Arenillas, David; Zhao, Xiaobei; Valen, Eivind; Yusuf, Dimas; Lenhard, Boris; Wasserman, Wyeth W.; Sandelin, Albin

    2010-01-01

    JASPAR (http://jaspar.genereg.net) is the leading open-access database of matrix profiles describing the DNA-binding patterns of transcription factors (TFs) and other proteins interacting with DNA in a sequence-specific manner. Its fourth major release is the largest expansion of the core database to date: the database now holds 457 non-redundant, curated profiles. The new entries include the first batch of profiles derived from ChIP-seq and ChIP-chip whole-genome binding experiments, and 177 yeast TF binding profiles. The introduction of a yeast division brings the convenience of JASPAR to an active research community. As binding models are refined by newer data, the JASPAR database now uses versioning of matrices: in this release, 12% of the older models were updated to improved versions. Classification of TF families has been improved by adopting a new DNA-binding domain nomenclature. A curated catalog of mammalian TFs is provided, extending the use of the JASPAR profiles to additional TFs belonging to the same structural family. The changes in the database set the system ready for more rapid acquisition of new high-throughput data sources. Additionally, three new special collections provide matrix profile data produced by recent alternative high-throughput approaches. PMID:19906716

  18. The Neotoma Paleoecology Database

    NASA Astrophysics Data System (ADS)

    Grimm, E. C.; Ashworth, A. C.; Barnosky, A. D.; Betancourt, J. L.; Bills, B.; Booth, R.; Blois, J.; Charles, D. F.; Graham, R. W.; Goring, S. J.; Hausmann, S.; Smith, A. J.; Williams, J. W.; Buckland, P.

    2015-12-01

    The Neotoma Paleoecology Database (www.neotomadb.org) is a multiproxy, open-access, relational database that includes fossil data for the past 5 million years (the late Neogene and Quaternary Periods). Modern distributional data for various organisms are also being made available for calibration and paleoecological analyses. The project is a collaborative effort among individuals from more than 20 institutions worldwide, including domain scientists representing a spectrum of Pliocene-Quaternary fossil data types, as well as experts in information technology. Working groups are active for diatoms, insects, ostracodes, pollen and plant macroscopic remains, testate amoebae, rodent middens, vertebrates, age models, geochemistry and taphonomy. Groups are also active in developing online tools for data analyses and for developing modules for teaching at different levels. A key design concept of NeotomaDB is that stewards for various data types are able to remotely upload and manage data. Cooperatives for different kinds of paleo data, or from different regions, can appoint their own stewards. Over the past year, much progress has been made on development of the steward software-interface that will enable this capability. The steward interface uses web services that provide access to the database. More generally, these web services enable remote programmatic access to the database, which both desktop and web applications can use and which provide real-time access to the most current data. Use of these services can alleviate the need to download the entire database, which can be out-of-date as soon as new data are entered. In general, the Neotoma web services deliver data either from an entire table or from the results of a view. Upon request, new web services can be quickly generated. Future developments will likely expand the spatial and temporal dimensions of the database. NeotomaDB is open to receiving new datasets and stewards from the global Quaternary community. Research is supported by NSF EAR-0622349.

  19. WebCN: A web-based computation tool for in situ-produced cosmogenic nuclides

    NASA Astrophysics Data System (ADS)

    Ma, Xiuzeng; Li, Yingkui; Bourgeois, Mike; Caffee, Marc; Elmore, David; Granger, Darryl; Muzikar, Paul; Smith, Preston

    2007-06-01

    Cosmogenic nuclide techniques are increasingly being utilized in geoscience research. For this it is critical to establish an effective, easily accessible and well defined tool for cosmogenic nuclide computations. We have been developing a web-based tool (WebCN) to calculate surface exposure ages and erosion rates based on the nuclide concentrations measured by the accelerator mass spectrometry. WebCN for 10Be and 26Al has been finished and published at http://www.physics.purdue.edu/primelab/for_users/rockage.html. WebCN for 36Cl is under construction. WebCN is designed as a three-tier client/server model and uses the open source PostgreSQL for the database management and PHP for the interface design and calculations. On the client side, an internet browser and Microsoft Access are used as application interfaces to access the system. Open Database Connectivity is used to link PostgreSQL and Microsoft Access. WebCN accounts for both spatial and temporal distributions of the cosmic ray flux to calculate the production rates of in situ-produced cosmogenic nuclides at the Earth's surface.

  20. An Investigation of Graduate Student Knowledge and Usage of Open-Access Journals

    ERIC Educational Resources Information Center

    Beard, Regina M.

    2016-01-01

    Graduate students lament the need to achieve the proficiency necessary to competently search multiple databases for their research assignments, regularly eschewing these sources in favor of Google Scholar or some other search engine. The author conducted an anonymous survey investigating graduate student knowledge or awareness of the open-access…

  1. Scientific Journal Publishing: Yearly Volume and Open Access Availability

    ERIC Educational Resources Information Center

    Bjork, Bo-Christer; Roos, Annikki; Lauri, Mari

    2009-01-01

    Introduction: We estimate the total yearly volume of peer-reviewed scientific journal articles published world-wide as well as the share of these articles available openly on the Web either directly or as copies in e-print repositories. Method: We rely on data from two commercial databases (ISI and Ulrich's Periodicals Directory) supplemented by…

  2. Ibmdbpy-spatial : An Open-source implementation of in-database geospatial analytics in Python

    NASA Astrophysics Data System (ADS)

    Roy, Avipsa; Fouché, Edouard; Rodriguez Morales, Rafael; Moehler, Gregor

    2017-04-01

    As the amount of spatial data acquired from several geodetic sources has grown over the years and as data infrastructure has become more powerful, the need for adoption of in-database analytic technology within geosciences has grown rapidly. In-database analytics on spatial data stored in a traditional enterprise data warehouse enables much faster retrieval and analysis for making better predictions about risks and opportunities, identifying trends and spot anomalies. Although there are a number of open-source spatial analysis libraries like geopandas and shapely available today, most of them have been restricted to manipulation and analysis of geometric objects with a dependency on GEOS and similar libraries. We present an open-source software package, written in Python, to fill the gap between spatial analysis and in-database analytics. Ibmdbpy-spatial provides a geospatial extension to the ibmdbpy package, implemented in 2015. It provides an interface for spatial data manipulation and access to in-database algorithms in IBM dashDB, a data warehouse platform with a spatial extender that runs as a service on IBM's cloud platform called Bluemix. Working in-database reduces the network overload, as the complete data need not be replicated into the user's local system altogether and only a subset of the entire dataset can be fetched into memory in a single instance. Ibmdbpy-spatial accelerates Python analytics by seamlessly pushing operations written in Python into the underlying database for execution using the dashDB spatial extender, thereby benefiting from in-database performance-enhancing features, such as columnar storage and parallel processing. The package is currently supported on Python versions from 2.7 up to 3.4. The basic architecture of the package consists of three main components - 1) a connection to the dashDB represented by the instance IdaDataBase, which uses a middleware API namely - pypyodbc or jaydebeapi to establish the database connection via ODBC or JDBC respectively, 2) an instance to represent the spatial data stored in the database as a dataframe in Python, called the IdaGeoDataFrame, with a specific geometry attribute which recognises a planar geometry column in dashDB and 3) Python wrappers for spatial functions like within, distance, area, buffer} and more which dashDB currently supports to make the querying process from Python much simpler for the users. The spatial functions translate well-known geopandas-like syntax into SQL queries utilising the database connection to perform spatial operations in-database and can operate on single geometries as well two different geometries from different IdaGeoDataFrames. The in-database queries strictly follow the standards of OpenGIS Implementation Specification for Geographic information - Simple feature access for SQL. The results of the operations obtained can thereby be accessed dynamically via interactive Jupyter notebooks from any system which supports Python, without any additional dependencies and can also be combined with other open source libraries such as matplotlib and folium in-built within Jupyter notebooks for visualization purposes. We built a use case to analyse crime hotspots in New York city to validate our implementation and visualized the results as a choropleth map for each borough.

  3. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles

    PubMed Central

    Mathelier, Anthony; Zhao, Xiaobei; Zhang, Allen W.; Parcy, François; Worsley-Hunt, Rebecca; Arenillas, David J.; Buchman, Sorana; Chen, Chih-yu; Chou, Alice; Ienasescu, Hans; Lim, Jonathan; Shyr, Casper; Tan, Ge; Zhou, Michelle; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W.

    2014-01-01

    JASPAR (http://jaspar.genereg.net) is the largest open-access database of matrix-based nucleotide profiles describing the binding preference of transcription factors from multiple species. The fifth major release greatly expands the heart of JASPAR—the JASPAR CORE subcollection, which contains curated, non-redundant profiles—with 135 new curated profiles (74 in vertebrates, 8 in Drosophila melanogaster, 10 in Caenorhabditis elegans and 43 in Arabidopsis thaliana; a 30% increase in total) and 43 older updated profiles (36 in vertebrates, 3 in D. melanogaster and 4 in A. thaliana; a 9% update in total). The new and updated profiles are mainly derived from published chromatin immunoprecipitation-seq experimental datasets. In addition, the web interface has been enhanced with advanced capabilities in browsing, searching and subsetting. Finally, the new JASPAR release is accompanied by a new BioPython package, a new R tool package and a new R/Bioconductor data package to facilitate access for both manual and automated methods. PMID:24194598

  4. JASPAR 2014: an extensively expanded and updated open-access database of transcription factor binding profiles.

    PubMed

    Mathelier, Anthony; Zhao, Xiaobei; Zhang, Allen W; Parcy, François; Worsley-Hunt, Rebecca; Arenillas, David J; Buchman, Sorana; Chen, Chih-yu; Chou, Alice; Ienasescu, Hans; Lim, Jonathan; Shyr, Casper; Tan, Ge; Zhou, Michelle; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W

    2014-01-01

    JASPAR (http://jaspar.genereg.net) is the largest open-access database of matrix-based nucleotide profiles describing the binding preference of transcription factors from multiple species. The fifth major release greatly expands the heart of JASPAR-the JASPAR CORE subcollection, which contains curated, non-redundant profiles-with 135 new curated profiles (74 in vertebrates, 8 in Drosophila melanogaster, 10 in Caenorhabditis elegans and 43 in Arabidopsis thaliana; a 30% increase in total) and 43 older updated profiles (36 in vertebrates, 3 in D. melanogaster and 4 in A. thaliana; a 9% update in total). The new and updated profiles are mainly derived from published chromatin immunoprecipitation-seq experimental datasets. In addition, the web interface has been enhanced with advanced capabilities in browsing, searching and subsetting. Finally, the new JASPAR release is accompanied by a new BioPython package, a new R tool package and a new R/Bioconductor data package to facilitate access for both manual and automated methods.

  5. Open Access: From Myth to Paradox

    ScienceCinema

    Ginsparg, Paul [Cornell University, Ithaca, New York, United States

    2018-04-19

    True open access to scientific publications not only gives readers the possibility to read articles without paying subscription, but also makes the material available for automated ingestion and harvesting by 3rd parties. Once articles and associated data become universally treatable as computable objects, openly available to 3rd party aggregators and value-added services, what new services can we expect, and how will they change the way that researchers interact with their scholarly communications infrastructure? I will discuss straightforward applications of existing ideas and services, including citation analysis, collaborative filtering, external database linkages, interoperability, and other forms of automated markup, and speculate on the sociology of the next generation of users.

  6. New mutations and an updated database for the patched-1 (PTCH1) gene.

    PubMed

    Reinders, Marie G; van Hout, Antonius F; Cosgun, Betûl; Paulussen, Aimée D; Leter, Edward M; Steijlen, Peter M; Mosterd, Klara; van Geel, Michel; Gille, Johan J

    2018-05-01

    Basal cell nevus syndrome (BCNS) is an autosomal dominant disorder characterized by multiple basal cell carcinomas (BCCs), maxillary keratocysts, and cerebral calcifications. BCNS most commonly is caused by a germline mutation in the patched-1 (PTCH1) gene. PTCH1 mutations are also described in patients with holoprosencephaly. We have established a locus-specific database for the PTCH1 gene using the Leiden Open Variation Database (LOVD). We included 117 new PTCH1 variations, in addition to 331 previously published unique PTCH1 mutations. These new mutations were found in 141 patients who had a positive PTCH1 mutation analysis in either the VU University Medical Centre (VUMC) or Maastricht University Medical Centre (MUMC) between 1995 and 2015. The database contains 331 previously published unique PTCH1 mutations and 117 new PTCH1 variations. We have established a locus-specific database for the PTCH1 gene using the Leiden Open Variation Database (LOVD). The database provides an open collection for both clinicians and researchers and is accessible online at http://www.lovd.nl/PTCH1. © 2018 The Authors. Molecular Genetics & Genomic Medicine published by Wiley Periodicals, Inc.

  7. Why Open Drug Discovery Needs Four Simple Rules for Licensing Data and Models

    PubMed Central

    Williams, Antony J.; Wilbanks, John; Ekins, Sean

    2012-01-01

    When we look at the rapid growth of scientific databases on the Internet in the past decade, we tend to take the accessibility and provenance of the data for granted. As we see a future of increased database integration, the licensing of the data may be a hurdle that hampers progress and usability. We have formulated four rules for licensing data for open drug discovery, which we propose as a starting point for consideration by databases and for their ultimate adoption. This work could also be extended to the computational models derived from such data. We suggest that scientists in the future will need to consider data licensing before they embark upon re-using such content in databases they construct themselves. PMID:23028298

  8. DianaHealth.com, an On-Line Database Containing Appraisals of the Clinical Value and Appropriateness of Healthcare Interventions: Database Development and Retrospective Analysis.

    PubMed

    Bonfill, Xavier; Osorio, Dimelza; Solà, Ivan; Pijoan, Jose Ignacio; Balasso, Valentina; Quintana, Maria Jesús; Puig, Teresa; Bolibar, Ignasi; Urrútia, Gerard; Zamora, Javier; Emparanza, José Ignacio; Gómez de la Cámara, Agustín; Ferreira-González, Ignacio

    2016-01-01

    To describe the development of a novel on-line database aimed to serve as a source of information concerning healthcare interventions appraised for their clinical value and appropriateness by several initiatives worldwide, and to present a retrospective analysis of the appraisals already included in the database. Database development and a retrospective analysis. The database DianaHealth.com is already on-line and it is regularly updated, independent, open access and available in English and Spanish. Initiatives are identified in medical news, in article references, and by contacting experts in the field. We include appraisals in the form of clinical recommendations, expert analyses, conclusions from systematic reviews, and original research that label any health care intervention as low-value or inappropriate. We obtain the information necessary to classify the appraisals according to type of intervention, specialties involved, publication year, authoring initiative, and key words. The database is accessible through a search engine which retrieves a list of appraisals and a link to the website where they were published. DianaHealth.com also provides a brief description of the initiatives and a section where users can report new appraisals or suggest new initiatives. From January 2014 to July 2015, the on-line database included 2940 appraisals from 22 initiatives: eleven campaigns gathering clinical recommendations from scientific societies, five sets of conclusions from literature review, three sets of recommendations from guidelines, two collections of articles on low clinical value in medical journals, and an initiative of our own. We have developed an open access on-line database of appraisals about healthcare interventions considered of low clinical value or inappropriate. DianaHealth.com could help physicians and other stakeholders make better decisions concerning patient care and healthcare systems sustainability. Future efforts should be focused on assessing the impact of these appraisals in the clinical practice.

  9. DianaHealth.com, an On-Line Database Containing Appraisals of the Clinical Value and Appropriateness of Healthcare Interventions: Database Development and Retrospective Analysis

    PubMed Central

    Bonfill, Xavier; Osorio, Dimelza; Solà, Ivan; Pijoan, Jose Ignacio; Balasso, Valentina; Quintana, Maria Jesús; Puig, Teresa; Bolibar, Ignasi; Urrútia, Gerard; Zamora, Javier; Emparanza, José Ignacio; Gómez de la Cámara, Agustín; Ferreira-González, Ignacio

    2016-01-01

    Objective To describe the development of a novel on-line database aimed to serve as a source of information concerning healthcare interventions appraised for their clinical value and appropriateness by several initiatives worldwide, and to present a retrospective analysis of the appraisals already included in the database. Methods and Findings Database development and a retrospective analysis. The database DianaHealth.com is already on-line and it is regularly updated, independent, open access and available in English and Spanish. Initiatives are identified in medical news, in article references, and by contacting experts in the field. We include appraisals in the form of clinical recommendations, expert analyses, conclusions from systematic reviews, and original research that label any health care intervention as low-value or inappropriate. We obtain the information necessary to classify the appraisals according to type of intervention, specialties involved, publication year, authoring initiative, and key words. The database is accessible through a search engine which retrieves a list of appraisals and a link to the website where they were published. DianaHealth.com also provides a brief description of the initiatives and a section where users can report new appraisals or suggest new initiatives. From January 2014 to July 2015, the on-line database included 2940 appraisals from 22 initiatives: eleven campaigns gathering clinical recommendations from scientific societies, five sets of conclusions from literature review, three sets of recommendations from guidelines, two collections of articles on low clinical value in medical journals, and an initiative of our own. Conclusions We have developed an open access on-line database of appraisals about healthcare interventions considered of low clinical value or inappropriate. DianaHealth.com could help physicians and other stakeholders make better decisions concerning patient care and healthcare systems sustainability. Future efforts should be focused on assessing the impact of these appraisals in the clinical practice. PMID:26840451

  10. Detection of heart disease by open access echocardiography: a retrospective analysis of general practice referrals

    PubMed Central

    Chambers, John; Kabir, Saleha; Cajeat, Eric

    2014-01-01

    Background Heart disease is difficult to detect clinically and it has been suggested that echocardiography should be available to all patients with possible cardiac symptoms or signs. Aim To analyse the results of 2 years of open access echocardiography for the frequency of structural heart disease according to request. Design and setting Retrospective database analysis in a teaching hospital open access echocardiography service. Method Reports of all open access transthoracic echocardiograms between January 2011 and December 2012 were categorised as normal, having minor abnormalities, or significant abnormalities according to the indication. Results There were 2343 open access echocardiograms performed and there were significant abnormalities in 29%, predominantly valve disease (n = 304, 13%), LV systolic dysfunction (n = 179, 8%), aortic dilatation (n = 80, 3%), or pulmonary hypertension (n = 91, 4%). If echocardiography had been targeted at a high-risk group, 267 with valve disease would have been detected (compared to 127 with murmur alone) and 139 with LV systolic dysfunction (compared to 91 with suspected heart failure alone). Most GP practices requested fewer than 10 studies, but 6 practices requested over 70 studies. Conclusion Open access echocardiograms are often abnormal but structural disease may not be suspected from the clinical request. Uptake by individual practices is patchy. A targeted expansion of echocardiography in patients with a high likelihood of disease is therefore likely to increase the detection of clinically important pathology. PMID:24567615

  11. Detection of heart disease by open access echocardiography: a retrospective analysis of general practice referrals.

    PubMed

    Chambers, John; Kabir, Saleha; Cajeat, Eric

    2014-02-01

    Heart disease is difficult to detect clinically and it has been suggested that echocardiography should be available to all patients with possible cardiac symptoms or signs. To analyse the results of 2 years of open access echocardiography for the frequency of structural heart disease according to request. Retrospective database analysis in a teaching hospital open access echocardiography service. Reports of all open access transthoracic echocardiograms between January 2011 and December 2012 were categorised as normal, having minor abnormalities, or significant abnormalities according to the indication. There were 2343 open access echocardiograms performed and there were significant abnormalities in 29%, predominantly valve disease (n = 304, 13%), LV systolic dysfunction (n = 179, 8%), aortic dilatation (n = 80, 3%), or pulmonary hypertension (n = 91, 4%). If echocardiography had been targeted at a high-risk group, 267 with valve disease would have been detected (compared to 127 with murmur alone) and 139 with LV systolic dysfunction (compared to 91 with suspected heart failure alone). Most GP practices requested fewer than 10 studies, but 6 practices requested over 70 studies. Open access echocardiograms are often abnormal but structural disease may not be suspected from the clinical request. Uptake by individual practices is patchy. A targeted expansion of echocardiography in patients with a high likelihood of disease is therefore likely to increase the detection of clinically important pathology.

  12. Interacting with the National Database for Autism Research (NDAR) via the LONI Pipeline workflow environment.

    PubMed

    Torgerson, Carinna M; Quinn, Catherine; Dinov, Ivo; Liu, Zhizhong; Petrosyan, Petros; Pelphrey, Kevin; Haselgrove, Christian; Kennedy, David N; Toga, Arthur W; Van Horn, John Darrell

    2015-03-01

    Under the umbrella of the National Database for Clinical Trials (NDCT) related to mental illnesses, the National Database for Autism Research (NDAR) seeks to gather, curate, and make openly available neuroimaging data from NIH-funded studies of autism spectrum disorder (ASD). NDAR has recently made its database accessible through the LONI Pipeline workflow design and execution environment to enable large-scale analyses of cortical architecture and function via local, cluster, or "cloud"-based computing resources. This presents a unique opportunity to overcome many of the customary limitations to fostering biomedical neuroimaging as a science of discovery. Providing open access to primary neuroimaging data, workflow methods, and high-performance computing will increase uniformity in data collection protocols, encourage greater reliability of published data, results replication, and broaden the range of researchers now able to perform larger studies than ever before. To illustrate the use of NDAR and LONI Pipeline for performing several commonly performed neuroimaging processing steps and analyses, this paper presents example workflows useful for ASD neuroimaging researchers seeking to begin using this valuable combination of online data and computational resources. We discuss the utility of such database and workflow processing interactivity as a motivation for the sharing of additional primary data in ASD research and elsewhere.

  13. The Database Query Support Processor (QSP)

    NASA Technical Reports Server (NTRS)

    1993-01-01

    The number and diversity of databases available to users continues to increase dramatically. Currently, the trend is towards decentralized, client server architectures that (on the surface) are less expensive to acquire, operate, and maintain than information architectures based on centralized, monolithic mainframes. The database query support processor (QSP) effort evaluates the performance of a network level, heterogeneous database access capability. Air Force Material Command's Rome Laboratory has developed an approach, based on ANSI standard X3.138 - 1988, 'The Information Resource Dictionary System (IRDS)' to seamless access to heterogeneous databases based on extensions to data dictionary technology. To successfully query a decentralized information system, users must know what data are available from which source, or have the knowledge and system privileges necessary to find out this information. Privacy and security considerations prohibit free and open access to every information system in every network. Even in completely open systems, time required to locate relevant data (in systems of any appreciable size) would be better spent analyzing the data, assuming the original question was not forgotten. Extensions to data dictionary technology have the potential to more fully automate the search and retrieval for relevant data in a decentralized environment. Substantial amounts of time and money could be saved by not having to teach users what data resides in which systems and how to access each of those systems. Information describing data and how to get it could be removed from the application and placed in a dedicated repository where it belongs. The result simplified applications that are less brittle and less expensive to build and maintain. Software technology providing the required functionality is off the shelf. The key difficulty is in defining the metadata required to support the process. The database query support processor effort will provide quantitative data on the amount of effort required to implement an extended data dictionary at the network level, add new systems, adapt to changing user needs, and provide sound estimates on operations and maintenance costs and savings.

  14. Large-Scale 1:1 Computing Initiatives: An Open Access Database

    ERIC Educational Resources Information Center

    Richardson, Jayson W.; McLeod, Scott; Flora, Kevin; Sauers, Nick J.; Kannan, Sathiamoorthy; Sincar, Mehmet

    2013-01-01

    This article details the spread and scope of large-scale 1:1 computing initiatives around the world. What follows is a review of the existing literature around 1:1 programs followed by a description of the large-scale 1:1 database. Main findings include: 1) the XO and the Classmate PC dominate large-scale 1:1 initiatives; 2) if professional…

  15. BIRS - Bioterrorism Information Retrieval System.

    PubMed

    Tewari, Ashish Kumar; Rashi; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Jain, Chakresh Kumar

    2013-01-01

    Bioterrorism is the intended use of pathogenic strains of microbes to widen terror in a population. There is a definite need to promote research for development of vaccines, therapeutics and diagnostic methods as a part of preparedness to any bioterror attack in the future. BIRS is an open-access database of collective information on the organisms related to bioterrorism. The architecture of database utilizes the current open-source technology viz PHP ver 5.3.19, MySQL and IIS server under windows platform for database designing. Database stores information on literature, generic- information and unique pathways of about 10 microorganisms involved in bioterrorism. This may serve as a collective repository to accelerate the drug discovery and vaccines designing process against such bioterrorist agents (microbes). The available data has been validated from various online resources and literature mining in order to provide the user with a comprehensive information system. The database is freely available at http://www.bioterrorism.biowaves.org.

  16. WikiPathways: a multifaceted pathway database bridging metabolomics to other omics research.

    PubMed

    Slenter, Denise N; Kutmon, Martina; Hanspers, Kristina; Riutta, Anders; Windsor, Jacob; Nunes, Nuno; Mélius, Jonathan; Cirillo, Elisa; Coort, Susan L; Digles, Daniela; Ehrhart, Friederike; Giesbertz, Pieter; Kalafati, Marianthi; Martens, Marvin; Miller, Ryan; Nishida, Kozo; Rieswijk, Linda; Waagmeester, Andra; Eijssen, Lars M T; Evelo, Chris T; Pico, Alexander R; Willighagen, Egon L

    2018-01-04

    WikiPathways (wikipathways.org) captures the collective knowledge represented in biological pathways. By providing a database in a curated, machine readable way, omics data analysis and visualization is enabled. WikiPathways and other pathway databases are used to analyze experimental data by research groups in many fields. Due to the open and collaborative nature of the WikiPathways platform, our content keeps growing and is getting more accurate, making WikiPathways a reliable and rich pathway database. Previously, however, the focus was primarily on genes and proteins, leaving many metabolites with only limited annotation. Recent curation efforts focused on improving the annotation of metabolism and metabolic pathways by associating unmapped metabolites with database identifiers and providing more detailed interaction knowledge. Here, we report the outcomes of the continued growth and curation efforts, such as a doubling of the number of annotated metabolite nodes in WikiPathways. Furthermore, we introduce an OpenAPI documentation of our web services and the FAIR (Findable, Accessible, Interoperable and Reusable) annotation of resources to increase the interoperability of the knowledge encoded in these pathways and experimental omics data. New search options, monthly downloads, more links to metabolite databases, and new portals make pathway knowledge more effortlessly accessible to individual researchers and research communities. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. Open Window: When Easily Identifiable Genomes and Traits Are in the Public Domain

    PubMed Central

    Angrist, Misha

    2014-01-01

    “One can't be of an enquiring and experimental nature, and still be very sensible.” - Charles Fort [1] As the costs of personal genetic testing “self-quantification” fall, publicly accessible databases housing people's genotypic and phenotypic information are gradually increasing in number and scope. The latest entrant is openSNP, which allows participants to upload their personal genetic/genomic and self-reported phenotypic data. I believe the emergence of such open repositories of human biological data is a natural reflection of inquisitive and digitally literate people's desires to make genomic and phenotypic information more easily available to a community beyond the research establishment. Such unfettered databases hold the promise of contributing mightily to science, science education and medicine. That said, in an age of increasingly widespread governmental and corporate surveillance, we would do well to be mindful that genomic DNA is uniquely identifying. Participants in open biological databases are engaged in a real-time experiment whose outcome is unknown. PMID:24647311

  18. The Cardiac Atlas Project--an imaging database for computational modeling and statistical atlases of the heart.

    PubMed

    Fonseca, Carissa G; Backhaus, Michael; Bluemke, David A; Britten, Randall D; Chung, Jae Do; Cowan, Brett R; Dinov, Ivo D; Finn, J Paul; Hunter, Peter J; Kadish, Alan H; Lee, Daniel C; Lima, Joao A C; Medrano-Gracia, Pau; Shivkumar, Kalyanam; Suinesiaputra, Avan; Tao, Wenchao; Young, Alistair A

    2011-08-15

    Integrative mathematical and statistical models of cardiac anatomy and physiology can play a vital role in understanding cardiac disease phenotype and planning therapeutic strategies. However, the accuracy and predictive power of such models is dependent upon the breadth and depth of noninvasive imaging datasets. The Cardiac Atlas Project (CAP) has established a large-scale database of cardiac imaging examinations and associated clinical data in order to develop a shareable, web-accessible, structural and functional atlas of the normal and pathological heart for clinical, research and educational purposes. A goal of CAP is to facilitate collaborative statistical analysis of regional heart shape and wall motion and characterize cardiac function among and within population groups. Three main open-source software components were developed: (i) a database with web-interface; (ii) a modeling client for 3D + time visualization and parametric description of shape and motion; and (iii) open data formats for semantic characterization of models and annotations. The database was implemented using a three-tier architecture utilizing MySQL, JBoss and Dcm4chee, in compliance with the DICOM standard to provide compatibility with existing clinical networks and devices. Parts of Dcm4chee were extended to access image specific attributes as search parameters. To date, approximately 3000 de-identified cardiac imaging examinations are available in the database. All software components developed by the CAP are open source and are freely available under the Mozilla Public License Version 1.1 (http://www.mozilla.org/MPL/MPL-1.1.txt). http://www.cardiacatlas.org a.young@auckland.ac.nz Supplementary data are available at Bioinformatics online.

  19. An event database for rotational seismology

    NASA Astrophysics Data System (ADS)

    Salvermoser, Johannes; Hadziioannou, Celine; Hable, Sarah; Chow, Bryant; Krischer, Lion; Wassermann, Joachim; Igel, Heiner

    2016-04-01

    The ring laser sensor (G-ring) located at Wettzell, Germany, routinely observes earthquake-induced rotational ground motions around a vertical axis since its installation in 2003. Here we present results from a recently installed event database which is the first that will provide ring laser event data in an open access format. Based on the GCMT event catalogue and some search criteria, seismograms from the ring laser and the collocated broadband seismometer are extracted and processed. The ObsPy-based processing scheme generates plots showing waveform fits between rotation rate and transverse acceleration and extracts characteristic wavefield parameters such as peak ground motions, noise levels, Love wave phase velocities and waveform coherence. For each event, these parameters are stored in a text file (json dictionary) which is easily readable and accessible on the website. The database contains >10000 events starting in 2007 (Mw>4.5). It is updated daily and therefore provides recent events at a time lag of max. 24 hours. The user interface allows to filter events for epoch, magnitude, and source area, whereupon the events are displayed on a zoomable world map. We investigate how well the rotational motions are compatible with the expectations from the surface wave magnitude scale. In addition, the website offers some python source code examples for downloading and processing the openly accessible waveforms.

  20. Open Access: From Myth to Paradox

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Ginsparg, Paul

    2009-05-06

    True open access to scientific publications not only gives readers the possibility to read articles without paying subscription, but also makes the material available for automated ingestion and harvesting by 3rd parties. Once articles and associated data become universally treatable as computable objects, openly available to 3rd party aggregators and value-added services, what new services can we expect, and how will they change the way that researchers interact with their scholarly communications infrastructure? I will discuss straightforward applications of existing ideas and services, including citation analysis, collaborative filtering, external database linkages, interoperability, and other forms of automated markup, and speculatemore » on the sociology of the next generation of users.« less

  1. SORTEZ: a relational translator for NCBI's ASN.1 database.

    PubMed

    Hart, K W; Searls, D B; Overton, G C

    1994-07-01

    The National Center for Biotechnology Information (NCBI) has created a database collection that includes several protein and nucleic acid sequence databases, a biosequence-specific subset of MEDLINE, as well as value-added information such as links between similar sequences. Information in the NCBI database is modeled in Abstract Syntax Notation 1 (ASN.1) an Open Systems Interconnection protocol designed for the purpose of exchanging structured data between software applications rather than as a data model for database systems. While the NCBI database is distributed with an easy-to-use information retrieval system, ENTREZ, the ASN.1 data model currently lacks an ad hoc query language for general-purpose data access. For that reason, we have developed a software package, SORTEZ, that transforms the ASN.1 database (or other databases with nested data structures) to a relational data model and subsequently to a relational database management system (Sybase) where information can be accessed through the relational query language, SQL. Because the need to transform data from one data model and schema to another arises naturally in several important contexts, including efficient execution of specific applications, access to multiple databases and adaptation to database evolution this work also serves as a practical study of the issues involved in the various stages of database transformation. We show that transformation from the ASN.1 data model to a relational data model can be largely automated, but that schema transformation and data conversion require considerable domain expertise and would greatly benefit from additional support tools.

  2. Kalium: a database of potassium channel toxins from scorpion venom.

    PubMed

    Kuzmenkov, Alexey I; Krylov, Nikolay A; Chugunov, Anton O; Grishin, Eugene V; Vassilevski, Alexander A

    2016-01-01

    Kalium (http://kaliumdb.org/) is a manually curated database that accumulates data on potassium channel toxins purified from scorpion venom (KTx). This database is an open-access resource, and provides easy access to pages of other databases of interest, such as UniProt, PDB, NCBI Taxonomy Browser, and PubMed. General achievements of Kalium are a strict and easy regulation of KTx classification based on the unified nomenclature supported by researchers in the field, removal of peptides with partial sequence and entries supported by transcriptomic information only, classification of β-family toxins, and addition of a novel λ-family. Molecules presented in the database can be processed by the Clustal Omega server using a one-click option. Molecular masses of mature peptides are calculated and available activity data are compiled for all KTx. We believe that Kalium is not only of high interest to professional toxinologists, but also of general utility to the scientific community.Database URL:http://kaliumdb.org/. © The Author(s) 2016. Published by Oxford University Press.

  3. Open Access to Geophysical Data

    NASA Astrophysics Data System (ADS)

    Sergeyeva, Nataliya A.; Zabarinskaya, Ludmila P.

    2017-04-01

    Russian World Data Centers for Solar-Terrestrial Physics & Solid Earth Physics hosted by the Geophysical Center of the Russian Academy of Sciences are the Regular Members of the ICSU-World Data System. Guided by the principles of the WDS Constitution and WDS Data Sharing Principles, the WDCs provide full and open access to data, long-term data stewardship, compliance with agreed-upon data standards and conventions, and mechanisms to facilitate and improve access to data. Historical and current geophysical data on different media, in the form of digital data sets, analog records, collections of maps, descriptions are stored and collected in the Centers. The WDCs regularly fill up repositories and database with new data, support them up to date. Now the WDCs focus on four new projects, aimed at increase of data available in network by retrospective data collection and digital preservation of data; creation of a modern system of registration and publication of data with digital object identifier (DOI) assignment, and promotion of data citation culture; creation of databases instead of file system for more convenient access to data; participation in the WDS Metadata Catalogue and Data Portal by creating of metadata for information resources of WDCs.

  4. International Soil Carbon Network (ISCN) Database v3-1

    DOE Data Explorer

    Nave, Luke [University of Michigan] (ORCID:0000000182588335); Johnson, Kris [USDA-Forest Service; van Ingen, Catharine [Microsoft Research; Agarwal, Deborah [Lawrence Berkeley National Laboratory] (ORCID:0000000150452396); Humphrey, Marty [University of Virginia; Beekwilder, Norman [University of Virginia

    2016-01-01

    The ISCN is an international scientific community devoted to the advancement of soil carbon research. The ISCN manages an open-access, community-driven soil carbon database. This is version 3-1 of the ISCN Database, released in December 2015. It gathers 38 separate dataset contributions, totalling 67,112 sites with data from 71,198 soil profiles and 431,324 soil layers. For more information about the ISCN, its scientific community and resources, data policies and partner networks visit: http://iscn.fluxdata.org/.

  5. Alternative treatment technology information center computer database system

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sullivan, D.

    1995-10-01

    The Alternative Treatment Technology Information Center (ATTIC) computer database system was developed pursuant to the 1986 Superfund law amendments. It provides up-to-date information on innovative treatment technologies to clean up hazardous waste sites. ATTIC v2.0 provides access to several independent databases as well as a mechanism for retrieving full-text documents of key literature. It can be accessed with a personal computer and modem 24 hours a day, and there are no user fees. ATTIC provides {open_quotes}one-stop shopping{close_quotes} for information on alternative treatment options by accessing several databases: (1) treatment technology database; this contains abstracts from the literature on all typesmore » of treatment technologies, including biological, chemical, physical, and thermal methods. The best literature as viewed by experts is highlighted. (2) treatability study database; this provides performance information on technologies to remove contaminants from wastewaters and soils. It is derived from treatability studies. This database is available through ATTIC or separately as a disk that can be mailed to you. (3) underground storage tank database; this presents information on underground storage tank corrective actions, surface spills, emergency response, and remedial actions. (4) oil/chemical spill database; this provides abstracts on treatment and disposal of spilled oil and chemicals. In addition to these separate databases, ATTIC allows immediate access to other disk-based systems such as the Vendor Information System for Innovative Treatment Technologies (VISITT) and the Bioremediation in the Field Search System (BFSS). The user may download these programs to their own PC via a high-speed modem. Also via modem, users are able to download entire documents through the ATTIC system. Currently, about fifty publications are available, including Superfund Innovative Technology Evaluation (SITE) program documents.« less

  6. 75 FR 57544 - Defense Trade Advisory Group; Notice of Open Meeting

    Federal Register 2010, 2011, 2012, 2013, 2014

    2010-09-21

    .... Truman Building, Washington, DC. Entry and registration will begin at 12:30 p.m. Please use the building... Visitor Access Control System (VACS-D) database. Please see the Privacy Impact Assessment for VACS-D at...

  7. Geothermal NEPA Database on OpenEI (Poster)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Young, K. R.; Levine, A.

    2014-09-01

    The National Renewable Energy Laboratory (NREL) developed the Geothermal National Environmental Policy Act (NEPA) Database as a platform for government agencies and industry to access and maintain information related to geothermal NEPA documents. The data were collected to inform analyses of NEPA timelines, and the collected data were made publically available via this tool in case others might find the data useful. NREL staff and contractors collected documents from agency websites, during visits to the two busiest Bureau of Land Management (BLM) field offices for geothermal development, and through email and phone call requests from other BLM field offices. Theymore » then entered the information into the database, hosted by Open Energy Information (http://en.openei.org/wiki/RAPID/NEPA). The long-term success of the project will depend on the willingness of federal agencies, industry, and others to populate the database with NEPA and related documents, and to use the data for their own analyses. As the information and capabilities of the database expand, developers and agencies can save time on new NEPA reports by accessing a single location to research related activities, their potential impacts, and previously proposed and imposed mitigation measures. NREL used a wiki platform to allow industry and agencies to maintain the content in the future so that it continues to provide relevant and accurate information to users.« less

  8. The successes and challenges of open-source biopharmaceutical innovation.

    PubMed

    Allarakhia, Minna

    2014-05-01

    Increasingly, open-source-based alliances seek to provide broad access to data, research-based tools, preclinical samples and downstream compounds. The challenge is how to create value from open-source biopharmaceutical innovation. This value creation may occur via transparency and usage of data across the biopharmaceutical value chain as stakeholders move dynamically between open source and open innovation. In this article, several examples are used to trace the evolution of biopharmaceutical open-source initiatives. The article specifically discusses the technological challenges associated with the integration and standardization of big data; the human capacity development challenges associated with skill development around big data usage; and the data-material access challenge associated with data and material access and usage rights, particularly as the boundary between open source and open innovation becomes more fluid. It is the author's opinion that the assessment of when and how value creation will occur, through open-source biopharmaceutical innovation, is paramount. The key is to determine the metrics of value creation and the necessary technological, educational and legal frameworks to support the downstream outcomes of now big data-based open-source initiatives. The continued focus on the early-stage value creation is not advisable. Instead, it would be more advisable to adopt an approach where stakeholders transform open-source initiatives into open-source discovery, crowdsourcing and open product development partnerships on the same platform.

  9. libChEBI: an API for accessing the ChEBI database.

    PubMed

    Swainston, Neil; Hastings, Janna; Dekker, Adriano; Muthukrishnan, Venkatesh; May, John; Steinbeck, Christoph; Mendes, Pedro

    2016-01-01

    ChEBI is a database and ontology of chemical entities of biological interest. It is widely used as a source of identifiers to facilitate unambiguous reference to chemical entities within biological models, databases, ontologies and literature. ChEBI contains a wealth of chemical data, covering over 46,500 distinct chemical entities, and related data such as chemical formula, charge, molecular mass, structure, synonyms and links to external databases. Furthermore, ChEBI is an ontology, and thus provides meaningful links between chemical entities. Unlike many other resources, ChEBI is fully human-curated, providing a reliable, non-redundant collection of chemical entities and related data. While ChEBI is supported by a web service for programmatic access and a number of download files, it does not have an API library to facilitate the use of ChEBI and its data in cheminformatics software. To provide this missing functionality, libChEBI, a comprehensive API library for accessing ChEBI data, is introduced. libChEBI is available in Java, Python and MATLAB versions from http://github.com/libChEBI, and provides full programmatic access to all data held within the ChEBI database through a simple and documented API. libChEBI is reliant upon the (automated) download and regular update of flat files that are held locally. As such, libChEBI can be embedded in both on- and off-line software applications. libChEBI allows better support of ChEBI and its data in the development of new cheminformatics software. Covering three key programming languages, it allows for the entirety of the ChEBI database to be accessed easily and quickly through a simple API. All code is open access and freely available.

  10. [The opening of the French national health database: Opportunities and difficulties. The experience of the Gazel and Constances cohorts].

    PubMed

    Goldberg, M; Carton, M; Gourmelen, J; Genreau, M; Montourcy, M; Le Got, S; Zins, M

    2016-09-01

    In France, the national health database (SNIIRAM) is an administrative health database that collects data on hospitalizations and healthcare consumption for more than 60 million people. Although it does not record behavioral and environmental data, these data have a major interest for epidemiology, surveillance and public health. One of the most interesting uses of SNIIRAM is its linkage with surveys collecting data directly from persons. Access to the SNIIRAM data is currently relatively limited, but in the near future changes in regulations will largely facilitate open access. However, it is a huge and complex database and there are some important methodological and technical difficulties for using it due to its volume and architecture. We are developing tools for facilitating the linkage of the Gazel and Constances cohorts to the SNIIRAM: interactive documentation on the SNIIRAM database, software for the verification of the completeness and validity of the data received from the SNIIRAM, methods for constructing indicators from the raw data in order to flag the presence of certain events (specific diagnosis, procedure, drug…), standard queries for producing a set of variables on a specific area (drugs, diagnoses during a hospital stay…). Moreover, the REDSIAM network recently set up aims to develop, evaluate and make available algorithms to identify pathologies in SNIIRAM. In order to fully benefit from the exceptional potential of the SNIIRAM database, it is essential to develop tools to facilitate its use. Copyright © 2016 Elsevier Masson SAS. All rights reserved.

  11. jSPyDB, an open source database-independent tool for data management

    NASA Astrophysics Data System (ADS)

    Pierro, Giuseppe Antonio; Cavallari, Francesca; Di Guida, Salvatore; Innocente, Vincenzo

    2011-12-01

    Nowadays, the number of commercial tools available for accessing Databases, built on Java or .Net, is increasing. However, many of these applications have several drawbacks: usually they are not open-source, they provide interfaces only with a specific kind of database, they are platform-dependent and very CPU and memory consuming. jSPyDB is a free web-based tool written using Python and Javascript. It relies on jQuery and python libraries, and is intended to provide a simple handler to different database technologies inside a local web browser. Such a tool, exploiting fast access libraries such as SQLAlchemy, is easy to install, and to configure. The design of this tool envisages three layers. The front-end client side in the local web browser communicates with a backend server. Only the server is able to connect to the different databases for the purposes of performing data definition and manipulation. The server makes the data available to the client, so that the user can display and handle them safely. Moreover, thanks to jQuery libraries, this tool supports export of data in different formats, such as XML and JSON. Finally, by using a set of pre-defined functions, users are allowed to create their customized views for a better data visualization. In this way, we optimize the performance of database servers by avoiding short connections and concurrent sessions. In addition, security is enforced since we do not provide users the possibility to directly execute any SQL statement.

  12. SNPchiMp v.3: integrating and standardizing single nucleotide polymorphism data for livestock species.

    PubMed

    Nicolazzi, Ezequiel L; Caprera, Andrea; Nazzicari, Nelson; Cozzi, Paolo; Strozzi, Francesco; Lawley, Cindy; Pirani, Ali; Soans, Chandrasen; Brew, Fiona; Jorjani, Hossein; Evans, Gary; Simpson, Barry; Tosser-Klopp, Gwenola; Brauning, Rudiger; Williams, John L; Stella, Alessandra

    2015-04-10

    In recent years, the use of genomic information in livestock species for genetic improvement, association studies and many other fields has become routine. In order to accommodate different market requirements in terms of genotyping cost, manufacturers of single nucleotide polymorphism (SNP) arrays, private companies and international consortia have developed a large number of arrays with different content and different SNP density. The number of currently available SNP arrays differs among species: ranging from one for goats to more than ten for cattle, and the number of arrays available is increasing rapidly. However, there is limited or no effort to standardize and integrate array- specific (e.g. SNP IDs, allele coding) and species-specific (i.e. past and current assemblies) SNP information. Here we present SNPchiMp v.3, a solution to these issues for the six major livestock species (cow, pig, horse, sheep, goat and chicken). Original data was collected directly from SNP array producers and specific international genome consortia, and stored in a MySQL database. The database was then linked to an open-access web tool and to public databases. SNPchiMp v.3 ensures fast access to the database (retrieving within/across SNP array data) and the possibility of annotating SNP array data in a user-friendly fashion. This platform allows easy integration and standardization, and it is aimed at both industry and research. It also enables users to easily link the information available from the array producer with data in public databases, without the need of additional bioinformatics tools or pipelines. In recognition of the open-access use of Ensembl resources, SNPchiMp v.3 was officially credited as an Ensembl E!mpowered tool. Availability at http://bioinformatics.tecnoparco.org/SNPchimp.

  13. Design and Analysis of a Model Reconfigurable Cyber-Exercise Laboratory (RCEL) for Information Assurance Education

    DTIC Science & Technology

    2004-03-01

    with MySQL . This choice was made because MySQL is open source. Any significant database engine such as Oracle or MS- SQL or even MS Access can be used...10 Figure 6. The DoD vs . Commercial Life Cycle...necessarily be interested in SCADA network security 13. MySQL (Database server) – This station represents a typical data server for a web page

  14. LOVD: easy creation of a locus-specific sequence variation database using an "LSDB-in-a-box" approach.

    PubMed

    Fokkema, Ivo F A C; den Dunnen, Johan T; Taschner, Peter E M

    2005-08-01

    The completion of the human genome project has initiated, as well as provided the basis for, the collection and study of all sequence variation between individuals. Direct access to up-to-date information on sequence variation is currently provided most efficiently through web-based, gene-centered, locus-specific databases (LSDBs). We have developed the Leiden Open (source) Variation Database (LOVD) software approaching the "LSDB-in-a-Box" idea for the easy creation and maintenance of a fully web-based gene sequence variation database. LOVD is platform-independent and uses PHP and MySQL open source software only. The basic gene-centered and modular design of the database follows the recommendations of the Human Genome Variation Society (HGVS) and focuses on the collection and display of DNA sequence variations. With minimal effort, the LOVD platform is extendable with clinical data. The open set-up should both facilitate and promote functional extension with scripts written by the community. The LOVD software is freely available from the Leiden Muscular Dystrophy pages (www.DMD.nl/LOVD/). To promote the use of LOVD, we currently offer curators the possibility to set up an LSDB on our Leiden server. (c) 2005 Wiley-Liss, Inc.

  15. MIRO and IRbase: IT Tools for the Epidemiological Monitoring of Insecticide Resistance in Mosquito Disease Vectors

    PubMed Central

    Dialynas, Emmanuel; Topalis, Pantelis; Vontas, John; Louis, Christos

    2009-01-01

    Background Monitoring of insect vector populations with respect to their susceptibility to one or more insecticides is a crucial element of the strategies used for the control of arthropod-borne diseases. This management task can nowadays be achieved more efficiently when assisted by IT (Information Technology) tools, ranging from modern integrated databases to GIS (Geographic Information System). Here we describe an application ontology that we developed de novo, and a specially designed database that, based on this ontology, can be used for the purpose of controlling mosquitoes and, thus, the diseases that they transmit. Methodology/Principal Findings The ontology, named MIRO for Mosquito Insecticide Resistance Ontology, developed using the OBO-Edit software, describes all pertinent aspects of insecticide resistance, including specific methodology and mode of action. MIRO, then, forms the basis for the design and development of a dedicated database, IRbase, constructed using open source software, which can be used to retrieve data on mosquito populations in a temporally and spatially separate way, as well as to map the output using a Google Earth interface. The dependency of the database on the MIRO allows for a rational and efficient hierarchical search possibility. Conclusions/Significance The fact that the MIRO complies with the rules set forward by the OBO (Open Biomedical Ontologies) Foundry introduces cross-referencing with other biomedical ontologies and, thus, both MIRO and IRbase are suitable as parts of future comprehensive surveillance tools and decision support systems that will be used for the control of vector-borne diseases. MIRO is downloadable from and IRbase is accessible at VectorBase, the NIAID-sponsored open access database for arthropod vectors of disease. PMID:19547750

  16. BioMart Central Portal: an open database network for the biological community

    PubMed Central

    Guberman, Jonathan M.; Ai, J.; Arnaiz, O.; Baran, Joachim; Blake, Andrew; Baldock, Richard; Chelala, Claude; Croft, David; Cros, Anthony; Cutts, Rosalind J.; Di Génova, A.; Forbes, Simon; Fujisawa, T.; Gadaleta, E.; Goodstein, D. M.; Gundem, Gunes; Haggarty, Bernard; Haider, Syed; Hall, Matthew; Harris, Todd; Haw, Robin; Hu, S.; Hubbard, Simon; Hsu, Jack; Iyer, Vivek; Jones, Philip; Katayama, Toshiaki; Kinsella, R.; Kong, Lei; Lawson, Daniel; Liang, Yong; Lopez-Bigas, Nuria; Luo, J.; Lush, Michael; Mason, Jeremy; Moreews, Francois; Ndegwa, Nelson; Oakley, Darren; Perez-Llamas, Christian; Primig, Michael; Rivkin, Elena; Rosanoff, S.; Shepherd, Rebecca; Simon, Reinhard; Skarnes, B.; Smedley, Damian; Sperling, Linda; Spooner, William; Stevenson, Peter; Stone, Kevin; Teague, J.; Wang, Jun; Wang, Jianxin; Whitty, Brett; Wong, D. T.; Wong-Erasmus, Marie; Yao, L.; Youens-Clark, Ken; Yung, Christina; Zhang, Junjun; Kasprzyk, Arek

    2011-01-01

    BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal and show example queries to demonstrate its capabilities. Database URL: http://central.biomart.org. PMID:21930507

  17. PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation

    PubMed Central

    Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W

    2007-01-01

    PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at , is open for business. PMID:17916232

  18. PAZAR: a framework for collection and dissemination of cis-regulatory sequence annotation.

    PubMed

    Portales-Casamar, Elodie; Kirov, Stefan; Lim, Jonathan; Lithwick, Stuart; Swanson, Magdalena I; Ticoll, Amy; Snoddy, Jay; Wasserman, Wyeth W

    2007-01-01

    PAZAR is an open-access and open-source database of transcription factor and regulatory sequence annotation with associated web interface and programming tools for data submission and extraction. Curated boutique data collections can be maintained and disseminated through the unified schema of the mall-like PAZAR repository. The Pleiades Promoter Project collection of brain-linked regulatory sequences is introduced to demonstrate the depth of annotation possible within PAZAR. PAZAR, located at http://www.pazar.info, is open for business.

  19. Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Troia, Matthew J.; McManamay, Ryan A.

    Primary biodiversity data constitute observations of particular species at given points in time and space. Open-access electronic databases provide unprecedented access to these data, but their usefulness in characterizing species distributions and patterns in biodiversity depend on how complete species inventories are at a given survey location and how uniformly distributed survey locations are along dimensions of time, space, and environment. Our aim was to compare completeness and coverage among three open-access databases representing ten taxonomic groups (amphibians, birds, freshwater bivalves, crayfish, freshwater fish, fungi, insects, mammals, plants, and reptiles) in the contiguous United States. We compiled occurrence records frommore » the Global Biodiversity Information Facility (GBIF), the North American Breeding Bird Survey (BBS), and federally administered fish surveys (FFS). In this study, we aggregated occurrence records by 0.1° × 0.1° grid cells and computed three completeness metrics to classify each grid cell as well-surveyed or not. Next, we compared frequency distributions of surveyed grid cells to background environmental conditions in a GIS and performed Kolmogorov–Smirnov tests to quantify coverage through time, along two spatial gradients, and along eight environmental gradients. The three databases contributed >13.6 million reliable occurrence records distributed among >190,000 grid cells. The percent of well-surveyed grid cells was substantially lower for GBIF (5.2%) than for systematic surveys (BBS and FFS; 82.5%). Still, the large number of GBIF occurrence records produced at least 250 well-surveyed grid cells for six of nine taxonomic groups. Coverages of systematic surveys were less biased across spatial and environmental dimensions but were more biased in temporal coverage compared to GBIF data. GBIF coverages also varied among taxonomic groups, consistent with commonly recognized geographic, environmental, and institutional sampling biases. Lastly, this comprehensive assessment of biodiversity data across the contiguous United States provides a prioritization scheme to fill in the gaps by contributing existing occurrence records to the public domain and planning future surveys.« less

  20. Filling in the GAPS: evaluating completeness and coverage of open-access biodiversity databases in the United States

    DOE PAGES

    Troia, Matthew J.; McManamay, Ryan A.

    2016-06-12

    Primary biodiversity data constitute observations of particular species at given points in time and space. Open-access electronic databases provide unprecedented access to these data, but their usefulness in characterizing species distributions and patterns in biodiversity depend on how complete species inventories are at a given survey location and how uniformly distributed survey locations are along dimensions of time, space, and environment. Our aim was to compare completeness and coverage among three open-access databases representing ten taxonomic groups (amphibians, birds, freshwater bivalves, crayfish, freshwater fish, fungi, insects, mammals, plants, and reptiles) in the contiguous United States. We compiled occurrence records frommore » the Global Biodiversity Information Facility (GBIF), the North American Breeding Bird Survey (BBS), and federally administered fish surveys (FFS). In this study, we aggregated occurrence records by 0.1° × 0.1° grid cells and computed three completeness metrics to classify each grid cell as well-surveyed or not. Next, we compared frequency distributions of surveyed grid cells to background environmental conditions in a GIS and performed Kolmogorov–Smirnov tests to quantify coverage through time, along two spatial gradients, and along eight environmental gradients. The three databases contributed >13.6 million reliable occurrence records distributed among >190,000 grid cells. The percent of well-surveyed grid cells was substantially lower for GBIF (5.2%) than for systematic surveys (BBS and FFS; 82.5%). Still, the large number of GBIF occurrence records produced at least 250 well-surveyed grid cells for six of nine taxonomic groups. Coverages of systematic surveys were less biased across spatial and environmental dimensions but were more biased in temporal coverage compared to GBIF data. GBIF coverages also varied among taxonomic groups, consistent with commonly recognized geographic, environmental, and institutional sampling biases. Lastly, this comprehensive assessment of biodiversity data across the contiguous United States provides a prioritization scheme to fill in the gaps by contributing existing occurrence records to the public domain and planning future surveys.« less

  1. DABAM: an open-source database of X-ray mirrors metrology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele

    2016-04-20

    An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper,more » with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. Some optics simulations are presented and discussed to illustrate the real use of the profiles from the database.« less

  2. DABAM: an open-source database of X-ray mirrors metrology

    PubMed Central

    Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele; Glass, Mark; Idir, Mourad; Metz, Jim; Raimondi, Lorenzo; Rebuffi, Luca; Reininger, Ruben; Shi, Xianbo; Siewert, Frank; Spielmann-Jaeggi, Sibylle; Takacs, Peter; Tomasset, Muriel; Tonnessen, Tom; Vivo, Amparo; Yashchuk, Valeriy

    2016-01-01

    An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper, with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. Some optics simulations are presented and discussed to illustrate the real use of the profiles from the database. PMID:27140145

  3. DABAM: An open-source database of X-ray mirrors metrology

    DOE PAGES

    Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele; ...

    2016-05-01

    An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper,more » with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. In conclusion, some optics simulations are presented and discussed to illustrate the real use of the profiles from the database.« less

  4. BIRS – Bioterrorism Information Retrieval System

    PubMed Central

    Tewari, Ashish Kumar; Rashi; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Jain, Chakresh Kumar

    2013-01-01

    Bioterrorism is the intended use of pathogenic strains of microbes to widen terror in a population. There is a definite need to promote research for development of vaccines, therapeutics and diagnostic methods as a part of preparedness to any bioterror attack in the future. BIRS is an open-access database of collective information on the organisms related to bioterrorism. The architecture of database utilizes the current open-source technology viz PHP ver 5.3.19, MySQL and IIS server under windows platform for database designing. Database stores information on literature, generic- information and unique pathways of about 10 microorganisms involved in bioterrorism. This may serve as a collective repository to accelerate the drug discovery and vaccines designing process against such bioterrorist agents (microbes). The available data has been validated from various online resources and literature mining in order to provide the user with a comprehensive information system. Availability The database is freely available at http://www.bioterrorism.biowaves.org PMID:23390356

  5. DABAM: an open-source database of X-ray mirrors metrology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele

    An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper,more » with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. Some optics simulations are presented and discussed to illustrate the real use of the profiles from the database.« less

  6. DABAM: An open-source database of X-ray mirrors metrology

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele

    An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper,more » with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. In conclusion, some optics simulations are presented and discussed to illustrate the real use of the profiles from the database.« less

  7. Construction of a nasopharyngeal carcinoma 2D/MS repository with Open Source XML database--Xindice.

    PubMed

    Li, Feng; Li, Maoyu; Xiao, Zhiqiang; Zhang, Pengfei; Li, Jianling; Chen, Zhuchu

    2006-01-11

    Many proteomics initiatives require integration of all information with uniformcriteria from collection of samples and data display to publication of experimental results. The integration and exchanging of these data of different formats and structure imposes a great challenge to us. The XML technology presents a promise in handling this task due to its simplicity and flexibility. Nasopharyngeal carcinoma (NPC) is one of the most common cancers in southern China and Southeast Asia, which has marked geographic and racial differences in incidence. Although there are some cancer proteome databases now, there is still no NPC proteome database. The raw NPC proteome experiment data were captured into one XML document with Human Proteome Markup Language (HUP-ML) editor and imported into native XML database Xindice. The 2D/MS repository of NPC proteome was constructed with Apache, PHP and Xindice to provide access to the database via Internet. On our website, two methods, keyword query and click query, were provided at the same time to access the entries of the NPC proteome database. Our 2D/MS repository can be used to share the raw NPC proteomics data that are generated from gel-based proteomics experiments. The database, as well as the PHP source codes for constructing users' own proteome repository, can be accessed at http://www.xyproteomics.org/.

  8. Technical Aspects of Interfacing MUMPS to an External SQL Relational Database Management System

    PubMed Central

    Kuzmak, Peter M.; Walters, Richard F.; Penrod, Gail

    1988-01-01

    This paper describes an interface connecting InterSystems MUMPS (M/VX) to an external relational DBMS, the SYBASE Database Management System. The interface enables MUMPS to operate in a relational environment and gives the MUMPS language full access to a complete set of SQL commands. MUMPS generates SQL statements as ASCII text and sends them to the RDBMS. The RDBMS executes the statements and returns ASCII results to MUMPS. The interface suggests that the language features of MUMPS make it an attractive tool for use in the relational database environment. The approach described in this paper separates MUMPS from the relational database. Positioning the relational database outside of MUMPS promotes data sharing and permits a number of different options to be used for working with the data. Other languages like C, FORTRAN, and COBOL can access the RDBMS database. Advanced tools provided by the relational database vendor can also be used. SYBASE is an advanced high-performance transaction-oriented relational database management system for the VAX/VMS and UNIX operating systems. SYBASE is designed using a distributed open-systems architecture, and is relatively easy to interface with MUMPS.

  9. Making geospatial data in ASF archive readily accessible

    NASA Astrophysics Data System (ADS)

    Gens, R.; Hogenson, K.; Wolf, V. G.; Drew, L.; Stern, T.; Stoner, M.; Shapran, M.

    2015-12-01

    The way geospatial data is searched, managed, processed and used has changed significantly in recent years. A data archive such as the one at the Alaska Satellite Facility (ASF), one of NASA's twelve interlinked Distributed Active Archive Centers (DAACs), used to be searched solely via user interfaces that were specifically developed for its particular archive and data sets. ASF then moved to using an application programming interface (API) that defined a set of routines, protocols, and tools for distributing the geospatial information stored in the database in real time. This provided a more flexible access to the geospatial data. Yet, it was up to user to develop the tools to get a more tailored access to the data they needed. We present two new approaches for serving data to users. In response to the recent Nepal earthquake we developed a data feed for distributing ESA's Sentinel data. Users can subscribe to the data feed and are provided with the relevant metadata the moment a new data set is available for download. The second approach was an Open Geospatial Consortium (OGC) web feature service (WFS). The WFS hosts the metadata along with a direct link from which the data can be downloaded. It uses the open-source GeoServer software (Youngblood and Iacovella, 2013) and provides an interface to include the geospatial information in the archive directly into the user's geographic information system (GIS) as an additional data layer. Both services are run on top of a geospatial PostGIS database, an open-source geographic extension for the PostgreSQL object-relational database (Marquez, 2015). Marquez, A., 2015. PostGIS essentials. Packt Publishing, 198 p. Youngblood, B. and Iacovella, S., 2013. GeoServer Beginner's Guide, Packt Publishing, 350 p.

  10. Software reuse example and challenges at NSIDC

    NASA Astrophysics Data System (ADS)

    Billingsley, B. W.; Brodzik, M.; Collins, J. A.

    2009-12-01

    NSIDC has created a new data discovery and access system, Searchlight, to provide users with the data they want in the format they want. NSIDC Searchlight supports discovery and access to disparate data types with on-the-fly reprojection, regridding and reformatting. Architected to both reuse open source systems and be reused itself, Searchlight reuses GDAL and Proj4 for manipulating data and format conversions, the netCDF Java library for creating netCDF output, MapServer and OpenLayers for defining spatial criteria and the JTS Topology Suite (JTS) in conjunction with Hibernate Spatial for database interaction and rich OGC-compliant spatial objects. The application reuses popular Java and Java Script libraries including Struts 2, Spring, JPA (Hibernate), Sitemesh, JFreeChart, JQuery, DOJO and a PostGIS PostgreSQL database. Future reuse of Searchlight components is supported at varying architecture levels, ranging from the database and model components to web services. We present the tools, libraries and programs that Searchlight has reused. We describe the architecture of Searchlight and explain the strategies deployed for reusing existing software and how Searchlight is built for reuse. We will discuss NSIDC reuse of the Searchlight components to support rapid development of new data delivery systems.

  11. PRGdb: a bioinformatics platform for plant resistance gene analysis

    PubMed Central

    Sanseverino, Walter; Roma, Guglielmo; De Simone, Marco; Faino, Luigi; Melito, Sara; Stupka, Elia; Frusciante, Luigi; Ercolano, Maria Raffaella

    2010-01-01

    PRGdb is a web accessible open-source (http://www.prgdb.org) database that represents the first bioinformatic resource providing a comprehensive overview of resistance genes (R-genes) in plants. PRGdb holds more than 16 000 known and putative R-genes belonging to 192 plant species challenged by 115 different pathogens and linked with useful biological information. The complete database includes a set of 73 manually curated reference R-genes, 6308 putative R-genes collected from NCBI and 10463 computationally predicted putative R-genes. Thanks to a user-friendly interface, data can be examined using different query tools. A home-made prediction pipeline called Disease Resistance Analysis and Gene Orthology (DRAGO), based on reference R-gene sequence data, was developed to search for plant resistance genes in public datasets such as Unigene and Genbank. New putative R-gene classes containing unknown domain combinations were discovered and characterized. The development of the PRG platform represents an important starting point to conduct various experimental tasks. The inferred cross-link between genomic and phenotypic information allows access to a large body of information to find answers to several biological questions. The database structure also permits easy integration with other data types and opens up prospects for future implementations. PMID:19906694

  12. ABrowse--a customizable next-generation genome browser framework.

    PubMed

    Kong, Lei; Wang, Jun; Zhao, Shuqi; Gu, Xiaocheng; Luo, Jingchu; Gao, Ge

    2012-01-05

    With the rapid growth of genome sequencing projects, genome browser is becoming indispensable, not only as a visualization system but also as an interactive platform to support open data access and collaborative work. Thus a customizable genome browser framework with rich functions and flexible configuration is needed to facilitate various genome research projects. Based on next-generation web technologies, we have developed a general-purpose genome browser framework ABrowse which provides interactive browsing experience, open data access and collaborative work support. By supporting Google-map-like smooth navigation, ABrowse offers end users highly interactive browsing experience. To facilitate further data analysis, multiple data access approaches are supported for external platforms to retrieve data from ABrowse. To promote collaborative work, an online user-space is provided for end users to create, store and share comments, annotations and landmarks. For data providers, ABrowse is highly customizable and configurable. The framework provides a set of utilities to import annotation data conveniently. To build ABrowse on existing annotation databases, data providers could specify SQL statements according to database schema. And customized pages for detailed information display of annotation entries could be easily plugged in. For developers, new drawing strategies could be integrated into ABrowse for new types of annotation data. In addition, standard web service is provided for data retrieval remotely, providing underlying machine-oriented programming interface for open data access. ABrowse framework is valuable for end users, data providers and developers by providing rich user functions and flexible customization approaches. The source code is published under GNU Lesser General Public License v3.0 and is accessible at http://www.abrowse.org/. To demonstrate all the features of ABrowse, a live demo for Arabidopsis thaliana genome has been built at http://arabidopsis.cbi.edu.cn/.

  13. Brave New World: Data Intensive Science with SDSS and the VO

    NASA Astrophysics Data System (ADS)

    Thakar, A. R.; Szalay, A. S.; O'Mullane, W.; Nieto-Santisteban, M.; Budavari, T.; Li, N.; Carliles, S.; Haridas, V.; Malik, T.; Gray, J.

    2004-12-01

    With the advent of digital archives and the VO, astronomy is quickly changing from a data-hungry to a data-intensive science. Local and specialized access to data will remain the most direct and efficient way to get data out of individual archives, especially if you know what you are looking for. However, the enormous sizes of the upcoming archives will preclude this type of access for most institutions, and will not allow researchers to tap the vast potential for discovery in cross-matching and comparing data between different archives. The VO makes this type of interoperability and distributed data access possible by adopting industry standards for data access (SQL) and data interchange (SOAP/XML) with platform independence (Web services). As a sneak preview of this brave new world where astronomers may need to become SQL warriors, we present a look at VO-enabled access to catalog data in the SDSS Catalog Archive Server (CAS): CasJobs - a workbench environment that allows arbitrarily complex SQL queries and your own personal database (MyDB) that you can share with collaborators; OpenSkyQuery - an IVOA (International Virtual Observatory Alliance) compliant federation of multiple archives (OpenSkyNodes) that currently links nearly 20 catalogs and allows cross-match queries (in ADQL - Astronomical Data Query Language) between them; Spectrum and Filter Profile Web services that provide access to an open database of spectra (registered users may add their own spectra); and VO-enabled Mirage - a Java visualizatiion tool developed at Bell Labs and enhanced at JHU that allows side-by-side comparison of SDSS catalog and FITS image data. Anticipating the next generation of Petabyte archives like LSST by the end of the decade, we are developing a parallel cross-match engine for all-sky cross-matches between large surveys, along with a 100-Terabyte data intensive science laboratory with high-speed parallel data access.

  14. ForC: a global database of forest carbon stocks and fluxes.

    PubMed

    Anderson-Teixeira, Kristina J; Wang, Maria M H; McGarvey, Jennifer C; Herrmann, Valentine; Tepley, Alan J; Bond-Lamberty, Ben; LeBauer, David S

    2018-06-01

    Forests play an influential role in the global carbon (C) cycle, storing roughly half of terrestrial C and annually exchanging with the atmosphere more than five times the carbon dioxide (CO 2 ) emitted by anthropogenic activities. Yet, scaling up from field-based measurements of forest C stocks and fluxes to understand global scale C cycling and its climate sensitivity remains an important challenge. Tens of thousands of forest C measurements have been made, but these data have yet to be integrated into a single database that makes them accessible for integrated analyses. Here we present an open-access global Forest Carbon database (ForC) containing previously published records of field-based measurements of ecosystem-level C stocks and annual fluxes, along with disturbance history and methodological information. ForC expands upon the previously published tropical portion of this database, TropForC (https://doi.org/10.5061/dryad.t516f), now including 17,367 records (previously 3,568) representing 2,731 plots (previously 845) in 826 geographically distinct areas. The database covers all forested biogeographic and climate zones, represents forest stands of all ages, and currently includes data collected between 1934 and 2015. We expect that ForC will prove useful for macroecological analyses of forest C cycling, for evaluation of model predictions or remote sensing products, for quantifying the contribution of forests to the global C cycle, and for supporting international efforts to inventory forest carbon and greenhouse gas exchange. A dynamic version of ForC is maintained at on GitHub (https://GitHub.com/forc-db), and we encourage the research community to collaborate in updating, correcting, expanding, and utilizing this database. ForC is an open access database, and we encourage use of the data for scientific research and education purposes. Data may not be used for commercial purposes without written permission of the database PI. Any publications using ForC data should cite this publication and Anderson-Teixeira et al. (2016a) (see Metadata S1). No other copyright or cost restrictions are associated with the use of this data set. © 2018 by the Ecological Society of America.

  15. The CompTox Chemistry Dashboard - A Community Data Resource for Environmental Chemistry

    EPA Science Inventory

    Despite an abundance of online databases providing access to chemical data, there is increasing demand for high-quality, structure-curated, open data to meet the various needs of the environmental sciences and computational toxicology communities. The U.S. Environmental Protectio...

  16. 77 FR 21808 - Privacy Act of 1974; System of Records

    Federal Register 2010, 2011, 2012, 2013, 2014

    2012-04-11

    ... and open source records and commercial database. EXEMPTIONS CLAIMED FOR THE SYSTEM: The Attorney... notification procedures, the record access procedures, the contesting record procedures, the record source..., confidential sources, and victims of crimes. The offenses and alleged offenses associated with the individuals...

  17. Publishing SNP genotypes of human embryonic stem cell lines: policy statement of the International Stem Cell Forum Ethics Working Party.

    PubMed

    Knoppers, Bartha M; Isasi, Rosario; Benvenisty, Nissim; Kim, Ock-Joo; Lomax, Geoffrey; Morris, Clive; Murray, Thomas H; Lee, Eng Hin; Perry, Margery; Richardson, Genevra; Sipp, Douglas; Tanner, Klaus; Wahlström, Jan; de Wert, Guido; Zeng, Fanyi

    2011-09-01

    Novel methods and associated tools permitting individual identification in publicly accessible SNP databases have become a debatable issue. There is growing concern that current technical and ethical safeguards to protect the identities of donors could be insufficient. In the context of human embryonic stem cell research, there are no studies focusing on the probability that an hESC line donor could be identified by analyzing published SNP profiles and associated genotypic and phenotypic information. We present the International Stem Cell Forum (ISCF) Ethics Working Party's Policy Statement on "Publishing SNP Genotypes of Human Embryonic Stem Cell Lines (hESC)". The Statement prospectively addresses issues surrounding the publication of genotypic data and associated annotations of hESC lines in open access databases. It proposes a balanced approach between the goals of open science and data sharing with the respect for fundamental bioethical principles (autonomy, privacy, beneficence, justice and research merit and integrity).

  18. Database Organisation in a Web-Enabled Free and Open-Source Software (foss) Environment for Spatio-Temporal Landslide Modelling

    NASA Astrophysics Data System (ADS)

    Das, I.; Oberai, K.; Sarathi Roy, P.

    2012-07-01

    Landslides exhibit themselves in different mass movement processes and are considered among the most complex natural hazards occurring on the earth surface. Making landslide database available online via WWW (World Wide Web) promotes the spreading and reaching out of the landslide information to all the stakeholders. The aim of this research is to present a comprehensive database for generating landslide hazard scenario with the help of available historic records of landslides and geo-environmental factors and make them available over the Web using geospatial Free & Open Source Software (FOSS). FOSS reduces the cost of the project drastically as proprietary software's are very costly. Landslide data generated for the period 1982 to 2009 were compiled along the national highway road corridor in Indian Himalayas. All the geo-environmental datasets along with the landslide susceptibility map were served through WEBGIS client interface. Open source University of Minnesota (UMN) mapserver was used as GIS server software for developing web enabled landslide geospatial database. PHP/Mapscript server-side application serve as a front-end application and PostgreSQL with PostGIS extension serve as a backend application for the web enabled landslide spatio-temporal databases. This dynamic virtual visualization process through a web platform brings an insight into the understanding of the landslides and the resulting damage closer to the affected people and user community. The landslide susceptibility dataset is also made available as an Open Geospatial Consortium (OGC) Web Feature Service (WFS) which can be accessed through any OGC compliant open source or proprietary GIS Software.

  19. Resolving the problem of multiple accessions of the same transcript deposited across various public databases.

    PubMed

    Weirick, Tyler; John, David; Uchida, Shizuka

    2017-03-01

    Maintaining the consistency of genomic annotations is an increasingly complex task because of the iterative and dynamic nature of assembly and annotation, growing numbers of biological databases and insufficient integration of annotations across databases. As information exchange among databases is poor, a 'novel' sequence from one reference annotation could be annotated in another. Furthermore, relationships to nearby or overlapping annotated transcripts are even more complicated when using different genome assemblies. To better understand these problems, we surveyed current and previous versions of genomic assemblies and annotations across a number of public databases containing long noncoding RNA. We identified numerous discrepancies of transcripts regarding their genomic locations, transcript lengths and identifiers. Further investigation showed that the positional differences between reference annotations of essentially the same transcript could lead to differences in its measured expression at the RNA level. To aid in resolving these problems, we present the algorithm 'Universal Genomic Accession Hash (UGAHash)' and created an open source web tool to encourage the usage of the UGAHash algorithm. The UGAHash web tool (http://ugahash.uni-frankfurt.de) can be accessed freely without registration. The web tool allows researchers to generate Universal Genomic Accessions for genomic features or to explore annotations deposited in the public databases of the past and present versions. We anticipate that the UGAHash web tool will be a valuable tool to check for the existence of transcripts before judging the newly discovered transcripts as novel. © The Author 2016. Published by Oxford University Press. For Permissions, please email: journals.permissions@oup.com.

  20. Learning Deep Representations for Ground to Aerial Geolocalization (Open Access)

    DTIC Science & Technology

    2015-10-15

    proposed approach, Where-CNN, is inspired by deep learning success in face verification and achieves significant improvements over tra- ditional hand...crafted features and existing deep features learned from other large-scale databases. We show the ef- fectiveness of Where-CNN in finding matches

  1. Improving Land Cover Mapping: a Mobile Application Based on ESA Sentinel 2 Imagery

    NASA Astrophysics Data System (ADS)

    Melis, M. T.; Dessì, F.; Loddo, P.; La Mantia, C.; Da Pelo, S.; Deflorio, A. M.; Ghiglieri, G.; Hailu, B. T.; Kalegele, K.; Mwasi, B. N.

    2018-04-01

    The increasing availability of satellite data is a real value for the enhancement of environmental knowledge and land management. Possibilities to integrate different source of geo-data are growing and methodologies to create thematic database are becoming very sophisticated. Moreover, the access to internet services and, in particular, to web mapping services is well developed and spread either between expert users than the citizens. Web map services, like Google Maps or Open Street Maps, give the access to updated optical imagery or topographic maps but information on land cover/use - are not still provided. Therefore, there are many failings in the general utilization -non-specialized users- and access to those maps. This issue is particularly felt where the digital (web) maps could form the basis for land use management as they are more economic and accessible than the paper maps. These conditions are well known in many African countries where, while the internet access is becoming open to all, the local map agencies and their products are not widespread.

  2. NCBI GEO: mining millions of expression profiles--database and tools.

    PubMed

    Barrett, Tanya; Suzek, Tugba O; Troup, Dennis B; Wilhite, Stephen E; Ngau, Wing-Chi; Ledoux, Pierre; Rudnev, Dmitry; Lash, Alex E; Fujibuchi, Wataru; Edgar, Ron

    2005-01-01

    The Gene Expression Omnibus (GEO) at the National Center for Biotechnology Information (NCBI) is the largest fully public repository for high-throughput molecular abundance data, primarily gene expression data. The database has a flexible and open design that allows the submission, storage and retrieval of many data types. These data include microarray-based experiments measuring the abundance of mRNA, genomic DNA and protein molecules, as well as non-array-based technologies such as serial analysis of gene expression (SAGE) and mass spectrometry proteomic technology. GEO currently holds over 30,000 submissions representing approximately half a billion individual molecular abundance measurements, for over 100 organisms. Here, we describe recent database developments that facilitate effective mining and visualization of these data. Features are provided to examine data from both experiment- and gene-centric perspectives using user-friendly Web-based interfaces accessible to those without computational or microarray-related analytical expertise. The GEO database is publicly accessible through the World Wide Web at http://www.ncbi.nlm.nih.gov/geo.

  3. BioMart Central Portal: an open database network for the biological community.

    PubMed

    Guberman, Jonathan M; Ai, J; Arnaiz, O; Baran, Joachim; Blake, Andrew; Baldock, Richard; Chelala, Claude; Croft, David; Cros, Anthony; Cutts, Rosalind J; Di Génova, A; Forbes, Simon; Fujisawa, T; Gadaleta, E; Goodstein, D M; Gundem, Gunes; Haggarty, Bernard; Haider, Syed; Hall, Matthew; Harris, Todd; Haw, Robin; Hu, S; Hubbard, Simon; Hsu, Jack; Iyer, Vivek; Jones, Philip; Katayama, Toshiaki; Kinsella, R; Kong, Lei; Lawson, Daniel; Liang, Yong; Lopez-Bigas, Nuria; Luo, J; Lush, Michael; Mason, Jeremy; Moreews, Francois; Ndegwa, Nelson; Oakley, Darren; Perez-Llamas, Christian; Primig, Michael; Rivkin, Elena; Rosanoff, S; Shepherd, Rebecca; Simon, Reinhard; Skarnes, B; Smedley, Damian; Sperling, Linda; Spooner, William; Stevenson, Peter; Stone, Kevin; Teague, J; Wang, Jun; Wang, Jianxin; Whitty, Brett; Wong, D T; Wong-Erasmus, Marie; Yao, L; Youens-Clark, Ken; Yung, Christina; Zhang, Junjun; Kasprzyk, Arek

    2011-01-01

    BioMart Central Portal is a first of its kind, community-driven effort to provide unified access to dozens of biological databases spanning genomics, proteomics, model organisms, cancer data, ontology information and more. Anybody can contribute an independently maintained resource to the Central Portal, allowing it to be exposed to and shared with the research community, and linking it with the other resources in the portal. Users can take advantage of the common interface to quickly utilize different sources without learning a new system for each. The system also simplifies cross-database searches that might otherwise require several complicated steps. Several integrated tools streamline common tasks, such as converting between ID formats and retrieving sequences. The combination of a wide variety of databases, an easy-to-use interface, robust programmatic access and the array of tools make Central Portal a one-stop shop for biological data querying. Here, we describe the structure of Central Portal and show example queries to demonstrate its capabilities.

  4. The surge of predatory open-access in neurosciences and neurology.

    PubMed

    Manca, Andrea; Martinez, Gianluca; Cugusi, Lucia; Dragone, Daniele; Dvir, Zeevi; Deriu, Franca

    2017-06-14

    Predatory open access is a controversial publishing business model that exploits the open-access system by charging publication fees in the absence of transparent editorial services. The credibility of academic publishing is now seriously threatened by predatory journals, whose articles are accorded real citations and thus contaminate the genuine scientific records of legitimate journals. This is of particular concern for public health since clinical practice relies on the findings generated by scholarly articles. Aim of this study was to compile a list of predatory journals targeting the neurosciences and neurology disciplines and to analyze the magnitude and geographical distribution of the phenomenon in these fields. Eighty-seven predatory journals operate in neurosciences and 101 in neurology, for a total of 2404 and 3134 articles issued, respectively. Publication fees range 521-637 USD, much less than those charged by genuine open-access journals. The country of origin of 26.0-37.0% of the publishers was impossible to determine due to poor websites or provision of vague or non-credible locations. Of the rest 35.3-42.0% reported their headquarters in the USA, 19.0-39.2% in India, 3.0-9.8% in other countries. Although calling themselves "open-access", none of the journals retrieved was listed in the Directory of Open Access Journals. However, 14.9-24.7% of them were found to be indexed in PubMed and PubMed Central, which raises concerns on the criteria for inclusion of journals and publishers imposed by these popular databases. Scholars in the neurosciences are advised to use all the available tools to recognize predatory practices and avoid the downsides of predatory journals. Copyright © 2017 IBRO. Published by Elsevier Ltd. All rights reserved.

  5. The DNA Data Bank of Japan launches a new resource, the DDBJ Omics Archive of functional genomics experiments.

    PubMed

    Kodama, Yuichi; Mashima, Jun; Kaminuma, Eli; Gojobori, Takashi; Ogasawara, Osamu; Takagi, Toshihisa; Okubo, Kousaku; Nakamura, Yasukazu

    2012-01-01

    The DNA Data Bank of Japan (DDBJ; http://www.ddbj.nig.ac.jp) maintains and provides archival, retrieval and analytical resources for biological information. The central DDBJ resource consists of public, open-access nucleotide sequence databases including raw sequence reads, assembly information and functional annotation. Database content is exchanged with EBI and NCBI within the framework of the International Nucleotide Sequence Database Collaboration (INSDC). In 2011, DDBJ launched two new resources: the 'DDBJ Omics Archive' (DOR; http://trace.ddbj.nig.ac.jp/dor) and BioProject (http://trace.ddbj.nig.ac.jp/bioproject). DOR is an archival database of functional genomics data generated by microarray and highly parallel new generation sequencers. Data are exchanged between the ArrayExpress at EBI and DOR in the common MAGE-TAB format. BioProject provides an organizational framework to access metadata about research projects and the data from the projects that are deposited into different databases. In this article, we describe major changes and improvements introduced to the DDBJ services, and the launch of two new resources: DOR and BioProject.

  6. Using the STOQS Web Application for Access to in situ Oceanographic Data

    NASA Astrophysics Data System (ADS)

    McCann, M. P.

    2012-12-01

    Using the STOQS Web Application for Access to in situ Oceanographic Data Mike McCann 7 August 2012 With increasing measurement and sampling capabilities of autonomous oceanographic platforms (e.g. Gliders, Autonomous Underwater Vehicles, Wavegliders), the need to efficiently access and visualize the data they collect is growing. The Monterey Bay Aquarium Research Institute has designed and built the Spatial Temporal Oceanographic Query System (STOQS) specifically to address this issue. The need for STOQS arises from inefficiencies discovered from using CF-NetCDF point observation conventions for these data. The problem is that access efficiency decreases with decreasing dimension of CF-NetCDF data. For example, the Trajectory Common Data Model feature type has only one coordinate dimension, usually Time - positions of the trajectory (Depth, Latitude, Longitude) are stored as non-indexed record variables within the NetCDF file. If client software needs to access data between two depth values or from a bounded geographic area, then the whole data set must be read and the selection made within the client software. This is very inefficient. What is needed is a way to easily select data of interest from an archive given any number of spatial, temporal, or other constraints. Geospatial relational database technology provides this capability. The full STOQS application consists of a Postgres/PostGIS database, Mapserver, and Python-Django running on a server and Web 2.0 technology (jQuery, OpenLayers, Twitter Bootstrap) running in a modern web browser. The web application provides faceted search capabilities allowing a user to quickly drill into the data of interest. Data selection can be constrained by spatial, temporal, and depth selections as well as by parameter value and platform name. The web application layer also provides a REST (Representational State Transfer) Application Programming Interface allowing tools such as the Matlab stoqstoolbox to retrieve data directly from the database. STOQS is an open source software project built upon a framework of free and open source software and is available for anyone to use for making their data more accessible and usable. For more information please see: http://code.google.com/p/stoqs/.; In the above screen grab a user has selected the "mass_concentrtion_of_chlorophyll_in_sea_water" parameter and a time depth range that includes three weeks of AUV missions of just the upper 5 meters.

  7. National security and national competitiveness: Open source solutions; NASA requirements and capabilities

    NASA Technical Reports Server (NTRS)

    Cotter, Gladys A.

    1993-01-01

    Foreign competitors are challenging the world leadership of the U.S. aerospace industry, and increasingly tight budgets everywhere make international cooperation in aerospace science necessary. The NASA STI Program has as part of its mission to support NASA R&D, and to that end has developed a knowledge base of aerospace-related information known as the NASA Aerospace Database. The NASA STI Program is already involved in international cooperation with NATO/AGARD/TIP, CENDI, ICSU/ICSTI, and the U.S. Japan Committee on STI. With the new more open political climate, the perceived dearth of foreign information in the NASA Aerospace Database, and the development of the ESA database and DELURA, the German databases, the NASA STI Program is responding by sponsoring workshops on foreign acquisitions and by increasing its cooperation with international partners and with other U.S. agencies. The STI Program looks to the future of improved database access through networking and a GUI; new media; optical disk, video, and full text; and a Technology Focus Group that will keep the NASA STI Program current with technology.

  8. Statistical Learning in Specific Language Impairment: A Meta-Analysis

    ERIC Educational Resources Information Center

    Lammertink, Imme; Boersma, Paul; Wijnen, Frank; Rispens, Judith

    2017-01-01

    Purpose: The current meta-analysis provides a quantitative overview of published and unpublished studies on statistical learning in the auditory verbal domain in people with and without specific language impairment (SLI). The database used for the meta-analysis is accessible online and open to updates (Community-Augmented Meta-Analysis), which…

  9. Cellular Consequences of Telomere Shortening in Histologically Normal Breast Tissues

    DTIC Science & Technology

    2013-09-01

    using the open source, JAVA -based image analysis software package ImageJ (http://rsb.info.nih.gov/ij/) and a custom designed plugin (“Telometer...Tabulated data were stored in a MySQL (http://www.mysql.com) database and viewed through Microsoft Access (Microsoft Corp.). Statistical Analysis For

  10. ORFer--retrieval of protein sequences and open reading frames from GenBank and storage into relational databases or text files.

    PubMed

    Büssow, Konrad; Hoffmann, Steve; Sievert, Volker

    2002-12-19

    Functional genomics involves the parallel experimentation with large sets of proteins. This requires management of large sets of open reading frames as a prerequisite of the cloning and recombinant expression of these proteins. A Java program was developed for retrieval of protein and nucleic acid sequences and annotations from NCBI GenBank, using the XML sequence format. Annotations retrieved by ORFer include sequence name, organism and also the completeness of the sequence. The program has a graphical user interface, although it can be used in a non-interactive mode. For protein sequences, the program also extracts the open reading frame sequence, if available, and checks its correct translation. ORFer accepts user input in the form of single or lists of GenBank GI identifiers or accession numbers. It can be used to extract complete sets of open reading frames and protein sequences from any kind of GenBank sequence entry, including complete genomes or chromosomes. Sequences are either stored with their features in a relational database or can be exported as text files in Fasta or tabulator delimited format. The ORFer program is freely available at http://www.proteinstrukturfabrik.de/orfer. The ORFer program allows for fast retrieval of DNA sequences, protein sequences and their open reading frames and sequence annotations from GenBank. Furthermore, storage of sequences and features in a relational database is supported. Such a database can supplement a laboratory information system (LIMS) with appropriate sequence information.

  11. Milliarcsecond Astronomy with the CHARA Array

    NASA Astrophysics Data System (ADS)

    Schaefer, Gail; ten Brummelaar, Theo; Gies, Douglas; Jones, Jeremy; Farrington, Christopher

    2018-01-01

    The Center for High Angular Resolution Astronomy offers 50 nights per year of open access time at the CHARA Array. The Array consists of six telescopes linked together as an interferometer, providing sub-milliarcsecond resolution in the optical and near-infrared. The Array enables a variety of scientific studies, including measuring stellar angular diameters, imaging stellar shapes and surface features, mapping the orbits of close binary companions, and resolving circumstellar environments. The open access time is part of an NSF/MSIP funded program to open the CHARA Array to the broader astronomical community. As part of the program, we will build a searchable database for the CHARA data archive and run a series of one-day community workshops at different locations across the country to expand the user base for stellar interferometry and encourage new scientific investigations with the CHARA Array.

  12. The YeastGenome app: the Saccharomyces Genome Database at your fingertips.

    PubMed

    Wong, Edith D; Karra, Kalpana; Hitz, Benjamin C; Hong, Eurie L; Cherry, J Michael

    2013-01-01

    The Saccharomyces Genome Database (SGD) is a scientific database that provides researchers with high-quality curated data about the genes and gene products of Saccharomyces cerevisiae. To provide instant and easy access to this information on mobile devices, we have developed YeastGenome, a native application for the Apple iPhone and iPad. YeastGenome can be used to quickly find basic information about S. cerevisiae genes and chromosomal features regardless of internet connectivity. With or without network access, you can view basic information and Gene Ontology annotations about a gene of interest by searching gene names and gene descriptions or by browsing the database within the app to find the gene of interest. With internet access, the app provides more detailed information about the gene, including mutant phenotypes, references and protein and genetic interactions, as well as provides hyperlinks to retrieve detailed information by showing SGD pages and views of the genome browser. SGD provides online help describing basic ways to navigate the mobile version of SGD, highlights key features and answers frequently asked questions related to the app. The app is available from iTunes (http://itunes.com/apps/yeastgenome). The YeastGenome app is provided freely as a service to our community, as part of SGD's mission to provide free and open access to all its data and annotations.

  13. Reusable Software and Open Data Incorporate Ecological Understanding To Optimize Agriculture and Improveme Crops.

    NASA Astrophysics Data System (ADS)

    LeBauer, D.

    2015-12-01

    Humans need a secure and sustainable food supply, and science can help. We have an opportunity to transform agriculture by combining knowledge of organisms and ecosystems to engineer ecosystems that sustainably produce food, fuel, and other services. The challenge is that the information we have. Measurements, theories, and laws found in publications, notebooks, measurements, software, and human brains are difficult to combine. We homogenize, encode, and automate the synthesis of data and mechanistic understanding in a way that links understanding at different scales and across domains. This allows extrapolation, prediction, and assessment. Reusable components allow automated construction of new knowledge that can be used to assess, predict, and optimize agro-ecosystems. Developing reusable software and open-access databases is hard, and examples will illustrate how we use the Predictive Ecosystem Analyzer (PEcAn, pecanproject.org), the Biofuel Ecophysiological Traits and Yields database (BETYdb, betydb.org), and ecophysiological crop models to predict crop yield, decide which crops to plant, and which traits can be selected for the next generation of data driven crop improvement. A next step is to automate the use of sensors mounted on robots, drones, and tractors to assess plants in the field. The TERRA Reference Phenotyping Platform (TERRA-Ref, terraref.github.io) will provide an open access database and computing platform on which researchers can use and develop tools that use sensor data to assess and manage agricultural and other terrestrial ecosystems. TERRA-Ref will adopt existing standards and develop modular software components and common interfaces, in collaboration with researchers from iPlant, NEON, AgMIP, USDA, rOpenSci, ARPA-E, many scientists and industry partners. Our goal is to advance science by enabling efficient use, reuse, exchange, and creation of knowledge.

  14. [Predatory journals: how their publishers operate and how to avoid them].

    PubMed

    Kratochvíl, Jiří; Plch, Lukáš

    Authors who publish in scientific or scholarly journals today face the risk of publishing in so-called predatory journals. These journals exploit the noble idea of the Open Access movement, whose goal is to make the latest scientific findings available for free. Predatory journals, unlike the reputable ones working on an Open Access basis, neglect the review process and publish low-quality submissions. The basic attributes of predatory journals are a very quick review process or even none at all, failure to be transparent about author fees for publishing an article, misleading potential authors by imitating the names of well-established journals, and false information on indexing in renowned databases or assigned impact factor. Some preventive measures against publishing in predatory journals or drawing information from them are: a thorough credibility check of the journals webpages, verification of the journals indexing on Bealls List and in the following databases: Web of Science Core Collection, Scopus, ERIH PLUS and DOAJ. Asking other scientists or scholars about their experience with a given journal can also be helpful. Without these necessary steps authors face an increased risk of publishing in a journal of poor quality, which will prevent them from obtaining Research and Development Council points (awarded based on the Information Register of Research & Development results); even more importantly, it may damage their reputation as well as the good name of their home institution in the professional community.Key words: academic writing - medical journals - Open Access - predatory journals - predatory publishers - scientific publications.

  15. PubChem BioAssay: A Decade's Development toward Open High-Throughput Screening Data Sharing.

    PubMed

    Wang, Yanli; Cheng, Tiejun; Bryant, Stephen H

    2017-07-01

    High-throughput screening (HTS) is now routinely conducted for drug discovery by both pharmaceutical companies and screening centers at academic institutions and universities. Rapid advance in assay development, robot automation, and computer technology has led to the generation of terabytes of data in screening laboratories. Despite the technology development toward HTS productivity, fewer efforts were devoted to HTS data integration and sharing. As a result, the huge amount of HTS data was rarely made available to the public. To fill this gap, the PubChem BioAssay database ( https://www.ncbi.nlm.nih.gov/pcassay/ ) was set up in 2004 to provide open access to the screening results tested on chemicals and RNAi reagents. With more than 10 years' development and contributions from the community, PubChem has now become the largest public repository for chemical structures and biological data, which provides an information platform to worldwide researchers supporting drug development, medicinal chemistry study, and chemical biology research. This work presents a review of the HTS data content in the PubChem BioAssay database and the progress of data deposition to stimulate knowledge discovery and data sharing. It also provides a description of the database's data standard and basic utilities facilitating information access and use for new users.

  16. Revitalizing the drug pipeline: AntibioticDB, an open access database to aid antibacterial research and development.

    PubMed

    Farrell, L J; Lo, R; Wanford, J J; Jenkins, A; Maxwell, A; Piddock, L J V

    2018-06-11

    The current state of antibiotic discovery, research and development is insufficient to respond to the need for new treatments for drug-resistant bacterial infections. The process has changed over the last decade, with most new agents that are in Phases 1-3, or recently approved, having been discovered in small- and medium-sized enterprises or academia. These agents have then been licensed or sold to large companies for further development with the goal of taking them to market. However, early drug discovery and development, including the possibility of developing previously discontinued agents, would benefit from a database of antibacterial compounds for scrutiny by the developers. This article describes the first free, open-access searchable database of antibacterial compounds, including discontinued agents, drugs under pre-clinical development and those in clinical trials: AntibioticDB (AntibioticDB.com). Data were obtained from publicly available sources. This article summarizes the compounds and drugs in AntibioticDB, including their drug class, mode of action, development status and propensity to select drug-resistant bacteria. AntibioticDB includes compounds currently in pre-clinical development and 834 that have been discontinued and that reached varying stages of development. These may serve as starting points for future research and development.

  17. The BioExtract Server: a web-based bioinformatic workflow platform

    PubMed Central

    Lushbough, Carol M.; Jennewein, Douglas M.; Brendel, Volker P.

    2011-01-01

    The BioExtract Server (bioextract.org) is an open, web-based system designed to aid researchers in the analysis of genomic data by providing a platform for the creation of bioinformatic workflows. Scientific workflows are created within the system by recording tasks performed by the user. These tasks may include querying multiple, distributed data sources, saving query results as searchable data extracts, and executing local and web-accessible analytic tools. The series of recorded tasks can then be saved as a reproducible, sharable workflow available for subsequent execution with the original or modified inputs and parameter settings. Integrated data resources include interfaces to the National Center for Biotechnology Information (NCBI) nucleotide and protein databases, the European Molecular Biology Laboratory (EMBL-Bank) non-redundant nucleotide database, the Universal Protein Resource (UniProt), and the UniProt Reference Clusters (UniRef) database. The system offers access to numerous preinstalled, curated analytic tools and also provides researchers with the option of selecting computational tools from a large list of web services including the European Molecular Biology Open Software Suite (EMBOSS), BioMoby, and the Kyoto Encyclopedia of Genes and Genomes (KEGG). The system further allows users to integrate local command line tools residing on their own computers through a client-side Java applet. PMID:21546552

  18. ASGARD: an open-access database of annotated transcriptomes for emerging model arthropod species.

    PubMed

    Zeng, Victor; Extavour, Cassandra G

    2012-01-01

    The increased throughput and decreased cost of next-generation sequencing (NGS) have shifted the bottleneck genomic research from sequencing to annotation, analysis and accessibility. This is particularly challenging for research communities working on organisms that lack the basic infrastructure of a sequenced genome, or an efficient way to utilize whatever sequence data may be available. Here we present a new database, the Assembled Searchable Giant Arthropod Read Database (ASGARD). This database is a repository and search engine for transcriptomic data from arthropods that are of high interest to multiple research communities but currently lack sequenced genomes. We demonstrate the functionality and utility of ASGARD using de novo assembled transcriptomes from the milkweed bug Oncopeltus fasciatus, the cricket Gryllus bimaculatus and the amphipod crustacean Parhyale hawaiensis. We have annotated these transcriptomes to assign putative orthology, coding region determination, protein domain identification and Gene Ontology (GO) term annotation to all possible assembly products. ASGARD allows users to search all assemblies by orthology annotation, GO term annotation or Basic Local Alignment Search Tool. User-friendly features of ASGARD include search term auto-completion suggestions based on database content, the ability to download assembly product sequences in FASTA format, direct links to NCBI data for predicted orthologs and graphical representation of the location of protein domains and matches to similar sequences from the NCBI non-redundant database. ASGARD will be a useful repository for transcriptome data from future NGS studies on these and other emerging model arthropods, regardless of sequencing platform, assembly or annotation status. This database thus provides easy, one-stop access to multi-species annotated transcriptome information. We anticipate that this database will be useful for members of multiple research communities, including developmental biology, physiology, evolutionary biology, ecology, comparative genomics and phylogenomics. Database URL: asgard.rc.fas.harvard.edu.

  19. The Colima Volcano WebGIS: system acquisition, application and database development in an open-source environment

    NASA Astrophysics Data System (ADS)

    Manea, M.; Norini, G.; Capra, L.; Manea, V. C.

    2009-04-01

    The Colima Volcano is currently the most active Mexican volcano. After the 1913 plinian activity the volcano presented several eruptive phases that lasted few years, but since 1991 its activity became more persistent with vulcanian eruptions, lava and dome extrusions. During the last 15 years the volcano suffered several eruptive episodes as in 1991, 1994, 1998-1999, 2001-2003, 2004 and 2005 with the emplacement of pyroclastic flows. During rain seasons lahars are frequent affecting several infrastructures such as bridges and electric towers. Researchers from different institutions (Mexico, USA, Germany, Italy, and Spain) are currently working on several aspects of the volcano, from remote sensing, field data of old and recent deposits, structural framework, monitoring (rain, seismicity, deformation and visual observations) and laboratory experiments (analogue models and numerical simulations). Each investigation is focused to explain a single process, but it is fundamental to visualize the global status of the volcano in order to understand its behavior and to mitigate future hazards. The Colima Volcano WebGIS represents an initiative aimed to collect and store on a systematic basis all the data obtained so far for the volcano and to continuously update the database with new information. The Colima Volcano WebGIS is hosted on the Computational Geodynamics Laboratory web server and it is based entirely on Open Source software. The web pages, written in php/html will extract information from a mysql relational database, which will host the information needed for the MapBender application. There will be two types of intended users: 1) researchers working on the Colima Volcano, interested in this project and collaborating in common projects will be provided with open access to the database and will be able to introduce their own data, results, interpretation or recommendations; 2) general users, interested in accessing information on Colima Volcano will be provided with restricted access and will be able to visualize maps, images, diagrams, and current activity. The website can be visited at: http://www.geociencias.unam.mx/colima

  20. Traditional Chinese Medical Journals currently published in mainland China.

    PubMed

    Fan, Wei-Yu; Tong, Yuan-Yuan; Pan, Yan-Li; Shang, Wen-Ling; Shen, Jia-Yi; Li, Wei; Li, Li-Jun

    2008-06-01

    Traditional Chinese Medical (TCM) journals have been playing an important role in scholarly communication in China. However, the information in those periodicals was not enough for international readers. This study aims to provide an overview of TCM journals in China. TCM journals currently published in mainland China were identified from Chinese databases and journal subscription catalogs. Data on publication start year, publishing region, language, whether core journals, whether indexed in famous international databases, with/without accessible URL were investigated, and subjects of journals were categorized. One hundred and forty-nine (149) TCM journals are currently published in mainland China; 88.59% of them are academic journals. The subjects of those journals are various, ranging from the general TCM, integrative medicine, herbal medicines, to veterinary TCM. The publishing areas are distributed in 27 regions, with Beijing having the most TCM journals published. One hundred and forty-two (142) of those periodicals are in Chinese, while 4 are also in English, and 3 in other languages. Only 8 TCM journals were recognized as core journals, and 5 were identified as both core journals and journals with high impacted articles by all evaluation systems in China. A few of the TCM journals from mainland China are indexed in PubMed/MEDLINE (10), EMBASE (5), Biological Abstracts (2), or AMED (1). Online full-text Chinese databases CJFD, COJ, and CSTPD cover most of TCM the journals published in the country. One hundred (100) TCM journals have accessible URLs, but only 3 are open access with free full texts. Publication of TCM journals in China has been active in academic communication in the past 20 years. However, only a few of them received recognized high evaluation. English information from them is not sufficient. Open access is not extensively acceptable. The accessibility of those journals to international readers needs to be improved.

  1. Public Access to Digital Material; A Call to Researchers: Digital Libraries Need Collaboration across Disciplines; Greenstone: Open-Source Digital Library Software; Retrieval Issues for the Colorado Digitization Project's Heritage Database; Report on the 5th European Conference on Digital Libraries, ECDL 2001; Report on the First Joint Conference on Digital Libraries.

    ERIC Educational Resources Information Center

    Kahle, Brewster; Prelinger, Rick; Jackson, Mary E.; Boyack, Kevin W.; Wylie, Brian N.; Davidson, George S.; Witten, Ian H.; Bainbridge, David; Boddie, Stefan J.; Garrison, William A.; Cunningham, Sally Jo; Borgman, Christine L.; Hessel, Heather

    2001-01-01

    These six articles discuss various issues relating to digital libraries. Highlights include public access to digital materials; intellectual property concerns; the need for collaboration across disciplines; Greenstone software for construction and presentation of digital information collections; the Colorado Digitization Project; and conferences…

  2. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles

    PubMed Central

    Mathelier, Anthony; Fornes, Oriol; Arenillas, David J.; Chen, Chih-yu; Denay, Grégoire; Lee, Jessica; Shi, Wenqiang; Shyr, Casper; Tan, Ge; Worsley-Hunt, Rebecca; Zhang, Allen W.; Parcy, François; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W.

    2016-01-01

    JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release. PMID:26531826

  3. Prospects for research in haemophilia with real-world data-An analysis of German registry and secondary data.

    PubMed

    Schopohl, D; Bidlingmaier, C; Herzig, D; Klamroth, R; Kurnik, K; Rublee, D; Schramm, W; Schwarzkopf, L; Berger, K

    2018-02-28

    Open questions in haemophilia, such as effectiveness of innovative therapies, clinical and patient-reported outcomes (PROs), epidemiology and cost, await answers. The aim was to identify data attributes required and investigate the availability, appropriateness and accessibility of real-world data (RWD) from German registries and secondary databases to answer the aforementioned questions. Systematic searches were conducted in BIOSIS, EMBASE and MEDLINE to identify non-commercial secondary healthcare databases and registries of patients with haemophilia (PWH). Inclusion of German patients, type of patients, data elements-stratified by use in epidemiology, safety, outcomes and health economics research-and accessibility were investigated by desk research. Screening of 676 hits, identification of four registries [national PWH (DHR), national/international paediatric (GEPARD, PEDNET), international safety monitoring (EUHASS)] and seven national secondary databases. Access was limited to participants in three registries and to employees in one secondary database. One registry asks for PROs. Limitations of secondary databases originate from the ICD-coding system (missing: severity of haemophilia, presence of inhibitory antibodies), data protection laws and need to monitor reliability. Rigorous observational analysis of German haemophilia RWD shows that there is potential to supplement current knowledge and begin to address selected policy goals. To improve the value of existing RWD, the following efforts are proposed: ethical, legal and methodological discussions on data linkage across different sources, formulation of transparent governance rules for data access, redefinition of the ICD-coding, standardized collection of outcome data and implementation of incentives for treatment centres to improve data collection. © 2018 John Wiley & Sons Ltd.

  4. Minimally invasive treatment of ureteropelvic junction obstruction: a critical analysis of results.

    PubMed

    Eden, Christopher G

    2007-10-01

    To analyse the indications and long-term results of endoscopic and minimal access approaches for the treatment of ureteropelvic junction (UPJ) obstruction and to compare them to open surgery. A review of the literature from 1950 to January 2007 was conducted using the Ovid Medline database. A lack of standardisation of techniques used to diagnose UPJ obstruction and to follow up treated patients introduces a degree of inaccuracy in interpreting the success rates of the various modalities of treatment. However, there is no indication that any one of these techniques is affected by this to a greater or lesser extent than another. Open pyeloplasty achieves very good (90-100% success) results, endopyelotomy and balloon disruption of the UPJ fail to match these results by 15-20%, and minimal access pyeloplasty produces results that are at least as good as those of open surgery but with the advantages of a minimal access approach. Minimal access pyeloplasty is likely to gradually replace endopyelotomy and balloon disruption of the UPJ for the treatment of UPJ obstruction. The much higher cost of robotic pyeloplasty and greater availability of laparoscopic expertise in teaching centres are likely to limit the dissemination of robotic pyeloplasty.

  5. What Trends Do Turkish Biology Education Studies Indicate?

    ERIC Educational Resources Information Center

    Topsakal, Unsal Umdu; Calik, Muammer; Cavus, Ragip

    2012-01-01

    The aim of this study is to determine what trends Turkish biology education studies indicate. To achieve this aim, the researchers examined online databases of the Higher Education Council and open access archives of graduate theses in web sites of Turkish universities. Finally, totally 138 graduate theses were elicited to analyze in regard to…

  6. Open access of evidence-based publications: the case of the orthopedic and musculoskeletal literature.

    PubMed

    Yammine, Kaissar

    2015-11-01

    The open access model, where researchers can publish their work and make it freely available to the whole medical community, is gaining ground over the traditional type of publication. However, fees are to be paid by either the authors or their institutions. The purpose of this paper is to assess the proportion and type of open access evidence-based articles in the form of systematic reviews and meta-analyses in the field of musculoskeletal disorders and orthopedic surgery. PubMed database was searched and the results showed a maximal number of hits for low back pain and total hip arthroplasty. We demonstrated that despite a 10-fold increase in the number of evidence-based publications in the past 10 years, the rate of free systematic reviews in the general biomedical literature did not change for the last two decades. In addition, the average percentage of free open access systematic reviews and meta-analyses for the commonest painful musculoskeletal conditions and orthopedic procedures was 20% and 18%, respectively. Those results were significantly lower than those of the systematic reviews and meta-analyses in the remaining biomedical research. Such findings could indicate a divergence between the efforts engaged at promoting evidence-based principles and those at disseminating evidence-based findings in the field of musculoskeletal disease and trauma. The high processing fee is thought to be a major limitation when considering open access model for publication. © 2015 Chinese Cochrane Center, West China Hospital of Sichuan University and Wiley Publishing Asia Pty Ltd.

  7. HomeBank: An Online Repository of Daylong Child-Centered Audio Recordings

    PubMed Central

    VanDam, Mark; Warlaumont, Anne S.; Bergelson, Elika; Cristia, Alejandrina; Soderstrom, Melanie; De Palma, Paul; MacWhinney, Brian

    2017-01-01

    HomeBank is introduced here. It is a public, permanent, extensible, online database of daylong audio recorded in naturalistic environments. HomeBank serves two primary purposes. First, it is a repository for raw audio and associated files: one database requires special permissions, and another redacted database allows unrestricted public access. Associated files include metadata such as participant demographics and clinical diagnostics, automated annotations, and human-generated transcriptions and annotations. Many recordings use the child-perspective LENA recorders (LENA Research Foundation, Boulder, Colorado, United States), but various recordings and metadata can be accommodated. The HomeBank database can have both vetted and unvetted recordings, with different levels of accessibility. Additionally, HomeBank is an open repository for processing and analysis tools for HomeBank or similar data sets. HomeBank is flexible for users and contributors, making primary data available to researchers, especially those in child development, linguistics, and audio engineering. HomeBank facilitates researchers’ access to large-scale data and tools, linking the acoustic, auditory, and linguistic characteristics of children’s environments with a variety of variables including socioeconomic status, family characteristics, language trajectories, and disorders. Automated processing applied to daylong home audio recordings is now becoming widely used in early intervention initiatives, helping parents to provide richer speech input to at-risk children. PMID:27111272

  8. MINEs: open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics.

    PubMed

    Jeffryes, James G; Colastani, Ricardo L; Elbadawi-Sidhu, Mona; Kind, Tobias; Niehaus, Thomas D; Broadbelt, Linda J; Hanson, Andrew D; Fiehn, Oliver; Tyo, Keith E J; Henry, Christopher S

    2015-01-01

    In spite of its great promise, metabolomics has proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography-mass spectrometry (LC-MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likely to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC-MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. Furthermore, MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures. Graphical abstractMINE database construction and access methods. The process of constructing a MINE database from the curated source databases is depicted on the left. The methods for accessing the database are shown on the right.

  9. The Open Spectral Database: an open platform for sharing and searching spectral data.

    PubMed

    Chalk, Stuart J

    2016-01-01

    A number of websites make available spectral data for download (typically as JCAMP-DX text files) and one (ChemSpider) that also allows users to contribute spectral files. As a result, searching and retrieving such spectral data can be time consuming, and difficult to reuse if the data is compressed in the JCAMP-DX file. What is needed is a single resource that allows submission of JCAMP-DX files, export of the raw data in multiple formats, searching based on multiple chemical identifiers, and is open in terms of license and access. To address these issues a new online resource called the Open Spectral Database (OSDB) http://osdb.info/ has been developed and is now available. Built using open source tools, using open code (hosted on GitHub), providing open data, and open to community input about design and functionality, the OSDB is available for anyone to submit spectral data, making it searchable and available to the scientific community. This paper details the concept and coding, internal architecture, export formats, Representational State Transfer (REST) Application Programming Interface and options for submission of data. The OSDB website went live in November 2015. Concurrently, the GitHub repository was made available at https://github.com/stuchalk/OSDB/, and is open for collaborators to join the project, submit issues, and contribute code. The combination of a scripting environment (PHPStorm), a PHP Framework (CakePHP), a relational database (MySQL) and a code repository (GitHub) provides all the capabilities to easily develop REST based websites for ingestion, curation and exposure of open chemical data to the community at all levels. It is hoped this software stack (or equivalent ones in other scripting languages) will be leveraged to make more chemical data available for both humans and computers.

  10. OpenFlyData: an exemplar data web integrating gene expression data on the fruit fly Drosophila melanogaster.

    PubMed

    Miles, Alistair; Zhao, Jun; Klyne, Graham; White-Cooper, Helen; Shotton, David

    2010-10-01

    Integrating heterogeneous data across distributed sources is a major requirement for in silico bioinformatics supporting translational research. For example, genome-scale data on patterns of gene expression in the fruit fly Drosophila melanogaster are widely used in functional genomic studies in many organisms to inform candidate gene selection and validate experimental results. However, current data integration solutions tend to be heavy weight, and require significant initial and ongoing investment of effort. Development of a common Web-based data integration infrastructure (a.k.a. data web), using Semantic Web standards, promises to alleviate these difficulties, but little is known about the feasibility, costs, risks or practical means of migrating to such an infrastructure. We describe the development of OpenFlyData, a proof-of-concept system integrating gene expression data on D. melanogaster, combining Semantic Web standards with light-weight approaches to Web programming based on Web 2.0 design patterns. To support researchers designing and validating functional genomic studies, OpenFlyData includes user-facing search applications providing intuitive access to and comparison of gene expression data from FlyAtlas, the BDGP in situ database, and FlyTED, using data from FlyBase to expand and disambiguate gene names. OpenFlyData's services are also openly accessible, and are available for reuse by other bioinformaticians and application developers. Semi-automated methods and tools were developed to support labour- and knowledge-intensive tasks involved in deploying SPARQL services. These include methods for generating ontologies and relational-to-RDF mappings for relational databases, which we illustrate using the FlyBase Chado database schema; and methods for mapping gene identifiers between databases. The advantages of using Semantic Web standards for biomedical data integration are discussed, as are open issues. In particular, although the performance of open source SPARQL implementations is sufficient to query gene expression data directly from user-facing applications such as Web-based data fusions (a.k.a. mashups), we found open SPARQL endpoints to be vulnerable to denial-of-service-type problems, which must be mitigated to ensure reliability of services based on this standard. These results are relevant to data integration activities in translational bioinformatics. The gene expression search applications and SPARQL endpoints developed for OpenFlyData are deployed at http://openflydata.org. FlyUI, a library of JavaScript widgets providing re-usable user-interface components for Drosophila gene expression data, is available at http://flyui.googlecode.com. Software and ontologies to support transformation of data from FlyBase, FlyAtlas, BDGP and FlyTED to RDF are available at http://openflydata.googlecode.com. SPARQLite, an implementation of the SPARQL protocol, is available at http://sparqlite.googlecode.com. All software is provided under the GPL version 3 open source license.

  11. Visualization of Vgi Data Through the New NASA Web World Wind Virtual Globe

    NASA Astrophysics Data System (ADS)

    Brovelli, M. A.; Kilsedar, C. E.; Zamboni, G.

    2016-06-01

    GeoWeb 2.0, laying the foundations of Volunteered Geographic Information (VGI) systems, has led to platforms where users can contribute to the geographic knowledge that is open to access. Moreover, as a result of the advancements in 3D visualization, virtual globes able to visualize geographic data even on browsers emerged. However the integration of VGI systems and virtual globes has not been fully realized. The study presented aims to visualize volunteered data in 3D, considering also the ease of use aspects for general public, using Free and Open Source Software (FOSS). The new Application Programming Interface (API) of NASA, Web World Wind, written in JavaScript and based on Web Graphics Library (WebGL) is cross-platform and cross-browser, so that the virtual globe created using this API can be accessible through any WebGL supported browser on different operating systems and devices, as a result not requiring any installation or configuration on the client-side, making the collected data more usable to users, which is not the case with the World Wind for Java as installation and configuration of the Java Virtual Machine (JVM) is required. Furthermore, the data collected through various VGI platforms might be in different formats, stored in a traditional relational database or in a NoSQL database. The project developed aims to visualize and query data collected through Open Data Kit (ODK) platform and a cross-platform application, where data is stored in a relational PostgreSQL and NoSQL CouchDB databases respectively.

  12. A public HTLV-1 molecular epidemiology database for sequence management and data mining.

    PubMed

    Araujo, Thessika Hialla Almeida; Souza-Brito, Leandro Inacio; Libin, Pieter; Deforche, Koen; Edwards, Dustin; de Albuquerque-Junior, Antonio Eduardo; Vandamme, Anne-Mieke; Galvao-Castro, Bernardo; Alcantara, Luiz Carlos Junior

    2012-01-01

    It is estimated that 15 to 20 million people are infected with the human T-cell lymphotropic virus type 1 (HTLV-1). At present, there are more than 2,000 unique HTLV-1 isolate sequences published. A central database to aggregate sequence information from a range of epidemiological aspects including HTLV-1 infections, pathogenesis, origins, and evolutionary dynamics would be useful to scientists and physicians worldwide. Described here, we have developed a database that collects and annotates sequence data and can be accessed through a user-friendly search interface. The HTLV-1 Molecular Epidemiology Database website is available at http://htlv1db.bahia.fiocruz.br/. All data was obtained from publications available at GenBank or through contact with the authors. The database was developed using Apache Webserver 2.1.6 and SGBD MySQL. The webpage interfaces were developed in HTML and sever-side scripting written in PHP. The HTLV-1 Molecular Epidemiology Database is hosted on the Gonçalo Moniz/FIOCRUZ Research Center server. There are currently 2,457 registered sequences with 2,024 (82.37%) of those sequences representing unique isolates. Of these sequences, 803 (39.67%) contain information about clinical status (TSP/HAM, 17.19%; ATL, 7.41%; asymptomatic, 12.89%; other diseases, 2.17%; and no information, 60.32%). Further, 7.26% of sequences contain information on patient gender while 5.23% of sequences provide the age of the patient. The HTLV-1 Molecular Epidemiology Database retrieves and stores annotated HTLV-1 proviral sequences from clinical, epidemiological, and geographical studies. The collected sequences and related information are now accessible on a publically available and user-friendly website. This open-access database will support clinical research and vaccine development related to viral genotype.

  13. KEGG orthology-based annotation of the predicted proteome of Acropora digitifera: ZoophyteBase - an open access and searchable database of a coral genome

    PubMed Central

    2013-01-01

    Background Contemporary coral reef research has firmly established that a genomic approach is urgently needed to better understand the effects of anthropogenic environmental stress and global climate change on coral holobiont interactions. Here we present KEGG orthology-based annotation of the complete genome sequence of the scleractinian coral Acropora digitifera and provide the first comprehensive view of the genome of a reef-building coral by applying advanced bioinformatics. Description Sequences from the KEGG database of protein function were used to construct hidden Markov models. These models were used to search the predicted proteome of A. digitifera to establish complete genomic annotation. The annotated dataset is published in ZoophyteBase, an open access format with different options for searching the data. A particularly useful feature is the ability to use a Google-like search engine that links query words to protein attributes. We present features of the annotation that underpin the molecular structure of key processes of coral physiology that include (1) regulatory proteins of symbiosis, (2) planula and early developmental proteins, (3) neural messengers, receptors and sensory proteins, (4) calcification and Ca2+-signalling proteins, (5) plant-derived proteins, (6) proteins of nitrogen metabolism, (7) DNA repair proteins, (8) stress response proteins, (9) antioxidant and redox-protective proteins, (10) proteins of cellular apoptosis, (11) microbial symbioses and pathogenicity proteins, (12) proteins of viral pathogenicity, (13) toxins and venom, (14) proteins of the chemical defensome and (15) coral epigenetics. Conclusions We advocate that providing annotation in an open-access searchable database available to the public domain will give an unprecedented foundation to interrogate the fundamental molecular structure and interactions of coral symbiosis and allow critical questions to be addressed at the genomic level based on combined aspects of evolutionary, developmental, metabolic, and environmental perspectives. PMID:23889801

  14. Open resource metagenomics: a model for sharing metagenomic libraries.

    PubMed

    Neufeld, J D; Engel, K; Cheng, J; Moreno-Hagelsieb, G; Rose, D R; Charles, T C

    2011-11-30

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM(2)BL [1]). The CM(2)BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the academic research community and industry. This article invites the scientific community to adopt this philosophy of open resource metagenomics to extend the utility of functional metagenomics beyond initial publication, circumventing the need to start from scratch with each new research project.

  15. Open resource metagenomics: a model for sharing metagenomic libraries

    PubMed Central

    Neufeld, J.D.; Engel, K.; Cheng, J.; Moreno-Hagelsieb, G.; Rose, D.R.; Charles, T.C.

    2011-01-01

    Both sequence-based and activity-based exploitation of environmental DNA have provided unprecedented access to the genomic content of cultivated and uncultivated microorganisms. Although researchers deposit microbial strains in culture collections and DNA sequences in databases, activity-based metagenomic studies typically only publish sequences from the hits retrieved from specific screens. Physical metagenomic libraries, conceptually similar to entire sequence datasets, are usually not straightforward to obtain by interested parties subsequent to publication. In order to facilitate unrestricted distribution of metagenomic libraries, we propose the adoption of open resource metagenomics, in line with the trend towards open access publishing, and similar to culture- and mutant-strain collections that have been the backbone of traditional microbiology and microbial genetics. The concept of open resource metagenomics includes preparation of physical DNA libraries, preferably in versatile vectors that facilitate screening in a diversity of host organisms, and pooling of clones so that single aliquots containing complete libraries can be easily distributed upon request. Database deposition of associated metadata and sequence data for each library provides researchers with information to select the most appropriate libraries for further research projects. As a starting point, we have established the Canadian MetaMicroBiome Library (CM2BL [1]). The CM2BL is a publicly accessible collection of cosmid libraries containing environmental DNA from soils collected from across Canada, spanning multiple biomes. The libraries were constructed such that the cloned DNA can be easily transferred to Gateway® compliant vectors, facilitating functional screening in virtually any surrogate microbial host for which there are available plasmid vectors. The libraries, which we are placing in the public domain, will be distributed upon request without restriction to members of both the academic research community and industry. This article invites the scientific community to adopt this philosophy of open resource metagenomics to extend the utility of functional metagenomics beyond initial publication, circumventing the need to start from scratch with each new research project. PMID:22180823

  16. BGD: a database of bat genomes.

    PubMed

    Fang, Jianfei; Wang, Xuan; Mu, Shuo; Zhang, Shuyi; Dong, Dong

    2015-01-01

    Bats account for ~20% of mammalian species, and are the only mammals with true powered flight. For the sake of their specialized phenotypic traits, many researches have been devoted to examine the evolution of bats. Until now, some whole genome sequences of bats have been assembled and annotated, however, a uniform resource for the annotated bat genomes is still unavailable. To make the extensive data associated with the bat genomes accessible to the general biological communities, we established a Bat Genome Database (BGD). BGD is an open-access, web-available portal that integrates available data of bat genomes and genes. It hosts data from six bat species, including two megabats and four microbats. Users can query the gene annotations using efficient searching engine, and it offers browsable tracks of bat genomes. Furthermore, an easy-to-use phylogenetic analysis tool was also provided to facilitate online phylogeny study of genes. To the best of our knowledge, BGD is the first database of bat genomes. It will extend our understanding of the bat evolution and be advantageous to the bat sequences analysis. BGD is freely available at: http://donglab.ecnu.edu.cn/databases/BatGenome/.

  17. The Fossil Calibration Database-A New Resource for Divergence Dating.

    PubMed

    Ksepka, Daniel T; Parham, James F; Allman, James F; Benton, Michael J; Carrano, Matthew T; Cranston, Karen A; Donoghue, Philip C J; Head, Jason J; Hermsen, Elizabeth J; Irmis, Randall B; Joyce, Walter G; Kohli, Manpreet; Lamm, Kristin D; Leehr, Dan; Patané, Josés L; Polly, P David; Phillips, Matthew J; Smith, N Adam; Smith, Nathan D; Van Tuinen, Marcel; Ware, Jessica L; Warnock, Rachel C M

    2015-09-01

    Fossils provide the principal basis for temporal calibrations, which are critical to the accuracy of divergence dating analyses. Translating fossil data into minimum and maximum bounds for calibrations is the most important-often least appreciated-step of divergence dating. Properly justified calibrations require the synthesis of phylogenetic, paleontological, and geological evidence and can be difficult for nonspecialists to formulate. The dynamic nature of the fossil record (e.g., new discoveries, taxonomic revisions, updates of global or local stratigraphy) requires that calibration data be updated continually lest they become obsolete. Here, we announce the Fossil Calibration Database (http://fossilcalibrations.org), a new open-access resource providing vetted fossil calibrations to the scientific community. Calibrations accessioned into this database are based on individual fossil specimens and follow best practices for phylogenetic justification and geochronological constraint. The associated Fossil Calibration Series, a calibration-themed publication series at Palaeontologia Electronica, will serve as a key pipeline for peer-reviewed calibrations to enter the database. © The Author(s) 2015. Published by Oxford University Press, on behalf of the Society of Systematic Biologists. All rights reserved. For Permissions, please email: journals.permissions@oup.com.

  18. Phynx: an open source software solution supporting data management and web-based patient-level data review for drug safety studies in the general practice research database and other health care databases.

    PubMed

    Egbring, Marco; Kullak-Ublick, Gerd A; Russmann, Stefan

    2010-01-01

    To develop a software solution that supports management and clinical review of patient data from electronic medical records databases or claims databases for pharmacoepidemiological drug safety studies. We used open source software to build a data management system and an internet application with a Flex client on a Java application server with a MySQL database backend. The application is hosted on Amazon Elastic Compute Cloud. This solution named Phynx supports data management, Web-based display of electronic patient information, and interactive review of patient-level information in the individual clinical context. This system was applied to a dataset from the UK General Practice Research Database (GPRD). Our solution can be setup and customized with limited programming resources, and there is almost no extra cost for software. Access times are short, the displayed information is structured in chronological order and visually attractive, and selected information such as drug exposure can be blinded. External experts can review patient profiles and save evaluations and comments via a common Web browser. Phynx provides a flexible and economical solution for patient-level review of electronic medical information from databases considering the individual clinical context. It can therefore make an important contribution to an efficient validation of outcome assessment in drug safety database studies.

  19. An open repository of earthquake-triggered ground-failure inventories

    USGS Publications Warehouse

    Schmitt, Robert G.; Tanyas, Hakan; Nowicki Jessee, M. Anna; Zhu, Jing; Biegel, Katherine M.; Allstadt, Kate E.; Jibson, Randall W.; Thompson, Eric M.; van Westen, Cees J.; Sato, Hiroshi P.; Wald, David J.; Godt, Jonathan W.; Gorum, Tolga; Xu, Chong; Rathje, Ellen M.; Knudsen, Keith L.

    2017-12-20

    Earthquake-triggered ground failure, such as landsliding and liquefaction, can contribute significantly to losses, but our current ability to accurately include them in earthquake-hazard analyses is limited. The development of robust and widely applicable models requires access to numerous inventories of ground failures triggered by earthquakes that span a broad range of terrains, shaking characteristics, and climates. We present an openly accessible, centralized earthquake-triggered groundfailure inventory repository in the form of a ScienceBase Community to provide open access to these data with the goal of accelerating research progress. The ScienceBase Community hosts digital inventories created by both U.S. Geological Survey (USGS) and non-USGS authors. We present the original digital inventory files (when available) as well as an integrated database with uniform attributes. We also summarize the mapping methodology and level of completeness as reported by the original author(s) for each inventory. This document describes the steps taken to collect, process, and compile the inventories and the process for adding additional ground-failure inventories to the ScienceBase Community in the future.

  20. HDDTOOLS: an R package serving Hydrological Data Discovery Tools

    NASA Astrophysics Data System (ADS)

    Vitolo, C.; Buytaert, W.

    2014-12-01

    Many governmental bodies and institutions are currently committed to publish open data as the result of a trend of increasing transparency, based on which a wide variety of information produced at public expense is now becoming open and freely available to improve public involvement in the process of decision and policy making. Discovery, access and retrieval of information is, however, not always a simple task. Especially when programmatic access to data resources is not allowed, downloading metadata catalogue, select the information needed, request datasets, de-compression, conversion, manual filtering and parsing can become rather tedious. The R package "hddtools" is an open source project, designed to make all the above operations more efficient by means of re-usable functions. The package facilitate non programmatic access to various online data sources such as the Global Runoff Data Centre, NASA's TRMM mission, the Data60UK database amongst others. This package complements R's growing functionality in environmental web technologies to bridge the gap between data providers and data consumers and it is designed to be the starting building block of scientific workflows for linking data and models in a seamless fashion.

  1. What Can OpenEI Do For You?

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    None

    2010-12-10

    Open Energy Information (OpenEI) is an open source web platform—similar to the one used by Wikipedia—developed by the US Department of Energy (DOE) and the National Renewable Energy Laboratory (NREL) to make the large amounts of energy-related data and information more easily searched, accessed, and used both by people and automated machine processes. Built utilizing the standards and practices of the Linked Open Data community, the OpenEI platform is much more robust and powerful than typical web sites and databases. As an open platform, all users can search, edit, add, and access data in OpenEI for free. The user communitymore » contributes the content and ensures its accuracy and relevance; as the community expands, so does the content's comprehensiveness and quality. The data are structured and tagged with descriptors to enable cross-linking among related data sets, advanced search functionality, and consistent, usable formatting. Data input protocols and quality standards help ensure the content is structured and described properly and derived from a credible source. Although DOE/NREL is developing OpenEI and seeding it with initial data, it is designed to become a true community model with millions of users, a large core of active contributors, and numerous sponsors.« less

  2. What Can OpenEI Do For You?

    ScienceCinema

    None

    2018-02-06

    Open Energy Information (OpenEI) is an open source web platform—similar to the one used by Wikipedia—developed by the US Department of Energy (DOE) and the National Renewable Energy Laboratory (NREL) to make the large amounts of energy-related data and information more easily searched, accessed, and used both by people and automated machine processes. Built utilizing the standards and practices of the Linked Open Data community, the OpenEI platform is much more robust and powerful than typical web sites and databases. As an open platform, all users can search, edit, add, and access data in OpenEI for free. The user community contributes the content and ensures its accuracy and relevance; as the community expands, so does the content's comprehensiveness and quality. The data are structured and tagged with descriptors to enable cross-linking among related data sets, advanced search functionality, and consistent, usable formatting. Data input protocols and quality standards help ensure the content is structured and described properly and derived from a credible source. Although DOE/NREL is developing OpenEI and seeding it with initial data, it is designed to become a true community model with millions of users, a large core of active contributors, and numerous sponsors.

  3. Database technology and the management of multimedia data in the Mirror project

    NASA Astrophysics Data System (ADS)

    de Vries, Arjen P.; Blanken, H. M.

    1998-10-01

    Multimedia digital libraries require an open distributed architecture instead of a monolithic database system. In the Mirror project, we use the Monet extensible database kernel to manage different representation of multimedia objects. To maintain independence between content, meta-data, and the creation of meta-data, we allow distribution of data and operations using CORBA. This open architecture introduces new problems for data access. From an end user's perspective, the problem is how to search the available representations to fulfill an actual information need; the conceptual gap between human perceptual processes and the meta-data is too large. From a system's perspective, several representations of the data may semantically overlap or be irrelevant. We address these problems with an iterative query process and active user participating through relevance feedback. A retrieval model based on inference networks assists the user with query formulation. The integration of this model into the database design has two advantages. First, the user can query both the logical and the content structure of multimedia objects. Second, the use of different data models in the logical and the physical database design provides data independence and allows algebraic query optimization. We illustrate query processing with a music retrieval application.

  4. Herpesvirus systematics☆

    PubMed Central

    Davison, Andrew J.

    2010-01-01

    This paper is about the taxonomy and genomics of herpesviruses. Each theme is presented as a digest of current information flanked by commentaries on past activities and future directions. The International Committee on Taxonomy of Viruses recently instituted a major update of herpesvirus classification. The former family Herpesviridae was elevated to a new order, the Herpesvirales, which now accommodates 3 families, 3 subfamilies, 17 genera and 90 species. Future developments will include revisiting the herpesvirus species definition and the criteria used for taxonomic assignment, particularly in regard to the possibilities of classifying the large number of herpesviruses detected only as DNA sequences by polymerase chain reaction. Nucleotide sequence accessions in primary databases, such as GenBank, consist of the sequences plus annotations of the genetic features. The quality of these accessions is important because they provide a knowledge base that is used widely by the research community. However, updating the accessions to take account of improved knowledge is essentially reserved to the original depositors, and this activity is rarely undertaken. Thus, the primary databases are likely to become antiquated. In contrast, secondary databases are open to curation by experts other than the original depositors, thus increasing the likelihood that they will remain up to date. One of the most promising secondary databases is RefSeq, which aims to furnish the best available annotations for complete genome sequences. Progress in regard to improving the RefSeq herpesvirus accessions is discussed, and insights into particular aspects of herpesvirus genomics arising from this work are reported. PMID:20346601

  5. Database Resources of the BIG Data Center in 2018

    PubMed Central

    Xu, Xingjian; Hao, Lili; Zhu, Junwei; Tang, Bixia; Zhou, Qing; Song, Fuhai; Chen, Tingting; Zhang, Sisi; Dong, Lili; Lan, Li; Wang, Yanqing; Sang, Jian; Hao, Lili; Liang, Fang; Cao, Jiabao; Liu, Fang; Liu, Lin; Wang, Fan; Ma, Yingke; Xu, Xingjian; Zhang, Lijuan; Chen, Meili; Tian, Dongmei; Li, Cuiping; Dong, Lili; Du, Zhenglin; Yuan, Na; Zeng, Jingyao; Zhang, Zhewen; Wang, Jinyue; Shi, Shuo; Zhang, Yadong; Pan, Mengyu; Tang, Bixia; Zou, Dong; Song, Shuhui; Sang, Jian; Xia, Lin; Wang, Zhennan; Li, Man; Cao, Jiabao; Niu, Guangyi; Zhang, Yang; Sheng, Xin; Lu, Mingming; Wang, Qi; Xiao, Jingfa; Zou, Dong; Wang, Fan; Hao, Lili; Liang, Fang; Li, Mengwei; Sun, Shixiang; Zou, Dong; Li, Rujiao; Yu, Chunlei; Wang, Guangyu; Sang, Jian; Liu, Lin; Li, Mengwei; Li, Man; Niu, Guangyi; Cao, Jiabao; Sun, Shixiang; Xia, Lin; Yin, Hongyan; Zou, Dong; Xu, Xingjian; Ma, Lina; Chen, Huanxin; Sun, Yubin; Yu, Lei; Zhai, Shuang; Sun, Mingyuan; Zhang, Zhang; Zhao, Wenming; Xiao, Jingfa; Bao, Yiming; Song, Shuhui; Hao, Lili; Li, Rujiao; Ma, Lina; Sang, Jian; Wang, Yanqing; Tang, Bixia; Zou, Dong; Wang, Fan

    2018-01-01

    Abstract The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. PMID:29036542

  6. Advanced telemedicine development

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Forslund, D.W.; George, J.E.; Gavrilov, E.M.

    1998-12-31

    This is the final report of a one-year, Laboratory Directed Research and Development (LDRD) project at the Los Alamos National Laboratory (LANL). The objective of this project was to develop a Java-based, electronic, medical-record system that can handle multimedia data and work over a wide-area network based on open standards, and that can utilize an existing database back end. The physician is to be totally unaware that there is a database behind the scenes and is only aware that he/she can access and manage the relevant information to treat the patient.

  7. Relational Database for the Geology of the Northern Rocky Mountains - Idaho, Montana, and Washington

    USGS Publications Warehouse

    Causey, J. Douglas; Zientek, Michael L.; Bookstrom, Arthur A.; Frost, Thomas P.; Evans, Karl V.; Wilson, Anna B.; Van Gosen, Bradley S.; Boleneus, David E.; Pitts, Rebecca A.

    2008-01-01

    A relational database was created to prepare and organize geologic map-unit and lithologic descriptions for input into a spatial database for the geology of the northern Rocky Mountains, a compilation of forty-three geologic maps for parts of Idaho, Montana, and Washington in U.S. Geological Survey Open File Report 2005-1235. Not all of the information was transferred to and incorporated in the spatial database due to physical file limitations. This report releases that part of the relational database that was completed for that earlier product. In addition to descriptive geologic information for the northern Rocky Mountains region, the relational database contains a substantial bibliography of geologic literature for the area. The relational database nrgeo.mdb (linked below) is available in Microsoft Access version 2000, a proprietary database program. The relational database contains data tables and other tables used to define terms, relationships between the data tables, and hierarchical relationships in the data; forms used to enter data; and queries used to extract data.

  8. Nuclear Receptor Signaling Atlas: Opening Access to the Biology of Nuclear Receptor Signaling Pathways.

    PubMed

    Becnel, Lauren B; Darlington, Yolanda F; Ochsner, Scott A; Easton-Marks, Jeremy R; Watkins, Christopher M; McOwiti, Apollo; Kankanamge, Wasula H; Wise, Michael W; DeHart, Michael; Margolis, Ronald N; McKenna, Neil J

    2015-01-01

    Signaling pathways involving nuclear receptors (NRs), their ligands and coregulators, regulate tissue-specific transcriptomes in diverse processes, including development, metabolism, reproduction, the immune response and neuronal function, as well as in their associated pathologies. The Nuclear Receptor Signaling Atlas (NURSA) is a Consortium focused around a Hub website (www.nursa.org) that annotates and integrates diverse 'omics datasets originating from the published literature and NURSA-funded Data Source Projects (NDSPs). These datasets are then exposed to the scientific community on an Open Access basis through user-friendly data browsing and search interfaces. Here, we describe the redesign of the Hub, version 3.0, to deploy "Web 2.0" technologies and add richer, more diverse content. The Molecule Pages, which aggregate information relevant to NR signaling pathways from myriad external databases, have been enhanced to include resources for basic scientists, such as post-translational modification sites and targeting miRNAs, and for clinicians, such as clinical trials. A portal to NURSA's Open Access, PubMed-indexed journal Nuclear Receptor Signaling has been added to facilitate manuscript submissions. Datasets and information on reagents generated by NDSPs are available, as is information concerning periodic new NDSP funding solicitations. Finally, the new website integrates the Transcriptomine analysis tool, which allows for mining of millions of richly annotated public transcriptomic data points in the field, providing an environment for dataset re-use and citation, bench data validation and hypothesis generation. We anticipate that this new release of the NURSA database will have tangible, long term benefits for both basic and clinical research in this field.

  9. Influenza Research Database: an integrated bioinformatics resource for influenza research and surveillance

    PubMed Central

    Squires, R. Burke; Noronha, Jyothi; Hunt, Victoria; García‐Sastre, Adolfo; Macken, Catherine; Baumgarth, Nicole; Suarez, David; Pickett, Brett E.; Zhang, Yun; Larsen, Christopher N.; Ramsey, Alvin; Zhou, Liwei; Zaremba, Sam; Kumar, Sanjeev; Deitrich, Jon; Klem, Edward; Scheuermann, Richard H.

    2012-01-01

    Please cite this paper as: Squires et al. (2012) Influenza research database: an integrated bioinformatics resource for influenza research and surveillance. Influenza and Other Respiratory Viruses 6(6), 404–416. Background  The recent emergence of the 2009 pandemic influenza A/H1N1 virus has highlighted the value of free and open access to influenza virus genome sequence data integrated with information about other important virus characteristics. Design  The Influenza Research Database (IRD, http://www.fludb.org) is a free, open, publicly‐accessible resource funded by the U.S. National Institute of Allergy and Infectious Diseases through the Bioinformatics Resource Centers program. IRD provides a comprehensive, integrated database and analysis resource for influenza sequence, surveillance, and research data, including user‐friendly interfaces for data retrieval, visualization and comparative genomics analysis, together with personal log in‐protected ‘workbench’ spaces for saving data sets and analysis results. IRD integrates genomic, proteomic, immune epitope, and surveillance data from a variety of sources, including public databases, computational algorithms, external research groups, and the scientific literature. Results  To demonstrate the utility of the data and analysis tools available in IRD, two scientific use cases are presented. A comparison of hemagglutinin sequence conservation and epitope coverage information revealed highly conserved protein regions that can be recognized by the human adaptive immune system as possible targets for inducing cross‐protective immunity. Phylogenetic and geospatial analysis of sequences from wild bird surveillance samples revealed a possible evolutionary connection between influenza virus from Delaware Bay shorebirds and Alberta ducks. Conclusions  The IRD provides a wealth of integrated data and information about influenza virus to support research of the genetic determinants dictating virus pathogenicity, host range restriction and transmission, and to facilitate development of vaccines, diagnostics, and therapeutics. PMID:22260278

  10. Semantic-JSON: a lightweight web service interface for Semantic Web contents integrating multiple life science databases.

    PubMed

    Kobayashi, Norio; Ishii, Manabu; Takahashi, Satoshi; Mochizuki, Yoshiki; Matsushima, Akihiro; Toyoda, Tetsuro

    2011-07-01

    Global cloud frameworks for bioinformatics research databases become huge and heterogeneous; solutions face various diametric challenges comprising cross-integration, retrieval, security and openness. To address this, as of March 2011 organizations including RIKEN published 192 mammalian, plant and protein life sciences databases having 8.2 million data records, integrated as Linked Open or Private Data (LOD/LPD) using SciNetS.org, the Scientists' Networking System. The huge quantity of linked data this database integration framework covers is based on the Semantic Web, where researchers collaborate by managing metadata across public and private databases in a secured data space. This outstripped the data query capacity of existing interface tools like SPARQL. Actual research also requires specialized tools for data analysis using raw original data. To solve these challenges, in December 2009 we developed the lightweight Semantic-JSON interface to access each fragment of linked and raw life sciences data securely under the control of programming languages popularly used by bioinformaticians such as Perl and Ruby. Researchers successfully used the interface across 28 million semantic relationships for biological applications including genome design, sequence processing, inference over phenotype databases, full-text search indexing and human-readable contents like ontology and LOD tree viewers. Semantic-JSON services of SciNetS.org are provided at http://semanticjson.org.

  11. Development of a Prototype Detailing Management System for the Civil Engineer Corps

    DTIC Science & Technology

    2002-09-01

    73 Figure 30. Associate Members To Billets SQL Statement...American Standard Code for Information Interchange EMPRS Electronic Military Personnel Record System VBA Visual Basic for Applications SDLC...capturing keystrokes or carrying out a series of actions when opening an Access database. In the place of macros, VBA should be used because of its

  12. Integrative interactive visualization of crystal structure, band structure, and Brillouin zone

    NASA Astrophysics Data System (ADS)

    Hanson, Robert; Hinke, Ben; van Koevering, Matthew; Oses, Corey; Toher, Cormac; Hicks, David; Gossett, Eric; Plata Ramos, Jose; Curtarolo, Stefano; Aflow Collaboration

    The AFLOW library is an open-access database for high throughput ab-initio calculations that serves as a resource for the dissemination of computational results in the area of materials science. Our project aims to create an interactive web-based visualization of any structure in the AFLOW database that has associate band structure data in a way that allows novel simultaneous exploration of the crystal structure, band structure, and Brillouin zone. Interactivity is obtained using two synchronized JSmol implementations, one for the crystal structure and one for the Brillouin zone, along with a D3-based band-structure diagram produced on the fly from data obtained from the AFLOW database. The current website portal (http://aflowlib.mems.duke.edu/users/jmolers/matt/website) allows interactive access and visualization of crystal structure, Brillouin zone and band structure for more than 55,000 inorganic crystal structures. This work was supported by the US Navy Office of Naval Research through a Broad Area Announcement administered by Duke University.

  13. GenomeRNAi: a database for cell-based RNAi phenotypes.

    PubMed

    Horn, Thomas; Arziman, Zeynep; Berger, Juerg; Boutros, Michael

    2007-01-01

    RNA interference (RNAi) has emerged as a powerful tool to generate loss-of-function phenotypes in a variety of organisms. Combined with the sequence information of almost completely annotated genomes, RNAi technologies have opened new avenues to conduct systematic genetic screens for every annotated gene in the genome. As increasing large datasets of RNAi-induced phenotypes become available, an important challenge remains the systematic integration and annotation of functional information. Genome-wide RNAi screens have been performed both in Caenorhabditis elegans and Drosophila for a variety of phenotypes and several RNAi libraries have become available to assess phenotypes for almost every gene in the genome. These screens were performed using different types of assays from visible phenotypes to focused transcriptional readouts and provide a rich data source for functional annotation across different species. The GenomeRNAi database provides access to published RNAi phenotypes obtained from cell-based screens and maps them to their genomic locus, including possible non-specific regions. The database also gives access to sequence information of RNAi probes used in various screens. It can be searched by phenotype, by gene, by RNAi probe or by sequence and is accessible at http://rnai.dkfz.de.

  14. GenomeRNAi: a database for cell-based RNAi phenotypes

    PubMed Central

    Horn, Thomas; Arziman, Zeynep; Berger, Juerg; Boutros, Michael

    2007-01-01

    RNA interference (RNAi) has emerged as a powerful tool to generate loss-of-function phenotypes in a variety of organisms. Combined with the sequence information of almost completely annotated genomes, RNAi technologies have opened new avenues to conduct systematic genetic screens for every annotated gene in the genome. As increasing large datasets of RNAi-induced phenotypes become available, an important challenge remains the systematic integration and annotation of functional information. Genome-wide RNAi screens have been performed both in Caenorhabditis elegans and Drosophila for a variety of phenotypes and several RNAi libraries have become available to assess phenotypes for almost every gene in the genome. These screens were performed using different types of assays from visible phenotypes to focused transcriptional readouts and provide a rich data source for functional annotation across different species. The GenomeRNAi database provides access to published RNAi phenotypes obtained from cell-based screens and maps them to their genomic locus, including possible non-specific regions. The database also gives access to sequence information of RNAi probes used in various screens. It can be searched by phenotype, by gene, by RNAi probe or by sequence and is accessible at PMID:17135194

  15. Public health and epidemiology journals published in Brazil and other Portuguese speaking countries

    PubMed Central

    Barreto, Mauricio L; Barata, Rita Barradas

    2008-01-01

    It is well known that papers written in languages other than English have a great risk of being ignored simply because these languages are not accessible to the international scientific community. The objective of this paper is to facilitate the access to the public health and epidemiology literature available in Portuguese speaking countries. It was found that it is particularly concentrated in Brazil, with some few examples in Portugal and none in other Portuguese speaking countries. This literature is predominantly written in Portuguese, but also in other languages such as English or Spanish. The paper describes the several journals, as well as the bibliographic databases that index these journals and how to access them. Most journals provide open-access with direct links in the indexing databases. The importance of this scientific production for the development of epidemiology as a scientific discipline and as a basic discipline for public health practice is discussed. To marginalize these publications has implications for a more balanced knowledge and understanding of the health problems and their determinants at a world-wide level. PMID:18826592

  16. Conversion of National Health Insurance Service-National Sample Cohort (NHIS-NSC) Database into Observational Medical Outcomes Partnership-Common Data Model (OMOP-CDM).

    PubMed

    You, Seng Chan; Lee, Seongwon; Cho, Soo-Yeon; Park, Hojun; Jung, Sungjae; Cho, Jaehyeong; Yoon, Dukyong; Park, Rae Woong

    2017-01-01

    It is increasingly necessary to generate medical evidence applicable to Asian people compared to those in Western countries. Observational Health Data Sciences a Informatics (OHDSI) is an international collaborative which aims to facilitate generating high-quality evidence via creating and applying open-source data analytic solutions to a large network of health databases across countries. We aimed to incorporate Korean nationwide cohort data into the OHDSI network by converting the national sample cohort into Observational Medical Outcomes Partnership-Common Data Model (OMOP-CDM). The data of 1.13 million subjects was converted to OMOP-CDM, resulting in average 99.1% conversion rate. The ACHILLES, open-source OMOP-CDM-based data profiling tool, was conducted on the converted database to visualize data-driven characterization and access the quality of data. The OMOP-CDM version of National Health Insurance Service-National Sample Cohort (NHIS-NSC) can be a valuable tool for multiple aspects of medical research by incorporation into the OHDSI research network.

  17. From Pharmacovigilance to Clinical Care Optimization.

    PubMed

    Celi, Leo Anthony; Moseley, Edward; Moses, Christopher; Ryan, Padhraig; Somai, Melek; Stone, David; Tang, Kai-Ou

    2014-09-01

    In order to ensure the continued, safe administration of pharmaceuticals, particularly those agents that have been recently introduced into the market, there is a need for improved surveillance after product release. This is particularly so because drugs are used by a variety of patients whose particular characteristics may not have been fully captured in the original market approval studies. Even well-conducted, randomized controlled trials are likely to have excluded a large proportion of individuals because of any number of issues. The digitization of medical care, which yields rich and accessible drug data amenable to analytic techniques, provides an opportunity to capture the required information via observational studies. We propose the development of an open, accessible database containing properly de-identified data, to provide the substrate for the required improvement in pharmacovigilance. A range of stakeholders could use this to identify delayed and low-frequency adverse events. Moreover, its power as a research tool could extend to the detection of complex interactions, potential novel uses, and subtle subpopulation effects. This far-reaching potential is demonstrated by our experience with the open Multi-parameter Intelligent Monitoring in Intensive Care (MIMIC) intensive care unit database. The new database could also inform the development of objective, robust clinical practice guidelines. Careful systematization and deliberate standardization of a fully digitized pharmacovigilance process is likely to save both time and resources for healthcare in general.

  18. Open-access MIMIC-II database for intensive care research.

    PubMed

    Lee, Joon; Scott, Daniel J; Villarroel, Mauricio; Clifford, Gari D; Saeed, Mohammed; Mark, Roger G

    2011-01-01

    The critical state of intensive care unit (ICU) patients demands close monitoring, and as a result a large volume of multi-parameter data is collected continuously. This represents a unique opportunity for researchers interested in clinical data mining. We sought to foster a more transparent and efficient intensive care research community by building a publicly available ICU database, namely Multiparameter Intelligent Monitoring in Intensive Care II (MIMIC-II). The data harnessed in MIMIC-II were collected from the ICUs of Beth Israel Deaconess Medical Center from 2001 to 2008 and represent 26,870 adult hospital admissions (version 2.6). MIMIC-II consists of two major components: clinical data and physiological waveforms. The clinical data, which include patient demographics, intravenous medication drip rates, and laboratory test results, were organized into a relational database. The physiological waveforms, including 125 Hz signals recorded at bedside and corresponding vital signs, were stored in an open-source format. MIMIC-II data were also deidentified in order to remove protected health information. Any interested researcher can gain access to MIMIC-II free of charge after signing a data use agreement and completing human subjects training. MIMIC-II can support a wide variety of research studies, ranging from the development of clinical decision support algorithms to retrospective clinical studies. We anticipate that MIMIC-II will be an invaluable resource for intensive care research by stimulating fair comparisons among different studies.

  19. Enabling cross-disciplinary research by linking data to Open Access publications

    NASA Astrophysics Data System (ADS)

    Rettberg, N.

    2012-04-01

    OpenAIREplus focuses on the linking of research data to associated publications. The interlinking of research objects has implications for optimising the research process, allowing the sharing, enrichment and reuse of data, and ultimately serving to make open data an essential part of first class research. The growing call for more concrete data management and sharing plans, apparent at funder and national level, is complemented by the increasing support for a scientific infrastructure that supports the seamless access to a range of research materials. This paper will describe the recently launched OpenAIREplus and will detail how it plans to achieve its goals of developing an Open Access participatory infrastructure for scientific information. OpenAIREplus extends the current collaborative OpenAIRE project, which provides European researchers with a service network for the deposit of peer-reviewed FP7 grant-funded Open Access publications. This new project will focus on opening up the infrastructure to data sources from subject-specific communities to provide metadata about research data and publications, facilitating the linking between these objects. The ability to link within a publication out to a citable database, or other research data material, is fairly innovative and this project will enable users to search, browse, view, and create relationships between different information objects. In this regard, OpenAIREplus will build on prototypes of so-called "Enhanced Publications", originally conceived in the DRIVER-II project. OpenAIREplus recognizes the importance of representing the context of publications and datasets, thus linking to resources about the authors, their affiliation, location, project data and funding. The project will explore how links between text-based publications and research data are managed in different scientific fields. This complements a previous study in OpenAIRE on current disciplinary practices and future needs for infrastructural Open Access services, taking into account the variety within research approaches. Adopting Linked Data mechanisms on top of citation and content mining, it will approach the interchange of data between generic infrastructures such as OpenAIREplus and subject specific service providers. The paper will also touch on the other challenges envisaged in the project with regard to the culture of sharing data, as well as IPR, licensing and organisational issues.

  20. Using a centralised database system and server in the European Union Framework Programme 7 project SEPServer

    NASA Astrophysics Data System (ADS)

    Heynderickx, Daniel

    2012-07-01

    The main objective of the SEPServer project (EU FP7 project 262773) is to produce a new tool, which greatly facilitates the investigation of solar energetic particles (SEPs) and their origin: a server providing SEP data, related electromagnetic (EM) observations and analysis methods, a comprehensive catalogue of the observed SEP events, and educational/outreach material on solar eruptions. The project is coordinated by the University of Helsinki. The project will combine data and knowledge from 11 European partners and several collaborating parties from Europe and US. The datasets provided by the consortium partners are collected in a MySQL database (using the ESA Open Data Interface under licence) on a server operated by DH Consultancy, which also hosts a web interface providing browsing, plotting and post-processing and analysis tools developed by the consortium, as well as a Solar Energetic Particle event catalogue. At this stage of the project, a prototype server has been established, which is presently undergoing testing by users inside the consortium. Using a centralized database has numerous advantages, including: homogeneous storage of the data, which eliminates the need for dataset specific file access routines once the data are ingested in the database; a homogeneous set of metadata describing the datasets on both a global and detailed level, allowing for automated access to and presentation of the various data products; standardised access to the data in different programming environments (e.g. php, IDL); elimination of the need to download data for individual data requests. SEPServer will, thus, add value to several space missions and Earth-based observations by facilitating the coordinated exploitation of and open access to SEP data and related EM observations, and promoting correct use of these data for the entire space research community. This will lead to new knowledge on the production and transport of SEPs during solar eruptions and facilitate the development of models for predicting solar radiation storms and calculation of expected fluxes/fluences of SEPs encountered by spacecraft in the interplanetary medium.

  1. Development of a database of instruments for resource-use measurement: purpose, feasibility, and design.

    PubMed

    Ridyard, Colin H; Hughes, Dyfrig A

    2012-01-01

    Health economists frequently rely on methods based on patient recall to estimate resource utilization. Access to questionnaires and diaries, however, is often limited. This study examined the feasibility of establishing an open-access Database of Instruments for Resource-Use Measurement, identified relevant fields for data extraction, and outlined its design. An electronic survey was sent to authors of full UK economic evaluations listed in the National Health Service Economic Evaluation Database (2008-2010), authors of monographs of Health Technology Assessments (1998-2010), and subscribers to the JISCMail health economics e-mailing list. The survey included questions on piloting, validation, recall period, and data capture method. Responses were analyzed and data extracted to generate relevant fields for the database. A total of 143 responses to the survey provided data on 54 resource-use instruments for inclusion in the database. All were reliant on patient or carer recall, and a majority (47) were questionnaires. Thirty-seven were designed for self-completion by the patient, carer, or guardian, and the remainder were designed for completion by researchers or health care professionals while interviewing patients. Methods of development were diverse, particularly in areas such as the planning of resource itemization (evident in 25 instruments), piloting (25), and validation (29). On the basis of the present analysis, we developed a Web-enabled Database of Instruments for Resource-Use Measurement, accessible via www.DIRUM.org. This database may serve as a practical resource for health economists, as well as a means to facilitate further research in the area of resource-use data collection. Copyright © 2012 International Society for Pharmacoeconomics and Outcomes Research (ISPOR). Published by Elsevier Inc. All rights reserved.

  2. Rrsm: The European Rapid Raw Strong-Motion Database

    NASA Astrophysics Data System (ADS)

    Cauzzi, C.; Clinton, J. F.; Sleeman, R.; Domingo Ballesta, J.; Kaestli, P.; Galanis, O.

    2014-12-01

    We introduce the European Rapid Raw Strong-Motion database (RRSM), a Europe-wide system that provides parameterised strong motion information, as well as access to waveform data, within minutes of the occurrence of strong earthquakes. The RRSM significantly differs from traditional earthquake strong motion dissemination in Europe, which has focused on providing reviewed, processed strong motion parameters, typically with significant delays. As the RRSM provides rapid open access to raw waveform data and metadata and does not rely on external manual waveform processing, RRSM information is tailored to seismologists and strong-motion data analysts, earthquake and geotechnical engineers, international earthquake response agencies and the educated general public. Access to the RRSM database is via a portal at http://www.orfeus-eu.org/rrsm/ that allows users to query earthquake information, peak ground motion parameters and amplitudes of spectral response; and to select and download earthquake waveforms. All information is available within minutes of any earthquake with magnitude ≥ 3.5 occurring in the Euro-Mediterranean region. Waveform processing and database population are performed using the waveform processing module scwfparam, which is integrated in SeisComP3 (SC3; http://www.seiscomp3.org/). Earthquake information is provided by the EMSC (http://www.emsc-csem.org/) and all the seismic waveform data is accessed at the European Integrated waveform Data Archive (EIDA) at ORFEUS (http://www.orfeus-eu.org/index.html), where all on-scale data is used in the fully automated processing. As the EIDA community is continually growing, the already significant number of strong motion stations is also increasing and the importance of this product is expected to also increase. Real-time RRSM processing started in June 2014, while past events have been processed in order to provide a complete database back to 2005.

  3. Chinese journals: a guide for epidemiologists

    PubMed Central

    Fung, Isaac CH

    2008-01-01

    Chinese journals in epidemiology, preventive medicine and public health contain much that is of potential international interest. However, few non-Chinese speakers are acquainted with this literature. This article therefore provides an overview of the contemporary scene in Chinese biomedical journal publication, Chinese bibliographic databases and Chinese journals in epidemiology, preventive medicine and public health. The challenge of switching to English as the medium of publication, the development of publishing bibliometric data from Chinese databases, the prospect of an Open Access publication model in China, the issue of language bias in literature reviews and the quality of Chinese journals are discussed. Epidemiologists are encouraged to search the Chinese bibliographic databases for Chinese journal articles. PMID:18826604

  4. Evaluation of an open-access CBT-based Internet program for social anxiety: Patterns of use, retention, and outcomes.

    PubMed

    Dryman, M Taylor; McTeague, Lisa M; Olino, Thomas M; Heimberg, Richard G

    2017-10-01

    Internet-delivered cognitive-behavioral therapy (ICBT) has been established as both efficacious and effective in reducing symptoms of social anxiety. However, most research has been conducted in controlled settings, and little is known regarding the utility of such programs in an open-access format. The present study examined the use, adherence, and effectiveness of Joyable, an open-access, Internet-delivered, coach-supported CBT-based intervention for social anxiety. Participants were 3,384 registered users (Mage [SD] = 29.82 [7.89]; 54% male) that created an account between 2014 and 2016. Characteristics of use, factors related to attrition and adherence, and within-group outcomes were examined. The primary outcome measure was the Social Phobia Inventory. On average, participants remained in the program for 81.02 days (SD = 60.50), during which they completed 12.14 activities (SD = 11.09) and 1.53 exposures (SD = 3.18). About half (57%) had contact with a coach. Full adherence to the program was achieved by 16% of participants, a rate higher than previously published open-access studies of ICBT. Social anxiety symptoms were significantly reduced for participants that engaged in the program, with medium within-group effects from baseline through the cognitive restructuring module (d = 0.63-0.76) and large effects from baseline through the exposure module (d = 1.40-1.83). Response rates were high (72%). Exposures and coach contact were significant predictors of retention and outcome. This open-access online CBT-based program is effective in reducing social anxiety symptoms and has the potential to extend Internet-based mental health services to socially anxious individuals unwilling or unable to seek face-to-face evidence-based therapy. (PsycINFO Database Record (c) 2017 APA, all rights reserved).

  5. KID Project: an internet-based digital video atlas of capsule endoscopy for research purposes.

    PubMed

    Koulaouzidis, Anastasios; Iakovidis, Dimitris K; Yung, Diana E; Rondonotti, Emanuele; Kopylov, Uri; Plevris, John N; Toth, Ervin; Eliakim, Abraham; Wurm Johansson, Gabrielle; Marlicz, Wojciech; Mavrogenis, Georgios; Nemeth, Artur; Thorlacius, Henrik; Tontini, Gian Eugenio

    2017-06-01

     Capsule endoscopy (CE) has revolutionized small-bowel (SB) investigation. Computational methods can enhance diagnostic yield (DY); however, incorporating machine learning algorithms (MLAs) into CE reading is difficult as large amounts of image annotations are required for training. Current databases lack graphic annotations of pathologies and cannot be used. A novel database, KID, aims to provide a reference for research and development of medical decision support systems (MDSS) for CE.  Open-source software was used for the KID database. Clinicians contribute anonymized, annotated CE images and videos. Graphic annotations are supported by an open-access annotation tool (Ratsnake). We detail an experiment based on the KID database, examining differences in SB lesion measurement between human readers and a MLA. The Jaccard Index (JI) was used to evaluate similarity between annotations by the MLA and human readers.  The MLA performed best in measuring lymphangiectasias with a JI of 81 ± 6 %. The other lesion types were: angioectasias (JI 64 ± 11 %), aphthae (JI 64 ± 8 %), chylous cysts (JI 70 ± 14 %), polypoid lesions (JI 75 ± 21 %), and ulcers (JI 56 ± 9 %).  MLA can perform as well as human readers in the measurement of SB angioectasias in white light (WL). Automated lesion measurement is therefore feasible. KID is currently the only open-source CE database developed specifically to aid development of MDSS. Our experiment demonstrates this potential.

  6. Social media based NPL system to find and retrieve ARM data: Concept paper

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra

    Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less

  7. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Devarakonda, Ranjeet; Giansiracusa, Michael T.; Kumar, Jitendra

    Information connectivity and retrieval has a role in our daily lives. The most pervasive source of online information is databases. The amount of data is growing at rapid rate and database technology is improving and having a profound effect. Almost all online applications are storing and retrieving information from databases. One challenge in supplying the public with wider access to informational databases is the need for knowledge of database languages like Structured Query Language (SQL). Although the SQL language has been published in many forms, not everybody is able to write SQL queries. Another challenge is that it may notmore » be practical to make the public aware of the structure of the database. There is a need for novice users to query relational databases using their natural language. To solve this problem, many natural language interfaces to structured databases have been developed. The goal is to provide more intuitive method for generating database queries and delivering responses. Social media makes it possible to interact with a wide section of the population. Through this medium, and with the help of Natural Language Processing (NLP) we can make the data of the Atmospheric Radiation Measurement Data Center (ADC) more accessible to the public. We propose an architecture for using Apache Lucene/Solr [1], OpenML [2,3], and Kafka [4] to generate an automated query/response system with inputs from Twitter5, our Cassandra DB, and our log database. Using the Twitter API and NLP we can give the public the ability to ask questions of our database and get automated responses.« less

  8. Actionable, long-term stable and semantic web compatible identifiers for access to biological collection objects

    PubMed Central

    Hyam, Roger; Hagedorn, Gregor; Chagnoux, Simon; Röpert, Dominik; Casino, Ana; Droege, Gabi; Glöckler, Falko; Gödderz, Karsten; Groom, Quentin; Hoffmann, Jana; Holleman, Ayco; Kempa, Matúš; Koivula, Hanna; Marhold, Karol; Nicolson, Nicky; Smith, Vincent S.; Triebel, Dagmar

    2017-01-01

    With biodiversity research activities being increasingly shifted to the web, the need for a system of persistent and stable identifiers for physical collection objects becomes increasingly pressing. The Consortium of European Taxonomic Facilities agreed on a common system of HTTP-URI-based stable identifiers which is now rolled out to its member organizations. The system follows Linked Open Data principles and implements redirection mechanisms to human-readable and machine-readable representations of specimens facilitating seamless integration into the growing semantic web. The implementation of stable identifiers across collection organizations is supported with open source provider software scripts, best practices documentations and recommendations for RDF metadata elements facilitating harmonized access to collection information in web portals. Database URL: http://cetaf.org/cetaf-stable-identifiers PMID:28365724

  9. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data.

    PubMed

    Delussu, Giovanni; Lianas, Luca; Frexia, Francesca; Zanetti, Gianluigi

    2016-01-01

    This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR's formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called "Constant Load" and "Constant Number of Records", with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes.

  10. Physical Science Informatics: Providing Open Science Access to Microheater Array Boiling Experiment Data

    NASA Technical Reports Server (NTRS)

    McQuillen, John; Green, Robert D.; Henrie, Ben; Miller, Teresa; Chiaramonte, Fran

    2014-01-01

    The Physical Science Informatics (PSI) system is the next step in this an effort to make NASA sponsored flight data available to the scientific and engineering community, along with the general public. The experimental data, from six overall disciplines, Combustion Science, Fluid Physics, Complex Fluids, Fundamental Physics, and Materials Science, will present some unique challenges. Besides data in textual or numerical format, large portions of both the raw and analyzed data for many of these experiments are digital images and video, requiring large data storage requirements. In addition, the accessible data will include experiment design and engineering data (including applicable drawings), any analytical or numerical models, publications, reports, and patents, and any commercial products developed as a result of the research. This objective of paper includes the following: Present the preliminary layout (Figure 2) of MABE data within the PSI database. Obtain feedback on the layout. Present the procedure to obtain access to this database.

  11. An open access database for the evaluation of heart sound algorithms.

    PubMed

    Liu, Chengyu; Springer, David; Li, Qiao; Moody, Benjamin; Juan, Ricardo Abad; Chorro, Francisco J; Castells, Francisco; Roig, José Millet; Silva, Ikaro; Johnson, Alistair E W; Syed, Zeeshan; Schmidt, Samuel E; Papadaniil, Chrysa D; Hadjileontiadis, Leontios; Naseri, Hosein; Moukadem, Ali; Dieterlen, Alain; Brandt, Christian; Tang, Hong; Samieinasab, Maryam; Samieinasab, Mohammad Reza; Sameni, Reza; Mark, Roger G; Clifford, Gari D

    2016-12-01

    In the past few decades, analysis of heart sound signals (i.e. the phonocardiogram or PCG), especially for automated heart sound segmentation and classification, has been widely studied and has been reported to have the potential value to detect pathology accurately in clinical applications. However, comparative analyses of algorithms in the literature have been hindered by the lack of high-quality, rigorously validated, and standardized open databases of heart sound recordings. This paper describes a public heart sound database, assembled for an international competition, the PhysioNet/Computing in Cardiology (CinC) Challenge 2016. The archive comprises nine different heart sound databases sourced from multiple research groups around the world. It includes 2435 heart sound recordings in total collected from 1297 healthy subjects and patients with a variety of conditions, including heart valve disease and coronary artery disease. The recordings were collected from a variety of clinical or nonclinical (such as in-home visits) environments and equipment. The length of recording varied from several seconds to several minutes. This article reports detailed information about the subjects/patients including demographics (number, age, gender), recordings (number, location, state and time length), associated synchronously recorded signals, sampling frequency and sensor type used. We also provide a brief summary of the commonly used heart sound segmentation and classification methods, including open source code provided concurrently for the Challenge. A description of the PhysioNet/CinC Challenge 2016, including the main aims, the training and test sets, the hand corrected annotations for different heart sound states, the scoring mechanism, and associated open source code are provided. In addition, several potential benefits from the public heart sound database are discussed.

  12. Supporting Building Portfolio Investment and Policy Decision Making through an Integrated Building Utility Data Platform

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Aziz, Azizan; Lasternas, Bertrand; Alschuler, Elena

    The American Recovery and Reinvestment Act stimulus funding of 2009 for smart grid projects resulted in the tripling of smart meters deployment. In 2012, the Green Button initiative provided utility customers with access to their real-time1 energy usage. The availability of finely granular data provides an enormous potential for energy data analytics and energy benchmarking. The sheer volume of time-series utility data from a large number of buildings also poses challenges in data collection, quality control, and database management for rigorous and meaningful analyses. In this paper, we will describe a building portfolio-level data analytics tool for operational optimization, businessmore » investment and policy assessment using 15-minute to monthly intervals utility data. The analytics tool is developed on top of the U.S. Department of Energy’s Standard Energy Efficiency Data (SEED) platform, an open source software application that manages energy performance data of large groups of buildings. To support the significantly large volume of granular interval data, we integrated a parallel time-series database to the existing relational database. The time-series database improves on the current utility data input, focusing on real-time data collection, storage, analytics and data quality control. The fully integrated data platform supports APIs for utility apps development by third party software developers. These apps will provide actionable intelligence for building owners and facilities managers. Unlike a commercial system, this platform is an open source platform funded by the U.S. Government, accessible to the public, researchers and other developers, to support initiatives in reducing building energy consumption.« less

  13. Large-Scale Event Extraction from Literature with Multi-Level Gene Normalization

    PubMed Central

    Wei, Chih-Hsuan; Hakala, Kai; Pyysalo, Sampo; Ananiadou, Sophia; Kao, Hung-Yu; Lu, Zhiyong; Salakoski, Tapio; Van de Peer, Yves; Ginter, Filip

    2013-01-01

    Text mining for the life sciences aims to aid database curation, knowledge summarization and information retrieval through the automated processing of biomedical texts. To provide comprehensive coverage and enable full integration with existing biomolecular database records, it is crucial that text mining tools scale up to millions of articles and that their analyses can be unambiguously linked to information recorded in resources such as UniProt, KEGG, BioGRID and NCBI databases. In this study, we investigate how fully automated text mining of complex biomolecular events can be augmented with a normalization strategy that identifies biological concepts in text, mapping them to identifiers at varying levels of granularity, ranging from canonicalized symbols to unique gene and proteins and broad gene families. To this end, we have combined two state-of-the-art text mining components, previously evaluated on two community-wide challenges, and have extended and improved upon these methods by exploiting their complementary nature. Using these systems, we perform normalization and event extraction to create a large-scale resource that is publicly available, unique in semantic scope, and covers all 21.9 million PubMed abstracts and 460 thousand PubMed Central open access full-text articles. This dataset contains 40 million biomolecular events involving 76 million gene/protein mentions, linked to 122 thousand distinct genes from 5032 species across the full taxonomic tree. Detailed evaluations and analyses reveal promising results for application of this data in database and pathway curation efforts. The main software components used in this study are released under an open-source license. Further, the resulting dataset is freely accessible through a novel API, providing programmatic and customized access (http://www.evexdb.org/api/v001/). Finally, to allow for large-scale bioinformatic analyses, the entire resource is available for bulk download from http://evexdb.org/download/, under the Creative Commons – Attribution – Share Alike (CC BY-SA) license. PMID:23613707

  14. Forest-Observation-System.net - towards a global in-situ data repository for biomass datasets validation

    NASA Astrophysics Data System (ADS)

    Shchepashchenko, D.; Chave, J.; Phillips, O. L.; Davies, S. J.; Lewis, S. L.; Perger, C.; Dresel, C.; Fritz, S.; Scipal, K.

    2017-12-01

    Forest monitoring is high on the scientific and political agenda. Global measurements of forest height, biomass and how they change with time are urgently needed as essential climate and ecosystem variables. The Forest Observation System - FOS (http://forest-observation-system.net/) is an international cooperation to establish a global in-situ forest biomass database to support earth observation and to encourage investment in relevant field-based observations and science. FOS aims to link the Remote Sensing (RS) community with ecologists who measure forest biomass and estimating biodiversity in the field for a common benefit. The benefit of FOS for the RS community is the partnering of the most established teams and networks that manage permanent forest plots globally; to overcome data sharing issues and introduce a standard biomass data flow from tree level measurement to the plot level aggregation served in the most suitable form for the RS community. Ecologists benefit from the FOS with improved access to global biomass information, data standards, gap identification and potential improved funding opportunities to address the known gaps and deficiencies in the data. FOS closely collaborate with the Center for Tropical Forest Science -CTFS-ForestGEO, the ForestPlots.net (incl. RAINFOR, AfriTRON and T-FORCES), AusCover, Tropical managed Forests Observatory and the IIASA network. FOS is an open initiative with other networks and teams most welcome to join. The online database provides open access for both metadata (e.g. who conducted the measurements, where and which parameters) and actual data for a subset of plots where the authors have granted access. A minimum set of database values include: principal investigator and institution, plot coordinates, number of trees, forest type and tree species composition, wood density, canopy height and above ground biomass of trees. Plot size is 0.25 ha or large. The database will be essential for validating and calibrating satellite observations and various models.

  15. 3DSDSCAR--a three dimensional structural database for sialic acid-containing carbohydrates through molecular dynamics simulation.

    PubMed

    Veluraja, Kasinadar; Selvin, Jeyasigamani F A; Venkateshwari, Selvakumar; Priyadarzini, Thanu R K

    2010-09-23

    The inherent flexibility and lack of strong intramolecular interactions of oligosaccharides demand the use of theoretical methods for their structural elucidation. In spite of the developments of theoretical methods, not much research on glycoinformatics is done so far when compared to bioinformatics research on proteins and nucleic acids. We have developed three dimensional structural database for a sialic acid-containing carbohydrates (3DSDSCAR). This is an open-access database that provides 3D structural models of a given sialic acid-containing carbohydrate. At present, 3DSDSCAR contains 60 conformational models, belonging to 14 different sialic acid-containing carbohydrates, deduced through 10 ns molecular dynamics (MD) simulations. The database is available at the URL: http://www.3dsdscar.org. Copyright 2010 Elsevier Ltd. All rights reserved.

  16. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles.

    PubMed

    Mathelier, Anthony; Fornes, Oriol; Arenillas, David J; Chen, Chih-Yu; Denay, Grégoire; Lee, Jessica; Shi, Wenqiang; Shyr, Casper; Tan, Ge; Worsley-Hunt, Rebecca; Zhang, Allen W; Parcy, François; Lenhard, Boris; Sandelin, Albin; Wasserman, Wyeth W

    2016-01-04

    JASPAR (http://jaspar.genereg.net) is an open-access database storing curated, non-redundant transcription factor (TF) binding profiles representing transcription factor binding preferences as position frequency matrices for multiple species in six taxonomic groups. For this 2016 release, we expanded the JASPAR CORE collection with 494 new TF binding profiles (315 in vertebrates, 11 in nematodes, 3 in insects, 1 in fungi and 164 in plants) and updated 59 profiles (58 in vertebrates and 1 in fungi). The introduced profiles represent an 83% expansion and 10% update when compared to the previous release. We updated the structural annotation of the TF DNA binding domains (DBDs) following a published hierarchical structural classification. In addition, we introduced 130 transcription factor flexible models trained on ChIP-seq data for vertebrates, which capture dinucleotide dependencies within TF binding sites. This new JASPAR release is accompanied by a new web tool to infer JASPAR TF binding profiles recognized by a given TF protein sequence. Moreover, we provide the users with a Ruby module complementing the JASPAR API to ease programmatic access and use of the JASPAR collection of profiles. Finally, we provide the JASPAR2016 R/Bioconductor data package with the data of this release. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  17. CMD: a Cotton Microsatellite Database resource for Gossypium genomics

    PubMed Central

    Blenda, Anna; Scheffler, Jodi; Scheffler, Brian; Palmer, Michael; Lacape, Jean-Marc; Yu, John Z; Jesudurai, Christopher; Jung, Sook; Muthukumar, Sriram; Yellambalase, Preetham; Ficklin, Stephen; Staton, Margaret; Eshelman, Robert; Ulloa, Mauricio; Saha, Sukumar; Burr, Ben; Liu, Shaolin; Zhang, Tianzhen; Fang, Deqiu; Pepper, Alan; Kumpatla, Siva; Jacobs, John; Tomkins, Jeff; Cantrell, Roy; Main, Dorrie

    2006-01-01

    Background The Cotton Microsatellite Database (CMD) is a curated and integrated web-based relational database providing centralized access to publicly available cotton microsatellites, an invaluable resource for basic and applied research in cotton breeding. Description At present CMD contains publication, sequence, primer, mapping and homology data for nine major cotton microsatellite projects, collectively representing 5,484 microsatellites. In addition, CMD displays data for three of the microsatellite projects that have been screened against a panel of core germplasm. The standardized panel consists of 12 diverse genotypes including genetic standards, mapping parents, BAC donors, subgenome representatives, unique breeding lines, exotic introgression sources, and contemporary Upland cottons with significant acreage. A suite of online microsatellite data mining tools are accessible at CMD. These include an SSR server which identifies microsatellites, primers, open reading frames, and GC-content of uploaded sequences; BLAST and FASTA servers providing sequence similarity searches against the existing cotton SSR sequences and primers, a CAP3 server to assemble EST sequences into longer transcripts prior to mining for SSRs, and CMap, a viewer for comparing cotton SSR maps. Conclusion The collection of publicly available cotton SSR markers in a centralized, readily accessible and curated web-enabled database provides a more efficient utilization of microsatellite resources and will help accelerate basic and applied research in molecular breeding and genetic mapping in Gossypium spp. PMID:16737546

  18. pvsR: An Open Source Interface to Big Data on the American Political Sphere.

    PubMed

    Matter, Ulrich; Stutzer, Alois

    2015-01-01

    Digital data from the political sphere is abundant, omnipresent, and more and more directly accessible through the Internet. Project Vote Smart (PVS) is a prominent example of this big public data and covers various aspects of U.S. politics in astonishing detail. Despite the vast potential of PVS' data for political science, economics, and sociology, it is hardly used in empirical research. The systematic compilation of semi-structured data can be complicated and time consuming as the data format is not designed for conventional scientific research. This paper presents a new tool that makes the data easily accessible to a broad scientific community. We provide the software called pvsR as an add-on to the R programming environment for statistical computing. This open source interface (OSI) serves as a direct link between a statistical analysis and the large PVS database. The free and open code is expected to substantially reduce the cost of research with PVS' new big public data in a vast variety of possible applications. We discuss its advantages vis-à-vis traditional methods of data generation as well as already existing interfaces. The validity of the library is documented based on an illustration involving female representation in local politics. In addition, pvsR facilitates the replication of research with PVS data at low costs, including the pre-processing of data. Similar OSIs are recommended for other big public databases.

  19. The U.S. Geological Survey Monthly Water Balance Model Futures Portal

    USGS Publications Warehouse

    Bock, Andrew R.; Hay, Lauren E.; Markstrom, Steven L.; Emmerich, Christopher; Talbert, Marian

    2017-05-03

    The U.S. Geological Survey Monthly Water Balance Model Futures Portal (https://my.usgs.gov/mows/) is a user-friendly interface that summarizes monthly historical and simulated future conditions for seven hydrologic and meteorological variables (actual evapotranspiration, potential evapotranspiration, precipitation, runoff, snow water equivalent, atmospheric temperature, and streamflow) at locations across the conterminous United States (CONUS).The estimates of these hydrologic and meteorological variables were derived using a Monthly Water Balance Model (MWBM), a modular system that simulates monthly estimates of components of the hydrologic cycle using monthly precipitation and atmospheric temperature inputs. Precipitation and atmospheric temperature from 222 climate datasets spanning historical conditions (1952 through 2005) and simulated future conditions (2020 through 2099) were summarized for hydrographic features and used to drive the MWBM for the CONUS. The MWBM input and output variables were organized into an open-access database. An Open Geospatial Consortium, Inc., Web Feature Service allows the querying and identification of hydrographic features across the CONUS. To connect the Web Feature Service to the open-access database, a user interface—the Monthly Water Balance Model Futures Portal—was developed to allow the dynamic generation of summary files and plots  based on plot type, geographic location, specific climate datasets, period of record, MWBM variable, and other options. Both the plots and the data files are made available to the user for download 

  20. Peer-to-peer architecture for multi-departmental distributed PACS

    NASA Astrophysics Data System (ADS)

    Rosset, Antoine; Heuberger, Joris; Pysher, Lance; Ratib, Osman

    2006-03-01

    We have elected to explore peer-to-peer technology as an alternative to centralized PACS architecture for the increasing requirements for wide access to images inside and outside a radiology department. The goal being to allow users across the enterprise to access any study anytime without the need for prefetching or routing of images from central archive. Images can be accessed between different workstations and local storage nodes. We implemented "bonjour" a new remote file access technology developed by Apple allowing applications to share data and files remotely with optimized data access and data transfer. Our Open-source image display platform called OsiriX was adapted to allow sharing of local DICOM images through direct access of each local SQL database to be accessible from any other OsiriX workstation over the network. A server version of Osirix Core Data database also allows to access distributed archives servers in the same way. The infrastructure implemented allows fast and efficient access to any image anywhere anytime independently from the actual physical location of the data. It also allows benefiting from the performance of distributed low-cost and high capacity storage servers that can provide efficient caching of PACS data that was found to be 10 to 20 x faster that accessing the same date from the central PACS archive. It is particularly suitable for large hospitals and academic environments where clinical conferences, interdisciplinary discussions and successive sessions of image processing are often part of complex workflow or patient management and decision making.

  1. Study of Large Data Resources for Multilingual Training and System Porting (Pub Version, Open Access)

    DTIC Science & Technology

    2016-05-03

    extraction trained on a large database corpus – English Fisher. Although the performance of ported monolingual system would be worse in comparison...Language TE LI HA LA ZU LLP hours 8.6 9.6 7.9 8.1 8.4 LM sentences 11935 10743 9861 11577 10644 LM words 68175 83157 93131 93328 60832 dictionary 14505

  2. mantisGRID: a grid platform for DICOM medical images management in Colombia and Latin America.

    PubMed

    Garcia Ruiz, Manuel; Garcia Chaves, Alvin; Ruiz Ibañez, Carlos; Gutierrez Mazo, Jorge Mario; Ramirez Giraldo, Juan Carlos; Pelaez Echavarria, Alejandro; Valencia Diaz, Edison; Pelaez Restrepo, Gustavo; Montoya Munera, Edwin Nelson; Garcia Loaiza, Bernardo; Gomez Gonzalez, Sebastian

    2011-04-01

    This paper presents the mantisGRID project, an interinstitutional initiative from Colombian medical and academic centers aiming to provide medical grid services for Colombia and Latin America. The mantisGRID is a GRID platform, based on open source grid infrastructure that provides the necessary services to access and exchange medical images and associated information following digital imaging and communications in medicine (DICOM) and health level 7 standards. The paper focuses first on the data abstraction architecture, which is achieved via Open Grid Services Architecture Data Access and Integration (OGSA-DAI) services and supported by the Globus Toolkit. The grid currently uses a 30-Mb bandwidth of the Colombian High Technology Academic Network, RENATA, connected to Internet 2. It also includes a discussion on the relational database created to handle the DICOM objects that were represented using Extensible Markup Language Schema documents, as well as other features implemented such as data security, user authentication, and patient confidentiality. Grid performance was tested using the three current operative nodes and the results demonstrated comparable query times between the mantisGRID (OGSA-DAI) and Distributed mySQL databases, especially for a large number of records.

  3. Database Resources of the BIG Data Center in 2018.

    PubMed

    2018-01-04

    The BIG Data Center at Beijing Institute of Genomics (BIG) of the Chinese Academy of Sciences provides freely open access to a suite of database resources in support of worldwide research activities in both academia and industry. With the vast amounts of omics data generated at ever-greater scales and rates, the BIG Data Center is continually expanding, updating and enriching its core database resources through big-data integration and value-added curation, including BioCode (a repository archiving bioinformatics tool codes), BioProject (a biological project library), BioSample (a biological sample library), Genome Sequence Archive (GSA, a data repository for archiving raw sequence reads), Genome Warehouse (GWH, a centralized resource housing genome-scale data), Genome Variation Map (GVM, a public repository of genome variations), Gene Expression Nebulas (GEN, a database of gene expression profiles based on RNA-Seq data), Methylation Bank (MethBank, an integrated databank of DNA methylomes), and Science Wikis (a series of biological knowledge wikis for community annotations). In addition, three featured web services are provided, viz., BIG Search (search as a service; a scalable inter-domain text search engine), BIG SSO (single sign-on as a service; a user access control system to gain access to multiple independent systems with a single ID and password) and Gsub (submission as a service; a unified submission service for all relevant resources). All of these resources are publicly accessible through the home page of the BIG Data Center at http://bigd.big.ac.cn. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. Scale out databases for CERN use cases

    NASA Astrophysics Data System (ADS)

    Baranowski, Zbigniew; Grzybek, Maciej; Canali, Luca; Lanza Garcia, Daniel; Surdy, Kacper

    2015-12-01

    Data generation rates are expected to grow very fast for some database workloads going into LHC run 2 and beyond. In particular this is expected for data coming from controls, logging and monitoring systems. Storing, administering and accessing big data sets in a relational database system can quickly become a very hard technical challenge, as the size of the active data set and the number of concurrent users increase. Scale-out database technologies are a rapidly developing set of solutions for deploying and managing very large data warehouses on commodity hardware and with open source software. In this paper we will describe the architecture and tests on database systems based on Hadoop and the Cloudera Impala engine. We will discuss the results of our tests, including tests of data loading and integration with existing data sources and in particular with relational databases. We will report on query performance tests done with various data sets of interest at CERN, notably data from the accelerator log database.

  5. ClearedLeavesDB: an online database of cleared plant leaf images

    PubMed Central

    2014-01-01

    Background Leaf vein networks are critical to both the structure and function of leaves. A growing body of recent work has linked leaf vein network structure to the physiology, ecology and evolution of land plants. In the process, multiple institutions and individual researchers have assembled collections of cleared leaf specimens in which vascular bundles (veins) are rendered visible. In an effort to facilitate analysis and digitally preserve these specimens, high-resolution images are usually created, either of entire leaves or of magnified leaf subsections. In a few cases, collections of digital images of cleared leaves are available for use online. However, these collections do not share a common platform nor is there a means to digitally archive cleared leaf images held by individual researchers (in addition to those held by institutions). Hence, there is a growing need for a digital archive that enables online viewing, sharing and disseminating of cleared leaf image collections held by both institutions and individual researchers. Description The Cleared Leaf Image Database (ClearedLeavesDB), is an online web-based resource for a community of researchers to contribute, access and share cleared leaf images. ClearedLeavesDB leverages resources of large-scale, curated collections while enabling the aggregation of small-scale collections within the same online platform. ClearedLeavesDB is built on Drupal, an open source content management platform. It allows plant biologists to store leaf images online with corresponding meta-data, share image collections with a user community and discuss images and collections via a common forum. We provide tools to upload processed images and results to the database via a web services client application that can be downloaded from the database. Conclusions We developed ClearedLeavesDB, a database focusing on cleared leaf images that combines interactions between users and data via an intuitive web interface. The web interface allows storage of large collections and integrates with leaf image analysis applications via an open application programming interface (API). The open API allows uploading of processed images and other trait data to the database, further enabling distribution and documentation of analyzed data within the community. The initial database is seeded with nearly 19,000 cleared leaf images representing over 40 GB of image data. Extensible storage and growth of the database is ensured by using the data storage resources of the iPlant Discovery Environment. ClearedLeavesDB can be accessed at http://clearedleavesdb.org. PMID:24678985

  6. ClearedLeavesDB: an online database of cleared plant leaf images.

    PubMed

    Das, Abhiram; Bucksch, Alexander; Price, Charles A; Weitz, Joshua S

    2014-03-28

    Leaf vein networks are critical to both the structure and function of leaves. A growing body of recent work has linked leaf vein network structure to the physiology, ecology and evolution of land plants. In the process, multiple institutions and individual researchers have assembled collections of cleared leaf specimens in which vascular bundles (veins) are rendered visible. In an effort to facilitate analysis and digitally preserve these specimens, high-resolution images are usually created, either of entire leaves or of magnified leaf subsections. In a few cases, collections of digital images of cleared leaves are available for use online. However, these collections do not share a common platform nor is there a means to digitally archive cleared leaf images held by individual researchers (in addition to those held by institutions). Hence, there is a growing need for a digital archive that enables online viewing, sharing and disseminating of cleared leaf image collections held by both institutions and individual researchers. The Cleared Leaf Image Database (ClearedLeavesDB), is an online web-based resource for a community of researchers to contribute, access and share cleared leaf images. ClearedLeavesDB leverages resources of large-scale, curated collections while enabling the aggregation of small-scale collections within the same online platform. ClearedLeavesDB is built on Drupal, an open source content management platform. It allows plant biologists to store leaf images online with corresponding meta-data, share image collections with a user community and discuss images and collections via a common forum. We provide tools to upload processed images and results to the database via a web services client application that can be downloaded from the database. We developed ClearedLeavesDB, a database focusing on cleared leaf images that combines interactions between users and data via an intuitive web interface. The web interface allows storage of large collections and integrates with leaf image analysis applications via an open application programming interface (API). The open API allows uploading of processed images and other trait data to the database, further enabling distribution and documentation of analyzed data within the community. The initial database is seeded with nearly 19,000 cleared leaf images representing over 40 GB of image data. Extensible storage and growth of the database is ensured by using the data storage resources of the iPlant Discovery Environment. ClearedLeavesDB can be accessed at http://clearedleavesdb.org.

  7. Examination of Industry Payments to Radiation Oncologists in 2014 Using the Centers for Medicare and Medicaid Services Open Payments Database.

    PubMed

    Jairam, Vikram; Yu, James B

    2016-01-01

    To use the Centers for Medicare and Medicaid Services Open Payments database to characterize payments made to radiation oncologists and compare their payment profile with that of medical and surgical oncologists. The June 2015 release of the Open Payments database was accessed, containing all payments made to physicians in 2014. The general payments dataset was used for analysis. Data on payments made to medical, surgical, and radiation oncologists was obtained and compared. Within radiation oncology, data regarding payment category, sponsorship, and geographic distribution were identified. Basic statistics including mean, median, range, and sum were calculated by provider and by transaction. Among the 3 oncologic specialties, radiation oncology had the smallest proportion (58%) of compensated physicians and the lowest mean ($1620) and median ($112) payment per provider. Surgical oncology had the highest proportion (84%) of compensated physicians, whereas medical oncology had the highest mean ($6371) and median ($448) payment per physician. Within radiation oncology, nonconsulting services accounted for the most money to physicians ($1,042,556), whereas the majority of the sponsors were medical device companies (52%). Radiation oncologists in the West accepted the most money ($2,041,603) of any US Census region. Radiation oncologists in 2014 received a large number of payments from industry, although less than their medical or surgical counterparts. As the Open Payments database continues to be improved, it remains to be seen whether this information will be used by patients to inform choice of providers or by lawmakers to enact policy regulating physician-industry relationships. Copyright © 2016 Elsevier Inc. All rights reserved.

  8. Authorial and institutional stratification in open access publishing: the case of global health research.

    PubMed

    Siler, Kyle; Haustein, Stefanie; Smith, Elise; Larivière, Vincent; Alperin, Juan Pablo

    2018-01-01

    Using a database of recent articles published in the field of Global Health research, we examine institutional sources of stratification in publishing access outcomes. Traditionally, the focus on inequality in scientific publishing has focused on prestige hierarchies in established print journals. This project examines stratification in contemporary publishing with a particular focus on subscription vs. various Open Access (OA) publishing options. Findings show that authors working at lower-ranked universities are more likely to publish in closed/paywalled outlets, and less likely to choose outlets that involve some sort of Article Processing Charge (APCs; gold or hybrid OA). We also analyze institutional differences and stratification in the APC costs paid in various journals. Authors affiliated with higher-ranked institutions, as well as hospitals and non-profit organizations pay relatively higher APCs for gold and hybrid OA publications. Results suggest that authors affiliated with high-ranked universities and well-funded institutions tend to have more resources to choose pay options with publishing. Our research suggests new professional hierarchies developing in contemporary publishing, where various OA publishing options are becoming increasingly prominent. Just as there is stratification in institutional representation between different types of publishing access, there is also inequality within access types.

  9. ScienceDirect through SciVerse: a new way to approach Elsevier.

    PubMed

    Bengtson, Jason

    2011-01-01

    SciVerse is the new combined portal from Elsevier that services their ScienceDirect collection, SciTopics, and their Scopus database. Using SciVerse to access ScienceDirect is the specific focus of this review. Along with advanced keyword searching and citation searching options, SciVerse also incorporates a very useful image search feature. The aim seems to be not only to create an interface that provides broad functionality on par with other database search tools that many searchers use regularly but also to create an open platform that could be changed to respond effectively to the needs of customers.

  10. Open access tools for quality-assured and efficient data entry in a large, state-wide tobacco survey in India.

    PubMed

    Shewade, Hemant Deepak; Vidhubala, E; Subramani, Divyaraj Prabhakar; Lal, Pranay; Bhatt, Neelam; Sundaramoorthi, C; Singh, Rana J; Kumar, Ajay M V

    2017-01-01

    A large state-wide tobacco survey was conducted using modified version of pretested, globally validated Global Adult Tobacco Survey (GATS) questionnaire in 2015-22016 in Tamil Nadu, India. Due to resource constrains, data collection was carrid out using paper-based questionnaires (unlike the GATS-India, 2009-2010, which used hand-held computer devices) while data entry was done using open access tools. The objective of this paper is to describe the process of data entry and assess its quality assurance and efficiency. In EpiData language, a variable is referred to as 'field' and a questionnaire (set of fields) as 'record'. EpiData software was used for double data entry with adequate checks followed by validation. Teamviewer was used for remote training and trouble shooting. The EpiData databases (one each for each district and each zone in Chennai city) were housed in shared Dropbox folders, which enabled secure sharing of files and automatic back-up. Each database for a district/zone had separate file for data entry of household level and individual level questionnaire. Of 32,945 households, there were 111,363 individuals aged ≥15 years. The average proportion of records with data entry errors for a district/zone in household level and individual level file was 4% and 24%, respectively. These are the errors that would have gone unnoticed if single entry was used. The median (inter-quartile range) time taken for double data entry for a single household level and individual level questionnaire was 30 (24, 40) s and 86 (64, 126) s, respectively. Efficient and quality-assured near-real-time data entry in a large sub-national tobacco survey was performed using innovative, resource-efficient use of open access tools.

  11. Nuclear Receptor Signaling Atlas: Opening Access to the Biology of Nuclear Receptor Signaling Pathways

    PubMed Central

    Becnel, Lauren B.; Darlington, Yolanda F.; Ochsner, Scott A.; Easton-Marks, Jeremy R.; Watkins, Christopher M.; McOwiti, Apollo; Kankanamge, Wasula H.; Wise, Michael W.; DeHart, Michael; Margolis, Ronald N.; McKenna, Neil J.

    2015-01-01

    Signaling pathways involving nuclear receptors (NRs), their ligands and coregulators, regulate tissue-specific transcriptomes in diverse processes, including development, metabolism, reproduction, the immune response and neuronal function, as well as in their associated pathologies. The Nuclear Receptor Signaling Atlas (NURSA) is a Consortium focused around a Hub website (www.nursa.org) that annotates and integrates diverse ‘omics datasets originating from the published literature and NURSA-funded Data Source Projects (NDSPs). These datasets are then exposed to the scientific community on an Open Access basis through user-friendly data browsing and search interfaces. Here, we describe the redesign of the Hub, version 3.0, to deploy “Web 2.0” technologies and add richer, more diverse content. The Molecule Pages, which aggregate information relevant to NR signaling pathways from myriad external databases, have been enhanced to include resources for basic scientists, such as post-translational modification sites and targeting miRNAs, and for clinicians, such as clinical trials. A portal to NURSA’s Open Access, PubMed-indexed journal Nuclear Receptor Signaling has been added to facilitate manuscript submissions. Datasets and information on reagents generated by NDSPs are available, as is information concerning periodic new NDSP funding solicitations. Finally, the new website integrates the Transcriptomine analysis tool, which allows for mining of millions of richly annotated public transcriptomic data points in the field, providing an environment for dataset re-use and citation, bench data validation and hypothesis generation. We anticipate that this new release of the NURSA database will have tangible, long term benefits for both basic and clinical research in this field. PMID:26325041

  12. Open Geoscience Database

    NASA Astrophysics Data System (ADS)

    Bashev, A.

    2012-04-01

    Currently there is an enormous amount of various geoscience databases. Unfortunately the only users of the majority of the databases are their elaborators. There are several reasons for that: incompaitability, specificity of tasks and objects and so on. However the main obstacles for wide usage of geoscience databases are complexity for elaborators and complication for users. The complexity of architecture leads to high costs that block the public access. The complication prevents users from understanding when and how to use the database. Only databases, associated with GoogleMaps don't have these drawbacks, but they could be hardly named "geoscience" Nevertheless, open and simple geoscience database is necessary at least for educational purposes (see our abstract for ESSI20/EOS12). We developed a database and web interface to work with them and now it is accessible at maps.sch192.ru. In this database a result is a value of a parameter (no matter which) in a station with a certain position, associated with metadata: the date when the result was obtained; the type of a station (lake, soil etc); the contributor that sent the result. Each contributor has its own profile, that allows to estimate the reliability of the data. The results can be represented on GoogleMaps space image as a point in a certain position, coloured according to the value of the parameter. There are default colour scales and each registered user can create the own scale. The results can be also extracted in *.csv file. For both types of representation one could select the data by date, object type, parameter type, area and contributor. The data are uploaded in *.csv format: Name of the station; Lattitude(dd.dddddd); Longitude(ddd.dddddd); Station type; Parameter type; Parameter value; Date(yyyy-mm-dd). The contributor is recognised while entering. This is the minimal set of features that is required to connect a value of a parameter with a position and see the results. All the complicated data treatment could be conducted in other programs after extraction the filtered data into *.csv file. It makes the database understandable for non-experts. The database employs open data format (*.csv) and wide spread tools: PHP as the program language, MySQL as database management system, JavaScript for interaction with GoogleMaps and JQueryUI for create user interface. The database is multilingual: there are association tables, which connect with elements of the database. In total the development required about 150 hours. The database still has several problems. The main problem is the reliability of the data. Actually it needs an expert system for estimation the reliability, but the elaboration of such a system would take more resources than the database itself. The second problem is the problem of stream selection - how to select the stations that are connected with each other (for example, belong to one water stream) and indicate their sequence. Currently the interface is English and Russian. However it can be easily translated to your language. But some problems we decided. For example problem "the problem of the same station" (sometimes the distance between stations is smaller, than the error of position): when you adding new station to the database our application automatically find station near this place. Also we decided problem of object and parameter type (how to regard "EC" and "electrical conductivity" as the same parameter). This problem has been solved using "associative tables". If you would like to see the interface on your language, just contact us. We should send you the list of terms and phrases for translation on your language. The main advantage of the database is that it is totally open: everybody can see, extract the data from the database and use them for non-commercial purposes with no charge. Registered users can contribute to the database without getting paid. We hope, that it will be widely used first of all for education purposes, but professional scientists could use it also.

  13. RNAcentral: A vision for an international database of RNA sequences

    PubMed Central

    Bateman, Alex; Agrawal, Shipra; Birney, Ewan; Bruford, Elspeth A.; Bujnicki, Janusz M.; Cochrane, Guy; Cole, James R.; Dinger, Marcel E.; Enright, Anton J.; Gardner, Paul P.; Gautheret, Daniel; Griffiths-Jones, Sam; Harrow, Jen; Herrero, Javier; Holmes, Ian H.; Huang, Hsien-Da; Kelly, Krystyna A.; Kersey, Paul; Kozomara, Ana; Lowe, Todd M.; Marz, Manja; Moxon, Simon; Pruitt, Kim D.; Samuelsson, Tore; Stadler, Peter F.; Vilella, Albert J.; Vogel, Jan-Hinnerk; Williams, Kelly P.; Wright, Mathew W.; Zwieb, Christian

    2011-01-01

    During the last decade there has been a great increase in the number of noncoding RNA genes identified, including new classes such as microRNAs and piRNAs. There is also a large growth in the amount of experimental characterization of these RNA components. Despite this growth in information, it is still difficult for researchers to access RNA data, because key data resources for noncoding RNAs have not yet been created. The most pressing omission is the lack of a comprehensive RNA sequence database, much like UniProt, which provides a comprehensive set of protein knowledge. In this article we propose the creation of a new open public resource that we term RNAcentral, which will contain a comprehensive collection of RNA sequences and fill an important gap in the provision of biomedical databases. We envision RNA researchers from all over the world joining a federated RNAcentral network, contributing specialized knowledge and databases. RNAcentral would centralize key data that are currently held across a variety of databases, allowing researchers instant access to a single, unified resource. This resource would facilitate the next generation of RNA research and help drive further discoveries, including those that improve food production and human and animal health. We encourage additional RNA database resources and research groups to join this effort. We aim to obtain international network funding to further this endeavor. PMID:21940779

  14. Legal assessment tool (LAT): an interactive tool to address privacy and data protection issues for data sharing.

    PubMed

    Kuchinke, Wolfgang; Krauth, Christian; Bergmann, René; Karakoyun, Töresin; Woollard, Astrid; Schluender, Irene; Braasch, Benjamin; Eckert, Martin; Ohmann, Christian

    2016-07-07

    In an unprecedented rate data in the life sciences is generated and stored in many different databases. An ever increasing part of this data is human health data and therefore falls under data protected by legal regulations. As part of the BioMedBridges project, which created infrastructures that connect more than 10 ESFRI research infrastructures (RI), the legal and ethical prerequisites of data sharing were examined employing a novel and pragmatic approach. We employed concepts from computer science to create legal requirement clusters that enable legal interoperability between databases for the areas of data protection, data security, Intellectual Property (IP) and security of biosample data. We analysed and extracted access rules and constraints from all data providers (databases) involved in the building of data bridges covering many of Europe's most important databases. These requirement clusters were applied to five usage scenarios representing the data flow in different data bridges: Image bridge, Phenotype data bridge, Personalised medicine data bridge, Structural data bridge, and Biosample data bridge. A matrix was built to relate the important concepts from data protection regulations (e.g. pseudonymisation, identifyability, access control, consent management) with the results of the requirement clusters. An interactive user interface for querying the matrix for requirements necessary for compliant data sharing was created. To guide researchers without the need for legal expert knowledge through legal requirements, an interactive tool, the Legal Assessment Tool (LAT), was developed. LAT provides researchers interactively with a selection process to characterise the involved types of data and databases and provides suitable requirements and recommendations for concrete data access and sharing situations. The results provided by LAT are based on an analysis of the data access and sharing conditions for different kinds of data of major databases in Europe. Data sharing for research purposes must be opened for human health data and LAT is one of the means to achieve this aim. In summary, LAT provides requirements in an interactive way for compliant data access and sharing with appropriate safeguards, restrictions and responsibilities by introducing a culture of responsibility and data governance when dealing with human data.

  15. Retracted Publications in the Biomedical Literature from Open Access Journals.

    PubMed

    Wang, Tao; Xing, Qin-Rui; Wang, Hui; Chen, Wei

    2018-03-07

    The number of articles published in open access journals (OAJs) has increased dramatically in recent years. Simultaneously, the quality of publications in these journals has been called into question. Few studies have explored the retraction rate from OAJs. The purpose of the current study was to determine the reasons for retractions of articles from OAJs in biomedical research. The Medline database was searched through PubMed to identify retracted publications in OAJs. The journals were identified by the Directory of Open Access Journals. Data were extracted from each retracted article, including the time from publication to retraction, causes, journal impact factor, and country of origin. Trends in the characteristics related to retraction were determined. Data from 621 retracted studies were included in the analysis. The number and rate of retractions have increased since 2010. The most common reasons for retraction are errors (148), plagiarism (142), duplicate publication (101), fraud/suspected fraud (98) and invalid peer review (93). The number of retracted articles from OAJs has been steadily increasing. Misconduct was the primary reason for retraction. The majority of retracted articles were from journals with low impact factors and authored by researchers from China, India, Iran, and the USA.

  16. A Quality-Control-Oriented Database for a Mesoscale Meteorological Observation Network

    NASA Astrophysics Data System (ADS)

    Lussana, C.; Ranci, M.; Uboldi, F.

    2012-04-01

    In the operational context of a local weather service, data accessibility and quality related issues must be managed by taking into account a wide set of user needs. This work describes the structure and the operational choices made for the operational implementation of a database system storing data from highly automated observing stations, metadata and information on data quality. Lombardy's environmental protection agency, ARPA Lombardia, manages a highly automated mesoscale meteorological network. A Quality Assurance System (QAS) ensures that reliable observational information is collected and disseminated to the users. The weather unit in ARPA Lombardia, at the same time an important QAS component and an intensive data user, has developed a database specifically aimed to: 1) providing quick access to data for operational activities and 2) ensuring data quality for real-time applications, by means of an Automatic Data Quality Control (ADQC) procedure. Quantities stored in the archive include hourly aggregated observations of: precipitation amount, temperature, wind, relative humidity, pressure, global and net solar radiation. The ADQC performs several independent tests on raw data and compares their results in a decision-making procedure. An important ADQC component is the Spatial Consistency Test based on Optimal Interpolation. Interpolated and Cross-Validation analysis values are also stored in the database, providing further information to human operators and useful estimates in case of missing data. The technical solution adopted is based on a LAMP (Linux, Apache, MySQL and Php) system, constituting an open source environment suitable for both development and operational practice. The ADQC procedure itself is performed by R scripts directly interacting with the MySQL database. Users and network managers can access the database by using a set of web-based Php applications.

  17. Semantic-JSON: a lightweight web service interface for Semantic Web contents integrating multiple life science databases

    PubMed Central

    Kobayashi, Norio; Ishii, Manabu; Takahashi, Satoshi; Mochizuki, Yoshiki; Matsushima, Akihiro; Toyoda, Tetsuro

    2011-01-01

    Global cloud frameworks for bioinformatics research databases become huge and heterogeneous; solutions face various diametric challenges comprising cross-integration, retrieval, security and openness. To address this, as of March 2011 organizations including RIKEN published 192 mammalian, plant and protein life sciences databases having 8.2 million data records, integrated as Linked Open or Private Data (LOD/LPD) using SciNetS.org, the Scientists' Networking System. The huge quantity of linked data this database integration framework covers is based on the Semantic Web, where researchers collaborate by managing metadata across public and private databases in a secured data space. This outstripped the data query capacity of existing interface tools like SPARQL. Actual research also requires specialized tools for data analysis using raw original data. To solve these challenges, in December 2009 we developed the lightweight Semantic-JSON interface to access each fragment of linked and raw life sciences data securely under the control of programming languages popularly used by bioinformaticians such as Perl and Ruby. Researchers successfully used the interface across 28 million semantic relationships for biological applications including genome design, sequence processing, inference over phenotype databases, full-text search indexing and human-readable contents like ontology and LOD tree viewers. Semantic-JSON services of SciNetS.org are provided at http://semanticjson.org. PMID:21632604

  18. Health Information-Seeking Patterns of the General Public and Indications for Disease Surveillance: Register-Based Study Using Lyme Disease.

    PubMed

    Pesälä, Samuli; Virtanen, Mikko J; Sane, Jussi; Mustonen, Pekka; Kaila, Minna; Helve, Otto

    2017-11-06

    People using the Internet to find information on health issues, such as specific diseases, usually start their search from a general search engine, for example, Google. Internet searches such as these may yield results and data of questionable quality and reliability. Health Library is a free-of-charge medical portal on the Internet providing medical information for the general public. Physician's Databases, an Internet evidence-based medicine source, provides medical information for health care professionals (HCPs) to support their clinical practice. Both databases are available throughout Finland, but the latter is used only by health professionals and pharmacies. Little is known about how the general public seeks medical information from medical sources on the Internet, how this behavior differs from HCPs' queries, and what causes possible differences in behavior. The aim of our study was to evaluate how the general public's and HCPs' information-seeking trends from Internet medical databases differ seasonally and temporally. In addition, we aimed to evaluate whether the general public's information-seeking trends could be utilized for disease surveillance and whether media coverage could affect these seeking trends. Lyme disease, serving as a well-defined disease model with distinct seasonal variation, was chosen as a case study. Two Internet medical databases, Health Library and Physician's Databases, were used. We compared the general public's article openings on Lyme disease from Health Library to HCPs' article openings on Lyme disease from Physician's Databases seasonally across Finland from 2011 to 2015. Additionally, media publications related to Lyme disease were searched from the largest and most popular media websites in Finland. Both databases, Health Library and Physician's Databases, show visually similar patterns in temporal variations of article openings on Lyme disease in Finland from 2011 to 2015. However, Health Library openings show not only an increasing trend over time but also greater fluctuations, especially during peak opening seasons. Outside these seasons, publications in the media coincide with Health Library article openings only occasionally. Lyme disease-related information-seeking behaviors between the general public and HCPs from Internet medical portals share similar temporal variations, which is consistent with the trend seen in epidemiological data. Therefore, the general public's article openings could be used as a supplementary source of information for disease surveillance. The fluctuations in article openings appeared stronger among the general public, thus, suggesting that different factors such as media coverage, affect the information-seeking behaviors of the public versus professionals. However, media coverage may also have an influence on HCPs. Not every publication was associated with an increase in openings, but the higher the media coverage by some publications, the higher the general public's access to Health Library. ©Samuli Pesälä, Mikko J Virtanen, Jussi Sane, Pekka Mustonen, Minna Kaila, Otto Helve. Originally published in JMIR Public Health and Surveillance (http://publichealth.jmir.org), 06.11.2017.

  19. Health Information–Seeking Patterns of the General Public and Indications for Disease Surveillance: Register-Based Study Using Lyme Disease

    PubMed Central

    Virtanen, Mikko J; Sane, Jussi; Mustonen, Pekka; Kaila, Minna; Helve, Otto

    2017-01-01

    Background People using the Internet to find information on health issues, such as specific diseases, usually start their search from a general search engine, for example, Google. Internet searches such as these may yield results and data of questionable quality and reliability. Health Library is a free-of-charge medical portal on the Internet providing medical information for the general public. Physician’s Databases, an Internet evidence-based medicine source, provides medical information for health care professionals (HCPs) to support their clinical practice. Both databases are available throughout Finland, but the latter is used only by health professionals and pharmacies. Little is known about how the general public seeks medical information from medical sources on the Internet, how this behavior differs from HCPs’ queries, and what causes possible differences in behavior. Objective The aim of our study was to evaluate how the general public’s and HCPs’ information-seeking trends from Internet medical databases differ seasonally and temporally. In addition, we aimed to evaluate whether the general public’s information-seeking trends could be utilized for disease surveillance and whether media coverage could affect these seeking trends. Methods Lyme disease, serving as a well-defined disease model with distinct seasonal variation, was chosen as a case study. Two Internet medical databases, Health Library and Physician’s Databases, were used. We compared the general public’s article openings on Lyme disease from Health Library to HCPs’ article openings on Lyme disease from Physician’s Databases seasonally across Finland from 2011 to 2015. Additionally, media publications related to Lyme disease were searched from the largest and most popular media websites in Finland. Results Both databases, Health Library and Physician’s Databases, show visually similar patterns in temporal variations of article openings on Lyme disease in Finland from 2011 to 2015. However, Health Library openings show not only an increasing trend over time but also greater fluctuations, especially during peak opening seasons. Outside these seasons, publications in the media coincide with Health Library article openings only occasionally. Conclusions Lyme disease–related information-seeking behaviors between the general public and HCPs from Internet medical portals share similar temporal variations, which is consistent with the trend seen in epidemiological data. Therefore, the general public’s article openings could be used as a supplementary source of information for disease surveillance. The fluctuations in article openings appeared stronger among the general public, thus, suggesting that different factors such as media coverage, affect the information-seeking behaviors of the public versus professionals. However, media coverage may also have an influence on HCPs. Not every publication was associated with an increase in openings, but the higher the media coverage by some publications, the higher the general public’s access to Health Library. PMID:29109071

  20. Mobile service for open data visualization on geo-based images

    NASA Astrophysics Data System (ADS)

    Lee, Kiwon; Kim, Kwangseob; Kang, Sanggoo

    2015-12-01

    Since the early 2010s, governments in most countries have adopted and promoted open data policy and open data platform. Korea are in the same situation, and government and public organizations have operated the public-accessible open data portal systems since 2011. The number of open data and data type have been increasing every year. These trends are more expandable or extensible on mobile environments. The purpose of this study is to design and implement a mobile application service to visualize various typed or formatted public open data with geo-based images on the mobile web. Open data cover downloadable data sets or open-accessible data application programming interface API. Geo-based images mean multi-sensor satellite imageries which are referred in geo-coordinates and matched with digital map sets. System components for mobile service are fully based on open sources and open development environments without any commercialized tools: PostgreSQL for database management system, OTB for remote sensing image processing, GDAL for data conversion, GeoServer for application server, OpenLayers for mobile web mapping, R for data analysis and D3.js for web-based data graphic processing. Mobile application in client side was implemented by using HTML5 for cross browser and cross platform. The result shows many advantageous points such as linking open data and geo-based data, integrating open data and open source, and demonstrating mobile applications with open data. It is expected that this approach is cost effective and process efficient implementation strategy for intelligent earth observing data.

  1. A Scalable Data Access Layer to Manage Structured Heterogeneous Biomedical Data

    PubMed Central

    Lianas, Luca; Frexia, Francesca; Zanetti, Gianluigi

    2016-01-01

    This work presents a scalable data access layer, called PyEHR, designed to support the implementation of data management systems for secondary use of structured heterogeneous biomedical and clinical data. PyEHR adopts the openEHR’s formalisms to guarantee the decoupling of data descriptions from implementation details and exploits structure indexing to accelerate searches. Data persistence is guaranteed by a driver layer with a common driver interface. Interfaces for two NoSQL Database Management Systems are already implemented: MongoDB and Elasticsearch. We evaluated the scalability of PyEHR experimentally through two types of tests, called “Constant Load” and “Constant Number of Records”, with queries of increasing complexity on synthetic datasets of ten million records each, containing very complex openEHR archetype structures, distributed on up to ten computing nodes. PMID:27936191

  2. PubChem BioAssay: 2017 update

    PubMed Central

    Wang, Yanli; Bryant, Stephen H.; Cheng, Tiejun; Wang, Jiyao; Gindulyte, Asta; Shoemaker, Benjamin A.; Thiessen, Paul A.; He, Siqian; Zhang, Jian

    2017-01-01

    PubChem's BioAssay database (https://pubchem.ncbi.nlm.nih.gov) has served as a public repository for small-molecule and RNAi screening data since 2004 providing open access of its data content to the community. PubChem accepts data submission from worldwide researchers at academia, industry and government agencies. PubChem also collaborates with other chemical biology database stakeholders with data exchange. With over a decade's development effort, it becomes an important information resource supporting drug discovery and chemical biology research. To facilitate data discovery, PubChem is integrated with all other databases at NCBI. In this work, we provide an update for the PubChem BioAssay database describing several recent development including added sources of research data, redesigned BioAssay record page, new BioAssay classification browser and new features in the Upload system facilitating data sharing. PMID:27899599

  3. DamaGIS: a multisource geodatabase for collection of flood-related damage data

    NASA Astrophysics Data System (ADS)

    Saint-Martin, Clotilde; Javelle, Pierre; Vinet, Freddy

    2018-06-01

    Every year in France, recurring flood events result in several million euros of damage, and reducing the heavy consequences of floods has become a high priority. However, actions to reduce the impact of floods are often hindered by the lack of damage data on past flood events. The present paper introduces a new database for collection and assessment of flood-related damage. The DamaGIS database offers an innovative bottom-up approach to gather and identify damage data from multiple sources, including new media. The study area has been defined as the south of France considering the high frequency of floods over the past years. This paper presents the structure and contents of the database. It also presents operating instructions in order to keep collecting damage data within the database. This paper also describes an easily reproducible method to assess the severity of flood damage regardless of the location or date of occurrence. A first analysis of the damage contents is also provided in order to assess data quality and the relevance of the database. According to this analysis, despite its lack of comprehensiveness, the DamaGIS database presents many advantages. Indeed, DamaGIS provides a high accuracy of data as well as simplicity of use. It also has the additional benefit of being accessible in multiple formats and is open access. The DamaGIS database is available at https://doi.org/10.5281/zenodo.1241089.

  4. KID Project: an internet-based digital video atlas of capsule endoscopy for research purposes

    PubMed Central

    Koulaouzidis, Anastasios; Iakovidis, Dimitris K.; Yung, Diana E.; Rondonotti, Emanuele; Kopylov, Uri; Plevris, John N.; Toth, Ervin; Eliakim, Abraham; Wurm Johansson, Gabrielle; Marlicz, Wojciech; Mavrogenis, Georgios; Nemeth, Artur; Thorlacius, Henrik; Tontini, Gian Eugenio

    2017-01-01

    Background and aims  Capsule endoscopy (CE) has revolutionized small-bowel (SB) investigation. Computational methods can enhance diagnostic yield (DY); however, incorporating machine learning algorithms (MLAs) into CE reading is difficult as large amounts of image annotations are required for training. Current databases lack graphic annotations of pathologies and cannot be used. A novel database, KID, aims to provide a reference for research and development of medical decision support systems (MDSS) for CE. Methods  Open-source software was used for the KID database. Clinicians contribute anonymized, annotated CE images and videos. Graphic annotations are supported by an open-access annotation tool (Ratsnake). We detail an experiment based on the KID database, examining differences in SB lesion measurement between human readers and a MLA. The Jaccard Index (JI) was used to evaluate similarity between annotations by the MLA and human readers. Results  The MLA performed best in measuring lymphangiectasias with a JI of 81 ± 6 %. The other lesion types were: angioectasias (JI 64 ± 11 %), aphthae (JI 64 ± 8 %), chylous cysts (JI 70 ± 14 %), polypoid lesions (JI 75 ± 21 %), and ulcers (JI 56 ± 9 %). Conclusion  MLA can perform as well as human readers in the measurement of SB angioectasias in white light (WL). Automated lesion measurement is therefore feasible. KID is currently the only open-source CE database developed specifically to aid development of MDSS. Our experiment demonstrates this potential. PMID:28580415

  5. Standardization of search methods for guideline development: an international survey of evidence-based guideline development groups.

    PubMed

    Deurenberg, Rikie; Vlayen, Joan; Guillo, Sylvie; Oliver, Thomas K; Fervers, Beatrice; Burgers, Jako

    2008-03-01

    Effective literature searching is particularly important for clinical practice guideline development. Sophisticated searching and filtering mechanisms are needed to help ensure that all relevant research is reviewed. To assess the methods used for the selection of evidence for guideline development by evidence-based guideline development organizations. A semistructured questionnaire assessing the databases, search filters and evaluation methods used for literature retrieval was distributed to eight major organizations involved in evidence-based guideline development. All of the organizations used search filters as part of guideline development. The medline database was the primary source accessed for literature retrieval. The OVID or SilverPlatter interfaces were used in preference to the freely accessed PubMed interface. The Cochrane Library, embase, cinahl and psycinfo databases were also frequently used by the organizations. All organizations reported the intention to improve and validate their filters for finding literature specifically relevant for guidelines. In the first international survey of its kind, eight major guideline development organizations indicated a strong interest in identifying, improving and standardizing search filters to improve guideline development. It is to be hoped that this will result in the standardization of, and open access to, search filters, an improvement in literature searching outcomes and greater collaboration among guideline development organizations.

  6. Global health equity in United Kingdom university research: a landscape of current policies and practices.

    PubMed

    Gotham, Dzintars; Meldrum, Jonathan; Nageshwaran, Vaitehi; Counts, Christopher; Kumari, Nina; Martin, Manuel; Beattie, Ben; Post, Nathan

    2016-10-10

    Universities are significant contributors to research and technologies in health; however, the health needs of the world's poor are historically neglected in research. Medical discoveries are frequently licensed exclusively to one producer, allowing a monopoly and inequitable pricing. Similarly, research is often published in ways that make it inaccessible. Universities can adopt policies and practices to overcome neglect and ensure equitable access to research and its products. For 25 United Kingdom universities, data on health research funding were extracted from the top five United Kingdom funders' databases and coded as research on neglected diseases (NDs) and/or health in low- and lower-middle-income countries (hLLMIC). Data on intellectual property licensing policies and practices and open-access policies were obtained from publicly available sources and by direct contact with universities. Proportions of research articles published as open-access were extracted from PubMed and PubMed Central. Across United Kingdom universities, the median proportion of 2011-2014 health research funds attributable to ND research was 2.6% and for hLLMIC it was 1.7%. Overall, 79% of all ND funding and 74% of hLLMIC funding were granted to the top four institutions within each category. Seven institutions had policies to ensure that technologies developed from their research are affordable globally. Mostly, universities licensed their inventions to third parties in a way that confers monopoly rights. Fifteen institutions had an institutional open-access publishing policy; three had an institutional open-access publishing fund. The proportion of health-related articles with full-text versions freely available online ranged from 58% to 100% across universities (2012-2013); 23% of articles also had a creative commons CC-BY license. There is wide variation in the amount of global health research undertaken by United Kingdom universities, with a large proportion of total research funding awarded to a few institutions. To meet a level of research commitment in line with the global burden of disease, most universities should seek to expand their research activity. Most universities do not license their intellectual property in a way that is likely to encourage access in resource-poor settings, and lack policies to do so. The majority of recent research publications are published open-access, but not as gold standard (CC-BY) open-access.

  7. RoMEO Studies 8: Self-Archiving: The Logic behind the Colour-Coding Used in the Copyright Knowledge Bank

    ERIC Educational Resources Information Center

    Jenkins, Celia; Probets, Steve; Oppenheim, Charles; Hubbard, Bill

    2007-01-01

    Purpose: The purpose of this research is to show how the self-archiving of journal papers is a major step towards providing open access to research. However, copyright transfer agreements (CTAs) that are signed by an author prior to publication often indicate whether, and in what form, self-archiving is allowed. The SHERPA/RoMEO database enables…

  8. Research Models Used in Doctoral Theses on Sport Management in Turkey: A Content Analysis

    ERIC Educational Resources Information Center

    Atalay, Ahmet

    2018-01-01

    The aim of this study was to examine the methodological tendencies in the doctorate theses which were prepared in the field of Sports Management in Turkish between 2007 and 2016 and which were open to access in the database of the Council of Higher Education (CHE) National Theses Center. In this context, 111 doctorate theses prepared in the last…

  9. Statistical Literacy in Data Revolution Era: Building Blocks and Instructional Dilemmas

    ERIC Educational Resources Information Center

    Prodromou, Theodosia; Dunne, Tim

    2017-01-01

    The data revolution has given citizens access to enormous large-scale open databases. In order to take into account the full complexity of data, we have to change the way we think in terms of the nature of data and its availability, the ways in which it is displayed and used, and the skills that are required for its interpretation. Substantial…

  10. SureChEMBL: a large-scale, chemically annotated patent document database.

    PubMed

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P

    2016-01-04

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  11. A web accessible resource for investigating cassava phenomics and genomics information: BIOGEN BASE

    PubMed Central

    Jayakodi, Murukarthick; selvan, Sreedevi Ghokhilamani; Natesan, Senthil; Muthurajan, Raveendran; Duraisamy, Raghu; Ramineni, Jana Jeevan; Rathinasamy, Sakthi Ambothi; Karuppusamy, Nageswari; Lakshmanan, Pugalenthi; Chokkappan, Mohan

    2011-01-01

    The goal of our research is to establish a unique portal to bring out the potential outcome of the research in the Casssava crop. The Biogen base for cassava clearly brings out the variations of different traits of the germplasms, maintained at the Tapioca and Castor Research Station, Tamil Nadu Agricultural University. Phenotypic and genotypic variations of the accessions are clearly depicted, for the users to browse and interpret the variations using the microsatellite markers. Database (BIOGEN BASE ‐ CASSAVA) is designed using PHP and MySQL and is equipped with extensive search options. It is more user-friendly and made publicly available, to improve the research and development of cassava by making a wealth of genetics and genomics data available through open, common, and worldwide forum for all individuals interested in the field. Availability The database is available for free at http://www.tnaugenomics.com/biogenbase/casava.php PMID:21904428

  12. A web accessible resource for investigating cassava phenomics and genomics information: BIOGEN BASE.

    PubMed

    Jayakodi, Murukarthick; Selvan, Sreedevi Ghokhilamani; Natesan, Senthil; Muthurajan, Raveendran; Duraisamy, Raghu; Ramineni, Jana Jeevan; Rathinasamy, Sakthi Ambothi; Karuppusamy, Nageswari; Lakshmanan, Pugalenthi; Chokkappan, Mohan

    2011-01-01

    The goal of our research is to establish a unique portal to bring out the potential outcome of the research in the Casssava crop. The Biogen base for cassava clearly brings out the variations of different traits of the germplasms, maintained at the Tapioca and Castor Research Station, Tamil Nadu Agricultural University. Phenotypic and genotypic variations of the accessions are clearly depicted, for the users to browse and interpret the variations using the microsatellite markers. Database (BIOGEN BASE - CASSAVA) is designed using PHP and MySQL and is equipped with extensive search options. It is more user-friendly and made publicly available, to improve the research and development of cassava by making a wealth of genetics and genomics data available through open, common, and worldwide forum for all individuals interested in the field. The database is available for free at http://www.tnaugenomics.com/biogenbase/casava.php.

  13. Chemical and isotopic database of water and gas from hydrothermal systems with an emphasis for the western United States

    USGS Publications Warehouse

    Mariner, R.H.; Venezky, D.Y.; Hurwitz, S.

    2006-01-01

    Chemical and isotope data accumulated by two USGS Projects (led by I. Barnes and R. Mariner) over a time period of about 40 years can now be found using a basic web search or through an image search (left). The data are primarily chemical and isotopic analyses of waters (thermal, mineral, or fresh) and associated gas (free and/or dissolved) collected from hot springs, mineral springs, cold springs, geothermal wells, fumaroles, and gas seeps. Additional information is available about the collection methods and analysis procedures.The chemical and isotope data are stored in a MySQL database and accessed using PHP from a basic search form below. Data can also be accessed using an Open Source GIS called WorldKit by clicking on the image to the left. Additional information is available about WorldKit including the files used to set up the site.

  14. SureChEMBL: a large-scale, chemically annotated patent document database

    PubMed Central

    Papadatos, George; Davies, Mark; Dedman, Nathan; Chambers, Jon; Gaulton, Anna; Siddle, James; Koks, Richard; Irvine, Sean A.; Pettersson, Joe; Goncharoff, Nicko; Hersey, Anne; Overington, John P.

    2016-01-01

    SureChEMBL is a publicly available large-scale resource containing compounds extracted from the full text, images and attachments of patent documents. The data are extracted from the patent literature according to an automated text and image-mining pipeline on a daily basis. SureChEMBL provides access to a previously unavailable, open and timely set of annotated compound-patent associations, complemented with sophisticated combined structure and keyword-based search capabilities against the compound repository and patent document corpus; given the wealth of knowledge hidden in patent documents, analysis of SureChEMBL data has immediate applications in drug discovery, medicinal chemistry and other commercial areas of chemical science. Currently, the database contains 17 million compounds extracted from 14 million patent documents. Access is available through a dedicated web-based interface and data downloads at: https://www.surechembl.org/. PMID:26582922

  15. The EarthServer project: Exploiting Identity Federations, Science Gateways and Social and Mobile Clients for Big Earth Data Analysis

    NASA Astrophysics Data System (ADS)

    Barbera, Roberto; Bruno, Riccardo; Calanducci, Antonio; Messina, Antonio; Pappalardo, Marco; Passaro, Gianluca

    2013-04-01

    The EarthServer project (www.earthserver.eu), funded by the European Commission under its Seventh Framework Program, aims at establishing open access and ad-hoc analytics on extreme-size Earth Science data, based on and extending leading-edge Array Database technology. The core idea is to use database query languages as client/server interface to achieve barrier-free "mix & match" access to multi-source, any-size, multi-dimensional space-time data -- in short: "Big Earth Data Analytics" - based on the open standards of the Open Geospatial Consortium Web Coverage Processing Service (OGC WCPS) and the W3C XQuery. EarthServer combines both, thereby achieving a tight data/metadata integration. Further, the rasdaman Array Database System (www.rasdaman.com) is extended with further space-time coverage data types. On server side, highly effective optimizations - such as parallel and distributed query processing - ensure scalability to Exabyte volumes. Six Lighthouse Applications are being established in EarthServer, each of which poses distinct challenges on Earth Data Analytics: Cryospheric Science, Airborne Science, Atmospheric Science, Geology, Oceanography, and Planetary Science. Altogether, they cover all Earth Science domains; the Planetary Science use case has been added to challenge concepts and standards in non-standard environments. In addition, EarthLook (maintained by Jacobs University) showcases use of OGC standards in 1D through 5D use cases. In this contribution we will report on the first applications integrated in the EarthServer Science Gateway and on the clients for mobile appliances developed to access them. We will also show how federated and social identity services can allow Big Earth Data Providers to expose their data in a distributed environment keeping a strict and fine-grained control on user authentication and authorisation. The degree of fulfilment of the EarthServer implementation with the recommendations made in the recent TERENA Study on AAA Platforms For Scientific Resources in Europe (https://confluence.terena.org/display/aaastudy/AAA+Study+Home+Page) will also be assessed.

  16. Poor quality evidence suggests that failure rates for atraumatic restorative treatment and conventional amalgam are similar.

    PubMed

    Hurst, Dominic

    2012-06-01

    The Medline, Cochrane CENTRAL, Biomed Central, Database of Open Access Journals (DOAJ), OpenJ-Gate, Bibliografia Brasileira de Odontologia (BBO), LILACS, IndMed, Sabinet, Scielo, Scirus (Medicine), OpenSIGLE and Google Scholar databases were searched. Hand searching was performed for journals not indexed in the databases. References of included trials were checked. Prospective clinical trials with test and control groups with a follow up of at least one year were included. Data abstraction was conducted independently and clinical and methodologically homogeneous data were pooled using a fixed-effects model. Eighteen trials were included. From these 32 individual dichotomous datasets were extracted and analysed. The majority of the results show no differences between both types of intervention. A high risk of selection-, performance-, detection- and attrition bias was identified. Existing research gaps are mainly due to lack of trials and small sample size. The current evidence indicates that the failure rate of high-viscosity GIC/ART restorations is not higher than, but similar to that of conventional amalgam fillings after periods longer than one year. These results are in line with the conclusions drawn during the original systematic review. There is a high risk that these results are affected by bias, and thus confirmation by further trials with suitably high numbers of participants is needed.

  17. Installation of the National Transport Code Collaboration Data Server at the ITPA International Multi-tokamak Confinement Profile Database

    NASA Astrophysics Data System (ADS)

    Roach, Colin; Carlsson, Johan; Cary, John R.; Alexander, David A.

    2002-11-01

    The National Transport Code Collaboration (NTCC) has developed an array of software, including a data client/server. The data server, which is written in C++, serves local data (in the ITER Profile Database format) as well as remote data (by accessing one or several MDS+ servers). The client, a web-invocable Java applet, provides a uniform, intuitive, user-friendly, graphical interface to the data server. The uniformity of the interface relieves the user from the trouble of mastering the differences between different data formats and lets him/her focus on the essentials: plotting and viewing the data. The user runs the client by visiting a web page using any Java capable Web browser. The client is automatically downloaded and run by the browser. A reference to the data server is then retrieved via the standard Web protocol (HTTP). The communication between the client and the server is then handled by the mature, industry-standard CORBA middleware. CORBA has bindings for all common languages and many high-quality implementations are available (both Open Source and commercial). The NTCC data server has been installed at the ITPA International Multi-tokamak Confinement Profile Database, which is hosted by the UKAEA at Culham Science Centre. The installation of the data server is protected by an Internet firewall. To make it accessible to clients outside the firewall some modifications of the server were required. The working version of the ITPA confinement profile database is not open to the public. Authentification of legitimate users is done utilizing built-in Java security features to demand a password to download the client. We present an overview of the NTCC data client/server and some details of how the CORBA firewall-traversal issues were resolved and how the user authentification is implemented.

  18. Digital geologic map and GIS database of Venezuela

    USGS Publications Warehouse

    Garrity, Christopher P.; Hackley, Paul C.; Urbani, Franco

    2006-01-01

    The digital geologic map and GIS database of Venezuela captures GIS compatible geologic and hydrologic data from the 'Geologic Shaded Relief Map of Venezuela,' which was released online as U.S. Geological Survey Open-File Report 2005-1038. Digital datasets and corresponding metadata files are stored in ESRI geodatabase format; accessible via ArcGIS 9.X. Feature classes in the geodatabase include geologic unit polygons, open water polygons, coincident geologic unit linework (contacts, faults, etc.) and non-coincident geologic unit linework (folds, drainage networks, etc.). Geologic unit polygon data were attributed for age, name, and lithologic type following the Lexico Estratigrafico de Venezuela. All digital datasets were captured from source data at 1:750,000. Although users may view and analyze data at varying scales, the authors make no guarantee as to the accuracy of the data at scales larger than 1:750,000.

  19. Database of mineral deposits in the Islamic Republic of Mauritania (phase V, deliverables 90 and 91): Chapter S in Second projet de renforcement institutionnel du secteur minier de la République Islamique de Mauritanie (PRISM-II)

    USGS Publications Warehouse

    Marsh, Erin; Anderson, Eric D.

    2015-01-01

    Three ore deposits databases from previous studies were evaluated and combined with new known mineral occurrences into one database, which can now be used to manage information about the known mineral occurrences of Mauritania. The Microsoft Access 2010 database opens with the list of tables and forms held within the database and a Switchboard control panel from which to easily navigate through the existing mineral deposit data and to enter data for new deposit locations. The database is a helpful tool for the organization of the basic information about the mineral occurrences of Mauritania. It is suggested the database be administered by a single operator in order to avoid data overlap and override that can result from shared real time data entry. It is proposed that the mineral occurrence database be used in concert with the geologic maps, geophysics and geochemistry datasets, as a publically advertised interface for the abundant geospatial information that the Mauritanian government can provide to interested parties.

  20. PsyGeNET: a knowledge platform on psychiatric disorders and their genes.

    PubMed

    Gutiérrez-Sacristán, Alba; Grosdidier, Solène; Valverde, Olga; Torrens, Marta; Bravo, Àlex; Piñero, Janet; Sanz, Ferran; Furlong, Laura I

    2015-09-15

    PsyGeNET (Psychiatric disorders and Genes association NETwork) is a knowledge platform for the exploratory analysis of psychiatric diseases and their associated genes. PsyGeNET is composed of a database and a web interface supporting data search, visualization, filtering and sharing. PsyGeNET integrates information from DisGeNET and data extracted from the literature by text mining, which has been curated by domain experts. It currently contains 2642 associations between 1271 genes and 37 psychiatric disease concepts. In its first release, PsyGeNET is focused on three psychiatric disorders: major depression, alcohol and cocaine use disorders. PsyGeNET represents a comprehensive, open access resource for the analysis of the molecular mechanisms underpinning psychiatric disorders and their comorbidities. The PysGeNET platform is freely available at http://www.psygenet.org/. The PsyGeNET database is made available under the Open Database License (http://opendatacommons.org/licenses/odbl/1.0/). lfurlong@imim.es Supplementary data are available at Bioinformatics online. © The Author 2015. Published by Oxford University Press.

  1. The CTBTO Link to the database of the International Seismological Centre (ISC)

    NASA Astrophysics Data System (ADS)

    Bondar, I.; Storchak, D. A.; Dando, B.; Harris, J.; Di Giacomo, D.

    2011-12-01

    The CTBTO Link to the database of the International Seismological Centre (ISC) is a project to provide access to seismological data sets maintained by the ISC using specially designed interactive tools. The Link is open to National Data Centres and to the CTBTO. By means of graphical interfaces and database queries tailored to the needs of the monitoring community, the users are given access to a multitude of products. These include the ISC and ISS bulletins, covering the seismicity of the Earth since 1904; nuclear and chemical explosions; the EHB bulletin; the IASPEI Reference Event list (ground truth database); and the IDC Reviewed Event Bulletin. The searches are divided into three main categories: The Area Based Search (a spatio-temporal search based on the ISC Bulletin), the REB search (a spatio-temporal search based on specific events in the REB) and the IMS Station Based Search (a search for historical patterns in the reports of seismic stations close to a particular IMS seismic station). The outputs are HTML based web-pages with a simplified version of the ISC Bulletin showing the most relevant parameters with access to ISC, GT, EHB and REB Bulletins in IMS1.0 format for single or multiple events. The CTBTO Link offers a tool to view REB events in context within the historical seismicity, look at observations reported by non-IMS networks, and investigate station histories and residual patterns for stations registered in the International Seismographic Station Registry.

  2. Latest developments for the IAGOS database: Interoperability and metadata

    NASA Astrophysics Data System (ADS)

    Boulanger, Damien; Gautron, Benoit; Thouret, Valérie; Schultz, Martin; van Velthoven, Peter; Broetz, Bjoern; Rauthe-Schöch, Armin; Brissebrat, Guillaume

    2014-05-01

    In-service Aircraft for a Global Observing System (IAGOS, http://www.iagos.org) aims at the provision of long-term, frequent, regular, accurate, and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft. The IAGOS database is an essential part of the global atmospheric monitoring network. Data access is handled by open access policy based on the submission of research requests which are reviewed by the PIs. Users can access the data through the following web sites: http://www.iagos.fr or http://www.pole-ether.fr as the IAGOS database is part of the French atmospheric chemistry data centre ETHER (CNES and CNRS). The database is in continuous development and improvement. In the framework of the IGAS project (IAGOS for GMES/COPERNICUS Atmospheric Service), major achievements will be reached, such as metadata and format standardisation in order to interoperate with international portals and other databases, QA/QC procedures and traceability, CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) data integration within the central database, and the real-time data transmission. IGAS work package 2 aims at providing the IAGOS data to users in a standardized format including the necessary metadata and information on data processing, data quality and uncertainties. We are currently redefining and standardizing the IAGOS metadata for interoperable use within GMES/Copernicus. The metadata are compliant with the ISO 19115, INSPIRE and NetCDF-CF conventions. IAGOS data will be provided to users in NetCDF or NASA Ames format. We also are implementing interoperability between all the involved IAGOS data services, including the central IAGOS database, the former MOZAIC and CARIBIC databases, Aircraft Research DLR database and the Jülich WCS web application JOIN (Jülich OWS Interface) which combines model outputs with in situ data for intercomparison. The optimal data transfer protocol is being investigated to insure the interoperability. To facilitate satellite and model validation, tools will be made available for co-location and comparison with IAGOS. We will enhance the JOIN application in order to properly display aircraft data as vertical profiles and along individual flight tracks and to allow for graphical comparison to model results that are accessible through interoperable web services, such as the daily products from the GMES/Copernicus atmospheric service.

  3. Data shopping in an open marketplace: Introducing the Ontogrator web application for marking up data using ontologies and browsing using facets.

    PubMed

    Morrison, Norman; Hancock, David; Hirschman, Lynette; Dawyndt, Peter; Verslyppe, Bert; Kyrpides, Nikos; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver; Grethe, Jeff; Booth, Tim; Sterk, Peter; Nenadic, Goran; Field, Dawn

    2011-04-29

    In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present heterogeneous product lines to buyers using faceted browsing. Using this approach we have developed the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assemble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully dependent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underlying quality of annotation and could lead to a systematic means for accessing integrated data resources.

  4. Data shopping in an open marketplace: Introducing the Ontogrator web application for marking up data using ontologies and browsing using facets

    PubMed Central

    Morrison, Norman; Hancock, David; Hirschman, Lynette; Dawyndt, Peter; Verslyppe, Bert; Kyrpides, Nikos; Kottmann, Renzo; Yilmaz, Pelin; Glöckner, Frank Oliver; Grethe, Jeff; Booth, Tim; Sterk, Peter; Nenadic, Goran; Field, Dawn

    2011-01-01

    In the future, we hope to see an open and thriving data market in which users can find and select data from a wide range of data providers. In such an open access market, data are products that must be packaged accordingly. Increasingly, eCommerce sellers present heterogeneous product lines to buyers using faceted browsing. Using this approach we have developed the Ontogrator platform, which allows for rapid retrieval of data in a way that would be familiar to any online shopper. Using Knowledge Organization Systems (KOS), especially ontologies, Ontogrator uses text mining to mark up data and faceted browsing to help users navigate, query and retrieve data. Ontogrator offers the potential to impact scientific research in two major ways: 1) by significantly improving the retrieval of relevant information; and 2) by significantly reducing the time required to compose standard database queries and assemble information for further research. Here we present a pilot implementation developed in collaboration with the Genomic Standards Consortium (GSC) that includes content from the StrainInfo, GOLD, CAMERA, Silva and Pubmed databases. This implementation demonstrates the power of ontogration and highlights that the usefulness of this approach is fully dependent on both the quality of data and the KOS (ontologies) used. Ideally, the use and further expansion of this collaborative system will help to surface issues associated with the underlying quality of annotation and could lead to a systematic means for accessing integrated data resources. PMID:21677865

  5. pvsR: An Open Source Interface to Big Data on the American Political Sphere

    PubMed Central

    2015-01-01

    Digital data from the political sphere is abundant, omnipresent, and more and more directly accessible through the Internet. Project Vote Smart (PVS) is a prominent example of this big public data and covers various aspects of U.S. politics in astonishing detail. Despite the vast potential of PVS’ data for political science, economics, and sociology, it is hardly used in empirical research. The systematic compilation of semi-structured data can be complicated and time consuming as the data format is not designed for conventional scientific research. This paper presents a new tool that makes the data easily accessible to a broad scientific community. We provide the software called pvsR as an add-on to the R programming environment for statistical computing. This open source interface (OSI) serves as a direct link between a statistical analysis and the large PVS database. The free and open code is expected to substantially reduce the cost of research with PVS’ new big public data in a vast variety of possible applications. We discuss its advantages vis-à-vis traditional methods of data generation as well as already existing interfaces. The validity of the library is documented based on an illustration involving female representation in local politics. In addition, pvsR facilitates the replication of research with PVS data at low costs, including the pre-processing of data. Similar OSIs are recommended for other big public databases. PMID:26132154

  6. A review of accessibility of administrative healthcare databases in the Asia-Pacific region.

    PubMed

    Milea, Dominique; Azmi, Soraya; Reginald, Praveen; Verpillat, Patrice; Francois, Clement

    2015-01-01

    We describe and compare the availability and accessibility of administrative healthcare databases (AHDB) in several Asia-Pacific countries: Australia, Japan, South Korea, Taiwan, Singapore, China, Thailand, and Malaysia. The study included hospital records, reimbursement databases, prescription databases, and data linkages. Databases were first identified through PubMed, Google Scholar, and the ISPOR database register. Database custodians were contacted. Six criteria were used to assess the databases and provided the basis for a tool to categorise databases into seven levels ranging from least accessible (Level 1) to most accessible (Level 7). We also categorised overall data accessibility for each country as high, medium, or low based on accessibility of databases as well as the number of academic articles published using the databases. Fifty-four administrative databases were identified. Only a limited number of databases allowed access to raw data and were at Level 7 [Medical Data Vision EBM Provider, Japan Medical Data Centre (JMDC) Claims database and Nihon-Chouzai Pharmacy Claims database in Japan, and Medicare, Pharmaceutical Benefits Scheme (PBS), Centre for Health Record Linkage (CHeReL), HealthLinQ, Victorian Data Linkages (VDL), SA-NT DataLink in Australia]. At Levels 3-6 were several databases from Japan [Hamamatsu Medical University Database, Medi-Trend, Nihon University School of Medicine Clinical Data Warehouse (NUSM)], Australia [Western Australia Data Linkage (WADL)], Taiwan [National Health Insurance Research Database (NHIRD)], South Korea [Health Insurance Review and Assessment Service (HIRA)], and Malaysia [United Nations University (UNU)-Casemix]. Countries were categorised as having a high level of data accessibility (Australia, Taiwan, and Japan), medium level of accessibility (South Korea), or a low level of accessibility (Thailand, China, Malaysia, and Singapore). In some countries, data may be available but accessibility was restricted based on requirements by data custodians. Compared with previous research, this study describes the landscape of databases in the selected countries with more granularity using an assessment tool developed for this purpose. A high number of databases were identified but most had restricted access, preventing their potential use to support research. We hope that this study helps to improve the understanding of the AHDB landscape, increase data sharing and database research in Asia-Pacific countries.

  7. Rapid sample classification using an open port sampling interface coupled with liquid introduction atmospheric pressure ionization mass spectrometry.

    PubMed

    Van Berkel, Gary J; Kertesz, Vilmos

    2017-02-15

    An "Open Access"-like mass spectrometric platform to fully utilize the simplicity of the manual open port sampling interface for rapid characterization of unprocessed samples by liquid introduction atmospheric pressure ionization mass spectrometry has been lacking. The in-house developed integrated software with a simple, small and relatively low-cost mass spectrometry system introduced here fills this void. Software was developed to operate the mass spectrometer, to collect and process mass spectrometric data files, to build a database and to classify samples using such a database. These tasks were accomplished via the vendor-provided software libraries. Sample classification based on spectral comparison utilized the spectral contrast angle method. Using the developed software platform near real-time sample classification is exemplified using a series of commercially available blue ink rollerball pens and vegetable oils. In the case of the inks, full scan positive and negative ion ESI mass spectra were both used for database generation and sample classification. For the vegetable oils, full scan positive ion mode APCI mass spectra were recorded. The overall accuracy of the employed spectral contrast angle statistical model was 95.3% and 98% in case of the inks and oils, respectively, using leave-one-out cross-validation. This work illustrates that an open port sampling interface/mass spectrometer combination, with appropriate instrument control and data processing software, is a viable direct liquid extraction sampling and analysis system suitable for the non-expert user and near real-time sample classification via database matching. Published in 2016. This article is a U.S. Government work and is in the public domain in the USA. Published in 2016. This article is a U.S. Government work and is in the public domain in the USA.

  8. Examination of Industry Payments to Radiation Oncologists in 2014 Using the Centers for Medicare and Medicaid Services Open Payments Database

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Jairam, Vikram; Yu, James B., E-mail: james.b.yu@yale.edu

    Purpose: To use the Centers for Medicare and Medicaid Services Open Payments database to characterize payments made to radiation oncologists and compare their payment profile with that of medical and surgical oncologists. Methods and Materials: The June 2015 release of the Open Payments database was accessed, containing all payments made to physicians in 2014. The general payments dataset was used for analysis. Data on payments made to medical, surgical, and radiation oncologists was obtained and compared. Within radiation oncology, data regarding payment category, sponsorship, and geographic distribution were identified. Basic statistics including mean, median, range, and sum were calculated by providermore » and by transaction. Results: Among the 3 oncologic specialties, radiation oncology had the smallest proportion (58%) of compensated physicians and the lowest mean ($1620) and median ($112) payment per provider. Surgical oncology had the highest proportion (84%) of compensated physicians, whereas medical oncology had the highest mean ($6371) and median ($448) payment per physician. Within radiation oncology, nonconsulting services accounted for the most money to physicians ($1,042,556), whereas the majority of the sponsors were medical device companies (52%). Radiation oncologists in the West accepted the most money ($2,041,603) of any US Census region. Conclusions: Radiation oncologists in 2014 received a large number of payments from industry, although less than their medical or surgical counterparts. As the Open Payments database continues to be improved, it remains to be seen whether this information will be used by patients to inform choice of providers or by lawmakers to enact policy regulating physician–industry relationships.« less

  9. PEP725 Pan European Phenological Database

    NASA Astrophysics Data System (ADS)

    Koch, Elisabeth; Adler, Silke; Ungersböck, Markus; Zach-Hermann, Susanne

    2010-05-01

    Europe is in the fortunate situation that it has a long tradition in phenological networking: the history of collecting phenological data and using them in climatology has its starting point in 1751 when Carl von Linné outlined in his work Philosophia Botanica methods for compiling annual plant calendars of leaf opening, flowering, fruiting and leaf fall together with climatological observations "so as to show how areas differ". The Societas Meteorologicae Palatinae at Mannheim well known for its first European wide meteorological network also established a phenological network which was active from 1781 to 1792. Recently in most European countries, phenological observations have been carried out routinely for more than 50 years by different governmental and non governmental organisations and following different observation guidelines, the data stored at different places in different formats. This has been really hampering pan European studies, as one has to address many National Observations Programs (NOP) to get access to the data before one can start to bring them in a uniform style. From 2004 to 2005 the COST-action 725 was running with the main objective to establish a European reference data set of phenological observations that can be used for climatological purposes, especially climate monitoring, and detection of changes. So far the common database/reference data set of COST725 comprises 7687248 data from 7285 observation sites in 15 countries and International Phenological Gardens (IPG) spanning the timeframe from 1951 to 2000. ZAMG is hosting the database. In January 2010 PEP725 has started and will take over not only the part of maintaining, updating the database, but also to bring in phenological data from the time before 1951, developing better quality checking procedures and ensuring an open access to the database. An attractive webpage will make phenology and climate impacts on vegetation more visible in the public enabling a monitoring of vegetation development.

  10. PEP725 Pan European Phenological Database

    NASA Astrophysics Data System (ADS)

    Koch, E.; Adler, S.; Lipa, W.; Ungersböck, M.; Zach-Hermann, S.

    2010-09-01

    Europe is in the fortunate situation that it has a long tradition in phenological networking: the history of collecting phenological data and using them in climatology has its starting point in 1751 when Carl von Linné outlined in his work Philosophia Botanica methods for compiling annual plant calendars of leaf opening, flowering, fruiting and leaf fall together with climatological observations "so as to show how areas differ". Recently in most European countries, phenological observations have been carried out routinely for more than 50 years by different governmental and non governmental organisations and following different observation guidelines, the data stored at different places in different formats. This has been really hampering pan European studies as one has to address many network operators to get access to the data before one can start to bring them in a uniform style. From 2004 to 2009 the COST-action 725 established a European wide data set of phenological observations. But the deliverables of this COST action was not only the common phenological database and common observation guidelines - COST725 helped to trigger a revival of some old networks and to establish new ones as for instance in Sweden. At the end of 2009 the COST action the database comprised about 8 million data in total from 15 European countries plus the data from the International Phenological Gardens IPG. In January 2010 PEP725 began its work as follow up project with funding from EUMETNET the network of European meteorological services and of ZAMG the Austrian national meteorological service. PEP725 not only will take over the part of maintaining, updating the COST725 database, but also to bring in phenological data from the time before 1951, developing better quality checking procedures and ensuring an open access to the database. An attractive webpage will make phenology and climate impacts on vegetation more visible in the public enabling a monitoring of vegetation development.

  11. WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Putman, Tim E.; Lelong, Sebastien; Burgstaller-Muehlbacher, Sebastian

    With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don’t exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomicmore » data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction.« less

  12. WikiGenomes: an open web application for community consumption and curation of gene annotation data in Wikidata

    DOE PAGES

    Putman, Tim E.; Lelong, Sebastien; Burgstaller-Muehlbacher, Sebastian; ...

    2017-03-06

    With the advancement of genome-sequencing technologies, new genomes are being sequenced daily. Although these sequences are deposited in publicly available data warehouses, their functional and genomic annotations (beyond genes which are predicted automatically) mostly reside in the text of primary publications. Professional curators are hard at work extracting those annotations from the literature for the most studied organisms and depositing them in structured databases. However, the resources don’t exist to fund the comprehensive curation of the thousands of newly sequenced organisms in this manner. Here, we describe WikiGenomes (wikigenomes.org), a web application that facilitates the consumption and curation of genomicmore » data by the entire scientific community. WikiGenomes is based on Wikidata, an openly editable knowledge graph with the goal of aggregating published knowledge into a free and open database. WikiGenomes empowers the individual genomic researcher to contribute their expertise to the curation effort and integrates the knowledge into Wikidata, enabling it to be accessed by anyone without restriction.« less

  13. The future application of GML database in GIS

    NASA Astrophysics Data System (ADS)

    Deng, Yuejin; Cheng, Yushu; Jing, Lianwen

    2006-10-01

    In 2004, the Geography Markup Language (GML) Implementation Specification (version 3.1.1) was published by Open Geospatial Consortium, Inc. Now more and more applications in geospatial data sharing and interoperability depend on GML. The primary purpose of designing GML is for exchange and transportation of geo-information by standard modeling and encoding of geography phenomena. However, the problems of how to organize and access lots of GML data effectively arise in applications. The research on GML database focuses on these problems. The effective storage of GML data is a hot topic in GIS communities today. GML Database Management System (GDBMS) mainly deals with the problem of storage and management of GML data. Now two types of XML database, namely Native XML Database, and XML-Enabled Database are classified. Since GML is an application of the XML standard to geographic data, the XML database system can also be used for the management of GML. In this paper, we review the status of the art of XML database, including storage, index and query languages, management systems and so on, then move on to the GML database. At the end, the future prospect of GML database in GIS application is presented.

  14. Deployment of Directory Service for IEEE N Bus Test System Information

    NASA Astrophysics Data System (ADS)

    Barman, Amal; Sil, Jaya

    2008-10-01

    Exchanging information over Internet and Intranet becomes a defacto standard in computer applications, among various users and organizations. Distributed system study, e-governance etc require transparent information exchange between applications, constituencies, manufacturers, and vendors. To serve these purposes database system is needed for storing system data and other relevant information. Directory service, which is a specialized database along with access protocol, could be the single solution since it runs over TCP/IP, supported by all POSIX compliance platforms and is based on open standard. This paper describes a way to deploy directory service, to store IEEE n bus test system data and integrating load flow program with it.

  15. Reflective Database Access Control

    ERIC Educational Resources Information Center

    Olson, Lars E.

    2009-01-01

    "Reflective Database Access Control" (RDBAC) is a model in which a database privilege is expressed as a database query itself, rather than as a static privilege contained in an access control list. RDBAC aids the management of database access controls by improving the expressiveness of policies. However, such policies introduce new interactions…

  16. BioMart: a data federation framework for large collaborative projects.

    PubMed

    Zhang, Junjun; Haider, Syed; Baran, Joachim; Cros, Anthony; Guberman, Jonathan M; Hsu, Jack; Liang, Yong; Yao, Long; Kasprzyk, Arek

    2011-01-01

    BioMart is a freely available, open source, federated database system that provides a unified access to disparate, geographically distributed data sources. It is designed to be data agnostic and platform independent, such that existing databases can easily be incorporated into the BioMart framework. BioMart allows databases hosted on different servers to be presented seamlessly to users, facilitating collaborative projects between different research groups. BioMart contains several levels of query optimization to efficiently manage large data sets and offers a diverse selection of graphical user interfaces and application programming interfaces to ensure that queries can be performed in whatever manner is most convenient for the user. The software has now been adopted by a large number of different biological databases spanning a wide range of data types and providing a rich source of annotation available to bioinformaticians and biologists alike.

  17. PmiRExAt: plant miRNA expression atlas database and web applications

    PubMed Central

    Gurjar, Anoop Kishor Singh; Panwar, Abhijeet Singh; Gupta, Rajinder; Mantri, Shrikant S.

    2016-01-01

    High-throughput small RNA (sRNA) sequencing technology enables an entirely new perspective for plant microRNA (miRNA) research and has immense potential to unravel regulatory networks. Novel insights gained through data mining in publically available rich resource of sRNA data will help in designing biotechnology-based approaches for crop improvement to enhance plant yield and nutritional value. Bioinformatics resources enabling meta-analysis of miRNA expression across multiple plant species are still evolving. Here, we report PmiRExAt, a new online database resource that caters plant miRNA expression atlas. The web-based repository comprises of miRNA expression profile and query tool for 1859 wheat, 2330 rice and 283 maize miRNA. The database interface offers open and easy access to miRNA expression profile and helps in identifying tissue preferential, differential and constitutively expressing miRNAs. A feature enabling expression study of conserved miRNA across multiple species is also implemented. Custom expression analysis feature enables expression analysis of novel miRNA in total 117 datasets. New sRNA dataset can also be uploaded for analysing miRNA expression profiles for 73 plant species. PmiRExAt application program interface, a simple object access protocol web service allows other programmers to remotely invoke the methods written for doing programmatic search operations on PmiRExAt database. Database URL: http://pmirexat.nabi.res.in. PMID:27081157

  18. Addressing Open Water Data Challenges in the Bureau of Reclamation

    NASA Astrophysics Data System (ADS)

    Brekke, L. D.; Danner, A.; Nagode, J.; Rocha, J.; Poulton, S.; Anderson, A.

    2017-12-01

    The Bureau of Reclamation is largest wholesaler of water in the United States. Located in the 17 western states, Reclamation serves water to 31 million people, provides irrigated water to 20 percent of Western farmers, and is the second largest producer of hydroelectric power in the United States. Through these activities, Reclamation generates large amounts of water and water-related data, describing reservoirs and river system conditions, hydropower, environmental compliance activities, infrastructure assets, and other aspects of Reclamation's mission activities. Reclamation aims to make water and water-related data sets more easily found, accessed, and used in decision-making activities in order to benefit the public, private sector, and research communities. Historically, there has not been an integrated, bureau-wide system to store data in machine-readable formats; nor a system to permit centralized browsing, open access, and web-services. Reclamation began addressing these limitations by developing the Reclamation Water Information System (RWIS), released in Spring 2017 (https://water.usbr.gov/). A bureau-wide team contributed to RWIS development, including water data stewards, database administrators, and information technology (IT) specialists. The first RWIS release publishes reservoir time series data from Reclamation's five regions and includes a map interface for sites identification, a query interface for data discovery and access, and web-services for automated retrieval. As RWIS enhancement continues, the development team is developing a companion system - the Reclamation Information Sharing Environment (RISE) - to provide access to the other data subjects and types (geospatial, documents). While RWIS and RISE are promising starts, Reclamation continues to face challenges in addressing open water data goals: making data consolidation and open publishing a value-added activity for programs that publish data locally, going beyond providing open access to also providing decision-support, and scaling up IT solutions for future success - where Reclamation programs increasingly elect to more and more data through RWIS/RISE, thereby creating a big data challenge. This presentation will highlight activities status, lessons learned, and future directions.

  19. Fracture Systems - Digital Field Data Capture

    NASA Astrophysics Data System (ADS)

    Haslam, Richard

    2017-04-01

    Fracture systems play a key role in subsurface resources and developments including groundwater and nuclear waste repositories. There is increasing recognition that there is a need to record and quantify fracture systems to better understand the potential risks and opportunities. With the advent of smart phones and digital field geology there have been numerous systems designed for field data collection. Digital field data collection allows for rapid data collection and interpretations. However, many of the current systems have principally been designed to cover the full range of field mapping and data needs, making them large and complex, plus many do not offer the tools necessary for the collection of fracture specific data. A new multiplatform data recording app has been developed for the collection of field data on faults and joint/fracture systems and a relational database designed for storage and retrieval. The app has been developed to collect fault data and joint/fracture data based on an open source platform. Data is captured in a form-based approach including validity checks to ensure data is collected systematically. In addition to typical structural data collection, the International Society of Rock Mechanics' (ISRM) "Suggested Methods for the Quantitative Description of Discontinuities in Rock Masses" is included allowing for industry standards to be followed and opening up the tools to industry as well as research. All data is uploaded automatically to a secure server and users can view their data and open access data as required. Users can decide if the data they produce should remain private or be open access. A series of automatic reports can be produced and/or the data downloaded. The database will hold a national archive and data retrieval will be made through a web interface.

  20. Database Access Systems.

    ERIC Educational Resources Information Center

    Dalrymple, Prudence W.; Roderer, Nancy K.

    1994-01-01

    Highlights the changes that have occurred from 1987-93 in database access systems. Topics addressed include types of databases, including CD-ROMs; enduser interface; database selection; database access management, including library instruction and use of primary literature; economic issues; database users; the search process; and improving…

  1. VIEWCACHE: An incremental pointer-based access method for autonomous interoperable databases

    NASA Technical Reports Server (NTRS)

    Roussopoulos, N.; Sellis, Timos

    1992-01-01

    One of biggest problems facing NASA today is to provide scientists efficient access to a large number of distributed databases. Our pointer-based incremental database access method, VIEWCACHE, provides such an interface for accessing distributed data sets and directories. VIEWCACHE allows database browsing and search performing inter-database cross-referencing with no actual data movement between database sites. This organization and processing is especially suitable for managing Astrophysics databases which are physically distributed all over the world. Once the search is complete, the set of collected pointers pointing to the desired data are cached. VIEWCACHE includes spatial access methods for accessing image data sets, which provide much easier query formulation by referring directly to the image and very efficient search for objects contained within a two-dimensional window. We will develop and optimize a VIEWCACHE External Gateway Access to database management systems to facilitate distributed database search.

  2. Human Connectome Project Informatics: quality control, database services, and data visualization

    PubMed Central

    Marcus, Daniel S.; Harms, Michael P.; Snyder, Abraham Z.; Jenkinson, Mark; Wilson, J Anthony; Glasser, Matthew F.; Barch, Deanna M.; Archie, Kevin A.; Burgess, Gregory C.; Ramaratnam, Mohana; Hodge, Michael; Horton, William; Herrick, Rick; Olsen, Timothy; McKay, Michael; House, Matthew; Hileman, Michael; Reid, Erin; Harwell, John; Coalson, Timothy; Schindler, Jon; Elam, Jennifer S.; Curtiss, Sandra W.; Van Essen, David C.

    2013-01-01

    The Human Connectome Project (HCP) has developed protocols, standard operating and quality control procedures, and a suite of informatics tools to enable high throughput data collection, data sharing, automated data processing and analysis, and data mining and visualization. Quality control procedures include methods to maintain data collection consistency over time, to measure head motion, and to establish quantitative modality-specific overall quality assessments. Database services developed as customizations of the XNAT imaging informatics platform support both internal daily operations and open access data sharing. The Connectome Workbench visualization environment enables user interaction with HCP data and is increasingly integrated with the HCP's database services. Here we describe the current state of these procedures and tools and their application in the ongoing HCP study. PMID:23707591

  3. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework.

    PubMed

    Khan, Aziz; Fornes, Oriol; Stigliani, Arnaud; Gheorghe, Marius; Castro-Mondragon, Jaime A; van der Lee, Robin; Bessy, Adrien; Chèneby, Jeanne; Kulkarni, Shubhada R; Tan, Ge; Baranasic, Damir; Arenillas, David J; Sandelin, Albin; Vandepoele, Klaas; Lenhard, Boris; Ballester, Benoît; Wasserman, Wyeth W; Parcy, François; Mathelier, Anthony

    2018-01-04

    JASPAR (http://jaspar.genereg.net) is an open-access database of curated, non-redundant transcription factor (TF)-binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. In the 2018 release of JASPAR, the CORE collection has been expanded with 322 new PFMs (60 for vertebrates and 262 for plants) and 33 PFMs were updated (24 for vertebrates, 8 for plants and 1 for insects). These new profiles represent a 30% expansion compared to the 2016 release. In addition, we have introduced 316 TFFMs (95 for vertebrates, 218 for plants and 3 for insects). This release incorporates clusters of similar PFMs in each taxon and each TF class per taxon. The JASPAR 2018 CORE vertebrate collection of PFMs was used to predict TF-binding sites in the human genome. The predictions are made available to the scientific community through a UCSC Genome Browser track data hub. Finally, this update comes with a new web framework with an interactive and responsive user-interface, along with new features. All the underlying data can be retrieved programmatically using a RESTful API and through the JASPAR 2018 R/Bioconductor package. © The Author(s) 2017. Published by Oxford University Press on behalf of Nucleic Acids Research.

  4. JASPAR 2018: update of the open-access database of transcription factor binding profiles and its web framework

    PubMed Central

    Fornes, Oriol; Stigliani, Arnaud; Gheorghe, Marius; Castro-Mondragon, Jaime A; Bessy, Adrien; Chèneby, Jeanne; Kulkarni, Shubhada R; Tan, Ge; Baranasic, Damir; Arenillas, David J; Vandepoele, Klaas; Parcy, François

    2018-01-01

    Abstract JASPAR (http://jaspar.genereg.net) is an open-access database of curated, non-redundant transcription factor (TF)-binding profiles stored as position frequency matrices (PFMs) and TF flexible models (TFFMs) for TFs across multiple species in six taxonomic groups. In the 2018 release of JASPAR, the CORE collection has been expanded with 322 new PFMs (60 for vertebrates and 262 for plants) and 33 PFMs were updated (24 for vertebrates, 8 for plants and 1 for insects). These new profiles represent a 30% expansion compared to the 2016 release. In addition, we have introduced 316 TFFMs (95 for vertebrates, 218 for plants and 3 for insects). This release incorporates clusters of similar PFMs in each taxon and each TF class per taxon. The JASPAR 2018 CORE vertebrate collection of PFMs was used to predict TF-binding sites in the human genome. The predictions are made available to the scientific community through a UCSC Genome Browser track data hub. Finally, this update comes with a new web framework with an interactive and responsive user-interface, along with new features. All the underlying data can be retrieved programmatically using a RESTful API and through the JASPAR 2018 R/Bioconductor package. PMID:29140473

  5. The AMMA database

    NASA Astrophysics Data System (ADS)

    Boichard, Jean-Luc; Brissebrat, Guillaume; Cloche, Sophie; Eymard, Laurence; Fleury, Laurence; Mastrorillo, Laurence; Moulaye, Oumarou; Ramage, Karim

    2010-05-01

    The AMMA project includes aircraft, ground-based and ocean measurements, an intensive use of satellite data and diverse modelling studies. Therefore, the AMMA database aims at storing a great amount and a large variety of data, and at providing the data as rapidly and safely as possible to the AMMA research community. In order to stimulate the exchange of information and collaboration between researchers from different disciplines or using different tools, the database provides a detailed description of the products and uses standardized formats. The AMMA database contains: - AMMA field campaigns datasets; - historical data in West Africa from 1850 (operational networks and previous scientific programs); - satellite products from past and future satellites, (re-)mapped on a regular latitude/longitude grid and stored in NetCDF format (CF Convention); - model outputs from atmosphere or ocean operational (re-)analysis and forecasts, and from research simulations. The outputs are processed as the satellite products are. Before accessing the data, any user has to sign the AMMA data and publication policy. This chart only covers the use of data in the framework of scientific objectives and categorically excludes the redistribution of data to third parties and the usage for commercial applications. Some collaboration between data producers and users, and the mention of the AMMA project in any publication is also required. The AMMA database and the associated on-line tools have been fully developed and are managed by two teams in France (IPSL Database Centre, Paris and OMP, Toulouse). Users can access data of both data centres using an unique web portal. This website is composed of different modules : - Registration: forms to register, read and sign the data use chart when an user visits for the first time - Data access interface: friendly tool allowing to build a data extraction request by selecting various criteria like location, time, parameters... The request can concern local, satellite and model data. - Documentation: catalogue of all the available data and their metadata. These tools have been developed using standard and free languages and softwares: - Linux system with an Apache web server and a Tomcat application server; - J2EE tools : JSF and Struts frameworks, hibernate; - relational database management systems: PostgreSQL and MySQL; - OpenLDAP directory. In order to facilitate the access to the data by African scientists, the complete system has been mirrored at AGHRYMET Regional Centre in Niamey and is operational there since January 2009. Users can now access metadata and request data through one or the other of two equivalent portals: http://database.amma-international.org or http://amma.agrhymet.ne/amma-data.

  6. Authorial and institutional stratification in open access publishing: the case of global health research

    PubMed Central

    Haustein, Stefanie; Smith, Elise; Larivière, Vincent; Alperin, Juan Pablo

    2018-01-01

    Using a database of recent articles published in the field of Global Health research, we examine institutional sources of stratification in publishing access outcomes. Traditionally, the focus on inequality in scientific publishing has focused on prestige hierarchies in established print journals. This project examines stratification in contemporary publishing with a particular focus on subscription vs. various Open Access (OA) publishing options. Findings show that authors working at lower-ranked universities are more likely to publish in closed/paywalled outlets, and less likely to choose outlets that involve some sort of Article Processing Charge (APCs; gold or hybrid OA). We also analyze institutional differences and stratification in the APC costs paid in various journals. Authors affiliated with higher-ranked institutions, as well as hospitals and non-profit organizations pay relatively higher APCs for gold and hybrid OA publications. Results suggest that authors affiliated with high-ranked universities and well-funded institutions tend to have more resources to choose pay options with publishing. Our research suggests new professional hierarchies developing in contemporary publishing, where various OA publishing options are becoming increasingly prominent. Just as there is stratification in institutional representation between different types of publishing access, there is also inequality within access types. PMID:29479492

  7. Screening the Medicines for Malaria Venture Pathogen Box across Multiple Pathogens Reclassifies Starting Points for Open-Source Drug Discovery

    PubMed Central

    Sykes, Melissa L.; Jones, Amy J.; Shelper, Todd B.; Simpson, Moana; Lang, Rebecca; Poulsen, Sally-Ann; Sleebs, Brad E.

    2017-01-01

    ABSTRACT Open-access drug discovery provides a substantial resource for diseases primarily affecting the poor and disadvantaged. The open-access Pathogen Box collection is comprised of compounds with demonstrated biological activity against specific pathogenic organisms. The supply of this resource by the Medicines for Malaria Venture has the potential to provide new chemical starting points for a number of tropical and neglected diseases, through repurposing of these compounds for use in drug discovery campaigns for these additional pathogens. We tested the Pathogen Box against kinetoplastid parasites and malaria life cycle stages in vitro. Consequently, chemical starting points for malaria, human African trypanosomiasis, Chagas disease, and leishmaniasis drug discovery efforts have been identified. Inclusive of this in vitro biological evaluation, outcomes from extensive literature reviews and database searches are provided. This information encompasses commercial availability, literature reference citations, other aliases and ChEMBL number with associated biological activity, where available. The release of this new data for the Pathogen Box collection into the public domain will aid the open-source model of drug discovery. Importantly, this will provide novel chemical starting points for drug discovery and target identification in tropical disease research. PMID:28674055

  8. Screening the Medicines for Malaria Venture Pathogen Box across Multiple Pathogens Reclassifies Starting Points for Open-Source Drug Discovery.

    PubMed

    Duffy, Sandra; Sykes, Melissa L; Jones, Amy J; Shelper, Todd B; Simpson, Moana; Lang, Rebecca; Poulsen, Sally-Ann; Sleebs, Brad E; Avery, Vicky M

    2017-09-01

    Open-access drug discovery provides a substantial resource for diseases primarily affecting the poor and disadvantaged. The open-access Pathogen Box collection is comprised of compounds with demonstrated biological activity against specific pathogenic organisms. The supply of this resource by the Medicines for Malaria Venture has the potential to provide new chemical starting points for a number of tropical and neglected diseases, through repurposing of these compounds for use in drug discovery campaigns for these additional pathogens. We tested the Pathogen Box against kinetoplastid parasites and malaria life cycle stages in vitro Consequently, chemical starting points for malaria, human African trypanosomiasis, Chagas disease, and leishmaniasis drug discovery efforts have been identified. Inclusive of this in vitro biological evaluation, outcomes from extensive literature reviews and database searches are provided. This information encompasses commercial availability, literature reference citations, other aliases and ChEMBL number with associated biological activity, where available. The release of this new data for the Pathogen Box collection into the public domain will aid the open-source model of drug discovery. Importantly, this will provide novel chemical starting points for drug discovery and target identification in tropical disease research. Copyright © 2017 Duffy et al.

  9. Federated Web-accessible Clinical Data Management within an Extensible NeuroImaging Database

    PubMed Central

    Keator, David B.; Wei, Dingying; Fennema-Notestine, Christine; Pease, Karen R.; Bockholt, Jeremy; Grethe, Jeffrey S.

    2010-01-01

    Managing vast datasets collected throughout multiple clinical imaging communities has become critical with the ever increasing and diverse nature of datasets. Development of data management infrastructure is further complicated by technical and experimental advances that drive modifications to existing protocols and acquisition of new types of research data to be incorporated into existing data management systems. In this paper, an extensible data management system for clinical neuroimaging studies is introduced: The Human Clinical Imaging Database (HID) and Toolkit. The database schema is constructed to support the storage of new data types without changes to the underlying schema. The complex infrastructure allows management of experiment data, such as image protocol and behavioral task parameters, as well as subject-specific data, including demographics, clinical assessments, and behavioral task performance metrics. Of significant interest, embedded clinical data entry and management tools enhance both consistency of data reporting and automatic entry of data into the database. The Clinical Assessment Layout Manager (CALM) allows users to create on-line data entry forms for use within and across sites, through which data is pulled into the underlying database via the generic clinical assessment management engine (GAME). Importantly, the system is designed to operate in a distributed environment, serving both human users and client applications in a service-oriented manner. Querying capabilities use a built-in multi-database parallel query builder/result combiner, allowing web-accessible queries within and across multiple federated databases. The system along with its documentation is open-source and available from the Neuroimaging Informatics Tools and Resource Clearinghouse (NITRC) site. PMID:20567938

  10. Federated web-accessible clinical data management within an extensible neuroimaging database.

    PubMed

    Ozyurt, I Burak; Keator, David B; Wei, Dingying; Fennema-Notestine, Christine; Pease, Karen R; Bockholt, Jeremy; Grethe, Jeffrey S

    2010-12-01

    Managing vast datasets collected throughout multiple clinical imaging communities has become critical with the ever increasing and diverse nature of datasets. Development of data management infrastructure is further complicated by technical and experimental advances that drive modifications to existing protocols and acquisition of new types of research data to be incorporated into existing data management systems. In this paper, an extensible data management system for clinical neuroimaging studies is introduced: The Human Clinical Imaging Database (HID) and Toolkit. The database schema is constructed to support the storage of new data types without changes to the underlying schema. The complex infrastructure allows management of experiment data, such as image protocol and behavioral task parameters, as well as subject-specific data, including demographics, clinical assessments, and behavioral task performance metrics. Of significant interest, embedded clinical data entry and management tools enhance both consistency of data reporting and automatic entry of data into the database. The Clinical Assessment Layout Manager (CALM) allows users to create on-line data entry forms for use within and across sites, through which data is pulled into the underlying database via the generic clinical assessment management engine (GAME). Importantly, the system is designed to operate in a distributed environment, serving both human users and client applications in a service-oriented manner. Querying capabilities use a built-in multi-database parallel query builder/result combiner, allowing web-accessible queries within and across multiple federated databases. The system along with its documentation is open-source and available from the Neuroimaging Informatics Tools and Resource Clearinghouse (NITRC) site.

  11. A review of accessibility of administrative healthcare databases in the Asia-Pacific region

    PubMed Central

    Milea, Dominique; Azmi, Soraya; Reginald, Praveen; Verpillat, Patrice; Francois, Clement

    2015-01-01

    Objective We describe and compare the availability and accessibility of administrative healthcare databases (AHDB) in several Asia-Pacific countries: Australia, Japan, South Korea, Taiwan, Singapore, China, Thailand, and Malaysia. Methods The study included hospital records, reimbursement databases, prescription databases, and data linkages. Databases were first identified through PubMed, Google Scholar, and the ISPOR database register. Database custodians were contacted. Six criteria were used to assess the databases and provided the basis for a tool to categorise databases into seven levels ranging from least accessible (Level 1) to most accessible (Level 7). We also categorised overall data accessibility for each country as high, medium, or low based on accessibility of databases as well as the number of academic articles published using the databases. Results Fifty-four administrative databases were identified. Only a limited number of databases allowed access to raw data and were at Level 7 [Medical Data Vision EBM Provider, Japan Medical Data Centre (JMDC) Claims database and Nihon-Chouzai Pharmacy Claims database in Japan, and Medicare, Pharmaceutical Benefits Scheme (PBS), Centre for Health Record Linkage (CHeReL), HealthLinQ, Victorian Data Linkages (VDL), SA-NT DataLink in Australia]. At Levels 3–6 were several databases from Japan [Hamamatsu Medical University Database, Medi-Trend, Nihon University School of Medicine Clinical Data Warehouse (NUSM)], Australia [Western Australia Data Linkage (WADL)], Taiwan [National Health Insurance Research Database (NHIRD)], South Korea [Health Insurance Review and Assessment Service (HIRA)], and Malaysia [United Nations University (UNU)-Casemix]. Countries were categorised as having a high level of data accessibility (Australia, Taiwan, and Japan), medium level of accessibility (South Korea), or a low level of accessibility (Thailand, China, Malaysia, and Singapore). In some countries, data may be available but accessibility was restricted based on requirements by data custodians. Conclusions Compared with previous research, this study describes the landscape of databases in the selected countries with more granularity using an assessment tool developed for this purpose. A high number of databases were identified but most had restricted access, preventing their potential use to support research. We hope that this study helps to improve the understanding of the AHDB landscape, increase data sharing and database research in Asia-Pacific countries. PMID:27123180

  12. U.S. Army Research Laboratory (ARL) multimodal signatures database

    NASA Astrophysics Data System (ADS)

    Bennett, Kelly

    2008-04-01

    The U.S. Army Research Laboratory (ARL) Multimodal Signatures Database (MMSDB) is a centralized collection of sensor data of various modalities that are co-located and co-registered. The signatures include ground and air vehicles, personnel, mortar, artillery, small arms gunfire from potential sniper weapons, explosives, and many other high value targets. This data is made available to Department of Defense (DoD) and DoD contractors, Intel agencies, other government agencies (OGA), and academia for use in developing target detection, tracking, and classification algorithms and systems to protect our Soldiers. A platform independent Web interface disseminates the signatures to researchers and engineers within the scientific community. Hierarchical Data Format 5 (HDF5) signature models provide an excellent solution for the sharing of complex multimodal signature data for algorithmic development and database requirements. Many open source tools for viewing and plotting HDF5 signatures are available over the Web. Seamless integration of HDF5 signatures is possible in both proprietary computational environments, such as MATLAB, and Free and Open Source Software (FOSS) computational environments, such as Octave and Python, for performing signal processing, analysis, and algorithm development. Future developments include extending the Web interface into a portal system for accessing ARL algorithms and signatures, High Performance Computing (HPC) resources, and integrating existing database and signature architectures into sensor networking environments.

  13. pE-DB: a database of structural ensembles of intrinsically disordered and of unfolded proteins.

    PubMed

    Varadi, Mihaly; Kosol, Simone; Lebrun, Pierre; Valentini, Erica; Blackledge, Martin; Dunker, A Keith; Felli, Isabella C; Forman-Kay, Julie D; Kriwacki, Richard W; Pierattelli, Roberta; Sussman, Joel; Svergun, Dmitri I; Uversky, Vladimir N; Vendruscolo, Michele; Wishart, David; Wright, Peter E; Tompa, Peter

    2014-01-01

    The goal of pE-DB (http://pedb.vib.be) is to serve as an openly accessible database for the deposition of structural ensembles of intrinsically disordered proteins (IDPs) and of denatured proteins based on nuclear magnetic resonance spectroscopy, small-angle X-ray scattering and other data measured in solution. Owing to the inherent flexibility of IDPs, solution techniques are particularly appropriate for characterizing their biophysical properties, and structural ensembles in agreement with these data provide a convenient tool for describing the underlying conformational sampling. Database entries consist of (i) primary experimental data with descriptions of the acquisition methods and algorithms used for the ensemble calculations, and (ii) the structural ensembles consistent with these data, provided as a set of models in a Protein Data Bank format. PE-DB is open for submissions from the community, and is intended as a forum for disseminating the structural ensembles and the methodologies used to generate them. While the need to represent the IDP structures is clear, methods for determining and evaluating the structural ensembles are still evolving. The availability of the pE-DB database is expected to promote the development of new modeling methods and leads to a better understanding of how function arises from disordered states.

  14. Establishment of an international database for genetic variants in esophageal cancer.

    PubMed

    Vihinen, Mauno

    2016-10-01

    The establishment of a database has been suggested in order to collect, organize, and distribute genetic information about esophageal cancer. The World Organization for Specialized Studies on Diseases of the Esophagus and the Human Variome Project will be in charge of a central database of information about esophageal cancer-related variations from publications, databases, and laboratories; in addition to genetic details, clinical parameters will also be included. The aim will be to get all the central players in research, clinical, and commercial laboratories to contribute. The database will follow established recommendations and guidelines. The database will require a team of dedicated curators with different backgrounds. Numerous layers of systematics will be applied to facilitate computational analyses. The data items will be extensively integrated with other information sources. The database will be distributed as open access to ensure exchange of the data with other databases. Variations will be reported in relation to reference sequences on three levels--DNA, RNA, and protein-whenever applicable. In the first phase, the database will concentrate on genetic variations including both somatic and germline variations for susceptibility genes. Additional types of information can be integrated at a later stage. © 2016 New York Academy of Sciences.

  15. Respiratory cancer database: An open access database of respiratory cancer gene and miRNA.

    PubMed

    Choubey, Jyotsna; Choudhari, Jyoti Kant; Patel, Ashish; Verma, Mukesh Kumar

    2017-01-01

    Respiratory cancer database (RespCanDB) is a genomic and proteomic database of cancer of respiratory organ. It also includes the information of medicinal plants used for the treatment of various respiratory cancers with structure of its active constituents as well as pharmacological and chemical information of drug associated with various respiratory cancers. Data in RespCanDB has been manually collected from published research article and from other databases. Data has been integrated using MySQL an object-relational database management system. MySQL manages all data in the back-end and provides commands to retrieve and store the data into the database. The web interface of database has been built in ASP. RespCanDB is expected to contribute to the understanding of scientific community regarding respiratory cancer biology as well as developments of new way of diagnosing and treating respiratory cancer. Currently, the database consist the oncogenomic information of lung cancer, laryngeal cancer, and nasopharyngeal cancer. Data for other cancers, such as oral and tracheal cancers, will be added in the near future. The URL of RespCanDB is http://ridb.subdic-bioinformatics-nitrr.in/.

  16. Smoke and Emissions Model Intercomparison Project (SEMIP)

    NASA Astrophysics Data System (ADS)

    Larkin, N. K.; Raffuse, S.; Strand, T.; Solomon, R.; Sullivan, D.; Wheeler, N.

    2008-12-01

    Fire emissions and smoke impacts from wildland fire are a growing concern due to increasing fire season severity, dwindling tolerance of smoke by the public, tightening air quality regulations, and their role in climate change issues. Unfortunately, while a number of models and modeling system solutions are available to address these issues, the lack of quantitative information on the limitations and difference between smoke and emissions models impedes the use of these tools for real-world applications (JFSP, 2007). We describe a new, open-access project to directly address this issue, the open-access Smoke Emissions Model Intercomparison Project (SEMIP) and invite the community to participate. Preliminary work utilizing the modular BlueSky framework to directly compare fire location and size information, fuel loading amounts, fuel consumption rates, and fire emissions from a number of current models that has found model-to-model variability as high as two orders of magnitude for an individual fire. Fire emissions inventories also show significant variability on both regional and national scales that are dependant on the fire location information used (ground report vs. satellite), the fuel loading maps assumed, and the fire consumption models employed. SEMIP expands on this work and creates an open-access database of model results and observations with the goal of furthering model development and model prediction usability for real-world decision support.

  17. Development of an Integrated Hydrologic Modeling System for Rainfall-Runoff Simulation

    NASA Astrophysics Data System (ADS)

    Lu, B.; Piasecki, M.

    2008-12-01

    This paper aims to present the development of an integrated hydrological model which involves functionalities of digital watershed processing, online data retrieval, hydrologic simulation and post-event analysis. The proposed system is intended to work as a back end to the CUAHSI HIS cyberinfrastructure developments. As a first step into developing this system, a physics-based distributed hydrologic model PIHM (Penn State Integrated Hydrologic Model) is wrapped into OpenMI(Open Modeling Interface and Environment ) environment so as to seamlessly interact with OpenMI compliant meteorological models. The graphical user interface is being developed from the openGIS application called MapWindows which permits functionality expansion through the addition of plug-ins. . Modules required to set up through the GUI workboard include those for retrieving meteorological data from existing database or meteorological prediction models, obtaining geospatial data from the output of digital watershed processing, and importing initial condition and boundary condition. They are connected to the OpenMI compliant PIHM to simulate rainfall-runoff processes and includes a module for automatically displaying output after the simulation. Online databases are accessed through the WaterOneFlow web services, and the retrieved data are either stored in an observation database(OD) following the schema of Observation Data Model(ODM) in case for time series support, or a grid based storage facility which may be a format like netCDF or a grid-based-data database schema . Specific development steps include the creation of a bridge to overcome interoperability issue between PIHM and the ODM, as well as the embedding of TauDEM (Terrain Analysis Using Digital Elevation Models) into the model. This module is responsible for developing watershed and stream network using digital elevation models. Visualizing and editing geospatial data is achieved by the usage of MapWinGIS, an ActiveX control developed by MapWindow team. After applying to the practical watershed, the performance of the model can be tested by the post-event analysis module.

  18. Development of an expert analysis tool based on an interactive subsidence hazard map for urban land use in the city of Celaya, Mexico

    NASA Astrophysics Data System (ADS)

    Alloy, A.; Gonzalez Dominguez, F.; Nila Fonseca, A. L.; Ruangsirikulchai, A.; Gentle, J. N., Jr.; Cabral, E.; Pierce, S. A.

    2016-12-01

    Land Subsidence as a result of groundwater extraction in central Mexico's larger urban centers initiated in the 80's as a result of population and economic growth. The city of Celaya has undergone subsidence for a few decades and a consequence is the development of an active normal fault system that affects its urban infrastructure and residential areas. To facilitate its analysis and a land use decision-making process we created an online interactive map enabling users to easily obtain information associated with land subsidence. Geological and socioeconomic data of the city was collected, including fault location, population data, and other important infrastructure and structural data has been obtained from fieldwork as part of a study abroad interchange undergraduate course. The subsidence and associated faulting hazard map was created using an InSAR derived subsidence velocity map and population data from INEGI to identify hazard zones using a subsidence gradient spatial analysis approach based on a subsidence gradient and population risk matrix. This interactive map provides a simple perspective of different vulnerable urban elements. As an accessible visualization tool, it will enhance communication between scientific and socio-economic disciplines. Our project also lays the groundwork for a future expert analysis system with an open source and easily accessible Python coded, SQLite database driven website which archives fault and subsidence data along with visual damage documentation to civil structures. This database takes field notes and provides an entry form for uniform datasets, which are used to generate a JSON. Such a database is useful because it allows geoscientists to have a centralized repository and access to their observations over time. Because of the widespread presence of the subsidence phenomena throughout cities in central Mexico, the spatial analysis has been automated using the open source software R. Raster, rgeos, shapefiles, and rgdal libraries have been used to develop the script which permits to obtain the raster maps of horizontal gradient and population density. An advantage is that this analysis can be automated for periodic updates or repurposed for similar analysis in other cities, providing an easily accessible tool for land subsidence hazard assessments.

  19. VIEWCACHE: An incremental pointer-based access method for autonomous interoperable databases

    NASA Technical Reports Server (NTRS)

    Roussopoulos, N.; Sellis, Timos

    1993-01-01

    One of the biggest problems facing NASA today is to provide scientists efficient access to a large number of distributed databases. Our pointer-based incremental data base access method, VIEWCACHE, provides such an interface for accessing distributed datasets and directories. VIEWCACHE allows database browsing and search performing inter-database cross-referencing with no actual data movement between database sites. This organization and processing is especially suitable for managing Astrophysics databases which are physically distributed all over the world. Once the search is complete, the set of collected pointers pointing to the desired data are cached. VIEWCACHE includes spatial access methods for accessing image datasets, which provide much easier query formulation by referring directly to the image and very efficient search for objects contained within a two-dimensional window. We will develop and optimize a VIEWCACHE External Gateway Access to database management systems to facilitate database search.

  20. A prospective international cooperative information technology platform built using open-source tools for improving the access to and safety of bone marrow transplantation in low- and middle-income countries.

    PubMed

    Agarwal, Rajat Kumar; Sedai, Amit; Dhimal, Sunil; Ankita, Kumari; Clemente, Luigi; Siddique, Sulman; Yaqub, Naila; Khalid, Sadaf; Itrat, Fatima; Khan, Anwar; Gilani, Sarah Khan; Marwah, Priya; Soni, Rajpreet; Missiry, Mohamed El; Hussain, Mohamed Hamed; Uderzo, Cornelio; Faulkner, Lawrence

    2014-01-01

    Jagriti Innovations developed a collaboration tool in partnership with the Cure2Children Foundation that has been used by health professionals in Italy, Pakistan, and India for the collaborative management of patients undergoing bone marrow transplantation (BMT) for thalassemia major since August 2008. This online open-access database covers data recording, analyzing, and reporting besides enabling knowledge exchange, telemedicine, capacity building, and quality assurance. As of February 2014, over 2400 patients have been registered and 112 BMTs have been performed with outcomes comparable to international standards, but at a fraction of the cost. This approach avoids medical emigration and contributes to local healthcare strengthening and competitiveness. This paper presents the experience and clinical outcomes associated with the use of this platform built using open-source tools and focusing on a locally pertinent tertiary care procedure-BMT. Published by the BMJ Publishing Group Limited. For permission to use (where not already granted under a licence) please go to http://group.bmj.com/group/rights-licensing/permissions.

  1. R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring.

    PubMed

    Rimet, Frédéric; Chaumeil, Philippe; Keck, François; Kermarrec, Lenaïg; Vasselon, Valentin; Kahlert, Maria; Franc, Alain; Bouchez, Agnès

    2016-01-01

    Diatoms are micro-algal indicators of freshwater pollution. Current standardized methodologies are based on microscopic determinations, which is time consuming and prone to identification uncertainties. The use of DNA-barcoding has been proposed as a way to avoid these flaws. Combining barcoding with next-generation sequencing enables collection of a large quantity of barcodes from natural samples. These barcodes are identified as certain diatom taxa by comparing the sequences to a reference barcoding library using algorithms. Proof of concept was recently demonstrated for synthetic and natural communities and underlined the importance of the quality of this reference library. We present an open-access and curated reference barcoding database for diatoms, called R-Syst::diatom, developed in the framework of R-Syst, the network of systematic supported by INRA (French National Institute for Agricultural Research), see http://www.rsyst.inra.fr/en. R-Syst::diatom links DNA-barcodes to their taxonomical identifications, and is dedicated to identify barcodes from natural samples. The data come from two sources, a culture collection of freshwater algae maintained in INRA in which new strains are regularly deposited and barcoded and from the NCBI (National Center for Biotechnology Information) nucleotide database. Two kinds of barcodes were chosen to support the database: 18S (18S ribosomal RNA) and rbcL (Ribulose-1,5-bisphosphate carboxylase/oxygenase), because of their efficiency. Data are curated using innovative (Declic) and classical bioinformatic tools (Blast, classical phylogenies) and up-to-date taxonomy (Catalogues and peer reviewed papers). Every 6 months R-Syst::diatom is updated. The database is available through the R-Syst microalgae website (http://www.rsyst.inra.fr/) and a platform dedicated to next-generation sequencing data analysis, virtual_BiodiversityL@b (https://galaxy-pgtp.pierroton.inra.fr/). We present here the content of the library regarding the number of barcodes and diatom taxa. In addition to these information, morphological features (e.g. biovolumes, chloroplasts…), life-forms (mobility, colony-type) or ecological features (taxa preferenda to pollution) are indicated in R-Syst::diatom. Database URL: http://www.rsyst.inra.fr/. © The Author(s) 2016. Published by Oxford University Press.

  2. R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring

    PubMed Central

    Rimet, Frédéric; Chaumeil, Philippe; Keck, François; Kermarrec, Lenaïg; Vasselon, Valentin; Kahlert, Maria; Franc, Alain; Bouchez, Agnès

    2016-01-01

    Diatoms are micro-algal indicators of freshwater pollution. Current standardized methodologies are based on microscopic determinations, which is time consuming and prone to identification uncertainties. The use of DNA-barcoding has been proposed as a way to avoid these flaws. Combining barcoding with next-generation sequencing enables collection of a large quantity of barcodes from natural samples. These barcodes are identified as certain diatom taxa by comparing the sequences to a reference barcoding library using algorithms. Proof of concept was recently demonstrated for synthetic and natural communities and underlined the importance of the quality of this reference library. We present an open-access and curated reference barcoding database for diatoms, called R-Syst::diatom, developed in the framework of R-Syst, the network of systematic supported by INRA (French National Institute for Agricultural Research), see http://www.rsyst.inra.fr/en. R-Syst::diatom links DNA-barcodes to their taxonomical identifications, and is dedicated to identify barcodes from natural samples. The data come from two sources, a culture collection of freshwater algae maintained in INRA in which new strains are regularly deposited and barcoded and from the NCBI (National Center for Biotechnology Information) nucleotide database. Two kinds of barcodes were chosen to support the database: 18S (18S ribosomal RNA) and rbcL (Ribulose-1,5-bisphosphate carboxylase/oxygenase), because of their efficiency. Data are curated using innovative (Declic) and classical bioinformatic tools (Blast, classical phylogenies) and up-to-date taxonomy (Catalogues and peer reviewed papers). Every 6 months R-Syst::diatom is updated. The database is available through the R-Syst microalgae website (http://www.rsyst.inra.fr/) and a platform dedicated to next-generation sequencing data analysis, virtual_BiodiversityL@b (https://galaxy-pgtp.pierroton.inra.fr/). We present here the content of the library regarding the number of barcodes and diatom taxa. In addition to these information, morphological features (e.g. biovolumes, chloroplasts…), life-forms (mobility, colony-type) or ecological features (taxa preferenda to pollution) are indicated in R-Syst::diatom. Database URL: http://www.rsyst.inra.fr/ PMID:26989149

  3. Introduction to an Open Source Internet-Based Testing Program for Medical Student Examinations

    PubMed Central

    2009-01-01

    The author developed a freely available open source internet-based testing program for medical examination. PHP and Java script were used as the programming language and postgreSQL as the database management system on an Apache web server and Linux operating system. The system approach was that a super user inputs the items, each school administrator inputs the examinees' information, and examinees access the system. The examinee's score is displayed immediately after examination with item analysis. The set-up of the system beginning with installation is described. This may help medical professors to easily adopt an internet-based testing system for medical education. PMID:20046457

  4. Introduction to an open source internet-based testing program for medical student examinations.

    PubMed

    Lee, Yoon-Hwan

    2009-12-20

    The author developed a freely available open source internet-based testing program for medical examination. PHP and Java script were used as the programming language and postgreSQL as the database management system on an Apache web server and Linux operating system. The system approach was that a super user inputs the items, each school administrator inputs the examinees' information, and examinees access the system. The examinee's score is displayed immediately after examination with item analysis. The set-up of the system beginning with installation is described. This may help medical professors to easily adopt an internet-based testing system for medical education.

  5. PathwayAccess: CellDesigner plugins for pathway databases.

    PubMed

    Van Hemert, John L; Dickerson, Julie A

    2010-09-15

    CellDesigner provides a user-friendly interface for graphical biochemical pathway description. Many pathway databases are not directly exportable to CellDesigner models. PathwayAccess is an extensible suite of CellDesigner plugins, which connect CellDesigner directly to pathway databases using respective Java application programming interfaces. The process is streamlined for creating new PathwayAccess plugins for specific pathway databases. Three PathwayAccess plugins, MetNetAccess, BioCycAccess and ReactomeAccess, directly connect CellDesigner to the pathway databases MetNetDB, BioCyc and Reactome. PathwayAccess plugins enable CellDesigner users to expose pathway data to analytical CellDesigner functions, curate their pathway databases and visually integrate pathway data from different databases using standard Systems Biology Markup Language and Systems Biology Graphical Notation. Implemented in Java, PathwayAccess plugins run with CellDesigner version 4.0.1 and were tested on Ubuntu Linux, Windows XP and 7, and MacOSX. Source code, binaries, documentation and video walkthroughs are freely available at http://vrac.iastate.edu/~jlv.

  6. JASPAR RESTful API: accessing JASPAR data from any programming language.

    PubMed

    Khan, Aziz; Mathelier, Anthony

    2018-05-01

    JASPAR is a widely used open-access database of curated, non-redundant transcription factor binding profiles. Currently, data from JASPAR can be retrieved as flat files or by using programming language-specific interfaces. Here, we present a programming language-independent application programming interface (API) to access JASPAR data using the Representational State Transfer (REST) architecture. The REST API enables programmatic access to JASPAR by most programming languages and returns data in eight widely used formats. Several endpoints are available to access the data and an endpoint is available to infer the TF binding profile(s) likely bound by a given DNA binding domain protein sequence. Additionally, it provides an interactive browsable interface for bioinformatics tool developers. This REST API is implemented in Python using the Django REST Framework. It is accessible at http://jaspar.genereg.net/api/ and the source code is freely available at https://bitbucket.org/CBGR/jaspar under GPL v3 license. aziz.khan@ncmm.uio.no or anthony.mathelier@ncmm.uio.no. Supplementary data are available at Bioinformatics online.

  7. Designing for Peta-Scale in the LSST Database

    NASA Astrophysics Data System (ADS)

    Kantor, J.; Axelrod, T.; Becla, J.; Cook, K.; Nikolaev, S.; Gray, J.; Plante, R.; Nieto-Santisteban, M.; Szalay, A.; Thakar, A.

    2007-10-01

    The Large Synoptic Survey Telescope (LSST), a proposed ground-based 8.4 m telescope with a 10 deg^2 field of view, will generate 15 TB of raw images every observing night. When calibration and processed data are added, the image archive, catalogs, and meta-data will grow 15 PB yr^{-1} on average. The LSST Data Management System (DMS) must capture, process, store, index, replicate, and provide open access to this data. Alerts must be triggered within 30 s of data acquisition. To do this in real-time at these data volumes will require advances in data management, database, and file system techniques. This paper describes the design of the LSST DMS and emphasizes features for peta-scale data. The LSST DMS will employ a combination of distributed database and file systems, with schema, partitioning, and indexing oriented for parallel operations. Image files are stored in a distributed file system with references to, and meta-data from, each file stored in the databases. The schema design supports pipeline processing, rapid ingest, and efficient query. Vertical partitioning reduces disk input/output requirements, horizontal partitioning allows parallel data access using arrays of servers and disks. Indexing is extensive, utilizing both conventional RAM-resident indexes and column-narrow, row-deep tag tables/covering indices that are extracted from tables that contain many more attributes. The DMS Data Access Framework is encapsulated in a middleware framework to provide a uniform service interface to all framework capabilities. This framework will provide the automated work-flow, replication, and data analysis capabilities necessary to make data processing and data quality analysis feasible at this scale.

  8. The Starchive: An open access, open source archive of nearby and young stars and their planets

    NASA Astrophysics Data System (ADS)

    Tanner, Angelle; Gelino, Chris; Elfeki, Mario

    2015-12-01

    Historically, astronomers have utilized a piecemeal set of archives such as SIMBAD, the Washington Double Star Catalog, various exoplanet encyclopedias and electronic tables from the literature to cobble together stellar and exo-planetary parameters in the absence of corresponding images and spectra. As the search for planets around young stars through direct imaging, transits and infrared/optical radial velocity surveys blossoms, there is a void in the available set of to create comprehensive lists of the stellar parameters of nearby stars especially for important parameters such as metallicity and stellar activity indicators. For direct imaging surveys, we need better resources for downloading existing high contrast images to help confirm new discoveries and find ideal target stars. Once we have discovered new planets, we need a uniform database of stellar and planetary parameters from which to look for correlations to better understand the formation and evolution of these systems. As a solution to these issues, we are developing the Starchive - an open access stellar archive in the spirit of the open exoplanet catalog, the Kepler Community Follow-up Program and many others. The archive will allow users to download various datasets, upload new images, spectra and metadata and will contain multiple plotting tools to use in presentations and data interpretations. While we will highly regulate and constantly validate the data being placed into our archive the open nature of its design is intended to allow the database to be expanded efficiently and have a level of versatility which is necessary in today's fast moving, big data community. Finally, the front-end scripts will be placed on github and users will be encouraged to contribute new plotting tools. Here, I will introduce the community to the content and expected capabilities of the archive and query the audience for community feedback.

  9. Database citation in full text biomedical articles.

    PubMed

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services.

  10. Database Citation in Full Text Biomedical Articles

    PubMed Central

    Kafkas, Şenay; Kim, Jee-Hyub; McEntyre, Johanna R.

    2013-01-01

    Molecular biology and literature databases represent essential infrastructure for life science research. Effective integration of these data resources requires that there are structured cross-references at the level of individual articles and biological records. Here, we describe the current patterns of how database entries are cited in research articles, based on analysis of the full text Open Access articles available from Europe PMC. Focusing on citation of entries in the European Nucleotide Archive (ENA), UniProt and Protein Data Bank, Europe (PDBe), we demonstrate that text mining doubles the number of structured annotations of database record citations supplied in journal articles by publishers. Many thousands of new literature-database relationships are found by text mining, since these relationships are also not present in the set of articles cited by database records. We recommend that structured annotation of database records in articles is extended to other databases, such as ArrayExpress and Pfam, entries from which are also cited widely in the literature. The very high precision and high-throughput of this text-mining pipeline makes this activity possible both accurately and at low cost, which will allow the development of new integrated data services. PMID:23734176

  11. Database resources of the National Center for Biotechnology

    PubMed Central

    Wheeler, David L.; Church, Deanna M.; Federhen, Scott; Lash, Alex E.; Madden, Thomas L.; Pontius, Joan U.; Schuler, Gregory D.; Schriml, Lynn M.; Sequeira, Edwin; Tatusova, Tatiana A.; Wagner, Lukas

    2003-01-01

    In addition to maintaining the GenBank(R) nucleic acid sequence database, the National Center for Biotechnology Information (NCBI) provides data analysis and retrieval resources for the data in GenBank and other biological data made available through NCBI's Web site. NCBI resources include Entrez, PubMed, PubMed Central (PMC), LocusLink, the NCBITaxonomy Browser, BLAST, BLAST Link (BLink), Electronic PCR (e-PCR), Open Reading Frame (ORF) Finder, References Sequence (RefSeq), UniGene, HomoloGene, ProtEST, Database of Single Nucleotide Polymorphisms (dbSNP), Human/Mouse Homology Map, Cancer Chromosome Aberration Project (CCAP), Entrez Genomes and related tools, the Map Viewer, Model Maker (MM), Evidence Viewer (EV), Clusters of Orthologous Groups (COGs) database, Retroviral Genotyping Tools, SAGEmap, Gene Expression Omnibus (GEO), Online Mendelian Inheritance in Man (OMIM), the Molecular Modeling Database (MMDB), the Conserved Domain Database (CDD), and the Conserved Domain Architecture Retrieval Tool (CDART). Augmenting many of the Web applications are custom implementations of the BLAST program optimized to search specialized data sets. All of the resources can be accessed through the NCBI home page at: http://www.ncbi.nlm.nih.gov. PMID:12519941

  12. Development of an open source laboratory information management system for 2-D gel electrophoresis-based proteomics workflow

    PubMed Central

    Morisawa, Hiraku; Hirota, Mikako; Toda, Tosifusa

    2006-01-01

    Background In the post-genome era, most research scientists working in the field of proteomics are confronted with difficulties in management of large volumes of data, which they are required to keep in formats suitable for subsequent data mining. Therefore, a well-developed open source laboratory information management system (LIMS) should be available for their proteomics research studies. Results We developed an open source LIMS appropriately customized for 2-D gel electrophoresis-based proteomics workflow. The main features of its design are compactness, flexibility and connectivity to public databases. It supports the handling of data imported from mass spectrometry software and 2-D gel image analysis software. The LIMS is equipped with the same input interface for 2-D gel information as a clickable map on public 2DPAGE databases. The LIMS allows researchers to follow their own experimental procedures by reviewing the illustrations of 2-D gel maps and well layouts on the digestion plates and MS sample plates. Conclusion Our new open source LIMS is now available as a basic model for proteome informatics, and is accessible for further improvement. We hope that many research scientists working in the field of proteomics will evaluate our LIMS and suggest ways in which it can be improved. PMID:17018156

  13. WOVOdat - An online, growing library of worldwide volcanic unrest

    NASA Astrophysics Data System (ADS)

    Newhall, C. G.; Costa, F.; Ratdomopurbo, A.; Venezky, D. Y.; Widiwijayanti, C.; Win, Nang Thin Zar; Tan, K.; Fajiculay, E.

    2017-10-01

    The World Organization of Volcano Observatories (WOVO), with major support from the Earth Observatory of Singapore, is developing a web-accessible database of seismic, geodetic, gas, hydrologic, and other unrest from volcanoes around the world. This database, WOVOdat, is intended for reference during volcanic crises, comparative studies, basic research on pre-eruption processes, teaching, and outreach. Data are already processed to have physical meaning, e.g. earthquake hypocenters rather than voltages or arrival times, and are historical rather than real-time, ranging in age from a few days to several decades. Data from > 900 episodes of unrest covering > 75 volcanoes are already accessible. Users can visualize and compare changes from one episode of unrest or from one volcano to the next. As the database grows more complete, users will be able to analyze patterns of unrest in the same way that epidemiologists study the spatial and temporal patterns and associations among diseases. WOVOdat was opened for station and data visualization in August 2013, and now includes utilities for data downloads and Boolean searches. Many more data sets are being added, as well as utilities interfacing to new applications, e.g., the construction of event trees. For more details, please see www.wovodat.org.

  14. The BIG Data Center: from deposition to integration to translation

    PubMed Central

    2017-01-01

    Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. PMID:27899658

  15. Open access tools for quality-assured and efficient data entry in a large, state-wide tobacco survey in India

    PubMed Central

    Shewade, Hemant Deepak; Vidhubala, E; Subramani, Divyaraj Prabhakar; Lal, Pranay; Bhatt, Neelam; Sundaramoorthi, C.; Singh, Rana J.; Kumar, Ajay M. V.

    2017-01-01

    ABSTRACT Background: A large state-wide tobacco survey was conducted using modified version of pretested, globally validated Global Adult Tobacco Survey (GATS) questionnaire in 2015–22016 in Tamil Nadu, India. Due to resource constrains, data collection was carrid out using paper-based questionnaires (unlike the GATS-India, 2009–2010, which used hand-held computer devices) while data entry was done using open access tools. The objective of this paper is to describe the process of data entry and assess its quality assurance and efficiency. Methods: In EpiData language, a variable is referred to as ‘field’ and a questionnaire (set of fields) as ‘record’. EpiData software was used for double data entry with adequate checks followed by validation. Teamviewer was used for remote training and trouble shooting. The EpiData databases (one each for each district and each zone in Chennai city) were housed in shared Dropbox folders, which enabled secure sharing of files and automatic back-up. Each database for a district/zone had separate file for data entry of household level and individual level questionnaire. Results: Of 32,945 households, there were 111,363 individuals aged ≥15 years. The average proportion of records with data entry errors for a district/zone in household level and individual level file was 4% and 24%, respectively. These are the errors that would have gone unnoticed if single entry was used. The median (inter-quartile range) time taken for double data entry for a single household level and individual level questionnaire was 30 (24, 40) s and 86 (64, 126) s, respectively. Conclusion: Efficient and quality-assured near-real-time data entry in a large sub-national tobacco survey was performed using innovative, resource-efficient use of open access tools. PMID:29092673

  16. TERRA REF: Advancing phenomics with high resolution, open access sensor and genomics data

    NASA Astrophysics Data System (ADS)

    LeBauer, D.; Kooper, R.; Burnette, M.; Willis, C.

    2017-12-01

    Automated plant measurement has the potential to improve understanding of genetic and environmental controls on plant traits (phenotypes). The application of sensors and software in the automation of high throughput phenotyping reflects a fundamental shift from labor intensive hand measurements to drone, tractor, and robot mounted sensing platforms. These tools are expected to speed the rate of crop improvement by enabling plant breeders to more accurately select plants with improved yields, resource use efficiency, and stress tolerance. However, there are many challenges facing high throughput phenomics: sensors and platforms are expensive, currently there are few standard methods of data collection and storage, and the analysis of large data sets requires high performance computers and automated, reproducible computing pipelines. To overcome these obstacles and advance the science of high throughput phenomics, the TERRA Phenotyping Reference Platform (TERRA-REF) team is developing an open-access database of high resolution sensor data. TERRA REF is an integrated field and greenhouse phenotyping system that includes: a reference field scanner with fifteen sensors that can generate terrabytes of data each day at mm resolution; UAV, tractor, and fixed field sensing platforms; and an automated controlled-environment scanner. These platforms will enable investigation of diverse sensing modalities, and the investigation of traits under controlled and field environments. It is the goal of TERRA REF to lower the barrier to entry for academic and industry researchers by providing high-resolution data, open source software, and online computing resources. Our project is unique in that all data will be made fully public in November 2018, and is already available to early adopters through the beta-user program. We will describe the datasets and how to use them as well as the databases and computing pipeline and how these can be reused and remixed in other phenomics pipelines. Finally, we will describe the National Data Service workbench, a cloud computing platform that can access the petabyte scale data while supporting reproducible research.

  17. The Open Data Repositorys Data Publisher

    NASA Technical Reports Server (NTRS)

    Stone, N.; Lafuente, B.; Downs, R. T.; Blake, D.; Bristow, T.; Fonda, M.; Pires, A.

    2015-01-01

    Data management and data publication are becoming increasingly important components of researcher's workflows. The complexity of managing data, publishing data online, and archiving data has not decreased significantly even as computing access and power has greatly increased. The Open Data Repository's Data Publisher software strives to make data archiving, management, and publication a standard part of a researcher's workflow using simple, web-based tools and commodity server hardware. The publication engine allows for uploading, searching, and display of data with graphing capabilities and downloadable files. Access is controlled through a robust permissions system that can control publication at the field level and can be granted to the general public or protected so that only registered users at various permission levels receive access. Data Publisher also allows researchers to subscribe to meta-data standards through a plugin system, embargo data publication at their discretion, and collaborate with other researchers through various levels of data sharing. As the software matures, semantic data standards will be implemented to facilitate machine reading of data and each database will provide a REST application programming interface for programmatic access. Additionally, a citation system will allow snapshots of any data set to be archived and cited for publication while the data itself can remain living and continuously evolve beyond the snapshot date. The software runs on a traditional LAMP (Linux, Apache, MySQL, PHP) server and is available on GitHub (http://github.com/opendatarepository) under a GPLv2 open source license. The goal of the Open Data Repository is to lower the cost and training barrier to entry so that any researcher can easily publish their data and ensure it is archived for posterity.

  18. EV@LUTIL: An open access database on occupational exposures to asbestos and man-made mineral fibres.

    PubMed

    Orlowski, Ewa; Audignon-Durand, Sabyne; Goldberg, Marcel; Imbernon, Ellen; Brochard, Patrick

    2015-10-01

    The aim of Evalutil is to document occupational exposure to asbestos and man-made mineral fibers. These databases provide grouped descriptive and metrological data from observed situations of occupational exposure, collected through the analysis of scientific articles and technical reports by industrial hygienists. Over 5,000 measurements were collected. We describe the occupations, economic activities, fiber-containing products, and operations on them that have been documented most often. Graphical measurement syntheses of these data show that the situations presented for asbestos and RCF, except mineral wools, report fiber concentrations mainly above historical occupational exposure limits. Free access to these data in French and in English on the Internet (https://ssl2.isped.u-bordeaux2.fr/eva_003/) helps public health and prevention professionals to identify and characterize occupational exposures to fibers. Extended recently to nanoscale particles, Evalutil continues to contribute to the improvement of knowledge about exposure to inhaled particles and the health risks associated with them. © 2015 Wiley Periodicals, Inc.

  19. The Auroral Planetary Imaging and Spectroscopy (APIS) service

    NASA Astrophysics Data System (ADS)

    Lamy, L.; Prangé, R.; Henry, F.; Le Sidaner, P.

    2015-06-01

    The Auroral Planetary Imaging and Spectroscopy (APIS) service, accessible online, provides an open and interactive access to processed auroral observations of the outer planets and their satellites. Such observations are of interest for a wide community at the interface between planetology, magnetospheric and heliospheric physics. APIS consists of (i) a high level database, built from planetary auroral observations acquired by the Hubble Space Telescope (HST) since 1997 with its mostly used Far-Ultraviolet spectro-imagers, (ii) a dedicated search interface aimed at browsing efficiently this database through relevant conditional search criteria and (iii) the ability to interactively work with the data online through plotting tools developed by the Virtual Observatory (VO) community, such as Aladin and Specview. This service is VO compliant and can therefore also been queried by external search tools of the VO community. The diversity of available data and the capability to sort them out by relevant physical criteria shall in particular facilitate statistical studies, on long-term scales and/or multi-instrumental multi-spectral combined analysis.

  20. The Cambridge Structural Database in retrospect and prospect.

    PubMed

    Groom, Colin R; Allen, Frank H

    2014-01-13

    The Cambridge Crystallographic Data Centre (CCDC) was established in 1965 to record numerical, chemical and bibliographic data relating to published organic and metal-organic crystal structures. The Cambridge Structural Database (CSD) now stores data for nearly 700,000 structures and is a comprehensive and fully retrospective historical archive of small-molecule crystallography. Nearly 40,000 new structures are added each year. As X-ray crystallography celebrates its centenary as a subject, and the CCDC approaches its own 50th year, this article traces the origins of the CCDC as a publicly funded organization and its onward development into a self-financing charitable institution. Principally, however, we describe the growth of the CSD and its extensive associated software system, and summarize its impact and value as a basis for research in structural chemistry, materials science and the life sciences, including drug discovery and drug development. Finally, the article considers the CCDC's funding model in relation to open access and open data paradigms. Copyright © 2014 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  1. Structural Fingerprinting of Nanocrystals in the Transmission Electron Microscope

    NASA Astrophysics Data System (ADS)

    Rouvimov, Sergei; Plachinda, Pavel; Moeck, Peter

    2010-03-01

    Three novel strategies for the structurally identification of nanocrystals in a transmission electron microscope are presented. Either a single high-resolution transmission electron microscopy image [1] or a single precession electron diffractogram (PED) [2] may be employed. PEDs from fine-grained crystal powders may also be utilized. Automation of the former two strategies is in progress and shall lead to statistically significant results on ensembles of nanocrystals. Open-access databases such as the Crystallography Open Database which provides more than 81,500 crystal structure data sets [3] or its mainly inorganic and educational subsets [4] may be utilized. [1] http://www.scientificjournals.org/journals 2007/j/of/dissertation.htm [2] P. Moeck and S. Rouvimov, in: {Drugs and the Pharmaceutical Sciences}, Vol. 191, 2009, 270-313 [3] http://cod.ibt.lt, http://www.crystallography.net, http://cod.ensicaen.fr, http://nanocrystallography.org, http://nanocrystallography.net, http://journals.iucr.org/j/issues/2009/04/00/kk5039/kk5039.pdf [4] http://nanocrystallography.research.pdx.edu/CIF-searchable

  2. MOSAIC: an online database dedicated to the comparative genomics of bacterial strains at the intra-species level.

    PubMed

    Chiapello, Hélène; Gendrault, Annie; Caron, Christophe; Blum, Jérome; Petit, Marie-Agnès; El Karoui, Meriem

    2008-11-27

    The recent availability of complete sequences for numerous closely related bacterial genomes opens up new challenges in comparative genomics. Several methods have been developed to align complete genomes at the nucleotide level but their use and the biological interpretation of results are not straightforward. It is therefore necessary to develop new resources to access, analyze, and visualize genome comparisons. Here we present recent developments on MOSAIC, a generalist comparative bacterial genome database. This database provides the bacteriologist community with easy access to comparisons of complete bacterial genomes at the intra-species level. The strategy we developed for comparison allows us to define two types of regions in bacterial genomes: backbone segments (i.e., regions conserved in all compared strains) and variable segments (i.e., regions that are either specific to or variable in one of the aligned genomes). Definition of these segments at the nucleotide level allows precise comparative and evolutionary analyses of both coding and non-coding regions of bacterial genomes. Such work is easily performed using the MOSAIC Web interface, which allows browsing and graphical visualization of genome comparisons. The MOSAIC database now includes 493 pairwise comparisons and 35 multiple maximal comparisons representing 78 bacterial species. Genome conserved regions (backbones) and variable segments are presented in various formats for further analysis. A graphical interface allows visualization of aligned genomes and functional annotations. The MOSAIC database is available online at http://genome.jouy.inra.fr/mosaic.

  3. A framework for integration of scientific applications into the OpenTopography workflow

    NASA Astrophysics Data System (ADS)

    Nandigam, V.; Crosby, C.; Baru, C.

    2012-12-01

    The NSF-funded OpenTopography facility provides online access to Earth science-oriented high-resolution LIDAR topography data, online processing tools, and derivative products. The underlying cyberinfrastructure employs a multi-tier service oriented architecture that is comprised of an infrastructure tier, a processing services tier, and an application tier. The infrastructure tier consists of storage, compute resources as well as supporting databases. The services tier consists of the set of processing routines each deployed as a Web service. The applications tier provides client interfaces to the system. (e.g. Portal). We propose a "pluggable" infrastructure design that will allow new scientific algorithms and processing routines developed and maintained by the community to be integrated into the OpenTopography system so that the wider earth science community can benefit from its availability. All core components in OpenTopography are available as Web services using a customized open-source Opal toolkit. The Opal toolkit provides mechanisms to manage and track job submissions, with the help of a back-end database. It allows monitoring of job and system status by providing charting tools. All core components in OpenTopography have been developed, maintained and wrapped as Web services using Opal by OpenTopography developers. However, as the scientific community develops new processing and analysis approaches this integration approach is not scalable efficiently. Most of the new scientific applications will have their own active development teams performing regular updates, maintenance and other improvements. It would be optimal to have the application co-located where its developers can continue to actively work on it while still making it accessible within the OpenTopography workflow for processing capabilities. We will utilize a software framework for remote integration of these scientific applications into the OpenTopography system. This will be accomplished by virtually extending the OpenTopography service over the various infrastructures running these scientific applications and processing routines. This involves packaging and distributing a customized instance of the Opal toolkit that will wrap the software application as an OPAL-based web service and integrate it into the OpenTopography framework. We plan to make this as automated as possible. A structured specification of service inputs and outputs along with metadata annotations encoded in XML can be utilized to automate the generation of user interfaces, with appropriate tools tips and user help features, and generation of other internal software. The OpenTopography Opal toolkit will also include the customizations that will enable security authentication, authorization and the ability to write application usage and job statistics back to the OpenTopography databases. This usage information could then be reported to the original service providers and used for auditing and performance improvements. This pluggable framework will enable the application developers to continue to work on enhancing their application while making the latest iteration available in a timely manner to the earth sciences community. This will also help us establish an overall framework that other scientific application providers will also be able to use going forward.

  4. Access to emergency hospital care provided by the public sector in sub-Saharan Africa in 2015: a geocoded inventory and spatial analysis.

    PubMed

    Ouma, Paul O; Maina, Joseph; Thuranira, Pamela N; Macharia, Peter M; Alegana, Victor A; English, Mike; Okiro, Emelda A; Snow, Robert W

    2018-03-01

    Timely access to emergency care can substantially reduce mortality. International benchmarks for access to emergency hospital care have been established to guide ambitions for universal health care by 2030. However, no Pan-African database of where hospitals are located exists; therefore, we aimed to complete a geocoded inventory of hospital services in Africa in relation to how populations might access these services in 2015, with focus on women of child bearing age. We assembled a geocoded inventory of public hospitals across 48 countries and islands of sub-Saharan Africa, including Zanzibar, using data from various sources. We only included public hospitals with emergency services that were managed by governments at national or local levels and faith-based or non-governmental organisations. For hospital listings without geographical coordinates, we geocoded each facility using Microsoft Encarta (version 2009), Google Earth (version 7.3), Geonames, Fallingrain, OpenStreetMap, and other national digital gazetteers. We obtained estimates for total population and women of child bearing age (15-49 years) at a 1 km 2 spatial resolution from the WorldPop database for 2015. Additionally, we assembled road network data from Google Map Maker Project and OpenStreetMap using ArcMap (version 10.5). We then combined the road network and the population locations to form a travel impedance surface. Subsequently, we formulated a cost distance algorithm based on the location of public hospitals and the travel impedance surface in AccessMod (version 5) to compute the proportion of populations living within a combined walking and motorised travel time of 2 h to emergency hospital services. We consulted 100 databases from 48 sub-Saharan countries and islands, including Zanzibar, and identified 4908 public hospitals. 2701 hospitals had either full or partial information about their geographical coordinates. We estimated that 287 282 013 (29·0%) people and 64 495 526 (28·2%) women of child bearing age are located more than 2-h travel time from the nearest hospital. Marked differences were observed within and between countries, ranging from less than 25% of the population within 2-h travel time of a public hospital in South Sudan to more than 90% in Nigeria, Kenya, Cape Verde, Swaziland, South Africa, Burundi, Comoros, São Tomé and Príncipe, and Zanzibar. Only 16 countries reached the international benchmark of more than 80% of their populations living within a 2-h travel time of the nearest hospital. Physical access to emergency hospital care provided by the public sector in Africa remains poor and varies substantially within and between countries. Innovative targeting of emergency care services is necessary to reduce these inequities. This study provides the first spatial census of public hospital services in Africa. Wellcome Trust and the UK Department for International Development. Copyright © 2018 The Author(s). Published by Elsevier Ltd. This is an Open Access article under the CC BY 4.0 license. Published by Elsevier Ltd.. All rights reserved.

  5. Introduction to TETHYS—an interdisciplinary GIS database for studying continental collisions

    NASA Astrophysics Data System (ADS)

    Khan, S. D.; Flower, M. F. J.; Sultan, M. I.; Sandvol, E.

    2006-05-01

    The TETHYS GIS database is being developed as a way to integrate relevant geologic, geophysical, geochemical, geochronologic, and remote sensing data bearing on Tethyan continental plate collisions. The project is predicated on a need for actualistic model 'templates' for interpreting the Earth's geologic record. Because of their time-transgressive character, Tethyan collisions offer 'actualistic' models for features such as continental 'escape', collision-induced upper mantle flow magmatism, and marginal basin opening, associated with modern convergent plate margins. Large integrated geochemical and geophysical databases allow for such models to be tested against the geologic record, leading to a better understanding of continental accretion throughout Earth history. The TETHYS database combines digital topographic and geologic information, remote sensing images, sample-based geochemical, geochronologic, and isotopic data (for pre- and post-collision igneous activity), and data for seismic tomography, shear-wave splitting, space geodesy, and information for plate tectonic reconstructions. Here, we report progress on developing such a database and the tools for manipulating and visualizing integrated 2-, 3-, and 4-d data sets with examples of research applications in progress. Based on an Oracle database system, linked with ArcIMS via ArcSDE, the TETHYS project is an evolving resource for researchers, educators, and others interested in studying the role of plate collisions in the process of continental accretion, and will be accessible as a node of the national Geosciences Cyberinfrastructure Network—GEON via the World-Wide Web and ultra-high speed internet2. Interim partial access to the data and metadata is available at: http://geoinfo.geosc.uh.edu/Tethys/ and http://www.esrs.wmich.edu/tethys.htm. We demonstrate the utility of the TETHYS database in building a framework for lithospheric interactions in continental collision and accretion.

  6. Development of Human Face Literature Database Using Text Mining Approach: Phase I.

    PubMed

    Kaur, Paramjit; Krishan, Kewal; Sharma, Suresh K

    2018-06-01

    The face is an important part of the human body by which an individual communicates in the society. Its importance can be highlighted by the fact that a person deprived of face cannot sustain in the living world. The amount of experiments being performed and the number of research papers being published under the domain of human face have surged in the past few decades. Several scientific disciplines, which are conducting research on human face include: Medical Science, Anthropology, Information Technology (Biometrics, Robotics, and Artificial Intelligence, etc.), Psychology, Forensic Science, Neuroscience, etc. This alarms the need of collecting and managing the data concerning human face so that the public and free access of it can be provided to the scientific community. This can be attained by developing databases and tools on human face using bioinformatics approach. The current research emphasizes on creating a database concerning literature data of human face. The database can be accessed on the basis of specific keywords, journal name, date of publication, author's name, etc. The collected research papers will be stored in the form of a database. Hence, the database will be beneficial to the research community as the comprehensive information dedicated to the human face could be found at one place. The information related to facial morphologic features, facial disorders, facial asymmetry, facial abnormalities, and many other parameters can be extracted from this database. The front end has been developed using Hyper Text Mark-up Language and Cascading Style Sheets. The back end has been developed using hypertext preprocessor (PHP). The JAVA Script has used as scripting language. MySQL (Structured Query Language) is used for database development as it is most widely used Relational Database Management System. XAMPP (X (cross platform), Apache, MySQL, PHP, Perl) open source web application software has been used as the server.The database is still under the developmental phase and discusses the initial steps of its creation. The current paper throws light on the work done till date.

  7. Questioning the efficacy of 'gold' open access to published articles.

    PubMed

    Fredericks, Suzanne

    2015-07-01

    To question the efficacy of 'gold' open access to published articles. Open access is unrestricted access to academic, theoretical and research literature that is scholarly and peer-reviewed. Two models of open access exist: 'gold' and 'green'. Gold open access provides everyone with access to articles during all stages of publication, with processing charges paid by the author(s). Green open access involves placing an already published article into a repository to provide unrestricted access, with processing charges incurred by the publisher. This is a discussion paper. An exploration of the relative benefits and drawbacks of the 'gold' and 'green' open access systems. Green open access is a more economic and efficient means of granting open access to scholarly literature but a large number of researchers select gold open access journals as their first choices for manuscript submissions. This paper questions the efficacy of gold open access models and presents an examination of green open access models to encourage nurse researchers to consider this approach. In the current academic environment, with increased pressures to publish and low funding success rates, it is difficult to understand why gold open access still exists. Green open access enhances the visibility of an academic's work, as increased downloads of articles tend to lead to increased citations. Green open access is the cheaper option, as well as the most beneficial choice, for universities that want to provide unrestricted access to all literature at minimal risk.

  8. Computational toxicology using the OpenTox application programming interface and Bioclipse

    PubMed Central

    2011-01-01

    Background Toxicity is a complex phenomenon involving the potential adverse effect on a range of biological functions. Predicting toxicity involves using a combination of experimental data (endpoints) and computational methods to generate a set of predictive models. Such models rely strongly on being able to integrate information from many sources. The required integration of biological and chemical information sources requires, however, a common language to express our knowledge ontologically, and interoperating services to build reliable predictive toxicology applications. Findings This article describes progress in extending the integrative bio- and cheminformatics platform Bioclipse to interoperate with OpenTox, a semantic web framework which supports open data exchange and toxicology model building. The Bioclipse workbench environment enables functionality from OpenTox web services and easy access to OpenTox resources for evaluating toxicity properties of query molecules. Relevant cases and interfaces based on ten neurotoxins are described to demonstrate the capabilities provided to the user. The integration takes advantage of semantic web technologies, thereby providing an open and simplifying communication standard. Additionally, the use of ontologies ensures proper interoperation and reliable integration of toxicity information from both experimental and computational sources. Conclusions A novel computational toxicity assessment platform was generated from integration of two open science platforms related to toxicology: Bioclipse, that combines a rich scriptable and graphical workbench environment for integration of diverse sets of information sources, and OpenTox, a platform for interoperable toxicology data and computational services. The combination provides improved reliability and operability for handling large data sets by the use of the Open Standards from the OpenTox Application Programming Interface. This enables simultaneous access to a variety of distributed predictive toxicology databases, and algorithm and model resources, taking advantage of the Bioclipse workbench handling the technical layers. PMID:22075173

  9. Item response theory analysis of the Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised in the Pooled Resource Open-Access ALS Clinical Trials Database.

    PubMed

    Bacci, Elizabeth D; Staniewska, Dorota; Coyne, Karin S; Boyer, Stacey; White, Leigh Ann; Zach, Neta; Cedarbaum, Jesse M

    2016-01-01

    Our objective was to examine dimensionality and item-level performance of the Amyotrophic Lateral Sclerosis Functional Rating Scale-Revised (ALSFRS-R) across time using classical and modern test theory approaches. Confirmatory factor analysis (CFA) and Item Response Theory (IRT) analyses were conducted using data from patients with amyotrophic lateral sclerosis (ALS) Pooled Resources Open-Access ALS Clinical Trials (PRO-ACT) database with complete ALSFRS-R data (n = 888) at three time-points (Time 0, Time 1 (6-months), Time 2 (1-year)). Results demonstrated that in this population of 888 patients, mean age was 54.6 years, 64.4% were male, and 93.7% were Caucasian. The CFA supported a 4* individual-domain structure (bulbar, gross motor, fine motor, and respiratory domains). IRT analysis within each domain revealed misfitting items and overlapping item response category thresholds at all time-points, particularly in the gross motor and respiratory domain items. Results indicate that many of the items of the ALSFRS-R may sub-optimally distinguish among varying levels of disability assessed by each domain, particularly in patients with less severe disability. Measure performance improved across time as patient disability severity increased. In conclusion, modifications to select ALSFRS-R items may improve the instrument's specificity to disability level and sensitivity to treatment effects.

  10. Sources of Free and Open Source Spatial Data for Natural Disasters and Principles for Use in Developing Country Contexts

    NASA Astrophysics Data System (ADS)

    Taylor, Faith E.; Malamud, Bruce D.; Millington, James D. A.

    2016-04-01

    Access to reliable spatial and quantitative datasets (e.g., infrastructure maps, historical observations, environmental variables) at regional and site specific scales can be a limiting factor for understanding hazards and risks in developing country settings. Here we present a 'living database' of >75 freely available data sources relevant to hazard and risk in Africa (and more globally). Data sources include national scientific foundations, non-governmental bodies, crowd-sourced efforts, academic projects, special interest groups and others. The database is available at http://tinyurl.com/africa-datasets and is continually being updated, particularly in the context of broader natural hazards research we are doing in the context of Malawi and Kenya. For each data source, we review the spatiotemporal resolution and extent and make our own assessments of reliability and usability of datasets. Although such freely available datasets are sometimes presented as a panacea to improving our understanding of hazards and risk in developing countries, there are both pitfalls and opportunities unique to using this type of freely available data. These include factors such as resolution, homogeneity, uncertainty, access to metadata and training for usage. Based on our experience, use in the field and grey/peer-review literature, we present a suggested set of guidelines for using these free and open source data in developing country contexts.

  11. SIMAP—the database of all-against-all protein sequence similarities and annotations with new interfaces and increased coverage

    PubMed Central

    Arnold, Roland; Goldenberg, Florian; Mewes, Hans-Werner; Rattei, Thomas

    2014-01-01

    The Similarity Matrix of Proteins (SIMAP, http://mips.gsf.de/simap/) database has been designed to massively accelerate computationally expensive protein sequence analysis tasks in bioinformatics. It provides pre-calculated sequence similarities interconnecting the entire known protein sequence universe, complemented by pre-calculated protein features and domains, similarity clusters and functional annotations. SIMAP covers all major public protein databases as well as many consistently re-annotated metagenomes from different repositories. As of September 2013, SIMAP contains >163 million proteins corresponding to ∼70 million non-redundant sequences. SIMAP uses the sensitive FASTA search heuristics, the Smith–Waterman alignment algorithm, the InterPro database of protein domain models and the BLAST2GO functional annotation algorithm. SIMAP assists biologists by facilitating the interactive exploration of the protein sequence universe. Web-Service and DAS interfaces allow connecting SIMAP with any other bioinformatic tool and resource. All-against-all protein sequence similarity matrices of project-specific protein collections are generated on request. Recent improvements allow SIMAP to cover the rapidly growing sequenced protein sequence universe. New Web-Service interfaces enhance the connectivity of SIMAP. Novel tools for interactive extraction of protein similarity networks have been added. Open access to SIMAP is provided through the web portal; the portal also contains instructions and links for software access and flat file downloads. PMID:24165881

  12. On the Reporting of Experimental and Control Therapies in Stroke Rehabilitation Trials: A Systematic Review.

    PubMed

    Lohse, Keith R; Pathania, Anupriya; Wegman, Rebecca; Boyd, Lara A; Lang, Catherine E

    2018-03-01

    To use the Centralized Open-Access Rehabilitation database for Stroke to explore reporting of both experimental and control interventions in randomized controlled trials for stroke rehabilitation (including upper and lower extremity therapies). The Centralized Open-Access Rehabilitation database for Stroke was created from a search of MEDLINE, Embase, Cochrane Central Register of Controlled Trials, Cochrane Database of Systematic Reviews, and Cumulative Index of Nursing and Allied Health from the earliest available date to May 31, 2014. A total of 2892 titles were reduced to 514 that were screened by full text. This screening left 215 randomized controlled trials in the database (489 independent groups representing 12,847 patients). Using a mixture of qualitative and quantitative methods, we performed a text-based analysis of how the procedures of experimental and control therapies were described. Experimental and control groups were rated by 2 independent coders according to the Template for Intervention Description and Replication criteria. Linear mixed-effects regression with a random effect of study (groups nested within studies) showed that experimental groups had statistically more words in their procedures (mean, 271.8 words) than did control groups (mean, 154.8 words) (P<.001). Experimental groups had statistically more references in their procedures (mean, 1.60 references) than did control groups (mean, .82 references) (P<.001). Experimental groups also scored significantly higher on the total Template for Intervention Description and Replication checklist (mean score, 7.43 points) than did control groups (mean score, 5.23 points) (P<.001). Control treatments in stroke motor rehabilitation trials are underdescribed relative to experimental treatments. These poor descriptions are especially problematic for "conventional" therapy control groups. Poor reporting is a threat to the internal validity and generalizability of clinical trial results. We recommend authors use preregistered protocols and established reporting criteria to improve transparency. Copyright © 2018 American Congress of Rehabilitation Medicine. Published by Elsevier Inc. All rights reserved.

  13. Combining data from multiple sources using the CUAHSI Hydrologic Information System

    NASA Astrophysics Data System (ADS)

    Tarboton, D. G.; Ames, D. P.; Horsburgh, J. S.; Goodall, J. L.

    2012-12-01

    The Consortium of Universities for the Advancement of Hydrologic Science, Inc. (CUAHSI) has developed a Hydrologic Information System (HIS) to provide better access to data by enabling the publication, cataloging, discovery, retrieval, and analysis of hydrologic data using web services. The CUAHSI HIS is an Internet based system comprised of hydrologic databases and servers connected through web services as well as software for data publication, discovery and access. The HIS metadata catalog lists close to 100 web services registered to provide data through this system, ranging from large federal agency data sets to experimental watersheds managed by University investigators. The system's flexibility in storing and enabling public access to similarly formatted data and metadata has created a community data resource from governmental and academic data that might otherwise remain private or analyzed only in isolation. Comprehensive understanding of hydrology requires integration of this information from multiple sources. HydroDesktop is the client application developed as part of HIS to support data discovery and access through this system. HydroDesktop is founded on an open source GIS client and has a plug-in architecture that has enabled the integration of modeling and analysis capability with the functionality for data discovery and access. Model integration is possible through a plug-in built on the OpenMI standard and data visualization and analysis is supported by an R plug-in. This presentation will demonstrate HydroDesktop, showing how it provides an analysis environment within which data from multiple sources can be discovered, accessed and integrated.

  14. FunRich proteomics software analysis, let the fun begin!

    PubMed

    Benito-Martin, Alberto; Peinado, Héctor

    2015-08-01

    Protein MS analysis is the preferred method for unbiased protein identification. It is normally applied to a large number of both small-scale and high-throughput studies. However, user-friendly computational tools for protein analysis are still needed. In this issue, Mathivanan and colleagues (Proteomics 2015, 15, 2597-2601) report the development of FunRich software, an open-access software that facilitates the analysis of proteomics data, providing tools for functional enrichment and interaction network analysis of genes and proteins. FunRich is a reinterpretation of proteomic software, a standalone tool combining ease of use with customizable databases, free access, and graphical representations. © 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  15. The Fabric for Frontier Experiments Project at Fermilab

    NASA Astrophysics Data System (ADS)

    Kirby, Michael

    2014-06-01

    The FabrIc for Frontier Experiments (FIFE) project is a new, far-reaching initiative within the Fermilab Scientific Computing Division to drive the future of computing services for experiments at FNAL and elsewhere. It is a collaborative effort between computing professionals and experiment scientists to produce an end-to-end, fully integrated set of services for computing on the grid and clouds, managing data, accessing databases, and collaborating within experiments. FIFE includes 1) easy to use job submission services for processing physics tasks on the Open Science Grid and elsewhere; 2) an extensive data management system for managing local and remote caches, cataloging, querying, moving, and tracking the use of data; 3) custom and generic database applications for calibrations, beam information, and other purposes; 4) collaboration tools including an electronic log book, speakers bureau database, and experiment membership database. All of these aspects will be discussed in detail. FIFE sets the direction of computing at Fermilab experiments now and in the future, and therefore is a major driver in the design of computing services worldwide.

  16. The Biomolecular Interaction Network Database and related tools 2005 update

    PubMed Central

    Alfarano, C.; Andrade, C. E.; Anthony, K.; Bahroos, N.; Bajec, M.; Bantoft, K.; Betel, D.; Bobechko, B.; Boutilier, K.; Burgess, E.; Buzadzija, K.; Cavero, R.; D'Abreo, C.; Donaldson, I.; Dorairajoo, D.; Dumontier, M. J.; Dumontier, M. R.; Earles, V.; Farrall, R.; Feldman, H.; Garderman, E.; Gong, Y.; Gonzaga, R.; Grytsan, V.; Gryz, E.; Gu, V.; Haldorsen, E.; Halupa, A.; Haw, R.; Hrvojic, A.; Hurrell, L.; Isserlin, R.; Jack, F.; Juma, F.; Khan, A.; Kon, T.; Konopinsky, S.; Le, V.; Lee, E.; Ling, S.; Magidin, M.; Moniakis, J.; Montojo, J.; Moore, S.; Muskat, B.; Ng, I.; Paraiso, J. P.; Parker, B.; Pintilie, G.; Pirone, R.; Salama, J. J.; Sgro, S.; Shan, T.; Shu, Y.; Siew, J.; Skinner, D.; Snyder, K.; Stasiuk, R.; Strumpf, D.; Tuekam, B.; Tao, S.; Wang, Z.; White, M.; Willis, R.; Wolting, C.; Wong, S.; Wrong, A.; Xin, C.; Yao, R.; Yates, B.; Zhang, S.; Zheng, K.; Pawson, T.; Ouellette, B. F. F.; Hogue, C. W. V.

    2005-01-01

    The Biomolecular Interaction Network Database (BIND) (http://bind.ca) archives biomolecular interaction, reaction, complex and pathway information. Our aim is to curate the details about molecular interactions that arise from published experimental research and to provide this information, as well as tools to enable data analysis, freely to researchers worldwide. BIND data are curated into a comprehensive machine-readable archive of computable information and provides users with methods to discover interactions and molecular mechanisms. BIND has worked to develop new methods for visualization that amplify the underlying annotation of genes and proteins to facilitate the study of molecular interaction networks. BIND has maintained an open database policy since its inception in 1999. Data growth has proceeded at a tremendous rate, approaching over 100 000 records. New services provided include a new BIND Query and Submission interface, a Standard Object Access Protocol service and the Small Molecule Interaction Database (http://smid.blueprint.org) that allows users to determine probable small molecule binding sites of new sequences and examine conserved binding residues. PMID:15608229

  17. CGDSNPdb: a database resource for error-checked and imputed mouse SNPs.

    PubMed

    Hutchins, Lucie N; Ding, Yueming; Szatkiewicz, Jin P; Von Smith, Randy; Yang, Hyuna; de Villena, Fernando Pardo-Manuel; Churchill, Gary A; Graber, Joel H

    2010-07-06

    The Center for Genome Dynamics Single Nucleotide Polymorphism Database (CGDSNPdb) is an open-source value-added database with more than nine million mouse single nucleotide polymorphisms (SNPs), drawn from multiple sources, with genotypes assigned to multiple inbred strains of laboratory mice. All SNPs are checked for accuracy and annotated for properties specific to the SNP as well as those implied by changes to overlapping protein-coding genes. CGDSNPdb serves as the primary interface to two unique data sets, the 'imputed genotype resource' in which a Hidden Markov Model was used to assess local haplotypes and the most probable base assignment at several million genomic loci in tens of strains of mice, and the Affymetrix Mouse Diversity Genotyping Array, a high density microarray with over 600,000 SNPs and over 900,000 invariant genomic probes. CGDSNPdb is accessible online through either a web-based query tool or a MySQL public login. Database URL: http://cgd.jax.org/cgdsnpdb/

  18. DOE Office of Scientific and Technical Information (OSTI.GOV)

    Sanchez del Rio, Manuel; Bianchi, Davide; Cocco, Daniele

    An open-source database containing metrology data for X-ray mirrors is presented. It makes available metrology data (mirror heights and slopes profiles) that can be used with simulation tools for calculating the effects of optical surface errors in the performances of an optical instrument, such as a synchrotron beamline. A typical case is the degradation of the intensity profile at the focal position in a beamline due to mirror surface errors. This database for metrology (DABAM) aims to provide to the users of simulation tools the data of real mirrors. The data included in the database are described in this paper,more » with details of how the mirror parameters are stored. An accompanying software is provided to allow simple access and processing of these data, calculate the most usual statistical parameters, and also include the option of creating input files for most used simulation codes. Some optics simulations are presented and discussed to illustrate the real use of the profiles from the database.« less

  19. StreptomycesInforSys: A web-enabled information repository

    PubMed Central

    Jain, Chakresh Kumar; Gupta, Vidhi; Gupta, Ashvarya; Gupta, Sanjay; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Sarethy, Indira P

    2012-01-01

    Members of Streptomyces produce 70% of natural bioactive products. There is considerable amount of information available based on polyphasic approach for classification of Streptomyces. However, this information based on phenotypic, genotypic and bioactive component production profiles is crucial for pharmacological screening programmes. This is scattered across various journals, books and other resources, many of which are not freely accessible. The designed database incorporates polyphasic typing information using combinations of search options to aid in efficient screening of new isolates. This will help in the preliminary categorization of appropriate groups. It is a free relational database compatible with existing operating systems. A cross platform technology with XAMPP Web server has been used to develop, manage, and facilitate the user query effectively with database support. Employment of PHP, a platform-independent scripting language, embedded in HTML and the database management software MySQL will facilitate dynamic information storage and retrieval. The user-friendly, open and flexible freeware (PHP, MySQL and Apache) is foreseen to reduce running and maintenance cost. Availability www.sis.biowaves.org PMID:23275736

  20. StreptomycesInforSys: A web-enabled information repository.

    PubMed

    Jain, Chakresh Kumar; Gupta, Vidhi; Gupta, Ashvarya; Gupta, Sanjay; Wadhwa, Gulshan; Sharma, Sanjeev Kumar; Sarethy, Indira P

    2012-01-01

    Members of Streptomyces produce 70% of natural bioactive products. There is considerable amount of information available based on polyphasic approach for classification of Streptomyces. However, this information based on phenotypic, genotypic and bioactive component production profiles is crucial for pharmacological screening programmes. This is scattered across various journals, books and other resources, many of which are not freely accessible. The designed database incorporates polyphasic typing information using combinations of search options to aid in efficient screening of new isolates. This will help in the preliminary categorization of appropriate groups. It is a free relational database compatible with existing operating systems. A cross platform technology with XAMPP Web server has been used to develop, manage, and facilitate the user query effectively with database support. Employment of PHP, a platform-independent scripting language, embedded in HTML and the database management software MySQL will facilitate dynamic information storage and retrieval. The user-friendly, open and flexible freeware (PHP, MySQL and Apache) is foreseen to reduce running and maintenance cost. www.sis.biowaves.org.

  1. A New Data Management System for Biological and Chemical Oceanography

    NASA Astrophysics Data System (ADS)

    Groman, R. C.; Chandler, C.; Allison, D.; Glover, D. M.; Wiebe, P. H.

    2007-12-01

    The Biological and Chemical Oceanography Data Management Office (BCO-DMO) was created to serve PIs principally funded by NSF to conduct marine chemical and ecological research. The new office is dedicated to providing open access to data and information developed in the course of scientific research on short and intermediate time-frames. The data management system developed in support of U.S. JGOFS and U.S. GLOBEC programs is being modified to support the larger scope of the BCO-DMO effort, which includes ultimately providing a way to exchange data with other data systems. The open access system is based on a philosophy of data stewardship, support for existing and evolving data standards, and use of public domain software. The DMO staff work closely with originating PIs to manage data gathered as part of their individual programs. In the new BCO-DMO data system, project and data set metadata records designed to support re-use of the data are stored in a relational database (MySQL) and the data are stored in or made accessible by the JGOFS/GLOBEC object- oriented, relational, data management system. Data access will be provided via any standard Web browser client user interface through a GIS application (Open Source, OGC-compliant MapServer), a directory listing from the data holdings catalog, or a custom search engine that facilitates data discovery. In an effort to maximize data system interoperability, data will also be available via Web Services; and data set descriptions will be generated to comply with a variety of metadata content standards. The office is located at the Woods Hole Oceanographic Institution and web access is via http://www.bco-dmo.org.

  2. HEPData: a repository for high energy physics data

    NASA Astrophysics Data System (ADS)

    Maguire, Eamonn; Heinrich, Lukas; Watt, Graeme

    2017-10-01

    The Durham High Energy Physics Database (HEPData) has been built up over the past four decades as a unique open-access repository for scattering data from experimental particle physics papers. It comprises data points underlying several thousand publications. Over the last two years, the HEPData software has been completely rewritten using modern computing technologies as an overlay on the Invenio v3 digital library framework. The software is open source with the new site available at https://hepdata.net now replacing the previous site at http://hepdata.cedar.ac.uk. In this write-up, we describe the development of the new site and explain some of the advantages it offers over the previous platform.

  3. Spatial Dmbs Architecture for a Free and Open Source Bim

    NASA Astrophysics Data System (ADS)

    Logothetis, S.; Valari, E.; Karachaliou, E.; Stylianidis, E.

    2017-08-01

    Recent research on the field of Building Information Modelling (BIM) technology, revealed that except of a few, accessible and free BIM viewers there is a lack of Free & Open Source Software (FOSS) BIM software for the complete BIM process. With this in mind and considering BIM as the technological advancement of Computer-Aided Design (CAD) systems, the current work proposes the use of a FOSS CAD software in order to extend its capabilities and transform it gradually into a FOSS BIM platform. Towards this undertaking, a first approach on developing a spatial Database Management System (DBMS) able to store, organize and manage the overall amount of information within a single application, is presented.

  4. Meet Spinky: An Open-Source Spindle and K-Complex Detection Toolbox Validated on the Open-Access Montreal Archive of Sleep Studies (MASS).

    PubMed

    Lajnef, Tarek; O'Reilly, Christian; Combrisson, Etienne; Chaibi, Sahbi; Eichenlaub, Jean-Baptiste; Ruby, Perrine M; Aguera, Pierre-Emmanuel; Samet, Mounir; Kachouri, Abdennaceur; Frenette, Sonia; Carrier, Julie; Jerbi, Karim

    2017-01-01

    Sleep spindles and K-complexes are among the most prominent micro-events observed in electroencephalographic (EEG) recordings during sleep. These EEG microstructures are thought to be hallmarks of sleep-related cognitive processes. Although tedious and time-consuming, their identification and quantification is important for sleep studies in both healthy subjects and patients with sleep disorders. Therefore, procedures for automatic detection of spindles and K-complexes could provide valuable assistance to researchers and clinicians in the field. Recently, we proposed a framework for joint spindle and K-complex detection (Lajnef et al., 2015a) based on a Tunable Q-factor Wavelet Transform (TQWT; Selesnick, 2011a) and morphological component analysis (MCA). Using a wide range of performance metrics, the present article provides critical validation and benchmarking of the proposed approach by applying it to open-access EEG data from the Montreal Archive of Sleep Studies (MASS; O'Reilly et al., 2014). Importantly, the obtained scores were compared to alternative methods that were previously tested on the same database. With respect to spindle detection, our method achieved higher performance than most of the alternative methods. This was corroborated with statistic tests that took into account both sensitivity and precision (i.e., Matthew's coefficient of correlation (MCC), F1, Cohen κ). Our proposed method has been made available to the community via an open-source tool named Spinky (for spindle and K-complex detection). Thanks to a GUI implementation and access to Matlab and Python resources, Spinky is expected to contribute to an open-science approach that will enhance replicability and reliable comparisons of classifier performances for the detection of sleep EEG microstructure in both healthy and patient populations.

  5. Identification and Validation of Human Missing Proteins and Peptides in Public Proteome Databases: Data Mining Strategy.

    PubMed

    Elguoshy, Amr; Hirao, Yoshitoshi; Xu, Bo; Saito, Suguru; Quadery, Ali F; Yamamoto, Keiko; Mitsui, Toshiaki; Yamamoto, Tadashi

    2017-12-01

    In an attempt to complete human proteome project (HPP), Chromosome-Centric Human Proteome Project (C-HPP) launched the journey of missing protein (MP) investigation in 2012. However, 2579 and 572 protein entries in the neXtProt (2017-1) are still considered as missing and uncertain proteins, respectively. Thus, in this study, we proposed a pipeline to analyze, identify, and validate human missing and uncertain proteins in open-access transcriptomics and proteomics databases. Analysis of RNA expression pattern for missing proteins in Human protein Atlas showed that 28% of them, such as Olfactory receptor 1I1 ( O60431 ), had no RNA expression, suggesting the necessity to consider uncommon tissues for transcriptomic and proteomic studies. Interestingly, 21% had elevated expression level in a particular tissue (tissue-enriched proteins), indicating the importance of targeting such proteins in their elevated tissues. Additionally, the analysis of RNA expression level for missing proteins showed that 95% had no or low expression level (0-10 transcripts per million), indicating that low abundance is one of the major obstacles facing the detection of missing proteins. Moreover, missing proteins are predicted to generate fewer predicted unique tryptic peptides than the identified proteins. Searching for these predicted unique tryptic peptides that correspond to missing and uncertain proteins in the experimental peptide list of open-access MS-based databases (PA, GPM) resulted in the detection of 402 missing and 19 uncertain proteins with at least two unique peptides (≥9 aa) at <(5 × 10 -4 )% FDR. Finally, matching the native spectra for the experimentally detected peptides with their SRMAtlas synthetic counterparts at three transition sources (QQQ, QTOF, QTRAP) gave us an opportunity to validate 41 missing proteins by ≥2 proteotypic peptides.

  6. StatsDB: platform-agnostic storage and understanding of next generation sequencing run metrics

    PubMed Central

    Ramirez-Gonzalez, Ricardo H.; Leggett, Richard M.; Waite, Darren; Thanki, Anil; Drou, Nizar; Caccamo, Mario; Davey, Robert

    2014-01-01

    Modern sequencing platforms generate enormous quantities of data in ever-decreasing amounts of time. Additionally, techniques such as multiplex sequencing allow one run to contain hundreds of different samples. With such data comes a significant challenge to understand its quality and to understand how the quality and yield are changing across instruments and over time. As well as the desire to understand historical data, sequencing centres often have a duty to provide clear summaries of individual run performance to collaborators or customers. We present StatsDB, an open-source software package for storage and analysis of next generation sequencing run metrics. The system has been designed for incorporation into a primary analysis pipeline, either at the programmatic level or via integration into existing user interfaces. Statistics are stored in an SQL database and APIs provide the ability to store and access the data while abstracting the underlying database design. This abstraction allows simpler, wider querying across multiple fields than is possible by the manual steps and calculation required to dissect individual reports, e.g. ”provide metrics about nucleotide bias in libraries using adaptor barcode X, across all runs on sequencer A, within the last month”. The software is supplied with modules for storage of statistics from FastQC, a commonly used tool for analysis of sequence reads, but the open nature of the database schema means it can be easily adapted to other tools. Currently at The Genome Analysis Centre (TGAC), reports are accessed through our LIMS system or through a standalone GUI tool, but the API and supplied examples make it easy to develop custom reports and to interface with other packages. PMID:24627795

  7. An offline-online Web-GIS Android application for fast data acquisition of landslide hazard and risk

    NASA Astrophysics Data System (ADS)

    Olyazadeh, Roya; Sudmeier-Rieux, Karen; Jaboyedoff, Michel; Derron, Marc-Henri; Devkota, Sanjaya

    2017-04-01

    Regional landslide assessments and mapping have been effectively pursued by research institutions, national and local governments, non-governmental organizations (NGOs), and different stakeholders for some time, and a wide range of methodologies and technologies have consequently been proposed. Land-use mapping and hazard event inventories are mostly created by remote-sensing data, subject to difficulties, such as accessibility and terrain, which need to be overcome. Likewise, landslide data acquisition for the field navigation can magnify the accuracy of databases and analysis. Open-source Web and mobile GIS tools can be used for improved ground-truthing of critical areas to improve the analysis of hazard patterns and triggering factors. This paper reviews the implementation and selected results of a secure mobile-map application called ROOMA (Rapid Offline-Online Mapping Application) for the rapid data collection of landslide hazard and risk. This prototype assists the quick creation of landslide inventory maps (LIMs) by collecting information on the type, feature, volume, date, and patterns of landslides using open-source Web-GIS technologies such as Leaflet maps, Cordova, GeoServer, PostgreSQL as the real DBMS (database management system), and PostGIS as its plug-in for spatial database management. This application comprises Leaflet maps coupled with satellite images as a base layer, drawing tools, geolocation (using GPS and the Internet), photo mapping, and event clustering. All the features and information are recorded into a GeoJSON text file in an offline version (Android) and subsequently uploaded to the online mode (using all browsers) with the availability of Internet. Finally, the events can be accessed and edited after approval by an administrator and then be visualized by the general public.

  8. Insight: An ontology-based integrated database and analysis platform for epilepsy self-management research.

    PubMed

    Sahoo, Satya S; Ramesh, Priya; Welter, Elisabeth; Bukach, Ashley; Valdez, Joshua; Tatsuoka, Curtis; Bamps, Yvan; Stoll, Shelley; Jobst, Barbara C; Sajatovic, Martha

    2016-10-01

    We present Insight as an integrated database and analysis platform for epilepsy self-management research as part of the national Managing Epilepsy Well Network. Insight is the only available informatics platform for accessing and analyzing integrated data from multiple epilepsy self-management research studies with several new data management features and user-friendly functionalities. The features of Insight include, (1) use of Common Data Elements defined by members of the research community and an epilepsy domain ontology for data integration and querying, (2) visualization tools to support real time exploration of data distribution across research studies, and (3) an interactive visual query interface for provenance-enabled research cohort identification. The Insight platform contains data from five completed epilepsy self-management research studies covering various categories of data, including depression, quality of life, seizure frequency, and socioeconomic information. The data represents over 400 participants with 7552 data points. The Insight data exploration and cohort identification query interface has been developed using Ruby on Rails Web technology and open source Web Ontology Language Application Programming Interface to support ontology-based reasoning. We have developed an efficient ontology management module that automatically updates the ontology mappings each time a new version of the Epilepsy and Seizure Ontology is released. The Insight platform features a Role-based Access Control module to authenticate and effectively manage user access to different research studies. User access to Insight is managed by the Managing Epilepsy Well Network database steering committee consisting of representatives of all current collaborating centers of the Managing Epilepsy Well Network. New research studies are being continuously added to the Insight database and the size as well as the unique coverage of the dataset allows investigators to conduct aggregate data analysis that will inform the next generation of epilepsy self-management studies. Copyright © 2016 Elsevier Ireland Ltd. All rights reserved.

  9. Selective access and editing in a database

    NASA Technical Reports Server (NTRS)

    Maluf, David A. (Inventor); Gawdiak, Yuri O. (Inventor)

    2010-01-01

    Method and system for providing selective access to different portions of a database by different subgroups of database users. Where N users are involved, up to 2.sup.N-1 distinguishable access subgroups in a group space can be formed, where no two access subgroups have the same members. Two or more members of a given access subgroup can edit, substantially simultaneously, a document accessible to each member.

  10. HydroClim: a Continental-Scale Database of Contemporary and Future Streamflow and Stream Temperature Estimates for Aquatic Ecosystem Studies

    NASA Astrophysics Data System (ADS)

    Knouft, J.; Ficklin, D. L.; Bart, H. L.; Rios, N. E.

    2017-12-01

    Streamflow and water temperature are primary factors influencing the traits, distribution, and diversity of freshwater species. Ongoing changes in climate are causing directional alteration of these environmental conditions, which can impact local ecological processes. Accurate estimation of these variables is critical for predicting the responses of species to ongoing changes in freshwater habitat, yet ecologically relevant high-resolution data describing variation in streamflow and water temperature across North America are not available. Considering the vast amount of web-accessible freshwater biodiversity data, development and application of appropriate hydrologic data are critical to the advancement of our understanding of freshwater systems. To address this issue, we are developing the "HydroClim" database, which will provide web-accessible (www.hydroclim.org) historical and projected monthly streamflow and water temperature data for stream sections in all major watersheds across the United States and Canada from 1950-2099. These data will also be integrated with FishNet 2 (www.fishnet2.net), an online biodiversity database that provides open access to over 2 million localities of freshwater fish species in the United States and Canada, thus allowing for the characterization of the habitat requirements of freshwater species across this region. HydroClim should provide a vast array of opportunities for a greater understanding of water resources as well as information for the conservation of freshwater biodiversity in the United States and Canada in the coming century.

  11. The Coral Trait Database, a curated database of trait information for coral species from the global oceans

    NASA Astrophysics Data System (ADS)

    Madin, Joshua S.; Anderson, Kristen D.; Andreasen, Magnus Heide; Bridge, Tom C. L.; Cairns, Stephen D.; Connolly, Sean R.; Darling, Emily S.; Diaz, Marcela; Falster, Daniel S.; Franklin, Erik C.; Gates, Ruth D.; Hoogenboom, Mia O.; Huang, Danwei; Keith, Sally A.; Kosnik, Matthew A.; Kuo, Chao-Yang; Lough, Janice M.; Lovelock, Catherine E.; Luiz, Osmar; Martinelli, Julieta; Mizerek, Toni; Pandolfi, John M.; Pochon, Xavier; Pratchett, Morgan S.; Putnam, Hollie M.; Roberts, T. Edward; Stat, Michael; Wallace, Carden C.; Widman, Elizabeth; Baird, Andrew H.

    2016-03-01

    Trait-based approaches advance ecological and evolutionary research because traits provide a strong link to an organism’s function and fitness. Trait-based research might lead to a deeper understanding of the functions of, and services provided by, ecosystems, thereby improving management, which is vital in the current era of rapid environmental change. Coral reef scientists have long collected trait data for corals; however, these are difficult to access and often under-utilized in addressing large-scale questions. We present the Coral Trait Database initiative that aims to bring together physiological, morphological, ecological, phylogenetic and biogeographic trait information into a single repository. The database houses species- and individual-level data from published field and experimental studies alongside contextual data that provide important framing for analyses. In this data descriptor, we release data for 56 traits for 1547 species, and present a collaborative platform on which other trait data are being actively federated. Our overall goal is for the Coral Trait Database to become an open-source, community-led data clearinghouse that accelerates coral reef research.

  12. The Coral Trait Database, a curated database of trait information for coral species from the global oceans

    PubMed Central

    Madin, Joshua S.; Anderson, Kristen D.; Andreasen, Magnus Heide; Bridge, Tom C.L.; Cairns, Stephen D.; Connolly, Sean R.; Darling, Emily S.; Diaz, Marcela; Falster, Daniel S.; Franklin, Erik C.; Gates, Ruth D.; Hoogenboom, Mia O.; Huang, Danwei; Keith, Sally A.; Kosnik, Matthew A.; Kuo, Chao-Yang; Lough, Janice M.; Lovelock, Catherine E.; Luiz, Osmar; Martinelli, Julieta; Mizerek, Toni; Pandolfi, John M.; Pochon, Xavier; Pratchett, Morgan S.; Putnam, Hollie M.; Roberts, T. Edward; Stat, Michael; Wallace, Carden C.; Widman, Elizabeth; Baird, Andrew H.

    2016-01-01

    Trait-based approaches advance ecological and evolutionary research because traits provide a strong link to an organism’s function and fitness. Trait-based research might lead to a deeper understanding of the functions of, and services provided by, ecosystems, thereby improving management, which is vital in the current era of rapid environmental change. Coral reef scientists have long collected trait data for corals; however, these are difficult to access and often under-utilized in addressing large-scale questions. We present the Coral Trait Database initiative that aims to bring together physiological, morphological, ecological, phylogenetic and biogeographic trait information into a single repository. The database houses species- and individual-level data from published field and experimental studies alongside contextual data that provide important framing for analyses. In this data descriptor, we release data for 56 traits for 1547 species, and present a collaborative platform on which other trait data are being actively federated. Our overall goal is for the Coral Trait Database to become an open-source, community-led data clearinghouse that accelerates coral reef research. PMID:27023900

  13. The Coral Trait Database, a curated database of trait information for coral species from the global oceans.

    PubMed

    Madin, Joshua S; Anderson, Kristen D; Andreasen, Magnus Heide; Bridge, Tom C L; Cairns, Stephen D; Connolly, Sean R; Darling, Emily S; Diaz, Marcela; Falster, Daniel S; Franklin, Erik C; Gates, Ruth D; Harmer, Aaron; Hoogenboom, Mia O; Huang, Danwei; Keith, Sally A; Kosnik, Matthew A; Kuo, Chao-Yang; Lough, Janice M; Lovelock, Catherine E; Luiz, Osmar; Martinelli, Julieta; Mizerek, Toni; Pandolfi, John M; Pochon, Xavier; Pratchett, Morgan S; Putnam, Hollie M; Roberts, T Edward; Stat, Michael; Wallace, Carden C; Widman, Elizabeth; Baird, Andrew H

    2016-03-29

    Trait-based approaches advance ecological and evolutionary research because traits provide a strong link to an organism's function and fitness. Trait-based research might lead to a deeper understanding of the functions of, and services provided by, ecosystems, thereby improving management, which is vital in the current era of rapid environmental change. Coral reef scientists have long collected trait data for corals; however, these are difficult to access and often under-utilized in addressing large-scale questions. We present the Coral Trait Database initiative that aims to bring together physiological, morphological, ecological, phylogenetic and biogeographic trait information into a single repository. The database houses species- and individual-level data from published field and experimental studies alongside contextual data that provide important framing for analyses. In this data descriptor, we release data for 56 traits for 1547 species, and present a collaborative platform on which other trait data are being actively federated. Our overall goal is for the Coral Trait Database to become an open-source, community-led data clearinghouse that accelerates coral reef research.

  14. Catalogue of HI PArameters (CHIPA)

    NASA Astrophysics Data System (ADS)

    Saponara, J.; Benaglia, P.; Koribalski, B.; Andruchow, I.

    2015-08-01

    The catalogue of HI parameters of galaxies HI (CHIPA) is the natural continuation of the compilation by M.C. Martin in 1998. CHIPA provides the most important parameters of nearby galaxies derived from observations of the neutral Hydrogen line. The catalogue contains information of 1400 galaxies across the sky and different morphological types. Parameters like the optical diameter of the galaxy, the blue magnitude, the distance, morphological type, HI extension are listed among others. Maps of the HI distribution, velocity and velocity dispersion can also be display for some cases. The main objective of this catalogue is to facilitate the bibliographic queries, through searching in a database accessible from the internet that will be available in 2015 (the website is under construction). The database was built using the open source `` mysql (SQL, Structured Query Language, management system relational database) '', while the website was built with ''HTML (Hypertext Markup Language)'' and ''PHP (Hypertext Preprocessor)''.

  15. Teaching resources for dermatology on the WWW--quiz system and dynamic lecture scripts using a HTTP-database demon.

    PubMed Central

    Bittorf, A.; Diepgen, T. L.

    1996-01-01

    The World Wide Web (WWW) is becoming the major way of acquiring information in all scientific disciplines as well as in business. It is very well suitable for fast distribution and exchange of up to date teaching resources. However, to date most teaching applications on the Web do not use its full power by integrating interactive components. We have set up a computer based training (CBT) framework for Dermatology, which consists of dynamic lecture scripts, case reports, an atlas and a quiz system. All these components heavily rely on an underlying image database that permits the creation of dynamic documents. We used a demon process that keeps the database open and can be accessed using HTTP to achieve better performance and avoid the overhead involved by starting CGI-processes. The result of our evaluation was very encouraging. Images Figure 3 PMID:8947625

  16. Poisonings and clinical toxicology: a template for Ireland.

    PubMed

    Tormey, W P; Moore, T

    2013-03-01

    Poisons information is accessed around the clock in the British Isles from six centres of which two are in Ireland at Dublin and Belfast accompanied by consultant toxicologist advisory service. The numbers of calls in Ireland are down to about 40 per day due to easy access to online data bases. Access to Toxbase, the clinical toxicology database of the National Poisons Information Service is available to National Health Service (NHS) health professionals and to Emergency Departments and Intensive Care units in the Republic of Ireland. There are 59 Toxbase users in the Republic of Ireland and 99 % of activity originates in Emergency Departments. All United States Poison Control Centres primarily use Poisindex which is a commercial database from Thomson Reuters. Information on paracetamol, diazepam, analgesics and psycho-active compounds are the commonest queries. Data from telephone and computer accesses provide an indicator of future trends in both licit and illicit drug poisons which may direct laboratory analytical service developments. Data from National Drug-Related Deaths Index is the most accurate information on toxicological deaths in Ireland. Laboratory toxicology requirements to support emergency departments are listed. Recommendations are made for a web-based open access Toxbase or equivalent; for a co-location of poisons information and laboratory clinical toxicology; for the establishment of a National Clinical Toxicology Institute for Ireland; for a list of accredited medical advisors in clinical toxicology; for multidisciplinary case conferences in complex toxicology cases for coroners; for the establishment of a national clinical toxicology referral out-patients service in Ireland.

  17. Beware of the predatory science journal: A potential threat to the integrity of medical research.

    PubMed

    Johal, Jaspreet; Ward, Robert; Gielecki, Jerzy; Walocha, Jerzy; Natsis, Kostantinos; Tubbs, R Shane; Loukas, Marios

    2017-09-01

    The issue of predatory journals has become increasingly more prevalent over the past decade, as the open-access model of publishing has gained prominence. Although the open-access model is well intentioned to increase accessibility of biomedical research, it is vulnerable to exploitation by those looking to corrupt medical academia and circumvent ethics and research standards. Predatory journals will achieve publication by either soliciting unsuspecting researchers who have legitimate research but fall victim to these predators or researchers looking to quickly publish their research without a thorough review process. Some features of predatory journals are a quick non-peer-review process, falsely listing or exaggerating the credibility of editorial board members, and either lack of or falsification of institutional affiliations and database listings. These predatory journals are a serious threat to the integrity of medical research, as they will infect the available literature with unsubstantiated articles, and allow low-quality research. A number of steps can be taken to prevent the spread and increase awareness of predatory publishers, and these must be done to maintain the integrity of medical academia. Clin. Anat. 30:767-773, 2017. © 2017Wiley Periodicals, Inc. © 2017 Wiley Periodicals, Inc.

  18. Web Monitoring of EOS Front-End Ground Operations, Science Downlinks and Level 0 Processing

    NASA Technical Reports Server (NTRS)

    Cordier, Guy R.; Wilkinson, Chris; McLemore, Bruce

    2008-01-01

    This paper addresses the efforts undertaken and the technology deployed to aggregate and distribute the metadata characterizing the real-time operations associated with NASA Earth Observing Systems (EOS) high-rate front-end systems and the science data collected at multiple ground stations and forwarded to the Goddard Space Flight Center for level 0 processing. Station operators, mission project management personnel, spacecraft flight operations personnel and data end-users for various EOS missions can retrieve the information at any time from any location having access to the internet. The users are distributed and the EOS systems are distributed but the centralized metadata accessed via an external web server provide an effective global and detailed view of the enterprise-wide events as they are happening. The data-driven architecture and the implementation of applied middleware technology, open source database, open source monitoring tools, and external web server converge nicely to fulfill the various needs of the enterprise. The timeliness and content of the information provided are key to making timely and correct decisions which reduce project risk and enhance overall customer satisfaction. The authors discuss security measures employed to limit access of data to authorized users only.

  19. A Multi-Purpose Data Dissemination Infrastructure for the Marine-Earth Observations

    NASA Astrophysics Data System (ADS)

    Hanafusa, Y.; Saito, H.; Kayo, M.; Suzuki, H.

    2015-12-01

    To open the data from a variety of observations, the Japan Agency for Marine-Earth Science and Technology (JAMSTEC) has developed a multi-purpose data dissemination infrastructure. Although many observations have been made in the earth science, all the data are not opened completely. We think data centers may provide researchers with a universal data dissemination service which can handle various kinds of observation data with little effort. For this purpose JAMSTEC Data Management Office has developed the "Information Catalog Infrastructure System (Catalog System)". This is a kind of catalog management system which can create, renew and delete catalogs (= databases) and has following features, - The Catalog System does not depend on data types or granularity of data records. - By registering a new metadata schema to the system, a new database can be created on the same system without sytem modification. - As web pages are defined by the cascading style sheets, databases have different look and feel, and operability. - The Catalog System provides databases with basic search tools; search by text, selection from a category tree, and selection from a time line chart. - For domestic users it creates the Japanese and English pages at the same time and has dictionary to control terminology and proper noun. As of August 2015 JAMSTEC operates 7 databases on the Catalog System. We expect to transfer existing databases to this system, or create new databases on it. In comparison with a dedicated database developed for the specific dataset, the Catalog System is suitable for the dissemination of small datasets, with minimum cost. Metadata held in the catalogs may be transfered to other metadata schema to exchange global databases or portals. Examples: JAMSTEC Data Catalog: http://www.godac.jamstec.go.jp/catalog/data_catalog/metadataList?lang=enJAMSTEC Document Catalog: http://www.godac.jamstec.go.jp/catalog/doc_catalog/metadataList?lang=en&tab=categoryResearch Information and Data Access Site of TEAMS: http://www.i-teams.jp/catalog/rias/metadataList?lang=en&tab=list

  20. Progress on the Fabric for Frontier Experiments Project at Fermilab

    NASA Astrophysics Data System (ADS)

    Box, Dennis; Boyd, Joseph; Dykstra, Dave; Garzoglio, Gabriele; Herner, Kenneth; Kirby, Michael; Kreymer, Arthur; Levshina, Tanya; Mhashilkar, Parag; Sharma, Neha

    2015-12-01

    The FabrIc for Frontier Experiments (FIFE) project is an ambitious, major-impact initiative within the Fermilab Scientific Computing Division designed to lead the computing model for Fermilab experiments. FIFE is a collaborative effort between experimenters and computing professionals to design and develop integrated computing models for experiments of varying needs and infrastructure. The major focus of the FIFE project is the development, deployment, and integration of Open Science Grid solutions for high throughput computing, data management, database access and collaboration within experiment. To accomplish this goal, FIFE has developed workflows that utilize Open Science Grid sites along with dedicated and commercial cloud resources. The FIFE project has made significant progress integrating into experiment computing operations several services including new job submission services, software and reference data distribution through CVMFS repositories, flexible data transfer client, and access to opportunistic resources on the Open Science Grid. The progress with current experiments and plans for expansion with additional projects will be discussed. FIFE has taken a leading role in the definition of the computing model for Fermilab experiments, aided in the design of computing for experiments beyond Fermilab, and will continue to define the future direction of high throughput computing for future physics experiments worldwide.

  1. Challenges in Database Design with Microsoft Access

    ERIC Educational Resources Information Center

    Letkowski, Jerzy

    2014-01-01

    Design, development and explorations of databases are popular topics covered in introductory courses taught at business schools. Microsoft Access is the most popular software used in those courses. Despite quite high complexity of Access, it is considered to be one of the most friendly database programs for beginners. A typical Access textbook…

  2. An open source web interface for linking models to infrastructure system databases

    NASA Astrophysics Data System (ADS)

    Knox, S.; Mohamed, K.; Harou, J. J.; Rheinheimer, D. E.; Medellin-Azuara, J.; Meier, P.; Tilmant, A.; Rosenberg, D. E.

    2016-12-01

    Models of networked engineered resource systems such as water or energy systems are often built collaboratively with developers from different domains working at different locations. These models can be linked to large scale real world databases, and they are constantly being improved and extended. As the development and application of these models becomes more sophisticated, and the computing power required for simulations and/or optimisations increases, so has the need for online services and tools which enable the efficient development and deployment of these models. Hydra Platform is an open source, web-based data management system, which allows modellers of network-based models to remotely store network topology and associated data in a generalised manner, allowing it to serve multiple disciplines. Hydra Platform uses a web API using JSON to allow external programs (referred to as `Apps') to interact with its stored networks and perform actions such as importing data, running models, or exporting the networks to different formats. Hydra Platform supports multiple users accessing the same network and has a suite of functions for managing users and data. We present ongoing development in Hydra Platform, the Hydra Web User Interface, through which users can collaboratively manage network data and models in a web browser. The web interface allows multiple users to graphically access, edit and share their networks, run apps and view results. Through apps, which are located on the server, the web interface can give users access to external data sources and models without the need to install or configure any software. This also ensures model results can be reproduced by removing platform or version dependence. Managing data and deploying models via the web interface provides a way for multiple modellers to collaboratively manage data, deploy and monitor model runs and analyse results.

  3. Ionosphere Waves Service (IWS) - a problem-oriented tool in ionosphere and Space Weather research produced by POPDAT project

    NASA Astrophysics Data System (ADS)

    Ferencz, Csaba; Lizunov, Georgii; Crespon, François; Price, Ivan; Bankov, Ludmil; Przepiórka, Dorota; Brieß, Klaus; Dudkin, Denis; Girenko, Andrey; Korepanov, Valery; Kuzmych, Andrii; Skorokhod, Tetiana; Marinov, Pencho; Piankova, Olena; Rothkaehl, Hanna; Shtus, Tetyana; Steinbach, Péter; Lichtenberger, János; Sterenharz, Arnold; Vassileva, Any

    2014-05-01

    In the frame of the FP7 POPDAT project the Ionosphere Waves Service (IWS) has been developed and opened for public access by ionosphere experts. IWS is forming a database, derived from archived ionospheric wave records to assist the ionosphere and Space Weather research, and to answer the following questions: How can the data of earlier ionospheric missions be reprocessed with current algorithms to gain more profitable results? How could the scientific community be provided with a new insight on wave processes that take place in the ionosphere? The answer is a specific and unique data mining service accessing a collection of topical catalogs that characterize a huge number of recorded occurrences of Whistler-like Electromagnetic Wave Phenomena, Atmosphere Gravity Waves, and Traveling Ionosphere Disturbances. IWS online service (http://popdat.cbk.waw.pl) offers end users to query optional set of predefined wave phenomena, their detailed characteristics. These were collected by target specific event detection algorithms in selected satellite records during database buildup phase. Result of performed wave processing thus represents useful information on statistical or comparative investigations of wave types, listed in a detailed catalog of ionospheric wave phenomena. The IWS provides wave event characteristics, extracted by specific software systems from data records of the selected satellite missions. The end-user can access targets by making specific searches and use statistical modules within the service in their field of interest. Therefore the IWS opens a new way in ionosphere and Space Weather research. The scientific applications covered by IWS concern beyond Space Weather also other fields like earthquake precursors, ionosphere climatology, geomagnetic storms, troposphere-ionosphere energy transfer, and trans-ionosphere link perturbations.

  4. Introducing the PRIDE Archive RESTful web services.

    PubMed

    Reisinger, Florian; del-Toro, Noemi; Ternent, Tobias; Hermjakob, Henning; Vizcaíno, Juan Antonio

    2015-07-01

    The PRIDE (PRoteomics IDEntifications) database is one of the world-leading public repositories of mass spectrometry (MS)-based proteomics data and it is a founding member of the ProteomeXchange Consortium of proteomics resources. In the original PRIDE database system, users could access data programmatically by accessing the web services provided by the PRIDE BioMart interface. New REST (REpresentational State Transfer) web services have been developed to serve the most popular functionality provided by BioMart (now discontinued due to data scalability issues) and address the data access requirements of the newly developed PRIDE Archive. Using the API (Application Programming Interface) it is now possible to programmatically query for and retrieve peptide and protein identifications, project and assay metadata and the originally submitted files. Searching and filtering is also possible by metadata information, such as sample details (e.g. species and tissues), instrumentation (mass spectrometer), keywords and other provided annotations. The PRIDE Archive web services were first made available in April 2014. The API has already been adopted by a few applications and standalone tools such as PeptideShaker, PRIDE Inspector, the Unipept web application and the Python-based BioServices package. This application is free and open to all users with no login requirement and can be accessed at http://www.ebi.ac.uk/pride/ws/archive/. © The Author(s) 2015. Published by Oxford University Press on behalf of Nucleic Acids Research.

  5. Isfahan MISP Dataset.

    PubMed

    Kashefpur, Masoud; Kafieh, Rahele; Jorjandi, Sahar; Golmohammadi, Hadis; Khodabande, Zahra; Abbasi, Mohammadreza; Teifuri, Nilufar; Fakharzadeh, Ali Akbar; Kashefpoor, Maryam; Rabbani, Hossein

    2017-01-01

    An online depository was introduced to share clinical ground truth with the public and provide open access for researchers to evaluate their computer-aided algorithms. PHP was used for web programming and MySQL for database managing. The website was entitled "biosigdata.com." It was a fast, secure, and easy-to-use online database for medical signals and images. Freely registered users could download the datasets and could also share their own supplementary materials while maintaining their privacies (citation and fee). Commenting was also available for all datasets, and automatic sitemap and semi-automatic SEO indexing have been set for the site. A comprehensive list of available websites for medical datasets is also presented as a Supplementary (http://journalonweb.com/tempaccess/4800.584.JMSS_55_16I3253.pdf).

  6. InChI in the wild: an assessment of InChIKey searching in Google

    PubMed Central

    2013-01-01

    While chemical databases can be queried using the InChI string and InChIKey (IK) the latter was designed for open-web searching. It is becoming increasingly effective for this since more sources enhance crawling of their websites by the Googlebot and consequent IK indexing. Searchers who use Google as an adjunct to database access may be less familiar with the advantages of using the IK as explored in this review. As an example, the IK for atorvastatin retrieves ~200 low-redundancy links from a Google search in 0.3 of a second. These include most major databases and a very low false-positive rate. Results encompass less familiar but potentially useful sources and can be extended to isomer capture by using just the skeleton layer of the IK. Google Advanced Search can be used to filter large result sets. Image searching with the IK is also effective and complementary to open-web queries. Results can be particularly useful for less-common structures as exemplified by a major metabolite of atorvastatin giving only three hits. Testing also demonstrated document-to-document and document-to-database joins via structure matching. The necessary generation of an IK from chemical names can be accomplished using open tools and resources for patents, papers, abstracts or other text sources. Active global sharing of local IK-linked information can be accomplished via surfacing in open laboratory notebooks, blogs, Twitter, figshare and other routes. While information-rich chemistry (e.g. approved drugs) can exhibit swamping and redundancy effects, the much smaller IK result sets for link-poor structures become a transformative first-pass option. The IK indexing has therefore turned Google into a de-facto open global chemical information hub by merging links to most significant sources, including over 50 million PubChem and ChemSpider records. The simplicity, specificity and speed of matching make it a useful option for biologists or others less familiar with chemical searching. However, compared to rigorously maintained major databases, users need to be circumspect about the consistency of Google results and provenance of retrieved links. In addition, community engagement may be necessary to ameliorate possible future degradation of utility. PMID:23399051

  7. Water-Related Power Plant Curtailments: An Overview of Incidents and Contributing Factors

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    McCall, James; Macknick, Jordan; Macknick, Jordan

    Water temperatures and water availability can affect the reliable operations of power plants in the United States. Data on water-related impacts on the energy sector are not consolidated and are reported by multiple agencies. This study provides an overview of historical incidents where water resources have affected power plant operations, discusses the various data sources providing information, and creates a publicly available and open access database that contains consolidated information about water-related power plant curtailment and shut-down incidents. Power plants can be affected by water resources if incoming water temperatures are too high, water discharge temperatures are too high, ormore » if there is not enough water available to operate. Changes in climate have the potential to exacerbate uncertainty over water resource availability and temperature. Power plant impacts from water resources include curtailment of generation, plant shut-downs, and requests for regulatory variances. In addition, many power plants have developed adaptation approaches to reducing the potential risks of water-related issues by investing in new technologies or developing and implementing plans to undertake during droughts or heatwaves. This study identifies 42 incidents of water-related power plant issues from 2000-2015, drawing from a variety of different datasets. These incidents occur throughout the U.S., and affect coal and nuclear plants that use once-through, recirculating, and pond cooling systems. In addition, water temperature violations reported to the Environmental Protection Agency are also considered, with 35 temperature violations noted from 2012-2015. In addition to providing some background information on incidents, this effort has also created an open access database on the Open Energy Information platform that contains information about water-related power plant issues that can be updated by users.« less

  8. Astronomical Publishing: Yesterday, Today and Tomorrow

    NASA Astrophysics Data System (ADS)

    Huchra, John

    Just in the last few years scientific publishing has moved rapidly away from the modes that served it well for over two centuries. As "digital natives" take over the field and rapid and open access comes to dominate the way we communicate, both scholarly journals and libraries need to adopt new business models to serve their communities. This is best done by identifying new "added value" such as databases, full text searching, full cross indexing while at the same time retaining the high quality of peer reviewed publication.

  9. Abstracting data warehousing issues in scientific research.

    PubMed

    Tews, Cody; Bracio, Boris R

    2002-01-01

    This paper presents the design and implementation of the Idaho Biomedical Data Management System (IBDMS). This system preprocesses biomedical data from the IMPROVE (Improving Control of Patient Status in Critical Care) library via an Open Database Connectivity (ODBC) connection. The ODBC connection allows for local and remote simulations to access filtered, joined, and sorted data using the Structured Query Language (SQL). The tool is capable of providing an overview of available data in addition to user defined data subset for verification of models of the human respiratory system.

  10. Editorial policies and background in editing Macedonian Medical Review and BANTAO journal.

    PubMed

    Spasovski, Goce

    2014-01-01

    Even in as small a country as R. Macedonia with limited resources allocated for science, there are many journals trying to establish good editorial practices and policies in publishing the scientific work achieved. Among the currently existing medical journals Macedonian Medical Review (MMR), ISSN 0025-1097, deserves to be elaborated as the oldest journal with continuous publication since its first appearance as the journal of the Macedonian Medical Association (MMA). Since its first issue, published in 1946, there has been an opus of some 4500 peer-reviewed published papers in more than 210 issues and some 80 supplements from various congresses and meetings. In this regard, great respect should be paid not only to the editorial boards, but also to the collaborators who have contributed to its successful continuity in all previous years. In line with the needs for further development of the journal and possibilities for access to world databases, the Editorial Board of MMR has made every effort to improve and modernize its work as well as the technical quality of the journal. Hence, MMA has signed a contract with De Gruyter Open as leading publisher of Open Access academic content for further improvement and promotion of the journal and facilitation of the Medline application, so we do hope for the further success of the journal. BANTAO Journal is published on behalf of the Balkan Cities Association of Nephrology, Dialysis, Transplantation and Artificial Organs (BANTAO), ISSN 1312-2517. The first issue was published in 2003, ten years after BANTAO was born. Its appearance was an extremely important event in the existence of BANTAO. The first official editor of the journal was Dimitar Nenov, Varna (2003-2005), followed by Ali Basci (Izmir, Turkey) and Goce Spasovski (Skopje, Macedonia) as editor-in-chief since 2009. Over the years, the Journal has been included in the EBSCO, DOAJ and SCOPUS/SCIMAGO databases. The journal is published biannually. Until now, 345 papers have been published in the past 11 years, in 21 regular issues and 3 supplements. It may be said that the journal is the "glue" between the nephrologists from the Balkan cities, reflecting the high quality research and scientific potential of Balkan nephrologists. The entire process of submitting and reviewing the manuscripts is electronically done and after their acceptance they are freely available (open access journal) on the website of the association and the journal: www.bantao.org. In this regard, the current President of BANTAO has already signed a contract with De Gruyter Open as leading publisher of Open Access academic content for further improvement and promotion of the journal and Medline application for the further success of the journal.

  11. Sci-Hub provides access to nearly all scholarly literature

    PubMed Central

    Romero, Ariel Rodriguez; Levernier, Jacob G; Munro, Thomas Anthony; McLaughlin, Stephen Reid; Greshake Tzovaras, Bastian

    2018-01-01

    The website Sci-Hub enables users to download PDF versions of scholarly articles, including many articles that are paywalled at their journal’s site. Sci-Hub has grown rapidly since its creation in 2011, but the extent of its coverage has been unclear. Here we report that, as of March 2017, Sci-Hub’s database contains 68.9% of the 81.6 million scholarly articles registered with Crossref and 85.1% of articles published in toll access journals. We find that coverage varies by discipline and publisher, and that Sci-Hub preferentially covers popular, paywalled content. For toll access articles, we find that Sci-Hub provides greater coverage than the University of Pennsylvania, a major research university in the United States. Green open access to toll access articles via licit services, on the other hand, remains quite limited. Our interactive browser at https://greenelab.github.io/scihub allows users to explore these findings in more detail. For the first time, nearly all scholarly literature is available gratis to anyone with an Internet connection, suggesting the toll access business model may become unsustainable. PMID:29424689

  12. Using SMILES strings for the description of chemical connectivity in the Crystallography Open Database.

    PubMed

    Quirós, Miguel; Gražulis, Saulius; Girdzijauskaitė, Saulė; Merkys, Andrius; Vaitkus, Antanas

    2018-05-18

    Computer descriptions of chemical molecular connectivity are necessary for searching chemical databases and for predicting chemical properties from molecular structure. In this article, the ongoing work to describe the chemical connectivity of entries contained in the Crystallography Open Database (COD) in SMILES format is reported. This collection of SMILES is publicly available for chemical (substructure) search or for any other purpose on an open-access basis, as is the COD itself. The conventions that have been followed for the representation of compounds that do not fit into the valence bond theory are outlined for the most frequently found cases. The procedure for getting the SMILES out of the CIF files starts with checking whether the atoms in the asymmetric unit are a chemically acceptable image of the compound. When they are not (molecule in a symmetry element, disorder, polymeric species,etc.), the previously published cif_molecule program is used to get such image in many cases. The program package Open Babel is then applied to get SMILES strings from the CIF files (either those directly taken from the COD or those produced by cif_molecule when applicable). The results are then checked and/or fixed by a human editor, in a computer-aided task that at present still consumes a great deal of human time. Even if the procedure still needs to be improved to make it more automatic (and hence faster), it has already yielded more than 160,000 curated chemical structures and the purpose of this article is to announce the existence of this work to the chemical community as well as to spread the use of its results.

  13. The database on transgenic luminescent microorganisms as an instrument of studying a microbial component of closed ecosystems

    NASA Astrophysics Data System (ADS)

    Boyandin, A. N.; Lankin, Y. P.; Kargatova, T. V.; Popova, L. Y.; Pechurkin, N. S.

    Luminescent transgenic microorganisms are widely used for study of microbial communities' functioning including closed ones. Bioluminescence is of high sensitive to effects of different environmental factors. Integration of lux-genes into different metabolic ways allows studying many aspects of microorganisms' life permitting to carry out measurements in situ. There is much information about applications of bioluminescent bacteria in different researches. But for effective using these data their summarizing and accumulation in common source is required. Therefore an information system on characteristics of transgenic microorganisms with cloned lux-genes was created. The database and client software related were developed. A database structure includes information on common characteristics of cloned lux-genes, their sources and properties, on regulation of gene expression in bacterial cells, on dependence of bioluminescence manifestation on biotic, abiotic and anthropogenic environmental factors. The database also can store description of changes in bacterial populations depending on environmental changes. The database created allows storing and using bibliographic information and also links to web sites of world collections of microorganisms. Internet publishing software permitting to open access to the database through the Internet is developed.

  14. Arkheia: Data Management and Communication for Open Computational Neuroscience

    PubMed Central

    Antolík, Ján; Davison, Andrew P.

    2018-01-01

    Two trends have been unfolding in computational neuroscience during the last decade. First, a shift of focus to increasingly complex and heterogeneous neural network models, with a concomitant increase in the level of collaboration within the field (whether direct or in the form of building on top of existing tools and results). Second, a general trend in science toward more open communication, both internally, with other potential scientific collaborators, and externally, with the wider public. This multi-faceted development toward more integrative approaches and more intense communication within and outside of the field poses major new challenges for modelers, as currently there is a severe lack of tools to help with automatic communication and sharing of all aspects of a simulation workflow to the rest of the community. To address this important gap in the current computational modeling software infrastructure, here we introduce Arkheia. Arkheia is a web-based open science platform for computational models in systems neuroscience. It provides an automatic, interactive, graphical presentation of simulation results, experimental protocols, and interactive exploration of parameter searches, in a web browser-based application. Arkheia is focused on automatic presentation of these resources with minimal manual input from users. Arkheia is written in a modular fashion with a focus on future development of the platform. The platform is designed in an open manner, with a clearly defined and separated API for database access, so that any project can write its own backend translating its data into the Arkheia database format. Arkheia is not a centralized platform, but allows any user (or group of users) to set up their own repository, either for public access by the general population, or locally for internal use. Overall, Arkheia provides users with an automatic means to communicate information about not only their models but also individual simulation results and the entire experimental context in an approachable graphical manner, thus facilitating the user's ability to collaborate in the field and outreach to a wider audience. PMID:29556187

  15. Arkheia: Data Management and Communication for Open Computational Neuroscience.

    PubMed

    Antolík, Ján; Davison, Andrew P

    2018-01-01

    Two trends have been unfolding in computational neuroscience during the last decade. First, a shift of focus to increasingly complex and heterogeneous neural network models, with a concomitant increase in the level of collaboration within the field (whether direct or in the form of building on top of existing tools and results). Second, a general trend in science toward more open communication, both internally, with other potential scientific collaborators, and externally, with the wider public. This multi-faceted development toward more integrative approaches and more intense communication within and outside of the field poses major new challenges for modelers, as currently there is a severe lack of tools to help with automatic communication and sharing of all aspects of a simulation workflow to the rest of the community. To address this important gap in the current computational modeling software infrastructure, here we introduce Arkheia. Arkheia is a web-based open science platform for computational models in systems neuroscience. It provides an automatic, interactive, graphical presentation of simulation results, experimental protocols, and interactive exploration of parameter searches, in a web browser-based application. Arkheia is focused on automatic presentation of these resources with minimal manual input from users. Arkheia is written in a modular fashion with a focus on future development of the platform. The platform is designed in an open manner, with a clearly defined and separated API for database access, so that any project can write its own backend translating its data into the Arkheia database format. Arkheia is not a centralized platform, but allows any user (or group of users) to set up their own repository, either for public access by the general population, or locally for internal use. Overall, Arkheia provides users with an automatic means to communicate information about not only their models but also individual simulation results and the entire experimental context in an approachable graphical manner, thus facilitating the user's ability to collaborate in the field and outreach to a wider audience.

  16. Comprehensive Routing Security Development and Deployment for the Internet

    DTIC Science & Technology

    2015-02-01

    feature enhancement and bug fixes. • MySQL : MySQL is a widely used and popular open source database package. It was chosen for database support in the...RPSTIR depends on several other open source packages. • MySQL : MySQL is used for the the local RPKI database cache. • OpenSSL: OpenSSL is used for...cryptographic libraries for X.509 certificates. • ODBC mySql Connector: ODBC (Open Database Connectivity) is a standard programming interface (API) for

  17. The IAGOS information system

    NASA Astrophysics Data System (ADS)

    Boulanger, Damien; Gautron, Benoit; Schultz, Martin; Brötz, Björn; Rauthe-Schöch, Armin; Thouret, Valérie

    2015-04-01

    IAGOS (In-service Aircraft for a Global Observing System) aims at the provision of long-term, frequent, regular, accurate, and spatially resolved in situ observations of the atmospheric composition. IAGOS observation systems are deployed on a fleet of commercial aircraft. The IAGOS database is an essential part of the global atmospheric monitoring network. Data access is handled by open access policy based on the submission of research requests which are reviewed by the PIs. The IAGOS database (http://www.iagos.fr, damien.boulanger@obs-mip.fr) is part of the French atmospheric chemistry data centre Ether (CNES and CNRS). In the framework of the IGAS project (IAGOS for Copernicus Atmospheric Service) interoperability with international portals or other databases is implemented in order to improve IAGOS data discovery. The IGAS data network is composed of three data centres: the IAGOS database in Toulouse including IAGOS-core data and IAGOS-CARIBIC (Civil Aircraft for the Regular Investigation of the Atmosphere Based on an Instrument Container) data since January 2015; the HALO research aircraft database at DLR (https://halo-db.pa.op.dlr.de); and the MACC data centre in Jülich (http://join.iek.fz-juelich.de). The MACC (Monitoring Atmospheric Composition and Climate) project is a prominent user of the IGAS data network. In June 2015 a new version of the IAGOS database will be released providing improved services such as download in NetCDF or NASA Ames formats; graphical tools (maps, scatter plots, etc.); standardized metadata (ISO 19115) and a better users management. The link with the MACC data centre, through JOIN (Jülich OWS Interface), will allow to combine model outputs with IAGOS data for intercomparison. The interoperability within the IGAS data network, implemented thanks to many web services, will improve the functionalities of the web interfaces of each data centre.

  18. A scalable database model for multiparametric time series: a volcano observatory case study

    NASA Astrophysics Data System (ADS)

    Montalto, Placido; Aliotta, Marco; Cassisi, Carmelo; Prestifilippo, Michele; Cannata, Andrea

    2014-05-01

    The variables collected by a sensor network constitute a heterogeneous data source that needs to be properly organized in order to be used in research and geophysical monitoring. With the time series term we refer to a set of observations of a given phenomenon acquired sequentially in time. When the time intervals are equally spaced one speaks of period or sampling frequency. Our work describes in detail a possible methodology for storage and management of time series using a specific data structure. We designed a framework, hereinafter called TSDSystem (Time Series Database System), in order to acquire time series from different data sources and standardize them within a relational database. The operation of standardization provides the ability to perform operations, such as query and visualization, of many measures synchronizing them using a common time scale. The proposed architecture follows a multiple layer paradigm (Loaders layer, Database layer and Business Logic layer). Each layer is specialized in performing particular operations for the reorganization and archiving of data from different sources such as ASCII, Excel, ODBC (Open DataBase Connectivity), file accessible from the Internet (web pages, XML). In particular, the loader layer performs a security check of the working status of each running software through an heartbeat system, in order to automate the discovery of acquisition issues and other warning conditions. Although our system has to manage huge amounts of data, performance is guaranteed by using a smart partitioning table strategy, that keeps balanced the percentage of data stored in each database table. TSDSystem also contains modules for the visualization of acquired data, that provide the possibility to query different time series on a specified time range, or follow the realtime signal acquisition, according to a data access policy from the users.

  19. A multidisciplinary database for geophysical time series management

    NASA Astrophysics Data System (ADS)

    Montalto, P.; Aliotta, M.; Cassisi, C.; Prestifilippo, M.; Cannata, A.

    2013-12-01

    The variables collected by a sensor network constitute a heterogeneous data source that needs to be properly organized in order to be used in research and geophysical monitoring. With the time series term we refer to a set of observations of a given phenomenon acquired sequentially in time. When the time intervals are equally spaced one speaks of period or sampling frequency. Our work describes in detail a possible methodology for storage and management of time series using a specific data structure. We designed a framework, hereinafter called TSDSystem (Time Series Database System), in order to acquire time series from different data sources and standardize them within a relational database. The operation of standardization provides the ability to perform operations, such as query and visualization, of many measures synchronizing them using a common time scale. The proposed architecture follows a multiple layer paradigm (Loaders layer, Database layer and Business Logic layer). Each layer is specialized in performing particular operations for the reorganization and archiving of data from different sources such as ASCII, Excel, ODBC (Open DataBase Connectivity), file accessible from the Internet (web pages, XML). In particular, the loader layer performs a security check of the working status of each running software through an heartbeat system, in order to automate the discovery of acquisition issues and other warning conditions. Although our system has to manage huge amounts of data, performance is guaranteed by using a smart partitioning table strategy, that keeps balanced the percentage of data stored in each database table. TSDSystem also contains modules for the visualization of acquired data, that provide the possibility to query different time series on a specified time range, or follow the realtime signal acquisition, according to a data access policy from the users.

  20. SEED Servers: High-Performance Access to the SEED Genomes, Annotations, and Metabolic Models

    PubMed Central

    Aziz, Ramy K.; Devoid, Scott; Disz, Terrence; Edwards, Robert A.; Henry, Christopher S.; Olsen, Gary J.; Olson, Robert; Overbeek, Ross; Parrello, Bruce; Pusch, Gordon D.; Stevens, Rick L.; Vonstein, Veronika; Xia, Fangfang

    2012-01-01

    The remarkable advance in sequencing technology and the rising interest in medical and environmental microbiology, biotechnology, and synthetic biology resulted in a deluge of published microbial genomes. Yet, genome annotation, comparison, and modeling remain a major bottleneck to the translation of sequence information into biological knowledge, hence computational analysis tools are continuously being developed for rapid genome annotation and interpretation. Among the earliest, most comprehensive resources for prokaryotic genome analysis, the SEED project, initiated in 2003 as an integration of genomic data and analysis tools, now contains >5,000 complete genomes, a constantly updated set of curated annotations embodied in a large and growing collection of encoded subsystems, a derived set of protein families, and hundreds of genome-scale metabolic models. Until recently, however, maintaining current copies of the SEED code and data at remote locations has been a pressing issue. To allow high-performance remote access to the SEED database, we developed the SEED Servers (http://www.theseed.org/servers): four network-based servers intended to expose the data in the underlying relational database, support basic annotation services, offer programmatic access to the capabilities of the RAST annotation server, and provide access to a growing collection of metabolic models that support flux balance analysis. The SEED servers offer open access to regularly updated data, the ability to annotate prokaryotic genomes, the ability to create metabolic reconstructions and detailed models of metabolism, and access to hundreds of existing metabolic models. This work offers and supports a framework upon which other groups can build independent research efforts. Large integrations of genomic data represent one of the major intellectual resources driving research in biology, and programmatic access to the SEED data will provide significant utility to a broad collection of potential users. PMID:23110173

  1. 50 CFR 660.332 - Open access daily trip limit (DTL) fishery for sablefish.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 50 Wildlife and Fisheries 9 2010-10-01 2010-10-01 false Open access daily trip limit (DTL) fishery... COAST STATES West Coast Groundfish-Open Access Fisheries § 660.332 Open access daily trip limit (DTL) fishery for sablefish. (a) Open access DTL fisheries both north and south of 36° N. lat. Open access...

  2. UKPMC: a full text article resource for the life sciences.

    PubMed

    McEntyre, Johanna R; Ananiadou, Sophia; Andrews, Stephen; Black, William J; Boulderstone, Richard; Buttery, Paula; Chaplin, David; Chevuru, Sandeepreddy; Cobley, Norman; Coleman, Lee-Ann; Davey, Paul; Gupta, Bharti; Haji-Gholam, Lesley; Hawkins, Craig; Horne, Alan; Hubbard, Simon J; Kim, Jee-Hyub; Lewin, Ian; Lyte, Vic; MacIntyre, Ross; Mansoor, Sami; Mason, Linda; McNaught, John; Newbold, Elizabeth; Nobata, Chikashi; Ong, Ernest; Pillai, Sharmila; Rebholz-Schuhmann, Dietrich; Rosie, Heather; Rowbotham, Rob; Rupp, C J; Stoehr, Peter; Vaughan, Philip

    2011-01-01

    UK PubMed Central (UKPMC) is a full-text article database that extends the functionality of the original PubMed Central (PMC) repository. The UKPMC project was launched as the first 'mirror' site to PMC, which in analogy to the International Nucleotide Sequence Database Collaboration, aims to provide international preservation of the open and free-access biomedical literature. UKPMC (http://ukpmc.ac.uk) has undergone considerable development since its inception in 2007 and now includes both a UKPMC and PubMed search, as well as access to other records such as Agricola, Patents and recent biomedical theses. UKPMC also differs from PubMed/PMC in that the full text and abstract information can be searched in an integrated manner from one input box. Furthermore, UKPMC contains 'Cited By' information as an alternative way to navigate the literature and has incorporated text-mining approaches to semantically enrich content and integrate it with related database resources. Finally, UKPMC also offers added-value services (UKPMC+) that enable grantees to deposit manuscripts, link papers to grants, publish online portfolios and view citation information on their papers. Here we describe UKPMC and clarify the relationship between PMC and UKPMC, providing historical context and future directions, 10 years on from when PMC was first launched.

  3. Web-based access to near real-time and archived high-density time-series data: cyber infrastructure challenges & developments in the open-source Waveform Server

    NASA Astrophysics Data System (ADS)

    Reyes, J. C.; Vernon, F. L.; Newman, R. L.; Steidl, J. H.

    2010-12-01

    The Waveform Server is an interactive web-based interface to multi-station, multi-sensor and multi-channel high-density time-series data stored in Center for Seismic Studies (CSS) 3.0 schema relational databases (Newman et al., 2009). In the last twelve months, based on expanded specifications and current user feedback, both the server-side infrastructure and client-side interface have been extensively rewritten. The Python Twisted server-side code-base has been fundamentally modified to now present waveform data stored in cluster-based databases using a multi-threaded architecture, in addition to supporting the pre-existing single database model. This allows interactive web-based access to high-density (broadband @ 40Hz to strong motion @ 200Hz) waveform data that can span multiple years; the common lifetime of broadband seismic networks. The client-side interface expands on it's use of simple JSON-based AJAX queries to now incorporate a variety of User Interface (UI) improvements including standardized calendars for defining time ranges, applying on-the-fly data calibration to display SI-unit data, and increased rendering speed. This presentation will outline the various cyber infrastructure challenges we have faced while developing this application, the use-cases currently in existence, and the limitations of web-based application development.

  4. The BioGRID interaction database: 2013 update.

    PubMed

    Chatr-Aryamontri, Andrew; Breitkreutz, Bobby-Joe; Heinicke, Sven; Boucher, Lorrie; Winter, Andrew; Stark, Chris; Nixon, Julie; Ramage, Lindsay; Kolas, Nadine; O'Donnell, Lara; Reguly, Teresa; Breitkreutz, Ashton; Sellam, Adnane; Chen, Daici; Chang, Christie; Rust, Jennifer; Livstone, Michael; Oughtred, Rose; Dolinski, Kara; Tyers, Mike

    2013-01-01

    The Biological General Repository for Interaction Datasets (BioGRID: http//thebiogrid.org) is an open access archive of genetic and protein interactions that are curated from the primary biomedical literature for all major model organism species. As of September 2012, BioGRID houses more than 500 000 manually annotated interactions from more than 30 model organisms. BioGRID maintains complete curation coverage of the literature for the budding yeast Saccharomyces cerevisiae, the fission yeast Schizosaccharomyces pombe and the model plant Arabidopsis thaliana. A number of themed curation projects in areas of biomedical importance are also supported. BioGRID has established collaborations and/or shares data records for the annotation of interactions and phenotypes with most major model organism databases, including Saccharomyces Genome Database, PomBase, WormBase, FlyBase and The Arabidopsis Information Resource. BioGRID also actively engages with the text-mining community to benchmark and deploy automated tools to expedite curation workflows. BioGRID data are freely accessible through both a user-defined interactive interface and in batch downloads in a wide variety of formats, including PSI-MI2.5 and tab-delimited files. BioGRID records can also be interrogated and analyzed with a series of new bioinformatics tools, which include a post-translational modification viewer, a graphical viewer, a REST service and a Cytoscape plugin.

  5. The BIG Data Center: from deposition to integration to translation.

    PubMed

    2017-01-04

    Biological data are generated at unprecedentedly exponential rates, posing considerable challenges in big data deposition, integration and translation. The BIG Data Center, established at Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, provides a suite of database resources, including (i) Genome Sequence Archive, a data repository specialized for archiving raw sequence reads, (ii) Gene Expression Nebulas, a data portal of gene expression profiles based entirely on RNA-Seq data, (iii) Genome Variation Map, a comprehensive collection of genome variations for featured species, (iv) Genome Warehouse, a centralized resource housing genome-scale data with particular focus on economically important animals and plants, (v) Methylation Bank, an integrated database of whole-genome single-base resolution methylomes and (vi) Science Wikis, a central access point for biological wikis developed for community annotations. The BIG Data Center is dedicated to constructing and maintaining biological databases through big data integration and value-added curation, conducting basic research to translate big data into big knowledge and providing freely open access to a variety of data resources in support of worldwide research activities in both academia and industry. All of these resources are publicly available and can be found at http://bigd.big.ac.cn. © The Author(s) 2016. Published by Oxford University Press on behalf of Nucleic Acids Research.

  6. UKPMC: a full text article resource for the life sciences

    PubMed Central

    McEntyre, Johanna R.; Ananiadou, Sophia; Andrews, Stephen; Black, William J.; Boulderstone, Richard; Buttery, Paula; Chaplin, David; Chevuru, Sandeepreddy; Cobley, Norman; Coleman, Lee-Ann; Davey, Paul; Gupta, Bharti; Haji-Gholam, Lesley; Hawkins, Craig; Horne, Alan; Hubbard, Simon J.; Kim, Jee-Hyub; Lewin, Ian; Lyte, Vic; MacIntyre, Ross; Mansoor, Sami; Mason, Linda; McNaught, John; Newbold, Elizabeth; Nobata, Chikashi; Ong, Ernest; Pillai, Sharmila; Rebholz-Schuhmann, Dietrich; Rosie, Heather; Rowbotham, Rob; Rupp, C. J.; Stoehr, Peter; Vaughan, Philip

    2011-01-01

    UK PubMed Central (UKPMC) is a full-text article database that extends the functionality of the original PubMed Central (PMC) repository. The UKPMC project was launched as the first ‘mirror’ site to PMC, which in analogy to the International Nucleotide Sequence Database Collaboration, aims to provide international preservation of the open and free-access biomedical literature. UKPMC (http://ukpmc.ac.uk) has undergone considerable development since its inception in 2007 and now includes both a UKPMC and PubMed search, as well as access to other records such as Agricola, Patents and recent biomedical theses. UKPMC also differs from PubMed/PMC in that the full text and abstract information can be searched in an integrated manner from one input box. Furthermore, UKPMC contains ‘Cited By’ information as an alternative way to navigate the literature and has incorporated text-mining approaches to semantically enrich content and integrate it with related database resources. Finally, UKPMC also offers added-value services (UKPMC+) that enable grantees to deposit manuscripts, link papers to grants, publish online portfolios and view citation information on their papers. Here we describe UKPMC and clarify the relationship between PMC and UKPMC, providing historical context and future directions, 10 years on from when PMC was first launched. PMID:21062818

  7. The plant phenological online database (PPODB): an online database for long-term phenological data.

    PubMed

    Dierenbach, Jonas; Badeck, Franz-W; Schaber, Jörg

    2013-09-01

    We present an online database that provides unrestricted and free access to over 16 million plant phenological observations from over 8,000 stations in Central Europe between the years 1880 and 2009. Unique features are (1) a flexible and unrestricted access to a full-fledged database, allowing for a wide range of individual queries and data retrieval, (2) historical data for Germany before 1951 ranging back to 1880, and (3) more than 480 curated long-term time series covering more than 100 years for individual phenological phases and plants combined over Natural Regions in Germany. Time series for single stations or Natural Regions can be accessed through a user-friendly graphical geo-referenced interface. The joint databases made available with the plant phenological database PPODB render accessible an important data source for further analyses of long-term changes in phenology. The database can be accessed via www.ppodb.de .

  8. The ChArMEx database

    NASA Astrophysics Data System (ADS)

    Ferré, Helene; Belmahfoud, Nizar; Boichard, Jean-Luc; Brissebrat, Guillaume; Descloitres, Jacques; Fleury, Laurence; Focsa, Loredana; Henriot, Nicolas; Mastrorillo, Laurence; Mière, Arnaud; Vermeulen, Anne

    2014-05-01

    The Chemistry-Aerosol Mediterranean Experiment (ChArMEx, http://charmex.lsce.ipsl.fr/) aims at a scientific assessment of the present and future state of the atmospheric environment in the Mediterranean Basin, and of its impacts on the regional climate, air quality, and marine biogeochemistry. The project includes long term monitoring of environmental parameters, intensive field campaigns, use of satellite data and modelling studies. Therefore ChARMEx scientists produce and need to access a wide diversity of data. In this context, the objective of the database task is to organize data management, distribution system and services, such as facilitating the exchange of information and stimulating the collaboration between researchers within the ChArMEx community, and beyond. The database relies on a strong collaboration between OMP and ICARE data centres and has been set up in the framework of the Mediterranean Integrated Studies at Regional And Locals Scales (MISTRALS) program data portal. All the data produced by or of interest for the ChArMEx community will be documented in the data catalogue and accessible through the database website: http://mistrals.sedoo.fr/ChArMEx. At present, the ChArMEx database contains about 75 datasets, including 50 in situ datasets (2012 and 2013 campaigns, Ersa background monitoring station), 25 model outputs (dust model intercomparison, MEDCORDEX scenarios), and a high resolution emission inventory over the Mediterranean. Many in situ datasets have been inserted in a relational database, in order to enable more accurate data selection and download of different datasets in a shared format. The database website offers different tools: - A registration procedure which enables any scientist to accept the data policy and apply for a user database account. - A data catalogue that complies with metadata international standards (ISO 19115-19139; INSPIRE European Directive; Global Change Master Directory Thesaurus). - Metadata forms to document observations or products that will be provided to the database. - A search tool to browse the catalogue using thematic, geographic and/or temporal criteria. - A shopping-cart web interface to order in situ data files. - A web interface to select and access to homogenized datasets. Interoperability between the two data centres is being set up using the OPEnDAP protocol. The data portal will soon propose a user-friendly access to satellite products managed by the ICARE data centre (SEVIRI, TRIMM, PARASOL...). In order to meet the operational needs of the airborne and ground based observational teams during the ChArMEx 2012 and 2013 campaigns, a day-to-day chart and report display website has been developed too: http://choc.sedoo.org. It offers a convenient way to browse weather conditions and chemical composition during the campaign periods.

  9. Fermilab Security Site Access Request Database

    Science.gov Websites

    Fermilab Security Site Access Request Database Use of the online version of the Fermilab Security Site Access Request Database requires that you login into the ESH&Q Web Site. Note: Only Fermilab generated from the ESH&Q Section's Oracle database on May 27, 2018 05:48 AM. If you have a question

  10. NONATObase: a database for Polychaeta (Annelida) from the Southwestern Atlantic Ocean.

    PubMed

    Pagliosa, Paulo R; Doria, João G; Misturini, Dairana; Otegui, Mariana B P; Oortman, Mariana S; Weis, Wilson A; Faroni-Perez, Larisse; Alves, Alexandre P; Camargo, Maurício G; Amaral, A Cecília Z; Marques, Antonio C; Lana, Paulo C

    2014-01-01

    Networks can greatly advance data sharing attitudes by providing organized and useful data sets on marine biodiversity in a friendly and shared scientific environment. NONATObase, the interactive database on polychaetes presented herein, will provide new macroecological and taxonomic insights of the Southwestern Atlantic region. The database was developed by the NONATO network, a team of South American researchers, who integrated available information on polychaetes from between 5°N and 80°S in the Atlantic Ocean and near the Antarctic. The guiding principle of the database is to keep free and open access to data based on partnerships. Its architecture consists of a relational database integrated in the MySQL and PHP framework. Its web application allows access to the data from three different directions: species (qualitative data), abundance (quantitative data) and data set (reference data). The database has built-in functionality, such as the filter of data on user-defined taxonomic levels, characteristics of site, sample, sampler, and mesh size used. Considering that there are still many taxonomic issues related to poorly known regional fauna, a scientific committee was created to work out consistent solutions to current misidentifications and equivocal taxonomy status of some species. Expertise from this committee will be incorporated by NONATObase continually. The use of quantitative data was possible by standardization of a sample unit. All data, maps of distribution and references from a data set or a specified query can be visualized and exported to a commonly used data format in statistical analysis or reference manager software. The NONATO network has initialized with NONATObase, a valuable resource for marine ecologists and taxonomists. The database is expected to grow in functionality as it comes in useful, particularly regarding the challenges of dealing with molecular genetic data and tools to assess the effects of global environment change. Database URL: http://nonatobase.ufsc.br/.

  11. NONATObase: a database for Polychaeta (Annelida) from the Southwestern Atlantic Ocean

    PubMed Central

    Pagliosa, Paulo R.; Doria, João G.; Misturini, Dairana; Otegui, Mariana B. P.; Oortman, Mariana S.; Weis, Wilson A.; Faroni-Perez, Larisse; Alves, Alexandre P.; Camargo, Maurício G.; Amaral, A. Cecília Z.; Marques, Antonio C.; Lana, Paulo C.

    2014-01-01

    Networks can greatly advance data sharing attitudes by providing organized and useful data sets on marine biodiversity in a friendly and shared scientific environment. NONATObase, the interactive database on polychaetes presented herein, will provide new macroecological and taxonomic insights of the Southwestern Atlantic region. The database was developed by the NONATO network, a team of South American researchers, who integrated available information on polychaetes from between 5°N and 80°S in the Atlantic Ocean and near the Antarctic. The guiding principle of the database is to keep free and open access to data based on partnerships. Its architecture consists of a relational database integrated in the MySQL and PHP framework. Its web application allows access to the data from three different directions: species (qualitative data), abundance (quantitative data) and data set (reference data). The database has built-in functionality, such as the filter of data on user-defined taxonomic levels, characteristics of site, sample, sampler, and mesh size used. Considering that there are still many taxonomic issues related to poorly known regional fauna, a scientific committee was created to work out consistent solutions to current misidentifications and equivocal taxonomy status of some species. Expertise from this committee will be incorporated by NONATObase continually. The use of quantitative data was possible by standardization of a sample unit. All data, maps of distribution and references from a data set or a specified query can be visualized and exported to a commonly used data format in statistical analysis or reference manager software. The NONATO network has initialized with NONATObase, a valuable resource for marine ecologists and taxonomists. The database is expected to grow in functionality as it comes in useful, particularly regarding the challenges of dealing with molecular genetic data and tools to assess the effects of global environment change. Database URL: http://nonatobase.ufsc.br/ PMID:24573879

  12. 47 CFR 54.410 - Subscriber eligibility determination and certification.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... eligibility by accessing one or more databases containing information regarding the subscriber's income (“income databases”), the eligible telecommunications carrier must access such income databases and... carrier cannot determine a prospective subscriber's income-based eligibility by accessing income databases...

  13. 47 CFR 54.410 - Subscriber eligibility determination and certification.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... eligibility by accessing one or more databases containing information regarding the subscriber's income (“income databases”), the eligible telecommunications carrier must access such income databases and... carrier cannot determine a prospective subscriber's income-based eligibility by accessing income databases...

  14. 47 CFR 54.410 - Subscriber eligibility determination and certification.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... eligibility by accessing one or more databases containing information regarding the subscriber's income (“income databases”), the eligible telecommunications carrier must access such income databases and... carrier cannot determine a prospective subscriber's income-based eligibility by accessing income databases...

  15. MINEs: Open access databases of computationally predicted enzyme promiscuity products for untargeted metabolomics

    DOE PAGES

    Jeffryes, James G.; Colastani, Ricardo L.; Elbadawi-Sidhu, Mona; ...

    2015-08-28

    Metabolomics have proven difficult to execute in an untargeted and generalizable manner. Liquid chromatography–mass spectrometry (LC–MS) has made it possible to gather data on thousands of cellular metabolites. However, matching metabolites to their spectral features continues to be a bottleneck, meaning that much of the collected information remains uninterpreted and that new metabolites are seldom discovered in untargeted studies. These challenges require new approaches that consider compounds beyond those available in curated biochemistry databases. Here we present Metabolic In silico Network Expansions (MINEs), an extension of known metabolite databases to include molecules that have not been observed, but are likelymore » to occur based on known metabolites and common biochemical reactions. We utilize an algorithm called the Biochemical Network Integrated Computational Explorer (BNICE) and expert-curated reaction rules based on the Enzyme Commission classification system to propose the novel chemical structures and reactions that comprise MINE databases. Starting from the Kyoto Encyclopedia of Genes and Genomes (KEGG) COMPOUND database, the MINE contains over 571,000 compounds, of which 93% are not present in the PubChem database. However, these MINE compounds have on average higher structural similarity to natural products than compounds from KEGG or PubChem. MINE databases were able to propose annotations for 98.6% of a set of 667 MassBank spectra, 14% more than KEGG alone and equivalent to PubChem while returning far fewer candidates per spectra than PubChem (46 vs. 1715 median candidates). Application of MINEs to LC–MS accurate mass data enabled the identity of an unknown peak to be confidently predicted. MINE databases are freely accessible for non-commercial use via user-friendly web-tools at http://minedatabase.mcs.anl.gov and developer-friendly APIs. MINEs improve metabolomics peak identification as compared to general chemical databases whose results include irrelevant synthetic compounds. MINEs complement and expand on previous in silico generated compound databases that focus on human metabolism. We are actively developing the database; future versions of this resource will incorporate transformation rules for spontaneous chemical reactions and more advanced filtering and prioritization of candidate structures.« less

  16. Large and linked in scientific publishing

    PubMed Central

    2012-01-01

    We are delighted to announce the launch of GigaScience, an online open-access journal that focuses on research using or producing large datasets in all areas of biological and biomedical sciences. GigaScience is a new type of journal that provides standard scientific publishing linked directly to a database that hosts all the relevant data. The primary goals for the journal, detailed in this editorial, are to promote more rapid data release, broader use and reuse of data, improved reproducibility of results, and direct, easy access between analyses and their data. Direct and permanent connections of scientific analyses and their data (achieved by assigning all hosted data a citable DOI) will enable better analysis and deeper interpretation of the data in the future. PMID:23587310

  17. Large and linked in scientific publishing.

    PubMed

    Goodman, Laurie; Edmunds, Scott C; Basford, Alexandra T

    2012-07-12

    We are delighted to announce the launch of GigaScience, an online open-access journal that focuses on research using or producing large datasets in all areas of biological and biomedical sciences. GigaScience is a new type of journal that provides standard scientific publishing linked directly to a database that hosts all the relevant data. The primary goals for the journal, detailed in this editorial, are to promote more rapid data release, broader use and reuse of data, improved reproducibility of results, and direct, easy access between analyses and their data. Direct and permanent connections of scientific analyses and their data (achieved by assigning all hosted data a citable DOI) will enable better analysis and deeper interpretation of the data in the future.

  18. IPD-MHC 2.0: an improved inter-species database for the study of the major histocompatibility complex

    PubMed Central

    Maccari, Giuseppe; Robinson, James; Ballingall, Keith; Guethlein, Lisbeth A.; Grimholt, Unni; Kaufman, Jim; Ho, Chak-Sum; de Groot, Natasja G.; Flicek, Paul; Bontrop, Ronald E.; Hammond, John A.; Marsh, Steven G. E.

    2017-01-01

    The IPD-MHC Database project (http://www.ebi.ac.uk/ipd/mhc/) collects and expertly curates sequences of the major histocompatibility complex from non-human species and provides the infrastructure and tools to enable accurate analysis. Since the first release of the database in 2003, IPD-MHC has grown and currently hosts a number of specific sections, with more than 7000 alleles from 70 species, including non-human primates, canines, felines, equids, ovids, suids, bovins, salmonids and murids. These sequences are expertly curated and made publicly available through an open access website. The IPD-MHC Database is a key resource in its field, and this has led to an average of 1500 unique visitors and more than 5000 viewed pages per month. As the database has grown in size and complexity, it has created a number of challenges in maintaining and organizing information, particularly the need to standardize nomenclature and taxonomic classification, while incorporating new allele submissions. Here, we describe the latest database release, the IPD-MHC 2.0 and discuss planned developments. This release incorporates sequence updates and new tools that enhance database queries and improve the submission procedure by utilizing common tools that are able to handle the varied requirements of each MHC-group. PMID:27899604

  19. Implementation of GenePattern within the Stanford Microarray Database.

    PubMed

    Hubble, Jeremy; Demeter, Janos; Jin, Heng; Mao, Maria; Nitzberg, Michael; Reddy, T B K; Wymore, Farrell; Zachariah, Zachariah K; Sherlock, Gavin; Ball, Catherine A

    2009-01-01

    Hundreds of researchers across the world use the Stanford Microarray Database (SMD; http://smd.stanford.edu/) to store, annotate, view, analyze and share microarray data. In addition to providing registered users at Stanford access to their own data, SMD also provides access to public data, and tools with which to analyze those data, to any public user anywhere in the world. Previously, the addition of new microarray data analysis tools to SMD has been limited by available engineering resources, and in addition, the existing suite of tools did not provide a simple way to design, execute and share analysis pipelines, or to document such pipelines for the purposes of publication. To address this, we have incorporated the GenePattern software package directly into SMD, providing access to many new analysis tools, as well as a plug-in architecture that allows users to directly integrate and share additional tools through SMD. In this article, we describe our implementation of the GenePattern microarray analysis software package into the SMD code base. This extension is available with the SMD source code that is fully and freely available to others under an Open Source license, enabling other groups to create a local installation of SMD with an enriched data analysis capability.

  20. The Arabidopsis Information Resource: Making and Mining the ‘Gold Standard’ Annotated Reference Plant Genome

    PubMed Central

    Berardini, Tanya Z.; Reiser, Leonore; Li, Donghui; Mezheritsky, Yarik; Muller, Robert; Strait, Emily; Huala, Eva

    2015-01-01

    The Arabidopsis Information Resource (TAIR) is a continuously updated, online database of genetic and molecular biology data for the model plant Arabidopsis thaliana that provides a global research community with centralized access to data for over 30,000 Arabidopsis genes. TAIR’s biocurators systematically extract, organize, and interconnect experimental data from the literature along with computational predictions, community submissions, and high throughput datasets to present a high quality and comprehensive picture of Arabidopsis gene function. TAIR provides tools for data visualization and analysis, and enables ordering of seed and DNA stocks, protein chips and other experimental resources. TAIR actively engages with its users who contribute expertise and data that augments the work of the curatorial staff. TAIR’s focus in an extensive and evolving ecosystem of online resources for plant biology is on the critically important role of extracting experimentally-based research findings from the literature and making that information computationally accessible. In response to the loss of government grant funding, the TAIR team founded a nonprofit entity, Phoenix Bioinformatics, with the aim of developing sustainable funding models for biological databases, using TAIR as a test case. Phoenix has successfully transitioned TAIR to subscription-based funding while still keeping its data relatively open and accessible. PMID:26201819

  1. The EPA CompTox Chemistry Dashboard - an online resource ...

    EPA Pesticide Factsheets

    The U.S. Environmental Protection Agency (EPA) Computational Toxicology Program integrates advances in biology, chemistry, and computer science to help prioritize chemicals for further research based on potential human health risks. This work involves computational and data driven approaches that integrate chemistry, exposure and biological data. As an outcome of these efforts the National Center for Computational Toxicology (NCCT) has measured, assembled and delivered an enormous quantity and diversity of data for the environmental sciences including high-throughput in vitro screening data, in vivo and functional use data, exposure models and chemical databases with associated properties. A series of software applications and databases have been produced over the past decade to deliver these data. Recent work has focused on the development of a new architecture that assembles the resources into a single platform. With a focus on delivering access to Open Data streams, web service integration accessibility and a user-friendly web application the CompTox Dashboard provides access to data associated with ~720,000 chemical substances. These data include research data in the form of bioassay screening data associated with the ToxCast program, experimental and predicted physicochemical properties, product and functional use information and related data of value to environmental scientists. This presentation will provide an overview of the CompTox Dashboard and its va

  2. Is there enough research output of EU projects available to assess and improve health system performance? An attempt to understand and categorise the output of EU projects conducted between 2002 and 2012.

    PubMed

    Zander, Britta; Busse, Reinhard

    2017-02-22

    Adequate performance assessment benefits from the use of disaggregated data to allow a proper evaluation of health systems. Since routinely collected data are usually not disaggregated enough to allow stratified analyses of healthcare needs, utilisation, cost and quality across different sectors, international research projects could fill this gap by exploring means to data collection or even providing individual-level data. The aim of this paper is therefore to (1) study the availability and accessibility of relevant European-funded health projects, and (2) to analyse their contents and methodologies. The European Commission Public Health Projects Database and CORDIS were searched for eligible projects, which were then analysed by information openly available online. Overall, only a few of the 39 identified projects produced data useful for proper performance assessment, due to, for example, lacking available or accessible data, or poor linkage of health status to costs and patient experiences. Other problems were insufficient databases to identify projects and poor communication of project contents and results. A new approach is necessary to improve accessibility to and coverage of data on outcomes, quality and costs of health systems enabling decision-makers and health professionals to properly assess performance.

  3. 50 CFR 660.313 - Open access fishery-recordkeeping and reporting.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 50 Wildlife and Fisheries 11 2011-10-01 2011-10-01 false Open access fishery-recordkeeping and... West Coast Groundfish-Open Access Fisheries § 660.313 Open access fishery—recordkeeping and reporting... to open access fisheries. (b) Declaration reports for vessels using nontrawl gear. Declaration...

  4. 50 CFR 660.313 - Open access fishery-recordkeeping and reporting.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 50 Wildlife and Fisheries 9 2010-10-01 2010-10-01 false Open access fishery-recordkeeping and... West Coast Groundfish-Open Access Fisheries § 660.313 Open access fishery—recordkeeping and reporting... to open access fisheries. (b) Declaration reports for vessels using nontrawl gear. Declaration...

  5. 50 CFR 660.313 - Open access fishery-recordkeeping and reporting.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 50 Wildlife and Fisheries 13 2014-10-01 2014-10-01 false Open access fishery-recordkeeping and... West Coast Groundfish-Open Access Fisheries § 660.313 Open access fishery—recordkeeping and reporting... to open access fisheries. (b) Declaration reports for vessels using nontrawl gear. Declaration...

  6. 50 CFR 660.313 - Open access fishery-recordkeeping and reporting.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 50 Wildlife and Fisheries 13 2012-10-01 2012-10-01 false Open access fishery-recordkeeping and... West Coast Groundfish-Open Access Fisheries § 660.313 Open access fishery—recordkeeping and reporting... to open access fisheries. (b) Declaration reports for vessels using nontrawl gear. Declaration...

  7. 50 CFR 660.313 - Open access fishery-recordkeeping and reporting.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 50 Wildlife and Fisheries 13 2013-10-01 2013-10-01 false Open access fishery-recordkeeping and... West Coast Groundfish-Open Access Fisheries § 660.313 Open access fishery—recordkeeping and reporting... to open access fisheries. (b) Declaration reports for vessels using nontrawl gear. Declaration...

  8. Pragmatic open space box utilization: asteroid survey model using distributed objects management based articulation (DOMBA)

    NASA Astrophysics Data System (ADS)

    Mohammad, Atif Farid; Straub, Jeremy

    2015-05-01

    A multi-craft asteroid survey has significant data synchronization needs. Limited communication speeds drive exacting performance requirements. Tables have been used in Relational Databases, which are structure; however, DOMBA (Distributed Objects Management Based Articulation) deals with data in terms of collections. With this, no read/write roadblocks to the data exist. A master/slave architecture is created by utilizing the Gossip protocol. This facilitates expanding a mission that makes an important discovery via the launch of another spacecraft. The Open Space Box Framework facilitates the foregoing while also providing a virtual caching layer to make sure that continuously accessed data is available in memory and that, upon closing the data file, recharging is applied to the data.

  9. The AAS Working Group on Accessibility and Disability (WGAD) Year 1 Highlights and Database Access

    NASA Astrophysics Data System (ADS)

    Knierman, Karen A.; Diaz Merced, Wanda; Aarnio, Alicia; Garcia, Beatriz; Monkiewicz, Jacqueline A.; Murphy, Nicholas Arnold

    2017-06-01

    The AAS Working Group on Accessibility and Disability (WGAD) was formed in January of 2016 with the express purpose of seeking equity of opportunity and building inclusive practices for disabled astronomers at all educational and career stages. In this presentation, we will provide a summary of current activities, focusing on developing best practices for accessibility with respect to astronomical databases, publications, and meetings. Due to the reliance of space sciences on databases, it is important to have user centered design systems for data retrieval. The cognitive overload that may be experienced by users of current databases may be mitigated by use of multi-modal interfaces such as xSonify. Such interfaces would be in parallel or outside the original database and would not require additional software efforts from the original database. WGAD is partnering with the IAU Commission C1 WG Astronomy for Equity and Inclusion to develop such accessibility tools for databases and methods for user testing. To collect data on astronomical conference and meeting accessibility considerations, WGAD solicited feedback from January AAS attendees via a web form. These data, together with upcoming input from the community and analysis of accessibility documents of similar conferences, will be used to create a meeting accessibility document. Additionally, we will update the progress of journal access guidelines and our social media presence via Twitter. We recommend that astronomical journals form committees to evaluate the accessibility of their publications by performing user-centered usability studies.

  10. The QuakeSim Project: Web Services for Managing Geophysical Data and Applications

    NASA Astrophysics Data System (ADS)

    Pierce, Marlon E.; Fox, Geoffrey C.; Aktas, Mehmet S.; Aydin, Galip; Gadgil, Harshawardhan; Qi, Zhigang; Sayar, Ahmet

    2008-04-01

    We describe our distributed systems research efforts to build the “cyberinfrastructure” components that constitute a geophysical Grid, or more accurately, a Grid of Grids. Service-oriented computing principles are used to build a distributed infrastructure of Web accessible components for accessing data and scientific applications. Our data services fall into two major categories: Archival, database-backed services based around Geographical Information System (GIS) standards from the Open Geospatial Consortium, and streaming services that can be used to filter and route real-time data sources such as Global Positioning System data streams. Execution support services include application execution management services and services for transferring remote files. These data and execution service families are bound together through metadata information and workflow services for service orchestration. Users may access the system through the QuakeSim scientific Web portal, which is built using a portlet component approach.

  11. The RCSB Protein Data Bank: views of structural biology for basic and applied research and education

    PubMed Central

    Rose, Peter W.; Prlić, Andreas; Bi, Chunxiao; Bluhm, Wolfgang F.; Christie, Cole H.; Dutta, Shuchismita; Green, Rachel Kramer; Goodsell, David S.; Westbrook, John D.; Woo, Jesse; Young, Jasmine; Zardecki, Christine; Berman, Helen M.; Bourne, Philip E.; Burley, Stephen K.

    2015-01-01

    The RCSB Protein Data Bank (RCSB PDB, http://www.rcsb.org) provides access to 3D structures of biological macromolecules and is one of the leading resources in biology and biomedicine worldwide. Our efforts over the past 2 years focused on enabling a deeper understanding of structural biology and providing new structural views of biology that support both basic and applied research and education. Herein, we describe recently introduced data annotations including integration with external biological resources, such as gene and drug databases, new visualization tools and improved support for the mobile web. We also describe access to data files, web services and open access software components to enable software developers to more effectively mine the PDB archive and related annotations. Our efforts are aimed at expanding the role of 3D structure in understanding biology and medicine. PMID:25428375

  12. 50 CFR 660.330 - Open access fishery-management measures.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 50 Wildlife and Fisheries 13 2014-10-01 2014-10-01 false Open access fishery-management measures... West Coast Groundfish-Open Access Fisheries § 660.330 Open access fishery—management measures. (a) General. Groundfish species taken in open access fisheries will be managed with cumulative trip limits...

  13. 50 CFR 660.330 - Open access fishery-management measures.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 50 Wildlife and Fisheries 9 2010-10-01 2010-10-01 false Open access fishery-management measures... West Coast Groundfish-Open Access Fisheries § 660.330 Open access fishery—management measures. (a) General. Groundfish species taken in open access fisheries will be managed with cumulative trip limits...

  14. 50 CFR 660.330 - Open access fishery-management measures.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 50 Wildlife and Fisheries 13 2013-10-01 2013-10-01 false Open access fishery-management measures... West Coast Groundfish-Open Access Fisheries § 660.330 Open access fishery—management measures. (a) General. Groundfish species taken in open access fisheries will be managed with cumulative trip limits...

  15. 50 CFR 660.320 - Open access fishery-crossover provisions.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 50 Wildlife and Fisheries 9 2010-10-01 2010-10-01 false Open access fishery-crossover provisions... West Coast Groundfish-Open Access Fisheries § 660.320 Open access fishery—crossover provisions. (a) Operating in both limited entry and open access fisheries. See provisions at § 660.60, subpart C. (b...

  16. 50 CFR 660.320 - Open access fishery-crossover provisions.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 50 Wildlife and Fisheries 11 2011-10-01 2011-10-01 false Open access fishery-crossover provisions... West Coast Groundfish-Open Access Fisheries § 660.320 Open access fishery—crossover provisions. (a) Operating in both limited entry and open access fisheries. See provisions at § 660.60, subpart C. (b...

  17. 50 CFR 660.312 - Open access fishery-prohibitions.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 50 Wildlife and Fisheries 13 2012-10-01 2012-10-01 false Open access fishery-prohibitions. 660.312... Groundfish-Open Access Fisheries § 660.312 Open access fishery—prohibitions. General groundfish prohibitions..., possess, or land groundfish in excess of the landing limit for the open access fishery without having a...

  18. 50 CFR 660.330 - Open access fishery-management measures.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 50 Wildlife and Fisheries 13 2012-10-01 2012-10-01 false Open access fishery-management measures... West Coast Groundfish-Open Access Fisheries § 660.330 Open access fishery—management measures. (a) General. Groundfish species taken in open access fisheries will be managed with cumulative trip limits...

  19. 50 CFR 660.312 - Open access fishery-prohibitions.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 50 Wildlife and Fisheries 13 2014-10-01 2014-10-01 false Open access fishery-prohibitions. 660.312... Groundfish-Open Access Fisheries § 660.312 Open access fishery—prohibitions. General groundfish prohibitions..., possess, or land groundfish in excess of the landing limit for the open access fishery without having a...

  20. 50 CFR 660.312 - Open access fishery-prohibitions.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 50 Wildlife and Fisheries 9 2010-10-01 2010-10-01 false Open access fishery-prohibitions. 660.312... Groundfish-Open Access Fisheries § 660.312 Open access fishery—prohibitions. General groundfish prohibitions..., possess, or land groundfish in excess of the landing limit for the open access fishery without having a...

  1. 50 CFR 660.312 - Open access fishery-prohibitions.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 50 Wildlife and Fisheries 13 2013-10-01 2013-10-01 false Open access fishery-prohibitions. 660.312... Groundfish-Open Access Fisheries § 660.312 Open access fishery—prohibitions. General groundfish prohibitions..., possess, or land groundfish in excess of the landing limit for the open access fishery without having a...

  2. 50 CFR 660.312 - Open access fishery-prohibitions.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 50 Wildlife and Fisheries 11 2011-10-01 2011-10-01 false Open access fishery-prohibitions. 660.312... Groundfish-Open Access Fisheries § 660.312 Open access fishery—prohibitions. General groundfish prohibitions..., possess, or land groundfish in excess of the landing limit for the open access fishery without having a...

  3. 50 CFR 660.330 - Open access fishery-management measures.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 50 Wildlife and Fisheries 11 2011-10-01 2011-10-01 false Open access fishery-management measures... West Coast Groundfish-Open Access Fisheries § 660.330 Open access fishery—management measures. (a) General. Groundfish species taken in open access fisheries will be managed with cumulative trip limits...

  4. Medical education and information literacy in the era of open access.

    PubMed

    Brower, Stewart M

    2010-01-01

    The Open Access movement in scholarly communications poses new issues and concerns for medical education in general and information literacy education specifically. For medical educators, Open Access can affect the availability of new information, instructional materials, and scholarship in medical education. For students, Open Access materials continue to be available to them post-graduation, regardless of affiliation. Libraries and information literacy librarians are challenged in their responses to the Open Access publishing movement in how best to support Open Access endeavors within their own institutions, and how best to educate their user base about Open Access in general.

  5. LSST communications middleware implementation

    NASA Astrophysics Data System (ADS)

    Mills, Dave; Schumacher, German; Lotz, Paul

    2016-07-01

    The LSST communications middleware is based on a set of software abstractions; which provide standard interfaces for common communications services. The observatory requires communication between diverse subsystems, implemented by different contractors, and comprehensive archiving of subsystem status data. The Service Abstraction Layer (SAL) is implemented using open source packages that implement open standards of DDS (Data Distribution Service1) for data communication, and SQL (Standard Query Language) for database access. For every subsystem, abstractions for each of the Telemetry datastreams, along with Command/Response and Events, have been agreed with the appropriate component vendor (such as Dome, TMA, Hexapod), and captured in ICD's (Interface Control Documents).The OpenSplice (Prismtech) Community Edition of DDS provides an LGPL licensed distribution which may be freely redistributed. The availability of the full source code provides assurances that the project will be able to maintain it over the full 10 year survey, independent of the fortunes of the original providers.

  6. Data Sharing Effect on Article Citation Rate in Paleoceanography

    NASA Astrophysics Data System (ADS)

    Sears, J. R.

    2011-12-01

    The validation of scientific results requires reproducible methods and data. Often, however, data sets supporting research articles are not openly accessible and interlinked. This analysis tests whether open sharing and linking of supporting data through the PANGAEA° data library measurably increases the citation rate of articles published between 1993 and 2010 in the journal Paleoceanography as reported in the Thomson Reuters Web of Science database. The 12.85% (171) of articles with publicly available supporting data sets received 19.94% (8,056) of the aggregate citations (40,409). Publicly available data were thus significantly (p=0.007, 95% confidence interval) associated with about 35% more citations per article than the average of all articles sampled over the 18-year study period (1,331), and the increase is fairly consistent over time (14 of 18 years). This relationship between openly available, curated data and increased citation rate may incentivize researchers to share their data.

  7. LexisNexis

    EPA Pesticide Factsheets

    LexisNexis provides access to electronic legal and non-legal research databases to the Agency's attorneys, administrative law judges, law clerks, investigators, and certain non-legal staff (e.g. staff in the Office of Public Affairs). The agency requires access to the following types of electronic databases: Legal databases, Non-legal databases, Public Records databases, and Financial databases.

  8. CDGP, the data center for deep geothermal data from Alsace

    NASA Astrophysics Data System (ADS)

    Schaming, Marc; Grunberg, Marc; Jahn, Markus; Schmittbuhl, Jean; Cuenot, Nicolas; Genter, Albert; Dalmais, Eléonore

    2016-04-01

    CDGP (Centre de données de géothermie profonde, deep geothermal data center, http://cdgp.u-strasbg.fr) is set by the LabEX G-EAU-THERMIE PROFONDE to archive the high quality data collected in the Upper Rhine Graben geothermal sites and to distribute them to the scientific community for R&D activities, taking IPR (Intellectual Property Rights) into account. Collected datasets cover the whole life of geothermal projects, from exploration to drilling, stimulation, circulation and production. They originate from the Soultz-sous-Forêts pilot plant but also include more recent projects like the ECOGI project at Rittershoffen, Alsace, France. They are historically separated in two rather independent categories: geophysical datasets mostly related to the industrial management of the geothermal reservoir and seismological data from the seismic monitoring both during stimulations and circulations. Geophysical datasets are mainly up to now from the Soultz-sous-Forêts project that were stored on office's shelves and old digital media. Some inventories have been done recently, and a first step of the integration of these reservoir data into a PostgreSQL/postGIS database (ISO 19107 compatible) has been performed. The database links depths, temperatures, pressures, flows, for periods (times) and locations (geometries). Other geophysical data are still stored in structured directories as a data bank and need to be included in the database. Seismological datasets are of two kinds: seismological waveforms and seismicity bulletins; the former are stored in a standardized way both in format (miniSEED) and in files and directories structures (SDS) following international standard of the seismological community (FDSN), and the latter in a database following the open standard QuakeML. CDGP uses a cataloging application (GeoNetwork) to manage the metadata resources. It provides metadata editing and search functions as well as a web map viewer. The metadata editor supports ISO19115/119/110 standards used for spatial resources. A step forward will be to add specific metadata records as defined by the Open Geospatial Consortium to provide geophysical / geologic / reservoir information: Observations and Measurements (O&M) to describe the acquisition of information from a primary source, and SensorML to describe the sensors. Seismological metadata, which describe all the instrumental response, use the dateless SEED standard. Access to data will be handled in an additional step using geOrchestra spatial data infrastructure (SDI). Direct access will be granted after registration and validation using the single sign-on authentication system. Access to the data will also be granted via EPOS-IP Anthropogenic Hazards project. Access to episodes (time-correlated collections of geophysical, technological and other relevant geo-data over a geothermal area) and application of analysis (time- and technology-dependent probabilistic seismic hazard analysis, multi-hazard and multi-risk assessment) are services accessible via a portal and will require AAAI (Authentication, Authorization, Accounting and Identification).

  9. Publishing in open access era: focus on respiratory journals

    PubMed Central

    Xu, Dingyao; Zhong, Xiyao; Li, Li; Ling, Qibo; Bu, Zhaode

    2014-01-01

    We have entered an open access publishing era. The impact and significance of open access is still under debate after two decades of evolution. Open access journals benefit researchers and the general public by promoting visibility, sharing and communicating. Non-mainstream journals should turn the challenge of open access into opportunity of presenting best research articles to the global readership. Open access journals need to optimize their business models to promote the healthy and continuous development. PMID:24822120

  10. Publishing in open access era: focus on respiratory journals.

    PubMed

    Dai, Ni; Xu, Dingyao; Zhong, Xiyao; Li, Li; Ling, Qibo; Bu, Zhaode

    2014-05-01

    We have entered an open access publishing era. The impact and significance of open access is still under debate after two decades of evolution. Open access journals benefit researchers and the general public by promoting visibility, sharing and communicating. Non-mainstream journals should turn the challenge of open access into opportunity of presenting best research articles to the global readership. Open access journals need to optimize their business models to promote the healthy and continuous development.

  11. PEP725 Pan European Phenological Database

    NASA Astrophysics Data System (ADS)

    Koch, E.; Lipa, W.; Ungersböck, M.; Zach-Hermann, S.

    2012-04-01

    PEP725 is a 5 years project with the main object to promote and facilitate phenological research by delivering a pan European phenological database with an open, unrestricted data access for science, research and education. PEP725 is funded by EUMETNET (the network of European meteorological services), ZAMG and the Austrian ministry for science & research bm:w_f. So far 16 European national meteorological services and 7 partners from different nati-onal phenological network operators have joined PEP725. The data access is very easy via web-access from the homepage www.pep725.eu. Ha-ving accepted the PEP725 data policy and registry the data download can be done by different criteria as for instance the selection of a specific plant or all data from one country. At present more than 300 000 new records are available in the PEP725 data-base coming from 31 European countries and from 8150 stations. For some more sta-tions (154) META data (location and data holder) are provided. Links to the network operators and data owners are also on the webpage in case you have more sophisticated questions about the data. Another objective of PEP725 is to bring together network-operators and scientists by organizing workshops. In April 2012 the second of these workshops will take place on the premises of ZAMG. Invited speakers will give presentations spanning the whole study area of phenology starting from observations to modelling. Quality checking is also a big issue. At the moment we study the literature to find ap-propriate methods.

  12. Global Ocean Currents Database

    NASA Astrophysics Data System (ADS)

    Boyer, T.; Sun, L.

    2016-02-01

    The NOAA's National Centers for Environmental Information has released an ocean currents database portal that aims 1) to integrate global ocean currents observations from a variety of instruments with different resolution, accuracy and response to spatial and temporal variability into a uniform network common data form (NetCDF) format and 2) to provide a dedicated online data discovery, access to NCEI-hosted and distributed data sources for ocean currents data. The portal provides a tailored web application that allows users to search for ocean currents data by platform types and spatial/temporal ranges of their interest. The dedicated web application is available at http://www.nodc.noaa.gov/gocd/index.html. The NetCDF format supports widely-used data access protocols and catalog services such as OPeNDAP (Open-source Project for a Network Data Access Protocol) and THREDDS (Thematic Real-time Environmental Distributed Data Services), which the GOCD users can use data files with their favorite analysis and visualization client software without downloading to their local machine. The potential users of the ocean currents database include, but are not limited to, 1) ocean modelers for their model skills assessments, 2) scientists and researchers for studying the impact of ocean circulations on the climate variability, 3) ocean shipping industry for safety navigation and finding optimal routes for ship fuel efficiency, 4) ocean resources managers while planning for the optimal sites for wastes and sewages dumping and for renewable hydro-kinematic energy, and 5) state and federal governments to provide historical (analyzed) ocean circulations as an aid for search and rescue

  13. Publishing in Open Access Education Journals: The Authors' Perspectives

    ERIC Educational Resources Information Center

    Coonin, Bryna; Younce, Leigh M.

    2010-01-01

    Open access publishing is now an accepted method of scholarly communication. However, the greatest traction for open access publishing thus far has been in the sciences. Penetration of open access publishing has been much slower among the social sciences. This study surveys 309 authors from recent issues of open access journals in education to…

  14. 50 CFR 660.311 - Open access fishery-definitions.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 50 Wildlife and Fisheries 13 2012-10-01 2012-10-01 false Open access fishery-definitions. 660.311... Groundfish-Open Access Fisheries § 660.311 Open access fishery—definitions. General definitions for the... specific to the open access fishery covered in this subpart and are in addition to those specified at § 660...

  15. 50 CFR 660.311 - Open access fishery-definitions.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... 50 Wildlife and Fisheries 11 2011-10-01 2011-10-01 false Open access fishery-definitions. 660.311... Groundfish-Open Access Fisheries § 660.311 Open access fishery—definitions. General definitions for the... specific to the open access fishery covered in this subpart and are in addition to those specified at § 660...

  16. 50 CFR 660.320 - Open access fishery-crossover provisions.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 50 Wildlife and Fisheries 13 2014-10-01 2014-10-01 false Open access fishery-crossover provisions... West Coast Groundfish-Open Access Fisheries § 660.320 Open access fishery—crossover provisions. The crossover provisions listed at § 660.60(h)(7), apply to vessels fishing in the open access fishery. [76 FR...

  17. 50 CFR 660.320 - Open access fishery-crossover provisions.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 50 Wildlife and Fisheries 13 2013-10-01 2013-10-01 false Open access fishery-crossover provisions... West Coast Groundfish-Open Access Fisheries § 660.320 Open access fishery—crossover provisions. The crossover provisions listed at § 660.60(h)(7), apply to vessels fishing in the open access fishery. [76 FR...

  18. 50 CFR 660.319 - Open access fishery gear identification and marking.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 50 Wildlife and Fisheries 13 2013-10-01 2013-10-01 false Open access fishery gear identification... COAST STATES West Coast Groundfish-Open Access Fisheries § 660.319 Open access fishery gear identification and marking. (a) Gear identification. (1) Open access fixed gear (longline, trap or pot, set net...

  19. 50 CFR 660.311 - Open access fishery-definitions.

    Code of Federal Regulations, 2010 CFR

    2010-10-01

    ... 50 Wildlife and Fisheries 9 2010-10-01 2010-10-01 false Open access fishery-definitions. 660.311... Groundfish-Open Access Fisheries § 660.311 Open access fishery—definitions. General definitions for the... specific to the open access fishery covered in this subpart and are in addition to those specified at § 660...

  20. 50 CFR 660.320 - Open access fishery-crossover provisions.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... 50 Wildlife and Fisheries 13 2012-10-01 2012-10-01 false Open access fishery-crossover provisions... West Coast Groundfish-Open Access Fisheries § 660.320 Open access fishery—crossover provisions. The crossover provisions listed at § 660.60(h)(7), apply to vessels fishing in the open access fishery. [76 FR...

  1. 50 CFR 660.311 - Open access fishery-definitions.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... 50 Wildlife and Fisheries 13 2013-10-01 2013-10-01 false Open access fishery-definitions. 660.311... Groundfish-Open Access Fisheries § 660.311 Open access fishery—definitions. General definitions for the... specific to the open access fishery covered in this subpart and are in addition to those specified at § 660...

  2. 50 CFR 660.311 - Open access fishery-definitions.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... 50 Wildlife and Fisheries 13 2014-10-01 2014-10-01 false Open access fishery-definitions. 660.311... Groundfish-Open Access Fisheries § 660.311 Open access fishery—definitions. General definitions for the... specific to the open access fishery covered in this subpart and are in addition to those specified at § 660...

  3. Education Scholars' Perceptions and Practices toward Open Access Publishing

    ERIC Educational Resources Information Center

    Ellingford, Lori Michelle

    2012-01-01

    Although open access publishing has been available since 1998, we know little regarding scholars' perceptions and practices toward publishing in open access outlets, especially in the social science community. Open access publishing has been slow to penetrate the field of education, yet the potential impact of open access could make this…

  4. Your Personal Analysis Toolkit - An Open Source Solution

    NASA Astrophysics Data System (ADS)

    Mitchell, T.

    2009-12-01

    Open source software is commonly known for its web browsers, word processors and programming languages. However, there is a vast array of open source software focused on geographic information management and geospatial application building in general. As geo-professionals, having easy access to tools for our jobs is crucial. Open source software provides the opportunity to add a tool to your tool belt and carry it with you for your entire career - with no license fees, a supportive community and the opportunity to test, adopt and upgrade at your own pace. OSGeo is a US registered non-profit representing more than a dozen mature geospatial data management applications and programming resources. Tools cover areas such as desktop GIS, web-based mapping frameworks, metadata cataloging, spatial database analysis, image processing and more. Learn about some of these tools as they apply to AGU members, as well as how you can join OSGeo and its members in getting the job done with powerful open source tools. If you haven't heard of OSSIM, MapServer, OpenLayers, PostGIS, GRASS GIS or the many other projects under our umbrella - then you need to hear this talk. Invest in yourself - use open source!

  5. Genelab: Scientific Partnerships and an Open-Access Database to Maximize Usage of Omics Data from Space Biology Experiments

    NASA Technical Reports Server (NTRS)

    Reinsch, S. S.; Galazka, J..; Berrios, D. C; Chakravarty, K.; Fogle, H.; Lai, S.; Bokyo, V.; Timucin, L. R.; Tran, P.; Skidmore, M.

    2016-01-01

    NASA's mission includes expanding our understanding of biological systems to improve life on Earth and to enable long-duration human exploration of space. The GeneLab Data System (GLDS) is NASA's premier open-access omics data platform for biological experiments. GLDS houses standards-compliant, high-throughput sequencing and other omics data from spaceflight-relevant experiments. The GeneLab project at NASA-Ames Research Center is developing the database, and also partnering with spaceflight projects through sharing or augmentation of experiment samples to expand omics analyses on precious spaceflight samples. The partnerships ensure that the maximum amount of data is garnered from spaceflight experiments and made publically available as rapidly as possible via the GLDS. GLDS Version 1.0, went online in April 2015. Software updates and new data releases occur at least quarterly. As of October 2016, the GLDS contains 80 datasets and has search and download capabilities. Version 2.0 is slated for release in September of 2017 and will have expanded, integrated search capabilities leveraging other public omics databases (NCBI GEO, PRIDE, MG-RAST). Future versions in this multi-phase project will provide a collaborative platform for omics data analysis. Data from experiments that explore the biological effects of the spaceflight environment on a wide variety of model organisms are housed in the GLDS including data from rodents, invertebrates, plants and microbes. Human datasets are currently limited to those with anonymized data (e.g., from cultured cell lines). GeneLab ensures prompt release and open access to high-throughput genomics, transcriptomics, proteomics, and metabolomics data from spaceflight and ground-based simulations of microgravity, radiation or other space environment factors. The data are meticulously curated to assure that accurate experimental and sample processing metadata are included with each data set. GLDS download volumes indicate strong interest of the scientific community in these data. To date GeneLab has partnered with multiple experiments including two plant (Arabidopsis thaliana) experiments, two mice experiments, and several microbe experiments. GeneLab optimized protocols in the rodent partnerships for maximum yield of RNA, DNA and protein from tissues harvested and preserved during the SpaceX-4 mission, as well as from tissues from mice that were frozen intact during spaceflight and later dissected on the ground. Analysis of GeneLab data will contribute fundamental knowledge of how the space environment affects biological systems, and as well as yield terrestrial benefits resulting from mitigation strategies to prevent effects observed during exposure to space environments.

  6. The Open Data Repository's Data Publisher

    NASA Astrophysics Data System (ADS)

    Stone, N.; Lafuente, B.; Downs, R. T.; Bristow, T.; Blake, D. F.; Fonda, M.; Pires, A.

    2015-12-01

    Data management and data publication are becoming increasingly important components of research workflows. The complexity of managing data, publishing data online, and archiving data has not decreased significantly even as computing access and power has greatly increased. The Open Data Repository's Data Publisher software (http://www.opendatarepository.org) strives to make data archiving, management, and publication a standard part of a researcher's workflow using simple, web-based tools and commodity server hardware. The publication engine allows for uploading, searching, and display of data with graphing capabilities and downloadable files. Access is controlled through a robust permissions system that can control publication at the field level and can be granted to the general public or protected so that only registered users at various permission levels receive access. Data Publisher also allows researchers to subscribe to meta-data standards through a plugin system, embargo data publication at their discretion, and collaborate with other researchers through various levels of data sharing. As the software matures, semantic data standards will be implemented to facilitate machine reading of data and each database will provide a REST application programming interface for programmatic access. Additionally, a citation system will allow snapshots of any data set to be archived and cited for publication while the data itself can remain living and continuously evolve beyond the snapshot date. The software runs on a traditional LAMP (Linux, Apache, MySQL, PHP) server and is available on GitHub (http://github.com/opendatarepository) under a GPLv2 open source license. The goal of the Open Data Repository is to lower the cost and training barrier to entry so that any researcher can easily publish their data and ensure it is archived for posterity. We gratefully acknowledge the support for this study by the Science-Enabling Research Activity (SERA), and NASA NNX11AP82A, Mars Science Laboratory Investigations and University of Arizona Geosciences.

  7. The plant phenological online database (PPODB): an online database for long-term phenological data

    NASA Astrophysics Data System (ADS)

    Dierenbach, Jonas; Badeck, Franz-W.; Schaber, Jörg

    2013-09-01

    We present an online database that provides unrestricted and free access to over 16 million plant phenological observations from over 8,000 stations in Central Europe between the years 1880 and 2009. Unique features are (1) a flexible and unrestricted access to a full-fledged database, allowing for a wide range of individual queries and data retrieval, (2) historical data for Germany before 1951 ranging back to 1880, and (3) more than 480 curated long-term time series covering more than 100 years for individual phenological phases and plants combined over Natural Regions in Germany. Time series for single stations or Natural Regions can be accessed through a user-friendly graphical geo-referenced interface. The joint databases made available with the plant phenological database PPODB render accessible an important data source for further analyses of long-term changes in phenology. The database can be accessed via www.ppodb.de .

  8. 47 CFR 15.711 - Interference avoidance methods.

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... channel availability for a TVBD is determined based on the geo-location and database access method described in paragraphs (a) and (b) of this section. (a) Geo-location and database access. A TVBD shall rely on the geo-location and database access mechanism to identify available television channels...

  9. 47 CFR 15.711 - Interference avoidance methods.

    Code of Federal Regulations, 2013 CFR

    2013-10-01

    ... channel availability for a TVBD is determined based on the geo-location and database access method described in paragraphs (a) and (b) of this section. (a) Geo-location and database access. A TVBD shall rely on the geo-location and database access mechanism to identify available television channels...

  10. 47 CFR 15.711 - Interference avoidance methods.

    Code of Federal Regulations, 2014 CFR

    2014-10-01

    ... channel availability for a TVBD is determined based on the geo-location and database access method described in paragraphs (a) and (b) of this section. (a) Geo-location and database access. A TVBD shall rely on the geo-location and database access mechanism to identify available television channels...

  11. 47 CFR 15.711 - Interference avoidance methods.

    Code of Federal Regulations, 2012 CFR

    2012-10-01

    ... channel availability for a TVBD is determined based on the geo-location and database access method described in paragraphs (a) and (b) of this section. (a) Geo-location and database access. A TVBD shall rely on the geo-location and database access mechanism to identify available television channels...

  12. Towards structured sharing of raw and derived neuroimaging data across existing resources

    PubMed Central

    Keator, D.B.; Helmer, K.; Steffener, J.; Turner, J.A.; Van Erp, T.G.M.; Gadde, S.; Ashish, N.; Burns, G.A.; Nichols, B.N.

    2013-01-01

    Data sharing efforts increasingly contribute to the acceleration of scientific discovery. Neuroimaging data is accumulating in distributed domain-specific databases and there is currently no integrated access mechanism nor an accepted format for the critically important meta-data that is necessary for making use of the combined, available neuroimaging data. In this manuscript, we present work from the Derived Data Working Group, an open-access group sponsored by the Biomedical Informatics Research Network (BIRN) and the International Neuroimaging Coordinating Facility (INCF) focused on practical tools for distributed access to neuroimaging data. The working group develops models and tools facilitating the structured interchange of neuroimaging meta-data and is making progress towards a unified set of tools for such data and meta-data exchange. We report on the key components required for integrated access to raw and derived neuroimaging data as well as associated meta-data and provenance across neuroimaging resources. The components include (1) a structured terminology that provides semantic context to data, (2) a formal data model for neuroimaging with robust tracking of data provenance, (3) a web service-based application programming interface (API) that provides a consistent mechanism to access and query the data model, and (4) a provenance library that can be used for the extraction of provenance data by image analysts and imaging software developers. We believe that the framework and set of tools outlined in this manuscript have great potential for solving many of the issues the neuroimaging community faces when sharing raw and derived neuroimaging data across the various existing database systems for the purpose of accelerating scientific discovery. PMID:23727024

  13. Database of Industrial Technological Information in Kanagawa : Networks for Technology Activities

    NASA Astrophysics Data System (ADS)

    Saito, Akira; Shindo, Tadashi

    This system is one of the databases which require participation by its members and of which premise is to open all the data in it. Aiming at free technological cooperation and exchange among industries it was constructed by Kanagawa Prefecture in collaboration with enterprises located in it. The input data is 36 items such as major product, special and advantageous technology, technolagy to be wanted for cooperation, facility and equipment, which technologically characterize each enterprise. They are expressed in 2,000 characters and written by natural language including Kanji except for some coded items. 24 search items are accessed by natural language so that in addition to interactive searching procedures including menu-type it enables extensive searching. The information service started in Oct., 1986 covering data from 2,000 enterprisen.

  14. Isfahan MISP Dataset

    PubMed Central

    Kashefpur, Masoud; Kafieh, Rahele; Jorjandi, Sahar; Golmohammadi, Hadis; Khodabande, Zahra; Abbasi, Mohammadreza; Teifuri, Nilufar; Fakharzadeh, Ali Akbar; Kashefpoor, Maryam; Rabbani, Hossein

    2017-01-01

    An online depository was introduced to share clinical ground truth with the public and provide open access for researchers to evaluate their computer-aided algorithms. PHP was used for web programming and MySQL for database managing. The website was entitled “biosigdata.com.” It was a fast, secure, and easy-to-use online database for medical signals and images. Freely registered users could download the datasets and could also share their own supplementary materials while maintaining their privacies (citation and fee). Commenting was also available for all datasets, and automatic sitemap and semi-automatic SEO indexing have been set for the site. A comprehensive list of available websites for medical datasets is also presented as a Supplementary (http://journalonweb.com/tempaccess/4800.584.JMSS_55_16I3253.pdf). PMID:28487832

  15. 10 years experience with pioneering open access publishing in health informatics: the Journal of Medical Internet Research (JMIR).

    PubMed

    Eysenbach, Gunther

    2010-01-01

    Peer-reviewed journals remain important vehicles for knowledge transfer and dissemination in health informatics, yet, their format, processes and business models are changing only slowly. Up to the end of last century, it was common for individual researchers and scientific organizations to leave the business of knowledge transfer to professional publishers, signing away their rights to the works in the process, which in turn impeded wider dissemination. Traditional medical informatics journals are poorly cited and the visibility and uptake of articles beyond the medical informatics community remain limited. In 1999, the Journal of Medical Internet Research (JMIR; http://www.jmir.org) was launched, featuring several innovations including 1) ownership and copyright retained by the authors, 2) electronic-only, "lean" non-for-profit publishing, 3) openly accessible articles with a reversed business model (author pays instead of reader pays), 4) technological innovations such as automatic XML tagging and reference checking, on-the-fly PDF generation from XML, etc., enabling wide distribution in various bibliographic and full-text databases. In the past 10 years, despite limited resources, the journal has emerged as a leading journal in health informatics, and is presently ranked the top journal in the medical informatics and health services research categories by impact factor. The paper summarizes some of the features of the Journal, and uses bibliometric and access data to compare the influence of the Journal on the discipline of medical informatics and other disciplines. While traditional medical informatics journals are primarily cited by other Medical Informatics journals (33%-46% of citations), JMIR papers are to a more often cited by "end-users" (policy, public health, clinical journals), which may be partly attributable to the "open access advantage".

  16. Open Access, Open Source and Digital Libraries: A Current Trend in University Libraries around the World

    ERIC Educational Resources Information Center

    Krishnamurthy, M.

    2008-01-01

    Purpose: The purpose of this paper is to describe the open access and open source movement in the digital library world. Design/methodology/approach: A review of key developments in the open access and open source movement is provided. Findings: Open source software and open access to research findings are of great use to scholars in developing…

  17. The Fabric for Frontier Experiments Project at Fermilab

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Kirby, Michael

    2014-01-01

    The FabrIc for Frontier Experiments (FIFE) project is a new, far-reaching initiative within the Fermilab Scientific Computing Division to drive the future of computing services for experiments at FNAL and elsewhere. It is a collaborative effort between computing professionals and experiment scientists to produce an end-to-end, fully integrated set of services for computing on the grid and clouds, managing data, accessing databases, and collaborating within experiments. FIFE includes 1) easy to use job submission services for processing physics tasks on the Open Science Grid and elsewhere, 2) an extensive data management system for managing local and remote caches, cataloging, querying,more » moving, and tracking the use of data, 3) custom and generic database applications for calibrations, beam information, and other purposes, 4) collaboration tools including an electronic log book, speakers bureau database, and experiment membership database. All of these aspects will be discussed in detail. FIFE sets the direction of computing at Fermilab experiments now and in the future, and therefore is a major driver in the design of computing services worldwide.« less

  18. Improving agricultural knowledge management: The AgTrials experience

    PubMed Central

    Hyman, Glenn; Espinosa, Herlin; Camargo, Paola; Abreu, David; Devare, Medha; Arnaud, Elizabeth; Porter, Cheryl; Mwanzia, Leroy; Sonder, Kai; Traore, Sibiry

    2017-01-01

    Background: Opportunities to use data and information to address challenges in international agricultural research and development are expanding rapidly. The use of agricultural trial and evaluation data has enormous potential to improve crops and management practices. However, for a number of reasons, this potential has yet to be realized. This paper reports on the experience of the AgTrials initiative, an effort to build an online database of agricultural trials applying principles of interoperability and open access. Methods: Our analysis evaluates what worked and what did not work in the development of the AgTrials information resource. We analyzed data on our users and their interaction with the platform. We also surveyed our users to gauge their perceptions of the utility of the online database. Results: The study revealed barriers to participation and impediments to interaction, opportunities for improving agricultural knowledge management and a large potential for the use of trial and evaluation data.  Conclusions: Technical and logistical mechanisms for developing interoperable online databases are well advanced.  More effort will be needed to advance organizational and institutional work for these types of databases to realize their potential. PMID:28580127

  19. A comparative cellular and molecular biology of longevity database.

    PubMed

    Stuart, Jeffrey A; Liang, Ping; Luo, Xuemei; Page, Melissa M; Gallagher, Emily J; Christoff, Casey A; Robb, Ellen L

    2013-10-01

    Discovering key cellular and molecular traits that promote longevity is a major goal of aging and longevity research. One experimental strategy is to determine which traits have been selected during the evolution of longevity in naturally long-lived animal species. This comparative approach has been applied to lifespan research for nearly four decades, yielding hundreds of datasets describing aspects of cell and molecular biology hypothesized to relate to animal longevity. Here, we introduce a Comparative Cellular and Molecular Biology of Longevity Database, available at ( http://genomics.brocku.ca/ccmbl/ ), as a compendium of comparative cell and molecular data presented in the context of longevity. This open access database will facilitate the meta-analysis of amalgamated datasets using standardized maximum lifespan (MLSP) data (from AnAge). The first edition contains over 800 data records describing experimental measurements of cellular stress resistance, reactive oxygen species metabolism, membrane composition, protein homeostasis, and genome homeostasis as they relate to vertebrate species MLSP. The purpose of this review is to introduce the database and briefly demonstrate its use in the meta-analysis of combined datasets.

  20. Text mining for metabolic pathways, signaling cascades, and protein networks.

    PubMed

    Hoffmann, Robert; Krallinger, Martin; Andres, Eduardo; Tamames, Javier; Blaschke, Christian; Valencia, Alfonso

    2005-05-10

    The complexity of the information stored in databases and publications on metabolic and signaling pathways, the high throughput of experimental data, and the growing number of publications make it imperative to provide systems to help the researcher navigate through these interrelated information resources. Text-mining methods have started to play a key role in the creation and maintenance of links between the information stored in biological databases and its original sources in the literature. These links will be extremely useful for database updating and curation, especially if a number of technical problems can be solved satisfactorily, including the identification of protein and gene names (entities in general) and the characterization of their types of interactions. The first generation of openly accessible text-mining systems, such as iHOP (Information Hyperlinked over Proteins), provides additional functions to facilitate the reconstruction of protein interaction networks, combine database and text information, and support the scientist in the formulation of novel hypotheses. The next challenge is the generation of comprehensive information regarding the general function of signaling pathways and protein interaction networks.

  1. Open-access evidence database of controlled trials and systematic reviews in youth mental health.

    PubMed

    De Silva, Stefanie; Bailey, Alan P; Parker, Alexandra G; Montague, Alice E; Hetrick, Sarah E

    2018-06-01

    To present an update to an evidence-mapping project that consolidates the evidence base of interventions in youth mental health. To promote dissemination of this resource, the evidence map has been translated into a free online database (https://orygen.org.au/Campus/Expert-Network/Evidence-Finder or https://headspace.org.au/research-database/). Included studies are extensively indexed to facilitate searching. A systematic search for prevention and treatment studies in young people (mean age 6-25 years) is conducted annually using Embase, MEDLINE, PsycINFO and the Cochrane Library. Included studies are restricted to controlled trials and systematic reviews published since 1980. To date, 221 866 publications have been screened, of which 2680 have been included in the database. Updates are conducted annually. This shared resource can be utilized to substantially reduce the amount of time involved with conducting literature searches. It is designed to promote the uptake of evidence-based practice and facilitate research to address gaps in youth mental health. © 2017 John Wiley & Sons Australia, Ltd.

  2. The International Experimental Thermal Hydraulic Systems database – TIETHYS: A new NEA validation tool

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Rohatgi, Upendra S.

    Nuclear reactor codes require validation with appropriate data representing the plant for specific scenarios. The thermal-hydraulic data is scattered in different locations and in different formats. Some of the data is in danger of being lost. A relational database is being developed to organize the international thermal hydraulic test data for various reactor concepts and different scenarios. At the reactor system level, that data is organized to include separate effect tests and integral effect tests for specific scenarios and corresponding phenomena. The database relies on the phenomena identification sections of expert developed PIRTs. The database will provide a summary ofmore » appropriate data, review of facility information, test description, instrumentation, references for the experimental data and some examples of application of the data for validation. The current database platform includes scenarios for PWR, BWR, VVER, and specific benchmarks for CFD modelling data and is to be expanded to include references for molten salt reactors. There are place holders for high temperature gas cooled reactors, CANDU and liquid metal reactors. This relational database is called The International Experimental Thermal Hydraulic Systems (TIETHYS) database and currently resides at Nuclear Energy Agency (NEA) of the OECD and is freely open to public access. Going forward the database will be extended to include additional links and data as they become available. https://www.oecd-nea.org/tiethysweb/« less

  3. Concentrations of indoor pollutants (CIP) database user's manual (Version 4. 0)

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Apte, M.G.; Brown, S.R.; Corradi, C.A.

    1990-10-01

    This is the latest release of the database and the user manual. The user manual is a tutorial and reference for utilizing the CIP Database system. An installation guide is included to cover various hardware configurations. Numerous examples and explanations of the dialogue between the user and the database program are provided. It is hoped that this resource will, along with on-line help and the menu-driven software, make for a quick and easy learning curve. For the purposes of this manual, it is assumed that the user is acquainted with the goals of the CIP Database, which are: (1) tomore » collect existing measurements of concentrations of indoor air pollutants in a user-oriented database and (2) to provide a repository of references citing measured field results openly accessible to a wide audience of researchers, policy makers, and others interested in the issues of indoor air quality. The database software, as distinct from the data, is contained in two files, CIP. EXE and PFIL.COM. CIP.EXE is made up of a number of programs written in dBase III command code and compiled using Clipper into a single, executable file. PFIL.COM is a program written in Turbo Pascal that handles the output of summary text files and is called from CIP.EXE. Version 4.0 of the CIP Database is current through March 1990.« less

  4. Plant Reactome: a resource for plant pathways and comparative analysis

    PubMed Central

    Naithani, Sushma; Preece, Justin; D'Eustachio, Peter; Gupta, Parul; Amarasinghe, Vindhya; Dharmawardhana, Palitha D.; Wu, Guanming; Fabregat, Antonio; Elser, Justin L.; Weiser, Joel; Keays, Maria; Fuentes, Alfonso Munoz-Pomer; Petryszak, Robert; Stein, Lincoln D.; Ware, Doreen; Jaiswal, Pankaj

    2017-01-01

    Plant Reactome (http://plantreactome.gramene.org/) is a free, open-source, curated plant pathway database portal, provided as part of the Gramene project. The database provides intuitive bioinformatics tools for the visualization, analysis and interpretation of pathway knowledge to support genome annotation, genome analysis, modeling, systems biology, basic research and education. Plant Reactome employs the structural framework of a plant cell to show metabolic, transport, genetic, developmental and signaling pathways. We manually curate molecular details of pathways in these domains for reference species Oryza sativa (rice) supported by published literature and annotation of well-characterized genes. Two hundred twenty-two rice pathways, 1025 reactions associated with 1173 proteins, 907 small molecules and 256 literature references have been curated to date. These reference annotations were used to project pathways for 62 model, crop and evolutionarily significant plant species based on gene homology. Database users can search and browse various components of the database, visualize curated baseline expression of pathway-associated genes provided by the Expression Atlas and upload and analyze their Omics datasets. The database also offers data access via Application Programming Interfaces (APIs) and in various standardized pathway formats, such as SBML and BioPAX. PMID:27799469

  5. The BioGRID interaction database: 2017 update

    PubMed Central

    Chatr-aryamontri, Andrew; Oughtred, Rose; Boucher, Lorrie; Rust, Jennifer; Chang, Christie; Kolas, Nadine K.; O'Donnell, Lara; Oster, Sara; Theesfeld, Chandra; Sellam, Adnane; Stark, Chris; Breitkreutz, Bobby-Joe; Dolinski, Kara; Tyers, Mike

    2017-01-01

    The Biological General Repository for Interaction Datasets (BioGRID: https://thebiogrid.org) is an open access database dedicated to the annotation and archival of protein, genetic and chemical interactions for all major model organism species and humans. As of September 2016 (build 3.4.140), the BioGRID contains 1 072 173 genetic and protein interactions, and 38 559 post-translational modifications, as manually annotated from 48 114 publications. This dataset represents interaction records for 66 model organisms and represents a 30% increase compared to the previous 2015 BioGRID update. BioGRID curates the biomedical literature for major model organism species, including humans, with a recent emphasis on central biological processes and specific human diseases. To facilitate network-based approaches to drug discovery, BioGRID now incorporates 27 501 chemical–protein interactions for human drug targets, as drawn from the DrugBank database. A new dynamic interaction network viewer allows the easy navigation and filtering of all genetic and protein interaction data, as well as for bioactive compounds and their established targets. BioGRID data are directly downloadable without restriction in a variety of standardized formats and are freely distributed through partner model organism databases and meta-databases. PMID:27980099

  6. Making your database available through Wikipedia: the pros and cons.

    PubMed

    Finn, Robert D; Gardner, Paul P; Bateman, Alex

    2012-01-01

    Wikipedia, the online encyclopedia, is the most famous wiki in use today. It contains over 3.7 million pages of content; with many pages written on scientific subject matters that include peer-reviewed citations, yet are written in an accessible manner and generally reflect the consensus opinion of the community. In this, the 19th Annual Database Issue of Nucleic Acids Research, there are 11 articles that describe the use of a wiki in relation to a biological database. In this commentary, we discuss how biological databases can be integrated with Wikipedia, thereby utilising the pre-existing infrastructure, tools and above all, large community of authors (or Wikipedians). The limitations to the content that can be included in Wikipedia are highlighted, with examples drawn from articles found in this issue and other wiki-based resources, indicating why other wiki solutions are necessary. We discuss the merits of using open wikis, like Wikipedia, versus other models, with particular reference to potential vandalism. Finally, we raise the question about the future role of dedicated database biocurators in context of the thousands of crowdsourced, community annotations that are now being stored in wikis.

  7. Viral taxonomy needs a spring clean; its exploration era is over.

    PubMed

    Gibbs, Adrian J

    2013-08-09

    The International Committee on Taxonomy of Viruses has recently changed its approved definition of a viral species, and also discontinued work on its database of virus descriptions. These events indicate that the exploration era of viral taxonomy has ended; over the past century the principles of viral taxonomy have been established, the tools for phylogenetic inference invented, and the ultimate discriminatory data required for taxonomy, namely gene sequences, are now readily available. Further changes would make viral taxonomy more informative. First, the status of a 'taxonomic species' with an italicized name should only be given to viruses that are specifically linked with a single 'type genomic sequence' like those in the NCBI Reference Sequence Database. Secondly all approved taxa should be predominately monophyletic, and uninformative higher taxa disendorsed. These are 'quality assurance' measures and would improve the value of viral nomenclature to its users. The ICTV should also promote the use of a public database, such as Wikipedia, to replace the ICTV database as a store of the primary metadata of individual viruses, and should publish abstracts of the ICTV Reports in that database, so that they are 'Open Access'.

  8. Making your database available through Wikipedia: the pros and cons

    PubMed Central

    Finn, Robert D.; Gardner, Paul P.; Bateman, Alex

    2012-01-01

    Wikipedia, the online encyclopedia, is the most famous wiki in use today. It contains over 3.7 million pages of content; with many pages written on scientific subject matters that include peer-reviewed citations, yet are written in an accessible manner and generally reflect the consensus opinion of the community. In this, the 19th Annual Database Issue of Nucleic Acids Research, there are 11 articles that describe the use of a wiki in relation to a biological database. In this commentary, we discuss how biological databases can be integrated with Wikipedia, thereby utilising the pre-existing infrastructure, tools and above all, large community of authors (or Wikipedians). The limitations to the content that can be included in Wikipedia are highlighted, with examples drawn from articles found in this issue and other wiki-based resources, indicating why other wiki solutions are necessary. We discuss the merits of using open wikis, like Wikipedia, versus other models, with particular reference to potential vandalism. Finally, we raise the question about the future role of dedicated database biocurators in context of the thousands of crowdsourced, community annotations that are now being stored in wikis. PMID:22144683

  9. Freely Accessible Chemical Database Resources of Compounds for in Silico Drug Discovery.

    PubMed

    Yang, JingFang; Wang, Di; Jia, Chenyang; Wang, Mengyao; Hao, GeFei; Yang, GuangFu

    2018-05-07

    In silico drug discovery has been proved to be a solidly established key component in early drug discovery. However, this task is hampered by the limitation of quantity and quality of compound databases for screening. In order to overcome these obstacles, freely accessible database resources of compounds have bloomed in recent years. Nevertheless, how to choose appropriate tools to treat these freely accessible databases are crucial. To the best of our knowledge, this is the first systematic review on this issue. The existed advantages and drawbacks of chemical databases were analyzed and summarized based on the collected six categories of freely accessible chemical databases from literature in this review. Suggestions on how and in which conditions the usage of these databases could be reasonable were provided. Tools and procedures for building 3D structure chemical libraries were also introduced. In this review, we described the freely accessible chemical database resources for in silico drug discovery. In particular, the chemical information for building chemical database appears as attractive resources for drug design to alleviate experimental pressure. Copyright© Bentham Science Publishers; For any queries, please email at epub@benthamscience.org.

  10. Correlates of Access to Business Research Databases

    ERIC Educational Resources Information Center

    Gottfried, John C.

    2010-01-01

    This study examines potential correlates of business research database access through academic libraries serving top business programs in the United States. Results indicate that greater access to research databases is related to enrollment in graduate business programs, but not to overall enrollment or status as a public or private institution.…

  11. 47 CFR 51.217 - Nondiscriminatory access: Telephone numbers, operator services, directory assistance services...

    Code of Federal Regulations, 2011 CFR

    2011-10-01

    ... to have access to its directory assistance services, including directory assistance databases, so... provider, including transfer of the LECs' directory assistance databases in readily accessible magnetic.... Updates to the directory assistance database shall be made in the same format as the initial transfer...

  12. A Web-Based Database for Nurse Led Outreach Teams (NLOT) in Toronto.

    PubMed

    Li, Shirley; Kuo, Mu-Hsing; Ryan, David

    2016-01-01

    A web-based system can provide access to real-time data and information. Healthcare is moving towards digitizing patients' medical information and securely exchanging it through web-based systems. In one of Ontario's health regions, Nurse Led Outreach Teams (NLOT) provide emergency mobile nursing services to help reduce unnecessary transfers from long-term care homes to emergency departments. Currently the NLOT team uses a Microsoft Access database to keep track of the health information on the residents that they serve. The Access database lacks scalability, portability, and interoperability. The objective of this study is the development of a web-based database using Oracle Application Express that is easily accessible from mobile devices. The web-based database will allow NLOT nurses to enter and access resident information anytime and from anywhere.

  13. Advancements in web-database applications for rabies surveillance.

    PubMed

    Rees, Erin E; Gendron, Bruno; Lelièvre, Frédérick; Coté, Nathalie; Bélanger, Denise

    2011-08-02

    Protection of public health from rabies is informed by the analysis of surveillance data from human and animal populations. In Canada, public health, agricultural and wildlife agencies at the provincial and federal level are responsible for rabies disease control, and this has led to multiple agency-specific data repositories. Aggregation of agency-specific data into one database application would enable more comprehensive data analyses and effective communication among participating agencies. In Québec, RageDB was developed to house surveillance data for the raccoon rabies variant, representing the next generation in web-based database applications that provide a key resource for the protection of public health. RageDB incorporates data from, and grants access to, all agencies responsible for the surveillance of raccoon rabies in Québec. Technological advancements of RageDB to rabies surveillance databases include (1) automatic integration of multi-agency data and diagnostic results on a daily basis; (2) a web-based data editing interface that enables authorized users to add, edit and extract data; and (3) an interactive dashboard to help visualize data simply and efficiently, in table, chart, and cartographic formats. Furthermore, RageDB stores data from citizens who voluntarily report sightings of rabies suspect animals. We also discuss how sightings data can indicate public perception to the risk of racoon rabies and thus aid in directing the allocation of disease control resources for protecting public health. RageDB provides an example in the evolution of spatio-temporal database applications for the storage, analysis and communication of disease surveillance data. The database was fast and inexpensive to develop by using open-source technologies, simple and efficient design strategies, and shared web hosting. The database increases communication among agencies collaborating to protect human health from raccoon rabies. Furthermore, health agencies have real-time access to a wide assortment of data documenting new developments in the raccoon rabies epidemic and this enables a more timely and appropriate response.

  14. Advancements in web-database applications for rabies surveillance

    PubMed Central

    2011-01-01

    Background Protection of public health from rabies is informed by the analysis of surveillance data from human and animal populations. In Canada, public health, agricultural and wildlife agencies at the provincial and federal level are responsible for rabies disease control, and this has led to multiple agency-specific data repositories. Aggregation of agency-specific data into one database application would enable more comprehensive data analyses and effective communication among participating agencies. In Québec, RageDB was developed to house surveillance data for the raccoon rabies variant, representing the next generation in web-based database applications that provide a key resource for the protection of public health. Results RageDB incorporates data from, and grants access to, all agencies responsible for the surveillance of raccoon rabies in Québec. Technological advancements of RageDB to rabies surveillance databases include 1) automatic integration of multi-agency data and diagnostic results on a daily basis; 2) a web-based data editing interface that enables authorized users to add, edit and extract data; and 3) an interactive dashboard to help visualize data simply and efficiently, in table, chart, and cartographic formats. Furthermore, RageDB stores data from citizens who voluntarily report sightings of rabies suspect animals. We also discuss how sightings data can indicate public perception to the risk of racoon rabies and thus aid in directing the allocation of disease control resources for protecting public health. Conclusions RageDB provides an example in the evolution of spatio-temporal database applications for the storage, analysis and communication of disease surveillance data. The database was fast and inexpensive to develop by using open-source technologies, simple and efficient design strategies, and shared web hosting. The database increases communication among agencies collaborating to protect human health from raccoon rabies. Furthermore, health agencies have real-time access to a wide assortment of data documenting new developments in the raccoon rabies epidemic and this enables a more timely and appropriate response. PMID:21810215

  15. Open access for operational research publications from low- and middle-income countries: who pays?

    PubMed Central

    Kumar, A. M. V.; Reid, A. J.; Van den Bergh, R.; Isaakidis, P.; Draguez, B.; Delaunois, P.; Nagaraja, S. B.; Ramsay, A.; Reeder, J. C.; Denisiuk, O.; Ali, E.; Khogali, M.; Hinderaker, S. G.; Kosgei, R. J.; van Griensven, J.; Quaglio, G. L.; Maher, D.; Billo, N. E.; Terry, R. F.; Harries, A. D.

    2014-01-01

    Open-access journal publications aim to ensure that new knowledge is widely disseminated and made freely accessible in a timely manner so that it can be used to improve people's health, particularly those in low- and middle-income countries. In this paper, we briefly explain the differences between closed- and open-access journals, including the evolving idea of the ‘open-access spectrum’. We highlight the potential benefits of supporting open access for operational research, and discuss the conundrum and ways forward as regards who pays for open access. PMID:26400799

  16. The continued movement for open access to peer-reviewed literature.

    PubMed

    Liesegang, Thomas J

    2013-09-01

    To provide a current overview of the movement for open access to the peer review literature. Perspective. Literature review of recent advances in the open access movement with a personal viewpoint of the nuances of the movement. The open access movement is complex, with many different constituents. The idealists for the open access movement are seeking open access to the literature but also to the data that constitute the research within the manuscript. The business model of the traditional subscription journal is being scrutinized in relation to the surge in the number of open access journals. Within this environment authors should beware predatory practices. More government and funding agencies are mandating open access to their funded research. This open access movement will continue to be disruptive until a business model ensures continuity of the scientific record. A flood of open access articles that might enrich, but also might pollute or confuse, the medical literature has altered the filtering mechanism provided by the traditional peer review system. At some point there may be a shake-out, with some literature being lost in cyberspace. The open access movement is maturing and must be embraced in some format. The challenge is to establish a sustainable financial business model that will permit the use of digital technology but yet not endanger the decades-old traditional publication model and peer review system. Authors seem to be slower in adopting open access than the idealists in the movement. Copyright © 2013 Elsevier Inc. All rights reserved.

  17. Progress on the FabrIc for Frontier Experiments project at Fermilab

    DOE PAGES

    Box, Dennis; Boyd, Joseph; Dykstra, Dave; ...

    2015-12-23

    The FabrIc for Frontier Experiments (FIFE) project is an ambitious, major-impact initiative within the Fermilab Scientific Computing Division designed to lead the computing model for Fermilab experiments. FIFE is a collaborative effort between experimenters and computing professionals to design and develop integrated computing models for experiments of varying needs and infrastructure. The major focus of the FIFE project is the development, deployment, and integration of Open Science Grid solutions for high throughput computing, data management, database access and collaboration within experiment. To accomplish this goal, FIFE has developed workflows that utilize Open Science Grid sites along with dedicated and commercialmore » cloud resources. The FIFE project has made significant progress integrating into experiment computing operations several services including new job submission services, software and reference data distribution through CVMFS repositories, flexible data transfer client, and access to opportunistic resources on the Open Science Grid. Hence, the progress with current experiments and plans for expansion with additional projects will be discussed. FIFE has taken a leading role in the definition of the computing model for Fermilab experiments, aided in the design of computing for experiments beyond Fermilab, and will continue to define the future direction of high throughput computing for future physics experiments worldwide« less

  18. "Big data" and "open data": What kind of access should researchers enjoy?

    PubMed

    Chatellier, Gilles; Varlet, Vincent; Blachier-Poisson, Corinne

    2016-02-01

    The healthcare sector is currently facing a new paradigm, the explosion of "big data". Coupled with advances in computer technology, the field of "big data" appears promising, allowing us to better understand the natural history of diseases, to follow-up new technologies (devices, drugs) implementation and to participate in precision medicine, etc. Data sources are multiple (medical and administrative data, electronic medical records, data from rapidly developing technologies such as DNA sequencing, connected devices, etc.) and heterogeneous while their use requires complex methods for accurate analysis. Moreover, faced with this new paradigm, we must determine who could (or should) have access to which data, how to combine collective interest and protection of personal data and how to finance in the long-term both operating costs and databases interrogation. This article analyses the opportunities and challenges related to the use of open and/or "big data", from the viewpoint of pharmacologists and representatives of the pharmaceutical and medical device industry. Copyright © 2016 Société française de pharmacologie et de thérapeutique. Published by Elsevier Masson SAS. All rights reserved.

  19. The Future of the ASP Conference Series

    NASA Astrophysics Data System (ADS)

    Jensen, Joseph B.; Barnes, Jonathan; Moody, J. Ward; Szkody, Paula

    The Astronomical Society of the Pacific (ASP) has been publishing the proceedings of conferences in astronomy and astrophysics for more than 20 years. The ASP Conference Series (ASPCS) is widely known for its affordable and high quality printed volumes. The ASPCS is adapting to the changing market by making electronically published volumes available to subscribers around the world, including papers in the Astrophysics Data System (ADS) database, and allowing authors to post papers on e-print archives. We discuss the role of the printed book in our future plans, and how electronic publishing affects the types of products and services we offer. Recently there has been increasing pressure in the academic world for open access (electronic copies of scholarly publications made freely-available immediately after publication), and we discuss how the ASPCS is responding to the needs of the professional astronomical community, the ASP, and humanity at large. While we cannot yet provide full open access and stay in business, we are actively pursuing several initiatives to improve the quality of our product and the impact of the papers we publish.

  20. ForC: a global database of forest carbon stocks and fluxes

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Anderson-Teixeira, Kristina J.; Wang, Maria M. H.; McGarvey, Jennifer C.

    Forests play an influential role in the global carbon (C) cycle, storing roughly half of terrestrial C and annually exchanging with the atmosphere more than ten times the carbon dioxide (CO 2) emitted by anthropogenic activities. Yet, scaling up from ground-based measurements of forest C stocks and fluxes to understand global scale C cycling and its climate sensitivity remains an important challenge. Tens of thousands of forest C measurements have been made, but these data have yet to be integrated into a single database that makes them accessible for integrated analyses. Here we present an open-access global Forest Carbon databasemore » (ForC) containing records of ground-based measurements of ecosystem-level C stocks and annual fluxes, along with disturbance history and methodological information. ForC expands upon the previously published tropical portion of this database, TropForC (DOI: 10.5061/dryad.t516f), now including 17,538 records (previously 3568) representing 2,731 plots (previously 845) in 826 geographically distinct areas (previously 178). The database covers all forested biogeographic and climate zones, represents forest stands of all ages, and includes 89 C cycle variables collected between 1934 and 2015. We expect that ForC will prove useful for macroecological analyses of forest C cycling, for evaluation of model predictions or remote sensing products, for quantifying the contribution of forests to the global C cycle, and for supporting international efforts to inventory forest carbon and greenhouse gas exchange. A dynamic version of ForC-db is maintained at https://github.com/forc-db, and we encourage the research community to collaborate in updating, correcting, expanding, and utilizing this database.« less

  1. Biopython: freely available Python tools for computational molecular biology and bioinformatics.

    PubMed

    Cock, Peter J A; Antao, Tiago; Chang, Jeffrey T; Chapman, Brad A; Cox, Cymon J; Dalke, Andrew; Friedberg, Iddo; Hamelryck, Thomas; Kauff, Frank; Wilczynski, Bartek; de Hoon, Michiel J L

    2009-06-01

    The Biopython project is a mature open source international collaboration of volunteer developers, providing Python libraries for a wide range of bioinformatics problems. Biopython includes modules for reading and writing different sequence file formats and multiple sequence alignments, dealing with 3D macro molecular structures, interacting with common tools such as BLAST, ClustalW and EMBOSS, accessing key online databases, as well as providing numerical methods for statistical learning. Biopython is freely available, with documentation and source code at (www.biopython.org) under the Biopython license.

  2. NCBI2RDF: enabling full RDF-based access to NCBI databases.

    PubMed

    Anguita, Alberto; García-Remesal, Miguel; de la Iglesia, Diana; Maojo, Victor

    2013-01-01

    RDF has become the standard technology for enabling interoperability among heterogeneous biomedical databases. The NCBI provides access to a large set of life sciences databases through a common interface called Entrez. However, the latter does not provide RDF-based access to such databases, and, therefore, they cannot be integrated with other RDF-compliant databases and accessed via SPARQL query interfaces. This paper presents the NCBI2RDF system, aimed at providing RDF-based access to the complete NCBI data repository. This API creates a virtual endpoint for servicing SPARQL queries over different NCBI repositories and presenting to users the query results in SPARQL results format, thus enabling this data to be integrated and/or stored with other RDF-compliant repositories. SPARQL queries are dynamically resolved, decomposed, and forwarded to the NCBI-provided E-utilities programmatic interface to access the NCBI data. Furthermore, we show how our approach increases the expressiveness of the native NCBI querying system, allowing several databases to be accessed simultaneously. This feature significantly boosts productivity when working with complex queries and saves time and effort to biomedical researchers. Our approach has been validated with a large number of SPARQL queries, thus proving its reliability and enhanced capabilities in biomedical environments.

  3. Open Access Journal Policies: A Systematic Analysis of Radiology Journals.

    PubMed

    Narayan, Anand; Lobner, Katie; Fritz, Jan

    2018-02-01

    The open access movement has pushed for greater access to scientific knowledge by expanding access to scientific journal articles. There is limited information about the extent to which open access policies have been adopted by radiology journals. We performed a systematic analysis to ascertain the proportion of radiology journals with open access options. A search was performed with the assistance of a clinical informationist. Full and mixed English-language diagnostic and interventional radiology Web of Science journals (impact factors > 1.0) were included. Nuclear medicine, radiation oncology, physics, and solicitation-only journals were excluded. Primary outcome was open access option (yes or no) with additional outcomes including presence or absence of embargo, complete or partial copyright transfer, publication fees, and self-archiving policies. Secondary outcomes included journal citations, journal impact factors, immediacy, Eigenfactor, and article influence scores. Independent double readings were performed with differences resolved by consensus, supplemented by contacting editorial staff at each journal. In all, 125 journals were identified; review yielded 49 journals (39%, mean impact factor of 2.61). Thirty-six of the journals had open access options (73.4%), and four journals were exclusively open access (8.2%). Twelve-month embargoes were most commonly cited (90.6%) with 28.6% of journals stating that they did not require a complete transfer of copyright. Prices for open access options ranged from $750 to $4,000 (median $3,000). No statistically significant differences were found in journal impact measures comparing journals with open access options to journals without open access options. Diagnostic and interventional radiology journals have widely adopted open access options with a few radiology journals being exclusively open access. Copyright © 2017 American College of Radiology. Published by Elsevier Inc. All rights reserved.

  4. GestuRe and ACtion Exemplar (GRACE) video database: stimuli for research on manners of human locomotion and iconic gestures.

    PubMed

    Aussems, Suzanne; Kwok, Natasha; Kita, Sotaro

    2018-06-01

    Human locomotion is a fundamental class of events, and manners of locomotion (e.g., how the limbs are used to achieve a change of location) are commonly encoded in language and gesture. To our knowledge, there is no openly accessible database containing normed human locomotion stimuli. Therefore, we introduce the GestuRe and ACtion Exemplar (GRACE) video database, which contains 676 videos of actors performing novel manners of human locomotion (i.e., moving from one location to another in an unusual manner) and videos of a female actor producing iconic gestures that represent these actions. The usefulness of the database was demonstrated across four norming experiments. First, our database contains clear matches and mismatches between iconic gesture videos and action videos. Second, the male actors and female actors whose action videos matched the gestures in the best possible way, perform the same actions in very similar manners and different actions in highly distinct manners. Third, all the actions in the database are distinct from each other. Fourth, adult native English speakers were unable to describe the 26 different actions concisely, indicating that the actions are unusual. This normed stimuli set is useful for experimental psychologists working in the language, gesture, visual perception, categorization, memory, and other related domains.

  5. SNPversity: a web-based tool for visualizing diversity

    PubMed Central

    Schott, David A; Vinnakota, Abhinav G; Portwood, John L; Andorf, Carson M

    2018-01-01

    Abstract Many stand-alone desktop software suites exist to visualize single nucleotide polymorphism (SNP) diversity, but web-based software that can be easily implemented and used for biological databases is absent. SNPversity was created to answer this need by building an open-source visualization tool that can be implemented on a Unix-like machine and served through a web browser that can be accessible worldwide. SNPversity consists of a HDF5 database back-end for SNPs, a data exchange layer powered by TASSEL libraries that represent data in JSON format, and an interface layer using PHP to visualize SNP information. SNPversity displays data in real-time through a web browser in grids that are color-coded according to a given SNP’s allelic status and mutational state. SNPversity is currently available at MaizeGDB, the maize community’s database, and will be soon available at GrainGenes, the clade-oriented database for Triticeae and Avena species, including wheat, barley, rye, and oat. The code and documentation are uploaded onto github, and they are freely available to the public. We expect that the tool will be highly useful for other biological databases with a similar need to display SNP diversity through their web interfaces. Database URL: https://www.maizegdb.org/snpversity PMID:29688387

  6. Comet: an open-source MS/MS sequence database search tool.

    PubMed

    Eng, Jimmy K; Jahan, Tahmina A; Hoopmann, Michael R

    2013-01-01

    Proteomics research routinely involves identifying peptides and proteins via MS/MS sequence database search. Thus the database search engine is an integral tool in many proteomics research groups. Here, we introduce the Comet search engine to the existing landscape of commercial and open-source database search tools. Comet is open source, freely available, and based on one of the original sequence database search tools that has been widely used for many years. © 2012 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim.

  7. Sci-Hub provides access to nearly all scholarly literature.

    PubMed

    Himmelstein, Daniel S; Romero, Ariel Rodriguez; Levernier, Jacob G; Munro, Thomas Anthony; McLaughlin, Stephen Reid; Greshake Tzovaras, Bastian; Greene, Casey S

    2018-03-01

    The website Sci-Hub enables users to download PDF versions of scholarly articles, including many articles that are paywalled at their journal's site. Sci-Hub has grown rapidly since its creation in 2011, but the extent of its coverage has been unclear. Here we report that, as of March 2017, Sci-Hub's database contains 68.9% of the 81.6 million scholarly articles registered with Crossref and 85.1% of articles published in toll access journals. We find that coverage varies by discipline and publisher, and that Sci-Hub preferentially covers popular, paywalled content. For toll access articles, we find that Sci-Hub provides greater coverage than the University of Pennsylvania, a major research university in the United States. Green open access to toll access articles via licit services, on the other hand, remains quite limited. Our interactive browser at https://greenelab.github.io/scihub allows users to explore these findings in more detail. For the first time, nearly all scholarly literature is available gratis to anyone with an Internet connection, suggesting the toll access business model may become unsustainable. © 2018, Himmelstein et al.

  8. First results of MAO NASU SS bodies photographic archive digitizing

    NASA Astrophysics Data System (ADS)

    Pakuliak, L.; Andruk, V.; Shatokhina, S.; Golovnya, V.; Yizhakevych, O.; Kulyk, I.

    2013-05-01

    MAO NASU glass archive encloses about 1800 photographic plates with planets and their satellites (including near 80 images of Uranus, Pluto and Neptune), about 1700 plates with minor planets and about 900 plates with comets. Plates were made during 1949-1999 using 11 telescopes of different focus, mostly the Double Wide-angle Astrograph (F/D=2000/400) and the Double Long-focus Astrograph (F/D=5500/400) of MAO NASU. Observational sites are Kyiv, Lviv (Ukraine), Biurakan (Armenia), Abastumani (Georgia), Mt. Maidanak (Uzbekistan), Quito (Equador). Tables contain data about the most significant numbers of plates sub-divided by years and objects. The database with metadata of plates (DBGPA) is available on the computer cluster of MAO (http://gua.db.ukr-vo.org) via open access. The database accumulates archives of four Ukrainian observatories, involving the UkrVO national project. Together with the archive managing system, the database serves as a test area for JDA - Joint Digital Archive - the core of the UkrVO.

  9. ERAIZDA: a model for holistic annotation of animal infectious and zoonotic diseases

    PubMed Central

    Buza, Teresia M.; Jack, Sherman W.; Kirunda, Halid; Khaitsa, Margaret L.; Lawrence, Mark L.; Pruett, Stephen; Peterson, Daniel G.

    2015-01-01

    There is an urgent need for a unified resource that integrates trans-disciplinary annotations of emerging and reemerging animal infectious and zoonotic diseases. Such data integration will provide wonderful opportunity for epidemiologists, researchers and health policy makers to make data-driven decisions designed to improve animal health. Integrating emerging and reemerging animal infectious and zoonotic disease data from a large variety of sources into a unified open-access resource provides more plausible arguments to achieve better understanding of infectious and zoonotic diseases. We have developed a model for interlinking annotations of these diseases. These diseases are of particular interest because of the threats they pose to animal health, human health and global health security. We demonstrated the application of this model using brucellosis, an infectious and zoonotic disease. Preliminary annotations were deposited into VetBioBase database (http://vetbiobase.igbb.msstate.edu). This database is associated with user-friendly tools to facilitate searching, retrieving and downloading of disease-related information. Database URL: http://vetbiobase.igbb.msstate.edu PMID:26581408

  10. Investigating the Potential Impacts of Energy Production in the Marcellus Shale Region Using the Shale Network Database

    NASA Astrophysics Data System (ADS)

    Brantley, S.; Brazil, L.

    2017-12-01

    The Shale Network's extensive database of water quality observations enables educational experiences about the potential impacts of resource extraction with real data. Through tools that are open source and free to use, researchers, educators, and citizens can access and analyze the very same data that the Shale Network team has used in peer-reviewed publications about the potential impacts of hydraulic fracturing on water. The development of the Shale Network database has been made possible through efforts led by an academic team and involving numerous individuals from government agencies, citizen science organizations, and private industry. Thus far, these tools and data have been used to engage high school students, university undergraduate and graduate students, as well as citizens so that all can discover how energy production impacts the Marcellus Shale region, which includes Pennsylvania and other nearby states. This presentation will describe these data tools, how the Shale Network has used them in developing lesson plans, and the resources available to learn more.

  11. Seabird databases and the new paradigm for scientific publication and attribution

    USGS Publications Warehouse

    Hatch, Scott A.

    2010-01-01

    For more than 300 years, the peer-reviewed journal article has been the principal medium for packaging and delivering scientific data. With new tools for managing digital data, a new paradigm is emerging—one that demands open and direct access to data and that enables and rewards a broad-based approach to scientific questions. Ground-breaking papers in the future will increasingly be those that creatively mine and synthesize vast stores of data available on the Internet. This is especially true for conservation science, in which essential data can be readily captured in standard record formats. For seabird professionals, a number of globally shared databases are in the offing, or should be. These databases will capture the salient results of inventories and monitoring, pelagic surveys, diet studies, and telemetry. A number of real or perceived barriers to data sharing exist, but none is insurmountable. Our discipline should take an important stride now by adopting a specially designed markup language for annotating and sharing seabird data.

  12. Open Access in the Natural and Social Sciences: The Correspondence of Innovative Moves to Enhance Access, Inclusion and Impact in Scholarly Communication

    ERIC Educational Resources Information Center

    Armbruster, Chris

    2008-01-01

    Online, open access is the superior model for scholarly communication. A variety of scientific communities in physics, the life sciences and economics have gone furthest in innovating their scholarly communication through open access, enhancing accessibility for scientists, students and the interested public. Open access enjoys a comparative…

  13. Distributed data discovery, access and visualization services to Improve Data Interoperability across different data holdings

    NASA Astrophysics Data System (ADS)

    Palanisamy, G.; Krassovski, M.; Devarakonda, R.; Santhana Vannan, S.

    2012-12-01

    The current climate debate is highlighting the importance of free, open, and authoritative sources of high quality climate data that are available for peer review and for collaborative purposes. It is increasingly important to allow various organizations around the world to share climate data in an open manner, and to enable them to perform dynamic processing of climate data. This advanced access to data can be enabled via Web-based services, using common "community agreed" standards without having to change their internal structure used to describe the data. The modern scientific community has become diverse and increasingly complex in nature. To meet the demands of such diverse user community, the modern data supplier has to provide data and other related information through searchable, data and process oriented tool. This can be accomplished by setting up on-line, Web-based system with a relational database as a back end. The following common features of the web data access/search systems will be outlined in the proposed presentation: - A flexible data discovery - Data in commonly used format (e.g., CSV, NetCDF) - Preparing metadata in standard formats (FGDC, ISO19115, EML, DIF etc.) - Data subseting capabilities and ability to narrow down to individual data elements - Standards based data access protocols and mechanisms (SOAP, REST, OpenDAP, OGC etc.) - Integration of services across different data systems (discovery to access, visualizations and subseting) This presentation will also include specific examples of integration of various data systems that are developed by Oak Ridge National Laboratory's - Climate Change Science Institute, their ability to communicate between each other to enable better data interoperability and data integration. References: [1] Devarakonda, Ranjeet, and Harold Shanafield. "Drupal: Collaborative framework for science research." Collaboration Technologies and Systems (CTS), 2011 International Conference on. IEEE, 2011. [2]Devarakonda, R., Shrestha, B., Palanisamy, G., Hook, L. A., Killeffer, T. S., Boden, T. A., ... & Lazer, K. (2014). THE NEW ONLINE METADATA EDITOR FOR GENERATING STRUCTURED METADATA. Oak Ridge National Laboratory (ORNL).

  14. CovalentDock Cloud: a web server for automated covalent docking.

    PubMed

    Ouyang, Xuchang; Zhou, Shuo; Ge, Zemei; Li, Runtao; Kwoh, Chee Keong

    2013-07-01

    Covalent binding is an important mechanism for many drugs to gain its function. We developed a computational algorithm to model this chemical event and extended it to a web server, the CovalentDock Cloud, to make it accessible directly online without any local installation and configuration. It provides a simple yet user-friendly web interface to perform covalent docking experiments and analysis online. The web server accepts the structures of both the ligand and the receptor uploaded by the user or retrieved from online databases with valid access id. It identifies the potential covalent binding patterns, carries out the covalent docking experiments and provides visualization of the result for user analysis. This web server is free and open to all users at http://docking.sce.ntu.edu.sg/.

  15. Finding research information on the web: how to make the most of Google and other free search tools.

    PubMed

    Blakeman, Karen

    2013-01-01

    The Internet and the World Wide Web has had a major impact on the accessibility of research information. The move towards open access and development of institutional repositories has resulted in increasing amounts of information being made available free of charge. Many of these resources are not included in conventional subscription databases and Google is not always the best way to ensure that one is picking up all relevant material on a topic. This article will look at how Google's search engine works, how to use Google more effectively for identifying research information, alternatives to Google and will review some of the specialist tools that have evolved to cope with the diverse forms of information that now exist in electronic form.

  16. CovalentDock Cloud: a web server for automated covalent docking

    PubMed Central

    Ouyang, Xuchang; Zhou, Shuo; Ge, Zemei; Li, Runtao; Kwoh, Chee Keong

    2013-01-01

    Covalent binding is an important mechanism for many drugs to gain its function. We developed a computational algorithm to model this chemical event and extended it to a web server, the CovalentDock Cloud, to make it accessible directly online without any local installation and configuration. It provides a simple yet user-friendly web interface to perform covalent docking experiments and analysis online. The web server accepts the structures of both the ligand and the receptor uploaded by the user or retrieved from online databases with valid access id. It identifies the potential covalent binding patterns, carries out the covalent docking experiments and provides visualization of the result for user analysis. This web server is free and open to all users at http://docking.sce.ntu.edu.sg/. PMID:23677616

  17. The Longhorn Array Database (LAD): An Open-Source, MIAME compliant implementation of the Stanford Microarray Database (SMD)

    PubMed Central

    Killion, Patrick J; Sherlock, Gavin; Iyer, Vishwanath R

    2003-01-01

    Background The power of microarray analysis can be realized only if data is systematically archived and linked to biological annotations as well as analysis algorithms. Description The Longhorn Array Database (LAD) is a MIAME compliant microarray database that operates on PostgreSQL and Linux. It is a fully open source version of the Stanford Microarray Database (SMD), one of the largest microarray databases. LAD is available at Conclusions Our development of LAD provides a simple, free, open, reliable and proven solution for storage and analysis of two-color microarray data. PMID:12930545

  18. Building a multi-scaled geospatial temporal ecology database from disparate data sources: fostering open science and data reuse.

    PubMed

    Soranno, Patricia A; Bissell, Edward G; Cheruvelil, Kendra S; Christel, Samuel T; Collins, Sarah M; Fergus, C Emi; Filstrup, Christopher T; Lapierre, Jean-Francois; Lottig, Noah R; Oliver, Samantha K; Scott, Caren E; Smith, Nicole J; Stopyak, Scott; Yuan, Shuai; Bremigan, Mary Tate; Downing, John A; Gries, Corinna; Henry, Emily N; Skaff, Nick K; Stanley, Emily H; Stow, Craig A; Tan, Pang-Ning; Wagner, Tyler; Webster, Katherine E

    2015-01-01

    Although there are considerable site-based data for individual or groups of ecosystems, these datasets are widely scattered, have different data formats and conventions, and often have limited accessibility. At the broader scale, national datasets exist for a large number of geospatial features of land, water, and air that are needed to fully understand variation among these ecosystems. However, such datasets originate from different sources and have different spatial and temporal resolutions. By taking an open-science perspective and by combining site-based ecosystem datasets and national geospatial datasets, science gains the ability to ask important research questions related to grand environmental challenges that operate at broad scales. Documentation of such complicated database integration efforts, through peer-reviewed papers, is recommended to foster reproducibility and future use of the integrated database. Here, we describe the major steps, challenges, and considerations in building an integrated database of lake ecosystems, called LAGOS (LAke multi-scaled GeOSpatial and temporal database), that was developed at the sub-continental study extent of 17 US states (1,800,000 km(2)). LAGOS includes two modules: LAGOSGEO, with geospatial data on every lake with surface area larger than 4 ha in the study extent (~50,000 lakes), including climate, atmospheric deposition, land use/cover, hydrology, geology, and topography measured across a range of spatial and temporal extents; and LAGOSLIMNO, with lake water quality data compiled from ~100 individual datasets for a subset of lakes in the study extent (~10,000 lakes). Procedures for the integration of datasets included: creating a flexible database design; authoring and integrating metadata; documenting data provenance; quantifying spatial measures of geographic data; quality-controlling integrated and derived data; and extensively documenting the database. Our procedures make a large, complex, and integrated database reproducible and extensible, allowing users to ask new research questions with the existing database or through the addition of new data. The largest challenge of this task was the heterogeneity of the data, formats, and metadata. Many steps of data integration need manual input from experts in diverse fields, requiring close collaboration.

  19. Building a multi-scaled geospatial temporal ecology database from disparate data sources: Fostering open science through data reuse

    USGS Publications Warehouse

    Soranno, Patricia A.; Bissell, E.G.; Cheruvelil, Kendra S.; Christel, Samuel T.; Collins, Sarah M.; Fergus, C. Emi; Filstrup, Christopher T.; Lapierre, Jean-Francois; Lotting, Noah R.; Oliver, Samantha K.; Scott, Caren E.; Smith, Nicole J.; Stopyak, Scott; Yuan, Shuai; Bremigan, Mary Tate; Downing, John A.; Gries, Corinna; Henry, Emily N.; Skaff, Nick K.; Stanley, Emily H.; Stow, Craig A.; Tan, Pang-Ning; Wagner, Tyler; Webster, Katherine E.

    2015-01-01

    Although there are considerable site-based data for individual or groups of ecosystems, these datasets are widely scattered, have different data formats and conventions, and often have limited accessibility. At the broader scale, national datasets exist for a large number of geospatial features of land, water, and air that are needed to fully understand variation among these ecosystems. However, such datasets originate from different sources and have different spatial and temporal resolutions. By taking an open-science perspective and by combining site-based ecosystem datasets and national geospatial datasets, science gains the ability to ask important research questions related to grand environmental challenges that operate at broad scales. Documentation of such complicated database integration efforts, through peer-reviewed papers, is recommended to foster reproducibility and future use of the integrated database. Here, we describe the major steps, challenges, and considerations in building an integrated database of lake ecosystems, called LAGOS (LAke multi-scaled GeOSpatial and temporal database), that was developed at the sub-continental study extent of 17 US states (1,800,000 km2). LAGOS includes two modules: LAGOSGEO, with geospatial data on every lake with surface area larger than 4 ha in the study extent (~50,000 lakes), including climate, atmospheric deposition, land use/cover, hydrology, geology, and topography measured across a range of spatial and temporal extents; and LAGOSLIMNO, with lake water quality data compiled from ~100 individual datasets for a subset of lakes in the study extent (~10,000 lakes). Procedures for the integration of datasets included: creating a flexible database design; authoring and integrating metadata; documenting data provenance; quantifying spatial measures of geographic data; quality-controlling integrated and derived data; and extensively documenting the database. Our procedures make a large, complex, and integrated database reproducible and extensible, allowing users to ask new research questions with the existing database or through the addition of new data. The largest challenge of this task was the heterogeneity of the data, formats, and metadata. Many steps of data integration need manual input from experts in diverse fields, requiring close collaboration.

  20. ArrayBridge: Interweaving declarative array processing with high-performance computing

    DOE Office of Scientific and Technical Information (OSTI.GOV)

    Xing, Haoyuan; Floratos, Sofoklis; Blanas, Spyros

    Scientists are increasingly turning to datacenter-scale computers to produce and analyze massive arrays. Despite decades of database research that extols the virtues of declarative query processing, scientists still write, debug and parallelize imperative HPC kernels even for the most mundane queries. This impedance mismatch has been partly attributed to the cumbersome data loading process; in response, the database community has proposed in situ mechanisms to access data in scientific file formats. Scientists, however, desire more than a passive access method that reads arrays from files. This paper describes ArrayBridge, a bi-directional array view mechanism for scientific file formats, that aimsmore » to make declarative array manipulations interoperable with imperative file-centric analyses. Our prototype implementation of ArrayBridge uses HDF5 as the underlying array storage library and seamlessly integrates into the SciDB open-source array database system. In addition to fast querying over external array objects, ArrayBridge produces arrays in the HDF5 file format just as easily as it can read from it. ArrayBridge also supports time travel queries from imperative kernels through the unmodified HDF5 API, and automatically deduplicates between array versions for space efficiency. Our extensive performance evaluation in NERSC, a large-scale scientific computing facility, shows that ArrayBridge exhibits statistically indistinguishable performance and I/O scalability to the native SciDB storage engine.« less

Top